You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe
During a recent issue we experienced partial failures, with an unhandled exception on the shard-level. Even though we had full logging enabled on the OpenSearch service, the exception was not logged to Cloudwatch. The only place where the error was surfaced was in the HTTP response. As we weren't aware of this error happening, we did not expect the per-shard errors in the response, so our application did not log them.
It is our understanding this is due to OpenSearch treating a failure as a successful response. AWS Support pointed us at allow_partial_search_results. Based on AWS Support guidance, we're moving to implement this now - however, a solution on the OpenSearch level that would have helped us identify the issue sooner would be appreciated.
Describe the solution you'd like
It would be preferable if shard-level partial errors were logged to CloudWatch Logs, so when investigating a failure they are surfaced immediately instead of requiring deployment of changes to our running application (+ needing to understand the client SDK's event hook mechanism).
Related component
Search:Resiliency
Describe alternatives you've considered
We have since implemented an on-response listener to handle this, and log the errors on an application level. I can imagine most users don't implement such an event listener, given that the examples don't cover it and the required elements in the response only show up in case of actual failures happening.
Additional context
Based on AWS case; case number can be provided on request - not sure if case numbers can be potentially sensitive.
The text was updated successfully, but these errors were encountered:
Thanks @sander-bol for bringing it here.
This is related to AWS OpenSearch Service and not OpenSearch, so you may want to reach out to AWS support directly for this.
No problem! As with the other issue, I appreciate your response and will push it back to AWS Support for internal follow-up as improvements to the service.
Is your feature request related to a problem? Please describe
During a recent issue we experienced partial failures, with an unhandled exception on the shard-level. Even though we had full logging enabled on the OpenSearch service, the exception was not logged to Cloudwatch. The only place where the error was surfaced was in the HTTP response. As we weren't aware of this error happening, we did not expect the per-shard errors in the response, so our application did not log them.
It is our understanding this is due to OpenSearch treating a failure as a successful response. AWS Support pointed us at allow_partial_search_results. Based on AWS Support guidance, we're moving to implement this now - however, a solution on the OpenSearch level that would have helped us identify the issue sooner would be appreciated.
Describe the solution you'd like
It would be preferable if shard-level partial errors were logged to CloudWatch Logs, so when investigating a failure they are surfaced immediately instead of requiring deployment of changes to our running application (+ needing to understand the client SDK's event hook mechanism).
Related component
Search:Resiliency
Describe alternatives you've considered
We have since implemented an on-response listener to handle this, and log the errors on an application level. I can imagine most users don't implement such an event listener, given that the examples don't cover it and the required elements in the response only show up in case of actual failures happening.
Additional context
Based on AWS case; case number can be provided on request - not sure if case numbers can be potentially sensitive.
The text was updated successfully, but these errors were encountered: