Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate errors to PromQL errors by wrapping queryable. #2941

Merged
merged 7 commits into from
Jul 31, 2020
Merged

Translate errors to PromQL errors by wrapping queryable. #2941

merged 7 commits into from
Jul 31, 2020

Conversation

pstibrany
Copy link
Contributor

@pstibrany pstibrany commented Jul 28, 2020

What this PR does: This is a refinement of PR #2571, which however only translated errors coming from the store, and ignored errors from ingesters. By registrering error-translation layer at API creation time, it will cover all possible error paths.

This PR also changes errors reported by various queryables to default to 500 instead of previous 422.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Peter Štibraný <[email protected]>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a CHANGELOG entry, please? This should fix incorrect error status codes propagated back from the querier.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick test injecting return httpgrpc.Errorf(http.StatusRequestEntityTooLarge, "mocked error") in v2QueryStream() and then I run ./development/tsdb-blocks-storage-s3/compose-up.sh.

I've got a 500 response both when querying the querier and query-frontend. The response content is:

{"status":"error","errorType":"internal","error":"expanding series: rpc error: code = Code(413) desc = mocked error"}

All unknown errors are translated to promql.ErrStorage
already at API level of the querier.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
@pstibrany
Copy link
Contributor Author

Thanks Marco, I've fixed the problem -- it was caused by distributor_queryable.go wrapping all errors coming from ingester into promql.ErrStorage -- which is reported as 500.

I've removed promql.ErrStorage from most places where it was used, as new queryable error-translation feature introduced in this PR reports all unknown errors as promql.ErrStorage by default.

This change of default is a potentionally breaking change for other places which didn't use promql.ErrStorage wrapping before -- such errors were reported as 422. Now they will be wrapped into ErrStorage and thus 500. (In chunk_store_queryable.go, unknown errors already defaulted to ErrStorage wrapping though).

I think user-errors reported by queryables should use some explicit errors and we will add them to pkg/api/queryable.go.

"github.com/cortexproject/cortex/pkg/util"
)

func TestApiStatusCodes(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be worth adding a test for context.DeadlineExceeded (we agreed to return 500).

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did few manual tests and LGTM! Good job 👏

@pstibrany pstibrany merged commit 527e1c4 into cortexproject:master Jul 31, 2020
@pstibrany
Copy link
Contributor Author

@gotjosh @jtlisi Should ruler use this error-translation as well? (How exactly does ruler handle errors from querier?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants