-
Notifications
You must be signed in to change notification settings - Fork 621
cloud_storage_clients
: check Content-Type
header before parsing REST error responses in s3_client
#25738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud_storage_clients
: check Content-Type
header before parsing REST error responses in s3_client
#25738
Conversation
cloud_storage_clients
: check Content-Type
header before parsing REST error responsescloud_storage_clients
: check Content-Type
header before parsing REST error responses in s3_client
CI test resultstest results on build#64237
test results on build#64258
test results on build#64526
|
4ef35e1
to
e060ecb
Compare
e060ecb
to
78455b6
Compare
`ss::httpd::function_handler` calls `rep->done(_type)` on the reply before returning it. Because certain code paths in imposter services may want to return a different `Content-Type` in their response, this impl's `handle()` function allows for modifying the `type` set in the function handler's body before the final call to `done(type)` is performed. This class also handles an oversight in `mime_types.cc`, in which `xml` is not handled as a possible `mime` type.
All S3 error codes should be returned using `application/xml` as the `Content-Type` header.
There are code paths here that possibly return a `xml` body for their response. They need to ensure the response will be properly marked with `application/xml` in the `Content-Type` header.
…ly_with()` Sometimes we need to be able to set the headers and the body of a response. Add a new overload that allows us to do so.
…tor_test` The `not_found_response` is XML encoded.
…t_test` More XML encoded responses that need their `Content-Type` headers properly set.
Future commits will add use of this function to the `s3_client`. For that reason, move the `enum class response_content_type` to `types.h`, and `get_response_content_type()` helper function to `util.cc/.h`.
According to the S3 docs (`https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html`), REST error responses _should_ have the Content-Type header set to `application/xml`. However, response like `503 Service Unavailable` may not be of this assumed `Content-Type`. As seen in CI (abridged output): ``` WARN s3 - s3_client.cc:841 - S3 PUT request failed for key "...": Service Unavailable [Content-Type: text/plain; charset=utf-8]; ... ERROR s3 - util.cc:209 - !!parsing error boost::wrapexcept<boost::property_tree::xml_parser::xml_parser_error> (<unspecified file>(1): expected <) ERROR s3 - s3_client.cc:441 - !!error parse error boost::wrapexcept<boost::property_tree::xml_parser::xml_parser_error> (<unspecified file>(1): expected <) ``` To fix this issue, read the `Content-Type` header in `s3_client` instead of assuming `xml` as the `Content-Type` and attempting to parse it. If the `Content-Type` is not `xml`, we will return a partially filled `rest_error_response`.
78455b6
to
1cc76c7
Compare
} | ||
} | ||
|
||
auto result_prefix = buf.empty() ? "Empty error response, " : ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we dump few bytes of the response at trace log level to aid debugging in the future if we fall into this unexpected branch with non-xml/non-empty response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that would be a good idea, I can add it in a follow up
/cdt |
Running a cdt test for a sanity check, will merge after. |
/ci-repeat |
/backport v25.1.x |
/backport v24.3.x |
Failed to create a backport PR to v24.3.x branch. I tried:
|
According to the S3 docs, REST error responses should have the Content-Type header set to
application/xml
. However, response like503 Service Unavailable
may not be of this assumedContent-Type
.As seen in CI (abridged output):
To fix this issue, read the
Content-Type
header ins3_client
instead of assumingxml
as theContent-Type
and attempting to parse it. If theContent-Type
is notxml
, we will return a partially filledrest_error_response
.There are a number of JIRA tickets associated with this
xml_parser_error
- however, there may be an actual root cause for these errors that are not fixed by simply fixing thexml
parsing.Commits TL;DR
The first 6 commits deal with our unit and fixture testing expectations, since previously we would always parse S3 REST responses as
XML
without checking theContent-Type
header.Now that we don't, many changes had to be made through
http
,cloud_storage_clients
andcloud_storage
tests to ensure responses from imposter services which have content encoded withXML
also now have a properly setContent-Type
header.The last 2 commits are the real implementation changes for this PR.
Backports Required
Release Notes
Improvements
Content-Type
REST error responses in thes3_client
.