Skip to content

Fix multipart Content-Type headers with both boundary and charset parameters #1516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 9, 2023

Conversation

mgaillard
Copy link
Contributor

This PR is for Issue #1515.

If the webserver returns a multipart response and include both boundary and charset parameters in the Content-Type header, then only the boundary needs to be read by parse_multipart_boundary(). I am not sure that including boundary + charset is something that follows the standard, but when querying an openresty server, we get this case.
I tried my best to read the RFC standards and do something that works in the general case. According to section 5.1.1 of the RFC2046, the boundary cannot contain a semicolon, so I just search for the first semicolon after the "boundary=" keyword and cut the substring at this position. I added four tests to check that it works.

httplib.h Outdated
@@ -3890,7 +3890,8 @@ inline bool parse_multipart_boundary(const std::string &content_type,
std::string &boundary) {
auto pos = content_type.find("boundary=");
if (pos == std::string::npos) { return false; }
boundary = content_type.substr(pos + 9);
auto end = content_type.find(';', pos);
boundary = content_type.substr(pos + 9, end - pos - 9);
Copy link
Owner

@yhirose yhirose Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could something like below be easier to understand?

auto boundary_keyword = "boundary=";
auto pos = content_type.find(boundary_keyword);
auto beg = pos + strlen(boundary_keyword);
boundary = content_type.substr(beg, end - beg);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, totally. I agree. I will make the modification tomorrow. Thanks for the suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and pushed on my branch.

@yhirose
Copy link
Owner

yhirose commented Mar 8, 2023

@mgaillard could you please take a look at some compile errors on GitHub Actions CI (ubuntu-latest)? Thanks!

@mgaillard
Copy link
Contributor Author

I could not reproduce the problem on my side, but I think it has to do with the fact that the build is split into .h and .cpp maybe? I noticed that there was no forward declaration for parse_multipart_boundary() so I added it in hope that it would help.

@yhirose yhirose merged commit df74526 into yhirose:master Mar 9, 2023
@yhirose
Copy link
Owner

yhirose commented Mar 9, 2023

@mgaillard thanks for the fine contribution!

ExclusiveOrange pushed a commit to ExclusiveOrange/cpp-httplib-exor that referenced this pull request May 2, 2023
…ameters (yhirose#1516)

* Fix multipart Content-Type headers with both boundary and charset parameters

* Improve code readability

* Add missing forward declaration

---------

Co-authored-by: Mathieu Gaillard <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants