Skip to content

fix: disables automatic content decompression #7513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

@jeremylong jeremylong requested a review from aikebah March 8, 2025 11:12
@boring-cyborg boring-cyborg bot added the utils changes to utils label Mar 8, 2025
@aikebah
Copy link
Collaborator

aikebah commented Mar 9, 2025

Don't think this would be the correct fix. HTTP compression https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding is supposed to be used for compress-at-server-decompress-at-client to preserve network bandwidth. It could be used for any file.

If the original media is already encoded (e.g., as a .zip file), this information is not included in the Content-Encoding header.

When .gz files are served up by a server with a Content-Encoding gzip header set they should be double-compressed.

I would be curious to know as to whether the bitbucket mirror of the source issue is really hosting the gz files.

@aikebah
Copy link
Collaborator

aikebah commented Apr 6, 2025

@jeremylong I got confused by the PR title. Your change would not disable automatic content decompression, but would prevent the entire use of content compression.

It would increase network traffic overall for any resource that would normally use content compression on the conmection.

In my view the root of the issue is still a (proxy?) configuration error that serves an explicit request for a (binary) (.json.).gz file wrongly as a .json file served with gzip content-encoding when a client indicates that it knows how to handle gzip content-encoding.

@aikebah
Copy link
Collaborator

aikebah commented Apr 6, 2025

As an example it would increase data-traffic for the KEV json from 127_750 to 1_154_748 bytes

Results with some added tracing of the headers with and without the fix shows that apparently KEV serving is optimized for gzipped content-encoding as it switches to a Transfer-Encoding: chunked streaming serving of the file when requested for the resource without having gzip compression available as an accepted encoding:

With your fix:

[INFO] Updating CISA Known Exploited Vulnerability list: https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json
Sending header - Host: www.cisa.gov
Sending header - Connection: keep-alive
Sending header - User-Agent: Apache-HttpClient/5.4.2 (Java/21.0.6)
Protocol version: HTTP/1.1
Received header - Content-Type: application/json
Received header - Server: Apache
Received header - X-Content-Type-Options: nosniff
Received header - Last-Modified: Fri, 04 Apr 2025 18:55:10 GMT
Received header - ETag: "119ebc-631f86d00a468"
Received header - Cache-Control: max-age=2803
Received header - Expires: Sun, 06 Apr 2025 13:10:24 GMT
Received header - Date: Sun, 06 Apr 2025 12:23:41 GMT
Received header - Transfer-Encoding: chunked
Received header - Connection: keep-alive
Received header - Connection: Transfer-Encoding
Received header - Strict-Transport-Security: max-age=31536000 ; includeSubDomains

Without your fix (.disableContentCompression() commented out):

[INFO] Updating CISA Known Exploited Vulnerability list: https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json
Sending header - Accept-Encoding: gzip, x-gzip, deflate
Sending header - Host: www.cisa.gov
Sending header - Connection: keep-alive
Sending header - User-Agent: Apache-HttpClient/5.4.2 (Java/21.0.6)
Protocol version: HTTP/1.1
Received header - Content-Type: application/json
Received header - Server: Apache
Received header - X-Content-Type-Options: nosniff
Received header - Last-Modified: Fri, 04 Apr 2025 18:55:10 GMT
Received header - ETag: "119ebc-631f86d00a468"
Received header - Accept-Ranges: bytes
Received header - Content-Encoding: gzip
Received header - Content-Length: 127750
Received header - Cache-Control: max-age=3069
Received header - Expires: Sun, 06 Apr 2025 13:10:24 GMT
Received header - Date: Sun, 06 Apr 2025 12:19:15 GMT
Received header - Connection: keep-alive
Received header - Vary: Accept-Encoding
Received header - Strict-Transport-Security: max-age=31536000 ; includeSubDomains

@jeremylong jeremylong closed this Apr 11, 2025
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 12, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
utils changes to utils
Projects
None yet
Development

Successfully merging this pull request may close these issues.

java.util.zip.ZipException: Not in GZIP format when using a ndv datafeed mirror created with vulnz
2 participants