Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Cloud Storage bucket: putObject access broken since 2.30.0 #5987

Open
1 task done
uwolfer opened this issue Mar 25, 2025 · 8 comments
Open
1 task done

Google Cloud Storage bucket: putObject access broken since 2.30.0 #5987

uwolfer opened this issue Mar 25, 2025 · 8 comments
Labels
bug This issue is a bug. p2 This is a standard priority issue

Comments

@uwolfer
Copy link

uwolfer commented Mar 25, 2025

Describe the bug

When using the library against a Google Cloud Storage bucket, it fails uploading an object starting 2.30.0 (tested until 2.31.6),

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

PutObject should work as it used to until 2.29.52.

Current Behavior

It throws

software.amazon.awssdk.services.s3.model.S3Exception: Invalid argument. (Service: S3, Status Code: 403, Request ID: null) (SDK Attempt Count: 1)

When stepping a bit into, I can see that SignatureDoesNotMatch is received.

Image

Reproduction Steps

      var credentials = AwsBasicCredentials.create("...", "...");
      var s3Client = S3Client.builder()
              .credentialsProvider(StaticCredentialsProvider.create(credentials))
              .endpointOverride(URI.create("https://storage.googleapis.com"))
              .region(Region.of("auto"))
              .serviceConfiguration(S3Configuration.builder().pathStyleAccessEnabled(true).build())
              .build());
      var request =
          PutObjectRequest.builder()
              .bucket(bucketName)
              .key(key)
              .contentType(contentType)
              .contentLength((long) contents.length)
              .build();
      s3Client.putObject(request, fromBytes(contents));

Possible Solution

No response

Additional Information/Context

Note: I have seen the following in changelog, could be related:

This change enhances integrity protections for new SDK requests to S3. S3 SDKs now support the CRC64NVME checksum algorithm, full object checksums for multipart S3 objects, and new default integrity protections for S3 requests.

AWS Java SDK version used

2.31.6

JDK version used

21.0.6

Operating System and version

archlinux (up-to-date)

@uwolfer uwolfer added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Mar 25, 2025
@uwolfer
Copy link
Author

uwolfer commented Mar 25, 2025

Looks like S3ClientBuilder#requestChecksumCalculation(WHEN_REQUIRED) resolves this issue. Is this expected?

@bhoradc
Copy link

bhoradc commented Mar 25, 2025

Hello @uwolfer,

Thank you for reporting this issue. Yes, using S3ClientBuilder#requestChecksumCalculation(WHEN_REQUIRED) is one of the supported ways to handle this scenario, particularly when working with S3-compatible storage services that don't support the S3 Default integrity protection change which was released in Java SDK v2.30.0.

I notice that your code works fine when connecting to Amazon S3 but fails with Google Cloud Storage. This is expected behavior as:

  1. Starting from AWS SDK for Java 2.x version 2.30.0, we introduced enhanced data integrity protections that are enabled by default. These include automatic checksum calculations for requests and validations for responses.

  2. These integrity features work seamlessly with Amazon S3 as they're fully supported by the service. However, when connecting to S3-compatible storage services, these enhanced features may not be supported, resulting in the SignatureDoesNotMatch error you encountered. You would want to followup on this with the third-party storage provider for a resolution on this.

You may refer to the Disclaimer section from this Announcement regarding the specific change.

Disclaimer: the AWS SDKs and CLI are designed for usage with official AWS services. We may introduce and enable new features by default, such as these new default integrity protections, prior to them being supported or otherwise handled by third-party service implementations. You can disable the new behavior with the WHEN_REQUIRED value for the request_checksum_calculation and response_checksum_validation configuration options covered in Data Integrity Protections for Amazon S3.

Regards,
Chaitanya

@bhoradc bhoradc added closing-soon This issue will close in 4 days unless further comments are made. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Mar 25, 2025
@uwolfer
Copy link
Author

uwolfer commented Mar 25, 2025

Thanks a lot @bhoradc for the detailed feedback.

I have not found the linked announcement. I would somehow have expected this in changelogs or release notes.

But still, I see two things to improve:

  1. properly map the error (SignatureDoesNotMatch should not be swallowed and result an a generic message Invalid argument. exception).
  2. add a hint in the exception to the docs in case this error happens, this would have saved me a few hours of analysis.

@github-actions github-actions bot removed the closing-soon This issue will close in 4 days unless further comments are made. label Mar 25, 2025
@bhoradc
Copy link

bhoradc commented Apr 1, 2025

Hello @uwolfer,

I have not found the linked announcement. I would somehow have expected this in changelogs or release notes.

Thank you again for your valuable feedback, as it helps us improve our communication and documentation for future releases.

  1. properly map the error (SignatureDoesNotMatch should not be swallowed and result an a generic message Invalid argument. exception).

Regarding above suggestion, it's important to note that the AWS SDK for Java is primarily designed and optimized for usage with official AWS services like Amazon S3.

In this particular case, the SignatureDoesNotMatch error is likely being returned by the third-party S3-compatible service because it does not support the enhanced integrity protections introduced in Java SDK v2.30.0. We may not not have comprehensive error mapping and handling for every possible error code and scenario from third-party implementations.

  1. add a hint in the exception to the docs in case this error happens, this would have saved me a few hours of analysis.

It is challenging for AWS SDK for Java to include specific hints or documentation references for errors originating from third-party services.

As I previously mentioned, the same code works fine when using the AWS SDK for Java with Amazon S3, but encounters the SignatureDoesNotMatch error when used with Google Cloud Storage. This highlights the compatibility challenges that arise when using the SDK with third-party S3-compatible services, which aligns with the disclaimer mentioned in our announcement.

Regards,
Chaitanya

@bhoradc bhoradc added the closing-soon This issue will close in 4 days unless further comments are made. label Apr 1, 2025
@uwolfer
Copy link
Author

uwolfer commented Apr 1, 2025

@bhoradc Thanks for your detailed feedback. I get your point.

I was under the impression that SignatureDoesNotMatch is a generic S3 error, not specific to GCS, but I might be wrong here.

In the linked announcement, there is a comment regarding some MD5 checksum validation. Do you know if that is required in order to preserve pre 2.30 behavior when using WHEN_REQUIRED? Or does WHEN_REQUIRED restore exactly pre 2.30 behavior regarding integrity checks without any further additions?

@github-actions github-actions bot removed the closing-soon This issue will close in 4 days unless further comments are made. label Apr 1, 2025
@bhoradc
Copy link

bhoradc commented Apr 2, 2025

Hello @uwolfer,

The Java SDK uses CRC32 as the default checksum algorithm from v2.30.0 onwards. Though setting WHEN_REQUIRED limits when checksums are calculated, it won't change the algorithm back to MD5.

For third-party storage providers that only accepts MD5 checksums, implementing the MD5RequiredOperationInterceptor would be the workaround.

Therefore, WHEN_REQUIRED alone won't be sufficient to maintain compatibility with third-party storage providers for operations that only accepts MD5.

Hope that clarifies.

Regards,
Chaitanya

@bhoradc bhoradc added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. closing-soon This issue will close in 4 days unless further comments are made. labels Apr 2, 2025
@uwolfer
Copy link
Author

uwolfer commented Apr 3, 2025

Thanks @bhoradc for your support.

While I still think it is a pity that v2.30.0 broke support with many 3rd party storage providers (and this was not announced in a way I would have noticed before opening this ticket), I also get your point that you officially only want to support AWS S3.

It would be nice to have some fallback which changes to MD5 (or at least some config) without having own code to handle this.

For now, I will stay pre-v2.30.0 and hope the situation improves over the time (either Google S3 implements those advanced integrity checks, or this library adds out of the box support again for 3rd parties).

@github-actions github-actions bot removed closing-soon This issue will close in 4 days unless further comments are made. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. labels Apr 3, 2025
@steveloughran
Copy link

@uwolfer you are not alone apache/hadoop#7494

Hadoop has just stopped updating; at least for the next release or two. 2.29.52 addresses most of our known issues, but not all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

3 participants