-
Notifications
You must be signed in to change notification settings - Fork 676
v4 signing inconsistent with version v1.36.3 #3056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is weird. Signing is pretty stable and there are not a lot of changes that we introduce to the package, with the exception of manually block or allow certain headers (last example is #2991) I see you added this comment
can you elaborate more on what that extra string you can see? One thing I can imagine may interfere (but I don't think would be relevant to SDK versions) would be headers that are modified on a hop-by-hop basis. There have been other cases where certain headers cause issues, such as #2594, so knowing which headers you use may be helpful as well |
I've been doing more tests, and I don't think the version of the sdk is relevant anymore, since I've had the issue happen again with the v1.36.0 too.
When the v4 signing fails, we see EXTRA string=. That value always the random UUID we create to identify the request, and it is sent in a header that looks like Mostly what I see is that this behaviour comes and goes. I can send tens of requests and none of them will fail, but then a new batch with fail. It seems that when a request succeeds, all subsequent requests also succeed. So I might run a test batch of 10 requests and they might all succeed. While I might run another batch of 10 requests, and the 2 or 3 first requests fail, but all the subsequent ones succeed. The IAM Role we use for that test is also used for other v4 signing tests that never fail on a different endpoint, and the endpoint we have trouble with is used by a client who v4 signs their requests using the python sdk, and never has issues either. It seems like the more I test, the less the issue happens, but it never fully goes away. I wrote it used to fail 50% of the time, while now I have to run many tests before one will fail. So it's also difficult to see if a change has any impact |
so most of the time, when we see this kind of errors is due to inconsistent headers. For example, retries that modify the header, proxies that modify the header, that sort of thing. The general guidance we give is to sign |
for now we "fixed" the issue by retrying after 2 seconds if the v4 signing fails. the retried call has the exact same headers given to it, but the request is recreated and resigned. we'll probably try to hunt down the issue further, maybe try to sign the request ourselves as per the guide you sent, or add some debugging to our current signing code. i'll close this issue as the more I look into it the more it seems the issue is some inconsistency somewhere in our code or infrastructure, rather than in the codebase here. thanks for the answers ! |
This issue is now closed. Comments on closed issues are hard for our team to see. |
Acknowledgements
go get -u github.com/aws/aws-sdk-go-v2/...
)Describe the bug
When signing requests with a v4 signature, we see inconsistent responses. around 50% of the requests I tested fail, while the rest succeed.
This issue happens in the latest version (at time of testing) v1.36.3, but works as expected in v1.36.0
Regression Issue
Expected Behavior
I expect that v4 signed requests work consistently, and that for the same request signed and sent 10 times, I get the same response 10 times.
Current Behavior
With version v1.36.3, around 50% of the requests fail with this kind of response:
Status code 403
Response body:
the other 50% of the requests respond as expected. In our case our API Gateway invokes a lambda, and we see the correct response from the lambda.
Reproduction Steps
We use the following code to create a v4 signed request. All imports are reasonably up to date, including the aws-sdk-go-v2 library using v1.36.3. Once we downgrade aws-sdk-go-v2 to v1.36.0, and leave all other imported libraries to their same versions, the issue cannot be reproduced anymore.
Possible Solution
No response
Additional Information/Context
While testing to figure out what was the cause of the inconsistency, we tried to add delay in the form of sleeps before sending the requests. We were otherwise sending 10 or so requests one after another. Adding half a second did not seem to have an effect, but adding 1 second or more resulted in all requests succeeding. No idea why that would be.
AWS Go SDK V2 Module Versions Used
Compiler and Version used
go 1.22.0
Operating System and version
lambda running Amazon Linux 2
The text was updated successfully, but these errors were encountered: