-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(s2n-quic-transport): Fixes s2n-quic TLS state incongruity when s2n-tls returns pending frequently #2574
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request addresses issue #2484 by ensuring that the handshake space is correctly discarded when a TLS handshake completes during unexpected wakeup events.
- Added logic in on_wakeup() to discard the handshake space when confirmed
- Inline documentation referencing RFC 9001 for context
Comments suppressed due to low confidence (1)
quic/s2n-quic-transport/src/connection/connection_impl.rs:1142
- [nitpick] Consider adding a unit test that simulates multiple wakeups with pending TLS responses to verify that the handshake space is discarded as expected.
if self.space_manager.handshake().is_some() && self.space_manager.is_handshake_confirmed() {
pub server_endpoint: Option<E>, | ||
pub client_endpoint: Option<E>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to have 2 different types here. Nor do they need to be Option
s. I'd also prefer to make the field private with a new
function, just so we have flexibility to add fields in the future.
pub server_endpoint: Option<E>, | |
pub client_endpoint: Option<E>, | |
endpoint: E, |
} | ||
|
||
fn max_tag_length(&self) -> usize { | ||
todo!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be delegated to the inner endpoint type, which is easier if you just have the one.
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
#[cfg(feature = "s2n-quic-tls")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also test the default
one since these features aren't always enabled.
.expect("Server should exist") | ||
.new_server_session(transport_parameters); | ||
SlowSession { | ||
defer: 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we picking 10? I think even 1 is probably good enough for most cases. Maybe we should take this as a parameter?
Release Summary:
Resolved issues:
resolves #2484
Description of changes:
Basically #2484 hit a debug assert failure where the TLS implementation had confirmed the handshake, but the s2n-quic space manager had not yet discarded its handshake space. In a normal handshake, the function handle_handshake_packet() calls update_crypto_state(), which eventually completes the handshake. Then the function on_processed packet() discards the handshake space.
However, in a weird handshake with a bunch of wakeups and Pendings returned from the TLS implementation, the handshake completes during the call to update_crypto_state() from the on_wakeup() function. And in that case the handshake space doesn't get discarded.
To fix this, I added code to discard the handshake space in the on_wakeup() function.
Call-outs:
An alternative approach that works is to move the code to discard the handshake space from on_processed_packet() to update_crypto_state(). I don't have strong intuition on which approach is better.
Testing:
Added a new integ test that reproduces the issue. Therefore, we can assert that this fix does solve the issue.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.