Skip to content

Commit a1d21c6

Browse files
ramfox“ramfox”
and
“ramfox”
authored
fix(iroh): remove quinn::Endpoint::wait_idle from iroh::Endpoint::close process (#3165)
## Description `quinn::Endpoint::wait_idle` adds a minimum 3 second closing time to `iroh::Endpoint::close` if you have any connections that have not closed gracefully. While we *want* users to close connections gracefully, we should also not be forcing users to wait 3 seconds to close the `iroh::Endpoint` if a connection has "gone wrong" if they don't want to. So instead, we are taking out `quinn::Endpoint::wait_idle` and adding more context for how to close a connection gracefully. ## Notes & open questions Before 1.0 we will be re-visiting closing to make sure the APIs make sense. ## Change checklist - [x] Self-review. - [x] Documentation updates following the [style guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text), if relevant. --------- Co-authored-by: “ramfox” <“[email protected]”>
1 parent 9a75d14 commit a1d21c6

File tree

2 files changed

+25
-20
lines changed

2 files changed

+25
-20
lines changed

iroh/src/endpoint.rs

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -991,10 +991,13 @@ impl Endpoint {
991991
/// of `0` and an empty reason. Though it is best practice to close those
992992
/// explicitly before with a custom error code and reason.
993993
///
994-
/// It will then make a best effort to wait for all close notifications to be
995-
/// acknowledged by the peers, re-transmitting them if needed. This ensures the
996-
/// peers are aware of the closed connections instead of having to wait for a timeout
997-
/// on the connection. Once all connections are closed or timed out, the magic socket is closed.
994+
/// This will not wait for the [`quinn::Endpoint`] to drain connections.
995+
///
996+
/// To ensure no data is lost, design protocols so that the last *sender*
997+
/// of data in the protocol calls [`Connection::closed`], and `await`s until
998+
/// it receives a "close" message from the *receiver*. Once the *receiver*
999+
/// gets the last data in the protocol, it should call [`Connection::close`]
1000+
/// to inform the *sender* that all data has been received.
9981001
///
9991002
/// Be aware however that the underlying UDP sockets are only closed
10001003
/// on [`Drop`], bearing in mind the [`Endpoint`] is only dropped once all the clones
@@ -1393,21 +1396,16 @@ impl Connection {
13931396
///
13941397
/// # Gracefully closing a connection
13951398
///
1396-
/// Only the peer last receiving application data can be certain that all data is
1397-
/// delivered. The only reliable action it can then take is to close the connection,
1398-
/// potentially with a custom error code. The delivery of the final CONNECTION_CLOSE
1399-
/// frame is very likely if both endpoints stay online long enough, calling
1400-
/// [`Endpoint::close`] will wait to provide sufficient time. Otherwise, the remote peer
1401-
/// will time out the connection, provided that the idle timeout is not disabled.
1402-
///
1403-
/// The sending side can not guarantee all stream data is delivered to the remote
1404-
/// application. It only knows the data is delivered to the QUIC stack of the remote
1405-
/// endpoint. Once the local side sends a CONNECTION_CLOSE frame in response to calling
1406-
/// [`close`] the remote endpoint may drop any data it received but is as yet
1407-
/// undelivered to the application, including data that was acknowledged as received to
1408-
/// the local endpoint.
1409-
///
1410-
/// [`close`]: Connection::close
1399+
/// Only the peer last **receiving** application data can be certain that all data is
1400+
/// delivered.
1401+
///
1402+
/// To communicate to the last **sender** of the application data that all the data was received, we recommend designing protocols that follow this pattern:
1403+
///
1404+
/// 1) The **sender** sends the last data. It then calls [`Connection::closed`]. This will wait until it receives a CONNECTION_CLOSE frame from the other side.
1405+
/// 2) The **receiver** receives the last data. It then calls [`Connection::close`] and provides an error_code and/or reason.
1406+
/// 3) The **sender** checks that the error_code is the expected error code.
1407+
///
1408+
/// If the `close`/`closed` dance is not done, or is interrupted at any point, the connection will eventually time out, provided that the idle timeout is not disabled.
14111409
#[inline]
14121410
pub fn close(&self, error_code: VarInt, reason: &[u8]) {
14131411
self.inner.close(error_code, reason)

iroh/src/magicsock.rs

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1785,12 +1785,19 @@ impl Handle {
17851785
/// Only the first close does anything. Any later closes return nil.
17861786
/// Polling the socket ([`AsyncUdpSocket::poll_recv`]) will return [`Poll::Pending`]
17871787
/// indefinitely after this call.
1788+
///
1789+
/// This will not wait for the [`quinn::Endpoint`] to drain connections.
1790+
///
1791+
/// To ensure no data is lost, design protocols so that the last *sender*
1792+
/// of data in the protocol calls [`crate::endpoint::Connection::closed`], and `await`s until
1793+
/// it receives a "close" message from the *receiver*. Once the *receiver*
1794+
/// gets the last data in the protocol, it should call [`crate::endpoint::Connection::close`]
1795+
/// to inform the *sender* that all data has been received.
17881796
#[instrument(skip_all, fields(me = %self.msock.me))]
17891797
pub(crate) async fn close(&self) {
17901798
trace!("magicsock closing...");
17911799
// Initiate closing all connections, and refuse future connections.
17921800
self.endpoint.close(0u16.into(), b"");
1793-
self.endpoint.wait_idle().await;
17941801

17951802
if self.msock.is_closed() {
17961803
return;

0 commit comments

Comments
 (0)