Skip to content

fix(iroh): Re-batch received relay datagram batches in case they exceed max_receive_segments #3414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 29, 2025

Conversation

matheus23
Copy link
Member

@matheus23 matheus23 commented Jul 29, 2025

Description

Fixes a panic in RelayTransport::poll_recv with the new relay code.

Since #3389 we forward the whole GSO batch via the relay. This then gets read on the other receiving endpoint, which tries to fill its buf_out, which might not be big enough, because its size is initialized with max_receive_segments taken into account, which might be different from the sending endpoint's max_transmit_segments.

So in case we get a batch that's too big, we use a DatagramReBatcher to split up our datagram batch into chunk sizes we can consume.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.

@matheus23 matheus23 self-assigned this Jul 29, 2025
Copy link

github-actions bot commented Jul 29, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3414/docs/iroh/

Last updated: 2025-07-29T13:51:06Z

@matheus23 matheus23 force-pushed the matheus23/fragment-datagrams branch from 1ba8ebc to 1cae9a2 Compare July 29, 2025 13:28
Copy link

github-actions bot commented Jul 29, 2025

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: c15d489

/// will result in making `self` empty and returning essentially a clone of `self`.
///
/// Calling this on an empty datagram batch (i.e. one where `contents.is_empty()`) will return `None`.
pub fn take_segments(&mut self, num_segments: usize) -> Option<Datagrams> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goodness I love Bytes.

segment_size = ?dm.datagrams.segment_size,
"dropping received datagram: quinn buffer too small"
);
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One, in my opinion large, downside of your approach is that you have to plumb the signal of how large this an be all the way through to the ActiveRelayActor. I find all that plumbing really unfortunate and complex.

Have you considered having an Option<Datagrams> on the RelayTransport itself? I think that way you could store the split off ones here locally. When you get called next you first consume that and then continue reading off the channel. Would that not be a lot simpler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't consider that.
But thinking about how I'd do this, I'm somewhat afraid of messing up the waking logic in that case.
I'd prefer not to have to mess with wakers :S

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think this is very similar to the ipv6_reported: Arc<AtomicBool> we already use. I tried to move the max_receive_segments declarations close to that if possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need to mess with wakers? When you receive something that doesn't fit you will return a Poll::Ready and no waker is expected to be installed. Next time it gets called it again returns Poll::Ready because there's something in the Option. When the Option is empty you poll the channel and things behave as before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think you're right. I'll look into following up this PR with a little bit of a cleanup.

@n0bot n0bot bot added this to iroh Jul 29, 2025
@github-project-automation github-project-automation bot moved this to 🏗 In progress in iroh Jul 29, 2025
@matheus23 matheus23 added this pull request to the merge queue Jul 29, 2025
Merged via the queue into main with commit a8485ad Jul 29, 2025
29 checks passed
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in iroh Jul 29, 2025
@matheus23 matheus23 deleted the matheus23/fragment-datagrams branch July 29, 2025 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

3 participants