Skip to content

Added new QUIC_STREAM_EVENT: QUIC_STREAM_EVENT_COPIED_TO_FRAME #5094

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Johan511
Copy link

Description

Right now there is no way of knowing when the stream writes data into frames, we add a new event which triggers when data from a SEND_REQUEST is written to a QUIC frame.

If we signal this event every time data is written to a frame, there would be severe performance penalties as we would keep calling the associated callback. Hence we signal the event unconditionally only the first time data from a given SEND_REQUEST is written to a frame and in the event the application must set a value (BytesCopiedBeforeNextEvent) which indicates how many more bytes from this send request need to be written into frames before the next QUIC_STREAM_EVENT_COPIED_TO_FRAME event is triggered. By default (if the application does not set the BytesCopiedBeforeNextEvent parameter) it is set to a very high value effectively disabling this event.

Possible Issues: No way to BytesCopiedBeforeNextEvent from outside the event effectively forcing the application to decide when it wants the next QUIC_STREAM_EVENT_COPIED_TO_FRAME when the event is called.

Testing

Very basic test written, have not executed it hoping CI checks it for me

Documentation

Yes, documentation updated in QUIC_STREAM_EVENT.md
Appropriate comments added in source

@Johan511 Johan511 requested a review from a team as a code owner May 12, 2025 05:39
@Johan511
Copy link
Author

@microsoft-github-policy-service agree

@nibanks
Copy link
Member

nibanks commented May 12, 2025

My preference here would be to not expose this, and instead implement the deadline scheduling logic inside MsQuic. This exposes too much internals IMO which can cause perf issues (very frequent callbacks) as well as fragility (what if we offload this layer to HW in the future?).

@nibanks nibanks added external Proposed by non-MSFT Area: API Area: Core Related to the shared, core protocol logic labels May 12, 2025
@Johan511
Copy link
Author

Johan511 commented May 12, 2025

context: this feature request based on a requirement for deadline bounded delivery of objects

and instead implement the deadline scheduling logic inside MsQuic.

I would love to see that but, but don't you agree it is a very niche request which would only make sense for real time applications which can afford to loose some objects?
Also, this would require bandwidth estimates which I do not believe the default congestion algorithm provides (CUBIC)
and it'd still be a heuristic with perfect bandwidth measurements, and not a very popular and widely tested one.

This exposes too much internals

I have modelled it very closely to QUIC_STREAM_EVENT_SEND_COMPLETE and do not believe we are exposing any internals. Can you please be more specific?

which can cause perf issues (very frequent callbacks)

Yes, that was a concern I had too that is why we only trigger the event once unconditionally. After that it is upto the application to decide if it wants to nuke transport layer performance, I believe we can give the application that freedom. Also the default behaviour if the application does not handle the callback is that it never gets called again, so existing MsQuic applications will not see much performance regression. I am open to the idea of disabling event unless a flag is set in StreamSend, that was my preferred implementation but wanted a draft for opinions before delving into such API design choices.

as well as fragility (what if we offload this layer to HW in the future?).

What is the fragility in question here?
What does HW mean?
My first thought is hardware, but I do not see how we could reliably do it

@nibanks
Copy link
Member

nibanks commented May 12, 2025

If we don't completely implement the deadline feature in MsQuic, I'm wondering if instead we should use some variation of QUIC_SEND_FLAG_CANCEL_ON_LOSS and/or QUIC_SEND_FLAG_CANCEL_ON_BLOCKED. For instance, we get a callback up to the app if it failed to send or can't be sent right now, and provide some kind of ETA as to when it might be sent and/or get to the peer. Then the app can decide what to do. Instead of just always when we frame a packet.

That being said, I suspect many real-time application will have a need for deadline aware scheduling. So it might be worth adding to MsQuic.

@Johan511
Copy link
Author

My only concern with a deadline timeout solution within MsQuic is the possibility of mis-use, I need it for timeouts of the range of 100ms
But I can see people trying to use it for all ranges of timeouts (few to several seconds), is it possible to have a model which works well across all ranges? If we did hope to arrive at it, that sounds like a line of research of its own to ensure that the model works well in most scenarios.

@Johan511
Copy link
Author

Johan511 commented May 12, 2025

To motivate this PR, let me phrase it a bit differently

We add a feature to let the user request for an event once x bytes of a certain SEND_REQUEST has been written to a frame.

mvfst (another implementation of QUIC) does have a similar feature

mvfst provides an option to receive a signal when a particular stream offset was first transmitted (or acked). It's not exactly what you are asking for since it's retroactive -- eg the stream has already written at least that offset and possibly more.

as quoted by a maintainer of mvfst over here

I believe the API for the feature can be found over here.

@nibanks
Copy link
Member

nibanks commented May 12, 2025

the possibility of mis-use

What kind of mis-use are you worried about exactly? So what if they set a deadline of 10 seconds for a 1 GB upload?

@@ -1521,6 +1521,7 @@ typedef enum QUIC_STREAM_EVENT_TYPE {
QUIC_STREAM_EVENT_IDEAL_SEND_BUFFER_SIZE = 8,
QUIC_STREAM_EVENT_PEER_ACCEPTED = 9,
QUIC_STREAM_EVENT_CANCEL_ON_LOSS = 10,
QUIC_STREAM_EVENT_COPIED_TO_FRAME = 11,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to go forward with this model, and I'm not saying we would, this new event MUST be opt-in. We cannot deliver it to old clients that have no idea about it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suggest opt in at compile time or runtime?

I was thinking of having a new STREAM_SEND flag and delivery QUIC_STREAM_EVENT_COPIED_TO_FRAME events only if that flag is set

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this under the #PREVIEW_FEATURES macro.

@Johan511
Copy link
Author

Johan511 commented May 12, 2025

What kind of mis-use are you worried about exactly? So what if they set a deadline of 10 seconds for a 1 GB upload?

for a deadline of 100ms, it'd be as simple as checking the current bandwidth and every time data from the stream is being written onto a frame, check if it can be transmitted within the deadline. If we believe we can't simply reset the stream because the deadline is small enough that we do not expect the network conditions to suddenly improve drastically.

Now for a 1GB upload over 10 seconds, we need a wider context of how the network has been behaving and predict longer into the future how the network will behave, if I am right congestion control algorithms only try to predict 2-3 seconds into the future, we also need more tolerance for adverse network conditions as there is a better chance network conditions can drastically improve.

I have a feeling a simple model would not be the best for all use cases and it might benefit to let the application decide what is best.

Copy link

codecov bot commented May 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.05%. Comparing base (f7890e3) to head (6f5557b).
Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5094      +/-   ##
==========================================
- Coverage   87.00%   86.05%   -0.95%     
==========================================
  Files          59       59              
  Lines       18035    18047      +12     
==========================================
- Hits        15692    15531     -161     
- Misses       2343     2516     +173     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: API Area: Core Related to the shared, core protocol logic external Proposed by non-MSFT
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants