Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(bin/bench): on upload, do not also request large download #2546

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mxinden
Copy link
Collaborator

@mxinden mxinden commented Mar 31, 2025

The neqo-bin Download benchmark has the client do a single HTTP GET to the server, requesting the number of bytes to download encoded in the URL path, i.e. http://[::1]:12345/104857600.

The neqo-bin Upload benchmark has the client do a single HTTP POST to the server, sending 104857600 random bytes along with it. That said, previously it would also request the same amount of bytes to download from the server. This accidentally made it both an Upload and a Download benchmark.

This explains the Download / Upload difference seen in #2538.

While a bug, this does not fix #2538, given that the HTTP3 server would simply ignore the /104857600 URL path on receiving a POST.

if headers.contains_header(":method", "POST") {
self.posts.insert(stream, 0);
continue;
}

Anyways, still worth merging.


My mistake. Sorry for the trouble!

The `neqo-bin` Download benchmark has the client do a single HTTP GET to
the server, requesting the number of bytes to download encoded in the
URL path, i.e. `http://[::1]:12345/104857600`.

The `neqo-bin` Upload benchmark has the client do a single HTTP POST to
the server, sending `104857600` random bytes along with it. That said,
previously it would also request the same amount of bytes to download
from the server.

This explains the Download / Upload difference seen in
mozilla#2538.

Fixed in this commit.
@larseggert
Copy link
Collaborator

Oh, good catch.

@larseggert
Copy link
Collaborator

Can we keep a bidirectional transfer as part of the benches?

@mxinden
Copy link
Collaborator Author

mxinden commented Mar 31, 2025

Can we keep a bidirectional transfer as part of the benches?

For sure. I will do a follow-up pull request. Sounds good?

Copy link

github-actions bot commented Mar 31, 2025

Failed Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Copy link

github-actions bot commented Mar 31, 2025

Benchmark results

Performance differences relative to e72ae0c.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: Change within noise threshold.
       time:   [717.32 ms 722.63 ms 727.96 ms]
       thrpt:  [137.37 MiB/s 138.38 MiB/s 139.41 MiB/s]
change:
       time:   [+0.5586% +1.6106% +2.6568%] (p = 0.00 < 0.05)
       thrpt:  [-2.5880% -1.5851% -0.5555%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.
       time:   [348.55 ms 350.05 ms 351.54 ms]
       thrpt:  [28.447 Kelem/s 28.567 Kelem/s 28.690 Kelem/s]
change:
       time:   [-0.8823% -0.2353% +0.4030%] (p = 0.49 > 0.05)
       thrpt:  [-0.4014% +0.2359% +0.8902%]
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: 💚 Performance has improved.
       time:   [24.890 ms 25.033 ms 25.181 ms]
       thrpt:  [39.712  elem/s 39.947  elem/s 40.176  elem/s]
change:
       time:   [-4.1887% -3.3991% -2.5881%] (p = 0.00 < 0.05)
       thrpt:  [+2.6568% +3.5187% +4.3718%]

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💚 Performance has improved.
       time:   [1.9373 s 1.9589 s 1.9805 s]
       thrpt:  [50.493 MiB/s 51.050 MiB/s 51.617 MiB/s]
change:
       time:   [-17.554% -16.358% -15.154%] (p = 0.00 < 0.05)
       thrpt:  [+17.860% +19.557% +21.292%]
decode 4096 bytes, mask ff: No change in performance detected.
       time:   [12.003 µs 12.044 µs 12.091 µs]
       change: [+0.0555% +0.7942% +1.9294%] (p = 0.08 > 0.05)

Found 17 outliers among 100 measurements (17.00%)
1 (1.00%) low severe
3 (3.00%) low mild
3 (3.00%) high mild
10 (10.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [2.9475 ms 2.9551 ms 2.9645 ms]
       change: [-0.5920% -0.1424% +0.3118%] (p = 0.54 > 0.05)

Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
6 (6.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [20.018 µs 20.073 µs 20.131 µs]
       change: [-0.0088% +0.6273% +1.5471%] (p = 0.10 > 0.05)

Found 17 outliers among 100 measurements (17.00%)
1 (1.00%) low severe
1 (1.00%) low mild
15 (15.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [4.7949 ms 4.8065 ms 4.8198 ms]
       change: [-0.3169% +0.0455% +0.4015%] (p = 0.80 > 0.05)

Found 14 outliers among 100 measurements (14.00%)
14 (14.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [6.3248 µs 6.3582 µs 6.3977 µs]
       change: [-0.0435% +0.3699% +0.8187%] (p = 0.10 > 0.05)

Found 19 outliers among 100 measurements (19.00%)
3 (3.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
10 (10.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [2.1487 ms 2.1544 ms 2.1615 ms]
       change: [-0.6239% -0.1589% +0.3046%] (p = 0.52 > 0.05)

Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low mild
3 (3.00%) high mild
7 (7.00%) high severe

1 streams of 1 bytes/multistream: No change in performance detected.
       time:   [70.260 µs 71.309 µs 72.808 µs]
       change: [-2.1571% +0.0676% +2.3125%] (p = 0.95 > 0.05)

Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe

1000 streams of 1 bytes/multistream: No change in performance detected.
       time:   [25.333 ms 25.371 ms 25.409 ms]
       change: [-0.2905% -0.0666% +0.1483%] (p = 0.55 > 0.05)
10000 streams of 1 bytes/multistream: No change in performance detected.
       time:   [1.7053 s 1.7071 s 1.7091 s]
       change: [-0.1856% -0.0296% +0.1280%] (p = 0.71 > 0.05)

Found 17 outliers among 100 measurements (17.00%)
6 (6.00%) low mild
11 (11.00%) high mild

1 streams of 1000 bytes/multistream: No change in performance detected.
       time:   [71.748 µs 72.396 µs 73.503 µs]
       change: [-0.9538% +0.0118% +1.5579%] (p = 0.98 > 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

100 streams of 1000 bytes/multistream: No change in performance detected.
       time:   [3.3864 ms 3.3932 ms 3.4003 ms]
       change: [-0.2811% +0.0112% +0.3095%] (p = 0.93 > 0.05)

Found 22 outliers among 100 measurements (22.00%)
22 (22.00%) high severe

1000 streams of 1000 bytes/multistream: Change within noise threshold.
       time:   [145.53 ms 145.61 ms 145.70 ms]
       change: [-0.1769% -0.0965% -0.0212%] (p = 0.02 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [94.897 ns 95.250 ns 95.600 ns]
       change: [-1.9455% -0.5878% +0.2478%] (p = 0.39 > 0.05)

Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) high mild
6 (6.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [112.94 ns 113.22 ns 113.52 ns]
       change: [-0.2997% -0.0435% +0.2345%] (p = 0.75 > 0.05)

Found 12 outliers among 100 measurements (12.00%)
3 (3.00%) low severe
2 (2.00%) high mild
7 (7.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [112.52 ns 112.93 ns 113.43 ns]
       change: [-1.1281% -0.4127% +0.2555%] (p = 0.25 > 0.05)

Found 17 outliers among 100 measurements (17.00%)
4 (4.00%) low severe
2 (2.00%) low mild
2 (2.00%) high mild
9 (9.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [94.266 ns 94.739 ns 95.261 ns]
       change: [+0.0285% +2.9431% +7.9162%] (p = 0.16 > 0.05)

Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) high mild
6 (6.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [116.53 ms 116.58 ms 116.63 ms]
       change: [-0.3771% -0.3150% -0.2495%] (p = 0.00 < 0.05)

Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low severe
4 (4.00%) low mild
2 (2.00%) high mild

SentPackets::take_ranges: No change in performance detected.
       time:   [8.1537 µs 8.4174 µs 8.6544 µs]
       change: [-5.3000% -1.4801% +2.4891%] (p = 0.43 > 0.05)

Found 15 outliers among 100 measurements (15.00%)
8 (8.00%) low severe
7 (7.00%) low mild

transfer/pacing-false/varying-seeds: Change within noise threshold.
       time:   [36.067 ms 36.131 ms 36.194 ms]
       change: [+2.0038% +2.2525% +2.4929%] (p = 0.00 < 0.05)
transfer/pacing-true/varying-seeds: Change within noise threshold.
       time:   [35.914 ms 35.989 ms 36.064 ms]
       change: [+0.5757% +0.8401% +1.1055%] (p = 0.00 < 0.05)
transfer/pacing-false/same-seed: Change within noise threshold.
       time:   [35.423 ms 35.484 ms 35.543 ms]
       change: [+0.2720% +0.5055% +0.7278%] (p = 0.00 < 0.05)
transfer/pacing-true/same-seed: Change within noise threshold.
       time:   [35.889 ms 35.937 ms 35.987 ms]
       change: [+0.3653% +0.5549% +0.7298%] (p = 0.00 < 0.05)

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

Client/server transfer results

Performance differences relative to e72ae0c.

Transfer of 33554432 bytes over loopback, 30 runs. All unit-less numbers are in milliseconds.

Client Server CC Pacing Mean ± σ Min Max MiB/s ± σ Δ main Δ main
neqo neqo reno on 418.1 ± 36.3 387.9 587.0 76.5 ± 0.9 -1.6 -0.4%
neqo neqo reno 451.5 ± 180.7 383.2 1383.4 70.9 ± 0.2 19.8 4.6%
neqo neqo cubic on 412.8 ± 39.4 349.6 565.2 77.5 ± 0.8 8.1 2.0%
neqo neqo cubic 405.4 ± 12.6 385.1 434.4 78.9 ± 2.5 -2.5 -0.6%
google neqo reno on 767.1 ± 89.8 561.6 959.0 41.7 ± 0.4 7.7 1.0%
google neqo reno 758.9 ± 91.2 566.9 949.1 42.2 ± 0.4 -1.9 -0.3%
google neqo cubic on 759.7 ± 88.9 551.9 960.2 42.1 ± 0.4 -8.8 -1.1%
google neqo cubic 753.7 ± 81.2 569.0 875.6 42.5 ± 0.4 -5.2 -0.7%
google google 577.7 ± 19.0 559.4 652.6 55.4 ± 1.7 -2.0 -0.3%
neqo msquic reno on 263.6 ± 15.6 241.0 313.0 121.4 ± 2.1 -8.6 -3.1%
neqo msquic reno 269.6 ± 47.3 242.0 508.1 118.7 ± 0.7 -7.9 -2.8%
neqo msquic cubic on 261.2 ± 12.2 242.7 290.6 122.5 ± 2.6 -2.4 -0.9%
neqo msquic cubic 269.2 ± 34.0 245.2 436.7 118.9 ± 0.9 1.1 0.4%
msquic msquic 171.7 ± 20.1 147.9 223.1 186.4 ± 1.6 -6.4 -3.6%

⬇️ Download logs

@mxinden
Copy link
Collaborator Author

mxinden commented Mar 31, 2025

I was wrong. Updated the PR description. Still worth merging.

While a bug, this does not fix #2538, given that the HTTP3 server would simply ignore the /104857600 URL path on receiving a POST.

if headers.contains_header(":method", "POST") {
self.posts.insert(stream, 0);
continue;
}

@martinthomson
Copy link
Member

Can we add separate options for upload and download size, so that the request URL contains the download size and the client just uploads/POSTs the upload size?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Why is there a large throughtput difference between upload and download tests?
3 participants