Skip to content

fix client side binds in TPU #7455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 12, 2025

Conversation

alexpyattaev
Copy link

@alexpyattaev alexpyattaev commented Aug 11, 2025

Problem

CI is flaking due to "bug" in port assignment in fn new_with_external_ip

This should go in together with #7456 else we can randomly run out of ports in some CI runs.

Summary of Changes

  • Bind TPU client ports inside given port-range too
  • bump localhost-port-range-for-tests allocation to 25 ports from 20

@alexpyattaev alexpyattaev marked this pull request as ready for review August 11, 2025 19:23
@codecov-commenter
Copy link

codecov-commenter commented Aug 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.2%. Comparing base (a09996b) to head (fad9733).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master    #7455     +/-   ##
=========================================
- Coverage    83.3%    83.2%   -0.1%     
=========================================
  Files         800      800             
  Lines      362783   362784      +1     
=========================================
- Hits       302227   302153     -74     
- Misses      60556    60631     +75     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple thoughts

@@ -203,12 +202,15 @@ impl Node {
bind_in_range_with_config(bind_ip_addr, port_range, socket_config)
.expect("Alpenglow port bind should succeed");
// These are client sockets, so the port is set to be 0 because it must be ephimeral.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update this comment to match the new behavior please

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

CHANGELOG.md Outdated
@@ -40,6 +40,7 @@ Release channels have their own copy of this changelog:
* `--skip-poh-verify`
* Deprecated snapshot archive formats have been removed and are no longer loadable.
* Using `--snapshot-interval-slots 0` to disable generating snapshots has been removed. Use `--no-snapshots` instead.
* Validator will now bind all ports within provided `--dynamic-port-range`, a minimum of 25 ports are required.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is 25 enforced in the actual code for --dynamic-port-range?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, but it will actually allocate up to 23 ports, 25 is a safe number to avoid confusion =)

@gregcusack gregcusack self-requested a review August 11, 2025 21:48
Copy link

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@alexpyattaev alexpyattaev merged commit 13f5a3f into anza-xyz:master Aug 12, 2025
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants