-
Notifications
You must be signed in to change notification settings - Fork 659
Gossip: Add dynamic stake weighting based on fraction of unstaked nodes #7070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gossip: Add dynamic stake weighting based on fraction of unstaked nodes #7070
Conversation
3a3b813
to
d67d4d8
Compare
d67d4d8
to
509b595
Compare
// (small d(alpha_target)/dt). If future code changes make `alpha_target` jump | ||
// larger, we must retune `TC`/`K` or use a higher‑order filter to avoid | ||
// lag/overshoot. | ||
let alpha = K * alpha_target + (1.0 - K) * self.alpha; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we extract the IIR filter into its own module and stick it in net-utils or something like it? Would make it far easier to test and also to guard against possible invalid inputs and such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya i think that's a good idea. locally, i have it pulled out into a separate file within the gossip module. not sure it fits in net-utils
since its not really a network-related method. maybe we leave in gossip module for now. but if anyone wants to reuse the filter in the future, we can pull it into it's own standalone module?
// 30_000ms convergence time | ||
const TC: f64 = 30_000.0; | ||
// Low pass filter smoothing constant | ||
// K ~ 0.611 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does rust-analyzer not compute consts? Also would be nice to get an idea of how long this filter takes to reach 0.5 in response to a step input from 0 to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh it does on my end, i can remove the comment: // K ~ 0.611
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call. will add a comment in the code. But this is what it is:
/// - From a step change in target from 0 to 1, `alpha` reaches:
/// - ~61% of the way to target after 1 update
/// - ~85% after 2
/// - ~94% after 3
/// - ~98% after 4
/// - ~99% after 5
Each update occurs every 7.5s. So we reach 94% of the way to the target in 22.5 seconds
if (1.0..=2.0).contains(&alpha) { | ||
self.alpha = alpha; | ||
} else { | ||
// We should never see this, but if we do, log and reset to safe default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not be better to have an explicit is_finite check followed by a clamp to the target range? Then we would not need to have an error here.
Overall the PR looks good, thank you! |
@alexpyattaev Does it make sense to keep floating point math or switch over to fixed point? I have a fixed point implementation. Sounds like trent would prefer fixed point to avoid possible bugs even though we can try and make them as unlikely as possible. |
i suggest we switch over to this pr with fixed point math: #7098 |
In this case it is not critically important, but floats require some extra caution around possible NANs. So if we can avoid them let us do so. |
moving to: #7098 |
Problem
This PR addresses this Issue: #6903
Unstaked and low staked nodes in gossip do not receive many messages on testnet due to the testnet stake distribution and the aggressive quadratic weighting of peers by stake. As a result, unstaked nodes have a poor and inconsistent view of the peers in the network.
Linear weighting is not aggressive enough and causes a disproportionate number of messages to be sent to unstaked nodes on mainnet.
We need weighting that falls in between linear and quadratic.
Summary of Changes
Add a dynamic weighting element
alpha
.alpha
fluctuates between 1 and 2 at runtime and is linearly proportional to the number of unstaked nodes in the cluster.For example, on mainnet, there are 82% unstaked nodes, so
alpha
will converge on 1.82. On testnet, there are 4% unstaked nodes, soalpha
will converge on 1.04.We use a first-order, low-pass filter to update
alpha
to avoid constantalpha
thrashing and overshooting the new target. Unit tests show that alpha converges on the targetalpha
within approximately 4-5 push active set rotations (30-38s).Note: floating point operations are expensive but
rotate()
is called once every 7.5s, so should not add any significant overhead.Testing has shown that flattening out the weight curve when their aren't many unstaked nodes (as on testnet), increases an unstaked nodes ingress gossip traffic as desired. On the other hand, high staked nodes don't see a significant reduction in received traffic.
All calculations and data that backup and support this PR here: