-
Notifications
You must be signed in to change notification settings - Fork 177
SIMD-0326: Alpenglow #326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
SIMD-0326: Alpenglow #326
Conversation
and *safe-to-skip*, explained in the white paper) that cause the validators to | ||
vote *notarize-fallback* or *skip-fallback*. | ||
|
||
Votes are distributed by broadcasting them directly to all other validators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we say all other staked validators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Alpenglow terminology every validator is staked.
In this proposal we make sure that nodes that do not participate in the protocol | ||
will not be rewarded. Towards this end, all nodes prove that they are voting | ||
actively. In slot *s*+8 (and only in that slot), the corresponding leader can | ||
post up to two vote aggregates (a notarization aggregate and/or skip aggregate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dumb question: would the notarization aggregate include all notarization-fallback votes with the same block-id?
I assume a notarization with wrong block-id will be ignored?
Do we reward notarization with wrong block-id? What if someone cast skip and notar-fallback, but skip vote got lost, unfortunately notar-fallback is for wrong block-id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only skip and notarization, no fallbacks. Everybody just gets one point at most.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These funny scenarios could arise on successful block equivocation by the leader. In the future, if we want to be nerdy about this possibility, we could do something to count multiple block-ids (or just issue the one-vote reward to everyone while we slash the equivocating leader lol).
per slot. The submitter (leader) gets the same amount of SOL as each of the | ||
voters included in the aggregate. Nodes receiving 0 SOL at the end of the epoch | ||
are removed from the active set of nodes. This scheme will practically eliminate | ||
today’s voting transaction overhead while still rewarding voting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reward for someone who voted both Notarization and Skip right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had slashing, it would be slashable. That's even stronger than just not getting rewards. But since we don't have slashing, we might consider to punish it by not giving any rewards.
## Impact | ||
|
||
The most visible change will be that optimistic confirmation is superseded by | ||
faster (actual) finality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the proposal is to make Confirmed
the same as Finalized
in Alpenglow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimistic Confirmation is a concept from TowerBFT, and Finality is a concept from Alpenglow. In this sentence we want to argue that the second is strictly better than the first (faster and actually final). Maybe we need to reformulate to make this clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay that's fine. I just wasn't sure whether we are announcing an API change here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we write TowerBFT's Optimistic Confirmation and Alpenglow's (actual) finality to make it 100% clear, or do you think it's okay like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's okay like this.
## Drawbacks | ||
|
||
The main drawback is the risk related to implementing a big protocol change. | ||
Migrating to Alpenglow will be challenging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we mention Migration will be designed and proposed in following SIMD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an expert, but I don't think migration should have a SIMD. At least I would not announce it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the migration also require a specific implementation for any client that would want to be part of the network during the switchover? Seems like it would need a SIMD then to me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would like to get a general consensus about Alpenglow, so that the engineers working on it know that it will come. The switching mechanism is not really a votable issue (as long as it's done in the best possible way). But of course we have to find an agreement between all the clients which will be involved in the switch. That could be covered by a separate (technical) SIMD if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything that is a consensus-critical change should have a SIMD, even if it does not go through governance. The purpose of SIMDs is to communicate and agree upon breaking (consensus-critical) changes that are coming to different validator teams.
Rotor not actually defined anywhere btw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks to the anza research team for doing the research to create and propose this consensus protocol. approving on behalf of firedancer -- excited for the improvement this will be bring to the network
Rotor is not part of this SIMD. |
The Alpenglow whitepaper states:
Why we would need transactions to record this information? The stake value is already stored on-chain in stake accounts. The public keys are stored on-chain (validator identity key, vote account key). The IP address and port number are ephemeral values that operators might change at any time. Why would we want or need transactions to "record" these things? And also I assume that these transactions update accounts state, so where are the additional values going to be stored? |
The Alpenglow whitepaper states (in section 1.5 under Broadcast):
Since all messages fit into one UDP packet, all messages take the same amount of time to transmit. What is meant here by voting messages "need even less time" due to being "shorter"? |
The Alpenglow whitepaper states (in section 1.5 under Time):
Given that clock drift is very likely to exist for all real-world machines, I think it would really be worthwhile to incorporate some notion of expected drift (likely minimum, likely maximum) in the timeout periods used to derive the expected block completion times as proposed by this whitepaper. Time synchronization is done totally differently under Alpenglow than under Proof of History and extra care and "proving out" of the likely expected values is warranted given that we'd be switching from a known and proven method, to a theoretical method. EDIT: I see that this is more directly addressed in the Timeout section. I would like to see some evidence of typical clock drift for known data center level hardware and for that value to be incorporated into any Timeouts used in this paper. Thank you. |
With regards to the 20% threshold for liveness vs. Solana's current 33%: With the current stake distribution, this takes the "halt line" from 22 nodes down to 9 nodes. And since two of those nodes are operated by the same entity (Figment and Ledger by Figment) this is actually 8 operating entities. Does this seem troubling? I may be misunderstanding the "halt line" analogy when considering Alpenglow. Is it the case that this 20% is actually just the fraction of stake that could prevent "fast finalization" (80%) but that the cluster would just fall back to the slower finalization (two 60% rounds), meaning that to truly halt Alpenglow would require 40% of stake to refuse/fail to vote? But that 20% could "slow down" consensus by preventing fast single-vote consensus? |
The Alpenglow whitepaper states (in section 2.6):
What happens if a node or subset of nodes observe enough notarization votes, but others do not? Some nodes will observe the block as finalized, and others will not. This is a transient situation that could result in the subset of nodes that believed the block was finalized, finding out later that it was in fact not finalized because some malicious nodes produced (presumably slashable) double-voting that sent Notarization Votes to some nodes, and Skip Votes to other nodes. While it is true that slashing would deter such malicious behavior, the fact that "finalization" is so easily foiled seems problematic. In the existing Solana consensus model, "optimistically confirmed" blocks are subject to the same weak finalization criteria that Alpenglow "finalized" blocks are. And presumably, each block subsequently chained decreases the likelihood of "rollback" in Alpenglow just as it did for classic Solana. The only significant difference I can see is that classic Solana defines a slashing schedule that makes it exponentially more costly for malicious nodes to cause a rollback of older and older confirmed blocks, but Alpenglow doesn't define its slashing mechanism so has effectively worse guarantee on finalization than classic Solana. I probably am misunderstanding, but why isn't Alpenglow "finalized" the same as classic Solana "optimistic confirmation", with the same chances of being rolled back? |
Thank you for the reply. Is the analysis the same for the situation in which: Group A 19.99% malicious sends Notarize to group B and Skip to group C Repeat this for two rounds and then group B sees the slot as finalized and group C sees the slot as skipped. EDIT: Oops I just saw the flaw. Group C will only see 59.99% of votes for Skip, so will not conclude skipped. |
Doesn't allowing multiple ParentReady(s,...) to proceed, as illlustrated in Figure 8, allow a leader to extend its slot time? The ParentReady "function" calls SetTimeouts(), which uses the current clock to set a new set of timeouts for the slot. Doesn't this then allow a leader to increase its total "window time"? In Figure 8 this is illustrated - when the second ParentReady occurs, the timeouts reset and so the leader can effectively get a lot longer slot time for its remaining blocks? I see that a further rule is "In this case, slices 1, . . . , t − 1 are ignored for the purpose of execution.". So the leader gives up the first t slices it emitted; but since time timeouts have been extended, it can emit those same number of slices again, possibly in an advantageous way given that the expanded time duration of its slot has allowed it to survey many more transactions and build more profitable blocks. |
For efficiency, shouldn't there be a "getShreds" instead of "getShred" which must specifiy a large number of parameters that are duplicated if a contiguous range of shreds is desired? |
In case of multiple parent readies, the timeouts do not reset. The timeouts are set when a node observes the first parent ready for a slot. For the fast leader handover, the leader optimistically starts producing entries before the ParentReady is observed, with the hope that the previous leader's block receives a notarization-fallback certificate. If in fact the previous leader's block ends up getting skipped the current leader will observe a ParentReady with a different parent. At this point, it gives up on the produced entries and sends a special marker that indicates that the transmitted entries should be ignored, and that the remainder of the block is to be replayed on the new parent instead. |
we like the idea of sending one request for one shred. this way it's basically stateless. sending the shred is the costly part anyway, so sending more requests is not that harmful. this was we can also load balance work very effectively. with Rotor we will have very little repair no matter what. |
Not sure who claims the second part of this comparison, but I do have some doubts. In any case, we believe that 20% byzantine attack resilience is plenty (still needs tens of billions potentially slashable USD to execute). We also believe that crash failures are "more important" (in the sense that they happen very often, while actual byzantine attacks are very rare). So Alpenglow's 20+20 security is practically more secure than conventional 33% protocols. |
True, clocks are not perfectly accurate. However, system clocks drift by roughly 1 second per day only. Per leader window (and that's the maximum timeouts we have) this drift is only 0.2 ms. If you want to wait 399.8 or 400.2 ms instead of 400 ms before you skip, that's perfectly fine. In fact, Alpenglow could tolerate much bigger deviations by all nodes. Note that clocks are naturally synced with every new timeout, so drifts don't accumulate at all. |
If you send 1000 messages with 1500 bytes each it will take more time than sending 1000 messages with 200 bytes each. |
Maybe the term "transaction" is a bit misleading here? What we meant to say is exactly this: You can update your information by updating (for instance) your accounts state. It's just important that all your information is present. So "transaction" does not need to be a financial transaction. |
(All comments so far should be answered?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall a great improvement for the network. Will bolster and formalize the security of Solana's consensus protocol while greatly decreasing finalization times.
Allows us to cleanup a lot of tech debt, and makes it easier to reason about further consensus improvements like Asynchronous Execution and MCP.
But it is an on-chain transaction right? Presumably one that updates some kind of accounts state with the details that you have listed? Is that what the docs mean when they say "transaction"? |
There are still comments on the changed file that haven't been addressed. |
Yes. But we're open for an alternative term. |
Indeed, there was one more comment. Thanks for noticing. |
@bji some more clarifications:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excited for this change!! 🚀
**Vote** is an existing term already, but votes are different in Alpenglow. In | ||
Alpenglow, votes are not transactions on chain anymore, but just sent directly | ||
between validators. Also, votes do not include lockouts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there still be a vote processor program? If so, will this be a bpf program? Would we migrate over from the current vote processing program to a new alpenglow bpf vote program? It might be worth specifying that in this SIMD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Engineers should have the word here, but I would assume the accounting for vote aggregations (and first certificates) can be done in the same way as accounting for votes now (with a different program though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Alpenglow v1 we do not plan to create a new vote program, because the votes are not transactions any more, nothing much for a vote program to do. We will add bls_pubkey in the current vote program for verification and pubkey management. It has been added to Vote Account v4 #185
For the rewards scheme described in SIMD, I suppose we need to add the two BLS certs to block footer
#307, and we need to add parent bankhash to block footer as well as agreed in NYC. The rewards calculation will probably be similar to now, done at beginning of epoch.
Does that answer your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying!! Few more questions - will we need to keep some data on-chain for reward computation/distribution? How will that data be updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's very good question, I think what needs to be done for block footer is still up in the air. But I would imagine the block footer will be kept in the big table as it contains important information. It's not "on-chain" as a normal transaction, but at least the raw data is kept around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was asking more about rewards - I'm assuming there will be some data that is needed to calculate how much each voter/staker account gets paid, based on their performance during the epoch. Will that be stored on-chain? How will it get updated? Sorry for the noob question :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's another good question, the conclusion from last meeting was:
- BLS certs in block footer
- but credits are updated in vote accounts
- Non leader checks the BLS certs -> credits conclusion during replay
- rest of rewards calculation same as now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but credits are updated in vote accounts
this is the part I'm not clear about - how exactly will this happen? will there will need to be some kind of bpf program updating the vote account state? if this is documented anywhere feel free to point me to it 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, shared the meeting notes. The current plan is to "do it outside the VM before all transactions", so have to implement it in validator code....
We can discuss pros and cons about this though, this is just the current proposal.
## Drawbacks | ||
|
||
The main drawback is the risk related to implementing a big protocol change. | ||
Migrating to Alpenglow will be challenging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything that is a consensus-critical change should have a SIMD, even if it does not go through governance. The purpose of SIMDs is to communicate and agree upon breaking (consensus-critical) changes that are coming to different validator teams.
proposals/0326-alpenglow.md
Outdated
### Rewards | ||
|
||
In this SIMD we focus on the consensus-related benefits of Alpenglow. Below, we | ||
translate the existing vote rewards as they are, while removing some harmful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation-wise, are the rewards still computed and distributed using the same mechanisms (via state stored in the on-chain stake and vote accounts)? Might be good to specify that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, also here, same mechanisms.
There are more, but perhaps my comments don't merit response? |
higher resilience and better performance. | ||
|
||
This SIMD comes with an extensive companion paper. The Alpenglow White Paper | ||
v1.1 is available at https://www.anza.xyz/alpenglow-1-1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d prefer to see the whitepaper here as part of the PR, so that it’s clear the document has changed (via commits). Even now, when I follow the link in the document title, I see Solana Alpenglow White Paper 2025-05-19 v1.1.pdf, while in the document header it says White Paper v1.1, July 22, 2025.
No description provided.