Skip to content

Commit 1655e46

Browse files
SIMD-0149: Migrate Snapshot Epoch Stakes (#149)
* SIMD-0149: Migrate Snapshot Epoch Stakes * fix lint * feedback * Update proposals/0149-migrate-snapshot-epoch-stakes.md Co-authored-by: lheeger-jump <[email protected]> * Update proposals/0149-migrate-snapshot-epoch-stakes.md Co-authored-by: lheeger-jump <[email protected]> --------- Co-authored-by: lheeger-jump <[email protected]>
1 parent 4b1eaaf commit 1655e46

File tree

1 file changed

+172
-0
lines changed

1 file changed

+172
-0
lines changed
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
---
2+
simd: '0149'
3+
title: Migrate Snapshot Serialized Epoch Stakes
4+
authors: Justin Starry (Anza)
5+
category: Standard
6+
type: Core
7+
status: Draft
8+
created: 2024-05-09
9+
feature: (fill in with feature tracking issues once accepted)
10+
---
11+
12+
## Summary
13+
14+
Migrate bank snapshots to a new "epoch stakes" field in order to store
15+
additional stake state needed for calculating partitioned rewards.
16+
17+
## Motivation
18+
19+
In order to properly support recalculating partitioned rewards (SIMD-0118)
20+
when rebooting from a snapshot in the middle of the rewards distribution
21+
period at the beginning of an epoch, additional stake state should be
22+
stored in the epoch stakes snapshot field.
23+
24+
Since the currently used epoch stakes field doesn't support versioning, we
25+
propose migrating the snapshot format to use a new epoch stakes field which is
26+
versioned and supports storing the required stake state.
27+
28+
## New Terminology
29+
30+
NA
31+
32+
## Alternatives Considered
33+
34+
We have discussed the following alternative approach:
35+
36+
1. The missing stake state information can be retrieved from the current epoch's
37+
stakes cache instead of epoch stakes. However this alternative is risky because
38+
the current epoch's stake cache is being updated during rewards distribution with
39+
newly rewarded stake accounts and possibly newly created stake accounts as well.
40+
41+
## Detailed Design
42+
43+
### Serialized Data Changes
44+
45+
To be more specific, the missing stake information is the `credits_observed`
46+
field from the `Stake` struct:
47+
48+
```rust
49+
struct Stake {
50+
delegation: Delegation,
51+
credits_observed: u64,
52+
}
53+
```
54+
55+
The currently used `epoch_stakes` field in the bank snapshot serializes a map of
56+
epochs to `EpochStakes` structs which have special logic for serializing stake
57+
state:
58+
59+
```rust
60+
struct (BankFieldsToDeserialize | BankFieldsToSerialize) {
61+
..
62+
epoch_stakes: HashMap<Epoch, EpochStakes>,
63+
..
64+
}
65+
66+
struct EpochStakes {
67+
#[serde(with = "crate::stakes::serde_stakes_enum_compat")]
68+
stakes: Arc<StakesEnum>,
69+
total_stake: u64,
70+
node_id_to_vote_accounts: Arc<NodeIdToVoteAccounts>,
71+
epoch_authorized_voters: Arc<EpochAuthorizedVoters>,
72+
}
73+
74+
enum StakesEnum {
75+
Accounts(Stakes<StakeAccount>),
76+
Delegations(Stakes<Delegation>),
77+
}
78+
```
79+
80+
When serializing `EpochStakes` in snapshots, all `StakesEnum` variants first map
81+
stake entry values to their `Delegation` value before serialization. The goal of
82+
this proposal is to migrate to a new epoch stakes field which maps stake entry
83+
values to their full `Stake` value before serialization so that
84+
`credits_observed` will be be included in the snapshot and available after
85+
snapshot deserialization.
86+
87+
The proposed `new_epoch_stakes` bank snapshot field will instead serialize a map
88+
of epochs to `VersionedEpochStakes` structs which can be updated in the future
89+
to serialize different information if needed. This field will be appended to the
90+
end of the serialized bank snapshot:
91+
92+
```rust
93+
struct (BankFieldsToDeserialize | BankFieldsToSerialize) {
94+
..
95+
new_epoch_stakes: HashMap<Epoch, VersionedEpochStakes>,
96+
}
97+
98+
enum VersionedEpochStakes {
99+
Current {
100+
stakes: Stakes<Stake>,
101+
total_stake: u64,
102+
node_id_to_vote_accounts: Arc<NodeIdToVoteAccounts>,
103+
epoch_authorized_voters: Arc<EpochAuthorizedVoters>,
104+
},
105+
}
106+
```
107+
108+
### Snapshot Migration Phases
109+
110+
Handling snapshot format changes is always a delicate operation to coordinate
111+
given that old software releases will not be able to deserialize snapshots from
112+
new software releases properly. The rollout will require two phases:
113+
114+
1. Introduce support for deserializing the migrated epoch stakes field
115+
2. Enable serializing epoch stakes to the new field and phase out the old field
116+
117+
During the first phase, validator software will be updated to attempt to
118+
deserialize the new epoch stakes field appended at the end of the bank snapshot
119+
and merge those entries with the epoch stakes entries deserialized from the old
120+
epoch stakes field. If the field doesn't exist, validators can assume that all
121+
epoch stakes entries are serialized in the old deserialized field.
122+
123+
During the second phase, validator software will be updated to start serializing
124+
epoch stakes entries to the new epoch stakes field. Note, however, that there
125+
will still potentially be some epoch stake entries that are incompatible with
126+
the new epoch stakes field that must still be serialized to the old epoch stakes
127+
field.
128+
129+
There are 3 different epoch stakes entry variants:
130+
131+
1. Entries created during epoch boundaries which have _full_ stake account data
132+
2. Entries deserialized from the old snapshot epoch stakes field which only have
133+
stake delegation state.
134+
3. Entries deserialized from the new snapshot epoch stakes field which have full
135+
stake state.
136+
137+
Only variants 1 and 3 can be serialized into the new epoch stakes field so any
138+
variant 2 epoch stakes entries will continue being serialized into the old epoch
139+
stakes field. There are normally 6 epoch stakes entries serialized in each
140+
snapshot, so nodes will have to cross 6 epoch boundaries before they completely
141+
stop serializing any entries to the old epoch stakes snapshot field.
142+
143+
### Snapshot Rollout
144+
145+
We propose merging the implementation for phase 1 into the next minor software
146+
release (e.g. v2.0) and then merging the implementation for phase 2 into the
147+
following minor software release (e.g v2.1). This way, we can ensure that the
148+
whole cluster is running a version capable of reading the new snapshot field
149+
before any nodes start producing snapshots with that new field.
150+
151+
## Impact
152+
153+
No major impact beyond backwards compatibility concerns. Snapshots will
154+
be a few MB larger than before.
155+
156+
## Security Considerations
157+
158+
Missing or corrupted epoch stakes entries caused by faulty snapshot migration
159+
can cause validators to fork off from the cluster or cause the cluster to lose
160+
consensus if sufficient stake is affected.
161+
162+
Care should be taken to ensure that implementations can detect corrupted
163+
or incorrect values.
164+
165+
## Backwards Compatibility
166+
167+
Snapshot changes must be made in a backwards compatible way. Handling
168+
compatibility is thoroughly discussed in the proposal above.
169+
170+
## Open Questions
171+
172+
NA

0 commit comments

Comments
 (0)