You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -13,13 +13,18 @@ feature: (fill in with feature key and github tracking issues once accepted)
13
13
14
14
## Summary
15
15
16
-
This proposal introduces a new parallelism-aware CU tracking algorithm and serial execution constraint.
17
-
The goal is to properly bound worst case slow-replay edge cases that aren’t
18
-
covered by the existing block CU limit and per-account CU write limit. Properly bounding worst case replay time will allow us to increase global CU limit faster
16
+
This proposal introduces a new parallelism-aware CU tracking algorithm
17
+
and serial execution constraint. The goal is to properly bound worst case
18
+
slow-replay edge cases that aren’t covered by the existing block CU limit and
19
+
per-account CU write limit. Properly bounding worst case replay time will allow
20
+
us to increase global CU limit faster
19
21
20
22
## Motivation
21
23
22
-
Block limits exist to ensure that blocks can be replayed quickly by most of the cluster running a default client on reference hardware. The more accurately the protocol can bound worst case replay time, the more throughput can be allowed in the average case.
24
+
Block limits exist to ensure that blocks can be replayed quickly by most of the
25
+
cluster running a default client on reference hardware. The more accurately the
26
+
protocol can bound worst case replay time, the more throughput can be allowed in
27
+
the average case.
23
28
24
29
Solana currently enforces 4 block level constraints:
| Max Block Accounts Data Size Delta | 100MB | 100MB |
31
36
32
-
The Max Writable Account Unit limit which specifies the maximum number of CU's that can be consumed by transactions writing to a single account was motivated by the desire to enforce paralelism in replay. It was originally set to `12M` CU back when the Global Block CU limit was `48m = 12m*4` because the default client implementation called for execution on `4` execution threads. Unfortunately, this constraint doesn't properly bound the worst case replay behavior of a block. It is possible to construct a block with overlapping write account accesses such that all `50m` CU must be replayed serially on a single thread but no single account is written more than the `12m` CU limit.
33
-
34
-
The per-account limit doesn't only fail in these pathological cases. Data sampled from mainnet shows that among blocks with 40M+ CUs, the **median** serial workload (the critical path as described in the Terminology section below) already exceeds the 12M CU limit. Additionally, many blocks exceed the limit by significant amounts. This will only get worse as the block CU limit increases.
37
+
The Max Writable Account Unit limit which specifies the maximum number of CU's
38
+
that can be consumed by transactions writing to a single account was motivated
39
+
by the desire to enforce paralelism in replay. It was originally set to `12M` CU
40
+
back when the Global Block CU limit was `48m = 12m*4` because the default client
41
+
implementation called for execution on `4` execution threads. Unfortunately,
42
+
this constraint doesn't properly bound the worst case replay behavior of a
43
+
block. It is possible to construct a block with overlapping write account
44
+
accesses such that all `50m` CU must be replayed serially on a single thread but
45
+
no single account is written more than the `12m` CU limit.
46
+
47
+
The per-account limit doesn't only fail in these pathological cases. Data
48
+
sampled from mainnet shows that among blocks with 40M+ CUs, the **median**
49
+
serial workload (the critical path as described in the Terminology section
50
+
below) already exceeds the 12M CU limit. Additionally, many blocks exceed the
51
+
limit by significant amounts. This will only get worse as the block CU limit
This proposal introduces a new block level constraint that would prevent this worst case scenario, allowing us to safely increase Global CU limits more quickly.
56
+
This proposal introduces a new block level constraint that would prevent this
57
+
worst case scenario, allowing us to safely increase Global CU limits more
58
+
quickly.
39
59
40
60
## New Terminology
41
61
42
-
-**TX-DAG**: a dependency graph where each node is a transaction and each edge represents execution dependencies between transactions that contend for the same account lock. The direction of the edge is determined by the relative position of each transaction in the block order.
43
-
-**Track**: ordered list of transactions belonging to a subset of transactions in a given block. Analogous to a serial execution schedule on a single thread.
62
+
-**TX-DAG**: a dependency graph where each node is a transaction and each edge
63
+
- represents execution dependencies between transactions that contend for the
64
+
- same account lock. The direction of the edge is determined by the relative
65
+
- position of each transaction in the block order.
66
+
-**Track**: ordered list of transactions belonging to a subset of transactions
67
+
- in a given block. Analogous to a serial execution schedule on a single thread.
44
68
-**Vote Track**: track dedicated to only simple vote transactions.
45
-
-**Deterministic Transaction Assignment Algorithm (DTAA)**: streaming algorithm that adds each incoming transaction to the TX-DAG and assigns it to a track based on previous assignments and the dependencies encoded by the TX-DAG.
- that adds each incoming transaction to the TX-DAG and assigns it to a track
71
+
- based on previous assignments and the dependencies encoded by the TX-DAG.
46
72
-**Critical Path**: longest CU-weighted path in a TX-DAG.
47
73
-**Makespan**: the highest CU track produced by DTAA to an entire block.
48
74
-**Serial Execution Limit**: cap on makespan.
@@ -62,16 +88,33 @@ limits:
62
88
63
89
The change introduces a new Max Serial Execution Units constraint at 25M CUs.
64
90
65
-
High-level: the protocol cost tracker maintains CU counts for each execution track. *After* transactions are executed, they are sent to a cost tracker. As the cost tracker receives transactions, it processes them in block order (based on their relative positions in the block) , it deterministically assigns each executed transaction to a virtual execution track and updates that track's CU amount to account for the used CUs of the transaction, all of its parents in the `TX-DAG`, and all previous transactions assigned to the same track. This is equivalent to virtual scheduling, where each task's virtual start time depends on completion of tasks that must complete before it.
66
-
67
-
Note that this change is purely a resource constraint on which blocks are considered valid and not a prescription on how transactions should be scheduled either duing scheduling or replay. If validators are running more performant hardware they are welcome to use additional cores to schedule or replay transactions. The goal of this proposal is to ensure that blocks can replay in a timeley manner on reference hardware.
68
-
69
-
Example applications of DTAA to mainnet blocks (the red transactions are on the critical path):
91
+
High-level: the protocol cost tracker maintains CU counts for each execution
92
+
track. *After* transactions are executed, they are sent to a cost tracker. As
93
+
the cost tracker receives transactions, it processes them in block order (based
94
+
on their relative positions in the block) , it deterministically assigns each
95
+
executed transaction to a virtual execution track and updates that track's CU
96
+
amount to account for the used CUs of the transaction, all of its parents in the
97
+
`TX-DAG`, and all previous transactions assigned to the same track. This is
98
+
equivalent to virtual scheduling, where each task's virtual start time depends
99
+
on completion of tasks that must complete before it.
100
+
101
+
Note that this change is purely a resource constraint on which blocks are
102
+
considered valid and not a prescription on how transactions should be scheduled
103
+
either duing scheduling or replay. If validators are running more performant
104
+
hardware they are welcome to use additional cores to schedule or replay
105
+
transactions. The goal of this proposal is to ensure that blocks can replay in a
106
+
timeley manner on reference hardware.
107
+
108
+
Example applications of DTAA to mainnet blocks (the red transactions are on the
109
+
critical path):
70
110
71
111

72
112

73
113
74
-
side note: these are two blocks with similar total CUs, but despite this the first has a much smaller makespan. In a 4-thread, parallel execution context, the first block can be executed nearly twice as fast if using CUs as a proxy for time.
114
+
side note: these are two blocks with similar total CUs, but despite this the
115
+
first has a much smaller makespan. In a 4-thread, parallel execution context,
116
+
the first block can be executed nearly twice as fast if using CUs as a proxy for
117
+
time.
75
118
76
119
### Protocol Changes and Additions
77
120
@@ -81,26 +124,37 @@ side note: these are two blocks with similar total CUs, but despite this the fir
81
124
- Deterministic Transaction Assignment Algorithm:
82
125
-`DTAA(tx_stream, N, M) -> track_cus[]`
83
126
-`track_cus[]` contains the serial execution CUs for each track
84
-
-`tx_stream` is a real-time transaction stream representing input during block production or verification.
85
-
-`DTAA` processes transactions from this stream as they arrive: assigning each to an appropriate track and adding it to the `TX-DAG`.
127
+
-`tx_stream` is a real-time transaction stream representing input during
128
+
block production or verification.
129
+
-`DTAA` processes transactions from this stream as they arrive: assigning
130
+
each to an appropriate track and adding it to the `TX-DAG`.
86
131
- Assignment to a track sets that track's total CUs to `end(tx, track)`:
-`max_path(tx) = max(end(p, p.track) for p in TX-DAG.parents(tx))`
90
-
- Note: `tx.CUs` is the real CUs consumed during execution, not the CU limit requested by the fee payer.
91
-
- Simple vote transactions are assigned to the `M` vote tracks, others to the `N` standard tracks.
135
+
- Note: `tx.CUs` is the real CUs consumed during execution, not the CU limit
136
+
requested by the fee payer.
137
+
- Simple vote transactions are assigned to the `M` vote tracks, others to the
138
+
`N` standard tracks.
92
139
- A block is valid iff:
93
140
-`max(track.CUs for track in DTAA(tx_stream, N, M)) ≤ L`
94
141
- Vote tracks are ignored
95
142
96
143
### Requirements
97
144
98
-
-**Consensus**: the serial execution limit doesn't apply to simple vote transactions. This ensures consensus performance isn’t negatively impacted.
99
-
-**Determinism**: for a given block, all validators must assign transactions to tracks identically so they agree on the makespan calculation. Otherwise, the cluster can fork due to block validity equivocation.
100
-
-**Real-time**: the assignment algorithm must work in a real-time, streaming context so CU-tracking can occur as shreds are being received by a validator or while a leader is building a block.
145
+
-**Consensus**: the serial execution limit doesn't apply to simple vote
146
+
transactions. This ensures consensus performance isn’t negatively impacted.
147
+
-**Determinism**: for a given block, all validators must assign transactions to
148
+
tracks identically so they agree on the makespan calculation. Otherwise, the
149
+
cluster can fork due to block validity equivocation.
150
+
-**Real-time**: the assignment algorithm must work in a real-time, streaming
151
+
context so CU-tracking can occur as shreds are being received by a validator
152
+
or while a leader is building a block.
101
153
102
154
103
-
A simple greedy scheduling algorithm fulfills these requirements. Pseudocode for one such algorithm is provided below. Note that optimal assignment is NP-hard so sub-optimality is unavoidable.
155
+
A simple greedy scheduling algorithm fulfills these requirements. Pseudocode for
156
+
one such algorithm is provided below. Note that optimal assignment is NP-hard so
-`APPLY_TX_COST(tx)` must be called with transactions in the same order they appear in the block. This applies to both replayers and the leader.
141
-
- vote tracks aren't subject to the serial execution limit but we need to track them because they can still contend for the same accounts as
142
-
transactions in the standard tracks, and be parents in the `TX-DAG` for those transactions.
194
+
-`APPLY_TX_COST(tx)` must be called with transactions in the same order they
195
+
appear in the block. This applies to both replayers and the leader.
196
+
- vote tracks aren't subject to the serial execution limit but we need to track
197
+
them because they can still contend for the same accounts as
198
+
transactions in the standard tracks, and be parents in the `TX-DAG` for those
199
+
transactions.
143
200
144
201
### Implementation
145
202
146
-
- Block Verification (Replay): because real CUs rather than requested CUs are used for determining if constraints are satisfied, cost tracking must
147
-
occur post-execution (i.e. `APPLY_TX_COST` must be called after `tx` is executed). Block execution during replay doesn't guarantee that transactions
148
-
in a block will complete execution in the same order they appear in the block, so cost tracking must account for this somehow. For example, the cost
149
-
tracker can handle re-ordering internally or a synchronization mechanism in the bank can enforce order.
150
-
- Block Production: similar post-execution requirements apply here as well; the main difference being that the position of a transaction in the block,
151
-
in addition to the real CUs it consumes, isn't determined until post-execution when the transaction is processed by the PoH recorder. Caveat: this
152
-
implies failure to satisfy the serial execution constraint may occur **after** a transaction has already been executed, which would waste compute
153
-
resources. This can be mitigated partially with additional, speculative pre-execution checks (as is done currently by apply the requested CUs to the cost tracker).
203
+
- Block Verification (Replay): because real CUs rather than requested CUs are
204
+
used for determining if constraints are satisfied, cost tracking must
205
+
occur post-execution (i.e. `APPLY_TX_COST` must be called after `tx` is
206
+
executed). Block execution during replay doesn't guarantee that transactions
207
+
in a block will complete execution in the same order they appear in the block,
208
+
so cost tracking must account for this somehow. For example, the cost tracker
209
+
can handle re-ordering internally or a synchronization mechanism in the bank
210
+
can enforce order.
211
+
- Block Production: similar post-execution requirements apply here as well; the
212
+
main difference being that the position of a transaction in the block, in
213
+
addition to the real CUs it consumes, isn't determined until post-execution
214
+
when the transaction is processed by the PoH recorder. Caveat: this implies
215
+
failure to satisfy the serial execution constraint may occur **after** a
216
+
transaction has already been executed, which would waste compute resources.
217
+
This can be mitigated partially with additional, speculative pre-execution
218
+
checks (as is done currently by apply the requested CUs to the cost tracker).
154
219
155
220
### DTAA Optimality
156
221
157
-
For a block `B`, `max(critical_path(B), total_cus(B) / N)` represents a lower bound on optimal makespan `OPT`. If `lb(OPT)` is this lower bound and
158
-
`makespan` is the makespan calculated by `DTAA`, then `r = lb(OPT) / makespan` can be considered a lower bound estimate of the true optimality of `DTAA`.
159
-
The closer to 1 `r` is the more optimal DTAA is estimated to be. Empirical analysis shows that when applying `DTAA` retroactively to all mainnet blocks
160
-
with total CUs greater that 40M in a sample thousand slot range, the median `r` is `0.8`.
222
+
For a block `B`, `max(critical_path(B), total_cus(B) / N)` represents a lower
223
+
bound on optimal makespan `OPT`. If `lb(OPT)` is this lower bound and `makespan`
224
+
is the makespan calculated by `DTAA`, then `r = lb(OPT) / makespan` can be
225
+
considered a lower bound estimate of the true optimality of `DTAA`. The closer
226
+
to 1 `r` is the more optimal DTAA is estimated to be. Empirical analysis shows
227
+
that when applying `DTAA` retroactively to all mainnet blocks with total CUs
228
+
greater that 40M in a sample thousand slot range, the median `r` is `0.8`.
161
229
162
230

163
231
164
-
Applying DTAA to historical mainnet blocks also provides an upper and lower bound on how much leftover capacity there is:
165
-
- lower bound (worst-case): subtract the makespan of the block from the serial execution constraint of 25M CUs
166
-
- upper bound (best-case): sum the differences between each track's CUs and the serial execution constraint
232
+
Applying DTAA to historical mainnet blocks also provides an upper and lower
233
+
bound on how much leftover capacity there is:
234
+
- lower bound (worst-case): subtract the makespan of the block from the serial
235
+
execution constraint of 25M CUs
236
+
- upper bound (best-case): sum the differences between each track's CUs and the
- critical path constraint: a limit on the critical path of the `TX-DAG`. This does restrict long serial chains of transactions but fails as a general
173
-
serial execution constraint because it cannot account for the degree of parallelism (e.g. number of threads).
243
+
- critical path constraint: a limit on the critical path of the `TX-DAG`. This
244
+
does restrict long serial chains of transactions but fails as a general serial
245
+
execution constraint because it cannot account for the degree of parallelism
246
+
(e.g. number of threads).
174
247
175
248
## Impact
176
249
177
-
This proposal will likely require updates to transaction scheduling during block production, which includes MEV infrastructure like JITO. Cost tracking
178
-
will no longer be independent of order so shuffling transactions around while optimizing for profitability may impact block validity.
250
+
This proposal will likely require updates to transaction scheduling during block
251
+
production, which includes MEV infrastructure like JITO. Cost tracking will no
252
+
longer be independent of order so shuffling transactions around while optimizing
253
+
for profitability may impact block validity.
179
254
180
255
## Security Considerations
181
256
182
-
All validator client implementations must use the same DTAA in order to prevent equivocation on block validity. Otherwise, a partition of the network may
183
-
occur.
257
+
All validator client implementations must use the same DTAA in order to prevent
258
+
equivocation on block validity. Otherwise, a partition of the network may occur.
0 commit comments