Skip to content

Commit 092a67c

Browse files
committed
SIMD-0186: Transaction Data Size Specification
1 parent 58ab7ca commit 092a67c

File tree

1 file changed

+143
-0
lines changed

1 file changed

+143
-0
lines changed
Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
---
2+
simd: '0186'
3+
title: Transaction Data Size Specification
4+
authors:
5+
- Hanako Mumei
6+
category: Standard
7+
type: Core
8+
status: Review
9+
created: 2024-10-20
10+
feature: (fill in with feature tracking issues once accepted)
11+
---
12+
13+
## Summary
14+
15+
Before a transaction can be executed, every account it may read from or write to
16+
must be loaded, including any programs it may call. The amount of data a
17+
transaction is allowed to load is capped, and if it exceeds that limit, loading
18+
is aborted. This functionality is already implemented in the validator. The
19+
purpose of this SIMD is to explicitly define how transaction size is calculated.
20+
21+
## Motivation
22+
23+
Transaction data size accounting is currently unspecified, and the
24+
implementation-defined algorithm used in the Agave client exhibits some
25+
surprising behaviors:
26+
27+
* BPF loaders required by top-level programs are counted against transaction
28+
data size. BPF loaders required by CPI programs are not. If a required BPF
29+
loader is also included in the accounts list, it is counted twice.
30+
* The size of a program owned by LoaderV3 may or may not include the size of its
31+
programdata depending on how the program account is used on the transaction.
32+
Programdata is also itself counted if included in the transaction accounts list.
33+
This means programdata may be counted zero, one, or two times per transaction.
34+
35+
All validator clients must arrive at precisely the same transaction data size
36+
for all transactions because a difference of one byte can determine whether a
37+
transaction is executed or failed, and thus affects consensus. Also, we want the
38+
calculated transaction data size to correspond well with the actual amount of
39+
data the transaction requests.
40+
41+
Therefore, this SIMD seeks to specify an algorithm that is straightforward to
42+
implement in a client-agnostic way, while also accurately accounting for the
43+
total data required by the transaction.
44+
45+
## New Terminology
46+
47+
One term is defined within the scope of this SIMD:
48+
49+
* Valid program: a program that has been loaded, or a builtin. This definition
50+
excludes programs that have failed verification, or LoaderV3 programs that have
51+
been closed or have delayed visibility due to being deployed or modified in the
52+
current slot.
53+
54+
These terms are not new, however we define them for clarity:
55+
56+
* Top-level program: the program corresponding to the program id on a given
57+
instruction.
58+
* Instruction account: an account passed to an instruction, which allows its
59+
program to view the actual bytes of the account. CPI can only happen through
60+
programs provided as instruction accounts.
61+
* Transaction accounts list: all accounts for the transaction, which includes
62+
top-level programs, the fee-payer, all instruction accounts, and any extra
63+
accounts added to the list but not used for any purpose.
64+
65+
## Detailed Design
66+
67+
The proposed algorithm is as follows:
68+
69+
1. Every account explicitly included on the transaction accounts list is counted
70+
once and only once.
71+
2. A valid program owned by LoaderV3 also includes the size of its programdata.
72+
3. Other than point 2, no accounts are implicitly added to the total data size.
73+
74+
Transactions may include a
75+
`ComputeBudgetInstruction::SetLoadedAccountsDataSizeLimit` instruction to define
76+
a data size limit for the transaction. Otherwise, the default limit is 64MiB
77+
(`64 * 1024 * 1024` bytes).
78+
79+
If a transaction exceeds its data size limit, account loading is aborted and the
80+
transaction is failed. Fees will be charged once
81+
`enable_transaction_loading_failure_fees` is enabled.
82+
83+
Adding required loaders to transaction data size is abolished. They are treated
84+
the same as any other account: counted if on the transaction accounts list, not
85+
counted otherwise.
86+
87+
Read-only and writable accounts are treated the same. In the future, when direct
88+
mapping is enabled, this SIMD may be amended to count them differently.
89+
90+
As a consequence of 1 and 2, for LoaderV3 programs, programdata is counted twice
91+
if a transaction includes both programdata and the program account itself in the
92+
accounts list, unless the program is not valid for execution. This is partly
93+
done for ease of implementation: we always want to count programdata when the
94+
program is included, and there is no reason for any transaction to include both
95+
accounts except during initial deployment, in which case the program is not yet
96+
valid.
97+
98+
We include programdata size in program size for LoaderV3 programs because in
99+
nearly all cases a transaction will include the program account (the only way to
100+
invoke the program) and will not include the programdata account because
101+
including it serves no purpose. Including the program account forces an
102+
unconditional load of the programdata account because it is required to compile
103+
the program for execution. Therefore we always count it, even when the program
104+
is an instruction account.
105+
106+
There is no special handling for programs owned by the native loader, LoaderV1,
107+
or LoaderV2.
108+
109+
Account size for programs owned by LoaderV4 is left undefined. This SIMD should
110+
be amended to define the required semantics before LoaderV4 is enabled on any
111+
network.
112+
113+
## Alternatives Considered
114+
115+
* Transaction data size accounting is already enabled, so the null option is to
116+
enshrine the current Agave behavior in the protocol. This is undesirable because
117+
the current behavior is highly idiosyncratic, and LoaderV3 program sizes are
118+
routinely undercounted.
119+
* Builtin programs are backed by accounts that only contain the program name as
120+
a string, typically making them 15-40 bytes. We could make them free when not
121+
instruction accounts, since they're part of the validator. However this
122+
adds complexity for no real benefit.
123+
124+
## Impact
125+
126+
The primary impact is this SIMD makes correctly implementing transaction data
127+
size accounting much easier for other validator clients.
128+
129+
It makes transactions which include program accounts for CPI somewhat larger,
130+
but given the generous 64MiB limit, it is unlikely that any existing users will
131+
be affected.
132+
133+
## Security Considerations
134+
135+
Security impact is minimal because this SIMD merely simplifies an existing
136+
feature.
137+
138+
This SIMD requires a feature gate.
139+
140+
## Backwards Compatibility
141+
142+
Transactions that call LoaderV3 programs via CPI and are extremely close to the
143+
64MiB limit may now exceed it.

0 commit comments

Comments
 (0)