Skip to content

Commit a364ce7

Browse files
committed
SIMD-0186: Transaction Data Size Specification
1 parent 58ab7ca commit a364ce7

File tree

1 file changed

+121
-0
lines changed

1 file changed

+121
-0
lines changed
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
---
2+
simd: '0186'
3+
title: Transaction Data Size Specification
4+
authors:
5+
- Hanako Mumei
6+
category: Standard
7+
type: Core
8+
status: Review
9+
created: 2024-10-20
10+
feature: (fill in with feature tracking issues once accepted)
11+
---
12+
13+
## Summary
14+
15+
Before a transaction can be executed, every account it may read from or write to
16+
must be loaded, including any programs it may call. The amount of data a
17+
transaction is allowed to load is capped, and if it exceeds that limit, loading
18+
is aborted. This functionality is already implemented in the validator. The
19+
purpose of this SIMD is to explicitly define how transaction size is calculated.
20+
21+
## Motivation
22+
23+
Transaction data size accounting is currently unspecified, and the
24+
implementation-defined algorithm used in the Agave client exhibits some
25+
surprising behaviors:
26+
27+
* BPF loaders required by top-level invoked programs are counted against
28+
transaction data size. BPF loaders required by CPI invoked programs are not. If
29+
a required BPF loader is also invoked or included in the accounts list, it is
30+
counted twice.
31+
* The size of a program owned by the upgradeable BPF loader (henceforth
32+
LoaderV3) may or may not include the size of its programdata depending on how it
33+
is used on the transaction, in addition to counting programdata if it itself is
34+
included on the transaction. This means programdata may be counted zero, one, or
35+
two times.
36+
37+
All validator clients must arrive at precisely the same transaction data size
38+
for all transactions because a difference of one byte can determine whether a
39+
transaction is executed or failed. Also, we want the calculated transaction data
40+
size to correspond closely to the actual amount of data the transaction
41+
requests.
42+
43+
Therefore, this SIMD seeks to specify an algorithm that is straightforward to
44+
implement in a client-agnostic way, while also accurately accounting for the
45+
total data required by the transaction.
46+
47+
## New Terminology
48+
49+
N/A
50+
51+
## Detailed Design
52+
53+
The proposed algorithm is as follows:
54+
55+
1. Every account explicitly included on the transaction accounts list is counted
56+
once and only once.
57+
2. A program owned by LoaderV3 also includes the size of its programdata.
58+
3. Other than point 2, no accounts are implicitly added to the total data size.
59+
60+
Transactions may include a
61+
`ComputeBudgetInstruction::SetLoadedAccountsDataSizeLimit` instruction to define
62+
a data size limit for the transaction. Otherwise, the default limit is 64MiB
63+
(`64 * 1024 * 1024` bytes). In the future, this default may be changed by
64+
amending this SIMD.
65+
66+
If a transaction exceeds its data size limit, account loading is aborted and the
67+
transaction is failed. Fees will be charged once
68+
`enable_transaction_loading_failure_fees` is enabled.
69+
70+
Read-only and writable accounts are treated the same. In the future, when direct
71+
mapping is enabled, this SIMD may be amended to count them differently.
72+
73+
As a consequence of 1 and 2, programdata is counted twice if a transaction
74+
includes both programdata and the program account itself in the accounts list.
75+
This is partly done for ease of implementation: we always want to count
76+
programdata when the program is included, and there is no reason for any
77+
transaction to include both accounts except during initial deployment.
78+
79+
There is no special handling for programs owned by the native loader or the
80+
non-upgradeable BPF loaders.
81+
82+
Account size for programs owned by LoaderV4 is left undefined. This SIMD should
83+
be amended before LoaderV4 is enabled.
84+
85+
## Alternatives Considered
86+
87+
* Transaction data size accounting is already enabled, so the null option is to
88+
enshrine the current Agave behavior in the protocol. This is undesirable because
89+
the current behavior is highly idiosyncratic, and LoaderV3 program sizes are
90+
routinely undercounted.
91+
* Builtin programs are backed by accounts that only contain the program name as
92+
a string, typically making them 15-40 bytes. We could make them free when not
93+
instruction accounts, since they're part of the validator. However this
94+
adds complexity for no real benefit.
95+
* We include LoaderV3 programdata size in program size because almost all
96+
transactions will use the program account, which forces a load of programdata,
97+
and not use programdata directly. To be truly consistent, we might want to count
98+
LoaderV1 and LoaderV2 programs twice if they're instruction accounts, since the
99+
data does have to be loaded twice. However this adds complexity for what may be
100+
an Agave-specific implementation detail, and these programs are rarely used.
101+
102+
## Impact
103+
104+
The primary impact is this SIMD makes correctly implementing transaction data
105+
size accounting much easier for other validator clients.
106+
107+
It makes transactions which include program accounts for CPI somewhat larger,
108+
but given the generous 64MiB limit, it is unlikely that any existing users will
109+
be affected.
110+
111+
## Security Considerations
112+
113+
Security impact is minimal because this SIMD merely simplifies an existing
114+
feature.
115+
116+
This SIMD requires a feature gate.
117+
118+
## Backwards Compatibility
119+
120+
Transactions that call LoaderV3 programs via CPI and are extremely close to the
121+
64MiB limit may now exceed it.

0 commit comments

Comments
 (0)