Skip to content

Commit 7816e67

Browse files
committed
update to new algorithm
1 parent c694155 commit 7816e67

File tree

2 files changed

+165
-152
lines changed

2 files changed

+165
-152
lines changed
Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
---
2+
simd: '0186'
3+
title: Loaded Transaction Data Size Specification
4+
authors:
5+
- Hanako Mumei
6+
category: Standard
7+
type: Core
8+
status: Review
9+
created: 2024-10-20
10+
feature: (fill in with feature tracking issues once accepted)
11+
---
12+
13+
## Summary
14+
15+
Before a transaction can be executed, every account it may read from or write to
16+
must be loaded, including any programs it may call. The amount of data a
17+
transaction is allowed to load is capped, and if it exceeds that limit, loading
18+
is aborted. This functionality is already implemented in the validator. The
19+
purpose of this SIMD is to explicitly define how loaded transaction data size is
20+
calculated.
21+
22+
## Motivation
23+
24+
Transaction data size accounting is currently unspecified, and the
25+
implementation-defined algorithm used in the Agave client exhibits some
26+
surprising behaviors:
27+
28+
* BPF loaders required by instructions' program IDs are counted against
29+
transaction data size. BPF loaders required by CPI programs are not. If a
30+
required BPF loader is also included in the accounts list, it is counted twice.
31+
* The size of a program owned by LoaderV3 may or may not include the size of its
32+
programdata depending on how the program account is used on the transaction.
33+
Programdata is also itself counted if included in the transaction accounts list.
34+
This means programdata may be counted zero, one, or two times per transaction.
35+
* Due to certain quirks of implementation, loader-owned accounts which do not
36+
contain valid programs for execution may or may not be counted against the
37+
transaction data size total depending on how they are used on the transaction.
38+
This includes, but is not limited to, LoaderV3 buffer accounts, and accounts
39+
which fail ELF validation.
40+
* Accounts can be included on a transaction account list without being an
41+
instruction account, fee-payer, or program ID. These accounts are presently
42+
loaded and counted against transaction data size, although they can never be
43+
used for any purpose by the transaction.
44+
45+
All validator clients must arrive at precisely the same transaction data size
46+
for all transactions because a difference of one byte can determine whether a
47+
transaction is executed or failed, and thus affects consensus. Also, we want the
48+
calculated transaction data size to correspond well with the actual amount of
49+
data the transaction requests.
50+
51+
Therefore, this SIMD seeks to specify an algorithm that is straightforward to
52+
implement in a client-agnostic way, while also accurately accounting for all
53+
account data required by the transaction.
54+
55+
## New Terminology
56+
57+
No new terms are introduced by this SIMD, however we define these for clarity:
58+
59+
* Instruction account: an account passed to an instruction in its accounts
60+
array, which allows the program to view the actual bytes contained in the
61+
account. CPI can only happen through programs provided as instruction accounts.
62+
* Transaction accounts list: all accounts for the transaction, which includes
63+
instruction accounts, the fee-payer, program IDs, and any extra accounts added
64+
to the list but not used for any purpose.
65+
* LoaderV3 program account: an account owned by
66+
`BPFLoaderUpgradeab1e11111111111111111111111` which contains in its account data
67+
the first four bytes `02 00 00 00` followed by a pubkey which points to an
68+
account which is defined as the program's programdata account.
69+
70+
For the purposes of this SIMD, we make no assumptions about the contents of the
71+
programdata account.
72+
73+
## Detailed Design
74+
75+
The proposed algorithm is as follows:
76+
77+
1. Given a transaction, take the unique set of account keys which are used as:
78+
79+
* An instruction account.
80+
* A program ID for an instruction.
81+
* The fee-payer.
82+
83+
2. Each account's size is determined solely by the byte length of its data prior
84+
to transaction execution.
85+
3. For any `LoaderV3` program account, add the size of the programdata account
86+
it references, if it exists.
87+
4. The total transaction size is the sum of these sizes.
88+
89+
Transactions may include a
90+
`ComputeBudgetInstruction::SetLoadedAccountsDataSizeLimit` instruction to define
91+
a data size limit for the transaction. Otherwise, the default limit is 64MiB
92+
(`64 * 1024 * 1024` bytes).
93+
94+
If a transaction exceeds its data size limit, the transaction is failed. Fees
95+
will be charged once `enable_transaction_loading_failure_fees` is enabled.
96+
97+
Adding required loaders to transaction data size is abolished. They are treated
98+
the same as any other account: counted if used in a manner described by 1, not
99+
counted otherwise.
100+
101+
No account that falls outside of the three categories listed by 1 is counted
102+
against transaction data size. Validator clients are free to decline to load
103+
them.
104+
105+
Read-only and writable accounts are treated the same. In the future, when direct
106+
mapping is enabled, this SIMD may be amended to count them differently.
107+
108+
As a consequence of 1 and 3, for LoaderV3 programs, programdata is counted twice
109+
if a transaction explicitly references the program account and its programdata
110+
account. This is done partly for simplicity, and partly to account for the cost
111+
of maintaining the compiled program in addition to the actual bytes of
112+
the programdata account.
113+
114+
We include programdata size in account size for LoaderV3 programs because using
115+
the program account on a transaction forces an unconditional load of programdata
116+
to compile the program for execution. We always count it, even when the program
117+
is an instruction account, because the program must be available for CPI.
118+
119+
There is no special handling for any account owned by the native loader,
120+
LoaderV1, or LoaderV2.
121+
122+
Account size for programs owned by LoaderV4 is left undefined. This SIMD should
123+
be amended to define the required semantics before LoaderV4 is enabled on any
124+
network.
125+
126+
## Alternatives Considered
127+
128+
* Transaction data size accounting is already enabled, so the null option is to
129+
enshrine the current Agave behavior in the protocol. This is undesirable because
130+
the current behavior is highly idiosyncratic, and LoaderV3 program sizes are
131+
routinely undercounted.
132+
* Builtin programs are backed by accounts that only contain the program name as
133+
a string, typically making them 15-40 bytes. We could impose a larger fixed cost
134+
for these. However, they must be made available for all programs anyway, and
135+
most of them are likely to be ported to BPF eventually, so this adds complexity
136+
for no real benefit.
137+
* Several slightly different algorithms were considered for handling LoaderV3
138+
programs in particular, for instance only counting programs that are valid for
139+
execution in the current slot. However, this would implicitly couple transaction
140+
data size with the results of ELF validation, which is highly undesirable.
141+
* We considered loading and counting sizes for accounts on the transaction
142+
account list which are not used for any purpose. This is the current behavior,
143+
but there is no reason to load such accounts at all.
144+
145+
## Impact
146+
147+
The primary impact is this SIMD makes correctly implementing transaction data
148+
size accounting much easier for other validator clients.
149+
150+
It makes the calculated size of transactions which include program accounts for
151+
CPI somewhat larger, but given the generous 64MiB limit, it is unlikely that any
152+
existing users will be affected. Based on an investigation of a 30-day window,
153+
transactions larger than 30MiB are virtually never seen.
154+
155+
## Security Considerations
156+
157+
Security impact is minimal because this SIMD merely simplifies an existing
158+
feature. Care must be taken to implement the rules exactly.
159+
160+
This SIMD requires a feature gate.
161+
162+
## Backwards Compatibility
163+
164+
Transactions that currently have a total transaction data size close to the
165+
64MiB limit, which call LoaderV3 programs via CPI, may now exceed it and fail.

proposals/0186-transaction-data-size-specification.md

Lines changed: 0 additions & 152 deletions
This file was deleted.

0 commit comments

Comments
 (0)