|
| 1 | +--- |
| 2 | +simd: '0186' |
| 3 | +title: Transaction Data Size Specification |
| 4 | +authors: |
| 5 | + - Hanako Mumei |
| 6 | +category: Standard |
| 7 | +type: Core |
| 8 | +status: Review |
| 9 | +created: 2024-10-20 |
| 10 | +feature: (fill in with feature tracking issues once accepted) |
| 11 | +--- |
| 12 | + |
| 13 | +## Summary |
| 14 | + |
| 15 | +Before a transaction can be executed, every account it may read from or write to |
| 16 | +must be loaded, including any programs it may call. The amount of data a |
| 17 | +transaction is allowed to load is capped, and if it exceeds that limit, loading |
| 18 | +is aborted. This functionality is already implemented in the validator. The |
| 19 | +purpose of this SIMD is to explicitly define how transaction size is calculated. |
| 20 | + |
| 21 | +## Motivation |
| 22 | + |
| 23 | +Transaction data size accounting is currently unspecified, and the |
| 24 | +implementation-defined algorithm used in the Agave client exhibits some |
| 25 | +surprising behaviors: |
| 26 | + |
| 27 | +* BPF loaders required by top-level programs are counted against transaction |
| 28 | +data size. BPF loaders required by CPI programs are not. If a required BPF |
| 29 | +loader is also included in the accounts list, it is counted twice. |
| 30 | +* The size of a program owned by LoaderV3 may or may not include the size of its |
| 31 | +programdata depending on how the program account is used on the transaction. |
| 32 | +Programdata is also itself counted if included in the transaction accounts list. |
| 33 | +This means programdata may be counted zero, one, or two times per transaction. |
| 34 | + |
| 35 | +All validator clients must arrive at precisely the same transaction data size |
| 36 | +for all transactions because a difference of one byte can determine whether a |
| 37 | +transaction is executed or failed, and thus affects consensus. Also, we want the |
| 38 | +calculated transaction data size to correspond well with the actual amount of |
| 39 | +data the transaction requests. |
| 40 | + |
| 41 | +Therefore, this SIMD seeks to specify an algorithm that is straightforward to |
| 42 | +implement in a client-agnostic way, while also accurately accounting for the |
| 43 | +total data required by the transaction. |
| 44 | + |
| 45 | +## New Terminology |
| 46 | + |
| 47 | +One term is defined within the scope of this SIMD: |
| 48 | + |
| 49 | +* Valid program: a program that has been loaded, or a builtin. This definition |
| 50 | +excludes programs that have failed verification, or LoaderV3 programs that have |
| 51 | +been closed or have delayed visibility due to being deployed or modified in the |
| 52 | +current slot. |
| 53 | + |
| 54 | +These terms are not new, however we define them for clarity: |
| 55 | + |
| 56 | +* Top-level program: the program corresponding to the program id on a given |
| 57 | +instruction. |
| 58 | +* Instruction account: an account passed to an instruction, which allows its |
| 59 | +program to view the actual bytes of the account. CPI can only happen through |
| 60 | +programs provided as instruction accounts. |
| 61 | +* Transaction accounts list: all accounts for the transaction, which includes |
| 62 | +top-level programs, the fee-payer, all instruction accounts, and any extra |
| 63 | +accounts added to the list but not used for any purpose. |
| 64 | + |
| 65 | +## Detailed Design |
| 66 | + |
| 67 | +The proposed algorithm is as follows: |
| 68 | + |
| 69 | +1. Every account explicitly included on the transaction accounts list is counted |
| 70 | +once and only once. |
| 71 | +2. A valid program owned by LoaderV3 also includes the size of its programdata. |
| 72 | +3. Other than point 2, no accounts are implicitly added to the total data size. |
| 73 | + |
| 74 | +Transactions may include a |
| 75 | +`ComputeBudgetInstruction::SetLoadedAccountsDataSizeLimit` instruction to define |
| 76 | +a data size limit for the transaction. Otherwise, the default limit is 64MiB |
| 77 | +(`64 * 1024 * 1024` bytes). |
| 78 | + |
| 79 | +If a transaction exceeds its data size limit, account loading is aborted and the |
| 80 | +transaction is failed. Fees will be charged once |
| 81 | +`enable_transaction_loading_failure_fees` is enabled. |
| 82 | + |
| 83 | +Adding required loaders to transaction data size is abolished. They are treated |
| 84 | +the same as any other account: counted if on the transaction accounts list, not |
| 85 | +counted otherwise. |
| 86 | + |
| 87 | +Read-only and writable accounts are treated the same. In the future, when direct |
| 88 | +mapping is enabled, this SIMD may be amended to count them differently. |
| 89 | + |
| 90 | +As a consequence of 1 and 2, for LoaderV3 programs, programdata is counted twice |
| 91 | +if a transaction includes both programdata and the program account itself in the |
| 92 | +accounts list, unless the program is not valid for execution. This is partly |
| 93 | +done for ease of implementation: we always want to count programdata when the |
| 94 | +program is included, and there is no reason for any transaction to include both |
| 95 | +accounts except during initial deployment, in which case the program is not yet |
| 96 | +valid. |
| 97 | + |
| 98 | +We include programdata size in program size for LoaderV3 programs because in |
| 99 | +nearly all cases a transaction will include the program account (the only way to |
| 100 | +invoke the program) and will not include the programdata account because |
| 101 | +including it serves no purpose. Including the program account forces an |
| 102 | +unconditional load of the programdata account because it is required to compile |
| 103 | +the program for execution. Therefore we always count it, even when the program |
| 104 | +is an instruction account. |
| 105 | + |
| 106 | +There is no special handling for programs owned by the native loader, LoaderV1, |
| 107 | +or LoaderV2. |
| 108 | + |
| 109 | +Account size for programs owned by LoaderV4 is left undefined. This SIMD should |
| 110 | +be amended to define the required semantics before LoaderV4 is enabled on any |
| 111 | +network. |
| 112 | + |
| 113 | +## Alternatives Considered |
| 114 | + |
| 115 | +* Transaction data size accounting is already enabled, so the null option is to |
| 116 | +enshrine the current Agave behavior in the protocol. This is undesirable because |
| 117 | +the current behavior is highly idiosyncratic, and LoaderV3 program sizes are |
| 118 | +routinely undercounted. |
| 119 | +* Builtin programs are backed by accounts that only contain the program name as |
| 120 | +a string, typically making them 15-40 bytes. We could make them free when not |
| 121 | +instruction accounts, since they're part of the validator. However this |
| 122 | +adds complexity for no real benefit. |
| 123 | + |
| 124 | +## Impact |
| 125 | + |
| 126 | +The primary impact is this SIMD makes correctly implementing transaction data |
| 127 | +size accounting much easier for other validator clients. |
| 128 | + |
| 129 | +It makes transactions which include program accounts for CPI somewhat larger, |
| 130 | +but given the generous 64MiB limit, it is unlikely that any existing users will |
| 131 | +be affected. |
| 132 | + |
| 133 | +## Security Considerations |
| 134 | + |
| 135 | +Security impact is minimal because this SIMD merely simplifies an existing |
| 136 | +feature. |
| 137 | + |
| 138 | +This SIMD requires a feature gate. |
| 139 | + |
| 140 | +## Backwards Compatibility |
| 141 | + |
| 142 | +Transactions that call LoaderV3 programs via CPI and are extremely close to the |
| 143 | +64MiB limit may now exceed it. |
0 commit comments