|
| 1 | +--- |
| 2 | +simd: '0166' |
| 3 | +title: Dynamic stack frames in SBF |
| 4 | +authors: |
| 5 | + - Alexander Meißner |
| 6 | + - Alessandro Decina |
| 7 | + - Lucas Steuernagel |
| 8 | +category: Standard |
| 9 | +type: Core |
| 10 | +status: Draft |
| 11 | +created: 2024-08-19T00:00:00.000Z |
| 12 | +feature: null |
| 13 | +supersedes: null |
| 14 | +superseded-by: null |
| 15 | +extends: null |
| 16 | +--- |
| 17 | + |
| 18 | +## Summary |
| 19 | + |
| 20 | +The SVM currently allocates a fixed amount of stack space to each function |
| 21 | +frame. We propose allowing programs to dynamically manage their stack space |
| 22 | +through the introduction of an explicit stack pointer register. |
| 23 | + |
| 24 | +## Motivation |
| 25 | + |
| 26 | +The SVM allocates a fixed amount of memory to hold a program’s stack. Within |
| 27 | +the stack region, the virtual machine reserves 4096 bytes of stack space for |
| 28 | +each function frame. This is simultaneously limiting for functions that |
| 29 | +require more space, and wasteful for functions that require less space. |
| 30 | + |
| 31 | +For well optimized programs that don’t allocate large amounts of stack, the |
| 32 | +virtual machine currently still reserves 4096 bytes of stack for each |
| 33 | +function call, leading to suboptimal memory usage, which may cause |
| 34 | +unnecessary page faults. |
| 35 | + |
| 36 | +On the other hand, some programs are known to create large function frames - |
| 37 | +this seems common with programs that serialize a lot of data - and they have |
| 38 | +to jump through hoops to avoid overflowing the stack. The virtual machine |
| 39 | +detects when a stack overflow occurs, and it does so by implementing a stack |
| 40 | +frame gaps system whereby it inserts a virtual sentinel frame following a |
| 41 | +valid function frame. If the sentinel frame is accessed, the executing program |
| 42 | +is aborted. This system is fragile and is incompatible with direct mapping - |
| 43 | +a feature we expect to enable soon. |
| 44 | + |
| 45 | +The changes proposed in this document would allow us to optimize stack memory |
| 46 | +usage and remove the fragile stack frame gaps system. Note that we do not |
| 47 | +propose to remove the existing maximum stack space limit: stack space stays |
| 48 | +unchanged, what changes is how it is partitioned internally. |
| 49 | + |
| 50 | +## Detailed Design |
| 51 | + |
| 52 | +Bringing dynamic stack frames to the Solana Bytecode Format and its |
| 53 | +corresponding virtual machine entails changes in several aspects of the |
| 54 | +execution environment. |
| 55 | + |
| 56 | +### SBF architecture modifications |
| 57 | + |
| 58 | + |
| 59 | +We will introduce a new register R11 in the virtual machine, which is going |
| 60 | +to hold the stack pointer. The program can only write to such a register and |
| 61 | +modify it through the `add64 reg, imm` instruction. The verifier will enforce |
| 62 | +these constraints on deployed programs. For further information about the |
| 63 | +changes in the ISA, refer to the [SPF spec document](https://github.com/solana-labs/rbpf/blob/main/doc/bytecode.md). |
| 64 | + |
| 65 | +The R11 register will work in tandem with the R10 (frame pointer) register. |
| 66 | +The former is write-only to the program, and the latter is read-only to the |
| 67 | +program, forming a common design pattern in hardware engineering. More |
| 68 | +details of this usage are in the following section. |
| 69 | + |
| 70 | +### Changes in the execution environment |
| 71 | + |
| 72 | +The R10 register will continue to hold the frame pointer, but we will manage |
| 73 | +it differently. With fixed frames, when there is a function call we add 4096 |
| 74 | +to R10 and subtract it when the function returns. In the new scheme, we will |
| 75 | +assign the value of R11 to R10 at function calls, and save R10’s former value |
| 76 | +so that we can restore it when the function returns. |
| 77 | + |
| 78 | +The introduction of dynamic stack frames will change the direction of stack |
| 79 | +growth. Presently, we stack frames on top of each other, but the memory usage |
| 80 | +in them grows downward. In the new frame setting, both the placement of new |
| 81 | +frames and the memory usage inside frames will be downward. |
| 82 | + |
| 83 | +The stack frame gaps feature, which creates a memory layout where frames are |
| 84 | +interleaved with equally sized gaps, are not compatible with dynamic stack |
| 85 | +frames and will be deactivated. |
| 86 | + |
| 87 | +### Changes in code generation |
| 88 | + |
| 89 | +In the compiler side, dynamic stack frames allow for some optimizations. |
| 90 | +First, when a function does not need any stack allocated variable, code |
| 91 | +generation will not create any instruction to modify R11. In addition, we |
| 92 | +can stop using R5 as a stack spill register when a function call receives |
| 93 | +more than five arguments. With dynamic stack frames, the compiler will use |
| 94 | +registers R1 to R5 for the first five arguments and place remainder arguments |
| 95 | +in the callee frame, instead of placing them in the caller’s frame. This new |
| 96 | +call convention obviates the need to use R5 for retrieving the caller’s frame |
| 97 | +pointer address to access those parameters. |
| 98 | + |
| 99 | +### Identification of programs |
| 100 | + |
| 101 | +As per the description in SIMD-0161, programs compiled with dynamic stack |
| 102 | +frames will contain the XX flag on their ELF header `e_flags` field. |
| 103 | + |
| 104 | +## Impact |
| 105 | + |
| 106 | +We foresee a positive impact in smart contract development. Developers won’t |
| 107 | +need to worry about exceeding the maximum frame space allowed for a function |
| 108 | +and won’t face any case of stack access violation if their code follows |
| 109 | +conventional Rust safety rules. Likewise, when we update the Rust version of |
| 110 | +our platform tools, developers will not have the burden of modifying their |
| 111 | +contract just because the newer version is using more stack than the previous |
| 112 | +one, often reaching the 4096 bytes limit. Refer to issues |
| 113 | +[#1186](https://github.com/anza-xyz/agave/issues/1186) and |
| 114 | +[#1158](https://github.com/anza-xyz/agave/issues/1158). |
| 115 | + |
| 116 | +We also expect some improvements in program execution. For functions with no |
| 117 | +stack usage, we will not emit the additional instruction that modifies R11, |
| 118 | +saving some execution time. Furthermore, for function calls that handle more |
| 119 | +than five arguments, there will be one less store and one less load operation |
| 120 | +due to the new call convention. |
| 121 | + |
| 122 | +## Security Considerations |
| 123 | + |
| 124 | +Stack gaps will be disabled for dynamic stack frames to work. Stack gaps could |
| 125 | +detect invalid accesses between two function frames, if the accessed address |
| 126 | +would fall between them. With dynamic stack frames, all stack access will be |
| 127 | +valid, provided that their address is within the allowed range. We already |
| 128 | +allow functions to read and modify the memory inside the frame of other |
| 129 | +functions, so removing the stack gaps should not bring any security |
| 130 | +implications. |
| 131 | + |
| 132 | +Although one can change R11 to any value that fits in a 64-bit integer, every |
| 133 | +memory access is verified, so there is no risk of invalid accesses from a |
| 134 | +corrupt register. |
| 135 | + |
| 136 | +## Drawbacks |
| 137 | + |
| 138 | +Programs will consume more compute units, as most functions will include two |
| 139 | +extra instructions: one to increment the stack pointer and another one to |
| 140 | +decrement it. |
| 141 | + |
| 142 | +## Alternatives Considered |
| 143 | + |
| 144 | +To cope with the SBF limitation of 4096 bytes for the frame size, we could |
| 145 | +have increased such a number. Even though this would solve the original |
| 146 | +problem, it would supply an unnecessary amount of memory to functions even |
| 147 | +when they do not need them. In addition, such a solution would increase |
| 148 | +pressure on the total memory available for the call stack. Either we would |
| 149 | +need to increase the total allocation for the virtual machine or decrease the |
| 150 | +maximum call depth. |
| 151 | + |
| 152 | +## New Terminology |
| 153 | + |
| 154 | +None. |
0 commit comments