Skip to content

SIMD-0219: Stricter ABI and Runtime Constraints #219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Lichtso
Copy link
Contributor

@Lichtso Lichtso commented Jan 6, 2025

No description provided.

abrahem79

This comment was marked as spam.

@Lichtso Lichtso force-pushed the stricter-vm-verification-constraints branch from 9c1578c to eaf4fe6 Compare May 29, 2025 14:12
@Lichtso Lichtso force-pushed the stricter-vm-verification-constraints branch from eaf4fe6 to 1f3dda9 Compare June 6, 2025 15:37
Copy link
Contributor

@buffalojoec buffalojoec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I think the fixes for the frame gaps and the memory accesses across regions make sense. However, I pushed back on the stack/heap pointers for CPI verification.

Also, I'm wondering if it makes sense to break this into three SIMDs? It seems like we could do each in isolation, which might help speed up the approval & implementation process.

Comment on lines 64 to 78
- The following pointers must be on the stack or heap,
meaning their virtual address is inside `0x200000000..0x400000000`,
otherwise `SyscallError::InvalidPointer` must be thrown:
- The pointer in the array of `&[AccountInfo]` / `SolAccountInfo*`
- The `AccountInfo::data` field,
which is a `RefCell<&[u8]>` in `sol_invoke_signed_rust`
- The `AccountInfo::lamports` field,
which is a `RefCell<&u64>` in `sol_invoke_signed_rust`
- The following pointers must point to what was originally serialized in the
input regions by the program runtime,
otherwise `SyscallError::InvalidPointer` must be thrown:
- `AccountInfo::key` / `SolAccountInfo::key`
- `AccountInfo::owner` / `SolAccountInfo::owner`
- `AccountInfo::lamports` / `SolAccountInfo::lamports`
- `AccountInfo::data::ptr` / `SolAccountInfo::data`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, this constraint forces programs to waste limited stack/heap space just to hold these pointer structures, when we can probably validate these in a different way, and keep them in the input region.

On Agave, when we serialize all of the accounts into the input region, we actually hang onto a bunch of the pointers in SyscallContext/SerializedAccountMetadata. We can just store more pointer information in the syscall context, and perform a very quick pointer analysis:

  • Before each CPI frame is created
  • When the program execution finishes

We can catch the violations then, and throw the SyscallError::InvalidPointer. This way, instead of imposing location constraints, we simply validate that all AccountInfo pointers match their original pointers when the VM was created.

As mentioned in the proposal, when direct mapping is enabled, these kinds of pointer violations will throw immediately, which means pushing the pointers to the stack/heap would only be necessary as a temporary measure, until the DM feature is enabled. Afterwards, programs are stuck with this constraint for no reason.

Copy link
Contributor Author

@Lichtso Lichtso Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are misunderstanding. This SIMD says that the AccountInfo structs need to be on stack or heap, not that the data they point to has to be there. The data they point to remains in the account serialization. This is compatible with the SDK entrypoint as is, and has to be so that we don't break existing programs.

Think part of the confusion comes from the fact that AccountInfo::data and AccountInfo::lamports have two levels of indirection and both outer and inner pointer need to not be in the account serialization (the final deref however must be), what no program ever does anyway.

We can just store more pointer information in the syscall context, and perform a very quick pointer analysis

Yes exactly, that is how direct mapping is implemented and what the SIMD is ought to describe. Seems I should reformulate it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are misunderstanding. This SIMD says that the AccountInfo structs need to be on stack or heap, not that the data they point to has to be there.

I never considered this SIMD to be implying we should move any serialized data to the stack or heap. I believe I'm understanding that part correctly.

What I'm maybe misunderstanding is what the goal is by limiting where AccountInfo pointers can be created. The sol_invoke_signed syscall accepts only pointers.

fn sol_invoke_signed_c(
    instruction_addr: *const u8,
    account_infos_addr: *const u8, // <-- Pointer to slice of `SolAccountInfo`
    account_infos_len: u64,
    signers_seeds_addr: *const u8,
    signers_seeds_len: u64,
) -> u64

struct SolAccountInfo {
    key_addr: u64,
    lamports_addr: u64,
    data_len: u64,
    data_addr: u64,
    owner_addr: u64,
    rent_epoch: u64,
    is_signer: bool,
    is_writable: bool,
    executable: bool,
}

Since translation of accounts just deref's out of the RefCell trackers, we can consider the two translations (Rust and C) identical for serialized accounts.

Enforcing that each account pointer lives on stack or heap doesn't seem to actually solve the problem, which is the ability to pass a pointer to an invalid input-region SolAccountInfo into CPI. Furthermore, there are multiple perfectly valid reasons you'd pass a pointer to already-serialized legitimate accounts, such as avoiding copies.

My initial point of just evaluating all pointers against the SerializedAccountMetadata would solve this problem at the fundamental level. It can happen during translate_accounts or translate_account_infos, where you've already got the stack/heap check implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial point of just evaluating all pointers against the SerializedAccountMetadata would solve this problem at the fundamental level.

That is what the SIMD already proposes anyway: "The following pointers must point to what was originally serialized in the input regions by the program runtime" which refers to the SerializedAccountMetadata.

But, that alone is insufficient, CPI also writes to AccountInfo as it returns, it should never be writing to an account during the returning (returning takes multiple steps after all) as that can result in iterate-while-modifying issues and violates Rust borrow checker rules.

@Lichtso
Copy link
Contributor Author

Lichtso commented Jun 24, 2025

Also, I'm wondering if it makes sense to break this into three SIMDs? It seems like we could do each in isolation, which might help speed up the approval & implementation process.

This entire SIMD is describing the account data direct mapping feature, which is already implemented, just not feature gated. I know from a dApp perspective they look like a bunch thrown together, but in the program runtime they are all interconnected. So I don't think that splitting them would speed things up, but rather make it much more complex.

@buffalojoec
Copy link
Contributor

I know from a dApp perspective they look like a bunch thrown together, but in the program runtime they are all interconnected.

Actually I am talking about the program runtime!

Right now, the plan is to just switch on direct mapping all at once with a feature gate, which will enable all of these constraints - as well as the implementation itself - immediately. That's a lot of change area in one feature gate, and I think some contributors are getting a bit nervous about the "all at once" approach.

What I'm instead suggesting is that we introduce these constraints piecemeal. You can break VM constraints into three separate SIMDs, with three separate feature gates, and activate them one by one:

  1. Frame gaps
  2. Memory access violations
  3. AccountInfo pointer regions

This approach would allow us to more easily address any security issues that might arise from just one "phase" of direct mapping constraints.

Later, once all three are activated, you can make direct mapping a validator startup flag for a while, before we just remove the flag altogether and make it the de facto hot path.

agave validator <...> --direct-memory-mapping

Overall I think this seems like a much safer approach to getting this all in. What do you think?

@Lichtso
Copy link
Contributor Author

Lichtso commented Jul 8, 2025

I think the change to the stack frame gaps is risky and not necessary anymore for the current implementation of direct mapping so we could revert it.

The restrictions to the AccountInfo pointer regions could be done first in a separate SIMD, but we know that it affects next to no programs, and it seems to have little risk associated.

Finally, the tough nut is the memory access violations. These are interwoven with the implementation switching to direct mapping. It is hard to correctly emulate these restrictions without implementing direct mapping. Decoupling them into a feature gate of pure restrictions which still uses the copy based serialization path is complex.

@Lichtso Lichtso changed the title SIMD-0219: Stricter VM verification constraints SIMD-0219: Stricter VM constraints Jul 17, 2025
@Lichtso Lichtso force-pushed the stricter-vm-verification-constraints branch from 40f9989 to 56f2cc5 Compare July 18, 2025 19:33
@Lichtso Lichtso changed the title SIMD-0219: Stricter VM constraints SIMD-0219: Stricter ABI and Runtime Constraints Jul 18, 2025
@Lichtso Lichtso force-pushed the stricter-vm-verification-constraints branch from 56f2cc5 to 25933db Compare July 18, 2025 19:39
- The access is completely within the rest of the account growth budget of the
transaction, otherwise `InstructionError::InvalidRealloc` must be thrown.
- The access is completely within the current length of the account,
otherwise extend the the account with zeros to the maximum allowed by the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
otherwise extend the the account with zeros to the maximum allowed by the
otherwise extend the account with zeros to the maximum allowed by the

@Lichtso Lichtso force-pushed the stricter-vm-verification-constraints branch from 25933db to 2c6fc74 Compare August 1, 2025 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants