Skip to content

remove stake program builtin #7203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

2501babe
Copy link
Member

@2501babe 2501babe commented Jul 28, 2025

Problem

the bpf stake program is now active on all clusters and can be removed from builtins

Summary of Changes

remove it from BUILTINS, add the bpf stake program 1.0.0 binary blob to CORE_BPF_PROGRAMS, and fix various tests and test setups to deal with this:

  • solana-test-validator and solana-program-test now ship with the core bpf stake program as a drop-in replacement
  • runtime, ledger, and rpc tests are changed in ad hoc ways to no longer expect the stake program to exist; in general they used it for convenience and were not testing actual stake program functionality
  • one partitioned epoch rewards test which tested stake program non-operation during the rewards period is superceded by a new, comprehensive test in the stake program repo
  • the solana-stake-accounts cli adds the stake program to its bank for its tests, which are not separable into the stake repo
  • delete programs/stake-tests/. these were tests for MoveStake and MoveLamports which have long since been duplicated in the bpf stake repo

this pr is already quite difficult to review. if everyone is happy with this idea, id like to merge this as a standalone pr without removing the stake program code, and do that in a followup pr. removing the stake program code is nontrivial because it contains a bunch of utility functions which are used in tests, during genesis, and possibly elsewhere. it also contains some of the rewards-related code still running in agave. and we may want to reexport various things to not break downstream too badly

these changes will have to be debated (particularly breaking the interface, whether we move things into new packages, etc) and itd be better to look at them in isolation

@2501babe 2501babe self-assigned this Jul 28, 2025
@2501babe 2501babe force-pushed the 20250728_stakebpf branch 3 times, most recently from a6cde57 to f502627 Compare July 29, 2025 13:00
@2501babe
Copy link
Member Author

2501babe commented Jul 29, 2025

playing whackamole with what i am supposed to change now that the stake program is migrated...

  • builtins-default-costs/src/lib.rs:
    • decrement TOTAL_COUNT_BUILTINS
    • delete stake program from MIGRATING_BUILTINS_COSTS
  • builtins/src/lib.rs:
    • delete prototype from BUILTINS
    • delete pubkey from buffer_accounts
  • program-test/src/programs.rs: add binary blob to CORE_BPF_PROGRAMS
  • runtime/src/genesis_utils.rs: decrement NUM_BUILTIN_PROGRAMS
  • test-validator/src/lib.rs: dont deactivate migrate_stake_program_to_core_bpf

plus various test changes already committed...

still todo:

  • cli/tests/stake.rs: delegate fails because we need some slack in compute usage, it uses more than simulate because votes get added to vote state
  • runtime/tests/stake.rs: bank doesnt have stake program, add it or move these tests to the stake repo... it uses some bank functionality explicitly, not sure how much it matters
  • ledger/src/staking_utils.rs: just creates and delegates a stake account, we can mock this
  • runtime/src/bank/partitioned_epoch_rewards/mod.rs: test_program_execution_restricted_for_stake_account_in_reward_period explicitly tests the stake program blocks operation in rewards period, i dont think we can do away with it
  • LocalCluster: there was a test if stake is available at genesis (its not) which i removed but unclear if there are tests that actually use stake here, local cluster tests just fail a ton on this branch

more todo:

  • rpc_pubsub::tests::test_account_subscribe' panicked at rpc/src/rpc_pubsub.rs:934
  • bank::tests::test_rent_state_changes_sysvars' panicked at runtime/src/bank/tests.rs:9794
  • bank::tests::test_shrink_candidate_slots_cached' panicked at runtime/src/bank/tests.rs:5408
  • bank::tests::test_bank_cloned_stake_delegations' panicked at runtime/src/bank/tests.rs:3151
  • bank::tests::test_bank_hash_consistency' panicked at runtime/src/bank/tests.rs:5174
  • bank::tests::test_vote_epoch_panic' panicked at runtime/src/bank/tests.rs:8207
  • stake-accounts/src/stake_accounts.rs

still no clear reason for the local cluster failures, the logs all just say timeout

the most perplexing thing to fix in this pr was that no LocalCluster tests failed, they just all timed out. this was because they have a setup function to create a stake account via the stake program (which did not exist), but they "succeed" as long as the transaction is processed, then infinitely loop waiting to confirm the account has a nonzero balance

note we had test failures in cli/tests/stake.rs from the fact that compute cost for DelegateStake has some jitter, but we did not need to edit this file at all in this pr because these pass after #7447

Copy link

mergify bot commented Aug 7, 2025

If this PR represents a change to the public RPC API:

  1. Make sure it includes a complementary update to rpc-client/ (example)
  2. Open a follow-up PR to update the JavaScript client @solana/kit (example)

Thank you for keeping the RPC clients in sync with the server API @2501babe.

@2501babe 2501babe force-pushed the 20250728_stakebpf branch 3 times, most recently from 16e51ec to d09ad5a Compare August 11, 2025 10:45
@2501babe 2501babe force-pushed the 20250728_stakebpf branch 3 times, most recently from 9504c99 to 246f038 Compare August 12, 2025 13:46
@2501babe 2501babe changed the title program-test: add bpf stake program 1.0.0 remove stake program builtin Aug 12, 2025
Comment on lines -515 to -516
#[test]
fn test_builtin_program_migration() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is kind of a shame to delete this whole test but i dont see how to realistically keep is since we no longer have any migrating programs. maybe make a mental note of the fact that we removed it, so we can add it back when we have a new migration

Comment on lines +64 to +68
(
solana_sdk_ids::stake::ID,
None,
include_bytes!("programs/core_bpf_stake-1.0.0.so"),
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw im very pleased to report that all the pain of this pr was in dealing with non-ProgramTest setups that are suddenly missing the stake program. zero issues with any of the ProgramTest consumers that now have a different stake program

Comment on lines -878 to +877
let new_stake_authority = solana_pubkey::new_rand();
let stake_authority = Keypair::new();
let validator = Keypair::new();
let voter = Keypair::new();
Copy link
Member Author

@2501babe 2501babe Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test has a complicated diff but the tldr is it created a stake account and wanted to see pubsub updates when it was reauthorized, i just reauthorize the vote account instead for the same effect

Comment on lines +860 to +862
/// Test that lamports can be sent to stake accounts regardless of rewards period.
#[test]
fn test_program_execution_restricted_for_stake_account_in_reward_period() {
use solana_transaction_error::TransactionError::InstructionError;

fn test_rewards_period_system_transfer() {
Copy link
Member Author

@2501babe 2501babe Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic stripped out here is replaced by solana-program/stake#78, which tests all stake program instructions (except deactivate deliquent rip) instead of just withdraw

Comment on lines +3141 to +3158
// insert a stake account delegated to it, since we do not have a stake program
let stake_pubkey = Pubkey::new_unique();
let stake_account = stake_state::create_account_with_activation_epoch(
&stake_pubkey,
&vote_keypair.pubkey(),
&bank.get_account(&vote_keypair.pubkey()).unwrap(),
&Rent::default(),
stake_balance,
bank.epoch(),
);

bank.store_account(&stake_pubkey, &stake_account);

// the stakes cache is updated by any successful transaction that touches a stake account
let transaction =
system_transaction::transfer(&mint_keypair, &stake_pubkey, 1, bank.last_blockhash());

bank.process_transaction(&transaction).unwrap();
Copy link
Member Author

@2501babe 2501babe Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

highlighting this in case it is contentious. this is the only runtime test that uses a stake program interaction for something "real." but i believe my change is just as good in accomplishing the intent of the test

Comment on lines 5181 to 5186
assert_eq!(bank.epoch(), 0);
assert_eq!(
bank.hash().to_string(),
"AyXhbqmPsC46x7MHAuW89pQcNZVrUZnAND6ABWJ24svx",
"EzyLJJki4ALhQAq5wbmiNctDhytQckGJRXnk9APKXv7r",
);
}
Copy link
Member Author

@2501babe 2501babe Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i joked i feel like claude code just replacing the failing static value with the actual one, but per joe this is correct. the bank hash is different because we no longer have the builtin stake program account. but there is no consensus divergence because this is already the case on live clusters

Comment on lines +302 to +311
let core_programs = core_bpf_programs(&genesis_config.rent, |_| true);
let stake_idx = core_programs
.iter()
.position(|(key, _)| *key == stake::program::id())
.unwrap();
let program = core_programs[stake_idx].1.clone();
let (programdata_address, programdata) = core_programs[stake_idx + 1].clone();

genesis_config.add_account(stake::program::id(), program);
genesis_config.add_account(programdata_address, programdata);
Copy link
Member Author

@2501babe 2501babe Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the same as what i originally did in LocalCluster (but changed it to load all core programs). but unlike there, just throwing the program in here is totally unobjectionable, because its just testing a stake cli

fwiw these are the only two places i needed the core program binary blobs outside ProgramTest, so i didnt consider it important to break out the core bpf programs into a new package. in fact i think its better they live there: circular dependencies create a barrier to dragging these magic elves (lol) all over the code base where they dont belong

Comment on lines +255 to +260
for core_program_account in &core_bpf_programs(&Rent::default(), |_| true) {
config
.additional_accounts
.push(core_program_account.clone());
}

Copy link
Member Author

@2501babe 2501babe Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im mildly concerned this is a hack, i dont know what mechanism a real cluster genesis uses to make sure it gets the core bpf programs, and if it is viable for LocalCluster to use it as well. i guess my view is "if we have some kind of test coverage of the multinode setup or similar, that ensures fresh clusters actually work, what we do in LocalCluster doesnt really matter. if LocalCluster is the only test surface we have over setting up clusters, probably we should use the formal mechanism that real clusters use to get core bpf programs"

also i assume that because it goes through the same kind of GenesisConfig setup as bank tests, LocalCluster is implicitly FeatureSet::all_enabled(), but i didnt find where this actually happens

@codecov-commenter
Copy link

codecov-commenter commented Aug 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (master@453a73f). Learn more about missing BASE report.
⚠️ Report is 7 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #7203   +/-   ##
=========================================
  Coverage          ?    83.3%           
=========================================
  Files             ?      801           
  Lines             ?   362669           
  Branches          ?        0           
=========================================
  Hits              ?   302155           
  Misses            ?    60514           
  Partials          ?        0           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@2501babe 2501babe marked this pull request as ready for review August 12, 2025 17:15
@2501babe 2501babe requested a review from a team as a code owner August 12, 2025 17:15
@2501babe 2501babe requested a review from buffalojoec August 12, 2025 17:15
@2501babe 2501babe requested a review from joncinque August 12, 2025 17:15
@2501babe
Copy link
Member Author

2501babe commented Aug 12, 2025

added you both as reviewers, you dont have to both review everything i was thinking @buffalojoec could sign off that i got the changes to all the hardcoded builtins/core programs/builtin costs/etc stuff right, and @joncinque could look at the stake-related test changes

i decided to do the stake program removal in phases, for ease of review. this pr removes it from the builtins list, adds it to program-test, and fixes all the breakage that results from that. the next pr would delete as much of the stake program code as possible

i wanted to do that as its own thing since we cant just delete it all; there is a bunch of stuff in the directory used in tests, used for genesis, used for rewards calculation etc. we may also want to reexport things from the stake program interface crate to reduce downstream breakage, make new packages for things, etc. so since there will be some debate and probably a few iterations, better to do that standalone

i annotated the parts of the pr that may be particularly interesting, most of the diff is whole-file deletions

edit: i have to rebase again to catch the fix in master for a security issue dependency update, but this already passed ci

position: 0,
}),
)];
pub const MIGRATING_BUILTINS_COSTS: &[(Pubkey, BuiltinCost)] = &[];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now empty, could technically remove in a follow-up PR.
However, I wonder if we should clean it up or leave it for future migrations. There are only a few programs left:

    (vote::id(), BuiltinCost::NotMigrating),
    (system_program::id(), BuiltinCost::NotMigrating),
    (compute_budget::id(), BuiltinCost::NotMigrating),
    (bpf_loader_upgradeable::id(), BuiltinCost::NotMigrating),
    (bpf_loader_deprecated::id(), BuiltinCost::NotMigrating),
    (bpf_loader::id(), BuiltinCost::NotMigrating),
    (loader_v4::id(), BuiltinCost::NotMigrating),
    (secp256k1_program::id(), BuiltinCost::NotMigrating),
    (ed25519_program::id(), BuiltinCost::NotMigrating),

Do we have plans or anticipate migrating any of these?

Copy link
Member Author

@2501babe 2501babe Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, i asked jon before about whether to leave the migration scaffolding in place for the future or if we were done. we want to migrate loaderv4 and the zk-elgamal program (which i dont see on this list, not sure its status), but they need some syscalls first, so i left migration logic for future use

originally we had planned on migrating vote and system. but my understanding is alpenglow will not need an actual vote program. and we nixed system because its simple enough for fd to keep their own version and creating a "privileged" syscall for a bpf system to allocate accounts was not worth it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants