-
Notifications
You must be signed in to change notification settings - Fork 177
SIMD-0315: Loader-v3 to loader-v4 migration #315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
2446501
to
061586c
Compare
- the migration authority | ||
(pubkey is `3Scf35jMNk2xXBD6areNjgMtXgp5ZspDhms8vdcbzC42`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea for anyone to have the unilateral authority to migrate programs even if it's a relatively harmless operation. I prefer the migration to be done by the runtime, maybe automatically at an epoch boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explained the reasons against doing so in the alternatives section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing that up. Is there a reason we don't just allow the migration instruction to be run permissionlessly by anyone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be done too fast (too many at once). I would rather have them be done one at a time and slowly work though a fixed list of loader-v3 programs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, we could deploy an on-chain program to manage rate-limiting on this. So, it would be effectively permissionless, but the program could track how many have been submitted per-slot and deny past a certain limit.
I have to agree I'm also apprehensive about a keypair that can single-handedly change even a frozen program's state, although I recognize the similarity to a feature keypair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lichtso any thoughts on doing this through an on-chain program? It could give us some better security guarantees and relax any concerns over a god mode key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean giving the migration authority to an on-chain program? A program only runs if somebody sends an transaction. It is still the same controlling of the timing, with extra steps.
god mode key
Again, the migration authority has almost no say in anything, but the timing.
- Clear the program data account (setting its size to zero) | ||
- Assign ownership of the program data account to the system program | ||
|
||
## Impact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the migrate instruction was not added to the list of bpf loader instructions that can be invoked via cpi. So a multisig owned program would need to first transfer the authority to a keypair based signer to do the migration, right?
Once new programs can not be deployed on loader-v3 anymore, the list of all | ||
loader-v3 programs becomes fixed and can be extracted from a snapshot. Using | ||
the added loader-v3 migration instruction and the global migration authority, | ||
the core protocol developers will then migrate all loader-v3 programs to | ||
loader-v4 programs, which once completed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if the mass migration was defined and activated in a separate SIMD + feature gate but could be convinced otherwise.
- This SIMD is useful already without global migration
- This SIMD is already big and complex enough without global migration
- Global migration could use some more discussion about alternative approaches to the global migration authority
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This SIMD was already split off from the loader-v4 one and is not even that long anymore. I think the global migration very much belongs in here as well as the program local one. Both use the same logic / mechanism, only the trigger is different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we ever done anything like this before? What kind of coordination goes into a global migration like the one proposed? Just curious if we have any precedents.
- the migration authority | ||
(pubkey is `3Scf35jMNk2xXBD6areNjgMtXgp5ZspDhms8vdcbzC42`) | ||
- or the upgrade authority stored in the program data account | ||
- or the program signer if the program is finalized, closed or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of migrating closed programs to allow resuscitation but I think migrating finalized programs should be done as part of the mass migration because they are more sensitive because they are presumed to be in use, never unavailable (delayed visibility), and there is no guarantee that the program keypair is even stored or secured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the trigger matters for the delayed visibility, does it? Seems like no matter who calls the migration, there will be a delayed visibility slot.
they are more sensitive because they are presumed to be in use, never unavailable (delayed visibility)
This is a really salient point. We should see if there's a way to avoid halting execution on finalized programs.
program data account minus the loader-v3 header size and use the migration | ||
authority. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this should say the "provided authority" because "migration authority" in this SIMD refers to the static core dev one, right?
- Assign ownership of the program account to loader-v4 | ||
- If the program data account was not closed / empty or uninitialized: | ||
- CPI loader-v4 `SetProgramLength` the program account to the size of the | ||
program data account minus the loader-v3 header size and use the migration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should specify that it's the programdata header and also call out the exact byte length here: 45
otherwise throw `InvalidAccountData` | ||
- Check that the program account points to the program data account, | ||
otherwise throw `InvalidArgument` | ||
- Clear the program account (setting its size to zero) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: clear the program account data (setting its data length to zero)
- Check that the program data was last modified before the current slot | ||
if the program data has the state `ProgramData`, | ||
otherwise throw `InvalidArgument` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should explicitly specify that the last modification slot comes from the programdata state
- otherwise, if the program data account was not finalized and the | ||
migration authority (as opposed to the upgrade authority) was provided: | ||
- CPI loader-v4 `TransferAuthority` to the upgrade authority | ||
- Clear the program data account (setting its size to zero) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly we need to handle the programdata account differently depending on whether it's system or loader-v3 owned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, programdata account is always owned by loader-v3 for valid programs. Edit: Closed programs can get in this situation.
type: Core | ||
status: Review | ||
created: 2024-08-15 | ||
feature: 2aQJYqER2aKyb3cZw22v4SL2xMX7vwXBRWfvS4pTrtED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it should be separate. I also was under the impression separate SIMDs meant separate feature gates as well. Not sure though.
The two loader-v3 accounts per program in their sum are always larger than | ||
the one resulting loader-v4 account. Thus there is no need for additional | ||
funding. This would not be the case when migrating from older loaders. Meaning | ||
that expanding this SIMD to cover these as well would require a funding source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, why not include it? If we just include the caveat that it requires pre-funding, then anyone who wants the benefits of v4 can choose to pay for it. Lots of the SPL programs come to mind as well.
- the migration authority | ||
(pubkey is `3Scf35jMNk2xXBD6areNjgMtXgp5ZspDhms8vdcbzC42`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, we could deploy an on-chain program to manage rate-limiting on this. So, it would be effectively permissionless, but the program could track how many have been submitted per-slot and deny past a certain limit.
I have to agree I'm also apprehensive about a keypair that can single-handedly change even a frozen program's state, although I recognize the similarity to a feature keypair.
- enable loader-v4 `LoaderV411111111111111111111111111111111111` program | ||
management and execution (see SIMD-0167). | ||
- enable the loader-v3 `BPFLoaderUpgradeab1e11111111111111111111111` | ||
instruction `UpgradeableLoaderInstruction::Migrate`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two feature gates might not be bad, just to let a bunch of first-time v4 deployments go out and bake before we enable mass migrations of v3 programs. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still vote for two separate features!
will count as a redeployment and thus render the program unavailable for the | ||
rest of the slot (delay visibility). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When finalized programs are migrated it would be best to not make them temporarily unavailable even for one slot.
I would even extend this idea to non-finalized programs as well. If the ELF didn't change, there's no reason the program should have to pay with a slot.
@Lichtso since the ELF isn't changing, and we are guaranteeing that in the loader v3 instruction, could we add a codepath to the program cache that handles in-place mutation of a loaded entry? It sounds sketchy typing it out, but it might be worth exploring.
Once new programs can not be deployed on loader-v3 anymore, the list of all | ||
loader-v3 programs becomes fixed and can be extracted from a snapshot. Using | ||
the added loader-v3 migration instruction and the global migration authority, | ||
the core protocol developers will then migrate all loader-v3 programs to | ||
loader-v4 programs, which once completed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we ever done anything like this before? What kind of coordination goes into a global migration like the one proposed? Just curious if we have any precedents.
084c02f
to
9b02e90
Compare
### Loader-v2 programs | ||
|
||
The two loader-v3 accounts per program in their sum are always larger than | ||
the one resulting loader-v4 account. Thus there is no need for additional | ||
funding. This would not be the case when migrating from loader-v2. Meaning | ||
that expanding this SIMD to cover it as well would require a funding source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why we couldn't just make the instruction support v2 -> v4 migrations, even if we don't plan to crank those ourselves?
- enable loader-v4 `LoaderV411111111111111111111111111111111111` program | ||
management and execution (see SIMD-0167). | ||
- enable the loader-v3 `BPFLoaderUpgradeab1e11111111111111111111111` | ||
instruction `UpgradeableLoaderInstruction::Migrate`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still vote for two separate features!
- Check that the last modified slot (stored in the program data accounts | ||
header) is less than the current slot if the program data has the state | ||
`ProgramData`, | ||
otherwise throw `InvalidArgument` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe verify the header is ProgramData
first?
- the migration authority | ||
(pubkey is `3Scf35jMNk2xXBD6areNjgMtXgp5ZspDhms8vdcbzC42`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lichtso any thoughts on doing this through an on-chain program? It could give us some better security guarantees and relax any concerns over a god mode key.
- the migration authority | ||
(pubkey is `3Scf35jMNk2xXBD6areNjgMtXgp5ZspDhms8vdcbzC42`) | ||
- or the upgrade authority stored in the program data account | ||
- or the program signer if the program is finalized, closed or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the trigger matters for the delayed visibility, does it? Seems like no matter who calls the migration, there will be a delayed visibility slot.
they are more sensitive because they are presumed to be in use, never unavailable (delayed visibility)
This is a really salient point. We should see if there's a way to avoid halting execution on finalized programs.
- CPI loader-v4 `Copy` the program data account into the program account | ||
- CPI loader-v4 `Deploy` the program account | ||
- If the program data account was finalized (upgrade authority is `None`): | ||
- CPI loader-v4 `Finalize` without a next version forwarding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- CPI loader-v4 `Finalize` without a next version forwarding | |
- CPI loader-v4 `Finalize` |
- otherwise, if the program account is empty, has the state `Buffer` or has | ||
the state `Program` but the program data account is not owned by loader-v3: | ||
- Set the length of the program account to 0 | ||
- Set the `is_executable` flag of the program account to `false` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just error if someone presents a v3 buffer account here? It can still be used to deploy a v4 program, since Copy
allows it, so there's no need to support it here. If we just need a way to close them, maybe we keep the Close
instruction active on loader v3?
9b02e90
to
d335c61
Compare
d335c61
to
897fd93
Compare
897fd93
to
1e2b4ad
Compare
As requested in #167 (comment)