Skip to content

asm: add LOCK-prefixed instructions #10477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 29, 2025
Merged

Conversation

abrown
Copy link
Contributor

@abrown abrown commented Mar 27, 2025

These changes add new instructions that add the LOCK prefix. The LOCK prefix is only valid for (1) instruction forms where the destination is a memory operand and (2) instructions in the following set: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B,
CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG (not all of which are defined yet). To handle (1) we add a new memory-only operand kind: m{8|16|32|64}.

With new LOCK-prefixed instructions defined, we use them in the ISLE lowering of read-modify-write instructions. The top-level lower.isle lowerings underwent some refactoring to handle the fact that these instructions return a SideEffectNoResult but the lowering rule expects an invalid register to be returned in a Reg-to-InstOutput auto conversion.

abrown added 10 commits March 26, 2025 17:26
This change is the first in a series to add new instructions that emit
the `LOCK` prefix during encoding.
This defines all `LOCK`-prefixed variants for `and*`.
This allows the `LOCK`-prefixed definitions to still fit on a single
line.
`LOCK`-prefixed instructions can only write to memory operands (see
prior commits). This change wires up the necessary codegen to handle the
new `m*` operands.
It is unclear why this would happen now, though perhaps the addition of
all the `LOCK`-prefixed instructions could do it (?).
@abrown abrown requested a review from a team as a code owner March 27, 2025 00:36
@abrown abrown requested review from fitzgen and removed request for a team March 27, 2025 00:36
@@ -1,5 +1,5 @@
# This extra configuration allows defining extra-long lines in
# `src/instructions`.
fn_call_width = 100
max_width = 110
max_width = 120
Copy link
Contributor Author

@abrown abrown Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be helpful to review this commit (13fb00d) separately since it contains all the formatting changes (cargo fmt) and no functional changes.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. isle Related to the ISLE domain-specific language labels Mar 27, 2025
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:x64", "cranelift:meta", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Comment on lines +43 to +63
// `LOCK`-prefixed memory-writing instructions.
inst("lock_addb", fmt("MI", [rw(m8), r(imm8)]), rex([0xf0, 0x80]).digit(0).ib(), _64b | compat),
inst("lock_addw", fmt("MI", [rw(m16), r(imm16)]), rex([0xf0, 0x66, 0x81]).digit(0).iw(), _64b | compat),
inst("lock_addl", fmt("MI", [rw(m32), r(imm32)]), rex([0xf0, 0x81]).digit(0).id(), _64b | compat),
inst("lock_addq", fmt("MI_SXL", [rw(m64), sxq(imm32)]), rex([0xf0, 0x81]).w().digit(0).id(), _64b),
inst("lock_addl", fmt("MI_SXB", [rw(m32), sxl(imm8)]), rex([0xf0, 0x83]).digit(0).ib(), _64b | compat),
inst("lock_addq", fmt("MI_SXB", [rw(m64), sxq(imm8)]), rex([0xf0, 0x83]).w().digit(0).ib(), _64b),
inst("lock_addb", fmt("MR", [rw(m8), r(r8)]), rex([0xf0, 0x0]).r(), _64b | compat),
inst("lock_addw", fmt("MR", [rw(m16), r(r16)]), rex([0xf0, 0x66, 0x1]).r(), _64b | compat),
inst("lock_addl", fmt("MR", [rw(m32), r(r32)]), rex([0xf0, 0x1]).r(), _64b | compat),
inst("lock_addq", fmt("MR", [rw(m64), r(r64)]), rex([0xf0, 0x1]).w().r(), _64b),
inst("lock_adcb", fmt("MI", [rw(m8), r(imm8)]), rex([0xf0, 0x80]).digit(2).ib(), _64b | compat),
inst("lock_adcw", fmt("MI", [rw(m16), r(imm16)]), rex([0xf0, 0x66, 0x81]).digit(2).iw(), _64b | compat),
inst("lock_adcl", fmt("MI", [rw(m32), r(imm32)]), rex([0xf0, 0x81]).digit(2).id(), _64b | compat),
inst("lock_adcq", fmt("MI_SXL", [rw(m64), sxq(imm32)]), rex([0xf0, 0x81]).w().digit(2).id(), _64b),
inst("lock_adcl", fmt("MI_SXB", [rw(m32), sxl(imm8)]), rex([0xf0, 0x83]).digit(2).ib(), _64b | compat),
inst("lock_adcq", fmt("MI_SXB", [rw(m64), sxq(imm8)]), rex([0xf0, 0x83]).w().digit(2).ib(), _64b),
inst("lock_adcb", fmt("MR", [rw(m8), r(r8)]), rex([0xf0, 0x10]).r(), _64b | compat),
inst("lock_adcw", fmt("MR", [rw(m16), r(r16)]), rex([0xf0, 0x66, 0x11]).r(), _64b | compat),
inst("lock_adcl", fmt("MR", [rw(m32), r(r32)]), rex([0xf0, 0x11]).r(), _64b | compat),
inst("lock_adcq", fmt("MR", [rw(m64), r(r64)]), rex([0xf0, 0x11]).w().r(), _64b),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for this: would it would to have .lock() as a builder on the return value of inst(..) and that would be an indicator that the lock prefix and non-lock prefix would both be generated? That way we wouldn't have to duplicate the opcodes/feature flags and we could bake-in logic that .lock() requires the first operand to be memory

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: I'd want the same generated code as what this PR does today, just with less boilerplate on the definitions here by having .lock() as an agumentation rather than a new set of instructions. It could even internally "expand" somewhere along the way to two instruction values perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I started with that route but abandoned it due to "too much magic." Right now these definitions have the property that each line here shows up as a generated instruction, and either doing that under the hood with .lock() or alternately setting up a for loop in this file to do the same ends up breaking that simple correspondence. And these new magically-created instructions are actually different enough that, combined with the "new instruction magic," I abandoned that route: they have a new mnemonic (lock_*), they have a new prefix, and--crucially--they only accept memory (not reg-mem) as their destination operand.

The other approach I thought through here was to actually model LOCK as its own separate instruction that would take a Box<LockableInst> or something like that. Intel's manual conceptually treats LOCK as its own instruction, so it made sense from that perspective. And it would have forced me to create some way to integrate hand-crafted instructions, i.e., instructions that we simply implement by hand but somehow get integrated into the auto-generated Inst enum. But ultimately I decided that, because in cranelift-codegen-meta we end up needing new ISLE constructors for any LOCK-prefixed variants, just creating more boilerplate was the simplest and clearest.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@fitzgen fitzgen added this pull request to the merge queue Mar 28, 2025
Merged via the queue into bytecodealliance:main with commit a968b7d Mar 29, 2025
53 checks passed
@abrown abrown deleted the asm-locks branch May 2, 2025 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants