Skip to content

Initial support for vex encoding for the new assembler. #10754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jlb6740
Copy link
Contributor

@jlb6740 jlb6740 commented May 8, 2025

No description provided.

@jlb6740 jlb6740 requested a review from a team as a code owner May 8, 2025 23:05
@jlb6740 jlb6740 requested review from abrown and removed request for a team May 8, 2025 23:05
@jlb6740
Copy link
Contributor Author

jlb6740 commented May 8, 2025

The review for this patch started here #10695. This attempts to add support for vex encoding and in the process adds the lowering of couple of initial vex instructions vaddpd and vaddps.
@rahulchaphalkar

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. labels May 9, 2025
@jlb6740 jlb6740 force-pushed the add-initial-vex-encoding branch 2 times, most recently from 78ea103 to 29c9182 Compare May 9, 2025 16:21
Copy link
Contributor

@abrown abrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some smaller review comments mixed but I wanted to highlight the VexInstruction encoding as the only major thing left to resolve; otherwise this looks good to me. I left some comments here about that encoding part but we can talk offline in more detail.

@@ -408,6 +418,23 @@ impl Prefixes {
&& self.group3.is_none()
&& self.group4.is_none()
}

pub fn bits(&self) -> u8 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is for VEX encoding maybe it should go with the VEX-related code. Plus, don't we need to check the actual values in each group?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, do we even use this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No .. will remove. I believe it remained as an artifact of a rebase when in fact it could/should have been purged at that time.

VexPP::None => write!(f, "None"),
VexPP::_66 => write!(f, "_66"),
VexPP::_F3 => write!(f, "_F3"),
VexPP::_F2 => write!(f, "_F2"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For printing VexPP and VexMMMMM, if we print using . instead of _ then we can match the manual a bit more closely. E.g., VEX.128.0F.WIG 58 /r.

Copy link
Contributor Author

@jlb6740 jlb6740 May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the underscore is needed because this naming is not just used for printing, but it also used as part of syntax for an associated item:

vex.vvvv = Some(self.xmm2.enc()); // cranelift/assembler-x64/meta/src/generate/format.rs:135
vex.prefix = LegacyPrefix::.66; // cranelift/assembler-x64/meta/src/generate/format.rs:136
vex.map = OpcodeMap::_0F; // cranelift/assembler-x64/meta/src/generate/format.rs:137

where here the LegacyPrefix::.66 is not valid syntax.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I can get around this. Will see.

rm8 | rm16 | rm32 | rm64 | xmm_m32 | xmm_m64 | xmm_m128 | m8 | m16 | m32 | m64 => true,
al | cl | ax | eax | rax | imm8 | imm16 | imm32 | r8 | r16 | r32 | r64 | xmm1
| xmm2 | xmm3 | ymm1 | ymm2 | ymm3 | zmm1 | zmm2 | zmm3 => false,
rm8 | rm16 | rm32 | rm64 | xmm_m32 | xmm_m64 | m8 | m16 | m32 | m64 | xmm_m128
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: move the xmm_* locations together?

Suggested change
rm8 | rm16 | rm32 | rm64 | xmm_m32 | xmm_m64 | m8 | m16 | m32 | m64 | xmm_m128
rm8 | rm16 | rm32 | rm64 | m8 | m16 | m32 | m64 | xmm_m32 | xmm_m64 | xmm_m128

xmm1,
xmm2,
xmm3,
ymm1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not convinced we need these yet. Maybe we should wait to add YMM and ZMM when we actually need those instructions.

@@ -32,6 +31,10 @@ impl dsl::Format {
self.generate_immediate(f);
}

pub fn generate_vex_encoding(&self, f: &mut Formatter, vex: &dsl::Vex) {
self.generate_vex(f, vex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well inline generate_vex here since there are no other uses.

/// REP/REPE, but no specific meaning here -- is just an opcode extension.
_F3,
/// Operand size override and same effect as F3.
_66F3,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only possibilities for the pp field are: None, 66, F3, and F2. We can pare this down and rename it Pp or Prefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe this is copied over or merged from the codegen. I called this vexPP in the DSL. Will do the same here.

/// formats pack this information as bits in a prefix (e.g. VEX / EVEX).
pub fn bits(&self) -> u8 {
match self {
OpcodeMap::None => 0b00,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual says 00000 is reserved for the m-mmmm field and would result in a #UD. Maybe we should remove None since we can't use it (?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None is used as default value when creating a VexInstruction. I could change that to be an option, but the way it is now is the exact same as the current implementation in the codegen encoding:

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/isa/x64/encoding/vex.rs#L27-L41

Do you think we should change this to be an option?

pub w: bool,
pub reg: u8,
//pub rm: XmmMem<R::ReadXmm, R::ReadGpr>,
pub rm: Option<XmmMem<R::ReadXmm, R::ReadGpr>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this (or reg) is quite right: we won't always have a memory operand here. It would seem we need to do the REX flags encoding, extract the R, X, and B fields, and pass those around here.

There are instructions like the following that we won't be able to handle with this approach:

  • VMOVHLPS xmm1, xmm2, xmm3
  • VBLENDVPS xmm1, xmm2, xmm3/m128, xmm4

If we alter the VexInstruction thing to just emit the VEX prefixes, not the whole instruction, then I think things will work better here (no more default instructions) and will be more flexible for the future. Plus, we can reuse all of the suffix stuff that should be the same (see encode_rex_suffixes).

@@ -351,7 +363,7 @@ pub fn generate_isle_inst_decls(f: &mut Formatter, inst: &Inst) {
.format
.operands
.iter()
.filter(|o| o.mutability.is_read())
//.filter(|o| o.mutability.is_read())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//.filter(|o| o.mutability.is_read())

@@ -3799,7 +3799,9 @@
(decl x64_addps (Xmm XmmMem) Xmm)
(rule 1 (x64_addps src1 src2)
(if-let true (use_avx))
(xmm_rmir_vex (AvxOpcode.Vaddps) src1 src2))
;;(xmm_rmir_vex (AvxOpcode.Vaddps) src1 src2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
;;(xmm_rmir_vex (AvxOpcode.Vaddps) src1 src2))


impl VexLength {
/// Encode the `L` bit.
pub fn bits(&self) -> u8 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not use this either in the DSL phase; isn't this only necessary when we encode in the cranelift-assembler-x64 crate?

abrown added a commit to abrown/wasmtime that referenced this pull request May 9, 2025
This change adds support to the new assembler for write-only operands.
This implementation appeared first in [bytecodealliance#10754] but is split out here to
unblock implementation of instructions that require it: multiplication,
conversions, moves, etc. This starts roughly the same as what was
implemented for write-only XMMs in [bytecodealliance#10754] but includes support for
write-only GPRs as well and generates the temporary registers which are
needed.

[bytecodealliance#10754]: bytecodealliance#10754

Co-authored-by: Johnnie Birch <[email protected]>
abrown added a commit to abrown/wasmtime that referenced this pull request May 9, 2025
This change adds support to the new assembler for write-only operands.
This implementation appeared first in [bytecodealliance#10754] but is split out here to
unblock implementation of instructions that require it: multiplication,
conversions, moves, etc. This starts roughly the same as what was
implemented for write-only XMMs in [bytecodealliance#10754] but includes support for
write-only GPRs as well and generates the temporary registers which are
needed.

[bytecodealliance#10754]: bytecodealliance#10754

Co-authored-by: Johnnie Birch <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request May 9, 2025
* asm: implement write-only operands

This change adds support to the new assembler for write-only operands.
This implementation appeared first in [#10754] but is split out here to
unblock implementation of instructions that require it: multiplication,
conversions, moves, etc. This starts roughly the same as what was
implemented for write-only XMMs in [#10754] but includes support for
write-only GPRs as well and generates the temporary registers which are
needed.

[#10754]: #10754

Co-authored-by: Johnnie Birch <[email protected]>

* fix: use `to_reg()` to extract the Cranelift type from `Writable`

---------

Co-authored-by: Johnnie Birch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants