Skip to content

Initial support for vex encoding for the new assembler. #10695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jlb6740
Copy link
Contributor

@jlb6740 jlb6740 commented Apr 29, 2025

Patch is first, one or two more will be pushed for enabling.

@jlb6740 jlb6740 requested review from a team as code owners April 29, 2025 01:34
@jlb6740 jlb6740 requested review from abrown and alexcrichton and removed request for a team April 29, 2025 01:34
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. labels Apr 29, 2025
@jlb6740 jlb6740 force-pushed the add-initial-vex-encoding-support-1-of-2 branch 5 times, most recently from cb9474d to b41bba8 Compare April 30, 2025 18:40
Patch is first, one or two more will be pushed for enabling.
use crate::api::{AsReg, CodeSink, KnownOffsetTable, Registers};
use crate::mem::emit_modrm_sib_disp;
use crate::Amode;
use cranelift_assembler_x64_meta::dsl::{OpcodeMap, Vex};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to duplicate necessary types here? The intent is that the meta crate is only used at build-time and isn't a runtime dependency but this is otherwise adding the meta crate as a runtime dependency as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cranelift-codegen cinematic universe of crates, we have -meta and -shared; the -shared crate is depended on by both the meta and main crate. Perhaps the same split makes sense here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alexcrichton, I think so. I did it this way as it seemed to be required for a previous iteration of this patch set, plus it assures consistency and avoids duplication, but I understand we don't want a dependency on the meta crate and that duplication is ok. If I do believe there is a dependency we want to keep, I'll follow @cfallin advise and see if a shared crate makes sense here. Also note, there are a few unnecessary elements in the vex struct in the meta crate that I am removing along with commented out code intended to help develop the follow-up patch 2 of 2 to be submitted later when we enable this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton .. I believe this is addressed now.

Also removes unneeded code and code cleans-up extraneous comments.
Copy link
Contributor

@abrown abrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got about halfway through reviewing this; here are some comments.

pub w: bool,
pub r: bool,
pub wig: bool,
pub rxb: u8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at the DSL level we may want to have separate booleans for each bit, e.g.:

r: bool,
x: bool,
b: bool,
w: bool,

As it stands, this duplicates the r bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am not positive, but I don't even think any of these bits will need to be set at the DSL level. For now I'll just remove the rxb and add the b and w bits as suggested here (but maybe we can remove later).

Self { mmmmm, ..self }
}

fn validate(&self, _operands: &[Operand]) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be helpful to check here for invalid instruction definitions: that way we can find them early, at codegen time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed .. can we implement this validate in the next PR where we turn the plumbing on and we can see that the validation is actually doing the right thing against those first few instructions or would you like to see a little more of this filled out now?

pub fn bytes(&self) -> u8 {
self.bits() / 8
pub fn bytes(&self) -> u16 {
self.bits() / 16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right:

Suggested change
self.bits() / 16
self.bits() / 8

zmm2,
zmm3,

xmm_m128,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as rm128... I'm not convinced adding these unused variants is necessary at this point, either.

Copy link
Contributor Author

@jlb6740 jlb6740 May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wanting to be consistent with the manual here:

ADDPD xmm1, xmm2/m128
VADDPD xmm1, xmm2, xmm3/m128

and writing something like this:

inst(
            "vaddpd",
            fmt("B", [w(xmm1), r(xmm2), r(xmm_m128)]),
            vex(0x58).length(_128).pp(_66).mmmmm(_OF),
            _64b | compat | sse,
        ),

instead of like this:

inst(
            "vaddpd",
            fmt("B", [w(xmm1), r(xmm2), r(rm128)]),
            vex(0x58).length(_128).pp(_66).mmmmm(_OF),
            _64b | compat | sse,
        ),

I kept the rm128 because I wasn't sure of opinions of this during review that I replaced rm128 with xmm_m128 in all the rex encoded instructions just yet (maybe a separate patch), but using the naming xmm_m128 is more consistent with the manual. What do you think?

Comment on lines +149 to +152
fmtln!(f, "match &self.xmm_m128 {{");
fmtln!(f, "XmmMem::Xmm(r) => {{vex.rm = XmmMem::Xmm(r.clone());}}");
fmtln!(f, "XmmMem::Mem(m) => {{vex.rm = XmmMem::Mem(m.clone());}}");
fmtln!(f, "}}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you're adding brace-delimited blocks, you can use Formatter::add_block to get the indentation right, avoid forgetting to add a closing brace, and to match the rest of this crate:

Suggested change
fmtln!(f, "match &self.xmm_m128 {{");
fmtln!(f, "XmmMem::Xmm(r) => {{vex.rm = XmmMem::Xmm(r.clone());}}");
fmtln!(f, "XmmMem::Mem(m) => {{vex.rm = XmmMem::Mem(m.clone());}}");
fmtln!(f, "}}");
f.add_block("match &self.xmm_m128", |f| {
fmtln!(f, "XmmMem::Xmm(r) => {{vex.rm = XmmMem::Xmm(r.clone());}}");
fmtln!(f, "XmmMem::Mem(m) => {{vex.rm = XmmMem::Mem(m.clone());}}");
});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit redundant to match here in any case...


pub fn list() -> Vec<Inst> {
vec![
inst("addpd", fmt("A", [rw(xmm1), r(align(rm128))]), rex([0x66, 0x0F, 0x58]).r(), _64b | compat | sse),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates one of the definitions in add.rs, right? Let's just keep all the addition in that file to avoid this kind of mistake.

Copy link
Contributor Author

@jlb6740 jlb6740 May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the final vex support I removed the one in add.rs (but put it back for this pr). The reason is simple, I was organizing the instructions the same way they are organized in the manual by opcode (specifically Vol 2A 3-34 for this instruction). The ADDPD opcode instruction section covers rex, vex, and evex. What do you think about this organization of grouping instructions consistent with the manual?

@@ -274,6 +277,6 @@ mod test {
.budget_ms(1_000);

// This will run the `roundtrip` fuzzer for one second. To repeatably
// test a single input, append `.seed(0x<failing seed>)`.
// test a single0 input, append `.seed(0x<failing seed>)`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// test a single0 input, append `.seed(0x<failing seed>)`.
// test a single input, append `.seed(0x<failing seed>)`.

imm: None,
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need to create a default (i.e., invalid) VEX instruction ever, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Will remove this.

fn generate_vex(&self, f: &mut Formatter, vex: &dsl::Vex) {
f.empty_line();
f.comment("Emit VEX prefix.");
fmtln!(f, "let vex: VexInstruction<R> = VexInstruction::default();");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of creating a default (i.e., invalid) instruction here, why not use a new constructor or a builder pattern (e.g., op(...).reg(...)...) to create only valid instructions?

@jlb6740
Copy link
Contributor Author

jlb6740 commented May 8, 2025

Hi @cfallin @alexcrichton @abrown I've attempted to address all comments or at least touch base with the reviewer offline. I've moved this PR to #10754 where instead of doing this in 2 parts, well just attempt to switch on the lowering of vaddpd and vaddps. I'll close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants