Skip to content

[NFC] Add useFPRegsForHalfType(). #74147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 2, 2024
Merged

Conversation

hvdijk
Copy link
Contributor

@hvdijk hvdijk commented Dec 1, 2023

Currently, half operations can be promoted in one of two ways.

  • If softPromoteHalfType() returns false, fp16 values are passed around in fp32 registers, and whole chains of fp16 operations are promoted to fp32 in one go.
  • If softPromoteHalfType() returns true, fp16 values are passed around in i16 registers, and individual fp16 operations are promoted to fp32 and the result truncated to fp16 right away.

The softPromoteHalfType behavior is necessary for correctness, but changing this for an existing target breaks the ABI. Therefore, this commit adds a third option:

  • If softPromoteHalfType() returns true and useFPRegsForHalfType() returns true as well, fp16 values are passed around in fp32 registers, but individual fp16 operations are promoted to fp32 and the result truncated to fp16 right away.

This change does not yet update any target to make use of it.

@hvdijk
Copy link
Contributor Author

hvdijk commented Dec 1, 2023

An example of how this can be used, which I plan to submit after this: hvdijk@softpromotehalf-fpreg-arm

@kparzysz
Copy link
Contributor

kparzysz commented Dec 2, 2023

You need to update all the "SoftPromoteHalf..." functions in llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp.

@hvdijk
Copy link
Contributor Author

hvdijk commented Dec 2, 2023

You need to update all the "SoftPromoteHalf..." functions in llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp.

Could you elaborate a bit on this? During testing with the aforementioned hvdijk@softpromotehalf-fpreg-arm, things seemed to work. Are you imagining anything in particular that doesn't work?

@kparzysz
Copy link
Contributor

kparzysz commented Dec 3, 2023

There are a handful of functions there that assume MVT::i16 for representing half values, e.g. https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp#L2806-L2812

@hvdijk
Copy link
Contributor Author

hvdijk commented Dec 3, 2023

Ah, thank you, I see what you mean and what's going wrong. What you say shouldn't be an issue directly, but you are right that this PR is incorrect. Inside functions, I continue to use MVT::i16 for representing half values. The conversion to/from f32 is handled before that, so the issue in those SoftPromoteHalf functions should never come up, their assumptions about i16 should continue to hold. However, that conversion to/from f32 is done wrong, so rather than f16 values promoted to f32, those registers hold f16 values as i16, extended to i32, bitcast to f32. And this did not come up in my testing because it is an ABI-incompatibility that can only be seen by mixing object files from current LLVM with my patched LLVM, which I had not done in my testing. I'll work on fixing that.

Edit: Actually... Isn't that already what happens on current LLVM too? In which case, isn't that what should continue to happen?

@kparzysz
Copy link
Contributor

kparzysz commented Dec 3, 2023

Inside functions, I continue to use MVT::i16 for representing half values.

Are you trying to only apply this promotion to function arguments and return values? If so, then it should be entirely done in handling a calling convention. This PR changes type legalization which will apply to all values of type MVT::fp16.

@hvdijk
Copy link
Contributor Author

hvdijk commented Dec 3, 2023

Kind of. I'm only trying to use f32 to maintain compatibility with current LLVM, which uses it implicitly for function arguments and return values because that's what it's promoted to. Any place that doesn't affect compatibility, it doesn't matter whether it's promoted to i16 or to f32 so long as the promotion is done correctly, and reversed correctly. I'll have a look at implementing it in the way that you suggest.

@hvdijk
Copy link
Contributor Author

hvdijk commented Dec 6, 2023

There are extra complications because f16 is not a legal type, and non-legal types are ignored in calling conventions, so the calling convention can set that they should be handled as f32, but that means nothing, they will be passed in whatever NumRegistersForVT[MVT::f16] is set to. But it is possible to override getRegisterTypeForCallingConv to handle that, and getNumRegistersForCallingConv, if we want.

The bigger issue is that legalising to i16 causes internal errors because on ARM, like f16, i16 is also not a legal type. We can legalise to f32, or we can legalise to i32, but not to i16. Since legalising to f32 also avoids the former problem as well, I'd prefer to stick with that.

I'll have another look at your "You need to update all the "SoftPromoteHalf..." functions" comment and see if I can find a test case that shows something that needs updating.

@hvdijk
Copy link
Contributor Author

hvdijk commented Jan 2, 2024

So, revisiting this... The TransformToType has two uses. Both are for what type to legalize to, but at different times. There is the type legalization of function parameters and return values, which bypasses the calling convention so is tricky to handle elsewhere. And there is the type legalization for internal values within functions. However, for TypeSoftPromoteHalf, TransformToType is almost entirely ignored for internal values within functions, most TypeSoftPromoteHalf always promotes to i16, and that is fine, there is no requirement for type legalization to actually legalize to the type set in TransformToType. If I go over the lot and ensure all promotion is to i16, would that then make this PR acceptable?

Copy link
Contributor

@kparzysz kparzysz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable.

// the size matters.
// Return true if the half type should be promoted using soft promotion rules
// where each operation is promoted to f32 individually, then converted to
// fp16. The default behavior is to promote links of operations, keeping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean "chains of operations"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you. I struggled to find the right wording, managed to remember to update it in the commit message and MR description, but forgot to also update the comment. Fixed.

virtual bool softPromoteHalfType() const { return false; }

// Return true if, for soft-promoted half, the half type should be passed
// around as f32. The default behavior is to pass around as i16. If
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of "should be passed around" have "should be passed in function calls". Does this apply to return values as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, and agreed that that would be clearer. Updated to say "passed to and returned from functions".

Currently, half operations can be promoted in one of two ways.

* If softPromoteHalfType() returns false, fp16 values are passed to and
  returned from functions in fp32 registers, and whole chains of fp16
  operations are promoted to fp32 in one go.
* If softPromoteHalfType() returns true, fp16 values are passed to and
  returned from functions in i16 registers, and individual fp16
  operations are promoted to fp32 and the result truncated to fp16 right
  away.

The softPromoteHalfType behavior is necessary for correctness, but
changing this for an existing target breaks the ABI. Therefore, this
commit adds a third option:

* If softPromoteHalfType() returns true and useFPRegsForHalfType()
  returns true as well, fp16 values are passed to and returned from
  functions in fp32 registers, but individual fp16 operations are
  promoted to fp32 and the result truncated to fp16 right away.

This change does not yet update any target to make use of it.
@hvdijk hvdijk force-pushed the softpromotehalf-fpreg branch from 6b65afa to 07916dc Compare January 12, 2024 01:25
@hvdijk
Copy link
Contributor Author

hvdijk commented Jan 12, 2024

In my previous comment, when I wrote TypeToPromoteTo, I meant RegisterTypeForVT. Apologies for the confusion. TypeToPromoteTo is used, but is always set to MVT::f32 and I am not changing that.

I have gone over all the SoftPromoteHalfRes_* functions. Two (BITCAST and EXTRACT_VECTOR_ELT) return an integer the same with as the float input, which for f16 always results in i16. Three (FCOPYSIGN, SELECT, SELECT_CC) return the same type as a promoted operand (GetSoftPromotedHalf) where the operand was originally f16. Two (VECREDUCE, VECREDUCE_SEQ) keep the element type as f16 for further legalization. Everything else unconditionally returns i16, nothing there appears to check RegisterTypeForVT directly or indirectly, so I think no functions there need updating to behave as we talked about, all of them already do. With the updated comments, can we merge this as is after all?

I have updated both this and my softpromotehalf-fpreg-arm branch which enables this for ARM, which I would hope to submit as a PR right after this one.

@hvdijk
Copy link
Contributor Author

hvdijk commented Feb 2, 2024

ping. We had a path forwards and it turned out that LLVM already did exactly what that path forwards entailed so it required no further changes beyond what was already in the PR, but then it was silent again. We've now probably missed the opportunity to get this fixed in LLVM 18, with the release branch already being created. Can I do anything to speed this along so we can get this into LLVM 19, at least?

Copy link
Contributor

@kparzysz kparzysz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hvdijk hvdijk merged commit 274d1b0 into llvm:main Feb 2, 2024
@hvdijk hvdijk deleted the softpromotehalf-fpreg branch February 2, 2024 14:05
@kparzysz
Copy link
Contributor

kparzysz commented Feb 2, 2024

Are you planning to have another PR with changes to targets? It's not yet too late to get it in 18.1.0.

@hvdijk
Copy link
Contributor Author

hvdijk commented Feb 2, 2024

Thanks!

Are you planning to have another PR with changes to targets? It's not yet too late to get it in 18.1.0.

I have the commit for ARM ready to create a PR as soon as I confirm no new tests need updating, will create that later today. I would be very happy if that can get into 18.1.0.

@kparzysz
Copy link
Contributor

kparzysz commented Feb 2, 2024

There is a process for cherry-picking fixes into release branches. I don't remember exactly what it is though...

@hvdijk
Copy link
Contributor Author

hvdijk commented Feb 2, 2024

Follow-up: #80440. If that is approved and merged, I think I'll just create an issue and ask Tom Stellard's opinion on cherry-picking.

agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
Currently, half operations can be promoted in one of two ways.

* If softPromoteHalfType() returns false, fp16 values are passed around
in fp32 registers, and whole chains of fp16 operations are promoted to
fp32 in one go.
* If softPromoteHalfType() returns true, fp16 values are passed around
in i16 registers, and individual fp16 operations are promoted to fp32
and the result truncated to fp16 right away.

The softPromoteHalfType behavior is necessary for correctness, but
changing this for an existing target breaks the ABI. Therefore, this
commit adds a third option:

* If softPromoteHalfType() returns true and useFPRegsForHalfType()
returns true as well, fp16 values are passed around in fp32 registers,
but individual fp16 operations are promoted to fp32 and the result
truncated to fp16 right away.

This change does not yet update any target to make use of it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants