-
Notifications
You must be signed in to change notification settings - Fork 5.1k
JIT: Cast UInt64 to Single directly during const folding #106419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 9 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
ceb81bf
Fix cast folding on ARM64
amanasifkhalid d729037
Add test
amanasifkhalid 0a36998
Fix weird comment spacing
amanasifkhalid a740f04
Fix test
amanasifkhalid c98af0a
Fix test
amanasifkhalid 08370ac
Flip params
amanasifkhalid 70c60b3
Run test on CoreCLR only
amanasifkhalid 8dd3473
Add RequiresProcessIsolation
amanasifkhalid 163d581
Merge branch 'main' into fp-cast-arm64
amanasifkhalid 3900e5d
Do single-step conversion on all platforms
amanasifkhalid de0d0c9
Expand test coverage
amanasifkhalid 08fd38c
Update test
amanasifkhalid 6ee12fc
Try running test on all CoreCLR platforms
amanasifkhalid 0ea8525
Fix condition
amanasifkhalid File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
22 changes: 22 additions & 0 deletions
22
src/tests/JIT/Regression/JitBlue/Runtime_106338/Runtime_106338.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
|
||
// Generated by Fuzzlyn v2.2 on 2024-08-13 00:04:04 | ||
// Run on Arm64 MacOS | ||
// Seed: 13207615092246842583-vectort,vector64,vector128,armadvsimd,armadvsimdarm64,armaes,armarmbase,armarmbasearm64,armcrc32,armcrc32arm64,armdp,armrdm,armrdmarm64,armsha1,armsha256 | ||
// Reduced from 226.8 KiB to 0.4 KiB in 00:02:12 | ||
// Debug: Outputs 1600094603 | ||
// Release: Outputs 1600094604 | ||
using System; | ||
using Xunit; | ||
|
||
public class Runtime_106338 | ||
{ | ||
[Fact] | ||
public static void TestEntryPoint() | ||
{ | ||
ulong vr10 = 16105307123914158031UL; | ||
float vr11 = 4294967295U | vr10; | ||
Assert.Equal(1600094603U, BitConverter.SingleToUInt32Bits(vr11)); | ||
} | ||
} |
11 changes: 11 additions & 0 deletions
11
src/tests/JIT/Regression/JitBlue/Runtime_106338/Runtime_106338.csproj
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
<Project Sdk="Microsoft.NET.Sdk"> | ||
<PropertyGroup> | ||
<Optimize>True</Optimize> | ||
<!-- Needed for CLRTestTargetUnsupported --> | ||
<RequiresProcessIsolation>true</RequiresProcessIsolation> | ||
<CLRTestTargetUnsupported Condition="'$(TargetArchitecture)' != 'arm64' or '$(RuntimeFlavor)' != 'coreclr'">true</CLRTestTargetUnsupported> | ||
</PropertyGroup> | ||
<ItemGroup> | ||
<Compile Include="$(MSBuildProjectName).cs" /> | ||
</ItemGroup> | ||
</Project> |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this bug exists on all platforms, not just Arm64.
The code in the test is
This boils down to
4294967295U | 16105307123914158031UL
which is16105307124325679103
The correct
float
result is then16105306574569865216.0f
, or rather the raw bits1600094603
(the DEBUG output).The result produced by the two step conversion, however, is
16105307674081492992.0f
, or rather the raw bits1600094604
(the RELEASE output).I would expect that all our current compilers (MSVC, Clang, and GCC) are producing correct results for
return (float)u64
and we no longer need to do a two step conversion. If any are still producing incorrect results, we should log bugs against them and we'd need to hand roll an implementation that does the conversion instead (I can give reference to one if we need it, but I don't think we do).Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. I tried both the one- and two-step conversion on x64, and in Debug and Release, I get
1600094604
. So it looks like the MSVC/Clang/GCC codegen on the const-folding path is matching what RyuJIT currently emits in Debug for x64. For this test case, is the definition of "correct" architecture-dependent? I'm guessing there's some subtle difference in rounding behavior between ARM64 and x64?Regardless of the answer to that, I think you're right that we can update the cast logic for all platforms. I can do that in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? I don't see the same behavior in godbolt: https://godbolt.org/z/s5MrTWaYM
-- note that
m1
andm2
return the same result (raw bits0x5f5f818b
) whilem3
returns one higher (raw bits0x5f5f818c
)No, we should be deterministic across all platforms for such a case; and correspondingly the added test should be enabled for all platforms as well, not just be Arm64 specific
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out the
ulong -> float
cast on x64 is represented in IR asulong -> double -> float
, while on ARM64, we combine it into one cast during morph. I'm guessing we have to enable that path for all platforms, too.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the importer is creating
ulong -> double -> float
forulong -> float
conversions in IL? That sounds wrong given that those are not equivalent (as the example shows). If morph is making that transformation in the opposite direction it likewise sounds wrong.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roslyn is likely relying on the JIT implementation for the constant folding here, not realizing it's incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree the helper has downsides too. But Roslyn could avoid using it when not targeting something with it available. For example, I imagine it could emit the equivalent of
in those cases (or whatever the right way is to do the conversion manually). Signed long -> float conversion is representable with just
conv.r4
, I believe.I don't think we need to try to fix this in .NET 9, but we may want to unify the debug/release behavior on something. We should probably open an issue for more discussion about the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #106646 to track this. I think ARM64 was the only platform where Debug/Release behavior could diverge, due to const-folding always doing a two-step cast and
ucvtf/scvtf
being able to encodeulong/long -> float
casts. Should we take this PR as-is for .NET 9?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect this also repros on AVX512 capable hardware for x64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I can repro it on my AVX512 machine, though this PR's changes seem to fix it. Here's the Debug codegen:
And here's Release:
Both now output
1600094603
. I can try tweaking the test to only run on ARM64, or on x64 ifAvx512.IsSupported
is true.