Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8353237: [AArch64] Incorrect result of VectorizedHashCode intrinsic on Cortex-A53 #24489

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

voitylov
Copy link

@voitylov voitylov commented Apr 7, 2025

The root of the problem is that VectorizedHashCode intrinsic introduced by JDK-8341194 is not aware of JDK-8079203. JDK-8079203 generates additional nop with madd instruction on Cortex-A53 as a workaround for Cortex-A53 erratum 835769 "AArch64 multiply-accumulate instruction might produce incorrect result". Current VectorizedHashCode intrinsic calculates byte offset to jump inside the unrolled loop code. It assumes 2 instructions per each unrolled iteration (load and madd). JDK-8079203 adds additional nop for Cortex-A53, which breaks offset calculation logic.
 
Current offset calculation logic is using shift instead of multiplication, power-of-2 number instructions are present in each unrolled loop iteration. To keep it simple, this fix adds one more nop into each loop iteration on Cortex-A53 in order to have 4 instruction per iteration, which is also a power-of-2. To account for that, the shift argument for offset calculation logic is increased by 1, because each loop iteration has 2 times more instructions on Cortex-A53.
 
This fix is tested on Raspberry Pi 3 (based on Cortex-A53) by running initially reported application and by running hotspot jtreg tests (not a single test could be run on Cortex-A53 before the fix). After the fix, the specialized test hotspot/jtreg/compiler/intrinsics/TestArraysHashCode.java passes.

The performance gain from the intrinsic is also observed on Cortex-A53 using the ArraysHashCode benchmark.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8353237: [AArch64] Incorrect result of VectorizedHashCode intrinsic on Cortex-A53 (Bug - P2)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24489/head:pull/24489
$ git checkout pull/24489

Update a local copy of the PR:
$ git checkout pull/24489
$ git pull https://git.openjdk.org/jdk.git pull/24489/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24489

View PR using the GUI difftool:
$ git pr show -t 24489

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24489.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 7, 2025

👋 Welcome back avoitylov! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 7, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 7, 2025
@openjdk
Copy link

openjdk bot commented Apr 7, 2025

@voitylov The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

mlbridge bot commented Apr 7, 2025

Webrevs

@eme64
Copy link
Contributor

eme64 commented Apr 8, 2025

@voitylov How can this bug be reproduced? Can you write a reproducer to attach to this PR?

Also: the title sounds a bit generic: "on some hardware" ... is it all aarch64, or something more specific? You could change the PR title.

@koutheir
Copy link

koutheir commented Apr 8, 2025

How can this bug be reproduced? Can you write a reproducer to attach to this PR?

I define a Java source file like this:

// A.java
public class A {
    public static void main(String[] args) {
        System.out.println("HelloWorld!");
    }
}

Then I build its class file on my computer (javac A.java). I transfer both files to the Rock64 device running Linux/AArch64, and I get the following behaviors.

First, the good behaviors:

# Load A.class:
$ jdk-25/bin/java A
HelloWorld!

# Compile A.java then load it:
$ jdk-25/bin/java -XX:+UnlockDiagnosticVMOptions -XX:-UseVectorizedHashCodeIntrinsic A.java
HelloWorld!

Then, the wrong behavior:

$ jdk-25/bin/java A.java
An exception has occurred in the compiler ((version info not available)). Please file a bug against the Java compiler via the Java bug reporting page (https://bugreport.java.com) after checking the Bug Database (https://bugs.java.com) for duplicates. Include your program, the following diagnostic, and the parameters passed to the Java compiler in your report. Thank you.
java.lang.NoClassDefFoundError: com/sun/tools/javac/jvm/ClassReader$AttributeReader
        at jdk.compiler/com.sun.tools.javac.jvm.ClassReader.initAttributeReaders(ClassReader.java:885)
        at jdk.compiler/com.sun.tools.javac.jvm.ClassReader.<init>(ClassReader.java:308)
        at jdk.compiler/com.sun.tools.javac.jvm.ClassReader.instance(ClassReader.java:270)
        at jdk.compiler/com.sun.tools.javac.code.ClassFinder.<init>(ClassFinder.java:186)
        at jdk.compiler/com.sun.tools.javac.code.ClassFinder.instance(ClassFinder.java:178)
        at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.<init>(JavaCompiler.java:400)
        at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.instance(JavaCompiler.java:129)
        at jdk.compiler/com.sun.tools.javac.processing.JavacProcessingEnvironment.<init>(JavacProcessingEnvironment.java:209)
        at jdk.compiler/com.sun.tools.javac.processing.JavacProcessingEnvironment.instance(JavacProcessingEnvironment.java:194)
        at jdk.compiler/com.sun.tools.javac.api.BasicJavacTask.initPlugins(BasicJavacTask.java:218)
        at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.prepareCompiler(JavacTaskImpl.java:204)
        at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.parseInternal(JavacTaskImpl.java:257)
        at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.invocationHelper(JavacTaskImpl.java:152)
        at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.parse(JavacTaskImpl.java:248)
        at jdk.compiler/com.sun.tools.javac.launcher.ProgramDescriptor.of(ProgramDescriptor.java:71)
        at jdk.compiler/com.sun.tools.javac.launcher.SourceLauncher.run(SourceLauncher.java:132)
        at jdk.compiler/com.sun.tools.javac.launcher.SourceLauncher.main(SourceLauncher.java:76)
Caused by: java.lang.ClassNotFoundException: com.sun.tools.javac.jvm.ClassReader$AttributeReader
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:580)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:490)
        ... 17 more
Exception in thread "main" java.lang.IllegalStateException: java.lang.NoClassDefFoundError: com/sun/tools/javac/jvm/ClassReader$AttributeReader
        at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.parse(JavacTaskImpl.java:252)
        at jdk.compiler/com.sun.tools.javac.launcher.ProgramDescriptor.of(ProgramDescriptor.java:71)
        at jdk.compiler/com.sun.tools.javac.launcher.SourceLauncher.run(SourceLauncher.java:132)
        at jdk.compiler/com.sun.tools.javac.launcher.SourceLauncher.main(SourceLauncher.java:76)
Caused by: java.lang.NoClassDefFoundError: com/sun/tools/javac/jvm/ClassReader$AttributeReader
        at jdk.compiler/com.sun.tools.javac.jvm.ClassReader.initAttributeReaders(ClassReader.java:885)
        at jdk.compiler/com.sun.tools.javac.jvm.ClassReader.<init>(ClassReader.java:308)
        at jdk.compiler/com.sun.tools.javac.jvm.ClassReader.instance(ClassReader.java:270)
        at jdk.compiler/com.sun.tools.javac.code.ClassFinder.<init>(ClassFinder.java:186)
        at jdk.compiler/com.sun.tools.javac.code.ClassFinder.instance(ClassFinder.java:178)
        at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.<init>(JavaCompiler.java:400)
        at jdk.compiler/com.sun.tools.javac.main.JavaCompiler.instance(JavaCompiler.java:129)
        at jdk.compiler/com.sun.tools.javac.processing.JavacProcessingEnvironment.<init>(JavacProcessingEnvironment.java:209)
        at jdk.compiler/com.sun.tools.javac.processing.JavacProcessingEnvironment.instance(JavacProcessingEnvironment.java:194)
        at jdk.compiler/com.sun.tools.javac.api.BasicJavacTask.initPlugins(BasicJavacTask.java:218)
        at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.prepareCompiler(JavacTaskImpl.java:204)
        at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.parseInternal(JavacTaskImpl.java:257)
        at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.invocationHelper(JavacTaskImpl.java:152)
        at jdk.compiler/com.sun.tools.javac.api.JavacTaskImpl.parse(JavacTaskImpl.java:248)
        ... 3 more
Caused by: java.lang.ClassNotFoundException: com.sun.tools.javac.jvm.ClassReader$AttributeReader
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:580)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:490)
        ... 17 more

For reference:

$ jdk-25/bin/java -version
openjdk version "25-ea" 2025-09-16
OpenJDK Runtime Environment (build 25-ea+16-1816)
OpenJDK 64-Bit Server VM (build 25-ea+16-1816, mixed mode, sharing)

$ uname -a
Linux rock64-aarch64-2 6.13.5-3-aarch64-ARCH #1 SMP PREEMPT_DYNAMIC Wed Mar  5 18:09:23 MST 2025 aarch64 GNU/Linux

$ cat /proc/cpuinfo
processor       : 0
BogoMIPS        : 48.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4

processor       : 1
BogoMIPS        : 48.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4

processor       : 2
BogoMIPS        : 48.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4

processor       : 3
BogoMIPS        : 48.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0xd03
CPU revision    : 4

$ cat /sys/devices/system/cpu/cpu0/regs/identification/revidr_el1
0x0000000000000180

# The erratum 835769 for Cortex A53 is fixed on this CPU:
$ python3
>>> (0x0000000000000180 >> 7) & 1
1

@eme64
Copy link
Contributor

eme64 commented Apr 8, 2025

@voitylov @koutheir I am asking because I see no such reproducer attached with the PR, and that is generally required for approval of PRs ;)

Or are you saying there is already a JTREG test that triggers this bug?

@koutheir
Copy link

koutheir commented Apr 8, 2025

I'm saying that a simple HelloWorld reproduces the issue every time on the Rock64.

@voitylov voitylov changed the title 8353237: [AArch64] Incorrect result of VectorizedHashCode intrinsic on some hardware 8353237: [AArch64] Incorrect result of VectorizedHashCode intrinsic on Cortex-A53 Apr 8, 2025
@voitylov
Copy link
Author

voitylov commented Apr 8, 2025

The title now reflects the concrete core affected by the issue. The description now lists a concrete jtreg test that can be used to reproduce the issue on that core, but like described previously, any hotspot jtreg test used to fail on that core before the fix. JDK 24 is basically DOA on Cortex-A53.

@voitylov
Copy link
Author

voitylov commented Apr 8, 2025

Before:

bellsoft@rpi3-0:~ $ ./jdk-24.0/bin/java -Xcomp -version
openjdk version "24-internal" 2025-03-18
OpenJDK Runtime Environment (build 24-internal-adhoc.bellsoft.jdkgit)
null (build null, null)

After:

bellsoft@rpi3-0:~ $ ./jdk-24.0/bin/java -Xcomp -version
openjdk version "24-internal" 2025-03-18
OpenJDK Runtime Environment (build 24-internal-adhoc.bellsoft.jdkgit)
OpenJDK 64-Bit Server VM (build 24-internal-adhoc.bellsoft.jdkgit, compiled mode)

@theRealAph
Copy link
Contributor

I see that the hashcode acceleration is a recent addition, so this doesn't need backports.
Whatever we do here is going to be ugly, as far as I can see.

I suppose the interesting question is how long we intend to support Cortex A53 for.

When you build HotSpot for Cortex A53, do you use the -mfix-cortex-a53-835769 GCC option, or is that option always on, by default, on the GCC that you use? I have no objection to this patch, but I wonder if this Cortex A53 bug might also fall foul of GCC-generated code on standard HotSpot builds.

@voitylov
Copy link
Author

voitylov commented Apr 9, 2025

It will have to be backported to 21u, as the intrinsic is already in 21u-dev. I'll do that once we address 8353237 in jdk/jdk.

Our cross compiler has, among other erratum flags, -mfix-cortex-a53-835769 on by default. IIRC Linaro also does this, but hard to tell if all vendors building OpenJDK do that.

Note that JDK-8079203 does not check the specific erratum flags in a processor. Instead, it adds nop based on processor type, penalizing all A53s. I could not find the history of that decision, maybe because not all A53 implementors used the erratum flags, or for the sake of simplicity. It's also consistent with a typical GCC setting. I decided not to touch this part because of that.

@theRealAph
Copy link
Contributor

Our cross compiler has, among other erratum flags, -mfix-cortex-a53-835769 on by default. IIRC Linaro also does this, but hard to tell if all vendors building OpenJDK do that.

Hmm. If we're going to fix this problem properly I guess we should add that flag to the build config. Maybe another PR?

@voitylov
Copy link
Author

voitylov commented Apr 9, 2025

Maybe, but I'm not sure this is the right thing to enforce in the build system across the board either. For example, a cloud vendor with a big Arm fleet without A53s may find such a setting undesirable.

@theRealAph
Copy link
Contributor

Maybe, but I'm not sure this is the right thing to enforce in the build system across the board either. For example, a cloud vendor with a big Arm fleet without A53s may find such a setting undesirable.

I don't think there is any possibility that a big cloud vendor would notice. From what I see of GCC-generated code, added NOPs are very rare, because GCC tends to schedule loads long before uses. It's likely that there would be no difference to any GCC-generated code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler [email protected] rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

4 participants