RFR: JDK-8288719: [arm32] SafeFetch32 thumb interleaving causes random crashes

Thomas Stuefe stuefe at openjdk.org
Mon Jun 20 10:37:11 UTC 2022


After [JDK-8284997](https://bugs.openjdk.org/browse/JDK-8284997) delivered just a bandaid, this is hopefully the real fix.

JDK-8283326 re-implemented SafeFetch as static assembler functions. This broke arm: the VM would crash at random points, usually in Atomic::add(), usually right at startup. In most cases the VM could not even be built correctly, see JDK-8284997.

This was only reproducible if the VM was built natively, on a Raspberry Pi, inside an Ubuntu18-derived container. Buiding natively on Raspberry Pi OS was fine. Cross-building was fine too. The difference is the default instruction set the toolchain uses. We don't explicitly specify `-mthumb` or `-marm`, so we use the toolchain's default. That default seems to depend on how GCC itself was built. Ubuntu ships a GCC that has been built in thumb mode, thus defaulting to `-mthumb`, whereas Raspberry Pi OS and Fedora ship GCCs that default to `-marm`.

So, the VM proper is compiled either to arm or thumb code. The `SafeFetch32` assembly function itself uses arm code always. Why this is I don't know for sure, I assume if I wanted thumb I need to specify `.thumb_func` in the assembly. 

If the VM uses thumb, it needs to call SafeFetch32 with a switching branch instruction (BX). But the compiler-generated BL. The instruction set was not switched upon entering SafeFetch32 and garbage thumb code was executed. VM crashes soon after.

This seems to be a common problem when writing arm assembly by hand, the solution is specify `.type function`. See also [1]: "As of GCC 4.7, the .type directive is pretty much required for functions. Or, rather, it is required if you want ARM and Thumb interworking to work."

A remaining question is whether we should specify the instruction set explicitly when building on arm32, to prevent surprises like this. Preferably with a configure option.

----

Testing:
- GHAs are green, but that does not say much: they just do the usual cross building without running the executables. Even if they would run, they would be compiled with -marm and not show the default
- Both @marchof and me tested the fix with a native build on Raspberry Pi. I confirmed that the patch fixes the problem. I ran gtests, which also tests SafeFetch

-------------

Commit messages:
 - Use real static safefetch on linux arm

Changes: https://git.openjdk.org/jdk/pull/9213/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=9213&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8288719
  Stats: 3 lines in 2 files changed: 1 ins; 1 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/9213.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/9213/head:pull/9213

PR: https://git.openjdk.org/jdk/pull/9213


More information about the hotspot-dev mailing list