RFR: 8326936: RISC-V: Shenandoah GC crashes due to incorrect atomic memory operations

Fei Yang fyang at openjdk.org
Wed Feb 28 11:28:53 UTC 2024


On Wed, 28 Feb 2024 09:55:37 GMT, MaxXing <duke at openjdk.org> wrote:

> With compressed oops enabled, Shenandoah GC crashes in the concurrent marking phase due to some incorrect atomic memory operations. This resulted in the failure of multiple related tests, including `gc/shenandoah/TestSmallHeap.java`, `gc/metaspace/TestMetaspacePerfCounters.java#Shenandoah-64`, and so on, tested on XuanTie C910 and QEMU.
> 
> This failure is related to a GCC bug we recently discovered: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114130.
> 
> In detail, there's a pattern in method `ShenandoahHeap::conc_update_with_forwarded`:
> 
> 
> if (in_collection_set(obj)) {
>   // Dereferences `obj` without explicit null check.
>   oop fwd = ShenandoahBarrierSet::resolve_forwarded_not_null(obj);
>   // Then calls atomic built-in.
>   atomic_update_oop(fwd, p, obj);
> }
> 
> 
> `atomic_update_oop` then compresses `obj` into a 32-bit word and calls the built-in `__atomic_compare_exchange`. The latter produces incorrect assembly that comparing this unsigned 32-bit word with the sign-extended result of `lr.w` instructions.
> 
> This bug can be bypassed by adding an explicit null check (like `if (obj && in_collection_set(obj))`), or adding compiler flag `-fno-delete-null-pointer-checks`.
> 
> In previous commits, JDK-8316186 removed RISC-V's `PlatformCmpxchg<4>` but left the flag `-fno-delete-null-pointer-checks` enabled. Then JDK-8316893 removed the flag and made the bug visible. This patch adds `PlatformCmpxchg<4>` back.

Hi, I don't witness such an issue when running those regression tests with jdk binaries built with native GCC-12 on Licheepi-4a or Unmatched boards. Do you know why? Does that mean this will only trigger for jdk binaries built with GCC-13? Guess @robehn who owns https://bugs.openjdk.org/browse/JDK-8316186 might also be interested in this.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18039#issuecomment-1968778651


More information about the hotspot-runtime-dev mailing list