RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent

Andrew Haley aph at openjdk.org
Wed May 28 09:57:52 UTC 2025


On Wed, 28 May 2025 09:27:56 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken.
>> 
>> My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits.
>> 
>> This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times.
>
> Good catch.
> 
> It's useful to look at how this mistake was made. The code was edited by four or five different authors, none of whom were intimately familiar with the port, leaving traps into which others fell. Silently clobbering a register in a convenience method is so dangerous that it (eh, probably) should never be done.
> 
> `do_oop_store` does nothing useful. Please delete it and replace its usages with explicit calls to `store_heap_oop`, with clearly labelled parameters, like this:
> 
> 
> diff --git a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
> index 4c1e4ce3a05..bf688cc01b7 100644
> --- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
> +++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
> @@ -1144,7 +1144,7 @@ void TemplateTable::aastore() {
>    // Get the value we will store
>    __ ldr(r0, at_tos());
>    // Now store using the appropriate barrier
> -  do_oop_store(_masm, element_address, r0, IS_ARRAY);
> +  __ store_heap_oop(element_address, r0, /*temps*/ r10, r11, r3, IS_ARRAY);
>    __ b(done);
>  
>    // Have a null in r0, r3=array, r2=index.  Store null at ary[idx]

> I agree with you @theRealAph, but I wonder if we should separate the bug fix (which probably needs extensive back porting), from the very reasonable refactoring you propose.

I could live with that, but IMO it's reasonable to do it now.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25483#issuecomment-2915710842


More information about the hotspot-runtime-dev mailing list