RFR: 8351997: AArch64: Interpreter volatile reference stores with G1 are not sequentially consistent [v2]
Aleksey Shipilev
shade at openjdk.org
Mon Jun 2 08:25:00 UTC 2025
On Mon, 2 Jun 2025 07:48:39 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:
>> The optimized fast_aputfield bytecode on AArch64 stores the field flags in r3, and performs the leading and trailing fencing depending on its volatile bit being set or not. However, r3 is also the last temp register passed in to the barrier set for reference stores, and G1 clobbers it in a way that may clear the volatile bit. Then the trailing fence won't get executed, and sequential consistency is broken.
>>
>> My fix puts the flags in r5 instead, which is the register that was used by normal aputfield bytecodes. This way, barriers don't clobber the volatile bits.
>>
>> This bug has been observed to mess up a classic Dekker duality in the java.util.concurrent.Exchanger class, leading to a hang in the test/jdk/java/util/concurrent/Exchanger/ExchangeLoops.java test that exercises it. Using G1 and -Xint a reproducer hangs 30/100 times in mach5. With the fix, the same reproducer hangs 0/100 times.
>
> Erik Österlund has updated the pull request incrementally with one additional commit since the last revision:
>
> Add comment about clobbered registers
Well, since we are introducing the hunks near `do_oop_store`-s, and thus extending the scope of the patch. At this point, we can just inline `do_oop_store` (and maybe `do_oop_load`?), like Andrew initially suggested. This will also match what RISC-V already did: https://github.com/openjdk/jdk/commit/c5a1543ee3e68775f09ca29fb07efd9aebfdb33e
-------------
PR Review: https://git.openjdk.org/jdk/pull/25483#pullrequestreview-2887283595
More information about the hotspot-dev
mailing list