RFR: 8343541: C1: Plain memory accesses are emitted with membars with +AlwaysAtomicAccesses [v2]
Xiaolong Peng
xpeng at openjdk.org
Tue Nov 19 21:33:57 UTC 2024
On Mon, 18 Nov 2024 09:30:54 GMT, Xiaolong Peng <xpeng at openjdk.org> wrote:
>> C1 and C2 has different implementations for `+AlwaysAtomicAccesses`, [C2 impl](https://github.com/openjdk/jdk/blob/4a7ce1d7c1bd4b751063b98cf8bedcd27055760b/src/hotspot/share/gc/shared/c2/barrierSetC2.cpp#L410) only guarantees atomicity hence no membars are emitted for plain memory access, but C1 treats it same as volatile access hence it emits membars. The change removes the unnecessary membars in C1 for `+AlwaysAtomicAccesses`.
>>
>> The test have been verified by very simple JMH benchmarks to measure the latency of reading/writing long/volatile long variable 10000 times, and run with VM option `-XX:TieredStopAtLevel=3 -XX:+UnlockExperimentalVMOptions -XX:+AlwaysAtomicAccesses`:
>>
>> Before the fix:
>>
>> Benchmark Mode Cnt Score Error Units
>> AlwaysAtomicAccesses.testReadLong avgt 5 58711.131 ± 716.940 ns/op
>> AlwaysAtomicAccesses.testReadVolatileLong avgt 5 59014.735 ± 675.354 ns/op
>> AlwaysAtomicAccesses.testWriteLong avgt 5 115817.978 ± 302.089 ns/op
>> AlwaysAtomicAccesses.testWriteVolatileLong avgt 5 116317.835 ± 1451.365 ns/op
>>
>>
>> After the fix:
>>
>> Benchmark Mode Cnt Score Error Units
>> AlwaysAtomicAccesses.testReadLong avgt 5 49651.527 ± 159.948 ns/op
>> AlwaysAtomicAccesses.testReadVolatileLong avgt 5 58668.844 ± 316.029 ns/op
>> AlwaysAtomicAccesses.testWriteLong avgt 5 23008.361 ± 10.947 ns/op
>> AlwaysAtomicAccesses.testWriteVolatileLong avgt 5 116440.017 ± 1240.832 ns/op
>
> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision:
>
> renaming
Thanks all for the review!
-------------
PR Comment: https://git.openjdk.org/jdk/pull/22191#issuecomment-2486793350
More information about the hotspot-dev
mailing list