RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code
Andrew Dinn
adinn at openjdk.java.net
Fri Mar 5 16:02:11 UTC 2021
On Fri, 5 Mar 2021 11:46:07 GMT, Andrew Dinn <adinn at openjdk.org> wrote:
>>> > OKAY, this make sense to us.
>>> > If it is OK to keep the exclusive part of this patch? :-)
>>> > As far as we know, the exclusive instructions are not being revised.
>>> > And we see `ldxr+stxlr+dmb` have been used in linux kernel since 2014 [1], and still used by now [2].
>>>
>>> I know that, but the Linux definition of a "full barrier" isn't quite as strong as HotSpot's `memory_order_conservative`, so we'd need a much more detailed analysis of what behaviours we can permit. Also, we'd have to find a strong reason to invest time in AArch64 without LSE instructions.
>>>
>>> > BTW, the barrier-ordered-before applies with stlxr according to the architecture specification:
>>>
>>> Sure, but so what? This is about the entire ldxr/stlxr combination and `memory_order_conservative` , in which we try to mimic Intel's "Loads and Stores Are Not Reordered with Locked Instructions" specification.
>>
>> Hi,
>>
>> For us, we still have servers used by our customers that does not support LSE extension.
>>
>> Hm, from our point of view, `ldaxr+stlxr+dmb` and `ldxr+stlxr+dmb` provide the same order semantics.
>> The acquire are used to ensure all loads/stores that are after an `ldaxr` (actually loads/stores after the `dmb` of `atomic_*default*_impl` in this case) in program order, while the `dmb` has already guaranteed this for us.
>> Without the acquire, the loads/stores after the atomic operations still can not pass the `dmb`.
>> Remove the acquire does not change the order between preceding loads/stores and `stlxr`.
>
>> For us, we still have servers used by our customers that does not support LSE extension.
>> Hm, from our point of view, ldaxr+stlxr+dmb and ldxr+stlxr+dmb provide the same order semantics.
>> The acquire are used to ensure all loads/stores that are after an ldaxr (actually loads/stores after the dmb of atomic_*default*_impl in this case) in program order, while the dmb has already guaranteed this for us.
>> Without the acquire, the loads/stores after the atomic operations still can not pass the dmb.
> Remove the acquire does not change the order between preceding loads/stores and stlxr.
>
> I agree that the code will still be correct if you change the ldaxr to ldar. While this may make some difference on machines which do not support LSE I would not expect it to be significant for anything other than a very carefully crafted benchmark or an extremely specialized parallel algorithm. Is this change request motivated by an actual real-world use case?
Correction:
I agree that the code will still be correct if you change the ldaxr to *ldxr*.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2788
More information about the hotspot-dev
mailing list