RFR: 8338727: RISC-V: Avoid synthetic data dependency in nmethod barrier on Ztso
Fei Yang
fyang at openjdk.org
Sat Aug 24 14:56:03 UTC 2024
On Wed, 21 Aug 2024 10:01:21 GMT, Robbin Ehn <rehn at openjdk.org> wrote:
> Hi please consider,
>
> On TSO we don't need the synthetic data dependency in between the loads.
> Also added some comment about this.
>
> Sanity tested
This change looks fine. But I have a question about the code comment.
src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp line 281:
> 279: // Embed an synthetic data dependency to order the guard load
> 280: // before the epoch load. (xor + add is standard way)
> 281: // Note: This may be slower than using a membar(load|load) (fence r,r).
But the RV ISA spec says that this is lightweight ordering mechanism compared with a FENCE R, R.
Here is what I read from the spec:
Like other modern memory models, the RVWMO memory model uses syntactic rather than semantic dependencies.
In other words, this definition depends on the identities of the registers being accessed by different instructions,
not the actual contents of those registers. This means that an address, control, or data dependency must be enforced
even if the calculation could seemingly be “optimized away”. This choice ensures that RVWMO remains compatible
with code that uses these false syntactic dependencies as a lightweight ordering mechanism.
ld a1,0(s0)
xor a2,a1,a1
add s1,s1,a2
ld a5,0(s1)
Figure A.10: A syntactic address dependency
For example, there is a syntactic address dependency from the memory operation generated by the
first instruction to the memory operation generated by the last instruction in Figure A.10, even
though a1 XOR a1 is zero and hence has no effect on the address accessed by the second load.
The benefit of using dependencies as a lightweight synchronization mechanism is that the ordering
enforcement requirement is limited only to the specific two instructions in question.
Other non-dependent instructions may be freely reordered by aggressive implementations.
One alternative would be to use a load-acquire, but this would enforce ordering for the first load
with respect to all subsequent instructions. Another would be to use a FENCE R,R, but this would
include all previous and all subsequent loads, making this option more expensive
-------------
PR Review: https://git.openjdk.org/jdk/pull/20661#pullrequestreview-2258753836
PR Review Comment: https://git.openjdk.org/jdk/pull/20661#discussion_r1730010163
More information about the hotspot-dev
mailing list