RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v2]

Aleksey Shipilev shade at openjdk.org
Wed May 22 11:11:04 UTC 2024


On Wed, 22 May 2024 02:53:23 GMT, kuaiwei <duke at openjdk.org> wrote:

>> he origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues:
>> 1 It show regression in some platform, like Apple silicon in mac os
>> 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld"
>> 
>> It can be fixed by:
>> 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2)
>> 2 Check the special pattern and merge the subsequent dmb.
>> 
>> It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these.
>> 
>> This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern.
>> 
>> In previous PR https://github.com/openjdk/jdk/pull/18467 , I tried an implementation to use state machine for merging. But it looks risky to pending instruction during emitting.
>
> kuaiwei has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Make MacroAssembler::merge more clear

Cursory review:

src/hotspot/cpu/aarch64/globals_aarch64.hpp line 127:

> 125:   product(ccstr, UseBranchProtection, "none",                           \
> 126:           "Branch Protection to use: none, standard, pac-ret")          \
> 127:   product(bool, AlwaysMergeDMB, true, DIAGNOSTIC,                      \

Suggestion:

  product(bool, AlwaysMergeDMB, true, DIAGNOSTIC,                       \

test/hotspot/gtest/aarch64/test_assembler_aarch64.cpp line 93:

> 91: }
> 92: 
> 93: TEST_VM(AssemblerAArch64, merge_dmb) {

Given the previous experience with barrier merges that prompted the backout, I would prefer to have a more comprehensive test here, maybe an additional one. I am thinking something like the exhaustive combination of 4 back-to-back barriers of each of 5 types. This gives us 5^4 = 625 test cases, which I think is still manageable.

test/hotspot/gtest/aarch64/test_assembler_aarch64.cpp line 198:

> 196: }
> 197: 
> 198: TEST_VM(AssemblerAArch64, merge_ldst) {

This test seems to be irrelevant for the issue at hand? Tests `ld/st` -> `ldp/stp` merging, not the barrier merges?

-------------

PR Review: https://git.openjdk.org/jdk/pull/19278#pullrequestreview-2070795468
PR Review Comment: https://git.openjdk.org/jdk/pull/19278#discussion_r1609708317
PR Review Comment: https://git.openjdk.org/jdk/pull/19278#discussion_r1609759284
PR Review Comment: https://git.openjdk.org/jdk/pull/19278#discussion_r1609707981


More information about the hotspot-dev mailing list