RFR: 8325821: [REDO] use "dmb.ishst+dmb.ishld" for release barrier [v7]

Andrew Haley aph at openjdk.org
Tue Apr 16 14:08:42 UTC 2024


On Fri, 12 Apr 2024 08:00:20 GMT, kuaiwei <duke at openjdk.org> wrote:

>> The origin patch for https://bugs.openjdk.org/browse/JDK-8324186 has 2 issues:
>> 1 It show regression in some platform, like Apple silicon in mac os
>> 2 Can not handle instruction sequence like "dmb.ishld; dmb.ishst; dmb.ishld; dmb.ishld"
>> 
>> It can be fixed by:
>> 1 Enable AlwaysMergeDMB by default, only disable it in architecture we can see performance improvement (N1 or N2)
>> 2 Check the special pattern and merge the subsequent dmb.
>> 
>> It also fix a bug when code buffer is expanding, st/ld/dmb can not be merged. I added unit tests for these.
>> 
>> This patch still has a unhandled case. Insts like "dmb.ishld; dmb.ishst; dmb.ish", it will merge the last 2 instructions and can not merge all three. Because when emitting dmb.ish, if merge all previous dmbs, the code buffer will shrink the size. I think it may break some resumption and think it's not a common pattern.
>> 
>> - Update:
>> After discussion, I made a new implementation based on finite state machine for merging instruction. The mergeable instruction will be pending in fsm until next unmergeable instruction.
>
> kuaiwei has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix arm build error

Argh, I found it. It happens because C2 calls `masm->offset()` from `PhaseOutput::fill_buffer()` after every node is emitted. So that trick isn't going to work.

It was worth a try, but given that C2 expects offset() to be correct after every node, I think we're stuck. Maybe the last idea you had is the best possible without C2 tinkering.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18467#issuecomment-2059182775


More information about the hotspot-dev mailing list