RFR: 8351949: RISC-V: Cleanup and enable store-load peephole for membars [v3]

Fei Yang fyang at openjdk.org
Mon Mar 17 11:19:55 UTC 2025


On Mon, 17 Mar 2025 07:58:50 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

>> Hi please consider.
>> 
>> |RVWMO| Patched|
>> | ---------- | ---------- |
>> |fence iorw,iorw| fence iorw,ow|
>> |sw  t4,120(t2) | sw  t4,120(t2) |
>> |fence ow,ir | unnecessary_membar_volatile_rvwmo |
>> | sw  t6,128(t2) // Non-volatile | sw  t6,128(t2) // Non-volatile |
>> |fence iorw,ow | fence iorw,ow|
>> |sw  t5,124(t2) |sw  t5,124(t2) |
>> 
>> |TSO                                | Patched|
>> | ---------- | ---------- |
>> | lw	a4,120(t2) | lw	a6,120(t2) |
>> | sw	a0,124(t2) | sw	t6,124(t2) |
>> | fence	iorw,iorw | unnecessary_membar_volatile_tso |
>> | sw	t4,120(t2) | sw	t4,120(t2) |
>> | fence	ow,ir | unnecessary_membar_volatile_tso |
>> | sw	t6,128(t2) | sw	t5,128(t2) |
>> | sw	t5,124(t2) // Non-volatile| sw	a1,124(t2)  // Non-volatile |
>> | fence	iorw,iorw  | unnecessary_membar_volatile_tso |
>> |...  | ... |
>> | sw	a3,120(t2) | sw	a0,120(t2) |
>> | fence	ow,ir | fence	ow,ir |
>> | lw	a7,124(t2) | lw	a5,124(t2) |
>> 
>> For the specific rvwmo volatile store + store + volatile store is around 30% faster on VF2.
>> 
>> The patch do:
>> - Separate ztso and rvwmo in ad by using UseZtso predicate.
>> - Match all that requires the same membar.
>> - Make fence/fencei protected as they shouldn't be using directly.
>> - Increased cost of membars to VOLATILE_REF_COST.
>> - Added a real_empty pipe.
>> - Change to pipe_slow on TSO (as x86).
>> 
>> Note that C2-rv64 is now superior to gcc/clang regrading fencing:
>> https://godbolt.org/z/6E3YTP15j
>> 
>> Testing jcstress, tier1 and manually reading the generated assembly.
>> Doing additional testing, but RFR it now as it may need some consideration.
>> 
>> /Robbin
>
> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision:
> 
>  - Merge branch 'master' into tso-merge
>  - Review comments
>  - Fixed ws
>  - Revert NC
>  - Fixed comment
>  - UseNewCode

Hi, Thanks for the update. I have checked the changes, seems fine to me modulo several minor comments about naming. And I having been running jcstress on two of my OoO machines over the weekend. So far so good. I will let the test continue for some more time to see.

src/hotspot/cpu/riscv/riscv.ad line 7967:

> 7965:   size(0);
> 7966: 
> 7967:   format %{ "no_membar_rvtso elided/tso (empty encoding)" %}

Here: s/no_membar_rvtso/unnecessary_membar_rvtso/

src/hotspot/cpu/riscv/riscv.ad line 7969:

> 7967:   format %{ "no_membar_rvtso elided/tso (empty encoding)" %}
> 7968:   ins_encode %{
> 7969:     __ block_comment("no_membar_rvtso");

And here: s/no_membar_rvtso/unnecessary_membar_rvtso/

src/hotspot/cpu/riscv/riscv.ad line 8007:

> 8005: // RVWMO
> 8006: 
> 8007: instruct membar_rvwmo_aqcuire() %{

Maybe it's better to put `_rvwmo` as a suffix? Like `membar_aqcuire_rvwmo`. Similar for other match rules like `membar_rvwmo_release`, `membar_rvwmo_storestore`, `membar_rvwmo_lock` and `membar_rvwmo_volatile`.

-------------

PR Review: https://git.openjdk.org/jdk/pull/24035#pullrequestreview-2688965550
PR Review Comment: https://git.openjdk.org/jdk/pull/24035#discussion_r1998503604
PR Review Comment: https://git.openjdk.org/jdk/pull/24035#discussion_r1998503875
PR Review Comment: https://git.openjdk.org/jdk/pull/24035#discussion_r1998510917


More information about the hotspot-dev mailing list