RFR: 8351140: RISC-V: Intrinsify Unsafe::setMemory [v12]

Fei Yang fyang at openjdk.org
Tue May 20 10:18:55 UTC 2025


On Tue, 20 May 2025 08:46:07 GMT, Anjian-Wen <duke at openjdk.org> wrote:

>> From [JDK-8329331](https://bugs.openjdk.org/browse/JDK-8329331), add riscv unsafe::setMemory intrinsic’s generator generate_unsafe_setmemory. This intrinsic optimizes about quite a lot unsafe setmemory time
>> 
>> on my musebook, the JMH test micro:java.lang.foreign.MemorySegmentZeroUnsafe shows below
>> 
>> before the patch
>> 
>> Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
>> MemorySegmentZeroUnsafe.panama       true       1  avgt   30   24.198 ± 0.392  ns/op
>> MemorySegmentZeroUnsafe.panama       true       2  avgt   30   20.688 ± 0.013  ns/op
>> MemorySegmentZeroUnsafe.panama       true       3  avgt   30   20.703 ± 0.045  ns/op
>> MemorySegmentZeroUnsafe.panama       true       4  avgt   30   20.053 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama       true       5  avgt   30   20.682 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama       true       6  avgt   30   20.732 ± 0.061  ns/op
>> MemorySegmentZeroUnsafe.panama       true       7  avgt   30   21.403 ± 0.096  ns/op
>> MemorySegmentZeroUnsafe.panama       true       8  avgt   30   25.268 ± 0.197  ns/op
>> MemorySegmentZeroUnsafe.panama       true      15  avgt   30   27.481 ± 0.195  ns/op
>> MemorySegmentZeroUnsafe.panama       true      16  avgt   30   27.577 ± 0.019  ns/op
>> MemorySegmentZeroUnsafe.panama       true      63  avgt   30  208.893 ± 2.795  ns/op
>> MemorySegmentZeroUnsafe.panama       true      64  avgt   30  199.167 ± 0.936  ns/op
>> MemorySegmentZeroUnsafe.panama       true     255  avgt   30  220.672 ± 0.879  ns/op
>> MemorySegmentZeroUnsafe.panama       true     256  avgt   30  246.256 ± 0.756  ns/op
>> MemorySegmentZeroUnsafe.panama      false       1  avgt   30   23.849 ± 0.088  ns/op
>> MemorySegmentZeroUnsafe.panama      false       2  avgt   30   20.671 ± 0.006  ns/op
>> MemorySegmentZeroUnsafe.panama      false       3  avgt   30   20.694 ± 0.037  ns/op
>> MemorySegmentZeroUnsafe.panama      false       4  avgt   30   20.048 ± 0.010  ns/op
>> MemorySegmentZeroUnsafe.panama      false       5  avgt   30   20.684 ± 0.020  ns/op
>> MemorySegmentZeroUnsafe.panama      false       6  avgt   30   20.685 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama      false       7  avgt   30   21.383 ± 0.086  ns/op
>> MemorySegmentZeroUnsafe.panama      false       8  avgt   30   25.684 ± 0.006  ns/op
>> MemorySegmentZeroUnsafe.panama      false      15  avgt   30   27.593 ± 0.043  ns/op
>> MemorySegmentZeroUnsafe.panama      false      16  avgt   30   28.437 ± 0.228  ns/o...
>
> Anjian-Wen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update code for optimize

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 1696:

> 1694:     // One byte misalignment happens.
> 1695:     __ test_bit(t0, dest, 0);
> 1696:     __ beqz(t0, L_skip_align1);

Can we use `tmp_reg` in places where `t0` is used in this function?

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 1726:

> 1724:     // Remaining count is less than 8 bytes and address is heapword aligned.
> 1725:     {
> 1726:       Label L_fill_2, L_fill_1;

You can declare a local `L_exit` and remove `L_exit1`.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 1730:

> 1728:       __ beqz(t0, L_fill_2);
> 1729:       __ sw(value, Address(dest, 0));
> 1730:       __ addi(dest, dest, 4);

Leave a new line after this.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 1735:

> 1733:       __ beqz(t0, L_fill_1);
> 1734:       __ sh(value, Address(dest, 0));
> 1735:       __ addi(dest, dest, 2);

Leave a new line after this.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 1749:

> 1747:     __ bind(L_fill_elements);
> 1748:     {
> 1749:       Label L_fill_2, L_fill_1;

You can declare a local `L_exit` and remove `L_exit2`.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 1769:

> 1767:       __ beqz(t0, L_exit2);
> 1768:       __ sb(value, Address(dest, 0));
> 1769:       __ addi(dest, dest, 1);

No need to update `dest` here.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23890#discussion_r2097576972
PR Review Comment: https://git.openjdk.org/jdk/pull/23890#discussion_r2097572103
PR Review Comment: https://git.openjdk.org/jdk/pull/23890#discussion_r2097573965
PR Review Comment: https://git.openjdk.org/jdk/pull/23890#discussion_r2097574187
PR Review Comment: https://git.openjdk.org/jdk/pull/23890#discussion_r2097572321
PR Review Comment: https://git.openjdk.org/jdk/pull/23890#discussion_r2097570585


More information about the hotspot-compiler-dev mailing list