RFR: 8351140: RISC-V: Intrinsify Unsafe::setMemory [v14]

Hamlin Li mli at openjdk.org
Wed May 21 09:22:54 UTC 2025


On Tue, 20 May 2025 12:50:08 GMT, Anjian-Wen <duke at openjdk.org> wrote:

>> From [JDK-8329331](https://bugs.openjdk.org/browse/JDK-8329331), add riscv unsafe::setMemory intrinsic’s generator generate_unsafe_setmemory. This intrinsic optimizes about quite a lot unsafe setmemory time
>> 
>> on my musebook, the JMH test micro:java.lang.foreign.MemorySegmentZeroUnsafe shows below
>> 
>> before the patch
>> 
>> Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
>> MemorySegmentZeroUnsafe.panama       true       1  avgt   30   24.198 ± 0.392  ns/op
>> MemorySegmentZeroUnsafe.panama       true       2  avgt   30   20.688 ± 0.013  ns/op
>> MemorySegmentZeroUnsafe.panama       true       3  avgt   30   20.703 ± 0.045  ns/op
>> MemorySegmentZeroUnsafe.panama       true       4  avgt   30   20.053 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama       true       5  avgt   30   20.682 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama       true       6  avgt   30   20.732 ± 0.061  ns/op
>> MemorySegmentZeroUnsafe.panama       true       7  avgt   30   21.403 ± 0.096  ns/op
>> MemorySegmentZeroUnsafe.panama       true       8  avgt   30   25.268 ± 0.197  ns/op
>> MemorySegmentZeroUnsafe.panama       true      15  avgt   30   27.481 ± 0.195  ns/op
>> MemorySegmentZeroUnsafe.panama       true      16  avgt   30   27.577 ± 0.019  ns/op
>> MemorySegmentZeroUnsafe.panama       true      63  avgt   30  208.893 ± 2.795  ns/op
>> MemorySegmentZeroUnsafe.panama       true      64  avgt   30  199.167 ± 0.936  ns/op
>> MemorySegmentZeroUnsafe.panama       true     255  avgt   30  220.672 ± 0.879  ns/op
>> MemorySegmentZeroUnsafe.panama       true     256  avgt   30  246.256 ± 0.756  ns/op
>> MemorySegmentZeroUnsafe.panama      false       1  avgt   30   23.849 ± 0.088  ns/op
>> MemorySegmentZeroUnsafe.panama      false       2  avgt   30   20.671 ± 0.006  ns/op
>> MemorySegmentZeroUnsafe.panama      false       3  avgt   30   20.694 ± 0.037  ns/op
>> MemorySegmentZeroUnsafe.panama      false       4  avgt   30   20.048 ± 0.010  ns/op
>> MemorySegmentZeroUnsafe.panama      false       5  avgt   30   20.684 ± 0.020  ns/op
>> MemorySegmentZeroUnsafe.panama      false       6  avgt   30   20.685 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama      false       7  avgt   30   21.383 ± 0.086  ns/op
>> MemorySegmentZeroUnsafe.panama      false       8  avgt   30   25.684 ± 0.006  ns/op
>> MemorySegmentZeroUnsafe.panama      false      15  avgt   30   27.593 ± 0.043  ns/op
>> MemorySegmentZeroUnsafe.panama      false      16  avgt   30   28.437 ± 0.228  ns/o...
>
> Anjian-Wen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   change all the t0 with tmp_reg

> @feilongjiang @RealFYang MemorySegmentFillUnsafe.unsafe Test show that the time reduce from `29.728 ± 0.294` to `23.747 ± 0.215` when the count is 7. which produce very good effects, thanks for commit!! below is the jmh test result

Based on the performance data after unroll,  the comparison of unligned and aligned data of `MemorySegmentFillUnsafe.unsafe` suggests that it could bring some benefit to merge these 2 pieces of code, i.e. keep only unalinged one and remove the aligned one.
But I'm not sure, maybe it's worth a try? At least it can reduce the generated code size.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 1724:

> 1722:     }
> 1723: 
> 1724:     // Remaining count is less than 8 bytes and address is heapword aligned.

remove this aligned code.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 1748:

> 1746:     }
> 1747: 
> 1748:     // Handle copies less than 8 bytes

keep this unaligned code.

-------------

PR Review: https://git.openjdk.org/jdk/pull/23890#pullrequestreview-2856957909
PR Review Comment: https://git.openjdk.org/jdk/pull/23890#discussion_r2099796385
PR Review Comment: https://git.openjdk.org/jdk/pull/23890#discussion_r2099796462


More information about the hotspot-compiler-dev mailing list