RFR: 8351140: RISC-V: Intrinsify Unsafe::setMemory [v14]

Anjian-Wen duke at openjdk.org
Wed May 21 09:40:52 UTC 2025


On Tue, 20 May 2025 12:50:08 GMT, Anjian-Wen <duke at openjdk.org> wrote:

>> From [JDK-8329331](https://bugs.openjdk.org/browse/JDK-8329331), add riscv unsafe::setMemory intrinsic’s generator generate_unsafe_setmemory. This intrinsic optimizes about quite a lot unsafe setmemory time
>> 
>> on my musebook, the JMH test micro:java.lang.foreign.MemorySegmentZeroUnsafe shows below
>> 
>> before the patch
>> 
>> Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
>> MemorySegmentZeroUnsafe.panama       true       1  avgt   30   24.198 ± 0.392  ns/op
>> MemorySegmentZeroUnsafe.panama       true       2  avgt   30   20.688 ± 0.013  ns/op
>> MemorySegmentZeroUnsafe.panama       true       3  avgt   30   20.703 ± 0.045  ns/op
>> MemorySegmentZeroUnsafe.panama       true       4  avgt   30   20.053 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama       true       5  avgt   30   20.682 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama       true       6  avgt   30   20.732 ± 0.061  ns/op
>> MemorySegmentZeroUnsafe.panama       true       7  avgt   30   21.403 ± 0.096  ns/op
>> MemorySegmentZeroUnsafe.panama       true       8  avgt   30   25.268 ± 0.197  ns/op
>> MemorySegmentZeroUnsafe.panama       true      15  avgt   30   27.481 ± 0.195  ns/op
>> MemorySegmentZeroUnsafe.panama       true      16  avgt   30   27.577 ± 0.019  ns/op
>> MemorySegmentZeroUnsafe.panama       true      63  avgt   30  208.893 ± 2.795  ns/op
>> MemorySegmentZeroUnsafe.panama       true      64  avgt   30  199.167 ± 0.936  ns/op
>> MemorySegmentZeroUnsafe.panama       true     255  avgt   30  220.672 ± 0.879  ns/op
>> MemorySegmentZeroUnsafe.panama       true     256  avgt   30  246.256 ± 0.756  ns/op
>> MemorySegmentZeroUnsafe.panama      false       1  avgt   30   23.849 ± 0.088  ns/op
>> MemorySegmentZeroUnsafe.panama      false       2  avgt   30   20.671 ± 0.006  ns/op
>> MemorySegmentZeroUnsafe.panama      false       3  avgt   30   20.694 ± 0.037  ns/op
>> MemorySegmentZeroUnsafe.panama      false       4  avgt   30   20.048 ± 0.010  ns/op
>> MemorySegmentZeroUnsafe.panama      false       5  avgt   30   20.684 ± 0.020  ns/op
>> MemorySegmentZeroUnsafe.panama      false       6  avgt   30   20.685 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama      false       7  avgt   30   21.383 ± 0.086  ns/op
>> MemorySegmentZeroUnsafe.panama      false       8  avgt   30   25.684 ± 0.006  ns/op
>> MemorySegmentZeroUnsafe.panama      false      15  avgt   30   27.593 ± 0.043  ns/op
>> MemorySegmentZeroUnsafe.panama      false      16  avgt   30   28.437 ± 0.228  ns/o...
>
> Anjian-Wen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   change all the t0 with tmp_reg

Thanks for your review!
I think the above test results may not fully reflect the difference in the impact of aligned and unaligned on the tail? I understand that if the dest address is aligned, the above aligned section has 0 to 4 less store instructions than the following section.
I can remove it and test jmh to see how it performs

> > @feilongjiang @RealFYang MemorySegmentFillUnsafe.unsafe Test show that the time reduce from `29.728 ± 0.294` to `23.747 ± 0.215` when the count is 7. which produce very good effects, thanks for commit!! below is the jmh test result
> 
> Based on the performance data after unroll, the comparison of unligned and aligned data of `MemorySegmentFillUnsafe.unsafe` suggests that it could bring some benefit to merge these 2 pieces of code, i.e. keep only unalinged one and remove the aligned one. But I'm not sure, maybe it's worth a try? At least it can reduce the generated code size.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23890#issuecomment-2897287517


More information about the hotspot-compiler-dev mailing list