RFR: 8351140: RISC-V: Intrinsify Unsafe::setMemory [v14]
Anjian-Wen
duke at openjdk.org
Wed May 21 09:40:52 UTC 2025
On Tue, 20 May 2025 12:50:08 GMT, Anjian-Wen <duke at openjdk.org> wrote:
>> From [JDK-8329331](https://bugs.openjdk.org/browse/JDK-8329331), add riscv unsafe::setMemory intrinsic’s generator generate_unsafe_setmemory. This intrinsic optimizes about quite a lot unsafe setmemory time
>>
>> on my musebook, the JMH test micro:java.lang.foreign.MemorySegmentZeroUnsafe shows below
>>
>> before the patch
>>
>> Benchmark (aligned) (size) Mode Cnt Score Error Units
>> MemorySegmentZeroUnsafe.panama true 1 avgt 30 24.198 ± 0.392 ns/op
>> MemorySegmentZeroUnsafe.panama true 2 avgt 30 20.688 ± 0.013 ns/op
>> MemorySegmentZeroUnsafe.panama true 3 avgt 30 20.703 ± 0.045 ns/op
>> MemorySegmentZeroUnsafe.panama true 4 avgt 30 20.053 ± 0.016 ns/op
>> MemorySegmentZeroUnsafe.panama true 5 avgt 30 20.682 ± 0.016 ns/op
>> MemorySegmentZeroUnsafe.panama true 6 avgt 30 20.732 ± 0.061 ns/op
>> MemorySegmentZeroUnsafe.panama true 7 avgt 30 21.403 ± 0.096 ns/op
>> MemorySegmentZeroUnsafe.panama true 8 avgt 30 25.268 ± 0.197 ns/op
>> MemorySegmentZeroUnsafe.panama true 15 avgt 30 27.481 ± 0.195 ns/op
>> MemorySegmentZeroUnsafe.panama true 16 avgt 30 27.577 ± 0.019 ns/op
>> MemorySegmentZeroUnsafe.panama true 63 avgt 30 208.893 ± 2.795 ns/op
>> MemorySegmentZeroUnsafe.panama true 64 avgt 30 199.167 ± 0.936 ns/op
>> MemorySegmentZeroUnsafe.panama true 255 avgt 30 220.672 ± 0.879 ns/op
>> MemorySegmentZeroUnsafe.panama true 256 avgt 30 246.256 ± 0.756 ns/op
>> MemorySegmentZeroUnsafe.panama false 1 avgt 30 23.849 ± 0.088 ns/op
>> MemorySegmentZeroUnsafe.panama false 2 avgt 30 20.671 ± 0.006 ns/op
>> MemorySegmentZeroUnsafe.panama false 3 avgt 30 20.694 ± 0.037 ns/op
>> MemorySegmentZeroUnsafe.panama false 4 avgt 30 20.048 ± 0.010 ns/op
>> MemorySegmentZeroUnsafe.panama false 5 avgt 30 20.684 ± 0.020 ns/op
>> MemorySegmentZeroUnsafe.panama false 6 avgt 30 20.685 ± 0.016 ns/op
>> MemorySegmentZeroUnsafe.panama false 7 avgt 30 21.383 ± 0.086 ns/op
>> MemorySegmentZeroUnsafe.panama false 8 avgt 30 25.684 ± 0.006 ns/op
>> MemorySegmentZeroUnsafe.panama false 15 avgt 30 27.593 ± 0.043 ns/op
>> MemorySegmentZeroUnsafe.panama false 16 avgt 30 28.437 ± 0.228 ns/o...
>
> Anjian-Wen has updated the pull request incrementally with one additional commit since the last revision:
>
> change all the t0 with tmp_reg
Thanks for your review!
I think the above test results may not fully reflect the difference in the impact of aligned and unaligned on the tail? I understand that if the dest address is aligned, the above aligned section has 0 to 4 less store instructions than the following section.
I can remove it and test jmh to see how it performs
> > @feilongjiang @RealFYang MemorySegmentFillUnsafe.unsafe Test show that the time reduce from `29.728 ± 0.294` to `23.747 ± 0.215` when the count is 7. which produce very good effects, thanks for commit!! below is the jmh test result
>
> Based on the performance data after unroll, the comparison of unligned and aligned data of `MemorySegmentFillUnsafe.unsafe` suggests that it could bring some benefit to merge these 2 pieces of code, i.e. keep only unalinged one and remove the aligned one. But I'm not sure, maybe it's worth a try? At least it can reduce the generated code size.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23890#issuecomment-2897287517
More information about the hotspot-compiler-dev
mailing list