RFR: 8351140: RISC-V: Intrinsify Unsafe::setMemory [v12]
Anjian-Wen
duke at openjdk.org
Tue May 20 08:51:56 UTC 2025
On Tue, 20 May 2025 08:46:07 GMT, Anjian-Wen <duke at openjdk.org> wrote:
>> From [JDK-8329331](https://bugs.openjdk.org/browse/JDK-8329331), add riscv unsafe::setMemory intrinsic’s generator generate_unsafe_setmemory. This intrinsic optimizes about quite a lot unsafe setmemory time
>>
>> on my musebook, the JMH test micro:java.lang.foreign.MemorySegmentZeroUnsafe shows below
>>
>> before the patch
>>
>> Benchmark (aligned) (size) Mode Cnt Score Error Units
>> MemorySegmentZeroUnsafe.panama true 1 avgt 30 24.198 ± 0.392 ns/op
>> MemorySegmentZeroUnsafe.panama true 2 avgt 30 20.688 ± 0.013 ns/op
>> MemorySegmentZeroUnsafe.panama true 3 avgt 30 20.703 ± 0.045 ns/op
>> MemorySegmentZeroUnsafe.panama true 4 avgt 30 20.053 ± 0.016 ns/op
>> MemorySegmentZeroUnsafe.panama true 5 avgt 30 20.682 ± 0.016 ns/op
>> MemorySegmentZeroUnsafe.panama true 6 avgt 30 20.732 ± 0.061 ns/op
>> MemorySegmentZeroUnsafe.panama true 7 avgt 30 21.403 ± 0.096 ns/op
>> MemorySegmentZeroUnsafe.panama true 8 avgt 30 25.268 ± 0.197 ns/op
>> MemorySegmentZeroUnsafe.panama true 15 avgt 30 27.481 ± 0.195 ns/op
>> MemorySegmentZeroUnsafe.panama true 16 avgt 30 27.577 ± 0.019 ns/op
>> MemorySegmentZeroUnsafe.panama true 63 avgt 30 208.893 ± 2.795 ns/op
>> MemorySegmentZeroUnsafe.panama true 64 avgt 30 199.167 ± 0.936 ns/op
>> MemorySegmentZeroUnsafe.panama true 255 avgt 30 220.672 ± 0.879 ns/op
>> MemorySegmentZeroUnsafe.panama true 256 avgt 30 246.256 ± 0.756 ns/op
>> MemorySegmentZeroUnsafe.panama false 1 avgt 30 23.849 ± 0.088 ns/op
>> MemorySegmentZeroUnsafe.panama false 2 avgt 30 20.671 ± 0.006 ns/op
>> MemorySegmentZeroUnsafe.panama false 3 avgt 30 20.694 ± 0.037 ns/op
>> MemorySegmentZeroUnsafe.panama false 4 avgt 30 20.048 ± 0.010 ns/op
>> MemorySegmentZeroUnsafe.panama false 5 avgt 30 20.684 ± 0.020 ns/op
>> MemorySegmentZeroUnsafe.panama false 6 avgt 30 20.685 ± 0.016 ns/op
>> MemorySegmentZeroUnsafe.panama false 7 avgt 30 21.383 ± 0.086 ns/op
>> MemorySegmentZeroUnsafe.panama false 8 avgt 30 25.684 ± 0.006 ns/op
>> MemorySegmentZeroUnsafe.panama false 15 avgt 30 27.593 ± 0.043 ns/op
>> MemorySegmentZeroUnsafe.panama false 16 avgt 30 28.437 ± 0.228 ns/o...
>
> Anjian-Wen has updated the pull request incrementally with one additional commit since the last revision:
>
> update code for optimize
@feilongjiang @RealFYang
MemorySegmentFillUnsafe Test show that the time reduce from `29.728 ± 0.294` to `23.747 ± 0.215` when the count is 7. which produce very good effects, thanks for commit!! below is the jmh test result
before unroll
Benchmark (aligned) (size) Mode Cnt Score Error Units
MemorySegmentFillUnsafe.panama true 1 avgt 30 23.235 ± 0.092 ns/op
MemorySegmentFillUnsafe.panama true 2 avgt 30 20.672 ± 0.005 ns/op
MemorySegmentFillUnsafe.panama true 3 avgt 30 20.686 ± 0.008 ns/op
MemorySegmentFillUnsafe.panama true 4 avgt 30 19.599 ± 0.116 ns/op
MemorySegmentFillUnsafe.panama true 5 avgt 30 20.793 ± 0.144 ns/op
MemorySegmentFillUnsafe.panama true 6 avgt 30 20.707 ± 0.058 ns/op
MemorySegmentFillUnsafe.panama true 7 avgt 30 21.387 ± 0.093 ns/op
MemorySegmentFillUnsafe.panama true 8 avgt 30 25.170 ± 0.113 ns/op
MemorySegmentFillUnsafe.panama true 15 avgt 30 31.145 ± 0.284 ns/op
MemorySegmentFillUnsafe.panama true 16 avgt 30 26.315 ± 0.009 ns/op
MemorySegmentFillUnsafe.panama true 63 avgt 30 46.668 ± 0.611 ns/op
MemorySegmentFillUnsafe.panama true 64 avgt 30 49.265 ± 0.569 ns/op
MemorySegmentFillUnsafe.panama true 255 avgt 30 62.224 ± 1.244 ns/op
MemorySegmentFillUnsafe.panama true 256 avgt 30 61.213 ± 0.788 ns/op
MemorySegmentFillUnsafe.panama false 1 avgt 30 23.224 ± 0.077 ns/op
MemorySegmentFillUnsafe.panama false 2 avgt 30 20.673 ± 0.005 ns/op
MemorySegmentFillUnsafe.panama false 3 avgt 30 20.679 ± 0.016 ns/op
MemorySegmentFillUnsafe.panama false 4 avgt 30 19.779 ± 0.349 ns/op
MemorySegmentFillUnsafe.panama false 5 avgt 30 20.672 ± 0.004 ns/op
MemorySegmentFillUnsafe.panama false 6 avgt 30 20.803 ± 0.077 ns/op
MemorySegmentFillUnsafe.panama false 7 avgt 30 21.329 ± 0.037 ns/op
MemorySegmentFillUnsafe.panama false 8 avgt 30 25.131 ± 0.086 ns/op
MemorySegmentFillUnsafe.panama false 15 avgt 30 31.021 ± 0.227 ns/op
MemorySegmentFillUnsafe.panama false 16 avgt 30 26.939 ± 0.008 ns/op
MemorySegmentFillUnsafe.panama false 63 avgt 30 47.253 ± 0.397 ns/op
MemorySegmentFillUnsafe.panama false 64 avgt 30 47.614 ± 0.267 ns/op
MemorySegmentFillUnsafe.panama false 255 avgt 30 61.818 ± 0.407 ns/op
MemorySegmentFillUnsafe.panama false 256 avgt 30 62.879 ± 0.901 ns/op
MemorySegmentFillUnsafe.unsafe true 1 avgt 30 20.561 ± 0.212 ns/op
MemorySegmentFillUnsafe.unsafe true 2 avgt 30 22.979 ± 0.196 ns/op
MemorySegmentFillUnsafe.unsafe true 3 avgt 30 25.152 ± 0.545 ns/op
MemorySegmentFillUnsafe.unsafe true 4 avgt 30 27.713 ± 0.243 ns/op
MemorySegmentFillUnsafe.unsafe true 5 avgt 30 27.877 ± 0.433 ns/op
MemorySegmentFillUnsafe.unsafe true 6 avgt 30 28.356 ± 0.159 ns/op
MemorySegmentFillUnsafe.unsafe true 7 avgt 30 29.442 ± 0.008 ns/op
MemorySegmentFillUnsafe.unsafe true 8 avgt 30 34.050 ± 0.497 ns/op
MemorySegmentFillUnsafe.unsafe true 15 avgt 30 34.128 ± 0.215 ns/op
MemorySegmentFillUnsafe.unsafe true 16 avgt 30 33.516 ± 0.157 ns/op
MemorySegmentFillUnsafe.unsafe true 63 avgt 30 35.779 ± 0.094 ns/op
MemorySegmentFillUnsafe.unsafe true 64 avgt 30 38.035 ± 0.113 ns/op
MemorySegmentFillUnsafe.unsafe true 255 avgt 30 50.912 ± 0.142 ns/op
MemorySegmentFillUnsafe.unsafe true 256 avgt 30 50.586 ± 0.070 ns/op
MemorySegmentFillUnsafe.unsafe false 1 avgt 30 20.307 ± 0.211 ns/op
MemorySegmentFillUnsafe.unsafe false 2 avgt 30 22.574 ± 0.017 ns/op
MemorySegmentFillUnsafe.unsafe false 3 avgt 30 24.593 ± 0.240 ns/op
MemorySegmentFillUnsafe.unsafe false 4 avgt 30 27.805 ± 0.206 ns/op
MemorySegmentFillUnsafe.unsafe false 5 avgt 30 26.974 ± 0.058 ns/op
MemorySegmentFillUnsafe.unsafe false 6 avgt 30 28.188 ± 0.011 ns/op
MemorySegmentFillUnsafe.unsafe false 7 avgt 30 29.728 ± 0.294 ns/op
MemorySegmentFillUnsafe.unsafe false 8 avgt 30 31.559 ± 0.104 ns/op
MemorySegmentFillUnsafe.unsafe false 15 avgt 30 36.024 ± 0.149 ns/op
MemorySegmentFillUnsafe.unsafe false 16 avgt 30 37.215 ± 0.201 ns/op
MemorySegmentFillUnsafe.unsafe false 63 avgt 30 38.211 ± 0.011 ns/op
MemorySegmentFillUnsafe.unsafe false 64 avgt 30 39.056 ± 0.221 ns/op
MemorySegmentFillUnsafe.unsafe false 255 avgt 30 53.070 ± 0.351 ns/op
MemorySegmentFillUnsafe.unsafe false 256 avgt 30 53.406 ± 0.178 ns/op
after unroll
Benchmark (aligned) (size) Mode Cnt Score Error Units
MemorySegmentFillUnsafe.panama true 1 avgt 30 23.424 ± 0.200 ns/op
MemorySegmentFillUnsafe.panama true 2 avgt 30 20.679 ± 0.009 ns/op
MemorySegmentFillUnsafe.panama true 3 avgt 30 20.769 ± 0.105 ns/op
MemorySegmentFillUnsafe.panama true 4 avgt 30 19.432 ± 0.018 ns/op
MemorySegmentFillUnsafe.panama true 5 avgt 30 20.675 ± 0.008 ns/op
MemorySegmentFillUnsafe.panama true 6 avgt 30 20.734 ± 0.089 ns/op
MemorySegmentFillUnsafe.panama true 7 avgt 30 21.305 ± 0.010 ns/op
MemorySegmentFillUnsafe.panama true 8 avgt 30 24.605 ± 0.466 ns/op
MemorySegmentFillUnsafe.panama true 15 avgt 30 31.731 ± 0.521 ns/op
MemorySegmentFillUnsafe.panama true 16 avgt 30 26.319 ± 0.007 ns/op
MemorySegmentFillUnsafe.panama true 63 avgt 30 46.153 ± 0.413 ns/op
MemorySegmentFillUnsafe.panama true 64 avgt 30 48.146 ± 0.345 ns/op
MemorySegmentFillUnsafe.panama true 255 avgt 30 61.937 ± 0.301 ns/op
MemorySegmentFillUnsafe.panama true 256 avgt 30 61.462 ± 0.546 ns/op
MemorySegmentFillUnsafe.panama false 1 avgt 30 23.202 ± 0.077 ns/op
MemorySegmentFillUnsafe.panama false 2 avgt 30 20.692 ± 0.019 ns/op
MemorySegmentFillUnsafe.panama false 3 avgt 30 20.678 ± 0.009 ns/op
MemorySegmentFillUnsafe.panama false 4 avgt 30 19.808 ± 0.373 ns/op
MemorySegmentFillUnsafe.panama false 5 avgt 30 21.633 ± 0.859 ns/op
MemorySegmentFillUnsafe.panama false 6 avgt 30 20.775 ± 0.116 ns/op
MemorySegmentFillUnsafe.panama false 7 avgt 30 21.395 ± 0.092 ns/op
MemorySegmentFillUnsafe.panama false 8 avgt 30 25.065 ± 0.012 ns/op
MemorySegmentFillUnsafe.panama false 15 avgt 30 31.904 ± 0.384 ns/op
MemorySegmentFillUnsafe.panama false 16 avgt 30 27.172 ± 0.199 ns/op
MemorySegmentFillUnsafe.panama false 63 avgt 30 48.113 ± 1.377 ns/op
MemorySegmentFillUnsafe.panama false 64 avgt 30 48.306 ± 0.413 ns/op
MemorySegmentFillUnsafe.panama false 255 avgt 30 61.440 ± 0.128 ns/op
MemorySegmentFillUnsafe.panama false 256 avgt 30 62.360 ± 0.342 ns/op
MemorySegmentFillUnsafe.unsafe true 1 avgt 30 21.759 ± 0.176 ns/op
MemorySegmentFillUnsafe.unsafe true 2 avgt 30 22.074 ± 0.068 ns/op
MemorySegmentFillUnsafe.unsafe true 3 avgt 30 21.303 ± 0.011 ns/op
MemorySegmentFillUnsafe.unsafe true 4 avgt 30 23.178 ± 0.006 ns/op
MemorySegmentFillUnsafe.unsafe true 5 avgt 30 23.189 ± 0.011 ns/op
MemorySegmentFillUnsafe.unsafe true 6 avgt 30 23.848 ± 0.072 ns/op
MemorySegmentFillUnsafe.unsafe true 7 avgt 30 23.393 ± 0.151 ns/op
MemorySegmentFillUnsafe.unsafe true 8 avgt 30 33.539 ± 0.169 ns/op
MemorySegmentFillUnsafe.unsafe true 15 avgt 30 36.204 ± 0.391 ns/op
MemorySegmentFillUnsafe.unsafe true 16 avgt 30 34.218 ± 0.730 ns/op
MemorySegmentFillUnsafe.unsafe true 63 avgt 30 35.807 ± 0.124 ns/op
MemorySegmentFillUnsafe.unsafe true 64 avgt 30 37.984 ± 0.065 ns/op
MemorySegmentFillUnsafe.unsafe true 255 avgt 30 50.843 ± 0.133 ns/op
MemorySegmentFillUnsafe.unsafe true 256 avgt 30 50.643 ± 0.078 ns/op
MemorySegmentFillUnsafe.unsafe false 1 avgt 30 21.782 ± 0.413 ns/op
MemorySegmentFillUnsafe.unsafe false 2 avgt 30 22.102 ± 0.073 ns/op
MemorySegmentFillUnsafe.unsafe false 3 avgt 30 21.727 ± 0.406 ns/op
MemorySegmentFillUnsafe.unsafe false 4 avgt 30 23.175 ± 0.007 ns/op
MemorySegmentFillUnsafe.unsafe false 5 avgt 30 23.402 ± 0.203 ns/op
MemorySegmentFillUnsafe.unsafe false 6 avgt 30 23.791 ± 0.007 ns/op
MemorySegmentFillUnsafe.unsafe false 7 avgt 30 23.747 ± 0.215 ns/op
MemorySegmentFillUnsafe.unsafe false 8 avgt 30 31.518 ± 0.073 ns/op
MemorySegmentFillUnsafe.unsafe false 15 avgt 30 36.252 ± 0.071 ns/op
MemorySegmentFillUnsafe.unsafe false 16 avgt 30 37.290 ± 0.236 ns/op
MemorySegmentFillUnsafe.unsafe false 63 avgt 30 38.373 ± 0.163 ns/op
MemorySegmentFillUnsafe.unsafe false 64 avgt 30 38.947 ± 0.300 ns/op
MemorySegmentFillUnsafe.unsafe false 255 avgt 30 52.648 ± 0.189 ns/op
MemorySegmentFillUnsafe.unsafe false 256 avgt 30 53.219 ± 0.195 ns/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23890#issuecomment-2893516321
More information about the hotspot-compiler-dev
mailing list