RFR: 8351140: RISC-V: Intrinsify Unsafe::setMemory [v12]

Anjian-Wen duke at openjdk.org
Tue May 20 08:51:56 UTC 2025


On Tue, 20 May 2025 08:46:07 GMT, Anjian-Wen <duke at openjdk.org> wrote:

>> From [JDK-8329331](https://bugs.openjdk.org/browse/JDK-8329331), add riscv unsafe::setMemory intrinsic’s generator generate_unsafe_setmemory. This intrinsic optimizes about quite a lot unsafe setmemory time
>> 
>> on my musebook, the JMH test micro:java.lang.foreign.MemorySegmentZeroUnsafe shows below
>> 
>> before the patch
>> 
>> Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
>> MemorySegmentZeroUnsafe.panama       true       1  avgt   30   24.198 ± 0.392  ns/op
>> MemorySegmentZeroUnsafe.panama       true       2  avgt   30   20.688 ± 0.013  ns/op
>> MemorySegmentZeroUnsafe.panama       true       3  avgt   30   20.703 ± 0.045  ns/op
>> MemorySegmentZeroUnsafe.panama       true       4  avgt   30   20.053 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama       true       5  avgt   30   20.682 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama       true       6  avgt   30   20.732 ± 0.061  ns/op
>> MemorySegmentZeroUnsafe.panama       true       7  avgt   30   21.403 ± 0.096  ns/op
>> MemorySegmentZeroUnsafe.panama       true       8  avgt   30   25.268 ± 0.197  ns/op
>> MemorySegmentZeroUnsafe.panama       true      15  avgt   30   27.481 ± 0.195  ns/op
>> MemorySegmentZeroUnsafe.panama       true      16  avgt   30   27.577 ± 0.019  ns/op
>> MemorySegmentZeroUnsafe.panama       true      63  avgt   30  208.893 ± 2.795  ns/op
>> MemorySegmentZeroUnsafe.panama       true      64  avgt   30  199.167 ± 0.936  ns/op
>> MemorySegmentZeroUnsafe.panama       true     255  avgt   30  220.672 ± 0.879  ns/op
>> MemorySegmentZeroUnsafe.panama       true     256  avgt   30  246.256 ± 0.756  ns/op
>> MemorySegmentZeroUnsafe.panama      false       1  avgt   30   23.849 ± 0.088  ns/op
>> MemorySegmentZeroUnsafe.panama      false       2  avgt   30   20.671 ± 0.006  ns/op
>> MemorySegmentZeroUnsafe.panama      false       3  avgt   30   20.694 ± 0.037  ns/op
>> MemorySegmentZeroUnsafe.panama      false       4  avgt   30   20.048 ± 0.010  ns/op
>> MemorySegmentZeroUnsafe.panama      false       5  avgt   30   20.684 ± 0.020  ns/op
>> MemorySegmentZeroUnsafe.panama      false       6  avgt   30   20.685 ± 0.016  ns/op
>> MemorySegmentZeroUnsafe.panama      false       7  avgt   30   21.383 ± 0.086  ns/op
>> MemorySegmentZeroUnsafe.panama      false       8  avgt   30   25.684 ± 0.006  ns/op
>> MemorySegmentZeroUnsafe.panama      false      15  avgt   30   27.593 ± 0.043  ns/op
>> MemorySegmentZeroUnsafe.panama      false      16  avgt   30   28.437 ± 0.228  ns/o...
>
> Anjian-Wen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   update code for optimize

@feilongjiang @RealFYang 
MemorySegmentFillUnsafe Test show that the time reduce from `29.728 ± 0.294` to `23.747 ± 0.215` when the count is 7. which produce very good effects, thanks for commit!! below is the jmh test result

before unroll

Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
MemorySegmentFillUnsafe.panama       true       1  avgt   30  23.235 ± 0.092  ns/op
MemorySegmentFillUnsafe.panama       true       2  avgt   30  20.672 ± 0.005  ns/op
MemorySegmentFillUnsafe.panama       true       3  avgt   30  20.686 ± 0.008  ns/op
MemorySegmentFillUnsafe.panama       true       4  avgt   30  19.599 ± 0.116  ns/op
MemorySegmentFillUnsafe.panama       true       5  avgt   30  20.793 ± 0.144  ns/op
MemorySegmentFillUnsafe.panama       true       6  avgt   30  20.707 ± 0.058  ns/op
MemorySegmentFillUnsafe.panama       true       7  avgt   30  21.387 ± 0.093  ns/op
MemorySegmentFillUnsafe.panama       true       8  avgt   30  25.170 ± 0.113  ns/op
MemorySegmentFillUnsafe.panama       true      15  avgt   30  31.145 ± 0.284  ns/op
MemorySegmentFillUnsafe.panama       true      16  avgt   30  26.315 ± 0.009  ns/op
MemorySegmentFillUnsafe.panama       true      63  avgt   30  46.668 ± 0.611  ns/op
MemorySegmentFillUnsafe.panama       true      64  avgt   30  49.265 ± 0.569  ns/op
MemorySegmentFillUnsafe.panama       true     255  avgt   30  62.224 ± 1.244  ns/op
MemorySegmentFillUnsafe.panama       true     256  avgt   30  61.213 ± 0.788  ns/op
MemorySegmentFillUnsafe.panama      false       1  avgt   30  23.224 ± 0.077  ns/op
MemorySegmentFillUnsafe.panama      false       2  avgt   30  20.673 ± 0.005  ns/op
MemorySegmentFillUnsafe.panama      false       3  avgt   30  20.679 ± 0.016  ns/op
MemorySegmentFillUnsafe.panama      false       4  avgt   30  19.779 ± 0.349  ns/op
MemorySegmentFillUnsafe.panama      false       5  avgt   30  20.672 ± 0.004  ns/op
MemorySegmentFillUnsafe.panama      false       6  avgt   30  20.803 ± 0.077  ns/op
MemorySegmentFillUnsafe.panama      false       7  avgt   30  21.329 ± 0.037  ns/op
MemorySegmentFillUnsafe.panama      false       8  avgt   30  25.131 ± 0.086  ns/op
MemorySegmentFillUnsafe.panama      false      15  avgt   30  31.021 ± 0.227  ns/op
MemorySegmentFillUnsafe.panama      false      16  avgt   30  26.939 ± 0.008  ns/op
MemorySegmentFillUnsafe.panama      false      63  avgt   30  47.253 ± 0.397  ns/op
MemorySegmentFillUnsafe.panama      false      64  avgt   30  47.614 ± 0.267  ns/op
MemorySegmentFillUnsafe.panama      false     255  avgt   30  61.818 ± 0.407  ns/op
MemorySegmentFillUnsafe.panama      false     256  avgt   30  62.879 ± 0.901  ns/op
MemorySegmentFillUnsafe.unsafe       true       1  avgt   30  20.561 ± 0.212  ns/op
MemorySegmentFillUnsafe.unsafe       true       2  avgt   30  22.979 ± 0.196  ns/op
MemorySegmentFillUnsafe.unsafe       true       3  avgt   30  25.152 ± 0.545  ns/op
MemorySegmentFillUnsafe.unsafe       true       4  avgt   30  27.713 ± 0.243  ns/op
MemorySegmentFillUnsafe.unsafe       true       5  avgt   30  27.877 ± 0.433  ns/op
MemorySegmentFillUnsafe.unsafe       true       6  avgt   30  28.356 ± 0.159  ns/op
MemorySegmentFillUnsafe.unsafe       true       7  avgt   30  29.442 ± 0.008  ns/op
MemorySegmentFillUnsafe.unsafe       true       8  avgt   30  34.050 ± 0.497  ns/op
MemorySegmentFillUnsafe.unsafe       true      15  avgt   30  34.128 ± 0.215  ns/op
MemorySegmentFillUnsafe.unsafe       true      16  avgt   30  33.516 ± 0.157  ns/op
MemorySegmentFillUnsafe.unsafe       true      63  avgt   30  35.779 ± 0.094  ns/op
MemorySegmentFillUnsafe.unsafe       true      64  avgt   30  38.035 ± 0.113  ns/op
MemorySegmentFillUnsafe.unsafe       true     255  avgt   30  50.912 ± 0.142  ns/op
MemorySegmentFillUnsafe.unsafe       true     256  avgt   30  50.586 ± 0.070  ns/op
MemorySegmentFillUnsafe.unsafe      false       1  avgt   30  20.307 ± 0.211  ns/op
MemorySegmentFillUnsafe.unsafe      false       2  avgt   30  22.574 ± 0.017  ns/op
MemorySegmentFillUnsafe.unsafe      false       3  avgt   30  24.593 ± 0.240  ns/op
MemorySegmentFillUnsafe.unsafe      false       4  avgt   30  27.805 ± 0.206  ns/op
MemorySegmentFillUnsafe.unsafe      false       5  avgt   30  26.974 ± 0.058  ns/op
MemorySegmentFillUnsafe.unsafe      false       6  avgt   30  28.188 ± 0.011  ns/op
MemorySegmentFillUnsafe.unsafe      false       7  avgt   30  29.728 ± 0.294  ns/op
MemorySegmentFillUnsafe.unsafe      false       8  avgt   30  31.559 ± 0.104  ns/op
MemorySegmentFillUnsafe.unsafe      false      15  avgt   30  36.024 ± 0.149  ns/op
MemorySegmentFillUnsafe.unsafe      false      16  avgt   30  37.215 ± 0.201  ns/op
MemorySegmentFillUnsafe.unsafe      false      63  avgt   30  38.211 ± 0.011  ns/op
MemorySegmentFillUnsafe.unsafe      false      64  avgt   30  39.056 ± 0.221  ns/op
MemorySegmentFillUnsafe.unsafe      false     255  avgt   30  53.070 ± 0.351  ns/op
MemorySegmentFillUnsafe.unsafe      false     256  avgt   30  53.406 ± 0.178  ns/op


after unroll

Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
MemorySegmentFillUnsafe.panama       true       1  avgt   30  23.424 ± 0.200  ns/op
MemorySegmentFillUnsafe.panama       true       2  avgt   30  20.679 ± 0.009  ns/op
MemorySegmentFillUnsafe.panama       true       3  avgt   30  20.769 ± 0.105  ns/op
MemorySegmentFillUnsafe.panama       true       4  avgt   30  19.432 ± 0.018  ns/op
MemorySegmentFillUnsafe.panama       true       5  avgt   30  20.675 ± 0.008  ns/op
MemorySegmentFillUnsafe.panama       true       6  avgt   30  20.734 ± 0.089  ns/op
MemorySegmentFillUnsafe.panama       true       7  avgt   30  21.305 ± 0.010  ns/op
MemorySegmentFillUnsafe.panama       true       8  avgt   30  24.605 ± 0.466  ns/op
MemorySegmentFillUnsafe.panama       true      15  avgt   30  31.731 ± 0.521  ns/op
MemorySegmentFillUnsafe.panama       true      16  avgt   30  26.319 ± 0.007  ns/op
MemorySegmentFillUnsafe.panama       true      63  avgt   30  46.153 ± 0.413  ns/op
MemorySegmentFillUnsafe.panama       true      64  avgt   30  48.146 ± 0.345  ns/op
MemorySegmentFillUnsafe.panama       true     255  avgt   30  61.937 ± 0.301  ns/op
MemorySegmentFillUnsafe.panama       true     256  avgt   30  61.462 ± 0.546  ns/op
MemorySegmentFillUnsafe.panama      false       1  avgt   30  23.202 ± 0.077  ns/op
MemorySegmentFillUnsafe.panama      false       2  avgt   30  20.692 ± 0.019  ns/op
MemorySegmentFillUnsafe.panama      false       3  avgt   30  20.678 ± 0.009  ns/op
MemorySegmentFillUnsafe.panama      false       4  avgt   30  19.808 ± 0.373  ns/op
MemorySegmentFillUnsafe.panama      false       5  avgt   30  21.633 ± 0.859  ns/op
MemorySegmentFillUnsafe.panama      false       6  avgt   30  20.775 ± 0.116  ns/op
MemorySegmentFillUnsafe.panama      false       7  avgt   30  21.395 ± 0.092  ns/op
MemorySegmentFillUnsafe.panama      false       8  avgt   30  25.065 ± 0.012  ns/op
MemorySegmentFillUnsafe.panama      false      15  avgt   30  31.904 ± 0.384  ns/op
MemorySegmentFillUnsafe.panama      false      16  avgt   30  27.172 ± 0.199  ns/op
MemorySegmentFillUnsafe.panama      false      63  avgt   30  48.113 ± 1.377  ns/op
MemorySegmentFillUnsafe.panama      false      64  avgt   30  48.306 ± 0.413  ns/op
MemorySegmentFillUnsafe.panama      false     255  avgt   30  61.440 ± 0.128  ns/op
MemorySegmentFillUnsafe.panama      false     256  avgt   30  62.360 ± 0.342  ns/op
MemorySegmentFillUnsafe.unsafe       true       1  avgt   30  21.759 ± 0.176  ns/op
MemorySegmentFillUnsafe.unsafe       true       2  avgt   30  22.074 ± 0.068  ns/op
MemorySegmentFillUnsafe.unsafe       true       3  avgt   30  21.303 ± 0.011  ns/op
MemorySegmentFillUnsafe.unsafe       true       4  avgt   30  23.178 ± 0.006  ns/op
MemorySegmentFillUnsafe.unsafe       true       5  avgt   30  23.189 ± 0.011  ns/op
MemorySegmentFillUnsafe.unsafe       true       6  avgt   30  23.848 ± 0.072  ns/op
MemorySegmentFillUnsafe.unsafe       true       7  avgt   30  23.393 ± 0.151  ns/op
MemorySegmentFillUnsafe.unsafe       true       8  avgt   30  33.539 ± 0.169  ns/op
MemorySegmentFillUnsafe.unsafe       true      15  avgt   30  36.204 ± 0.391  ns/op
MemorySegmentFillUnsafe.unsafe       true      16  avgt   30  34.218 ± 0.730  ns/op
MemorySegmentFillUnsafe.unsafe       true      63  avgt   30  35.807 ± 0.124  ns/op
MemorySegmentFillUnsafe.unsafe       true      64  avgt   30  37.984 ± 0.065  ns/op
MemorySegmentFillUnsafe.unsafe       true     255  avgt   30  50.843 ± 0.133  ns/op
MemorySegmentFillUnsafe.unsafe       true     256  avgt   30  50.643 ± 0.078  ns/op
MemorySegmentFillUnsafe.unsafe      false       1  avgt   30  21.782 ± 0.413  ns/op
MemorySegmentFillUnsafe.unsafe      false       2  avgt   30  22.102 ± 0.073  ns/op
MemorySegmentFillUnsafe.unsafe      false       3  avgt   30  21.727 ± 0.406  ns/op
MemorySegmentFillUnsafe.unsafe      false       4  avgt   30  23.175 ± 0.007  ns/op
MemorySegmentFillUnsafe.unsafe      false       5  avgt   30  23.402 ± 0.203  ns/op
MemorySegmentFillUnsafe.unsafe      false       6  avgt   30  23.791 ± 0.007  ns/op
MemorySegmentFillUnsafe.unsafe      false       7  avgt   30  23.747 ± 0.215  ns/op
MemorySegmentFillUnsafe.unsafe      false       8  avgt   30  31.518 ± 0.073  ns/op
MemorySegmentFillUnsafe.unsafe      false      15  avgt   30  36.252 ± 0.071  ns/op
MemorySegmentFillUnsafe.unsafe      false      16  avgt   30  37.290 ± 0.236  ns/op
MemorySegmentFillUnsafe.unsafe      false      63  avgt   30  38.373 ± 0.163  ns/op
MemorySegmentFillUnsafe.unsafe      false      64  avgt   30  38.947 ± 0.300  ns/op
MemorySegmentFillUnsafe.unsafe      false     255  avgt   30  52.648 ± 0.189  ns/op
MemorySegmentFillUnsafe.unsafe      false     256  avgt   30  53.219 ± 0.195  ns/op

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23890#issuecomment-2893516321


More information about the hotspot-compiler-dev mailing list