RFR: 8352972: PPC64: Intrinsify Unsafe::setMemory

Martin Doerr mdoerr at openjdk.org
Wed Mar 26 14:21:17 UTC 2025


Similar to the x86 implementation.

Before this patch:

Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
MemorySegmentZeroUnsafe.panama       true       1  avgt   30  15.048 ± 0.095  ns/op
MemorySegmentZeroUnsafe.panama       true       2  avgt   30  15.054 ± 0.089  ns/op
MemorySegmentZeroUnsafe.panama       true       3  avgt   30  15.161 ± 0.089  ns/op
MemorySegmentZeroUnsafe.panama       true       4  avgt   30  15.147 ± 0.082  ns/op
MemorySegmentZeroUnsafe.panama       true       5  avgt   30  15.198 ± 0.089  ns/op
MemorySegmentZeroUnsafe.panama       true       6  avgt   30  15.128 ± 0.099  ns/op
MemorySegmentZeroUnsafe.panama       true       7  avgt   30  19.234 ± 0.148  ns/op
MemorySegmentZeroUnsafe.panama       true       8  avgt   30  15.060 ± 0.090  ns/op
MemorySegmentZeroUnsafe.panama       true      15  avgt   30  19.229 ± 0.171  ns/op
MemorySegmentZeroUnsafe.panama       true      16  avgt   30  15.030 ± 0.082  ns/op
MemorySegmentZeroUnsafe.panama       true      63  avgt   30  85.290 ± 0.431  ns/op
MemorySegmentZeroUnsafe.panama       true      64  avgt   30  84.273 ± 0.843  ns/op
MemorySegmentZeroUnsafe.panama       true     255  avgt   30  89.551 ± 0.706  ns/op
MemorySegmentZeroUnsafe.panama       true     256  avgt   30  87.736 ± 0.679  ns/op
MemorySegmentZeroUnsafe.panama      false       1  avgt   30  15.044 ± 0.073  ns/op
MemorySegmentZeroUnsafe.panama      false       2  avgt   30  14.980 ± 0.058  ns/op
MemorySegmentZeroUnsafe.panama      false       3  avgt   30  15.138 ± 0.126  ns/op
MemorySegmentZeroUnsafe.panama      false       4  avgt   30  15.025 ± 0.049  ns/op
MemorySegmentZeroUnsafe.panama      false       5  avgt   30  15.192 ± 0.118  ns/op
MemorySegmentZeroUnsafe.panama      false       6  avgt   30  15.464 ± 0.667  ns/op
MemorySegmentZeroUnsafe.panama      false       7  avgt   30  19.179 ± 0.143  ns/op
MemorySegmentZeroUnsafe.panama      false       8  avgt   30  15.278 ± 0.130  ns/op
MemorySegmentZeroUnsafe.panama      false      15  avgt   30  19.428 ± 0.146  ns/op
MemorySegmentZeroUnsafe.panama      false      16  avgt   30  18.011 ± 1.233  ns/op
MemorySegmentZeroUnsafe.panama      false      63  avgt   30  87.090 ± 0.989  ns/op
MemorySegmentZeroUnsafe.panama      false      64  avgt   30  86.513 ± 0.623  ns/op
MemorySegmentZeroUnsafe.panama      false     255  avgt   30  89.415 ± 0.831  ns/op
MemorySegmentZeroUnsafe.panama      false     256  avgt   30  90.665 ± 0.798  ns/op
MemorySegmentZeroUnsafe.unsafe       true       1  avgt   30  86.530 ± 0.504  ns/op
MemorySegmentZeroUnsafe.unsafe       true       2  avgt   30  84.540 ± 0.399  ns/op
MemorySegmentZeroUnsafe.unsafe       true       3  avgt   30  86.954 ± 0.768  ns/op
MemorySegmentZeroUnsafe.unsafe       true       4  avgt   30  86.409 ± 0.801  ns/op
MemorySegmentZeroUnsafe.unsafe       true       5  avgt   30  86.774 ± 0.808  ns/op
MemorySegmentZeroUnsafe.unsafe       true       6  avgt   30  86.128 ± 0.804  ns/op
MemorySegmentZeroUnsafe.unsafe       true       7  avgt   30  86.512 ± 0.434  ns/op
MemorySegmentZeroUnsafe.unsafe       true       8  avgt   30  85.680 ± 0.335  ns/op
MemorySegmentZeroUnsafe.unsafe       true      15  avgt   30  88.098 ± 0.660  ns/op
MemorySegmentZeroUnsafe.unsafe       true      16  avgt   30  86.162 ± 0.634  ns/op
MemorySegmentZeroUnsafe.unsafe       true      63  avgt   30  87.605 ± 0.606  ns/op
MemorySegmentZeroUnsafe.unsafe       true      64  avgt   30  86.423 ± 0.667  ns/op
MemorySegmentZeroUnsafe.unsafe       true     255  avgt   30  89.882 ± 0.416  ns/op
MemorySegmentZeroUnsafe.unsafe       true     256  avgt   30  89.026 ± 0.555  ns/op
MemorySegmentZeroUnsafe.unsafe      false       1  avgt   30  86.808 ± 0.250  ns/op
MemorySegmentZeroUnsafe.unsafe      false       2  avgt   30  86.504 ± 0.427  ns/op
MemorySegmentZeroUnsafe.unsafe      false       3  avgt   30  87.304 ± 0.570  ns/op
MemorySegmentZeroUnsafe.unsafe      false       4  avgt   30  85.787 ± 0.395  ns/op
MemorySegmentZeroUnsafe.unsafe      false       5  avgt   30  86.032 ± 0.517  ns/op
MemorySegmentZeroUnsafe.unsafe      false       6  avgt   30  85.668 ± 0.414  ns/op
MemorySegmentZeroUnsafe.unsafe      false       7  avgt   30  85.621 ± 0.457  ns/op
MemorySegmentZeroUnsafe.unsafe      false       8  avgt   30  85.744 ± 0.384  ns/op
MemorySegmentZeroUnsafe.unsafe      false      15  avgt   30  85.898 ± 0.380  ns/op
MemorySegmentZeroUnsafe.unsafe      false      16  avgt   30  86.993 ± 0.532  ns/op
MemorySegmentZeroUnsafe.unsafe      false      63  avgt   30  86.700 ± 0.558  ns/op
MemorySegmentZeroUnsafe.unsafe      false      64  avgt   30  87.678 ± 0.721  ns/op
MemorySegmentZeroUnsafe.unsafe      false     255  avgt   30  91.774 ± 0.860  ns/op
MemorySegmentZeroUnsafe.unsafe      false     256  avgt   30  89.748 ± 0.749  ns/op


With this patch:

Benchmark                       (aligned)  (size)  Mode  Cnt   Score   Error  Units
MemorySegmentZeroUnsafe.panama       true       1  avgt   30  15.206 ± 0.113  ns/op
MemorySegmentZeroUnsafe.panama       true       2  avgt   30  15.106 ± 0.094  ns/op
MemorySegmentZeroUnsafe.panama       true       3  avgt   30  15.314 ± 0.118  ns/op
MemorySegmentZeroUnsafe.panama       true       4  avgt   30  15.067 ± 0.078  ns/op
MemorySegmentZeroUnsafe.panama       true       5  avgt   30  15.192 ± 0.094  ns/op
MemorySegmentZeroUnsafe.panama       true       6  avgt   30  15.145 ± 0.098  ns/op
MemorySegmentZeroUnsafe.panama       true       7  avgt   30  19.353 ± 0.176  ns/op
MemorySegmentZeroUnsafe.panama       true       8  avgt   30  15.164 ± 0.070  ns/op
MemorySegmentZeroUnsafe.panama       true      15  avgt   30  19.201 ± 0.103  ns/op
MemorySegmentZeroUnsafe.panama       true      16  avgt   30  15.138 ± 0.092  ns/op
MemorySegmentZeroUnsafe.panama       true      63  avgt   30  27.875 ± 0.783  ns/op
MemorySegmentZeroUnsafe.panama       true      64  avgt   30  19.560 ± 0.252  ns/op
MemorySegmentZeroUnsafe.panama       true     255  avgt   30  91.272 ± 0.568  ns/op
MemorySegmentZeroUnsafe.panama       true     256  avgt   30  19.582 ± 0.089  ns/op
MemorySegmentZeroUnsafe.panama      false       1  avgt   30  15.049 ± 0.117  ns/op
MemorySegmentZeroUnsafe.panama      false       2  avgt   30  15.096 ± 0.095  ns/op
MemorySegmentZeroUnsafe.panama      false       3  avgt   30  15.094 ± 0.073  ns/op
MemorySegmentZeroUnsafe.panama      false       4  avgt   30  15.012 ± 0.068  ns/op
MemorySegmentZeroUnsafe.panama      false       5  avgt   30  15.130 ± 0.121  ns/op
MemorySegmentZeroUnsafe.panama      false       6  avgt   30  15.079 ± 0.090  ns/op
MemorySegmentZeroUnsafe.panama      false       7  avgt   30  19.121 ± 0.120  ns/op
MemorySegmentZeroUnsafe.panama      false       8  avgt   30  15.153 ± 0.136  ns/op
MemorySegmentZeroUnsafe.panama      false      15  avgt   30  19.516 ± 0.101  ns/op
MemorySegmentZeroUnsafe.panama      false      16  avgt   30  19.054 ± 0.091  ns/op
MemorySegmentZeroUnsafe.panama      false      63  avgt   30  28.211 ± 0.742  ns/op
MemorySegmentZeroUnsafe.panama      false      64  avgt   30  30.415 ± 0.368  ns/op
MemorySegmentZeroUnsafe.panama      false     255  avgt   30  93.071 ± 0.785  ns/op
MemorySegmentZeroUnsafe.panama      false     256  avgt   30  93.184 ± 0.594  ns/op
MemorySegmentZeroUnsafe.unsafe       true       1  avgt   30  19.361 ± 0.085  ns/op
MemorySegmentZeroUnsafe.unsafe       true       2  avgt   30  19.415 ± 0.101  ns/op
MemorySegmentZeroUnsafe.unsafe       true       3  avgt   30  19.198 ± 0.111  ns/op
MemorySegmentZeroUnsafe.unsafe       true       4  avgt   30  19.380 ± 0.066  ns/op
MemorySegmentZeroUnsafe.unsafe       true       5  avgt   30  19.107 ± 0.057  ns/op
MemorySegmentZeroUnsafe.unsafe       true       6  avgt   30  19.097 ± 0.064  ns/op
MemorySegmentZeroUnsafe.unsafe       true       7  avgt   30  19.520 ± 0.381  ns/op
MemorySegmentZeroUnsafe.unsafe       true       8  avgt   30  19.406 ± 0.093  ns/op
MemorySegmentZeroUnsafe.unsafe       true      15  avgt   30  19.210 ± 0.049  ns/op
MemorySegmentZeroUnsafe.unsafe       true      16  avgt   30  19.459 ± 0.092  ns/op
MemorySegmentZeroUnsafe.unsafe       true      63  avgt   30  29.300 ± 0.235  ns/op
MemorySegmentZeroUnsafe.unsafe       true      64  avgt   30  19.200 ± 0.080  ns/op
MemorySegmentZeroUnsafe.unsafe       true     255  avgt   30  91.678 ± 0.243  ns/op
MemorySegmentZeroUnsafe.unsafe       true     256  avgt   30  19.793 ± 0.139  ns/op
MemorySegmentZeroUnsafe.unsafe      false       1  avgt   30  19.430 ± 0.082  ns/op
MemorySegmentZeroUnsafe.unsafe      false       2  avgt   30  19.469 ± 0.106  ns/op
MemorySegmentZeroUnsafe.unsafe      false       3  avgt   30  19.264 ± 0.123  ns/op
MemorySegmentZeroUnsafe.unsafe      false       4  avgt   30  19.260 ± 0.080  ns/op
MemorySegmentZeroUnsafe.unsafe      false       5  avgt   30  19.210 ± 0.068  ns/op
MemorySegmentZeroUnsafe.unsafe      false       6  avgt   30  19.240 ± 0.066  ns/op
MemorySegmentZeroUnsafe.unsafe      false       7  avgt   30  20.132 ± 0.375  ns/op
MemorySegmentZeroUnsafe.unsafe      false       8  avgt   30  20.148 ± 0.358  ns/op
MemorySegmentZeroUnsafe.unsafe      false      15  avgt   30  19.405 ± 0.154  ns/op
MemorySegmentZeroUnsafe.unsafe      false      16  avgt   30  19.375 ± 0.119  ns/op
MemorySegmentZeroUnsafe.unsafe      false      63  avgt   30  29.458 ± 0.491  ns/op
MemorySegmentZeroUnsafe.unsafe      false      64  avgt   30  29.554 ± 0.817  ns/op
MemorySegmentZeroUnsafe.unsafe      false     255  avgt   30  93.094 ± 0.789  ns/op
MemorySegmentZeroUnsafe.unsafe      false     256  avgt   30  93.630 ± 0.869  ns/op


`Unsafe` cases with small Cnt are significantly faster. Aligned large cases, too.

-------------

Commit messages:
 - 8352972: PPC64: Intrinsify Unsafe::setMemory

Changes: https://git.openjdk.org/jdk/pull/24254/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24254&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8352972
  Stats: 109 lines in 1 file changed: 109 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/24254.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24254/head:pull/24254

PR: https://git.openjdk.org/jdk/pull/24254


More information about the hotspot-compiler-dev mailing list