RFR: 8372285: G1: Micro-optimize x86 barrier code [v3]
Aleksey Shipilev
shade at openjdk.org
Fri Nov 21 12:35:24 UTC 2025
On Fri, 21 Nov 2025 10:04:56 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:
> I assume that the jmh writebarrier micros were run just in case.
As expected, I see no real impact on EPYC machine, as we realistically only touch gc-active and/or slow-paths:
Benchmark Mode Cnt Score Error Units
# ----- Baseline
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullLarge avgt 12 2074.042 ± 33.941 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullSmall avgt 12 31.908 ± 0.020 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullYoungLarge avgt 12 2052.188 ± 2.993 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullYoungSmall avgt 12 31.923 ± 0.127 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathOldToYoungLarge avgt 12 2648.758 ± 12.689 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathOldToYoungSmall avgt 12 41.843 ± 6.851 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathRealLarge avgt 12 1860.052 ± 41.707 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathRealSmall avgt 12 29.635 ± 0.026 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToOldLarge avgt 12 2647.011 ± 3.035 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToOldSmall avgt 12 40.217 ± 0.053 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToYoungLarge avgt 12 1838.099 ± 11.536 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToYoungSmall avgt 12 29.637 ± 0.031 ns/op
WriteBarrier.WithDefaultUnrolling.testFieldWriteBarrierFastPath avgt 12 1.694 ± 0.001 ns/op
WriteBarrier.WithDefaultUnrolling.testFieldWriteBarrierFastPathYoungRef avgt 12 2.709 ± 0.001 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullLarge avgt 12 2245.868 ± 1.523 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullSmall avgt 12 36.056 ± 0.008 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullYoungLarge avgt 12 2247.127 ± 7.293 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullYoungSmall avgt 12 36.046 ± 0.012 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathOldToYoungLarge avgt 12 2812.237 ± 32.421 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathOldToYoungSmall avgt 12 44.899 ± 0.258 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathRealLarge avgt 12 2251.210 ± 18.101 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathRealSmall avgt 12 36.018 ± 0.011 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToOldLarge avgt 12 2821.869 ± 32.633 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToOldSmall avgt 12 44.800 ± 0.018 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToYoungLarge avgt 12 2247.837 ± 14.136 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToYoungSmall avgt 12 36.021 ± 0.015 ns/op
WriteBarrier.WithoutUnrolling.testFieldWriteBarrierFastPath avgt 12 1.694 ± 0.001 ns/op
WriteBarrier.WithoutUnrolling.testFieldWriteBarrierFastPathYoungRef avgt 12 2.710 ± 0.001 ns/op
# ----- Patched
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullLarge avgt 12 2058.748 ± 11.193 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullSmall avgt 12 31.943 ± 0.031 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullYoungLarge avgt 12 2052.097 ± 1.134 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathNullYoungSmall avgt 12 31.927 ± 0.021 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathOldToYoungLarge avgt 12 2661.495 ± 36.916 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathOldToYoungSmall avgt 12 40.327 ± 0.463 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathRealLarge avgt 12 1841.228 ± 7.491 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathRealSmall avgt 12 29.644 ± 0.021 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToOldLarge avgt 12 2671.222 ± 45.797 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToOldSmall avgt 12 40.214 ± 0.073 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToYoungLarge avgt 12 1833.984 ± 9.946 ns/op
WriteBarrier.WithDefaultUnrolling.testArrayWriteBarrierFastPathYoungToYoungSmall avgt 12 29.635 ± 0.070 ns/op
WriteBarrier.WithDefaultUnrolling.testFieldWriteBarrierFastPath avgt 12 1.694 ± 0.001 ns/op
WriteBarrier.WithDefaultUnrolling.testFieldWriteBarrierFastPathYoungRef avgt 12 2.710 ± 0.001 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullLarge avgt 12 2244.271 ± 37.550 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullSmall avgt 12 36.044 ± 0.006 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullYoungLarge avgt 12 2245.466 ± 18.204 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathNullYoungSmall avgt 12 36.036 ± 0.009 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathOldToYoungLarge avgt 12 2811.951 ± 26.061 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathOldToYoungSmall avgt 12 44.692 ± 0.041 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathRealLarge avgt 12 2241.369 ± 0.614 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathRealSmall avgt 12 36.019 ± 0.014 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToOldLarge avgt 12 2827.016 ± 43.966 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToOldSmall avgt 12 44.700 ± 0.060 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToYoungLarge avgt 12 2242.395 ± 5.700 ns/op
WriteBarrier.WithoutUnrolling.testArrayWriteBarrierFastPathYoungToYoungSmall avgt 12 36.018 ± 0.010 ns/op
WriteBarrier.WithoutUnrolling.testFieldWriteBarrierFastPath avgt 12 1.693 ± 0.001 ns/op
WriteBarrier.WithoutUnrolling.testFieldWriteBarrierFastPathYoungRef avgt 12 2.710 ± 0.001 ns/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/28446#issuecomment-3562839053
More information about the hotspot-dev
mailing list