status of VM long loop optimizations - call for action

Fri Dec 10 21:34:59 UTC 2021

A simple write benchmark I had already made for specialized 
VarHandles(AKA insertCoordinates) seems to get about 1ns consistently 
faster, so I guess these changes helped a bit?

Before:

Benchmark                                    Mode  Cnt   Score Error  Units
VarHandleBenchmark.genericHandleBenchmark    avgt    5  21.155 ± 0.145  
ns/op
VarHandleBenchmark.specFinalHandleBenchmark  avgt    5   0.678 ± 0.201  
ns/op
VarHandleBenchmark.specHandleBenchmark       avgt    5  17.323 ± 1.324  
ns/op

After:

Benchmark                                    Mode  Cnt   Score Error  Units
VarHandleBenchmark.genericHandleBenchmark    avgt    5  20.304 ± 1.466  
ns/op
VarHandleBenchmark.specFinalHandleBenchmark  avgt    5   0.652 ± 0.156  
ns/op
VarHandleBenchmark.specHandleBenchmark       avgt    5  17.266 ± 1.712  
ns/op

Benchmark:

     public static final MemorySegment SEGMENT = 
MemorySegment.allocateNative(ValueLayout.JAVA_INT, 
ResourceScope.newSharedScope());

     public static final VarHandle GENERIC_HANDLE = 
MemoryHandles.varHandle(ValueLayout.JAVA_INT);

     public static VarHandle SPEC_HANDLE = 
MemoryHandles.insertCoordinates(GENERIC_HANDLE, 0, SEGMENT, 0);

     public static final VarHandle SPEC_HANDLE_FINAL = 
MemoryHandles.insertCoordinates(GENERIC_HANDLE, 0, SEGMENT, 0);

     @Benchmark
     @BenchmarkMode(Mode.AverageTime)
     @OutputTimeUnit(TimeUnit.NANOSECONDS)
     public void genericHandleBenchmark()
     {
         GENERIC_HANDLE.set(SEGMENT, 0, 5);
     }

     @Benchmark
     @BenchmarkMode(Mode.AverageTime)
     @OutputTimeUnit(TimeUnit.NANOSECONDS)
     public void specHandleBenchmark()
     {
         SPEC_HANDLE.set(5);
     }

     @Benchmark
     @BenchmarkMode(Mode.AverageTime)
     @OutputTimeUnit(TimeUnit.NANOSECONDS)
     public void specFinalHandleBenchmark()
     {
         SPEC_HANDLE_FINAL.set(5);
     }

Sort of off-topic but... I don't remember anyone saying previously that 
insertCoordinates would give that big of a difference(or any at all!) so 
it's surprising to me. I was expecting a performance decrease due to the 
handle no longer being static-final. Can javac maybe optimize this so 
that any case where:

GENERIC_HANDLE.set(SEGMENT, 0, 5);

is, an optimized VarHandle is created at compile time that is equivalent 
to SPEC_HANDLE and inserted there instead?

On 12/10/21 4:55 AM, Maurizio Cimadamore wrote:
> (resending since mailing lists were down yesterday - I apologize if 
> this results in duplicates).
>
> Hi,
> few days ago some VM enhancements were integrated [1, 2], so it is 
> time to take a look again at where we are.
>
> I put together a branch which removes all workarounds (both for long 
> loops and for alignment checks):
>
> https://github.com/mcimadamore/jdk/tree/long_loop_workarounds_removal
>
> I also ran memory access benchmarks before/after, to see what the 
> difference is like - here's a visual report:
>
> https://jmh.morethan.io/?gists=dfa7075db33f7e6a2690ac80a64aa252,7f894f48460a6a0c9891cbe3158b43a7 
>
>
> Overall, I think the numbers are solid. The branch w/o workarounds 
> keep up with mainline in basically all cases but one (UnrolledAccess - 
> this code pattern needs more work in the VM, but Roland Westrelin has 
> identified a possible fix for it). In some cases (parallel tests) we 
> see quite a big jump forward.
>
> I think it's hard to say how these results will translate in real 
> world - my gut feeling is that the simpler bound checking logic will 
> almost invariably result in performance improvements with more complex 
> code patterns, despite what synthetic benchmark might say (the current 
> logic in mainline is fragile as it has to guard against integer 
> overflow, which in turns sometimes kills BCE optimizations).
>
> So I'd be inclined to integrate these changes in 18.
>
> If you gave a project that works agaist the Java 18 API, it would be 
> very helpful for us if you could try it on the above branch and report 
> back. This will help us make a more informed decision.
>
> Cheers
> Maurizio
>
> [1] - https://bugs.openjdk.java.net/browse/JDK-8276116
> [2] - https://bugs.openjdk.java.net/browse/JDK-8277850
>
>
>