Weird performance behavior involving VarHandles
Jorn Vernee
jorn.vernee at oracle.com
Thu Apr 25 17:56:43 UTC 2024
I can reproduce this locally:
Benchmark Mode Cnt Score Error Units
ReproducerBenchmarks.control avgt 5 1.280 ± 0.015 ns/op
ReproducerBenchmarks.gwt2_methodhandle avgt 5 1.690 ± 0.008 ns/op
ReproducerBenchmarks.gwt_methodhandle avgt 5 1.305 ± 0.038 ns/op
Disabling tiered compilation 'fixes' the performance of gwt2:
Benchmark Mode Cnt Score Error Units
ReproducerBenchmarks.control avgt 5 1.299 ± 0.016 ns/op
ReproducerBenchmarks.gwt2_methodhandle avgt 5 1.312 ± 0.030 ns/op
ReproducerBenchmarks.gwt_methodhandle avgt 5 1.303 ± 0.034 ns/op
In both cases the assembly looks identical though. So, this may just be
up to a different code cache layout (or something like that).
Jorn
On 25/04/2024 00:28, Maurizio Cimadamore wrote:
>
> Cool benchmark/test case!
>
> I don't know off-hand where the difference could be coming from - but
> just curious: did you try accessing in a loop (e.g. to see if checks
> are hoisted as expected) ?
>
> I seem to recall that the lambda forms for guards-with-test are rather
> complex, as they need to profile the various branches. I wonder if
> some "leftover" from the profiling code stays there and pollutes the
> benchmark?
>
> Maurizio
>
> On 24/04/2024 07:37, Remi Forax wrote:
>> I get
>>
>> Benchmark Mode Cnt Score Error Units
>> ReproducerBenchmarks.control avgt 5 1.250 ± 0.024 ns/op
>> ReproducerBenchmarks.gwt2_methodhandle avgt 5 1.852 ± 0.024 ns/op
>>
>> and I don't understand why there is a difference in performance because
>> for c2, the strings "x" and "y" are constant so the corresponding
>> VarHandles should be constant thus optimized the same way.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-compiler-dev/attachments/20240425/edffbdd2/attachment.htm>
More information about the hotspot-compiler-dev
mailing list