MethodHandle performance

Sat Jan 14 16:23:19 UTC 2017

That makes the code simpler. The performance doesn't change. I've 
updated in place.

2017-01-14 15:53 keltezéssel, Remi Forax írta:
> Attila,
> you can use @Stable initialized with null instead of final when you declare the spreaderCache so you do not have to use Unsafe.
>
> Rémi
>
> ----- Mail original -----
>> De: "Hontvári Attila" <attila at hontvari.net>
>> À: jigsaw-dev at openjdk.java.net
>> Envoyé: Samedi 14 Janvier 2017 13:56:58
>> Objet: Re: MethodHandle performance
>> As an experiment I have reimplemented MethodHandle::invokeWithArguments,
>> so it only generates a spreader on the first invocation, after that the
>> spreader will be reused. Now it is 10 times faster, therefore it reaches
>> the performance of reflection. If we don't pass primitive arguments, the
>> performance is close to MethodHandle::invoke.
>>
>> https://gist.github.com/hoat4/b459938cf7ae93e64bba3208c69af567
>>
>> On the first invocation of iWA, the new code checks if the MH is a
>> fixed-arity MH, or a varargs collector. In case of a fixed-arity MH,
>> this is simple, it stores the spreadInvoker in a field to be called by
>> iWA. But if the MH is a varargs-collector, it creates a new object for
>> caching the spreaders by the arguments count, and the iWA calls will be
>> forwarded to this object.
>>
>> To enable inlining of a constant MH's iWA, the spreader is stored in a
>> final field. The field's initial value is an MH pointing to a setup
>> method, and when it is called, it generates the spreader, and rewrites
>> the final field with the generated spreader. This is risky, but I
>> couldn't induce the JVM to inline the wrong spreader method. I haven't
>> considered concurrency problems.
>>
>> I've ran Michael Rasmussen's benchmark. This is the original JDK 8
>> MethodHandle:
>>
>> Benchmark                              Mode  Cnt    Score Error  Units
>>
>> MyBenchmark.invoke                     avgt    5   25,611 ± 0,256  ns/op
>> MyBenchmark.invokeExact                avgt    5   25,658 ± 0,116  ns/op
>> MyBenchmark.invokeWithArguments        avgt    5  397,023 ± 39,137  ns/op
>> MyBenchmark.reflective                 avgt    5   42,578 ± 4,206  ns/op
>> MyBenchmark.staticInvoke               avgt    5   18,863 ± 0,417  ns/op
>> MyBenchmark.staticInvokeExact          avgt    5   18,918 ± 0,461  ns/op
>> MyBenchmark.staticInvokeWithArguments  avgt    5  390,777 ± 41,888  ns/op
>>
>> And this is the new code's performance:
>>
>> Benchmark                              Mode  Cnt   Score Error  Units
>> MyBenchmark.invoke                     avgt    5  25,623 ± 0,249 ns/op
>> MyBenchmark.invokeExact                avgt    5  25,623 ± 0,390 ns/op
>> MyBenchmark.invokeWithArguments        avgt    5  44,167 ± 0,774 ns/op
>> MyBenchmark.reflective                 avgt    5  42,549 ± 4,202 ns/op
>> MyBenchmark.staticInvoke               avgt    5  19,025 ± 0,417 ns/op
>> MyBenchmark.staticInvokeExact          avgt    5  18,910 ± 0,304 ns/op
>> MyBenchmark.staticInvokeWithArguments  avgt    5  32,013 ± 2,749 ns/op
>>
>>   Attila
>>
>> 2017-01-13 20:04 keltezéssel, John Rose írta:
>>> On Jan 12, 2017, at 12:29 PM, Claes Redestad <claes.redestad at oracle.com> wrote:
>>>> Right, I was just looking at the micro Stephen provided me, and it does
>>>> seem that the added cost for this case is due to invokeWithArguments
>>>> creating a new invoker every time.
>>> This is a good workaround, and Stephen's report is a helpful reminder
>>> that our performance story has a sharp edge.
>>>
>>> We cache spreaders in the case of varargs methods,
>>> for full performance, but not for the ad hoc spreader used by MH.iWA.
>>>
>>> We should cache them, to remove this sharp edge (or performance pothole).
>>> There are small technical challenges to do so.  Claes and I added
>>> some notes to the bug report; maybe someone can look into it more.
>>>
>>> — John