MethodHandle performance

Sat Jan 14 14:53:46 UTC 2017

Attila,
you can use @Stable initialized with null instead of final when you declare the spreaderCache so you do not have to use Unsafe.

Rémi

----- Mail original -----
> De: "Hontvári Attila" <attila at hontvari.net>
> À: jigsaw-dev at openjdk.java.net
> Envoyé: Samedi 14 Janvier 2017 13:56:58
> Objet: Re: MethodHandle performance

> As an experiment I have reimplemented MethodHandle::invokeWithArguments,
> so it only generates a spreader on the first invocation, after that the
> spreader will be reused. Now it is 10 times faster, therefore it reaches
> the performance of reflection. If we don't pass primitive arguments, the
> performance is close to MethodHandle::invoke.
> 
> https://gist.github.com/hoat4/b459938cf7ae93e64bba3208c69af567
> 
> On the first invocation of iWA, the new code checks if the MH is a
> fixed-arity MH, or a varargs collector. In case of a fixed-arity MH,
> this is simple, it stores the spreadInvoker in a field to be called by
> iWA. But if the MH is a varargs-collector, it creates a new object for
> caching the spreaders by the arguments count, and the iWA calls will be
> forwarded to this object.
> 
> To enable inlining of a constant MH's iWA, the spreader is stored in a
> final field. The field's initial value is an MH pointing to a setup
> method, and when it is called, it generates the spreader, and rewrites
> the final field with the generated spreader. This is risky, but I
> couldn't induce the JVM to inline the wrong spreader method. I haven't
> considered concurrency problems.
> 
> I've ran Michael Rasmussen's benchmark. This is the original JDK 8
> MethodHandle:
> 
> Benchmark                              Mode  Cnt    Score Error  Units
> 
> MyBenchmark.invoke                     avgt    5   25,611 ± 0,256  ns/op
> MyBenchmark.invokeExact                avgt    5   25,658 ± 0,116  ns/op
> MyBenchmark.invokeWithArguments        avgt    5  397,023 ± 39,137  ns/op
> MyBenchmark.reflective                 avgt    5   42,578 ± 4,206  ns/op
> MyBenchmark.staticInvoke               avgt    5   18,863 ± 0,417  ns/op
> MyBenchmark.staticInvokeExact          avgt    5   18,918 ± 0,461  ns/op
> MyBenchmark.staticInvokeWithArguments  avgt    5  390,777 ± 41,888  ns/op
> 
> And this is the new code's performance:
> 
> Benchmark                              Mode  Cnt   Score Error  Units
> MyBenchmark.invoke                     avgt    5  25,623 ± 0,249 ns/op
> MyBenchmark.invokeExact                avgt    5  25,623 ± 0,390 ns/op
> MyBenchmark.invokeWithArguments        avgt    5  44,167 ± 0,774 ns/op
> MyBenchmark.reflective                 avgt    5  42,549 ± 4,202 ns/op
> MyBenchmark.staticInvoke               avgt    5  19,025 ± 0,417 ns/op
> MyBenchmark.staticInvokeExact          avgt    5  18,910 ± 0,304 ns/op
> MyBenchmark.staticInvokeWithArguments  avgt    5  32,013 ± 2,749 ns/op
> 
>  Attila
> 
> 2017-01-13 20:04 keltezéssel, John Rose írta:
>> On Jan 12, 2017, at 12:29 PM, Claes Redestad <claes.redestad at oracle.com> wrote:
>>> Right, I was just looking at the micro Stephen provided me, and it does
>>> seem that the added cost for this case is due to invokeWithArguments
>>> creating a new invoker every time.
>> This is a good workaround, and Stephen's report is a helpful reminder
>> that our performance story has a sharp edge.
>>
>> We cache spreaders in the case of varargs methods,
>> for full performance, but not for the ad hoc spreader used by MH.iWA.
>>
>> We should cache them, to remove this sharp edge (or performance pothole).
>> There are small technical challenges to do so.  Claes and I added
>> some notes to the bug report; maybe someone can look into it more.
>>
> > — John