Unusually high polymorphic dispatch costs?

Charles Oliver Nutter headius at headius.com
Fri Apr 29 13:12:03 PDT 2011


I think my strategy is going to be something like this:

* Bind first method encountered straight through with a new GWT
* Upon encountering nth new method, rebuild GWT chain to check first,
second, nth PIC-style
* For some N encountered methods, failover to a simple IC

Some variation of these (profiling GWT misses versus hits, direct call
+ IC chain, etc) should work around the GWT creation overhead.

I suppose you could pre-create GWTs, but that seems more cumbersome
since in many/most cases the test will be time-sensitive (e.g. in
JRuby where it's based on the current "generation" of a class, which
will change over time).

For monomorphic, GWT + direct handle will certainly be faster than
CachingCallSite. For some N-morphic, a GWT "PIC chain" will still be
faster than CCS. And then the failover should be no slower than CCS.
Not a bad trade-off.

- Charlie

On Fri, Apr 29, 2011 at 2:59 PM, Ola Bini <ola.bini at gmail.com> wrote:
> Hi,
>
> Given that creating GWTs are expensive, is it a really bad idea to
> create them and bind them on a cache miss then? My current logic for
> call sites look something like this:
>
>   invoke call site
>        if fallback, check if current morphism is < 10.
>            If so, create a new GWT with the currently found method and
> appropriate test.
>
> How would you recommend doing this without creating GWTs at runtime?
> Having ten slots in the call site and precreate the GWTs that use them?
>
> Cheers
>
> On 2011-04-29 09.59, Rémi Forax wrote:
>> On 04/28/2011 09:58 PM, Charles Oliver Nutter wrote:
>>> I'm trying to figure out why polymorphic dispatch is incredibly slow
>>> in JRuby + indy. Take this benchmark, for example:
>>>
>>> class A; def foo; end; end
>>> class B; def foo; end; end
>>>
>>> a = A.new
>>> b = B.new
>>>
>>> 5.times { puts Benchmark.measure { 1000000.times { a, b = b, a; a.foo;
>>> b.foo } } }
>>>
>>> a.foo and b.foo are bimorphic here. Under stock JRuby, using
>>> CachingCallSite, this benchmark runs in about 0.13s per iteration.
>>> Using invokedynamic, it takes 9s!!!
>>>
>>> This is after a patch I just committed that caches the target method
>>> handle for direct paths. I believe the only thing created when GWT
>>> fails now is a new GWT.
>>
>> If you want to emulate a bimorphic cache, you should have two GWTs.
>> So no construction of new GWT after discovering all possible targets
>> for the two callsites.
>>
>> Relying on a mutable MethodHandle, a method handle that change
>> for every call will not work well because the JIT will not be able to
>> inline through this mutable method handle.
>>
>>> Is it expected that rebinding a call site or constructing a GWT would
>>> be very expensive? If yes...I will have to look into having a hard
>>> failover to inline caching or a PIC-like handle chain for polymorphic
>>> cases. That's not necessarily difficult. If no...I'm happy to update
>>> my build and play with patches to see what's happening here.
>>
>> Yes, it's expensive.
>> The target of a CallSite should be stable.
>> So yes it's expensible and yes it's intended.
>>
>>> A sampled profile produced the following output:
>>>
>>>           Stub + native   Method
>>>   57.6%     0  +  5214    java.lang.invoke.MethodHandleNatives.init
>>>   30.9%     0  +  2798    java.lang.invoke.MethodHandleNatives.init
>>>    2.1%     0  +   189    java.lang.invoke.MethodHandleNatives.getTarget
>>>    0.1%     0  +     7    java.lang.Object.getClass
>>>    0.0%     0  +     3    java.lang.Class.isPrimitive
>>>    0.0%     0  +     3    java.lang.System.arraycopy
>>>   90.7%     0  +  8214    Total stub
>>>
>>> Of course we all know how accurate sampled profiles are, but this is
>>> pretty a pretty dismal result.
>>>
>>> I suspect that this polymorphic cost is a *major* factor in slowing
>>> down some benchmarks under invokedynamic. FWIW, the above benchmark
>>> without the a,b swap runs in 0.06s, better than 2x faster than stock
>>> JRuby (yay!).
>>>
>>> - Charlie
>>
>> Rémi
>>
>> _______________________________________________
>> mlvm-dev mailing list
>> mlvm-dev at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>
>
>
> --
>  Ola Bini (http://olabini.com)
>  Ioke - JRuby - ThoughtWorks
>
>  "Yields falsehood when quined" yields falsehood when quined.
>
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>


More information about the mlvm-dev mailing list