Inlining
Charles Oliver Nutter
headius at headius.com
Mon Aug 24 08:29:50 PDT 2009
On Mon, Aug 24, 2009 at 5:08 AM, Raffaello
Giulietti<raffaello.giulietti at gmail.com> wrote:
> The target of the call site is a method handle to a method similar to
>
> static Obj invoke_0(Stub stub, Obj self) {
> if (self.klass == stub.klass_0)
> return stub.mh_0.<Obj>invoke(self);
> // some other similar lines, depending on how polymorphic the inline
> cache shall be
>
> // otherwise do a slow lookup, cache the results in the stub
> according to some strategy and invoke the final method
> return lookupCacheInvoke(stub, self);
> }
>
> It is true that each call site's target refers to the same method
> handle, hence to the same dispatching method, i.e., invoke_0 above.
> However, every call site has its own Stub instance, so every call site
> caches its own information.
>
> But you state
>
>> It sounds like you're still dispatching through a generic piece of
>> code, yes? If you have a piece of code in the call path that all calls
>> pass through, you essentially defeat inlining entirely.
>
> Why is this so? I cannot see why invoke_0 couldn't be inlined at the
> invokedynamic call site. Is there a fundamental reason?
invoke_0 will be inlined, but the further call to the method handles
may not be. If this same method is being called through for many
different paths, then you actually have a polymorphic (even
megamorphic) call site at the <Obj>invoke(self) calls. It will be, to
the JVM, a single call site with a large number of possible targets,
and at least Hotspot can't inline across such a boundary.
What you actually want to do is install a MH into the indy call site
that points either directly at the eventual method to be called or at
a guardWithTest handle that performs your polymorphic check and then
decides on a slow or fast path call.
Here's what that looks like in JRuby:
private static MethodHandle createGWT(MethodHandle test,
MethodHandle target, MethodHandle fallback, CacheEntry entry, CallSite
site) {
MethodHandle myTest = MethodHandles.insertArguments(test, 0, entry);
MethodHandle myTarget = MethodHandles.insertArguments(target, 0, entry);
MethodHandle myFallback =
MethodHandles.insertArguments(fallback, 0, site);
MethodHandle guardWithTest =
MethodHandles.guardWithTest(myTest, myTarget, myFallback);
return MethodHandles.convertArguments(guardWithTest, site.type());
}
The result of this call is installed directly into the indy call site.
The "test" boils down to this:
public static boolean test(CacheEntry entry, IRubyObject self) {
return entry.typeOk(self.getMetaClass());
}
This just confirms that the CacheEntry (a tuple of class token and
method) is valid for the incoming self.
The fallback patch ends up like this, a megamorphic slow-path bit of logic:
public static IRubyObject fallback(JRubyCallSite site,
ThreadContext context, IRubyObject caller, IRubyObject self, String
name) {
RubyClass selfClass = pollAndGetClass(context, self);
CacheEntry entry = selfClass.searchWithCache(name);
if (methodMissing(entry, site.callType(), name, caller)) {
return callMethodMissing(entry, site.callType(), context,
self, name);
}
site.setTarget(createGWT(TEST_0, TARGET_0, FALLBACK_0, entry, site));
return entry.method.call(context, self, selfClass, name);
}
Notice that it reinstalls a *new* GWT with the new fast-path target,
and then calls the method directly.
The targets don't have any generic piece of code; they all bind
eventually to a virtual call to DynamicMethod.call, where
DynamicMethod is JRuby's method object abstraction. In each case, the
eventual implementation of "call" invoked should be monomorphic for a
given GWT fast path, but I will also be removing that last phase in
favor of a DirectMethodHandle that goes straight to the actual target
code (DynamicMethod impls are frequently generated code, since we
don't have method handles pre-JDK7). But the basic idea is that your
fast path should not call through any generalized Java code; it needs
to be handles all the way to a unique target, or inlining is defeated.
With indy now showing some perf improvement over my basic code, I'm
going to fix that last phase and prepare a blog post on all this.
- Charlie
More information about the mlvm-dev
mailing list