Studying LF performance

Vladimir Kozlov vladimir.kozlov at oracle.com
Sun Dec 23 14:04:02 PST 2012


Hi Charlie,

If you want to experiment :) you can try the code Roland and Christian 
pushed.

Roland just pushed Incremental inlining changes for C2 which should help 
LF inlining:

http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/rev/d092d1b31229

You also need Christian's inlining related changes in JDK which :

http://hg.openjdk.java.net/hsx/hotspot-main/jdk/rev/12fa4d7ecaf5

Regards,
Vladimir

On 12/23/12 11:21 AM, Charles Oliver Nutter wrote:
> A thread emerges!
>
> I'm going to be taking some time this holiday to explore the
> performance of the new LF indy impl in various situations. This will
> be the thread where I gather observations.
>
> A couple preliminaries...
>
> My perf exploration so far seems to show LF performing nearly
> equivalent to the old impl for the smallest benchmarks, with
> performance rapidly degrading as the size of the code involved grows.
> Recursive fib and tak have nearly identical perf on LF and the old
> impl. Red/black performs about the same on LF as with indy disabled,
> well behind the old indy performance. At some point, LF falls
> completely off the cliff and can't even compete with non-indy logic,
> as in a benchmark I ran today of Ruby constant access (heavily
> SwitchPoint-dependent).
>
> Discussions with Christian seem to indicate that the fall-off is
> because non-inlined LF indy call sites perform very poorly compared to
> the old impl. I'll be trying to explore this and correlate the perf
> cliff with failure to inline. Christian has told me that (upcoming?)
> work on incremental inlining will help reduce the performance impact
> of the fall-off, but I'm not sure of the status of this work.
>
> Some early ASM output from a trivial benchmark: loop 500M times
> calling #foo, which immediately calls #bar, which just returns the
> self object (ALOAD 2; ARETURN in essence). I've been comparing the new
> ASM to the old, both presented in a gist here:
> https://gist.github.com/4365103
>
> As you can see, the code resulting from both impls boils down to
> almost nothing, but there's one difference...
>
> New code not present in old:
>
> 0x0000000111ab27ef: je     0x0000000111ab2835  ;*ifnull
>                                                  ; -
> java.lang.Class::cast at 1 (line 3007)
>                                                  ; -
> java.lang.invoke.LambdaForm$MH/763053631::guard at 12
>                                                  ; -
> java.lang.invoke.LambdaForm$MH/518216626::linkToCallSite at 14
>                                                  ; -
> ruby.__dash_e__::method__0$RUBY$foo at 3 (line 1)
>
> A side effect of inlining through LFs, I presume? Checking to ensure
> non-null call site? If so, shouldn't this have folded away, since the
> call site is constant?
>
> In any case, it's hardly damning to have an extra branch. This output
> is, at least, proof that LF *can* inline and optimize as well as the
> old impl...so we can put that aside for now. The questions to explore
> then are:
>
> * Do cases expected to inline actually do so under LF impl?
> * When inlining, does code optimize as it should (across the various
> shapes of call sites in JRuby, at least)?
> * When code does not inline, how does it impact performance?
>
> My expectation is that cases which should inline do so under LF, but
> that the non-inlined performance is significantly worse than under the
> old impl. The critical bit will be ensuring that even when LF call
> sites do not inline, they at least still compile to avoid
> interpretation and LF-to-LF overhead. At a minimum, it seems like we
> should be able to expect all LF between a call site and its DMH target
> will get compiled into a single unit, if not inlined into the caller.
> I still contend that call site + LFs should be heavily prioritized for
> inlining either into the caller or along with the called method, since
> they really *are* the shape of the call site. If there has to be a
> callq somewhere in that chain, there should ideally be only one.
>
> So...here we go.
>
> - Charlie
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>


More information about the mlvm-dev mailing list