[9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

Fri Jan 16 23:13:48 UTC 2015

On Jan 16, 2015, at 9:16 AM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> 
> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/ <http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/>
> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/ <http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/>
> https://bugs.openjdk.java.net/browse/JDK-8063137 <https://bugs.openjdk.java.net/browse/JDK-8063137>
> ...
> PS: as a summary, my experiments show that fixes for 8063137 & 8068915 [2] almost completely recovers peak performance after LambdaForm sharing [3]. There's one more problem left (non-inlined MethodHandle invocations are more expensive when LFs are shared), but it's a story for another day.

This performance bump is excellent news.  LFs are supposed to express emergently common behaviors, like hidden classes.  We are much closer to that goal now.

I'm glad to see that the library-assisted profiling turns out to be relatively clean.

In effect this restores the pre-LF CountingMethodHandle logic from 2011, which was so beneficial in JDK 7:
  http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/02de5cdbef21/src/share/classes/java/lang/invoke/CountingMethodHandle.java

I have some suggestions to make this version a little cleaner; see below.

Starting with the JDK changes:

In LambdaForm.java, I'm feeling flag pressure from all the little boolean fields and constructor parameters.

(Is it time to put in a bit-encoded field "private byte LambdaForm.flags", or do we wait for another boolean to come along?  But see next questions, which are more important.)

What happens when a GWT LF gets inlined into a larger LF?  Then there might be two or more selectAlternative calls.
Will this confuse anything or will it Just Work?  The combined LF will get profiled as usual, and the selectAlternative calls will also collect profile (or not?).

This leads to another question:  Why have a boolean 'isGWT' at all?  Why not just check for one or more occurrence of selectAlternative, and declare that those guys override (some of) the profiling.  Something like:

  -+ if (PROFILE_GWT && lambdaForm.isGWT) 
  ++ if (PROFILE_GWT && lambdaForm.containsFunction(NF_selectAlternative))
(...where LF.containsFunction(NamedFunction) is a variation of LF.contains(Name).)

I suppose the answer may be that you want to inline GWTs (if ever) into customized code where the JVM profiling should get maximum benefit.  In that case case you might want to set the boolean to "false" to distinguish "immature" GWT combinators from customized ones.

If that's the case, perhaps the real boolean flag you want is not 'isGWT' but 'sharedProfile' or 'immature' or some such, or (inverting) 'customized'.  (I like the feel of a 'customized' flag.)  Then @IgnoreProfile would get attached to a LF that (a ) contains selectAlternative and (b ) is marked as non-customized/immature/shared.  You might also want to adjust the call to 'profileBranch' based on whether the containing LF was shared or customized.

What I'm mainly poking at here is that 'isGWT' is not informative about the intended use of the flag.

In 'updateCounters', if the counter overflows, you'll get continuous creation of ArithmeticExceptions.  Will that optimize or will it cause a permanent slowdown?  Consider a hack like this on the exception path:
   counters[idx] = Integer.MAX_VALUE / 2;

On the Name Bikeshed:  It looks like @IgnoreProfile (ignore_profile in the VM) promises too much "ignorance", since it suppresses branch counts and traps, but allows type profiles to be consulted.  Maybe something positive like "@ManyTraps" or "@SharedMegamorphic"?  (It's just a name, and this is just a suggestion.)

Going to the JVM:

In library_call.cpp, I think you should change the assert to a guard:
  -+     assert(aobj->length() == 2, "");
  ++     && aobj->length() == 2) {

In Parse::dynamic_branch_prediction, the mere presence of the Opaque4 node is enough to trigger replacement of profiling.  I think there should *not* be a test of method()->ignore_profile().  That should provide better integration between the two sources of profile data to JVM profiling?

Also, I think the name 'Opaque4Node' is way too… opaque.  Suggest 'ProfileBranchNode', since that's exactly what it does.

Suggest changing the log element "profile_branch" to "observe source='profileBranch'", to make a better hint as to the source of the info.

— John