[9] RFR (M): 8063137: Never-taken branches should be pruned when GWT LambdaForms are shared

Vladimir Kozlov vladimir.kozlov at oracle.com
Tue Jan 20 19:22:56 UTC 2015


Looks good.

Webrev has empty changes for  macro.cpp. Please, make sure nothing in it 
when you push.

Thanks,
Vladimir

On 1/20/15 11:10 AM, Vladimir Ivanov wrote:
>>>> You forgot to mark Opaque4Node as macro node. I would suggest to
>>>> base it
>>>> on Opaque2Node then you will get some methods from it.
>>> Do I really need to do so? I expect it to go away during IGVN pass
>>> right after parsing is over. That's why I register
>>> the node for igvn in LibraryCallKit::inline_profileBranch(). Changes
>>> in macro.cpp & compile.cpp are leftovers from the
>>> version when Opaque4 was macro node. I plan to remove them.
>>
>> I see, this is why you did not inherited it. Okay. I would suggest to
>> leave an assert in compile.cpp to make sure it is not left.
>>
>> I found typo when looked today (should be '&&'):
>>
>> + Node *Opaque4Node::Ideal(PhaseGVN *phase, bool can_reshape) {
>> +   if (can_reshape & _delay_removal) {
> Good catch! Fixed in the latest version:
> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.01/hotspot
>
> Best regards,
> Vladimir Ivanov
>
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>>> On 1/16/15 9:16 AM, Vladimir Ivanov wrote:
>>>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/hotspot/
>>>>> http://cr.openjdk.java.net/~vlivanov/8063137/webrev.00/jdk/
>>>>> https://bugs.openjdk.java.net/browse/JDK-8063137
>>>>>
>>>>> After GuardWithTest (GWT) LambdaForms became shared, profile pollution
>>>>> significantly distorted compilation decisions. It affected inlining
>>>>> and
>>>>> hindered some optimizations. It causes significant performance
>>>>> regressions for Nashorn (on Octane benchmarks).
>>>>>
>>>>> Inlining was fixed by 8059877 [1], but it didn't cover the case when a
>>>>> branch is never taken. It can cause missed optimization opportunity,
>>>>> and
>>>>> not just increase in code size. For example, non-pruned branch can
>>>>> break
>>>>> escape analysis.
>>>>>
>>>>> Currently, there are 2 problems:
>>>>>    - branch frequencies profile pollution
>>>>>    - deoptimization counts pollution
>>>>>
>>>>> Branch frequency pollution hides from JIT the fact that a branch is
>>>>> never taken. Since GWT LambdaForms (and hence their bytecode) are
>>>>> heavily shared, but the behavior is specific to MethodHandle,
>>>>> there's no
>>>>> way for JIT to understand how particular GWT instance behaves.
>>>>>
>>>>> The solution I propose is to do profiling in Java code and feed it to
>>>>> JIT. Every GWT MethodHandle holds an auxiliary array (int[2]) where
>>>>> profiling info is stored. Once JIT kicks in, it can retrieve these
>>>>> counts, if corresponding MethodHandle is a compile-time constant
>>>>> (and it
>>>>> is usually the case). To communicate the profile data from Java
>>>>> code to
>>>>> JIT, MethodHandleImpl::profileBranch() is used.
>>>>>
>>>>> If GWT MethodHandle isn't a compile-time constant, profiling should
>>>>> proceed. It happens when corresponding LambdaForm is already shared,
>>>>> for
>>>>> newly created GWT MethodHandles profiling can occur only in native
>>>>> code
>>>>> (dedicated nmethod for a single LambdaForm). So, when compilation of
>>>>> the
>>>>> whole MethodHandle chain is triggered, the profile should be already
>>>>> gathered.
>>>>>
>>>>> Overriding branch frequencies is not enough. Statistics on
>>>>> deoptimization events is also polluted. Even if a branch is never
>>>>> taken,
>>>>> JIT doesn't issue an uncommon trap there unless corresponding bytecode
>>>>> doesn't trap too much and doesn't cause too many recompiles.
>>>>>
>>>>> I added @IgnoreProfile and place it only on GWT LambdaForms. When JIT
>>>>> sees it on some method, Compile::too_many_traps &
>>>>> Compile::too_many_recompiles for that method always return false. It
>>>>> allows JIT to prune the branch based on custom profile and recompile
>>>>> the
>>>>> method, if the branch is visited.
>>>>>
>>>>> For now, I wanted to keep the fix very focused. The next thing I
>>>>> plan to
>>>>> do is to experiment with ignoring deoptimization counts for other
>>>>> LambdaForms which are heavily shared. I already saw problems caused by
>>>>> deoptimization counts pollution (see JDK-8068915 [2]).
>>>>>
>>>>> I plan to backport the fix into 8u40, once I finish extensive
>>>>> performance testing.
>>>>>
>>>>> Testing: JPRT, java/lang/invoke tests, nashorn (nashorn testsuite,
>>>>> Octane).
>>>>>
>>>>> Thanks!
>>>>>
>>>>> PS: as a summary, my experiments show that fixes for 8063137 & 8068915
>>>>> [2] almost completely recovers peak performance after LambdaForm
>>>>> sharing
>>>>> [3]. There's one more problem left (non-inlined MethodHandle
>>>>> invocations
>>>>> are more expensive when LFs are shared), but it's a story for another
>>>>> day.
>>>>>
>>>>> Best regards,
>>>>> Vladimir Ivanov
>>>>>
>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8059877
>>>>>      8059877: GWT branch frequencies pollution due to LF sharing
>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8068915
>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8046703
>>>>>      JEP 210: LambdaForm Reduction and Caching
>>>>> _______________________________________________
>>>>> mlvm-dev mailing list
>>>>> mlvm-dev at openjdk.java.net
>>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>>> _______________________________________________
>>>> mlvm-dev mailing list
>>>> mlvm-dev at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev


More information about the hotspot-compiler-dev mailing list