RFR(XS): 8030976: untaken paths should be more vigorously pruned at highest optimization level
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Wed Jun 4 16:42:08 UTC 2014
Rickard,
Thanks for the explanation.
Looks good then.
I agree with VladimirK that it'd be good to extend recording and track
all deopt reasons. But that is for future enhancement.
Best regards,
Vladimir Ivanov
On 6/4/14 9:21 AM, Rickard Bäckman wrote:
> 1) Like Igor said, one was missing. I'll move Reason_unstable_if above
> Reason_tenured and 21 should be enough.
>
> 2) We only record when an uncommon trap is taken for the first 7
> reasons. So I map unstable_if to intrinsic because intrinsic should
> probably never happen on the bci of an if.
>
> /R
>
> On 06/03, Vladimir Ivanov wrote:
>> 2 questions:
>> - why do you set _trap_hist_limit = 22 and not 21?
>>
>> - why do you map Reason_unstable_if to Reason_intrinsic in
>> reason_recorded_per_bytecode_if_any?
>> + else if (reason == Reason_unstable_if)
>> + return Reason_intrinsic;
>>
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> On 6/3/14 5:07 PM, Rickard Bäckman wrote:
>>> Updated: http://cr.openjdk.java.net/~rbackman/8030976.2/
>>>
>>> Thanks
>>> /R
>>>
>>> On 06/03, Roland Westrelin wrote:
>>>> Hi Rickard,
>>>>
>>>>> Updated webrev: http://cr.openjdk.java.net/~rbackman/8030976.1/
>>>>
>>>> You need to pass Deoptimization::Reason_unstable_if to uncommon_trap() in Parse::adjust_map_after_if(). Also the code below in Parse::adjust_map_after_if() is no longer needed AFAICT.
>>>>
>>>> Roland.
>>>>
>>>> 1185 // If this might possibly turn into an implicit null check,
>>>> 1186 // and the null has never yet been seen, we need to generate
>>>> 1187 // an uncommon trap, so as to recompile instead of suffering
>>>> 1188 // with very slow branches. (We'll get the slow branches if
>>>> 1189 // the program ever changes phase and starts seeing nulls here.)
>>>> 1190 //
>>>> 1191 // We do not inspect for a null constant, since a node may
>>>> 1192 // optimize to 'null' later on.
>>>> 1193 //
>>>> 1194 // Null checks, and other tests which expect inequality,
>>>> 1195 // show btest == BoolTest::eq along the non-taken branch.
>>>> 1196 // On the other hand, type tests, must-be-null tests,
>>>> 1197 // and other tests which expect pointer equality,
>>>> 1198 // show btest == BoolTest::ne along the non-taken branch.
>>>> 1199 // We prune both types of branches if they look unused.
>>>> 1200 repush_if_args();
>>>> 1201 // We need to mark this branch as taken so that if we recompile we will
>>>> 1202 // see that it is possible. In the tiered system the interpreter doesn't
>>>> 1203 // do profiling and by the time we get to the lower tier from the interpreter
>>>> 1204 // the path may be cold again. Make sure it doesn't look untaken
>>>> 1205 if (is_fallthrough) {
>>>> 1206 profile_not_taken_branch(!ProfileInterpreter);
>>>> 1207 } else {
>>>> 1208 profile_taken_branch(iter().get_dest(), !ProfileInterpreter);
>>>> 1209 }
>>>>
>>>>>
>>>>> Thanks
>>>>> /R
>>>>>
>>>>> On 05/22, Rickard Bäckman wrote:
>>>>>> Roland pointed out a problem with the Reason used. New webrev coming
>>>>>> shortly.
>>>>>>
>>>>>> Thanks Roland.
>>>>>>
>>>>>> On 05/22, Rickard Bäckman wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> can I please have reviews for this change.
>>>>>>> The patch makes C2 place uncommon traps on previously untaken branches
>>>>>>> much more aggressively (we are simply trusting the profiling more).
>>>>>>> This improves performance for a couple of different patterns.
>>>>>>>
>>>>>>> Example:
>>>>>>>
>>>>>>> class Test {
>>>>>>> public int[] array = new int[] = { 1, 1, 2, 3, 5, 8, 13 };
>>>>>>>
>>>>>>> public void some_method() {
>>>>>>> for (int i = 0; i < array.length; i++) {
>>>>>>> if (array[i] < 255) {
>>>>>>> some_call();
>>>>>>> } else {
>>>>>>> some_other_call();
>>>>>>> }
>>>>>>> }
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> Where we previously if the else branch had never been taken rarely would
>>>>>>> inline the some_other_call and when array escapes we can't make
>>>>>>> assumptions on non-changing lengths, call killing registers, etc.
>>>>>>>
>>>>>>> On some of the Nashorn benchmark this patch increases score by 35%,
>>>>>>> others don't see any change at all. No difference on SpecJBB 2005.
>>>>>>> More performance numbers / microbenchmark in the comments of the bug.
>>>>>>>
>>>>>>> Webrev: http://cr.openjdk.java.net/~rbackman/8030976/
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8030976
>>>>>>>
>>>>>>> Thanks
>>>>>>> /R
>>>>>>
>>>>>>
>>>>>> /R
>>>>
More information about the hotspot-compiler-dev
mailing list