[15] RFR(M): 8240227: Loop predicates should be copied to unswitched loops

Christian Hagedorn christian.hagedorn at oracle.com
Wed Mar 18 10:41:01 UTC 2020


Thank you Nils and Tobias for reviewing it again!

Best regards,
Christian

On 18.03.20 10:48, Nils Eliasson wrote:
> For the record - I'm okay with the cleaned up version too.
> 
> Best regards,
> 
> Nils
> 
> On 2020-03-13 13:10, Christian Hagedorn wrote:
>> Hi Tobias
>>
>> Thank you for your review and your comments!
>>
>> On 13.03.20 09:24, Tobias Hartmann wrote:
>>> - loopPredicate.cpp:157 Wrong indentation
>>> - loopPredicate.cpp:363 Should be "sanity" but in general, I prefer 
>>> are more meaningful message
>>> - Maybe replace all these predicate->as_Proj() / 
>>> predicate->as_IfProj() by declaring a new local
>>> "proj" or something
>>> - loopPredicate.cpp:389 Can be replaced by 
>>> igvn->replace_input_of(iff, 1 igvn->intcon(proj->_con));
>>> - loopUnswitch.cpp:160 Wrong indentation
>>> - PartialPeelingUnswitch.java:1811 Should be "} else if {" (also in 
>>> the following lines)
>>> - PartialPeelingUnswitch.java:1838 Missing whitespace
>>
>> Fixed.
>>
>>> - loopPredicate.cpp:415/422 Why are two versions of 
>>> PhaseIdealLoop::clone_loop_predicates required?
>>
>> There was originally a method PhaseIterGVN::clone_loop_predicates 
>> which did not have any uses. I removed it in the current webrev. But 
>> in addition, we can also remove one of those remaining 
>> clone_loop_predicates methods which is not required anymore and make 
>> all involved methods non-static. I removed all related dead code.
>>
>>> - loopPredicate.cpp:421 Isn't that code only used for unswitching? 
>>> I.e., the comment needs to be
>>> updated and maybe the assert from line 308 should be moved upwards.
>>
>> Yes, it is only used for loop unswitching. Fixed.
>>
>>> Since we want to backport both 8240227 and 8237859, it may be good to 
>>> run some more extra days of
>>> Lucene testing on a server machine while this is backing in JDK 15.
>>
>> Definitely a good idea. I'll run some testing of Lucene and ZGC on a 
>> machine over the weekend.
>>
>> I created new updated webrevs with the cleaned up code:
>> Full: http://cr.openjdk.java.net/~chagedorn/8240227/webrev.01/
>> Inc: http://cr.openjdk.java.net/~chagedorn/8240227/webrev.00-01/
>>
>> I'll run again some testing over the weekend, just to be sure (and 
>> also together with the fix for JDK-8237859).
>>
>> Best regards,
>> Christian
>>
>>> On 11.03.20 13:20, Christian Hagedorn wrote:
>>>> Hi
>>>>
>>>> Please review the following patch:
>>>> https://bugs.openjdk.java.net/browse/JDK-8240227
>>>> http://cr.openjdk.java.net/~chagedorn/8240227/webrev.00/
>>>>
>>>> This is the same problem as for JDK-8203915 [1] but now with 
>>>> additional loop unswitching which was
>>>> not handled in the fix for JDK-8203915.
>>>>
>>>> Background:
>>>> The original idea of inserting skeleton predicates [2] for range 
>>>> check predicates involving the loop
>>>> induction variable was to copy those concrete predicates down to the 
>>>> main loop when creating
>>>> pre/main/post loops. When unrolling the loop later, the limit check 
>>>> predicate for the induction
>>>> variable is updated to become stronger and cover all accesses of the 
>>>> unrolled loop body. If a
>>>> predicate for the main loop later turns out to always fail due to 
>>>> type information, then the main
>>>> loop can be completely removed.
>>>>
>>>> If we did not do that (which exactly happens in the test case since 
>>>> we do not copy the range check
>>>> predicates to the unswitched loops which are therefore not found 
>>>> when creating pre/main/post loops
>>>> for the slow and fast loop) then the C2 matcher can hit the bad AD 
>>>> file assert when over-unrolling
>>>> the main loop. In the test case some data paths die because CastII 
>>>> nodes for an array load are
>>>> replaced by TOP. This happens after Opaque1 nodes are removed and 
>>>> the type information about the
>>>> maximum number of iterations for the pre-loop propagate to the 
>>>> induction variable of the main loop.
>>>> IGVN concludes that the induction variable is out-of-bounds for some 
>>>> CastII nodes which are then
>>>> replaced by TOP. As a result, the graph contains vectorized results 
>>>> with a TOP memory input which
>>>> cannot be handled by the matcher.
>>>>
>>>> [3] shows the IR before doing the first IGVN iteration where 
>>>> major_progress() is false (Opaque1
>>>> nodes can be removed). The 719 Opaque1 is removed and the type 
>>>> information of 1063 MinI [int:0..13]
>>>> propagates to the induction variable of the pre-loop 701 Phi 
>>>> [int:5..13] (loop starts with j = 5, at
>>>> most 8 iterations of the pre-loop) which then propagates over 716 
>>>> CastII [int:6..14] -> 1665 Phi
>>>> [int:6..max-31] (32 iterations after unrolling) -> 1537 AddI 
>>>> [int:30..max-7] (adds 24) which is then
>>>> compared with 1027 CastII [int:5..7] (iArr[i]). [int:5..7] does not 
>>>> contain [int:30..max-7] and
>>>> therefore 1027 CastII is replaced with TOP.
>>>>
>>>> The main loop is dead but is not removed since there is no range 
>>>> check predicate for the main loop
>>>> that would notice that some loads are always out-of-bounds due to 
>>>> the type information. The fix for
>>>> JDK-8203915 avoids that (for loops without unswitching) by adding 
>>>> range check predicates for the
>>>> main loop which are then updated while unrolling and later allow the 
>>>> main loop to be removed.
>>>>
>>>> The only thing missing is that the created range check predicates 
>>>> for a loop to be unswitched are
>>>> not cloned to both unswitched loop versions. Therefore, the changes 
>>>> for [1] cannot be applied since
>>>> it does not find any predicates when creating pre/main/post loops 
>>>> for the slow and fast loop. This
>>>> fix addresses that and clones all range check predicates that have 
>>>> at least one control edge to a
>>>> data node which is part of the loop to be unswitched. All control 
>>>> edges to data nodes which are part
>>>> of either the slow or fast loop are updated to the new cloned 
>>>> predicates accordingly. All old range
>>>> predicates of the original loop to be unswitched that do not have 
>>>> any control edges to data nodes
>>>> can be removed.
>>>>
>>>> This patch is also a prerequisite for the fix of JDK-8237859 [4]. I 
>>>> included some renaming for
>>>> predicate related code that should make the difference between the 
>>>> various predicate types and how
>>>> they are used clearer (skeleton, concrete, empty, updated...). I 
>>>> also included many more unswitch
>>>> tests (PartialPeelingUnswitch.java) that I originally wrote as part 
>>>> of an RFE in progress [5] but
>>>> also helped with testing this fix.
>>>>
>>>> Thank you!
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8203915
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8193130
>>>> [3] 
>>>> https://bugs.openjdk.java.net/secure/attachment/87257/IR_before_IGVN_major_progress_false.png 
>>>>
>>>> [4] https://bugs.openjdk.java.net/browse/JDK-8237859
>>>> [5] https://bugs.openjdk.java.net/browse/JDK-8236722


More information about the hotspot-compiler-dev mailing list