[15] RFR(M): 8240227: Loop predicates should be copied to unswitched loops
Christian Hagedorn
christian.hagedorn at oracle.com
Wed Mar 18 10:41:01 UTC 2020
Thank you Nils and Tobias for reviewing it again!
Best regards,
Christian
On 18.03.20 10:48, Nils Eliasson wrote:
> For the record - I'm okay with the cleaned up version too.
>
> Best regards,
>
> Nils
>
> On 2020-03-13 13:10, Christian Hagedorn wrote:
>> Hi Tobias
>>
>> Thank you for your review and your comments!
>>
>> On 13.03.20 09:24, Tobias Hartmann wrote:
>>> - loopPredicate.cpp:157 Wrong indentation
>>> - loopPredicate.cpp:363 Should be "sanity" but in general, I prefer
>>> are more meaningful message
>>> - Maybe replace all these predicate->as_Proj() /
>>> predicate->as_IfProj() by declaring a new local
>>> "proj" or something
>>> - loopPredicate.cpp:389 Can be replaced by
>>> igvn->replace_input_of(iff, 1 igvn->intcon(proj->_con));
>>> - loopUnswitch.cpp:160 Wrong indentation
>>> - PartialPeelingUnswitch.java:1811 Should be "} else if {" (also in
>>> the following lines)
>>> - PartialPeelingUnswitch.java:1838 Missing whitespace
>>
>> Fixed.
>>
>>> - loopPredicate.cpp:415/422 Why are two versions of
>>> PhaseIdealLoop::clone_loop_predicates required?
>>
>> There was originally a method PhaseIterGVN::clone_loop_predicates
>> which did not have any uses. I removed it in the current webrev. But
>> in addition, we can also remove one of those remaining
>> clone_loop_predicates methods which is not required anymore and make
>> all involved methods non-static. I removed all related dead code.
>>
>>> - loopPredicate.cpp:421 Isn't that code only used for unswitching?
>>> I.e., the comment needs to be
>>> updated and maybe the assert from line 308 should be moved upwards.
>>
>> Yes, it is only used for loop unswitching. Fixed.
>>
>>> Since we want to backport both 8240227 and 8237859, it may be good to
>>> run some more extra days of
>>> Lucene testing on a server machine while this is backing in JDK 15.
>>
>> Definitely a good idea. I'll run some testing of Lucene and ZGC on a
>> machine over the weekend.
>>
>> I created new updated webrevs with the cleaned up code:
>> Full: http://cr.openjdk.java.net/~chagedorn/8240227/webrev.01/
>> Inc: http://cr.openjdk.java.net/~chagedorn/8240227/webrev.00-01/
>>
>> I'll run again some testing over the weekend, just to be sure (and
>> also together with the fix for JDK-8237859).
>>
>> Best regards,
>> Christian
>>
>>> On 11.03.20 13:20, Christian Hagedorn wrote:
>>>> Hi
>>>>
>>>> Please review the following patch:
>>>> https://bugs.openjdk.java.net/browse/JDK-8240227
>>>> http://cr.openjdk.java.net/~chagedorn/8240227/webrev.00/
>>>>
>>>> This is the same problem as for JDK-8203915 [1] but now with
>>>> additional loop unswitching which was
>>>> not handled in the fix for JDK-8203915.
>>>>
>>>> Background:
>>>> The original idea of inserting skeleton predicates [2] for range
>>>> check predicates involving the loop
>>>> induction variable was to copy those concrete predicates down to the
>>>> main loop when creating
>>>> pre/main/post loops. When unrolling the loop later, the limit check
>>>> predicate for the induction
>>>> variable is updated to become stronger and cover all accesses of the
>>>> unrolled loop body. If a
>>>> predicate for the main loop later turns out to always fail due to
>>>> type information, then the main
>>>> loop can be completely removed.
>>>>
>>>> If we did not do that (which exactly happens in the test case since
>>>> we do not copy the range check
>>>> predicates to the unswitched loops which are therefore not found
>>>> when creating pre/main/post loops
>>>> for the slow and fast loop) then the C2 matcher can hit the bad AD
>>>> file assert when over-unrolling
>>>> the main loop. In the test case some data paths die because CastII
>>>> nodes for an array load are
>>>> replaced by TOP. This happens after Opaque1 nodes are removed and
>>>> the type information about the
>>>> maximum number of iterations for the pre-loop propagate to the
>>>> induction variable of the main loop.
>>>> IGVN concludes that the induction variable is out-of-bounds for some
>>>> CastII nodes which are then
>>>> replaced by TOP. As a result, the graph contains vectorized results
>>>> with a TOP memory input which
>>>> cannot be handled by the matcher.
>>>>
>>>> [3] shows the IR before doing the first IGVN iteration where
>>>> major_progress() is false (Opaque1
>>>> nodes can be removed). The 719 Opaque1 is removed and the type
>>>> information of 1063 MinI [int:0..13]
>>>> propagates to the induction variable of the pre-loop 701 Phi
>>>> [int:5..13] (loop starts with j = 5, at
>>>> most 8 iterations of the pre-loop) which then propagates over 716
>>>> CastII [int:6..14] -> 1665 Phi
>>>> [int:6..max-31] (32 iterations after unrolling) -> 1537 AddI
>>>> [int:30..max-7] (adds 24) which is then
>>>> compared with 1027 CastII [int:5..7] (iArr[i]). [int:5..7] does not
>>>> contain [int:30..max-7] and
>>>> therefore 1027 CastII is replaced with TOP.
>>>>
>>>> The main loop is dead but is not removed since there is no range
>>>> check predicate for the main loop
>>>> that would notice that some loads are always out-of-bounds due to
>>>> the type information. The fix for
>>>> JDK-8203915 avoids that (for loops without unswitching) by adding
>>>> range check predicates for the
>>>> main loop which are then updated while unrolling and later allow the
>>>> main loop to be removed.
>>>>
>>>> The only thing missing is that the created range check predicates
>>>> for a loop to be unswitched are
>>>> not cloned to both unswitched loop versions. Therefore, the changes
>>>> for [1] cannot be applied since
>>>> it does not find any predicates when creating pre/main/post loops
>>>> for the slow and fast loop. This
>>>> fix addresses that and clones all range check predicates that have
>>>> at least one control edge to a
>>>> data node which is part of the loop to be unswitched. All control
>>>> edges to data nodes which are part
>>>> of either the slow or fast loop are updated to the new cloned
>>>> predicates accordingly. All old range
>>>> predicates of the original loop to be unswitched that do not have
>>>> any control edges to data nodes
>>>> can be removed.
>>>>
>>>> This patch is also a prerequisite for the fix of JDK-8237859 [4]. I
>>>> included some renaming for
>>>> predicate related code that should make the difference between the
>>>> various predicate types and how
>>>> they are used clearer (skeleton, concrete, empty, updated...). I
>>>> also included many more unswitch
>>>> tests (PartialPeelingUnswitch.java) that I originally wrote as part
>>>> of an RFE in progress [5] but
>>>> also helped with testing this fix.
>>>>
>>>> Thank you!
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8203915
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8193130
>>>> [3]
>>>> https://bugs.openjdk.java.net/secure/attachment/87257/IR_before_IGVN_major_progress_false.png
>>>>
>>>> [4] https://bugs.openjdk.java.net/browse/JDK-8237859
>>>> [5] https://bugs.openjdk.java.net/browse/JDK-8236722
More information about the hotspot-compiler-dev
mailing list