[16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by zero
Christian Hagedorn
christian.hagedorn at oracle.com
Tue Jul 14 12:39:32 UTC 2020
> [..] Since the DivINode has a control outside of the main loop [..]
Edit: I actually meant that get_ctrl() returns a node outside of the
main-loop (i.e. the DivINode is not part of the main-loop body). The
DivINode still has NULL as control input.
Best regards,
Christian
On 14.07.20 14:32, Christian Hagedorn wrote:
> Hi Vladimir
>
> I had a closer look at the failing testcase with webrev.00. The original
> DivNode has its zero check removed based on correct type information.
> Afterwards its split through an induction variable phi for which both
> inputs have non-zero types. So, the DivNode end up after an AddINode
> (which adds a positive constant) which has a non-zero type. All good so
> far.
>
> Now we add pre/main/post loops and the induction variable phi for the
> pre-loop gets type int>=1 since the limit for the pre-loop is hidden
> behind an Opaque1 node which just returns int as type. The AddINode
> belonging to the loop induction variable phi in the pre-loop is
> therefore updated to have the type int as well (int>=1 + positive_int
> could overflow). This type information propagates to the main-loop and
> its AddINode belonging to the loop induction variable phi (which is an
> input to the DivNode) also gets its type set to int.
>
> Later, we add a vector post loop where we clone the main loop and add a
> phi p for the the AddINode node and its new clone. Since the DivINode
> has a control outside of the main loop, it is not cloned and gets the
> phi p as an input. At a later point in time, we want to split through p.
> But then we detect zero as possible value due to the type range of both
> AddINodes being int.
>
> Even though the type information is not accurate enough, the DivINode is
> never zero and we could safely apply the split through the phi. We could
> think about doing a bail out for all kinds of phis but I think it should
> only be an actual problem for loop induction variable phis.
>
> Thinking about this type propagation problem, couldn't we somehow set
> the type of the Opaque1 node hiding the pre-loop limit to the same type
> as the pre-loop limit to allow this information to flow to the pre and
> main loop? Or would that cause other problems? I guess there probably
> must be a reason why we don't do it like that.
>
> Best regards,
> Christian
>
> On 13.07.20 19:16, Vladimir Kozlov wrote:
>> This rise question: why zero check was removed if one of merged types
>> has 0?
>> Should we be more careful when we remove zero check?
>>
>> Thanks,
>> Vladimir
>>
>> On 7/13/20 2:06 AM, Christian Hagedorn wrote:
>>> A test in some later tier testing revealed that the assertion code is
>>> actually too strong. There can be a Div/Mod node whose zero check was
>>> removed but that is then spilt through a non-induction-variable phi
>>> whose inputs have zero in their type range (which is fine, this
>>> happens in some loop opts after partial peeling was applied earlier).
>>> This happened, for example, for a phi which merged two nodes from the
>>> original and a cloned loop. I think we just need to remove the
>>> additional assertion code.
>>>
>>> New webrev:
>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>>
>>> Best regards,
>>> Christian
>>>
>>> On 13.07.20 09:19, Christian Hagedorn wrote:
>>>> Thank you Vladimir for your review!
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>>>> Looks good.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>>>> Hi
>>>>>>
>>>>>> Please review the following patch:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>>>
>>>>>> In the failing testcase, C2 removes a zero check for a
>>>>>> division/modulo node n based on the type information of the loop
>>>>>> induction variable phi p (always between 1 and 50 and never 0).
>>>>>> However, n is later split through p and ends up after the AddNode
>>>>>> which updates the induction variable p. In the last iteration j
>>>>>> equals 2 and is then updated to 0. The division/modulo node n is
>>>>>> now executed before the loop limit check which results in a SIGFPE.
>>>>>>
>>>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division
>>>>>> or modulo node has its zero check removed (i.e. control in NULL)
>>>>>> and is split through a phi which has an input that could be zero.
>>>>>> This should only happen for an induction variable phi of a
>>>>>> trip-counted (integer) loop.
>>>>>>
>>>>>> Best regards,
>>>>>> Christian
More information about the hotspot-compiler-dev
mailing list