[16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by zero
Christian Hagedorn
christian.hagedorn at oracle.com
Tue Jul 14 12:32:19 UTC 2020
Hi Vladimir
I had a closer look at the failing testcase with webrev.00. The original
DivNode has its zero check removed based on correct type information.
Afterwards its split through an induction variable phi for which both
inputs have non-zero types. So, the DivNode end up after an AddINode
(which adds a positive constant) which has a non-zero type. All good so far.
Now we add pre/main/post loops and the induction variable phi for the
pre-loop gets type int>=1 since the limit for the pre-loop is hidden
behind an Opaque1 node which just returns int as type. The AddINode
belonging to the loop induction variable phi in the pre-loop is
therefore updated to have the type int as well (int>=1 + positive_int
could overflow). This type information propagates to the main-loop and
its AddINode belonging to the loop induction variable phi (which is an
input to the DivNode) also gets its type set to int.
Later, we add a vector post loop where we clone the main loop and add a
phi p for the the AddINode node and its new clone. Since the DivINode
has a control outside of the main loop, it is not cloned and gets the
phi p as an input. At a later point in time, we want to split through p.
But then we detect zero as possible value due to the type range of both
AddINodes being int.
Even though the type information is not accurate enough, the DivINode is
never zero and we could safely apply the split through the phi. We could
think about doing a bail out for all kinds of phis but I think it should
only be an actual problem for loop induction variable phis.
Thinking about this type propagation problem, couldn't we somehow set
the type of the Opaque1 node hiding the pre-loop limit to the same type
as the pre-loop limit to allow this information to flow to the pre and
main loop? Or would that cause other problems? I guess there probably
must be a reason why we don't do it like that.
Best regards,
Christian
On 13.07.20 19:16, Vladimir Kozlov wrote:
> This rise question: why zero check was removed if one of merged types
> has 0?
> Should we be more careful when we remove zero check?
>
> Thanks,
> Vladimir
>
> On 7/13/20 2:06 AM, Christian Hagedorn wrote:
>> A test in some later tier testing revealed that the assertion code is
>> actually too strong. There can be a Div/Mod node whose zero check was
>> removed but that is then spilt through a non-induction-variable phi
>> whose inputs have zero in their type range (which is fine, this
>> happens in some loop opts after partial peeling was applied earlier).
>> This happened, for example, for a phi which merged two nodes from the
>> original and a cloned loop. I think we just need to remove the
>> additional assertion code.
>>
>> New webrev:
>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>
>> Best regards,
>> Christian
>>
>> On 13.07.20 09:19, Christian Hagedorn wrote:
>>> Thank you Vladimir for your review!
>>>
>>> Best regards,
>>> Christian
>>>
>>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>>> Looks good.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>>> Hi
>>>>>
>>>>> Please review the following patch:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>>
>>>>> In the failing testcase, C2 removes a zero check for a
>>>>> division/modulo node n based on the type information of the loop
>>>>> induction variable phi p (always between 1 and 50 and never 0).
>>>>> However, n is later split through p and ends up after the AddNode
>>>>> which updates the induction variable p. In the last iteration j
>>>>> equals 2 and is then updated to 0. The division/modulo node n is
>>>>> now executed before the loop limit check which results in a SIGFPE.
>>>>>
>>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division
>>>>> or modulo node has its zero check removed (i.e. control in NULL)
>>>>> and is split through a phi which has an input that could be zero.
>>>>> This should only happen for an induction variable phi of a
>>>>> trip-counted (integer) loop.
>>>>>
>>>>> Best regards,
>>>>> Christian
More information about the hotspot-compiler-dev
mailing list