[16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by zero

Christian Hagedorn christian.hagedorn at oracle.com
Tue Jul 14 12:39:32 UTC 2020


> [..] Since the DivINode has a control outside of the main loop [..]

Edit: I actually meant that get_ctrl() returns a node outside of the 
main-loop (i.e. the DivINode is not part of the main-loop body). The 
DivINode still has NULL as control input.

Best regards,
Christian

On 14.07.20 14:32, Christian Hagedorn wrote:
> Hi Vladimir
> 
> I had a closer look at the failing testcase with webrev.00. The original 
> DivNode has its zero check removed based on correct type information. 
> Afterwards its split through an induction variable phi for which both 
> inputs have non-zero types. So, the DivNode end up after an AddINode 
> (which adds a positive constant) which has a non-zero type. All good so 
> far.
> 
> Now we add pre/main/post loops and the induction variable phi for the 
> pre-loop gets type int>=1 since the limit for the pre-loop is hidden 
> behind an Opaque1 node which just returns int as type. The AddINode 
> belonging to the loop induction variable phi in the pre-loop is 
> therefore updated to have the type int as well (int>=1 + positive_int 
> could overflow). This type information propagates to the main-loop and 
> its AddINode belonging to the loop induction variable phi (which is an 
> input to the DivNode) also gets its type set to int.
> 
> Later, we add a vector post loop where we clone the main loop and add a 
> phi p for the the AddINode node and its new clone. Since the DivINode 
> has a control outside of the main loop, it is not cloned and gets the 
> phi p as an input. At a later point in time, we want to split through p. 
> But then we detect zero as possible value due to the type range of both 
> AddINodes being int.
> 
> Even though the type information is not accurate enough, the DivINode is 
> never zero and we could safely apply the split through the phi. We could 
> think about doing a bail out for all kinds of phis but I think it should 
> only be an actual problem for loop induction variable phis.
> 
> Thinking about this type propagation problem, couldn't we somehow set 
> the type of the Opaque1 node hiding the pre-loop limit to the same type 
> as the pre-loop limit to allow this information to flow to the pre and 
> main loop? Or would that cause other problems? I guess there probably 
> must be a reason why we don't do it like that.
> 
> Best regards,
> Christian
> 
> On 13.07.20 19:16, Vladimir Kozlov wrote:
>> This rise question: why zero check was removed if one of merged types 
>> has 0?
>> Should we be more careful when we remove zero check?
>>
>> Thanks,
>> Vladimir
>>
>> On 7/13/20 2:06 AM, Christian Hagedorn wrote:
>>> A test in some later tier testing revealed that the assertion code is 
>>> actually too strong. There can be a Div/Mod node whose zero check was 
>>> removed but that is then spilt through a non-induction-variable phi 
>>> whose inputs have zero in their type range (which is fine, this 
>>> happens in some loop opts after partial peeling was applied earlier). 
>>> This happened, for example, for a phi which merged two nodes from the 
>>> original and a cloned loop. I think we just need to remove the 
>>> additional assertion code.
>>>
>>> New webrev:
>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>>
>>> Best regards,
>>> Christian
>>>
>>> On 13.07.20 09:19, Christian Hagedorn wrote:
>>>> Thank you Vladimir for your review!
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>>>> Looks good.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>>>> Hi
>>>>>>
>>>>>> Please review the following patch:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>>>
>>>>>> In the failing testcase, C2 removes a zero check for a 
>>>>>> division/modulo node n based on the type information of the loop 
>>>>>> induction variable phi p (always between 1 and 50 and never 0). 
>>>>>> However, n is later split through p and ends up after the AddNode 
>>>>>> which updates the induction variable p. In the last iteration j 
>>>>>> equals 2 and is then updated to 0. The division/modulo node n is 
>>>>>> now executed before the loop limit check which results in a SIGFPE.
>>>>>>
>>>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division 
>>>>>> or modulo node has its zero check removed (i.e. control in NULL) 
>>>>>> and is split through a phi which has an input that could be zero. 
>>>>>> This should only happen for an induction variable phi of a 
>>>>>> trip-counted (integer) loop.
>>>>>>
>>>>>> Best regards,
>>>>>> Christian


More information about the hotspot-compiler-dev mailing list