[jdk16] RFR: 8257822: C2 crashes with SIGFPE due to a division that floats above its zero check

Christian Hagedorn chagedorn at openjdk.java.net
Fri Dec 11 10:12:09 UTC 2020


This bug manifests itself very similarly to [JDK-8229496](https://bugs.openjdk.java.net/browse/JDK-8229496). The only difference is that this bug is triggered by the split-if optimization instead of loop unswitching.

![Zero_check_after_split_if](https://user-images.githubusercontent.com/17833009/101883951-44300e80-3b98-11eb-9556-8d6a4ebac770.png)

The division `331 DivI` is initially added with its zero check. The zero check is then split through a region which creates the two zero checks `507 If` and `508 If`. The division `331 DivI` itself had a phi as input from the region through which the zero check was split. It therefore gets a new `520 Phi` as input to merge the previous inputs but is not split itself (state in IR above). 

IGVN is then able to remove the zero check `508 If` because it is dominated by the same check `If 125`. Only later, IGVN removes `512 Region`and rewires the control input of `331 DivI` directly to the range check projection `475 IfFalse`.

A loop predicate is then later added to the outer loop in `PhaseIdealLoop::loop_predication_impl_helper()` for `474 RangeCheck`. Since `331 DivI` is now control dependent on its projection `475 IfFalse`, it is also moved out of the loop by `PhaseIdealLoop::dominated_by()` because `depends_only_on_test()` is true for `Div/Mod` nodes:
https://github.com/openjdk/jdk16/blob/58dca9253d3ec7bc5745d5b814b33e1b4b8b08e8/src/hotspot/share/opto/loopopts.cpp#L283-L287

As a result, the division floats above its zero check and is indeed scheduled before it, resulting in a SIGFPE crash.


One solution would be to extend split-if to also split and clone a `Div/Mod` node to ensure that it still directly has a zero check as control input such that `IfNode::dominated_by()` can correctly update the control input again later when replacing the zero check by a dominating `If`. However, I reckon that this is not the only place where a zero check could be separated from its `Div/Mod` node. Therefore, I suggest to do something similar as in JDK-8229496 but allow a `Div/Mod` node to float out of the loop if its type does not include zero anymore (and therefore does not have a zero check anymore). This is less restrictive as in JDK-8229496 where we skipped all `Div/Mod` nodes by adding additional `CastLL` nodes with `carry_dependency` which unfortunately resulted in a preformance regession ([JDK-8242108](https://bugs.openjdk.java.net/browse/JDK-8242108)) and a backout of the fix again (it was throught that another fix had solved the problem in the meantime).

I could not observe any regressions running standard performance testing and the reported JMH benchmark from the regression in JDK-8242108 (results posted in JBS).

Thanks,
Christian

-------------

Commit messages:
 - 

Changes: https://git.openjdk.java.net/jdk16/pull/9/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk16&pr=9&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8257822
  Stats: 96 lines in 3 files changed: 91 ins; 0 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk16/pull/9.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk16 pull/9/head:pull/9

PR: https://git.openjdk.java.net/jdk16/pull/9


More information about the hotspot-compiler-dev mailing list