RFR: 8331717: C2: Crash with SIGFPE Because Loop Predication Wrongly Hoists Division Requiring Zero Check [v2]

Theo Weidmann tweidmann at openjdk.org
Thu Jan 9 15:23:37 UTC 2025


On Thu, 26 Dec 2024 13:28:29 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> Theo Weidmann has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Combine test files
>
> I have looked more deeply into the issue with `depends_only_on_test` in general and I have a really deep concern regarding the current state of how it is handled.
> 
> To start with, what is `depends_only_on_test`? Given a node `n` with the control input `c`, if `c` can be deduced from `c'` and `n->depends_only_on_test() == true`, then we can rewire the control input of `n` to `c'`. This means that `depends_only_on_test` does not mean that the node depends on a test, it means that the node depends on the test that is its control input.
> 
> For example:
> 
>     if (y != 0) {
>         if (x > 0) {
>             if (y != 0) {
>                 x / y;
>             }
>         }
>     }
> 
> Then `x/y` `depends_only_on_test` because its control input is the test `y != 0`. Then, we can rewire the control input of the division to the outer `y != 0`, resulting in:
> 
>     if (y != 0) {
>         x / y;
>         if (x > 0) {
>         }
>     }
> 
> On the other hand, consider this case:
> 
>     if (x > 0) {
>         if (y != 0) {
>             if (x > 0) {
>                 x / y;
>             }
>         }
>     }
> 
> Then `x/y` does not `depends_only_on_test` because its control input is the test `x > 0` which is unrelated, we can see that if we rewire the division to the outer `x > 0` test, the division floats above the actual test `y != 0`. This means that `depends_only_on_test` is a dynamic property of a node, and not a static property of the division operation. It can change when we transform the graph and it can be different for different nodes of the same kind.
> 
> So, what is the issue in [JDK-8257822](https://bugs.openjdk.org/browse/JDK-8257822)? When the zero check of the division is split through `Phi` but the division is not, it is wired to the merge point, not the tests themselves. This means that the division no longer `depends_only_on_test` and should return `false`. However, it still reports that it `depends_only_on_test`, which makes `PhaseIdealLoop::dominated_by` move it out of the loop when its control input is moved.
> 
> The fix was to make `PhaseIdealLoop::dominated_by` treat a division as if it does not `depends_only_on_test` when we cannot prove that the divisor is non-zero. This fixed the issue in that particular instance, as it achieved the same result as an actual correct fix of making `depends_only_on_test` return `false`. However, the node still reports itself as `depends_only_on_test`, and that opens more opportunities of miscompilation.
> 
> One important consideration regarding nodes that `depen...

@merykitty Thank you for your detailed write-up! @chhagedorn and I talked about it and we also agree with you that there seems to be some fundamental flaw here that needs to be addressed. We should definitely address it soon. It looks like a bigger endeavor though and we should probably file your write-up as RFE so it does not get buried here.

Do you think we should give up on this point fix then? Or do you think it's fine if we merge it and address the underlying cause separately? We still believe that there's no harm in applying this band-aid patch in the way I proposed, while, of course, we have to address the underlying issue here too. 

@rwestrel added pin_array_access_node. Maybe you also want to weigh in on this?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22666#issuecomment-2580546412


More information about the hotspot-compiler-dev mailing list