RFR: 8324517: C2: crash in compiled code because of dependency on removed range check CastIIs [v2]
Tobias Hartmann
thartmann at openjdk.org
Mon Apr 22 11:47:32 UTC 2024
On Thu, 28 Mar 2024 14:14:57 GMT, Roland Westrelin <roland at openjdk.org> wrote:
>> Range check `CastII` nodes are removed once loop opts are over. The
>> test case for this change includes 3 cases where elimination of a
>> range check `CastII` causes a crash in compiled code because either a
>> out of bounds array load or a division by zero happen.
>>
>> In `test1`:
>>
>> - the range checks for the `array[otherArray.length]` loads constant
>> fold: `otherArray.length` is a `CastII` of i at the `otherArray`
>> allocation. `i` is less than 9. The `CastII` at the allocation
>> narrows the type down further to `[0-9]`.
>>
>> - the `array[otherArray.length]` loads are control dependent on the
>> unrelated:
>>
>>
>> if (flag == 0) {
>>
>>
>> test. There's an identical dominating test which replaces that one. As
>> a consequence, the `array[otherArray.length]` loads become control
>> dependent on the dominating test.
>>
>> - The `CastII` nodes at the `otherArray` allocations are replaced by a
>> dominating range check `CastII` nodes for:
>>
>>
>> newArray[i] = 42;
>>
>>
>> - After loop opts, the range check `CastII` nodes are removed and the
>> 2 `array[otherArray.length]` loads common at the first:
>>
>>
>> if (flag == 0) {
>>
>>
>> test before the:
>>
>>
>> float[] otherArray = new float[i];
>>
>>
>> and
>>
>>
>> newArray[i] = 42;
>>
>>
>> that guarantee `i` is positive.
>>
>> - `test1` is called with `i = -1`, the array load proceeds with an out
>> of bounds index and the crash occurs.
>>
>>
>> `test2` and `test3` are mostly identical except for the check that's
>> eliminated (a null divisor check) and the instruction that causes a
>> fault (an integer division).
>>
>> The fix I propose is to not eliminate range check `CastII` nodes after
>> loop opts. When range check`CastII` nodes were introduced, performance
>> was observed to regress. Removing them after loop opts was found to
>> preserve both correctness and performance. Today, the performance
>> regression still exists when `CastII` nodes are left in. So I propose
>> we keep them until the end of optimizations (so the 2 array loads
>> above don't lose a dependency and wrongly common) but remove them at
>> the end of all optimizations.
>>
>> In the case of the array loads, they are dependent on a range check
>> for another array through a range check `CastII` and we must not lose
>> that dependency otherwise the array loads could float above the range
>> check at gcm time. I propose we deal with that problem the way it's
>> handled for `CastPP` nodes: add the dependency to the load (or
>> division)nodes ...
>
> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>
> - review
> - Merge branch 'master' into JDK-8324517
> - test and fix
Great job coming up with these tests, Roland!
Did you check if the other usages of `_range_check_dependency` via `CastIINode::has_range_check` are still needed? Seems to me as if at least the checks in `PhaseIdealLoop::match_fill_loop` can be removed.
src/hotspot/share/opto/compile.cpp line 3471:
> 3469: remove_range_check_cast(n->as_CastII());
> 3470: }
> 3471: break;
Indentation is off.
src/hotspot/share/opto/compile.cpp line 3896:
> 3894: // Range check CastII nodes feed into an address computation subgraph. Remove them to let that subgraph float freely.
> 3895: // For memory access or integer divisions nodes that depend on the cast, record the dependency on the cast's control
> 3896: // as a precedence edge, so they can't float above the cast in case that cast's narrowed type helped eliminated a
Suggestion:
// as a precedence edge, so they can't float above the cast in case that cast's narrowed type helped eliminate a
src/hotspot/share/opto/compile.cpp line 3906:
> 3904: for (DUIterator_Fast imax, i = m->fast_outs(imax); i < imax; i++) {
> 3905: Node* use = m->fast_out(i);
> 3906: if (use->is_Mem() || use->Opcode() == Op_DivI || use->Opcode() == Op_DivL) {
`Op_ModI` and `Op_ModL` are missing here. And isn't this too strong in cases where we can prove that the operand is non-zero? Could you re-use `PhaseIterGVN::no_dependent_zero_check`? Please also add corresponding tests.
Looking at `PhaseIterGVN::no_dependent_zero_check`, I noticed that `UDiv[I/L]Node` and `UMod[I/L]Node` are not handled but I think they should. I think this was missed when these nodes where added by [JDK-8282221](https://bugs.openjdk.org/browse/JDK-8282221). One can probably extend @chhagedorn's test from [JDK-8259227](https://bugs.openjdk.org/browse/JDK-8259227) to trigger the same issue.
-------------
Changes requested by thartmann (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/18377#pullrequestreview-2014468911
PR Review Comment: https://git.openjdk.org/jdk/pull/18377#discussion_r1574576215
PR Review Comment: https://git.openjdk.org/jdk/pull/18377#discussion_r1574577525
PR Review Comment: https://git.openjdk.org/jdk/pull/18377#discussion_r1574616943
More information about the hotspot-compiler-dev
mailing list