RFR: 8324517: C2: crash in compiled code because of dependency on removed range check CastIIs [v5]

David Holmes dholmes at openjdk.org
Thu May 16 09:10:16 UTC 2024


On Wed, 15 May 2024 12:07:22 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> Range check `CastII` nodes are removed once loop opts are over. The
>> test case for this change includes 3 cases where elimination of a
>> range check `CastII` causes a crash in compiled code because either a
>> out of bounds array load or a division by zero happen.
>> 
>> In `test1`:
>> 
>> - the range checks for the `array[otherArray.length]` loads constant
>>   fold: `otherArray.length` is a `CastII` of i at the `otherArray`
>>   allocation. `i` is less than 9. The `CastII` at the allocation
>>   narrows the type down further to `[0-9]`.
>>   
>> - the `array[otherArray.length]` loads are control dependent on the
>> unrelated:
>> 
>> 
>> if (flag == 0) {
>> 
>> 
>> test. There's an identical dominating test which replaces that one. As
>> a consequence, the `array[otherArray.length]` loads become control
>> dependent on the dominating test.
>> 
>> - The `CastII` nodes at the `otherArray` allocations are replaced by a
>>   dominating range check `CastII` nodes for:
>>   
>> 
>> newArray[i] = 42;
>> 
>> 
>> - After loop opts, the range check `CastII` nodes are removed and the
>>   2 `array[otherArray.length]` loads common at the first:
>>   
>> 
>> if (flag == 0) {
>> 
>> 
>> test before the:
>> 
>> 
>> float[] otherArray = new float[i];
>> 
>> 
>> and
>> 
>> 
>> newArray[i] = 42;
>> 
>> 
>> that guarantee `i` is positive.
>> 
>> - `test1` is called with `i = -1`, the array load proceeds with an out
>>   of bounds index and the crash occurs.
>>   
>>   
>> `test2` and `test3` are mostly identical except for the check that's
>> eliminated (a null divisor check) and the instruction that causes a
>> fault (an integer division).
>>   
>> The fix I propose is to not eliminate range check `CastII` nodes after
>> loop opts. When range check`CastII` nodes were introduced, performance
>> was observed to regress. Removing them after loop opts was found to
>> preserve both correctness and performance. Today, the performance
>> regression still exists when `CastII` nodes are left in. So I propose
>> we keep them until the end of optimizations (so the 2 array loads
>> above don't lose a dependency and wrongly common) but remove them at
>> the end of all optimizations.
>> 
>> In the case of the array loads, they are dependent on a range check
>> for another array through a range check `CastII` and we must not lose
>> that dependency otherwise the array loads could float above the range
>> check at gcm time. I propose we deal with that problem the way it's
>> handled for `CastPP` nodes: add the dependency to the load (or
>> division)nodes ...
>
> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision:
> 
>  - Node::is_div_or_mod()
>  - Merge branch 'master' into JDK-8324517
>  - test fix
>  - review
>  - Merge branch 'master' into JDK-8324517
>  - Merge branch 'master' into JDK-8324517
>  - review
>  - Merge branch 'master' into JDK-8324517
>  - test and fix

This is causing a crash in compiler/rangechecks/TestArrayAccessAboveRCAfterRCCastIIEliminated.java

Aarch64 only so far:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/opt/mach5/mesos/work_dir/slaves/a4a7850a-7c35-410a-b879-d77fbb2f6087-S6223/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/c30fdcec-3ee7-44a9-bfc5-38869fd48e4b/runs/4e34865c-00ca-43a6-87eb-21db2db1c8ab/workspace/open/src/hotspot/share/opto/gcm.cpp:1423), pid=1811107, tid=1811123
#  assert(false) failed: graph should be schedulable

will get a bug filed

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18377#issuecomment-2114617357


More information about the hotspot-compiler-dev mailing list