RFR: 8342496: C2/Shenandoah: SEGV in compiled code when running jcstress

Roland Westrelin roland at openjdk.org
Thu Oct 17 13:17:46 UTC 2024


The reason for the crash is that compiled code reads from an object
that's null. All field loads from an object are guarded by a null
check. Where is the null check in that case? After the field load:


0x00007ffaac91261f:  44 8B 69 0C                      mov    r13d, dword ptr [rcx + 0xc] <- field load
0x00007ffaac912623:  85 C9                            test   ecx, ecx                    <- null check (oops!)
0x00007ffaac912625:  74 5C                            je     0x7ffaac912683


When the IR graph is constructed for the test case, the field load is
correctly made dependent on the null check (through a `CastPP` node)
but then something happens that's shenandoah specific and that causes
the field load to become dependent on another check so it can execute
before the null check.

There are several load barriers involved in the process. One of them
is expanded at the null check projection. In the process, control for
the nodes that are control dependent on the null check is updated to
be the region at the end of the just expanded barrier. The `CastPP`
node for the null check gets the `Region` as new control.

Another barrier is expanded right after that one. The 2 are back to
back. They are merged. The `Region` that the `CastPP` depends on goes
away, the `CastPP` is cloned in both branches at the `Region` and one
of them becomes control dependent on the heap stable test of the first
expanded barrier. At this point, one of the `CastPP` is control
dependent on a heap stable test that's after the null check. But then,
the heap stable test is moved out of loop and 2 copies of the loop are
made so one can run without any overhead from barriers. When that
happens, the `CastPP` becomes dependent on a test that dominates the
null check and so the field load that depends on the `CastPP` can be
scheduled before the null check.

The fix I propose is not update the control when the barrier is
expanded for nodes that can float when the test they depend on
moves. This way the `CastPP` remains dependent on the null check.

-------------

Commit messages:
 - more
 - fix & test

Changes: https://git.openjdk.org/jdk/pull/21562/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21562&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8342496
  Stats: 70 lines in 2 files changed: 70 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/21562.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21562/head:pull/21562

PR: https://git.openjdk.org/jdk/pull/21562


More information about the hotspot-compiler-dev mailing list