[lworld] RFR: 8343423: [lworld] compiler/valhalla/inlinetypes/TestArrays.java IR mismatches after after merging JDK-8334060 in jdk-24+18

Christian Hagedorn chagedorn at openjdk.org
Thu Nov 7 07:50:06 UTC 2024


There are several events that need to be taken into consideration to understand why `TestArray::test97-100()` are now failing after the merge of JDK-24+18:

### Background
#### Original Intent of `test95-101()`:
Orignally, the tests were added with [JDK-8251442](https://bugs.openjdk.org/browse/JDK-8251442) to check that sub type checks are properly folded with a new regex `CHECKCAST_ARRAY`. Back there, we did not have the possibility to match on compilation phases. We could only match on the output of `PrintIdeal` or `PrintOptoAssembly`. `PrintIdeal` is after the macro node expansion and thus `SubTypeCheck` nodes cannot be found anymore. As an alternative, a matching on `PrintOptoAssembly` was chosen by matching multiple lines.

Note that only `test101()` is a positive test for a sub type check (it requires that one is found) while `test95-100()` are negative tests that checks the absense of a sub type check.

#### Matching on Multiple Lines in Opto Assembly Is Fragile
As seen in the past, matching multiples lines on Opto Assembly is fragile when some unrelated instructions are inserted in between (for example spills). The regexes can become quite complicated and also slow to process.

#### `CHECKCAST_ARRAY` Stopped Working Correctly at some Point Due to Scheduling Bool and If Apart
At some point, code emission started to schedule the nodes for the bool/comparison earlier, then scheduled an unrelated If and only then the actual jump for the sub type check. Example:

![image](https://github.com/user-attachments/assets/bc9fb4b9-3493-48a1-8c0a-f3cb5a22808f)

The nodes for `74 compP_reg` are first emitted. Then we emit `101 testL_reg` together with `100 jmpCon_short` and only then `73 jmpConU`:

--- 74 compP_rReg nodes ---
0c4     B7: #	out( B9 B8 ) <- in( B32 B21 B19 B6 )  Freq: 0.87277
0c4     # checkcastPP of RBP
0c4     movl    R11, [RBP + #8 (8-bit)]	# compressed klass ptr
0c8     encode_heap_oop_not_null R8,RBX
10d     movl    [R13], R8	# compressed ptr
111     movq    R10, R13	# ptr -> long
114     movq    R8, RBX	# ptr -> long
117     xorq    R8, R10	# long
11a     shrq    R8, #22
11e     movq    RBX, precise [precise compiler/valhalla/inlinetypes/MyValue1: 0x00006ffd384d65a8 (compiler/valhalla/inlinetypes/MyInterface):Constant:exact * (java/lang/Cloneable,java/io/Serializable): :Constant:exact *	# ptr
128     decode_and_move_klass_not_null R9,R11
12b     movq    RBP, [R9 + #80 (8-bit)]	# class

--- 100 jmpCon_short nodes ---
12f     testq   R8, R8
132     je,s   B9  P=0.001000 C=-1.000000

--- True Path of 100 jmpCon_short  ---
134     B8: #	out( B22 B9 ) <- in( B7 )  Freq: 0.871897
134     shrq    R10, #9
138     movq    RDI, 0x00006ffd51000000	# ptr
142     addq    RDI, R10	# ptr
145     cmpb    [RDI], #2
148     jne     B22  P=0.001000 C=-1.000000

--- False Path of 100 jmpCon_short ---
14e     B9: #	out( B18 B10 ) <- in( B24 B25 B22 B8 B7 )  Freq: 0.87277

--- 73 jmpConU ---
14e     cmpq    RBP, RBX	# ptr


#### `test101()` still Working
Unfortunately, this only seemed to have affected the negative tests and not the positive `test101()`. I have not investigated further there why but it's not important for this bug fix.

#### Starting to Track `ArrayStoreExceptions` in Interpreter
With [JDK-8275908](https://bugs.openjdk.org/browse/JDK-8275908), we started to profile `ArrayStoreExceptions` in the interpreter already and propagate that information over the MDO to the C2 compiler:
https://github.com/openjdk/valhalla/blob/115d3dea17c1c8cc2f06f0b83ca9617a56099744/src/hotspot/share/interpreter/interpreterRuntime.cpp#L566-L567

#### Cannot Propagate Improved Type Information with `CheckCastPP` Due to Non-Zero Trap Count
With the additional profiling in the interpreter with JDK-8275908, we are no longer able to propagate the additional type information with the `CheckCastPP` because we require that the array store check trap was never taken at runtime:
https://github.com/openjdk/valhalla/blob/115d3dea17c1c8cc2f06f0b83ca9617a56099744/src/hotspot/share/opto/parseHelper.cpp#L196
https://github.com/openjdk/valhalla/blob/115d3dea17c1c8cc2f06f0b83ca9617a56099744/src/hotspot/share/opto/compile.cpp#L4617
As a result, we miss to propagate the exactness here:
https://github.com/openjdk/valhalla/blob/115d3dea17c1c8cc2f06f0b83ca9617a56099744/src/hotspot/share/opto/parseHelper.cpp#L215
And thus cannot remove the sub type checks in `test97-100()`.

#### Not Noticed with Broken `CHECKCAST_ARRAY` but only after the Late Barrier Expansion for G1 JEP
We should actually have found this after JDK-8275908 but as described above, `CHECKCAST_ARRAY` was broken. We only started to notice it when merging in the Late Barrier Expansion for G1 JEP which must have changed the scheduling decisions.

### Proposed Fix
To simulate the pre-JDK-8275908 behavior, we can simply set a warm-up of 0 which ensures that no trap is ever recorded to have been taken. We additionally require that the IR rules are only evaluated if `MonomorphicArrayCheck` is set which is a requirement to propagate the type:
https://github.com/openjdk/valhalla/blob/115d3dea17c1c8cc2f06f0b83ca9617a56099744/src/hotspot/share/opto/parseHelper.cpp#L171

#### Additional Tweaks
To get this working, I've also changed the following:
- Using `IRNode.SUBTYPE_CHECK` instead of `CHECKCAST_ARRAY` is easier and less fragile which does the matching on `CompilePhase.BEFORE_MACRO_EXPANSION`. 
  - Allowed us to remove `CHECKCAST_ARRAY`. 
- Noticed that `IRNode.SUBTYPE_CHECK` had the wrong default `CompilePhase` set (`CompilePhase.PRINT_IDEAL`). I've changed that by introducing a new `IRNode.macroNodes()` method which sets it to `CompilePhase.BEFORE_MACRO_EXPANSION` (this means if you omit `phase = ...` it will fall back to `CompilePhase.BEFORE_MACRO_EXPANSION` when using `IRNode.SUBTYPE_CHECK`).
  - This could also be upstreamed to mainline. 
  - As a result, I've noticed that `InlineTypeIRNode` is not initialized when we trying to fetch the default compile phases in the flag VM. Added `InlineTypeIRNode.forceStaticInitialization()` to `IRNode` and removed the other call. This should now ensure that we always have properly initialized the Valhalla specific IR nodes.

Thanks,
Christian

-------------

Commit messages:
 - Only run test97-100 with MonomorphicArrayCheck enabled
 - 8343423: [lworld] compiler/valhalla/inlinetypes/TestArrays.java IR mismatches after after merging JDK-8334060 in jdk-24+18

Changes: https://git.openjdk.org/valhalla/pull/1296/files
  Webrev: https://webrevs.openjdk.org/?repo=valhalla&pr=1296&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8343423
  Stats: 52 lines in 5 files changed: 31 ins; 7 del; 14 mod
  Patch: https://git.openjdk.org/valhalla/pull/1296.diff
  Fetch: git fetch https://git.openjdk.org/valhalla.git pull/1296/head:pull/1296

PR: https://git.openjdk.org/valhalla/pull/1296


More information about the valhalla-dev mailing list