RFR: 8308869: C2: use profile data in subtype checks when profile has more than one class [v9]

Tobias Hartmann thartmann at openjdk.org
Wed Aug 23 09:13:49 UTC 2023


On Wed, 19 Jul 2023 13:36:27 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> In this simple micro benchmark:
>> 
>> https://github.com/franz1981/java-puzzles/blob/main/src/main/java/red/hat/puzzles/polymorphism/RequireNonNullCheckcastScalability.java#L70
>> 
>> Performance drops sharply with polluted profile:
>> 
>> 
>> Benchmark                                         (typePollution)   Mode  Cnt     Score    Error   Units
>> RequireNonNullCheckcastScalability.isDuplicated1            false  thrpt   10  1453.372 ± 24.919  ops/us
>> 
>> 
>> to:
>> 
>> 
>> Benchmark                                         (typePollution)   Mode  Cnt   Score   Error   Units
>> RequireNonNullCheckcastScalability.isDuplicated1             true  thrpt   10  28.579 ± 2.280  ops/us
>> 
>> 
>> The test has 2 type checks to 2 different interfaces so caching with
>> `secondary_super_cache` doesn't help.
>> 
>> The micro-benchmark only uses 2 different concrete classes
>> (`DuplicatedContext` and `NonDuplicatedContext`) and they are recorded
>> in profile data at the type checks. But c2 only take advantage of
>> profile data at type checks if they report a single class.
>> 
>> What I propose is that the full blown type check expanded in
>> `Phase::gen_subtype_check()` takes advantage of profile data. So in
>> the case of the micro benchmark, before checking the
>> `secondary_super_cache`, generated code checks whether the object
>> being type checked is a `DuplicatedContext` or a
>> `NonDuplicatedContext`.
>> 
>> This works fairly well on this micro benchmark:
>> 
>> 
>> Benchmark                                         (typePollution)   Mode  Cnt    Score    Error   Units
>> RequireNonNullCheckcastScalability.isDuplicated1             true  thrpt   10  871.224 ± 20.750  ops/us
>> 
>> 
>> It also scales much better if there are multiple threads running the
>> same test (`secondary_super_cache` doesn't scale well: see
>> JDK-8180450).
>> 
>> Now if the micro-benchmark is changed according to the comment:
>> 
>> https://github.com/franz1981/java-puzzles/blob/d2d60af3d0dfe7a2567807395138edcb1d1c24f5/src/main/java/red/hat/puzzles/polymorphism/RequireNonNullCheckcastScalability.java#L62
>> 
>> so the type check hits in the `secondary_super_cache`, the current
>> code performs much better:
>> 
>> 
>> Benchmark                                         (typePollution)   Mode  Cnt    Score    Error   Units
>> RequireNonNullCheckcastScalability.isDuplicated1             true  thrpt   10  871.224 ± 20.750  ops/us
>> 
>> 
>> but leveraging profiling as explained above performs even better:
>> 
>> 
>> Benchmark                   ...
>
> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision:
> 
>  - riscv support
>  - improvements to test
>  - Merge branch 'master' into JDK-8308869
>  - never common SubTypeCheckNode nodes
>  - keep both ways of doing profile
>  - whitespace
>  - reworked change
>  - Merge branch 'master' into JDK-8308869
>  - more test failures
>  - Merge branch 'master' into JDK-8308869
>  - ... and 6 more: https://git.openjdk.org/jdk/compare/674d5f17...8d9a08d1

I didn't get to review this yet but I plan to - probably only after a short vacation next week. I did run some performance and correctness testing though. Performance looks good (neutral). Correctness testing looks good too but found this single failure:

`ProfileAtTypeCheck` fails IR verification with `-ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation`:


Failed IR Rules (11) of Methods (9)
-----------------------------------
1) Method "public static void compiler.c2.irTests.ProfileAtTypeCheck.test10(boolean)" - [Failed IR rules: 1]:
   * @IR rule 2: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={ITER_GVN1}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#SUBTYPE_CHECK#_", "1"}, failOn={}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "Iter GVN 1":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(SubTypeCheck.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 2 = 1 [given]
             - Matched nodes (2):
               * 45  SubTypeCheck  === _ 36 27  [[ 130 ]]  profiled at:  compiler.c2.irTests.ProfileAtTypeCheck::test10:3 !jvms: ProfileAtTypeCheck::test10 @ bci:3 (line 270)
               * 90  SubTypeCheck  === _ 82 27  [[ 126 ]]  profiled at:  compiler.c2.irTests.ProfileAtTypeCheck::test10:16 !jvms: ProfileAtTypeCheck::test10 @ bci:16 (line 272)

2) Method "public static void compiler.c2.irTests.ProfileAtTypeCheck.test12()" - [Failed IR rules: 2]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={AFTER_PARSING}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#SUBTYPE_CHECK#_", "3"}, failOn={}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "After Parsing":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(SubTypeCheck.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 0 = 3 [given]
           - No nodes matched!
   * @IR rule 2: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={PHASEIDEALLOOP_ITERATIONS}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#SUBTYPE_CHECK#_", "1"}, failOn={}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "PhaseIdealLoop iterations":
       - NO compilation output found for this phase! Make sure this phase is emitted or remove it from the list of compile phases in the @IR rule to match on.

3) Method "public static void compiler.c2.irTests.ProfileAtTypeCheck.test15(java.lang.Object)" - [Failed IR rules: 1]:
   * @IR rule 3: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={MACRO_EXPANSION}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#CMP_P#_", "5", "_#LOAD_KLASS#_", "1", "_#LOAD_NKLASS#_", "1", "_#PARTIAL_SUBTYPE_CHECK#_", "1"}, failOn={}, applyIfAnd={"UseCompressedClassPointers", "true", "UseParallelGC", "true"}, applyIfOr={}, applyIfNot={})"
     > Phase "Macro expand":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(CmpP.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 3 = 5 [given]
             - Matched nodes (3):
               * 39  CmpP  === _ 10 38  [[ 40 ]]  !jvms: ProfileAtTypeCheck::test15 @ bci:5 (line 429)
               * 117  CmpP  === _ 116 38  [[ 118 ]] 
               * 123  CmpP  === _ 97 35  [[ 125 ]]  !orig=[100]

4) Method "public static void compiler.c2.irTests.ProfileAtTypeCheck.test2(java.lang.Object)" - [Failed IR rules: 2]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={AFTER_PARSING}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={}, failOn={"_#SUBTYPE_CHECK#_"}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "After Parsing":
       - failOn: Graph contains forbidden nodes:
         * Constraint 1: "(\\d+(\\s){2}(SubTypeCheck.*)+(\\s){2}===.*)"
           - Matched forbidden node:
             * 39  SubTypeCheck  === _ 30 21  [[ 53 ]]  profiled at:  compiler.c2.irTests.ProfileAtTypeCheck::test2:1 !jvms: ProfileAtTypeCheck::test2 @ bci:1 (line 102)
   * @IR rule 3: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={AFTER_PARSING}, applyIf={"UseCompressedClassPointers", "true"}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#CMP_P#_", "2", "_#LOAD_NKLASS#_", "1"}, failOn={}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "After Parsing":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(CmpP.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 1 = 2 [given]
             - Matched node:
               * 25  CmpP  === _ 10 24  [[ 26 ]]  !jvms: ProfileAtTypeCheck::test2 @ bci:1 (line 102)
         * Constraint 2: "(\\d+(\\s){2}(LoadNKlass.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 0 = 1 [given]
           - No nodes matched!

5) Method "public static void compiler.c2.irTests.ProfileAtTypeCheck.test3(java.lang.Object)" - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={AFTER_PARSING}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#SUBTYPE_CHECK#_", "1"}, failOn={}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "After Parsing":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(SubTypeCheck.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 2 = 1 [given]
             - Matched nodes (2):
               * 40  SubTypeCheck  === _ 30 21  [[ 46 ]]  profiled at:  compiler.c2.irTests.ProfileAtTypeCheck::test3:1 !jvms: ProfileAtTypeCheck::test3 @ bci:1 (line 120)
               * 62  SubTypeCheck  === _ 30 21  [[ 67 ]]  profiled at:  compiler.c2.irTests.ProfileAtTypeCheck::test3:8 !jvms: ProfileAtTypeCheck::test3 @ bci:8 (line 121)

6) Method "public static void compiler.c2.irTests.ProfileAtTypeCheck.test5(java.lang.Object)" - [Failed IR rules: 1]:
   * @IR rule 3: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={MACRO_EXPANSION}, applyIf={"UseCompressedClassPointers", "true"}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#CMP_P#_", "5", "_#LOAD_KLASS#_", "1", "_#LOAD_NKLASS#_", "1", "_#PARTIAL_SUBTYPE_CHECK#_", "1"}, failOn={}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "Macro expand":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(CmpP.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 3 = 5 [given]
             - Matched nodes (3):
               * 25  CmpP  === _ 10 24  [[ 26 ]]  !jvms: ProfileAtTypeCheck::test5 @ bci:1 (line 161)
               * 108  CmpP  === _ 107 24  [[ 109 ]] 
               * 114  CmpP  === _ 88 21  [[ 116 ]]  !orig=[91]

7) Method "public static boolean compiler.c2.irTests.ProfileAtTypeCheck.test7(java.lang.Object)" - [Failed IR rules: 1]:
   * @IR rule 3: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={MACRO_EXPANSION}, applyIf={"UseCompressedClassPointers", "true"}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#CMP_P#_", "5", "_#LOAD_KLASS#_", "1", "_#LOAD_NKLASS#_", "1", "_#PARTIAL_SUBTYPE_CHECK#_", "1"}, failOn={}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "Macro expand":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(CmpP.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 3 = 5 [given]
             - Matched nodes (3):
               * 26  CmpP  === _ 10 25  [[ 27 ]]  !jvms: ProfileAtTypeCheck::test7 @ bci:1 (line 198)
               * 86  CmpP  === _ 85 25  [[ 87 ]] 
               * 92  CmpP  === _ 66 22  [[ 94 ]]  !orig=[69]

8) Method "public static void compiler.c2.irTests.ProfileAtTypeCheck.test8(java.lang.Object)" - [Failed IR rules: 1]:
   * @IR rule 3: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={MACRO_EXPANSION}, applyIf={"UseCompressedClassPointers", "true"}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#CMP_P#_", "5", "_#LOAD_KLASS#_", "1", "_#LOAD_NKLASS#_", "1", "_#PARTIAL_SUBTYPE_CHECK#_", "1"}, failOn={}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "Macro expand":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(CmpP.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 3 = 5 [given]
             - Matched nodes (3):
               * 25  CmpP  === _ 10 24  [[ 26 ]]  !jvms: ProfileAtTypeCheck::test8 @ bci:1 (line 216)
               * 108  CmpP  === _ 107 24  [[ 109 ]] 
               * 114  CmpP  === _ 88 21  [[ 116 ]]  !orig=[91]

9) Method "public static void compiler.c2.irTests.ProfileAtTypeCheck.test9(boolean,boolean,java.lang.Object,java.lang.Object)" - [Failed IR rules: 1]:
   * @IR rule 2: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={PHASEIDEALLOOP1}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={"_#SUBTYPE_CHECK#_", "2"}, failOn={}, applyIfAnd={}, applyIfOr={}, applyIfNot={})"
     > Phase "PhaseIdealLoop 1":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(SubTypeCheck.*)+(\\s){2}===.*)"
           - Failed comparison: [found] 1 = 2 [given]
             - Matched node:
               * 104  SubTypeCheck  === _ 166 87  [[ 140 ]]  profiled at:  compiler.c2.irTests.ProfileAtTypeCheck::test9:56 !jvms: ProfileAtTypeCheck::test9 @ bci:56 (line 253)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14375#issuecomment-1689324669


More information about the hotspot-compiler-dev mailing list