RFR: 8308869: C2: use profile data in subtype checks when profile has more than one class [v12]

Tobias Hartmann thartmann at openjdk.org
Tue Sep 5 11:45:46 UTC 2023


On Tue, 5 Sep 2023 09:47:38 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> In this simple micro benchmark:
>> 
>> https://github.com/franz1981/java-puzzles/blob/main/src/main/java/red/hat/puzzles/polymorphism/RequireNonNullCheckcastScalability.java#L70
>> 
>> Performance drops sharply with polluted profile:
>> 
>> 
>> Benchmark                                         (typePollution)   Mode  Cnt     Score    Error   Units
>> RequireNonNullCheckcastScalability.isDuplicated1            false  thrpt   10  1453.372 ± 24.919  ops/us
>> 
>> 
>> to:
>> 
>> 
>> Benchmark                                         (typePollution)   Mode  Cnt   Score   Error   Units
>> RequireNonNullCheckcastScalability.isDuplicated1             true  thrpt   10  28.579 ± 2.280  ops/us
>> 
>> 
>> The test has 2 type checks to 2 different interfaces so caching with
>> `secondary_super_cache` doesn't help.
>> 
>> The micro-benchmark only uses 2 different concrete classes
>> (`DuplicatedContext` and `NonDuplicatedContext`) and they are recorded
>> in profile data at the type checks. But c2 only take advantage of
>> profile data at type checks if they report a single class.
>> 
>> What I propose is that the full blown type check expanded in
>> `Phase::gen_subtype_check()` takes advantage of profile data. So in
>> the case of the micro benchmark, before checking the
>> `secondary_super_cache`, generated code checks whether the object
>> being type checked is a `DuplicatedContext` or a
>> `NonDuplicatedContext`.
>> 
>> This works fairly well on this micro benchmark:
>> 
>> 
>> Benchmark                                         (typePollution)   Mode  Cnt    Score    Error   Units
>> RequireNonNullCheckcastScalability.isDuplicated1             true  thrpt   10  871.224 ± 20.750  ops/us
>> 
>> 
>> It also scales much better if there are multiple threads running the
>> same test (`secondary_super_cache` doesn't scale well: see
>> JDK-8180450).
>> 
>> Now if the micro-benchmark is changed according to the comment:
>> 
>> https://github.com/franz1981/java-puzzles/blob/d2d60af3d0dfe7a2567807395138edcb1d1c24f5/src/main/java/red/hat/puzzles/polymorphism/RequireNonNullCheckcastScalability.java#L62
>> 
>> so the type check hits in the `secondary_super_cache`, the current
>> code performs much better:
>> 
>> 
>> Benchmark                                         (typePollution)   Mode  Cnt    Score    Error   Units
>> RequireNonNullCheckcastScalability.isDuplicated1             true  thrpt   10  871.224 ± 20.750  ops/us
>> 
>> 
>> but leveraging profiling as explained above performs even better:
>> 
>> 
>> Benchmark                   ...
>
> Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 31 additional commits since the last revision:
> 
>  - review
>  - Merge branch 'master' into JDK-8308869
>  - Update src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp
>    
>    Co-authored-by: Tobias Hartmann <tobias.hartmann at oracle.com>
>  - Update src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp
>    
>    Co-authored-by: Tobias Hartmann <tobias.hartmann at oracle.com>
>  - Update src/hotspot/cpu/riscv/c1_LIRAssembler_riscv.cpp
>    
>    Co-authored-by: Tobias Hartmann <tobias.hartmann at oracle.com>
>  - Update src/hotspot/cpu/riscv/c1_LIRAssembler_riscv.cpp
>    
>    Co-authored-by: Tobias Hartmann <tobias.hartmann at oracle.com>
>  - Update src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp
>    
>    Co-authored-by: Tobias Hartmann <tobias.hartmann at oracle.com>
>  - Update src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp
>    
>    Co-authored-by: Tobias Hartmann <tobias.hartmann at oracle.com>
>  - Update src/hotspot/share/opto/graphKit.cpp
>    
>    Co-authored-by: Tobias Hartmann <tobias.hartmann at oracle.com>
>  - Update src/hotspot/share/opto/graphKit.cpp
>    
>    Co-authored-by: Tobias Hartmann <tobias.hartmann at oracle.com>
>  - ... and 21 more: https://git.openjdk.org/jdk/compare/6ae37ff5...d6e9fa0f

Thanks for making these changes. Looks good to me.

src/hotspot/share/opto/subtypenode.hpp line 64:

> 62: 
> 63: private:
> 64:   // method/bci that for this subtype check so profile data can be retrieved after parsing is over

Suggestion:

  // method/bci for this subtype check so profile data can be retrieved after parsing is over

-------------

Marked as reviewed by thartmann (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/14375#pullrequestreview-1610856488
PR Review Comment: https://git.openjdk.org/jdk/pull/14375#discussion_r1315775294


More information about the hotspot-compiler-dev mailing list