RFR: 8308869: C2: use profile data in subtype checks when profile has more than one class [v7]
Roland Westrelin
roland at openjdk.org
Thu Jun 29 14:54:30 UTC 2023
> In this simple micro benchmark:
>
> https://github.com/franz1981/java-puzzles/blob/main/src/main/java/red/hat/puzzles/polymorphism/RequireNonNullCheckcastScalability.java#L70
>
> Performance drops sharply with polluted profile:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplicated1 false thrpt 10 1453.372 ± 24.919 ops/us
>
>
> to:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplicated1 true thrpt 10 28.579 ± 2.280 ops/us
>
>
> The test has 2 type checks to 2 different interfaces so caching with
> `secondary_super_cache` doesn't help.
>
> The micro-benchmark only uses 2 different concrete classes
> (`DuplicatedContext` and `NonDuplicatedContext`) and they are recorded
> in profile data at the type checks. But c2 only take advantage of
> profile data at type checks if they report a single class.
>
> What I propose is that the full blown type check expanded in
> `Phase::gen_subtype_check()` takes advantage of profile data. So in
> the case of the micro benchmark, before checking the
> `secondary_super_cache`, generated code checks whether the object
> being type checked is a `DuplicatedContext` or a
> `NonDuplicatedContext`.
>
> This works fairly well on this micro benchmark:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplicated1 true thrpt 10 871.224 ± 20.750 ops/us
>
>
> It also scales much better if there are multiple threads running the
> same test (`secondary_super_cache` doesn't scale well: see
> JDK-8180450).
>
> Now if the micro-benchmark is changed according to the comment:
>
> https://github.com/franz1981/java-puzzles/blob/d2d60af3d0dfe7a2567807395138edcb1d1c24f5/src/main/java/red/hat/puzzles/polymorphism/RequireNonNullCheckcastScalability.java#L62
>
> so the type check hits in the `secondary_super_cache`, the current
> code performs much better:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplicated1 true thrpt 10 871.224 ± 20.750 ops/us
>
>
> but leveraging profiling as explained above performs even better:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplic...
Roland Westrelin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision:
- whitespace
- reworked change
- Merge branch 'master' into JDK-8308869
- more test failures
- Merge branch 'master' into JDK-8308869
- whitespaces
- test failures
- review
- 32 bit fix
- white spaces
- ... and 1 more: https://git.openjdk.org/jdk/compare/591890fc...101399eb
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/14375/files
- new: https://git.openjdk.org/jdk/pull/14375/files/684f7520..101399eb
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=14375&range=06
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=14375&range=05-06
Stats: 10540 lines in 525 files changed: 5401 ins; 2092 del; 3047 mod
Patch: https://git.openjdk.org/jdk/pull/14375.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/14375/head:pull/14375
PR: https://git.openjdk.org/jdk/pull/14375
More information about the hotspot-compiler-dev
mailing list