RFR: 8308869: C2: use profile data in subtype checks when profile has more than one class [v5]
Roland Westrelin
roland at openjdk.org
Thu Jun 15 13:08:36 UTC 2023
> In this simple micro benchmark:
>
> https://github.com/franz1981/java-puzzles/blob/main/src/main/java/red/hat/puzzles/polymorphism/RequireNonNullCheckcastScalability.java#L70
>
> Performance drops sharply with polluted profile:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplicated1 false thrpt 10 1453.372 ± 24.919 ops/us
>
>
> to:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplicated1 true thrpt 10 28.579 ± 2.280 ops/us
>
>
> The test has 2 type checks to 2 different interfaces so caching with
> `secondary_super_cache` doesn't help.
>
> The micro-benchmark only uses 2 different concrete classes
> (`DuplicatedContext` and `NonDuplicatedContext`) and they are recorded
> in profile data at the type checks. But c2 only take advantage of
> profile data at type checks if they report a single class.
>
> What I propose is that the full blown type check expanded in
> `Phase::gen_subtype_check()` takes advantage of profile data. So in
> the case of the micro benchmark, before checking the
> `secondary_super_cache`, generated code checks whether the object
> being type checked is a `DuplicatedContext` or a
> `NonDuplicatedContext`.
>
> This works fairly well on this micro benchmark:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplicated1 true thrpt 10 871.224 ± 20.750 ops/us
>
>
> It also scales much better if there are multiple threads running the
> same test (`secondary_super_cache` doesn't scale well: see
> JDK-8180450).
>
> Now if the micro-benchmark is changed according to the comment:
>
> https://github.com/franz1981/java-puzzles/blob/d2d60af3d0dfe7a2567807395138edcb1d1c24f5/src/main/java/red/hat/puzzles/polymorphism/RequireNonNullCheckcastScalability.java#L62
>
> so the type check hits in the `secondary_super_cache`, the current
> code performs much better:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplicated1 true thrpt 10 871.224 ± 20.750 ops/us
>
>
> but leveraging profiling as explained above performs even better:
>
>
> Benchmark (typePollution) Mode Cnt Score Error Units
> RequireNonNullCheckcastScalability.isDuplic...
Roland Westrelin has updated the pull request incrementally with one additional commit since the last revision:
whitespaces
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/14375/files
- new: https://git.openjdk.org/jdk/pull/14375/files/a2c8055c..6daa01d0
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=14375&range=04
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=14375&range=03-04
Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
Patch: https://git.openjdk.org/jdk/pull/14375.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/14375/head:pull/14375
PR: https://git.openjdk.org/jdk/pull/14375
More information about the hotspot-compiler-dev
mailing list