RFR: 8331658: secondary_super_cache does not scale well: C1
Andrew Haley
aph at openjdk.org
Tue May 28 15:17:17 UTC 2024
This is the C1 version of [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450).
The new logic in this PR is as simple as I can make it. It is a somewhat-simplified version of the C2 change in [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450). In order to reduce risk I haven't touched the existing slow subtype stub.
The register allocation logic in the existing code is pretty gnarly, and I have no desire to break anything at this point in the release cycle, so I have allocated just one register more than te existing code.
Performance is pretty good. Before and after:
x64, AMD 2950X, 8 cores:
Benchmark Mode Cnt Score Error Units
SecondarySuperCacheHits.test avgt 5 0.959 ± 0.091 ns/op
SecondarySuperCacheInterContention.test avgt 5 42.931 ± 6.951 ns/op
SecondarySuperCacheInterContention.test:t1 avgt 5 42.397 ± 7.708 ns/op
SecondarySuperCacheInterContention.test:t2 avgt 5 43.466 ± 8.238 ns/op
SecondarySuperCacheIntraContention.test avgt 5 74.660 ± 0.127 ns/op
SecondarySuperCacheHits.test avgt 5 1.480 ± 0.077 ns/op
SecondarySuperCacheInterContention.test avgt 5 1.461 ± 0.063 ns/op
SecondarySuperCacheInterContention.test:t1 avgt 5 1.767 ± 0.078 ns/op
SecondarySuperCacheInterContention.test:t2 avgt 5 1.155 ± 0.052 ns/op
SecondarySuperCacheIntraContention.test avgt 5 1.421 ± 0.002 ns/op
AArch64, Mac M3, 8 cores:
Benchmark Mode Cnt Score Error Units
SecondarySuperCacheHits.test avgt 5 0.835 ± 0.021 ns/op
SecondarySuperCacheInterContention.test avgt 5 74.078 ± 18.095 ns/op
SecondarySuperCacheInterContention.test:t1 avgt 5 81.863 ± 42.492 ns/op
SecondarySuperCacheInterContention.test:t2 avgt 5 66.293 ± 11.254 ns/op
SecondarySuperCacheIntraContention.test avgt 5 335.563 ± 6.171 ns/op
SecondarySuperCacheHits.test avgt 5 1.212 ± 0.004 ns/op
SecondarySuperCacheInterContention.test avgt 5 0.871 ± 0.002 ns/op
SecondarySuperCacheInterContention.test:t1 avgt 5 0.626 ± 0.003 ns/op
SecondarySuperCacheInterContention.test:t2 avgt 5 1.115 ± 0.006 ns/op
SecondarySuperCacheIntraContention.test avgt 5 0.696 ± 0.001 ns/op
The first test, `SecondarySuperCacheHits`, showns a small regression. It's the "happy path" which simply checks the same subclass again and again in a loop, in a single thread. I suspect that, as with the C2 experiments we did, this will never be noticeable. All the other tests, though, show a huge improvement, so performance is a lot more predictable.
This patch only affects `checkcast` and `instanceof`. The performance of `Class::isInstance()` isn't affected because it's not intrinsified in C1, and neither is any of the logic for arraycopy intrinsics.
After the next release is done, I'd like to do a big cleanup and simplification of subtype checking, which should include the still-missing parts of C1 and the interpreter and make everything much more maintainable.
Finally, this patch doesn't greatly help with tiered compilation because the subtype checking runtime is greatly affected by profile counter updates. It's really all about pure C1, which seems to be popular in some short-lived cloud applications.
-------------
Commit messages:
- JDK-8331341: secondary_super_cache does not scale well: C1 and interpreter
- JDK-8331341: secondary_super_cache does not scale well: C1 and interpreter
- Merge branch 'clean' into C1-hash-supers
- JDK-8331341: secondary_super_cache does not scale well: C1 and interpreter
- JDK-8331341: secondary_super_cache does not scale well: C1 and interpreter
- JDK-8331341: secondary_super_cache does not scale well: C1 and interpreter
- JDK-8331341: secondary_super_cache does not scale well: C1 and interpreter
- Test
- Merge branch 'C1-hash-supers' of https://github.com/theRealAph/jdk into C1-hash-supers
- Merge branch 'C1-hash-supers' of https://github.com/theRealAph/jdk into C1-hash-supers
- ... and 95 more: https://git.openjdk.org/jdk/compare/235ba9a7...8c05732c
Changes: https://git.openjdk.org/jdk/pull/19426/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19426&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8331658
Stats: 232 lines in 9 files changed: 205 ins; 3 del; 24 mod
Patch: https://git.openjdk.org/jdk/pull/19426.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/19426/head:pull/19426
PR: https://git.openjdk.org/jdk/pull/19426
More information about the hotspot-compiler-dev
mailing list