RFR: 8331658: secondary_super_cache does not scale well: C1 [v2]

Vladimir Ivanov vlivanov at openjdk.org
Mon Jun 3 19:44:41 UTC 2024


On Wed, 29 May 2024 09:32:41 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> This is the C1 version of [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450).
>> 
>> The new logic in this PR is as simple as I can make it. It is a somewhat-simplified version of the C2 change in [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450). In order to reduce risk I haven't touched the existing slow subtype stub.
>> The register allocation logic in the existing code is pretty gnarly, and I have no desire to break anything at this point in the release cycle, so I have allocated just one register more than the existing code does.
>> 
>> Performance is pretty good. Before and after:
>> 
>> x64, AMD 2950X, 8 cores:
>> 
>> 
>> Benchmark                                   Mode  Cnt   Score   Error  Units
>> SecondarySuperCacheHits.test                avgt    5   0.959 ± 0.091  ns/op
>> SecondarySuperCacheInterContention.test     avgt    5  42.931 ± 6.951  ns/op
>> SecondarySuperCacheInterContention.test:t1  avgt    5  42.397 ± 7.708  ns/op
>> SecondarySuperCacheInterContention.test:t2  avgt    5  43.466 ± 8.238  ns/op
>> SecondarySuperCacheIntraContention.test     avgt    5  74.660 ± 0.127  ns/op
>> 
>> SecondarySuperCacheHits.test                avgt    5  1.480 ± 0.077  ns/op
>> SecondarySuperCacheInterContention.test     avgt    5  1.461 ± 0.063  ns/op
>> SecondarySuperCacheInterContention.test:t1  avgt    5  1.767 ± 0.078  ns/op
>> SecondarySuperCacheInterContention.test:t2  avgt    5  1.155 ± 0.052  ns/op
>> SecondarySuperCacheIntraContention.test     avgt    5  1.421 ± 0.002  ns/op
>> 
>> AArch64, Mac M3, 8 cores:
>> 
>> 
>> Benchmark                     Mode  Cnt  Score   Error  Units
>> SecondarySuperCacheHits.test                avgt    5    0.835 ±  0.021  ns/op
>> SecondarySuperCacheInterContention.test     avgt    5   74.078 ± 18.095  ns/op
>> SecondarySuperCacheInterContention.test:t1  avgt    5   81.863 ± 42.492  ns/op
>> SecondarySuperCacheInterContention.test:t2  avgt    5   66.293 ± 11.254  ns/op
>> SecondarySuperCacheIntraContention.test     avgt    5  335.563 ±  6.171  ns/op
>> 
>> SecondarySuperCacheHits.test                avgt    5  1.212 ±  0.004  ns/op
>> SecondarySuperCacheInterContention.test     avgt    5  0.871 ±  0.002  ns/op
>> SecondarySuperCacheInterContention.test:t1  avgt    5  0.626 ±  0.003  ns/op
>> SecondarySuperCacheInterContention.test:t2  avgt    5  1.115 ±  0.006  ns/op
>> SecondarySuperCacheIntraContention.test     avgt    5  0.696 ±  0.001  ns/op
>> 
>> 
>> 
>> The first test, `SecondarySuperCacheHits`, showns a small regression. It's...
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
> 
>   JDK-8331658: secondary_super_cache does not scale well: C1

It's unfortunate to see C1-specific version of secondary supers table lookup. Why don't you reuse `MacroAssembler::lookup_secondary_supers_table` instead?

Also, in the context of C1, do performance benefits justify additional implementation complexity? As an alternative, migrating `MacroAssembler::check_klass_subtype_slow_path` away from linear search to a table lookup would also do the job and cover all cases of subtype checks in the JVM.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19426#issuecomment-2145980997


More information about the hotspot-compiler-dev mailing list