RFR: 8331117: [PPC64] secondary_super_cache does not scale well [v5]
Richard Reingruber
rrich at openjdk.org
Thu Jun 13 15:02:17 UTC 2024
On Wed, 12 Jun 2024 14:35:42 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:
>> PPC64 implementation of [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450). Please review!
>> I noticed that `r_array_length` is sometimes 0 and I don't see code for that on x86. Any idea?
>> How can we verify it? By comparing the performance using the micro benchmarks?
>>
>> Micro benchmark results without patch (measured on Power10 with 2*8 hardware threads):
>>
>> Original
>> SecondarySuperCacheHits: 13.033 ±(99.9%) 0.058 ns/op [Average]
>> SecondarySuperCacheInterContention.test avgt 15 432.366 ± 8.364 ns/op
>> SecondarySuperCacheInterContention.test:t1 avgt 15 432.310 ± 8.460 ns/op
>> SecondarySuperCacheInterContention.test:t2 avgt 15 432.422 ± 10.819 ns/op
>> SecondarySuperCacheIntraContention.test avgt 15 355.192 ± 3.597 ns/op
>> SecondarySupersLookup.testNegative00 avgt 15 12.274 ± 0.026 ns/op
>> SecondarySupersLookup.testNegative01 avgt 15 12.300 ± 0.039 ns/op
>> SecondarySupersLookup.testNegative02 avgt 15 12.304 ± 0.034 ns/op
>> SecondarySupersLookup.testNegative03 avgt 15 12.276 ± 0.050 ns/op
>> SecondarySupersLookup.testNegative04 avgt 15 12.235 ± 0.044 ns/op
>> SecondarySupersLookup.testNegative05 avgt 15 12.308 ± 0.156 ns/op
>> SecondarySupersLookup.testNegative06 avgt 15 12.291 ± 0.048 ns/op
>> SecondarySupersLookup.testNegative07 avgt 15 12.307 ± 0.052 ns/op
>> SecondarySupersLookup.testNegative08 avgt 15 12.398 ± 0.075 ns/op
>> SecondarySupersLookup.testNegative09 avgt 15 12.552 ± 0.122 ns/op
>> SecondarySupersLookup.testNegative10 avgt 15 12.490 ± 0.083 ns/op
>> SecondarySupersLookup.testNegative16 avgt 15 12.565 ± 0.092 ns/op
>> SecondarySupersLookup.testNegative20 avgt 15 19.059 ± 0.958 ns/op
>> SecondarySupersLookup.testNegative30 avgt 15 19.268 ± 0.124 ns/op
>> SecondarySupersLookup.testNegative32 avgt 15 20.059 ± 0.114 ns/op
>> SecondarySupersLookup.testNegative40 avgt 15 25.117 ± 0.368 ns/op
>> SecondarySupersLookup.testNegative50 avgt 15 32.735 ± 0.359 ns/op
>> SecondarySupersLookup.testNegative55 avgt 15 34.866 ± 0.152 ns/op
>> SecondarySupersLookup.testNegative56 avgt 15 35.492 ± 0.276 ns/op
>> SecondarySupersLookup.testNegative57 avgt 15 36.620 ± 0.334 ns/op
>> SecondarySupersLookup.testNegative58 avgt 15 37.226 ± 0.180 ns/op
>> SecondarySupersLookup.testNegative59 avgt 15 37.774 ± 0.241 ns/op
>> SecondarySupersLookup.testNegative60 avgt 15 38.627 ± 1.451 ns/op
>> SecondarySupersLookup.testNegative61 avgt 15 ...
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
>
> Remove pointless assertion.
Hi Martin,
thanks for the port. It looks good. I've only got a few minor comments.
Cheers, Richard.
src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2166:
> 2164:
> 2165: // Return true: we succeeded in generating this code
> 2166: bool MacroAssembler::lookup_secondary_supers_table(Register r_sub_klass,
The method always returns `true`. Should even return a value?
src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2176:
> 2174: assert_different_registers(r_sub_klass, r_super_klass, temp1, temp2, temp3, temp4, result);
> 2175:
> 2176: Label L_fallthrough;
`L_done` would be a better name.
src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2232:
> 2230:
> 2231: // Linear probe. Rotate the bitmap so that the next bit to test is
> 2232: // in Bit 1.
It's bit 2 that's tested next after the rotation, isn't it? See L2331 in `lookup_secondary_supers_table_slow_path`
Suggestion:
// in Bit 2.
src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2274:
> 2272: LOOKUP_SECONDARY_SUPERS_TABLE_REGISTERS;
> 2273:
> 2274: Label L_fallthrough;
`L_done` would be a better name.
src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2328:
> 2326: ldx(result, r_array_base, r_array_index);
> 2327: xor_(result, result, r_super_klass);
> 2328: beq(CCR0, L_fallthrough);
You might add a comment `success (result == 0)`.
src/hotspot/cpu/ppc/ppc.ad line 12059:
> 12057: u1 super_klass_slot = ((Klass*)$super_con$$constant)->hash_slot();
> 12058: if (InlineSecondarySupersTest) {
> 12059: success = __ lookup_secondary_supers_table($sub$$Register, $super_reg$$Register,
`success` is always true. Can it be removed?
-------------
PR Review: https://git.openjdk.org/jdk/pull/19368#pullrequestreview-2113235016
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1638232248
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1636593154
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1637747037
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1637842324
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1637847137
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1638229107
More information about the hotspot-compiler-dev
mailing list