RFR: 8331117: [PPC64] secondary_super_cache does not scale well [v5]

Richard Reingruber rrich at openjdk.org
Thu Jun 13 15:02:17 UTC 2024


On Wed, 12 Jun 2024 14:35:42 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> PPC64 implementation of [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450). Please review!
>> I noticed that `r_array_length` is sometimes 0 and I don't see code for that on x86. Any idea?
>> How can we verify it? By comparing the performance using the micro benchmarks?
>> 
>> Micro benchmark results without patch (measured on Power10 with 2*8 hardware threads):
>> 
>> Original 
>> SecondarySuperCacheHits: 13.033 ±(99.9%) 0.058 ns/op [Average]
>> SecondarySuperCacheInterContention.test     avgt   15  432.366 ±  8.364  ns/op
>> SecondarySuperCacheInterContention.test:t1  avgt   15  432.310 ±  8.460  ns/op
>> SecondarySuperCacheInterContention.test:t2  avgt   15  432.422 ± 10.819  ns/op
>> SecondarySuperCacheIntraContention.test  avgt   15  355.192 ± 3.597  ns/op
>> SecondarySupersLookup.testNegative00  avgt   15  12.274 ± 0.026  ns/op
>> SecondarySupersLookup.testNegative01  avgt   15  12.300 ± 0.039  ns/op
>> SecondarySupersLookup.testNegative02  avgt   15  12.304 ± 0.034  ns/op
>> SecondarySupersLookup.testNegative03  avgt   15  12.276 ± 0.050  ns/op
>> SecondarySupersLookup.testNegative04  avgt   15  12.235 ± 0.044  ns/op
>> SecondarySupersLookup.testNegative05  avgt   15  12.308 ± 0.156  ns/op
>> SecondarySupersLookup.testNegative06  avgt   15  12.291 ± 0.048  ns/op
>> SecondarySupersLookup.testNegative07  avgt   15  12.307 ± 0.052  ns/op
>> SecondarySupersLookup.testNegative08  avgt   15  12.398 ± 0.075  ns/op
>> SecondarySupersLookup.testNegative09  avgt   15  12.552 ± 0.122  ns/op
>> SecondarySupersLookup.testNegative10  avgt   15  12.490 ± 0.083  ns/op
>> SecondarySupersLookup.testNegative16  avgt   15  12.565 ± 0.092  ns/op
>> SecondarySupersLookup.testNegative20  avgt   15  19.059 ± 0.958  ns/op
>> SecondarySupersLookup.testNegative30  avgt   15  19.268 ± 0.124  ns/op
>> SecondarySupersLookup.testNegative32  avgt   15  20.059 ± 0.114  ns/op
>> SecondarySupersLookup.testNegative40  avgt   15  25.117 ± 0.368  ns/op
>> SecondarySupersLookup.testNegative50  avgt   15  32.735 ± 0.359  ns/op
>> SecondarySupersLookup.testNegative55  avgt   15  34.866 ± 0.152  ns/op
>> SecondarySupersLookup.testNegative56  avgt   15  35.492 ± 0.276  ns/op
>> SecondarySupersLookup.testNegative57  avgt   15  36.620 ± 0.334  ns/op
>> SecondarySupersLookup.testNegative58  avgt   15  37.226 ± 0.180  ns/op
>> SecondarySupersLookup.testNegative59  avgt   15  37.774 ± 0.241  ns/op
>> SecondarySupersLookup.testNegative60  avgt   15  38.627 ± 1.451  ns/op
>> SecondarySupersLookup.testNegative61  avgt   15 ...
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove pointless assertion.

Hi Martin,
thanks for the port. It looks good. I've only got a few minor comments.
Cheers, Richard.

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2166:

> 2164: 
> 2165: // Return true: we succeeded in generating this code
> 2166: bool MacroAssembler::lookup_secondary_supers_table(Register r_sub_klass,

The method always returns `true`. Should even return a value?

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2176:

> 2174:   assert_different_registers(r_sub_klass, r_super_klass, temp1, temp2, temp3, temp4, result);
> 2175: 
> 2176:   Label L_fallthrough;

`L_done` would be a better name.

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2232:

> 2230: 
> 2231:   // Linear probe. Rotate the bitmap so that the next bit to test is
> 2232:   // in Bit 1.

It's bit 2 that's tested next after the rotation, isn't it? See L2331 in `lookup_secondary_supers_table_slow_path`
Suggestion:

  // in Bit 2.

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2274:

> 2272:   LOOKUP_SECONDARY_SUPERS_TABLE_REGISTERS;
> 2273: 
> 2274:   Label L_fallthrough;

`L_done` would be a better name.

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2328:

> 2326:     ldx(result, r_array_base, r_array_index);
> 2327:     xor_(result, result, r_super_klass);
> 2328:     beq(CCR0, L_fallthrough);

You might add a comment `success (result == 0)`.

src/hotspot/cpu/ppc/ppc.ad line 12059:

> 12057:     u1 super_klass_slot = ((Klass*)$super_con$$constant)->hash_slot();
> 12058:     if (InlineSecondarySupersTest) {
> 12059:       success = __ lookup_secondary_supers_table($sub$$Register, $super_reg$$Register,

`success` is always true. Can it be removed?

-------------

PR Review: https://git.openjdk.org/jdk/pull/19368#pullrequestreview-2113235016
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1638232248
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1636593154
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1637747037
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1637842324
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1637847137
PR Review Comment: https://git.openjdk.org/jdk/pull/19368#discussion_r1638229107


More information about the hotspot-compiler-dev mailing list