RFR: 8332587: RISC-V: secondary_super_cache does not scale well
Hamlin Li
mli at openjdk.org
Mon Jun 3 11:12:04 UTC 2024
On Tue, 21 May 2024 08:31:53 GMT, Gui Cao <gcao at openjdk.org> wrote:
> Implementation of subtype checking [JDK-8180450](https://bugs.openjdk.org/browse/JDK-8180450) for linux-riscv64.
> This optimization depends on availability of the Zbb extension which has the cpop instruction.
>
> ### Correctness testing:
>
> - [x] Run tier1-3, hotspot:tier4 tests on SOPHON SG2042 (release)
> - [x] Run tier1-3 tests with -XX:+UseRVV on qemu 8.1.0 (fastdebug)
> - [x] Run all of tier1 with `-XX:+VerifySecondarySupers`
>
> ### JMH tested on Banana Pi BPI-F3 board (has Zbb) and Enable UseZbb:
> Original:
>
> Benchmark Mode Cnt Score Error Units
> SecondarySuperCacheHits.test avgt 15 11.375 ± 0.071 ns/op
> SecondarySuperCacheInterContention.test avgt 15 646.087 ± 32.587 ns/op
> SecondarySuperCacheInterContention.test:t1 avgt 15 600.090 ± 83.779 ns/op
> SecondarySuperCacheInterContention.test:t2 avgt 15 692.084 ± 73.218 ns/op
> SecondarySupersLookup.testNegative00 avgt 15 16.420 ± 0.239 ns/op
> SecondarySupersLookup.testNegative01 avgt 15 18.307 ± 0.260 ns/op
> SecondarySupersLookup.testNegative02 avgt 15 21.695 ± 0.458 ns/op
> SecondarySupersLookup.testNegative03 avgt 15 24.855 ± 0.664 ns/op
> SecondarySupersLookup.testNegative04 avgt 15 27.305 ± 0.522 ns/op
> SecondarySupersLookup.testNegative05 avgt 15 29.719 ± 0.385 ns/op
> SecondarySupersLookup.testNegative06 avgt 15 32.231 ± 0.498 ns/op
> SecondarySupersLookup.testNegative07 avgt 15 33.747 ± 0.603 ns/op
> SecondarySupersLookup.testNegative08 avgt 15 35.856 ± 0.629 ns/op
> SecondarySupersLookup.testNegative09 avgt 15 37.077 ± 0.546 ns/op
> SecondarySupersLookup.testNegative10 avgt 15 39.408 ± 0.465 ns/op
> SecondarySupersLookup.testNegative16 avgt 15 51.041 ± 0.547 ns/op
> SecondarySupersLookup.testNegative20 avgt 15 58.722 ± 0.922 ns/op
> SecondarySupersLookup.testNegative30 avgt 15 77.310 ± 0.654 ns/op
> SecondarySupersLookup.testNegative32 avgt 15 81.116 ± 0.854 ns/op
> SecondarySupersLookup.testNegative40 avgt 15 96.311 ± 0.840 ns/op
> SecondarySupersLookup.testNegative50 avgt 15 115.427 ± 0.838 ns/op
> SecondarySupersLookup.testNegative55 avgt 15 124.371 ± 1.076 ns/op
> SecondarySupersLookup.testNegative56 avgt 15 126.796 ± 0.916 ns/op
> SecondarySupersLookup.testNegative57 avgt 15 127.952 ± 1.202 ns/op
> SecondarySupersLookup.testNegative58 avgt 15 131.956 ± 4.515 ns/op
> SecondarySupersLookup.testNegative59 avgt 15 131.858 ± 1.066 ns/op
> SecondarySupersLookup.testNegative60...
Thanks for explanation. Seems this patch will also use some instructions in zba/zbs extension. Will they also impact the final performance?
Also has some minor comments first.
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3484:
> 3482: }
> 3483:
> 3484: // Ensure that the inline code and the stub are using the same registers.
Can you add comment about why inline code and the stub should use the same registers?
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3485:
> 3483:
> 3484: // Ensure that the inline code and the stub are using the same registers.
> 3485: #define LOOKUP_SECONDARY_SUPERS_TABLE_REGISTERS \
Can `r_super_klass ` and so on be passed as arguments of macro?
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3530:
> 3528: // the bit is zero, we are certain that super_klass is not one of
> 3529: // the secondary supers.
> 3530: test_bit(t0, r_bitmap, bit);
it uses zbs if possible, I wonder what's the performance data when zbs is diabled or unsupported.
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3549:
> 3547: assert(Array<Klass*>::length_offset_in_bytes() == 0, "Adjust this code");
> 3548:
> 3549: shadd(result, r_array_index, r_array_base, result, LogBytesPerWord);
similar here, it uses zba if possible, so good to know what's the performance data when zba not supported.
src/hotspot/cpu/riscv/riscv.ad line 10127:
> 10125: %{
> 10126: match(Set result (PartialSubtypeCheck sub (Binary super_reg super_con)));
> 10127: predicate(UseSecondarySupersTable);
suggestion: move `predicate` before `match`.
src/hotspot/cpu/riscv/vm_version_riscv.hpp line 280:
> 278: constexpr static bool supports_recursive_lightweight_locking() { return true; }
> 279:
> 280: constexpr static bool supports_secondary_supers_table() { return true; }
If it depends on zbb, should it return UseZbb here?
-------------
PR Review: https://git.openjdk.org/jdk/pull/19320#pullrequestreview-2093273846
PR Review Comment: https://git.openjdk.org/jdk/pull/19320#discussion_r1624154904
PR Review Comment: https://git.openjdk.org/jdk/pull/19320#discussion_r1624122884
PR Review Comment: https://git.openjdk.org/jdk/pull/19320#discussion_r1624212710
PR Review Comment: https://git.openjdk.org/jdk/pull/19320#discussion_r1624214685
PR Review Comment: https://git.openjdk.org/jdk/pull/19320#discussion_r1624119194
PR Review Comment: https://git.openjdk.org/jdk/pull/19320#discussion_r1624090583
More information about the hotspot-dev
mailing list