RFR: 8332587: RISC-V: secondary_super_cache does not scale well [v3]
Gui Cao
gcao at openjdk.org
Fri Jun 7 09:49:18 UTC 2024
On Tue, 4 Jun 2024 08:55:24 GMT, Hamlin Li <mli at openjdk.org> wrote:
> There are a bit regression in cases of testNegative63/64, although these might be rare cases or not very common cases, but it's worth to have a try to improve it if possible. I guess it's related to the implementation for the cases when bitmap is full. When it's full, before go to `repne_scan`, there're some instructions to execute. I wonder if it will help to have another "bitmap full test" just after "bitmap false test" (which is `test_bit(t0, r_bitmap, bit);`). But I'm not sure if it's feasible, maybe worth a try.
Hi, Sorry for being late. Thanks for the suggestion, I gave it a try.
``` diff
diff --git a/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
index 61e8016db4a..af1649c061f 100644
--- a/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
+++ b/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
@@ -3664,6 +3664,32 @@ bool MacroAssembler::lookup_secondary_supers_table(Register r_sub_klass,
test_bit(t0, r_bitmap, bit);
beqz(t0, L_fallthrough);
+ // We will consult the secondary-super array.
+ ld(r_array_base, Address(r_sub_klass, in_bytes(Klass::secondary_supers_offset())));
+
+ mv(tmp3, r_bitmap);
+ if (bit != 0) {
+ ror_imm(tmp3, tmp3, bit);
+ }
+
+ Label skip;
+ addi(t0, tmp3, (u1)1);
+ bnez(t0, skip);
+
+ // Load the array length.
+ lwu(r_array_length, Address(r_array_base, Array<Klass*>::length_offset_in_bytes()));
+ // And adjust the array base to point to the data.
+ // NB! Effectively increments current slot index by 1.
+ assert(Array<Klass*>::base_offset_in_bytes() == wordSize, "");
+ addi(r_array_base, r_array_base, Array<Klass*>::base_offset_in_bytes());
+
+ repne_scan(r_array_base, r_super_klass, r_array_length, t0);
+ bne(r_super_klass, t0, L_fallthrough);
+ mv(result, zr);
+ beqz(result, L_fallthrough);
+
+ bind(skip);
+
// Get the first array index that can contain super_klass into r_array_index.
if (bit != 0) {
slli(r_array_index, r_bitmap, (Klass::SECONDARY_SUPERS_TABLE_MASK - bit));
@@ -3672,9 +3698,6 @@ bool MacroAssembler::lookup_secondary_supers_table(Register r_sub_klass,
mv(r_array_index, (u1)1);
}
- // We will consult the secondary-super array.
- ld(r_array_base, Address(r_sub_klass, in_bytes(Klass::secondary_supers_offset())));
-
// The value i in r_array_index is >= 1, so even though r_array_base
// points to the length, we don't need to adjust it to point to the data.
assert(Array<Klass*>::base_offset_in_bytes() == wordSize, "Adjust this code");
#### JMH tested on Banana Pi BPI-F3 board (has Zbb) and Enable UseZbb, Not Enable UseZba, UseZbs
Benchmark Mode Cnt Score Error Units
SecondarySupersLookup.testNegative00 avgt 15 13.262 ± 0.197 ns/op
SecondarySupersLookup.testNegative01 avgt 15 13.273 ± 0.222 ns/op
SecondarySupersLookup.testNegative02 avgt 15 13.264 ± 0.199 ns/op
SecondarySupersLookup.testNegative03 avgt 15 13.275 ± 0.222 ns/op
SecondarySupersLookup.testNegative04 avgt 15 13.264 ± 0.198 ns/op
SecondarySupersLookup.testNegative05 avgt 15 13.259 ± 0.192 ns/op
SecondarySupersLookup.testNegative06 avgt 15 13.260 ± 0.195 ns/op
SecondarySupersLookup.testNegative07 avgt 15 13.275 ± 0.221 ns/op
SecondarySupersLookup.testNegative08 avgt 15 13.261 ± 0.196 ns/op
SecondarySupersLookup.testNegative09 avgt 15 13.267 ± 0.201 ns/op
SecondarySupersLookup.testNegative10 avgt 15 13.272 ± 0.211 ns/op
SecondarySupersLookup.testNegative16 avgt 15 13.271 ± 0.200 ns/op
SecondarySupersLookup.testNegative20 avgt 15 13.271 ± 0.210 ns/op
SecondarySupersLookup.testNegative30 avgt 15 13.277 ± 0.219 ns/op
SecondarySupersLookup.testNegative32 avgt 15 13.280 ± 0.224 ns/op
SecondarySupersLookup.testNegative40 avgt 15 13.285 ± 0.232 ns/op
SecondarySupersLookup.testNegative50 avgt 15 13.288 ± 0.237 ns/op
SecondarySupersLookup.testNegative55 avgt 15 54.940 ± 0.771 ns/op
SecondarySupersLookup.testNegative56 avgt 15 54.934 ± 0.798 ns/op
SecondarySupersLookup.testNegative57 avgt 15 54.909 ± 0.766 ns/op
SecondarySupersLookup.testNegative58 avgt 15 54.679 ± 0.830 ns/op
SecondarySupersLookup.testNegative59 avgt 15 54.941 ± 0.819 ns/op
SecondarySupersLookup.testNegative60 avgt 15 76.957 ± 0.945 ns/op
SecondarySupersLookup.testNegative61 avgt 15 76.956 ± 1.007 ns/op
SecondarySupersLookup.testNegative62 avgt 15 76.938 ± 0.976 ns/op
SecondarySupersLookup.testNegative63 avgt 15 138.371 ± 1.192 ns/op
SecondarySupersLookup.testNegative64 avgt 15 140.137 ± 1.077 ns/op
SecondarySupersLookup.testPositive01 avgt 15 10.734 ± 0.149 ns/op
SecondarySupersLookup.testPositive02 avgt 15 10.730 ± 0.150 ns/op
SecondarySupersLookup.testPositive03 avgt 15 10.727 ± 0.146 ns/op
SecondarySupersLookup.testPositive04 avgt 15 10.735 ± 0.157 ns/op
SecondarySupersLookup.testPositive05 avgt 15 10.730 ± 0.149 ns/op
SecondarySupersLookup.testPositive06 avgt 15 10.735 ± 0.155 ns/op
SecondarySupersLookup.testPositive07 avgt 15 10.730 ± 0.149 ns/op
SecondarySupersLookup.testPositive08 avgt 15 10.730 ± 0.149 ns/op
SecondarySupersLookup.testPositive09 avgt 15 10.731 ± 0.151 ns/op
SecondarySupersLookup.testPositive10 avgt 15 10.728 ± 0.148 ns/op
SecondarySupersLookup.testPositive16 avgt 15 10.733 ± 0.151 ns/op
SecondarySupersLookup.testPositive20 avgt 15 10.728 ± 0.146 ns/op
SecondarySupersLookup.testPositive30 avgt 15 10.737 ± 0.162 ns/op
SecondarySupersLookup.testPositive32 avgt 15 10.733 ± 0.154 ns/op
SecondarySupersLookup.testPositive40 avgt 15 10.729 ± 0.148 ns/op
SecondarySupersLookup.testPositive50 avgt 15 10.732 ± 0.151 ns/op
SecondarySupersLookup.testPositive60 avgt 15 10.729 ± 0.148 ns/op
SecondarySupersLookup.testPositive63 avgt 15 10.730 ± 0.150 ns/op
SecondarySupersLookup.testPositive64 avgt 15 10.735 ± 0.158 ns/op
Finished running test 'micro:vm.lang.SecondarySupersLookup'
After the fix it got better with the cases of testNegative63/64. As said above, this case is very rare, Is there a need for this additional complexity?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19320#issuecomment-2154485874
More information about the hotspot-dev
mailing list