RFR: 8344232: [PPC64] secondary_super_cache does not scale well: C1 and interpreter
Richard Reingruber
rrich at openjdk.org
Wed Jan 22 17:46:49 UTC 2025
On Wed, 22 Jan 2025 12:21:54 GMT, Richard Reingruber <rrich at openjdk.org> wrote:
>> PPC64 implementation of https://github.com/openjdk/jdk/commit/ead0116f2624e0e34529e47e4f509142d588b994. I have implemented a couple of rotate instructions.
>> The first commit only implements `lookup_secondary_supers_table_var` and uses it in C2. The second commit makes the changes to use it in the interpreter, runtime and C1.
>> C1 part is refactored such that the same code as before this patch is generated when `UseSecondarySupersTable` is disabled. Some stubs are modified to provide one more temp register.
>>
>> Performance difference can be observed when C2 is disabled (measured on Power10):
>>
>>
>> -XX:TieredStopAtLevel=1 -XX:-UseSecondarySupersTable:
>> SecondarySuperCacheHits.test avgt 15 13.028 ± 0.005 ns/op
>> SecondarySuperCacheInterContention.test avgt 15 417.746 ± 19.046 ns/op
>> SecondarySuperCacheInterContention.test:t1 avgt 15 417.852 ± 17.814 ns/op
>> SecondarySuperCacheInterContention.test:t2 avgt 15 417.641 ± 23.431 ns/op
>> SecondarySuperCacheIntraContention.test avgt 15 340.995 ± 5.620 ns/op
>>
>>
>>
>> -XX:TieredStopAtLevel=1 -XX:+UseSecondarySupersTable:
>> SecondarySuperCacheHits.test avgt 15 14.539 ± 0.002 ns/op
>> SecondarySuperCacheInterContention.test avgt 15 25.667 ± 0.576 ns/op
>> SecondarySuperCacheInterContention.test:t1 avgt 15 25.709 ± 0.655 ns/op
>> SecondarySuperCacheInterContention.test:t2 avgt 15 25.626 ± 0.820 ns/op
>> SecondarySuperCacheIntraContention.test avgt 15 22.466 ± 1.554 ns/op
>>
>>
>> `SecondarySuperCacheHits` seems to be slightly slower, but `SecondarySuperCacheInterContention` and `SecondarySuperCacheIntraContention` are much faster (when C2 is disabled).
>
> src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2157:
>
>> 2155:
>> 2156: bind(fallthru);
>> 2157: if (L_success != nullptr && result_reg == noreg) {
>
> Is there a problem if `L_success == nullptr && result_reg == noreg` and there aren't any secondary supers?
> In that case we would reach here with CR0.eq from L2134 and we would fallthrough with CR0.eq. Due to the change in `C1StubId::slow_subtype_check_id` we would return there with CR0.eq.
This is a reproducer:
public class InstanceOfTest {
public static interface TestInterfaceI {
}
public static class TestClassNegative {
}
public static void main(String[] args) {
Object obj = new TestClassNegative();
for (int i = 100_000; i > 0; i--) {
dontinline_testMethod(obj);
}
boolean result = dontinline_testMethod(obj);
System.out.println("result: " + result);
}
static boolean dontinline_testMethod(Object obj) {
return obj instanceof TestInterfaceI;
}
}
./jdk/bin/java -XX:TieredStopAtLevel=1 -XX:-UseSecondarySupersTable InstanceOfTest
result: true
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/22881#discussion_r1925725737
More information about the hotspot-dev
mailing list