RFR: 8349727: [PPC] C1: Improve Class.isInstance intrinsic [v3]

Richard Reingruber rrich at openjdk.org
Wed Feb 19 15:26:53 UTC 2025


On Wed, 19 Feb 2025 13:25:09 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>> PPC64 implementation of [JDK-8337251](https://bugs.openjdk.org/browse/JDK-8337251).
>> The new runtime stub is called like a C function. The initial version therefore used a `FunctionDescriptor` with relocation on PPC64 with ABIv1. I've changed that with the 3rd Commit. `rt_call` jumps directly to the entry point, now. 
>> 
>> Performance measured on Power10: `make run-test TEST="micro:SecondarySupersLookup" MICRO="VM_OPTIONS=-XX:TieredStopAtLevel=1"`
>> 
>> Before this patch (C code)
>> 
>> Benchmark                             Mode  Cnt   Score   Error  Units
>> SecondarySupersLookup.testNegative00  avgt   15  18.570 ± 0.009  ns/op
>> ...
>> SecondarySupersLookup.testNegative30  avgt   15  18.566 ± 0.002  ns/op
>> SecondarySupersLookup.testNegative32  avgt   15  19.177 ± 1.347  ns/op
>> SecondarySupersLookup.testNegative40  avgt   15  18.569 ± 0.006  ns/op
>> SecondarySupersLookup.testNegative50  avgt   15  19.207 ± 1.334  ns/op
>> SecondarySupersLookup.testNegative55  avgt   15  19.708 ± 1.338  ns/op
>> SecondarySupersLookup.testNegative56  avgt   15  19.132 ± 0.137  ns/op
>> SecondarySupersLookup.testNegative57  avgt   15  19.133 ± 0.134  ns/op
>> SecondarySupersLookup.testNegative58  avgt   15  19.772 ± 1.316  ns/op
>> SecondarySupersLookup.testNegative59  avgt   15  19.109 ± 0.014  ns/op
>> SecondarySupersLookup.testNegative60  avgt   15  22.381 ± 0.016  ns/op
>> SecondarySupersLookup.testNegative61  avgt   15  22.331 ± 0.011  ns/op
>> SecondarySupersLookup.testNegative62  avgt   15  22.352 ± 0.029  ns/op
>> SecondarySupersLookup.testNegative63  avgt   15  30.371 ± 0.031  ns/op
>> SecondarySupersLookup.testNegative64  avgt   15  29.927 ± 0.221  ns/op
>> SecondarySupersLookup.testPositive01  avgt   15  18.571 ± 0.006  ns/op
>> ...
>> SecondarySupersLookup.testPositive09  avgt   15  18.599 ± 0.140  ns/op
>> SecondarySupersLookup.testPositive10  avgt   15  19.210 ± 1.332  ns/op
>> SecondarySupersLookup.testPositive16  avgt   15  18.603 ± 0.142  ns/op
>> SecondarySupersLookup.testPositive20  avgt   15  19.210 ± 1.333  ns/op
>> SecondarySupersLookup.testPositive30  avgt   15  18.600 ± 0.140  ns/op
>> SecondarySupersLookup.testPositive32  avgt   15  18.637 ± 0.189  ns/op
>> SecondarySupersLookup.testPositive40  avgt   15  19.137 ± 0.190  ns/op
>> SecondarySupersLookup.testPositive50  avgt   15  18.567 ± 0.002  ns/op
>> SecondarySupersLookup.testPositive60  avgt   15  19.069 ± 0.004  ns/op
>> SecondarySupersLookup.testPositive63  avgt   15  26.024 ± 0.017  ns/op
>> SecondarySupersLookup.tes...
>
> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Unroll repne_scan loop for better performance.

src/hotspot/cpu/ppc/c1_LIRAssembler_ppc.cpp line 2845:

> 2843:   if (dest == Runtime1::entry_for(C1StubId::register_finalizer_id) ||
> 2844:       dest == Runtime1::entry_for(C1StubId::new_multi_array_id   ) ||
> 2845:       dest == Runtime1::entry_for(C1StubId::is_instance_of_id    )) {

Should there be an assertion that `dest` is in the CodeCache?
Or even use that check as condition to emit the optimized call?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23602#discussion_r1961888058


More information about the hotspot-compiler-dev mailing list