RFR: 8318157: RISC-V: implement ensureMaterializedForStackWalk intrinsic
Olga Mikhaltsova
omikhaltcova at openjdk.org
Wed Nov 29 00:00:05 UTC 2023
On Tue, 28 Nov 2023 14:02:49 GMT, Fei Yang <fyang at openjdk.org> wrote:
>> Please, review addition of the ensureMaterializedForStackWalk C2 intrinsic for RISC-V.
>> It's implemented by analogy to AArch64 and x86_64 [1].
>>
>> The benchmark shows the following performance improvement on the T-Head RVB-ICE board:
>> **Before**
>>
>> Benchmark Mode Cnt Score Error Units
>> ScopedValues.CreateBindThenGetThenRemove_ScopedValue avgt 10 873.072 12.051 ns/op
>> ScopedValues.bindThenGetNoRemove_ThreadLocal avgt 10 55.507 18.006 ns/op
>> ScopedValues.bindThenGetThenRemove_ScopedValue avgt 10 758.185 14.585 ns/op
>> ScopedValues.bindThenGetThenRemove_ThreadLocal avgt 10 922.742 9.597 ns/op
>> ScopedValues.bindViaGet_ScopedValue avgt 10 603.164 13.694 ns/op
>> ScopedValues.bind_ScopedValue avgt 10 557.377 9.591 ns/op
>> ScopedValues.bind_ThreadLocal avgt 10 854.636 17.014 ns/op
>> ScopedValues.counter_ScopedValue avgt 10 35.367 0.421 ns/op
>> ScopedValues.counter_ThreadLocal avgt 10 45.345 0.283 ns/op
>> ScopedValues.setNoRemove_ScopedValue avgt 10 39.838 0.432 ns/op
>> ScopedValues.setNoRemove_ThreadLocal avgt 10 44.769 0.404 ns/op
>> ScopedValues.sixValues_ScopedValue avgt 10 2.566 0.006 us/op
>> ScopedValues.sixValues_ThreadLocal avgt 10 9.067 0.020 us/op
>> ScopedValues.thousandAdds_ScopedValue avgt 10 0.158 0.004 us/op
>> ScopedValues.thousandAdds_ThreadLocal avgt 10 9.662 0.019 us/op
>> ScopedValues.thousandIsBoundQueries avgt 10 31.203 0.451 ns/op
>> ScopedValues.thousandMaybeGets avgt 10 156.356 4.469 ns/op
>>
>> **After**
>>
>> Benchmark Mode Cnt Score Error Units
>> ScopedValues.CreateBindThenGetThenRemove_ScopedValue avgt 10 367.616 4.382 ns/op
>> ScopedValues.bindThenGetNoRemove_ThreadLocal avgt 10 55.935 18.048 ns/op
>> ScopedValues.bindThenGetThenRemove_ScopedValue avgt 10 371.733 2.835 ns/op
>> ScopedValues.bindThenGetThenRemove_ThreadLocal avgt 10 902.112 12.709 ns/op
>> ScopedValues.bindViaGet_ScopedValue avgt 10 158.404 2.920 ns/op
>> ScopedValues.bind_ScopedValue avgt 10...
>
> Hi, thanks for finding this. Looks OK. Any correctness test performed?
@RealFYang, yes, tier1 tests passed.
The only thing, I'm hesitating about, is the size of __ nop(). I assume that it'll be 4 bytes because, for example, I see in code related to deoptimization the following:
[src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp#L1290](https://github.com/openjdk/jdk/blob/a40d8d97e84d88d1a65aba81bfc09339be95e427/src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp#L1290)
// First instruction must be a nop as it may need to be patched on deoptimisation
{
Assembler::IncompressibleRegion ir(masm); // keep the nop as 4 bytes for patching.
MacroAssembler::assert_alignment(__ pc());
__ nop(); // 4 bytes
}
But from the other hand, according to the comments, in case of RVC __ nop() shall be 2 bytes:
[src/hotspot/cpu/riscv/riscv.ad#L1292](https://github.com/openjdk/jdk/blob/98f7d263915e20a78f1ab53016079e3f15988aa3/src/hotspot/cpu/riscv/riscv.ad#L1292)
Assembler::CompressibleRegion cr(&_masm); // nops shall be 2-byte under RVC for alignment purposes.
for (int i = 0; i < _count; i++) {
__ nop();
}
Is there possibility of error?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16808#issuecomment-1830964623
More information about the hotspot-compiler-dev
mailing list