RFR: 8257882: Implement linkToNative intrinsic on AArch64 [v3]
Andrew Haley
aph at openjdk.java.net
Fri Dec 11 09:38:01 UTC 2020
On Fri, 11 Dec 2020 07:42:19 GMT, Nick Gasson <ngasson at openjdk.org> wrote:
>> This is more-or-less a straight port of the x86 code to AArch64.
>> GraphKit::make_native_call() calls SharedRuntime::make_native_invoker()
>> to generate a blob that jumps to the native function entry point. This
>> simply switches the thread state from Java to native and handles the
>> safepoint poll on return from native code.
>>
>> AArch64 suffers the same problem as x86 in JDK-8251047 where R29 (frame
>> pointer) may hold a live oop over the MachCallNative node. Normally this
>> would be saved by RegisterSaver::save_live_registers() but the native
>> invoker blob is not a "proper" stub routine so doesn't have an oop map.
>> I copied the x86 solution to this where the frame pointer register is
>> saved to the stack and a pointer to that is stored in the frame anchor.
>> This is then read back and added to the register map when walking the
>> stack. I saw in the PR comments [1] that this might be a temporary fix,
>> but I'm not sure if there's any plan to change that now?
>>
>> Another potential fix might be to change the C2 register definition of
>> R29 and R29_H to be save-on-call as below. This works for the
>> TestStackWalk.java test case but I don't know whether it has other
>> unintended side-effects.
>>
>> reg_def R29 ( SOC, NS, Op_RegI, 29, r29->as_VMReg() ); // fp
>> reg_def R29_H ( SOC, NS, Op_RegI, 29, r29->as_VMReg()->next());
>>
>> JMH results from jdk/incubator/foreign/CallOverhead.java to show it's
>> working:
>>
>> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=false:
>>
>> Benchmark Mode Cnt Score Error Units
>> CallOverhead.jni_blank avgt 30 51.450 ? 0.363 ns/op
>> CallOverhead.jni_identity avgt 30 54.145 ? 0.627 ns/op
>> CallOverhead.panama_args10 avgt 30 1914.431 ? 73.771 ns/op
>> CallOverhead.panama_args5 avgt 30 1394.274 ? 49.369 ns/op
>> CallOverhead.panama_blank avgt 30 872.878 ? 20.716 ns/op
>> CallOverhead.panama_blank_trivial avgt 30 873.852 ? 21.350 ns/op
>> CallOverhead.panama_identity avgt 30 1058.729 ? 20.229 ns/op
>> CallOverhead.panama_identity_memory_address avgt 30 1041.648 ? 22.930 ns/op
>> CallOverhead.panama_identity_struct avgt 30 3212.483 ? 151.627 ns/op
>> CallOverhead.panama_identity_trivial avgt 30 1056.398 ? 18.779 ns/op
>>
>> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=true:
>>
>> Benchmark Mode Cnt Score Error Units
>> CallOverhead.jni_blank avgt 30 51.519 ? 0.345 ns/op
>> CallOverhead.jni_identity avgt 30 54.689 ? 0.687 ns/op
>> CallOverhead.panama_args10 avgt 30 42.856 ? 0.760 ns/op
>> CallOverhead.panama_args5 avgt 30 42.192 ? 0.712 ns/op
>> CallOverhead.panama_blank avgt 30 41.934 ? 0.349 ns/op
>> CallOverhead.panama_blank_trivial avgt 30 2.806 ? 0.545 ns/op
>> CallOverhead.panama_identity avgt 30 44.043 ? 0.398 ns/op
>> CallOverhead.panama_identity_memory_address avgt 30 45.021 ? 0.409 ns/op
>> CallOverhead.panama_identity_struct avgt 30 3206.829 ? 175.750 ns/op
>> CallOverhead.panama_identity_trivial avgt 30 7.849 ? 1.651 ns/op
>>
>> [1] https://github.com/openjdk/panama-foreign/pull/279#issuecomment-679172326
>
> Nick Gasson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
>
> - Separate RegSet and FloatRegSet
> - Merge branch 'master' into 8257882
> - Review comments
> - 8257882: Implement linkToNative intrinsic on AArch64
>
> This is more-or-less a straight port of the x86 code to AArch64.
> GraphKit::make_native_call() calls SharedRuntime::make_native_invoker()
> to generate a blob that jumps to the native function entry point. This
> simply switches the thread state from Java to native and handles the
> safepoint poll on return from native code.
>
> AArch64 suffers the same problem as x86 in JDK-8251047 where R29 (frame
> pointer) may hold a live oop over the MachCallNative node. Normally this
> would be saved by RegisterSaver::save_live_registers() but the native
> invoker blob is not a "proper" stub routine so doesn't have an oop map.
> I copied the x86 solution to this where the frame pointer register is
> saved to the stack and a pointer to that is stored in the frame anchor.
> This is then read back and added to the register map when walking the
> stack. I saw in the PR comments [1] that this might be a temporary fix,
> but I'm not sure if there's any plan to change that now?
>
> Another potential fix might be to change the C2 register definition of
> R29 and R29_H to be save-on-call as below. This works for the
> TestStackWalk.java test case but I don't know whether it has other
> unintended side-effects.
>
> reg_def R29 ( SOC, NS, Op_RegI, 29, r29->as_VMReg() ); // fp
> reg_def R29_H ( SOC, NS, Op_RegI, 29, r29->as_VMReg()->next());
>
> JMH results from jdk/incubator/foreign/CallOverhead.java to show it's
> working:
>
> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=false:
>
> Benchmark Mode Cnt Score Error Units
> CallOverhead.jni_blank avgt 30 51.450 ? 0.363 ns/op
> CallOverhead.jni_identity avgt 30 54.145 ? 0.627 ns/op
> CallOverhead.panama_args10 avgt 30 1914.431 ? 73.771 ns/op
> CallOverhead.panama_args5 avgt 30 1394.274 ? 49.369 ns/op
> CallOverhead.panama_blank avgt 30 872.878 ? 20.716 ns/op
> CallOverhead.panama_blank_trivial avgt 30 873.852 ? 21.350 ns/op
> CallOverhead.panama_identity avgt 30 1058.729 ? 20.229 ns/op
> CallOverhead.panama_identity_memory_address avgt 30 1041.648 ? 22.930 ns/op
> CallOverhead.panama_identity_struct avgt 30 3212.483 ? 151.627 ns/op
> CallOverhead.panama_identity_trivial avgt 30 1056.398 ? 18.779 ns/op
>
> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=true:
>
> Benchmark Mode Cnt Score Error Units
> CallOverhead.jni_blank avgt 30 51.519 ? 0.345 ns/op
> CallOverhead.jni_identity avgt 30 54.689 ? 0.687 ns/op
> CallOverhead.panama_args10 avgt 30 42.856 ? 0.760 ns/op
> CallOverhead.panama_args5 avgt 30 42.192 ? 0.712 ns/op
> CallOverhead.panama_blank avgt 30 41.934 ? 0.349 ns/op
> CallOverhead.panama_blank_trivial avgt 30 2.806 ? 0.545 ns/op
> CallOverhead.panama_identity avgt 30 44.043 ? 0.398 ns/op
> CallOverhead.panama_identity_memory_address avgt 30 45.021 ? 0.409 ns/op
> CallOverhead.panama_identity_struct avgt 30 3206.829 ? 175.750 ns/op
> CallOverhead.panama_identity_trivial avgt 30 7.849 ? 1.651 ns/op
>
> [1] https://github.com/openjdk/panama-foreign/pull/279#issuecomment-679172326
src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 3140:
> 3138: // State transition
> 3139: __ mov(rscratch1, _thread_in_native);
> 3140: __ strw(rscratch1, Address(rthread, JavaThread::thread_state_offset()));
I think this should be a releasing store.
src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 3159:
> 3157:
> 3158: __ safepoint_poll(L_safepoint_poll_slow_path, rthread, true /* at_return */, false /* in_nmethod */);
> 3159:
This looks wrong: please look at the definition of MacroAssembler::safepoint_poll, which has an acquire flag.
-------------
PR: https://git.openjdk.java.net/jdk/pull/1711
More information about the core-libs-dev
mailing list