RFR: 8257882: Implement linkToNative intrinsic on AArch64 [v3]

Andrew Haley aph at openjdk.java.net
Fri Dec 11 09:38:01 UTC 2020


On Fri, 11 Dec 2020 07:42:19 GMT, Nick Gasson <ngasson at openjdk.org> wrote:

>> This is more-or-less a straight port of the x86 code to AArch64.
>> GraphKit::make_native_call() calls SharedRuntime::make_native_invoker()
>> to generate a blob that jumps to the native function entry point. This
>> simply switches the thread state from Java to native and handles the
>> safepoint poll on return from native code.
>> 
>> AArch64 suffers the same problem as x86 in JDK-8251047 where R29 (frame
>> pointer) may hold a live oop over the MachCallNative node. Normally this
>> would be saved by RegisterSaver::save_live_registers() but the native
>> invoker blob is not a "proper" stub routine so doesn't have an oop map.
>> I copied the x86 solution to this where the frame pointer register is
>> saved to the stack and a pointer to that is stored in the frame anchor.
>> This is then read back and added to the register map when walking the
>> stack. I saw in the PR comments [1] that this might be a temporary fix,
>> but I'm not sure if there's any plan to change that now?
>> 
>> Another potential fix might be to change the C2 register definition of
>> R29 and R29_H to be save-on-call as below. This works for the
>> TestStackWalk.java test case but I don't know whether it has other
>> unintended side-effects.
>> 
>>   reg_def R29     ( SOC,  NS, Op_RegI, 29, r29->as_VMReg()        ); // fp
>>   reg_def R29_H   ( SOC,  NS, Op_RegI, 29, r29->as_VMReg()->next());
>> 
>> JMH results from jdk/incubator/foreign/CallOverhead.java to show it's
>> working:
>> 
>> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=false:
>> 
>> Benchmark                                    Mode  Cnt     Score     Error  Units
>> CallOverhead.jni_blank                       avgt   30    51.450 ?   0.363  ns/op
>> CallOverhead.jni_identity                    avgt   30    54.145 ?   0.627  ns/op
>> CallOverhead.panama_args10                   avgt   30  1914.431 ?  73.771  ns/op
>> CallOverhead.panama_args5                    avgt   30  1394.274 ?  49.369  ns/op
>> CallOverhead.panama_blank                    avgt   30   872.878 ?  20.716  ns/op
>> CallOverhead.panama_blank_trivial            avgt   30   873.852 ?  21.350  ns/op
>> CallOverhead.panama_identity                 avgt   30  1058.729 ?  20.229  ns/op
>> CallOverhead.panama_identity_memory_address  avgt   30  1041.648 ?  22.930  ns/op
>> CallOverhead.panama_identity_struct          avgt   30  3212.483 ? 151.627  ns/op
>> CallOverhead.panama_identity_trivial         avgt   30  1056.398 ?  18.779  ns/op
>> 
>> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=true:
>> 
>> Benchmark                                    Mode  Cnt     Score     Error  Units
>> CallOverhead.jni_blank                       avgt   30    51.519 ?   0.345  ns/op
>> CallOverhead.jni_identity                    avgt   30    54.689 ?   0.687  ns/op
>> CallOverhead.panama_args10                   avgt   30    42.856 ?   0.760  ns/op
>> CallOverhead.panama_args5                    avgt   30    42.192 ?   0.712  ns/op
>> CallOverhead.panama_blank                    avgt   30    41.934 ?   0.349  ns/op
>> CallOverhead.panama_blank_trivial            avgt   30     2.806 ?   0.545  ns/op
>> CallOverhead.panama_identity                 avgt   30    44.043 ?   0.398  ns/op
>> CallOverhead.panama_identity_memory_address  avgt   30    45.021 ?   0.409  ns/op
>> CallOverhead.panama_identity_struct          avgt   30  3206.829 ? 175.750  ns/op
>> CallOverhead.panama_identity_trivial         avgt   30     7.849 ?   1.651  ns/op
>> 
>> [1] https://github.com/openjdk/panama-foreign/pull/279#issuecomment-679172326
>
> Nick Gasson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
> 
>  - Separate RegSet and FloatRegSet
>  - Merge branch 'master' into 8257882
>  - Review comments
>  - 8257882: Implement linkToNative intrinsic on AArch64
>    
>    This is more-or-less a straight port of the x86 code to AArch64.
>    GraphKit::make_native_call() calls SharedRuntime::make_native_invoker()
>    to generate a blob that jumps to the native function entry point. This
>    simply switches the thread state from Java to native and handles the
>    safepoint poll on return from native code.
>    
>    AArch64 suffers the same problem as x86 in JDK-8251047 where R29 (frame
>    pointer) may hold a live oop over the MachCallNative node. Normally this
>    would be saved by RegisterSaver::save_live_registers() but the native
>    invoker blob is not a "proper" stub routine so doesn't have an oop map.
>    I copied the x86 solution to this where the frame pointer register is
>    saved to the stack and a pointer to that is stored in the frame anchor.
>    This is then read back and added to the register map when walking the
>    stack. I saw in the PR comments [1] that this might be a temporary fix,
>    but I'm not sure if there's any plan to change that now?
>    
>    Another potential fix might be to change the C2 register definition of
>    R29 and R29_H to be save-on-call as below. This works for the
>    TestStackWalk.java test case but I don't know whether it has other
>    unintended side-effects.
>    
>      reg_def R29     ( SOC,  NS, Op_RegI, 29, r29->as_VMReg()        ); // fp
>      reg_def R29_H   ( SOC,  NS, Op_RegI, 29, r29->as_VMReg()->next());
>    
>    JMH results from jdk/incubator/foreign/CallOverhead.java to show it's
>    working:
>    
>    -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=false:
>    
>    Benchmark                                    Mode  Cnt     Score     Error  Units
>    CallOverhead.jni_blank                       avgt   30    51.450 ?   0.363  ns/op
>    CallOverhead.jni_identity                    avgt   30    54.145 ?   0.627  ns/op
>    CallOverhead.panama_args10                   avgt   30  1914.431 ?  73.771  ns/op
>    CallOverhead.panama_args5                    avgt   30  1394.274 ?  49.369  ns/op
>    CallOverhead.panama_blank                    avgt   30   872.878 ?  20.716  ns/op
>    CallOverhead.panama_blank_trivial            avgt   30   873.852 ?  21.350  ns/op
>    CallOverhead.panama_identity                 avgt   30  1058.729 ?  20.229  ns/op
>    CallOverhead.panama_identity_memory_address  avgt   30  1041.648 ?  22.930  ns/op
>    CallOverhead.panama_identity_struct          avgt   30  3212.483 ? 151.627  ns/op
>    CallOverhead.panama_identity_trivial         avgt   30  1056.398 ?  18.779  ns/op
>    
>    -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=true:
>    
>    Benchmark                                    Mode  Cnt     Score     Error  Units
>    CallOverhead.jni_blank                       avgt   30    51.519 ?   0.345  ns/op
>    CallOverhead.jni_identity                    avgt   30    54.689 ?   0.687  ns/op
>    CallOverhead.panama_args10                   avgt   30    42.856 ?   0.760  ns/op
>    CallOverhead.panama_args5                    avgt   30    42.192 ?   0.712  ns/op
>    CallOverhead.panama_blank                    avgt   30    41.934 ?   0.349  ns/op
>    CallOverhead.panama_blank_trivial            avgt   30     2.806 ?   0.545  ns/op
>    CallOverhead.panama_identity                 avgt   30    44.043 ?   0.398  ns/op
>    CallOverhead.panama_identity_memory_address  avgt   30    45.021 ?   0.409  ns/op
>    CallOverhead.panama_identity_struct          avgt   30  3206.829 ? 175.750  ns/op
>    CallOverhead.panama_identity_trivial         avgt   30     7.849 ?   1.651  ns/op
>    
>    [1] https://github.com/openjdk/panama-foreign/pull/279#issuecomment-679172326

src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 3140:

> 3138:   // State transition
> 3139:   __ mov(rscratch1, _thread_in_native);
> 3140:   __ strw(rscratch1, Address(rthread, JavaThread::thread_state_offset()));

I think this should be a releasing store.

src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 3159:

> 3157: 
> 3158:   __ safepoint_poll(L_safepoint_poll_slow_path, rthread, true /* at_return */, false /* in_nmethod */);
> 3159: 

This looks wrong: please look at the definition of MacroAssembler::safepoint_poll, which has an acquire flag.

-------------

PR: https://git.openjdk.java.net/jdk/pull/1711


More information about the hotspot-dev mailing list