RFR: 8257882: Implement linkToNative intrinsic on AArch64 [v2]

Jorn Vernee jvernee at openjdk.java.net
Thu Dec 10 13:58:36 UTC 2020


On Thu, 10 Dec 2020 09:15:47 GMT, Nick Gasson <ngasson at openjdk.org> wrote:

>> This is more-or-less a straight port of the x86 code to AArch64.
>> GraphKit::make_native_call() calls SharedRuntime::make_native_invoker()
>> to generate a blob that jumps to the native function entry point. This
>> simply switches the thread state from Java to native and handles the
>> safepoint poll on return from native code.
>> 
>> AArch64 suffers the same problem as x86 in JDK-8251047 where R29 (frame
>> pointer) may hold a live oop over the MachCallNative node. Normally this
>> would be saved by RegisterSaver::save_live_registers() but the native
>> invoker blob is not a "proper" stub routine so doesn't have an oop map.
>> I copied the x86 solution to this where the frame pointer register is
>> saved to the stack and a pointer to that is stored in the frame anchor.
>> This is then read back and added to the register map when walking the
>> stack. I saw in the PR comments [1] that this might be a temporary fix,
>> but I'm not sure if there's any plan to change that now?
>> 
>> Another potential fix might be to change the C2 register definition of
>> R29 and R29_H to be save-on-call as below. This works for the
>> TestStackWalk.java test case but I don't know whether it has other
>> unintended side-effects.
>> 
>>   reg_def R29     ( SOC,  NS, Op_RegI, 29, r29->as_VMReg()        ); // fp
>>   reg_def R29_H   ( SOC,  NS, Op_RegI, 29, r29->as_VMReg()->next());
>> 
>> JMH results from jdk/incubator/foreign/CallOverhead.java to show it's
>> working:
>> 
>> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=false:
>> 
>> Benchmark                                    Mode  Cnt     Score     Error  Units
>> CallOverhead.jni_blank                       avgt   30    51.450 ?   0.363  ns/op
>> CallOverhead.jni_identity                    avgt   30    54.145 ?   0.627  ns/op
>> CallOverhead.panama_args10                   avgt   30  1914.431 ?  73.771  ns/op
>> CallOverhead.panama_args5                    avgt   30  1394.274 ?  49.369  ns/op
>> CallOverhead.panama_blank                    avgt   30   872.878 ?  20.716  ns/op
>> CallOverhead.panama_blank_trivial            avgt   30   873.852 ?  21.350  ns/op
>> CallOverhead.panama_identity                 avgt   30  1058.729 ?  20.229  ns/op
>> CallOverhead.panama_identity_memory_address  avgt   30  1041.648 ?  22.930  ns/op
>> CallOverhead.panama_identity_struct          avgt   30  3212.483 ? 151.627  ns/op
>> CallOverhead.panama_identity_trivial         avgt   30  1056.398 ?  18.779  ns/op
>> 
>> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=true:
>> 
>> Benchmark                                    Mode  Cnt     Score     Error  Units
>> CallOverhead.jni_blank                       avgt   30    51.519 ?   0.345  ns/op
>> CallOverhead.jni_identity                    avgt   30    54.689 ?   0.687  ns/op
>> CallOverhead.panama_args10                   avgt   30    42.856 ?   0.760  ns/op
>> CallOverhead.panama_args5                    avgt   30    42.192 ?   0.712  ns/op
>> CallOverhead.panama_blank                    avgt   30    41.934 ?   0.349  ns/op
>> CallOverhead.panama_blank_trivial            avgt   30     2.806 ?   0.545  ns/op
>> CallOverhead.panama_identity                 avgt   30    44.043 ?   0.398  ns/op
>> CallOverhead.panama_identity_memory_address  avgt   30    45.021 ?   0.409  ns/op
>> CallOverhead.panama_identity_struct          avgt   30  3206.829 ? 175.750  ns/op
>> CallOverhead.panama_identity_trivial         avgt   30     7.849 ?   1.651  ns/op
>> 
>> [1] https://github.com/openjdk/panama-foreign/pull/279#issuecomment-679172326
>
> Nick Gasson has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Review comments

Thanks for the amazing work!

FWIW, on x86 RBP was being passed as debug info, so the solution Vlad I proposed would be to override MachCallNativeNode::in_RegMask to not include it IIRC. I haven't had time to look into it yet though.

The situation on AArch64 seems to be different though? The RFP is not passed as debug info but as part of the normal calling convention maybe?

src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 3104:

> 3102:                                                const GrowableArray<VMReg>& output_registers) {
> 3103:   BufferBlob* _invoke_native_blob =
> 3104:     BufferBlob::create("nep_invoker_blob", MethodHandles::adapter_code_size);

That reminds me; this should _not_ use MethodHandles::adapter_code_size, but a separate constant, since the former is tailored specifically to method handle stubs (and is too large for this case). I still need to update the one for x86 as well (looks like I forgot to do that one before when I changed them for invoker/upcall handler). I think 1024 bytes should be more than enough, but would need to test it.

-------------

Marked as reviewed by jvernee (Committer).

PR: https://git.openjdk.java.net/jdk/pull/1711


More information about the core-libs-dev mailing list