RFR: 8257882: Implement linkToNative intrinsic on AArch64
Andrew Haley
aph at openjdk.java.net
Wed Dec 9 10:52:37 UTC 2020
On Wed, 9 Dec 2020 08:20:38 GMT, Nick Gasson <ngasson at openjdk.org> wrote:
> This is more-or-less a straight port of the x86 code to AArch64.
> GraphKit::make_native_call() calls SharedRuntime::make_native_invoker()
> to generate a blob that jumps to the native function entry point. This
> simply switches the thread state from Java to native and handles the
> safepoint poll on return from native code.
>
> AArch64 suffers the same problem as x86 in JDK-8251047 where R29 (frame
> pointer) may hold a live oop over the MachCallNative node. Normally this
> would be saved by RegisterSaver::save_live_registers() but the native
> invoker blob is not a "proper" stub routine so doesn't have an oop map.
> I copied the x86 solution to this where the frame pointer register is
> saved to the stack and a pointer to that is stored in the frame anchor.
> This is then read back and added to the register map when walking the
> stack. I saw in the PR comments [1] that this might be a temporary fix,
> but I'm not sure if there's any plan to change that now?
>
> Another potential fix might be to change the C2 register definition of
> R29 and R29_H to be save-on-call as below. This works for the
> TestStackWalk.java test case but I don't know whether it has other
> unintended side-effects.
>
> reg_def R29 ( SOC, NS, Op_RegI, 29, r29->as_VMReg() ); // fp
> reg_def R29_H ( SOC, NS, Op_RegI, 29, r29->as_VMReg()->next());
>
> JMH results from jdk/incubator/foreign/CallOverhead.java to show it's
> working:
>
> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=false:
>
> Benchmark Mode Cnt Score Error Units
> CallOverhead.jni_blank avgt 30 51.450 ? 0.363 ns/op
> CallOverhead.jni_identity avgt 30 54.145 ? 0.627 ns/op
> CallOverhead.panama_args10 avgt 30 1914.431 ? 73.771 ns/op
> CallOverhead.panama_args5 avgt 30 1394.274 ? 49.369 ns/op
> CallOverhead.panama_blank avgt 30 872.878 ? 20.716 ns/op
> CallOverhead.panama_blank_trivial avgt 30 873.852 ? 21.350 ns/op
> CallOverhead.panama_identity avgt 30 1058.729 ? 20.229 ns/op
> CallOverhead.panama_identity_memory_address avgt 30 1041.648 ? 22.930 ns/op
> CallOverhead.panama_identity_struct avgt 30 3212.483 ? 151.627 ns/op
> CallOverhead.panama_identity_trivial avgt 30 1056.398 ? 18.779 ns/op
>
> -Djdk.internal.foreign.ProgrammableInvoker.USE_INTRINSICS=true:
>
> Benchmark Mode Cnt Score Error Units
> CallOverhead.jni_blank avgt 30 51.519 ? 0.345 ns/op
> CallOverhead.jni_identity avgt 30 54.689 ? 0.687 ns/op
> CallOverhead.panama_args10 avgt 30 42.856 ? 0.760 ns/op
> CallOverhead.panama_args5 avgt 30 42.192 ? 0.712 ns/op
> CallOverhead.panama_blank avgt 30 41.934 ? 0.349 ns/op
> CallOverhead.panama_blank_trivial avgt 30 2.806 ? 0.545 ns/op
> CallOverhead.panama_identity avgt 30 44.043 ? 0.398 ns/op
> CallOverhead.panama_identity_memory_address avgt 30 45.021 ? 0.409 ns/op
> CallOverhead.panama_identity_struct avgt 30 3206.829 ? 175.750 ns/op
> CallOverhead.panama_identity_trivial avgt 30 7.849 ? 1.651 ns/op
>
> [1] https://github.com/openjdk/panama-foreign/pull/279#issuecomment-679172326
src/hotspot/cpu/aarch64/aarch64.ad line 16057:
> 16055:
> 16056: format %{ "CALL, native $meth" %}
> 16057:
It might be better to expand this a little to indicate a near or a far call, if that's possible.
src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 3156:
> 3154:
> 3155: // Store a pointer to the previous R29 saved on the stack as it may
> 3156: // contain an oop if PreserveFramePointer is off. This value is
This is a bit confusing: please say "R29 (RFP)" here.
src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 3186:
> 3184: // Make sure that native code does not change SVE vector length.
> 3185: __ verify_sve_vector_length();
> 3186: }
Do we have to surround every call to verify_sve_vector_length() with if (UseSVE > 0) ?
Why is that check not inside verify_sve_vector_length() ? What is the meaning of the
> 0 comparison? Can it be negative? So many questions. :-)
src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 3227:
> 3225: __ mov(c_rarg0, rthread);
> 3226: #ifndef PRODUCT
> 3227: assert(frame::arg_reg_save_area_bytes == 0, "not expecting frame reg save area");
I don't think we need #ifndef PRODUCT here.
src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 3098:
> 3096: MacroAssembler* masm = _masm;
> 3097: if (reg->is_Register()) {
> 3098: __ push(RegSet::of(reg->as_Register()), sp);
Is this right? SP is 16-aligned, so this will use 16 bytes of stack for every 8-byte register. Does it get used a lot?
-------------
PR: https://git.openjdk.java.net/jdk/pull/1711
More information about the core-libs-dev
mailing list