[foreign-abi] RFR: 8248331: Intrinsify downcall handles in C2
Jorn Vernee
jvernee at openjdk.java.net
Tue Jun 30 13:55:37 UTC 2020
The message from this sender included one or more files
which could not be scanned for virus detection; do not
open these files unless you are certain of the sender's intent.
----------------------------------------------------------------------
On Mon, 29 Jun 2020 11:20:22 GMT, Jorn Vernee <jvernee at openjdk.org> wrote:
> Hi,
>
> This patch adds intrinsification of down call handles.
>
> This is done through a new method handle intrinsic called linkToNative. We create a NativeMethodHandle that calls this
> intrinsic, which then replaces the leaf method handle in ProgrammableInvoker::getBoundMethodHandle, basically replacing
> the call to invokeMoves. Before C2 kicks in, this intrinsic will call a fallback method handle that we pass to it. The
> handle that is passed is a handle that points to ProgrammableInvoker::invokeMoves, thus simulating the current
> behaviour. However, when a call to linkToNative is inlined, C2 will instead generate either a direct call to the
> target function, or a call to a wrapper stub (which is generated on demand) that also does the thread state transitions
> needed for long running native functions. Information about ABI, and which registers to use are captured in a so-called
> 'appendix argument' of the type NativeEntryPoint, which is passed as the last argument. This captures all the
> information needed to generate the call in C2 (note that previously in the patch shared in the discussion thread this
> information was split over several classes, but I've condensed the info into just NativeEntryPoint in order to reduce
> the amount of code needed in the vm to be able to access the information). With this, the overhead for downcalls is on
> par or slightly lower than with JNI for calls that need to do thread state transitions, and it is even lower when the
> thread state transitions are omitted (see that *_trivial runs). Benchmark
> Mode Cnt Score Error Units CallOverhead.jni_blank avgt 30 8.461 □ 0.892
> ns/op CallOverhead.jni_identity avgt 30 12.585 □ 0.066 ns/op
> CallOverhead.panama_blank avgt 30 8.562 □ 0.029 ns/op
> CallOverhead.panama_blank_trivial avgt 30 1.372 □ 0.008 ns/op
> CallOverhead.panama_identity avgt 30 11.813 □ 0.073 ns/op
> CallOverhead.panama_identity_trivial avgt 30 6.042 □ 0.024 ns/op Finished running test
> 'micro:CallOverhead' Thanks, Jorn
> I see the switch between trivial and non-trivial invocation (working my way through the code from bottom up) is
> triggered by a boolean attribute on a function descriptor. For the moment Is that just for experimental purposes or is
> it also intended for end users?
For now it's intended as a tool to help explore what is possible (by us or by end users), and how much of a difference
dropping the thread state transitions makes. Whether it will be in the final product remains to be seen. Perhaps it can
be replaced with a heuristic that turns of thread state transitions automatically, or maybe it's ultimately just not
worth the complexity to get a small speed-up (though, it seems that attributes on FunctionDescriptor are more broadly
applicable for other use-case, such as capturing information about how a call might effect CPU status registers, or for
turning on and off certain dynamic safety checks on arguments).
-------------
PR: https://git.openjdk.java.net/panama-foreign/pull/219
More information about the panama-dev
mailing list