RFR: 8254231: Implementation of Foreign Linker API (Incubator) [v4]

Coleen Phillimore coleenp at openjdk.java.net
Thu Oct 15 23:18:14 UTC 2020


On Thu, 15 Oct 2020 17:08:28 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> This patch contains the changes associated with the first incubation round of the foreign linker access API incubation
>> (see JEP 389 [1]). This work is meant to sit on top of the foreign memory access support (see JEP 393 [2] and
>> associated pull request [3]).
>> The main goal of this API is to provide a way to call native functions from Java code without the need of intermediate
>> JNI glue code. In order to do this, native calls are modeled through the MethodHandle API. I suggest reading the
>> writeup [4] I put together few weeks ago, which illustrates what the foreign linker support is, and how it should be
>> used by clients.  Disclaimer: the pull request mechanism isn't great at managing *dependent* reviews. For this reasons,
>> I'm attaching a webrev which contains only the differences between this PR and the memory access PR. I will be
>> periodically uploading new webrevs, as new iterations come out, to try and make the life of reviewers as simple as
>> possible.  A big thank to Jorn Vernee and Vladimir Ivanov - they are the main architects of all the hotspot changes you
>> see here, and without their help, the foreign linker support wouldn't be what it is today. As usual, a big thank to
>> Paul Sandoz, who provided many insights (often by trying the bits first hand).  Thanks Maurizio
>> Webrev:
>> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/webrev
>> 
>> Javadoc:
>> 
>> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/javadoc/jdk/incubator/foreign/package-summary.html
>> 
>> Specdiff (relative to [3]):
>> 
>> http://cr.openjdk.java.net/~mcimadamore/8254231_v1/specdiff_delta/overview-summary.html
>> 
>> CSR:
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8254232
>> 
>> 
>> 
>> ### API Changes
>> 
>> The API changes are actually rather slim:
>> 
>> * `LibraryLookup`
>>   * This class allows clients to lookup symbols in native libraries; the interface is fairly simple; you can load a library
>>     by name, or absolute path, and then lookup symbols on that library.
>> * `FunctionDescriptor`
>>   * This is an abstraction that is very similar, in spirit, to `MethodType`; it is, at its core, an aggregate of memory
>>     layouts for the function arguments/return type. A function descriptor is used to describe the signature of a native
>>     function.
>> * `CLinker`
>>   * This is the real star of the show. A `CLinker` has two main methods: `downcallHandle` and `upcallStub`; the first takes
>>     a native symbol (as obtained from `LibraryLookup`), a `MethodType` and a `FunctionDescriptor` and returns a
>>     `MethodHandle` instance which can be used to call the target native symbol. The second takes an existing method handle,
>>     and a `FunctionDescriptor` and returns a new `MemorySegment` corresponding to a code stub allocated by the VM which
>>     acts as a trampoline from native code to the user-provided method handle. This is very useful for implementing upcalls.
>>    * This class also contains the various layout constants that should be used by clients when describing native signatures
>>      (e.g. `C_LONG` and friends); these layouts contain additional ABI classfication information (in the form of layout
>>      attributes) which is used by the runtime to *infer* how Java arguments should be shuffled for the native call to take
>>      place.
>>   * Finally, this class provides some helper functions e.g. so that clients can convert Java strings into C strings and
>>     back.
>> * `NativeScope`
>>   * This is an helper class which allows clients to group together logically related allocations; that is, rather than
>>     allocating separate memory segments using separate *try-with-resource* constructs, a `NativeScope` allows clients to
>>     use a _single_ block, and allocate all the required segments there. This is not only an usability boost, but also a
>>     performance boost, since not all allocation requests will be turned into `malloc` calls.
>> * `MemorySegment`
>>   * Only one method added here - namely `handoff(NativeScope)` which allows a segment to be transferred onto an existing
>>     native scope.
>> 
>> ### Safety
>> 
>> The foreign linker API is intrinsically unsafe; many things can go wrong when requesting a native method handle. For
>> instance, the description of the native signature might be wrong (e.g. have too many arguments) - and the runtime has,
>> in the general case, no way to detect such mismatches. For these reasons, obtaining a `CLinker` instance is
>> a *restricted* operation, which can be enabled by specifying the usual JDK property `-Dforeign.restricted=permit` (as
>> it's the case for other restricted method in the foreign memory API).  ### Implementation changes  The Java changes
>> associated with `LibraryLookup` are relative straightforward; the only interesting thing to note here is that library
>> loading does _not_ depend on class loaders, so `LibraryLookup` is not subject to the same restrictions which apply to
>> JNI library loading (e.g. same library cannot be loaded by different classloaders).  As for `NativeScope` the changes
>> are again relatively straightforward; it is an API which sits neatly on top of the foreign meory access API, providing
>> some kind of allocation service which shares the same underlying memory segment(s), and turns an allocation request
>> into a segment slice, which is a much less expensive operation. `NativeScope` comes in two variants: there are native
>> scopes for which the allocation size is known a priori, and native scopes which can grow - these two schemes are
>> implemented by two separate subclasses of `AbstractNativeScopeImpl`.  Of course the bulk of the changes are to support
>> the `CLinker` downcall/upcall routines. These changes cut pretty deep into the JVM; I'll briefly summarize the goal of
>> some of this changes - for further details, Jorn has put together a detailed writeup which explains the rationale
>> behind the VM support, with some references to the code [5].  The main idea behind foreign linker is to infer, given a
>> Java method type (expressed as a `MethodType` instance) and the description of the signature of a native function
>> (expressed as a `FunctionDescriptor` instance) a _recipe_ that can be used to turn a Java call into the corresponding
>> native call targeting the requested native function.  This inference scheme can be defined in a pretty straightforward
>> fashion by looking at the various ABI specifications (for instance, see [6] for the SysV ABI, which is the one used on
>> Linux/Mac). The various `CallArranger` classes, of which we have a flavor for each supported platform, do exactly that
>> kind of inference.  For the inference process to work, we need to attach extra information to memory layouts; it is no
>> longer sufficient to know e.g. that a layout is 32/64 bits - we need to know whether it is meant to represent a
>> floating point value, or an integral value; this knowledge is required because floating points are passed in different
>> registers by most ABIs. For this reason, `CLinker` offers a set of pre-baked, platform-dependent layout constants which
>> contain the required classification attributes (e.g. a `Clinker.TypeKind` enum value). The runtime extracts this
>> attribute, and performs classification accordingly.  A native call is decomposed into a sequence of basic, primitive
>> operations, called `Binding` (see the great javadoc on the `Binding.java` class for more info). There are many such
>> bindings - for instance the `Move` binding is used to move a value into a specific machine register/stack slot. So, the
>> main job of the various `CallingArranger` classes is to determine, given a Java `MethodType` and `FunctionDescriptor`
>> what is the set of bindings associated with the downcall/upcall.  At the heart of the foreign linker support is the
>> `ProgrammableInvoker` class. This class effectively generates a `MethodHandle` which follows the steps described by the
>> various bindings obtained by `CallArranger`. There are actually various strategies to interpret these bindings - listed
>> below:
>> * basic intepreted mode; in this mode, all bindings are interpreted using a stack-based machine written in Java (see
>>   `BindingInterpreter`), except for the `Move` bindings. For these bindings, the move is implemented by allocating
>>   a *buffer* (whose size is ABI specific) and by moving all the lowered values into positions within this buffer. The
>>   buffer is then passed to a piece of assembly code inside the VM which takes values from the buffer and moves them in
>>   their expected registers/stack slots (note that each position in the buffer corresponds to a different register). This
>>   is the most general invocation mode, the more "customizable" one, but also the slowest - since for every call there is
>>   some extra allocation which takes place.
>> 
>> * specialized interpreted mode; same as before, but instead of interpreting the bindings with a stack-based interpreter,
>>   we generate a method handle chain which effectively interprets all the bindings (again, except `Move` ones).
>> 
>> * intrinsified mode; this is typically used in combination with the specialized interpreted mode described above
>>   (although it can also be used with the Java-based binding interpreter). The goal here is to remove the buffer
>>   allocation and copy by introducing an additional JVM intrinsic. If a native call recipe is constant (e.g. the set of
>>   bindings is constant, which is probably the case if the native method handle is stored in a `static`, `final` field),
>>   then the VM can generate specialized assembly code which interprets the `Move` binding without the need to go for an
>>   intermediate buffer. This gives us back performances that are on par with JNI.
>> 
>> For upcalls, the support is not (yet) as advanced, and only the basic interpreted mode is available there. We plan to
>> add support for intrinsified modes there as well, which should considerably boost perfomances (probably well beyond
>> what JNI can offer at the moment, since the upcall support in JNI is not very well optimized).  Again, for more
>> readings on the internals of the foreign linker support, please refer to [5].
>> #### Test changes
>> 
>> Many new tests have been added to validate the foreign linker support; we have high level tests (see `StdLibTest`)
>> which aim at testing the linker from the perspective of code that clients could write. But we also have deeper
>> combinatorial tests (see `TestUpcall` and `TestDowncall`) which are meant to stress every corner of the ABI
>> implementation. There are also some great tests (see the `callarranger` folder) which test the various `CallArranger`s
>> for all the possible platforms; these tests adopt more of a white-box approach - that is, instead of treating the
>> linker machinery as a black box and verify that the support works by checking that the native call returned the results
>> we expected, these tests aims at checking that the set of bindings generated by the call arranger is correct. This also
>> mean that we can test the classification logic for Windows, Mac and Linux regardless of the platform we're executing
>> on.  Some additional microbenchmarks have been added to compare the performances of downcall/upcall with JNI.  [1] -
>> https://openjdk.java.net/jeps/389 [2] - https://openjdk.java.net/jeps/393 [3] -
>> https://git.openjdk.java.net/jdk/pull/548 [4] -
>> https://github.com/openjdk/panama-foreign/blob/foreign-jextract/doc/panama_ffi.md [5] -
>> http://cr.openjdk.java.net/~jvernee/docs/Foreign-abi%20downcall%20intrinsics%20technical%20description.html
>
> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Re-add file erroneously deleted (detected as rename)

I looked through some Hotspot runtime code and that looks ok.  I saw a couple of strange things on my way through the
code.  See comments.

src/hotspot/cpu/x86/foreign_globals_x86.cpp line 2:

> 1: /*
> 2:  * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved.

Copyright should be 2020.  All the new files should have 2020 as the copyright, a bunch don't.

src/hotspot/cpu/x86/foreign_globals_x86.cpp line 56:

> 54: }
> 55:
> 56: const ABIDescriptor parseABIDescriptor(JNIEnv* env, jobject jabi) {

I don't know if you care about performance but of these env->calls transition into the VM and back out again.  You
should prefix all the code that comes from java to native with JNI_ENTRY and just use native JVM code to implement
these.

src/hotspot/cpu/x86/foreign_globals_x86.hpp line 32:

> 30: #define __ _masm->
> 31:
> 32: struct VectorRegister {

Why are these structs and not classes?

src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp line 3885:

> 3883:
> 3884:   __ flush();
> 3885: }

I think as a future RFE we should refactor this function and generate_native_wrapper since they're similar (this is
nicer to read).  If I can remove is_critical_native code they will be more similar.

-------------

Changes requested by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/634



More information about the security-dev mailing list