RFR: 8254162: Implementation of Foreign-Memory Access API (Third Incubator)

Aleksey Shipilev shade at openjdk.java.net
Thu Oct 8 07:44:43 UTC 2020


On Wed, 7 Oct 2020 17:13:22 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

> This patch contains the changes associated with the third incubation round of the foreign memory access API incubation
> (see JEP 393 [1]). This iteration focus on improving the usability of the API in 3 main ways:
> * first, by providing a way to obtain truly *shared* segments, which can be accessed and closed concurrently from
>   multiple threads
> * second, by providing a way to register a memory segment against a `Cleaner`, so as to have some (optional) guarantee
>   that the memory will be deallocated, eventually
> * third, by not requiring users to dive deep into var handles when they first pick up the API; a new `MemoryAccess` class
>   has been added, which defines several useful dereference routines; these are really just thin wrappers around memory
>   access var handles, but they make the barrier of entry for using this API somewhat lower.
> 
> A big conceptual shift that comes with this API refresh is that the role of `MemorySegment` and `MemoryAddress` is not
> the same as it used to be; it used to be the case that a memory address could (sometimes, not always) have a back link
> to the memory segment which originated it; additionally, memory access var handles used `MemoryAddress` as a basic unit
> of dereference.  This has all changed as per this API refresh;  now a `MemoryAddress` is just a dumb carrier which
> wraps a pair of object/long addressing coordinates; `MemorySegment` has become the star of the show, as far as
> dereferencing memory is concerned. You cannot dereference memory if you don't have a segment. This improves usability
> in a number of ways - first, it is a lot easier to wrap native addresses (`long`, essentially) into a `MemoryAddress`;
> secondly, it is crystal clear what a client has to do in order to dereference memory: if a client has a segment, it can
> use that; otherwise, if the client only has an address, it will have to create a segment *unsafely* (this can be done
> by calling `MemoryAddress::asSegmentRestricted`).  A list of the API, implementation and test changes is provided
> below. If  you have any questions, or need more detailed explanations, I (and the  rest of the Panama team) will be
> happy to point at existing discussions,  and/or to provide the feedback required.   A big thank to Erik Osterlund,
> Vladimir Ivanov and David Holmes, without whom the work on shared memory segment would not have been possible; also I'd
> like to thank Paul Sandoz, whose insights on API design have been very helpful in this journey.  Thanks  Maurizio
> Javadoc:   http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html
> Specdiff:
> 
> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html
> 
> CSR:
> 
> https://bugs.openjdk.java.net/browse/JDK-8254163
> 
> 
> 
> ### API Changes
> 
> * `MemorySegment`
>   * drop factory for restricted segment (this has been moved to `MemoryAddress`, see below)
>   * added a no-arg factory for a native restricted segment representing entire native heap
>   * rename `withOwnerThread` to `handoff`
>   * add new `share` method, to create shared segments
>   * add new `registerCleaner` method, to register a segment against a cleaner
>   * add more helpers to create arrays from a segment e.g. `toIntArray`
>   * add some `asSlice` overloads (to make up for the fact that now segments are more frequently used as cursors)
>   * rename `baseAddress` to `address` (so that `MemorySegment` can implement `Addressable`)
> * `MemoryAddress`
>   * drop `segment` accessor
>   * drop `rebase` method and replace it with `segmentOffset` which returns the offset (a `long`) of this address relative
>     to a given segment
> * `MemoryAccess`
>   * New class supporting several static dereference helpers; the helpers are organized by carrier and access mode, where a
>     carrier is one of the usual suspect (a Java primitive, minus `boolean`); the access mode can be simple (e.g. access
>     base address of given segment), or indexed, in which case the accessor takes a segment and either a low-level byte
>     offset,or a high level logical index. The classification is reflected in the naming scheme (e.g. `getByte` vs.
>     `getByteAtOffset` vs `getByteAtIndex`).
> * `MemoryHandles`
>   * drop `withOffset` combinator
>   * drop `withStride` combinator
>   * the basic memory access handle factory now returns a var handle which takes a `MemorySegment` and a `long` - from which
>     it is easy to derive all the other handles using plain var handle combinators.
> * `Addressable`
>   * This is a new interface which is attached to entities which can be projected to a `MemoryAddress`. For now, both
>     `MemoryAddress` and `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more implementations. Clients
>     can largely ignore this interface, which comes in really handy when defining native bindings with tools like `jextract`.
> * `MemoryLayouts`
>   * A new layout, for machine addresses, has been added to the mix.
> 
> 
> 
> ### Implementation changes
> 
> There are two main things to discuss here: support for shared segments, and the general simplification of the memory
> access var handle support.
> #### Shared segments
> 
> The support for shared segments cuts in pretty deep in the VM. Support for shared segments is notoriously hard to
> achieve, at least in a way that guarantees optimal access performances. This is caused by the fact that, if a segment
> is shared, it would be possible for a thread to close it while another is accessing it.  After considering several
> options (see [3]), we zeroed onto an approach which is inspired by an happy idea that Andrew Haley had (and that he
> reminded me of at this year OpenJDK committer workshop - thanks!). The idea is that if we could *freeze* the world
> (e.g. with a GC pause), while a segment is closed, we could then prevent segments from being accessed concurrently to a
> close operation. For this to work, it  is crucial that no GC safepoints can occur between a segment liveness check and
> the access itself (otherwise it would be possible for the accessing thread to stop just right before an unsafe call).
> It also relies on the fact that hotspot/C2 should not be able to propagate loads across safepoints.  Sadly, none of
> these conditions seems to be valid in the current implementation, so we needed to resort to a bit of creativity. First,
> we noted that, if we could mark so called *scoped* method with an annotation, it would be very simply to check as to
> whether a thread was in the middle of a scoped method when we stopped the world for a close operation (btw, instead of
> stopping the world, we do a much more efficient, thread-local polling, thanks to JEP 312 [4]).  The question is, then,
> once we detect that a thread is accessing the very segment we're about to close, what should happen? We first
> experimented with a solution which would install an *asynchronous* exception on the accessing thread, thus making it
> fail. This solution has some desirable properties, in that a `close` operation always succeeds. Unfortunately the
> machinery for async exceptions is a bit fragile (e.g. not all the code in hotspot checks for async exceptions); to
> minimize risks, we decided to revert to a simpler strategy, where `close` might fail when it finds that another thread
> is accessing the segment being closed.  As written in the javadoc, this doesn't mean that clients should just catch and
> try again; an exception on `close` is a bug in the user code, likely arising from lack of synchronization, and should
> be treated as such.  In terms of gritty implementation, we needed to centralize memory access routines in a single
> place, so that we could have a set of routines closely mimicking the primitives exposed by `Unsafe` but which, in
> addition, also provided a liveness check. This way we could mark all these routines with the special `@Scoped`
> annotation, which tells the VM that something important is going on.  To achieve this, we created a new (autogenerated)
> class, called `ScopedMemoryAccess`. This class contains all the main memory access primitives (including bulk access,
> like `copyMemory`, or `setMemory`), and accepts, in addition to the access coordinates, also a scope object, which is
> tested before access. A reachability fence is also thrown in the mix to make sure that the scope is kept alive during
> access (which is important when registering segments against cleaners).  Of course, to make memory access safe, memory
> access var handles, byte buffer var handles, and byte buffer API should use the new `ScopedMemoryAccess` class instead
> of unsafe, so that a liveness check can be triggered (in case a scope is present).  `ScopedMemoryAccess` has a
> `closeScope` method, which initiates the thread-local handshakes, and returns `true` if the handshake completed
> successfully.  The implementation of `MemoryScope` (now significantly simplified from what we had before), has two
> implementations, one for confined segments and one for shared segments; the main difference between the two is what
> happens when the scope is closed; a confined segment sets a boolean flag to false, and returns, whereas a shared
> segment goes into a `CLOSING` state, then starts the handshake, and then updates the state again, to either `CLOSED` or
> `ALIVE` depending on whether the handshake was successful or not. Note that when a shared segment is in the `CLOSING`
> state, `MemorySegment::isAlive` will still return `true`, while the liveness check upon memory access will fail.  ####
> Memory access var handles overhaul  The key realization here was that if all memory access var handles took a
> coordinate pair of `MemorySegment` and `long`, all other access types could be derived from this basic var handle
> form.  This allowed us to remove the on-the-fly var handle generation, and to simply derive structural access var
> handles (such as those obtained by calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so that
> e.g. additional offset is injected into a base memory access var handle.  This also helped in simplifying the
> implementation by removing the special `withStride` and `withOffset` combinators, which previously needed low-level
> access on the innards of the memory access var handle. All that code is now gone.  #### Test changes  Not much to see
> here - most of the tests needed to be updated because of the API changes. Some were beefed up (like the array test,
> since now segments can be projected into many different kinds of arrays). A test has been added to test the `Cleaner`
> functionality, and another stress test has been added for shared segments (`TestHandshake`). Some of the
> microbenchmarks also needed some tweaks - and some of them were also updated to also test performance in the shared
> segment case.  [1] - https://openjdk.java.net/jeps/393 [2] - https://openjdk.java.net/jeps/389 [3] -
> https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] - https://openjdk.java.net/jeps/312

Drive-by review.

src/hotspot/share/prims/scopedMemoryAccess.cpp line 1:

> 1:

Misses copyright header.

src/hotspot/share/prims/scopedMemoryAccess.cpp line 81:

> 79:       CompiledMethod* cm = last_frame.cb()->as_compiled_method();
> 80:
> 81:       //FIXME: this doesn't work if reachability fences are violated by C2

Maybe turn this into /* */ block, so that it is obvious the whole thing relates to the `FIXME`?

src/hotspot/share/prims/scopedMemoryAccess.cpp line 96:

> 94:     int depth = 0;
> 95:     vframeStream stream(jt);
> 96:     for (; !stream.at_end(); stream.next()) {

Can move `stream` initialization right into the `for` statement?

src/hotspot/share/prims/scopedMemoryAccess.cpp line 138:

> 136: /// JVM_RegisterUnsafeMethods
> 137:
> 138: #define LANG "Ljdk/internal/misc/"

This is weirdly defined as `LANG`. I suppose this would change to `Ljava/lang` soon. But maybe `PACKAGE` is better.

src/hotspot/share/prims/scopedMemoryAccess.cpp line 130:

> 128:  * Top frames containg obj will be deoptimized.
> 129:  */
> 130: JVM_ENTRY(jboolean, ScopedMemoryAccess_closeScope(JNIEnv *env, jobject receiver, jobject deopt, jobject
> exception)) {

`JVM_ENTRY` does not require a brace, it is braced already. See existing uses of `JVM_ENTRY`.

src/hotspot/share/prims/scopedMemoryAccess.cpp line 134:

> 132:   Handshake::execute(&cl);
> 133:   return !cl._found;
> 134: } JVM_END

Ditto for `JVM_END`.

src/hotspot/share/prims/scopedMemoryAccess.cpp line 166:

> 164:   int ok = env->RegisterNatives(scopedMemoryAccessClass, jdk_internal_misc_ScopedMemoryAccess_methods,
> sizeof(jdk_internal_misc_ScopedMemoryAccess_methods)/sizeof(JNINativeMethod)); 165:   guarantee(ok == 0, "register
> jdk.internal.misc.ScopedMemoryAccess natives"); 166: } JVM_END

`JVM_ENTRY`/`JVM_END` braces again.

src/java.base/share/classes/java/lang/invoke/MemoryAccessVarHandleBase.java line 45:

> 43:     final boolean skipAlignmentMaskCheck;
> 44:
> 45:     MemoryAccessVarHandleBase(VarForm form, boolean skipOffetCheck, boolean be, long length, long alignmentMask) {

Typo: `skipOff*s*etCheck`. Should it be `skipAlignmentMaskCheck` to begin with?

test/jdk/java/foreign/TestMismatch.java line 26:

> 24: /*
> 25:  * @test
> 26:  * @run testng/othervm -XX:MaxDirectMemorySize=5000000000 TestMismatch

Whoa, allocating 5 GB? That might fail on 32-bit platforms... Anyhow, this flag accepts suffixes, so
`-XX:MaxDirectMemorySize=5g`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/548



More information about the build-dev mailing list