[foreign-memaccess] on shared segments
Brian Goetz
brian.goetz at oracle.com
Thu Sep 26 18:20:05 UTC 2019
I think this approach balances the requirements cleanly. It puts the
cost of concurrent access on the current use cases -- by requiring an
extra step to set up the shared buffer -- without perturbing the rest of
the API or the performance of the confined use cases.
On 9/26/2019 2:10 PM, Maurizio Cimadamore wrote:
> Hi,
> in a previous document [1] I explored the problem of allowing
> concurrent access to a memory segment in a safe fashion. From that
> exploration, it emerged that there was one type of race that was
> particularly nasty: that is, a race between a thread A attempting to
> close a segment S while a thread B is attempting to access (read or
> write) S.
>
> The presence of this race makes it really hard to generalize the
> existing memory access API to cases where concurrent/shared access is
> needed. Of course one naive solution would be to synchronize every
> access on the liveness check, but that makes performance really poor -
> which would defeat the point of having such an API in the first place.
>
> Instead, to solve that problem, in the document I posit about a
> solution which uses an explicit acquire/release mechanism - that is
> clients of a shared segment will need to explicitly acquire the
> segment in order to be able to operate on it, and release it when
> done. A shared segment can only be closed when all clients are done
> with the segment - this is what ensures temporal safety. Moreover,
> since each client works on its own 'acquired' copy of the shared
> segment, everything is a constant and the JIT can see through the code
> and optimize it in the same way as it does for confined access. That
> said, we never fully committed to that solution, since the resulting
> API was very complex: for things to work, part of the MemorySegment
> API has to be moved under a new abstraction (in the document called
> MemoryHandle) - more specifically the bits that are responsible for
> creating addresses. While it's possible to devise a confined segment
> that is both a MemorySegment and a MemoryHandle (thus giving us back
> the old API), the general feedback I've received is that this solution
> seems a bit too convoluted.
>
> When discussing about this problem with Jim, he pointed out a useful
> connection and a possible way out: after all, all these
> acquire/release and reference counting schemes are there to perform a
> job that a JVM knows exactly how to do at speed: determining whether
> an object is still used or not. So, instead of inventing new
> machinery, we could simply piggy back on the mechanisms we already
> have - that is GC and Cleaners.
>
> The key realization, in the shared case, can be summarized as:
> performance, safety, deterministic deallocation, pick two! Since we're
> not willing to compromise on safety, or on performance, letting go of
> the deterministic de-allocation goal (only for shared segments) seems
> a reasonable conclusion.
>
> In other words, there are now two kinds of segments: /confined/
> segment and /shared/ segments. A segment always starts off as
> confined, and has an owning thread. You can update the owning thread -
> effectively nuking the existing segment and obtaining a new segment
> that is confined on a new thread. This allows clients to achieve
> serialized thread-confinement use cases - where multiple threads
> operate on a piece of memory one at a time. Confined segments are
> operated upon as usual: you allocate a segment, you use it, you close
> it (or you use a try with resources to do it all automagically).
>
> If clients want more - e.g. full concurrent access, an API point is
> provided to turn a confined segment into a shared one. Again, what
> happens here is that the existing segment will be nuked, and a new
> shared segment will be created. But, this shared segment _cannot be
> closed_ (e.g. it is pinned, using the existing API terminology). So,
> how are off-heap resources released if we can't close the segment?
> Well, we let the GC take care of it - by registering the segment on a
> Cleaner, and have the cleaner call some cleanup code once the segment
> is no longer referenced (in reality, things are a bit different, in
> the sense that what we really key on is the _scope_ of a segment,
> which might be shared across multiple views, but the essence is the
> same). In other words, deallocation for shared segments works pretty
> much the same way deallocation of direct buffer work.
>
> With this move, we are able to retain the simplicity of the existing
> API, while also being able to support efficient and safe concurrent
> access.
>
> A webrev implementing this change is available here:
>
> http://cr.openjdk.java.net/~mcimadamore/panama/shared-segments_v2/
>
> Implementation-wise things are, I think, quite straightforward. I took
> sometime to refactor the code, to make the various scope subclasses
> disappear. We now have a single memory segment implementation and two
> scopes: shared and confined. The confined scope takes a 'Runnable'
> cleanup action which is used (i) when closing the confined segment or
> (ii) passed onto the Cleaner by the shared scope if the segment is
> upgraded to 'shared' state. Also, since shared segment now can now be
> picked up by Cleaner when no longer referenced, it is crucial that we
> add in reachability fences around Unsafe operations (same way as
> direct buffer does really). This is because sometimes the GC can
> aggressively collect unused objects stored in local variables during
> method execution. Adding these fences doesn't negatively impact
> performances (in fact, I'm told these fences are a no-op in Hotspot).
>
> I also took some effort to update some of the javadoc which are
> rendered invalid by this change.
>
> Comments welcome
>
> Maurizio
>
> [1] - http://cr.openjdk.java.net/~mcimadamore/panama/confinement.html
>
More information about the panama-dev
mailing list