[foreign-memaccess] on shared segments

Fri Sep 27 11:29:32 UTC 2019

Most definitely 2). If the developer creates generalized code, they may not have control of the origin. asShared/asConfined as no-ops are a form of guarantee. Of course, asConfined on a shared segment has an interesting story to tell. I think it has to be a copying semantic; not sure an error is helpful there.

> On Sep 26, 2019, at 5:07 PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
> 
> 
> On 26/09/2019 19:20, Brian Goetz wrote:
>> I think this approach balances the requirements cleanly.  It puts the cost of concurrent access on the current use cases -- by requiring an extra step to set up the shared buffer -- without perturbing the rest of the API or the performance of the confined use cases.
> 
> Thanks.
> 
> Btw, forgot to mention - this opens up another round of "how should the asXYZ" methods be called. Before we had:
> 
> slice(long, long) -> segment with smaller bounds
> 
> asReadOnly() -> read only segment
> asPinned() -> non closeabe segment
> 
> Now we also have
> 
> asConfined(Thread) -> make a new confined segment with different owner
> asShared() -> make a new shared segment
> 
> 
> I think slice, asReadOnly and asPinned can be seen as 'views' - that is, their temporal scope is the same as the segment they come from. asConfined and asShared are different beasts.
> 
> For now I used the asXYZ for everything, but I'm conscious that maybe a better naming scheme exists.
> 
> Also, there is a question of what asConfined and asShared should do in the case where no change is needed - that is, if you call asConfined(A) on a segment already owned by A - what happens? Similarly, if you call asShared() on an already shared segment, what happens? I think we have three choices:
> 
> 1) give an error - seems harsh
> 2) return same segment - less harsh, but seems irregular - sometimes the segment is killed, sometimes is not
> 3) kill current segment and return new segment every time - seems also a bit harsh
> 
> (the patch I shared is a bit inconsistent in that it does (2) for asShared, but (3) for asConfined)
> 
> I don't have any particular strong preference for any of the choices, other than I kind of dislike (3). (2) seems reasonable overall. Opinions?
> 
> Maurizio
> 
>> 
>> On 9/26/2019 2:10 PM, Maurizio Cimadamore wrote:
>>> Hi,
>>> in a previous document [1] I explored the problem of allowing concurrent access to a memory segment in a safe fashion. From that exploration, it emerged that there was one type of race that was particularly nasty: that is, a race between a thread A attempting to close a segment S while a thread B is attempting to access (read or write) S.
>>> 
>>> The presence of this race makes it really hard to generalize the existing memory access API to cases where concurrent/shared access is needed. Of course one naive solution would be to synchronize every access on the liveness check, but that makes performance really poor - which would defeat the point of having such an API in the first place.
>>> 
>>> Instead, to solve that problem, in the document I posit about a solution which uses an explicit acquire/release mechanism - that is clients of a shared segment will need to explicitly acquire the segment in order to be able to operate on it, and release it when done. A shared segment can only be closed when all clients are done with the segment - this is what ensures temporal safety. Moreover, since each client works on its own 'acquired' copy of the shared segment, everything is a constant and the JIT can see through the code and optimize it in the same way as it does for confined access. That said, we never fully committed to that solution, since the resulting API was very complex: for things to work, part of the MemorySegment API has to be moved under a new abstraction (in the document called MemoryHandle) - more specifically the bits that are responsible for creating addresses. While it's possible to devise a confined segment that is both a MemorySegment and a MemoryHandle (thus giving us back the old API), the general feedback I've received is that this solution seems a bit too convoluted.
>>> 
>>> When discussing about this problem with Jim, he pointed out a useful connection and a possible way out: after all, all these acquire/release and reference counting schemes are there to perform a job that a JVM knows exactly how to do at speed: determining whether an object is still used or not. So, instead of inventing new machinery, we could simply piggy back on the mechanisms we already have - that is GC and Cleaners.
>>> 
>>> The key realization, in the shared case, can be summarized as: performance, safety, deterministic deallocation, pick two! Since we're not willing to compromise on safety, or on performance, letting go of the deterministic de-allocation goal (only for shared segments) seems a reasonable conclusion.
>>> 
>>> In other words, there are now two kinds of segments: /confined/ segment and /shared/ segments. A segment always starts off as confined, and has an owning thread. You can update the owning thread - effectively nuking the existing segment and obtaining a new segment that is confined on a new thread. This allows clients to achieve serialized thread-confinement use cases - where multiple threads operate on a piece of memory one at a time. Confined segments are operated upon as usual: you allocate a segment, you use it, you close it (or you use a try with resources to do it all automagically).
>>> 
>>> If clients want more - e.g. full concurrent access, an API point is provided to turn a confined segment into a shared one. Again, what happens here is that the existing segment will be nuked, and a new shared segment will be created. But, this shared segment _cannot be closed_ (e.g. it is pinned, using the existing API terminology). So, how are off-heap resources released if we can't close the segment? Well, we let the GC take care of it - by registering the segment on a Cleaner, and have the cleaner call some cleanup code once the segment is no longer referenced (in reality, things are a bit different, in the sense that what we really  key on is the _scope_ of a segment, which might be shared across multiple views, but the essence is the same). In other words, deallocation for shared segments works pretty much the same way deallocation of direct buffer work.
>>> 
>>> With this move, we are able to retain the simplicity of the existing API, while also being able to support efficient and safe concurrent access.
>>> 
>>> A webrev implementing this change is available here:
>>> 
>>> http://cr.openjdk.java.net/~mcimadamore/panama/shared-segments_v2/
>>> 
>>> Implementation-wise things are, I think, quite straightforward. I took sometime to refactor the code, to make the various scope subclasses disappear. We now have a single memory segment implementation and two scopes: shared and confined. The confined scope takes a 'Runnable' cleanup action which is used (i) when closing the confined segment or (ii) passed onto the Cleaner by the shared scope if the segment is upgraded to 'shared' state. Also, since shared segment now can now be picked up by Cleaner when no longer referenced, it is crucial that we add in reachability fences around Unsafe operations (same way as direct buffer does really). This is because sometimes the GC can aggressively collect unused objects stored in local variables during method execution. Adding these fences doesn't negatively impact performances (in fact, I'm told these fences are a no-op in Hotspot).
>>> 
>>> I also took some effort to update some of the javadoc which are rendered invalid by this change.
>>> 
>>> Comments welcome
>>> 
>>> Maurizio
>>> 
>>> [1] - http://cr.openjdk.java.net/~mcimadamore/panama/confinement.html
>>> 
>>