[foreign-memaccess] musing on the memory access API

Mon Jan 4 18:25:37 UTC 2021

Apologizes - this email was not meant for public consumption (as the 
somewhat informal tone suggests). That said, since there's nothing in 
here which has not been discussed in this mailing list in the past, in 
one way or another, we might as well discuss it :-)

On 04/01/2021 16:11, Maurizio Cimadamore wrote:
> Hi,
> now that the foreign memory access API has been around for an year, I 
> think it’s time we start asking ourselves if this is the API we want, 
> and how comfortable we are in finalizing it. Overall, I think that 
> there are some aspects of the memory access API which are definitively 
> a success:
>
>  *
>
>    memory layouts, and the way they connect with dereference var
>    handles is definitively a success story, and now that we have added
>    even more var handle combinators, it is really possible to get crazy
>    with expressing exotic memory access
>
>  *
>
>    the new shape of memory access var handle as (MemorySegment,
>    long)->X makes a lot of sense, and it allowed us to greatly simplify
>    and unify the implementation (as well as to give users a cheap way
>    to do unsafe dereference of random addresses, which they sometimes 
> want)
>
>  *
>
>    the distinction between MemorySegment and MemoryAddress is largely
>    beneficial - and, when explained, it’s pretty obvious where the
>    difference come from: to do dereference we need to attach bounds (of
>    various kinds) to a raw pointer - after we do that, dereference
>    operations are safe. I think this model makes it very natural to
>    think about which places in your program might introduce invalid
>    assumptions, especially when dealing with native code
>
> I also think that there are aspects of the API where it’s less clear 
> we made the right call:
>
>  *
>
>    slicing behavior: closing the slice closes everything. This was
>    mostly a forced move: there are basically two use cases for slices:
>    sometimes you slice soon after creation (e.g. to align), in which
>    case you want the new slice to have same properties as the old one
>    (e.g. deallocate on close). There are other cases where you are just
>    creating a dumb sub-view, and you don’t really want to expose
>    close() on those. This led to the creation of the “access modes”
>    mechanism: each segment has some access modes - if the client wants
>    to prevent calls to MemorySegment::close it can do so by
>    /restricting/ the segment, and removing the corresponding CLOSE
>    access mode (e.g. before the segment is shared with other clients).
>    While this allows us to express all the use cases we care about, it
>    seems also a tad convoluted. Moreover, the client wrapping a
>    MemorySegment inside a TWR is always unsure as to whether the
>    segment will support close() or not.
>
>  *
>
>    not all segments are created equal: some memory segments are just
>    dumb views over memory that has been allocated somewhere else - e.g.
>    a Java heap array or a byte buffer. In such cases, it seems odd to
>    feature a close() operation (and I might add even having
>    thread-confinement, given the original API did not feature that to
>    begin with).
>
> Sidebar: on numerous occasions it has been suggested to solve issues 
> such as the one above by allowing close() to be a no-op in certain 
> cases. While that is doable, I’ve never been too conviced about it, 
> mainly because of this:
>
> |MemorySegment s = ... s.close(); assertFalse(s.isAlive()); // I 
> expect this to never fail!!!! |
>
> In other words, a world where some segments are stateful and respond 
> accordingly to close() requests and some are not seems very confusing 
> to me.
>
>  * the various operations for managing confinement of segments is
>    rapidly turning into an distraction. For instance, recently, the
>    Netty guys have created a port on top of the memory access API,
>    since we have added support for shared segment. Their use of shared
>    segment was a bit strange, in the sense that, while they allocated a
>    segment in shared mode, they wanted to be able to confine the
>    segment near where the segment is used, to catch potential mistakes.
>    To do so, they resorted to calling handoff on a shared segment
>    repeatedly, which performance-wise doesn’t work. Closing a shared
>    segment (even if just for handing it off to some other thread) is a
>    very expensive operation which needs to be used carefully - but the
>    Netty developers were not aware of the trade-off (despite it being
>    described in the javadoc - but that’s understandable, as it’s pretty
>    subtle). Of course, if they just worked with a shared segment, and
>    avoided handoff, things would have worked just fine (closing shared
>    segments is perfectly fine for long lived segments). In other words,
>    this is a case where, by featuring many different modes of
>    interacting with segments (confined, shared) as well as ways to go
>    back and forth between these states, we create extra complexity,
>    both for ourselves and for the user.
>
> I’ve been thinking quite a bit about these issues, trying to find a 
> more stable position in the design space. While I can’t claim to have 
> found a 100% solution, I think I might be onto something worth 
> exploring. On a recent re-read of the C# Span API doc [1], it dawned 
> on me that there is a sibling abstraction to the Span abstraction in 
> C#, namely Memory [2]. While some of the reasons behind the Span vs. 
> Memory split have to do with stack vs. heap allocation (e.g. Span can 
> only be used for local vars, not fields), and so not directly related 
> to our design choices, I think some of the concepts of the C# solution 
> hinted at a possibly better way to stack the problem of memory access.
>
> We have known at least for the last 6 months that a MemorySegment is 
> playing multiple roles at once: a MS is both a memory allocation (e.g. 
> result of a malloc, or mmap), and a /view/ over said memory. This 
> duplicity creates most of the problem listed above, as it’s clear 
> that, while close() is a method that should belong to an allocation 
> abstraction, it is less clear that close() should also belong to a 
> view-like abstraction. We have tried, in the past, to come up with a 
> 3-pronged design, where we had not only MemorySegment and 
> MemoryAddress, but also a MemoryResource abstraction from which /all/ 
> segments were derived. These experiments have failed, pretty much all 
> for the same reason: the return on complexity seemed thin.
>
> Recently, I found myself going back slightly to that approach, 
> although in a quite different way. Here’s the basic idea I’m playing 
> with:
>
>  * introduce a new abstraction: AllocationHandle (name TBD) - this
>    wraps an allocation, whether generated by malloc, mmap, or some
>    future allocator TBD (Jim’s QBA?)
>  * We provide many AllocationHandle factories: { confined, shared } x {
>    cleaner, no cleaner }
>  * AllocationHandle is thin: just has a way to get size, alignment and
>    a method to release memory - e.g. close(); in other words,
>    AllocationHandle <: AutoCloseable
>  * crucially, an AllocationHandle has a way to obtain a segment /view/
>    out of it (MemorySegment)
>  * a MemorySegment is the same thing it used to be, /minus/ the
>    terminal operations (|close|, |handoff|, … methods)
>  * we still keep all the factories for constructing MemorySegments out
>    of heap arrays and byte buffer
>  * there’s no way to go from a MemorySegment back to an AllocationHandle
>
> This approach solves quite few issues:
>
>  * Since MemorySegment does not have a close() method, we don’t have to
>    worry about specifying what close() does in problematic cases
>    (slices, on-heap, etc.)
>  * There is an asymmetry between the actor which does an allocation
>    (the holder of the AllocationHandle) and the rest of the world,
>    which just deals with (non-closeable) MemorySegment - this seems to
>    reflect how memory is allocated in the real world (one actor
>    allocates, then shares a pointer to allocated memory to some other
>    actors)
>  * AllocationHandles come in many shapes and form, but instead of
>    having dynamic state transitions, users will have to choose the
>    flavor they like ahead of time, knowing pros and cons of each
>  * This approach removes the need for access modes and restricted views
>    - we probably still need a readOnly property in segments to support
>    mapped memory, but that’s pretty much it
>
> Of course there are also things that can be perceived as disadvantages:
>
>  * Conciseness. Code dealing in native memory segments will have to
>    first obtain an allocation handle, then obtaining a segment. For
>    instance, code like this:
>
> |try (MemorySegment s = MemorySegment.allocateNative(layout)) { ... 
> MemoryAccess.getIntAtOffset(s, 42); ... } |
>
> Will become:
>
> |try (AllocationHandle ah = 
> AllocationHandle.allocateNativeConfined(layout)) { MemorySegment s = 
> ah.asSegment(); ... MemoryAccess.getIntAtOffset(s, 42); ... } |
>
>  *
>
>    It would be no longer possible for the linker API to just allocate
>    memory and return a segment based on that memory - since now the
>    user cannot free that memory anymore (no close method in segments).
>    We could solve this either by having the linker API return
>    allocation handle or, better, by having the linker API accepting a
>    NativeScope where allocation should occur (since that’s how clients
>    are likely to interact with the API point anyway). In fact, we have
>    already considered doing something similar in the past (doing a
>    malloc for each struct returned by value is a performance killer in
>    certain contexts).
>
>  *
>
>    At least in this form, we give up state transitions between confined
>    and shared. Users will have to pick in which side of the world they
>    want to play with and stick with it. For simple lexically scoped use
>    cases, confined is fine and efficient - in more complex cases,
>    shared might be unavoidable. While handing off an entire
>    AllocationHandle is totally doable, doing so (e.g. killing an
>    existing AH instance to return a new AH instance confined on a
>    different thread) will also kill all segments derived from the
>    original AH. So it’s not clear such an API would be very useful: to
>    be able to do an handoff, clients will need to pass around an
>    AllocationHandle, not a MemorySegment (like now). Note that adding
>    handoff operation directly on MemorySegment, under this design, is
>    not feasible: handoff is a terminal operation, so we would allow
>    clients to do nonsensical things like:
>
> 1. obtain a segment
> 2. create two identical segments via slicing
> 3. set the owner of the two segments to two different threads
>
> For this reason, it makes sense to think about ownership as a property 
> on the /allocation/, not on the /view/.
>
>  * While the impact of these changes on client using memory access API
>    directly is somewhat biggie (no TWR on heap/buffer segments, need to
>    go through an AllocationHandle for native stuff), clients of
>    extracted API are largely unchanged, thanks to the fact that most of
>    such clients use NativeScope anyway to abstract over how segments
>    are allocated.
>
> Any thoughts? I think the first question is as to whether we’re ok 
> with the loss in conciseness, and with the addition of a new (albeit 
> very simple) abstraction.
>
> [1] - 
> https://docs.microsoft.com/en-us/dotnet/api/system.span-1?view=net-5.0
> [2] - 
> https://docs.microsoft.com/en-us/dotnet/standard/memory-and-spans/memory-t-usage-guidelines
>
>