memory access - pulling all the threads

Tue Jan 26 11:03:32 UTC 2021

On 26/01/2021 07:36, Ty Young wrote:
> The basic idea behind a NativeAllocator makes sense but Is keeping the 
> current MemorySegment.close() and access modes out of the question? 
> Would it not be possible to introduce a free(MemorySegment)  method to 
> this NativeAllocator interface, which MemorySegment.close() calls, so 
> that the MemorySegment abstraction may be marked as **not** alive but 
> the underlying memory may still be alive?

In the proposed design, single segments do not have a liveness bit to 
themselves. Of course you can use an allocator which creates a new scope 
for each new segment and land exactly where you were before. As for 
free, it doesn't make sense on "all" allocators. I think whether 
segments are independently freeable and how, is a detail of the 
allocator implementation.

As for adding close() to MemorySegment, I believe having MemorySegment 
(as well as other resource-like abstraction like VaList) being 
AutoCloseable looks like a win in short examples where you have only a 
single segment - but it fails to scale to more complex examples where 
you have multiple segments (and other resources) whose life-cycle is 
shared. And when you get to the latter case, having a close() becomes 
problematic, because you are never quite sure of what will the 
side-effects of calling close() be: will you close only the segment you 
are operating on? Or, is that segment a slice from a bigger chunk, so 
that you will also end up killing other 50 segments?

The proposed API, by making the transition from segment to scope (many 
to one) explicit, makes it clearer that segment is just a view that is 
attached to some other abstraction which will "release" the resources 
attached to the segment; this makes the segment API roughly half the 
size of what it was, and allows us to capture common traits of all 
managed resources in a single abstraction. Now, MemorySegment has a 
scope, VaList has a scope, LibraryLookup has a scope (although that is 
not exposed - yet), and if we tweak CLinker::upcallStub to return 
something other than a segment (e.g. a FunctionPointer), that will have 
a scope too.

Of course, clients can implement closeable segments using the 
abstractions provided, e.g.

```
class CloseableSegment implements AutoCloseable {
    private final MemorySegment segment;

    public static CloseableSegment make(long size) {
        segment = MemorySegment.allocateNative(size, 
ResourceScope.ofConfined());
    }

    // maybe sprinkle some accessors based on MemoryAcccess ?

    public void close() {
       return segment.scope.close();
    }

    public MemorySegment toSegment() {
        return segment; // interop with linker API
     }
}
```

You can even implement handoff primitives like the ones we had using the 
proposed API (e.g. on handoff from A to B close the scope and return a 
new CloseableSegment that has a scope confined on B). So, the segment 
abstraction included in the proposal is more primitive, which is, I 
think, a good sign.

P.S.

Now that segments are simpler and do not have any real lifecycle logic, 
one might wonder why not allowing users to define custom implementation 
of the MemorySegment interface. While that's possible in practice, I 
think most of the times that will run into troubles when integrating 
with var handles and method handles, which prefer exact type matches. 
That's the main reason why we have not relied on subtyping much for 
linker-related abstractions: it's easy to run into performance pitfalls, 
and there doesn't seem enough to gain in exchange.

Maurizio

>
>
> On 1/25/21 11:52 AM, Maurizio Cimadamore wrote:
>> Hi,
>> as you know, I've been looking at both internal and external feedback 
>> on usage of the memory access API, in an attempt to understand what 
>> the problem with the API are, and how to move forward. As discussed 
>> here [1], there are some things which work well, such as structured 
>> access, or recent addition to shared segment support (the latter seem 
>> to have enabled a wide variety of experiments which allowed us to 
>> gather more feedback - thanks!). But there are still some issues to 
>> be resolved - which could be summarized as "the MemorySegment 
>> abstraction is trying to do too many things at once" (again, please 
>> refer to [1] for a more detailed description of the problem involved).
>>
>> In [1] I described a possible approach where every allocation method 
>> (MemorySegment::allocateNative and MemorySegment::mapFile) return a 
>> "allocation handle", not a segment directly. The handle is the 
>> closeable entity, while the segment is just a view. While this 
>> approach is workable (and something very similar has indeed been 
>> explored here [2]), after implementing some parts of it, I was left 
>> not satisfied with how this approach integrates with respect to the 
>> foreign linker support. For instance, defining the behavior of 
>> methods such as CLinker::toCString becomes quite convoluted: where 
>> does the allocation handle associated with the returned string comes 
>> from? If the segment has no pointer to the handle, how can the memory 
>> associated to the string be closed? What is the relationship between 
>> an allocation handle and a NativeScope? All these questions led me to 
>> conclude that the proposed approach was not enough, and that we 
>> needed to try harder.
>>
>> The above approach does one thing right: it splits memory segments 
>> from the entity managing allocation/closure of memory resources, thus 
>> turning memory segments into dumb views. But it doesn't go far enough 
>> in this direction; as it turns out, what we really want here is a way 
>> to capture the concept of the lifecycle that is associated to one or 
>> more (logically related) resources - which, unsurprisingly, is part 
>> of what NativeScope does too. So, let's try to model this abstraction:
>>
>> ```
>> interface ResourceScope extends AutoCloseable {
>>    void addOnClose(Runnable) // adds a new cleanup action to this scope
>>    void close() // closes the scope
>>
>>    static ResourceScope ofConfined() // creates a confined resource 
>> scope
>>    static ResourceScope ofShared() // creates a shared resource scope
>>    static ResourceScope ofConfined(Cleaner) // creates a confined 
>> resource scope - managed by cleaner
>>    static ResourceScope ofShared(Cleaner) // creates a shared 
>> resource scope - managed by cleaner
>> }
>> ```
>>
>> It's a very simple interface - you can basically add new cleanup 
>> actions to it, which will be called when the scope is closed; note 
>> that ResourceScope supports implicit close (via a Cleaner), or 
>> explicit close (via the close method) - it can even support both (not 
>> shown here).
>>
>> Armed with this new abstraction, let's try to see if we can shine new 
>> light onto some of the existing API methods and abstractions.
>>
>> Let's start with heap segments - these are allocated using one of the 
>> MemorySegment::ofArray() factories; one of the issues with heap 
>> segments is that it doesn't make much sense to close them. In the 
>> proposed approach, this can be handled nicely: heap segments are 
>> associated with a _global_ scope that cannot be closed - a scope that 
>> is _always alive_. This clarifies the role of heap segments (and also 
>> of buffer segments) nicely.
>>
>> Let's proceed to MemorySegment::allocateNative/mapFile - what should 
>> these factories do? Under the new proposal, these method should 
>> accept a ResourceScope parameter, which defines the lifecycle to 
>> which the newly created segment should be attached to. If we want to 
>> still provide ResourceScope-less overloads (as the API does now) we 
>> can pick a useful default: a shared, non-closeable, cleaner-backed 
>> scope. This choice gives us essentially the same semantics as a byte 
>> buffer, so it would be an ideal starting point for developers coming 
>> from the ByteBuffer API trying to familiarize with the new memory 
>> access API. Note that, when using these more compact factories, 
>> scopes are almost entirely hidden from the client - so no extra 
>> complexity is added (compared e.g. to the ByteBuffer API).
>>
>> As it turns out, ResourceScope is not only useful for segments, but 
>> it is also useful for a number of entities which need to be attached 
>> to some lifecycle, such as:
>>
>> * upcall stubs
>> * va lists
>> * loaded libraries
>>
>> The upcall stub case is particularly telling: in that case, we have 
>> decided to model an upcall stub as a MemorySegment not because it 
>> makes sense to dereference an upcall stub - but simply because we 
>> need to have a way to _release_ the upcall stub once we're done using 
>> it. Under the new proposal, we have a new, powerful option: the 
>> upcall stub API point can accept an user-provided ResourceScope which 
>> will be responsible for managing the lifecycle of the upcall stub 
>> entity. That is, we are now  free to turn the result of a call to 
>> upcallStub to something other than a MemorySegment (e.g. a 
>> FunctionPointer?) w/o loss of functionality.
>>
>> Resource scopes are very useful to manage _group_ of resources - 
>> there are in fact cases where one or more segments share the same 
>> lifecycle - that is, they need to be all alive at the same time; to 
>> handle some of these use cases, the status quo adds the NativeScope 
>> abstraction, which can accept registration of external memory segment 
>> (via the MemorySegment::handoff) API. This use case is naturally 
>> handled by the ResourceScope API:
>>
>> ```
>> try (ResourceScope scope : ResourceScope.ofConfined()) {
>>    MemorySegment.allocateNative(layout, scope):
>>    MemorySegment.mapFile(... , scope);
>>    CLinker.upcallStub(..., scope);
>> } // release all resources
>> ```
>>
>> Does this remove the need for NativeScope ? Not so fast: NativeScope 
>> is used to group logically related resources, yes, but is also used 
>> as a faster, arena-based allocator - which attempts to minimize the 
>> number of system calls (e.g. to malloc) by allocating bigger memory 
>> blocks and then handing over slices to clients. Let's try to model 
>> the allocation-nature of a NativeScope with a separate interface, as 
>> follows:
>>
>> ```
>> @FunctionalInterface
>> interface NativeAllocator {
>>    MemorySegment allocate(long size, long align);
>>    default allocateInt(MemoryLayout intLayout, int value) { ... }
>>    default allocateLong(MemoryLayout intLayout, long value) { ... }
>>    ... // all allocation helpers in NativeScope
>> }
>> ```
>>
>> At first, it seems this interface doesn't add much. But it is quite 
>> powerful - for instance, a client can create a simple, malloc-like 
>> allocator, as follows:
>>
>> ```
>> NativeAllocator malloc = (size, align) -> 
>> MemorySegment.allocateNative(size, align, ResourceScope.ofConfined());
>>
>> ```
>>
>> This is an allocator which allocates a new region of memory on each 
>> allocation request, backed by a fresh confined scope (which can be 
>> closed independently). This idiom is in fact so common that the API 
>> allows client to create these allocators in a more compact fashion:
>>
>> ```
>> NativeAllocator confinedMalloc = 
>> NativeAllocator.ofMalloc(ResourceScope::ofConfined);
>> NativeAllocator sharedMalloc = 
>> NativeAllocator.ofMalloc(ResourceScope::ofConfined);
>> ```
>>
>> But other strategies are also possible:
>>
>> * arena allocation (e.g. the allocation strategy currently used by 
>> NativeScope)
>> * recycling allocation (a single segment, with given layout, is 
>> allocated, and allocation requests are served by repeatedly slicing 
>> that very segment) - this is a critical optimization in e.g. loops
>> * interop with custom allocators
>>
>> So, where would we accept a NativeAllocator in our API? Turns out 
>> that accepting an allocator is handy whenever an API point needs to 
>> allocate some native memory - so, instead of
>>
>> ```
>> MemorySegment toCString(String)
>> ```
>>
>> This is better:
>>
>> ```
>> MemorySegment toCString(String, NativeAllocator)
>> ```
>>
>> Of course, we need to tweak the foreign linker, so that in all 
>> foreign calls returning a struct by value (which require some 
>> allocation), a NativeAllocator prefix argument is added to the method 
>> handle, so that the user can specify which allocator should be used 
>> by the call; this is a straightforward change which greatly enhances 
>> the expressive power of the linker API.
>>
>> So, we are in a place where some methods (e.g. factories which create 
>> some resource) takes an additional ResourceScope argument - and some 
>> other methods (e.g. methods that need to allocate native segments) 
>> which take an additional NativeAllocator argument. Now, it would be 
>> inconvenient for the user to have to create both, at least in simple 
>> use cases - but, since these are interfaces, nothing prevents us from 
>> creating a new abstraction which implements _both_ ResourceScope 
>> _and_ NativeAllocator - in fact this is exactly what the role of the 
>> already existing NativeScope is!
>>
>> ```
>> interface NativeScope extends NativeAllocator, ResourceScope { ... }
>> ```
>>
>> In other words, we have retconned the existing NativeScope 
>> abstraction, by explaining its behavior in terms of more primitive 
>> abstractions (scopes and allocators). This means that clients can, 
>> for the most part, just create a NativeScope and then pass it 
>> whenever a ResourceScope or a NativeAllocator is required (which is 
>> what is already happening in all of our jextract examples).
>>
>> There are some additional bonus points of this approach.
>>
>> First, ResourceScope features some locking capabilities - e.g. you 
>> can do things like:
>>
>> ```
>> try (ResourceScope.Lock lock = segment.scope().lock()) {
>>    <critical operation on segment>
>> }
>> ```
>>
>> Which allows clients to perform segment critical operations w/o 
>> worrying that a segment memory will be reclaimed while in the middle 
>> of the operation. This solves the problem with async operation on 
>> byte buffers derived from shared segments (see [3]).
>>
>> Another bonus point is that the ResourceScope interface is completely 
>> segment-agnostic - in fact, we have now a way to describe APIs which 
>> return resources which must be cleaned by the user (or, implicitly, 
>> by the GC). For instance, it would be entirely reasonable to imagine, 
>> one day, the ByteBuffer API to provide an additional factory - e.g. 
>> allocateDirect(int size, ResourceScope scope) - which gives you a 
>> direct buffer attached to a given (closeable) scope. The same trick 
>> can probably be used in other APIs as well where implicit cleanup has 
>> been preferred for performance and/or safety reasons.
>>
>> tl;dr;
>>
>> This restacking described in this email enhances the Foreign Memory 
>> Access API in many different ways, and allows clients to approach the 
>> API in increasing degrees of complexity (depending on needs):
>>
>> * for smoother transition, coming from the ByteBuffer API, users can 
>> only have swap ByteBuffer::allocateDirect with 
>> MemorySegment::allocateNative - not much else changes, no need to 
>> think about lifecycles (and ResourceScope); GC is still in charge of 
>> deallocation
>> * users that want tighter control over resources, can dive deeper and 
>> learn how segments (and other resources) are attached to a resource 
>> scope (which can be closed safely, if needed)
>> * for the native interop case, the NativeScope abstraction is 
>> retconned to be both a ResourceScope *and* a NativeAllocator - so it 
>> can be used whenever an API needs to know how to _allocate_ or which 
>> _lifecycle_ should be used for a newly created resource
>> * scopes can be locked, which allows clients to write critical 
>> sections in which a segment has to be operated upon w/o fear of it 
>> being closed
>> * the idiom described here can be used to e.g. enhance the ByteBuffer 
>> API and to add close capabilities there
>>
>> All the above require very little changes to the clients of the 
>> memory access API. The biggest change is that a MemorySegment no 
>> longer supports the AutoCloseable interface, which is instead moved 
>> to ResourceScope. While this can get a little more verbose in case 
>> you need a single segment, the code scales _a lot_ better in case you 
>> need multiple segments/resources. Existing clients using 
>> jextract-generated APIs, on the other hand, are not affected much, 
>> since they are mostly dependent on the NativeScope API, which this 
>> proposal does not alter (although the role of a NativeScope is now 
>> retconned to be allocator + scope).
>>
>> You can find a branch which implements some of the changes described 
>> above (except the changes to the foreign linker API) here:
>>
>> https://github.com/mcimadamore/panama-foreign/tree/resourceScope
>>
>> While an initial javadoc of the API described in this email can be 
>> found here:
>>
>> http://cr.openjdk.java.net/~mcimadamore/panama/resourceScope-javadoc_v2/javadoc/jdk/incubator/foreign/package-summary.html 
>>
>>
>>
>> Cheers
>> Maurizio
>>
>> [1] - 
>> https://mail.openjdk.java.net/pipermail/panama-dev/2021-January/011700.html
>> [2] - https://datasketches.apache.org/docs/Memory/MemoryPackage.html
>> [3] - 
>> https://mail.openjdk.java.net/pipermail/panama-dev/2021-January/011810.html
>>
>>