[foreign-memaccess] RFC: to scope or not to scope?

Tue Jun 4 10:33:38 UTC 2019

Maurizio Cimadamore schreef op 2019-06-04 02:59:
> Summing up from the discussions in this thread, I hear 3 main concerns:
> 
> 1) default read/write access of newly minted segments (John claims it
> should be NO_ACCESS so that a thread has to explicitly acquire access)
> 
> 2) when creating views (e.g. asConfined), we have to be mindful of
> aliasing - e.g. the original segment will still be unrestricted
> 
> 3) we'd like sub-regions to be created statelessly, but if we merge
> scope and segment we can't do that
> 
> 
> I think (3) is a non-issue. We can still have _immutable_ segments,
> which use some 'scope' object which is in charge of the liveliness
> check. Creating a subregion will create a new segment with dfferent
> bounds and same scope object. So if we use inline classes, no
> allocation is required here.

Agreed.

> As for (2), aliasing is always a problem when creating restricted
> views. Even MemorySegment::resize is problematic is the caller assumes
> that from that point on, memory will only be accessible within those
> bounds (it will not, there will still be a master segment which is
> fully accessible in its whole initial bounds). So I'd file this in the
> category of ByteBuffer::asReadOnly - callers need to know that what
> these methods do is to create a *view* with certain characteristics,
> as opposed to change the underlying characteristics of the original
> segment.

This makes sense to me as far as API goes.

I was wondering whether we could effectively optimize around this 
aliasing problem, or instead have to conservatively assume full access 
can occur.

> As for (1) I think I believe the API should create segments which are
> accessible with lowest guarantees, unless a client opts in for more.
> This gives us a performance model that scales gracefully (you pay for
> what you ask for). It's up to the client to make sure they're using
> the API correctly to get what they want. I've given examples of
> cooperative threads sharing different, confined subregions of the same
> master region, and I think that use case of 'divide et impera' can be
> implemented nicely with the API I proposed.

Thanks, this use-case gives a good perspective on how you want 
asConfined() to be used. I think this solves the problem of a rogue 
thread X coming in and creating racy conditions by instead assuming that 
there are no rogue threads, and all threads are under the control of the 
user. i.e. the segment is not published publicly for any thread to 
access, but created in private and accessed by controlled worker 
threads. The user just has to make sure that all worker threads, and 
subsequent threads reading the finished work submit to a common 
synchronization point.

Jorn

> Maurizio
> 
> 
> On 31/05/2019 21:44, Maurizio Cimadamore wrote:
>> Hi,
>> lately I've been thinking hard about the relationship between scopes 
>> and memory segments in the foreign-memaccess API. I think some of the 
>> decisions we made lately - e.g. make scopes either global (pinned) or 
>> closeable (and confined) is a good one. But now I wonder, why do we 
>> need scopes in this API?
>> 
>> A scope is useful to manage lifecycle of memory resources; you create 
>> a scope, do some allocation within it, close the scope and the 
>> associated resources are gone. This is a good programming model for 
>> high level layers, but is it still a good model for low-level layers? 
>> Over the last few weeks I noticed few things:
>> 
>> * creating a scope is quite expensive, as there are several data 
>> structures associated to it
>> 
>> * the ownership/parent mechanism is completely useless in this API, 
>> given that the scope parent-of relationship is really only useful when 
>> storing pointers from scope A into scope B, an operation that is not 
>> available in this API (since we have no pointers here)
>> 
>> * the API forces you through quite a bit of hops to get to what you 
>> want; code like this is pretty idiomatic:
>> 
>> try (MemoryScope scope = MemoryScope.globalScope().fork()) {
>>     MemorySegment segment = scope.allocate(32);
>>     MemoryAddress address = segment.baseAddress();
>>     ...
>> }
>> 
>> * Several of the new MemorySegment sources just use the 'pinned' 
>> UNCHECKED scope - de-facto turning scope checks off
>> 
>> 
>> So, we have quite a complex API, which is used (in part) only when 
>> dealing with native memory; in the remaining case it's just noise, 
>> mostly.
>> 
>> This set me off thinking... what if we brought some of the aspect of 
>> memory scope into the memory segment itself (and then get rid of 
>> memory scopes) ? That is, let's see if we can add the following things 
>> to a segment:
>> 
>> * make it AutoCloseable - this way you can use a segment into a 
>> try-with-resources, as you would do with scopes
>> 
>> * add a 'isAlive' state; a segment starts off in the 'alive' state and 
>> then, when it's closed it goes in the 'closed' state - meaning that 
>> all addresses originating from it (as well as all the segments) will 
>> become invalid
>> 
>> * add confinement, so that only the owner thread is allowed to call 
>> methods in the segment (e.g. to resize or close it)
>> 
>> * add some methods to obtain a read-only segment, a confined one (one 
>> which only allows read/writes from owning thread), or a pinned one 
>> (one that cannot be closed)
>> 
>> In this new world, the above code can be rewritten as follows:
>> 
>> try (MemorySegment segment = MemorySegment.ofNative(32)) {
>>     MemoryAddress address = segment.baseAddress();
>>     ...
>> }
>> 
>> This is more compact and goes straight to the point. I like how 
>> compact the API is - and I also like that now the API is very 
>> symmetric with the heap cases as well - for instance:
>> 
>> try (MemorySegment segment = MemorySegment.ofArray(new byte[32])) {
>>     MemoryAddress address = segment.baseAddress();
>>     ...
>> }
>> 
>> The only thing that changes is the resource declaration in the try 
>> with resources!
>> 
>> 
>> It's not all rosy of course; this choice has some consequences:
>> 
>> * memory segments are no longer immutable; that was possible before, 
>> since the mutable state was confined into the scope which was then 
>> attached to the segment. Now it's the segment itself that is mutable 
>> (in the liveness bit). While this could make transition to value types 
>> harder, I don't think it's really a blocker - in reality we could 
>> implement a very similar trick where we push the mutable state 
>> somewhere else, and then the segment becomes immutable again. Also, in 
>> real world cases I expect clients will do some kind of pooling, 
>> allocating big segments and then returning small pinned sub-regions to 
>> clients (thus avoiding one system call per allocation). In such cases, 
>> there is only one master mutable segment - and a lot of small 
>> immutable ones. Which is an happy case.
>> 
>> * Thinking about what happens when you e.g. resize a region is a bit 
>> harder if you can close a region. Should closing a sub-region also 
>> close the one it comes from? Or should we throw an exception? Or 
>> should we do nothing? or reference counting? After reading the very 
>> good docs on Netty [1], I came to the conclusions that closing a 
>> derived segment should also result in the closure of the root one. 
>> This choice would of course not be a very good one for pooled 
>> sub-regions, but for this we can always use the pinning operation - 
>> that is: create a resized sub-region, pin it, and return it to the 
>> client. I think the combination of resizing + pinning gives quite a 
>> bit of power and I don't see any immediate need for doing something 
>> with reference counting.
>> 
>> 
>> I then realized that the pointer scopes we have in foreign right now 
>> can in fact be implemented *cleanly* on top of this lower level memory 
>> segment mechanism. Here's a snippet of code which demonstrates how one 
>> would go about writing a PointerScope which allocates a slab of memory 
>> and returns portions of it to the clients:
>> 
>> 
>> class PointerScopeImpl implements PointerScope {
>>     long SEGMENT_SIZE = 64 * 1024;
>> 
>>     List<MemorySegment> usedSegments;
>>     MemorySegment currentSegment;
>>     long offsetInSegment;
>> 
>>     <X> Pointer<X> allocate(LayoutType<X> type) {
>>        MemorySegment segment = 
>> allocateInternal(type.layout().bytesSize(), 
>> type.layout().alignmentBytes());
>>        return new BoundedPointer<X>(type, segment.baseAddress());
>>     }
>> 
>>     ...
>> 
>>     MemorySegment allocateInternal(long bytes, long align) {
>>          long size = type.bytesSize();
>>          if (offsetInSegment + size > SEGMENT_SIZE) {
>>              usedSegments.add(currentSegment);
>>              currentSegment = 
>> MemorySegment.ofNative(Math.min(SEGMENT_SIZE, size), align);
>>              offsetInSegment = 0;
>>          }
>>          MemorySegment segment = 
>> currentSegment.resize(offsetInSegment, size);
>>          offsetInSegment += size;
>>          return segment;
>>     }
>> 
>>     void close() {
>>         currentSegment.close();
>>         usedSegments.forEach().close();
>>     }
>> }
>> 
>> That is, we can get the same functionality we have in Panama, 
>> essentially using segments as a way to get at the Unsafe allocation 
>> facilities. This seems pretty cool!
>> 
>> I've put together a prototype of this approach (should apply cleanly 
>> on top of foreign-memaccess):
>> 
>> http://cr.openjdk.java.net/~mcimadamore/panama/scope-removal/
>> 
>> I was pleased at how the tests could be simplified with this approach. 
>> I was also please to see that the performance numbers took a 
>> significant jump forward, essentially bringing this within reach of 
>> raw Unsafe usage (but with the extra safety sprinkled on top).
>> 
>> Concluding, this seems yet another of those cases where we were trying 
>> to conflate high-level concerns with lower-level concerns, and once we 
>> push everything in the right place of the stack, things seems to slot 
>> into a lower energy state. What do you think?
>> 
>> Cheers
>> Maurizio
>> 
>> [1] - https://netty.io/wiki/reference-counted-objects.html
>> 
>> 
>> 
>>