[foreign-memaccess] RFC: to scope or not to scope?

Tue Jun 4 10:08:33 UTC 2019

Thinking more about this - I think with (3) we have an issue. This is 
not an issue that is new in this proposal, but it's nevertheless a big 
point of contention, which I think might have been where some of the 
discomfort was coming from. I'll write a separate email about (3), to 
avoid polluting this thread.

Thanks
Maurizio

On 04/06/2019 01:59, Maurizio Cimadamore wrote:
> Summing up from the discussions in this thread, I hear 3 main concerns:
>
> 1) default read/write access of newly minted segments (John claims it 
> should be NO_ACCESS so that a thread has to explicitly acquire access)
>
> 2) when creating views (e.g. asConfined), we have to be mindful of 
> aliasing - e.g. the original segment will still be unrestricted
>
> 3) we'd like sub-regions to be created statelessly, but if we merge 
> scope and segment we can't do that
>
>
> I think (3) is a non-issue. We can still have _immutable_ segments, 
> which use some 'scope' object which is in charge of the liveliness 
> check. Creating a subregion will create a new segment with dfferent 
> bounds and same scope object. So if we use inline classes, no 
> allocation is required here.
>
> As for (2), aliasing is always a problem when creating restricted 
> views. Even MemorySegment::resize is problematic is the caller assumes 
> that from that point on, memory will only be accessible within those 
> bounds (it will not, there will still be a master segment which is 
> fully accessible in its whole initial bounds). So I'd file this in the 
> category of ByteBuffer::asReadOnly - callers need to know that what 
> these methods do is to create a *view* with certain characteristics, 
> as opposed to change the underlying characteristics of the original 
> segment.
>
> As for (1) I think I believe the API should create segments which are 
> accessible with lowest guarantees, unless a client opts in for more. 
> This gives us a performance model that scales gracefully (you pay for 
> what you ask for). It's up to the client to make sure they're using 
> the API correctly to get what they want. I've given examples of 
> cooperative threads sharing different, confined subregions of the same 
> master region, and I think that use case of 'divide et impera' can be 
> implemented nicely with the API I proposed.
>
> Maurizio
>
>
> On 31/05/2019 21:44, Maurizio Cimadamore wrote:
>> Hi,
>> lately I've been thinking hard about the relationship between scopes 
>> and memory segments in the foreign-memaccess API. I think some of the 
>> decisions we made lately - e.g. make scopes either global (pinned) or 
>> closeable (and confined) is a good one. But now I wonder, why do we 
>> need scopes in this API?
>>
>> A scope is useful to manage lifecycle of memory resources; you create 
>> a scope, do some allocation within it, close the scope and the 
>> associated resources are gone. This is a good programming model for 
>> high level layers, but is it still a good model for low-level layers? 
>> Over the last few weeks I noticed few things:
>>
>> * creating a scope is quite expensive, as there are several data 
>> structures associated to it
>>
>> * the ownership/parent mechanism is completely useless in this API, 
>> given that the scope parent-of relationship is really only useful 
>> when storing pointers from scope A into scope B, an operation that is 
>> not available in this API (since we have no pointers here)
>>
>> * the API forces you through quite a bit of hops to get to what you 
>> want; code like this is pretty idiomatic:
>>
>> try (MemoryScope scope = MemoryScope.globalScope().fork()) {
>>     MemorySegment segment = scope.allocate(32);
>>     MemoryAddress address = segment.baseAddress();
>>     ...
>> }
>>
>> * Several of the new MemorySegment sources just use the 'pinned' 
>> UNCHECKED scope - de-facto turning scope checks off
>>
>>
>> So, we have quite a complex API, which is used (in part) only when 
>> dealing with native memory; in the remaining case it's just noise, 
>> mostly.
>>
>> This set me off thinking... what if we brought some of the aspect of 
>> memory scope into the memory segment itself (and then get rid of 
>> memory scopes) ? That is, let's see if we can add the following 
>> things to a segment:
>>
>> * make it AutoCloseable - this way you can use a segment into a 
>> try-with-resources, as you would do with scopes
>>
>> * add a 'isAlive' state; a segment starts off in the 'alive' state 
>> and then, when it's closed it goes in the 'closed' state - meaning 
>> that all addresses originating from it (as well as all the segments) 
>> will become invalid
>>
>> * add confinement, so that only the owner thread is allowed to call 
>> methods in the segment (e.g. to resize or close it)
>>
>> * add some methods to obtain a read-only segment, a confined one (one 
>> which only allows read/writes from owning thread), or a pinned one 
>> (one that cannot be closed)
>>
>> In this new world, the above code can be rewritten as follows:
>>
>> try (MemorySegment segment = MemorySegment.ofNative(32)) {
>>     MemoryAddress address = segment.baseAddress();
>>     ...
>> }
>>
>> This is more compact and goes straight to the point. I like how 
>> compact the API is - and I also like that now the API is very 
>> symmetric with the heap cases as well - for instance:
>>
>> try (MemorySegment segment = MemorySegment.ofArray(new byte[32])) {
>>     MemoryAddress address = segment.baseAddress();
>>     ...
>> }
>>
>> The only thing that changes is the resource declaration in the try 
>> with resources!
>>
>>
>> It's not all rosy of course; this choice has some consequences:
>>
>> * memory segments are no longer immutable; that was possible before, 
>> since the mutable state was confined into the scope which was then 
>> attached to the segment. Now it's the segment itself that is mutable 
>> (in the liveness bit). While this could make transition to value 
>> types harder, I don't think it's really a blocker - in reality we 
>> could implement a very similar trick where we push the mutable state 
>> somewhere else, and then the segment becomes immutable again. Also, 
>> in real world cases I expect clients will do some kind of pooling, 
>> allocating big segments and then returning small pinned sub-regions 
>> to clients (thus avoiding one system call per allocation). In such 
>> cases, there is only one master mutable segment - and a lot of small 
>> immutable ones. Which is an happy case.
>>
>> * Thinking about what happens when you e.g. resize a region is a bit 
>> harder if you can close a region. Should closing a sub-region also 
>> close the one it comes from? Or should we throw an exception? Or 
>> should we do nothing? or reference counting? After reading the very 
>> good docs on Netty [1], I came to the conclusions that closing a 
>> derived segment should also result in the closure of the root one. 
>> This choice would of course not be a very good one for pooled 
>> sub-regions, but for this we can always use the pinning operation - 
>> that is: create a resized sub-region, pin it, and return it to the 
>> client. I think the combination of resizing + pinning gives quite a 
>> bit of power and I don't see any immediate need for doing something 
>> with reference counting.
>>
>>
>> I then realized that the pointer scopes we have in foreign right now 
>> can in fact be implemented *cleanly* on top of this lower level 
>> memory segment mechanism. Here's a snippet of code which demonstrates 
>> how one would go about writing a PointerScope which allocates a slab 
>> of memory and returns portions of it to the clients:
>>
>>
>> class PointerScopeImpl implements PointerScope {
>>     long SEGMENT_SIZE = 64 * 1024;
>>
>>     List<MemorySegment> usedSegments;
>>     MemorySegment currentSegment;
>>     long offsetInSegment;
>>
>>     <X> Pointer<X> allocate(LayoutType<X> type) {
>>        MemorySegment segment = 
>> allocateInternal(type.layout().bytesSize(), 
>> type.layout().alignmentBytes());
>>        return new BoundedPointer<X>(type, segment.baseAddress());
>>     }
>>
>>     ...
>>
>>     MemorySegment allocateInternal(long bytes, long align) {
>>          long size = type.bytesSize();
>>          if (offsetInSegment + size > SEGMENT_SIZE) {
>>              usedSegments.add(currentSegment);
>>              currentSegment = 
>> MemorySegment.ofNative(Math.min(SEGMENT_SIZE, size), align);
>>              offsetInSegment = 0;
>>          }
>>          MemorySegment segment = 
>> currentSegment.resize(offsetInSegment, size);
>>          offsetInSegment += size;
>>          return segment;
>>     }
>>
>>     void close() {
>>         currentSegment.close();
>>         usedSegments.forEach().close();
>>     }
>> }
>>
>> That is, we can get the same functionality we have in Panama, 
>> essentially using segments as a way to get at the Unsafe allocation 
>> facilities. This seems pretty cool!
>>
>> I've put together a prototype of this approach (should apply cleanly 
>> on top of foreign-memaccess):
>>
>> http://cr.openjdk.java.net/~mcimadamore/panama/scope-removal/
>>
>> I was pleased at how the tests could be simplified with this 
>> approach. I was also please to see that the performance numbers took 
>> a significant jump forward, essentially bringing this within reach of 
>> raw Unsafe usage (but with the extra safety sprinkled on top).
>>
>> Concluding, this seems yet another of those cases where we were 
>> trying to conflate high-level concerns with lower-level concerns, and 
>> once we push everything in the right place of the stack, things seems 
>> to slot into a lower energy state. What do you think?
>>
>> Cheers
>> Maurizio
>>
>> [1] - https://netty.io/wiki/reference-counted-objects.html
>>
>>
>>
>>