[foreign-memaccess] RFC: to scope or not to scope?

Fri May 31 20:44:42 UTC 2019

Hi,
lately I've been thinking hard about the relationship between scopes and 
memory segments in the foreign-memaccess API. I think some of the 
decisions we made lately - e.g. make scopes either global (pinned) or 
closeable (and confined) is a good one. But now I wonder, why do we need 
scopes in this API?

A scope is useful to manage lifecycle of memory resources; you create a 
scope, do some allocation within it, close the scope and the associated 
resources are gone. This is a good programming model for high level 
layers, but is it still a good model for low-level layers? Over the last 
few weeks I noticed few things:

* creating a scope is quite expensive, as there are several data 
structures associated to it

* the ownership/parent mechanism is completely useless in this API, 
given that the scope parent-of relationship is really only useful when 
storing pointers from scope A into scope B, an operation that is not 
available in this API (since we have no pointers here)

* the API forces you through quite a bit of hops to get to what you 
want; code like this is pretty idiomatic:

try (MemoryScope scope = MemoryScope.globalScope().fork()) {
     MemorySegment segment = scope.allocate(32);
     MemoryAddress address = segment.baseAddress();
     ...
}

* Several of the new MemorySegment sources just use the 'pinned' 
UNCHECKED scope - de-facto turning scope checks off

So, we have quite a complex API, which is used (in part) only when 
dealing with native memory; in the remaining case it's just noise, mostly.

This set me off thinking... what if we brought some of the aspect of 
memory scope into the memory segment itself (and then get rid of memory 
scopes) ? That is, let's see if we can add the following things to a 
segment:

* make it AutoCloseable - this way you can use a segment into a 
try-with-resources, as you would do with scopes

* add a 'isAlive' state; a segment starts off in the 'alive' state and 
then, when it's closed it goes in the 'closed' state - meaning that all 
addresses originating from it (as well as all the segments) will become 
invalid

* add confinement, so that only the owner thread is allowed to call 
methods in the segment (e.g. to resize or close it)

* add some methods to obtain a read-only segment, a confined one (one 
which only allows read/writes from owning thread), or a pinned one (one 
that cannot be closed)

In this new world, the above code can be rewritten as follows:

try (MemorySegment segment = MemorySegment.ofNative(32)) {
     MemoryAddress address = segment.baseAddress();
     ...
}

This is more compact and goes straight to the point. I like how compact 
the API is - and I also like that now the API is very symmetric with the 
heap cases as well - for instance:

try (MemorySegment segment = MemorySegment.ofArray(new byte[32])) {
     MemoryAddress address = segment.baseAddress();
     ...
}

The only thing that changes is the resource declaration in the try with 
resources!

It's not all rosy of course; this choice has some consequences:

* memory segments are no longer immutable; that was possible before, 
since the mutable state was confined into the scope which was then 
attached to the segment. Now it's the segment itself that is mutable (in 
the liveness bit). While this could make transition to value types 
harder, I don't think it's really a blocker - in reality we could 
implement a very similar trick where we push the mutable state somewhere 
else, and then the segment becomes immutable again. Also, in real world 
cases I expect clients will do some kind of pooling, allocating big 
segments and then returning small pinned sub-regions to clients (thus 
avoiding one system call per allocation). In such cases, there is only 
one master mutable segment - and a lot of small immutable ones. Which is 
an happy case.

* Thinking about what happens when you e.g. resize a region is a bit 
harder if you can close a region. Should closing a sub-region also close 
the one it comes from? Or should we throw an exception? Or should we do 
nothing? or reference counting? After reading the very good docs on 
Netty [1], I came to the conclusions that closing a derived segment 
should also result in the closure of the root one. This choice would of 
course not be a very good one for pooled sub-regions, but for this we 
can always use the pinning operation - that is: create a resized 
sub-region, pin it, and return it to the client. I think the combination 
of resizing + pinning gives quite a bit of power and I don't see any 
immediate need for doing something with reference counting.

I then realized that the pointer scopes we have in foreign right now can 
in fact be implemented *cleanly* on top of this lower level memory 
segment mechanism. Here's a snippet of code which demonstrates how one 
would go about writing a PointerScope which allocates a slab of memory 
and returns portions of it to the clients:

class PointerScopeImpl implements PointerScope {
     long SEGMENT_SIZE = 64 * 1024;

     List<MemorySegment> usedSegments;
     MemorySegment currentSegment;
     long offsetInSegment;

     <X> Pointer<X> allocate(LayoutType<X> type) {
        MemorySegment segment = 
allocateInternal(type.layout().bytesSize(), type.layout().alignmentBytes());
        return new BoundedPointer<X>(type, segment.baseAddress());
     }

     ...

     MemorySegment allocateInternal(long bytes, long align) {
          long size = type.bytesSize();
          if (offsetInSegment + size > SEGMENT_SIZE) {
              usedSegments.add(currentSegment);
              currentSegment = 
MemorySegment.ofNative(Math.min(SEGMENT_SIZE, size), align);
              offsetInSegment = 0;
          }
          MemorySegment segment = currentSegment.resize(offsetInSegment, 
size);
          offsetInSegment += size;
          return segment;
     }

     void close() {
         currentSegment.close();
         usedSegments.forEach().close();
     }
}

That is, we can get the same functionality we have in Panama, 
essentially using segments as a way to get at the Unsafe allocation 
facilities. This seems pretty cool!

I've put together a prototype of this approach (should apply cleanly on 
top of foreign-memaccess):

http://cr.openjdk.java.net/~mcimadamore/panama/scope-removal/

I was pleased at how the tests could be simplified with this approach. I 
was also please to see that the performance numbers took a significant 
jump forward, essentially bringing this within reach of raw Unsafe usage 
(but with the extra safety sprinkled on top).

Concluding, this seems yet another of those cases where we were trying 
to conflate high-level concerns with lower-level concerns, and once we 
push everything in the right place of the stack, things seems to slot 
into a lower energy state. What do you think?

Cheers
Maurizio

[1] - https://netty.io/wiki/reference-counted-objects.html