[foreign] Pulling the string on Scope

Thu Dec 20 21:32:21 UTC 2018

Maurizio Cimadamore schreef op 2018-12-20 21:55:
> On 20/12/2018 16:47, Jorn Vernee wrote:
>> Actually, I realized my idea doesn't work since it leaves the old 
>> pointer in a 'stale' state, where it's still attached to the old Scope 
>> and thus can't check it's liveliness accurately any more :/
>> 
> Heh - I was about to ask you about that :-)
>> 
>> I also had another idea that shifts the memory management 
>> responsibility from Scope to BoundedMemoryRegion, and gives each 
>> resource it's own memory region. In that case the resource does not 
>> have a reference to the scope, but uses the memory region to check 
>> it's liveliness, so transferring it to another scope doesn't have to 
>> effect the immutability. That solution seems to work, but it also does 
>> not solve the problem of leaking resources when putting them in a 
>> struct which I described. (That's the one additional test that fails, 
>> which I've just commented out for now)
> 
> How do you transfer a resource from a scope to another? Do you
> transfer its memory region from one scope to another?

Because the memory management is done by BoundedMemoryRegion, Scope is 
more or less just Set of Resource objects and an API to allocate them, 
so there is no transferring of memory regions between scopes. The 
transfer consists of removing the Resource from the Set<Resource> in one 
scope and adding it to the Set<Resource> in the other scope. And then 
Scope is also used to close() resources in bulk, which works nicely 
together with try-with-resources.

> I think that could be ok as long as there's, as you say, one memory 
> region per
> resource - but what if we want more resources to share the same
> region? Do we really want to allocate a new memory region every time
> we allocate a new pointer?

We don't allocate a new memory region every time. I have created a 
BoundedMemoryRegion subclass that can have another region as a 
dependency. So, for instance when accessing an element of an array I 
just create a BoundedMermoryRegion that is dependent on the 
BoundedMemoryRegion of the array, but not actually allocates any new 
native memory. If the array's BoundedMemoryRegion is closed, the 
dependent region will also no longer be alive.

> If the two concepts are so connected, then
> why do we have two abstractions (and not one) in the first place?

Well, we want Pointer to be a value type, but if BoundedMemoryRegion 
manages the memory it also needs an `isAlive` flag to make sure that 
dead memory is not being accessed or memory is being freed twice, so it 
needs to be mutable. The implementation we have right now also has 
multiple sub classes of BoundedMemoryRegion, which I've also expanded on 
further. This is also not possible with value types AFAIK. So, we can't 
just merge the two as far as I can see.

In a way we've 'cheated' the immutability of Pointer by having a box for 
the liveliness flag (which is mutable). Currently this is Scope, but 
I've changed it to BoundedMemoryRegion.

I will try and make an RFC for the draft patch tomorrow, or otherwise 
next week (there's still room for cleanup imho, but it's looking good so 
far).

Jorn

> Maurizio
> 
>> 
>> I will do some more testing of that solution and then submit a draft.
>> 
>> Jorn
>> 
>> Jorn Vernee schreef op 2018-12-20 14:54:
>>>> Transferring resources from one scope to another is something we are
>>>> actively looking into. In a way, we can model 'freeing' a resource 
>>>> as
>>>> moving it into the 'NullScope' (or BitBucketScope :-)).
>>>> 
>>>> Now, as I started to exploring that model more seriously, I found a
>>>> pretty serious blocker: the current Pointer/Scope API is designed
>>>> around two principles:
>>>> 
>>>> 1) Pointers are immutable, and value-ready (hint hint Valhalla :-))
>>>> 2) Obtaining the scope of a given Pointer is an O(1) operation
>>>> 
>>>> (2) is important: since liveness is a property of Scope and since we
>>>> can go in O(1) from Pointer to its owning Scope, we can, in O(1)
>>>> establish if a pointer is alive!
>>>> 
>>>> If we start supporting moving resources from one Scope to another, 
>>>> I'm
>>>> having hard time in figuring out how that can be achieved in an
>>>> efficient manner. If Pointer is immutable (as per (1)) that means we
>>>> can't simply change its owning scope - if we did that, it would be
>>>> impossible to migrate Pointer to values once Valhalla is ready.
>>>> 
>>>> That leaves us few options, one of which is that Pointer do not 
>>>> store
>>>> owning scope in a field, but instead Scopes keep them in tables. But
>>>> this gives up (2) - now determining the scope owning a resource is 
>>>> no
>>>> longer O(1), as there's at least a table lookup.
>>>> 
>>>> I still have not found a satisfactory way to resolve this tension.
>>> 
>>> So for 1), instead of `void transferTo(Scope scope)` we can use
>>> `Resource withScope(Scope scope)`? Then for the transfer we shallow
>>> copy the Resource object (_not_ the underlying memory region), and
>>> give it a new Pointer with the new Scope. The underlying memory 
>>> region
>>> is then transferred between the scopes (which are already mutable). 
>>> If
>>> all the Resource implementation will be VBCs/value types any ways
>>> what's one more copy?
>>> 
>>> I don't think anything changes w.r.t. 2) in that scenario.
>>> 
>>> I've experimented with this idea (rough draft [1]), but I don't quite
>>> have it working yet (seeing some crashes). I think this is up to the
>>> change in memory management strategy needed for NativeScope so that
>>> resources are transferable. There's probably some mistake there.
>>> 
>>> While making this I also ran into an inconsistency:
>>> 
>>>     @NativeStruct("[" +
>>>             "   u64(get=getPtr)(set=setPtr):i32" +
>>>             "](MyStruct)")
>>>     interface MyStruct extends Struct<MyStruct> {
>>>         Pointer<Integer> getPtr();
>>>         void setPtr(Pointer<Integer> ptr);
>>>     }
>>> 
>>>     public static void main(String[] args) {
>>>         Pointer<Integer> outside;
>>>         try(Scope s = Scope.newNativeScope()) {
>>>             MyStruct struct = s.allocateStruct(MyStruct.class);
>>>             struct.setPtr(s.allocate(NativeTypes.INT32));
>>>             outside = struct.getPtr();
>>>         }
>>>         int i = outside.get(); // should throw, because no longer in 
>>> scope.
>>>     }
>>> 
>>> The above example does not actually throw an exception. There is a
>>> solution for this problem for Callbacks; if you have a
>>> Pointer<Callback<...>> then when the pointer is de-referenced the
>>> callback is added to the Scope of the enclosing Pointer, and in the
>>> case of struct fields this just happens to be the same scope that the
>>> struct belongs to, but this seems like an unwanted side-effect. If 
>>> the
>>> Callback was not in the same scope as the Pointer we inadvertently 
>>> add
>>> it to the wrong scope. The right way to solve this seems to be to
>>> track the Scope of a Struct's field when setting it, and then setting
>>> it again when getting. But, we can't ask native code to do that, so
>>> maybe the only solution is to say that when you get a struct's field
>>> it does not belong to any scope, and the missing exception in the
>>> above example is intended behavior. Adding a resource field to the
>>> scope of the enclosing struct when getting seems to make sense for
>>> nested structs and fused arrays only, since you can say for sure that
>>> when the struct is freed so are it's contents.
>>> 
>>> Thoughts?
>>> 
>>> Jorn
>>> 
>>> [1] :
>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/transferresources/webrev.01/ 
>>> Maurizio Cimadamore schreef op 2018-12-19 16:29:
>>>> On 19/12/2018 15:08, Jorn Vernee wrote:
>>>>> Hi,
>>>>> 
>>>>> I have been thinking for a while that there could be some 
>>>>> improvements made to the Scope API. (And I'm currently in the 
>>>>> process of debugging what seems to be a life-cycle problem)
>>>>> 
>>>>> * Scopes could optionally have names. If an exception is thrown 
>>>>> because a Resource's scope is closed, and don't know where this 
>>>>> scope/resource is coming from, you have to debug that. By allowing 
>>>>> scopes to have optional names, we can add additional debugging 
>>>>> information to a scope which can be displayed in the exception 
>>>>> message; "Scope is not alive" vs. "Scope 'MyScope' is not alive".
>>>> Not feeling too strongly on this...
>>>> 
>>>>> 
>>>>> * There should imho be a way to transfer resources between scopes. 
>>>>> I suggest adding a `void transferTo(Scope scope)` method to 
>>>>> Resource for this. Transferring resources between scopes could be 
>>>>> useful for instance in the case where a Resource is returned from a 
>>>>> native function and it's up to the caller to manage that resource. 
>>>>> Currently the resource is leaked, which makes sense when the return 
>>>>> value is not supposed to be managed by the caller, but for the 
>>>>> former case it's nice to be able to transfer the resource into some 
>>>>> other, non-leaked, scope. (From what I can tell the allocation 
>>>>> strategy of NativeScope would have to be changed to allow for 
>>>>> this).
>>>> 
>>>> Transferring resources from one scope to another is something we are
>>>> actively looking into. In a way, we can model 'freeing' a resource 
>>>> as
>>>> moving it into the 'NullScope' (or BitBucketScope :-)).
>>>> 
>>>> Now, as I started to exploring that model more seriously, I found a
>>>> pretty serious blocker: the current Pointer/Scope API is designed
>>>> around two principles:
>>>> 
>>>> 1) Pointers are immutable, and value-ready (hint hint Valhalla :-))
>>>> 2) Obtaining the scope of a given Pointer is an O(1) operation
>>>> 
>>>> (2) is important: since liveness is a property of Scope and since we
>>>> can go in O(1) from Pointer to its owning Scope, we can, in O(1)
>>>> establish if a pointer is alive!
>>>> 
>>>> If we start supporting moving resources from one Scope to another, 
>>>> I'm
>>>> having hard time in figuring out how that can be achieved in an
>>>> efficient manner. If Pointer is immutable (as per (1)) that means we
>>>> can't simply change its owning scope - if we did that, it would be
>>>> impossible to migrate Pointer to values once Valhalla is ready.
>>>> 
>>>> That leaves us few options, one of which is that Pointer do not 
>>>> store
>>>> owning scope in a field, but instead Scopes keep them in tables. But
>>>> this gives up (2) - now determining the scope owning a resource is 
>>>> no
>>>> longer O(1), as there's at least a table lookup.
>>>> 
>>>> I still have not found a satisfactory way to resolve this tension.
>>>> 
>>>> 
>>>>> 
>>>>> * boxing/unboxing code should take a Scope as an argument. When 
>>>>> returning values they can be put into this scope, or in the case of 
>>>>> Windows a copy of a struct might be needed for the duration of the 
>>>>> call. For upcalls the boxed arguments can also be added to this 
>>>>> scope. Each call will have 2 scopes, one for the duration of the 
>>>>> call and one for return values, which is 'leaked', and it's up to 
>>>>> the caller to manage this scope and it's resources. Which of the 
>>>>> two is passed to box/unbox depends on if we're doing an upcall or 
>>>>> downcall. So in abstract you'd have this;
>>>>> 
>>>>> ```
>>>>> try(var callScope = Scope.newNativeScope()) {
>>>>>     var returnScope = Scope.newNativeScope(); // leaked to caller
>>>>> 
>>>>>     unbox(callScope, ...);
>>>>> 
>>>>>     // do call
>>>>> 
>>>>>     box(returnScope, ...);
>>>>> }
>>>>> ```
>>>>> 
>>>>> Where for upcalls box and unbox are switched.
>>>> 
>>>> This is not related to the Scope API, but I agree that the code 
>>>> should
>>>> be more symmetric between upcalls and downcalls - while right now we
>>>> create the scope in the box routine only; I agree with your proposal
>>>> to add a scope parameter, and let these methods be fully symmetric.
>>>> 
>>>> 
>>>> Maurizio
>>>> 
>>>>> 
>>>>> What do you think about about Scope?
>>>>> 
>>>>> Thanks,
>>>>> Jorn