resource scopes and close actions

Thu Feb 3 11:31:20 UTC 2022

Thanks, this is informative.

Your approach seems generally correct. Since you create all the 
structures when you open the file (openInput method) it obviously makes 
sense for all the intermediate structs and arrays to share the same 
lifecycle/scope. Right now that is possible thanks to the scope() 
accessor; if we take that away it gets harder, I get that. But let's 
deal with that issue in a separate thread.

> static final MethodHandle avformat_open_input$FH = 
> Memory.downcall("avformat_open_input", 
> FunctionDescriptor.of(Memory.INT, Memory.POINTER, Memory.POINTER, 
> Memory.POINTER, Memory.POINTER));
>     public static AVFormatContext openInput(String url, AVInputFormat 
> fmt, HandleArray<AVDictionary> options, ResourceScope scope$) {
>         MemoryAddress ps;
>         int result$;
>         try (Frame frame$ = Memory.createFrame()) {
>             PointerArray ps$h = PointerArray.createArray(1, frame$);
>             result$ = 
> (int)avformat_open_input$FH.invokeExact((jdk.incubator.foreign.Addressable)Memory.address(ps$h), 
> (jdk.incubator.foreign.Addressable)frame$.copy(url), 
> (jdk.incubator.foreign.Addressable)Memory.address(fmt), 
> (jdk.incubator.foreign.Addressable)Memory.address(options));
>             ps = ps$h.get(0);
>             if (result$ == 0) {
>                 AVFormatContext res$ = AVFormatContext.create(ps, 
> scope$);
>                 scope$.addCloseAction(() -> close(ps));
>                 return res$;
>             }
>         } catch (Throwable t) {
>             throw new RuntimeException(t);
>         }
>         throw new RuntimeException("error="+result$);
>     }
>
>     public static void close(MemoryAddress s) {...}
>
> So it's only a minor inconvenience - it needs to keep another copy of 
> the MemoryAddress (ps) around, and it needs to have a static 'close' 
> function that takes a MemoryAddress rather than a static or 
> objectfunction that uses 'segment'.  The former is just a wasteful 
> copy and the latter pollutes the api a bit.  ffmpeg 
> avformat_close_input actually takes an AVFormatContext ** and so can 
> be invoked multiple times safely, so another explicit close function 
> would be useful as well but that's another matter.
>
So, this is the main factory for AVFormatContext, and it takes a scope - 
that seems to make sense to me. When the user closes the scope, some 
native function will be called to release the memory that has been 
allocated so far. I'm not sure I see the issues you mention though: with 
respect to duplication, you refer to the fact that you have to keep a 
MemoryAddress (local!) variable around? Not to dismiss the concern, but 
that doesn't seem _that_ terrible.

As for the close(MemoryAddress), yes, something like that would be 
needed, but why is it a public method? Couldn't it just be some private 
method somewhere? I guess I'm a bit fuzzy as to where your API 
boundaries are. It seems to me that you want your clients to deal in 
AVFormatContext, AVRational and friends, and you don't really want them 
to think (too much) about scope, segments and addresses - which I think 
is a fine move. But if that's the case, why having a 
close(MemoryAddress) in the API?

> Anyway i'm still trying to work out how to fit this all in, i'm 
> passing around scopes because MemorySegment.ofAddress() needs it and 
> now i'm trying to work out how to fit them to the lifecycle of 
> objects.   This was the first attempt at utilising the facilities for 
> some benefit.
I think having at least one scope does make sense. After all it is 
possible for a client to create an AVFormatContext, get an AVStream out 
of it and keep hold of it for too long (e.g. after the AVFormatContext 
has been released). In which case, if the client tries to access that 
AVStream you will have a JVM crash. If, on the other hand, the AVStream 
is backed by the same scope as the AVFormatContext, clients can't cause 
crashes in this particular way: once the AVFormatContext has been 
closed, nothing associated with its scope will be accessible. Which 
seems like an important invariant in your API? Also, I'm not 100% clear 
on how you handle writes; if the user sets a custom AVStream on your 
array, clearly you can have a problem if the lifecycle of the custom 
AVStream doesn't match that of the enclosing AVFormatContext (which 
seems to be another reason to keep track of lifecycles?)
>
> But it seems very difficult to get right and somewhat clumsy to use 
> and only provides a very marginal increase in live-ness safety because 
> you can always turn a MemoryAddress into a MemorySegment in another 
> scope.  And ideally it's not something  you'd expose to the api user.  
> With JNI there were other more straightforward mechanisms to manage 
> the java<>c lifecycle matching (e.g. nulling out the pointer), ones 
> that could be hidden from the library user.
It is true that if you have a MemoryAddress you can always turn it into 
something - but here you are writing an high-level API - again, I'm not 
100% sure as to what the boundaries of your API are. If your API wraps 
segments and addresses, then a user, ideally, will not be able to 
circumvent scopes by creating a segment backed by same address but a 
different scope. If, on the other hand, your struct abstractions also 
expose segments and addresses, there is a chance for users to break away 
from scope restrictions, but they will always need a restricted 
operation to do so. So, assuming your API is the only "privileged" part 
running in your system, the user can't mess with scopes, even if it can 
sees the addresses.
>
> I'm starting to think of changing to use globalScope() and to mostly 
> pretend ResourceScope doesn't exist.  For handle-based apis (anonymous 
> pointers) like OpenCL it's already effectively the case anyway.

Of course that decision is up to you. Using globalScope() might be a 
fine alternative in many cases where native lifecycle is just too hard 
to track correctly. In this case, it seems to be that you have a clear 
point where the file is openened, so it seems like there's a clear 
"scope" associated to the whole set of structures you create, and it 
would be a bit sad to throw that info away. But it all depends on what 
are the requirements of your API, and how much you want protect clients 
from crashing when accessing already freed resources. Is temporal safety 
important to you? If it is, use MemorySegment and scopes and propagate 
them and check them as/where needed (and, of course, the API should not 
make it impossible to do so, but that's a separate discussion); if not, 
just use MemoryAddress and global scopes (in case you need to 
temporarily turn an address into a segment).

Maurizio

>
>
> On 2/2/22 21:20, Maurizio Cimadamore wrote:
>> Hi Michael,
>> I understand the concern. Running close actions _after_ the scope is 
>> closed (and so segments associated with it are inaccessible) is a 
>> sort of a forced move: you need to run the close actions when you are 
>> 100% sure that nobody else is accessing the scope. Otherwise you 
>> might have a race between a close action attempting to free some 
>> resource and another thread attempting to access it.
>>
>> The result is that, as you have noticed, adding close actions is 
>> mostly useful when working with APIs that turn raw MemoryAddress into 
>> segments manually (MemorySegment::ofAddress).
>>
>
>
>> Can you please share a problematic case with ffmpeg?
>>
>> Thanks
>> Maurizio
>>
>> On 02/02/2022 01:34, Michael Zucchi wrote:
>>>
>>> Morning,
>>>
>>> I was hoping to use resource scopes to auto-cleanup native-allocated 
>>> resources, but it looks like it's going to be a bit messy because 
>>> segments get closed before their close actions are invoked.
>>>
>>> Is this the intended behaviour or a side-effect of 'no particular 
>>> order'?
>>>
>>> For some api's it doesn't matter as they don't use api-allocated 
>>> public structures (e.g. opencl) and the same approach is fine using 
>>> MemoryAddress, but for others it's an issue (e.g. ffmpeg) where the 
>>> obvious mapping is to use a segment and varhandles.
>>>
>>> Probably as a workaround - if i want to have this feature at any 
>>> rate because i'm not sure if it's a good idea yet - i'll just save a 
>>> copy of the address before it's turned into a segment and pass that 
>>> to any close function instead.
>>>
>>> I've only really started looking at seeing how to 'scope' things 
>>> 'properly' but it feels a bit clumsy and sort of not worth it 
>>> because it has to be worked around all the time either with hacks 
>>> like this or resorting to simply using the global scope which 
>>> effectively does nothing.
>>>
>>>  Z
>>>
>>>
>