resource scopes and close actions
Michael Zucchi
notzed at gmail.com
Sat Feb 5 09:16:33 UTC 2022
On 3/2/22 22:01, Maurizio Cimadamore wrote:
>
>> So it's only a minor inconvenience - it needs to keep another copy of
>> the MemoryAddress (ps) around, and it needs to have a static 'close'
>> function that takes a MemoryAddress rather than a static or
>> objectfunction that uses 'segment'. The former is just a wasteful
>> copy and the latter pollutes the api a bit. ffmpeg
>> avformat_close_input actually takes an AVFormatContext ** and so can
>> be invoked multiple times safely, so another explicit close function
>> would be useful as well but that's another matter.
>>
> So, this is the main factory for AVFormatContext, and it takes a scope
> - that seems to make sense to me. When the user closes the scope, some
> native function will be called to release the memory that has been
> allocated so far. I'm not sure I see the issues you mention though:
> with respect to duplication, you refer to the fact that you have to
> keep a MemoryAddress (local!) variable around? Not to dismiss the
> concern, but that doesn't seem _that_ terrible.
>
Well it's only semantically a local, physically it's copied to the
lambda handle. Like i said it's only a minor thing.
> As for the close(MemoryAddress), yes, something like that would be
> needed, but why is it a public method? Couldn't it just be some
> private method somewhere? I guess I'm a bit fuzzy as to where your API
> boundaries are. It seems to me that you want your clients to deal in
> AVFormatContext, AVRational and friends, and you don't really want
> them to think (too much) about scope, segments and addresses - which I
> think is a fine move. But if that's the case, why having a
> close(MemoryAddress) in the API?
>
Well the situation is a bit messier than i thought, as below.
>> Anyway i'm still trying to work out how to fit this all in, i'm
>> passing around scopes because MemorySegment.ofAddress() needs it and
>> now i'm trying to work out how to fit them to the lifecycle of
>> objects. This was the first attempt at utilising the facilities for
>> some benefit.
> I think having at least one scope does make sense. After all it is
> possible for a client to create an AVFormatContext, get an AVStream
> out of it and keep hold of it for too long (e.g. after the
> AVFormatContext has been released). In which case, if the client tries
> to access that AVStream you will have a JVM crash. If, on the other
> hand, the AVStream is backed by the same scope as the AVFormatContext,
> clients can't cause crashes in this particular way: once the
> AVFormatContext has been closed, nothing associated with its scope
> will be accessible. Which seems like an important invariant in your
> API? Also, I'm not 100% clear on how you handle writes; if the user
> sets a custom AVStream on your array, clearly you can have a problem
> if the lifecycle of the custom AVStream doesn't match that of the
> enclosing AVFormatContext (which seems to be another reason to keep
> track of lifecycles?)
Hmm yes I see the utility there, without a scope-thing you lose that
downstream tracking although it also means you have to be all in on it
for it to be effective. In this specific case possibly AVFormatContext
should have it's own shared scope and an explicit close function so it
could close that scope at the right time, rather than going the other
way around.
Sorry I got that part about writing streams completely wrong as it's
been a long time since i looked at it. It's a bit more complicated as
the close function you use depends on the constructor and potentially
later calls.
reading streams:
avformat_open_input() allocates/initialises struct AVFormatContext
you can read and modify some of the structures it creates
avformat_close_input() closes it.
or also reading streams (e.g. to provide custom i/o):
avformat_alloc_context()
modify some fields
avformat_open_input()
read/modify things
avformat_close_input()
writing:
avformat_alloc_contex()
modify some fields
avformat_new_stream()
lookup/modify some fields
avformat_write_header()
avformat_free()
So 'auto-closing' isn't quite straightforward anyway, not without some
higher level calls or state tracking. In the JNI version I used the
'opaque' user-data field in AVFormatContext to track which close
function should be used. Same approach could be done here I guess,
although it would need a copy of the MemoryAddress to be saved in any
constructors so the close function could access it in an onClose handler.
>>
>> But it seems very difficult to get right and somewhat clumsy to use
>> and only provides a very marginal increase in live-ness safety
>> because you can always turn a MemoryAddress into a MemorySegment in
>> another scope. And ideally it's not something you'd expose to the
>> api user. With JNI there were other more straightforward mechanisms
>> to manage the java<>c lifecycle matching (e.g. nulling out the
>> pointer), ones that could be hidden from the library user.
> It is true that if you have a MemoryAddress you can always turn it
> into something - but here you are writing an high-level API - again,
> I'm not 100% sure as to what the boundaries of your API are. If your
> API wraps segments and addresses, then a user, ideally, will not be
> able to circumvent scopes by creating a segment backed by same address
> but a different scope. If, on the other hand, your struct abstractions
> also expose segments and addresses, there is a chance for users to
> break away from scope restrictions, but they will always need a
> restricted operation to do so. So, assuming your API is the only
> "privileged" part running in your system, the user can't mess with
> scopes, even if it can sees the addresses.
Oh right, well I didn't really think about hiding them completely from
the api, I suppose that makes sense and maybe in some cases it will be
possible. In other cases they are just handles, so even if they are
hidden they are still represented by a MemoryAddress which are all in
the global scope so they can't go out of scope even if they are closed -
which will still potentially crash the jvm if they are passed to
function calls. See a couple of paragraphs below about those.
I don't really see what practical difference the restricted functions
make for a library - it's up to the application author and ultimately
the user as to which modules have native access and not the library.
MemorySegment and maybe MemoryAddress will have to be exposed for some
of the data structures for use and interoperation with other libraries.
Specifically for ffmpeg there's at least AVFrame { uint8_t *data[8]; },
which are memory buffers for pixels and samples (and could be any
primitive type not just ubyte) and the main input/output one is
interested in. While the jni version has some helpers to handle
conversion to/from java2d/javafx/java audio, it exposes a jni-allocated
ByteBuffer for other cases like opencl where you want to avoid big
copies. It's also got ref/unref of the data fields and some functions
will implicitly unref, so that might be tricky to get to work within the
confines of resource scopes. Possibly a case where it just doesn't
really fit the mechanism.
OpenCL (& Vulkan) does everything via handles (typed anonymous
pointers), so presumably represented by MemoryAddress. What approach to
use here to make them scope-safe? Have explicit scope tests before
using them in calls?
If one wanted a scope-protected MemoryAddress this seems to be the only
solution in the present api:
CLCommandQueue {
ResourceScope scope;
MemoryAddress handle;
void enqueueNDRangeKernel(...) {
try (ResourceScope scope = ResourceScope.newConfinedScope()) {
scope.keepAlive(this.scope);
enqueueNDRangeKernel$MH.invokeExact((Addressable)handle, ...);
} catch (IllegalStateException ex) {
}
}
}
Actually it's a bit worse because there will be handles (possibly in
different scopes) in the argument list and they would all need to be
checked in the same way.
So for practicality they would have to be MemorySegment (ideally
length=0) since they are always checked?
OpenCL adds another wrinkle here, it's possible to use generic getters
to retrieve handles to other objects, like retrieving the
cl_command_queue a given cl_event was written to. In the jni bindings I
did for opencl I used a global hash table to resolve handles to unique
java objects, that seems the only approach but it seems heavy. Side
note: MemoryAddress.hashCode() is very poor for hash tables since most
addresses have at least the 3 least significant bits set to 0 on a
64-bit machine.
!Z
More information about the panama-dev
mailing list