[foreign-abi] RFR: JDK-8243669: Improve library loading for Panama libraries

Thu Apr 30 01:19:36 UTC 2020

On 4/28/20 8:56 PM, Maurizio Cimadamore wrote:
>> The idea with a standard API for reference counting would be to offer 
>> a framework that could be used for any native resources, not just 
>> memory segments or libraries or whatever next Panama is going to 
>> decide is "important", and that could be shared across any number of 
>> native libraries that are often used together to manage their 
>> resources in a sane way. Just for reference, here's an example with 
>> OpenCV and TensorFlow using JavaCPP's PointerScope:
>> http://bytedeco.org/news/2018/07/17/bytedeco-as-distribution/
> 
> My take on scope is that they work generally well - but, to make them 
> fully safe, then you have to start throwing in assumption about thread 
> confinement (which is described in this proposal). Your pointer API has 
> explicit retain/release method - if you call release() and the refCount 
> == 0 you deallocate. Right? (sorry if I got the library names wrong). 

Yes, that's the basic idea, but the point is that we can do everything 
via "scopes". We don't usually need to know that it's doing reference 
counting in the background, sort of like ARC in Swift.

> So, how do you solve problems like this:
> 
> Thread A accessing a pointer while thread B is 'releasing' it, where A 
> is not well-behaved and did NOT perform a retain() ?

Right, it's not perfect, but I think these kinds of issues are solvable, 
if we're willing to spend time and work on them. For example, if 
something like `PointerScope` could be integrated into the Java language 
itself, we would be able to guarantee that what you describe above never 
happens, making everything thread-safe. I don't see any limitations in 
that regards, but I may be missing something. Could you provide an 
example that fails? Or is there just concern about the performance hit 
that could be incurred (in which case I'd still say "let's work on it")?

> So, in my mental model, refcounts, scopes are _tools. If the _goal_ is 
> to write a safe API, these tools, alone, are not going to save the day. 
> A scope-like abstraction can of course be made to work, if you bring 
> together _other_ restrictions (e.g. pointers in a scope can only be used 
> by one thread). Assuming your API (and your clients) are ok with that 
> restriction, of course. Otherwise, we're basically discussing ways on 
> how to build an _unsafe_ API, which is a much simpler problem and not 
> what the Foreign Memory Access API is trying to do.
> 
> 
> As for the claim that library and memory resources are in the same 
> league, I think even that claim is questionable. Memory can be short 
> lived - you allocate something on the stack, pass it on a function and 
> then clear the memory. But a library has (typically) a much longer 
> lifespan. So, while it is in general not great to rely on the GC to 
> auto-clean the memory allocated off-heap (there are many war stories as 
> to why this fails to scale at some point), I see very little gain in 
> adding a lot of complexity to allow for deterministic library unloading, 
> when the GC is probably going to do fine for such longer-lived objects - 
> at the same time avoiding the "same thread" restrictions, and providing 
> a guarantee that all native method handles derived from a library will 
> keep the library alive.

My point is that this can all be part of a standard API to deal with all 
these issues, *in one place*, instead of forcing users to come up with 
their own heuristics, over and over again, leaving us with a large 
amount of mental models to deal with. JavaCPP also relies on the GC to 
clean up things around, and as I mentioned in another thread, it even 
tries to call System.gc and malloc_trim(0) a few times as a last-ditch 
effort to not throw OutOfMemoryError. Why not put that kind of thing in 
the JDK where everyone can agree on something that makes sense instead 
of having them come up with different heuristics for these things? 
That's what I think we should agree on first, not the technical details 
of what should ultimately be done, but that there is a need to agree on 
doing something to standardize this, as has been done with varying 
levels of success in other languages like C++, Python, and Swift. If 
Java is different in that sense, could you explain why it needs to *not* 
provide a standard way of doing these kinds of things?

Samuel