Usage of global scope by Linker is unsafe for pointers to `static` data in loaded library

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Fri Sep 27 09:20:04 UTC 2024


Thanks for the detailed explanation of the issue. This is very helpful.

The crux of the issue is this reinterpret call:

https://github.com/tree-sitter/java-tree-sitter/blob/078a1c2b7c7ad823850444c3ba35c47379c5bd46/src/main/java/io/github/treesitter/jtreesitter/Language.java#L44

This is going to assert that the lifetime of the provided segment is 
that of LIBRARY_ARENA, which is probably incorrect - at least in this 
specific instance. The segment being reinterpreted comes from a native 
call performed on some external native library - so it's fair to assume 
that whatever memory is being allocated there will probably not outlive 
the external library.

In other words, you have a situation with two lifetimes - you have the 
lifetime of the tree sitter library, call it lifetime L - and then you 
have the lifetime of an external library (where the language-specific 
parser is defined), call it E. The relationship between the two 
lifetimes is that E is *nested* in L. That is, L will generally outlive E.

The code you linked mixes E and L together. That is, when we obtain the 
language-specific parser, with lifetime E, we reinterpret it to lifetime 
L - this is incorrect, as L is *bigger* than E. As such, this will lead 
to a situation where the segment in E can be accessed (because it's now 
backed by L), even though it is no longer there.

There is also another issue which can arise when you have nested 
lifetimes: you probably don't want L to terminate *before* E either. 
This typically means setting up some sort of logic to make sure that one 
cannot "close" L while E is still around. In most cases though, this is 
not a problem, as L is an automatic lifetime, and so it will be kept 
around "long enough".

We know this problem very well. Lifetimes in general are described here:

https://cr.openjdk.org/~mcimadamore/panama/why_lifetimes.html

And a proposal to model dependent lifetimes is described here:

https://inside.java/2021/10/12/panama-scope-dependencies/

This proposal was never implemented as it was deemed too complex. But 
the degrees of freedom discussed in this proposal are all there in the 
problem you are facing.

So, enough "theory". How can this problem be addressed?

In the FFM lingo, each lifetime needs an arena. So, there is an arena 
for L (LIBRARY_ARENA). What we're missing is an arena for E. The "load" 
method takes a symbol lookup - but a lookup is an object that "consumes" 
an arena - and does not expose it directly. So, if the client wants the 
library to install cleanup action in E, it must be provide the library 
with the arena for that E. This might feel redudnant, but in part this 
is also how things are intended to work. An arena is a capability: the 
ability to allocate segments, reinterpret them and attach cleanup action 
is something that should be closely guarded by the owner of that arena.

For instance a non-cooperative client might install a never-terminating 
cleanup action on the arena - so it would be somewhat unfortunate if 
clients were able to attach cleanup actions on random segments for which 
they don't have an arena. When a client passes an arena to a library, it 
is effectively *transferring ownership* of that arena to the library. 
The library is now able to fully act as if on behalf of the client.

And, since in this case the client would be passing both a symbol lookup 
*and* an arena, it might be wise to check that the segment we get back 
from the lookup indeed has the same scope as that of the provided arena 
- if not we're in a case where there's a third lifetime that is neither 
L nor E - and I believe that's too much :-)

One possible way to avoid duplication of parameters being passed would 
be for "load" to take some function from Arena -> SymbolLookup, instead 
of providing a symbol lookup that is completely external. But, passing 
the arena seems to make general sense here, as you want that ownership 
transfer to occur here, so that the library can do what it needs to do 
to set up the language segment.

I hope this helps.

Maurizio


In general, the idiom where you have a method that receives a random 
segment and you do a reinterpret on it, attaching a cleanup action also 
looks suspicious. The method doesn't have "ownership" of the segment, so 
how do we make sure we're not really adding _two_ cleanup actions for 
the same segment?

I think the cleanest solution for what the API wants to do is:

* in "load", accept (as you suggest) an Arena as well

On 27/09/2024 00:19, some-java-user-99206970363698485155 at vodafonemail.de 
wrote:
>
> But only while using jtreesitter more I noticed this issue, that the 
> TSLanguage pointer actually points to static data, and if you unload 
> the library while you are still using it, the JVM crashes. It turns 
> out this is partially due to an incorrect (?) `reinterpret` call, but 
> also due to the Linker using the global scope for the result, see also 
> [3]. I am suggesting there now to not only take a SymbolLookup but 
> also the related Arena, to be able to set the Arena on the TSLanguage 
> pointer to fix its scope.
>
> So maybe this is really a niche use case, and it can be solved by also 
> separately providing the Arena (though having to provide it separately 
> is a bit inconvenient).
>
> As for why I am not simply using `Arena.global()` when loading the 
> parser library: I was thinking more about the case where you 
> accidentally not use the global Arena, and was hoping for it to still 
> be (relatively) safe when the Arena is closed too early, throwing an 
> exception in the worst case, but not crashing the JVM. And also a bit 
> similar to my previous message in 
> https://mail.openjdk.org/pipermail/jextract-dev/2024-August/001919.html, 
> I am still exploring ways to use a temporary file for the library and 
> be able to unload and delete it before JVM exit (therefore not use 
> `Arena.global()`).
>
> But you are right, some of my concerns were also a bit theoretical, 
> extending from the issues I actually encountered.
>
>
> I hope this topic did not cause too much distraction, and it would 
> likely be good to see if other use cases actually benefit from 
> JDK-8340641 (reinterpret with Scope) and JDK-8340642 (scope from 
> SymbolLookup) as well. Maybe they are really not needed, and users of 
> the foreign API can easily work around this or adjust their API 
> slightly to accommodate for this.
> Thanks nonetheless for investigating this and thinking about possible 
> solutions.
>


More information about the panama-dev mailing list