Usage of global scope by Linker is unsafe for pointers to `static` data in loaded library
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Sep 27 09:20:04 UTC 2024
Thanks for the detailed explanation of the issue. This is very helpful.
The crux of the issue is this reinterpret call:
https://github.com/tree-sitter/java-tree-sitter/blob/078a1c2b7c7ad823850444c3ba35c47379c5bd46/src/main/java/io/github/treesitter/jtreesitter/Language.java#L44
This is going to assert that the lifetime of the provided segment is
that of LIBRARY_ARENA, which is probably incorrect - at least in this
specific instance. The segment being reinterpreted comes from a native
call performed on some external native library - so it's fair to assume
that whatever memory is being allocated there will probably not outlive
the external library.
In other words, you have a situation with two lifetimes - you have the
lifetime of the tree sitter library, call it lifetime L - and then you
have the lifetime of an external library (where the language-specific
parser is defined), call it E. The relationship between the two
lifetimes is that E is *nested* in L. That is, L will generally outlive E.
The code you linked mixes E and L together. That is, when we obtain the
language-specific parser, with lifetime E, we reinterpret it to lifetime
L - this is incorrect, as L is *bigger* than E. As such, this will lead
to a situation where the segment in E can be accessed (because it's now
backed by L), even though it is no longer there.
There is also another issue which can arise when you have nested
lifetimes: you probably don't want L to terminate *before* E either.
This typically means setting up some sort of logic to make sure that one
cannot "close" L while E is still around. In most cases though, this is
not a problem, as L is an automatic lifetime, and so it will be kept
around "long enough".
We know this problem very well. Lifetimes in general are described here:
https://cr.openjdk.org/~mcimadamore/panama/why_lifetimes.html
And a proposal to model dependent lifetimes is described here:
https://inside.java/2021/10/12/panama-scope-dependencies/
This proposal was never implemented as it was deemed too complex. But
the degrees of freedom discussed in this proposal are all there in the
problem you are facing.
So, enough "theory". How can this problem be addressed?
In the FFM lingo, each lifetime needs an arena. So, there is an arena
for L (LIBRARY_ARENA). What we're missing is an arena for E. The "load"
method takes a symbol lookup - but a lookup is an object that "consumes"
an arena - and does not expose it directly. So, if the client wants the
library to install cleanup action in E, it must be provide the library
with the arena for that E. This might feel redudnant, but in part this
is also how things are intended to work. An arena is a capability: the
ability to allocate segments, reinterpret them and attach cleanup action
is something that should be closely guarded by the owner of that arena.
For instance a non-cooperative client might install a never-terminating
cleanup action on the arena - so it would be somewhat unfortunate if
clients were able to attach cleanup actions on random segments for which
they don't have an arena. When a client passes an arena to a library, it
is effectively *transferring ownership* of that arena to the library.
The library is now able to fully act as if on behalf of the client.
And, since in this case the client would be passing both a symbol lookup
*and* an arena, it might be wise to check that the segment we get back
from the lookup indeed has the same scope as that of the provided arena
- if not we're in a case where there's a third lifetime that is neither
L nor E - and I believe that's too much :-)
One possible way to avoid duplication of parameters being passed would
be for "load" to take some function from Arena -> SymbolLookup, instead
of providing a symbol lookup that is completely external. But, passing
the arena seems to make general sense here, as you want that ownership
transfer to occur here, so that the library can do what it needs to do
to set up the language segment.
I hope this helps.
Maurizio
In general, the idiom where you have a method that receives a random
segment and you do a reinterpret on it, attaching a cleanup action also
looks suspicious. The method doesn't have "ownership" of the segment, so
how do we make sure we're not really adding _two_ cleanup actions for
the same segment?
I think the cleanest solution for what the API wants to do is:
* in "load", accept (as you suggest) an Arena as well
On 27/09/2024 00:19, some-java-user-99206970363698485155 at vodafonemail.de
wrote:
>
> But only while using jtreesitter more I noticed this issue, that the
> TSLanguage pointer actually points to static data, and if you unload
> the library while you are still using it, the JVM crashes. It turns
> out this is partially due to an incorrect (?) `reinterpret` call, but
> also due to the Linker using the global scope for the result, see also
> [3]. I am suggesting there now to not only take a SymbolLookup but
> also the related Arena, to be able to set the Arena on the TSLanguage
> pointer to fix its scope.
>
> So maybe this is really a niche use case, and it can be solved by also
> separately providing the Arena (though having to provide it separately
> is a bit inconvenient).
>
> As for why I am not simply using `Arena.global()` when loading the
> parser library: I was thinking more about the case where you
> accidentally not use the global Arena, and was hoping for it to still
> be (relatively) safe when the Arena is closed too early, throwing an
> exception in the worst case, but not crashing the JVM. And also a bit
> similar to my previous message in
> https://mail.openjdk.org/pipermail/jextract-dev/2024-August/001919.html,
> I am still exploring ways to use a temporary file for the library and
> be able to unload and delete it before JVM exit (therefore not use
> `Arena.global()`).
>
> But you are right, some of my concerns were also a bit theoretical,
> extending from the issues I actually encountered.
>
>
> I hope this topic did not cause too much distraction, and it would
> likely be good to see if other use cases actually benefit from
> JDK-8340641 (reinterpret with Scope) and JDK-8340642 (scope from
> SymbolLookup) as well. Maybe they are really not needed, and users of
> the foreign API can easily work around this or adjust their API
> slightly to accommodate for this.
> Thanks nonetheless for investigating this and thinking about possible
> solutions.
>
More information about the panama-dev
mailing list