library loading - continued

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Mon May 10 12:49:27 UTC 2021


Hi,
in my email [1] I described a plan to simplify the library loading 
mechanism based on the `LibraryLookup` abstraction. The motivations 
behind the restructuring were three-fold:

* Semantics of `LibraryLookup::ofDefault` is unstable; on some 
platforms, default lookup leaks other loaded library symbols; on other 
platforms (e.g. Windows) implementing it required use of debugging APIs.
* `LibraryLookup` has its own concept of lifecycle which is incompatible 
with `ResourceScope`, and also with that typically associated with JNI 
loaded libraries (whose lifecycle is associated with that of the owning 
classloader)
* `LibraryLookup` does not offer a migration path for JNI-based 
frameworks. That is, many frameworks are written with the assumption 
that `System::loadLibrary` will affect the set of available symbols 
_everywhere_ else (at least within the same classloader context)

For these reasons, we decided to remove `LibraryLookup` and to provide a 
more basic lookup capability instead, as a static method 
(`CLinker::findNative`). This method would lookup symbols in libraries 
loaded by the classloader used by the caller; in doing so, the new 
method addresses the migration issues which affected `LibraryLookup`.

When writing this patch [2, 3] we have realized that this new scheme was 
not without its own issues:

* Loading symbols in the standard C library is now incredibly difficult, 
sometimes requiring surprising workarounds [4, 5]
* There is no way to e.g. implement a lookup which filters the set of 
symbols available to users

After staring at these issues, we realized that having a lookup 
abstraction was not, in itself, the root cause of the problems we were 
seeing in [1]. These problems were, rather, caused by the fact that the 
lookup abstraction we had was introducing its own concept of library 
loading which had nothing to do with JNI library loading. In other 
words, `LibraryLookup`, despite its name, wasn't a *pure* lookup 
abstraction, it was attempting to do a little more (e.g. keeping 
libraries loaded using GC reachability), without too much success.

Having realized that, we now believe we can safely reintroduce some kind 
of lookup abstraction, in the form of `SymbolLookup`: a simple 
functional interface which, given a symbol name, gives us the address of 
that symbol (if available). The use or a more neutral name here 
(`SymbolLookup` instead of `LibraryLookup`) is deliberate: not always a 
lookup is a result of library loading; lookup objects might in fact be 
created (or composed) by users, using existing resources.

How do developers obtain a `SymbolLookup` ? Two ways:

* They can obtain a `SymbolLookup` for a given class loader - which 
allows to search symbols in all the libraries loaded by that class loader
* They can ask `CLinker` a so called *system lookup* - a lookup which 
allows to search for basic C symbols (such as `strlen` and `qsort`)

Note that the first mechanism allows us to solve the migration problem, 
by exposing the lookup abstraction associated with a specific class 
loader. This means frameworks will still be able to call 
`System::loadLibrary` to *side-effect* the results of a class 
loader-based symbol lookup. Note that, since `SymbolLookup` is a simple 
functional interface, it would be possible for developers to set up 
symbol lookup chains e.g. where lookup for parent class loader is 
consulted first (e.g. following classloader delegation). This 
flexibility will certainly come in handy in real world use cases.

The second mechanism gives us something similar to the previous concept 
of *default lookup* - but without the messy bits: the system lookup is 
simply a system-dependent symbol lookup object, which might help in 
retrieving common C symbols. The API makes no promises as to *which* 
libraries will be consulted by this lookup (this is an implementation 
detail). That said, the implementation (unlike before) will not make use 
of `RTLD_DEFAULT` whose semantics is brittle and not implementable 
across all OSs.

It is possible that, in the future, we might add more ways to obtain a 
symbol lookup - for instance:

```
SymbolLookup.ofLibrary(String libName, ResourceScope scope)

```

This is not too different from what we had in the original 
`LibraryLookup` abstraction, and would allow developers to load a 
library and associate its lifecycle with a `ResourceScope` (rather than 
a class loader). That is, when the scope is closed, the library will be 
unloaded. However, adding these new mode will require some additional 
foundational work on the `CLinker` support - as we need to make sure 
that the memory address used by a downcall method handle cannot be 
unloaded while the downcall method handle is being invoked. This means 
that, at the very minimum, the linker will need to acquire/release the 
scope associated with the address of the native function, to prevent 
premature closing of said scope.

Summing up, we believe that while investigating for a more minimal 
library lookup mechanism we have found a way to provide most (all?) of 
the functionalities available before, in a more disciplined and 
compositional manner.

Cheers
Maurizio

[1] - 
https://mail.openjdk.java.net/pipermail/panama-dev/2021-April/013577.html
[2] - https://git.openjdk.java.net/panama-foreign/pull/526
[3] - https://git.openjdk.java.net/panama-foreign/pull/529
[4] - 
https://github.com/sundararajana/panama-foreign/blob/197db28d5097b1689b4befcd7129eb7af13f41c2/test/jdk/java/foreign/libStdLibTest.c
[5] - https://github.com/openjdk/panama-foreign/pull/527





More information about the panama-dev mailing list