Questions about the Hermetic Java project
David Holmes
david.holmes at oracle.com
Tue Jun 3 01:22:26 UTC 2025
On 3/06/2025 9:29 am, Jiangli Zhou wrote:
> On Sun, Jun 1, 2025 at 7:55 PM David Holmes <david.holmes at oracle.com> wrote:
>>
>> On 31/05/2025 7:20 am, Jiangli Zhou wrote:
>>> On Thu, May 29, 2025 at 11:54 PM David Holmes <david.holmes at oracle.com> wrote:
>>>>
>>>> On 30/05/2025 9:26 am, Jiangli Zhou wrote:
>>>>>
>>>>> I just thought of one more thing related to the discussion now. Any
>>>>> concern if the implementation does not ignore JNI_OnLoad_L and etc if
>>>>> they are defined application's dynamically linked native libraries? Or
>>>>> that's unspecified behavior and it's up to the implement to decide?
>>>>
>>>> For Internal libraries or external? For external you have to follow the
>>>> spec - if both methods exist you only want to execute one of them.
>>>
>>> It's for the external (non-JDK) library that I'm a bit more cautious.
>>>
>>> In the existing code in JDK mainline,
>>> https://github.com/openjdk/jdk/blob/3cc630985d47be6ba4cf991698e999f17dbde203/src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java#L117,
>>> loadLibrary() first tries to find the built-in library using
>>> JNI_OnLoad_L symbol (L is the library name). When dlsym is called to
>>> find the symbol from the main process, any of the already loaded
>>> shared libraries are also searched, as described by the dlsym man page
>>> (included related part below).
>>>
>>> https://man7.org/linux/man-pages/man3/dlsym.3.html:
>>> RTLD_DEFAULT
>>> Find the first occurrence of the desired symbol using the
>>> default shared object search order. The search will
>>> include global symbols in the executable and its
>>> dependencies, as well as symbols in shared objects that
>>> were dynamically loaded with the RTLD_GLOBAL flag.
>>>
>>> I think it would be rare, it is possible to construct such case:
>>>
>>> There are user JNI libraries A and B, with B is built as a dependency
>>> of A. A defines JNI_OnLoad_A and JNI_OnLoad. B defines JNI_OnLoad_B
>>> and JNI_OnLoad. When A is being loaded using loadLibrary(),
>>> loadLibrary() tries first to lookup JNI_OnLoad_A, which is not found.
>>> A is then loaded dynamically, which causes B being loaded implicitly
>>> as a dependency of A. Later when loadLibrary() is called for B,
>>> JNI_OnLoad_B would be found and then called. This is an existing
>>> behavior. I think it's an unspecified behavior and we don't need to
>>> add any additional checks to prevent JNI_OnLoad_B from being called.
>>
>> That sounds like a significant design flaw to me. You can't specify that
>> JNI_OnLoad_L will only be called if L is statically linked, if the
>> existence of JNI_OnLoad_L is used to infer that L is statically linked!
>> I would expect libraries to have both versions of the OnLoad functions
>> to allow for them being statically or dynamically linked - which the
>> spec allows for by saying the alternate variant is ignored. But then the
>> JDK will execute the wrong method if it finds JNI_OnLoad_L in a
>> dynamically linked library.
>
> JNI_OnLoad_L is used to determine if a requested JNI native library is
> a built-in (statically linked) library. Thus, it can avoid the
> operation of explicit loading for the shared library, e.g. with dlopen
> on Linux. JNI_OnLoad_L is expected to provide the same implementation
> as JNI_OnLoad besides being used as an identifier of a built-in
> library, IIUC.
>
> In the scenario that I described in the previous message, when a JNI
> shared library is already implicitly loaded as a dependency of another
> native library, dlopen for explicitly loading the shared library is
> not necessary. From the implementation point of view, the code seems
> to have been doing the right thing since JDK 8. I did some search and
> found https://stackoverflow.com/questions/32302262/does-dlopen-re-load-already-loaded-dependencies-if-so-what-are-the-implication,
> which point out to the following in POSIX spec (latest
> https://pubs.opengroup.org/onlinepubs/9799919799/):
>
> "Only a single copy of an executable object file shall be brought into
> the address space, even if dlopen() is invoked multiple times in
> reference to the executable object file, and even if different
> pathnames are used to reference the executable object file."
>
> Then avoiding calling dlopen for the already loaded native library
> doesn't cause any undesired side effects.
>
> Perhaps we can update the JNI spec to include the "already loaded" JNI
> native libraries case, in addition to the built-in native libraries,
> regarding JNI_OnLoad_L. Also clarify the "these functions will be
> ignored" part in JNI spec for the dynamically linked libraries.
>
> "If dynamically linked library defines JNI_OnLoad_L and/or
> JNI_OnUnload_L functions, these functions will be ignored."
>
> Thoughts?
The problem is, as I see it, that the spec assumes that if it finds the
JNI_OnLoad_L symbol then L must be a statically linked library. But that
ignores the case you highlight where L was implicitly dynamically loaded
as a dependency on another library. Hence the existence test for the
symbol is not sufficient to determine if L was statically linked.
If JNI_OnLoad_L and JNI_OnLoad were guaranteed to always do exactly the
same thing it would not make any practical difference, but surely that
is not always the case? I can certainly postulate the existence of a
library that only needs the "on load" hook for the statically linked
case, in which case invoking it when actually dynamically linked would
be incorrect and potentially harmful.
It seems we lack a way to know if a given library is truly statically
linked, or to be advised when a library is implicitly loaded as a
dependency. I am no expert on linking but I've been unable to locate any
information on programmatically determining these conditions.
If there is no real solution then documenting the problem may be all we
can do.
David
-----
> Thanks!
> Jiangli
More information about the leyden-dev
mailing list