Questions about the Hermetic Java project
David Holmes
david.holmes at oracle.com
Wed Jun 4 04:31:20 UTC 2025
On 4/06/2025 5:00 am, Jiangli Zhou wrote:
> On Mon, Jun 2, 2025 at 6:22 PM David Holmes <david.holmes at oracle.com> wrote:
>>
>> On 3/06/2025 9:29 am, Jiangli Zhou wrote:
>>> On Sun, Jun 1, 2025 at 7:55 PM David Holmes <david.holmes at oracle.com> wrote:
>>>>
>>>> On 31/05/2025 7:20 am, Jiangli Zhou wrote:
>>>>> On Thu, May 29, 2025 at 11:54 PM David Holmes <david.holmes at oracle.com> wrote:
>>>>>>
>>>>>> On 30/05/2025 9:26 am, Jiangli Zhou wrote:
>>>>>>>
>>>>>>> I just thought of one more thing related to the discussion now. Any
>>>>>>> concern if the implementation does not ignore JNI_OnLoad_L and etc if
>>>>>>> they are defined application's dynamically linked native libraries? Or
>>>>>>> that's unspecified behavior and it's up to the implement to decide?
>>>>>>
>>>>>> For Internal libraries or external? For external you have to follow the
>>>>>> spec - if both methods exist you only want to execute one of them.
>>>>>
>>>>> It's for the external (non-JDK) library that I'm a bit more cautious.
>>>>>
>>>>> In the existing code in JDK mainline,
>>>>> https://github.com/openjdk/jdk/blob/3cc630985d47be6ba4cf991698e999f17dbde203/src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java#L117,
>>>>> loadLibrary() first tries to find the built-in library using
>>>>> JNI_OnLoad_L symbol (L is the library name). When dlsym is called to
>>>>> find the symbol from the main process, any of the already loaded
>>>>> shared libraries are also searched, as described by the dlsym man page
>>>>> (included related part below).
>>>>>
>>>>> https://man7.org/linux/man-pages/man3/dlsym.3.html:
>>>>> RTLD_DEFAULT
>>>>> Find the first occurrence of the desired symbol using the
>>>>> default shared object search order. The search will
>>>>> include global symbols in the executable and its
>>>>> dependencies, as well as symbols in shared objects that
>>>>> were dynamically loaded with the RTLD_GLOBAL flag.
>>>>>
>>>>> I think it would be rare, it is possible to construct such case:
>>>>>
>>>>> There are user JNI libraries A and B, with B is built as a dependency
>>>>> of A. A defines JNI_OnLoad_A and JNI_OnLoad. B defines JNI_OnLoad_B
>>>>> and JNI_OnLoad. When A is being loaded using loadLibrary(),
>>>>> loadLibrary() tries first to lookup JNI_OnLoad_A, which is not found.
>>>>> A is then loaded dynamically, which causes B being loaded implicitly
>>>>> as a dependency of A. Later when loadLibrary() is called for B,
>>>>> JNI_OnLoad_B would be found and then called. This is an existing
>>>>> behavior. I think it's an unspecified behavior and we don't need to
>>>>> add any additional checks to prevent JNI_OnLoad_B from being called.
>>>>
>>>> That sounds like a significant design flaw to me. You can't specify that
>>>> JNI_OnLoad_L will only be called if L is statically linked, if the
>>>> existence of JNI_OnLoad_L is used to infer that L is statically linked!
>>>> I would expect libraries to have both versions of the OnLoad functions
>>>> to allow for them being statically or dynamically linked - which the
>>>> spec allows for by saying the alternate variant is ignored. But then the
>>>> JDK will execute the wrong method if it finds JNI_OnLoad_L in a
>>>> dynamically linked library.
>>>
>>> JNI_OnLoad_L is used to determine if a requested JNI native library is
>>> a built-in (statically linked) library. Thus, it can avoid the
>>> operation of explicit loading for the shared library, e.g. with dlopen
>>> on Linux. JNI_OnLoad_L is expected to provide the same implementation
>>> as JNI_OnLoad besides being used as an identifier of a built-in
>>> library, IIUC.
>>>
>>> In the scenario that I described in the previous message, when a JNI
>>> shared library is already implicitly loaded as a dependency of another
>>> native library, dlopen for explicitly loading the shared library is
>>> not necessary. From the implementation point of view, the code seems
>>> to have been doing the right thing since JDK 8. I did some search and
>>> found https://stackoverflow.com/questions/32302262/does-dlopen-re-load-already-loaded-dependencies-if-so-what-are-the-implication,
>>> which point out to the following in POSIX spec (latest
>>> https://pubs.opengroup.org/onlinepubs/9799919799/):
>>>
>>> "Only a single copy of an executable object file shall be brought into
>>> the address space, even if dlopen() is invoked multiple times in
>>> reference to the executable object file, and even if different
>>> pathnames are used to reference the executable object file."
>>>
>>> Then avoiding calling dlopen for the already loaded native library
>>> doesn't cause any undesired side effects.
>>>
>>> Perhaps we can update the JNI spec to include the "already loaded" JNI
>>> native libraries case, in addition to the built-in native libraries,
>>> regarding JNI_OnLoad_L. Also clarify the "these functions will be
>>> ignored" part in JNI spec for the dynamically linked libraries.
>>>
>>> "If dynamically linked library defines JNI_OnLoad_L and/or
>>> JNI_OnUnload_L functions, these functions will be ignored."
>>>
>>> Thoughts?
>>
>> The problem is, as I see it, that the spec assumes that if it finds the
>> JNI_OnLoad_L symbol then L must be a statically linked library. But that
>> ignores the case you highlight where L was implicitly dynamically loaded
>> as a dependency on another library. Hence the existence test for the
>> symbol is not sufficient to determine if L was statically linked.
>
> Right.
>
>>
>> If JNI_OnLoad_L and JNI_OnLoad were guaranteed to always do exactly the
>> same thing it would not make any practical difference, but surely that
>> is not always the case? I can certainly postulate the existence of a
>> library that only needs the "on load" hook for the statically linked
>> case, in which case invoking it when actually dynamically linked would
>> be incorrect and potentially harmful.
>
> Hmmm, I haven't thought of such a case. David, could you please give a
> concrete example for the case?
I don't have a concrete example, that's why I just said I could
"postulate the existence" of such a case. :)
>>
>> It seems we lack a way to know if a given library is truly statically
>> linked, or to be advised when a library is implicitly loaded as a
>> dependency. I am no expert on linking but I've been unable to locate any
>> information on programmatically determining these conditions.
>
> I haven't found anything either.
>
>>
>> If there is no real solution then documenting the problem may be all we
>> can do.
>
> That sounds reasonable to me.
>
> I mentioned the new mailing list discussions to Ron, Alan and Magnus
> today during the hermetic Java meeting. They may have follow up
> thoughts.
Okay.
David
-----
> Thanks!
> Jiangli
>
>>
>> David
>> -----
>>
>>> Thanks!
>>> Jiangli
More information about the build-dev
mailing list