Hermetic Java project meeting

Magnus Ihse Bursie magnus.ihse.bursie at oracle.com
Thu Apr 25 16:28:16 UTC 2024


> Just to be more clear, that's with using `objcopy` to localize 
> non-exported symbols for all JDK static libraries and libjvm.a, not 
> just libjvm.a right?
Yes.
>
> Can you please include the compiler or linker errors on linux/clang?

It is a bit tricky. The problem arises at the partial linking step. The 
problem seem to arise out of how clang converts a request to link into 
an actual call to ld. I enabled debug code (printing the command line, 
and running clang with `-v` to get it to print the actual command line 
used to run ld) and ran it on GHA, where it worked fine. This is how it 
looks there:

WILL_RUN: /usr/bin/clang -v -m64 -r -o 
/home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o 
/home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o
Ubuntu clang version 14.0.0-1ubuntu1.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
Candidate multilib: .;@m64
Selected multilib: .;@m64
"/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m 
elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o 
/home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o 
-L/usr/bin/../lib/gcc/x86_64-linux-gnu/13 
-L/usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../lib64 
-L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu 
-L/usr/lib/../lib64 -L/usr/lib/llvm-14/bin/../lib -L/lib -L/usr/lib -r 
/home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o

In contrast, on my machine it looks like this:

WILL_RUN: 
/usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/clang -v 
-m64 -r -o 
/localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o 
/localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o
clang version 13.0.1
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
"/usr/bin/ld" --hash-style=both --eh-frame-hdr -m elf_x86_64 
-dynamic-linker /lib64/ld-linux-x86-64.so.2 -o 
/localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o 
/lib/x86_64-linux-gnu/crt1.o /lib/x86_64-linux-gnu/crti.o 
/usr/lib/gcc/x86_64-linux-gnu/9/crtbegin.o 
-L/usr/lib/gcc/x86_64-linux-gnu/9 
-L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 
-L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu 
-L/usr/lib/../lib64 
-L/usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/../lib 
-L/lib -L/usr/lib -r 
/localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o 
-lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s 
--no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtend.o 
/lib/x86_64-linux-gnu/crtn.o
/usr/bin/ld: cannot find -lgcc_s
/usr/bin/ld: cannot find -lgcc_s
clang-13: error: linker command failed with exit code 1 (use -v to see 
invocation)

I don't understand what makes clang think it should include "-lgcc 
--as-needed -lgcc_s" and the crt*.o files when doing a partial link. In 
fact, the entire process on how clang (and gcc) builds up the linker 
command line is bordering on black magic to me. I think it can be 
affected by variables set at compile time (at least this was the case 
for gcc, last I checked), or maybe it picks up some kind of script from 
the environment. That's why I believe my machine could just be messed up.

I could get a bit further by passing "-nodefaultlibs" (or whatever it 
was), but then the generated .o file were messed up wrt to library 
symbols and it failed dramatically when trying to do the final link of 
the static java launcher.

>
>     I have also tried to extract all the changes (and only the changes)
>     related to static build from the hermetic-java-runtime branch
>     (ignoring
>     the JavaHome/resource loading changes), to see if I could get
>     something
>     like StaticLink.gmk in mainline. I thought I was doing quite fine,
>     but
>     after a while I realized my testing was botched since the launcher
>     had
>     actually loaded the libraries dynamically instead, even though
>     they were
>     statically linked. :-( I am currently trying to bisect my way
>     thought my
>     repo to understand where things went wrong.
>
>
> Did you run with `bin/javastatic`? The system automatically detects if 
> the binary contains statically linked native libraries and avoids 
> loading the dynamic libraries. Can you please share which test(s) ran 
> into the library loading issue? I'll see if I can reproduce the 
> problem that you are running into.

It was in fact not a problem. I was fooled by an error message. To be 
sure I was not loading any dynamically linked libraries, I removed the 
jdk/lib directory. Now the launcher failed, saying something like:

"Error: Cannot locate dynamic library libjava.dylib".

which was a bit of a jump scare.

However, it turned out that it actually tried to load lib/jvm.cfg, and 
failed in loading this (since I had removed the entire lib directory), 
and this failure caused the above error message to be printed. When I 
restored lib/jvm.cfg (but not any dynamic libraries), the launcher worked.

There are several bugs lurking here. For once, the error message is 
incorrect and should be corrected. Secondly, a statically linked 
launcher has just a single JVM included and should not have to look for 
the lib/jvm.cfg file at all.

After looking around a bit in the launcher/jli code, my conclusion is 
that this code will need some additional care and loving attention to 
make it properly adjusted to function as a static launcher. We can't 
have a static launcher that tries to load a jvm.cfg file it does not 
need, and when it fails, complains that it is missing a dynamic library 
that it should not load.

I'll try to get this fixed as part of my efforts to get the static 
launcher into mainline.

>     This was done haphazardly in StaticLink.gmk in the
>     hermetic-java-runtime
>     branch, where an arbitrary subset of external libraries were
>     hard-coded.
>     Before integration in mainline can be possible, this information
>     needs
>     to be collected correctly and automatically for all included JDK
>     libraries. Fortunately, it is not likely to be too hard. I basically
>     just need to store the information from the LIBS provided to the
>     NativeCompilation, and pick that up for all static libraries we
>     include
>     in the static launcher. (A complication is that we need to
>     de-duplicate
>     the list, and that some libraries are specified using two words, like
>     "-framework Application" on macos, so it will take some care
>     getting it
>     right.)
>
>
> Right, currently the hermetic-java-runtime branch specifies a list of 
> hard-coded dependency libraries for linking. One of the goals of the 
> hermetic prototype was avoiding/reducing refactoring/restructuring the 
> existing code whenever possible. The reason is to reduce merge 
> overhead when integrating with new changes from the mainline. We can 
> do the proper refactoring and cleanups when getting the changes into 
> the mainline.

That is basically what I am doing right now. I am looking at your 
prototype and trying to reimplement this functionality properly so that 
it can be merged into mainline. The first step on that way was to 
actually get your prototype running.

Now I have managed to get a version of your prototype that only includes 
the minimal set of changes needed to support the static launcher, and 
that works on mac and linux/gcc. Since your prototype is based on 
586396cbb55a31 from March, I am trying to merge the patch with the 
latest master. This worked fine for macOS, but I hit some unexpected 
snag on Linux which I'm currently investigating.

> We have only briefly touched on the spec change topic (for the naming 
> of native libraries) during the zoom meetings. I also agree that we 
> should get that part started now. It's unclear to me if there's any 
> existing blocker for that.
>
I don't think there is. It's just that someone needs to step up and do it.

/Magnus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/build-dev/attachments/20240425/1d7380cd/attachment-0001.htm>


More information about the build-dev mailing list