Hermetic Java project meeting
Jiangli Zhou
jianglizhou at google.com
Tue May 7 04:04:56 UTC 2024
On Tue, Apr 30, 2024 at 5:42 AM Magnus Ihse Bursie
<magnus.ihse.bursie at oracle.com> wrote:
>
>
> On 2024-04-26 03:15, Jiangli Zhou wrote:
> > On Thu, Apr 25, 2024 at 9:28 AM Magnus Ihse Bursie
> > <magnus.ihse.bursie at oracle.com> wrote:
> >>
> >> Just to be more clear, that's with using `objcopy` to localize non-exported symbols for all JDK static libraries and libjvm.a, not just libjvm.a right?
> >>
> >> Yes.
> >>
> >>
> >> Can you please include the compiler or linker errors on linux/clang?
> >>
> >> It is a bit tricky. The problem arises at the partial linking step. The problem seem to arise out of how clang converts a request to link into an actual call to ld. I enabled debug code (printing the command line, and running clang with `-v` to get it to print the actual command line used to run ld) and ran it on GHA, where it worked fine. This is how it looks there:
> >>
> >> WILL_RUN: /usr/bin/clang -v -m64 -r -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o
> >> Ubuntu clang version 14.0.0-1ubuntu1.1
> >> Target: x86_64-pc-linux-gnu
> >> Thread model: posix
> >> InstalledDir: /usr/bin
> >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10
> >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
> >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12
> >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
> >> Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
> >> Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
> >> Candidate multilib: .;@m64
> >> Selected multilib: .;@m64
> >> "/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13 -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/lib/llvm-14/bin/../lib -L/lib -L/usr/lib -r /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o
> >>
> >> In contrast, on my machine it looks like this:
> >>
> >> WILL_RUN: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/clang -v -m64 -r -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o
> >> clang version 13.0.1
> >> Target: x86_64-unknown-linux-gnu
> >> Thread model: posix
> >> InstalledDir: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin
> >> Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
> >> Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
> >> Candidate multilib: .;@m64
> >> Candidate multilib: 32;@m32
> >> Candidate multilib: x32;@mx32
> >> Selected multilib: .;@m64
> >> "/usr/bin/ld" --hash-style=both --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /lib/x86_64-linux-gnu/crt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/../lib -L/lib -L/usr/lib -r /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtend.o /lib/x86_64-linux-gnu/crtn.o
> >> /usr/bin/ld: cannot find -lgcc_s
> >> /usr/bin/ld: cannot find -lgcc_s
> >> clang-13: error: linker command failed with exit code 1 (use -v to see invocation)
> >>
> >> I don't understand what makes clang think it should include "-lgcc --as-needed -lgcc_s" and the crt*.o files when doing a partial link. In fact, the entire process on how clang (and gcc) builds up the linker command line is bordering on black magic to me. I think it can be affected by variables set at compile time (at least this was the case for gcc, last I checked), or maybe it picks up some kind of script from the environment. That's why I believe my machine could just be messed up.
> >>
> >> I could get a bit further by passing "-nodefaultlibs" (or whatever it was), but then the generated .o file were messed up wrt to library symbols and it failed dramatically when trying to do the final link of the static java launcher.
> >>
> >>
> > Looks like you are using /usr/bin/ld and not lld. I haven't run into
> > this type of issue. Have you tried -fuse-ld=lld?
>
> I am not sure why clang insisted on picking up ld and not lld. I remeber
> trying with -fuse-ld=lld, and that it did not work either.
> Unfortunately, I don't remember exactly what the problems were.
>
> I started reinstalling my Linux workstation yesterday, but something
> went wrong, and it failed so hard that it got semi-bricked by the new
> installation, so I need to redo everything from scratch. :-( After that
> is done, I'll re-test. Hopefully this was just my old installation that
> was too broken.
>
>
> >
> >>>
> >>> I have also tried to extract all the changes (and only the changes)
> >>> related to static build from the hermetic-java-runtime branch (ignoring
> >>> the JavaHome/resource loading changes), to see if I could get something
> >>> like StaticLink.gmk in mainline. I thought I was doing quite fine, but
> >>> after a while I realized my testing was botched since the launcher had
> >>> actually loaded the libraries dynamically instead, even though they were
> >>> statically linked. :-( I am currently trying to bisect my way thought my
> >>> repo to understand where things went wrong.
> >>
> >> Did you run with `bin/javastatic`? The system automatically detects if the binary contains statically linked native libraries and avoids loading the dynamic libraries. Can you please share which test(s) ran into the library loading issue? I'll see if I can reproduce the problem that you are running into.
> >>
> >> It was in fact not a problem. I was fooled by an error message. To be sure I was not loading any dynamically linked libraries, I removed the jdk/lib directory. Now the launcher failed, saying something like:
> >>
> >> "Error: Cannot locate dynamic library libjava.dylib".
> >>
> >> which was a bit of a jump scare.
> >>
> >> However, it turned out that it actually tried to load lib/jvm.cfg, and failed in loading this (since I had removed the entire lib directory), and this failure caused the above error message to be printed. When I restored lib/jvm.cfg (but not any dynamic libraries), the launcher worked.
> >>
> > Sounds like you are running into problems immediately during startup.
> > Does the problem occur with just running bin/javastatic using a simple
> > HelloWorld? Can you please send me your command line for reproducing?
>
> Maybe I was not clear enough: I did resolve the problem.
>
> > For the static Java support, I changed CreateExecutionEnvironment to
> > return immediately if it executes statically. jvm.cfg is not loaded.
> > Please see https://github.com/openjdk/leyden/blob/c1c5fc686c1452550e1b3663a320fba652248505/src/java.base/unix/native/libjli/java_md.c#L296.
> > Sounds like the JLI_IsStaticJDK() check is not working properly in
> > your case.
>
> I've been trying to extract from your port a minimal set of patches that
> is needed to get static build to work. In that process, JavaHome and
> JLI_IsStaticJDK have been removed. It might be that this issue arised
> only in my slimmed-down branch, and not on your leyden branch (at this
> point I don't recall exactly). But, we need to fix this separately,
> since we must be able to build a static launcher without the hermetic
> changes.
The JDK and VM code has pre-existing assumptions about the JDK
directories and dynamic linking (e.g. the .so).
JLI_IsStaticJDK|JLI_SetStaticJDK|JVM_IsStaticJDK|JVM_SetStaticJDK is
needed for static JDK support to handle those cases correctly.
CreateExecutionEnvironment that I mentioned earlier is one of the
examples.
I'm quite certain the issue that you are running into is due to the
incorrect static check/handling in CreateExecutionEnvironment.
>
> In my branch, I am only using compile-time #ifdef checks for static vs
> dynamic. In the long run, the runtime checks that you have done are a
> good thing, but at the moment they are just adding intrusive changes
> without providing any benefit -- if we can't reuse .o files between
> dynamic and static compilation, there is no point in introducing a
> runtime check when we already have a working compile-time check.
I haven't seen your branch/code. I'd suggest not going with the #ifdef
checks as that's the opposite direction of what we want to achieve. It
doesn't seem to be worth your effort to add more #ifdef checks in
order to do static linking build work, even those are for temporary
testing reasons.
>
> I did think I correctly changed every dynamic check that you had added
> to a compile-time check, so it bewilders me somewhat when you say that
> jvm.cfg is not needed in your branch.
>
> Can you verify and confirm that the static launcher actually works in
> your branch, if there is no "lib/jvm.cfg" present?
In my <path>/leyden/build/linux-x86_64-server-slowdebug/images/jdk directory:
$ mv lib/jvm.cfg lib/jvm.cfg.no_used
$ find . | grep jvm.cfg
./lib/jvm.cfg.no_used
$ bin/javastatic -cp <my_jar> HelloWorld
HelloWorld
Thanks!
Jiangli
>
> /Magnus
>
>
> >
> > Best,
> > Jiangli
> >
> >> There are several bugs lurking here. For once, the error message is incorrect and should be corrected. Secondly, a statically linked launcher has just a single JVM included and should not have to look for the lib/jvm.cfg file at all.
> >>
> >> After looking around a bit in the launcher/jli code, my conclusion is that this code will need some additional care and loving attention to make it properly adjusted to function as a static launcher. We can't have a static launcher that tries to load a jvm.cfg file it does not need, and when it fails, complains that it is missing a dynamic library that it should not load.
> >>
> >> I'll try to get this fixed as part of my efforts to get the static launcher into mainline.
> >>> This was done haphazardly in StaticLink.gmk in the hermetic-java-runtime
> >>> branch, where an arbitrary subset of external libraries were hard-coded.
> >>> Before integration in mainline can be possible, this information needs
> >>> to be collected correctly and automatically for all included JDK
> >>> libraries. Fortunately, it is not likely to be too hard. I basically
> >>> just need to store the information from the LIBS provided to the
> >>> NativeCompilation, and pick that up for all static libraries we include
> >>> in the static launcher. (A complication is that we need to de-duplicate
> >>> the list, and that some libraries are specified using two words, like
> >>> "-framework Application" on macos, so it will take some care getting it
> >>> right.)
> >>
> >> Right, currently the hermetic-java-runtime branch specifies a list of hard-coded dependency libraries for linking. One of the goals of the hermetic prototype was avoiding/reducing refactoring/restructuring the existing code whenever possible. The reason is to reduce merge overhead when integrating with new changes from the mainline. We can do the proper refactoring and cleanups when getting the changes into the mainline.
> >>
> >> That is basically what I am doing right now. I am looking at your prototype and trying to reimplement this functionality properly so that it can be merged into mainline. The first step on that way was to actually get your prototype running.
> >>
> >> Now I have managed to get a version of your prototype that only includes the minimal set of changes needed to support the static launcher, and that works on mac and linux/gcc. Since your prototype is based on 586396cbb55a31 from March, I am trying to merge the patch with the latest master. This worked fine for macOS, but I hit some unexpected snag on Linux which I'm currently investigating.
> >>
> >> We have only briefly touched on the spec change topic (for the naming of native libraries) during the zoom meetings. I also agree that we should get that part started now. It's unclear to me if there's any existing blocker for that.
> >>
> >> I don't think there is. It's just that someone needs to step up and do it.
> >>
> >> /Magnus
More information about the leyden-dev
mailing list