Hermetic Java project meeting
This is my summary of what was said in today's meeting: * Jiangli reported that her team had done extensive testing and not seen any problems, both with just the static launcher as generated by the leyden branch, and with bundled user applications using the One Jar (?) tool. They tested JDK tier 1, and an suite of Google's internal tests. When testing JTReg tests with native libraries, these were dynamically loaded. * Alan asked about Hotspot JTReg tests that launched "java". Jiangli reported that they had not seen any problems, but my understanding was that there was some confusion if any such tests were actually run. I think this is something that will need further attention, but if someone said they would look into it, I missed it. * Jiangli will get numbers on how much time is added to the GHA testing if we add building and linking of static libraries, without fixing so we can compile to a single set of object files. * We did not fully come to a conclusion if compiling on a single set of object files for static and dynamic linking was a hard requirement or not, but at a minimum it is a desirable goal. (My personal opinion is that is a hard requirement if the GHA build times are seriously affected otherwise.) There are basically two problems prohibiting single object file compilation: 1) using dynamic checks instead of #ifdefs for code that differs between static and dynamic. 2) Handling the difference between JNI_OnLoad (as required for dynamic libraries) and JNI_OnLoad_<libname> (as required for static libraries). * The leyden branch has basically solved both these problems. The first one could more or less be integrated already (given perhaps some discussion on exactly *how* the JDK should discover in runtime if it is static or dynamic), but the latter requires a spec change to be integrated. * I think everyone agreed that moving on with a spec change was a good idea, regardless of if this is blocker or not, but I don't recall that there were any concrete next steps decided. Ron and Alan said that we do spec changes all the time so it will not add as much bureaucracy as one might fear. * Regarding which native libraries to include, I think we agreed on the following: - Static linking will only support headless-only builds (in which the build system excludes the AWT library that does "headful" stuff -- otherwise there would be duplicate symbols) - As a first delivery, the build system will just create a static version of the "java" launcher (not jar, javac, etc). This will include all native libraries from all modules that are included in the build. - Going forward, the correct solution is to make jlink create a launcher that includes just the native libraries from the modules that is included in the jlink command. This will require jlink to understand how to call the native linker. - Somewhere in there we probably also needs to have jlink know about headless-only vs normal (headless or "headful" determined on runtime), so it can create a java.desktop output that includes only the headless library. * Magnus reported that the refactoring and fixing of technical debt that was required for doing static builds properly has just been finished, and that his attention is now turning into creating a properly integrated system for generating static builds alongside dynamic builds. * Jiangli and Magnus will work outside the meeting to resolve the build issues Magnus faced with the hermetic java branch in the Leyden repo. * Just before the meeting unfortunately had to be aborted, Jiangli mentioned that they had discovered issues with some JDK native libraries when using objcopy to localize all non-visible symbols. It was at the time of writing not clear what those issues were. Jiangli will report back with what they found. (And while I had not time to mention it on the meeting, I will also look into this.) /Magnus
Magnus, thanks for the meeting summary! Please see clarifications below, to avoid any confusion. On Tue, Apr 16, 2024 at 11:27 AM Magnus Ihse Bursie <magnus.ihse.bursie@oracle.com> wrote:
This is my summary of what was said in today's meeting:
* Jiangli reported that her team had done extensive testing and not seen any problems, both with just the static launcher as generated by the leyden branch, and with bundled user applications using the One Jar (?) tool. They tested JDK tier 1, and an suite of Google's internal tests. When testing JTReg tests with native libraries, these were dynamically loaded.
Clarifications, as discussed in the meeting: We have done multiple levels of testing for static/hermetic Java prototype with our internal codebase on JDK 11, JDK 21 and JDK head (mainline based). We run into bugs/failures and have addressed the issues found along the way. The layden branch contains most of the static/hermetic Java work from the prototype. In general static/hermetic Java support looks robust in current state on JDK 21 in our prototype based on testing (already communicated some of the the testing outcome on JDK 11 in last years JVMLS), with some of the remaining areas (e.g. how to handle user code accessing java.home, and https://github.com/openjdk/leyden/blob/hermetic-java-runtime/src/java.base/s... is still a internal class in the prototpye) that require broad discussions involving potential spec updates, and a small number of remaining failures to be looked into. Testing we have done includes: - jtreg with tier1 using `javastatic` ("fully" statically linked Java launcher) on JDK 11 and JDK 21 - Explicit functional and integration tests (most of them are developed base on existing jtreg tests) to test the final hermetic Java image. Image is built using singlejar, and the test and JDK are in a single image. - Scattered hermetic Java testing using our internal tests. The scattered hermetic image is a special mode for testing to emulate hermetic Java image execution without building the final hermetic Java. This requires some additional launcher changes that are not in the lenden branch currently. - Some production testing on JDK 11 and JDK 21
* Alan asked about Hotspot JTReg tests that launched "java". Jiangli reported that they had not seen any problems,
Also clarification: As mentioned during the meeting, most of the issues that we found with jtreg tier1 testing were due to the assumption of using dynamic libraries in the tests. We have addressed those. Alan had some questions about launching sub-processes. In the prototype, we had done work to support POSIX_SPAWN launch mechanism for ProcessBuilder.start() on hermetic Java, e.g. https://github.com/openjdk/leyden/commit/931b71b0845d24b1949a23333ef1cdb3d66.... We need to look into tier1 testing to verify if they cover sub-processing testing (Alan mentioned there are some in tier1).
but my understanding was that there was some confusion if any such tests were actually run. I think this is something that will need further attention, but if someone said they would look into it, I missed it.
Megnas, can you please elaborate the above about "if any such tests were actually run"?
* Jiangli will get numbers on how much time is added to the GHA testing if we add building and linking of static libraries, without fixing so we can compile to a single set of object files.
Will follow up on this.
* We did not fully come to a conclusion if compiling on a single set of object files for static and dynamic linking was a hard requirement or not, but at a minimum it is a desirable goal. (My personal opinion is that is a hard requirement if the GHA build times are seriously affected otherwise.)
There are basically two problems prohibiting single object file compilation:
1) using dynamic checks instead of #ifdefs for code that differs between static and dynamic.
2) Handling the difference between JNI_OnLoad (as required for dynamic libraries) and JNI_OnLoad_<libname> (as required for static libraries).
* The leyden branch has basically solved both these problems. The first one could more or less be integrated already (given perhaps some discussion on exactly *how* the JDK should discover in runtime if it is static or dynamic), but the latter requires a spec change to be integrated.
* I think everyone agreed that moving on with a spec change was a good idea, regardless of if this is blocker or not, but I don't recall that there were any concrete next steps decided. Ron and Alan said that we do spec changes all the time so it will not add as much bureaucracy as one might fear.
There was also a question raised by Dave during the meeting on the timeline for the spec/JSR related work.
* Regarding which native libraries to include, I think we agreed on the following: - Static linking will only support headless-only builds (in which the build system excludes the AWT library that does "headful" stuff -- otherwise there would be duplicate symbols)
Yes.
- As a first delivery, the build system will just create a static version of the "java" launcher (not jar, javac, etc). This will include all native libraries from all modules that are included in the build.
Yes. It would be headless based.
- Going forward, the correct solution is to make jlink create a launcher that includes just the native libraries from the modules that is included in the jlink command. This will require jlink to understand how to call the native linker.
Yes. That would be one of the N-step for supporting hermetic Java.
- Somewhere in there we probably also needs to have jlink know about headless-only vs normal (headless or "headful" determined on runtime), so it can create a java.desktop output that includes only the headless library.
Alan has described an idea of dealing with java.desktop using jlink.
* Magnus reported that the refactoring and fixing of technical debt that was required for doing static builds properly has just been finished, and that his attention is now turning into creating a properly integrated system for generating static builds alongside dynamic builds.
Thank you, Magnus!
* Jiangli and Magnus will work outside the meeting to resolve the build issues Magnus faced with the hermetic java branch in the Leyden repo.
Yes.
* Just before the meeting unfortunately had to be aborted, Jiangli mentioned that they had discovered issues with some JDK native libraries when using objcopy to localize all non-visible symbols. It was at the time of writing not clear what those issues were. Jiangli will report back with what they found. (And while I had not time to mention it on the meeting, I will also look into this.)
Best, Jiangli
/Magnus
On Tue, Apr 16, 2024 at 12:39 PM Jiangli Zhou <jianglizhou@google.com> wrote:
Magnus, thanks for the meeting summary! Please see clarifications below, to avoid any confusion.
On Tue, Apr 16, 2024 at 11:27 AM Magnus Ihse Bursie <magnus.ihse.bursie@oracle.com> wrote:
This is my summary of what was said in today's meeting:
* Jiangli reported that her team had done extensive testing and not seen any problems, both with just the static launcher as generated by the leyden branch, and with bundled user applications using the One Jar (?) tool. They tested JDK tier 1, and an suite of Google's internal tests. When testing JTReg tests with native libraries, these were dynamically loaded.
Clarifications, as discussed in the meeting:
We have done multiple levels of testing for static/hermetic Java prototype with our internal codebase on JDK 11, JDK 21 and JDK head (mainline based). We run into bugs/failures and have addressed the issues found along the way. The layden branch contains most of the static/hermetic Java work from the prototype. In general static/hermetic Java support looks robust in current state on JDK 21 in our prototype based on testing (already communicated some of the the testing outcome on JDK 11 in last years JVMLS), with some of the remaining areas (e.g. how to handle user code accessing java.home, and https://github.com/openjdk/leyden/blob/hermetic-java-runtime/src/java.base/s... is still a internal class in the prototpye) that require broad discussions involving potential spec updates, and a small number of remaining failures to be looked into.
Testing we have done includes: - jtreg with tier1 using `javastatic` ("fully" statically linked Java launcher) on JDK 11 and JDK 21 - Explicit functional and integration tests (most of them are developed base on existing jtreg tests) to test the final hermetic Java image. Image is built using singlejar, and the test and JDK are in a single image. - Scattered hermetic Java testing using our internal tests. The scattered hermetic image is a special mode for testing to emulate hermetic Java image execution without building the final hermetic Java. This requires some additional launcher changes that are not in the lenden branch currently. - Some production testing on JDK 11 and JDK 21
* Alan asked about Hotspot JTReg tests that launched "java". Jiangli reported that they had not seen any problems,
Also clarification:
As mentioned during the meeting, most of the issues that we found with jtreg tier1 testing were due to the assumption of using dynamic libraries in the tests. We have addressed those.
Alan had some questions about launching sub-processes. In the prototype, we had done work to support POSIX_SPAWN launch mechanism for ProcessBuilder.start() on hermetic Java, e.g. https://github.com/openjdk/leyden/commit/931b71b0845d24b1949a23333ef1cdb3d66.... We need to look into tier1 testing to verify if they cover sub-processing testing (Alan mentioned there are some in tier1).
but my understanding was that there was some confusion if any such tests were actually run. I think this is something that will need further attention, but if someone said they would look into it, I missed it.
Megnas, can you please elaborate the above about "if any such tests were actually run"?
* Jiangli will get numbers on how much time is added to the GHA testing if we add building and linking of static libraries, without fixing so we can compile to a single set of object files.
Will follow up on this.
I did some measurements using the https://github.com/openjdk/leyden/tree/hermetic-java-runtime branch on my Linux desktop today. Here's the detail: - Build machine: Debian 6.6.* - gcc version 13.2.0 - `Time` command is used to do the measurements - Same configuration is used: --with-boot-jdk=<jdk_path> --with-debug-level=slowdebug --with-static-java=yes - `make clean` is done before both of the measurements # # Building JDK image with just dynamic libraries # Build command: make images # real 5m27.582s user 75m11.272s sys 8m2.783s # # Building JDK image with both dynamic libraries and static libraries, also link bin/javastatic # Build command: make static-java-image # real 6m5.958s user 115m59.353s sys 15m11.664s There is some overhead with the overall build time, but it's not as significant as doubling the time. I think we should consider prioritizing the launcher static linking work for now. I also want to reiterate the importance of compiling just the .o files only once and use them for both the dynamic libraries and static libraries. In our prototype on JDK 11, we did that. When moving the work to JDK head for the leyden branch and mainline, we adopted to use the existing `static-libs-image` for building the full set of static libraries with separately compiled .o files (https://bugs.openjdk.org/browse/JDK-8307858). As mentioned in some of the earlier meetings with Alan and Ron last year, we ran into memory issues during JDK build time on our system due to the duplication of the .o files. I did some workaround to mitigate the memory issue. So we should also prioritize the work although it is a non-blocker for integrating some of the leyden branch work into the mainline at this point. I'd recommend addressing that immediately after the launcher static linking work. As you are already aware of, it would require some of the runtime work to clean up the STATIC_BUILD macro usages. Best, Jiangli
* We did not fully come to a conclusion if compiling on a single set of object files for static and dynamic linking was a hard requirement or not, but at a minimum it is a desirable goal. (My personal opinion is that is a hard requirement if the GHA build times are seriously affected otherwise.)
There are basically two problems prohibiting single object file compilation:
1) using dynamic checks instead of #ifdefs for code that differs between static and dynamic.
2) Handling the difference between JNI_OnLoad (as required for dynamic libraries) and JNI_OnLoad_<libname> (as required for static libraries).
* The leyden branch has basically solved both these problems. The first one could more or less be integrated already (given perhaps some discussion on exactly *how* the JDK should discover in runtime if it is static or dynamic), but the latter requires a spec change to be integrated.
* I think everyone agreed that moving on with a spec change was a good idea, regardless of if this is blocker or not, but I don't recall that there were any concrete next steps decided. Ron and Alan said that we do spec changes all the time so it will not add as much bureaucracy as one might fear.
There was also a question raised by Dave during the meeting on the timeline for the spec/JSR related work.
* Regarding which native libraries to include, I think we agreed on the following: - Static linking will only support headless-only builds (in which the build system excludes the AWT library that does "headful" stuff -- otherwise there would be duplicate symbols)
Yes.
- As a first delivery, the build system will just create a static version of the "java" launcher (not jar, javac, etc). This will include all native libraries from all modules that are included in the build.
Yes. It would be headless based.
- Going forward, the correct solution is to make jlink create a launcher that includes just the native libraries from the modules that is included in the jlink command. This will require jlink to understand how to call the native linker.
Yes. That would be one of the N-step for supporting hermetic Java.
- Somewhere in there we probably also needs to have jlink know about headless-only vs normal (headless or "headful" determined on runtime), so it can create a java.desktop output that includes only the headless library.
Alan has described an idea of dealing with java.desktop using jlink.
* Magnus reported that the refactoring and fixing of technical debt that was required for doing static builds properly has just been finished, and that his attention is now turning into creating a properly integrated system for generating static builds alongside dynamic builds.
Thank you, Magnus!
* Jiangli and Magnus will work outside the meeting to resolve the build issues Magnus faced with the hermetic java branch in the Leyden repo.
Yes.
* Just before the meeting unfortunately had to be aborted, Jiangli mentioned that they had discovered issues with some JDK native libraries when using objcopy to localize all non-visible symbols. It was at the time of writing not clear what those issues were. Jiangli will report back with what they found. (And while I had not time to mention it on the meeting, I will also look into this.)
Best, Jiangli
/Magnus
I will not be able to participate in the meeting today. Let me report a bit on my work this week. I have made a proof of concept branch which can properly compile static java on macos with clang and linux with gcc. (With "properly" I mean hiding non-exported symbols). I still have problems replicating the build with clang on linux. I suspect my linux workstation has a broken installation of clang. That machine has grown more erratic over time, and I need to reinstall it, but I'm procrastinating spending the time doing that... So for now I'm just skipping the clang on linux part. I have also made great progress on windows. Julian's hint about objcopy working fine on COFF, and being present in cygwin, got me to realize that this was in effect a possible way forward. My PoC currently manages to extract a list of exported symbols from the static libs using dumpbin. This is necessary since --localize-hidden does not work on COFF (the concept of "hidden" symbols is an ELF-only concept), and instead we need to use the --keep-global-symbols option, which (despite the name) converts all symbols in the given list to global and all other to local. I am currently working actively with getting the next steps done in this PoC. I have initiated talks with SAP, and they are interested in helping out getting static linking working on AIX (given that it is possible to do with not too much effort.) I have also tried to extract all the changes (and only the changes) related to static build from the hermetic-java-runtime branch (ignoring the JavaHome/resource loading changes), to see if I could get something like StaticLink.gmk in mainline. I thought I was doing quite fine, but after a while I realized my testing was botched since the launcher had actually loaded the libraries dynamically instead, even though they were statically linked. :-( I am currently trying to bisect my way thought my repo to understand where things went wrong. This problem was exaggerated by the fact that we cannot build *only* static libs, so the dynamic ones where there alongside to be (improperly) picked up. I might want to spend some time on fixing this first. That will help both with speeding up the build/test cycle for static builds, and help avoid that kind of issue repeating. That will require some more refactoring in the core build/link code though. Experimenting with the static launcher on linux/gcc and macos made me realize that we will need to know the set of external libraries needed by each individual JDK library. When building a dynamic library, this knowledge (e.g. -liconv -framework Application) is encoded into the lib*.so file by the linker. But not so for a static library. Instead, we need to store this information for each JDK library, and then in the end, when we want to pick up all static libraries and link them together to form the "javastatic" executable, we need to pass this set of external libraries to the linker. This was done haphazardly in StaticLink.gmk in the hermetic-java-runtime branch, where an arbitrary subset of external libraries were hard-coded. Before integration in mainline can be possible, this information needs to be collected correctly and automatically for all included JDK libraries. Fortunately, it is not likely to be too hard. I basically just need to store the information from the LIBS provided to the NativeCompilation, and pick that up for all static libraries we include in the static launcher. (A complication is that we need to de-duplicate the list, and that some libraries are specified using two words, like "-framework Application" on macos, so it will take some care getting it right.) I have also been thinking about possible ways that we can share compiled obj files between static and dynamic libraries, even if we cannot do it fully. Most files do not need the STATIC_BUILD define and will thus be identical for both static and dynamic builds. It might be possible to just hard-code the exact files that needs to be different. It's ugly, and I still would like to make sure we press forward with the spec changes to JNI/JVMTI, but it would work as a stop-gap measure. /Magnus
Magnus, thanks for the update! Looks like you've made some good progress. Please see my comments below. On Tue, Apr 23, 2024 at 3:42 AM Magnus Ihse Bursie < magnus.ihse.bursie@oracle.com> wrote:
I will not be able to participate in the meeting today.
Let me report a bit on my work this week.
I have made a proof of concept branch which can properly compile static java on macos with clang and linux with gcc. (With "properly" I mean hiding non-exported symbols). I still have problems replicating the build with clang on linux. I suspect my linux workstation has a broken installation of clang. That machine has grown more erratic over time, and I need to reinstall it, but I'm procrastinating spending the time doing that... So for now I'm just skipping the clang on linux part.
Just to be more clear, that's with using `objcopy` to localize non-exported symbols for all JDK static libraries and libjvm.a, not just libjvm.a right? Can you please include the compiler or linker errors on linux/clang?
I have also made great progress on windows. Julian's hint about objcopy working fine on COFF, and being present in cygwin, got me to realize that this was in effect a possible way forward. My PoC currently manages to extract a list of exported symbols from the static libs using dumpbin. This is necessary since --localize-hidden does not work on COFF (the concept of "hidden" symbols is an ELF-only concept), and instead we need to use the --keep-global-symbols option, which (despite the name) converts all symbols in the given list to global and all other to local. I am currently working actively with getting the next steps done in this PoC.
I have initiated talks with SAP, and they are interested in helping out getting static linking working on AIX (given that it is possible to do with not too much effort.)
I have also tried to extract all the changes (and only the changes) related to static build from the hermetic-java-runtime branch (ignoring the JavaHome/resource loading changes), to see if I could get something like StaticLink.gmk in mainline. I thought I was doing quite fine, but after a while I realized my testing was botched since the launcher had actually loaded the libraries dynamically instead, even though they were statically linked. :-( I am currently trying to bisect my way thought my repo to understand where things went wrong.
Did you run with `bin/javastatic`? The system automatically detects if the binary contains statically linked native libraries and avoids loading the dynamic libraries. Can you please share which test(s) ran into the library loading issue? I'll see if I can reproduce the problem that you are running into. For your work to get StaticLink.gmk into the mainline, I suggest taking all changes in the hermetic-java-runtime branch for testing to simplify things on your side. Then you can just extract the makefile part to integrate with the mainline. As we've discussed in the zoom meeting, I'll work on getting the runtime changes for static support into the mainline incrementally.
This problem was exaggerated by the fact that we cannot build *only* static libs, so the dynamic ones where there alongside to be (improperly) picked up. I might want to spend some time on fixing this first. That will help both with speeding up the build/test cycle for static builds, and help avoid that kind of issue repeating. That will require some more refactoring in the core build/link code though.
Experimenting with the static launcher on linux/gcc and macos made me realize that we will need to know the set of external libraries needed by each individual JDK library. When building a dynamic library, this knowledge (e.g. -liconv -framework Application) is encoded into the lib*.so file by the linker. But not so for a static library. Instead, we need to store this information for each JDK library, and then in the end, when we want to pick up all static libraries and link them together to form the "javastatic" executable, we need to pass this set of external libraries to the linker.
This was done haphazardly in StaticLink.gmk in the hermetic-java-runtime branch, where an arbitrary subset of external libraries were hard-coded. Before integration in mainline can be possible, this information needs to be collected correctly and automatically for all included JDK libraries. Fortunately, it is not likely to be too hard. I basically just need to store the information from the LIBS provided to the NativeCompilation, and pick that up for all static libraries we include in the static launcher. (A complication is that we need to de-duplicate the list, and that some libraries are specified using two words, like "-framework Application" on macos, so it will take some care getting it right.)
Right, currently the hermetic-java-runtime branch specifies a list of hard-coded dependency libraries for linking. One of the goals of the hermetic prototype was avoiding/reducing refactoring/restructuring the existing code whenever possible. The reason is to reduce merge overhead when integrating with new changes from the mainline. We can do the proper refactoring and cleanups when getting the changes into the mainline.
I have also been thinking about possible ways that we can share compiled obj files between static and dynamic libraries, even if we cannot do it fully. Most files do not need the STATIC_BUILD define and will thus be identical for both static and dynamic builds. It might be possible to just hard-code the exact files that needs to be different. It's ugly, and I still would like to make sure we press forward with the spec changes to JNI/JVMTI, but it would work as a stop-gap measure.
We have only briefly touched on the spec change topic (for the naming of native libraries) during the zoom meetings. I also agree that we should get that part started now. It's unclear to me if there's any existing blocker for that. Best, Jiangli
/Magnus
Just to be more clear, that's with using `objcopy` to localize non-exported symbols for all JDK static libraries and libjvm.a, not just libjvm.a right? Yes.
Can you please include the compiler or linker errors on linux/clang?
It is a bit tricky. The problem arises at the partial linking step. The problem seem to arise out of how clang converts a request to link into an actual call to ld. I enabled debug code (printing the command line, and running clang with `-v` to get it to print the actual command line used to run ld) and ran it on GHA, where it worked fine. This is how it looks there: WILL_RUN: /usr/bin/clang -v -m64 -r -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o Ubuntu clang version 14.0.0-1ubuntu1.1 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 Candidate multilib: .;@m64 Selected multilib: .;@m64 "/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13 -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/lib/llvm-14/bin/../lib -L/lib -L/usr/lib -r /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o In contrast, on my machine it looks like this: WILL_RUN: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/clang -v -m64 -r -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o clang version 13.0.1 Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib: x32;@mx32 Selected multilib: .;@m64 "/usr/bin/ld" --hash-style=both --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /lib/x86_64-linux-gnu/crt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/../lib -L/lib -L/usr/lib -r /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtend.o /lib/x86_64-linux-gnu/crtn.o /usr/bin/ld: cannot find -lgcc_s /usr/bin/ld: cannot find -lgcc_s clang-13: error: linker command failed with exit code 1 (use -v to see invocation) I don't understand what makes clang think it should include "-lgcc --as-needed -lgcc_s" and the crt*.o files when doing a partial link. In fact, the entire process on how clang (and gcc) builds up the linker command line is bordering on black magic to me. I think it can be affected by variables set at compile time (at least this was the case for gcc, last I checked), or maybe it picks up some kind of script from the environment. That's why I believe my machine could just be messed up. I could get a bit further by passing "-nodefaultlibs" (or whatever it was), but then the generated .o file were messed up wrt to library symbols and it failed dramatically when trying to do the final link of the static java launcher.
I have also tried to extract all the changes (and only the changes) related to static build from the hermetic-java-runtime branch (ignoring the JavaHome/resource loading changes), to see if I could get something like StaticLink.gmk in mainline. I thought I was doing quite fine, but after a while I realized my testing was botched since the launcher had actually loaded the libraries dynamically instead, even though they were statically linked. :-( I am currently trying to bisect my way thought my repo to understand where things went wrong.
Did you run with `bin/javastatic`? The system automatically detects if the binary contains statically linked native libraries and avoids loading the dynamic libraries. Can you please share which test(s) ran into the library loading issue? I'll see if I can reproduce the problem that you are running into.
It was in fact not a problem. I was fooled by an error message. To be sure I was not loading any dynamically linked libraries, I removed the jdk/lib directory. Now the launcher failed, saying something like: "Error: Cannot locate dynamic library libjava.dylib". which was a bit of a jump scare. However, it turned out that it actually tried to load lib/jvm.cfg, and failed in loading this (since I had removed the entire lib directory), and this failure caused the above error message to be printed. When I restored lib/jvm.cfg (but not any dynamic libraries), the launcher worked. There are several bugs lurking here. For once, the error message is incorrect and should be corrected. Secondly, a statically linked launcher has just a single JVM included and should not have to look for the lib/jvm.cfg file at all. After looking around a bit in the launcher/jli code, my conclusion is that this code will need some additional care and loving attention to make it properly adjusted to function as a static launcher. We can't have a static launcher that tries to load a jvm.cfg file it does not need, and when it fails, complains that it is missing a dynamic library that it should not load. I'll try to get this fixed as part of my efforts to get the static launcher into mainline.
This was done haphazardly in StaticLink.gmk in the hermetic-java-runtime branch, where an arbitrary subset of external libraries were hard-coded. Before integration in mainline can be possible, this information needs to be collected correctly and automatically for all included JDK libraries. Fortunately, it is not likely to be too hard. I basically just need to store the information from the LIBS provided to the NativeCompilation, and pick that up for all static libraries we include in the static launcher. (A complication is that we need to de-duplicate the list, and that some libraries are specified using two words, like "-framework Application" on macos, so it will take some care getting it right.)
Right, currently the hermetic-java-runtime branch specifies a list of hard-coded dependency libraries for linking. One of the goals of the hermetic prototype was avoiding/reducing refactoring/restructuring the existing code whenever possible. The reason is to reduce merge overhead when integrating with new changes from the mainline. We can do the proper refactoring and cleanups when getting the changes into the mainline.
That is basically what I am doing right now. I am looking at your prototype and trying to reimplement this functionality properly so that it can be merged into mainline. The first step on that way was to actually get your prototype running. Now I have managed to get a version of your prototype that only includes the minimal set of changes needed to support the static launcher, and that works on mac and linux/gcc. Since your prototype is based on 586396cbb55a31 from March, I am trying to merge the patch with the latest master. This worked fine for macOS, but I hit some unexpected snag on Linux which I'm currently investigating.
We have only briefly touched on the spec change topic (for the naming of native libraries) during the zoom meetings. I also agree that we should get that part started now. It's unclear to me if there's any existing blocker for that.
I don't think there is. It's just that someone needs to step up and do it. /Magnus
On Thu, Apr 25, 2024 at 9:28 AM Magnus Ihse Bursie <magnus.ihse.bursie@oracle.com> wrote:
Just to be more clear, that's with using `objcopy` to localize non-exported symbols for all JDK static libraries and libjvm.a, not just libjvm.a right?
Yes.
Can you please include the compiler or linker errors on linux/clang?
It is a bit tricky. The problem arises at the partial linking step. The problem seem to arise out of how clang converts a request to link into an actual call to ld. I enabled debug code (printing the command line, and running clang with `-v` to get it to print the actual command line used to run ld) and ran it on GHA, where it worked fine. This is how it looks there:
WILL_RUN: /usr/bin/clang -v -m64 -r -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o Ubuntu clang version 14.0.0-1ubuntu1.1 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 Candidate multilib: .;@m64 Selected multilib: .;@m64 "/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13 -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/lib/llvm-14/bin/../lib -L/lib -L/usr/lib -r /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o
In contrast, on my machine it looks like this:
WILL_RUN: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/clang -v -m64 -r -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o clang version 13.0.1 Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib: x32;@mx32 Selected multilib: .;@m64 "/usr/bin/ld" --hash-style=both --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /lib/x86_64-linux-gnu/crt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/../lib -L/lib -L/usr/lib -r /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtend.o /lib/x86_64-linux-gnu/crtn.o /usr/bin/ld: cannot find -lgcc_s /usr/bin/ld: cannot find -lgcc_s clang-13: error: linker command failed with exit code 1 (use -v to see invocation)
I don't understand what makes clang think it should include "-lgcc --as-needed -lgcc_s" and the crt*.o files when doing a partial link. In fact, the entire process on how clang (and gcc) builds up the linker command line is bordering on black magic to me. I think it can be affected by variables set at compile time (at least this was the case for gcc, last I checked), or maybe it picks up some kind of script from the environment. That's why I believe my machine could just be messed up.
I could get a bit further by passing "-nodefaultlibs" (or whatever it was), but then the generated .o file were messed up wrt to library symbols and it failed dramatically when trying to do the final link of the static java launcher.
Looks like you are using /usr/bin/ld and not lld. I haven't run into this type of issue. Have you tried -fuse-ld=lld?
I have also tried to extract all the changes (and only the changes) related to static build from the hermetic-java-runtime branch (ignoring the JavaHome/resource loading changes), to see if I could get something like StaticLink.gmk in mainline. I thought I was doing quite fine, but after a while I realized my testing was botched since the launcher had actually loaded the libraries dynamically instead, even though they were statically linked. :-( I am currently trying to bisect my way thought my repo to understand where things went wrong.
Did you run with `bin/javastatic`? The system automatically detects if the binary contains statically linked native libraries and avoids loading the dynamic libraries. Can you please share which test(s) ran into the library loading issue? I'll see if I can reproduce the problem that you are running into.
It was in fact not a problem. I was fooled by an error message. To be sure I was not loading any dynamically linked libraries, I removed the jdk/lib directory. Now the launcher failed, saying something like:
"Error: Cannot locate dynamic library libjava.dylib".
which was a bit of a jump scare.
However, it turned out that it actually tried to load lib/jvm.cfg, and failed in loading this (since I had removed the entire lib directory), and this failure caused the above error message to be printed. When I restored lib/jvm.cfg (but not any dynamic libraries), the launcher worked.
Sounds like you are running into problems immediately during startup. Does the problem occur with just running bin/javastatic using a simple HelloWorld? Can you please send me your command line for reproducing? For the static Java support, I changed CreateExecutionEnvironment to return immediately if it executes statically. jvm.cfg is not loaded. Please see https://github.com/openjdk/leyden/blob/c1c5fc686c1452550e1b3663a320fba652248.... Sounds like the JLI_IsStaticJDK() check is not working properly in your case. Best, Jiangli
There are several bugs lurking here. For once, the error message is incorrect and should be corrected. Secondly, a statically linked launcher has just a single JVM included and should not have to look for the lib/jvm.cfg file at all.
After looking around a bit in the launcher/jli code, my conclusion is that this code will need some additional care and loving attention to make it properly adjusted to function as a static launcher. We can't have a static launcher that tries to load a jvm.cfg file it does not need, and when it fails, complains that it is missing a dynamic library that it should not load.
I'll try to get this fixed as part of my efforts to get the static launcher into mainline.
This was done haphazardly in StaticLink.gmk in the hermetic-java-runtime branch, where an arbitrary subset of external libraries were hard-coded. Before integration in mainline can be possible, this information needs to be collected correctly and automatically for all included JDK libraries. Fortunately, it is not likely to be too hard. I basically just need to store the information from the LIBS provided to the NativeCompilation, and pick that up for all static libraries we include in the static launcher. (A complication is that we need to de-duplicate the list, and that some libraries are specified using two words, like "-framework Application" on macos, so it will take some care getting it right.)
Right, currently the hermetic-java-runtime branch specifies a list of hard-coded dependency libraries for linking. One of the goals of the hermetic prototype was avoiding/reducing refactoring/restructuring the existing code whenever possible. The reason is to reduce merge overhead when integrating with new changes from the mainline. We can do the proper refactoring and cleanups when getting the changes into the mainline.
That is basically what I am doing right now. I am looking at your prototype and trying to reimplement this functionality properly so that it can be merged into mainline. The first step on that way was to actually get your prototype running.
Now I have managed to get a version of your prototype that only includes the minimal set of changes needed to support the static launcher, and that works on mac and linux/gcc. Since your prototype is based on 586396cbb55a31 from March, I am trying to merge the patch with the latest master. This worked fine for macOS, but I hit some unexpected snag on Linux which I'm currently investigating.
We have only briefly touched on the spec change topic (for the naming of native libraries) during the zoom meetings. I also agree that we should get that part started now. It's unclear to me if there's any existing blocker for that.
I don't think there is. It's just that someone needs to step up and do it.
/Magnus
On 2024-04-26 03:15, Jiangli Zhou wrote:
On Thu, Apr 25, 2024 at 9:28 AM Magnus Ihse Bursie <magnus.ihse.bursie@oracle.com> wrote:
Just to be more clear, that's with using `objcopy` to localize non-exported symbols for all JDK static libraries and libjvm.a, not just libjvm.a right?
Yes.
Can you please include the compiler or linker errors on linux/clang?
It is a bit tricky. The problem arises at the partial linking step. The problem seem to arise out of how clang converts a request to link into an actual call to ld. I enabled debug code (printing the command line, and running clang with `-v` to get it to print the actual command line used to run ld) and ran it on GHA, where it worked fine. This is how it looks there:
WILL_RUN: /usr/bin/clang -v -m64 -r -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o Ubuntu clang version 14.0.0-1ubuntu1.1 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 Candidate multilib: .;@m64 Selected multilib: .;@m64 "/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13 -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/lib/llvm-14/bin/../lib -L/lib -L/usr/lib -r /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o
In contrast, on my machine it looks like this:
WILL_RUN: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/clang -v -m64 -r -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o clang version 13.0.1 Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib: x32;@mx32 Selected multilib: .;@m64 "/usr/bin/ld" --hash-style=both --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /lib/x86_64-linux-gnu/crt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/../lib -L/lib -L/usr/lib -r /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtend.o /lib/x86_64-linux-gnu/crtn.o /usr/bin/ld: cannot find -lgcc_s /usr/bin/ld: cannot find -lgcc_s clang-13: error: linker command failed with exit code 1 (use -v to see invocation)
I don't understand what makes clang think it should include "-lgcc --as-needed -lgcc_s" and the crt*.o files when doing a partial link. In fact, the entire process on how clang (and gcc) builds up the linker command line is bordering on black magic to me. I think it can be affected by variables set at compile time (at least this was the case for gcc, last I checked), or maybe it picks up some kind of script from the environment. That's why I believe my machine could just be messed up.
I could get a bit further by passing "-nodefaultlibs" (or whatever it was), but then the generated .o file were messed up wrt to library symbols and it failed dramatically when trying to do the final link of the static java launcher.
Looks like you are using /usr/bin/ld and not lld. I haven't run into this type of issue. Have you tried -fuse-ld=lld?
I am not sure why clang insisted on picking up ld and not lld. I remeber trying with -fuse-ld=lld, and that it did not work either. Unfortunately, I don't remember exactly what the problems were. I started reinstalling my Linux workstation yesterday, but something went wrong, and it failed so hard that it got semi-bricked by the new installation, so I need to redo everything from scratch. :-( After that is done, I'll re-test. Hopefully this was just my old installation that was too broken.
I have also tried to extract all the changes (and only the changes) related to static build from the hermetic-java-runtime branch (ignoring the JavaHome/resource loading changes), to see if I could get something like StaticLink.gmk in mainline. I thought I was doing quite fine, but after a while I realized my testing was botched since the launcher had actually loaded the libraries dynamically instead, even though they were statically linked. :-( I am currently trying to bisect my way thought my repo to understand where things went wrong.
Did you run with `bin/javastatic`? The system automatically detects if the binary contains statically linked native libraries and avoids loading the dynamic libraries. Can you please share which test(s) ran into the library loading issue? I'll see if I can reproduce the problem that you are running into.
It was in fact not a problem. I was fooled by an error message. To be sure I was not loading any dynamically linked libraries, I removed the jdk/lib directory. Now the launcher failed, saying something like:
"Error: Cannot locate dynamic library libjava.dylib".
which was a bit of a jump scare.
However, it turned out that it actually tried to load lib/jvm.cfg, and failed in loading this (since I had removed the entire lib directory), and this failure caused the above error message to be printed. When I restored lib/jvm.cfg (but not any dynamic libraries), the launcher worked.
Sounds like you are running into problems immediately during startup. Does the problem occur with just running bin/javastatic using a simple HelloWorld? Can you please send me your command line for reproducing?
Maybe I was not clear enough: I did resolve the problem.
For the static Java support, I changed CreateExecutionEnvironment to return immediately if it executes statically. jvm.cfg is not loaded. Please see https://github.com/openjdk/leyden/blob/c1c5fc686c1452550e1b3663a320fba652248.... Sounds like the JLI_IsStaticJDK() check is not working properly in your case.
I've been trying to extract from your port a minimal set of patches that is needed to get static build to work. In that process, JavaHome and JLI_IsStaticJDK have been removed. It might be that this issue arised only in my slimmed-down branch, and not on your leyden branch (at this point I don't recall exactly). But, we need to fix this separately, since we must be able to build a static launcher without the hermetic changes. In my branch, I am only using compile-time #ifdef checks for static vs dynamic. In the long run, the runtime checks that you have done are a good thing, but at the moment they are just adding intrusive changes without providing any benefit -- if we can't reuse .o files between dynamic and static compilation, there is no point in introducing a runtime check when we already have a working compile-time check. I did think I correctly changed every dynamic check that you had added to a compile-time check, so it bewilders me somewhat when you say that jvm.cfg is not needed in your branch. Can you verify and confirm that the static launcher actually works in your branch, if there is no "lib/jvm.cfg" present? /Magnus
Best, Jiangli
There are several bugs lurking here. For once, the error message is incorrect and should be corrected. Secondly, a statically linked launcher has just a single JVM included and should not have to look for the lib/jvm.cfg file at all.
After looking around a bit in the launcher/jli code, my conclusion is that this code will need some additional care and loving attention to make it properly adjusted to function as a static launcher. We can't have a static launcher that tries to load a jvm.cfg file it does not need, and when it fails, complains that it is missing a dynamic library that it should not load.
I'll try to get this fixed as part of my efforts to get the static launcher into mainline.
This was done haphazardly in StaticLink.gmk in the hermetic-java-runtime branch, where an arbitrary subset of external libraries were hard-coded. Before integration in mainline can be possible, this information needs to be collected correctly and automatically for all included JDK libraries. Fortunately, it is not likely to be too hard. I basically just need to store the information from the LIBS provided to the NativeCompilation, and pick that up for all static libraries we include in the static launcher. (A complication is that we need to de-duplicate the list, and that some libraries are specified using two words, like "-framework Application" on macos, so it will take some care getting it right.)
Right, currently the hermetic-java-runtime branch specifies a list of hard-coded dependency libraries for linking. One of the goals of the hermetic prototype was avoiding/reducing refactoring/restructuring the existing code whenever possible. The reason is to reduce merge overhead when integrating with new changes from the mainline. We can do the proper refactoring and cleanups when getting the changes into the mainline.
That is basically what I am doing right now. I am looking at your prototype and trying to reimplement this functionality properly so that it can be merged into mainline. The first step on that way was to actually get your prototype running.
Now I have managed to get a version of your prototype that only includes the minimal set of changes needed to support the static launcher, and that works on mac and linux/gcc. Since your prototype is based on 586396cbb55a31 from March, I am trying to merge the patch with the latest master. This worked fine for macOS, but I hit some unexpected snag on Linux which I'm currently investigating.
We have only briefly touched on the spec change topic (for the naming of native libraries) during the zoom meetings. I also agree that we should get that part started now. It's unclear to me if there's any existing blocker for that.
I don't think there is. It's just that someone needs to step up and do it.
/Magnus
On Tue, Apr 30, 2024 at 5:42 AM Magnus Ihse Bursie <magnus.ihse.bursie@oracle.com> wrote:
On 2024-04-26 03:15, Jiangli Zhou wrote:
On Thu, Apr 25, 2024 at 9:28 AM Magnus Ihse Bursie <magnus.ihse.bursie@oracle.com> wrote:
Just to be more clear, that's with using `objcopy` to localize non-exported symbols for all JDK static libraries and libjvm.a, not just libjvm.a right?
Yes.
Can you please include the compiler or linker errors on linux/clang?
It is a bit tricky. The problem arises at the partial linking step. The problem seem to arise out of how clang converts a request to link into an actual call to ld. I enabled debug code (printing the command line, and running clang with `-v` to get it to print the actual command line used to run ld) and ran it on GHA, where it worked fine. This is how it looks there:
WILL_RUN: /usr/bin/clang -v -m64 -r -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o Ubuntu clang version 14.0.0-1ubuntu1.1 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/10 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13 Candidate multilib: .;@m64 Selected multilib: .;@m64 "/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/librmi_relocatable.o -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13 -L/usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/lib/llvm-14/bin/../lib -L/lib -L/usr/lib -r /home/runner/work/jdk/jdk/build/linux-x64/support/native/java.rmi/librmi/static/GC.o
In contrast, on my machine it looks like this:
WILL_RUN: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/clang -v -m64 -r -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o clang version 13.0.1 Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib: x32;@mx32 Selected multilib: .;@m64 "/usr/bin/ld" --hash-style=both --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/librmi_relocatable.o /lib/x86_64-linux-gnu/crt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/usr/local/clang+llvm-13.0.1-x86_64-linux-gnu-ubuntu-18.04/bin/../lib -L/lib -L/usr/lib -r /localhome/git/jdk-ALT/build/clangherm/support/native/java.rmi/librmi/static/GC.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtend.o /lib/x86_64-linux-gnu/crtn.o /usr/bin/ld: cannot find -lgcc_s /usr/bin/ld: cannot find -lgcc_s clang-13: error: linker command failed with exit code 1 (use -v to see invocation)
I don't understand what makes clang think it should include "-lgcc --as-needed -lgcc_s" and the crt*.o files when doing a partial link. In fact, the entire process on how clang (and gcc) builds up the linker command line is bordering on black magic to me. I think it can be affected by variables set at compile time (at least this was the case for gcc, last I checked), or maybe it picks up some kind of script from the environment. That's why I believe my machine could just be messed up.
I could get a bit further by passing "-nodefaultlibs" (or whatever it was), but then the generated .o file were messed up wrt to library symbols and it failed dramatically when trying to do the final link of the static java launcher.
Looks like you are using /usr/bin/ld and not lld. I haven't run into this type of issue. Have you tried -fuse-ld=lld?
I am not sure why clang insisted on picking up ld and not lld. I remeber trying with -fuse-ld=lld, and that it did not work either. Unfortunately, I don't remember exactly what the problems were.
I started reinstalling my Linux workstation yesterday, but something went wrong, and it failed so hard that it got semi-bricked by the new installation, so I need to redo everything from scratch. :-( After that is done, I'll re-test. Hopefully this was just my old installation that was too broken.
I have also tried to extract all the changes (and only the changes) related to static build from the hermetic-java-runtime branch (ignoring the JavaHome/resource loading changes), to see if I could get something like StaticLink.gmk in mainline. I thought I was doing quite fine, but after a while I realized my testing was botched since the launcher had actually loaded the libraries dynamically instead, even though they were statically linked. :-( I am currently trying to bisect my way thought my repo to understand where things went wrong.
Did you run with `bin/javastatic`? The system automatically detects if the binary contains statically linked native libraries and avoids loading the dynamic libraries. Can you please share which test(s) ran into the library loading issue? I'll see if I can reproduce the problem that you are running into.
It was in fact not a problem. I was fooled by an error message. To be sure I was not loading any dynamically linked libraries, I removed the jdk/lib directory. Now the launcher failed, saying something like:
"Error: Cannot locate dynamic library libjava.dylib".
which was a bit of a jump scare.
However, it turned out that it actually tried to load lib/jvm.cfg, and failed in loading this (since I had removed the entire lib directory), and this failure caused the above error message to be printed. When I restored lib/jvm.cfg (but not any dynamic libraries), the launcher worked.
Sounds like you are running into problems immediately during startup. Does the problem occur with just running bin/javastatic using a simple HelloWorld? Can you please send me your command line for reproducing?
Maybe I was not clear enough: I did resolve the problem.
For the static Java support, I changed CreateExecutionEnvironment to return immediately if it executes statically. jvm.cfg is not loaded. Please see https://github.com/openjdk/leyden/blob/c1c5fc686c1452550e1b3663a320fba652248.... Sounds like the JLI_IsStaticJDK() check is not working properly in your case.
I've been trying to extract from your port a minimal set of patches that is needed to get static build to work. In that process, JavaHome and JLI_IsStaticJDK have been removed. It might be that this issue arised only in my slimmed-down branch, and not on your leyden branch (at this point I don't recall exactly). But, we need to fix this separately, since we must be able to build a static launcher without the hermetic changes.
The JDK and VM code has pre-existing assumptions about the JDK directories and dynamic linking (e.g. the .so). JLI_IsStaticJDK|JLI_SetStaticJDK|JVM_IsStaticJDK|JVM_SetStaticJDK is needed for static JDK support to handle those cases correctly. CreateExecutionEnvironment that I mentioned earlier is one of the examples. I'm quite certain the issue that you are running into is due to the incorrect static check/handling in CreateExecutionEnvironment.
In my branch, I am only using compile-time #ifdef checks for static vs dynamic. In the long run, the runtime checks that you have done are a good thing, but at the moment they are just adding intrusive changes without providing any benefit -- if we can't reuse .o files between dynamic and static compilation, there is no point in introducing a runtime check when we already have a working compile-time check.
I haven't seen your branch/code. I'd suggest not going with the #ifdef checks as that's the opposite direction of what we want to achieve. It doesn't seem to be worth your effort to add more #ifdef checks in order to do static linking build work, even those are for temporary testing reasons.
I did think I correctly changed every dynamic check that you had added to a compile-time check, so it bewilders me somewhat when you say that jvm.cfg is not needed in your branch.
Can you verify and confirm that the static launcher actually works in your branch, if there is no "lib/jvm.cfg" present?
In my <path>/leyden/build/linux-x86_64-server-slowdebug/images/jdk directory: $ mv lib/jvm.cfg lib/jvm.cfg.no_used $ find . | grep jvm.cfg ./lib/jvm.cfg.no_used $ bin/javastatic -cp <my_jar> HelloWorld HelloWorld Thanks! Jiangli
/Magnus
Best, Jiangli
There are several bugs lurking here. For once, the error message is incorrect and should be corrected. Secondly, a statically linked launcher has just a single JVM included and should not have to look for the lib/jvm.cfg file at all.
After looking around a bit in the launcher/jli code, my conclusion is that this code will need some additional care and loving attention to make it properly adjusted to function as a static launcher. We can't have a static launcher that tries to load a jvm.cfg file it does not need, and when it fails, complains that it is missing a dynamic library that it should not load.
I'll try to get this fixed as part of my efforts to get the static launcher into mainline.
This was done haphazardly in StaticLink.gmk in the hermetic-java-runtime branch, where an arbitrary subset of external libraries were hard-coded. Before integration in mainline can be possible, this information needs to be collected correctly and automatically for all included JDK libraries. Fortunately, it is not likely to be too hard. I basically just need to store the information from the LIBS provided to the NativeCompilation, and pick that up for all static libraries we include in the static launcher. (A complication is that we need to de-duplicate the list, and that some libraries are specified using two words, like "-framework Application" on macos, so it will take some care getting it right.)
Right, currently the hermetic-java-runtime branch specifies a list of hard-coded dependency libraries for linking. One of the goals of the hermetic prototype was avoiding/reducing refactoring/restructuring the existing code whenever possible. The reason is to reduce merge overhead when integrating with new changes from the mainline. We can do the proper refactoring and cleanups when getting the changes into the mainline.
That is basically what I am doing right now. I am looking at your prototype and trying to reimplement this functionality properly so that it can be merged into mainline. The first step on that way was to actually get your prototype running.
Now I have managed to get a version of your prototype that only includes the minimal set of changes needed to support the static launcher, and that works on mac and linux/gcc. Since your prototype is based on 586396cbb55a31 from March, I am trying to merge the patch with the latest master. This worked fine for macOS, but I hit some unexpected snag on Linux which I'm currently investigating.
We have only briefly touched on the spec change topic (for the naming of native libraries) during the zoom meetings. I also agree that we should get that part started now. It's unclear to me if there's any existing blocker for that.
I don't think there is. It's just that someone needs to step up and do it.
/Magnus
On 2024-05-07 06:04, Jiangli Zhou wrote:
On Tue, Apr 30, 2024 at 5:42 AM Magnus Ihse Bursie <magnus.ihse.bursie@oracle.com> wrote:
I am not sure why clang insisted on picking up ld and not lld. I remeber trying with -fuse-ld=lld, and that it did not work either. Unfortunately, I don't remember exactly what the problems were.
I started reinstalling my Linux workstation yesterday, but something went wrong, and it failed so hard that it got semi-bricked by the new installation, so I need to redo everything from scratch. :-( After that is done, I'll re-test. Hopefully this was just my old installation that was too broken.
I decided to spend the time to reinstall my machine. Now linking with clang works. Kind of. For some reason, it still picks up binutils ld and not lld, and then -l:libc++.a does not work, but when I replaced it with -l:libstdc++.a it worked just fine. I guess we need to either forcefully add -fuse-ld=lld to our clang compilation lines, or figure out if clang is going to call the binutils or llvm ld, and select the right option. I still find the logic for how clang and gcc locates the default linker to be mostly magic. I guess I need to make a deep dive in understanding this to be able to resolve this properly.
The JDK and VM code has pre-existing assumptions about the JDK directories and dynamic linking (e.g. the .so). JLI_IsStaticJDK|JLI_SetStaticJDK|JVM_IsStaticJDK|JVM_SetStaticJDK is needed for static JDK support to handle those cases correctly. CreateExecutionEnvironment that I mentioned earlier is one of the examples.
I'm quite certain the issue that you are running into is due to the incorrect static check/handling in CreateExecutionEnvironment.
I'll have a look at that, thanks for the pointer.
In my branch, I am only using compile-time #ifdef checks for static vs dynamic. In the long run, the runtime checks that you have done are a good thing, but at the moment they are just adding intrusive changes without providing any benefit -- if we can't reuse .o files between dynamic and static compilation, there is no point in introducing a runtime check when we already have a working compile-time check. I haven't seen your branch/code. I'd suggest not going with the #ifdef checks as that's the opposite direction of what we want to achieve. It doesn't seem to be worth your effort to add more #ifdef checks in order to do static linking build work, even those are for temporary testing reasons.
Okaaaaay... My understanding was that you wanted to push for the quickest possible integration of building a static java launcher into mainline. To do that as fast as possible, we need to use the existing framework for separating statically and dynamically linked libraries, which means doing compile time checks using #ifdefs. Are you saying now that the priorities has changed, and that you want to start by introducing your framework for the runtime lookup if we are static or dynamic? To be honest, I think your prototype is rather hacky in how you implement this, and I reckon that it will require quite a lot of work to be accepted into mainline. I also think you need a CSR for changing the Hotspot/JDK behavior wrt this, which further adds to the process. If you want to go that route instead, then I'll put my work on hold until you have gotten a working solution for the runtime lookup in mainline. I gather this means that there is no real stress for me anymore. /Magnus
On Tue, May 7, 2024 at 5:26 AM Magnus Ihse Bursie < magnus.ihse.bursie@oracle.com> wrote:
On 2024-05-07 06:04, Jiangli Zhou wrote:
On Tue, Apr 30, 2024 at 5:42 AM Magnus Ihse Bursie<magnus.ihse.bursie@oracle.com> <magnus.ihse.bursie@oracle.com> wrote:
I am not sure why clang insisted on picking up ld and not lld. I remeber trying with -fuse-ld=lld, and that it did not work either. Unfortunately, I don't remember exactly what the problems were.
I started reinstalling my Linux workstation yesterday, but something went wrong, and it failed so hard that it got semi-bricked by the new installation, so I need to redo everything from scratch. :-( After that is done, I'll re-test. Hopefully this was just my old installation that was too broken.
I decided to spend the time to reinstall my machine. Now linking with clang works. Kind of. For some reason, it still picks up binutils ld and not lld, and then -l:libc++.a does not work, but when I replaced it with -l:libstdc++.a it worked just fine. I guess we need to either forcefully add -fuse-ld=lld to our clang compilation lines, or figure out if clang is going to call the binutils or llvm ld, and select the right option.
https://lld.llvm.org/#using-lld has some information on using lld instead of the default linker.
I still find the logic for how clang and gcc locates the default linker to be mostly magic. I guess I need to make a deep dive in understanding this to be able to resolve this properly.
The JDK and VM code has pre-existing assumptions about the JDK directories and dynamic linking (e.g. the .so). JLI_IsStaticJDK|JLI_SetStaticJDK|JVM_IsStaticJDK|JVM_SetStaticJDK is needed for static JDK support to handle those cases correctly. CreateExecutionEnvironment that I mentioned earlier is one of the examples.
I'm quite certain the issue that you are running into is due to the incorrect static check/handling in CreateExecutionEnvironment.
I'll have a look at that, thanks for the pointer.
In my branch, I am only using compile-time #ifdef checks for static vs dynamic. In the long run, the runtime checks that you have done are a good thing, but at the moment they are just adding intrusive changes without providing any benefit -- if we can't reuse .o files between dynamic and static compilation, there is no point in introducing a runtime check when we already have a working compile-time check.
I haven't seen your branch/code. I'd suggest not going with the #ifdef checks as that's the opposite direction of what we want to achieve. It doesn't seem to be worth your effort to add more #ifdef checks in order to do static linking build work, even those are for temporary testing reasons.
Okaaaaay... My understanding was that you wanted to push for the quickest possible integration of building a static java launcher into mainline.
That's correct. Please see more details below.
To do that as fast as possible, we need to use the existing framework for separating statically and dynamically linked libraries, which means doing compile time checks using #ifdefs.
Using #ifdefs is not the most efficient path for us to get static Java launcher support in mainline. That's because most of the runtime changes for static Java support in hermetic-java-runtime branch are already done using `JLI_IsStaticJDK|JVM_IsStaticJDK` checks. We should not convert those to use #ifdefs then later convert the #ifdef back to runtime checks again during the integration work. As suggested and discussed earlier we can aim to get the static Java related changes into mainline incrementally. Following is a path that I think would work effectively and "fast" by limiting potentially wasted efforts: Step 1 - Get the makefile changes for linking `javastatic` without any of the runtime changes; Don't enable any build and testing for `javastatic` in this step yet Step 2 - Incrementally get the runtime changes reviewed and integrated into mainline; Enable building for `javastatic` as a test in github workflow when we can run HelloWorld using static launcher in mainline; Enable testing tier 1 for `javastatic` in workflow when we can run jtreg tests with the static launcher - could be done in a later step; Step 3 - Remove all STATIC_BUILD macros in JDK runtime code; Also cleanup the macros in tests (can be done later) CSR and JNI specification work to support JNI_OnLoad_<lib_name> and friends for JNI dynamic library and builtin library Step 4 - Build (makefile) changes to support linking .a and .so libraries using the same set of .o objects, to avoid compiling the .c/.c++ source twice Those lay the foundation for the hermetic Java work in mainline.
Are you saying now that the priorities has changed, and that you want to start by introducing your framework for the runtime lookup if we are static or dynamic?
By "runtime lookup", I think you were referring to the JNI native library lookup. We can handle them as part of the step 2 above. I think for any of the runtime changes, we need to be able to build in the mainline (although initially not included in the github workflow).
To be honest, I think your prototype is rather hacky in how you implement this, and I reckon that it will require quite a lot of work to be accepted into mainline. I also think you need a CSR for changing the Hotspot/JDK behavior wrt this, which further adds to the process.
For CSR work, we can do that as part of step #3. Actually, for the builtin/dynamic library lookup support, I think the enhancements in hermetic-java-runtime are already close to the proper shape (not hacky).
If you want to go that route instead, then I'll put my work on hold until you have gotten a working solution for the runtime lookup in mainline. I gather this means that there is no real stress for me anymore.
Ron and Alan mentioned Tuesday morning PT may not work the best for you. Would you be open for a separate time to discuss the details on moving forward? Best, Jiangli
/Magnus
We had a discussion on static Java this morning. Outcome of today's discussion: For the first step in earlier outlined steps, there's a preference to have a minimum buildable and runnable (be able to run HelloWorld) `javastatic` as the initial integration point for the mainline. Following are the updated plan/steps: Step 1 - Get the makefile changes for linking `javastatic` with minimum needed runtime changes into mainline; `javastatic` is buildable and runnable - can run HelloWorld Can enable build and testing for `javastatic` in github workflow Step 2 - Incrementally get the runtime changes reviewed and integrated into mainline; Revert any of the #ifdef changes if they were introduced in the first step Enable testing tier 1 for `javastatic` in workflow when we can run jtreg tests with the static launcher - could be done in a later step; Step 3 - Remove all STATIC_BUILD macros in JDK runtime code; Also cleanup the macros in tests (can be done later) CSR and JNI specification work to support JNI_OnLoad_<lib_name> and friends for JNI dynamic library and builtin library Step 4 - Build (makefile) changes to support linking .a and .so libraries using the same set of .o objects, to avoid compiling the .c/.c++ source twice According to Magnus, his #ifdef changes only affect about half a dozen files. Those #ifdef are inserted in places where the JLI_IsStaticJDK|JVM_IsStaticJDK checks are used in the hermetic-java-runtime branch. Magnus will send out his changes as PR draft for initial review for deciding on how to move forward with the non-makefile changes. If the #ifdef changes are not too disruptive, we could include those in the initial integration work. Then the followup runtime changes would revert the #ifdef changes. Best, Jiangli On Mon, May 20, 2024 at 10:17 PM Jiangli Zhou <jianglizhou@google.com> wrote:
On Tue, May 7, 2024 at 5:26 AM Magnus Ihse Bursie < magnus.ihse.bursie@oracle.com> wrote:
On 2024-05-07 06:04, Jiangli Zhou wrote:
On Tue, Apr 30, 2024 at 5:42 AM Magnus Ihse Bursie<magnus.ihse.bursie@oracle.com> <magnus.ihse.bursie@oracle.com> wrote:
I am not sure why clang insisted on picking up ld and not lld. I remeber trying with -fuse-ld=lld, and that it did not work either. Unfortunately, I don't remember exactly what the problems were.
I started reinstalling my Linux workstation yesterday, but something went wrong, and it failed so hard that it got semi-bricked by the new installation, so I need to redo everything from scratch. :-( After that is done, I'll re-test. Hopefully this was just my old installation that was too broken.
I decided to spend the time to reinstall my machine. Now linking with clang works. Kind of. For some reason, it still picks up binutils ld and not lld, and then -l:libc++.a does not work, but when I replaced it with -l:libstdc++.a it worked just fine. I guess we need to either forcefully add -fuse-ld=lld to our clang compilation lines, or figure out if clang is going to call the binutils or llvm ld, and select the right option.
https://lld.llvm.org/#using-lld has some information on using lld instead of the default linker.
I still find the logic for how clang and gcc locates the default linker to be mostly magic. I guess I need to make a deep dive in understanding this to be able to resolve this properly.
The JDK and VM code has pre-existing assumptions about the JDK directories and dynamic linking (e.g. the .so). JLI_IsStaticJDK|JLI_SetStaticJDK|JVM_IsStaticJDK|JVM_SetStaticJDK is needed for static JDK support to handle those cases correctly. CreateExecutionEnvironment that I mentioned earlier is one of the examples.
I'm quite certain the issue that you are running into is due to the incorrect static check/handling in CreateExecutionEnvironment.
I'll have a look at that, thanks for the pointer.
In my branch, I am only using compile-time #ifdef checks for static vs dynamic. In the long run, the runtime checks that you have done are a good thing, but at the moment they are just adding intrusive changes without providing any benefit -- if we can't reuse .o files between dynamic and static compilation, there is no point in introducing a runtime check when we already have a working compile-time check.
I haven't seen your branch/code. I'd suggest not going with the #ifdef checks as that's the opposite direction of what we want to achieve. It doesn't seem to be worth your effort to add more #ifdef checks in order to do static linking build work, even those are for temporary testing reasons.
Okaaaaay... My understanding was that you wanted to push for the quickest possible integration of building a static java launcher into mainline.
That's correct. Please see more details below.
To do that as fast as possible, we need to use the existing framework for separating statically and dynamically linked libraries, which means doing compile time checks using #ifdefs.
Using #ifdefs is not the most efficient path for us to get static Java launcher support in mainline. That's because most of the runtime changes for static Java support in hermetic-java-runtime branch are already done using `JLI_IsStaticJDK|JVM_IsStaticJDK` checks. We should not convert those to use #ifdefs then later convert the #ifdef back to runtime checks again during the integration work.
As suggested and discussed earlier we can aim to get the static Java related changes into mainline incrementally. Following is a path that I think would work effectively and "fast" by limiting potentially wasted efforts:
Step 1 - Get the makefile changes for linking `javastatic` without any of the runtime changes; Don't enable any build and testing for `javastatic` in this step yet Step 2 - Incrementally get the runtime changes reviewed and integrated into mainline; Enable building for `javastatic` as a test in github workflow when we can run HelloWorld using static launcher in mainline; Enable testing tier 1 for `javastatic` in workflow when we can run jtreg tests with the static launcher - could be done in a later step; Step 3 - Remove all STATIC_BUILD macros in JDK runtime code; Also cleanup the macros in tests (can be done later) CSR and JNI specification work to support JNI_OnLoad_<lib_name> and friends for JNI dynamic library and builtin library Step 4 - Build (makefile) changes to support linking .a and .so libraries using the same set of .o objects, to avoid compiling the .c/.c++ source twice
Those lay the foundation for the hermetic Java work in mainline.
Are you saying now that the priorities has changed, and that you want to start by introducing your framework for the runtime lookup if we are static or dynamic?
By "runtime lookup", I think you were referring to the JNI native library lookup. We can handle them as part of the step 2 above.
I think for any of the runtime changes, we need to be able to build in the mainline (although initially not included in the github workflow).
To be honest, I think your prototype is rather hacky in how you implement this, and I reckon that it will require quite a lot of work to be accepted into mainline. I also think you need a CSR for changing the Hotspot/JDK behavior wrt this, which further adds to the process.
For CSR work, we can do that as part of step #3. Actually, for the builtin/dynamic library lookup support, I think the enhancements in hermetic-java-runtime are already close to the proper shape (not hacky).
If you want to go that route instead, then I'll put my work on hold until you have gotten a working solution for the runtime lookup in mainline. I gather this means that there is no real stress for me anymore.
Ron and Alan mentioned Tuesday morning PT may not work the best for you. Would you be open for a separate time to discuss the details on moving forward?
Best, Jiangli
/Magnus
On 2024-05-21 20:51, Jiangli Zhou wrote:
Magnus will send out his changes as PR draft for initial review for deciding on how to move forward with the non-makefile changes.
This is now published in https://github.com/openjdk/jdk/pull/19478. /Magnus
On 2024-05-07 06:04, Jiangli Zhou wrote:
I did think I correctly changed every dynamic check that you had added to a compile-time check, so it bewilders me somewhat when you say that jvm.cfg is not needed in your branch.
Can you verify and confirm that the static launcher actually works in your branch, if there is no "lib/jvm.cfg" present? In my <path>/leyden/build/linux-x86_64-server-slowdebug/images/jdk directory:
$ mv lib/jvm.cfg lib/jvm.cfg.no_used $ find . | grep jvm.cfg ./lib/jvm.cfg.no_used
$ bin/javastatic -cp <my_jar> HelloWorld HelloWorld
I was very much mislead by this. I was sure I hade made some mistake when I picked out the changes you have made for static builds (and removed all the other changes, e.g. for the hermetic jar files), since you said this worked for you. I have been scrutinizing the difference between your branch and mine, over and over again, without understanding what the difference could be. Finally I did what I should have done at the very beginning, and actually tested building and running your branch. It did not work either. So why did you claim it worked? I kept digging, and I found out the reason. You had indeed implemented a fix for this, but only on Linux. I was testing on macOS. (It is also not implemented for Windows, but since I'm still struggling to find a way to create proper static builds there it is less of a problem for now.) My branch worked just as well as your on Linux. I have now fixed so it works on macOS too. With this hurdle out of the way, I can get back to doing real work on the patch. Unfortunately this detour took far too much time. :-( /Magnus
On Tue, May 28, 2024 at 9:09 AM Magnus Ihse Bursie < magnus.ihse.bursie@oracle.com> wrote:
On 2024-05-07 06:04, Jiangli Zhou wrote:
I did think I correctly changed every dynamic check that you had added to a compile-time check, so it bewilders me somewhat when you say that jvm.cfg is not needed in your branch.
Can you verify and confirm that the static launcher actually works in your branch, if there is no "lib/jvm.cfg" present? In my <path>/leyden/build/linux-x86_64-server-slowdebug/images/jdk directory:
$ mv lib/jvm.cfg lib/jvm.cfg.no_used $ find . | grep jvm.cfg ./lib/jvm.cfg.no_used $ bin/javastatic -cp <my_jar> HelloWorld HelloWorld
I was very much mislead by this. I was sure I hade made some mistake when I picked out the changes you have made for static builds (and removed all the other changes, e.g. for the hermetic jar files), since you said this worked for you. I have been scrutinizing the difference between your branch and mine, over and over again, without understanding what the difference could be.
Finally I did what I should have done at the very beginning, and actually tested building and running your branch.
It did not work either.
So why did you claim it worked? I kept digging, and I found out the reason. You had indeed implemented a fix for this, but only on Linux. I was testing on macOS. (It is also not implemented for Windows, but since I'm still struggling to find a way to create proper static builds there it is less of a problem for now.)
My branch worked just as well as your on Linux. I have now fixed so it works on macOS too. With this hurdle out of the way, I can get back to doing real work on the patch. Unfortunately this detour took far too much time. :-(
Mystery solved! Yes, our prototype was for linux-x64 only (communicated in https://mail.openjdk.org/pipermail/leyden-dev/2023-February/000106.html and our discussion meetings). I think I was not clear that you were testing on macOS. The supported platform was also discussed in the hermetic Java meetings. My understanding was that we can focus on Linux (or unix-like) initially. Let's get the initial integration point support for Linux. Glad to see that you are making progress! Please let me know if you are running into any other issues. Best, Jiangli
/Magnus
participants (2)
-
Jiangli Zhou
-
Magnus Ihse Bursie