Questions about the Hermetic Java project

Magnus Ihse Bursie magnus.ihse.bursie at oracle.com
Wed Feb 14 09:00:27 UTC 2024


First some background for build-dev: I have spent some time looking at 
the build implications of the Hermetic Java effort, which is part of 
Project Leyden. A high-level overview is available here: 
https://cr.openjdk.org/~jiangli/hermetic_java.pdf and the current source 
code is here: https://github.com/openjdk/leyden/tree/hermetic-java-runtime.

Hermetic Java faces several challenges, but the part that is relevant 
for the build system is the ability to create static libraries. We've 
had this functionality (in three different ways...) for some time, but 
it is rather badly implemented.

As a result of my investigations, I have a bunch of questions. :-) I 
have gotten some answers in private discussion, but for the sake of 
transparency I will repeat them here, to foster an open dialogue.

1. Am I correct in understanding that the ultimate goal of this exercise 
is to be able to have jmods which include static libraries (*.a) of the 
native code which the module uses, and that the user can then run a 
special jlink command to have this linked into a single executable 
binary (which also bundles the *.class files and any additional 
resources needed)?

2. If so, is the idea to create special kinds of static jmods, like 
java.base-static.jmod, that contains *.a files instead of lib*.so files? 
Or is the idea that the normal jmod should contain both?

3. Linking .o and .a files into an executable is a formidable task. Is 
the intention to have jlink call a system-provided ld, or to bundle ld 
with jlink, or to reimplement this functionality in Java?

4. Is the intention is to allow users to create their own jmods with 
static libraries, and have these linked in as well? This seems to be the 
case. If that is so, then there will always be the risk for name 
collisions, and we can only minimize the risk by making sure any global 
names are as unique as possible.

5. The original implementation of static builds in the JDK, created for 
the Mobile project, used a configure flag, --enable-static-builds, to 
change the entire behavior of the build system to only produce *.a files 
instead of lib*.so. In contrast, the current system is using a special 
target instead. In my eyes, this is a much worse solution. Apart from 
the conceptual principle (if the build should generate static or dynamic 
libraries is definitely a property of what a "configuration" means), 
this makes it much harder to implement efficiently, since we cannot make 
changes in NativeCompilation.gmk, where they are needed.

That was not as much a question as a statement. 🙂 But here is the 
question: Do you think it would be reasonable to restore the old 
behavior but with the new methods, so that we don't use special targets, 
but instead tells configure to generate static libraries? I'm thinking 
we should have a flag like "--with-library-type=" that can have values 
"dynamic" (which is default), "static" or "both". I am not sure if 
"both" are needed, but if we want to bundle both lib*.so and *.a files 
into a single jmod file (see question 2 above), then it definitely is. 
In general, the cost of producing two kinds of libraries are quite 
small, compared to the cost of compiling the source code to object files.

Finally, I have looked at how to manipulate symbol visibility. There 
seems many ways forward, so I feel confident that we can find a good 
solution.

One way forward is to use objcopy to manipulate symbol status 
(global/local). There is an option --localize-symbol in objcopy, that 
has been available in objcopy since at least 2.15, which was released 
2004, so it should be safe to use. But ideally we should avoid using 
objcopy and do this as part of the linking process. This should be 
possible to do, given that we make changes in NativeCompilation.gmk -- 
see question 5 above.

As a fallback, it is also possible to rename symbols, either piecewise 
or wholesale, using objcopy. There are many ways to do this, using 
--prefix-symbols, --redefine-sym or --redefine-syms (note the -s, this 
takes a file with a list of symbols). Thus we can always introduce a 
"post factum namespace" by renaming symbols.

So in the end, I think it will be fully possible to produce .a files 
that only has global symbols for the functions that are part of the API 
exposed by that library, and have all other symbols local, and make this 
is in a way that is consistent with the rest of the build system.

Finally, a note on Hotspot. Due to debugging reasons, we export 
basically all symbols in hotspot as global. This is not reasonable to do 
for a static build. The effect of not exporting those symbols will be 
that SA will not function to 100%. On the other hand, I have no idea if 
SA works at all with a static build. Have you tested this? Is this part 
of the plan to support, or will it be officially dropped for Hermetic Java?

/Magnus



More information about the leyden-dev mailing list