From sgehwolf at redhat.com Tue Mar 14 10:36:17 2023 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 14 Mar 2023 11:36:17 +0100 Subject: jmods-less jlinking prototype Message-ID: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> The Problem: ------------ Why do we need the 'jmods' directory when running jlink when 'java -- list-modules' lists all the modules and all the files from the JDK build are in (only 'jmods' directory is missing)[1]? Example: $ rm -rf ./images/jdk/jmods/ $ ./images/jdk/bin/jlink --add-modules java.base --output build/test-jlink-jdk Error: --module-path is not specified and this runtime image does not contain jmods directory. Usage: jlink --module-path --add-modules [,...] Use --help for a list of possible options The Proposed Solution: ---------------------- Instead of using the .jmod archives when jlinking an application image, use the JDK installation on the file system together with the module contents from the jimage to produce the desired result, provided we initially start with a full JDK (including 'jmods' directory). Only use this mode *if and only if* the 'jmods' directory is missing in the JDK image. Use a jlink plugin that records resource files - other than classes and resources which are in the jimage file - in a new resource file 'jmod_resources' and add it to the jimage. In later jlink passes use the 'jmod_resources' resource file of the jimage in order to track other files part of the application image. The union of 'jmod_resources' (if any[2]) and the classes and resources files from the jimage form the equivalent of what '.jmod' archives track today. The size of those extra 'jmod_resources' files are fairly small: about ~170 bytes on average over all current JDK modules. The one for 'java.base' is "largest" with 1308 bytes on my system. This seems a good trade-off. In the proposed prototype the supporting jlink plugin is enabled by default but could instead only enable given a specific option. It seems conceivable that such recursive jlink passes would become more common in context of Leyden, provided jlink serves as the tool for driving condensation passes. Supporting jmods-less jlinking would further strengthen that use-case. An additional advantage of such an approach would be that distributors could ship a lighter weight JDK distribution (without jmods), yet still support jlinking specific application images. The code is here: https://github.com/jerboaa/jdk/compare/jlink-jmods-base...jerboaa:jdk:jlink-jmods-less?expand=1 Or the branch: https://github.com/jerboaa/jdk/tree/jlink-jmods-less Current Limitations: -------------------- The prototype only works on the actual target runtime platform. Cross- jlinking isn't supported. However, in context of Leyden this might be OK as some AOT compilations as part of some condenser runs might have similar restrictions. Also, only modules already present in the base jimage can be included in derivatives. Consider a base JDK image including JDK modules 'java.base', 'app.module1' and 'app.module2'. Further jlink passes would only be able to create images including a *subset* of those modules. Alternatives: ------------- There is a '--keep-packaged-modules ' option, but that seems unsatisfying since it copies large amounts of data for no good reason. Especially when the java.base.jmod includes debuginfo in native libraries itself (in particular, libjvm.so). Example Usage: -------------- $ ./images/jdk/bin/jlink --add-modules ALL-MODULE-PATH --output build/jlinked-image WARNING: Using incubator modules: jdk.incubator.concurrent, jdk.incubator.vector $ ls ./build/jlinked-image/jmods ls: cannot access './build/jlinked-image/jmods': No such file or directory $ ./build/jlinked-image/bin/jlink --verbose --add-modules java.base \ --output ./build/java.base-image java.base jrt:/java.base (jmod-less) Providers: java.base provides java.nio.file.spi.FileSystemProvider used by java.base java.base provides java.util.random.RandomGenerator used by java.base $ ./build/java.base-image/bin/java --version openjdk 21-internal 2023-09-19 OpenJDK Runtime Environment (build 21-internal-adhoc.sgehwolf.jdk-jdk) OpenJDK 64-Bit Server VM (build 21-internal-adhoc.sgehwolf.jdk-jdk, mixed mode) Thoughts? Opinions? Note that depending on feedback, we could consider getting something like this into mainline JDK as it seems useful outside of Leyden as well. Thanks, Severin [1] In the Leyden context there is also this issue: Let 'jlink' be the tool to "drive" condensor passes. Would I be able to use it recursively, minus after the last step? Not in it's current form without using '--keep-packaged-modules': $ ./images/jdk/bin/jlink --add-modules jdk.jlink --output build/test-jlink-image $ ./build/test-jlink-image/bin/jlink --add-modules java.base --output build/test-java.base-image Error: --module-path is not specified and this runtime image does not contain jmods directory. Usage: jlink --module-path --add-modules [,...] Use --help for a list of possible options [2] Not all modules contain non-class-resource data, like native libraries or binaries, but only contain Java class files and resources. If so, all required files are already in the jimage file. From zjx001202 at gmail.com Tue Mar 14 12:15:03 2023 From: zjx001202 at gmail.com (Glavo) Date: Tue, 14 Mar 2023 20:15:03 +0800 Subject: jmods-less jlinking prototype In-Reply-To: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> Message-ID: As a reference, I implemented a tool in the past: https://github.com/Glavo/fallback-jmod This tool creates an incomplete jmod file, which contains a list of missing files. This tool also includes a jlink plugin. When performing a jlink, it will use the classes and binaries in the specified JRE path to complete the contents of this jmod according to this file list. I'm glad to see that other people are paying attention to this issue. The huge size of JDK 9+makes me feel irritated. I really hope to remove the duplicate files in the jmod file in some way. On Tue, Mar 14, 2023 at 6:36?PM Severin Gehwolf wrote: > The Problem: > ------------ > Why do we need the 'jmods' directory when running jlink when 'java -- > list-modules' lists all the modules and all the files from the JDK > build are in (only 'jmods' directory is > missing)[1]? > > Example: > > $ rm -rf ./images/jdk/jmods/ > $ ./images/jdk/bin/jlink --add-modules java.base --output > build/test-jlink-jdk > Error: --module-path is not specified and this runtime image does not > contain jmods directory. > Usage: jlink --module-path --add-modules > [,...] > Use --help for a list of possible options > > The Proposed Solution: > ---------------------- > Instead of using the .jmod archives when jlinking an > application image, use the JDK installation on the file system together > with the module contents from the jimage to produce the desired result, > provided we initially start with a full JDK (including 'jmods' > directory). Only use this mode *if and only if* the 'jmods' directory > is missing in the JDK image. Use a jlink plugin that records resource > files - other than classes and resources which are in the jimage file - > in a new resource file 'jmod_resources' and add it to the jimage. > > In later jlink passes use the 'jmod_resources' resource file of the > jimage in order to track other files part of the application image. The > union of 'jmod_resources' (if any[2]) and the classes and resources > files from the jimage form the equivalent of what '.jmod' > archives track today. > > The size of those extra 'jmod_resources' files are fairly small: about > ~170 bytes on average over all current JDK modules. The one for > 'java.base' is "largest" with 1308 bytes on my system. This seems a > good trade-off. > > In the proposed prototype the supporting jlink plugin is enabled by > default but could instead only enable given a specific option. > > It seems conceivable that such recursive jlink passes would become more > common in context of Leyden, provided jlink serves as the tool for > driving condensation passes. Supporting jmods-less jlinking would > further strengthen that use-case. > > An additional advantage of such an approach would be that distributors > could ship a lighter weight JDK distribution (without jmods), yet still > support jlinking specific application images. > > The code is here: > > https://github.com/jerboaa/jdk/compare/jlink-jmods-base...jerboaa:jdk:jlink-jmods-less?expand=1 > Or the branch: > https://github.com/jerboaa/jdk/tree/jlink-jmods-less > > Current Limitations: > -------------------- > The prototype only works on the actual target runtime platform. Cross- > jlinking isn't supported. However, in context of Leyden this might be > OK as some AOT compilations as part of some condenser runs might have > similar restrictions. > > Also, only modules already present in the base jimage can be included > in derivatives. Consider a base JDK image including JDK modules > 'java.base', 'app.module1' and 'app.module2'. Further jlink passes > would only be able to create images including a *subset* of those > modules. > > Alternatives: > ------------- > There is a '--keep-packaged-modules ' option, > but that seems unsatisfying since it copies large amounts of data for > no good reason. Especially when the java.base.jmod includes debuginfo > in native libraries itself (in particular, libjvm.so). > > Example Usage: > -------------- > $ ./images/jdk/bin/jlink --add-modules ALL-MODULE-PATH --output > build/jlinked-image > WARNING: Using incubator modules: jdk.incubator.concurrent, > jdk.incubator.vector > $ ls ./build/jlinked-image/jmods > ls: cannot access './build/jlinked-image/jmods': No such file or directory > $ ./build/jlinked-image/bin/jlink --verbose --add-modules java.base \ > --output ./build/java.base-image > java.base jrt:/java.base (jmod-less) > > Providers: > java.base provides java.nio.file.spi.FileSystemProvider used by java.base > java.base provides java.util.random.RandomGenerator used by java.base > $ ./build/java.base-image/bin/java --version > openjdk 21-internal 2023-09-19 > OpenJDK Runtime Environment (build 21-internal-adhoc.sgehwolf.jdk-jdk) > OpenJDK 64-Bit Server VM (build 21-internal-adhoc.sgehwolf.jdk-jdk, mixed > mode) > > Thoughts? Opinions? > > Note that depending on feedback, we could consider getting something > like this into mainline JDK as it seems useful outside of Leyden as > well. > > Thanks, > Severin > > > [1] In the Leyden context there is also this issue: > Let 'jlink' be the tool to "drive" condensor passes. Would I be > able to use it recursively, minus after the last step? Not in it's > current form without using '--keep-packaged-modules': > $ ./images/jdk/bin/jlink --add-modules jdk.jlink --output > build/test-jlink-image > $ ./build/test-jlink-image/bin/jlink --add-modules java.base --output > build/test-java.base-image > Error: --module-path is not specified and this runtime image does not > contain jmods directory. > Usage: jlink --module-path --add-modules > [,...] > Use --help for a list of possible options > > [2] Not all modules contain non-class-resource data, like native > libraries or binaries, but only contain Java class files and > resources. If so, all required files are already in the jimage > file. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron.pressler at oracle.com Wed Mar 15 00:14:51 2023 From: ron.pressler at oracle.com (Ron Pressler) Date: Wed, 15 Mar 2023 00:14:51 +0000 Subject: jmods-less jlinking prototype In-Reply-To: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> Message-ID: > On 14 Mar 2023, at 10:36, Severin Gehwolf wrote: > > The Problem: > ------------ > Why do we need the 'jmods' directory when running jlink when 'java -- > list-modules' lists all the modules and all the files from the JDK > build are in (only 'jmods' directory is > missing)[1]? Hi. The way that?s stated it looks like the problem you?re trying to solve is of a JDK tool that stops working when you delete some JDK files. Is the actual issue reducing the JDK?s size? If so, can you give some more motivation for why reducing the size ? not of a runtime but of an SDK ? by ~80 MB or 25% is something worth doing given that the solution is not entirely trivial? ? Ron From sgehwolf at redhat.com Wed Mar 15 10:42:50 2023 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 15 Mar 2023 11:42:50 +0100 Subject: jmods-less jlinking prototype In-Reply-To: References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> Message-ID: <7fb63fb2730bd5df7b47f77050e600ed0af5c08e.camel@redhat.com> Hi Ron, On Wed, 2023-03-15 at 00:14 +0000, Ron Pressler wrote: > > On 14 Mar 2023, at 10:36, Severin Gehwolf wrote: > > > > The Problem: > > ------------ > > Why do we need the 'jmods' directory when running jlink when 'java -- > > list-modules' lists all the modules and all the files from the JDK > > build are in (only 'jmods' directory is > > missing)[1]? > > Hi. > > The way that?s stated it looks like the problem you?re trying to > solve is of a JDK tool that stops working when you delete some JDK > files. It's less of a question of wanting to delete some JDK files, but more something we had to in order to not put off too many users by the size change between JDK 8 and JDK 9+. I'm trying to explain some of it below. The footnote had another motivational point: Be able to recursively jlink. ----- [1] In the Leyden context there is also this issue: Let 'jlink' be the tool to "drive" condensor passes. Would I be able to use it recursively, minus after the final step? Not in it's current form without using '--keep-packaged-modules': $ ./images/jdk/bin/jlink --add-modules jdk.jlink --output build/test-jlink-image $ ./build/test-jlink-image/bin/jlink --add-modules java.base --output build/test-java.base-image Error: --module-path is not specified and this runtime image does not contain jmods directory. Usage: jlink --module-path --add-modules [,...] Use --help for a list of possible options ----- Note that --keep-packaged-modules isn't on by default and I'd argue it's less compelling having to use it in the Leyden context because of the mentioned size issue below. What if somebody wants to run a condensor in a cloud setup? Is this not something we want to consider? > Is the actual issue reducing the JDK?s size? If so, can you give some > more motivation for why reducing the size ? not of a runtime but of > an SDK ? by ~80 MB or 25% is something worth doing given that the > solution is not entirely trivial? Reducing JDK's size is important and would in our opinion be worth some extra complexity in jlink code. Why? 1. Allow recursive jlink runs (see above). 2. Installed JDK size is something everyone is paying a tax for, even though they might not even use jlink for their application needs. For example installing the *full* JDK on Fedora or Red Hat Enterprise Linux by picking the 'java-17-openjdk-jmods' package, would have users download a whopping extra ~230MB of data. Network bandwidth and storage isn't free and this exacerbates in the cloud world with JDK container image sizes. Note that the size difference is different depending how the JDK itself is being built. Here is what I'm seeing for Fedora and Red Hat Enterprise Linux[1]: $ du -sh $(rpm -ql java-17-openjdk-jmods | head -n1) 254M /usr/lib/jvm/java-17-openjdk-[...]/jmods So in our case it's not 25% but up to 55% of the entire JDK install. The reason for this is included debuginfo symbols in native libraries which get packaged up in jmods and won't get stripped by the build system because jmods are opaque artifacts. 3. Considering a cloud setup where a full JDK container image is being used to generate an application specific image including the Java runtime, such a JDK container image would have to include the jmods archives. The full JDK container image is an infrastructure component in such a setup. Even a ~80MB extra for such images results in extra money needing to be spent (for storage or network bandwidth). What's more, the size difference makes using the same JDK image for application runtime - yes some users want the full JDK in containers - as well as for the build- your-own-application-image jlink use-case uncompelling. Thanks, Severin [1] See 'Size' in https://koji.fedoraproject.org/koji/rpminfo?rpmID=33327839 From ron.pressler at oracle.com Wed Mar 15 12:36:15 2023 From: ron.pressler at oracle.com (Ron Pressler) Date: Wed, 15 Mar 2023 12:36:15 +0000 Subject: [External] : Re: jmods-less jlinking prototype In-Reply-To: <7fb63fb2730bd5df7b47f77050e600ed0af5c08e.camel@redhat.com> References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> <7fb63fb2730bd5df7b47f77050e600ed0af5c08e.camel@redhat.com> Message-ID: > On 15 Mar 2023, at 10:42, Severin Gehwolf wrote: > > Reducing JDK's size is important and would in our opinion be worth some > extra complexity in jlink code. Why? > > 1. Allow recursive jlink runs (see above). I understand it?s an added capability. I?d like to understand why it?s important. > 2. Installed JDK size is something everyone is paying a tax for, > even though they might not even use jlink for their application > needs. But if they don?t use jlink, the easier solution is to just delete the jmod files. > For example installing the *full* JDK on Fedora or Red Hat > Enterprise Linux by picking the 'java-17-openjdk-jmods' package, > would have users download a whopping extra ~230MB of data. If size is that important you can get an even bigger reduction by not including debug info. > 3. Considering a cloud setup where a full JDK container image is > being used to generate an application specific image including > the Java runtime, such a JDK container image would have to > include the jmods archives. The full JDK container image is an > infrastructure component in such a setup. Even a ~80MB extra for > such images results in extra money needing to be spent (for > storage or network bandwidth). I still don?t understand. How many containers are used for building? Assuming a nice JDK build where jmods are 25%, we?re talking about a 25% difference in the bandwidth and storage for the *build* infra. How big of an impact is it? I think THIS is the main motivation here, and so more data about the impact is what?s required to assess the importance of the proposal. > What's more, the size difference > makes using the same JDK image for application runtime - yes some > users want the full JDK in containers - as well as for the build- > your-own-application-image jlink use-case uncompelling. > This one is really confusing to me. If you?re concerned with runtime size, with jlink you can reduce the size to 40MB in total; that?s a much, much bigger impact than removing the jmod files. So if size is important, jlink has a far bigger positive impact than a negative one, and a bigger positive impact than what you?re proposing ? running jlink reduces the size by 85% as opposed to 25% ? and if you don?t want to use jlink you can just delete the jmod files and be done with it. I understand that what you?re saying is that with a bit more complexity you can get the best of both worlds. It?s just that without more information about the impact, it?s unclear how significantly are both worlds better than just one world. ? Ron From heidinga at redhat.com Wed Mar 15 14:15:37 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Wed, 15 Mar 2023 10:15:37 -0400 Subject: [External] : Re: jmods-less jlinking prototype In-Reply-To: References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> <7fb63fb2730bd5df7b47f77050e600ed0af5c08e.camel@redhat.com> Message-ID: On Wed, Mar 15, 2023 at 8:36?AM Ron Pressler wrote: > > > > On 15 Mar 2023, at 10:42, Severin Gehwolf wrote: > > > > > Reducing JDK's size is important and would in our opinion be worth some > > extra complexity in jlink code. Why? > > > > 1. Allow recursive jlink runs (see above). > > I understand it?s an added capability. I?d like to understand why it?s > important. > Read this in the context of previous discussions on this list. In particular, the previous Jlink thread between Severin, Brian, and myself [0] where we discuss jlink as a proxy for the eventual Leyden condenser tool. With condensers, we're going to want to apply condensers and get a runnable image. We may also want to apply additional condensers to that image and produce an even more condensed image. The key idea here is the output of the process (assuming it's non-terminal) needs to also be usable as an input to the process: { JDK + modules | condensed image } --> condensers --> condensed image condensed image --> runnable image [0] https://mail.openjdk.org/pipermail/leyden-dev/2023-February/000133.html > > > 2. Installed JDK size is something everyone is paying a tax for, > > even though they might not even use jlink for their application > > needs. > > But if they don?t use jlink, the easier solution is to just delete the > jmod files. > Sure. One way to look at this is many existing deployments are "condensed" already in that they've been opted into a single runnable platform (ie: linux x86-64) and by allowing an image without jmods to be further jlinked, we enable them to be condensed further. > > > For example installing the *full* JDK on Fedora or Red Hat > > Enterprise Linux by picking the 'java-17-openjdk-jmods' package, > > would have users download a whopping extra ~230MB of data. > > If size is that important you can get an even bigger reduction by not > including debug info. > > > 3. Considering a cloud setup where a full JDK container image is > > being used to generate an application specific image including > > the Java runtime, such a JDK container image would have to > > include the jmods archives. The full JDK container image is an > > infrastructure component in such a setup. Even a ~80MB extra for > > such images results in extra money needing to be spent (for > > storage or network bandwidth). > > I still don?t understand. How many containers are used for building? > Assuming a nice JDK build where jmods are 25%, we?re talking about a 25% > difference in the bandwidth and storage for the *build* infra. > How big of an impact is it? > > I think THIS is the main motivation here, and so more data about the > impact is what?s required to assess the importance of the proposal. > Size reduction for existing installs is actually the nice-to-have benefit we get by allowing a jlinked image to act as an input to jlink. The core benefit we care about for Leyden is being able to condense (jlink) an already condensed (jlinked) runtime. > > > > What's more, the size difference > > makes using the same JDK image for application runtime - yes some > > users want the full JDK in containers - as well as for the build- > > your-own-application-image jlink use-case uncompelling. > > > > This one is really confusing to me. If you?re concerned with runtime size, > with jlink you can reduce the size to 40MB in total; that?s a much, much > bigger impact than removing the jmod files. > So if size is important, jlink has a far bigger positive impact than a > negative one, and a bigger positive impact than what you?re proposing ? > running jlink reduces the size by 85% as opposed to 25% ? > and if you don?t want to use jlink you can just delete the jmod files and > be done with it. Here I'm confused. Severin has prototyped changes to make it easier to use jlink for a broader set of consumers so of course we want to use jlink! This is about making it easier to use jlink - to produce an initial image and then to further condense that image. > > > I understand that what you?re saying is that with a bit more complexity > you can get the best of both worlds. It?s just that without more > information about the impact, it?s unclear how significantly are both > worlds better than just one world. > The high order bit in this approach is being able to use jlink to condense a runtime image, and then use jlink to condense it further. Do you have concerns with the high order bit? We can talk about size benefits (a secondary benefit) once we all agree on the high order bit. --Dan > > ? Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron.pressler at oracle.com Wed Mar 15 14:42:21 2023 From: ron.pressler at oracle.com (Ron Pressler) Date: Wed, 15 Mar 2023 14:42:21 +0000 Subject: [External] : Re: jmods-less jlinking prototype In-Reply-To: References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> <7fb63fb2730bd5df7b47f77050e600ed0af5c08e.camel@redhat.com> Message-ID: <36A061CE-A482-4D03-8E7C-E295E08376C5@oracle.com> On 15 Mar 2023, at 14:15, Dan Heidinga > wrote: The high order bit in this approach is being able to use jlink to condense a runtime image, and then use jlink to condense it further. Do you have concerns with the high order bit? We can talk about size benefits (a secondary benefit) once we all agree on the high order bit. I think we?ve seen two different high-order bits, but I can more readily understand your motivation, as you state it: it will allow an image to be an intermediate form in a condensing pipeline. It then depends on the question of whether or not that?s the desired intermediate format, which is a clear question with a clear outcome. ? Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From Alan.Bateman at oracle.com Wed Mar 15 14:51:18 2023 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 15 Mar 2023 14:51:18 +0000 Subject: jmods-less jlinking prototype In-Reply-To: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> Message-ID: <7c67f4f2-a794-8852-5aca-8bb4055803b5@oracle.com> On 14/03/2023 10:36, Severin Gehwolf wrote: > The Problem: > ------------ > Why do we need the 'jmods' directory when running jlink when 'java -- > list-modules' lists all the modules and all the files from the JDK > build are in (only 'jmods' directory is > missing)[1]? Just some history around this. During JDK 9, there was a suggestion that the packaged modules should be a separate download but that had its own set of issues.? There were also suggestions that jlink should use the classes/resources from the current run-time image to avoid needing to include the complete contents of all packaged modules, it just wasn't interesting to pursue at the time. That was 6+ years ago and there is a lot of experience with jlink since then, plus we've now on a road where jlink may be working with condensers in the future. So it could be interesting to explore to see how it might fit in. More background is that generating a run-time image may involve a number of transformations, e.g jlink --strip-debug may remove debugging symbols and strip debug related class file attributes, other plugins, like generate-jli-classes and system-modules, generate code and/or modify a number of classes.? Post generation, there may also be changes to configuration files in conf/**. I think the point is that all types of resources (native libs, executables, classes/resources, conf) can be filtered out, stripped, or irreversibly transformed. Even if the run-time image has the jdk.jlink module, it might not have many of the modules that were in the original JDK builds so it's not really a good starting point for stamping out further images. On container environments, we see a lot of examples where jlink is run to produce a run-time image with a subset of the modules and with some stripping and compression to get a smaller run-time image. Smaller container means faster to transport across network and deploy, great! Most of the recipes that I've seen published don't allow for further jlink-ing, either because they only have a small subset of the modules or they don't have the jdk.jlink module so they don't have the tool. Maybe this effort might change that. So I think it could be interesting to explore to at least see how far it can get. I read your proposal/prototype as generating a patch/diff of the bits in the run-time image so you can essentially go back to the contents of the original packages modules. If I read your mail correctly, if debug symbols are stripped then it doesn't currently recover the original libjvm.so or other libraries. Also I couldn't see if the original conf/templates are saved for future images or whether configuration edits get copied. -Alan. From sgehwolf at redhat.com Wed Mar 15 15:02:49 2023 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 15 Mar 2023 16:02:49 +0100 Subject: jmods-less jlinking prototype In-Reply-To: References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> <7fb63fb2730bd5df7b47f77050e600ed0af5c08e.camel@redhat.com> Message-ID: On Wed, 2023-03-15 at 12:36 +0000, Ron Pressler wrote: > > On 15 Mar 2023, at 10:42, Severin Gehwolf wrote: > > Reducing JDK's size is important and would in our opinion be worth some > > extra complexity in jlink code. Why? > > > > ? 1. Allow recursive jlink runs (see above). > > I understand it?s an added capability. I?d like to understand why > it?s important. >From [i] there are 3 critical properties: "Condensation is composable." Consider jlink as a tool driving condensation. The proposed prototype satisfies the composable property. So its importance derives from that. It seems conceivable that JDK developers would run some condensation. If Leyden supports condensation for application developers, this jlink approach could be a step in that direction. It enables that use-case on a broader spectrum of JDK installs (not just the ones including jmods files). So in a way it would also support "Condensation is selectable". I understand that this is a moot point if jlink ends up not being used as a tool for condensation. The premise of this prototype work was that it might be. Does that make sense? > > ? 2. Installed JDK size is something everyone is paying a tax for, > > ???? even though they might not even use jlink for their application > > ???? needs. > > But if they don?t use jlink, the easier solution is to just delete the jmod files. That's right. Thus, our approach of having jmods in a separate RPM sub- package. Users can selectively install jmods or not. However, that use- case becomes harder for container images where image owners would either install it by default or provide two different images (with and without jmods). I've phrased it the way I did, since there seemed to have been reservations on a jmods-less JDK install from your initial reply. Therefore, my argument was in light of a full JDK install (of which jmods are a part of). I hope that makes some sense. > > For example installing the *full* JDK on Fedora or Red Hat > > ???? Enterprise Linux by picking the 'java-17-openjdk-jmods' package, > > ???? would have users download a whopping extra ~230MB of data. > > If size is that important you can get an even bigger reduction by not > including debug info. Debug info is an important support tool. In our world, native libraries of a base JDK install have debuginfos in-file (internal), get them stripped from executables and shared libraries by the RPM/deb build system and get transplanted into corresponding `-debuginfo` subpackages so that they can get installed when need be. Not including debuginfo would break this. Including it, has the issue described with jmods size. But we are digressing... > > ? 3. Considering a cloud setup where a full JDK container image is > > ???? being used to generate an application specific image including > > ???? the Java runtime, such a JDK container image would have to > > ???? include the jmods archives. The full JDK container image is an > > ???? infrastructure component in such a setup. Even a ~80MB extra for > > ???? such images results in extra money needing to be spent (for > > ???? storage or network bandwidth). > > I still don?t understand. How many containers are used for building? > Assuming a nice JDK build where jmods are 25%, we?re talking about a > 25% difference in the bandwidth and storage for the *build* infra. > How big of an impact is it? How many containers? One per JDK version. Say for JDK versions 21, 17, 11 we'd have an image. Then each such image, receives quarterly updates (at the very least; there are base image updates as well). So on a cloud build system, updated images would get re-pulled at least once a quarter so as to get the latest update. Add 25% in size to that and it quickly adds up. > > > What's more, the size difference > > ???? makes using the same JDK image for application runtime - yes some > > ???? users want the full JDK in containers - as well as for the build- > > ???? your-own-application-image jlink use-case uncompelling. > > > > This one is really confusing to me. If you?re concerned with runtime > size, with jlink you can reduce the size to 40MB in total; that?s a > much, much bigger impact than removing the jmod files. > So if size is important, jlink has a far bigger positive impact than > a negative one, and a bigger positive impact than what you?re > proposing ? running jlink reduces the size by 85% as opposed to 25% ? > and if you don?t want to use jlink you can just delete the jmod files > and be done with it. IMHO an important point in this scenario is that there are different user groups (owning teams) involved. One group are application owners (A), another JDK providers (B), and possibly a third group container image providers (C). Groups need not be the same people (but could be in some cases). Going back to the cloud jlink use-case where the end goal would be for group (A) to get application code into a container as small as possible we have: input application source code (i) and output the application binary with a jlink-ed runtime image all bundled up in a container (ii). In order to do this, artifacts from (C) are being used (artifact from group (C) use artifacts from group (B)). Note that group (A) doesn't want/have access to the JDK themselves. It's provided to them by (C) and they don't want to change/maintain their own image for this. Now, group (A) builds their apps in the cloud. So they need a "builder image" with jmods (from group (C)), since they will be using jlink without them knowing it. So from that aspect, a size reduction for a jlink-capable "builder image" is already a win. The end result of group (A), namely (ii), would be even smaller, but you have to produce this artifact first. Thus, the jlink prototype wins here. Also, for group (C) it's a win in terms of maintenance when there need not be two different container images for satisfying uses cases from group (A) where they either 1) want just a runtime (full JDK) + app in a container or 2) want to use jlink to create a container app image. This would be a scenario where the "best of both worlds" approach wins. > I understand that what you?re saying is that with a bit more > complexity you can get the best of both worlds. It?s just that > without more information about the impact, it?s unclear how > significantly are both worlds better than just one world. Understood. Thanks, Severin [i] https://openjdk.org/projects/leyden/notes/02-shift-and-constrain From sgehwolf at redhat.com Wed Mar 15 16:57:28 2023 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 15 Mar 2023 17:57:28 +0100 Subject: jmods-less jlinking prototype In-Reply-To: <7c67f4f2-a794-8852-5aca-8bb4055803b5@oracle.com> References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> <7c67f4f2-a794-8852-5aca-8bb4055803b5@oracle.com> Message-ID: <6de512743b1ccf060a61bb55aad2904a680423fb.camel@redhat.com> On Wed, 2023-03-15 at 14:51 +0000, Alan Bateman wrote: > On 14/03/2023 10:36, Severin Gehwolf wrote: > > The Problem: > > ------------ > > Why do we need the 'jmods' directory when running jlink when 'java -- > > list-modules' lists all the modules and all the files from the JDK > > build are in (only 'jmods' directory is > > missing)[1]? > > Just some history around this. During JDK 9, there was a suggestion that > the packaged modules should be a separate download but that had its own > set of issues.? There were also suggestions that jlink should use the > classes/resources from the current run-time image to avoid needing to > include the complete contents of all packaged modules, it just wasn't > interesting to pursue at the time. That was 6+ years ago and there is a > lot of experience with jlink since then, plus we've now on a road where > jlink may be working with condensers in the future. So it could be > interesting to explore to see how it might fit in. Thanks for the history lesson! > More background is that generating a run-time image may involve a number > of transformations, e.g jlink --strip-debug may remove debugging symbols > and strip debug related class file attributes, other plugins, like > generate-jli-classes and system-modules, generate code and/or modify a > number of classes.? Post generation, there may also be changes to > configuration files in conf/**. I think the point is that all types of > resources (native libs, executables, classes/resources, conf) can be > filtered out, stripped, or irreversibly transformed. Agreed to the above. Incidentally, the generate-jli-classes plugin isn't designed for recursive jlinking so I had to work around it[1]. In the Leyden context this does not seem so bad a trade-off, though. You can only sub-set the JDK image you start out with with this mode. So if the libraries had debug info stripped away the result would have that too. I'd argue that the case where users actually want things back (add back, rather than remove) from any base JDK image is rather small. Though, I have no numbers about this so this is only anecdotal. Have you seen use-cases doing that (add-back, taking from jmods)? > Even if the run-time image has the jdk.jlink module, it might not have many of the > modules that were in the original JDK builds so it's not really a good > starting point for stamping out further images. While it might not have many modules in the JDK, the prime use case of this prototype is focused on further reduction, so it might in fact be a good starting point for many cases. In particular, the base case of? ALL-MODULE-PATH -> jlink again to get a small application image. > On container environments, we see a lot of examples where jlink is run > to produce a run-time image with a subset of the modules and with some > stripping and compression to get a smaller run-time image. Smaller > container means faster to transport across network and deploy, great! > Most of the recipes that I've seen published don't allow for further > jlink-ing, either because they only have a small subset of the modules > or they don't have the jdk.jlink module so they don't have the tool. > Maybe this effort might change that. +1 > So I think it could be interesting to explore to at least see how far it > can get. I read your proposal/prototype as generating a patch/diff of > the bits in the run-time image so you can essentially go back to the > contents of the original packages modules. If I read your mail > correctly, if debug symbols are stripped then it doesn't currently > recover the original libjvm.so or other libraries. Correct. Recovering something that wasn't in the the JDK image to begin with is not a goal. That's where the original jmods-full usecase would come in (although, if the JDK build had debug symbols already stripped, there is also no way to go back and recover those in the jmods-full case). > Also I couldn't see > if the original conf/templates are saved for future images or whether > configuration edits get copied. Original conf/templates aren't saved. They're taken as is from the base image. I.e. modifications propagate throughout. I'd like to stress that this proposal isn't supposed to replace the jmods-full jlink. At least the first stage need to be jmods-full (e.g. initial JDK build). Or at least it wasn't designed for that. [1] https://github.com/jerboaa/jdk/commit/41339bf7938c2070b68b34a64cf86241282e1c66#diff-fe9ad54f129e10e65432e62154d18797929677180122263f50ab454b59658cd8R65 From Alan.Bateman at oracle.com Fri Mar 17 09:34:29 2023 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 17 Mar 2023 09:34:29 +0000 Subject: jmods-less jlinking prototype In-Reply-To: <6de512743b1ccf060a61bb55aad2904a680423fb.camel@redhat.com> References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> <7c67f4f2-a794-8852-5aca-8bb4055803b5@oracle.com> <6de512743b1ccf060a61bb55aad2904a680423fb.camel@redhat.com> Message-ID: <981ca9f9-bf75-c031-b89f-4d1d060fbc91@oracle.com> On 15/03/2023 16:57, Severin Gehwolf wrote: > : > I'd argue that the case where users actually want things back (add > back, rather than remove) from any base JDK image is rather small. > Though, I have no numbers about this so this is only anecdotal. Have > you seen use-cases doing that (add-back, taking from jmods)? Almost every usage that I've observed has been reduction and one jlink step to create a small run-time image. It's always start with a JDK build with packaged modules and then jlink to generate a run-time image with the subset of the standard and JDK modules that the application needs. Some sightings have included jlink options to compress or strip debug attributes, just to get the size down. I haven't seen any usages of --keep-packaged-modules so the resulting run-time couldn't be used to stamp out another run-time image, at least not without specifying -module-path and of course including the jdk.jlink module. The only of additive case that I've come across is with the GraalVM build where they use the jlink --add-option to add VM options to select the Graal JIT. This works by adding a resource to the java.base module.? More recently,? --save-jlink-argfiles has been added so the options are preserved for future stamping of out of images. I haven't seen any usages of this yet but it would need to be paired with --keep-packaged-modules to be useful. Your prototype avoid needing that of course. -Alan From mike at hydraulic.software Fri Mar 17 11:33:47 2023 From: mike at hydraulic.software (Mike Hearn) Date: Fri, 17 Mar 2023 12:33:47 +0100 Subject: jmods-less jlinking prototype In-Reply-To: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> Message-ID: Hi Severin, There's an alternative that can resolve some of the problems raised in this thread, preserve the ability to cross-link (which is extremely useful - many of our users rely on it), and also tidy up some rough edges in the current packaging/distribution story. The idea is to stop shipping JMODs with the JDK but enhance the release metadata file to point to the URLs of those JMODs for each supported platform. JLink can then use that metadata file to download any JMODs that are missing before linking. Advantages: - Shrinks the base JDK download without the user needing to delete the jmods directory. - Ducks the issue of JLink transforming the files irreversibly. - Preserves/enhances the ability to cross-link, without needing JMOD format changes. - The upgraded release metadata file can also be put online separately, so tools can then be pointed to a single URL in order to discover, download and generate a JDK of any platform from any vendor (as long as they have access to a jlink already). This would be similar to the approach Conveyor already uses, without the proprietary parts. Currently we ship "stdlib" config files for each JDK vendor and version that contains the URLs of the JDK downloads, the tool then extracts that to get at the jmods inside, then downloads a standard OpenJDK JDK for the host platform to get a jlink, then uses jlink on each platform JDK to construct the JVM that'll then be bundled with the app. The configs are in turn generated by probing the foojay discovery API, but we've had stability issues with it in the past. It'd be much simpler if users could just be told to specify a magic URL for their vendor that covers all versions of all platforms, especially if the JMODs were then "loose" as it'd mean only the needed modules would be downloaded. Obviously, it'd require vendors to be on board and change how they do uploads but the current approach with flaky discovery APIs and zips-inside-zips is kind of indirect and inefficient. With respect to recursive linking/condensation, it might be a bit early to think about that. As previously discussed people may want optimized JDK images specifically for development purposes, and going "sideways" from optimized-for-dev to optimized-for-release would be a major complication. Making it efficient to start from a set of JMODs and JARs seems like the way to go. From heidinga at redhat.com Fri Mar 17 13:53:08 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Fri, 17 Mar 2023 09:53:08 -0400 Subject: jmods-less jlinking prototype In-Reply-To: <981ca9f9-bf75-c031-b89f-4d1d060fbc91@oracle.com> References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> <7c67f4f2-a794-8852-5aca-8bb4055803b5@oracle.com> <6de512743b1ccf060a61bb55aad2904a680423fb.camel@redhat.com> <981ca9f9-bf75-c031-b89f-4d1d060fbc91@oracle.com> Message-ID: On Fri, Mar 17, 2023 at 5:35?AM Alan Bateman wrote: > On 15/03/2023 16:57, Severin Gehwolf wrote: > > : > > I'd argue that the case where users actually want things back (add > > back, rather than remove) from any base JDK image is rather small. > > Though, I have no numbers about this so this is only anecdotal. Have > > you seen use-cases doing that (add-back, taking from jmods)? > > Almost every usage that I've observed has been reduction and one jlink > step to create a small run-time image. It's always start with a JDK > build with packaged modules and then jlink to generate a run-time image > with the subset of the standard and JDK modules that the application > needs. Some sightings have included jlink options to compress or strip > debug attributes, just to get the size down. I haven't seen any usages > of --keep-packaged-modules so the resulting run-time couldn't be used to > stamp out another run-time image, at least not without specifying > -module-path and of course including the jdk.jlink module. > Severin and I talked about whether to always include jdk.jlink as part of his prototype but finally opted not to as some users may not want it. We also talked about some of the ways jlink (condensers) can be set up. The obvious first case is as it is today: generating an image starting from the JDK & jmod files. JDK+jmods --- jlink ---> runtime image + modules file Severin's prototype is another point on the spectrum where some condensers (jlink plugins) have been run and modified the image in various ways (ie: making it platform-specific, pre-gening the jli classes, etc) making it more ready for deployment. Many of the customizations have been captured in the modules file - new classes, modified classes, etc - so using the modules file as input to the next jlink is important to not lose those customizations. As Severin's prototype shows, the modules file is not sufficient on its own - we still need the rest of the JDK and the various config files, etc that go along with each module. Today, we don't have users doing this as this kind of re-jlink as they don't have the jmods in their image anymore. JDK+jmods --- jlink ---> runtime image + modules file --- jlink ---> new runtime image + new modules file There's another option though which avoids needing jdk.jlink to be included in the intermediate images. We could use a separate JDK to do the jlink process and allow it to use the runtime image+modules file as the input to the process. This has jlink act as a tool on the side, operating on either the JDK+jmods or the runtime image+modules file. We don't need jlink to be in the image, just available in some JDK that can operate on the image. JDK+jmods --- jlink ---> runtime image + modules file runtime image + modules --- JDK's jlink ---> new runtime image + new modules file We can grow Severin's prototype to eventually allow this later form as well once we've nailed down the current behaviour. > The only of additive case that I've come across is with the GraalVM > build where they use the jlink --add-option to add VM options to select > the Graal JIT. This works by adding a resource to the java.base module. > More recently, --save-jlink-argfiles has been added so the options are > preserved for future stamping of out of images. I haven't seen any > usages of this yet but it would need to be paired with > --keep-packaged-modules to be useful. Your prototype avoid needing that > of course. > Thanks for mentioning the --save-jlink-argfiles option. I missed that when it went in. --Dan > > -Alan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From heidinga at redhat.com Mon Mar 20 13:58:34 2023 From: heidinga at redhat.com (Dan Heidinga) Date: Mon, 20 Mar 2023 09:58:34 -0400 Subject: jmods-less jlinking prototype In-Reply-To: References: <5a81ad40b9b20d92a73ddedf437187c7f141f9ce.camel@redhat.com> Message-ID: On Fri, Mar 17, 2023 at 7:34?AM Mike Hearn wrote: > Hi Severin, > > There's an alternative that can resolve some of the problems raised in > this thread, preserve the ability to cross-link (which is extremely > useful - many of our users rely on it), and also tidy up some rough > edges in the current packaging/distribution story. The idea is to stop > shipping JMODs with the JDK but enhance the release metadata file to > point to the URLs of those JMODs for each supported platform. JLink > can then use that metadata file to download any JMODs that are missing > before linking. > > Advantages: > > - Shrinks the base JDK download without the user needing to delete the > jmods directory. > > - Ducks the issue of JLink transforming the files irreversibly. > > - Preserves/enhances the ability to cross-link, without needing JMOD > format changes. > > - The upgraded release metadata file can also be put online > separately, so tools can then be pointed to a single URL in order to > discover, download and generate a JDK of any platform from any vendor > (as long as they have access to a jlink already). > > This would be similar to the approach Conveyor already uses, without > the proprietary parts. Currently we ship "stdlib" config files for > each JDK vendor and version that contains the URLs of the JDK > downloads, the tool then extracts that to get at the jmods inside, > then downloads a standard OpenJDK JDK for the host platform to get a > jlink, then uses jlink on each platform JDK to construct the JVM > that'll then be bundled with the app. The configs are in turn > generated by probing the foojay discovery API, but we've had stability > issues with it in the past. It'd be much simpler if users could just > be told to specify a magic URL for their vendor that covers all > versions of all platforms, especially if the JMODs were then "loose" > as it'd mean only the needed modules would be downloaded. Obviously, > it'd require vendors to be on board and change how they do uploads but > the current approach with flaky discovery APIs and zips-inside-zips is > kind of indirect and inefficient. > > With respect to recursive linking/condensation, it might be a bit > early to think about that. As previously discussed people may want > optimized JDK images specifically for development purposes, and going > "sideways" from optimized-for-dev to optimized-for-release would be a > major complication. Making it efficient to start from a set of JMODs > and JARs seems like the way to go. > Hi Mike, There's certainly benefits to this approach. I think there's a fair bit of synergy between your approach and the larger set of use cases I wrote up in my response to Alan (see other fork of this email thread). A common pattern across some of the use cases is the need to decouple the "inputs" (modules, jmods, etc) from the JDK running jink. Once that's possible - as Severin shows in his prototype - it becomes simpler to talk about different sources of modules including remote sources. >From a Leyden perspective, I'm more interested in exploring how to extend jlink to allow condensing previously condensed images as I see that developing into a more common usage pattern in large enterprises with platform teams providing preconfigured supported JDK's to their teams which support further customization. Kind of akin to a build pack approach. --Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: