From mark.reinhold at oracle.com Fri May 20 14:42:01 2022 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Fri, 20 May 2022 07:42:01 -0700 (PDT) Subject: Welcome to Project Leyden! Message-ID: <20220520144201.B214553774B@eggemoggin.niobe.net> I?ve published an initial Project page, here: https://openjdk.java.net/projects/leyden/ At the moment Leyden has no repositories, since I expect that at the start we?ll focus more upon concepts than code. We?ll create repositories for documents and code as we need them, and assign roles at that time. - Mark From mark.reinhold at oracle.com Fri May 20 14:42:02 2022 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Fri, 20 May 2022 07:42:02 -0700 (PDT) Subject: Project Leyden: Beginnings Message-ID: <20220520144202.C213853774D@eggemoggin.niobe.net> The ultimate goal of this Project, as stated in the Call for Discussion [1], is to address the long-term pain points of Java?s slow startup time, slow time to peak performance, and large footprint. In the Call for Discussion I proposed that we address these pain points by introducing a concept of _static run-time images_ to the Java Platform, and to the JDK. - A static image is a standalone program, derived from an application and a JDK, which runs that application -- and no other. - A static image is a _closed world_ with respect to the classes that it can load: At run time it cannot load classes from outside the image, nor can it create classes dynamically. The closed-world constraint imposes strict limits on Java?s natural dynamism, particularly on the run-time reflection and class-loading features upon which so many existing Java libraries and frameworks depend. Not all applications are well suited to this constraint, and not all developers are willing to live with it. So rather than adopt the closed-world constraint at the start, I propose that we instead pursue a gradual, incremental approach. We will explore a spectrum of constraints, weaker than the closed-world constraint, and discover what optimizations they enable. The resulting optimizations will almost certainly be weaker than those enabled by the closed-world constraint. Because the constraints are weaker, however, the optimizations will likely be applicable to a broader range of existing code -- thus they will be more useful to more developers. We will work incrementally along this spectrum of constraints, starting small and simple so that we can develop a firm understanding of the changes required to the Java Platform Specification. Along the way we will strive, of course, to preserve Java?s core values of readability, compatibility, and generality. We will lean heavily on existing components of the JDK including the HotSpot JVM, the C2 compiler, application class-data sharing (CDS), and the `jlink` linking tool. In the long run we will likely embrace the full closed-world constraint in order to produce fully-static images. Between now and then, however, we will develop and deliver incremental improvements which developers can use sooner rather than later. Let us begin! - Mark [1] https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html // https://openjdk.java.net/projects/leyden/notes/01-beginnings From volker.simonis at gmail.com Fri May 20 16:16:44 2022 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 20 May 2022 18:16:44 +0200 Subject: Project Leyden: Beginnings Message-ID: Hi Mark, Thanks for your all your efforts to finally get Leyden started! > The ultimate goal of this Project, as stated in the Call for Discussion > [1], is to address the long-term pain points of Java?s slow startup > time, slow time to peak performance, and large footprint. > You're probably aware of the CRaC project [1] which as well addresses the first two of these pain points (slow startup & slow time to peak performance) by leveraging checkpointing (aka. snapshotting) and restoring (aka. resuming) of fully warmed up JVM instances. [1] https://openjdk.java.net/projects/crac/ > In the Call for Discussion I proposed that we address these pain points > by introducing a concept of _static run-time images_ to the Java > Platform, and to the JDK. > > - A static image is a standalone program, derived from an application > and a JDK, which runs that application -- and no other. > > - A static image is a _closed world_ with respect to the classes that > it can load: At run time it cannot load classes from outside the > image, nor can it create classes dynamically. > > The closed-world constraint imposes strict limits on Java?s natural > dynamism, particularly on the run-time reflection and class-loading > features upon which so many existing Java libraries and frameworks > depend. Not all applications are well suited to this constraint, and > not all developers are willing to live with it. > > So rather than adopt the closed-world constraint at the start, I propose > that we instead pursue a gradual, incremental approach. > Now that the goal of exploring "static images" is not the main goal of Leyden anymore (you should probably update the Leyden project page [2] to reflect this), the goals of CRaC and Leyden seem to match even more. CRaC's new execution model doesn't impose any constraints on "Java?s natural dynamism" so it should naturally support most server-side applications out of the box. Instead, CRaC imposes a new constraint for Java applications which we call "snapsafety". A snapsafe application can operate correctly and securely after it has been restored from a previously checkpointed (and possibly cloned) state. The main challenge for CRaC is to first make the JVM and the core libraries snapsafe before it exposes hooks to libraries and application to give them a chance to become snapsafe as well. [2] https://openjdk.java.net/projects/leyden/ > We will explore a spectrum of constraints, weaker than the closed-world > constraint, and discover what optimizations they enable. The resulting > optimizations will almost certainly be weaker than those enabled by the > closed-world constraint. Because the constraints are weaker, however, > the optimizations will likely be applicable to a broader range of > existing code -- thus they will be more useful to more developers. > > We will work incrementally along this spectrum of constraints, starting > small and simple so that we can develop a firm understanding of the > changes required to the Java Platform Specification. Along the way we > will strive, of course, to preserve Java?s core values of readability, > compatibility, and generality. > It seems to me that "snapsafety" could be such a constraint and I hope for a fruitful and successful cooperation between the two projects. Thank you and best regards, Volker > We will lean heavily on existing components of the JDK including the > HotSpot JVM, the C2 compiler, application class-data sharing (CDS), and > the `jlink` linking tool. > > In the long run we will likely embrace the full closed-world constraint > in order to produce fully-static images. Between now and then, however, > we will develop and deliver incremental improvements which developers > can use sooner rather than later. > > Let us begin! > > - Mark > > > [1] https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html > > // https://openjdk.java.net/projects/leyden/notes/01-beginnings From Matthew.Carter at microsoft.com Fri May 20 22:50:46 2022 From: Matthew.Carter at microsoft.com (Mat Carter) Date: Fri, 20 May 2022 22:50:46 +0000 Subject: Project Leyden: Beginnings Message-ID: Hi Mark This is great news! ? We look forward to the discussions and collaborations ahead as this is a major focus area for us and our 1st and 3rd party customers The Microsoft Java engineering group have started looking at the tooling (performance counters and JFR events) that is currently available for developers to understand the impact of their decisions on start-up/warmup (reaching steady-state).? We'll surface our initial thoughts and questions about the current state of profiling in the serviceability-dev mailing list. Cheers Mat Sent from Outlook From zjx001202 at gmail.com Mon May 23 08:49:23 2022 From: zjx001202 at gmail.com (Glavo) Date: Mon, 23 May 2022 16:49:23 +0800 Subject: Question about Project Leyden: why choose "static run-time images"? Message-ID: Hello, everyone. I'm excited to see OpenJDK's quest to improve startup speed, but I have a confusion about Project Leyden. As far as I know, there are currently two efforts in the Java community to improve startup speed: * OpenJ9[1], Zing JVM[2] and Dragonwell[3] reduces JIT warm-up time by caching profiling data or JIT results. This approach greatly improves startup speed with little sacrifice in functionality and flexibility. * GraalVM native-image introduces the concept of "closed-world assumption", at the expense of flexibility and partial functionality, statically compiles Java code, reduces startup time and reduces size. Relatively speaking, it is more expensive, and we need to limit ourselves to use it better. When I got to know Project Leyden, it seems to me that its purpose and cost are very much the same as GraalVM native-image. So, why did Project Leyden choose this approach? Why not opt for a more general way like OpenJ9? What decisive advantage does it have over GraalVM native-image that make us need to start over? Thanks [1]: https://docs.azul.com/prime/Use-ReadyNow.html [2]: https://www.eclipse.org/openj9/docs/shrc/#aot-code-and-jit-data [3]: https://github.com/alibaba/dragonwell8/wiki/Alibaba-Dragonwell8-User-Guide#jwarmup From claes.redestad at oracle.com Mon May 23 09:36:34 2022 From: claes.redestad at oracle.com (Claes Redestad) Date: Mon, 23 May 2022 11:36:34 +0200 Subject: Question about Project Leyden: why choose "static run-time images"? In-Reply-To: References: Message-ID: Hi, I think there's some confusion here since earlier communication - including the current leyden project page - is a bit outdated and does not reflect recent change in direction. To quote Mark[1]: "So rather than adopt the closed-world constraint at the start, I propose that we instead pursue a gradual, incremental approach. [...] We will lean heavily on existing components of the JDK including the HotSpot JVM, the C2 compiler, application class-data sharing (CDS), and the `jlink` linking tool. In the long run we will likely embrace the full closed-world constraint in order to produce fully-static images." Best regards Claes [1] https://mail.openjdk.java.net/pipermail/leyden-dev/2022-May/000001.html On 2022-05-23 10:49, Glavo wrote: > Hello, everyone. I'm excited to see OpenJDK's quest to improve startup > speed, but I have a confusion about Project Leyden. > > As far as I know, there are currently two efforts in the Java community to > improve startup speed: > > * OpenJ9[1], Zing JVM[2] and Dragonwell[3] reduces JIT warm-up time by > caching profiling data or JIT results. > This approach greatly improves startup speed with little sacrifice in > functionality and flexibility. > * GraalVM native-image introduces the concept of "closed-world assumption", > at the expense of > flexibility and partial functionality, statically compiles Java code, > reduces startup time and reduces size. > Relatively speaking, it is more expensive, and we need to limit ourselves > to use it better. > > When I got to know Project Leyden, it seems to me that its purpose and cost > are very much the same as GraalVM native-image. > So, why did Project Leyden choose this approach? Why not opt for a more > general way like OpenJ9? > What decisive advantage does it have over GraalVM native-image that make us > need to start over? > > Thanks > > [1]: https://docs.azul.com/prime/Use-ReadyNow.html > [2]: https://www.eclipse.org/openj9/docs/shrc/#aot-code-and-jit-data > [3]: > https://github.com/alibaba/dragonwell8/wiki/Alibaba-Dragonwell8-User-Guide#jwarmup From ioi.lam at oracle.com Mon May 23 16:10:25 2022 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 23 May 2022 09:10:25 -0700 Subject: Project Leyden: Beginnings In-Reply-To: References: Message-ID: On 5/20/2022 9:16 AM, Volker Simonis wrote: > Hi Mark, > > Thanks for your all your efforts to finally get Leyden started! > >> The ultimate goal of this Project, as stated in the Call for Discussion >> [1], is to address the long-term pain points of Java?s slow startup >> time, slow time to peak performance, and large footprint. >> > You're probably aware of the CRaC project [1] which as well addresses > the first two of these pain points (slow startup & slow time to peak > performance) by leveraging checkpointing (aka. snapshotting) and > restoring (aka. resuming) of fully warmed up JVM instances. > > [1] https://openjdk.java.net/projects/crac/ > >> In the Call for Discussion I proposed that we address these pain points >> by introducing a concept of _static run-time images_ to the Java >> Platform, and to the JDK. >> >> - A static image is a standalone program, derived from an application >> and a JDK, which runs that application -- and no other. >> >> - A static image is a _closed world_ with respect to the classes that >> it can load: At run time it cannot load classes from outside the >> image, nor can it create classes dynamically. >> >> The closed-world constraint imposes strict limits on Java?s natural >> dynamism, particularly on the run-time reflection and class-loading >> features upon which so many existing Java libraries and frameworks >> depend. Not all applications are well suited to this constraint, and >> not all developers are willing to live with it. >> >> So rather than adopt the closed-world constraint at the start, I propose >> that we instead pursue a gradual, incremental approach. >> > Now that the goal of exploring "static images" is not the main goal of > Leyden anymore (you should probably update the Leyden project page [2] > to reflect this), the goals of CRaC and Leyden seem to match even > more. CRaC's new execution model doesn't impose any constraints on > "Java?s natural dynamism" so it should naturally support most > server-side applications out of the box. Instead, CRaC imposes a new > constraint for Java applications which we call "snapsafety". A > snapsafe application can operate correctly and securely after it has > been restored from a previously checkpointed (and possibly cloned) > state. The main challenge for CRaC is to first make the JVM and the > core libraries snapsafe before it exposes hooks to libraries and > application to give them a chance to become snapsafe as well. > > [2] https://openjdk.java.net/projects/leyden/ > >> We will explore a spectrum of constraints, weaker than the closed-world >> constraint, and discover what optimizations they enable. The resulting >> optimizations will almost certainly be weaker than those enabled by the >> closed-world constraint. Because the constraints are weaker, however, >> the optimizations will likely be applicable to a broader range of >> existing code -- thus they will be more useful to more developers. >> >> We will work incrementally along this spectrum of constraints, starting >> small and simple so that we can develop a firm understanding of the >> changes required to the Java Platform Specification. Along the way we >> will strive, of course, to preserve Java?s core values of readability, >> compatibility, and generality. >> > It seems to me that "snapsafety" could be such a constraint and I hope > for a fruitful and successful cooperation between the two projects. I think we have an opportunity in Leyden to improve the language and platform to support such concepts. I don't know the details of "snapsafety", but in general we should have language support to indicate some sort of "immutable" constraints. These constraints can be validated (so that we can use pre-optimized snapshot (s)), or invalidated (so we will go back to the old slow-but-correct initialization). Also, in addition to a single snapshot of an app, perhaps we can also consider multiple snapshots at a lower granularity. One parallel to draw from is the "constexpr" keyword in C++. However, "constexpr" only deals with language-level constructs. For Java, perhaps we need something that includes a wider set of environmental dependencies. For example, many immutable tables in Java apps are created from external XML files. Do we want a way to snapshot such tables? Maybe we can do that if the XML files are statically stored inside a jlink image? Again, I don't know what the answer is, but I am excited that we are able to look for solutions at all levels of the language and platform. Thanks - Ioi > > Thank you and best regards, > Volker > >> We will lean heavily on existing components of the JDK including the >> HotSpot JVM, the C2 compiler, application class-data sharing (CDS), and >> the `jlink` linking tool. >> >> In the long run we will likely embrace the full closed-world constraint >> in order to produce fully-static images. Between now and then, however, >> we will develop and deliver incremental improvements which developers >> can use sooner rather than later. >> >> Let us begin! >> >> - Mark >> >> >> [1] https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html >> >> // https://openjdk.java.net/projects/leyden/notes/01-beginnings From tanksherman27 at gmail.com Tue May 24 02:11:08 2022 From: tanksherman27 at gmail.com (Julian Waters) Date: Tue, 24 May 2022 10:11:08 +0800 Subject: Useful information learnt from Graal Message-ID: Hi all, I'd like to link a document written by Andrew Dinn on information learnt from Graal here, since it may prove to be a useful starting point: https://github.com/adinn/leydendoc/blob/main/LeydenLessonsfromGraalNative.pdf I'm excited that Leyden is finally starting! best regards, Julian From volker.simonis at gmail.com Tue May 24 17:47:08 2022 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 24 May 2022 19:47:08 +0200 Subject: Please add leyden-dev to the OpenJDK mailinglist info page Message-ID: Can you please add leyden-dev to the list of OpenJDK mailing lists at https://mail.openjdk.java.net/mailman/listinfo? Thanks, Volker From john.r.rose at oracle.com Tue May 24 18:00:21 2022 From: john.r.rose at oracle.com (John Rose) Date: Tue, 24 May 2022 11:00:21 -0700 Subject: Project Leyden: Beginnings In-Reply-To: References: Message-ID: <58BA7C37-13E4-4E19-A88F-B9127B5C288E@oracle.com> n 23 May 2022, at 9:10, Ioi Lam wrote: > On 5/20/2022 9:16 AM, Volker Simonis wrote: >> ? It seems to me that "snapsafety" could be such a constraint and I >> hope >> for a fruitful and successful cooperation between the two projects. A snappy term indeed! When applied to the existing Java platform, the concept (probably) leads to all sorts of complicated considerations about remote and hidden side effects and environmental queries. As Ioi points out, the big new thing here, not possible outside of Leyden, is the option to *modify* the Java language specification (and standard libraries), if we think it helps clarify or simplify the (suitably modified) definition of snapsafety. > > I think we have an opportunity in Leyden to improve the language and > platform to support such concepts. I don't know the details of > "snapsafety", but in general we should have language support to > indicate some sort of "immutable" constraints. These constraints can > be validated (so that we can use pre-optimized snapshot (s)), or > invalidated (so we will go back to the old slow-but-correct > initialization). The part of the language I like to think about changing is not so much assertions (maybe `assert`s) about past events (which are those ?immutable constraints??) but rather relaxation or modification for rules regarding order of evaluation, for suitably marked expressions and statements. The small scale constant-folding rules which every JIT uses are really order of evaluation changes: An expression like `1+2+x` folding to `3+x` takes the expression `1+2` and moves it ?back in time? to JIT time. This is safe because the JIT knows there is no way the program can give evidence of the difference (unless a debugger single-steps through bytecodes). But I think we should chase after constant-folding this sort of thing: ``` Object lookup(String x) { // hey, can someone please do this just once, at jlink time? var mydata = readHashTable(findResourceFile("mydata.xml?)); // this depends on x, so cannot be moved back in time: return var.get(x); } ``` The standard technique is to put `mydata` in a static final variable. And now that?s easy to do inline as well: ``` Object lookup(String x) { // like a C++ static, the initializer is executed on first use: class Static { static final HashMap mydata = readHashTable(findResourceFile("mydata.xml?)); // but still, can someone please do it just once, at jlink time? } // this depends on x, so cannot be moved back in time: return Static.mydata.get(x); } ``` (Side note: Reading files throws a checked exception. Does this mean that the above method should be amended to throw a possible checked exception, but marked as ?somewhere in the past?? If so, then time-shifted expressions would need to have associated time-shifted exception checking rules.) This is a kind of time-shifting currently under programmer control. It suggests to me that we can and should double down on supporting static final state (and also lazy statics as in JDK-8209964), by focusing some effort on time-shifting not so much arbitrary expressions and statements, but the initialization of classes. If a programmer could mark a *whole class* as time-shiftable in its initialization, then the programmer could expect that jlink could make good provisioning decisions about that class, rather than the current standard policy of initializing a class on first use (of a static or of an instance creation). One more bit of mental framework: A Java class is initialized no earlier and no later than its first initializing use (static or instance creation). Certainly there must be other events that the class initialization could be referred to. ?jlink time? is a hazy concept, but program startup is not: A Java program starts just before its selected `main` entry point is run. If a class C could be marked (by the programmer) as being initialized no earlier than entry to `main`, then the programmer could certify that the class is a candidate for pre-initialization, regardless of the change of semantics (relative to Java?s current order of class initialization). And that would solve some (not all) of the problems around making valid jlink-time evaluations. I guess I?m suggesting that a language-level proxy for ?jlink time? is main method entry. I suspect that time-shifted class initialization probably needs a concept of time-shifted dependency (as well as time-shifted exceptions, see above?) so that if class C is marked as ?can initialize around main entry? C can also be marked as ?but no earlier than initialization of D?, for some other class D that C?s initialization depends on. (The work on lazies JDK-8209964 is sort of a complementary image of what Leyden is after, since a lazy variable is time-shifted *after* its containing class is initialized, another change from standard Java rules. The two kinds of time shifting, backward and forward, probably deserve a combined treatment of some sort.) > > Also, in addition to a single snapshot of an app, perhaps we can also > consider multiple snapshots at a lower granularity. > > One parallel to draw from is the "constexpr" keyword in C++. However, > "constexpr" only deals with language-level constructs. For Java, > perhaps we need something that includes a wider set of environmental > dependencies. For example, many immutable tables in Java apps are > created from external XML files. Do we want a way to snapshot such > tables? Maybe we can do that if the XML files are statically stored > inside a jlink image? > > Again, I don't know what the answer is, but I am excited that we are > able to look for solutions at all levels of the language and platform. > > > Thanks > - Ioi > > >> >> Thank you and best regards, >> Volker >> >>> We will lean heavily on existing components of the JDK including the >>> HotSpot JVM, the C2 compiler, application class-data sharing (CDS), >>> and >>> the `jlink` linking tool. >>> >>> In the long run we will likely embrace the full closed-world >>> constraint >>> in order to produce fully-static images. Between now and then, >>> however, >>> we will develop and deliver incremental improvements which >>> developers >>> can use sooner rather than later. >>> >>> Let us begin! >>> >>> - Mark >>> >>> >>> [1] >>> https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html >>> >>> // https://openjdk.java.net/projects/leyden/notes/01-beginnings From john.r.rose at oracle.com Tue May 24 18:30:32 2022 From: john.r.rose at oracle.com (John Rose) Date: Tue, 24 May 2022 11:30:32 -0700 Subject: Project Leyden: Beginnings In-Reply-To: References: Message-ID: <94F08F63-1238-400B-B8E2-C662DB018C10@oracle.com> On 23 May 2022, at 9:10, Ioi Lam wrote: (more) > ? > One parallel to draw from is the "constexpr" keyword in C++. Please take a look also at the ideas in D around pure and immutable computations. They allow time-shifting of very complicated D programs to compile time. I saw a demo (long ago) of compile-time data weaving in D which took an immutable bundle of strings, transformed them and ran the result as D expressions through the D compiler itself (as a subtask, at compile-time), and took the resulting output as further input to incorporate into the D program. Basically, it was as if the D preprocessor suddenly was a metaprogramming framework. And it worked, not as a special hack, but as a corollary of very cleanly worked out D-language rules for purity and immutability, plus the fact that much of the D standard libraries (including the D compiler) were pure enough to play these games with. Reference: https://dlang.org/spec/function.html#pure-functions (I see from dlang.org they have fancy templates now. They probably interoperate well with ad hoc compile-time computations. The C++ constexpr stuff is moving that way too, I guess.) Also, unlike a compiled language, we have a virtual machine that can (in principle) be asked to verify purity of methods on the fly. This means we can (in principle) have pure functions which are separately linked, and do not need to be dumped into the current compilation. Of course native methods and Panama downcalls would have to either be rejected or manually certified, but that?s all part of the game. > However, "constexpr" only deals with language-level constructs. For > Java, perhaps we need something that includes a wider set of > environmental dependencies. Yes. (I have thought for some years that our keyword `const` has been waiting to be used to annotate time-shifted computations. That?s just bikeshedding of course.) I think, given Java?s embrace of dynamic linking, it might make sense to define an idea of a time-shifted *value* as well as a time-shifted expression or statement. By that I mean a normal method could (perhaps) declare some but not all of its *parameters* as `static` (or `constexpr` or whatever) with the meaning that it is requesting that the corresponding actual arguments be time-shifted (if possible) at every invocation point of that method. Then a dynamically linked method could still partially play the time-shifting game, in some of its parameters. (And similarly for local variables. Sort of a ?better static? with a dependency-driven initialization order.) If a method?s formal parameter is marked `static`, then expressions using that parameter inside the method are also candidates for early evaluation. (This means the JVM or somebody has to keep track of separate derived values for each method call site. Doable but tricky.) This might give a framework to thread through all the pre-evaluated values, through an application workload, without disrupting the logic by partitioning it into disjoint ?before? and ?after? phases. And I think all of the above works about as well, not only for time-shifting back in time to pre-evaluation in jlink, but also for time-shifting forward in time to lazy evaluation. Java already has lots of lazy evaluation in it, notably on-demand class initialization and (under the covers) condy/indy. If we had a way to mark program portions as time-shiftable, that would naturally parley out into a way to work with lazy computations, as well as pre-evaluated ones. This is the way I would prefer to handle recurrent requests (from me and others) for APIs which help string templates to support syntax-specific validation (SQL, XML, etc.). Such validation, for a constant template, should happen as early as possible, ideally in the IDE, and certainly at jlink time. Time-shifting can be a foundation for static validation. From forax at univ-mlv.fr Wed May 25 13:59:04 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 25 May 2022 15:59:04 +0200 (CEST) Subject: Project Leyden: Beginnings In-Reply-To: <58BA7C37-13E4-4E19-A88F-B9127B5C288E@oracle.com> References: <58BA7C37-13E4-4E19-A88F-B9127B5C288E@oracle.com> Message-ID: <481783435.13293648.1653487144511.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "John Rose" > To: "Ioi Lam" > Cc: leyden-dev at openjdk.java.net > Sent: Tuesday, May 24, 2022 8:00:21 PM > Subject: Re: Project Leyden: Beginnings > n 23 May 2022, at 9:10, Ioi Lam wrote: > >> On 5/20/2022 9:16 AM, Volker Simonis wrote: >>> ? It seems to me that "snapsafety" could be such a constraint and I >>> hope >>> for a fruitful and successful cooperation between the two projects. > > A snappy term indeed! When applied to the existing Java platform, the > concept (probably) leads to all sorts of complicated considerations > about remote and hidden side effects and environmental queries. > > As Ioi points out, the big new thing here, not possible outside of > Leyden, is the option to *modify* the Java language specification (and > standard libraries), if we think it helps clarify or simplify the > (suitably modified) definition of snapsafety. > >> >> I think we have an opportunity in Leyden to improve the language and >> platform to support such concepts. I don't know the details of >> "snapsafety", but in general we should have language support to >> indicate some sort of "immutable" constraints. These constraints can >> be validated (so that we can use pre-optimized snapshot (s)), or >> invalidated (so we will go back to the old slow-but-correct >> initialization). > > The part of the language I like to think about changing is not so much > assertions (maybe `assert`s) about past events (which are those > ?immutable constraints??) but rather relaxation or modification for > rules regarding order of evaluation, for suitably marked expressions and > statements. > > The small scale constant-folding rules which every JIT uses are really > order of evaluation changes: An expression like `1+2+x` folding to > `3+x` takes the expression `1+2` and moves it ?back in time? to JIT > time. This is safe because the JIT knows there is no way the program > can give evidence of the difference (unless a debugger single-steps > through bytecodes). But I think we should chase after constant-folding > this sort of thing: > > ``` > Object lookup(String x) { > // hey, can someone please do this just once, at jlink time? > var mydata = readHashTable(findResourceFile("mydata.xml?)); > // this depends on x, so cannot be moved back in time: > return var.get(x); > } > ``` > > The standard technique is to put `mydata` in a static final variable. > And now that?s easy to do inline as well: > > ``` > Object lookup(String x) { > // like a C++ static, the initializer is executed on first use: > class Static { > static final HashMap > mydata = readHashTable(findResourceFile("mydata.xml?)); > // but still, can someone please do it just once, at jlink time? > } > // this depends on x, so cannot be moved back in time: > return Static.mydata.get(x); > } > ``` > > (Side note: Reading files throws a checked exception. Does this mean > that the above method should be amended to throw a possible checked > exception, but marked as ?somewhere in the past?? If so, then > time-shifted expressions would need to have associated time-shifted > exception checking rules.) > > This is a kind of time-shifting currently under programmer control. It > suggests to me that we can and should double down on supporting static > final state (and also lazy statics as in JDK-8209964), by focusing some > effort on time-shifting not so much arbitrary expressions and > statements, but the initialization of classes. If a programmer could > mark a *whole class* as time-shiftable in its initialization, then the > programmer could expect that jlink could make good provisioning > decisions about that class, rather than the current standard policy of > initializing a class on first use (of a static or of an instance > creation). In my opinion, lazy static is enough, we may not need a class wide keyword. Lazy static means - it is not executed as part of the static initialize (not in ) - the initialization expression has to be executed before the first access, it can be just before the first access or a long time before, offline. - if an exception occurs during the execution of the initialization, the exception is wrapped into a (subclass of) LinkageError, any attempt to access to the static variable will throw that exception (the same way constant pool constant are resolved) so perhaps lazy static is not the right term, perhaps "const" is a better term. > > One more bit of mental framework: A Java class is initialized no > earlier and no later than its first initializing use (static or instance > creation). Certainly there must be other events that the class > initialization could be referred to. ?jlink time? is a hazy > concept, but program startup is not: A Java program starts just before > its selected `main` entry point is run. If a class C could be marked > (by the programmer) as being initialized no earlier than entry to > `main`, then the programmer could certify that the class is a candidate > for pre-initialization, regardless of the change of semantics (relative > to Java?s current order of class initialization). And that would > solve some (not all) of the problems around making valid jlink-time > evaluations. I guess I?m suggesting that a language-level proxy for > ?jlink time? is main method entry. startup time may be not that well defined because of project Crac. > > I suspect that time-shifted class initialization probably needs a > concept of time-shifted dependency (as well as time-shifted exceptions, > see above?) so that if class C is marked as ?can initialize around > main entry? C can also be marked as ?but no earlier than > initialization of D?, for some other class D that C?s initialization > depends on. > > (The work on lazies JDK-8209964 is sort of a complementary image of what > Leyden is after, since a lazy variable is time-shifted *after* its > containing class is initialized, another change from standard Java > rules. The two kinds of time shifting, backward and forward, probably > deserve a combined treatment of some sort.) or the exact time of the evaluation is not guarantee. Dependencies is an issue. To be allowed to be pre-computed a "const" variable should not depend transitively on the execution of a static init block. > >> >> Also, in addition to a single snapshot of an app, perhaps we can also >> consider multiple snapshots at a lower granularity. >> >> One parallel to draw from is the "constexpr" keyword in C++. However, >> "constexpr" only deals with language-level constructs. For Java, >> perhaps we need something that includes a wider set of environmental >> dependencies. For example, many immutable tables in Java apps are >> created from external XML files. Do we want a way to snapshot such >> tables? Maybe we can do that if the XML files are statically stored >> inside a jlink image ? or if the XML file is read at before-runtime. This is something Quarkus (and Micronaut) does, they uses annotation processor or bytecode patching to inject constants in-between the compilation and runtime. >> >> Again, I don't know what the answer is, but I am excited that we are >> able to look for solutions at all levels of the language and platform. If we knew the anwser, Leyden is not necessary :) R?mi >> >> >> Thanks >> - Ioi >> >> >>> >>> Thank you and best regards, >>> Volker >>> >>>> We will lean heavily on existing components of the JDK including the >>>> HotSpot JVM, the C2 compiler, application class-data sharing (CDS), >>>> and >>>> the `jlink` linking tool. >>>> >>>> In the long run we will likely embrace the full closed-world >>>> constraint >>>> in order to produce fully-static images. Between now and then, >>>> however, >>>> we will develop and deliver incremental improvements which >>>> developers >>>> can use sooner rather than later. >>>> >>>> Let us begin! >>>> >>>> - Mark >>>> >>>> >>>> [1] >>>> https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html >>>> > >>> // https://openjdk.java.net/projects/leyden/notes/01-beginnings From forax at univ-mlv.fr Wed May 25 14:09:13 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 25 May 2022 16:09:13 +0200 (CEST) Subject: Project Leyden: Beginnings In-Reply-To: <94F08F63-1238-400B-B8E2-C662DB018C10@oracle.com> References: <94F08F63-1238-400B-B8E2-C662DB018C10@oracle.com> Message-ID: <422053938.13300180.1653487753204.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "John Rose" > To: "Ioi Lam" > Cc: "leyden-dev" > Sent: Tuesday, May 24, 2022 8:30:32 PM > Subject: Re: Project Leyden: Beginnings > On 23 May 2022, at 9:10, Ioi Lam wrote: > > (more) >> ? >> One parallel to draw from is the "constexpr" keyword in C++. > > Please take a look also at the ideas in D around pure and immutable > computations. They allow time-shifting of very complicated D programs > to compile time. I saw a demo (long ago) of compile-time data weaving > in D which took an immutable bundle of strings, transformed them and ran > the result as D expressions through the D compiler itself (as a subtask, > at compile-time), and took the resulting output as further input to > incorporate into the D program. Basically, it was as if the D > preprocessor suddenly was a metaprogramming framework. And it worked, > not as a special hack, but as a corollary of very cleanly worked out > D-language rules for purity and immutability, plus the fact that much of > the D standard libraries (including the D compiler) were pure enough to > play these games with. > > Reference: https://dlang.org/spec/function.html#pure-functions > > (I see from dlang.org they have fancy templates now. They probably > interoperate well with ad hoc compile-time computations. The C++ > constexpr stuff is moving that way too, I guess.) > > Also, unlike a compiled language, we have a virtual machine that can (in > principle) be asked to verify purity of methods on the fly. This means > we can (in principle) have pure functions which are separately linked, > and do not need to be dumped into the current compilation. Of course > native methods and Panama downcalls would have to either be rejected or > manually certified, but that?s all part of the game. yes, Zig does somethoing similar too. https://kristoff.it/blog/what-is-zig-comptime/ > >> However, "constexpr" only deals with language-level constructs. For >> Java, perhaps we need something that includes a wider set of >> environmental dependencies. > > Yes. (I have thought for some years that our keyword `const` has been > waiting to be used to annotate time-shifted computations. That?s just > bikeshedding of course.) I think, given Java?s embrace of dynamic > linking, it might make sense to define an idea of a time-shifted *value* > as well as a time-shifted expression or statement. By that I mean a > normal method could (perhaps) declare some but not all of its > *parameters* as `static` (or `constexpr` or whatever) with the meaning > that it is requesting that the corresponding actual arguments be > time-shifted (if possible) at every invocation point of that method. > Then a dynamically linked method could still partially play the > time-shifting game, in some of its parameters. (And similarly for local > variables. Sort of a ?better static? with a dependency-driven > initialization order.) If a method?s formal parameter is marked > `static`, then expressions using that parameter inside the method are > also candidates for early evaluation. (This means the JVM or somebody > has to keep track of separate derived values for each method call site. > Doable but tricky.) This might give a framework to thread through all > the pre-evaluated values, through an application workload, without > disrupting the logic by partitioning it into disjoint ?before? and > ?after? phases. The problem of any keywords like constexpr is that it does not work well when you have libraries in the middle (like an XML parser) and you requires everything to be transitively a constexpr. > > And I think all of the above works about as well, not only for > time-shifting back in time to pre-evaluation in jlink, but also for > time-shifting forward in time to lazy evaluation. Java already has lots > of lazy evaluation in it, notably on-demand class initialization and > (under the covers) condy/indy. If we had a way to mark program portions > as time-shiftable, that would naturally parley out into a way to work > with lazy computations, as well as pre-evaluated ones. I believe we can conflate the two by saying that an expression can be evaluated whenever it suits the jlink/runtime because in both cases you want something that relax the initialization order. > > This is the way I would prefer to handle recurrent requests (from me and > others) for APIs which help string templates to support syntax-specific > validation (SQL, XML, etc.). Such validation, for a constant template, > should happen as early as possible, ideally in the IDE, and certainly at > jlink time. Time-shifting can be a foundation for static validation. yes ! with less syntactic sugar please (i wonder if there is a rehab for syntactic sugar ?). R?mi From groved at us.ibm.com Thu May 26 16:35:55 2022 From: groved at us.ibm.com (David P Grove) Date: Thu, 26 May 2022 16:35:55 +0000 Subject: Experimentation with build time and runtime class initialization in qbicc Message-ID: <2723189E-F51E-447A-9D3B-037E3501B68D@contoso.com> Hi, In the qbicc project, we?ve been exploring options for adapting Java?s class initialization semantics for native images. In particular, we are trying to arrive at a non-surprising semantics that in a native-image scenarios allows most initialization to happen at build-time while still enabling runtime initialization of selected static fields. Our current design and experience is captured here: https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc. In a nutshell, the idea is to initialize classes via build-time execution of existing methods as per normal Java semantics while adding per-static-field methods to provide a capability for runtime-reinitialization of a field before its first access. --dave From brian.goetz at oracle.com Thu May 26 18:21:19 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 26 May 2022 14:21:19 -0400 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: <2723189E-F51E-447A-9D3B-037E3501B68D@contoso.com> References: <2723189E-F51E-447A-9D3B-037E3501B68D@contoso.com> Message-ID: <2f10087f-7d2a-9236-3818-02088a1a08e7@oracle.com> Hi David; Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions.? Can you post the contents of that document here, so there are no issues there? Thanks, -Brian On 5/26/2022 12:35 PM, David P Grove wrote: > Hi, > > In the qbicc project, we?ve been exploring options for adapting Java?s class initialization semantics for native images. In particular, we are trying to arrive at a non-surprising semantics that in a native-image scenarios allows most initialization to happen at build-time while still enabling runtime initialization of selected static fields. > > Our current design and experience is captured here:https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc. In a nutshell, the idea is to initialize classes via build-time execution of existing methods as per normal Java semantics while adding per-static-field methods to provide a capability for runtime-reinitialization of a field before its first access. > > --dave > From groved at us.ibm.com Thu May 26 20:22:06 2022 From: groved at us.ibm.com (David P Grove) Date: Thu, 26 May 2022 20:22:06 +0000 Subject: Experimentation with build time and runtime class initialization in qbicc Message-ID: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> Hi, I?ve appended the contents of the referenced wiki page in this email. Apologies in advance if the formatting doesn?t come through as intended. There is a full implementation of this (GPLv2 + Classpath exception) as part of the qbicc project on GitHub. There is also a GitHub discussion in the qbicc project that links to various GitHub issues that capture the history that led to the current design. I will not hyperlink to those here so that if people have any IP concerns, they can avoid seeing them. They are easily findable. Regards, --dave ## Overview One of the goals of the qbicc project is to explore technical approaches for adapting Java's specification of class initialization to fully support native image compilation. Enabling build-time evaluation of complex class initialization logic is essential for obtaining much of the benefits of native image compilation: reduced memory footprint and fast startup. However, both the core JDK and many frameworks will not be primarily be used in native image scenarios. Therefore, it is essential that the approach taken for build-time initialization enables both the existing runtime class initialization and the new build-time class initialization logic to co-exist. Furthermore, for as many cases as possible, the class initialization code should be shared between the two usage scenarios and have non-surprising semantics in both. ## Build-time Initialization In qbicc, all classes are initialized at build-time. Class initialization at build time is performed according to the existing semantics of Java class initialization driven by build-time execution of the `` methods of reachable classes. The set of reachable classes is determined iteratively, starting with the program entrypoints and adding the methods and classes they utilize until no further reachable classes are discovered (a fixed point is reached). After build-time initialization has completed, a build-time heap has been constructed that contains the objects that were created during the build-time execution of the `` methods. Using the reachable static fields of the reachable program as roots, this build-time heap is serialized into the native image. This set of objects will form the initial runtime heap of the program when it is executed. ## Runtime Initializers There are cases where one or more initialization actions of a class **must** be executed at program runtime. Most typically these involve the creation of native resources (open files, threads, etc) that cannot be successfully serialized into the build time heap. Qbicc supports runtime initialization by allowing static fields of a classes to be declared as runtime initialized. These fields will be initialized lazily, at first access, by executing a runtime initializer (``) associated with the accessed field. Runtime initialization is localized: accessing a particular static field will cause its runtime initializer to be executed but has no implications for other runtime initializers defined either in the field's defining class or any superclass or implemented interface of the field's defining class. When serialized from the build-time heap to the runtime heap, all runtime-initialized fields will be serialized with the zero (uninitialized) value appropriate for their type. Qbicc allows related static fields in the same class to share a common `` method. The first access to any of the fields will cause the execution of the associated `` method and the initialization of all the fields. ## Adjusting Heap Serialization For some objects it is necessary to initialize them during build-time initialization, but "reset" them before they are used at runtime. Qbicc supports this by allowing fields to be annotated to be serialized as the type-appropriate zero value or as a primitive constant value. This value replacement happens as the build time heap is serialized. One common scenario is to invalidate objects that are wrapping native resources. For example, when a `FileDescriptor` is serialized its `fd` and `handle` instance fields are serialized as `-1` and its `closed` field is serialize as `true`. Thus, any attempt to use the build-time FileDescriptor at runtime will raise the appropriate exception. ## Patching: Migration for Existing Classes The runtime initialization mechanisms described above are currently enabled via a set of annotations. This allows qbicc to implement the desired semantics without requiring any changes to the Java compiler, class file format, or language specification. In the long term, we believe small modifications to the Java specification, for example defining a `rtinit { ... }` similar to the existing `static { ... }` construct could enable a simpler specification. The primary annotation for runtime initialization is `RuntimeAspect`. This annotation is defined on a class and is interpreted as meaning that the `` method of the class should be interpreted as an `` method. This method will not be executed during build-time initialization and instead will be deferred until the first access of one of the static fields defined in the class. To allow us to "externally" modify JDK core classes for qbicc, we have developed an annotation-driven patcher infrastructure. The patcher allows the declaration of patch classes that add, remove, and modify the methods and fields of an existing class. This modification includes the replacement of the `` method and the declaration of multiple `RuntimeAspect` patch classes. The best way to explore what is possible with the patcher is to examine the java.base/src directory in the qbicc-class-library project. It makes extensive use of the patcher annotations to adapt the core JDK classes to qbicc while still allowing us to consume the upstream OpenJDK code base via an unmodified git submodule. ## Design Alternatives A number of alternatives were considered before arriving at the final design documented here. The technical discussions and options considered can be explored starting in qbicc discussion #764 on GitHub. From: Brian Goetz Date: Thursday, May 26, 2022 at 2:21 PM To: David P Grove , "leyden-dev at openjdk.java.net" Subject: [EXTERNAL] Re: Experimentation with build time and runtime class initialization in qbicc Hi David; Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi David; Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no issues there? Thanks, -Brian On 5/26/2022 12:35 PM, David P Grove wrote: Hi, In the qbicc project, we?ve been exploring options for adapting Java?s class initialization semantics for native images. In particular, we are trying to arrive at a non-surprising semantics that in a native-image scenarios allows most initialization to happen at build-time while still enabling runtime initialization of selected static fields. Our current design and experience is captured here: https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc. In a nutshell, the idea is to initialize classes via build-time execution of existing methods as per normal Java semantics while adding per-static-field methods to provide a capability for runtime-reinitialization of a field before its first access. --dave From brian.goetz at oracle.com Fri May 27 01:01:10 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 26 May 2022 21:01:10 -0400 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> Message-ID: <2de6008c-7e3e-c25e-1ba3-356532d498e7@oracle.com> Thanks for providing this. Something about the qbicc approach here doesn't seem to add up to me.? Maybe you can tell me what I'm missing. From reading your notes, it seems that at build time, you start with the root class(es), execute their , which will cause loading of more classes, more , and you iterate until there are no new classes to initialize.? You then treat the statics as roots, and serialize those objects to the initial heap image.? But before doing that, you exclude (zero out) any which are marked as "reinitialize at runtime." The rationale for this clearly is that you want to continue the graph walk to find all the loadable classes, but then don't want to use the polluted value.? But what happens in cases like this: ??? class Aliased { ??????? @RuntimeInitialized private static final Socket s = ...; ??????? private static final Socket copy = s; ??? } Do you throw on reads of runtime-initialized fields from a ?? Do you walk the heap and find aliases to runtime-initialized values, and replace them with something (if so, what?)? Or is the Aliased class above just "broken" according to this model, and I encounter a stale/nonworking socket in `copy` at runtime, and one that is not properly aliased to `s`?? Once an object is initialized at build time, its state can escape into all sorts of other places, and just zeroing out the static root isn't enough to stamp it out. Am I missing something? Thanks, -Brian On 5/26/2022 4:22 PM, David P Grove wrote: > Hi, > > I?ve appended the contents of the referenced wiki page in this email. Apologies in advance if the formatting doesn?t come through as intended. > > There is a full implementation of this (GPLv2 + Classpath exception) as part of the qbicc project on GitHub. There is also a GitHub discussion in the qbicc project that links to various GitHub issues that capture the history that led to the current design. I will not hyperlink to those here so that if people have any IP concerns, they can avoid seeing them. They are easily findable. > > Regards, > > --dave > > ## Overview > > One of the goals of the qbicc project is to explore technical approaches for adapting Java's specification of class initialization to fully support native image compilation. Enabling build-time evaluation of complex class initialization logic is essential for obtaining much of the benefits of native image compilation: reduced memory footprint and fast startup. However, both the core JDK and many frameworks will not be primarily be used in native image scenarios. Therefore, it is essential that the approach taken for build-time initialization enables both the existing runtime class initialization and the new build-time class initialization logic to co-exist. Furthermore, for as many cases as possible, the class initialization code should be shared between the two usage scenarios and have non-surprising semantics in both. > > ## Build-time Initialization > > In qbicc, all classes are initialized at build-time. Class initialization at build time is performed according to the existing semantics of Java class initialization driven by build-time execution of the `` methods of reachable classes. The set of reachable classes is determined iteratively, starting with the program entrypoints and adding the methods and classes they utilize until no further reachable classes are discovered (a fixed point is reached). > > After build-time initialization has completed, a build-time heap has been constructed that contains the objects that were created during the build-time execution of the `` methods. Using the reachable static fields of the reachable program as roots, this build-time heap is serialized into the native image. This set of objects will form the initial runtime heap of the program when it is executed. > > ## Runtime Initializers > > There are cases where one or more initialization actions of a class **must** be executed at program runtime. Most typically these involve the creation of native resources (open files, threads, etc) that cannot be successfully serialized into the build time heap. > > Qbicc supports runtime initialization by allowing static fields of a classes to be declared as runtime initialized. These fields will be initialized lazily, at first access, by executing a runtime initializer (``) associated with the accessed field. Runtime initialization is localized: accessing a particular static field will cause its runtime initializer to be executed but has no implications for other runtime initializers defined either in the field's defining class or any superclass or implemented interface of the field's defining class. > > When serialized from the build-time heap to the runtime heap, all runtime-initialized fields will be serialized with the zero (uninitialized) value appropriate for their type. > > Qbicc allows related static fields in the same class to share a common `` method. The first access to any of the fields will cause the execution of the associated `` method and the initialization of all the fields. > > ## Adjusting Heap Serialization > > For some objects it is necessary to initialize them during build-time initialization, but "reset" them before they are used at runtime. > Qbicc supports this by allowing fields to be annotated to be serialized as the type-appropriate zero value or as a primitive constant value. This value replacement happens as the build time heap is serialized. > > One common scenario is to invalidate objects that are wrapping native resources. For example, when a `FileDescriptor` is serialized its `fd` and `handle` instance fields are serialized as `-1` and its `closed` field is serialize as `true`. Thus, any attempt to use the build-time FileDescriptor at runtime will raise the appropriate exception. > > ## Patching: Migration for Existing Classes > > The runtime initialization mechanisms described above are currently enabled via a set of annotations. This allows qbicc to implement the desired semantics without requiring any changes to the Java compiler, class file format, or language specification. In the long term, we believe small modifications to the Java specification, for example defining a `rtinit { ... }` similar to the existing `static { ... }` construct could enable a simpler specification. > > The primary annotation for runtime initialization is `RuntimeAspect`. This annotation is defined on a class and is interpreted as meaning that the `` method of the class should be interpreted as an `` method. This method will not be executed during build-time initialization and instead will be deferred until the first access of one of the static fields defined in the class. > > To allow us to "externally" modify JDK core classes for qbicc, we have developed an annotation-driven patcher infrastructure. The patcher allows the declaration of patch classes that add, remove, and modify the methods and fields of an existing class. This modification includes the replacement of the `` method and the declaration of multiple `RuntimeAspect` patch classes. > > The best way to explore what is possible with the patcher is to examine the java.base/src directory in the qbicc-class-library project. It makes extensive use of the patcher annotations to adapt the core JDK classes to qbicc while still allowing us to consume the upstream OpenJDK code base via an unmodified git submodule. > > ## Design Alternatives > > A number of alternatives were considered before arriving at the final design documented here. The technical discussions and options considered can be explored starting in qbicc discussion #764 on GitHub. > > > From: Brian Goetz > Date: Thursday, May 26, 2022 at 2:21 PM > To: David P Grove,"leyden-dev at openjdk.java.net" > Subject: [EXTERNAL] Re: Experimentation with build time and runtime class initialization in qbicc > > Hi David; Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > ZjQcmQRYFpfptBannerEnd > Hi David; > > Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no issues there? > > Thanks, > -Brian > On 5/26/2022 12:35 PM, David P Grove wrote: > > Hi, > > > > In the qbicc project, we?ve been exploring options for adapting Java?s class initialization semantics for native images. In particular, we are trying to arrive at a non-surprising semantics that in a native-image scenarios allows most initialization to happen at build-time while still enabling runtime initialization of selected static fields. > > > > Our current design and experience is captured here:https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc. In a nutshell, the idea is to initialize classes via build-time execution of existing methods as per normal Java semantics while adding per-static-field methods to provide a capability for runtime-reinitialization of a field before its first access. > > > > --dave > > > > From kasperni at gmail.com Fri May 27 11:53:09 2022 From: kasperni at gmail.com (Kasper Nielsen) Date: Fri, 27 May 2022 12:53:09 +0100 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> Message-ID: Hi David, Thanks for the write-up. One thing that isn't completely clear to me after reading this is why language changes () are needed? It seems to me this could be entirely implemented via a standard API. Using ClassValue as the main inspiration you could have something like: abstract class RuntimeLocal { protected RuntimeLocal() { checkBuildTime(); VM.registerForRuntimeInitialization(this); } protected abstract T computeValue(); public final T get(); // Calls to get are optimized by the vm } Usage would be something similar to: class Usage { static final LocalDateTime BUILD_TIME = LocalDateTime.now(); static final RuntimeLocal RUNTIME_TIME = new RuntimeLocal<>() { protected LocalDateTime computeValue() { return LocalDateTime.now(); } }; } I might be missing some details, but it seems to me that this approach would be strongly favorable to needing to change the language as well as adding new bytecodes. /Kasper On Thu, 26 May 2022 at 21:22, David P Grove wrote: > Hi, > I?ve appended the contents of the referenced wiki page in this email. > Apologies in advance if the formatting doesn?t come through as intended. > > There is a full implementation of this (GPLv2 + Classpath > exception) as part of the qbicc project on GitHub. There is also a GitHub > discussion in the qbicc project that links to various GitHub issues that > capture the history that led to the current design. I will not hyperlink > to those here so that if people have any IP concerns, they can avoid seeing > them. They are easily findable. > > Regards, > > --dave > > From aph at redhat.com Fri May 27 13:33:15 2022 From: aph at redhat.com (Andrew Haley) Date: Fri, 27 May 2022 14:33:15 +0100 Subject: Project Leyden: Beginnings In-Reply-To: <20220520144202.C213853774D@eggemoggin.niobe.net> References: <20220520144202.C213853774D@eggemoggin.niobe.net> Message-ID: On 5/20/22 15:42, mark.reinhold at oracle.com wrote: > Let us begin! As you'd expect, here at Red Hat there's a variety of opinions. Rather than simply post my own response to this, I've been talking to Middleware architects (the likely _users_ of Leyden!) as well as our OpenJDK team members. Here's what we think: We're excited to see Leyden taking shape and will be active participants. Our customers are benefiting from GraalVM today and as such we?ll continue to engage with that project as the Leyden ideas are explored and take shape. Bringing standardization to this space is important for developers as it will clarify the behaviours they can depend on. As we engage in updating the standard, we should consider not just new behavior, but also exceptions and variations that can accommodate some of the existing behavior of GraalVM such as build-time initialization. The proposed incremental approach will ensure we bring along the current ecosystem and devtools while carefully introducing any new constraints. We also see the benefit in segmenting the problem space into discrete areas that can be introduced sooner rather than waiting for a big-bang integration of multiple constraints. > In the long run we will likely embrace the full closed-world > constraint in order to produce fully-static images. Our experience with Java on K8s and containers (notably Quarkus) has demonstrated real world benefits of a closed-world approach, so it's good to see it explicitly listed as a likely goal in the long run. It will be important that Leyden is careful to specify its efforts in the fast start / small footprint space while being mindful of that constraint. Hopefully, we can all work towards a future that converges both GraalVM's efforts and those of Leyden. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From heidinga at redhat.com Fri May 27 15:35:48 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Fri, 27 May 2022 11:35:48 -0400 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: <2de6008c-7e3e-c25e-1ba3-356532d498e7@oracle.com> References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> <2de6008c-7e3e-c25e-1ba3-356532d498e7@oracle.com> Message-ID: On Thu, May 26, 2022 at 9:01 PM Brian Goetz wrote: > > Thanks for providing this. > > Something about the qbicc approach here doesn't seem to add up to me. > Maybe you can tell me what I'm missing. > > From reading your notes, it seems that at build time, you start with > the root class(es), execute their , which will cause loading of > more classes, more , and you iterate until there are no new > classes to initialize. With qbicc we embraced the closed-world constraint and mandated that all class initialization happens at build time. While we started with runtime class initialization to bootstrap being able to run more code, we quickly switched to being all-in on build time init (BTI) due to the virtuous cycle between BTI and dead code elimination. > You then treat the statics as roots, and > serialize those objects to the initial heap image. But before doing > that, you exclude (zero out) any which are marked as "reinitialize at > runtime." Right. > > The rationale for this clearly is that you want to continue the graph > walk to find all the loadable classes, but then don't want to use the > polluted value. But what happens in cases like this: > > class Aliased { > @RuntimeInitialized private static final Socket s = ...; > private static final Socket copy = s; > } > > Do you throw on reads of runtime-initialized fields from a ? Do > you walk the heap and find aliases to runtime-initialized values, and > replace them with something (if so, what?) Or is the Aliased class > above just "broken" according to this model, and I encounter a > stale/nonworking socket in `copy` at runtime, and one that is not > properly aliased to `s`? Once an object is initialized at build time, > its state can escape into all sorts of other places, and just zeroing > out the static root isn't enough to stamp it out. This is where the "soupy" nature of becomes evident. is a single method that has tremendous side effects, setting static fields, initializing other classes, starting threads, caching computed values, etc. It's very hard to automatically reason about what has happened in a method and what the user intends for those side effects (if they're even aware of what they all may be!). What was the user's intent when they initialized 'copy'? To record what the original Socket connection - set up at build time - had been rather than separately storing the address/port? If they had a semantic meaning for `copy` even after `s` had been nulled out, then automatically resetting `copy` would violate their expectation. We need the user to tell us their intent. If they wanted both `s` & `copy` to be reset, then they need to be explicit about that and annotate both fields. We don't attempt to null all copies of the value of a @RuntimeInitialized field. > > Am I missing something? You seemed to have grasped it correctly =) If that field had been a primitive, such as a long, we'd be unable to track down which other longs in the heap were copies of it or derived from it. We wouldn't reset some other location with the value 42 because a @RuntimeInitialized field was set to 42 at build time. The programmer has to take responsibility for which fields need to be reset. With qbicc, that's annotations. With Leyden we may be able to give them a better way to group fields and express how & when they should be initialized. --Dan > > Thanks, > -Brian > > > On 5/26/2022 4:22 PM, David P Grove wrote: > > Hi, > > > > I?ve appended the contents of the referenced wiki page in this email. Apologies in advance if the formatting doesn?t come through as intended. > > > > There is a full implementation of this (GPLv2 + Classpath exception) as part of the qbicc project on GitHub. There is also a GitHub discussion in the qbicc project that links to various GitHub issues that capture the history that led to the current design. I will not hyperlink to those here so that if people have any IP concerns, they can avoid seeing them. They are easily findable. > > > > Regards, > > > > --dave > > > > ## Overview > > > > One of the goals of the qbicc project is to explore technical approaches for adapting Java's specification of class initialization to fully support native image compilation. Enabling build-time evaluation of complex class initialization logic is essential for obtaining much of the benefits of native image compilation: reduced memory footprint and fast startup. However, both the core JDK and many frameworks will not be primarily be used in native image scenarios. Therefore, it is essential that the approach taken for build-time initialization enables both the existing runtime class initialization and the new build-time class initialization logic to co-exist. Furthermore, for as many cases as possible, the class initialization code should be shared between the two usage scenarios and have non-surprising semantics in both. > > > > ## Build-time Initialization > > > > In qbicc, all classes are initialized at build-time. Class initialization at build time is performed according to the existing semantics of Java class initialization driven by build-time execution of the `` methods of reachable classes. The set of reachable classes is determined iteratively, starting with the program entrypoints and adding the methods and classes they utilize until no further reachable classes are discovered (a fixed point is reached). > > > > After build-time initialization has completed, a build-time heap has been constructed that contains the objects that were created during the build-time execution of the `` methods. Using the reachable static fields of the reachable program as roots, this build-time heap is serialized into the native image. This set of objects will form the initial runtime heap of the program when it is executed. > > > > ## Runtime Initializers > > > > There are cases where one or more initialization actions of a class **must** be executed at program runtime. Most typically these involve the creation of native resources (open files, threads, etc) that cannot be successfully serialized into the build time heap. > > > > Qbicc supports runtime initialization by allowing static fields of a classes to be declared as runtime initialized. These fields will be initialized lazily, at first access, by executing a runtime initializer (``) associated with the accessed field. Runtime initialization is localized: accessing a particular static field will cause its runtime initializer to be executed but has no implications for other runtime initializers defined either in the field's defining class or any superclass or implemented interface of the field's defining class. > > > > When serialized from the build-time heap to the runtime heap, all runtime-initialized fields will be serialized with the zero (uninitialized) value appropriate for their type. > > > > Qbicc allows related static fields in the same class to share a common `` method. The first access to any of the fields will cause the execution of the associated `` method and the initialization of all the fields. > > > > ## Adjusting Heap Serialization > > > > For some objects it is necessary to initialize them during build-time initialization, but "reset" them before they are used at runtime. > > Qbicc supports this by allowing fields to be annotated to be serialized as the type-appropriate zero value or as a primitive constant value. This value replacement happens as the build time heap is serialized. > > > > One common scenario is to invalidate objects that are wrapping native resources. For example, when a `FileDescriptor` is serialized its `fd` and `handle` instance fields are serialized as `-1` and its `closed` field is serialize as `true`. Thus, any attempt to use the build-time FileDescriptor at runtime will raise the appropriate exception. > > > > ## Patching: Migration for Existing Classes > > > > The runtime initialization mechanisms described above are currently enabled via a set of annotations. This allows qbicc to implement the desired semantics without requiring any changes to the Java compiler, class file format, or language specification. In the long term, we believe small modifications to the Java specification, for example defining a `rtinit { ... }` similar to the existing `static { ... }` construct could enable a simpler specification. > > > > The primary annotation for runtime initialization is `RuntimeAspect`. This annotation is defined on a class and is interpreted as meaning that the `` method of the class should be interpreted as an `` method. This method will not be executed during build-time initialization and instead will be deferred until the first access of one of the static fields defined in the class. > > > > To allow us to "externally" modify JDK core classes for qbicc, we have developed an annotation-driven patcher infrastructure. The patcher allows the declaration of patch classes that add, remove, and modify the methods and fields of an existing class. This modification includes the replacement of the `` method and the declaration of multiple `RuntimeAspect` patch classes. > > > > The best way to explore what is possible with the patcher is to examine the java.base/src directory in the qbicc-class-library project. It makes extensive use of the patcher annotations to adapt the core JDK classes to qbicc while still allowing us to consume the upstream OpenJDK code base via an unmodified git submodule. > > > > ## Design Alternatives > > > > A number of alternatives were considered before arriving at the final design documented here. The technical discussions and options considered can be explored starting in qbicc discussion #764 on GitHub. > > > > > > From: Brian Goetz > > Date: Thursday, May 26, 2022 at 2:21 PM > > To: David P Grove,"leyden-dev at openjdk.java.net" > > Subject: [EXTERNAL] Re: Experimentation with build time and runtime class initialization in qbicc > > > > Hi David; Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no > > ZjQcmQRYFpfptBannerStart > > This Message Is From an External Sender > > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > Hi David; > > > > Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no issues there? > > > > Thanks, > > -Brian > > On 5/26/2022 12:35 PM, David P Grove wrote: > > > > Hi, > > > > > > > > In the qbicc project, we?ve been exploring options for adapting Java?s class initialization semantics for native images. In particular, we are trying to arrive at a non-surprising semantics that in a native-image scenarios allows most initialization to happen at build-time while still enabling runtime initialization of selected static fields. > > > > > > > > Our current design and experience is captured here:https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc. In a nutshell, the idea is to initialize classes via build-time execution of existing methods as per normal Java semantics while adding per-static-field methods to provide a capability for runtime-reinitialization of a field before its first access. > > > > > > > > --dave > > > > > > > > > From groved at us.ibm.com Fri May 27 16:15:30 2022 From: groved at us.ibm.com (David P Grove) Date: Fri, 27 May 2022 16:15:30 +0000 Subject: Experimentation with build time and runtime class initialization in qbicc Message-ID: From: Brian Goetz > From reading your notes, it seems that at build time, you start with the root class(es), execute their , > which will cause loading of more classes, more , and you iterate until there are no new classes to initialize.? > You then treat the statics as roots, and serialize those objects to the initial heap image.? But before doing that, > you exclude (zero out) any which are marked as "reinitialize at runtime."? This is correct. In addition, as qbicc serializes each object, it also looks for annotations on instance fields that indicate that instead of serializing the build-time value of the instance field it should substitute a different value (FileDescriptor is a motivating example...we want to serialize a closed FileDescriptor to ensure any runtime reads/writes through it will result in the proper exception being raised). > ... what happens in cases like this: > > class Aliased { > ?????? @RuntimeInitialized private static final Socket s = ...; ?> ????? private static final Socket copy = s; ?>} First, I'll say what this code snippet would do with qbicc, then I'll say what the program should be to get the semantics the programmer probably intended. At build time, qbicc will execute the of Aliased, presumably a Socket object will be allocated by ... and references to that Socket object will be stored in s and copy. Any build-time usage of either s or copy via a build-time executed getfield will get a reference to that Socket object in the build-time heap. The @RuntimeInitialized has no impact on the build-time execution of code. At the end of compilation, when we serialize the static fields for Aliased, we will write null for s and a reference to the serialized Socket object for copy. In the generated code, all getfields to s will be preceded by checks to ensure that the method for s has been executed (similar to how a clinit check would be generated in a JVM). Since copy does not have a , getfields to copy in the generated code will not be preceded by any checks. The first time s is accessed at runtime, the ... code will be executed by the method and a new Socket object will be created and stored in s. The fields s and copy will now point to distinct Socket objects. Usages of the Socket object reachable from copy would likely result in an exception because the backing FileDescriptor for the Socket object referenced from copy would have been modified during the serialization process so that its instance fields have values as if the FileDescriptor had been closed. Using the syntax above, one would need to write this code to get the intended aliasing at both build-time and runtime. class Aliased { @RuntimeInitilalized private static final Socket s = ...; @RuntimeInitilalized private static final Socket copy = s; } The way we would actually write this pattern in qbicc today is a little more indirect because we (1) we don't want to change javac and (2) we don?t want to directly edit OpenJDK source code (to make it easier to consume updates). Therefore, we define a "patch class" with a @RuntimeAspect annotation that qbicc combines with the unmodified bytecodes of the Aliased class to get what we need. I've added a third field just to emphasize that we need to allow the of a class to be a subset of its . class Aliased { private static final Socket s = ...; private static final Socket copy = s; private static final Object anotherField = ... } @RuntimeAspect(Aliased.class) class Aliased_RT { private static final Socket s = ...; private static final Socket copy = s; } The only part of the Aliased_RT class we are interested in is the method that javac generated for it. The qbicc compiler takes Aliased_RT's and uses it as the method for the fields s and copy of the Aliased class. The rest of the Aliased_RT class is ignored. If one was able to change javac, then the simpler @RuntimeInitialized syntax you had used would be better. From a single class definition, javac could generate both a method that initialized s, copy, and anotherField and an method that initialized s and copy. Finally, qbicc does not attempt to recognize when an object that is directly referred to by a @RuntimeInitialized static field is also reachable in some other (perhaps deeply nested) way. As a result, it is certainly possible to write programs where build-time and runtime-time identity (==) of two access paths is different. So far, this hasn't been an issue for us, but it is one of the ways in which one could detect at runtime that something non-standard has happened. Hope this explains more clearly without being tediously long, --dave From brian.goetz at oracle.com Fri May 27 18:11:16 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 27 May 2022 14:11:16 -0400 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: References: Message-ID: Thanks for the explanation. I realize that qbicc is largely an exploration, and the current status is a proof-of-concept, so I'm not trying to denigrate the work you've done or score cheap points by picking holes; I am sure this hole is one you are already well aware of.? Instead, I'm trying to highlight (as DanH also has) the fact that, no matter how much we might like to think of this as a "tool-based transformation", what is really going on is that a new language is being invented, and its semantics are being retroactively applied to the existing Java language -- which is of course a dangerous and difficult game.? (The issue I raised demonstrates that this particular new language is not yet up to the level of safety we expect from Java, but again, the current goal of qbicc is to explore the possibilities, not to seriously propose a new model for initialization.? So, all good.) As a language designer, I have no quibble with inventing new languages, or with evolving the Java language to do new things -- indeed, that's what I do every day.? But if we are to evolve the Java language, we must do so honestly and holistically.? How we approach shifting the timing constraints of initialization is going to require careful thought, and likely multiple iterations, before we arrive at a programming model that meets the safety, transparency, and performance requirements such a feature would demand. As to scope, as already indicated in Mark's announcement, we intend to take an incremental approach.? We are definitely interested in exploring whether there is a sufficiently safe and transparent programming model that could get us to build-time initialization, but it is not currently the first priority of Leyden -- there are more foundational (and less intrusive) things we should address first.? But the topic is definitely fair game for discussion, and I suspect it will take several iterations before we would get to anything we could consider putting in "everybody's Java." Cheers, -Brian On 5/27/2022 12:15 PM, David P Grove wrote: > > From: Brian Goetz >> From reading your notes, it seems that at build time, you start with the root class(es), execute their , >> which will cause loading of more classes, more , and you iterate until there are no new classes to initialize. >> You then treat the statics as roots, and serialize those objects to the initial heap image.? But before doing that, >> you exclude (zero out) any which are marked as "reinitialize at runtime." > This is correct. In addition, as qbicc serializes each object, it also looks for annotations on instance fields that indicate that instead of serializing the build-time value of the instance field it should substitute a different value (FileDescriptor is a motivating example...we want to serialize a closed FileDescriptor to ensure any runtime reads/writes through it will result in the proper exception being raised). > >> ... what happens in cases like this: >> >> class Aliased { >> ?????? @RuntimeInitialized private static final Socket s = ...; > ?> ????? private static final Socket copy = s; > ?>} > > First, I'll say what this code snippet would do with qbicc, then I'll say what the program should be to get the semantics the programmer probably intended. > > At build time, qbicc will execute the of Aliased, presumably a Socket object will be allocated by ... and references to that Socket object will be stored in s and copy. Any build-time usage of either s or copy via a build-time executed getfield will get a reference to that Socket object in the build-time heap. The @RuntimeInitialized has no impact on the build-time execution of code. At the end of compilation, when we serialize the static fields for Aliased, we will write null for s and a reference to the serialized Socket object for copy. In the generated code, all getfields to s will be preceded by checks to ensure that the method for s has been executed (similar to how a clinit check would be generated in a JVM). Since copy does not have a , getfields to copy in the generated code will not be preceded by any checks. The first time s is accessed at runtime, the ... code will be executed by the method and a new Socket object will be created and stored in s. The fields s and copy will now point to distinct Socket objects. Usages of the Socket object reachable from copy would likely result in an exception because the backing FileDescriptor for the Socket object referenced from copy would have been modified during the serialization process so that its instance fields have values as if the FileDescriptor had been closed. > > Using the syntax above, one would need to write this code to get the intended aliasing at both build-time and runtime. > class Aliased { > @RuntimeInitilalized private static final Socket s = ...; > @RuntimeInitilalized private static final Socket copy = s; > } > > The way we would actually write this pattern in qbicc today is a little more indirect because we (1) we don't want to change javac and (2) we don?t want to directly edit OpenJDK source code (to make it easier to consume updates). Therefore, we define a "patch class" with a @RuntimeAspect annotation that qbicc combines with the unmodified bytecodes of the Aliased class to get what we need. I've added a third field just to emphasize that we need to allow the of a class to be a subset of its . > > class Aliased { > private static final Socket s = ...; > private static final Socket copy = s; > private static final Object anotherField = ... > } > > @RuntimeAspect(Aliased.class) > class Aliased_RT { > private static final Socket s = ...; > private static final Socket copy = s; > } > > The only part of the Aliased_RT class we are interested in is the method that javac generated for it. The qbicc compiler takes Aliased_RT's and uses it as the method for the fields s and copy of the Aliased class. The rest of the Aliased_RT class is ignored. > > If one was able to change javac, then the simpler @RuntimeInitialized syntax you had used would be better. From a single class definition, javac could generate both a method that initialized s, copy, and anotherField and an method that initialized s and copy. > > Finally, qbicc does not attempt to recognize when an object that is directly referred to by a @RuntimeInitialized static field is also reachable in some other (perhaps deeply nested) way. As a result, it is certainly possible to write programs where build-time and runtime-time identity (==) of two access paths is different. So far, this hasn't been an issue for us, but it is one of the ways in which one could detect at runtime that something non-standard has happened. > > Hope this explains more clearly without being tediously long, > > --dave > > From volker.simonis at gmail.com Fri May 27 19:46:37 2022 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 27 May 2022 12:46:37 -0700 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> <2de6008c-7e3e-c25e-1ba3-356532d498e7@oracle.com> Message-ID: Dan Heidinga schrieb am Fr., 27. Mai 2022, 08:36: > On Thu, May 26, 2022 at 9:01 PM Brian Goetz > wrote: > > > > Thanks for providing this. > > > > Something about the qbicc approach here doesn't seem to add up to me. > > Maybe you can tell me what I'm missing. > > > > From reading your notes, it seems that at build time, you start with > > the root class(es), execute their , which will cause loading of > > more classes, more , and you iterate until there are no new > > classes to initialize. > > With qbicc we embraced the closed-world constraint and mandated that > all class initialization happens at build time. While we started with > runtime class initialization to bootstrap being able to run more code, > we quickly switched to being all-in on build time init (BTI) due to > the virtuous cycle between BTI and dead code elimination. > > > You then treat the statics as roots, and > > serialize those objects to the initial heap image. But before doing > > that, you exclude (zero out) any which are marked as "reinitialize at > > runtime." > > Right. > > > > > The rationale for this clearly is that you want to continue the graph > > walk to find all the loadable classes, but then don't want to use the > > polluted value. But what happens in cases like this: > > > > class Aliased { > > @RuntimeInitialized private static final Socket s = ...; > > private static final Socket copy = s; > > } > > > > Do you throw on reads of runtime-initialized fields from a ? Do > > you walk the heap and find aliases to runtime-initialized values, and > > replace them with something (if so, what?) Or is the Aliased class > > above just "broken" according to this model, and I encounter a > > stale/nonworking socket in `copy` at runtime, and one that is not > > properly aliased to `s`? Once an object is initialized at build time, > > its state can escape into all sorts of other places, and just zeroing > > out the static root isn't enough to stamp it out. > > This is where the "soupy" nature of becomes evident. > is a single method that has tremendous side effects, setting static > fields, initializing other classes, starting threads, caching computed > values, etc. It's very hard to automatically reason about what has > happened in a method and what the user intends for those side > effects (if they're even aware of what they all may be!). > > What was the user's intent when they initialized 'copy'? To record > what the original Socket connection - set up at build time - had been > rather than separately storing the address/port? If they had a > semantic meaning for `copy` even after `s` had been nulled out, then > automatically resetting `copy` would violate their expectation. > > We need the user to tell us their intent. If they wanted both `s` & > `copy` to be reset, then they need to be explicit about that and > annotate both fields. We don't attempt to null all copies of the > value of a @RuntimeInitialized field. > > > > > Am I missing something? > > You seemed to have grasped it correctly =) > > If that field had been a primitive, such as a long, we'd be unable to > track down which other longs in the heap were copies of it or derived > from it. We wouldn't reset some other location with the value 42 > because a @RuntimeInitialized field was set to 42 at build time. The > programmer has to take responsibility for which fields need to be > reset. With qbicc, that's annotations. With Leyden we may be able to > give them a better way to group fields and express how & when they > should be initialized. > And with CRaC we don't have to care for build-time initialization at all. Instead we just have to make sure that "relevant" fields are being reset before snapshot and correctly re-initialized on resume. The question is which fields have to be considered "relevant" in the CRaC context? Intuitively this will be a subset of the @RuntimeInitialized fields. But for CRaC this question also depends on the snapshot mechanism. If we're using CRIU to checkpoint a single process, sockets and file descriptions will certainly be hot candidates for @RuntimeInitialized fields. On the other hand, if we're snapshotting a complete virtual machine (e.g. with Firecracker) there's no need to reset/re-init file descriptors and even sockets might be handled transparently by the OS. Docker checkpoint is another interesting snapshotting possibility somewhere between single process and whole VM snapshotting. > > --Dan > > > > > Thanks, > > -Brian > > > > > > On 5/26/2022 4:22 PM, David P Grove wrote: > > > Hi, > > > > > > I?ve appended the contents of the referenced wiki > page in this email. Apologies in advance if the formatting doesn?t come > through as intended. > > > > > > There is a full implementation of this (GPLv2 + > Classpath exception) as part of the qbicc project on GitHub. There is also > a GitHub discussion in the qbicc project that links to various GitHub > issues that capture the history that led to the current design. I will not > hyperlink to those here so that if people have any IP concerns, they can > avoid seeing them. They are easily findable. > > > > > > Regards, > > > > > > --dave > > > > > > ## Overview > > > > > > One of the goals of the qbicc project is to explore technical > approaches for adapting Java's specification of class initialization to > fully support native image compilation. Enabling build-time evaluation of > complex class initialization logic is essential for obtaining much of the > benefits of native image compilation: reduced memory footprint and fast > startup. However, both the core JDK and many frameworks will not be > primarily be used in native image scenarios. Therefore, it is essential > that the approach taken for build-time initialization enables both the > existing runtime class initialization and the new build-time class > initialization logic to co-exist. Furthermore, for as many cases as > possible, the class initialization code should be shared between the two > usage scenarios and have non-surprising semantics in both. > > > > > > ## Build-time Initialization > > > > > > In qbicc, all classes are initialized at build-time. Class > initialization at build time is performed according to the existing > semantics of Java class initialization driven by build-time execution of > the `` methods of reachable classes. The set of reachable classes > is determined iteratively, starting with the program entrypoints and adding > the methods and classes they utilize until no further reachable classes are > discovered (a fixed point is reached). > > > > > > After build-time initialization has completed, a build-time heap has > been constructed that contains the objects that were created during the > build-time execution of the `` methods. Using the reachable static > fields of the reachable program as roots, this build-time heap is > serialized into the native image. This set of objects will form the > initial runtime heap of the program when it is executed. > > > > > > ## Runtime Initializers > > > > > > There are cases where one or more initialization actions of a class > **must** be executed at program runtime. Most typically these involve the > creation of native resources (open files, threads, etc) that cannot be > successfully serialized into the build time heap. > > > > > > Qbicc supports runtime initialization by allowing static fields of a > classes to be declared as runtime initialized. These fields will be > initialized lazily, at first access, by executing a runtime initializer > (``) associated with the accessed field. Runtime initialization is > localized: accessing a particular static field will cause its runtime > initializer to be executed but has no implications for other runtime > initializers defined either in the field's defining class or any superclass > or implemented interface of the field's defining class. > > > > > > When serialized from the build-time heap to the runtime heap, all > runtime-initialized fields will be serialized with the zero (uninitialized) > value appropriate for their type. > > > > > > Qbicc allows related static fields in the same class to share a common > `` method. The first access to any of the fields will cause the > execution of the associated `` method and the initialization of all > the fields. > > > > > > ## Adjusting Heap Serialization > > > > > > For some objects it is necessary to initialize them during build-time > initialization, but "reset" them before they are used at runtime. > > > Qbicc supports this by allowing fields to be annotated to be > serialized as the type-appropriate zero value or as a primitive constant > value. This value replacement happens as the build time heap is serialized. > > > > > > One common scenario is to invalidate objects that are wrapping native > resources. For example, when a `FileDescriptor` is serialized its `fd` and > `handle` instance fields are serialized as `-1` and its `closed` field is > serialize as `true`. Thus, any attempt to use the build-time FileDescriptor > at runtime will raise the appropriate exception. > > > > > > ## Patching: Migration for Existing Classes > > > > > > The runtime initialization mechanisms described above are currently > enabled via a set of annotations. This allows qbicc to implement the > desired semantics without requiring any changes to the Java compiler, class > file format, or language specification. In the long term, we believe small > modifications to the Java specification, for example defining a `rtinit { > ... }` similar to the existing `static { ... }` construct could enable a > simpler specification. > > > > > > The primary annotation for runtime initialization is `RuntimeAspect`. > This annotation is defined on a class and is interpreted as meaning that > the `` method of the class should be interpreted as an `` > method. This method will not be executed during build-time initialization > and instead will be deferred until the first access of one of the static > fields defined in the class. > > > > > > To allow us to "externally" modify JDK core classes for qbicc, we have > developed an annotation-driven patcher infrastructure. The patcher allows > the declaration of patch classes that add, remove, and modify the methods > and fields of an existing class. This modification includes the > replacement of the `` method and the declaration of multiple > `RuntimeAspect` patch classes. > > > > > > The best way to explore what is possible with the patcher is to > examine the java.base/src directory in the qbicc-class-library project. It > makes extensive use of the patcher annotations to adapt the core JDK > classes to qbicc while still allowing us to consume the upstream OpenJDK > code base via an unmodified git submodule. > > > > > > ## Design Alternatives > > > > > > A number of alternatives were considered before arriving at the final > design documented here. The technical discussions and options considered > can be explored starting in qbicc discussion #764 on GitHub. > > > > > > > > > From: Brian Goetz > > > Date: Thursday, May 26, 2022 at 2:21 PM > > > To: David P Grove,"leyden-dev at openjdk.java.net" < > leyden-dev at openjdk.java.net> > > > Subject: [EXTERNAL] Re: Experimentation with build time and runtime > class initialization in qbicc > > > > > > Hi David; Would like to understand more about this, but first, from an > IP-hygiene perspective, documents linked from this list should be under the > OpenJDK terms and conditions. Can you post the contents of that document > here, so there are no > > > ZjQcmQRYFpfptBannerStart > > > This Message Is From an External Sender > > > This message came from outside your organization. > > > ZjQcmQRYFpfptBannerEnd > > > Hi David; > > > > > > Would like to understand more about this, but first, from an > IP-hygiene perspective, documents linked from this list should be under the > OpenJDK terms and conditions. Can you post the contents of that document > here, so there are no issues there? > > > > > > Thanks, > > > -Brian > > > On 5/26/2022 12:35 PM, David P Grove wrote: > > > > > > Hi, > > > > > > > > > > > > In the qbicc project, we?ve been exploring options for adapting Java?s > class initialization semantics for native images. In particular, we are > trying to arrive at a non-surprising semantics that in a native-image > scenarios allows most initialization to happen at build-time while still > enabling runtime initialization of selected static fields. > > > > > > > > > > > > Our current design and experience is captured here: > https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc< > https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc>. In a > nutshell, the idea is to initialize classes via build-time execution of > existing methods as per normal Java semantics while adding > per-static-field methods to provide a capability for > runtime-reinitialization of a field before its first access. > > > > > > > > > > > > --dave > > > > > > > > > > > > > > > > From brian.goetz at oracle.com Fri May 27 20:44:55 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 27 May 2022 16:44:55 -0400 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> <2de6008c-7e3e-c25e-1ba3-356532d498e7@oracle.com> Message-ID: <4ba1f1fa-a165-c176-5c09-271dc8925354@oracle.com> > If that field had been a primitive, such as a long, we'd be unable to > track down which other longs in the heap were copies of it or derived > from it. We wouldn't reset some other location with the value 42 > because a @RuntimeInitialized field was set to 42 at build time. The > programmer has to take responsibility for which fields need to be > reset. With qbicc, that's annotations. With Leyden we may be able to > give them a better way to group fields and express how & when they > should be initialized. I know Dan knows this, but for the broader audience, let me remind everyone that annotations are not likely to be a suitable mechanism for anything other than prototyping here.? If something affects language semantics (and all of this does), it needs to be part of the language. But they're a fine tool for prototypes and proofs-of-concept. From christian.wimmer at oracle.com Sat May 28 16:39:13 2022 From: christian.wimmer at oracle.com (Christian Wimmer) Date: Sat, 28 May 2022 09:39:13 -0700 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> <2de6008c-7e3e-c25e-1ba3-356532d498e7@oracle.com> Message-ID: <5d85deac-4a14-3898-c4ac-54f7cc141d11@oracle.com> Hi, I agree with the "soupy nature" of methods mentioned below. This makes it impossible in general to reverse-engineer which parts of initialize which static field. One suggestion how that could be improved: Instead of emitting a single method, javac can emit separate methods for each static field that is initialized inline as part of the field declaration, as well as each static{} block. With a consistent naming scheme of these methods, it would be much easier to run some initializations at build time and some at run time. For compatibility, the method could be a chain of invocations of the methods (or maybe itself is no longer necessary at all). So for example a class class MyClass { ? static Object o1 = "abc"; ? static { ??? foo(); ? } ? static Object o2 = 42; } the Java compiler would create the methods (written here with disassembled bytecode) () { ? o1 = "abc" } () { ? foo(); } () { ? o2 = 42; } () { ? (); ? (); ? (); } Why such a scheme? It is much easier to prove here that the field o2 can be initialized at build time regardless of what foo() is doing, and then remove the run-time initialization of o2 by replacing with an empty method. All of that can be done without analyzing and modifying the bytecode soup of the current method. -Christian On 5/27/22 08:35, Dan Heidinga wrote: > On Thu, May 26, 2022 at 9:01 PM Brian Goetz wrote: >> Thanks for providing this. >> >> Something about the qbicc approach here doesn't seem to add up to me. >> Maybe you can tell me what I'm missing. >> >> From reading your notes, it seems that at build time, you start with >> the root class(es), execute their , which will cause loading of >> more classes, more , and you iterate until there are no new >> classes to initialize. > With qbicc we embraced the closed-world constraint and mandated that > all class initialization happens at build time. While we started with > runtime class initialization to bootstrap being able to run more code, > we quickly switched to being all-in on build time init (BTI) due to > the virtuous cycle between BTI and dead code elimination. > >> You then treat the statics as roots, and >> serialize those objects to the initial heap image. But before doing >> that, you exclude (zero out) any which are marked as "reinitialize at >> runtime." > Right. > >> The rationale for this clearly is that you want to continue the graph >> walk to find all the loadable classes, but then don't want to use the >> polluted value. But what happens in cases like this: >> >> class Aliased { >> @RuntimeInitialized private static final Socket s = ...; >> private static final Socket copy = s; >> } >> >> Do you throw on reads of runtime-initialized fields from a ? Do >> you walk the heap and find aliases to runtime-initialized values, and >> replace them with something (if so, what?) Or is the Aliased class >> above just "broken" according to this model, and I encounter a >> stale/nonworking socket in `copy` at runtime, and one that is not >> properly aliased to `s`? Once an object is initialized at build time, >> its state can escape into all sorts of other places, and just zeroing >> out the static root isn't enough to stamp it out. > This is where the "soupy" nature of becomes evident. > is a single method that has tremendous side effects, setting static > fields, initializing other classes, starting threads, caching computed > values, etc. It's very hard to automatically reason about what has > happened in a method and what the user intends for those side > effects (if they're even aware of what they all may be!). > > What was the user's intent when they initialized 'copy'? To record > what the original Socket connection - set up at build time - had been > rather than separately storing the address/port? If they had a > semantic meaning for `copy` even after `s` had been nulled out, then > automatically resetting `copy` would violate their expectation. > > We need the user to tell us their intent. If they wanted both `s` & > `copy` to be reset, then they need to be explicit about that and > annotate both fields. We don't attempt to null all copies of the > value of a @RuntimeInitialized field. > >> Am I missing something? > You seemed to have grasped it correctly =) > > If that field had been a primitive, such as a long, we'd be unable to > track down which other longs in the heap were copies of it or derived > from it. We wouldn't reset some other location with the value 42 > because a @RuntimeInitialized field was set to 42 at build time. The > programmer has to take responsibility for which fields need to be > reset. With qbicc, that's annotations. With Leyden we may be able to > give them a better way to group fields and express how & when they > should be initialized. > > --Dan > >> Thanks, >> -Brian >> >> >> On 5/26/2022 4:22 PM, David P Grove wrote: >>> Hi, >>> >>> I?ve appended the contents of the referenced wiki page in this email. Apologies in advance if the formatting doesn?t come through as intended. >>> >>> There is a full implementation of this (GPLv2 + Classpath exception) as part of the qbicc project on GitHub. There is also a GitHub discussion in the qbicc project that links to various GitHub issues that capture the history that led to the current design. I will not hyperlink to those here so that if people have any IP concerns, they can avoid seeing them. They are easily findable. >>> >>> Regards, >>> >>> --dave >>> >>> ## Overview >>> >>> One of the goals of the qbicc project is to explore technical approaches for adapting Java's specification of class initialization to fully support native image compilation. Enabling build-time evaluation of complex class initialization logic is essential for obtaining much of the benefits of native image compilation: reduced memory footprint and fast startup. However, both the core JDK and many frameworks will not be primarily be used in native image scenarios. Therefore, it is essential that the approach taken for build-time initialization enables both the existing runtime class initialization and the new build-time class initialization logic to co-exist. Furthermore, for as many cases as possible, the class initialization code should be shared between the two usage scenarios and have non-surprising semantics in both. >>> >>> ## Build-time Initialization >>> >>> In qbicc, all classes are initialized at build-time. Class initialization at build time is performed according to the existing semantics of Java class initialization driven by build-time execution of the `` methods of reachable classes. The set of reachable classes is determined iteratively, starting with the program entrypoints and adding the methods and classes they utilize until no further reachable classes are discovered (a fixed point is reached). >>> >>> After build-time initialization has completed, a build-time heap has been constructed that contains the objects that were created during the build-time execution of the `` methods. Using the reachable static fields of the reachable program as roots, this build-time heap is serialized into the native image. This set of objects will form the initial runtime heap of the program when it is executed. >>> >>> ## Runtime Initializers >>> >>> There are cases where one or more initialization actions of a class **must** be executed at program runtime. Most typically these involve the creation of native resources (open files, threads, etc) that cannot be successfully serialized into the build time heap. >>> >>> Qbicc supports runtime initialization by allowing static fields of a classes to be declared as runtime initialized. These fields will be initialized lazily, at first access, by executing a runtime initializer (``) associated with the accessed field. Runtime initialization is localized: accessing a particular static field will cause its runtime initializer to be executed but has no implications for other runtime initializers defined either in the field's defining class or any superclass or implemented interface of the field's defining class. >>> >>> When serialized from the build-time heap to the runtime heap, all runtime-initialized fields will be serialized with the zero (uninitialized) value appropriate for their type. >>> >>> Qbicc allows related static fields in the same class to share a common `` method. The first access to any of the fields will cause the execution of the associated `` method and the initialization of all the fields. >>> >>> ## Adjusting Heap Serialization >>> >>> For some objects it is necessary to initialize them during build-time initialization, but "reset" them before they are used at runtime. >>> Qbicc supports this by allowing fields to be annotated to be serialized as the type-appropriate zero value or as a primitive constant value. This value replacement happens as the build time heap is serialized. >>> >>> One common scenario is to invalidate objects that are wrapping native resources. For example, when a `FileDescriptor` is serialized its `fd` and `handle` instance fields are serialized as `-1` and its `closed` field is serialize as `true`. Thus, any attempt to use the build-time FileDescriptor at runtime will raise the appropriate exception. >>> >>> ## Patching: Migration for Existing Classes >>> >>> The runtime initialization mechanisms described above are currently enabled via a set of annotations. This allows qbicc to implement the desired semantics without requiring any changes to the Java compiler, class file format, or language specification. In the long term, we believe small modifications to the Java specification, for example defining a `rtinit { ... }` similar to the existing `static { ... }` construct could enable a simpler specification. >>> >>> The primary annotation for runtime initialization is `RuntimeAspect`. This annotation is defined on a class and is interpreted as meaning that the `` method of the class should be interpreted as an `` method. This method will not be executed during build-time initialization and instead will be deferred until the first access of one of the static fields defined in the class. >>> >>> To allow us to "externally" modify JDK core classes for qbicc, we have developed an annotation-driven patcher infrastructure. The patcher allows the declaration of patch classes that add, remove, and modify the methods and fields of an existing class. This modification includes the replacement of the `` method and the declaration of multiple `RuntimeAspect` patch classes. >>> >>> The best way to explore what is possible with the patcher is to examine the java.base/src directory in the qbicc-class-library project. It makes extensive use of the patcher annotations to adapt the core JDK classes to qbicc while still allowing us to consume the upstream OpenJDK code base via an unmodified git submodule. >>> >>> ## Design Alternatives >>> >>> A number of alternatives were considered before arriving at the final design documented here. The technical discussions and options considered can be explored starting in qbicc discussion #764 on GitHub. >>> >>> >>> From: Brian Goetz >>> Date: Thursday, May 26, 2022 at 2:21 PM >>> To: David P Grove,"leyden-dev at openjdk.java.net" >>> Subject: [EXTERNAL] Re: Experimentation with build time and runtime class initialization in qbicc >>> >>> Hi David; Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no >>> ZjQcmQRYFpfptBannerStart >>> This Message Is From an External Sender >>> This message came from outside your organization. >>> ZjQcmQRYFpfptBannerEnd >>> Hi David; >>> >>> Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no issues there? >>> >>> Thanks, >>> -Brian >>> On 5/26/2022 12:35 PM, David P Grove wrote: >>> >>> Hi, >>> >>> >>> >>> In the qbicc project, we?ve been exploring options for adapting Java?s class initialization semantics for native images. In particular, we are trying to arrive at a non-surprising semantics that in a native-image scenarios allows most initialization to happen at build-time while still enabling runtime initialization of selected static fields. >>> >>> >>> >>> Our current design and experience is captured here:https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc. In a nutshell, the idea is to initialize classes via build-time execution of existing methods as per normal Java semantics while adding per-static-field methods to provide a capability for runtime-reinitialization of a field before its first access. >>> >>> >>> >>> --dave >>> >>> >>> >>> From christian.wimmer at oracle.com Sat May 28 16:46:45 2022 From: christian.wimmer at oracle.com (Christian Wimmer) Date: Sat, 28 May 2022 09:46:45 -0700 Subject: Project Leyden: Beginnings In-Reply-To: References: <20220520144202.C213853774D@eggemoggin.niobe.net> Message-ID: <0e5128c8-6db4-effb-437f-0086267387ec@oracle.com> Hi Andrew, Since you mentioned GraalVM: note that the GraalVM team at Oracle was advised to not talk about Native Image on this mailing list. Only when concrete questions arise, we will be happy to explain how things are handled in Native Image. -Christian On 5/27/22 06:33, Andrew Haley wrote: > On 5/20/22 15:42, mark.reinhold at oracle.com wrote: > > > Let us begin! > > As you'd expect, here at Red Hat there's a variety of opinions. > Rather than simply post my own response to this, I've been talking to > Middleware architects (the likely _users_ of Leyden!) as well as our > OpenJDK team members. Here's what we think: > > ? We're excited to see Leyden taking shape and will be active > ? participants. Our customers are benefiting from GraalVM today and as > ? such we?ll continue to engage with that project as the Leyden ideas > ? are explored and take shape. > > ? Bringing standardization to this space is important for developers > ? as it will clarify the behaviours they can depend on.? As we engage > ? in updating the standard, we should consider not just new behavior, > ? but also exceptions and variations that can accommodate some of the > ? existing behavior of GraalVM such as build-time initialization. > > ? The proposed incremental approach will ensure we bring along the > ? current ecosystem and devtools while carefully introducing any new > ? constraints. We also see the benefit in segmenting the problem space > ? into discrete areas that can be introduced sooner rather than > ? waiting for a big-bang integration of multiple constraints. > > ? > In the long run we will likely embrace the full closed-world > ? > constraint in order to produce fully-static images. > > ? Our experience with Java on K8s and containers (notably Quarkus) has > ? demonstrated real world benefits of a closed-world approach, so it's > ? good to see it explicitly listed as a likely goal in the long > ? run. It will be important that Leyden is careful to specify its > ? efforts in the fast start / small footprint space while being > ? mindful of that constraint.? Hopefully, we can all work towards a > ? future that converges both GraalVM's efforts and those of Leyden. > From brian.goetz at oracle.com Sat May 28 16:58:37 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 28 May 2022 12:58:37 -0400 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: <5d85deac-4a14-3898-c4ac-54f7cc141d11@oracle.com> References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> <2de6008c-7e3e-c25e-1ba3-356532d498e7@oracle.com> <5d85deac-4a14-3898-c4ac-54f7cc141d11@oracle.com> Message-ID: <9f86c5f9-bf30-aca8-73bf-461ab4b8cd1d@oracle.com> I too agree that the "soupy" nature of makes reverse-engineering difficult, and that this alternate translation would make things easier for an after-the-fact analysis tool that is trying to reason about what computations could be safely shifted in time. But, keep in mind that it's not a free lunch.? To point out the obvious tradeoff: this turns into a startup hit for every dynamically executed Java program (larger classfiles, more bytecodes, more methods).? This is a tradeoff we would have to consider carefully, since making Java startup slower in general is not a cost we should take on lightly, especially given the charter of this project.? So, something for the "could consider" list, but not a slam-dunk. On 5/28/2022 12:39 PM, Christian Wimmer wrote: > Hi, > > I agree with the "soupy nature" of methods mentioned below. > This makes it impossible in general to reverse-engineer which parts of > initialize which static field. One suggestion how that could > be improved: Instead of emitting a single method, javac can > emit separate methods for each static field that is > initialized inline as part of the field declaration, as well as each > static{} block. With a consistent naming scheme of these methods, it > would be much easier to run some initializations at build time and > some at run time. For compatibility, the method could be a > chain of invocations of the methods (or maybe > itself is no longer necessary at all). > > So for example a class > > class MyClass { > ? static Object o1 = "abc"; > ? static { > ??? foo(); > ? } > ? static Object o2 = 42; > } > > the Java compiler would create the methods (written here with > disassembled bytecode) > > () { > ? o1 = "abc" > } > () { > ? foo(); > } > () { > ? o2 = 42; > } > () { > ? (); > ? (); > ? (); > } > > Why such a scheme? It is much easier to prove here that the field o2 > can be initialized at build time regardless of what foo() is doing, > and then remove the run-time initialization of o2 by replacing > with an empty method. All of that can be done without > analyzing and modifying the bytecode soup of the current method. > > -Christian > > > On 5/27/22 08:35, Dan Heidinga wrote: >> On Thu, May 26, 2022 at 9:01 PM Brian Goetz >> wrote: >>> Thanks for providing this. >>> >>> Something about the qbicc approach here doesn't seem to add up to me. >>> Maybe you can tell me what I'm missing. >>> >>> ? From reading your notes, it seems that at build time, you start with >>> the root class(es), execute their , which will cause loading of >>> more classes, more , and you iterate until there are no new >>> classes to initialize. >> With qbicc we embraced the closed-world constraint and mandated that >> all class initialization happens at build time.? While we started with >> runtime class initialization to bootstrap being able to run more code, >> we quickly switched to being all-in on build time init (BTI) due to >> the virtuous cycle between BTI and dead code elimination. >> >>> You then treat the statics as roots, and >>> serialize those objects to the initial heap image.? But before doing >>> that, you exclude (zero out) any which are marked as "reinitialize at >>> runtime." >> Right. >> >>> The rationale for this clearly is that you want to continue the graph >>> walk to find all the loadable classes, but then don't want to use the >>> polluted value.? But what happens in cases like this: >>> >>> ????? class Aliased { >>> ????????? @RuntimeInitialized private static final Socket s = ...; >>> ????????? private static final Socket copy = s; >>> ????? } >>> >>> Do you throw on reads of runtime-initialized fields from a >>> ?? Do >>> you walk the heap and find aliases to runtime-initialized values, and >>> replace them with something (if so, what?)? Or is the Aliased class >>> above just "broken" according to this model, and I encounter a >>> stale/nonworking socket in `copy` at runtime, and one that is not >>> properly aliased to `s`?? Once an object is initialized at build time, >>> its state can escape into all sorts of other places, and just zeroing >>> out the static root isn't enough to stamp it out. >> This is where the "soupy" nature of becomes evident. >> is a single method that has tremendous side effects, setting static >> fields, initializing other classes, starting threads, caching computed >> values, etc.? It's very hard to automatically reason about what has >> happened in a method and what the user intends for those side >> effects (if they're even aware of what they all may be!). >> >> What was the user's intent when they initialized 'copy'?? To record >> what the original Socket connection - set up at build time - had been >> rather than separately storing the address/port?? If they had a >> semantic meaning for `copy` even after `s` had been nulled out, then >> automatically resetting `copy` would violate their expectation. >> >> We need the user to tell us their intent.? If they wanted both `s` & >> `copy` to be reset, then they need to be explicit about that and >> annotate both fields.? We don't attempt to null all copies of the >> value of a @RuntimeInitialized field. >> >>> Am I missing something? >> You seemed to have grasped it correctly =) >> >> If that field had been a primitive, such as a long, we'd be unable to >> track down which other longs in the heap were copies of it or derived >> from it.? We wouldn't reset some other location with the value 42 >> because a @RuntimeInitialized field was set to 42 at build time.? The >> programmer has to take responsibility for which fields need to be >> reset.? With qbicc, that's annotations.? With Leyden we may be able to >> give them a better way to group fields and express how & when they >> should be initialized. >> >> --Dan >> >>> Thanks, >>> -Brian >>> >>> >>> On 5/26/2022 4:22 PM, David P Grove wrote: >>>> Hi, >>>> >>>> ????????????????? I?ve appended the contents of the referenced wiki >>>> page in this email.? Apologies in advance if the formatting doesn?t >>>> come through as intended. >>>> >>>> ????????????????? There is a full implementation of this (GPLv2 + >>>> Classpath exception) as part of the qbicc project on GitHub.? There >>>> is also a GitHub discussion in the qbicc project that links to >>>> various GitHub issues that capture the history that led to the >>>> current design.? I will not hyperlink to those here so that if >>>> people have any IP concerns, they can avoid seeing them.? They are >>>> easily findable. >>>> >>>> Regards, >>>> >>>> --dave >>>> >>>> ## Overview >>>> >>>> One of the goals of the qbicc project is to explore technical >>>> approaches for adapting Java's specification of class >>>> initialization to fully support native image compilation.? Enabling >>>> build-time evaluation of complex class initialization logic is >>>> essential for obtaining much of the benefits of native image >>>> compilation: reduced memory footprint and fast startup.? However, >>>> both the core JDK and many frameworks will not be primarily be used >>>> in native image scenarios.? Therefore, it is essential that the >>>> approach taken for build-time initialization enables both the >>>> existing runtime class initialization and the new build-time class >>>> initialization logic to co-exist. Furthermore, for as many cases as >>>> possible, the class initialization code should be shared between >>>> the two usage scenarios and have non-surprising semantics in both. >>>> >>>> ## Build-time Initialization >>>> >>>> In qbicc, all classes are initialized at build-time. Class >>>> initialization at build time is performed according to the existing >>>> semantics of Java class initialization driven by build-time >>>> execution of the `` methods of reachable classes. The set >>>> of reachable classes is determined iteratively, starting with the >>>> program entrypoints and adding the methods and classes they utilize >>>> until no further reachable classes are discovered (a fixed point is >>>> reached). >>>> >>>> After build-time initialization has completed, a build-time heap >>>> has been constructed that contains the objects that were created >>>> during the build-time execution of the `` methods.? Using >>>> the reachable static fields of the reachable program as roots, this >>>> build-time heap is serialized into the native image.? This set of >>>> objects will form the initial runtime heap of the program when it >>>> is executed. >>>> >>>> ## Runtime Initializers >>>> >>>> There are cases where one or more initialization actions of a class >>>> **must** be executed at program runtime.? Most typically these >>>> involve the creation of native resources (open files, threads, etc) >>>> that cannot be successfully serialized into the build time heap. >>>> >>>> Qbicc supports runtime initialization by allowing static fields of >>>> a classes to be declared as runtime initialized. These fields will >>>> be initialized lazily, at first access, by executing a runtime >>>> initializer (``) associated with the accessed field.? >>>> Runtime initialization is localized: accessing a particular static >>>> field will cause its runtime initializer to be executed but has no >>>> implications for other runtime initializers defined either in the >>>> field's defining class or any superclass or implemented interface >>>> of the field's defining class. >>>> >>>> When serialized from the build-time heap to the runtime heap, all >>>> runtime-initialized fields will be serialized with the zero >>>> (uninitialized) value appropriate for their type. >>>> >>>> Qbicc allows related static fields in the same class to share a >>>> common `` method. The first access to any of the fields >>>> will cause the execution of the associated `` method and >>>> the initialization of all the fields. >>>> >>>> ## Adjusting Heap Serialization >>>> >>>> For some objects it is necessary to initialize them during >>>> build-time initialization, but "reset" them before they are used at >>>> runtime. >>>> Qbicc supports this by allowing fields to be annotated to be >>>> serialized as the type-appropriate zero value or as a primitive >>>> constant value. This value replacement happens as the build time >>>> heap is serialized. >>>> >>>> One common scenario is to invalidate objects that are wrapping >>>> native resources. For example, when a `FileDescriptor` is >>>> serialized its `fd` and `handle` instance fields are serialized as >>>> `-1` and its `closed` field is serialize as `true`. Thus, any >>>> attempt to use the build-time FileDescriptor at runtime will raise >>>> the appropriate exception. >>>> >>>> ## Patching: Migration for Existing Classes >>>> >>>> The runtime initialization mechanisms described above are currently >>>> enabled via a set of annotations.? This allows qbicc to implement >>>> the desired semantics without requiring any changes to the Java >>>> compiler, class file format, or language specification. In the long >>>> term, we believe small modifications to the Java specification, for >>>> example defining a `rtinit { ... }` similar to the existing `static >>>> { ... }` construct could enable a simpler specification. >>>> >>>> The primary annotation for runtime initialization is >>>> `RuntimeAspect`.? This annotation is defined on a class and is >>>> interpreted as meaning that the `` method of the class >>>> should be interpreted as an `` method.? This method will >>>> not be executed during build-time initialization and instead will >>>> be deferred until the first access of one of the static fields >>>> defined in the class. >>>> >>>> To allow us to "externally" modify JDK core classes for qbicc, we >>>> have developed an annotation-driven patcher infrastructure. The >>>> patcher allows the declaration of patch classes that add, remove, >>>> and modify the methods and fields of an existing class.? This >>>> modification includes the replacement of the `` method and >>>> the declaration of multiple `RuntimeAspect` patch classes. >>>> >>>> The best way to explore what is possible with the patcher is to >>>> examine the java.base/src directory in the qbicc-class-library >>>> project. It makes extensive use of the patcher annotations to adapt >>>> the core JDK classes to qbicc while still allowing us to consume >>>> the upstream OpenJDK code base via an unmodified git submodule. >>>> >>>> ## Design Alternatives >>>> >>>> A number of alternatives were considered before arriving at the >>>> final design documented here.? The technical discussions and >>>> options considered can be explored starting in qbicc discussion >>>> #764 on GitHub. >>>> >>>> >>>> From: Brian Goetz >>>> Date: Thursday, May 26, 2022 at 2:21 PM >>>> To: David P Grove,"leyden-dev at openjdk.java.net" >>>> >>>> Subject: [EXTERNAL] Re: Experimentation with build time and runtime >>>> class initialization in qbicc >>>> >>>> Hi David; Would like to understand more about this, but first, from >>>> an IP-hygiene perspective, documents linked from this list should >>>> be under the OpenJDK terms and conditions. Can you post the >>>> contents of that document here, so there are no >>>> ZjQcmQRYFpfptBannerStart >>>> This Message Is From an External Sender >>>> This message came from outside your organization. >>>> ZjQcmQRYFpfptBannerEnd >>>> Hi David; >>>> >>>> Would like to understand more about this, but first, from an >>>> IP-hygiene perspective, documents linked from this list should be >>>> under the OpenJDK terms and conditions.? Can you post the contents >>>> of that document here, so there are no issues there? >>>> >>>> Thanks, >>>> -Brian >>>> On 5/26/2022 12:35 PM, David P Grove wrote: >>>> >>>> Hi, >>>> >>>> >>>> >>>> In the qbicc project, we?ve been exploring options for adapting >>>> Java?s class initialization semantics for native images.? In >>>> particular, we are trying to arrive at a non-surprising semantics >>>> that in a native-image scenarios allows most initialization to >>>> happen at build-time while still enabling runtime initialization of >>>> selected static fields. >>>> >>>> >>>> >>>> Our current design and experience is captured >>>> here:https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc. >>>> In a nutshell, the idea is to initialize classes via build-time >>>> execution of existing methods as per normal Java semantics >>>> while adding per-static-field methods to provide a >>>> capability for runtime-reinitialization of a field before its first >>>> access. >>>> >>>> >>>> >>>> --dave >>>> >>>> >>>> >>>> From christian.wimmer at oracle.com Sat May 28 17:15:23 2022 From: christian.wimmer at oracle.com (Christian Wimmer) Date: Sat, 28 May 2022 10:15:23 -0700 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: <9f86c5f9-bf30-aca8-73bf-461ab4b8cd1d@oracle.com> References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> <2de6008c-7e3e-c25e-1ba3-356532d498e7@oracle.com> <5d85deac-4a14-3898-c4ac-54f7cc141d11@oracle.com> <9f86c5f9-bf30-aca8-73bf-461ab4b8cd1d@oracle.com> Message-ID: <71a04d1c-1e70-444c-e787-55dc131df8b6@oracle.com> Certainly everything comes with a tradeoff. But I would argue that the cost of the current workaround to influence static field initializations - make a separate static inner class for a static field that should be initialized separately - is even higher because it requires a full class data structure just to hold a single field. Even in the JDK, the number of inner classes named "Lazy" is growing. A more fine-grained initialization of fields within a class can help to reduce such overhead. -Christian On 5/28/22 09:58, Brian Goetz wrote: > I too agree that the "soupy" nature of makes > reverse-engineering difficult, and that this alternate translation > would make things easier for an after-the-fact analysis tool that is > trying to reason about what computations could be safely shifted in time. > > But, keep in mind that it's not a free lunch.? To point out the > obvious tradeoff: this turns into a startup hit for every dynamically > executed Java program (larger classfiles, more bytecodes, more > methods).? This is a tradeoff we would have to consider carefully, > since making Java startup slower in general is not a cost we should > take on lightly, especially given the charter of this project.? So, > something for the "could consider" list, but not a slam-dunk. > > > > > > On 5/28/2022 12:39 PM, Christian Wimmer wrote: >> Hi, >> >> I agree with the "soupy nature" of methods mentioned below. >> This makes it impossible in general to reverse-engineer which parts >> of initialize which static field. One suggestion how that >> could be improved: Instead of emitting a single method, >> javac can emit separate methods for each static field >> that is initialized inline as part of the field declaration, as well >> as each static{} block. With a consistent naming scheme of these >> methods, it would be much easier to run some initializations at build >> time and some at run time. For compatibility, the method >> could be a chain of invocations of the methods (or maybe >> itself is no longer necessary at all). >> >> So for example a class >> >> class MyClass { >> ? static Object o1 = "abc"; >> ? static { >> ??? foo(); >> ? } >> ? static Object o2 = 42; >> } >> >> the Java compiler would create the methods (written here with >> disassembled bytecode) >> >> () { >> ? o1 = "abc" >> } >> () { >> ? foo(); >> } >> () { >> ? o2 = 42; >> } >> () { >> ? (); >> ? (); >> ? (); >> } >> >> Why such a scheme? It is much easier to prove here that the field o2 >> can be initialized at build time regardless of what foo() is doing, >> and then remove the run-time initialization of o2 by replacing >> with an empty method. All of that can be done without >> analyzing and modifying the bytecode soup of the current >> method. >> >> -Christian >> >> >> On 5/27/22 08:35, Dan Heidinga wrote: >>> On Thu, May 26, 2022 at 9:01 PM Brian Goetz >>> wrote: >>>> Thanks for providing this. >>>> >>>> Something about the qbicc approach here doesn't seem to add up to me. >>>> Maybe you can tell me what I'm missing. >>>> >>>> ? From reading your notes, it seems that at build time, you start with >>>> the root class(es), execute their , which will cause >>>> loading of >>>> more classes, more , and you iterate until there are no new >>>> classes to initialize. >>> With qbicc we embraced the closed-world constraint and mandated that >>> all class initialization happens at build time.? While we started with >>> runtime class initialization to bootstrap being able to run more code, >>> we quickly switched to being all-in on build time init (BTI) due to >>> the virtuous cycle between BTI and dead code elimination. >>> >>>> You then treat the statics as roots, and >>>> serialize those objects to the initial heap image.? But before doing >>>> that, you exclude (zero out) any which are marked as "reinitialize at >>>> runtime." >>> Right. >>> >>>> The rationale for this clearly is that you want to continue the graph >>>> walk to find all the loadable classes, but then don't want to use the >>>> polluted value.? But what happens in cases like this: >>>> >>>> ????? class Aliased { >>>> ????????? @RuntimeInitialized private static final Socket s = ...; >>>> ????????? private static final Socket copy = s; >>>> ????? } >>>> >>>> Do you throw on reads of runtime-initialized fields from a >>>> ?? Do >>>> you walk the heap and find aliases to runtime-initialized values, and >>>> replace them with something (if so, what?)? Or is the Aliased class >>>> above just "broken" according to this model, and I encounter a >>>> stale/nonworking socket in `copy` at runtime, and one that is not >>>> properly aliased to `s`?? Once an object is initialized at build time, >>>> its state can escape into all sorts of other places, and just zeroing >>>> out the static root isn't enough to stamp it out. >>> This is where the "soupy" nature of becomes evident. >>> is a single method that has tremendous side effects, setting static >>> fields, initializing other classes, starting threads, caching computed >>> values, etc.? It's very hard to automatically reason about what has >>> happened in a method and what the user intends for those side >>> effects (if they're even aware of what they all may be!). >>> >>> What was the user's intent when they initialized 'copy'?? To record >>> what the original Socket connection - set up at build time - had been >>> rather than separately storing the address/port?? If they had a >>> semantic meaning for `copy` even after `s` had been nulled out, then >>> automatically resetting `copy` would violate their expectation. >>> >>> We need the user to tell us their intent.? If they wanted both `s` & >>> `copy` to be reset, then they need to be explicit about that and >>> annotate both fields.? We don't attempt to null all copies of the >>> value of a @RuntimeInitialized field. >>> >>>> Am I missing something? >>> You seemed to have grasped it correctly =) >>> >>> If that field had been a primitive, such as a long, we'd be unable to >>> track down which other longs in the heap were copies of it or derived >>> from it.? We wouldn't reset some other location with the value 42 >>> because a @RuntimeInitialized field was set to 42 at build time.? The >>> programmer has to take responsibility for which fields need to be >>> reset.? With qbicc, that's annotations.? With Leyden we may be able to >>> give them a better way to group fields and express how & when they >>> should be initialized. >>> >>> --Dan >>> >>>> Thanks, >>>> -Brian >>>> >>>> >>>> On 5/26/2022 4:22 PM, David P Grove wrote: >>>>> Hi, >>>>> >>>>> ????????????????? I?ve appended the contents of the referenced >>>>> wiki page in this email.? Apologies in advance if the formatting >>>>> doesn?t come through as intended. >>>>> >>>>> ????????????????? There is a full implementation of this (GPLv2 + >>>>> Classpath exception) as part of the qbicc project on GitHub.? >>>>> There is also a GitHub discussion in the qbicc project that links >>>>> to various GitHub issues that capture the history that led to the >>>>> current design.? I will not hyperlink to those here so that if >>>>> people have any IP concerns, they can avoid seeing them.? They are >>>>> easily findable. >>>>> >>>>> Regards, >>>>> >>>>> --dave >>>>> >>>>> ## Overview >>>>> >>>>> One of the goals of the qbicc project is to explore technical >>>>> approaches for adapting Java's specification of class >>>>> initialization to fully support native image compilation.? >>>>> Enabling build-time evaluation of complex class initialization >>>>> logic is essential for obtaining much of the benefits of native >>>>> image compilation: reduced memory footprint and fast startup.? >>>>> However, both the core JDK and many frameworks will not be >>>>> primarily be used in native image scenarios.? Therefore, it is >>>>> essential that the approach taken for build-time initialization >>>>> enables both the existing runtime class initialization and the new >>>>> build-time class initialization logic to co-exist. Furthermore, >>>>> for as many cases as possible, the class initialization code >>>>> should be shared between the two usage scenarios and have >>>>> non-surprising semantics in both. >>>>> >>>>> ## Build-time Initialization >>>>> >>>>> In qbicc, all classes are initialized at build-time. Class >>>>> initialization at build time is performed according to the >>>>> existing semantics of Java class initialization driven by >>>>> build-time execution of the `` methods of reachable >>>>> classes. The set of reachable classes is determined iteratively, >>>>> starting with the program entrypoints and adding the methods and >>>>> classes they utilize until no further reachable classes are >>>>> discovered (a fixed point is reached). >>>>> >>>>> After build-time initialization has completed, a build-time heap >>>>> has been constructed that contains the objects that were created >>>>> during the build-time execution of the `` methods.? Using >>>>> the reachable static fields of the reachable program as roots, >>>>> this build-time heap is serialized into the native image.? This >>>>> set of objects will form the initial runtime heap of the program >>>>> when it is executed. >>>>> >>>>> ## Runtime Initializers >>>>> >>>>> There are cases where one or more initialization actions of a >>>>> class **must** be executed at program runtime.? Most typically >>>>> these involve the creation of native resources (open files, >>>>> threads, etc) that cannot be successfully serialized into the >>>>> build time heap. >>>>> >>>>> Qbicc supports runtime initialization by allowing static fields of >>>>> a classes to be declared as runtime initialized. These fields will >>>>> be initialized lazily, at first access, by executing a runtime >>>>> initializer (``) associated with the accessed field.? >>>>> Runtime initialization is localized: accessing a particular static >>>>> field will cause its runtime initializer to be executed but has no >>>>> implications for other runtime initializers defined either in the >>>>> field's defining class or any superclass or implemented interface >>>>> of the field's defining class. >>>>> >>>>> When serialized from the build-time heap to the runtime heap, all >>>>> runtime-initialized fields will be serialized with the zero >>>>> (uninitialized) value appropriate for their type. >>>>> >>>>> Qbicc allows related static fields in the same class to share a >>>>> common `` method. The first access to any of the fields >>>>> will cause the execution of the associated `` method and >>>>> the initialization of all the fields. >>>>> >>>>> ## Adjusting Heap Serialization >>>>> >>>>> For some objects it is necessary to initialize them during >>>>> build-time initialization, but "reset" them before they are used >>>>> at runtime. >>>>> Qbicc supports this by allowing fields to be annotated to be >>>>> serialized as the type-appropriate zero value or as a primitive >>>>> constant value. This value replacement happens as the build time >>>>> heap is serialized. >>>>> >>>>> One common scenario is to invalidate objects that are wrapping >>>>> native resources. For example, when a `FileDescriptor` is >>>>> serialized its `fd` and `handle` instance fields are serialized as >>>>> `-1` and its `closed` field is serialize as `true`. Thus, any >>>>> attempt to use the build-time FileDescriptor at runtime will raise >>>>> the appropriate exception. >>>>> >>>>> ## Patching: Migration for Existing Classes >>>>> >>>>> The runtime initialization mechanisms described above are >>>>> currently enabled via a set of annotations.? This allows qbicc to >>>>> implement the desired semantics without requiring any changes to >>>>> the Java compiler, class file format, or language specification. >>>>> In the long term, we believe small modifications to the Java >>>>> specification, for example defining a `rtinit { ... }` similar to >>>>> the existing `static { ... }` construct could enable a simpler >>>>> specification. >>>>> >>>>> The primary annotation for runtime initialization is >>>>> `RuntimeAspect`.? This annotation is defined on a class and is >>>>> interpreted as meaning that the `` method of the class >>>>> should be interpreted as an `` method.? This method will >>>>> not be executed during build-time initialization and instead will >>>>> be deferred until the first access of one of the static fields >>>>> defined in the class. >>>>> >>>>> To allow us to "externally" modify JDK core classes for qbicc, we >>>>> have developed an annotation-driven patcher infrastructure. The >>>>> patcher allows the declaration of patch classes that add, remove, >>>>> and modify the methods and fields of an existing class.? This >>>>> modification includes the replacement of the `` method and >>>>> the declaration of multiple `RuntimeAspect` patch classes. >>>>> >>>>> The best way to explore what is possible with the patcher is to >>>>> examine the java.base/src directory in the qbicc-class-library >>>>> project. It makes extensive use of the patcher annotations to >>>>> adapt the core JDK classes to qbicc while still allowing us to >>>>> consume the upstream OpenJDK code base via an unmodified git >>>>> submodule. >>>>> >>>>> ## Design Alternatives >>>>> >>>>> A number of alternatives were considered before arriving at the >>>>> final design documented here.? The technical discussions and >>>>> options considered can be explored starting in qbicc discussion >>>>> #764 on GitHub. >>>>> >>>>> >>>>> From: Brian Goetz >>>>> Date: Thursday, May 26, 2022 at 2:21 PM >>>>> To: David P Grove,"leyden-dev at openjdk.java.net" >>>>> >>>>> Subject: [EXTERNAL] Re: Experimentation with build time and >>>>> runtime class initialization in qbicc >>>>> >>>>> Hi David; Would like to understand more about this, but first, >>>>> from an IP-hygiene perspective, documents linked from this list >>>>> should be under the OpenJDK terms and conditions. Can you post the >>>>> contents of that document here, so there are no >>>>> ZjQcmQRYFpfptBannerStart >>>>> This Message Is From an External Sender >>>>> This message came from outside your organization. >>>>> ZjQcmQRYFpfptBannerEnd >>>>> Hi David; >>>>> >>>>> Would like to understand more about this, but first, from an >>>>> IP-hygiene perspective, documents linked from this list should be >>>>> under the OpenJDK terms and conditions.? Can you post the contents >>>>> of that document here, so there are no issues there? >>>>> >>>>> Thanks, >>>>> -Brian >>>>> On 5/26/2022 12:35 PM, David P Grove wrote: >>>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> In the qbicc project, we?ve been exploring options for adapting >>>>> Java?s class initialization semantics for native images.? In >>>>> particular, we are trying to arrive at a non-surprising semantics >>>>> that in a native-image scenarios allows most initialization to >>>>> happen at build-time while still enabling runtime initialization >>>>> of selected static fields. >>>>> >>>>> >>>>> >>>>> Our current design and experience is captured >>>>> here:https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc. >>>>> In a nutshell, the idea is to initialize classes via build-time >>>>> execution of existing methods as per normal Java >>>>> semantics while adding per-static-field methods to >>>>> provide a capability for runtime-reinitialization of a field >>>>> before its first access. >>>>> >>>>> >>>>> >>>>> --dave >>>>> >>>>> >>>>> >>>>> > From aph at redhat.com Sat May 28 18:13:05 2022 From: aph at redhat.com (Andrew Haley) Date: Sat, 28 May 2022 19:13:05 +0100 Subject: Project Leyden: Beginnings In-Reply-To: <0e5128c8-6db4-effb-437f-0086267387ec@oracle.com> References: <20220520144202.C213853774D@eggemoggin.niobe.net> <0e5128c8-6db4-effb-437f-0086267387ec@oracle.com> Message-ID: On 5/28/22 17:46, Christian Wimmer wrote: > Since you mentioned GraalVM: note that the GraalVM team at Oracle was > advised to not talk about Native Image on this mailing list. Only when > concrete questions arise, we will be happy to explain how things are > handled in Native Image. I see. Thanks for letting us know. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From tanksherman27 at gmail.com Mon May 30 13:07:51 2022 From: tanksherman27 at gmail.com (Julian Waters) Date: Mon, 30 May 2022 21:07:51 +0800 Subject: Can Ahead of Time code benefit regular Java applications too? Message-ID: Hi all, Since Leyden's goal has shifted from originally exploring only binaries compiled directly to native code, to "address the long-term pain points of Java?s slow startup time, slow time to peak performance, and large footprint", would there be any merit in looking at allowing native code to be embedded within jars to bypass the Interpreter at runtime? (Maybe have Ahead of Time code that replaces the Interpreter be compiled by C1, and treat it as part of the C1 pipeline so it can be profiled while being run) Ideally it'd be similar to the now defunct jaotc, but more compact (within the jar itself or perhaps the classfiles somehow) instead of compiling the Ahead of Time code into an entirely separate file which then needs to be explicitly passed to the JVM at runtime. This may or may not be a good starting point before advancing to entirely standalone Java binaries, but I digress. Perhaps the experience of the CraC team would be of some help in this area? best regards, Julian From akozlov at azul.com Mon May 30 17:24:55 2022 From: akozlov at azul.com (Anton Kozlov) Date: Mon, 30 May 2022 20:24:55 +0300 Subject: Project Leyden: Beginnings Message-ID: <7c59af5c-9ede-19fb-7865-7bb854e93ca7@azul.com> Great to see Leyden started! I also think that Leyden and CRaC are working toward the same goal from different sides. Assuming a program to be a series of actions, A B C D E F In CRaC, we don't need to specify dependencies and correctness of all permutations, the only difference from the usual Java programming model is that between each subsequent pair of actions checkpoint and restore may happen. Those are rather transparent, except they may invalidate some assumptions about the program environment. So a non-trivial task for the CRaC Project is how to find these assumptions implemented in the program state. If an explicit language-level mechanism will be developed in Leyden to mark executions for build time, the same mechanism will probably be useful to mark now-wrong assumptions after restore for CRaC. So I'm excited and curious about what's going to be developed in Leyden. I also hope CRaC Project to be useful for Leyden. Not sure about pieces of the program or VM state, like a standalone JIT compilations, that would be orthogonal tasks for CRaC. But why not a closed world start image that is restored into an open world Java instance, to achieve a great start-up, warm-up, and footprint of Leyden, without sacrificing the dynamic Java features in the runtime. Thanks, Anton From adinn at redhat.com Tue May 31 09:32:59 2022 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 31 May 2022 10:32:59 +0100 Subject: Project Leyden: Beginnings In-Reply-To: <7c59af5c-9ede-19fb-7865-7bb854e93ca7@azul.com> References: <7c59af5c-9ede-19fb-7865-7bb854e93ca7@azul.com> Message-ID: Hi Anton, On 30/05/2022 18:24, Anton Kozlov wrote: > . . . But why not a closed world start > image that is restored into an open world Java instance, to achieve a > great start-up, warm-up, and footprint of Leyden, without sacrificing > the dynamic Java features in the runtime. I'm not sue what you mean by 'a closed world start image' but the picture you are painting here sounds a little bit too rosy. Of There are some complexities here that are perhaps worth articulating, even if I am misreading your comment and you are already aware of them. One has to bear in mind that a closed world as defined by full program analysis (possibly supplemented with user directives to embrace things like reflective targets) can exclude everything that is not marked as reachable during the analysis from its generated image, maybe whole classes in some cases, or maybe just static/instance fields and methods of some classes. Whereas, a bootstrap image that needs to allow for operation of code loaded in an open world, must still include exported, public (or, in some cases, protected) elements of the code base even if they are not reachable during the analysis. That is needed because because yet to be loaded code may end up referencing those public (or protected) elements. This is especially relevant to build time initialization as the closed world assumption means great swathes of public static state and its initialization code can be squeezed out of the final image great swathes of the static state that is included be treated as effectively final constants multiple occurences of those final constants which represent independent, effectively static fields can be coalesced to a single, read-only datum in the initial heap GraalVM performs exactly those optimizations as part of its build time initialization and this provides a significant reduction in (initial) heap size and application startup. Note that it is not just JDK data that this applies to either. The closed world analysis enables this optimization to be performed across the whole application code base. Now that does not mean an initial image for use in an open world cannot perform the optimization described above on a partial basis. It simply has to limit itself to parts of the code base that are not accessible to/extensible by later loaded code. So, for example, an analysis of an individual JDK module might be able to identify some non-exported subset of the module's code elements which effectively constitute a closed world and optimize an initial image accordingly. A global analysis of multiple modules that considered all possible client uses of exported code might be able to take these optimizations further. Clearly, this approach does not immediately stretch to 'whole program', global optimization, as performed by Graal. It would only do so for a fully modularized application. However, it may still give significant improvements to JDK runtime startup and footprint -- hence benefit all apps running on the JVM. Note, also that I said 'immediately' because I am also hopeful that this approach might be combined with some change to the programming model in order to allow a move towards a broader analysis that includes library and application code, essentially moving nearer toward the closed world analysis and attendant optimizations employed by GraalVM. regards, Andrew Dinn ----------- From volker.simonis at gmail.com Tue May 31 14:28:50 2022 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 31 May 2022 16:28:50 +0200 Subject: Project Leyden: Beginnings In-Reply-To: <0e5128c8-6db4-effb-437f-0086267387ec@oracle.com> References: <20220520144202.C213853774D@eggemoggin.niobe.net> <0e5128c8-6db4-effb-437f-0086267387ec@oracle.com> Message-ID: On Sat, May 28, 2022 at 6:47 PM Christian Wimmer wrote: > > Hi Andrew, > > Since you mentioned GraalVM: note that the GraalVM team at Oracle was > advised to not talk about Native Image on this mailing list. Only when > concrete questions arise, we will be happy to explain how things are > handled in Native Image. > Sorry, but this seems really weird to me. With OpenJDK and GraalVM Oracle is running two major open source projects and now Oracle forbids its own employees working on one of the project to communicate with the other ones? Is Oracle afraid of its own Terms of Use [1] or am I missing something obvious? From my understanding, Oracle employees working on the OpenJDK have started project Leyden to introduce "a concept of _static images_ to the Java Platform" [2] and "take inspiration from past efforts to explore this" like "the Native Image feature of GraalVM" [2]. And then it took some other Oracle employees another two years just to find out that they don't want its first open source project to get too much inspired by the second one? I'm puzzled... [1] https://openjdk.java.net/legal/tou/terms [2] https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html > -Christian > > > On 5/27/22 06:33, Andrew Haley wrote: > > On 5/20/22 15:42, mark.reinhold at oracle.com wrote: > > > > > Let us begin! > > > > As you'd expect, here at Red Hat there's a variety of opinions. > > Rather than simply post my own response to this, I've been talking to > > Middleware architects (the likely _users_ of Leyden!) as well as our > > OpenJDK team members. Here's what we think: > > > > We're excited to see Leyden taking shape and will be active > > participants. Our customers are benefiting from GraalVM today and as > > such we?ll continue to engage with that project as the Leyden ideas > > are explored and take shape. > > > > Bringing standardization to this space is important for developers > > as it will clarify the behaviours they can depend on. As we engage > > in updating the standard, we should consider not just new behavior, > > but also exceptions and variations that can accommodate some of the > > existing behavior of GraalVM such as build-time initialization. > > > > The proposed incremental approach will ensure we bring along the > > current ecosystem and devtools while carefully introducing any new > > constraints. We also see the benefit in segmenting the problem space > > into discrete areas that can be introduced sooner rather than > > waiting for a big-bang integration of multiple constraints. > > > > > In the long run we will likely embrace the full closed-world > > > constraint in order to produce fully-static images. > > > > Our experience with Java on K8s and containers (notably Quarkus) has > > demonstrated real world benefits of a closed-world approach, so it's > > good to see it explicitly listed as a likely goal in the long > > run. It will be important that Leyden is careful to specify its > > efforts in the fast start / small footprint space while being > > mindful of that constraint. Hopefully, we can all work towards a > > future that converges both GraalVM's efforts and those of Leyden. > > From tanksherman27 at gmail.com Tue May 31 15:32:06 2022 From: tanksherman27 at gmail.com (Julian Waters) Date: Tue, 31 May 2022 23:32:06 +0800 Subject: Project Leyden: Beginnings In-Reply-To: References: <20220520144202.C213853774D@eggemoggin.niobe.net> <0e5128c8-6db4-effb-437f-0086267387ec@oracle.com> Message-ID: To my knowledge Leyden had always planned to have a different implementation than Graal when it came to native binaries, instead of just merging code from Graal into HotSpot; The former is envisioned to be entirely native to the official JVM and only working through enhancing HotSpot itself for code generation (Andrew mentioned discussions of it being a reference implementation of AOT Java binaries), while the latter focuses more on "Java on Java". My guess is the advisory is so that too much of Graal's design and code doesn't seep into Leyden - Inspiration from Graal is helpful, turning Leyden into Graal itself wouldn't be (Hence why it's only advised and not outright forbidden). best regards. Julian On Tue, May 31, 2022 at 10:29 PM Volker Simonis wrote: > On Sat, May 28, 2022 at 6:47 PM Christian Wimmer > wrote: > > > > Hi Andrew, > > > > Since you mentioned GraalVM: note that the GraalVM team at Oracle was > > advised to not talk about Native Image on this mailing list. Only when > > concrete questions arise, we will be happy to explain how things are > > handled in Native Image. > > > > Sorry, but this seems really weird to me. With OpenJDK and GraalVM > Oracle is running two major open source projects and now Oracle > forbids its own employees working on one of the project to communicate > with the other ones? Is Oracle afraid of its own Terms of Use [1] or > am I missing something obvious? From my understanding, Oracle > employees working on the OpenJDK have started project Leyden to > introduce "a concept of _static images_ to the Java Platform" [2] and > "take inspiration from past efforts to explore this" like "the Native > Image feature of GraalVM" [2]. And then it took some other Oracle > employees another two years just to find out that they don't want its > first open source project to get too much inspired by the second one? > > I'm puzzled... > > [1] https://openjdk.java.net/legal/tou/terms > [2] https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html > > > -Christian > > > > > > On 5/27/22 06:33, Andrew Haley wrote: > > > On 5/20/22 15:42, mark.reinhold at oracle.com wrote: > > > > > > > Let us begin! > > > > > > As you'd expect, here at Red Hat there's a variety of opinions. > > > Rather than simply post my own response to this, I've been talking to > > > Middleware architects (the likely _users_ of Leyden!) as well as our > > > OpenJDK team members. Here's what we think: > > > > > > We're excited to see Leyden taking shape and will be active > > > participants. Our customers are benefiting from GraalVM today and as > > > such we?ll continue to engage with that project as the Leyden ideas > > > are explored and take shape. > > > > > > Bringing standardization to this space is important for developers > > > as it will clarify the behaviours they can depend on. As we engage > > > in updating the standard, we should consider not just new behavior, > > > but also exceptions and variations that can accommodate some of the > > > existing behavior of GraalVM such as build-time initialization. > > > > > > The proposed incremental approach will ensure we bring along the > > > current ecosystem and devtools while carefully introducing any new > > > constraints. We also see the benefit in segmenting the problem space > > > into discrete areas that can be introduced sooner rather than > > > waiting for a big-bang integration of multiple constraints. > > > > > > > In the long run we will likely embrace the full closed-world > > > > constraint in order to produce fully-static images. > > > > > > Our experience with Java on K8s and containers (notably Quarkus) has > > > demonstrated real world benefits of a closed-world approach, so it's > > > good to see it explicitly listed as a likely goal in the long > > > run. It will be important that Leyden is careful to specify its > > > efforts in the fast start / small footprint space while being > > > mindful of that constraint. Hopefully, we can all work towards a > > > future that converges both GraalVM's efforts and those of Leyden. > > > > From heidinga at redhat.com Tue May 31 15:50:30 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Tue, 31 May 2022 11:50:30 -0400 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> Message-ID: On Fri, May 27, 2022 at 7:53 AM Kasper Nielsen wrote: > > Hi David, > > Thanks for the write-up. > > One thing that isn't completely clear to me after reading this is why > language > changes () are needed? The model was a convenient way for us to explore a model that put all class initialization at build time, while allowing a small set of fields to be reinitialized at runtime. It also minimized the changes we had to make to the core JDK classes which makes maintaining the changes much easier given the rate of JDK updates. SubstrateVM uses a similar approach with their Substitutions for what I assume are similar reasons. Leyden will be able to update the JDK core classes directly and can take a more direct approach to indicating in which phase a static field should be initialized. > It seems to me this could be entirely > implemented via a standard API. Using ClassValue as the main inspiration you > could have something like: > > abstract class RuntimeLocal { > protected RuntimeLocal() { > checkBuildTime(); > VM.registerForRuntimeInitialization(this); > } > protected abstract T computeValue(); > public final T get(); // Calls to get are optimized by the vm > } > > > Usage would be something similar to: > > class Usage { > > static final LocalDateTime BUILD_TIME = LocalDateTime.now(); > > static final RuntimeLocal RUNTIME_TIME = new > RuntimeLocal<>() { > protected LocalDateTime computeValue() { > return LocalDateTime.now(); > } > }; > } > > I might be missing some details, but it seems to me that this approach would > be strongly favorable to needing to change the language as well as adding > new bytecodes. This is a good starting point. I went a fair ways looking at how to group static fields into different classes to decouple their lifetimes and found that I couldn't cleanly split them into two groups. I used the Initialization on demand holder pattern (IODH) rather than your RuntimeLocal but the idea is very similar. The problem is that while it's clear that some fields can be initialized early (build time) and others must be initialized late (runtime), there is a third group that needs to be reinitialized. I list 3 buckets: early, late, and reinit, but that's a minimum number. There may be more than 3. And due to the "soupy" nature of , it's not always easy to avoid depending on a field that's in a different bucket. And values in that 3rd bucket - the fields that need to be reinitialized - don't have a clear meaning when their value propagates around the program. Does it need to be cleared everywhere and force reinit of all consumers? Lots to figure out here. We need a better model - whether that's library features or new language features - that makes it easier to express when (which phase) an operation should occur and some way to talk about the dependency chain of that value (all the classes that have to be initialized, values calculated, etc). --Dan > > /Kasper > > On Thu, 26 May 2022 at 21:22, David P Grove wrote: > > > Hi, > > I?ve appended the contents of the referenced wiki page in this email. > > Apologies in advance if the formatting doesn?t come through as intended. > > > > There is a full implementation of this (GPLv2 + Classpath > > exception) as part of the qbicc project on GitHub. There is also a GitHub > > discussion in the qbicc project that links to various GitHub issues that > > capture the history that led to the current design. I will not hyperlink > > to those here so that if people have any IP concerns, they can avoid seeing > > them. They are easily findable. > > > > Regards, > > > > --dave > > > > > From adinn at redhat.com Tue May 31 15:51:35 2022 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 31 May 2022 16:51:35 +0100 Subject: Project Leyden: Beginnings In-Reply-To: References: <20220520144202.C213853774D@eggemoggin.niobe.net> <0e5128c8-6db4-effb-437f-0086267387ec@oracle.com> Message-ID: <7fc8ee04-d19a-c6e6-ab62-dfb3a6834609@redhat.com> Hi Volker, On 31/05/2022 15:28, Volker Simonis wrote: > On Sat, May 28, 2022 at 6:47 PM Christian Wimmer >> Since you mentioned GraalVM: note that the GraalVM team at Oracle was >> advised to not talk about Native Image on this mailing list. Only when >> concrete questions arise, we will be happy to explain how things are >> handled in Native Image. >> > > Sorry, but this seems really weird to me. With OpenJDK and GraalVM > Oracle is running two major open source projects and now Oracle > forbids its own employees working on one of the project to communicate > with the other ones? Is Oracle afraid of its own Terms of Use [1] or > am I missing something obvious? From my understanding, Oracle > employees working on the OpenJDK have started project Leyden to > introduce "a concept of _static images_ to the Java Platform" [2] and > "take inspiration from past efforts to explore this" like "the Native > Image feature of GraalVM" [2]. And then it took some other Oracle > employees another two years just to find out that they don't want its > first open source project to get too much inspired by the second one? > > I'm puzzled... > > [1] https://openjdk.java.net/legal/tou/terms > [2] https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html I am not sure how you arrived at the conclusion that "Oracle forbids its own employees working on one of the project to communicate with the other ones". That's particularly weird since I know that the two teams have been working together for a long time -- for example to ensure GraalVM is ready to implement Loom. I took Christian's first statement merely to indicate that this would not be the correct forum for general discussion of GraalVM Native. That's not surprising. This project is primarily about arriving at a satisfactory model for static Java. That may ultimately lead to some revisions in how GraalVM Native operates and -- we hope - some revisions to the language or runtime that might simplify the life of GraalVM developers. It may also need to consider some of the things that were done or are being done in the GraalVM Native project. However, mostly what is discussed here will have no direct bearing on how the GraalVM Native implementation proceeds. The GraalVM project has its own forums for those discussions and that is where they should proceed. If you think that first statement means anything more than that then I recommend you what he said in follow up to that. Christian explicitly stated that the GraalVM team will be "happy to talk about how things are handled in Native Image ... when concrete questions arise". That looks to me to be a fairly unambiguous and generous offer to provide advice, based on the experience garnered by the GraalVM team, where it applies to problems Leyden needs to address. I must say I find the rest of your post to be equally as marred by speculation and misreading of the prior and current situation. However, that's not my main takeaway; most of all I find what what you said to be irrelevant. I would much prefer to drop all speculation about what might or might not be going on inside Oracle and instead get on with looking into how we can move this project forward. Oracle staff from both the OpenJDK and GraalVM projects have stated that they are willing to provide input to this project. Can we not simply thank them for that and work with them? regards, Andrew Dinn ----------- Red Hat Distinguished Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From brian.goetz at oracle.com Tue May 31 16:17:05 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 31 May 2022 12:17:05 -0400 Subject: Experimentation with build time and runtime class initialization in qbicc In-Reply-To: References: <0EE27016-2D6A-46A8-825A-1AFF788A5C67@us.ibm.com> Message-ID: I think Dan is homing in on one of the key questions, which is the nature of the third bucket (static finals that require reinitialization.)? It would be useful for everyone following the discussion if we had a more complete list of situations you've encountered where this seems essential, and their notable aspects. As you point out, there are a host of potential "solutions"; while it is surely premature to try to propose a solution, it is never too early to come to a better understanding of the problem. On 5/31/2022 11:50 AM, Dan Heidinga wrote: > On Fri, May 27, 2022 at 7:53 AM Kasper Nielsen wrote: >> Hi David, >> >> Thanks for the write-up. >> >> One thing that isn't completely clear to me after reading this is why >> language >> changes () are needed? > The model was a convenient way for us to explore a model that > put all class initialization at build time, while allowing a small set > of fields to be reinitialized at runtime. It also minimized the > changes we had to make to the core JDK classes which makes maintaining > the changes much easier given the rate of JDK updates. SubstrateVM > uses a similar approach with their Substitutions for what I assume are > similar reasons. > > Leyden will be able to update the JDK core classes directly and can > take a more direct approach to indicating in which phase a static > field should be initialized. > >> It seems to me this could be entirely >> implemented via a standard API. Using ClassValue as the main inspiration you >> could have something like: >> >> abstract class RuntimeLocal { >> protected RuntimeLocal() { >> checkBuildTime(); >> VM.registerForRuntimeInitialization(this); >> } >> protected abstract T computeValue(); >> public final T get(); // Calls to get are optimized by the vm >> } >> >> >> Usage would be something similar to: >> >> class Usage { >> >> static final LocalDateTime BUILD_TIME = LocalDateTime.now(); >> >> static final RuntimeLocal RUNTIME_TIME = new >> RuntimeLocal<>() { >> protected LocalDateTime computeValue() { >> return LocalDateTime.now(); >> } >> }; >> } >> >> I might be missing some details, but it seems to me that this approach would >> be strongly favorable to needing to change the language as well as adding >> new bytecodes. > This is a good starting point. I went a fair ways looking at how to > group static fields into different classes to decouple their lifetimes > and found that I couldn't cleanly split them into two groups. I used > the Initialization on demand holder pattern (IODH) rather than your > RuntimeLocal but the idea is very similar. > > The problem is that while it's clear that some fields can be > initialized early (build time) and others must be initialized late > (runtime), there is a third group that needs to be reinitialized. I > list 3 buckets: early, late, and reinit, but that's a minimum number. > There may be more than 3. And due to the "soupy" nature of , > it's not always easy to avoid depending on a field that's in a > different bucket. And values in that 3rd bucket - the fields that > need to be reinitialized - don't have a clear meaning when their value > propagates around the program. Does it need to be cleared everywhere > and force reinit of all consumers? Lots to figure out here. > > We need a better model - whether that's library features or new > language features - that makes it easier to express when (which phase) > an operation should occur and some way to talk about the dependency > chain of that value (all the classes that have to be initialized, > values calculated, etc). > > --Dan > >> /Kasper >> >> On Thu, 26 May 2022 at 21:22, David P Grove wrote: >> >>> Hi, >>> I?ve appended the contents of the referenced wiki page in this email. >>> Apologies in advance if the formatting doesn?t come through as intended. >>> >>> There is a full implementation of this (GPLv2 + Classpath >>> exception) as part of the qbicc project on GitHub. There is also a GitHub >>> discussion in the qbicc project that links to various GitHub issues that >>> capture the history that led to the current design. I will not hyperlink >>> to those here so that if people have any IP concerns, they can avoid seeing >>> them. They are easily findable. >>> >>> Regards, >>> >>> --dave >>> >>> From ioi.lam at oracle.com Tue May 31 21:21:33 2022 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 31 May 2022 14:21:33 -0700 Subject: Improve determinism in the Java language Message-ID: A lot of the recent Leyden discussion has been around "what optimizations can be done ahead of time" (e.g., static field initialization). However, I think we also need to look at a lower level. One reason that Java has been difficult to optimize ahead-of-time is the tremendous dynamism in the language. Here are a few things that I think we can do to make Java programs more deterministic so that ahead of time optimizations can be applied: 1 Deterministic Program Code ? A Java program can essentially rewrite itself and even ? the libraries it uses. Here's an example: ? class App { ????? static { ????????? if (...) { ????????????? MethodHandles.lookup() ????????????????? .defineClass(.. hacked App$Bar ...); ????????? } ????? } ????? static final Bar bar = new Bar(); ????? static class Bar { ????????? .... ????? } ? } ? - We can't effectively AOT-compile the program code because ??? the native code may not match the runtime generated ??? bytecodes. ? - We can't pre-initialize the App.bar field because its shape ??? may be different. ? One idea is to disallow such code patching when Leyden is enabled. ? For example, we can require that to use Leyden, an application ? must be "prelinked", which means that as soon as the application ? is loaded, the classes App and App$Bar are already loaded. The ? defineClass() call will fail with a LinkageError (duplicated class ? definition). 2 Decouple class namespaces from dynamic bytecode generation ? This is a corollary of the above item. Java uses ? ClassLoader.defineClass() for BOTH namespace and dynamic ? bytecode generation. I would stipulate that most users ? of Leyden want to do the former and not the latter. ? We should have a new API to load a fixed set of classes ? into a namespace. 3 order ? Java allows s that recursively depend on each other. The ? result depends on the reference order of these classes. ? class A {? static final int a = B.b++; } ? class B {? static final int b = A.a++; } ? We could have a problem if the application assumes that A is ? always initialized before B, but the Leyden optimizer ? initializes them in the opposite order. ? We could: ? - Refuse to optimize classes that have mutually recursive ??? , or ? - Change the language spec to give the JVM more freedom to ??? decide the initialization order. From ioi.lam at oracle.com Tue May 31 21:42:08 2022 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 31 May 2022 14:42:08 -0700 Subject: Can Ahead of Time code benefit regular Java applications too? In-Reply-To: References: Message-ID: <9f70a2d5-5cb1-e615-b76b-957f95ac9928@oracle.com> On 5/30/2022 6:07 AM, Julian Waters wrote: > Hi all, > > Since Leyden's goal has shifted from originally exploring only binaries > compiled directly to native code, to "address the long-term pain points of > Java?s slow startup time, slow time to peak performance, and large > footprint", would there be any merit in looking at allowing native code to > be embedded within jars to bypass the Interpreter at runtime? (Maybe have > Ahead of Time code that replaces the Interpreter be compiled by C1, and > treat it as part of the C1 pipeline so it can be profiled while being run) > Ideally it'd be similar to the now defunct jaotc, but more compact (within > the jar itself or perhaps the classfiles somehow) instead of compiling the > Ahead of Time code into an entirely separate file which then needs to be > explicitly passed to the JVM at runtime. This may or may not be a good > starting point before advancing to entirely standalone Java binaries, but I > digress. Perhaps the experience of the CraC team would be of some help in > this area? > > best regards, > Julian What kind of interface and dependency between the JVM and the native code would be needed to support this? As far as I can tell, the Leyden discussions have been about producing artifacts (native code or heap dumps) that are tightly bound to a specific build of the JDK. If you want a (version agnostic) JAR file to contain native code that can be used by arbitrary JDKs, that would raise the complexity quite significantly. Thanks - Ioi