From mark.reinhold at oracle.com Thu Dec 1 22:35:16 2022 From: mark.reinhold at oracle.com (Mark Reinhold) Date: Thu, 1 Dec 2022 22:35:16 +0000 Subject: Selectively Shifting and Constraining Computation In-Reply-To: References: <20221013105630.787752083@eggemoggin.niobe.net> <20221027172839.945357016@eggemoggin.niobe.net> Message-ID: <20221201143515.194094123@eggemoggin.niobe.net> 2022/11/7 7:57:50 -0800, Dan Heidinga : > On Thu, Oct 27, 2022 at 8:28 PM mark.reinhold at oracle.com wrote: >> 2022/10/21 19:49:36 -0700, Dan Heidinga : >>> Thanks for posting this document. It provides a clear starting point for >>> us to move forward to start to address Leyden. >> >> Thanks, Dan, and thanks for your comments and questions. >> >>> A few questions based on my read through: >>> >>>> As long as a condenser preserves program meaning and does not impose >>>> constraints other than those accepted by the developer, an implementation >>>> has wide latitude to optimize the result. >>> >>> Does ?an implementation? refer only to the JDK or are other parts of the >>> ecosystem - frameworks, libraries, etc - able to add their own condensers >>> as well? >> >> Good question! >> >> We?re concerned here primarily with any software system that?s claimed >> to be a conformant implementation of the Java Platform -- the JDK is >> merely the dominant and best-known example. The components of such a >> system -- including any condensers -- can be held to the rigors of the >> Platform Specification and the JCK, thereby ensuring that they preserve >> the meanings of Java programs. >> >> Some people will, no doubt, want to write their own condensers, and >> obviously we?d like all condensers to preserve program meaning. The >> current conformance regime (specification/RI/JCK), however, applies only >> to entire Platform implementations. > > Interesting. I'm hesitating to raise this as I don't want to derail the > technical topics, but I recall that after jlink was released, it was > necessary to update the definition of the "platform" to handle the subset > runtimes created by jlink. We updated the definition of the Platform Implementation concept to allow conforming module subsets [1] prior to releasing jlink in JDK 9, not after the release. > With the kinds of changes envisioned for > condensers - including removing unused classes, fields & methods - will we > need to update the definition of the platform again? Almost certainly. > Updates of this > nature tend to take a while on the legal front so I'm raising this early > (and hoping it's already being thought about) so the project doesn't end up > with conderners that can't be shipped due to conflicts with the platform > definition. We?ll evolve the Platform Specification to admit condensers, and whatever opt-in constraints they might impose, prior to shipping any actual condensers. As usual, moreover, we?ll evolve the Specification in parallel with the implementation, so that the work on each is informed by the work on the other. (I know I don?t need to explain that to you, but some people assume that these are non-concurrent efforts, with one driving the other.) >>>> ... >>>> >>>> Condensation is meaning preserving. >>> >>> Have you given thought to precisely specifying and then validating this >>> property? It?s definitely something we want but it may be challenging to >>> enforce?. >> >> The meaning of a Java program is already given by the Specification, and >> the JCK already validates a Java implementation by running a large >> collection of Specification-derived tests. My initial thought here is >> to extend the JCK to test a condenser by applying the condenser to each >> JCK test and then verifying that the condensed test still passes. (To >> keep JCK run times tractable, do this in large batches.) > > That's a great starting point. I hadn't considered using the JCKs in that > way. > > The devil will be in the details as there will be conflicts between the > condenser's changes (make this module unmodifiable) and the test's > expectations (redefine class X). Yes. Many existing tests will likely need adjustment. > This approach also feels a bit weighted > toward the end - a condenser can be developed, merged, and only later be > discovered to conflict with the JCKs. I've been trying to think of ways to > bring this forward in the development process and to make it more open but > the answer may be as it is today - lots of jtregs, clearly written specs, > and the final check is done by the JCKs. Agreed. >>>> ... >>>> >>>> Eliminate apparently-unused classes and class members (i.e., stripping, which >>>> requires the constraint that selected classes and members cannot be reflected >>>> upon) >>> >>> Being slightly pedantic, it means the removed members can?t be reflected >>> on, correct? >> >> Yes, but ... >> >>> A class that?s been stripped of unused methods / fields can >>> still have its remaining members reflectively looked up is my assumption. >> >> That?s a reasonable assumption. >> >> In detail, though, we could in principle retain reflective metadata even >> for removed program elements, if there?s a need to do so. We could, >> likewise, remove reflective metadata for some retained program elements >> if we?re confident that those elements will never be reflected upon. >> More exploration required ... > > That makes sense. If I'm following that correctly, we can mark some > classes as "reflection hostile" and others as supporting reflection of all > members (even the removed ones). What I'm really hearing is a willingness > to completely change the model so users can make different tradeoffs. Exactly. >>> Using jlink is the natural starting point for this work. It would be >>> unfortunate if condensers were limited to only work on modules given the >>> (unfortunately) slow update of them. Is the intention to extend jlink to >>> work with classpath entities (ie: non-modular code) as well? We foresee >>> the need for condensers to be able to ?condense? non-modules as well. >> >> The jlink tool doesn?t support arbitrary class path code, or automatic >> modules, in order to guarantee that the resulting image is structurally >> correct by construction. I?m reluctant to give up on that guarantee. >> Viable workarounds exist for non-modular code (jdeps, ModiTect, etc.), >> so developers who want to risk building a potentially-broken image from >> a potentially-broken class path can do so, at their option. > > Libraries are always the challenge here. Modularizing before your > dependencies is difficult and while Leyden may help drive modularization in > the ecosystem (a plus!), I hope we can take a pragmatic approach and bring > class path code into Leyden fold. As you said above, lot's to explore here > but I ask that we add an explicit goal to bring class path code in as much > as we can. If all condensers must support the class path then none will be able to leverage the constraints imposed by modules -- and as I said, I suspect that there are some interesting ways to leverage those constraints. It might make sense to extend jlink to copy the entire content of the class path (modulo duplicates) into a custom run-time image, but I don?t think we should require every condenser to support the class path. - Mark [1] https://cr.openjdk.java.net/~iris/se/9/java-se-9-fr-spec/#Constraints-on-all-modules-in-an-Implementation From heidinga at redhat.com Wed Dec 7 15:17:53 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Wed, 7 Dec 2022 10:17:53 -0500 Subject: Class init progression discussion Message-ID: Previously, we at Red Hat had done some investigation into static final fields and how existing classes could be adapted to enable their statics to be initialized at build time. I'd like to replay some of that investigation here, both to share our findings and also to gather different perspectives on the problem space. Our work had been very focused on moving work from runtime to build time and with legitimizing build time initialization - recognizing two phases: build & run time. With Leyden's approach to shifting constraints there may be different outcomes from these kinds of investigations, especially in light of the recognition that there may be many phases beyond just build and run times. I'd like to break the document up into a series of threads on each topic as that often leads to more focused discussion. I've also posted the full document is available at [0]. Let's start off with some background on class initialization and a key question: why don't users group their static fields together in ways to give them different lifetimes? We have tools to do this today, like the IODH pattern, why aren't they used more broadly? Key question: As we look at shifting work from runtime to build time, we should understand what drives user behaviour today. What tools are available for controlling lifecycles of static fields and how popular are those tools? Broadly used or narrowly? Why? Background: Class initialization is a bit of a magic process from the perspective of users. Users typically believe it happens when the class is loaded rather than understanding the actual triggers specified in JVMS { new, get/putstatic, invokestatic, and caused by subclass init, reflection }. Class init is also difficult to reason about because it takes all the static field initializations (apart from those done using the ConstantValue attribute) and puts them together into a single ?soupy? method. This method is ?soupy? in that it mixes all the interactions and operations - even those expressed separately in the source - into a single method that allows unbounded side effects. To make matters even less clear, the language and VM have slightly different rules for how static final fields are written: At the language level, the compiler enforces that static finals are only written to once but the VM allows them to be written as many times as needed if done within the method (this used to be if done from within the defining class). All this results in the life cycle of every static field in a class being conjoined, even if they would be better handled in different life cycles. There is: * One initialization point * One method that is called to write the fields * One duration for all of them (the life of the class or, equivalently, its classloader) * One thread that executes the method (good for exactly once semantics) There is a common case where users recognize these conditions and use them to produce obviously correct code: the initialization-on-demand holder idiom. This is used to create singletons that ensure there is only a single instance of the class, and occasionally used to defer expensive operations until first use (lazy init). --Dan [0] http://cr.openjdk.java.net/~heidinga/leyden/ClassInitPlan.pdf -------------- next part -------------- An HTML attachment was scrubbed... URL: From heidinga at redhat.com Wed Dec 7 15:52:33 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Wed, 7 Dec 2022 10:52:33 -0500 Subject: A class per static field? Why or why not? Message-ID: Continuing on the Class init progression discussion.... Why don't we put every static field in its own class? The obvious answer is that it's too much mental load for developers. But if we put that aside for a moment, and assume that we have infinitely smart developers, it might be useful to understand why we don't program like this now. Or what programming like this might actually look like. Putting every static field in its own class trivially gives us lazy static fields (sorry John, no new JEP required in this world) with each static only being initialized when actually accessed. It gives each static field a clear initialization point where we can more easily tell what caused a particular static to be initialized. It makes it easier to determine the true dependency graph between static fields rather than today's "soupy" model. It doesn't solve the "soupy" problem as developers can still do arbitrary things in the but it does reduce the problem as it moves a lot of code out of the common as each static now has its own . Does this make analysis more tractable? In our investigation [0], we focused on the underlying JVM physics of classes and looked at the memory use of this approach. Which was estimated to average out to under 1K per class. What do other languages do with their equivalent of static state? Are there different design points for expressing static state we should be investigating to better enable shifting computation to different points in time? --Dan [0] http://cr.openjdk.java.net/~heidinga/leyden/ClassInitPlan.pdf -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Dec 7 16:39:51 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 7 Dec 2022 11:39:51 -0500 Subject: A class per static field? Why or why not? In-Reply-To: References: Message-ID: <8c4f2d5d-11a4-6619-fa5f-f2ee2e29e68d@oracle.com> Thanks, Dan, for sharing the investigation and for asking the right questions.? A few comments inline. On 12/7/2022 10:52 AM, Dan Heidinga wrote: > Continuing?on the Class init progression discussion.... > > Why don't we put every static field in its own class? Pedantic correction: we're only talking about static finals with initializers.? Mutable statics have arbitrarily complicated initialization lifecycles, and that's just how it is; static finals that are initialized in `static { }` blocks already have their lifecycle complected with other writes in those blocks. > The obvious answer is that it's too much mental load for developers. > But if we put that aside for a moment, and assume that we have > infinitely smart developers, it might be useful to understand why we > don't program like this now.? Or what programming like this might > actually look like. > > Putting every static field in its own class trivially gives us lazy > static fields (sorry John, no new JEP required?in this world) with > each static only being initialized when actually accessed. > > It gives each static field a clear initialization point where we can > more easily tell what caused a particular static to be initialized. > > It makes it easier to determine the true dependency graph between > static fields rather than today's "soupy" model. Some possible reasons (just brainstorming here): ?- It's more code, both at the declaration site (wrap it in a class) and the use site (qualify it with a class name).? Developers instantly see this cost, but it make take longer to see the benefit. ?- Perception that this is more heavyweight, since classes are "obviously" more heavyweight than variables. ?- Thinking about lifecycles is hard.? If the easy thing -- declare a bunch of statics and initialize them -- works, this is what developers will do, and are unlikely to revisit it until something doesn't work. ?- More importantly, lifecycle mostly becomes relevant when your code is used in a bigger system, and at coding time, that's a distant-future worry.? Like other crosscutting concerns such as concurrency and security, thinking about deployment / redeployment / startup characteristics is hard to focus on when you're trying to get your code to work, and its easy to forget to go back and think about it after you get your code to work. So, I think the answer is: people follow the path of least resistance, and the path of least resistance here leads to someplace "good enough" to get things working but which sows the seed for long-term technical debt.? The PoLR today is good enough that people can get to something that mostly works without thinking very hard. If we can make the PoLR lead someplace better, that's what winning will look like. > It doesn't solve the "soupy" problem as developers?can still > do arbitrary?things in the but it does reduce the problem as > it moves a lot of code out of the common as each static now > has its own .? Does this make analysis more tractable? I agree with your (implicit) intuition that if we could get to a world where we only complected initialization lifecycles rarely, rather than routinely, then it would be more practical to characterize those as "weirdo" cases for which the answer is "rewrite/don't use that code if you want ".? The problem today is that way too much code uses the existing soupy mechanisms -- but only some smaller fraction of it, which is hard to identify either by human or automated analysis, implicitly depends on the initialization-order semantics of the existing mechanisms. > In our investigation [0], we focused?on the underlying JVM physics?of > classes and looked at the memory use of this approach.? Which was > estimated to average out to under 1K per class. Semantics and boilerplate aside, this seems amenable to a "Loomy" move, which is: "make the expensive thing less expensive, rather than asking users to resort to complex workarounds." > What do other languages do with their equivalent of static state? Are > there different design points for expressing static state we should be > investigating to better enable shifting computation to different > points in time? One of the things that accidentally makes our lives harder here is that most other languages do not specify semantics as carefully as Java does, so the answer is sometimes "whatever the implementation does."? For better or worse, Java is much more precise at specifying what triggers class initialization. Looking at the most Java-like languages: ?- C# allows members to be declared static, supports field initializers like Java, and supports "static constructors" (similar to `static { }` blocks in Java, but with a constructor-like syntax) which are run at class initialization time.? If a static constructor is present, it does the same soupy thing, where field initializers are run in textual order prior to running the static constructor; if no static constructor is present, the spec is cagey about when static field initializers are run, but they appear to all be run in the textual order: > 14.5.6.2 Static field initialization > The static field variable initializers of a class correspond to a > sequence of assignments that are executed in the textual order in > which they appear in the class declaration (?14.5.6.1). Within a > partial class, the meaning of ?textual order? is specified by > ?14.5.6.1. If a static constructor (?14.12) exists in the class, > execution of the static field initializers occurs immediately prior to > executing that static constructor. Otherwise, the static field > initializers are executed at an implementation-dependent time prior to > the first use of a static field of that class. ?- Scala and Kotlin ditched "static" as a modifier, instead offering "companion objects" (singleton classes).? While the two models are equally expressive, companion objects have us syntactically segregate the static parts of a class into a single entity, and encourage us to think about the static parts as a whole rather than individual members. Kotlin: ??? class X { ??????? companion object { ??????????? // per-class fields and methods here ??????? } ??? } Members of the companion object can be qualified with the class name, or used unqualified, just as in Java. Scala lets you declare something similar as a top level entity: ??? class X { ... } ??? object X { ... } with more complex rules that treat a class and an object with the same name as being two facets of the same entity.? (You can have an object separate from a class; it's just a class whose members are effectively static and which is initialized the first time one of its members is accessed.) The approach of companion objects rather than static members provides a useful nudge to thinking of the static parts of a class as being a single, independent entity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Wed Dec 7 16:58:51 2022 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 7 Dec 2022 16:58:51 +0000 Subject: A class per static field? Why or why not? In-Reply-To: References: Message-ID: On 07/12/2022 15:52, Dan Heidinga wrote: > Why don't we put every static field in its own class? There's a war story to be shared here:-)? - you might want to take a look at what the jextract tool does: https://github.com/openjdk/jextract To generate native bindings, jextract needs _a lot_ of constants, downcall method handles, memory layouts, var handles, ... The first iteration of jextract used static final fields, and startup time was horrible given to the presence of so many field inits. The current iteration of jextract uses a class holder for N constants (with N configurable). This is a trade-off, to get good startup while reducing the number of classes generated by the tool. That said, the number of classes generated by jextract is _still_ astonishingly high (with big libraries you can easily get 200 of them). While they are only loaded on-demand, users are often scared by this. For this reason jextract also provides filtering capabilities: if you only ever need to interact with 5 native functions, skip generation for everything else. This model has been serving us well so far, which confirms your "hunch". Also, recent "tip" - since we can now have static fields inside methods, we can also do this: ```java X computeOnce() { ??? class Holder { ???????? static final X x = ... ??? } ??? return Holder.x; } ``` Which seems better than having N unrelated classes scattered around (although you will see the classfiles for them if you run javac). We have not tweaked jextract to do this (yet). Of course, tools like jextract are a natural "killer application" for lazy statics. Lazy statics would allow jextract to generate a lot less code. And not only would the source code be more compact, but the size of the generated classfiles would be much, much shorter too. Cheers Maurizio From heidinga at redhat.com Mon Dec 12 16:56:20 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Mon, 12 Dec 2022 11:56:20 -0500 Subject: A class per static field? Why or why not? In-Reply-To: <8c4f2d5d-11a4-6619-fa5f-f2ee2e29e68d@oracle.com> References: <8c4f2d5d-11a4-6619-fa5f-f2ee2e29e68d@oracle.com> Message-ID: On Wed, Dec 7, 2022 at 11:40 AM Brian Goetz wrote: > Thanks, Dan, for sharing the investigation and for asking the right > questions. A few comments inline. > > On 12/7/2022 10:52 AM, Dan Heidinga wrote: > > Continuing on the Class init progression discussion.... > > Why don't we put every static field in its own class? > > > Pedantic correction: we're only talking about static finals with > initializers. Mutable statics have arbitrarily complicated initialization > lifecycles, and that's just how it is; static finals that are initialized > in `static { }` blocks already have their lifecycle complected with other > writes in those blocks. > > The obvious answer is that it's too much mental load for developers. But > if we put that aside for a moment, and assume that we have infinitely smart > developers, it might be useful to understand why we don't program like this > now. Or what programming like this might actually look like. > > Putting every static field in its own class trivially gives us lazy static > fields (sorry John, no new JEP required in this world) with each static > only being initialized when actually accessed. > > It gives each static field a clear initialization point where we can more > easily tell what caused a particular static to be initialized. > > It makes it easier to determine the true dependency graph between static > fields rather than today's "soupy" model. > > > Some possible reasons (just brainstorming here): > > - It's more code, both at the declaration site (wrap it in a class) and > the use site (qualify it with a class name). Developers instantly see this > cost, but it make take longer to see the benefit. > - Perception that this is more heavyweight, since classes are "obviously" > more heavyweight than variables. > - Thinking about lifecycles is hard. If the easy thing -- declare a > bunch of statics and initialize them -- works, this is what developers will > do, and are unlikely to revisit it until something doesn't work. > - More importantly, lifecycle mostly becomes relevant when your code is > used in a bigger system, and at coding time, that's a distant-future > worry. Like other crosscutting concerns such as concurrency and security, > thinking about deployment / redeployment / startup characteristics is hard > to focus on when you're trying to get your code to work, and its easy to > forget to go back and think about it after you get your code to work. > > So, I think the answer is: people follow the path of least resistance, and > the path of least resistance here leads to someplace "good enough" to get > things working but which sows the seed for long-term technical debt. The > PoLR today is good enough that people can get to something that mostly > works without thinking very hard. If we can make the PoLR lead someplace > better, that's what winning will look like. > > +1. One additional challenge here is the deployment model affects the end destination for the path. Paving the PoLR to make lazy init more common / easier can result in making earlier init (build time or shifted early) more difficult. And vice versa. The PoLR should ideally lead developers to say "as early as possible (build time) or as lazy as possible, I don't care which" so the VM has as much freedom as possible. Really what they often want to say is "don't affect my startup time with this operation" but don't have a good way to express that both early & late are valid solutions. > It doesn't solve the "soupy" problem as developers can still do > arbitrary things in the but it does reduce the problem as it moves > a lot of code out of the common as each static now has its own > . Does this make analysis more tractable? > > > I agree with your (implicit) intuition that if we could get to a world > where we only complected initialization lifecycles rarely, rather than > routinely, then it would be more practical to characterize those as > "weirdo" cases for which the answer is "rewrite/don't use that code if you > want ". The problem today is that way too much code uses the > existing soupy mechanisms -- but only some smaller fraction of it, which is > hard to identify either by human or automated analysis, implicitly depends > on the initialization-order semantics of the existing mechanisms. > > In our investigation [0], we focused on the underlying JVM physics of > classes and looked at the memory use of this approach. Which was estimated > to average out to under 1K per class. > > > Semantics and boilerplate aside, this seems amenable to a "Loomy" move, > which is: "make the expensive thing less expensive, rather than asking > users to resort to complex workarounds." > > What do other languages do with their equivalent of static state? Are > there different design points for expressing static state we should be > investigating to better enable shifting computation to different points in > time? > > > One of the things that accidentally makes our lives harder here is that > most other languages do not specify semantics as carefully as Java does, so > the answer is sometimes "whatever the implementation does." For better or > worse, Java is much more precise at specifying what triggers class > initialization. > > Looking at the most Java-like languages: > > - C# allows members to be declared static, supports field initializers > like Java, and supports "static constructors" (similar to `static { }` > blocks in Java, but with a constructor-like syntax) which are run at class > initialization time. If a static constructor is present, it does the same > soupy thing, where field initializers are run in textual order prior to > running the static constructor; if no static constructor is present, the > spec is cagey about when static field initializers are run, but they appear > to all be run in the textual order: > > 14.5.6.2 Static field initialization > The static field variable initializers of a class correspond to a sequence > of assignments that are executed in the textual order in which they appear > in the class declaration (?14.5.6.1). Within a partial class, the meaning > of ?textual order? is specified by ?14.5.6.1. If a static constructor > (?14.12) exists in the class, execution of the static field initializers > occurs immediately prior to executing that static constructor. Otherwise, > the static field initializers are executed at an implementation-dependent > time prior to the first use of a static field of that class. > > > - Scala and Kotlin ditched "static" as a modifier, instead offering > "companion objects" (singleton classes). While the two models are equally > expressive, companion objects have us syntactically segregate the static > parts of a class into a single entity, and encourage us to think about the > static parts as a whole rather than individual members. > > Kotlin: > class X { > companion object { > // per-class fields and methods here > } > } > > Members of the companion object can be qualified with the class name, or > used unqualified, just as in Java. > > Scala lets you declare something similar as a top level entity: > > class X { ... } > object X { ... } > > with more complex rules that treat a class and an object with the same > name as being two facets of the same entity. (You can have an object > separate from a class; it's just a class whose members are effectively > static and which is initialized the first time one of its members is > accessed.) > > The approach of companion objects rather than static members provides a > useful nudge to thinking of the static parts of a class as being a single, > independent entity. > > Independent entity, yes. Single, maybe. We group the statics of a class into a single today but we may want multiple groupings if we can give them different initialization points (lifecycles). Maybe that's just putting them in a different class but whatever we pick here will affect the PoLR discussed above and "single" may not be the right model given the classes we already have. --Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Dec 12 18:11:16 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Dec 2022 13:11:16 -0500 Subject: A class per static field? Why or why not? In-Reply-To: References: <8c4f2d5d-11a4-6619-fa5f-f2ee2e29e68d@oracle.com> Message-ID: <383fa7dc-7aba-bbe1-1510-cc63d88caa90@oracle.com> > > The approach of companion objects rather than static members > provides a useful nudge to thinking of the static parts of a class > as being a single, independent entity. > > > Independent entity, yes.? Single, maybe. Right.? The companion mechanism in Kotlin pushes pretty hard at "single"; the companion mechanism in Scala is somewhere in the middle, where it use a magic name association between a class called X and an object called X, but you can also have objects whose name is separate from any class and it can stand as an independent sub-part.? If we went down this road, we would probably go even farther, where the analogue of `object` would be more like a general-purpose singleton class which you could freely mix and match with.? It's not all that different from using IODH today from an expressiveness perspective, but (like with enums) it moves instance management from the user's side of the ledger to the language's side.? Let's say that this is a possibility we could explore if we suspected there were a bigger potential payoff. -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmehra at redhat.com Fri Dec 23 17:46:10 2022 From: asmehra at redhat.com (Ashutosh Mehra) Date: Fri, 23 Dec 2022 12:46:10 -0500 Subject: Portability of compiled code Message-ID: Hello, Early this year we were doing some experiments on portability of checkpoints under project CRaC and prepared a document [0] on our findings. We feel parts of it would be relevant under Project Leyden as well, as it *may* enable ahead-of-time compilation in future (I know we are not there yet!) For the CRaC checkpoints, we found the code generated by C1/C2 compiler is not always portable due to the use of architecture specific instructions. The same would hold true in the context of AoT compilations as well. To make the code portable C1/C2 compilers should be provided with a minimal set of cpu features that they are allowed to exploit during codegen phase. However, this can also negatively impact the performance of the generated code as it would now not be utilizing all the features of the underlying architecture. So the performance may have to be traded for gaining portability. If anyone has thoughts on this aspect of the problem, please share them. [0] http://cr.openjdk.java.net/~heidinga/crac/Portability_of_checkpoints.pdf - Ashutosh Mehra -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Dec 23 19:12:50 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 23 Dec 2022 19:12:50 +0000 Subject: Portability of compiled code In-Reply-To: References: Message-ID: You identify a problem (generated code has environmental dependencies which may not carry into other environments) and a solution (detune the generated code to something generic.) I agree with the problem, but there are many other possible solutions, and we shouldn?t prematurely snap to one of them. Moving some work from runtime to earlier phases (whether through a Leyden condenser, a classfile rewriter, etc) always creates the possibility that this work depends on some environmental characteristic of the early execution environment. The list is endless: hardware, operating system, environment variables, time and time zone, JDK and other dependency versions, etc. Strategies for dealing with this include: - Record environmental dependencies, and fail if the code is restarted when the dependencies are not met. This is entirely reasonable when applied to things like ?you ran your checkpoint on Intel/Windows, but you?re resuming on ARM/MacOS?; it is merely a matter of judgment how fine-grained to record and enforce. - Generate perfectly portable everything, even if that means generating worse everything. This is the route you are suggesting, but obviously this has limits that will be reached pretty quickly anyway. Nothing is going to save you from trying to resume on a different architecture / OS / incompatible JVM version / wrong class path / etc. - Record environmental dependencies, but be willing to toss them if they don?t match the runtime environment. For things like an AOT?ed code cache, the code cache could be flushed and we could fall back to dynamic code generation. - Train on a variety of architectures and include multiple versions of the AOT?ed code in the binary. ? and plenty of others. A key aspect of Leyden is not only ahead-of-time condensation, but recording constraints that capture assumptions inherent in those condensations, and prevent operations that would contradict those assumptions (e.g., prohibit redefinition of classes that have been used to generate AOT?ed code.) On Dec 23, 2022, at 12:46 PM, Ashutosh Mehra > wrote: Hello, Early this year we were doing some experiments on portability of checkpoints under project CRaC and prepared a document [0] on our findings. We feel parts of it would be relevant under Project Leyden as well, as it may enable ahead-of-time compilation in future (I know we are not there yet!) For the CRaC checkpoints, we found the code generated by C1/C2 compiler is not always portable due to the use of architecture specific instructions. The same would hold true in the context of AoT compilations as well. To make the code portable C1/C2 compilers should be provided with a minimal set of cpu features that they are allowed to exploit during codegen phase. However, this can also negatively impact the performance of the generated code as it would now not be utilizing all the features of the underlying architecture. So the performance may have to be traded for gaining portability. If anyone has thoughts on this aspect of the problem, please share them. [0] http://cr.openjdk.java.net/~heidinga/crac/Portability_of_checkpoints.pdf - Ashutosh Mehra -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmehra at redhat.com Fri Dec 23 21:11:01 2022 From: asmehra at redhat.com (Ashutosh Mehra) Date: Fri, 23 Dec 2022 16:11:01 -0500 Subject: Portability of compiled code In-Reply-To: References: Message-ID: Brian, thanks for your comments! - Generate perfectly portable everything, even if that means generating > worse everything. This is the route you are suggesting, but obviously this > has limits that will be reached pretty quickly anyway > I wouldn't call for generating perfectly portable code. Portability would depend on what micro-architectures we are willing to support and finding the common set of cpu features supported by them. -Record environmental dependencies, but be willing to toss them if they > don?t match the runtime environment. For things like an AOT?ed code cache, > the code cache could be flushed and we could fall back to dynamic code > generation. > I agree that recording the environmental dependencies would always be required. Even when generating the portable code, we should record the dependencies, at the least to verify the AOT'ed code can be executed in the runtime environment. To expand a bit more on the portability aspect, the idea behind choosing a set of cpu features is to make the code executable on a broader range of micro architectures. Let's say the application needs to be deployed on a cloud where the user may not know the micro-architecture of the systems where the code would eventually be executed. In such a case if the ahead-of-time condensation produces code using a specific cpu feature available on the system it is running on, then the AOT'ed code may not be usable on deployment systems. We can fall back to dynamic code generation, but we lose the startup benefits we could get if only the code had been more portable. Even if the portable code is inferior in quality, it can do the job of providing comparatively quicker start up. At runtime the code can be upgraded (I guess based on profiling) to a more efficient version to exploit all the cpu features of the underlying architecture. - Ashutosh Mehra On Fri, Dec 23, 2022 at 3:15 PM Brian Goetz wrote: > You identify a problem (generated code has environmental dependencies > which may not carry into other environments) and a solution (detune the > generated code to something generic.) I agree with the problem, but there > are many other possible solutions, and we shouldn?t prematurely snap to one > of them. > > Moving some work from runtime to earlier phases (whether through a Leyden > condenser, a classfile rewriter, etc) always creates the possibility that > this work depends on some environmental characteristic of the early > execution environment. The list is endless: hardware, operating system, > environment variables, time and time zone, JDK and other dependency > versions, etc. > > Strategies for dealing with this include: > > - Record environmental dependencies, and fail if the code is restarted > when the dependencies are not met. This is entirely reasonable when > applied to things like ?you ran your checkpoint on Intel/Windows, but > you?re resuming on ARM/MacOS?; it is merely a matter of judgment how > fine-grained to record and enforce. > > - Generate perfectly portable everything, even if that means generating > worse everything. This is the route you are suggesting, but obviously this > has limits that will be reached pretty quickly anyway. Nothing is going to > save you from trying to resume on a different architecture / OS / > incompatible JVM version / wrong class path / etc. > > - Record environmental dependencies, but be willing to toss them if they > don?t match the runtime environment. For things like an AOT?ed code cache, > the code cache could be flushed and we could fall back to dynamic code > generation. > > - Train on a variety of architectures and include multiple versions of > the AOT?ed code in the binary. > > ? and plenty of others. > > A key aspect of Leyden is not only ahead-of-time condensation, but > recording constraints that capture assumptions inherent in those > condensations, and prevent operations that would contradict those > assumptions (e.g., prohibit redefinition of classes that have been used to > generate AOT?ed code.) > > > On Dec 23, 2022, at 12:46 PM, Ashutosh Mehra wrote: > > Hello, > > Early this year we were doing some experiments on portability of > checkpoints under project CRaC and prepared a document [0] on our findings. > We feel parts of it would be relevant under Project Leyden as well, as it > *may* enable ahead-of-time compilation in future (I know we are not there > yet!) > > For the CRaC checkpoints, we found the code generated by C1/C2 compiler is > not always portable due to the use of architecture specific instructions. > The same would hold true in the context of AoT compilations as well. > To make the code portable C1/C2 compilers should be provided with a > minimal set of cpu features that they are allowed to exploit during codegen > phase. > However, this can also negatively impact the performance of the generated > code as it would now not be utilizing all the features of the underlying > architecture. > So the performance may have to be traded for gaining portability. > > If anyone has thoughts on this aspect of the problem, please share them. > > [0] > http://cr.openjdk.java.net/~heidinga/crac/Portability_of_checkpoints.pdf > > - Ashutosh Mehra > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: