From daniel.smith at oracle.com Thu Jun 4 19:16:59 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Thu, 4 Jun 2020 13:16:59 -0600 Subject: Evolving CONSTANT_Class In-Reply-To: <8F687AE4-38AB-4578-9EA7-311D1385877E@oracle.com> References: <8F687AE4-38AB-4578-9EA7-311D1385877E@oracle.com> Message-ID: Some thoughts, working backwards from species, that may inform this decision. A *species* is a specialization of a (generic) class or interface, where by "specialization" we mean the class/interface declaration interpreted in the context of a constant pool that has been modified by inserting certain resolved constants. At the use site (think 'new'), we informally talk about a species like 'List[Val]'. What this means is "the species produced by resolving 'List', resolving 'Val', and modifying the constant pool of 'List' with the resolved 'Val'". It will also be common to talk about species like 'List[T]', where 'T' is represented by a constant pool entry that will be filled in with a live constant. This suggests that our representation of a species should combine 1) a pointer to a Class constant, and 2) pointers to other resolvable constants (typically, but maybe not exclusively, representing types). I think we intuitively want to encode a species with something like 'Class("LList[QVal;];")', but this encoding is flawed: - There's no constant pool entry to cache the resolution of List - There's no constant pool entry to cache the resolution of Val - There's no way to encode a live type argument (List[T]), so we'd need a separate encoding for that - Depending on the domain of type arguments (can I use an integer?), there's no descriptor string encoding for many other type arguments; again, we'd need a separate encoding I'm appealing here to a design principle that seems to have driven the original constant pool design: Class constants are for things that get resolved (and can be cached); descriptor strings are little more than fancy names. This principle doesn't always get followed: the verifier sometimes loads classes named by descriptors; array type class constants resolve their element types without a separate entry; more recently, StackMapTables use Class constants to represent types, and MethodTypes resolve method descriptors "as if" there were class constants for all of the parameter types. But I think these, especially the recent ones, are mistakes, and I still think the original notion is a useful separation of concerns that we should try to follow in our design. Implications, if you buy this argument: - There's got to be some sort of new CONSTANT_Species entry consisting of pointers to the generic class and the type arguments. - For class-flavored references that allow species (super_class, interfaces, new, maybe this_class), either a Class can point to a Species, or a Species can appear as an alternative to a Class. - For type-flavored references (Methodref, instanceof, anewarray), again we need either a Class/Type that can point to the Species, or we allow the Species as an alternative to be referenced directly. A distinct problem here is that we need a way to express whether the species type is an L type or a Q type. Maybe that's an extra layer, or maybe it's built into CONSTANT_Species. (This is really the same problem as what we do about L vs. Q class types, but without the legacy constraints.) - For bare descriptors (type of a field), it's fine to use something like "LList[QVal;];". Or maybe it's useful to describe descriptors in terms of Class/Species constants. In any case, there's still a need to figure out how to parameterize a descriptor with live constants ("LList[$T];"), but I think this can be set aside as a separate problem. ----- Bonus round: generic methods. Generic methods work a lot like species?at the use site, we need to be able to refer to a method in the context of a constant pool that has been modified by inserting certain resolved constants. (We might even want to use the term "species" here, too. Or maybe it's "specialized method", where "specialized class" = "species".) The existing representation of a method to be invoked is a Methodref, which has pointers to a Class constant, a name string, and a descriptor string. So I think we need CONSTANT_SpecializedMethodref, which has 1) a pointer to a Methodref constant, and 2) pointers to some resolvable constants (typically, but maybe not exclusively, representing types). (Caveat: there are some details about the interaction between type arguments, overriding, and method resolution that I'm hand-waving about. Maybe the encoding will be stacked a little differently.) Again, we can either somehow wrap the SpecializedMethodref in a Methodref (this seems a lot more awkward that it does when wrapping a Species in a Class), or we can allow the use sites (invoke instructions, mostly) to point to either Methodrefs or SpecializedMethodrefs. ----- Where this leaves me (acknowledging that I've made some leaps that some people might be more skeptical of) is pretty down on options (1) and (2). If we do (4), CONSTANT_Type is going to be heavily overloaded: it can refer to a descriptor, a SpeciesType, an ArrayType (for arrays of species types), a type variable, etc. Basically, the distinction between (3) and (4) amounts to whether outside references can point to one of many alternatives, or whether they're all routed through a CONSTANT_Type, which then points to one of the alternatives. I can imagine good arguments for both of those alternatives. From kevinb at google.com Fri Jun 5 02:00:19 2020 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 4 Jun 2020 19:00:19 -0700 Subject: The fate of int, int.ref, and Integer Message-ID: Hello friends, A couple thoughts on the fate of the primitives and wrappers. First, on nomenclature, I think the most useful definitions of what it means to be an "inline type" are those that reveal the primitives to *already be* inline types. Java's always had them, but it hasn't had *user-defined* inline types, because it hasn't had *inline classes* (and classes are how we user-define types). That's clean, and it's not even a retcon. Also on nomenclature, I want to avoid phrases like "you can expand the set of primitives"; no, I still think that "primitives" should always apply to the eight *predefined*, irreducible inline types. User-defined inline types are always composite (how could they not be?). I approve of the idea of writing int.java etc. files in order to add methods to `int`, and add interfaces to `int.ref`. It is fine if these files are essentially "fake" (they don't actually bring the primitives into existence as other classes do). I think attempts to try to make them look "real" would mean letting them do things other inline types can't and it definitely wouldn't seem worth it to me. What I would explain is "In Java =X that it has members and implements `Comparable`, it is a class for that reason, but the type itself is still predefined with or without that class." (wart: yeah, arrays have no class, yet sure seem to have members `length` and `clone` anyway. oh well.) I also approve of giving the new `int` class everything it needs so that the `Integer` class becomes obsolete; that is, there would no longer be any good reason to use it except when forced by legacy code. (Of course, anything that wants to depend on identity or locking of such an object I will just declare to be legacy code, so it works!) Really though, don't bring `getInteger` over when you do. However, I am highly skeptical of attempts to do anything else beyond that. I've seen, at least, the allusions to some kind of aliasing between `int.ref` and `Integer`. That seems unnecessary to me, and more to the point, I feel that it can only make things more confusing to users; that in fact it will cause a large share of all the confusion they do feel. So wait, what IS the wrapper class then? What IS this reference projection then? I see no benefit to blurring that line, at this point. Reactions? -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From john.r.rose at oracle.com Fri Jun 5 02:16:15 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 4 Jun 2020 19:16:15 -0700 Subject: The fate of int, int.ref, and Integer In-Reply-To: References: Message-ID: <3B91AB6D-A93C-4623-8F1E-6B865F609D8E@oracle.com> On Jun 4, 2020, at 7:00 PM, Kevin Bourrillion wrote: > > Hello friends, > > A couple thoughts on the fate of the primitives and wrappers. > > First, on nomenclature, I think the most useful definitions of what it means to be an "inline type" are those that reveal the primitives to already be inline types. Java's always had them, but it hasn't had user-defined inline types, because it hasn't had inline classes (and classes are how we user-define types). That's clean, and it's not even a retcon. +1 We have tried to keep ?works like an int? as a goal. I don?t think we?ve compromised too much away from that; I think your formula works, except for the technical fact that ?inline? is always followed by ?class?. > Also on nomenclature, I want to avoid phrases like "you can expand the set of primitives"; no, I still think that "primitives" should always apply to the eight predefined, irreducible inline types. User-defined inline types are always composite (how could they not be?). Yes. But: I expect that JVMs will sometimes secretly define things that look like inline classes but in fact are physically atomic (except bitwise of course). Vectors in AVX are like this: They go in one register, not many. I expect such things to be hidden from the end user, in places like jdk.internal.types, and wrapped in ordinary inline wrapper classes for public consumption. > > I approve of the idea of writing int.java etc. files in order to add methods to `int`, and add interfaces to `int.ref`. It is fine if these files are essentially "fake" (they don't actually bring the primitives into existence as other classes do). I think attempts to try to make them look "real" would mean letting them do things other inline types can't and it definitely wouldn't seem worth it to me. Yep. We are a long way from doing so, I think. We might like some kind of Haskell-flavored fu that lets us relate those things to their operators. At least, I?d like to know something about that road ahead, before committing to the initial contents of int.java. > What I would explain is "In Java =X that it has members and implements `Comparable`, it is a class for that reason, but the type itself is still predefined with or without that class.? I think people would not be satisfied with such an explanation, until we can explain why 42.toString() does or doesn?t work, and how 42 < 43 connects to a call to Comparable.compareTo, and (worst of all) how 1.0 == 1.0 connects to the Java == operator on classes and/or Comparable.compareTo (pick one). So we?re pretty far from making int into a class, or from writing int.java. But, yes, we can say that primitives are (in some sense to be defined or hand waved away) ?inline types?. > > (wart: yeah, arrays have no class, yet sure seem to have members `length` and `clone` anyway. oh well.) > > I also approve of giving the new `int` class everything it needs so that the `Integer` class becomes obsolete; that is, there would no longer be any good reason to use it except when forced by legacy code. (Of course, anything that wants to depend on identity or locking of such an object I will just declare to be legacy code, so it works!) Really though, don't bring `getInteger` over when you do. This is a maze of twisty passages. I agree there?s are ways through it. We want to choose a way through that doesn?t leave us disgusted with ourselves in the morning. > However, I am highly skeptical of attempts to do anything else beyond that. I've seen, at least, the allusions to some kind of aliasing between `int.ref` and `Integer`. That seems unnecessary to me, and more to the point, I feel that it can only make things more confusing to users; that in fact it will cause a large share of all the confusion they do feel. So wait, what IS the wrapper class then? What IS this reference projection then? I see no benefit to blurring that line, at this point. Interesting. I don?t have a strong feeling, but I *do* hope that we could define by fiat that Integer is the ref-projection of int, sooner rather than later. You are prompting me to re-examine this idea, and see what it might buy us. At the very least, I?d like to be say List instead of List and get away with it. This is touching on exactly what we can do (short and long term) with generics, which is an open question. Are you saying that it would be risky to declare that int.ref == Integer, because it would make it harder to get rid of Integer? Isn?t it going to be impossible anyway to get rid of Integer? I think the problem is mainly to make Integer as palatable as possible in the future, perhaps deprecating some of the oldest cruft, and (at least conceptually) attributing the useful parts to int, even before we venture to write int.java. My $0.02. Thanks for raising the question. ? John From kevinb at google.com Fri Jun 5 02:46:08 2020 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 4 Jun 2020 19:46:08 -0700 Subject: The fate of int, int.ref, and Integer In-Reply-To: <3B91AB6D-A93C-4623-8F1E-6B865F609D8E@oracle.com> References: <3B91AB6D-A93C-4623-8F1E-6B865F609D8E@oracle.com> Message-ID: On Thu, Jun 4, 2020 at 7:16 PM John Rose wrote: On Jun 4, 2020, at 7:00 PM, Kevin Bourrillion wrote: > > > Hello friends, > > A couple thoughts on the fate of the primitives and wrappers. > > First, on nomenclature, I think the most useful definitions of what it > means to be an "inline type" are those that reveal the primitives to *already > be* inline types. Java's always had them, but it hasn't had *user-defined* inline > types, because it hasn't had *inline classes* (and classes are how we > user-define types). That's clean, and it's not even a retcon. > > > +1 We have tried to keep ?works like an int? as a goal. I don?t think > we?ve > compromised too much away from that; I think your formula works, except > for the technical fact that ?inline? is always followed by ?class?. > I don't know what this means -- I see separate usages of "inline class" and "inline type" and they make sense to me. > Also on nomenclature, I want to avoid phrases like "you can expand the set > of primitives"; no, I still think that "primitives" should always apply to > the eight *predefined*, irreducible inline types. User-defined inline > types are always composite (how could they not be?). > > > Yes. But: I expect that JVMs will sometimes secretly define things that > look > like inline classes but in fact are physically atomic (except bitwise of > course). > Vectors in AVX are like this: They go in one register, not many. I expect > such things to be hidden from the end user, in places like > jdk.internal.types, > and wrapped in ordinary inline wrapper classes for public consumption. > (I should probably open every email with a reminder that my concerns always live purely within the language model.) I approve of the idea of writing int.java etc. files in order to add > methods to `int`, and add interfaces to `int.ref`. It is fine if these > files are essentially "fake" (they don't actually bring the primitives into > existence as other classes do). I think attempts to try to make them look > "real" would mean letting them do things other inline types can't and it > definitely wouldn't seem worth it to me. > > > Yep. We are a long way from doing so, I think. We might like some kind of > Haskell-flavored fu that lets us relate those things to their operators. > At > least, I?d like to know something about that road ahead, before committing > to the initial contents of int.java. > That kind of fu is the sort I'm skeptical of; the only thing you really need this class for is to augment the existing predefined type with methods so that it's reference projection can be the better version of Integer (and if static int-accepting methods on Integer become callable as myInt.method(), that's nice gravy too). That file doesn't need to "explain" anything about how ints already work, imho. It's okay that primitives are special. What I would explain is "In Java to be because it had no members. Now in Java >=X that it has members and > implements `Comparable`, it is a class for that reason, but the type itself > is still predefined with or without that class.? > > > I think people would not be satisfied with such an explanation, until we > can > explain why 42.toString() does or doesn?t work, and how 42 < 43 connects > to a call to Comparable.compareTo, and (worst of all) how 1.0 == 1.0 > connects > to the Java == operator on classes and/or Comparable.compareTo (pick one). > Satisfied? No, it wasn't meant to be an explanation of very *much*, just why int became a class. Off the cuff, it's clean to make 42.toString() and myInt.toString() work if possible, to emphasize that there's nothing about ref-ness that somehow enables method calls (and further demolish the popular misconception that "dot means dereference"). So we?re pretty far from making int into a class, or from writing int.java. > But, yes, we can say that primitives are (in some sense to be defined or > hand > waved away) ?inline types?. > I don't feel like it should require much handwaving, just avoiding an unnecessarily specific definition for "inline type". That means there will be a little more to say about *user-defined* inline types beyond the general case, but that's okay. I have no idea if what I'm saying is controversial; I'm just saying that `int` has the *essential* qualities of any inline type already. (wart: yeah, arrays have no class, yet sure seem to have members `length` > and `clone` anyway. oh well.) > > > I also approve of giving the new `int` class everything it needs so that > the `Integer` class becomes obsolete; that is, there would no longer be any > good reason to use it except when forced by legacy code. (Of course, > anything that wants to depend on identity or locking of such an object I > will just declare to be legacy code, so it works!) Really though, don't > bring `getInteger` over when you do. > > > This is a maze of twisty passages. I agree there?s are ways through it. > We want to choose a way through that doesn?t leave us disgusted with > ourselves in the morning. > My view was that it's the *next* part below that becomes twisty and that we would need to do carefully. However, I am highly skeptical of attempts to do anything else beyond that. > I've seen, at least, the allusions to some kind of aliasing between > `int.ref` and `Integer`. That seems unnecessary to me, and more to the > point, I feel that it can only make things more confusing to users; that in > fact it will cause a large share of all the confusion they do feel. So > wait, what IS the wrapper class then? What IS this reference projection > then? I see no benefit to blurring that line, at this point. > > > Interesting. I don?t have a strong feeling, but I *do* hope that we could > define > by fiat that Integer is the ref-projection of int, sooner rather than > later. > You are prompting me to re-examine this idea, and see what it might buy us. > At the very least, I?d like to be say List instead of > List > and get away with it. > Not sure if by "instead of" you mean those types actually being convertible; I'm expressing that I could accept a world where those types are just different, but people have incentive and some means to make discrete migrations over. But don't get me started about type migrations. :-) > This is touching on exactly what we can do (short > and long term) with generics, which is an open question. > > Are you saying that it would be risky to declare that int.ref == Integer, > because > it would make it harder to get rid of Integer? Isn?t it going to be > impossible > anyway to get rid of Integer? I think the problem is mainly to make > Integer > as palatable as possible in the future, perhaps deprecating some of the > oldest > cruft, and (at least conceptually) attributing the useful parts to int, > even before > we venture to write int.java. > Absolutely, impossible to get rid of Integer. But there can eventually be codebases that ban it, and use only int.ref, like Google's one day would. My basic orientation is that the more `Integer` is left alone, the less confusing everything is. Everyone is very accustomed to Living With Certain Things That Suck, and it's better when we can store our knowledge in neat boxes that don't all bleed together, Anyway, the details could override that orientation as we get into them. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From brian.goetz at oracle.com Fri Jun 5 12:43:11 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 5 Jun 2020 08:43:11 -0400 Subject: The fate of int, int.ref, and Integer In-Reply-To: References: Message-ID: <66761060-44ce-c1d5-4226-9e4e2f75c180@oracle.com> > I approve of the idea of writing int.java etc. files in order to add > methods to `int`, and add interfaces to `int.ref`. It is fine if these > files are essentially?"fake" (they don't actually bring the primitives > into existence as other classes do). I think attempts to try to make > them look "real" would mean letting them do things other inline types > can't and it definitely wouldn't seem worth it to me. Not to mention this circularity: ??? native inline class int implements BlahBlah { ??????? int theValue;? <-- oops ??? } So to the extent these are written as .java files, some fakery is inevitable. > I also approve of giving the new `int` class everything it needs so > that the `Integer` class becomes obsolete; that is, there would no > longer be any good reason to use it except when forced by legacy code. > (Of course, anything that wants to depend on identity or locking of > such an object I will just declare to be legacy code, so it works!) > Really though, don't bring `getInteger` over when you do. > > However, I am highly skeptical of attempts to do anything else beyond > that. I've seen, at least, the allusions to some kind of aliasing > between `int.ref` and `Integer`. That seems unnecessary to me, and > more to the point, I feel that it can only make things more confusing > to users; that in fact it will cause a large share of all the > confusion they do feel. So wait, what IS the wrapper class then? What > IS this reference projection then? I see no benefit to blurring that > line, at this point. > > Reactions? I certainly understand your gut sense that trying to retcon Integer to be something it was never meant to be will likely have some unexpected consequences.? But, its not like the alternative is great either. Suppose we have `native inline class int { ... }`.? So it gets a reference projection, `int.ref`, and a inline widening conversion from `int` to `int.ref`.? And also, it already has a boxing conversion (with the same semantics, and applicable in exactly the same places) to `Integer`.? Now what happens when someone does: ??? Object o = i; There are gazillions of lines of code that hard-code the assumption that this results in `Integer`.? Every client of reflection is rife with this assumption.? So we're probably going to conclude that the boxing conversion has to win over the widening conversion.? And now every use of primitives -- the most important inline types -- will be saddled with accidental identity when they box.? Which means that none of the boxing cases we have now -- part of the motivation for doing Valhalla in the first place -- will ever get better. OK, that's existing code.? What about new code?? It gets worse! Some libraries might use `List` (because they're better boxes), and others might use `List`.? And they won't interop.? And this problem won't ever go away. The move of saying `Integer` *is* `int.ref` makes these problems go away.? This seems too good to pass up preemptively. From john.r.rose at oracle.com Sat Jun 6 19:08:37 2020 From: john.r.rose at oracle.com (John Rose) Date: Sat, 6 Jun 2020 12:08:37 -0700 Subject: The fate of int, int.ref, and Integer In-Reply-To: <66761060-44ce-c1d5-4226-9e4e2f75c180@oracle.com> References: <66761060-44ce-c1d5-4226-9e4e2f75c180@oracle.com> Message-ID: On Jun 5, 2020, at 5:43 AM, Brian Goetz wrote: > > The move of saying `Integer` *is* `int.ref` makes these problems go away. This seems too good to pass up preemptively. I agree. And this leads us into a maze of twisty passages. Full of uninsulated electrified wires and third rails to avoid. One tactic I like to get through the maze is to find a way to add some ad hoc polymorphism to Integer, while keeping it sealed up as a final class, like today. It seems to need to cover both good old identity objects (like new Integer(42)) and also new inline objects ((Integer)42, boxed as before, but via a new subtype relation). This means there are two object types floating around, identity-Integer and int, plus an API type, Integer-as-super. I think the notion of species (as a finer grained subdivision of types, under class) can be used to create the necessary distinctions, without introducing a lot of new types, and breaking reflection. In the spirit of brainstorming, here are more details on such a path, a that might lead us through the maze. Given this: int.ref id = new Integer(42); //identity object int.val x = id; int.val y = 42; int.ref z = y; ?we could choose to arrange things so that all of x, y, z are inline objects (true ?ints?), while id retains its special flavor. Also, Object.getClass could report Integer.class for *all* of those values (even y). This could be justified by revealing ?int? as, not a class, but a *species* of Integer. So that Object.getSpecies would report the further details: For id it is the the version of Integer which holds identity (which doesn?t need a name I suppose) and for x/y/z it reports the species reflector for int. If we further use a muddied java.lang.Class to continue to represent non-classes like ?int?, we have to double down on the idea of a ?crass?, or ?runtime class-like type?. In that case we can have getSpecies return a crass, and then: assert 42.getClass() == Integer.class; assert 42.getSpecies() == int.class; assert new Integer(42).getSpecies() == (something else); From daniel.smith at oracle.com Mon Jun 8 23:29:33 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Mon, 8 Jun 2020 17:29:33 -0600 Subject: access control for withfield bytecode, compared to putfield In-Reply-To: <5DD7A25B-BBDD-4BFA-BA86-E0BAC92101D6@oracle.com> References: <5DD7A25B-BBDD-4BFA-BA86-E0BAC92101D6@oracle.com> Message-ID: <8D570D5A-EB65-406E-BBAA-7F1AF6043929@oracle.com> > On Apr 8, 2020, at 11:29 PM, John Rose wrote: > > To summarize: The simplest rule for access checking a > withfield instruction is to say, ?pretend the field was > declared private, and perform access checks?. That?s > it; the rest follows from the rules we have already laid > down. Just had a chance to read this old mail... FWIW, this *is* the specified behavior in the most recent JVMS iteration: http://cr.openjdk.java.net/~dlsmith/lw2/lw2-20190628/specs/inline-classes-jvms.html#jvms-6.5.withfield I agree, private access seems to be the right model. (Plus, maybe at some point, giving the class file the ability to express a 'withfield' access restriction as one of { public, protected, package, private }.) From john.r.rose at oracle.com Tue Jun 9 06:38:00 2020 From: john.r.rose at oracle.com (John Rose) Date: Mon, 8 Jun 2020 23:38:00 -0700 Subject: access control for withfield bytecode, compared to putfield In-Reply-To: <8D570D5A-EB65-406E-BBAA-7F1AF6043929@oracle.com> References: <8D570D5A-EB65-406E-BBAA-7F1AF6043929@oracle.com> Message-ID: <4EBB6331-7EA0-465E-86FF-3C371A5C85A1@oracle.com> +1 > On Jun 8, 2020, at 4:29 PM, Dan Smith wrote: > > ? >> >> On Apr 8, 2020, at 11:29 PM, John Rose wrote: >> >> To summarize: The simplest rule for access checking a >> withfield instruction is to say, ?pretend the field was >> declared private, and perform access checks?. That?s >> it; the rest follows from the rules we have already laid >> down. > > Just had a chance to read this old mail... > > FWIW, this *is* the specified behavior in the most recent JVMS iteration: > > http://cr.openjdk.java.net/~dlsmith/lw2/lw2-20190628/specs/inline-classes-jvms.html#jvms-6.5.withfield > > I agree, private access seems to be the right model. (Plus, maybe at some point, giving the class file the ability to express a 'withfield' access restriction as one of { public, protected, package, private }.) From daniel.smith at oracle.com Tue Jun 9 21:18:21 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 9 Jun 2020 15:18:21 -0600 Subject: Valhalla basic concepts / terminology In-Reply-To: References: Message-ID: I think there remain some finer details of the usage of these terms to be nailed down. Here's an overview of how I think about it. (Please note that I'm talking about the language model here. Exactly how this translates into the JVM model is a separate problem.) - The *values* of the Java Programming Language are *reference values*?references to objects?and *inline values*?the objects themselves. An *object* is either a *class instance* or an *array*. (See JLS 4.3.1.) All objects can be manipulated via a reference value. Only some objects can also be manipulated directly as inline values. - A *class* describes the structure of a class instance. A *concrete class* can be *instantiated* (typically via a class instance creation expression). An *inline class* is a concrete class whose instances can be treated as inline values. An *identity class* is a concrete class whose instances support identity-sensitive behaviors, and so must always be handled via references. - A *type* describes a set of values. An *inline type* consists of inline values, the instances of a particular inline class. A *reference type* consists of references to objects with a particular property, or the null reference. Inline types are disjoint. Reference types have subsetting relationships, captured by the *subtype* relation. - A *type expression* is the syntax used to refer to a particular type. A class name is one example of a type expression, with a variety of rules used to map this name to specific type. The type expression 'ClassName' often denotes a reference type, but for some inline classes denotes its inline type. The type expression 'ClassName.ref' denotes the reference type of an inline class, and the type expression 'ClassName.val' denotes the inline type of an inline class. (Whether these decorations are allowed redundantly is TBD.) Where did the primitives go? Primitive values are inline values?specifically, instances of certain inline classes (hopefully the class java.lang.Integer, etc., if we can make the migration work). Primitive types are inline types (e.g., 'int' is shorthand for 'java.lang.Integer.val'). --- A few things that still make me a bit uneasy, maybe could use more noodling: - "Inline value" vs. "reference value" makes sense. Then re-using "inline" for "inline class" vs. "identity class" is potentially confusing. In this context, we're using "inline" as shorthand for "inline-capable" and "identity-free". It would sort of be nice if we could flip the world and make 'identity' the class declaration keyword (although we'd still need a term for the absence of that keyword). - The syntax ".val" used to denote an "inline type" is a bit of a mismatch. Maybe we want a new syntax. Or maybe we want to rework the word "value" into the story so that "inline type" becomes "value type". - The term "class type" now has multiple possible interpretations. I guess, unless it's qualified further, it ought to refer to all types "derived from" a particular class, including reference types, inline types, parameterized types, raw types, ... The taxonomy of types, including appropriate terms, needs to be sorted out. - I'm ignoring generic inline classes. We're all ignoring generic inline classes. :-) Generics in the inline type world are, I think, a somewhat different beast than generics in the reference type world, because inline types are disjoint. More work to be done here. From kevinb at google.com Tue Jun 9 23:00:39 2020 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 9 Jun 2020 16:00:39 -0700 Subject: Valhalla basic concepts / terminology In-Reply-To: References: Message-ID: On Tue, Jun 9, 2020 at 2:18 PM Dan Smith wrote: A few things that still make me a bit uneasy, maybe could use more noodling: > > - "Inline value" vs. "reference value" makes sense. Then re-using "inline" > for "inline class" vs. "identity class" is potentially confusing. In this > context, we're using "inline" as shorthand for "inline-capable" and > "identity-free". It would sort of be nice if we could flip the world and > make 'identity' the class declaration keyword (although we'd still need a > term for the absence of that keyword). > My current way of spinning this to myself is that a class being "inline" means it *enables* possible inlining, *and* that referring to it as an "inline" type (its concrete type) is another step that also *enables* possible inlining, and in the end the VM will do what it wants, subject to those limitations. So, I can see the term working at both levels... - The syntax ".val" used to denote an "inline type" is a bit of a mismatch. > Maybe we want a new syntax. Or maybe we want to rework the word "value" > into the story so that "inline type" becomes "value type". > This was my reaction too. ".val" means "the value itself, that you care about", and ".ref" means "a reference value that points to the value you care about", but I used the word "value" *more* times in the second phrase. It doesn't feel like this will be clear. > - The term "class type" now has multiple possible interpretations. I > guess, unless it's qualified further, it ought to refer to all types > "derived from" a particular class, including reference types, inline types, > parameterized types, raw types, ... The taxonomy of types, including > appropriate terms, needs to be sorted out. > > - I'm ignoring generic inline classes. We're all ignoring generic inline > classes. :-) Generics in the inline type world are, I think, a somewhat > different beast than generics in the reference type world, because inline > types are disjoint. More work to be done here. > I haven't had a chance to learn much about this or give it much thought, but here's my naive thoughts/questions: - Type parameters that are identity-bounded (or get marked as identity-bounded) work like always, even in a generic inline class, not much trouble there? - Existing code would mostly work for inline types via their reference projections, but identity stuff would fail at runtime, and of course performance gains are left on the table - If you could declare a type parameter as inclusive of inline types you could at least move some runtime failures to compile-time - But I don't see how to get performance benefits of flattening unless these generics are reified somehow? -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From brian.goetz at oracle.com Wed Jun 10 00:00:12 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 9 Jun 2020 20:00:12 -0400 Subject: Valhalla basic concepts / terminology In-Reply-To: References: Message-ID: <3efb4997-a742-7c9e-418d-a7e77c6da7eb@oracle.com> On 6/9/2020 7:00 PM, Kevin Bourrillion wrote: > On Tue, Jun 9, 2020 at 2:18 PM Dan Smith > wrote: > > A few things that still make me a bit uneasy, maybe could use more > noodling: > > - "Inline value" vs. "reference value" makes sense. Then re-using > "inline" for "inline class" vs. "identity class" is potentially > confusing. In this context, we're using "inline" as shorthand for > "inline-capable" and "identity-free". It would sort of be nice if > we could flip the world and make 'identity' the class declaration > keyword (although we'd still need a term for the absence of that > keyword). > > > My current way of spinning this to myself is that a class being > "inline" means it /enables/ possible inlining, /and/?that referring to > it as an "inline" type (its concrete type) is another step that also > /enables/?possible inlining, and in the end the VM will do what it > wants, subject to those limitations. So, I can see the term working at > both levels... Perhaps the problem is that we've put the term on the secondary attribute rather than the primary.? The primary attribute is id-freedom, from which we derive the option to inline.? But we're calling them "inline". > > > - The syntax ".val" used to denote an "inline type" is a bit of a > mismatch. Maybe we want a new syntax. Or maybe we want to rework > the word "value" into the story so that "inline type" becomes > "value type". > > > This was my reaction too. ".val" means "the value itself, that you > care about", and ".ref" means "a reference value that points to the > value you care about", but I used the word "value" /more/?times in the > second phrase. It doesn't feel like this will be clear. My intention here was to appeal to terms many users already understand: pass by value and pass by reference.? That's why `V.val` is not `V.inline`. From daniel.smith at oracle.com Wed Jun 10 15:20:12 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 10 Jun 2020 09:20:12 -0600 Subject: Valhalla basic concepts / terminology In-Reply-To: <3efb4997-a742-7c9e-418d-a7e77c6da7eb@oracle.com> References: <3efb4997-a742-7c9e-418d-a7e77c6da7eb@oracle.com> Message-ID: <8423A2E1-960F-42A8-94CB-8E4AAEEF4ED2@oracle.com> > On Jun 9, 2020, at 6:00 PM, Brian Goetz wrote: > >> - The syntax ".val" used to denote an "inline type" is a bit of a mismatch. Maybe we want a new syntax. Or maybe we want to rework the word "value" into the story so that "inline type" becomes "value type". >> >> This was my reaction too. ".val" means "the value itself, that you care about", and ".ref" means "a reference value that points to the value you care about", but I used the word "value" more times in the second phrase. It doesn't feel like this will be clear. > > My intention here was to appeal to terms many users already understand: pass by value and pass by reference. That's why `V.val` is not `V.inline`. Sure. And maybe that value/reference dichotomy can be extended into the terms we use in the model. So, the "values" of the language (using the term formally) are *values* (objects) and *references to values*. Now there's a nice alignment between the syntax and the terminology. Or given that "objects" parenthetical, maybe "object" is the right term: the values of the language are *objects* and *references to objects*. In that case, maybe the syntax should be 'Foo.obj'. The objects themselves, not references. From brian.goetz at oracle.com Wed Jun 10 16:15:20 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 10 Jun 2020 12:15:20 -0400 Subject: Valhalla basic concepts / terminology In-Reply-To: <8423A2E1-960F-42A8-94CB-8E4AAEEF4ED2@oracle.com> References: <3efb4997-a742-7c9e-418d-a7e77c6da7eb@oracle.com> <8423A2E1-960F-42A8-94CB-8E4AAEEF4ED2@oracle.com> Message-ID: <7dc0ed13-8f83-71bd-df07-d28003ad51ee@oracle.com> FWIW, one of the things that made this take so long is that there are a number of related dichotomies, which we kept trying to collapse together (or use the wrong one): ?- primitive type vs reference type ?- primitive vs class ?- inline class vs identity class ?- pass/store by reference vs pass/store by value ?- nullable vs non-nullable ?- direct vs indirect On 6/10/2020 11:20 AM, Dan Smith wrote: >> On Jun 9, 2020, at 6:00 PM, Brian Goetz wrote: >> >>> - The syntax ".val" used to denote an "inline type" is a bit of a mismatch. Maybe we want a new syntax. Or maybe we want to rework the word "value" into the story so that "inline type" becomes "value type". >>> >>> This was my reaction too. ".val" means "the value itself, that you care about", and ".ref" means "a reference value that points to the value you care about", but I used the word "value" more times in the second phrase. It doesn't feel like this will be clear. >> My intention here was to appeal to terms many users already understand: pass by value and pass by reference. That's why `V.val` is not `V.inline`. > Sure. And maybe that value/reference dichotomy can be extended into the terms we use in the model. So, the "values" of the language (using the term formally) are *values* (objects) and *references to values*. Now there's a nice alignment between the syntax and the terminology. > > Or given that "objects" parenthetical, maybe "object" is the right term: the values of the language are *objects* and *references to objects*. In that case, maybe the syntax should be 'Foo.obj'. The objects themselves, not references. From brian.goetz at oracle.com Mon Jun 15 19:28:39 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 15 Jun 2020 15:28:39 -0400 Subject: Evolving CONSTANT_Class In-Reply-To: <8F687AE4-38AB-4578-9EA7-311D1385877E@oracle.com> References: <8F687AE4-38AB-4578-9EA7-311D1385877E@oracle.com> Message-ID: <75875e7d-7166-ef75-440f-570dcc9ee0f9@oracle.com> > > Here's a table listing all the type-flavored uses (where "X" means > "allowed here" and "~" means "maybe not essential, but the semantics > would be clear"): More specifically, in the first two columns X means "allowed now", and in the later columns, X means "proposed."? Note too that the proposed Species column is identical to the proposed Class name column. The primitive column is interesting as we probably are going to translate away all of these to some sort of `Qint` type when they appear in these places, so in the JVM, are probably not needed. > Another way to handle it is to distinguish between a *species*, which > is a class-like entity, and a *species type*. It's helpful to remember > that there may be inline types of species (that is, a "Q envelope" of > a species). I think this is a fruitful direction; I can have `ArrayList[T] extends List[T]` where it is a class-like use, and I can have `Foo[T].x` where it is a type-like use. > 1) Treat everything in the class/interface table as a degenerate use > of a type. A class name is always interpreted as an L type. Given that a specializable class Foo gives rise to species Foo[x] and Foo[y], _and_ a class type Foo such that Foo[t] <: Foo for all t, the duality between class and type here seems inevitable. > - When a Class constant is viewed as a type (for (1) that's always, > for (2) that's for type-flavored references), the implicit L envelope > is a historical wart. Do we also support explicit L descriptors? Do we > try to migrate the world away from the implicit envelopes? I would love to migrate away, but I suspect the cost/benefit isn't there.? Historical warts are OK. > - Should we add primitive types? How are they spelled? (The standard > descriptor syntax for primitives is already interpreted as a bare > class name.) Given the way we are thinking for translation, where there is going to be some Q type that stands in for primitives when used in class-y contexts (if for no other reason than the double-slot thing), I don't think this is needed. > - How do we handle type variables, both top-level and nested? Either > we embed constant pool pointers in Utf8 entries (yuck!), or we need to > extend Class constants to support references both to Utf8 entries and > to [some new thing]. This is the stringy-vs-tree problem we've been wrestling with for a long time.? The solution to this problem seems to hinge on the solution to that one. > - Should we revisit "naked" descriptor references, allowing them to > point to either bare Utf8 entries or Class constants and > MethodType/[something else] constants? Do we try to migrate the world > away from naked descriptor references? I think this may well fall out of the "trees vs strings" discussion. > I'm appealing here to a design principle that seems to have driven the original constant pool design: Class constants are for things that get resolved (and can be cached); descriptor strings are little more than fancy names. This principle doesn't always get followed: the verifier sometimes loads classes named by descriptors; array type class constants resolve their element types without a separate entry; more recently, StackMapTables use Class constants to represent types, and MethodTypes resolve method descriptors "as if" there were class constants for all of the parameter types. But I think these, especially the recent ones, are mistakes, and I still think the original notion is a useful separation of concerns that we should try to follow in our design. The tension that comes up here is that we want to be able to match descriptors between clients and declarations.? I don't want to invent one way to describe class constants for species, and another way to embed species in descriptors. Now, it may be possible (depending on our translation strategy) that we don't need to embed species in descriptors, because we're just going to erase descriptors, and put the specialization information somewhere else, for the VM to use opportunistically.? That would make the splitting strategy more appealing. > - For bare descriptors (type of a field), it's fine to use something like "LList[QVal;];". Or maybe it's useful to describe descriptors in terms of Class/Species constants. In any case, there's still a need to figure out how to parameterize a descriptor with live constants ("LList[$T];"), but I think this can be set aside as a separate problem. This is the one I'm alluding to above. > So I think we need CONSTANT_SpecializedMethodref, which has 1) a pointer to a Methodref constant, and 2) pointers to some resolvable constants (typically, but maybe not exclusively, representing types). (Caveat: there are some details about the interaction between type arguments, overriding, and method resolution that I'm hand-waving about. Maybe the encoding will be stacked a little differently.) We've been around this merry go round a few times too, going back and forth between cramming stuff into the descriptor string and putting the method types somewhere else.? Again, the translation story (can we leave descriptors alone) impinges on this. Don't forget that when you have a local generic class nested in a generic method, the method args implicitly parameterize the nested class.? Which means that when we refer to a species of the local class, we have to supply the type arguments for both the method and for the local class (and any other enclosing classes.)? Again, there is a lump/split choice here; we can smoosh together the arguments, or provide a trail of witnesses to the enclosing arguments.? If we choose the latter, then it might be mix of C_SMRef and C_Species. From daniel.smith at oracle.com Mon Jun 15 20:54:14 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Mon, 15 Jun 2020 14:54:14 -0600 Subject: Evolving CONSTANT_Class In-Reply-To: <75875e7d-7166-ef75-440f-570dcc9ee0f9@oracle.com> References: <8F687AE4-38AB-4578-9EA7-311D1385877E@oracle.com> <75875e7d-7166-ef75-440f-570dcc9ee0f9@oracle.com> Message-ID: <967A41F1-BDC6-4776-B2E1-43831B675CFD@oracle.com> > On Jun 15, 2020, at 1:28 PM, Brian Goetz wrote: > >> Another way to handle it is to distinguish between a *species*, which is a class-like entity, and a *species type*. It's helpful to remember that there may be inline types of species (that is, a "Q envelope" of a species). > > I think this is a fruitful direction; I can have `ArrayList[T] extends List[T]` where it is a class-like use, and I can have `Foo[T].x` where it is a type-like use. Concretely, what does this mean for the class file? Are you suggesting that 'List[T]' and 'Foo[T]', above, should have different encodings? Or at least represent different entities? What seems attractive to me for now is that we have CONSTANT_Species for the first one, and some sort of type encoding (probably referencing CONSTANT_Species) for the second one. >> 1) Treat everything in the class/interface table as a degenerate use of a type. A class name is always interpreted as an L type. > > Given that a specializable class Foo gives rise to species Foo[x] and Foo[y], _and_ a class type Foo such that Foo[t] <: Foo for all t, the duality between class and type here seems inevitable. There *are* two concepts here. That seems inevitable. But it's possible that, as a lumping move, we'll *encode* all class-flavored uses as types, and then infer the intended class from whatever type gets used. So, e.g.: a CONSTANT_Class encodes a type, full stop. 'this_class' refers to a type that is the type of 'this' in the current class. 'new' refers to a type that is the class type of a new class instance. NestHost refers to the type of 'this' for the class that acts as the nest host. Etc. >> - How do we handle type variables, both top-level and nested? Either we embed constant pool pointers in Utf8 entries (yuck!), or we need to extend Class constants to support references both to Utf8 entries and to [some new thing]. > > This is the stringy-vs-tree problem we've been wrestling with for a long time. The solution to this problem seems to hinge on the solution to that one. >> - Should we revisit "naked" descriptor references, allowing them to point to either bare Utf8 entries or Class constants and MethodType/[something else] constants? Do we try to migrate the world away from naked descriptor references? > > I think this may well fall out of the "trees vs strings" discussion. Without getting in the weeds on "trees vs. strings", let's just assume we come up with a solution. That solution is very likely not going to embed constant pool pointers in a Utf8 (because tools that manipulate constant pool pointers would be sad to be in the business of parsing/rewriting Utf8 strings). The solution is thus going to need at least 4 bytes (two pointers), so it can express "List[T]" with some encoding of "List" and a pointer to T. The implication is that it's a new flavor of constant. Call that CONSTANT_SpecializedDescriptor. So, to rephrase my questions in terms of the class file format: - What does checkcast point to? A CONSTANT_Class is already allowed. We need to add either CONSTANT_SpecailizedDescriptor, or , or CONSTANT_Class, where CONSTANT_Class can then point to a CONSTANT_SpecializedDescriptor. - What does the descriptor_index of a field_info point to? A Utf8 is already allowed. CONSTANT_SpecializedDescriptor seems like a natural fit, too. What about CONSTANT_Class instead, or in addition? Is it a "bug" that descriptor_index can't be a CONSTANT_Class already, or is that an intentional design choice? - What does the descriptor_index of a method_info point to? Same questions, except CONSTANT_MethodType seems to be the analog to CONSTANT_Class here. Or maybe we want to invent a new analog. >> I'm appealing here to a design principle that seems to have driven the original constant pool design: Class constants are for things that get resolved (and can be cached); descriptor strings are little more than fancy names. This principle doesn't always get followed: the verifier sometimes loads classes named by descriptors; array type class constants resolve their element types without a separate entry; more recently, StackMapTables use Class constants to represent types, and MethodTypes resolve method descriptors "as if" there were class constants for all of the parameter types. But I think these, especially the recent ones, are mistakes, and I still think the original notion is a useful separation of concerns that we should try to follow in our design. > > The tension that comes up here is that we want to be able to match descriptors between clients and declarations. I don't want to invent one way to describe class constants for species, and another way to embed species in descriptors. But this is what the class file has already done! There's the descriptor 'Ljava/lang/Object;', and the constant CONSTANT_Class('java/lang/Object'). An over-arching thing here is whether we think that dual encoding is a mistake, or whether it's a feature of the design. My take is that CONSTANT_Classes (along with, say, CONSTANT_Methodrefs) are designed for resolution, while Utf8 descriptors are designed for matching. Whatever we want to do about descriptors, I think we should at least have a species encoding that is designed for resolution. (Of course, we can define a resolution algorithm that can handle any encoding. But the idea of breaking up steps of resolution into separate constant pool pointers seems quite useful, directly encoding the "resolution tree" that gets activated when you ask to resolve the species.) > Now, it may be possible (depending on our translation strategy) that we don't need to embed species in descriptors, because we're just going to erase descriptors, and put the specialization information somewhere else, for the VM to use opportunistically. That would make the splitting strategy more appealing. Back to my taxonomy in the first mail, we really need up to three things: - A resolvable encoding of the species itself (e.g., for 'new') - A resolvable encoding of the species type (e.g., for 'checkcast' or as a type argument) - A descriptor-like encoding of the species type (e.g., for 'field_info' and CONSTANT_NameAndType) Some of these you may be able to remove from the requirements list, but I don't think that gets you very far. > Don't forget that when you have a local generic class nested in a generic method, the method args implicitly parameterize the nested class. Which means that when we refer to a species of the local class, we have to supply the type arguments for both the method and for the local class (and any other enclosing classes.) Again, there is a lump/split choice here; we can smoosh together the arguments, or provide a trail of witnesses to the enclosing arguments. If we choose the latter, then it might be mix of C_SMRef and C_Species. Yeah, if we don't flatten these nests into a top-level class with a long list of type arguments, the outer class/method is one more step in the resolution algorithm that would map nicely to one more pointer in the constant pool encoding. From daniel.smith at oracle.com Tue Jun 16 18:13:52 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 16 Jun 2020 12:13:52 -0600 Subject: EG meeting, 2020-06-17 Message-ID: The next EG Zoom meeting is tomorrow, 4pm UTC (9am PDT, 12pm EDT). I've volunteered to take over from David leading the EG discussions at these bi-weekly meetings. I'd like to try to focus our meetings a little better on specific topics that people have already had a chance to think about. Generally, I think this means we should draw agenda items from recent emails to the list. With that in mind, here are some threads that are potential topics for tomorrow's meeting. If you want to add something, send an email about it!: - "Evolving CONSTANT_Class": I described some design choices involving constant pool encodings of types and classes. I'd love to make some progress here, let's discuss. - "The fate of int, int.ref, and Integer": Kevin asked some questions about terminology surrounding primitives and the plans for wrapper classes. I think this is settled, other than the fact that we owe a fleshed-out proposal detailing our plans for wrapper classes. - "Valhalla basic concepts / terminology": I added some thoughts about language model terminology. Nothing specific to resolve here. From brian.goetz at oracle.com Tue Jun 16 21:35:12 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 16 Jun 2020 17:35:12 -0400 Subject: EG meeting, 2020-06-17 In-Reply-To: References: Message-ID: <12764284-49CF-45A1-9660-AC49C6AB46B7@oracle.com> Thanks Dan! > I'd like to try to focus our meetings a little better on specific topics that people have already had a chance to think about. Generally, I think this means we should draw agenda items from recent emails to the list. Good plan. ' > - "Evolving CONSTANT_Class": I described some design choices involving constant pool encodings of types and classes. I'd love to make some progress here, let's discuss. This seems a good place to start ? its been out there a while, and the concepts shouldn?t depend TOO much on stuff that is still swirling. From frederic.parain at oracle.com Wed Jun 17 13:16:32 2020 From: frederic.parain at oracle.com (Frederic Parain) Date: Wed, 17 Jun 2020 09:16:32 -0400 Subject: Evolving CONSTANT_Class In-Reply-To: <8F687AE4-38AB-4578-9EA7-311D1385877E@oracle.com> References: <8F687AE4-38AB-4578-9EA7-311D1385877E@oracle.com> Message-ID: <81D62B71-FE4E-4AB2-AC9D-F493AA8EFD47@oracle.com> This is probably a remain of the L/Q model, where a value class could be referenced either as a L-type or a Q-type. L-Foo.x and Q-Foo.x pointed to the same place in the layout, so there wasn?t a justification to make a L/Q distinction here. With LW3, L-Foo and Q-Foo are two different types, so FieldRef/MethodRef should definitively accept types there (with an explicit Q-envelope or an implicit L-envelope). Fred > On Jun 2, 2020, at 18:13, Dan Smith wrote: > > Fieldref/Methodref are an anomaly, where the current Valhalla design tries to maintain that it's a class/interface reference, not a type (e.g., we prohibit inline types here), even though array types can also be used. We might be happier embracing that the class_index of a Fieldref/Methodref is the type of the first argument and a type to search for members. (I'm on the fence about this.) From daniel.smith at oracle.com Wed Jun 17 19:34:27 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 17 Jun 2020 13:34:27 -0600 Subject: EG meeting, 2020-06-17 In-Reply-To: References: Message-ID: > On Jun 16, 2020, at 12:13 PM, Dan Smith wrote: > > - "Evolving CONSTANT_Class": I described some design choices involving constant pool encodings of types and classes. I'd love to make some progress here, let's discuss. Here's a brief summary of the discussion (I'll take better notes next time): - John described some ideas he's exploring in the design of specialization. The relevant point is that descriptors and Class constants may not need to express species types at all, leading to an approach where those things stay mostly unchanged, and new type information (maybe even 'Q' types?) is represented with new constants. - I asked some questions about the relative merits of approaches that enforce "new" types (e.g., for putfield) via i) static validation in the verifier; ii) dynamic checks at run time; or iii) tolerating pollution as a "slow path". - Dan H. expressed a preference for encodings that avoid overloading, particularly when it has the effect of checking flags and branching at run time. - Frederic had some concerns about putting things in the "class" bucket, rather than the "type" bucket, where there's useful information in the type. We talked about whether we like modeling fields/methods as members of classes or as members of types. - I talked about the distinction between resolution-optimized type encodings (CONSTANT_Class) and matching-optimized type encodings (Utf8 descriptors), arguing that it may make sense to preserve both styles as we evolve. My takeaway is that the requirements are in flux, so it's hard to draw firm conclusions right now, but that we are leaning towards something like CONSTANT_Type (#4 in my email) to be used where we need to resolve and dynamically act on "types" (checkcast, anewarray). (There may be a stronger distinction made between verification/descriptor types and "live", possibly client-specified, runtime types.) From daniel.smith at oracle.com Wed Jun 17 21:38:39 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 17 Jun 2020 15:38:39 -0600 Subject: Evolving the wrapper classes Message-ID: Here's a concrete proposal for how we'll evolve the wrapper classes (Byte, Short, Integer, Long, Float, Double, Character, and Boolean) to be inline classes whose ".val" representations are (in the Java language) the primitive types. This has the effect of replacing boxing conversions in the Java language with lighter-weight reference conversions (no identity is imposed), and will facilitate specialization in the JVM by wrapping primitive values in lightweight inline class instances. Important concepts in this approach: - The wrapper classes are reference-default classes?'Integer' is a '.ref' type. - In the language, 'int' is an alias for 'Integer.val'. These are the same type. - In the JVM, there are three distinct types: 'Ljava/lang/Integer;', 'Qjava/lang/Integer$val;', and 'I' The below outline feels pretty complete to me, as far as the core library/JVM/compiler components are concerned, and quite achievable. Please raise anything I'm overlooking (I'm sure there's something...). --- Step 1: Warnings In the near future, we implement a variety of warnings for clients of the wrapper classes who rely on features that will break when the wrapper classes are inline classes: Library changes: - The constructors, currently marked deprecated, are deprecated for removal. This should amplify warnings about their use. Java compiler changes: - Attempts to synchronize on or invoke wait/notify methods of expressions with wrapper class static types produce a new warning. - Possibly, uses of '==', 'identityHashCode', or 'clone' on these expressions produce a warning. - Possibly, any uses of 'getClass' that compare with '==' to wrapper class literals produce a warning. JVM changes: - Possibly, runtime warnings occur mimicking some of the compiler warnings, but using runtime types. (Note that all of these warnings may also apply to Value-based Classes. The wrapper classes happen to fall short of the value-based class requirements in their factories' guarantees about identity; these rules about factories and equality are probably unnecessary limitations, given the current deterministic behavior of acmp.) --- Step 2: Preview Feature When, or sometime after, we ship inline classes as a preview feature, we support treating the wrapper classes as inline. JVM changes (when --enable-preview is set): - References to java/lang/Integer and java/lang/Integer$val are hacked to load special class files corresponding to the .ref and .val types of inline class Integer. - The type [I is considered by the verifier to be equivalent to [java/lang/Integer$val. Array operations (aaload, iaload, etc.) support this. Java language/compiler changes (when --enable-preview is set): - The class file reader knows how to find the special Integer.class and Integer$val.class. - The type 'Integer.val' is equivalent to 'int'. Primitive types are inline types?they have members, support method invocation, etc. - Where necessary (depending on the operations being performed), the compiler generates conversions between 'I' and 'java/lang/Integer$val'. 'I' is preferred wherever possible. - Boxing can be specified with stronger guarantees about '=='. --- Step 3: Standard Feature When we're ready to leave preview, we'll need to raise the profile of "those things we warned about are going to blow up now!". Library changes: - The wrapper classes are declared in source as reference-default inline classes. - The constructors are removed, replaced with private constructors. JVM changes: - Wrapper classes are loaded using standard processes. Java language/compiler changes: - The wrapper classes have special permission to declare fields of their own type. - Wrapper classes are read using standard processes. From brian.goetz at oracle.com Wed Jun 17 22:13:28 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 17 Jun 2020 18:13:28 -0400 Subject: Evolving the wrapper classes In-Reply-To: References: Message-ID: <23c14a8b-d2aa-5ef7-0d28-d7ccd8bc3a15@oracle.com> This is pretty much what I was expecting.? A few comments: > Step 1: Warnings A dynamic warning, initially only activated by opt-in, when someone attempts to synchronize on an instance of a wrapper class.? There's a changeset in review now for 16. > - Where necessary (depending on the operations being performed), the compiler generates conversions between 'I' and 'java/lang/Integer$val'. 'I' is preferred wherever possible. We have to use QInteger$val whenever we use int as a type parameter, the rest of the time, we can use I. > Library changes: > - The constructors are removed, replaced with private constructors. This can happen earlier if we want; we can just remove them (after suitable DFR.)? There are factories. From kevinb at google.com Wed Jun 17 22:30:37 2020 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 17 Jun 2020 15:30:37 -0700 Subject: Evolving the wrapper classes In-Reply-To: References: Message-ID: Hmm, just maybe this will be less confusing than I was fearing. I'm seeing now that "Integer is the real class, int is alias for Integer.val" is a whole lot cleaner than "int becomes a val-default class and Integer is demoted to alias for int.ref", which for some reason was the way I was thinking of it. On Wed, Jun 17, 2020 at 2:39 PM Dan Smith wrote: Here's a concrete proposal for how we'll evolve the wrapper classes (Byte, > Short, Integer, Long, Float, Double, Character, and Boolean) to be inline > classes whose ".val" representations are (in the Java language) the > primitive types. > > This has the effect of replacing boxing conversions in the Java language > with lighter-weight reference conversions (no identity is imposed), and > will facilitate specialization in the JVM by wrapping primitive values in > lightweight inline class instances. > > Important concepts in this approach: > > - The wrapper classes are reference-default classes?'Integer' is a '.ref' > type. > - In the language, 'int' is an alias for 'Integer.val'. These are the same > type. - In the JVM, there are three distinct types: 'Ljava/lang/Integer;', > 'Qjava/lang/Integer$val;', and 'I' > > The below outline feels pretty complete to me, as far as the core > library/JVM/compiler components are concerned, and quite achievable. Please > raise anything I'm overlooking (I'm sure there's something...). > > --- > > Step 1: Warnings > > In the near future, we implement a variety of warnings for clients of the > wrapper classes who rely on features that will break when the wrapper > classes are inline classes: > Cool, these would have been nice to have years ago anyway. :-) We have several of them on for Google code already and I'll see if we can roll out the rest to see what kinds of things blow up. > Library changes: > - The constructors, currently marked deprecated, are deprecated for > removal. This should amplify warnings about their use. > > Java compiler changes: > - Attempts to synchronize on or invoke wait/notify methods of expressions > with wrapper class static types produce a new warning. > - Possibly, uses of '==', 'identityHashCode', or 'clone' on these > expressions produce a warning. > - Possibly, any uses of 'getClass' that compare with '==' to wrapper class > literals produce a warning. > > JVM changes: > - Possibly, runtime warnings occur mimicking some of the compiler > warnings, but using runtime types. > > (Note that all of these warnings may also apply to Value-based Classes. > The wrapper classes happen to fall short of the value-based class > requirements in their factories' guarantees about identity; these rules > about factories and equality are probably unnecessary limitations, given > the current deterministic behavior of acmp.) > > --- > > Step 2: Preview Feature > > When, or sometime after, we ship inline classes as a preview feature, we > support treating the wrapper classes as inline. > > JVM changes (when --enable-preview is set): > - References to java/lang/Integer and java/lang/Integer$val are hacked to > load special class files corresponding to the .ref and .val types of inline > class Integer. > - The type [I is considered by the verifier to be equivalent to > [java/lang/Integer$val. Array operations (aaload, iaload, etc.) support > this. > > Java language/compiler changes (when --enable-preview is set): > - The class file reader knows how to find the special Integer.class and > Integer$val.class. > - The type 'Integer.val' is equivalent to 'int'. Primitive types are > inline types?they have members, support method invocation, etc. > This at least *suggests *that `42L.hashCode()` would begin to work just as `"foo".hashCode()` does? > - Where necessary (depending on the operations being performed), the > compiler generates conversions between 'I' and 'java/lang/Integer$val'. 'I' > is preferred wherever possible. > - Boxing can be specified with stronger guarantees about '=='. > > --- > > Step 3: Standard Feature > > When we're ready to leave preview, we'll need to raise the profile of > "those things we warned about are going to blow up now!". > > Library changes: > - The wrapper classes are declared in source as reference-default inline > classes. > - The constructors are removed, replaced with private constructors. > > JVM changes: > - Wrapper classes are loaded using standard processes. > > Java language/compiler changes: > - The wrapper classes have special permission to declare fields of their > own type. > Fair. I actually see this as rather parallel to the way the class Object in Object.java gets to be its own superclass. I think I'm able to make sense of this plan. Users *can* write `Integer.val` in their code, but would there ever be a good reason to? I assume we would always prefer `int`. And this actually makes me wonder if it's worth considering also allowing `int.ref` to be an alias for `Integer` because it would allow users to drop the word `Integer` from their code more completely, and therefore `int` would look more and more like it was just an inline type like any other. It reminds you that the old boxing/unboxing isn't in play anymore. And `int.ref` is more self-evidently something you can't synchronize on, etc. But, what would remain weird is that you don't *actually *find a val-default class called `int` sitting in an `int.java` file. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From kevinb at google.com Wed Jun 17 22:34:42 2020 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 17 Jun 2020 15:34:42 -0700 Subject: The fate of int, int.ref, and Integer In-Reply-To: References: <66761060-44ce-c1d5-4226-9e4e2f75c180@oracle.com> Message-ID: Just want to record in *this* thread that the new thread "Evolving the wrapper classes " is easing some of my confusion and some of my concerns; anyone coming across this should probably go there now. On Sat, Jun 6, 2020 at 12:10 PM John Rose wrote: > On Jun 5, 2020, at 5:43 AM, Brian Goetz wrote: > > > The move of saying `Integer` *is* `int.ref` makes these problems go away. > This seems too good to pass up preemptively. > > > I agree. And this leads us into a maze of twisty passages. > Full of uninsulated electrified wires and third rails to avoid. > > One tactic I like to get through the maze is to find a way to > add some ad hoc polymorphism to Integer, while keeping it > sealed up as a final class, like today. It seems to need to cover > both good old identity objects (like new Integer(42)) and also > new inline objects ((Integer)42, boxed as before, but via a new > subtype relation). This means there are two object types > floating around, identity-Integer and int, plus an API type, > Integer-as-super. I think the notion of species (as a finer > grained subdivision of types, under class) can be used to > create the necessary distinctions, without introducing > a lot of new types, and breaking reflection. > > In the spirit of brainstorming, here are more details on > such a path, a that might lead us through the maze. > > Given this: > > int.ref id = new Integer(42); //identity object > int.val x = id; > int.val y = 42; > int.ref z = y; > > ?we could choose to arrange things so that all of > x, y, z are inline objects (true ?ints?), while id retains > its special flavor. Also, Object.getClass could report > Integer.class for *all* of those values (even y). This > could be justified by revealing ?int? as, not a class, > but a *species* of Integer. So that Object.getSpecies > would report the further details: For id it is the > the version of Integer which holds identity (which > doesn?t need a name I suppose) and for x/y/z it > reports the species reflector for int. > > If we further use a muddied java.lang.Class to > continue to represent non-classes like ?int?, > we have to double down on the idea of a ?crass?, > or ?runtime class-like type?. In that case we can > have getSpecies return a crass, and then: > > assert 42.getClass() == Integer.class; > assert 42.getSpecies() == int.class; > assert new Integer(42).getSpecies() == (something else); > > > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From daniel.smith at oracle.com Wed Jun 17 22:37:18 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 17 Jun 2020 16:37:18 -0600 Subject: Evolving the wrapper classes In-Reply-To: <23c14a8b-d2aa-5ef7-0d28-d7ccd8bc3a15@oracle.com> References: <23c14a8b-d2aa-5ef7-0d28-d7ccd8bc3a15@oracle.com> Message-ID: <95A23D33-8D46-4E79-B28F-CF6BFF5CF7AC@oracle.com> > On Jun 17, 2020, at 4:13 PM, Brian Goetz wrote: > >> - Where necessary (depending on the operations being performed), the compiler generates conversions between 'I' and 'java/lang/Integer$val'. 'I' is preferred wherever possible. > > We have to use QInteger$val whenever we use int as a type parameter, the rest of the time, we can use I. Right. Specifically, if we support inline types as type arguments before we get to specialization, we'll use erasure, which looks like: new java/util/ArrayList; // dup, init astore 1 aload 1 iconst_0 invokestatic Qjava/lang/Integer$val;.(I)Qjava/lang/Integer$val; // <-- conversion invokevirtual java/util/ArrayList.add:(Ljava/lang/Object;)Z pop aload1 iconst_0 invokevirtual java/util/ArrayList.get(I)Ljava/lang/Object; checkcast Qjava/lang/Integer$val; invokevirtual Qjava/lang/Integer$val;.intValue()I // <-- conversion And then there's also instance method invocations: iconst_0 invokestatic Qjava/lang/Integer$val;.(I)Qjava/lang/Integer$val; // <-- conversion invokevirtual Qjava/lang/Integer$val;.floatValue()F (Note that none of these conversions are "boxing" or "unboxing". They're strictly compilation artifacts. It may be useful to come up with a new word for them.) From daniel.smith at oracle.com Wed Jun 17 22:44:57 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 17 Jun 2020 16:44:57 -0600 Subject: Evolving the wrapper classes In-Reply-To: References: Message-ID: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> > On Jun 17, 2020, at 4:30 PM, Kevin Bourrillion wrote: > >> - The type 'Integer.val' is equivalent to 'int'. Primitive types are inline types?they have members, support method invocation, etc. >> > This at least suggests that `42L.hashCode()` would begin to work just as `"foo".hashCode()` does? Yep, that's what I mean. Member accesses are now allowed (assuming the member you're looking for exists) for all types. > Users can write `Integer.val` in their code, but would there ever be a good reason to? I assume we would always prefer `int`. And this actually makes me wonder if it's worth considering also allowing `int.ref` to be an alias for `Integer` because it would allow users to drop the word `Integer` from their code more completely, and therefore `int` would look more and more like it was just an inline type like any other. It reminds you that the old boxing/unboxing isn't in play anymore. And `int.ref` is more self-evidently something you can't synchronize on, etc. But, what would remain weird is that you don't actually find a val-default class called `int` sitting in an `int.java` file. Agreed, I think it would be reasonable to consider both i) supporting 'int.ref' as type syntax, and ii) prohibiting 'Integer.val'. Although, redundancies aside, if you have some discomfort with bare 'Integer' being a reference type for an inline class, you may have similar discomfort with our "reference default" story in general (e.g., 'LocalDateTime' will have the same properties). From brian.goetz at oracle.com Wed Jun 17 22:45:06 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 17 Jun 2020 18:45:06 -0400 Subject: Evolving the wrapper classes In-Reply-To: References: Message-ID: > Hmm, just maybe this will be less confusing than I was fearing. I'm > seeing now that "Integer is the real class, int is alias for > Integer.val" is a whole lot cleaner than "int becomes a val-default > class and Integer is demoted to alias for int.ref", which for some > reason was the way I was thinking of it. I'm sure that, sometime during the evolution, I probably said it the way you remember.? Then Dan came along and cleaned it up :) > > > > Java language/compiler changes (when --enable-preview is set): > - The class file reader knows how to find the special > Integer.class and Integer$val.class. > - The type 'Integer.val' is equivalent to 'int'. Primitive types > are inline types?they have members, support method invocation, etc. > > > This at least /suggests /that `42L.hashCode()` would begin to work > just as `"foo".hashCode()` does? We certainly have that option.? We could decide to not take it, on the theory that it scares the neighbors, but it does seem sensible to just say "0L is a long-valued expression" and "long implements XYZ interfaces", and let the neighbors be scared for a few minutes. > Users /can/?write `Integer.val` in their code, but would there ever be > a good reason to? I assume we would always prefer `int`. And this > actually makes me wonder if it's worth considering also allowing > `int.ref` to be an alias for `Integer` because it would allow users to > drop the word `Integer` from their code more completely, and therefore > `int` would look more and more like it was just an inline type like > any other. It reminds you that the old boxing/unboxing isn't in play > anymore. And `int.ref` is more self-evidently something you can't > synchronize on, etc. But, what would remain weird is that you don't > /actually /find a val-default class called `int` sitting in an > `int.java` file. Right, there's a set of pros and cons here, none of which are technical.? Being able to say `int.ref` makes it more clear that `int` and `Point` are the same thing, but on the other hand, it raises issues of "there are two ways to say the same thing, so let's have an endless debate about it." From Tobi_Ajila at ca.ibm.com Fri Jun 19 15:59:08 2020 From: Tobi_Ajila at ca.ibm.com (Tobi Ajila) Date: Fri, 19 Jun 2020 11:59:08 -0400 Subject: Evolving the wrapper classes In-Reply-To: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> Message-ID: Hi Dan S. >>> - Where necessary (depending on the operations being performed), the compiler generates conversions between 'I' and 'java/lang/Integer$val'. 'I' is preferred wherever possible. >> >> We have to use QInteger$val whenever we use int as a type parameter, the rest of the time, we can use I. ... > (Note that none of these conversions are "boxing" or "unboxing". They're strictly compilation artifacts. It may be useful to come up with a new word for them.) Given your examples can we assume that the JVM will never need to do an implicit `Qjava/lang/Integer$val;` to `I` conversion? And these will always be explicit conversions performed by javac? > - The type [I is considered by the verifier to be equivalent to [java/lang/Integer$val. Array operations (aaload, iaload, etc.) support this. Could you please explain the motivation behind this? Specifically, in which cases are iaload and aaload operations both performed on `[I` ? If `I` and `Qjava/lang/Integer$val;` will require explicit javac conversions, shouldn't `[I` and `[java/lang/Integer$val` also? Regards, --Tobi From brian.goetz at oracle.com Fri Jun 19 16:04:35 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 19 Jun 2020 12:04:35 -0400 Subject: Evolving the wrapper classes In-Reply-To: References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> Message-ID: <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> > Given your examples can we assume that the JVM will never need to do an implicit `Qjava/lang/Integer$val;` to `I` conversion? And these will always be explicit conversions performed by javac? > Correct. > > - The type [I is considered by the verifier to be equivalent to [java/lang/Integer$val. Array operations (aaload, iaload, etc.) support this. > > Could you please explain the motivation behind this? Specifically, in which cases are iaload and aaload operations both performed on `[I` ? > > If `I` and `Qjava/lang/Integer$val;` will require explicit javac conversions, shouldn't `[I` and `[java/lang/Integer$val` also? > Because arrays have identity (not to mention potentially large copying costs), there is simply no reasonable conversion we can define; any ?conversion? would involve copying all the data, changing identity, or both. Just as with the array subtyping requirements (Point[] <: Point.ref[] <: Object[]), these are things only the VM can do for us. From Tobi_Ajila at ca.ibm.com Fri Jun 19 17:07:25 2020 From: Tobi_Ajila at ca.ibm.com (Tobi Ajila) Date: Fri, 19 Jun 2020 13:07:25 -0400 Subject: Evolving the wrapper classes In-Reply-To: <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> Message-ID: > Because arrays have identity (not to mention potentially large copying costs), there is simply no reasonable conversion we can define; any "conversion" would involve copying all the data, changing identity, or both. Just as with the array subtyping requirements (Point[] <: Point.ref [] <: Object[]), these are things only the VM can do for us. I suspected that this was likely due to the large cost of converting between `[I` and `[java/lang/Integer$val`. However, I am still a little unclear as to what the motivation is for this. Is this solely for specialized generics? In Dan's examples with `I` and `java/lang/Integer$val`, the only places where conversions are needed are when primitives are used as type parameters or to call instance methods on them, both of which can already be done with primitive arrays. So in the LW3 - LW20 timeframe would we have any need for these conversions? If so, could you provide some examples? In the case of specialized generics, is the intention that `[I` (and I suppose `I` as well) will appear in generic code? From daniel.smith at oracle.com Fri Jun 19 17:32:38 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 19 Jun 2020 11:32:38 -0600 Subject: Evolving the wrapper classes In-Reply-To: References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> Message-ID: <754AE337-F271-476E-ADAF-FB648E92757F@oracle.com> > On Jun 19, 2020, at 11:07 AM, Tobi Ajila wrote: > > I am still a little unclear as to what the motivation is for this. Is this solely for specialized generics? > > In Dan's examples with `I` and `java/lang/Integer$val`, the only places where conversions are needed are when primitives are used as type parameters or to call instance methods on them, both of which can already be done with primitive arrays. So in the LW3 - LW20 timeframe would we have any need for these conversions? If so, could you provide some examples? I think it comes down to specialization and subtyping. Pre-specialization, here's one example that uses subtyping: int[] arr = { 1 }; Object[] objs = arr; // just like Point[] <: Object[] Object obj = objs[0]; Integer i = (Integer) obj; This would compile to something like: iconst_1 newarray T_INT dup iconst_0 iconst_1 iastore astore_0 aload_0 astore_1 aload_1 iconst_0 aaload astore_2 aload_2 checkcast java/lang/Integer astore_3 Going in the other direction?allocating a [Qjava/lang/Integer; and then using iaload/iastore on it?may not be necessary unless/until the language supports "new T[]" in specialized code, but it tentatively makes sense to support now anyway, rather than having to come back and fix it up later. > In the case of specialized generics, is the intention that `[I` (and I suppose `I` as well) will appear in generic code? If you mean "can '[' be specialized to '[I'?", the answer is no. The primitive types cannot act as type arguments. From daniel.smith at oracle.com Fri Jun 19 17:43:22 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 19 Jun 2020 11:43:22 -0600 Subject: Evolving the wrapper classes In-Reply-To: <754AE337-F271-476E-ADAF-FB648E92757F@oracle.com> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <754AE337-F271-476E-ADAF-FB648E92757F@oracle.com> Message-ID: PSA: this thread has been polluted with the address: valhalla-spec-experts Which just generates admin notifications. Please delete that address from any replies. :-) From Tobi_Ajila at ca.ibm.com Fri Jun 19 18:12:08 2020 From: Tobi_Ajila at ca.ibm.com (Tobi Ajila) Date: Fri, 19 Jun 2020 14:12:08 -0400 Subject: Evolving the wrapper classes In-Reply-To: <754AE337-F271-476E-ADAF-FB648E92757F@oracle.com> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <754AE337-F271-476E-ADAF-FB648E92757F@oracle.com> Message-ID: Thanks for the example Dan, this "Object[] objs = arr; // just like Point[] <: Object[]" makes it very clear. Brian's response makes more sense to me now. > From: Dan Smith > To: Tobi Ajila > Cc: Brian Goetz , valhalla-spec-experts > , valhalla-spec-experts > > Date: 2020/06/19 01:32 PM > Subject: [EXTERNAL] Re: Evolving the wrapper classes > > > > On Jun 19, 2020, at 11:07 AM, Tobi Ajila wrote: > > > > I am still a little unclear as to what the motivation is for this. > Is this solely for specialized generics? > > > > In Dan's examples with `I` and `java/lang/Integer$val`, the only > places where conversions are needed are when primitives are used as > type parameters or to call instance methods on them, both of which > can already be done with primitive arrays. So in the LW3 - LW20 > timeframe would we have any need for these conversions? If so, could > you provide some examples? > > I think it comes down to specialization and subtyping. > > Pre-specialization, here's one example that uses subtyping: > > int[] arr = { 1 }; > Object[] objs = arr; // just like Point[] <: Object[] > Object obj = objs[0]; > Integer i = (Integer) obj; > > This would compile to something like: > > iconst_1 > newarray T_INT > dup > iconst_0 > iconst_1 > iastore > astore_0 > > aload_0 > astore_1 > > aload_1 > iconst_0 > aaload > astore_2 > > aload_2 > checkcast java/lang/Integer > astore_3 > > Going in the other direction?allocating a [Qjava/lang/Integer; and > then using iaload/iastore on it?may not be necessary unless/until > the language supports "new T[]" in specialized code, but it > tentatively makes sense to support now anyway, rather than having to > come back and fix it up later. > > > In the case of specialized generics, is the intention that `[I` > (and I suppose `I` as well) will appear in generic code? > > If you mean "can '[' be specialized to '[I'?", the answer is no. > The primitive types cannot act as type arguments. From brian.goetz at oracle.com Fri Jun 19 18:18:09 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 19 Jun 2020 14:18:09 -0400 Subject: Evolving the wrapper classes In-Reply-To: References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> Message-ID: <6c0763af-9029-534a-e3ab-219500326964@oracle.com> Zooming out, what we've been trying to do is shake out the places where the JVM treats primitives and references differently, and aligning them, so that we are able to broaden the approach of "generics erase T to Object" to include inlines and primitives.? The war cry might be: ??? Object is the new Any L-World does much of this for inlines, but we don't want to leave primitives out in in the cold in the programming model; being able to get good behavior for Foo but not the same for Foo would be a missed opportunity to provide a uniform programming model.? Much of this is either handled by existing L-World behavior (e.g., behavior of ==), but this seam is one that needs to be covered.? We can cover some in the static compiler (conversions between I and Qint) but when it comes to arrays, the invariance of arrays would expose our tricks, and we'd have to have awful restrictions like "you can't use arrays in generics." Note that [I and [QInteger$val have the exact same layout, so it is really a matter of treating the two type names as referring to the same underlying runtime type. On 6/19/2020 1:07 PM, Tobi Ajila wrote: > > > Because arrays have identity (not to mention potentially large > copying costs), there is simply no reasonable conversion we can > define; any "conversion" would involve copying all the data, changing > identity, or both. ?Just as with the array subtyping requirements > (Point[] <: Point.ref[] <: Object[]), these are things only the VM can > do for us. > > I suspected that this was likely due to the large cost of converting > between `[I` and `[java/lang/Integer$val`. However, I am still a > little unclear as to what the motivation is for this. Is this solely > for specialized generics? > > In Dan's examples with `I` and `java/lang/Integer$val`, the only > places where conversions are needed are when primitives are used as > type parameters or to call instance methods on them, both of which can > already be done with primitive arrays. So in the LW3 - LW20 timeframe > would we have any need for these conversions? If so, could you provide > some examples? > > In the case of specialized generics, is the intention that `[I` (and I > suppose `I` as well) will appear in generic code? > From forax at univ-mlv.fr Fri Jun 19 18:54:15 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 19 Jun 2020 20:54:15 +0200 (CEST) Subject: Evolving the wrapper classes In-Reply-To: <6c0763af-9029-534a-e3ab-219500326964@oracle.com> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> Message-ID: <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Tobi Ajila" > Cc: "valhalla-spec-experts" , > "valhalla-spec-experts" > Envoy?: Vendredi 19 Juin 2020 20:18:09 > Objet: Re: Evolving the wrapper classes > Zooming out, what we've been trying to do is shake out the places where the JVM > treats primitives and references differently, and aligning them, so that we are > able to broaden the approach of "generics erase T to Object" to include inlines > and primitives. The war cry might be: > Object is the new Any > L-World does much of this for inlines, but we don't want to leave primitives out > in in the cold in the programming model; being able to get good behavior for > Foo but not the same for Foo would be a missed opportunity to > provide a uniform programming model. Much of this is either handled by existing > L-World behavior (e.g., behavior of ==), but this seam is one that needs to be > covered. We can cover some in the static compiler (conversions between I and > Qint) but when it comes to arrays, the invariance of arrays would expose our > tricks, and we'd have to have awful restrictions like "you can't use arrays in > generics." > Note that [I and [QInteger$val have the exact same layout, so it is really a > matter of treating the two type names as referring to the same underlying > runtime type. yes, but at the same time descriptor are matched by name and you need to have the proper descriptor when overriding/implementing a method, so the strategy of blindly replacing every I by QInteger$val; doesn't really work. Usually the solution is to use bridges but bridges only work with subtyping relationship not equivalence relationship (because you can travel in both direction). I believe we need to bring the forward/bridge-o-matic at the same time we retrofit primitive to inline. R?mi > On 6/19/2020 1:07 PM, Tobi Ajila wrote: >>> Because arrays have identity (not to mention potentially large copying costs), >>> there is simply no reasonable conversion we can define; any "conversion" would >>> involve copying all the data, changing identity, or both. Just as with the >>> array subtyping requirements (Point[] <: Point.ref[] <: Object[]), these are >> > things only the VM can do for us. >> I suspected that this was likely due to the large cost of converting between >> `[I` and `[java/lang/Integer$val`. However, I am still a little unclear as to >> what the motivation is for this. Is this solely for specialized generics? >> In Dan's examples with `I` and `java/lang/Integer$val`, the only places where >> conversions are needed are when primitives are used as type parameters or to >> call instance methods on them, both of which can already be done with primitive >> arrays. So in the LW3 - LW20 timeframe would we have any need for these >> conversions? If so, could you provide some examples? >> In the case of specialized generics, is the intention that `[I` (and I suppose >> `I` as well) will appear in generic code? From brian.goetz at oracle.com Fri Jun 19 18:59:35 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 19 Jun 2020 14:59:35 -0400 Subject: Evolving the wrapper classes In-Reply-To: <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> Message-ID: > > yes, but at the same time descriptor are matched by name and you need > to have the proper descriptor when overriding/implementing a method, > so the strategy of blindly replacing every I by QInteger$val; doesn't > really work. Who said blindly? From forax at univ-mlv.fr Fri Jun 19 19:09:05 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 19 Jun 2020 21:09:05 +0200 (CEST) Subject: Evolving the wrapper classes In-Reply-To: References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> Message-ID: <2125272433.1507024.1592593745412.JavaMail.zimbra@u-pem.fr> Blindly is perhaps a word too strong, let say we have to come with a plan, a good plan, and i fail to see how it can work with only the current bridge mechanism we have. R?mi > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Tobi Ajila" , "valhalla-spec-experts" > , "valhalla-spec-experts" > > Envoy?: Vendredi 19 Juin 2020 20:59:35 > Objet: Re: Evolving the wrapper classes >> yes, but at the same time descriptor are matched by name and you need to have >> the proper descriptor when overriding/implementing a method, >> so the strategy of blindly replacing every I by QInteger$val; doesn't really >> work. > Who said blindly? From brian.goetz at oracle.com Fri Jun 19 19:13:12 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 19 Jun 2020 15:13:12 -0400 Subject: Evolving the wrapper classes In-Reply-To: <2125272433.1507024.1592593745412.JavaMail.zimbra@u-pem.fr> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> <2125272433.1507024.1592593745412.JavaMail.zimbra@u-pem.fr> Message-ID: I hope to surprise you positively! On 6/19/2020 3:09 PM, forax at univ-mlv.fr wrote: > Blindly is perhaps a word too strong, let say we have to come with a > plan, a good plan, > and i fail to see how it can work with only the current bridge > mechanism we have. > From daniel.smith at oracle.com Fri Jun 19 19:27:12 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 19 Jun 2020 13:27:12 -0600 Subject: Evolving the wrapper classes In-Reply-To: <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> Message-ID: <7C6DB9A4-FB38-4FC8-AB3D-CFE3045D8E5E@oracle.com> > On Jun 19, 2020, at 12:54 PM, Remi Forax wrote: > >> Note that [I and [QInteger$val have the exact same layout, so it is really a matter of treating the two type names as referring to the same underlying runtime type. > > yes, but at the same time descriptor are matched by name and you need to have the proper descriptor when overriding/implementing a method, > so the strategy of blindly replacing every I by QInteger$val; doesn't really work. > > Usually the solution is to use bridges but bridges only work with subtyping relationship not equivalence relationship (because you can travel in both direction). > I believe we need to bring the forward/bridge-o-matic at the same time we retrofit primitive to inline. In the VM this is mostly a verification problem: have a '[Qjava/lang/Integer$val;', need a '[I'? You're good! ("Mostly", because there is still the matter of ensuring there's a single encoding for both kinds of objects, or that the instructions are capable of handling two different encodings.) I'm not sure we'd get into any situations where a '([I)V' descriptor needs to override a '([Qjava/lang/Integer$val)V' descriptor, or vice versa, until we get to specialization, and then I'm not sure this is any different than other forms of bridging. All existing code will continue to use 'I' in its compiled descriptors. From forax at univ-mlv.fr Fri Jun 19 19:51:00 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 19 Jun 2020 21:51:00 +0200 (CEST) Subject: Evolving the wrapper classes In-Reply-To: <7C6DB9A4-FB38-4FC8-AB3D-CFE3045D8E5E@oracle.com> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> <7C6DB9A4-FB38-4FC8-AB3D-CFE3045D8E5E@oracle.com> Message-ID: <1877293090.1513285.1592596260851.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "daniel smith" > ?: "Remi Forax" > Cc: "Brian Goetz" , "valhalla-spec-experts" > Envoy?: Vendredi 19 Juin 2020 21:27:12 > Objet: Re: Evolving the wrapper classes >> On Jun 19, 2020, at 12:54 PM, Remi Forax wrote: >> >>> Note that [I and [QInteger$val have the exact same layout, so it is really a >>> matter of treating the two type names as referring to the same underlying >>> runtime type. >> >> yes, but at the same time descriptor are matched by name and you need to have >> the proper descriptor when overriding/implementing a method, >> so the strategy of blindly replacing every I by QInteger$val; doesn't really >> work. >> >> Usually the solution is to use bridges but bridges only work with subtyping >> relationship not equivalence relationship (because you can travel in both >> direction). >> I believe we need to bring the forward/bridge-o-matic at the same time we >> retrofit primitive to inline. > > In the VM this is mostly a verification problem: have a > '[Qjava/lang/Integer$val;', need a '[I'? You're good! ("Mostly", because there > is still the matter of ensuring there's a single encoding for both kinds of > objects, or that the instructions are capable of handling two different > encodings.) > > I'm not sure we'd get into any situations where a '([I)V' descriptor needs to > override a '([Qjava/lang/Integer$val)V' descriptor, or vice versa, until we get > to specialization, covariant return type interface I { int foo(); } interface J { Object foo(); } class A implements I, J { int foo(); } with I.java compiled a long time ago. > and then I'm not sure this is any different than other forms > of bridging. All existing code will continue to use 'I' in its compiled > descriptors. if everything is compiled at the same time, there is no issue, otherwise you can create a loop. R?mi From daniel.smith at oracle.com Fri Jun 19 20:16:48 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 19 Jun 2020 14:16:48 -0600 Subject: Evolving the wrapper classes In-Reply-To: <1877293090.1513285.1592596260851.JavaMail.zimbra@u-pem.fr> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> <7C6DB9A4-FB38-4FC8-AB3D-CFE3045D8E5E@oracle.com> <1877293090.1513285.1592596260851.JavaMail.zimbra@u-pem.fr> Message-ID: <510D461F-292D-4742-9C77-5FCB68E6B959@oracle.com> > On Jun 19, 2020, at 1:51 PM, forax at univ-mlv.fr wrote: > > covariant return type > interface I { > int foo(); > } > interface J { > Object foo(); > } > class A implements I, J { > int foo(); > } > > with I.java compiled a long time ago. Nothing in this proposal changes how these classes are compiled. The following methods will exist: I.foo()I J.foo()Ljava/lang/Object; A.foo()I A.foo()Ljava/lang/Object; // bridge I think you may be mixed up thinking that we'll sometimes translate 'int' in a descriptor to 'Ljava/lang/Integer$val;', but that's not the case. 'I' is preferred wherever possible. From daniel.smith at oracle.com Fri Jun 19 20:20:26 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 19 Jun 2020 14:20:26 -0600 Subject: Evolving the wrapper classes In-Reply-To: <510D461F-292D-4742-9C77-5FCB68E6B959@oracle.com> References: <4784D849-69E4-47B7-BFF5-53A8483E086A@oracle.com> <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> <7C6DB9A4-FB38-4FC8-AB3D-CFE3045D8E5E@oracle.com> <1877293090.1513285.1592596260851.JavaMail.zimbra@u-pem.fr> <510D461F-292D-4742-9C77-5FCB68E6B959@oracle.com> Message-ID: <9859D11D-2325-4F1A-AA80-4360C0824B12@oracle.com> > On Jun 19, 2020, at 2:16 PM, Dan Smith wrote: > >> On Jun 19, 2020, at 1:51 PM, forax at univ-mlv.fr wrote: >> >> covariant return type >> interface I { >> int foo(); >> } >> interface J { >> Object foo(); >> } >> class A implements I, J { >> int foo(); >> } >> >> with I.java compiled a long time ago. > > Nothing in this proposal changes how these classes are compiled. Oh wait, this would actually be a compiler error currently. So the fact that we allow it is new, but it works the same as other covariant override bridges. (The bridge *body* is a little different, because there's a I -> Object conversion needed.) From forax at univ-mlv.fr Fri Jun 19 20:44:05 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 19 Jun 2020 22:44:05 +0200 (CEST) Subject: Evolving the wrapper classes In-Reply-To: <510D461F-292D-4742-9C77-5FCB68E6B959@oracle.com> References: <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> <7C6DB9A4-FB38-4FC8-AB3D-CFE3045D8E5E@oracle.com> <1877293090.1513285.1592596260851.JavaMail.zimbra@u-pem.fr> <510D461F-292D-4742-9C77-5FCB68E6B959@oracle.com> Message-ID: <10669452.1518752.1592599445117.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "daniel smith" > ?: "Remi Forax" > Cc: "Brian Goetz" , "valhalla-spec-experts" > Envoy?: Vendredi 19 Juin 2020 22:16:48 > Objet: Re: Evolving the wrapper classes >> On Jun 19, 2020, at 1:51 PM, forax at univ-mlv.fr wrote: >> >> covariant return type >> interface I { >> int foo(); >> } >> interface J { >> Object foo(); >> } >> class A implements I, J { >> int foo(); >> } >> >> with I.java compiled a long time ago. > > Nothing in this proposal changes how these classes are compiled. The following > methods will exist: > > I.foo()I > J.foo()Ljava/lang/Object; > A.foo()I > A.foo()Ljava/lang/Object; // bridge > > I think you may be mixed up thinking that we'll sometimes translate 'int' in a > descriptor to 'Ljava/lang/Integer$val;', but that's not the case. 'I' is > preferred wherever possible. If the VM see I as a subtype of Object, there is no need for a Ljava/lang/Integer$val; at all, it's better to have aload/astore/etc to work on I directly. R?mi From daniel.smith at oracle.com Fri Jun 19 21:01:34 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 19 Jun 2020 15:01:34 -0600 Subject: Evolving the wrapper classes In-Reply-To: <10669452.1518752.1592599445117.JavaMail.zimbra@u-pem.fr> References: <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> <7C6DB9A4-FB38-4FC8-AB3D-CFE3045D8E5E@oracle.com> <1877293090.1513285.1592596260851.JavaMail.zimbra@u-pem.fr> <510D461F-292D-4742-9C77-5FCB68E6B959@oracle.com> <10669452.1518752.1592599445117.JavaMail.zimbra@u-pem.fr> Message-ID: <7F202960-B147-40FD-B935-168AE29EA196@oracle.com> > On Jun 19, 2020, at 2:44 PM, forax at univ-mlv.fr wrote: > > If the VM see I as a subtype of Object, there is no need for a Ljava/lang/Integer$val; at all, it's better to have aload/astore/etc to work on I directly. That's initially somewhat attractive, but think about the implications for things like slot size. Primitives and inline objects are really two very different things in the current JVM design. From brian.goetz at oracle.com Sun Jun 21 18:50:40 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 21 Jun 2020 14:50:40 -0400 Subject: Evolving the wrapper classes In-Reply-To: <7F202960-B147-40FD-B935-168AE29EA196@oracle.com> References: <712D2630-6718-4E2A-B64C-85F133C20B2B@oracle.com> <6c0763af-9029-534a-e3ab-219500326964@oracle.com> <1435018147.1504867.1592592855433.JavaMail.zimbra@u-pem.fr> <7C6DB9A4-FB38-4FC8-AB3D-CFE3045D8E5E@oracle.com> <1877293090.1513285.1592596260851.JavaMail.zimbra@u-pem.fr> <510D461F-292D-4742-9C77-5FCB68E6B959@oracle.com> <10669452.1518752.1592599445117.JavaMail.zimbra@u-pem.fr> <7F202960-B147-40FD-B935-168AE29EA196@oracle.com> Message-ID: <4f81684d-e667-8269-c703-caaf50f36983@oracle.com> Indeed.? We can blur the distinction between I and QInteger$val at the _language_ level -- where it matters -- but there's no return trying to do so at the VM level.? Except in the cases we already described -- array types -- where we cannot effectively blur it in the language. On 6/19/2020 5:01 PM, Dan Smith wrote: >> On Jun 19, 2020, at 2:44 PM, forax at univ-mlv.fr wrote: >> >> If the VM see I as a subtype of Object, there is no need for a Ljava/lang/Integer$val; at all, it's better to have aload/astore/etc to work on I directly. > That's initially somewhat attractive, but think about the implications for things like slot size. Primitives and inline objects are really two very different things in the current JVM design. From brian.goetz at oracle.com Tue Jun 23 21:11:45 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 23 Jun 2020 17:11:45 -0400 Subject: Erasure Message-ID: As background for some upcoming docs on specialized generics, I?ve posted a doc here http://cr.openjdk.java.net/~briangoetz/valhalla/erasure.html outlining the historical rationale for erasure in generics. Many of the same factors are going to come up again?. From daniel.smith at oracle.com Tue Jun 30 19:08:09 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 30 Jun 2020 13:08:09 -0600 Subject: EG meeting, 2020-06-30 Message-ID: <208E2EF9-6304-4B60-9A1E-D1F2F72DC935@oracle.com> The next EG Zoom meeting is tomorrow, 4pm UTC (9am PDT, 12pm EDT). I'll be away, and Brian volunteered to lead the meeting. (I've heard from Tobi and Dan that they'll also be away.) Recent threads that we may want to discuss: - "Evolving the wrapper classes": I summarized the steps needed to make the the primitive wrapper classes inline classes: warnings (soon), inline as a preview feature (when inline classes are preview), and then full support (when inline classes are final) - "Erasure": Brian posted a summary of the history and design rationale for erased generics