From daniel.smith at oracle.com Wed Sep 8 00:15:08 2021 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 8 Sep 2021 00:15:08 +0000 Subject: [External] : Re: Draft JVMS changes for Primitive Objects (JEP 401) In-Reply-To: References: Message-ID: > On Aug 11, 2021, at 1:24 PM, Dan Heidinga wrote: > > And continuing on the "long-overdue" theme, here's my long-overdue > review of the spec changes. > > A big thank you to you, Dan S., for the careful spec writeup efforts. > I think this captures our discussions well. > > --Dan > > == Section 2.11.5 Object Creation and Manipulation >> Create a new class instance: new, withfield. > Should that also include "defaultvalue"? The semantics aren't quite > the same because of the structural equality of primitive class types > but it is conceptually very similar. And in the instruction section, > we state "The defaultvalue instruction is similar to the new > instruction" which lends credence to including it in this list. Hmm, that is a bit inconsistent. There are two conflicting perspectives: - It's analogous to 'ldc' or 'aconst_null', putting a well-known constant on the stack (so belongs in 2.11.2) - It's analogous to 'new', putting a fresh instance on the stack (so belongs in 2.11.5) I think I lean towards emphasizing the "load a constant" nature of the operation, but you're right that it's not a consistent view throughout the document. > == Section 4.1 The ClassFile Structure > The `ACC_PRIM_SUPER` flag is introduced and restrictions on classes > with the flag are called out in various sections such as: > 4.5 Fields > ACC_PRIM_SUPER flag set, each field must have its > ACC_STATIC flag set. > 4.6 Methods > In a primitive class, and in an abstract class that has > its ACC_PRIM_SUPER flag set, a method that has its ACC_SYNCHRONIZED > flag set must also have its ACC_STATIC flag set. > 5.3 Creation and Loading > implements PrimitiveObject if the opposite > is true (ACC_PRIM_SUPER, no instance initialization method). > > I didn't see static constraints called out to enforce these > restrictions (should they be?). Having the handling of the > ACC_PRIM_SUPER in one place would make the VM's job of validating it > easier. Most of Chapter 4 provides the specification for format checking?including 4.5 and 4.6. Compare the restrictions on fields and methods of interfaces appearing in 4.5 and 4.6. There's a perspective shift here from what you're looking for?rather than saying "an ACC_PRIM_SUPER class file is validated as follows: ...", we treat ACC_PRIM_SUPER as a statement of fact, and then *other* things in the class file are validated with that fact in mind. This avoids forcing everything about the new feature into a sidebar, as if it's not part of the "core" JVM. Anyway, I'd suggest implementing & thinking about validation in the same way you implement validation of ACC_INTERFACE-related constraints. > == 4.6 Methods >> Design discussion: this section requires that unnamed factory methods (named ) are static >> and have a return type that matches their declaring class or interface. By restricting the >> descriptor in this way, clients can rely on a predictable, useful return type. >> >> Alternatively, we could allow a subtype or supertype as the return type, or impose no constraints >> at all. One potential use case is a hidden class, which is incapable of naming its class type in >> a descriptor. > > Because these are static methods, I thought we had agreed they could > name any superclass as the return value due to the hidden class > requirements. Even though this allows some strange behaviour (ie: > after some bytecode manipulation) such as the following pseudo-code > shows: > ``` > primitive class Strange { > Strange() { //()Ljava/lang/Object; > return new String(); > } > } > ``` > The contract on `` is more convention than requirement. In cases > where the return value needs to be used as a primitive value, it would > need to go through a checkcast to validate it when a different return > type is named. > > While this doesn't give the hidden class full powers to be checked in > the checkcast, it can still be checked against the PrimitiveObject > interface or its ACC_PRIM_SUPER type. Seems like a reasonable setup > and avoids the VM having to check the name matches on the return type > of the descriptor. Something we need to investigate more is how factory methods get surfaced in reflection. I could imagine clients like reflection really wanting a guarantee that when you call Foo., you get a Foo. But it depends on how it is presented in the API. So, still an open question, which is why I listed both alternatives. Hidden classes did indeed push us to be more permissive, but in retrospect, that really shouldn't drive the choice: if you insist on putting a factory-like method in your hidden class, but can't follow the JVM's rules, you're fully capable of spinning your own static method using whatever name you want. >> A method of a class or interface named (2.9.4) must have its ACC_STATIC flag set. > > Should interfaces be able to implement `` or should we prevent > that like we do for ``? Preventing it now gives us the most > freedom later if we retcon this as a general object factory. > >> If the name of the method is , then the descriptor must denote a return type >> that is a type of the current class or interface. For a primitive class, the return >> type must be an inlinable reference type. > > Same questions about requiring the return type to match (I don't think > we should) and if we should prevent interfaces from implementing it (I > think yes). Yeah, more of the same question about what these things are really for, and how they are meant to be interpreted. I picked one point in the space ("a factory method creates an instance of this class or interface via ad hoc program logic"); there are others, ranging from "a factory method is an arbitrary method with a funny name" to "a factory method is, exclusively, the construction mechanism for a primitive class". > == 4.7.31 The JavaFlags Attribute >> We're having some good internal discussions about default values & null, and will send something out when that settles into something stable. > I expect (hope?) this will change as the internal discussion > solidifies. I'm not a fan of this "kitchen sink" approach as it > becomes an attractive nuisance to wedge other flags into. The > suggestion to use more focused attributes (`PrimitiveClassProperties` > & `ReferenceDefaultPrimitiveClass`) matches the existing conventions > for naming / using attributes. Yes, worth revisiting when we get further along. The idea here is that javac just needs some space to stash its special metadata, and things have suffered because we don't have that space (see, e.g., ACC_ENUM). But as we tweak things, TBD how much stuff ends up being "javac special metadata". > == 5.3.5 Deriving a Class from a class File Representation >> Alternatively, we could more uniformly claim that the class is "considered to implement" the >> expected interface, regardless of what it implements by inheritance. The difference in >> behavior might be observable, say, via reflection. > I think we need to honour programmer intent here. If a class > implements one of the interfaces by inheritance, then the programmer > has specified their intent and we should go with it (or flag it as an > error). There would be no error. The scenario is whether: class Foo implements IdentityObject class Bar extends Foo gets interpreted as class Bar extends Foo [implements IdentityObject] or just class Bar extends Foo In other words, how much do we "prune" redundant inferred supers? More work for you to prune them, but maybe the results of reflection are more intuitive with pruning. >> An abstract class implements IdentityObject if it declares an instance initialization method >> and does not have its ACC_PRIM_SUPER flag set; and implements PrimitiveObject if the opposite >> is true (ACC_PRIM_SUPER, no instance initialization method). Instance initialization methods >> and ACC_PRIM_SUPER represent two channels for subclass instance creation, and this analysis >> determines whether only one channel is "open". > > The rules expressed here don't cover the cases outlined in chapter 4 - > namely that a class that has ACC_PRIM_SUPER must only have static > fields and only its static methods can be synchronized. Correct. The Chapter 4 rules are enforced at Step 2 of 5.3.5. The inference of supertypes (and related constraints) is left to Step 5. >> Alternatively, we could ignore instance initialization methods and rely entirely on >> ACC_PRIM_SUPER. In practice, abstract classes written in the Java programming language always >> have instance initialization methods, so the difference in behavior is only relevant to classes >> produced via other languages or tools. > > My preference is for the VM to check the rules against the explicitly > set ACC_PRIM_SUPER bit. It means the language can own when to set it > and the VM only has to do consistency checks on it. Aesthetically, I like the idea that there are parallel mechanisms for inferring IdentityObject and inferring PrimitiveObject. But, yeah, our particular needs are met whether you can infer PrimitiveObject or not. > == 5.4.3.1 Field Resolution >> Thus, a field reference with a type like Qjava/lang/String; is permitted. Since it's impossible >> to declare a field with such a type (see 5.4.2), resolution of the reference will fail anyway >> with a NoSuchFieldError. > > I'm a little confused by "resolution of the reference will fail with > NoSuchFieldError" as my reading of 5.4.2 says we would reject any > class that has a Q descriptor that doesn't name a Q type. How would > resolution of such a field reference ever occur? The choices are: 1) Explicitly check descriptors of field references at resolution time for Q types in the descriptors that don't name valid primitive classes. If any are found, resolution fails with an ICCE. 2) Allow resolution to run its course, which will inevitably fail with a NSFE. Just a question of which error happens (with associated rules for checking). > == Bytecodes > new > tolerable because the Identity class requires no initialization. > Do we need changes to the JVMTI spec to indicate that Identity isn't > passed to the ClassFileLoadHook and is not modifiable? And maybe a > rule in static constraints that says Identity has an abstract > method? (Or have we dropped that idea?) We need a way to say, via the > spec, that code in Identity.()V will never run, no matter how a > user adds it there. At a minimum, the spec for 'new' describes what happens if you say 'new java/lang/Object', and if you try to hack the Identity class via JVMTI, you need to be aware of that spec. Whether JVMTI wants to do require something more for sanity checking is a good question... For constraining the standard library, eh, I'm pretty happy leaving that to the standard library. (And, no, we don't have abstract methods anymore.) No need to make sure standard library classes are defined the way they're supposed to be defined. If somebody hacks those binaries, again, the spec for 'new' says what happens, the hacker shouldn't be surprised. > withfield > use of the withfield instruction is restricted to > nestmates of the field's declaring class. > I'm glad to see this as discussion on this went around and around. > Limiting to the nest still seems like the right choice to me. So +1 > to this. Cool. Perhaps someday we'll expand it with normal access controls, but it requires a separate mechanism from the (read-oriented) field access flags. From daniel.smith at oracle.com Wed Sep 8 02:04:08 2021 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 8 Sep 2021 02:04:08 +0000 Subject: EG meeting *canceled*, 2021-09-08 Message-ID: Well, I was hoping to be in a place to have a good status update conversation tomorrow, but the long weekend interfered with those plans. I think we'll be best off skipping this meeting once more, and regrouping on Sep 22. In the mean time, if there's anything you think deserves some attention, feel free to send a mail about it! From daniel.smith at oracle.com Thu Sep 9 14:23:41 2021 From: daniel.smith at oracle.com (Dan Smith) Date: Thu, 9 Sep 2021 14:23:41 +0000 Subject: Factory methods & the language model Message-ID: <83A0DA05-45F2-43FB-9E67-7DEEFD46BB34@oracle.com> JEP 401 includes special JVM factory methods, spelled (or, alternatively, with a non-void return), which are needed as a standardized way to encode the Java language's primitive class constructors. We have a lot of flexibility in how much we restrict use of these methods. Too many restrictions seem arbitrary and incoherent from the JVM's point of view; but too few restrictions risk untested corner cases, unfortunate compatibility obligations, and difficulties mapping back to the Java language model. Expanding on that last one: for tools that operate with a Java language model, there are essentially three strategies for dealing with factory methods outside of the core primitive class construction use case: 1) Have the JVM reject them 2) Ignore them 3) Expand the model to include them Taking javac as an example, here's what that looks like: 1) If factory methods outside of primitive classes are illegal, javac can treat classes with such methods as malformed and report an error. 2) Or if javac sees a factory method in a non-primitive class, it can just leave it out when it maps the class file to a language-level class. (There's precedent for this in, e.g., the treatment of fields with the same name and different descriptors.) 3) Or we can allow javac to view factory methods in any class as constructors. A few complications: - Constructors of non-final classes have both 'new Foo()' and 'super()' entry points; factories only support the first. So we either need to validate that a matching pair of and exist, or expand the language to model factories independently from constructors. - The language expects instance creation expressions to create fresh instances. We need to either validate this behavior (does the factory look like "new/dup/"?) or relax the language semantics (perhaps this is in the grey area of mixed binaries?) - Factories can appear in abstract classes and interfaces. Again, are we willing to change the language model to support these use cases? Perhaps to even allow their declaration? - If a factory method has a mismatched return type (declared in Foo, but returns a Bar), are we willing to support a type system where the type of a factory invocation is not the type of the class to which the factory belongs? There are probably limits to what we're willing to do with (3), which pushes at least some cases into the (1) or (2) buckets. So, my question: what should we expect from (3), now and in the foreseeable future? And for the cases that fall outside of it, should we fall back to (1), (2), or a mixture of both? From daniel.smith at oracle.com Thu Sep 9 18:00:39 2021 From: daniel.smith at oracle.com (Dan Smith) Date: Thu, 9 Sep 2021 18:00:39 +0000 Subject: Factory methods & the language model In-Reply-To: References: <83A0DA05-45F2-43FB-9E67-7DEEFD46BB34@oracle.com> Message-ID: <865F5DDA-9A4D-4144-9162-B3FC46533A64@oracle.com> To clarify a bit that I left out: this discussion assumes a pretty fixed JVM feature: a factory method is a static method with a special name, invoked via invokestatic, and possibly subject to certain constraints about the descriptor/enclosing class. I'm not proposing any changes to that basic approach, although choices we make for the Java language & tools _might_ influence the set of constraints we choose to impose in JVMS. > On Sep 9, 2021, at 10:15 AM, Dan Heidinga wrote: > > On Thu, Sep 9, 2021 at 10:24 AM Dan Smith wrote: >> >> JEP 401 includes special JVM factory methods, spelled (or, alternatively, with a non-void return), which are needed as a standardized way to encode the Java language's primitive class constructors. >> >> We have a lot of flexibility in how much we restrict use of these methods. Too many restrictions seem arbitrary and incoherent from the JVM's point of view; but too few restrictions risk untested corner cases, unfortunate compatibility obligations, and difficulties mapping back to the Java language model. >> >> Expanding on that last one: for tools that operate with a Java language model, there are essentially three strategies for dealing with factory methods outside of the core primitive class construction use case: >> >> 1) Have the JVM reject them > > This gives us the maximum flexibility to expand factories in the > future and let's us concentrate on the inline types use cases. Seems > like a pretty safe fallback position on factories. Yeah. Seems a little... lacking in vision to impose this restriction on class files of all languages, but it also avoids over-committing. > >> 2) Ignore them > > I strongly dislike this. If javac were to ignore them, and just not > generate them, they are effectively dead code. Dead to the Java language and tools, but perhaps a useful way to compile a Scala feature or something? > It's be much clearer > to users if javac flagged them as such and refused to compile unless > they were deleted. If javac ignores them, we still need an answer on > what the JVM does with them - reject them? load them but prevent them > from being invoked? drop them when loading the classfile? This seems > like it collapses back to option 1. The JVM semantics are clean and wouldn't change: if you want to use a factory, invoke it with invokestatic. It's just that the Java language wouldn't provide any mechanism to do so (because or aren't legal Java method names). Ignoring does feel a bit like the feature is incomplete or something, but this sort of behavior does show up from time to time where Java and the JVM aren't perfectly in sync. For example: - If there are two fields with the same name, one of them is effectively invisible - If there are two methods with the same params and different returns, they're considered overloads that are impossible to disambiguate - If there's a stray method in an interface (before we outlawed this), javac either filters it out or treats it as a normal method, but anyway you can't call it because of its name >> 3) Expand the model to include them > > How much expanding does the model need? We had originally modeled the > factory methods as regular static methods and only gave them the > specialized name to make them easy to detect, to deal with withfield > being limited to the nest, and to allow reflective operations like > Class::getConstructor() and Class::newInstance() to identify the > inline type "constructors". Am I forgetting a case? Talking here about expanding the *language* model in some way so that factory methods appearing in non-primitive classes and interfaces can somehow be recognized or invoked. (1) and (2) are reasonable options, too, but here I'm exploring other approaches that go beyond rejecting or ignoring. >> 3) Or we can allow javac to view factory methods in any class as constructors. A few complications: >> >> - Constructors of non-final classes have both 'new Foo()' and 'super()' entry points; factories only support the first. So we either need to validate that a matching pair of and exist, or expand the language to model factories independently from constructors. > > I don't think we want to touch the "new/dup/" sequence and > trying to allow factories to operate in that delicate dance would be a > mistake. Factories, beyond the inline types uses, give us a chance to > encapsulate the "new/dup/" dance and present a cleaner model. > We shouldn't attempt to mix the two. Not sure which direction you're going here? One stance we could take: new/dup/ is fine for identity classes, we're not going to do anything different. Another stance we could take: new/dup/ is painful, let's try to migrate to a different convention where factory methods encapsulate new/dup/, and clients just call the factory. I'm saying if we take the latter stance, there's a problem in that constructors would then be compiled down to factory methods *and* (for super calls) methods, and we might need some validation to ensure they are aligned. >> - The language expects instance creation expressions to create fresh instances. We need to either validate this behavior (does the factory look like "new/dup/"?) or relax the language semantics (perhaps this is in the grey area of mixed binaries?) >> > > Only the invokestatic bytecode should be used to invoke a factory. > Classes can have both factories and constructors, but they serve > different purposes and only overlap due to reflective operations. > Keeping them completely separate at the bytecode level is cleanest. Sure, it's nice at the JVM level to treat them as independent features. But that doesn't match the Java language, so there's a mismatch to work out (either by changing the language, or restricting the VM, or having javac ignore code shapes that don't match). > >> - Factories can appear in abstract classes and interfaces. Again, are we willing to change the language model to support these use cases? Perhaps to even allow their declaration? > > This makes sense. Factories are just static methods with a special > name. A factory on an abstract class or interface makes sense if the > concrete implementations are all package-private (sealed?) so users > only reference the one public abstract class. Yep, could be a useful feature. Is it one we could actually see implementing? TBD... >> - If a factory method has a mismatched return type (declared in Foo, but returns a Bar), are we willing to support a type system where the type of a factory invocation is not the type of the class to which the factory belongs? >> > > I thought we needed this capability for anonymous inline classes as > they can't name themselves in the return type of the factory. We concluded that "need" is too strong a word here. It's a corner case that can be handled without using the factory method feature. > And I > don't see a problem with it as long as we don't touch the new/dup/init > dance. Is there another problem here I'm not seeing? Clients like the Java language will expect the return type to match, and will have to work around the issue if it doesn't (again, with any of these strategies: reject as malformed, ignore, or expand the language to allow it). Specifically, even if we limit the feature to primitive classes, if a primitive class can have a factory that returns something other than the primitive class's type, javac needs to decide what to do about that. >> There are probably limits to what we're willing to do with (3), which pushes at least some cases into the (1) or (2) buckets. >> >> So, my question: what should we expect from (3), now and in the foreseeable future? And for the cases that fall outside of it, should we fall back to (1), (2), or a mixture of both? >> > > (1), limiting to inline types, is the easiest and safest option while > allowing the most flexibility to change in the future. > > For (3), it seems like all the complexity goes away if we don't try to > make factories == constructors at the bytecode level. Am I missing > something that would force us to do so? No, it's not necessarily about JVM bytecode constraints. It's about how javac interprets whatever class files are thrown at it. But you're right, if we limit the feature in the JVM to the minimal needs of the Java language (in a primitive class, matching return type), we can avoid these issues. From daniel.smith at oracle.com Thu Sep 9 21:31:57 2021 From: daniel.smith at oracle.com (Dan Smith) Date: Thu, 9 Sep 2021 21:31:57 +0000 Subject: [External] : Re: Factory methods & the language model In-Reply-To: References: <83A0DA05-45F2-43FB-9E67-7DEEFD46BB34@oracle.com> <865F5DDA-9A4D-4144-9162-B3FC46533A64@oracle.com> Message-ID: <67D21363-2592-426D-AE58-ECCE2E072DE3@oracle.com> On Sep 9, 2021, at 1:13 PM, Dan Heidinga > wrote: but to keep the door open to having both factories and constructors in identity classes, should we use a different syntax for factories in primitive classes now? That way factories would be "spelled" consistently between primitive and identity classes. Doing so diminishes the "codes like a class" story but leaves the door open for more compatibility in the future. Enthusiastic +1. I don't really *want* to do that, but if we think that's where we're headed, it is pretty weird that, say, a factory declaration in an Java interface declaration looks completely different from a factory declaration in a Java primitive class declaration. Or maybe both styles of declaration are supported by primitive classes? And does reflection treat them differently, too? Not sure if this leads anywhere good, but I want to do a bit of thinking through the implications... From daniel.smith at oracle.com Fri Sep 10 16:20:35 2021 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 10 Sep 2021 16:20:35 +0000 Subject: Factory methods & the language model In-Reply-To: References: <83A0DA05-45F2-43FB-9E67-7DEEFD46BB34@oracle.com> <865F5DDA-9A4D-4144-9162-B3FC46533A64@oracle.com> <67D21363-2592-426D-AE58-ECCE2E072DE3@oracle.com> Message-ID: > On Sep 10, 2021, at 7:56 AM, Dan Heidinga wrote: > > On Thu, Sep 9, 2021 at 5:32 PM Dan Smith wrote: >> >> On Sep 9, 2021, at 1:13 PM, Dan Heidinga wrote: >> >> but to keep the door open to having both factories and >> constructors in identity classes, should we use a different syntax for >> factories in primitive classes now? That way factories would be >> "spelled" consistently between primitive and identity classes. Doing >> so diminishes the "codes like a class" story but leaves the door open >> for more compatibility in the future. >> >> >> Enthusiastic +1. >> >> I don't really *want* to do that, but if we think that's where we're headed, it is pretty weird that, say, a factory declaration in an Java interface declaration looks completely different from a factory declaration in a Java primitive class declaration. Or maybe both styles of declaration are supported by primitive classes? And does reflection treat them differently, too? Not sure if this leads anywhere good, but I want to do a bit of thinking through the implications... >> > > Do you want to tackle this on list or wait for the next EG meeting? > If you have a model / syntax in mind, we can start to work through the > implications. Otherwise, we can all pull out the bikeshed paint.... > Both are fine. :-) I'm not particularly interested in settling on a bikeshed color, but am interested in the general mood for pursuing this direction at all. (And not necessarily right away, just?is this a direction we think we'll be going?) A few observations/questions: - 'new Foo()' traditionally guarantees fresh instance creation for identity classes. Primitive classes relax this, since of course there is no unique identity to talk about. Would we be happy with a language that relaxes this further, such that 'new Foo()' can return an arbitrary Foo instance (or maybe even null)? Or would we want to pursue a different, factory-specific invocation syntax? (And if so, should primitive classes use it too?) - The JVM's factory methods are unnamed, but in practice it's often useful to give your factory methods names. Of course the Java language already supports *named* factory methods. Does an unnamed factory method feature significantly improve the language? - Identity classes don't have a 'withfield' operation, which means we can't mimic the declaration syntax of primitive classes in identity class factories. Instead, identity class factories probably look just like normal static methods. Can we try to make primitive class factories *also* look like normal static methods? (Would require a 'withfield' Java expression. Honestly not sure I want to write that code.) My initial sense: no, trying to generalize like this isn't useful, primitive class constructors are best modeled as real constructors, even though they use the factory method JVM encoding. It's not particularly likely that an identity class factory feature will be worthwhile at all, but if it is, identity and primitive classes may use a similar JVM encoding but we shouldn't view these as similar language features. From brian.goetz at oracle.com Fri Sep 10 18:25:50 2021 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 10 Sep 2021 14:25:50 -0400 Subject: Factory methods & the language model In-Reply-To: References: <83A0DA05-45F2-43FB-9E67-7DEEFD46BB34@oracle.com> <865F5DDA-9A4D-4144-9162-B3FC46533A64@oracle.com> <67D21363-2592-426D-AE58-ECCE2E072DE3@oracle.com> Message-ID: <617ab0ce-4a93-7705-77c1-039cbae21676@oracle.com> > I'm not particularly interested in settling on a bikeshed color, but am interested in the general mood for pursuing this direction at all. (And not necessarily right away, just?is this a direction we think we'll be going?) > > A few observations/questions: > > - 'new Foo()' traditionally guarantees fresh instance creation for identity classes. Primitive classes relax this, since of course there is no unique identity to talk about. Would we be happy with a language that relaxes this further, such that 'new Foo()' can return an arbitrary Foo instance (or maybe even null)? Or would we want to pursue a different, factory-specific invocation syntax? (And if so, should primitive classes use it too?) Let me offer some context from Amber that might be helpful, regarding whether we might want "factory" to be a language feature. Example 1 -- records.? One of the complaints about records is that you have to instantiate them through constructors, not factories; over the decades, people have come to view exposed constructors as "dirty" and therefore would rather see ??? List.of(Foo.of(a, b), Foo.of(c, d)) than ??? List.of(new Foo(a, b), new Foo(c, d)) This is a cosmetic concern, but it still does elicit "but why" comments from developers when they first encounter records.? The reason, of course, is that records are a language feature, and therefore have to be built on top of other language features. Constructors are a language feature, but factories are not, they are just a library structuring convention.? If the language had some way to declare factories, we might have (but still might not have) encapsulated the constructor and exposed a factory instead. (Though, since we'd have to make up a name for the factory, and we try to avoid magic naming, we might still have gone with ctors.) Example 2 -- "with" expressions / reconstructors.? A number of interesting features come out of the pairing of constructors and deconstruction patterns with the same (or compatible) argument lists, such as `with` expressions (`point with { x = 3 }`). Handling this involves doing multiple overload selection operations, first to find a deconstructor that yields the right bindings, and then to find a compatible constructor that accepts the resulting exploded state. Among the many problems of doing this (including the fact that parameter names are not required to be stable or meaningful), we also have the problem of "what if the class doesn't have a constructor, but a factory."? This would require the language to have a notion of factory method (e.g., `factory newPoint(int x, int y)`) so the compiler could try to match up a compatible factory with the deconstructor. The same questions apply here -- what constraints do we put on a "factory" at the source level?? Implicit null checks on exit?? Do we want to pun `new Foo()` with factory invocation, or treat them separately?? Many questions (which I'm not trying to answer here.) I'll just note that there is already an asymmetry in primitive classes here, in that the user writes a constructor, but the compiler generates a factory in the classfile. From forax at univ-mlv.fr Tue Sep 14 07:56:19 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 14 Sep 2021 09:56:19 +0200 (CEST) Subject: Factory methods & the language model In-Reply-To: <83A0DA05-45F2-43FB-9E67-7DEEFD46BB34@oracle.com> References: <83A0DA05-45F2-43FB-9E67-7DEEFD46BB34@oracle.com> Message-ID: <314827016.760933.1631606179313.JavaMail.zimbra@u-pem.fr> I will take the scenic road to answer :) There is currently an issue with the fact that we present primitive class as constructor in Java the language but is translated not to a constructor in the class file. This introduce a false sense of compatibility, the code is identical if it's a classical class and a primitive class, but it's a trap because those are not binary compatible. We start with the idea of "code like a class" but the current runtime model is different for a primitive class and a classical class, Q-type and factory at construction. I think Java should reflect that difference in the code instead of trying to retrofit the constructor syntax to work with primitive class. Apart from the "code like a class" mantra, it seems we do not want users to use a syntax corresponding to withfield directly, the withfield are implicitly generated by the compiler from the this.x = x. In Amber, in a compact constructor, we have chosen another way, to avoid users to write this.x = x, they are generated automatically by the compiler. I think we can use the same idea here. In term of syntax, instead of having a constructor, we can have a method __new__ (syntax up to debate obviously) and the compiler generates the withfield to initialize the primitive object before returning it public primitve class Point { private final int x; private final int y; public __new__(int x, int y) { if (x > 0 || y > 0) { throw ... } // at the end the compiler generates the withfield to initialize the fields x and y with the value of the local variable x and y } } The idea is that in the method __new__, - is implicitly static (so "this" is not available) - there should be a local variable defined inisde __new__ with the same name as each field - you can have as many overloads you want, using __new__() instead of "this()" to call in between the overloads the same way record constructors work for you to delegate between constructors For the reflection API, we can either go and introduce a new kind of member reflected by a class like j.l.r.NewFactoryMethod or with flag-like method on j.l.r.Method (isNewFactoryMethod()) saying that it's a special kind of static method. I prefer the later because most existing code will still work with that model. To summarize, i think the syntax should reflect the fact that a primitive class is not initialized the same way a classical class is. So if we want to be consistent with that idea, it means that for the JVM, those factory methods __new__ are only available to primitive classes. So the VM should reject - a primitive class with a method - a reference class with a method This is i believe your solution (1). regards, R?mi ----- Original Message ----- > From: "daniel smith" > To: "valhalla-spec-experts" > Sent: Jeudi 9 Septembre 2021 16:23:41 > Subject: Factory methods & the language model > JEP 401 includes special JVM factory methods, spelled (or, alternatively, > with a non-void return), which are needed as a standardized way to > encode the Java language's primitive class constructors. > > We have a lot of flexibility in how much we restrict use of these methods. Too > many restrictions seem arbitrary and incoherent from the JVM's point of view; > but too few restrictions risk untested corner cases, unfortunate compatibility > obligations, and difficulties mapping back to the Java language model. > > Expanding on that last one: for tools that operate with a Java language model, > there are essentially three strategies for dealing with factory methods outside > of the core primitive class construction use case: > > 1) Have the JVM reject them > 2) Ignore them > 3) Expand the model to include them > > Taking javac as an example, here's what that looks like: > > 1) If factory methods outside of primitive classes are illegal, javac can treat > classes with such methods as malformed and report an error. > > 2) Or if javac sees a factory method in a non-primitive class, it can just leave > it out when it maps the class file to a language-level class. (There's > precedent for this in, e.g., the treatment of fields with the same name and > different descriptors.) > > 3) Or we can allow javac to view factory methods in any class as constructors. A > few complications: > > - Constructors of non-final classes have both 'new Foo()' and 'super()' entry > points; factories only support the first. So we either need to validate that a > matching pair of and exist, or expand the language to model > factories independently from constructors. > > - The language expects instance creation expressions to create fresh instances. > We need to either validate this behavior (does the factory look like > "new/dup/"?) or relax the language semantics (perhaps this is in the grey > area of mixed binaries?) > > - Factories can appear in abstract classes and interfaces. Again, are we willing > to change the language model to support these use cases? Perhaps to even allow > their declaration? > > - If a factory method has a mismatched return type (declared in Foo, but returns > a Bar), are we willing to support a type system where the type of a factory > invocation is not the type of the class to which the factory belongs? > > There are probably limits to what we're willing to do with (3), which pushes at > least some cases into the (1) or (2) buckets. > > So, my question: what should we expect from (3), now and in the foreseeable > future? And for the cases that fall outside of it, should we fall back to (1), > (2), or a mixture of both? From forax at univ-mlv.fr Tue Sep 14 08:38:31 2021 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 14 Sep 2021 10:38:31 +0200 (CEST) Subject: Factory methods & the language model In-Reply-To: <617ab0ce-4a93-7705-77c1-039cbae21676@oracle.com> References: <83A0DA05-45F2-43FB-9E67-7DEEFD46BB34@oracle.com> <865F5DDA-9A4D-4144-9162-B3FC46533A64@oracle.com> <67D21363-2592-426D-AE58-ECCE2E072DE3@oracle.com> <617ab0ce-4a93-7705-77c1-039cbae21676@oracle.com> Message-ID: <173911978.797118.1631608711740.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "daniel smith" , "Dan Heidinga" > Cc: "valhalla-spec-experts" > Sent: Vendredi 10 Septembre 2021 20:25:50 > Subject: Re: Factory methods & the language model >> I'm not particularly interested in settling on a bikeshed color, but am >> interested in the general mood for pursuing this direction at all. (And not >> necessarily right away, just?is this a direction we think we'll be going?) >> >> A few observations/questions: >> >> - 'new Foo()' traditionally guarantees fresh instance creation for identity >> classes. Primitive classes relax this, since of course there is no unique >> identity to talk about. Would we be happy with a language that relaxes this >> further, such that 'new Foo()' can return an arbitrary Foo instance (or maybe >> even null)? Or would we want to pursue a different, factory-specific invocation >> syntax? (And if so, should primitive classes use it too?) > > Let me offer some context from Amber that might be helpful, regarding > whether we might want "factory" to be a language feature. > [...] > > Example 2 -- "with" expressions / reconstructors.? A number of > interesting features come out of the pairing of constructors and > deconstruction patterns with the same (or compatible) argument lists, > such as `with` expressions (`point with { x = 3 }`). Handling this > involves doing multiple overload selection operations, first to find a > deconstructor that yields the right bindings, and then to find a > compatible constructor that accepts the resulting exploded state. > > Among the many problems of doing this (including the fact that parameter > names are not required to be stable or meaningful), we also have the > problem of "what if the class doesn't have a constructor, but a > factory."? This would require the language to have a notion of factory > method (e.g., `factory newPoint(int x, int y)`) so the compiler could > try to match up a compatible factory with the deconstructor. I think there is a way to not introduce a weird with expression syntax by piggybacking on the fact that a record is a weird tuple. A record in Java is not just a tuple, i.e. a vector of values, but because all components are named also a compact key/value set. The "with" expression is a case where we want to see a record as key/value set more than as a vector of values. If we have a syntax to construct a record as a key/value set, this syntax can be slightly extended to express a "with" expression. By example, if we have a syntax like Point p = Point { x: 3, y: 4 }; Then the syntax for the with expression will be to something like Point p2 = Point { x: 7, p }; I can hear you saying that i'm trying to trick you to add a new syntax for creating a record which is bigger that just the "with" expression. And that's partially true. I believe that the "key/value set" syntax for a record is something we should introduce in Amber anyway because it's a declarative syntax, the same way a Stream is, making the code easier to read. And people want a syntax like this so badly that they are up to writing a full builder pattern in their code just for being able to see the name of the parameters when creating complex objects. regards, R?mi From daniel.smith at oracle.com Wed Sep 22 00:30:31 2021 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 22 Sep 2021 00:30:31 +0000 Subject: Project status summary Message-ID: <4FF76A7B-43E7-420C-930F-2117846E413E@oracle.com> As I've mentioned, I've been wanting to put together a broad summary of where the project is at. I've grouped this into three areas or tracks: Primitive Objects, Unified Primitives, and Universal Generics. ------ Primitive Objects JEP 401: This is the core preview feature, including primitive class declarations, primitive object semantics, and primitive value types (with reference companions). - Awaiting finalization of some outstanding design issues before trying to target a release - Working towards an Early Access release, with the goal of substantially aligning with the JEP 401 description - Our design focus recently has been on the "Enforcing instance validation" section of the JEP; our best candidate solution is to support a kind of primitive class that is both strictly-enforced and nullable. I'll flesh this out in a separate email in the next few days. - There are still some complexities regarding reflection, 'getClass', and MethodHandles that we'd like to refine - The behavior of weak references is still an open question - JVMS changes are written, with some iteration necessary to fill in gaps and respond to feedback - JLS changes are pending the above instance validation revisions, along with some validation of the type system (see discussion in Universal Generics) --- JEP 8267650: This is a supplementary task focusing on JVMS rules and some corner-case JVM behaviors. We'd like to complete it before or at the same time as the JEP 401 release. - JEP is nearly ready for Submission, but I need to iterate on it - Some initial JVMS changes were created; Alex suggested some significant revisions that need to be applied --- Future work: - We hope to work on migrating a number of standard library classes (such as java.time.*) once JEP 401 is done (probably to be released after the features are final) - Other projects like Amber and Panama hope to take advantage of primitive objects as well ------ Unified Primitives JEP 402: This involves making the wrapper classes primitive and treating 'int', 'short', etc., as their value types. - Expect to target a release concurrently with JEP 401 - I don't think we've tried implementing this yet (in javac or the special JVM treatment for arrays). It's probably best being handled downstream of the JEP 401 design issues. - Some lingering discomfort with the proposed reflection story - Some vague ideas about pushing this equivalence deeper into the JVM, but no concrete proposals - JVMS changes aren't done, will be pretty small and narrowly-focused - JLS changes will be fleshed out in parallel with JEP 401 --- JEP TBD: Wrapper Constructor Tooling. JEP 390 provided migration warnings about wrapper class constructors in 16+. We need to follow this up with some tooling to convert legacy class files so that they'll run on a release that doesn't provide Integer., etc., methods. - Should release before or at the same time as JEP 401. - Could also integrate other suggested followups to JEP 390, like runtime logging of deprecation warnings. --- Future work: - There are a lot of opportunities to enhance the API provided by the wrapper classes after we've completed the primitive class migration. ------ Universal Generics JEP 8261529: This is the set of language changes needed to allow generics over value types and to facilitate safe migration. - Has now been Submitted, awaiting Candidate status. - The type system rules are being developed. High level intuitions are pretty straightforward, but the details of type variable types (now in two flavors!) and intersection types need some fleshing out and validation, particularly since these have historically been neglected. - JLS changes will come when the type system design is clearer - A prototype is implemented, subject to specification clarifications - A near-term goal is to validate the user experience of the proposed compilation warnings by addressing them in a subset of standard library code ---- JEP TBD: Java Type System Refinements. Not clear exactly what this will entail, but there is probably a significant chunk of spec work that can be spun off independently and address some longstanding issues with the current type system. --- Future work: - Applying changes to address warnings in standard libraries (definitely java.base, plus some others, maybe not everything, potentially in stages) - Parametric JVM, as discussed earlier this year?we have a reasonable picture of what this will look like, but there are lots of details to work through both in the design and prototyping. Type restrictions could be spun off as a separate feature, as they may have other use cases. From daniel.smith at oracle.com Wed Sep 22 00:35:16 2021 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 22 Sep 2021 00:35:16 +0000 Subject: EG meeting, 2021-09-22 Message-ID: <6D295804-FABA-43F2-B935-F04F26CAC118@oracle.com> Tomorrow's EG Zoom meeting is on! Wednesday at 4pm UTC (9am PDT, 12pm EDT). Topics to discuss: "Factory methods & the language model": I raised some questions about how we think we should treat JVM factory methods in tools/libraries that present a language-level view. "Project status summary": I summarized where the different pieces of the project are at. Would like to know if there are substantial pieces/issues that I missed.