From alex.buckley at oracle.com Mon Jun 3 19:20:39 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Mon, 03 Jun 2019 12:20:39 -0700 Subject: Draft language spec for JEP 355: Text Blocks In-Reply-To: <5CE45564.5090405@oracle.com> References: <5CE33352.4030008@oracle.com> <5CE34CF1.7000005@oracle.com> <16C84E21-26F7-4527-99B5-CD2434501E36@oracle.com> <5CE45564.5090405@oracle.com> Message-ID: <5CF57307.20507@oracle.com> On 5/21/2019 12:45 PM, Alex Buckley wrote: > On 5/21/2019 5:51 AM, Brian Goetz wrote: >> As string literals get longer, the cost-benefit of interning get >> worse, and eventually turn negative; it is super-unlikely that two >> compilation units will use the same 14-line snippet of JSON (no >> benefit), and at the same time, we?re taking up much more space in >> the intern table (more cost). >> >> Surely today we?ll use Constant_String_info because that?s the >> sensible translation target, and if the same string appears twice in >> a single class, it?ll automatically get merged by the constant pool >> writer. But committing forever to interning seems likely to be >> something we?ll eventually regret, without buying us very much. Even >> the migration benefit seems questionable. > > OK, I have walked back the requirement to intern text blocks in 3.10.6 > and 12.5. Spec updated in place > (http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls.html), old > version available > (http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls-20190520.html). Thanks to the scrutiny of the CSR process, we realized the need to state plainly that all text blocks are constant expressions. And, since text blocks are of type String, and all String-typed constant expressions are interned, the outcome is that all text blocks must be interned. I have updated http://cr.openjdk.java.net/~abuckley/jep355/text-blocks-jls.html to reflect this. Alex From amalloy at google.com Mon Jun 3 22:19:10 2019 From: amalloy at google.com (Alan Malloy) Date: Mon, 3 Jun 2019 15:19:10 -0700 Subject: Revisiting field references Message-ID: Hello, amber-spec-experts. I understand that "field references" is an idea that was considered when other member references were being implemented, and it seems to have been a "well, maybe someday" feature: nothing fundamentally wrong with it, just not worth delaying method references for. Google is interested in reopening that discussion, and in working on the implementation if a satisfactory design can be found. For the remainder of this message, we will use this class definition for examples: public class Holder { private int field; } This class contains only an instance field, but everything in this document applies equally in the case of static fields, except of course that they can?t be bound and won?t expect a receiver argument. Additionally, most of this document will assume that Holder::field is the syntax used for creating an unbound reference to that field. This feels very natural of course, but there is a section about the tradeoffs of reusing the :: token for fields. Getter as Supplier The most obvious thing you can do with a field is read it. Holder::field could be a ToIntFunction, and this::field an IntSupplier (assuming this is a Holder). I suspect that a feature that does this and no more would actually cover a majority of use cases: most people who today want a field reference probably just want a shorter version of a lambda that reads a field. However, this by itself is not really a very compelling addition, simply because it doesn?t buy us much: the ?workaround? of writing the lambda by hand is not very painful or error-prone, so permitting a reference instead, while nice, is not transformative. However, there are some other things we could do with field references, which may make the feature more worthwhile. Setter as Consumer The most obvious difference between fields and methods is that while there's only one thing to do with a method (invoke it), you can either read a field or write it. So, while we?ve already established that this::field could be an IntSupplier, it could, depending on context, instead be an IntConsumer instead, setting the field when invoked. Likewise Holder::field could be an ObjIntConsumer instead of a ToIntFunction. This seems natural enough, but merits discussion instead of just being included in the feature because it is ?obvious?. Setters are more complicated than getters. The most obvious complication is that they should be illegal if the field is final. More subtly, source code may become harder to understand when the expression this::field may mean two very different things, either a read or a write. The compiler should have enough type information in context to disambiguate, or give an appropriate diagnostic when a use site is ambiguous, e.g. due to overloading, but this information can be difficult for a developer to sort out manually, making every use site a debugging puzzle. They must figure out the target type of the reference to determine whether it is a read or a write. Increased Transparency Another appealing thing to do with a field reference is to make it more transparent than a simple lambda. We could have some sort of FieldReference object describing the class in which the field lives, the name and type of the field, and which object (if any) is bound as the receiver. This FieldReference object would expose get, and possibly set, methods for the referred-to field. Of course this looks a lot like java.lang.reflect.Field; but instead of one final class using reflection to handle all fields of all classes, we can use the lambda meta-factory (or something like it) to generate specialized subclasses, which can conveniently be bound to receivers as well as being faster. An advantage of supporting this is that it could enable libraries that currently accept lambdas to generate more efficient code. For example, consider Comparator c = Comparator.comparing(a -> a.name) .thenComparing(a -> a.species) .thenComparingInt(a -> a.mass); A perfectly reasonable Comparator, and much more readable than a nest of if-conditions written by hand. But if used in a tight loop to compare many animals, this is quite expensive compared to the hand-written version, because each comparison may dispatch through many lambdas, and this it not easy for the JIT to inline. If we really wanted to allow Comparator combinators to be used in performance-sensitive situations, Comparator could have an optimize() method that attempts to generate bytecode for an efficient comparator in a way similar to what the lambda meta-factory does. Even without field references, that optimize() method could eliminate some lambda calls: instead of a chain of lambdas for each .thenComparing call, it could be unrolled into 3 if statements. But we?d still have 3 lambdas left, to compute the values to compare to each other. If we could pass in a field reference, the optimize() method could introspect on those, allowing it to emit getfield bytecodes directly, saving more indirection, and resulting in the same bytecode you could get by writing this comparator by hand. I hope it goes without saying that I am not proposing to actually implement Comparator.optimize any time soon: it?s just a convenient, well-known example of the kind of library that could be gradually improved by promoting field references from ?sugar for a lambda? to reified objects. Note that if we reify field references, there will surely be some people who ask, ?why not method references?? I think it is much more difficult to do this, because methods can be overloaded. Which overload of String::valueOf did you want to reify as a MethodReference object? When we use these as lambdas, context can give us a hint; when crystalizing them as a descriptor object we will have no context. So, there seems to me to be good reason to push back against this request, but it is a choice we should make deliberately. Annotation parameters Last, if we had such a FieldRef descriptor, we might like to be able to use them as annotation parameters, making it possible to be more formal about annotations like class Stream { private Lock lock; @GuardedBy(Stream::lock) // next() only called while holding lock public int next() {...} } Probably this would mean having FieldReference implement Constable, so that Holder::field could be put in the constant pool, along with other annotation parameters. This also suggests that a FieldReference object should not directly store the bound receiver, since that could not be put in the constant pool; instead we would want a FieldReference to always be unbound, and then some sort of decorator or wrapper that holds a FieldReference tied to a captured receiver. Open Questions The first set of questions is: are these all reasonable, useful features? Am I missing any pitfalls that they imply? One looming design question is unfortunately syntax: is Foo::x really the best syntax? It's very natural, but it will be ambiguous if Foo also has a method named x. To preserve backwards compatibility with code written before the introduction of field references, we would obviously need to resolve this ambiguity in favor of any applicable method reference over any applicable field reference. It would surely be too extreme to say that it's impossible to get a field reference when a method with the same name exists. So if you really want the field reference in a context like this, we could introduce some alternate syntax to clarify that: Foo:::x, or Foo..x, for example: the details don't have to be sorted out at this time, as much as we need to decide whether to use any new token at all or just reuse the :: token. But this tie-breaker strategy has a problem: it solves backwards compatibility, while leaving a subtle forward-compatibility pitfall. Holder::field currently resolves to a field reference, but suppose in the future someone adds a method with the same name. As discussed before, we must resolve conflicts in favor of methods, and so Holder::field suddenly becomes a method reference next time you compile the client code. Now class authors can change which member is being accessed by adding a new member, which seems dangerous. But maybe it's fine - adding new overloads of an existing method can already do that, if clients were relying on autoboxing or other type coercions. We could avoid the difficulty by having no syntactic overlap between field and method references: Holder::toString for methods only, Holder:::field for fields only. That's unlikely to be popular, and indeed it is a bit ugly. Is it better to accept the small danger of ambiguity? Finally, if anyone has implementation tips I would be happy to hear them. I am pretty new to javac, and while I've thrown together an implementation that desugars field references into getter lambdas it?s far from a finished feature, and I?m sure what I?ve already done wasn't done the best way. Finding all the places that would need to change is no small task. From forax at univ-mlv.fr Tue Jun 4 20:50:57 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 4 Jun 2019 22:50:57 +0200 (CEST) Subject: Revisiting field references In-Reply-To: References: Message-ID: <2069084817.756393.1559681457779.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Alan Malloy" > ?: "amber-spec-experts" > Envoy?: Mardi 4 Juin 2019 00:19:10 > Objet: Revisiting field references Hi Alan, i'm sorry but i've more questions than answers, > Hello, amber-spec-experts. I understand that "field references" is an > idea that was considered when other member references were being > implemented, and it seems to have been a "well, maybe someday" > feature: nothing fundamentally wrong with it, just not worth delaying > method references for. Google is interested in reopening that > discussion, and in working on the implementation if a satisfactory > design can be found. The problem with field ref is not the implementation it's the semantics, the question being, a field ref is it a way to expose the getter method ref (and the setter method ref) or is it more like a reified Property. > > For the remainder of this message, we will use this class definition > for examples: > > public class Holder { > private int field; > } > > This class contains only an instance field, but everything in this > document applies equally in the case of static fields, except of > course that they can?t be bound and won?t expect a receiver argument. > > Additionally, most of this document will assume that Holder::field is > the syntax used for creating an unbound reference to that field. This > feels very natural of course, but there is a section about the > tradeoffs of reusing the :: token for fields. > > Getter as Supplier > > The most obvious thing you can do with a field is read it. > Holder::field could be a ToIntFunction, and this::field an > IntSupplier (assuming this is a Holder). I suspect that a feature that > does this and no more would actually cover a majority of use cases: > most people who today want a field reference probably just want a > shorter version of a lambda that reads a field. However, this by > itself is not really a very compelling addition, simply because it > doesn?t buy us much: the ?workaround? of writing the lambda by hand is > not very painful or error-prone, so permitting a reference instead, > while nice, is not transformative. However, there are some other > things we could do with field references, which may make the feature > more worthwhile. We already have that for free with a record, if you write: record Holder(int field) then those statements already compile ToIntFunction fun = Holder::field; IntSupplier fun2 = new Holder()::field; because the generated getter as the same name as the field. > > Setter as Consumer > > The most obvious difference between fields and methods is that while > there's only one thing to do with a method (invoke it), you can either > read a field or write it. So, while we?ve already established that > this::field could be an IntSupplier, it could, depending on context, > instead be an IntConsumer instead, setting the field when invoked. > Likewise Holder::field could be an ObjIntConsumer instead of a > ToIntFunction. > > This seems natural enough, but merits discussion instead of just being > included in the feature because it is ?obvious?. Setters are more > complicated than getters. The most obvious complication is that they > should be illegal if the field is final. More subtly, source code may > become harder to understand when the expression this::field may mean > two very different things, either a read or a write. The compiler > should have enough type information in context to disambiguate, or > give an appropriate diagnostic when a use site is ambiguous, e.g. due > to overloading, but this information can be difficult for a developer > to sort out manually, making every use site a debugging puzzle. They > must figure out the target type of the reference to determine whether > it is a read or a write. > > Increased Transparency > > Another appealing thing to do with a field reference is to make it > more transparent than a simple lambda. We could have some sort of > FieldReference object describing the class in which the field lives, > the name and type of the field, and which object (if any) is bound as > the receiver. This FieldReference object would expose get, and > possibly set, methods for the referred-to field. Of course this looks > a lot like java.lang.reflect.Field; but instead of one final class > using reflection to handle all fields of all classes, we can use the > lambda meta-factory (or something like it) to generate specialized > subclasses, which can conveniently be bound to receivers as well as > being faster. that the Property, i was talking about above, but don't we already have VarHandle for that ? that VarHandle can be initialized using a lazy static final field (see issue 8209964) or by compiler intrinsics (JEP 348). > > An advantage of supporting this is that it could enable libraries that > currently accept lambdas to generate more efficient code. For example, > consider > > Comparator c = > Comparator.comparing(a -> a.name) > .thenComparing(a -> a.species) > .thenComparingInt(a -> a.mass); > > A perfectly reasonable Comparator, and much more readable than a nest > of if-conditions written by hand. But if used in a tight loop to > compare many animals, this is quite expensive compared to the > hand-written version, because each comparison may dispatch through > many lambdas, and this it not easy for the JIT to inline. If we really > wanted to allow Comparator combinators to be used in > performance-sensitive situations, Comparator could have an optimize() > method that attempts to generate bytecode for an efficient comparator > in a way similar to what the lambda meta-factory does. the issue of that code is that c2 thinks that it's a recursive call and stop the inlining, but i see this more as a limitation of the way c2 currently works. Anyway, yes, that's the main difference between the two semantics, either you have method references or you have a full blown reified object. But it can also be seen as limitation of the current method reference that can not be seen as an expression tree like you do in C#. In that case, the user can choose which representation it wants. > > Even without field references, that optimize() method could eliminate > some lambda calls: instead of a chain of lambdas for each > .thenComparing call, it could be unrolled into 3 if statements. But > we?d still have 3 lambdas left, to compute the values to compare to > each other. If we could pass in a field reference, the optimize() > method could introspect on those, allowing it to emit getfield > bytecodes directly, saving more indirection, and resulting in the same > bytecode you could get by writing this comparator by hand. > > I hope it goes without saying that I am not proposing to actually > implement Comparator.optimize any time soon: it?s just a convenient, > well-known example of the kind of library that could be gradually > improved by promoting field references from ?sugar for a lambda? to > reified objects. > > Note that if we reify field references, there will surely be some > people who ask, ?why not method references?? I think it is much more > difficult to do this, because methods can be overloaded. Which > overload of String::valueOf did you want to reify as a MethodReference > object? When we use these as lambdas, context can give us a hint; when > crystalizing them as a descriptor object we will have no context. So, > there seems to me to be good reason to push back against this request, > but it is a choice we should make deliberately. > > Annotation parameters > > Last, if we had such a FieldRef descriptor, we might like to be able > to use them as annotation parameters, making it possible to be more > formal about annotations like > > class Stream { > private Lock lock; > @GuardedBy(Stream::lock) // next() only called while holding lock > public int next() {...} > } > > Probably this would mean having FieldReference implement Constable, so > that Holder::field could be put in the constant pool, along with other > annotation parameters. This also suggests that a FieldReference object > should not directly store the bound receiver, since that could not be > put in the constant pool; instead we would want a FieldReference to > always be unbound, and then some sort of decorator or wrapper that > holds a FieldReference tied to a captured receiver. We still can not use method ref (the unbound one) in annotation. > > Open Questions > > The first set of questions is: are these all reasonable, useful > features? Am I missing any pitfalls that they imply? > > One looming design question is unfortunately syntax: is Foo::x really > the best syntax? It's very natural, but it will be ambiguous if Foo > also has a method named x. To preserve backwards compatibility with > code written before the introduction of field references, we would > obviously need to resolve this ambiguity in favor of any applicable > method reference over any applicable field reference. It would surely > be too extreme to say that it's impossible to get a field reference > when a method with the same name exists. So if you really want the > field reference in a context like this, we could introduce some > alternate syntax to clarify that: Foo:::x, or Foo..x, for example: the > details don't have to be sorted out at this time, as much as we need > to decide whether to use any new token at all or just reuse the :: > token. > > But this tie-breaker strategy has a problem: it solves backwards > compatibility, while leaving a subtle forward-compatibility pitfall. > Holder::field currently resolves to a field reference, but suppose in > the future someone adds a method with the same name. As discussed > before, we must resolve conflicts in favor of methods, and so > Holder::field suddenly becomes a method reference next time you > compile the client code. Now class authors can change which member is > being accessed by adding a new member, which seems dangerous. But > maybe it's fine - adding new overloads of an existing method can > already do that, if clients were relying on autoboxing or other type > coercions. > > We could avoid the difficulty by having no syntactic overlap between > field and method references: Holder::toString for methods only, > Holder:::field for fields only. That's unlikely to be popular, and > indeed it is a bit ugly. Is it better to accept the small danger of > ambiguity? > > Finally, if anyone has implementation tips I would be happy to hear > them. I am pretty new to javac, and while I've thrown together an > implementation that desugars field references into getter lambdas it?s > far from a finished feature, and I?m sure what I?ve already done > wasn't done the best way. Finding all the places that would need to > change is no small task. I will just add that there is also no syntax for an array ref syntax. Something you get with a VarHandle. R?mi From brian.goetz at oracle.com Tue Jun 4 23:22:11 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 4 Jun 2019 19:22:11 -0400 Subject: Revisiting field references In-Reply-To: References: Message-ID: Alan; Thanks for these thoughts.? Indeed, field references are something that keep coming up in the "if we only had" department, and your analysis covers many of the important points.? They are especially useful in APIs such as the Equivalence one that Kevin and Liam have already written about, as well as similar APIs (e.g., Comparator factories).? Let me make some connections with other roads not taken, as well as some connections with other features that are being considered. > Getter as Supplier > Setter as Consumer It surely seems nice to be able to SAM-convert a field ref to an appropriately-shaped SAM type based on target typing.? Another question we should ask is: should field refs (and method refs, for that matter) have a standalone type??? In the original JSR-335 discussions, another "left for the future" item was whether method refs should be "enhanced" compared to lambdas.? Such areas include: better equals() implementation (two method refs for the same method could be equal), better toString() implementation (using the name of the method, rather than an opaque descriptor), reflective behavior as described in your Increased Transparency section (retrieving the nominal metadata for the method class, name, and descriptor), etc. Two more areas that were discussed, but left for the future, for method references were: ?- Alternate target types, such as Method or MethodHandle (yes, Remi, we see you); ?- Explicit method references (with explicit parameter types, such as `Foo::bar(int,int)`), which could be used in contexts where a SAM target type is not available -- such as Method. Further, the work done on constant folding exposes interesting optimization opportunities for APIs that can truck in member references; for a field ref such as `Holder::field`, foldable (compile-time invocable) APIs could be brought to bear.? I don't want to deep dive on this here, other than to point out that there is a significant connection. > More subtly, source code may > become harder to understand when the expression this::field may mean > two very different things, either a read or a write. Thanks for bringing up this point, as it speaks to the "we _can_, but should we?" question.? Target typing is a powerful thing, but as you say, it is a little scary for the same locution to represent such different behaviors. > We could have some sort of > FieldReference object describing the class in which the field lives, > the name and type of the field, and which object (if any) is bound as > the receiver. A few points here: ?- Some of these items are security-sensitive, and others are much less so.? The name of the method or field, for example, is pretty safe to expose (it is already exposed through stack traces), whereas exposing bound receivers or captured arguments seems a pretty clear no-go (I doubt anyone passing `foo::bar` to a library method thinks they are sharing `foo` too.)? I think the line here is: expose nominal metadata (class names, method names, method descriptors), and not live objects (class mirrors, method handles, bound arguments, etc), except as mediated by access control (e.g., caller provides a Lookup.) ?- It's not either-or; MethodReference / FieldReference could be interfaces, and when SAM-converting a method/field reference to a suitable SAM type, the resulting object is of type `(SamType & MethodReference)`. ?- FieldReference/MethodReference could be the standalone type of field refs (and either non-overloaded method references, or method references with explicit parameter types.) > An advantage of supporting this is that it could enable libraries that > currently accept lambdas to generate more efficient code. This connects with the constant-folding, as well as offering some API-design options, such as: ??? static Comparator ofFields(FieldReference... fields) which would allow for invocations like ??? Comparator.ofFields(Foo::a, Foo::b) Having a factory that takes a varargs of field references is a good trigger to try optimizing transparently, as you know you have optimizable references in hand. > I hope it goes without saying that I am not proposing to actually > implement Comparator.optimize any time soon: it?s just a convenient, > well-known example of the kind of library that could be gradually > improved by promoting field references from ?sugar for a lambda? to > reified objects. A good API is both easy to use and easy to optimize.? Capturing higher-level semantics such as transparent field refs vs opaque lambdas scores well on both counts, regardless of specific optimization avenues you might have in mind. > Annotation parameters > > Last, if we had such a FieldRef descriptor, we might like to be able > to use them as annotation parameters, making it possible to be more > formal about annotations like > > class Stream { > private Lock lock; > @GuardedBy(Stream::lock) // next() only called while holding lock > public int next() {...} > } Method refs in annotations are another one on the "would like to do eventually" list; adding ConstantDynamic removed one of the major impediments, but there's still a bunch of work in between here and there.? To your comment about method refs, the main impediment is the lack of explicitly typed method references, but we know how to do that too. > Probably this would mean having FieldReference implement Constable, so > that Holder::field could be put in the constant pool, along with other > annotation parameters. Yes.? The key is that any symbolic metadata (Class, MethodType) is mediated by a Lookup (for the reflective path), and by the implicit resolution context (when loading the classfile.) > This also suggests that a FieldReference object > should not directly store the bound receiver, since that could not be > put in the constant pool; instead we would want a FieldReference to > always be unbound, and then some sort of decorator or wrapper that > holds a FieldReference tied to a captured receiver. A bound method ref is not a constant; only unbound / static method refs would be.? Same story for fields. > Open Questions > > The first set of questions is: are these all reasonable, useful > features? Am I missing any pitfalls that they imply? > > One looming design question is unfortunately syntax: is Foo::x really > the best syntax? It's very natural, but it will be ambiguous if Foo > also has a method named x. ... which will be the case very often.? And in fact, that method is probably (but we can't guarantee) an accessor for the field.? And, while we can construct precedence rules that make sense for various use cases, in reality, sometimes you want Foo::x to be the field (such as in the comparator example), and sometimes you want it to be the method (such as when you're sharing it with foreign code, because you want the defensive copy that the accessor might do.) So even if we bias towards the method (which as you say, is a forced move), there still needs to be a way to denote the field ref explicitly in case of conflict. > Finally, if anyone has implementation tips I would be happy to hear > them. I am pretty new to javac, and while I've thrown together an > implementation that desugars field references into getter lambdas it?s > far from a finished feature, and I?m sure what I?ve already done > wasn't done the best way. Finding all the places that would need to > change is no small task. I know that Dan went through a similar exercise a long time ago, identifying the places in the spec/compiler that would take stress if we wanted to broaden the set of target types for method refs. While that's not exactly the same problem, it would be useful to dredge that up, as it will likely cover some of the same ground. For prototyping purposes, I wouldn't try to use the same token; that's just making your life harder.? Pick something easily parsable, even if its hideous. More thoughts later. From john.r.rose at oracle.com Tue Jun 4 23:34:48 2019 From: john.r.rose at oracle.com (John Rose) Date: Tue, 4 Jun 2019 16:34:48 -0700 Subject: Revisiting field references In-Reply-To: References: Message-ID: <60048B26-FA1D-4586-A62A-C2E3D8AC3F00@oracle.com> On Jun 4, 2019, at 4:22 PM, Brian Goetz wrote: > > For prototyping purposes, I wouldn't try to use the same token; that's just making your life harder. Pick something easily parsable, even if its hideous. In fact, it's often beneficial to adopt a __ClearlyHideousSyntax for prototyping or semantic discussions, because it sends the clear signal that we are not painting the syntax bikeshed yet. If you pick a merely ugly syntax, someone is liable to waste bandwidth trying to fix it for you. No shame, we've all done it, but let's not? From brian.goetz at oracle.com Wed Jun 5 13:30:23 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 5 Jun 2019 09:30:23 -0400 Subject: Revisiting field references In-Reply-To: References: Message-ID: <1A7D3A05-F5A1-43B1-A54D-DB6F9DECCBF7@oracle.com> More random comments. > Getter as Supplier It is worth asking whether bound field refs are worth it at all. On the one hand, they make perfect sense from a type-system perspective, but on the other, I?m struggling to thing of real-world use cases where they pay for themselves (vs a lambda like () -> holder.field). Bound method refs are the least commonly used kind of method refs, and I would think bound fields would be even less useful. On the other other hand, the asymmetry might seem glaring. On the subject of whether the Foo::m syntax has enough juice to get all the way there, let?s keep in mind that there?s a new kind of class member coming down the road ? patterns ? and ?pattern reference? might want to eventually join this party. (Another precedent we can lean on for field ref syntax is the one we use for class literals ? Foo.class. This is less pretty but clearly more extensible to other kinds of members.). Starting from where we are now, there are a number of (mostly orthogonal!) directions we can branch into: - Explicit method refs. This is a method ref with explicit type parameters (Foo::m(int, int)); it is marginally useful for disambiguation today, but an enabler for other features on this list (such as alternate target types, or standalone method refs.). - Alternate target types. Currently a method ref can be converted only to a SAM type; conversion to j.l.r.Method is potentially useful. - Sharper method refs. Adding in a MethodRef interface that carries symbolic information about the method ref; when converting a mref to a SAM type, we?d get an object of type (SAM & MethodRef). - Better equals/toString for method refs vs lambdas. - Standalone types. This would give an exact method ref a standalone type of MethodRef. - Fields. - Usability in annotations. - Abbreviated member refs (::m). - Generalized syntax. If we choose to keep going with ::, we?ll have to face down the ambiguities, which likely means a way to disambiguate. Clearly we can disambiguate in the direction of methods with explicit parameter types, but how do we disambiguate in the direction of fields (or patterns)? Perhaps (just throwing out a dumb idea for sake of exposition), the method ref Foo::m is really shorthand for Foo::method::m, where the ?method? is inferred most of the time. I realize tackling these all at once is overwhelming, but having a full map of the design space is likely to be helpful in identifying interactions and directions of highest leverage. From james.laskey at oracle.com Thu Jun 6 15:46:48 2019 From: james.laskey at oracle.com (Jim Laskey) Date: Thu, 6 Jun 2019 12:46:48 -0300 Subject: Text blocks Message-ID: <15FB6F91-496F-4E15-A6DE-E10B9B06C950@oracle.com> Just a note that text block (https://openjdk.java.net/jeps/355 ) changesets have been pushed the the jdk. Thank you all for your input. Beat them up when you have a chance. I'll be preparing a proper release note in the coming weeks. But in the meantime (jshell changesets should be in tomorrow); Preview Feature Text blocks exist as a Preview feature of the Java Language JEP 12 . This means that in order to use text blocks in your Java code, you must use --enable-preview and -source 13 flags on the javac command line and --enable-preview on the java command line; javac --enable-preview -source 13 ... java --enable-preview ... If you are using jshell to experiment with text blocks then you must also use the --enable-preview flag; jshell --enable-preview Cheers, -- Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Jun 6 20:48:33 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 6 Jun 2019 16:48:33 -0400 Subject: Records and annotations Message-ID: Recall that some time ago, we were discussing some different directions for how to handle annotations on record components. Approach A: Record components are a new kind of annotation target location; if an annotation is meta-annotated with this target kind, it can be applied to record components. ?Expose reflection over annotations on record components as with other features. Approach B: Annotations on record components are merely ?pushed down? to the corresponding JLS-mandated API elements (constructor parameters, accessor methods, fields), according to the allowed target kinds of the annotation (if the annotation is only valid on fields, it is only pushed down to fields.). Approach B+: Like B, except that we continue to reify the provenance of the annotations, and expose them through reflection as annotations on the record component _in addition to_ annotations on the mandated API elements. In an alternate universe where we had done records first, and were now adding annotations, we?d surely pick A. ?However, in the current universe, picking A would put us in an adoption bind; we have to wait for specific annotations to acquire knowledge of the new target kinds (through the @Target meta-annotation), and for frameworks to be aware of annotations on record components, before we can migrate classes dependent on those annotations/frameworks to be records. ?Further, library authors suffer a familiar problem: if @Foo is meta-annotated with a target kind of RECORD_COMPONENT, then that means it must have been compiled against a Java 14+ JDK, which means that the resulting classes are dependent on JDK 14+, unless they use something like MR Jars to have two versions in one JAR. ?This would further impede adoption. For guidance in our A/B choice, we can look to enums. Enum constants are surely a first-class language element, and can be annotated, but they do not have their own annotation target kind; instead, the compiler pushes down the annotations onto the fields that carry the enum constants. ?While this might be an uneasy dependence on the translation strategy, in fact this translation strategy is mandated (because we want migrating between a class with static constant fields and an enum to be a binary-compatible migration.). Records are in a similar boat as enums; while there is a translation strategy going on here, the elements of it are mandated by the language specification. ?So I think the trick that enums use is a reasonable one to carry forward to records, allowing us to seriously consider B/B+. ? (Strategy A also has a lot of accidental detail; class file attributes for various kinds of options and bookkeeping to manage exactly what is being annotated, reflection API surface, etc.). The following type-checking strategy applies to B and B+: ?- A record component may be annotated by a declaration annotation with no target kind meta annotation, or whose target kind includes one or more of PARAMETER, FIELD, or METHOD ?- The type of a record component may be annotated by a type annotation Strategy B then entails pushing down annotations through tree manipulation to the right places. ?For PARAMETER annotations, they are pushed down to the parameters of the implicit constructor; for FIELD annotations, to the fields; for METHOD annotations, to the accessor. ?And for type annotations, to the corresponding type use in constructor parameters, field declarations, and accessor methods. ?(And if the annotation is applicable to more than one of these, it is pushed down to all applicable targets.) But wait! ?What if the author also explicitly declares, say, the accessor method? ? ? record R(int a) { ? ? ? ? int a() { return a; } ? ? } No problem, we can still push the annotation down, and there is precedent for annotations being ?inherited? in this way. But wait! ?What if the author explicitly declares the same annotation, but with conflicting values? ? ? record R(@Foo(1) int a) { ? ? ? ? @Foo(2) int a() { return a; } ? ?} We can still push down @Foo(1), and then look to see if @Foo is a repeating annotation.? If it is, great; if not, then a() has two @Foo annotations, which results in a compilation error.? So we always push down, and then enforce arity rules. By pushing annotations down in this manner, existing reflection can pick up the annotations on the various class members with no additional work or reflection API surface. ?Are we done? We might be done, or we might want to do more (strategy B+). ?In B+, we _additionally_ reify which annotations were present on the component, and (possibly) expose additional reflection API surface to query annotations on record components. Why would we want to do this? ?Well, one reason that occurs to me is that we?ve been holding the move of ?abstract records? and records extending abstract records in our back pocket. ?In this case, we might wish to copy annotations down from a record component in a superclass to the corresponding pseudo-component in the subclass, for example. ?But, I?m not particularly compelled by this ? I think the strategy we took for enums is mostly good enough. ?So I?m voting for pure B. From john.r.rose at oracle.com Thu Jun 6 22:18:17 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 6 Jun 2019 15:18:17 -0700 Subject: Records and annotations In-Reply-To: References: Message-ID: On Jun 6, 2019, at 1:48 PM, Brian Goetz wrote: > ? > But, I?m not particularly compelled by this ? I think the strategy we took for enums is mostly good enough. So I?m voting for pure B. Yes, I track with your reasoning. Pure B is perfectly fine as wherever there is a mandated translation strategy to inform the user where (in the reflective APIs) to look for the annotation. You can achieve A-like effects with appropriate conventions. For example, if I have a little library of annotations just for record components, I can target them to fields, and look for them in Class.getDeclaredFields, even if the fields happen to be private. (Right?) I'd kind of like to call this B+-. No new channels or API points, but a known way to find component annotations, by looking at the fields. If we wanted to make things more explicit, we could incrementally modify a plan B with A-like conventions to a plan B+ in which there's a new annotation target for components, but it still gets passed through the field channels. This wouldn't require new API points or classfile formats. Strictly speaking a new annotation target isn't required either, just some marker or other; another meta-annotation, but not a target type. The net of the above is B seems sufficient, although it also seems necessary to specify a deterministic place to find component annotations per se (fields, I suppose). And if we want to do more it's easy to add a bit meta-data with a meta-annotation, not necessarily a target type. From brian.goetz at oracle.com Fri Jun 7 19:11:05 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 7 Jun 2019 15:11:05 -0400 Subject: Looking beyond records: better constructors (and deconstructors) Message-ID: <9159b986-caea-9240-26f9-09f9f595d5b0@oracle.com> With most of the decisions regarding records being settled, let's take a few minutes to look down the road.? Records are great for where they apply, but there are plenty of classes that suffer from error-prone boilerplate that do not qualify to be records. We would like for some of the record goodies to filter down to ordinary classes, where possible. The lowest-hanging fruit here is constructors: many constructors look vaguely like (or can be made to look like) this: ??? Foo(ARGS) { ??????? if (ARGS NOT VALID) ??????????? throw new IllegalArgumentException(...); ??????? if (ARGS NEED TO BE NORMALIZED / COPIED) { ??????????? ARGS = normalize(ARGS); ??????? } ??????? this.ARGS = ARGS; ??? } That is, many constructors take arguments that are candidate values for their fields, and then validate the arguments, possibly normalize or defensively copy them, and write them to the corresponding fields with the error-prone boilerplate of: ??? this.x = x; ??? this.y = y; Similarly, when we add deconstruction patterns, a deconstructor will likely have the similar idiom, in reverse: ??? x = this.x; ??? y = this.y; Records sidestep this because we have already committed to a deterministic relationship between the public construction/deconstruction protocol and the internal representation.? And records let you skip the initialization boilerplate, even if you have an explicit constructor: ??? record Range(int low, int high) { ??????? public Range { ??????????? if (low > high) ??????????????? throw new IAE("Bad range: [%d, %d]".formatted(low, high)); ??????????? // Implicit field initialization FTW! ??????? } ??? } The author provides the explicit validity check, but the compiler fills in the boilerplate field initialization.? Now, the constructor only contains the "non-obvious" code.? Can we share this with ordinary classes? What we would need is to tell the compiler that the constructor argument "int low" and the field "int low" are describing the same thing.? This is commonly the case, but purely convention. We could, through a variety of syntactic indicators, capture this relationship.? (Please, let's decide whether we like the feature before we bikeshed the syntax.)? For example: ??? class Foo { ??????? private int x; ??????? public Foo(this int x) { } ??? } where `this int x` means that the constructor has a parameter `int x`, which corresponds to the field `int x` of the current class. The compiler can reciprocate by filling in the `this.x = x` boilerplate where needed, and the same with a deconstruction pattern: ??? class Foo { ??????? private int x; ??????? public Foo(this int x) { } ??????? public pattern Foo(this int x) { } ??? } If the constructor wants to do validation and/or normalization, it is like what we do with records -- we put that in the constructor, mutating the arguments if needed, and then the argument values are committed to the fields implicitly if they are DU on all paths out of the ctor. With such a feature, then the special constructor form for records: ??????? public Range { STUFF } becomes simply shorthand for ??????? public Range(this int low, this int high) { STUFF } reducing some of the "magic" associated with records. From brian.goetz at oracle.com Tue Jun 11 19:21:16 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Jun 2019 15:21:16 -0400 Subject: Towards better serialization Message-ID: I've posted a document at: http://cr.openjdk.java.net/~briangoetz/amber/serialization.html on an exploration we've been doing to address some of the shortcomings of Java serialization, building on other tools that have been (or will be) added to the platform. Rather than attempt to add band-aids on existing serialization, it addresses the risks of serialization at their root.? It is somewhat of a shift -- it cannot represent all object graphs, and it makes some additional work for the author -- but it brings object serialization into the light, where it needs to be in order to be safer.? Comments welcome! -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Jun 11 19:40:52 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Jun 2019 15:40:52 -0400 Subject: Records: supertype? Message-ID: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> We've gone back and forth a few times on whether records should have a special supertype, as enums do (java.lang.Enum.) Enums benefit from having an explicit supertype for a number of reasons: ?- It provides a type bound that is useful in API signatures (e.g., EnumSet>); ?- The base class actually has state; ?- It provides declarations of public methods such as `ordinal()`, `compareTo()`, and `getDeclaringClass()`, and implementations of methods such as `toString()` and `readObject()`; ?- It declares supertypes such as `Comparable`; ?- It provides a place to capture specification of enum-specific behavior, in a way that is more discoverable within the IDE than putting it in the JLS. One could make the same argument that records should similarly have an explicit supertype (e.g., `AbstractRecord`), but the arguments aren't as decisive: ?- No state in the base class; ?- No record-specific methods or supertypes (currently); ?- Hard to imagine `>` showing up in APIs, though it's possible. Which is to say, the choice of whether or not to have a base class for records is less obvious.? Still: ?- The specification for equals/toString/hashCode for records is somewhat refined over that of Object, and putting it in the Javadoc is the sensible place to put it; ?- One can imagine wanting to add methods like `toJson()` to records in the future, and an abstract base class is a sensible place to put it. So, one could make a case for "do the simplest thing" (no supertype), or, alternately, be more like enums, and still get some benefits. If we decide to go for the supertype, there's a bikeshed to paint.? Record is the obvious analogue to Enum, but I worry that it will create clashes with user code (java.lang names are always auto-imported.)? AbstractRecord feels a little clunky.? (Once you've weighed in on the first question, you can proceed to the bikeshed.) From kevinb at google.com Tue Jun 11 20:03:42 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 11 Jun 2019 13:03:42 -0700 Subject: Records: supertype? In-Reply-To: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> References: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> Message-ID: It sounds to me like nothing bad whatsoever will come from leaving it out. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Jun 11 20:07:18 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Jun 2019 16:07:18 -0400 Subject: Records: supertype? In-Reply-To: References: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> Message-ID: <536bdc05-888b-763b-05cc-a2f86cb1b5cd@oracle.com> > It sounds to me like nothing bad whatsoever will come from leaving it out. We lose out on some future flexibility to add new methods, which might amount to nothing, or might be a big deal. The main thing we gain immediately is that we have a place to hang specification, such as the refined specification for `equals()`, or general constraints on record-ness, where there is at least some chance users will see it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Jun 11 20:16:53 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Jun 2019 16:16:53 -0400 Subject: Records and serialization Message-ID: <4ae86763-040a-f834-1fee-843d9ab3837d@oracle.com> So far, we've taken a middle-of-the-road position on records and serialization, where we generate a `readResolve()` method that pipes the at-rest state through the canonical constructor, to gain the benefit of validation checks.? Some have argued to do less, but I think we can do a little more, and in light of the direction outlined earlier today for serialization, we should do more. The semantics of a record are that we derive all of an object's standard protocols -- construction, deconstruction (whether through pattern match or accessors), equality, hashing, and string representation -- from the state description.? Deriving the serialization protocol from the state description similarly makes sense.? Which would mean: the state description is also the serialized form.? So not only do we want to generate a readResolve() method as we currently have, but we probably want to prohibit specifying explicit readResolve(), readObject(), and writeObject() methods (and other serialization knobs) on Serializable records -- the canonical constructor should be the line of defense against bad data for both the front-door and back-door APIs. This also, as it turns out, yields the serialization protocol we'd get if we implicitly marked the canonical constructor with `@Deserializer` and the canonical deconstruction pattern with `@Serializer`, as per the doc dropped today. From kevinb at google.com Tue Jun 11 20:51:39 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 11 Jun 2019 13:51:39 -0700 Subject: Records: supertype? In-Reply-To: <536bdc05-888b-763b-05cc-a2f86cb1b5cd@oracle.com> References: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> <536bdc05-888b-763b-05cc-a2f86cb1b5cd@oracle.com> Message-ID: On Tue, Jun 11, 2019 at 1:07 PM Brian Goetz wrote: > > It sounds to me like nothing bad whatsoever will come from leaving it out. > > > We lose out on some future flexibility to add new methods, which might > amount to nothing, or might be a big deal. > Wouldn't we just introduce the type then once we needed it? It would be awkward, but would it be impossible or inadvisable? > The main thing we gain immediately is that we have a place to hang > specification, such as the refined specification for `equals()`, or general > constraints on record-ness, where there is at least some chance users will > see it. > That is nice, but I think any tools that generate/show documentation ought to do something useful as a special case for records even without the supertype. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Jun 11 21:05:38 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Jun 2019 17:05:38 -0400 Subject: Records: supertype? In-Reply-To: References: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> <536bdc05-888b-763b-05cc-a2f86cb1b5cd@oracle.com> Message-ID: > We lose out on some future flexibility to add new methods, which > might amount to nothing, or might be a big deal. > > > Wouldn't we just introduce the type then once we needed it? It would > be awkward, but would it be impossible or inadvisable? That would not be binary-compatible. Let's say we had ??? record Foo(int x) {} which was translated without a supertype.? Now we later try to add in the supertype, say with a `.toJson()` method, but don't recompile Foo.? Then: ??? Record r = (Record) aFoo; ??? String s = r.toJson(); compiles, but throws some form of IncompatibleClassChangeError, since a Foo does not extend Record at runtime. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Jun 11 21:29:51 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Jun 2019 17:29:51 -0400 Subject: Towards better serialization In-Reply-To: <1441489070.554776.1560287280181.JavaMail.zimbra@u-pem.fr> References: <1441489070.554776.1560287280181.JavaMail.zimbra@u-pem.fr> Message-ID: > About the details, > - using factories, constructor and matchers: yes > - using annotations @Serializer/@Deserializer + version, > this part is still too magic, no?, an annotation is not better than an empty interface, it's still not integrated with the language, @Serializer/@Deserializer should be keywords, and version can be a parameter, so you can disambiguate versions in code (framework may use annotation on top of that mechanism to provide a more declarative API). While you know that I reject about 99% of the proposed uses of annotations as "that's not what annotations are for", this one really does fit the bill.? Because, what these annotations do is _capture design intent_ that these members are intended, by the author, to be usable by serialization frameworks for certain activities.? But it does not affect their accessibility, or semantics, or the generated bytecode.? The language has (almost) no interest in these annotations; they are a side-channel signal between a class and a serialization framework (and not just Java serialization) that these members are suitable and intended for a certain purpose.? (Yes, the compiler may wish to do additional type checking, like checking that the arguments lists for the serializer and deserializer are compatible, and issue diagnostics accordingly, but that's also within the rules for annotations, like @Override.) They are not empty; they are parameterized (at least) by version. And the version cannot be a runtime parameter to the serializer; the caller has no idea of the class-specific current version numbers of the zillion classes in an object graph.? They are properties of the class itself, as of the time it was compiled. > - the keyword "open", i think it's not needed, in fact i hope it's not needed, we already have enough visibility keywords in Java. Frameworks can access a JDK API that will provide access to the method marked with the keywords "serializer" and "deserializer" (also your use of the keyword open is close to the initially proposed use of the keyword module in the JSR 294 which was withdrawn). Now THAT would move it over the line where annotations would not be OK, because then they would affect the semantics of the class. There are surely a range of options here, but the one you propose takes two orthogonal considerations and couples them -- which is reinventing one of the sins of original serialization.? (And, other frameworks (e.g., dependency injection, mocking, etc) have similar need for dynamic access to members that are not intended as part of the "front door" API anyway.) Having something that means both "use me for serialization" and "throw the usual accessibility rules out the window" is not a primitive.? These are separate things, that want separate markings. > And nitpicking, can we agree that in a pattern the parameters act more as return values than as parameters, so instead of writing > public pattern serializeMe(String serverName) { > serverName = conn.getName(); > } > I prefer > public pattern (String serverName) serializeMe { > return (conn.getName()); > } OMG, are you seriously going to bikeshed the syntax of a *different feature* here?? Really?? Really? (And no, we cannot agree that.) From kevinb at google.com Tue Jun 11 21:48:30 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 11 Jun 2019 14:48:30 -0700 Subject: Records: supertype? In-Reply-To: References: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> <536bdc05-888b-763b-05cc-a2f86cb1b5cd@oracle.com> Message-ID: D'argh, sorry you sometimes have to remind me of basic things. Well, that turns the thing upside down then. I'm not on Team Leave It Out, then. Naming... sure, something like 0.02% of all our classes/interfaces are named exactly `Record`. That's higher than I'd have guessed. Clashing with a java.lang name isn't a great thing; on the other hand we also have a dozens of classes named exactly `String` and nothing ever really exploded from it. Of course, `AbstractRecord` is wide open. The main problem I see with that name is that it will be weird if we end up allowing users to write their own abstract records. (It's *slightly* weird to have it named "*Record" when it *is not a record* but it's already the case that Enum is not an enum, so .) I guess I would lean toward just using the straightforward name `Record`. On Tue, Jun 11, 2019 at 2:05 PM Brian Goetz wrote: > > We lose out on some future flexibility to add new methods, which might >> amount to nothing, or might be a big deal. >> > > Wouldn't we just introduce the type then once we needed it? It would be > awkward, but would it be impossible or inadvisable? > > > That would not be binary-compatible. > > Let's say we had > > record Foo(int x) {} > > which was translated without a supertype. Now we later try to add in the > supertype, say with a `.toJson()` method, but don't recompile Foo. Then: > > Record r = (Record) aFoo; > String s = r.toJson(); > > compiles, but throws some form of IncompatibleClassChangeError, since a > Foo does not extend Record at runtime. > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Jun 11 21:51:54 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Jun 2019 17:51:54 -0400 Subject: Records: supertype? In-Reply-To: References: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> <536bdc05-888b-763b-05cc-a2f86cb1b5cd@oracle.com> Message-ID: > Of course, `AbstractRecord` is wide open. The main problem I see with > that name is that it will be weird if we end up allowing users to > write their own abstract records. Which may well happen someday.? (Which also makes giving Record an F-bounded type parameter a little more dodgy.) > (It's /slightly/?weird to have it named "*Record" when it /is not a > record/?but it's already the case that Enum is not an enum, so .) > > I guess I would lean toward just using the straightforward name `Record`. Yeah, that seems the most straightforward choice.? Along with the same rule we have for Enum, where it cannot be extended by ordinary classes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Jun 11 22:41:31 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 12 Jun 2019 00:41:31 +0200 (CEST) Subject: Towards better serialization In-Reply-To: References: <1441489070.554776.1560287280181.JavaMail.zimbra@u-pem.fr> Message-ID: <1296837984.561786.1560292891629.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Mardi 11 Juin 2019 23:29:51 > Objet: Re: Towards better serialization [...] > >> And nitpicking, can we agree that in a pattern the parameters act more as return >> values than as parameters, so instead of writing >> public pattern serializeMe(String serverName) { >> serverName = conn.getName(); >> } >> I prefer >> public pattern (String serverName) serializeMe { >> return (conn.getName()); >> } > > OMG, are you seriously going to bikeshed the syntax of a *different > feature* here?? Really?? Really? > > (And no, we cannot agree that.) i will not respond to the bikeshedding bait, i'm taking about the semantics here and like you i'm using a Java like syntax to explain my point, this is an extractor, it's a method that extracts information from an instance, it's not a method parameterized by that information. R?mi From forax at univ-mlv.fr Tue Jun 11 22:47:36 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 12 Jun 2019 00:47:36 +0200 (CEST) Subject: Towards better serialization In-Reply-To: References: <1441489070.554776.1560287280181.JavaMail.zimbra@u-pem.fr> Message-ID: <998633524.561933.1560293256055.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Mardi 11 Juin 2019 23:29:51 > Objet: Re: Towards better serialization >> About the details, >> - using factories, constructor and matchers: yes >> - using annotations @Serializer/@Deserializer + version, >> this part is still too magic, no?, an annotation is not better than an empty >> interface, it's still not integrated with the language, >> @Serializer/@Deserializer should be keywords, and version can be a parameter, >> so you can disambiguate versions in code (framework may use annotation on top >> of that mechanism to provide a more declarative API). > [..] > > They are not empty; they are parameterized (at least) by version. And > the version cannot be a runtime parameter to the serializer; the caller > has no idea of the class-specific current version numbers of the zillion > classes in an object graph.? They are properties of the class itself, as > of the time it was compiled. Being able to pass the version in the serializer allows a new version of the class that uses an instance created from an old version of the stream to be encoded as an old version. > >> - the keyword "open", i think it's not needed, in fact i hope it's not needed, >> we already have enough visibility keywords in Java. Frameworks can access a JDK >> API that will provide access to the method marked with the keywords >> "serializer" and "deserializer" (also your use of the keyword open is close to >> the initially proposed use of the keyword module in the JSR 294 which was >> withdrawn). > > Now THAT would move it over the line where annotations would not be OK, > because then they would affect the semantics of the class. now you start to see my point > > There are surely a range of options here, but the one you propose takes > two orthogonal considerations and couples them -- which is reinventing > one of the sins of original serialization.? (And, other frameworks > (e.g., dependency injection, mocking, etc) have similar need for dynamic > access to members that are not intended as part of the "front door" API > anyway.) you want to special mechanism for the serialization, no ? Otherwise, we already have java.lang.invoke.Lookup.privateLookupIn() and the open keyword on the module/package. It already encodes the semantics "we believe in encapsulation but sometimes we don't". Having a finer grain version of "open" goes against what was decided by the JPMS EG in my humble opinion. R?mi From brian.goetz at oracle.com Tue Jun 11 22:54:16 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Jun 2019 18:54:16 -0400 Subject: Towards better serialization In-Reply-To: <1296837984.561786.1560292891629.JavaMail.zimbra@u-pem.fr> References: <1441489070.554776.1560287280181.JavaMail.zimbra@u-pem.fr> <1296837984.561786.1560292891629.JavaMail.zimbra@u-pem.fr> Message-ID: <045e30a0-c795-97e0-ad3a-ecae937e1199@oracle.com> > i'm taking about the semantics here and like you i'm using a Java like syntax to explain my point, > this is an extractor, it's a method No -- extractors are not methods. In any case, this is not the appropriate time or place to discuss the surfacing of patterns in the language syntax now.? I know you've been wanting to have this discussion for a long time, but we're not there yet.? I'm working on it. From brian.goetz at oracle.com Tue Jun 11 23:01:38 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Jun 2019 19:01:38 -0400 Subject: Towards better serialization In-Reply-To: <998633524.561933.1560293256055.JavaMail.zimbra@u-pem.fr> References: <1441489070.554776.1560287280181.JavaMail.zimbra@u-pem.fr> <998633524.561933.1560293256055.JavaMail.zimbra@u-pem.fr> Message-ID: <1a720fcd-617d-c82d-0f15-f19479a67eb1@oracle.com> >> There are surely a range of options here, but the one you propose takes >> two orthogonal considerations and couples them -- which is reinventing >> one of the sins of original serialization.? (And, other frameworks >> (e.g., dependency injection, mocking, etc) have similar need for dynamic >> access to members that are not intended as part of the "front door" API >> anyway.) > you want to special mechanism for the serialization, no ? No, I want an _unspecial_ mechanism for the serialization.? I want the user to code with ordinary constructors and pattern extractors, and have serialization just call them, informed by metadata (which is how we pass information to frameworks.) > Otherwise, we already have java.lang.invoke.Lookup.privateLookupIn() and the open keyword on the module/package. Yes, Lookup is a possible implementation mechanism for such an relaxed encapsulation mechanism.? But aligning with what modules do, just at a finer-grained level, is extending an existing concept in a natural way, which seems preferable to adding a new concept to the programming model. > It already encodes the semantics "we believe in encapsulation but sometimes we don't". > Having a finer grain version of "open" goes against what was decided by the JPMS EG in my humble opinion. > We could surely ditch it, and say "If you want to have serializable classes in your module, open the whole module."? And some users might be OK with that, even though it is squashing a bug with a tank.? But, in reality this _is_ a new category of accessibility modifier -- that a method is dynamically accessible regardless of its static accessibility.? And having this appear clearly in the source file makes it much more obvious what is going on.? The high-order bit here is "banish the magic". From amaembo at gmail.com Wed Jun 12 04:57:21 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Wed, 12 Jun 2019 11:57:21 +0700 Subject: Records: supertype? In-Reply-To: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> References: <1c9c28fb-2e71-b566-b4e2-b0c36bc71c8d@oracle.com> Message-ID: While java.lang is the most obvious place to put the new base class, it's not the only possibility. I think such bad class is not very important to import it by default. It could be placed in separate package like java.lang.record to avoid any name clash with existing code. With best regards, Tagir Valeev ??, 12 ???? 2019 ?., 2:41 Brian Goetz : > We've gone back and forth a few times on whether records should have a > special supertype, as enums do (java.lang.Enum.) > > Enums benefit from having an explicit supertype for a number of reasons: > > - It provides a type bound that is useful in API signatures (e.g., > EnumSet>); > - The base class actually has state; > - It provides declarations of public methods such as `ordinal()`, > `compareTo()`, and `getDeclaringClass()`, and implementations of methods > such as `toString()` and `readObject()`; > - It declares supertypes such as `Comparable`; > - It provides a place to capture specification of enum-specific > behavior, in a way that is more discoverable within the IDE than putting > it in the JLS. > > One could make the same argument that records should similarly have an > explicit supertype (e.g., `AbstractRecord`), but the arguments aren't as > decisive: > > - No state in the base class; > - No record-specific methods or supertypes (currently); > - Hard to imagine `>` showing up in APIs, > though it's possible. > > Which is to say, the choice of whether or not to have a base class for > records is less obvious. Still: > > - The specification for equals/toString/hashCode for records is > somewhat refined over that of Object, and putting it in the Javadoc is > the sensible place to put it; > - One can imagine wanting to add methods like `toJson()` to records in > the future, and an abstract base class is a sensible place to put it. > > So, one could make a case for "do the simplest thing" (no supertype), > or, alternately, be more like enums, and still get some benefits. > > If we decide to go for the supertype, there's a bikeshed to paint. > Record is the obvious analogue to Enum, but I worry that it will create > clashes with user code (java.lang names are always auto-imported.) > AbstractRecord feels a little clunky. (Once you've weighed in on the > first question, you can proceed to the bikeshed.) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Jun 12 07:35:13 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 12 Jun 2019 09:35:13 +0200 (CEST) Subject: Towards better serialization In-Reply-To: References: <1441489070.554776.1560287280181.JavaMail.zimbra@u-pem.fr> Message-ID: <275973379.624365.1560324913591.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Mardi 11 Juin 2019 23:29:51 > Objet: Re: Towards better serialization >> About the details, >> - using factories, constructor and matchers: yes >> - using annotations @Serializer/@Deserializer + version, >> this part is still too magic, no?, an annotation is not better than an empty >> interface, it's still not integrated with the language, >> @Serializer/@Deserializer should be keywords, and version can be a parameter, >> so you can disambiguate versions in code (framework may use annotation on top >> of that mechanism to provide a more declarative API). > > While you know that I reject about 99% of the proposed uses of > annotations as "that's not what annotations are for", this one really > does fit the bill.? Because, what these annotations do is _capture > design intent_ that these members are intended, by the author, to be > usable by serialization frameworks for certain activities.? But it does > not affect their accessibility, or semantics, or the generated > bytecode.? The language has (almost) no interest in these annotations; > they are a side-channel signal between a class and a serialization > framework (and not just Java serialization) that these members are > suitable and intended for a certain purpose.? (Yes, the compiler may > wish to do additional type checking, like checking that the arguments > lists for the serializer and deserializer are compatible, and issue > diagnostics accordingly, but that's also within the rules for > annotations, like @Override.) The main difference between @Override, @FunctionalInterface and @Serializer/@Deserializer is that the formers are a form of documentation, the code still work if you remove them, it's not true with the two annotations you are proposing. Those annotations add a new contract at the language level, so they should be keywords and not annotations. R?mi From forax at univ-mlv.fr Wed Jun 12 08:04:21 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 12 Jun 2019 10:04:21 +0200 (CEST) Subject: Towards better serialization In-Reply-To: <1a720fcd-617d-c82d-0f15-f19479a67eb1@oracle.com> References: <1441489070.554776.1560287280181.JavaMail.zimbra@u-pem.fr> <998633524.561933.1560293256055.JavaMail.zimbra@u-pem.fr> <1a720fcd-617d-c82d-0f15-f19479a67eb1@oracle.com> Message-ID: <733188012.647849.1560326661439.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Mercredi 12 Juin 2019 01:01:38 > Objet: Re: Towards better serialization >>> There are surely a range of options here, but the one you propose takes >>> two orthogonal considerations and couples them -- which is reinventing >>> one of the sins of original serialization.? (And, other frameworks >>> (e.g., dependency injection, mocking, etc) have similar need for dynamic >>> access to members that are not intended as part of the "front door" API >>> anyway.) >> you want to special mechanism for the serialization, no ? > > No, I want an _unspecial_ mechanism for the serialization.? I want the > user to code with ordinary constructors and pattern extractors, and have > serialization just call them, informed by metadata (which is how we pass > information to frameworks.) > >> Otherwise, we already have java.lang.invoke.Lookup.privateLookupIn() and the >> open keyword on the module/package. > > Yes, Lookup is a possible implementation mechanism for such an relaxed > encapsulation mechanism.? But aligning with what modules do, just at a > finer-grained level, is extending an existing concept in a natural way, > which seems preferable to adding a new concept to the programming model. > >> It already encodes the semantics "we believe in encapsulation but sometimes we >> don't". >> Having a finer grain version of "open" goes against what was decided by the JPMS >> EG in my humble opinion. >> > > We could surely ditch it, and say "If you want to have serializable > classes in your module, open the whole module."? And some users might be > OK with that, even though it is squashing a bug with a tank.? But, in > reality this _is_ a new category of accessibility modifier -- that a > method is dynamically accessible regardless of its static > accessibility.? And having this appear clearly in the source file makes > it much more obvious what is going on.? The high-order bit here is > "banish the magic". I understand that you want a more fine grained mechanism but i don't think we need it (see a sibling thread) and changing the security model of Java AGAIN is really time consuming for a lot of people (every frameworks / languages that are using the reflection has to be fixed). I would prefer that part to be considered separately to avoid us developers to have to wait a long time to have all the other good fixes to the serialization that you are proposing. R?mi From amaembo at gmail.com Wed Jun 12 11:49:21 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Wed, 12 Jun 2019 18:49:21 +0700 Subject: Towards better serialization In-Reply-To: References: Message-ID: Hello! Nice reading, thanks! What about inheritance? Could factory method deserializer declared in the class X produce an object of type Y which is a subclass of X? In this case where serializer pattern should be declared? In Y or in X? Assuming that serialized stream contains the class name Y, then probably both serializer and deserializer should be searched by serialization framework in the Y. However probably Y is a private implementation detail and we don't like to expose it and we already have some factory method in the X which can produce Y and we like to use it for the serialization. More concrete example: immutable lists created via List.of(...). There are at least two implementations inside and, I think, it's desired not to expose their count and structure. E.g. future Java version might have more or less imlementations. How the serializer and deserializer would look like for these objects? Another question: is it expected that static checks will be applied for annotated methods/patterns? I expect the following: - Deserialization annotation is applied only to constructor or to static factory method which return type is the same as the containing class. If the class is parameterized like Map, then static factory method should be parameterized in the same way like static Map createMap(...) (or such restriction is redundant?). Parameterized constructor is not allowed (or it is?) - Serialization annotation is applied only to patterns. Probably could be applied to getter-like no-arg method if object is serialized to single simpler value like File::toString could be used to serialize File. - No two members of the same class could have the same annotation with the same version number - If class contains both serializer and deserializer with the same version number, their parameter count and types should match. What about singleton objects or objects without state in general? Seems no problem with deserialization (e.g. class Singleton { @Deserializer public static Singleton getInstance() {...} }). But how to declare the serializer? Is it expected to have patterns which deconstruct the object to zero components? Will java.lang.String have a serializer (toCharArray?) and a deserializer (String(char[])) or it's considered to be basic enough and all serialization frameworks should handle it as a primitive? With best regards, Tagir Valeev. On Wed, Jun 12, 2019 at 2:21 AM Brian Goetz wrote: > I've posted a document at: > > http://cr.openjdk.java.net/~briangoetz/amber/serialization.html > > > on an exploration we've been doing to address some of the shortcomings of > Java serialization, building on other tools that have been (or will be) > added to the platform. Rather than attempt to add band-aids on existing > serialization, it addresses the risks of serialization at their root. It > is somewhat of a shift -- it cannot represent all object graphs, and it > makes some additional work for the author -- but it brings object > serialization into the light, where it needs to be in order to be safer. > Comments welcome! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Jun 12 15:11:16 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 12 Jun 2019 11:11:16 -0400 Subject: Towards better serialization In-Reply-To: References: Message-ID: > What about inheritance? Good question! First, note that this is two questions ? one is ?what _can_ a serialization framework do here?, and the other is ?what should Java serialization do here.? Remember, the high-order bit here is to banish the magic; one thing this enables is that writing a serialization framework reduces to (a) mapping instances to serializers (which could be informed by @Serializer, or not) and (b) encoding, all without need of privilege, and of course the reverse. This means that one of the parameters of a serialization framework is deserialization fidelity. For example, when confronted with a List, and after inspecting the instance class for a suitable serializer and not finding one, what could it do? It clearly could fail, but it could also encode the List using a generic List wrapper, which might deserialize to ArrayList or List.of(?). Clearly, if the implementation class has a suitable serializer, we should probably use that. But if it doesn?t, what if List has a serializer? A serialization framework could use that as a fallback. Let?s take a simple inheritance example, without serializers first. class A { int a; public A(this int a) { } // this-bound parameter } class B extends A { int b; public B(int a, this int b) { super(a); } } Note that as part of A?s design, it exposes an accessible-to-subclasses constructor, which subclass constructors will delegate to. Let?s add in the patterns (no syntax please!): class A { int a; public A(this int a) { } public pattern A(this int a) { } } class B extends A { int b; public B(int a, this int b) { super(a); } public pattern B(int a, this int b) { // deliberately stupid syntax match this to A(binding a); // binds a // b implicitly bound, because of this-declaration } } The point here is, just as A must provide some construction service to its subclasses, it must also provide some deconstruction services (patterns, accessible fields, accessible accessors, whatever) to its subclasses too. Now, B is fully ready to be serializable, as it has both a constructor and deconstructor which are suitable for serialization. > Could factory method deserializer declared in the class X produce an object of type Y which is a subclass of X? Yes. An example of this would be if we wanted to put a deserializer in List, where it returns some default implementation (ArrayList, List.of(), whatever.). Remember, serialization frameworks get to decide how they are going to map instances to serializers/deserializers; the above is not a statement that Java serialization _will_ support finding deserializers in super types, but that a serialization framework _could_ do so, and it is on them to define the search process. (A serialization framework could also, for example, provide a registry where you could register serializers/deserializers explicitly ? ?if you find a FooList, serialized it as a BarList?. Java serialization probably will not, but others could.). > In this case where serializer pattern should be declared? In Y or in X? Assuming that serialized stream contains the class name Y, then probably both serializer and deserializer should be searched by serialization framework in the Y. Again, how a serialization framework searches for a serializer/deserializer is part of its differentiation from other frameworks. We provide authors with the ability to easily and defensively expose API points for use by serialization; the framework wires up its choices of which API points are called in response to what. > More concrete example: immutable lists created via List.of(...). There are at least two implementations inside and, I think, it's desired not to expose their count and structure. E.g. future Java version might have more or less imlementations. How the serializer and deserializer would look like for these objects? Probably so. In this case, I would think one of annotation parameters on @Serializer() would map to ?look in this other class for deserializers?, so that InternalListImpl42 could serialize to something that says ?deserialize me with the deserializer for PublicListWrapper?, such as: private class InternalListImpl42 { @Serializer(version = PublicListWrapper.VERSION, deserializationClass = PublicListWrapper.class) open pattern InternalListImpl42(Object[] elements) { ? } } in which case the serialization version corresponds to that of PLW, not ILI42. > Another question: is it expected that static checks will be applied for annotated methods/patterns? I expect the following: > - Deserialization annotation is applied only to constructor or to static factory method which return type is the same as the containing class. If the class is parameterized like Map, then static factory method should be parameterized in the same way like static Map createMap(...) (or such restriction is redundant?). Parameterized constructor is not allowed (or it is?) > - Serialization annotation is applied only to patterns. Probably could be applied to getter-like no-arg method if object is serialized to single simpler value like File::toString could be used to serialize File. > - No two members of the same class could have the same annotation with the same version number > - If class contains both serializer and deserializer with the same version number, their parameter count and types should match. These are the sort of checks I would expect a compiler to want to do. There is room for adjustment and interpretation (the parameter types need not be equal, just assignment-compatible) but you?ve got the right spirit. For super-simple classes, we could consider to allow an accessor to act as a serializer, if the serial form only has one component. If it has multiple components, now we?re in the business of either capturing the order somewhere, or writing names to the stream so they can be matched to constructor parameter names (yuck), which starts to feel like the wrong end of the convenience-complexity lever, but again ? serialization frameworks can do what they want. > What about singleton objects or objects without state in general? Seems no problem with deserialization (e.g. class Singleton { @Deserializer public static Singleton getInstance() {...} }). But how to declare the serializer? Is it expected to have patterns which deconstruct the object to zero components? A pattern with zero state components is perfectly allowable. If we think this is a common case, then we could have a @Serializer.Singleton annotation to optimize its expression. There?s a significant API design problem ahead of us in picking the right annotations, defining the consistency-checking rules, etc; the examples in the document should be considered placeholders. > Will java.lang.String have a serializer (toCharArray?) and a deserializer (String(char[])) or it's considered to be basic enough and all serialization frameworks should handle it as a primitive? I would think ?primitive?, but if you wanted to make an argument for the other way, I?d listen! From brian.goetz at oracle.com Thu Jun 13 16:46:09 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Jun 2019 12:46:09 -0400 Subject: Records and annotations In-Reply-To: References: Message-ID: <5037a213-8dd7-0056-a899-b4cbff090bc4@oracle.com> There's a sub-option of B that has been suggested, call it B-, which we can also consider: only push down annotations from record components into _implicit_ declarations of mandated members, rather than all declarations (implicit or explicit.) The benefit of this (compared to B) is transparency: you get what is present in the source file. The downside is that if you explicitly declare a member, you have to explicitly replicate its annotations, and it's easy to forget to do so. On 6/6/2019 4:48 PM, Brian Goetz wrote: > Recall that some time ago, we were discussing some different > directions for how to handle annotations on record components. > > Approach A: Record components are a new kind of annotation target > location; if an annotation is meta-annotated with this target kind, it > can be applied to record components. ?Expose reflection over > annotations on record components as with other features. > > Approach B: Annotations on record components are merely ?pushed down? > to the corresponding JLS-mandated API elements (constructor > parameters, accessor methods, fields), according to the allowed target > kinds of the annotation (if the annotation is only valid on fields, it > is only pushed down to fields.). > > Approach B+: Like B, except that we continue to reify the provenance > of the annotations, and expose them through reflection as annotations > on the record component _in addition to_ annotations on the mandated > API elements. > > In an alternate universe where we had done records first, and were now > adding annotations, we?d surely pick A. ?However, in the current > universe, picking A would put us in an adoption bind; we have to wait > for specific annotations to acquire knowledge of the new target kinds > (through the @Target meta-annotation), and for frameworks to be aware > of annotations on record components, before we can migrate classes > dependent on those annotations/frameworks to be records. ?Further, > library authors suffer a familiar problem: if @Foo is meta-annotated > with a target kind of RECORD_COMPONENT, then that means it must have > been compiled against a Java 14+ JDK, which means that the resulting > classes are dependent on JDK 14+, unless they use something like MR > Jars to have two versions in one JAR. ?This would further impede > adoption. > > For guidance in our A/B choice, we can look to enums. Enum constants > are surely a first-class language element, and can be annotated, but > they do not have their own annotation target kind; instead, the > compiler pushes down the annotations onto the fields that carry the > enum constants. ?While this might be an uneasy dependence on the > translation strategy, in fact this translation strategy is mandated > (because we want migrating between a class with static constant fields > and an enum to be a binary-compatible migration.). > > Records are in a similar boat as enums; while there is a translation > strategy going on here, the elements of it are mandated by the > language specification. ?So I think the trick that enums use is a > reasonable one to carry forward to records, allowing us to seriously > consider B/B+. ? (Strategy A also has a lot of accidental detail; > class file attributes for various kinds of options and bookkeeping to > manage exactly what is being annotated, reflection API surface, etc.). > > The following type-checking strategy applies to B and B+: > > ?- A record component may be annotated by a declaration annotation > with no target kind meta annotation, or whose target kind includes one > or more of PARAMETER, FIELD, or METHOD > ?- The type of a record component may be annotated by a type annotation > > Strategy B then entails pushing down annotations through tree > manipulation to the right places. ?For PARAMETER annotations, they are > pushed down to the parameters of the implicit constructor; for FIELD > annotations, to the fields; for METHOD annotations, to the accessor. > ?And for type annotations, to the corresponding type use in > constructor parameters, field declarations, and accessor methods. > ?(And if the annotation is applicable to more than one of these, it is > pushed down to all applicable targets.) > > But wait! ?What if the author also explicitly declares, say, the > accessor method? > > ? ? record R(int a) { > ? ? ? ? int a() { return a; } > ? ? } > > No problem, we can still push the annotation down, and there is > precedent for annotations being ?inherited? in this way. > > But wait! ?What if the author explicitly declares the same annotation, > but with conflicting values? > > ? ? record R(@Foo(1) int a) { > ? ? ? ? @Foo(2) int a() { return a; } > ? ?} > > We can still push down @Foo(1), and then look to see if @Foo is a > repeating annotation.? If it is, great; if not, then a() has two @Foo > annotations, which results in a compilation error.? So we always push > down, and then enforce arity rules. > > By pushing annotations down in this manner, existing reflection can > pick up the annotations on the various class members with no > additional work or reflection API surface. ?Are we done? > > We might be done, or we might want to do more (strategy B+). ?In B+, > we _additionally_ reify which annotations were present on the > component, and (possibly) expose additional reflection API surface to > query annotations on record components. Why would we want to do this? > ?Well, one reason that occurs to me is that we?ve been holding the > move of ?abstract records? and records extending abstract records in > our back pocket. ?In this case, we might wish to copy annotations down > from a record component in a superclass to the corresponding > pseudo-component in the subclass, for example. ?But, I?m not > particularly compelled by this ? I think the strategy we took for > enums is mostly good enough. ?So I?m voting for pure B. > From kevinb at google.com Thu Jun 13 18:11:27 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 13 Jun 2019 11:11:27 -0700 Subject: Towards better serialization In-Reply-To: References: Message-ID: I think the discussion I'm seeing on-list may be ignoring the forest for a few of its trees. Java's implementation of serialization has been a gaping wound for a long time, for all the reasons the doc so carefully lays out. And every framework to support other wire formats has always had to start from scratch. Isolating how we get data in and out of objects from how we read and write that data on the wire is self-evidently the right thing to do, and this looks like a very clean way to do it, and I can find little fault with it. I don't know that it's necessary to remove the old serialization. It works just fine for symmetric serialization (a controlled environment where you can guarantee the reader and writer use the exact same code). Under that constraint, it actually delivers on its promise. It's just that it's dangerous for almost anything else. The main thing that I have a problem with is `open`. It seems we are turning 4 visibility levels into a 4x2 matrix (with one cell, `public open`, being redundant) ... and just because of reflection? This feels wrong to me. It might be what we need, but I would *really* like to know that we've exhausted all alternatives. Again, though, this is a tree. The forest is good. On Tue, Jun 11, 2019 at 12:21 PM Brian Goetz wrote: > I've posted a document at: > > http://cr.openjdk.java.net/~briangoetz/amber/serialization.html > > > on an exploration we've been doing to address some of the shortcomings of > Java serialization, building on other tools that have been (or will be) > added to the platform. Rather than attempt to add band-aids on existing > serialization, it addresses the risks of serialization at their root. It > is somewhat of a shift -- it cannot represent all object graphs, and it > makes some additional work for the author -- but it brings object > serialization into the light, where it needs to be in order to be safer. > Comments welcome! > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Jun 13 18:25:03 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Jun 2019 14:25:03 -0400 Subject: Towards better serialization In-Reply-To: References: Message-ID: <9C338332-4D24-4AD7-8123-FE28257CBDAD@oracle.com> > I don't know that it's necessary to remove the old serialization. It works just fine for symmetric serialization (a controlled environment where you can guarantee the reader and writer use the exact same code). Under that constraint, it actually delivers on its promise. It's just that it's dangerous for almost anything else. Yes. And, note as well that it is not all-or-nothing; we can ?demote? default serialization without disabling it, by (say) requiring an opt-in (e.g., ?enable-default-unsafe-legacy-serialization-yes-i-know) so that applications that want to use it can, and applications that don?t can be protected from it. There?s a huge range of policy choices here, which can be phased in over time. There are two parts to the approach outlined here: - Upgrading the programming model to allow authors to have explicit but unobtrusive control over extracting state from objects, reconstituting them safely, and capturing schema evolution; - Adjusting Java serialization (and other serialization frameworks) to use this mechanism. The first is obviously a precursor to the second, but within the second, there is a range of policy choices, which can play out over time, and different serialization frameworks can make different choices. > The main thing that I have a problem with is `open`. It seems we are turning 4 visibility levels into a 4x2 matrix (with one cell, `public open`, being redundant) ... and just because of reflection? This feels wrong to me. It might be what we need, but I would really like to know that we've exhausted all alternatives. Again, though, this is a tree. The forest is good. Actually, ?public open? may not actually be redundant; ?public open? in a non-exported package of a module is only statically public within the module. But again, this is a tree, so let?s talk about the forest: the distinction between ?front door? and ?back door? APIs is real. Backdoor APIs include serialization, but they also include members that exist to support, say, mocking or dependency injection. While the ?backdoor? API members are effectively public, we don?t necessarily want to expose them statically to front-door consumers, because they are not intended for front-door consumers. The module system has recognized this need, where a package can be opened but not exported, allowing for dynamic access but not static access. We can piggyback on this (if a module is open, it is not actually required to open the individual members, though doing so has benefit to readers anyway), but opening an entire module for the sake of serializing a few classes is a pretty coarse hammer. I see a lot of benefit in using the _same_ mechanism to segregate front-door and back-door API members at the different granularities (member, class, package, module) rather than inventing different ones, but that?s a possibilities too. Another possibility is to break the 4x2 down into 4+1, where ?open? implies ?statically private?. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Jun 13 20:53:49 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 13 Jun 2019 13:53:49 -0700 Subject: Records and annotations In-Reply-To: <5037a213-8dd7-0056-a899-b4cbff090bc4@oracle.com> References: <5037a213-8dd7-0056-a899-b4cbff090bc4@oracle.com> Message-ID: <98EC52CC-D9BE-476C-A913-4BE61BB9EC0E@oracle.com> On Jun 13, 2019, at 9:46 AM, Brian Goetz wrote: > > The benefit of this (compared to B) is transparency: you get what is present in the source file. That's very good. > The downside is that if you explicitly declare a member, you have to explicitly replicate its annotations, and it's easy to forget to do so. Maybe we could tolerate a warning there? Along with the antidote @SuppressWarnings("annotations"). Meh, maybe not; I'd grumble at all the make-work to turn the warnings off. More complicated: A meta-annotation that documents which annotations are pushed down from record components to members (explicit or not). Then there's not even a choice for pushed-down annotations. A little more complicated: A multi-way meta-annotation: @ComponentContribution("alwaysPushDown") @ComponentContribution("pushDownToGeneratedOnly") @ComponentContribution("dontPushDown") @ComponentContribution("dontPushButWarn") ? The default behavior would have to be chosen with care. @ComponentContribution could be left for for future enhancement if the initial default behavior was workable in more cases. As both compiler folks and procrastinators know (I would know), pushing something to the future often pushes it to "never". I think "dontPushDown" is an OK default, and starting point, and maybe ending point. People will make mistakes and grumble about that?but not about make-work or complexity. If it's a problem we can eventually add @ComponentContribution to create new points of control. ? John From kevinb at google.com Thu Jun 13 22:22:03 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 13 Jun 2019 15:22:03 -0700 Subject: Records and annotations In-Reply-To: References: Message-ID: On Thu, Jun 6, 2019 at 1:51 PM Brian Goetz wrote: library authors suffer a familiar problem: if @Foo is meta-annotated > with a target kind of RECORD_COMPONENT, then that means it must have > been compiled against a Java 14+ JDK, which means that the resulting > classes are dependent on JDK 14+, unless they use something like MR Jars > to have two versions in one JAR. This would further impede adoption. > This has been one of my concerns about A. Multirelease jars make a solution possible, but it is still a lot of headache for the library owner to build them (I assume I would need to branch or use a preprocessor of some kind). I think B or B+ is what we want. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Thu Jun 20 09:18:02 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 20 Jun 2019 16:18:02 +0700 Subject: Different serialization strategies for different formats? Message-ID: Hello! Consider we have a Color class which represents a color in RGB format: class Color { private final int rgb; } The most obvious and efficient way to serialize/deserialize its state is to extract this int field: class Color { private final int rgb; @Deserializer public Color(int rgb) { if (rgb < 0 || rgb > 0xFFFFFF) throw new IllegalArgumentException(); this.rgb = rgb; } @Serializer public int toRGB() {return this.rgb;} } It's great for binary serialization. However if I serialize to JSON I would not like to see `color: 16711680`. JSON or XML are intended to be at least partially human-readable. So probably I want to see `color: red` or at least `color: #FF0000`. Well no problem, we can alternatively serialize as string: @Serializer @Override public String toString() { var name = colorToName.get(this); return name == null ? String.format("#%06X", rgb) : name; } @Deserializer public static Color parse(String str) { var color = nameToColor.get(str); if (color != null) return color; if (str.startsWith("#")) return new Color(Integer.valueOf(str.substring(1), 16)); throw new IllegalArgumentException(); } private static final Map nameToColor = Map.of( "black", new Color(0x000000), "red", new Color(0xFF0000), "green", new Color(0x00FF00), "blue", new Color(0x0000FF) // ... ); private static final Map colorToName = nameToColor.entrySet() .stream().collect(Collectors.toMap(Entry::getValue, Entry::getKey)); The problem is to indicate that we provide two alternative ways to serialize the object, and one of them (producing int) is better suitable for binary formats while another one (producing String) is better suitable for human-readable formats. Note that it's orthogonal to the serialization versions: we may also have several versions of both binary and human-readable serializers. What do you think? Should new Java serialization support several serialization kinds and provide some hints to serialization engine which kind is preferred for given format? E.g. @Serializer(king=SerializationKind.Binary) With best regards, Tagir Valeev -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Jun 20 13:43:46 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 20 Jun 2019 09:43:46 -0400 Subject: Different serialization strategies for different formats? In-Reply-To: References: Message-ID: <7D0F4A7D-A9B8-47E8-AED3-7AB3BB4CB360@oracle.com> Yes, a similar question came up in an internal discussion as well. > Consider we have a Color class which represents a color in RGB format: > > class Color { private final int rgb; } > > The most obvious and efficient way to serialize/deserialize its state is to extract this int field: > > It's great for binary serialization. However if I serialize to JSON I would not like to see `color: 16711680`. JSON or XML are intended to be at least partially human-readable. So probably I want to see `color: red` or at least `color: #FF0000`. Well no problem, we can alternatively serialize as string: Good example. There?s no problem in the model with multiple serializers, but it raises the question: how would a client select which form? Suppose instead of (or in addition to) the version property on the annotation, we had some other selectors. Suppose for sake of argument that Color has the following serializers: @Serializer(selector = ?binary?) public pattern Color(int colorValue) { ? } @Serializer(selector = ?text?) public pattern Color(int r, int g, int b) { ? } These tags are selected by the author of Color at development time. But the ultimate user of serialization is someone in some other maintenance domain, asking to serialize a whole graph that has colors in it. Without some sort of global agreement on the taxonomy of selectors, a given graph might have many classes which reflect the text/binary distinction (just one possible distinction) in a dozen different ways. And the tex/tbinary distinction might not be the only distinction one wants to reflect; one could imagine varying degrees of detail preservation, for example. So I like the idea of treating the set of serializers as something that can be queried over by a serialization library ? the question is ? what is the structure of these queries, such that would-be queriers don?t have to ?join? 100 different ?tables? each with their own schema style? From kevinb at google.com Thu Jun 20 20:18:07 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 20 Jun 2019 13:18:07 -0700 Subject: Looking beyond records: better constructors (and deconstructors) In-Reply-To: <9159b986-caea-9240-26f9-09f9f595d5b0@oracle.com> References: <9159b986-caea-9240-26f9-09f9f595d5b0@oracle.com> Message-ID: Sorry to have only a wishy-washy reply to offer. On Fri, Jun 7, 2019 at 12:11 PM Brian Goetz wrote: With most of the decisions regarding records being settled, let's take a > few minutes to look down the road. Records are great for where they > apply, but there are plenty of classes that suffer from error-prone > boilerplate that do not qualify to be records. We would like for some of > the record goodies to filter down to ordinary classes, where possible. > FWIW, I am probably a lot less concerned about the "cliff" between records and non-record classes than most. I suspect that most classes that have a lot of record-like state but can't quite be records would probably be best served by bundling up their record-like state into an actual record. What we would need is to tell the compiler that the constructor argument > "int low" and the field "int low" are describing the same thing. I definitely recognize this problem in today's code; that adding a single piece of state requires too many bits of code to be sprayed all over your class. There is certainly appeal in the idea of being able to represent that state as a single "thing" in the code, tying constructor/deconstructor/getter/validation/etc. all together. You had a past proposal that was trying to do something like that. I did have some reservations with that proposal, but now when I look at the current proposal, it's aiming for so much *less* than that that it's not *clear* to me it delivers enough benefit to bother with. You still have to spray changes in almost as many places. Even if this is one step that we intend to be followed by future steps, the intermediate version would be used for some length of time and the set of features it chose should feel like a worthwhile "sweet spot"... maybe this clears that bar, but I'm not sure. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Jun 20 20:25:28 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 20 Jun 2019 16:25:28 -0400 Subject: Looking beyond records: better constructors (and deconstructors) In-Reply-To: References: <9159b986-caea-9240-26f9-09f9f595d5b0@oracle.com> Message-ID: <6b09acee-a61a-b86a-4af6-4b53fb10cc54@oracle.com> I don't disagree with any of this; while the boilerplate of constructor bodies is O(n), the coefficient is small, compared to other O(n) boilerplates that shall remain nameless. What motivates me to put this forward now is that, as we get ready to support pattern declarations, most deconstruction patterns will have the same error-prone O(n) boilerplate as constructors.? So whatever you think the benefit of this is, multiply by 2, for the constructor and the deconstructor -- and so kills a whole category of boilerplate before it is even born. On 6/20/2019 4:18 PM, Kevin Bourrillion wrote: > Sorry to have only a wishy-washy reply to offer. > > > On Fri, Jun 7, 2019 at 12:11 PM Brian Goetz > wrote: > > With most of the decisions regarding records being settled, let's > take a > few minutes to look down the road.? Records are great for where they > apply, but there are plenty of classes that suffer from error-prone > boilerplate that do not qualify to be records. We would like for > some of > the record goodies to filter down to ordinary classes, where possible. > > > FWIW, I am probably a lot less concerned about the "cliff" between > records and non-record classes than most. I suspect that most classes > that have a lot of record-like state but can't quite be records would > probably be best served by bundling up their record-like state into an > actual record. > > > What we would need is to tell the compiler that the constructor > argument > "int low" and the field "int low" are describing the same thing. > > > I definitely recognize this problem in today's code; that adding a > single piece of state requires too many bits of code to be sprayed all > over your class. There is certainly appeal in the idea of being able > to represent that state as a single "thing" in the code, tying > constructor/deconstructor/getter/validation/etc. all together. You had > a past proposal that was trying to do something like that.? I did have > some reservations with that proposal, but now when I look at the > current proposal, it's aiming for so much /less/?than that that it's > not /clear/?to me it delivers enough benefit to bother with. You still > have to spray changes in almost as many places. > > Even if this is one step that we intend to be followed by future > steps, the intermediate version would be used for some length of time > and the set of features it chose should feel like a worthwhile "sweet > spot"... maybe this clears that bar, but I'm not sure. > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Jun 23 11:02:39 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 23 Jun 2019 13:02:39 +0200 (CEST) Subject: Different serialization strategies for different formats? In-Reply-To: <7D0F4A7D-A9B8-47E8-AED3-7AB3BB4CB360@oracle.com> References: <7D0F4A7D-A9B8-47E8-AED3-7AB3BB4CB360@oracle.com> Message-ID: <686815509.2501374.1561287759196.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Tagir Valeev" > Cc: "amber-spec-experts" > Envoy?: Jeudi 20 Juin 2019 15:43:46 > Objet: Re: Different serialization strategies for different formats? > Yes, a similar question came up in an internal discussion as well. > >> Consider we have a Color class which represents a color in RGB format: >> >> class Color { private final int rgb; } >> >> The most obvious and efficient way to serialize/deserialize its state is to >> extract this int field: >> >> It's great for binary serialization. However if I serialize to JSON I would not >> like to see `color: 16711680`. JSON or XML are intended to be at least >> partially human-readable. So probably I want to see `color: red` or at least >> `color: #FF0000`. Well no problem, we can alternatively serialize as string: > > Good example. There?s no problem in the model with multiple serializers, but it > raises the question: how would a client select which form? Suppose instead of > (or in addition to) the version property on the annotation, we had some other > selectors. Suppose for sake of argument that Color has the following > serializers: > > @Serializer(selector = ?binary?) > public pattern Color(int colorValue) { ? } > > @Serializer(selector = ?text?) > public pattern Color(int r, int g, int b) { ? } > > These tags are selected by the author of Color at development time. But the > ultimate user of serialization is someone in some other maintenance domain, > asking to serialize a whole graph that has colors in it. Without some sort of > global agreement on the taxonomy of selectors, a given graph might have many > classes which reflect the text/binary distinction (just one possible > distinction) in a dozen different ways. And the tex/tbinary distinction might > not be the only distinction one wants to reflect; one could imagine varying > degrees of detail preservation, for example. > > So I like the idea of treating the set of serializers as something that can be > queried over by a serialization library ? the question is ? what is the > structure of these queries, such that would-be queriers don?t have to ?join? > 100 different ?tables? each with their own schema style? I don't think we should specify a query scheme, it seems more future proof to only provide a way to expose all serializers (resp deserializer) and let the serialization library provides their own annotations / do the selection on top of what we expose. so the JDK will only provide @Serializer/@Deserializer and an hypothetical JSON library will provide @JsonFormat to indicate a supplementary way of selection @Serializer @JsonFormat(selector = ?binary?) public pattern Color(int colorValue) { ? } @Serializer @JsonFormat(selector = ?text?) public pattern Color(int r, int g, int b) { ? } R?mi From brian.goetz at oracle.com Sun Jun 23 11:32:24 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Jun 2019 07:32:24 -0400 Subject: Different serialization strategies for different formats? In-Reply-To: <686815509.2501374.1561287759196.JavaMail.zimbra@u-pem.fr> References: <7D0F4A7D-A9B8-47E8-AED3-7AB3BB4CB360@oracle.com> <686815509.2501374.1561287759196.JavaMail.zimbra@u-pem.fr> Message-ID: This works fine when all classes in a graph are in the same maintenance domain. But, what about libraries? Sent from my iPad > On Jun 23, 2019, at 7:02 AM, Remi Forax wrote: > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "Tagir Valeev" >> Cc: "amber-spec-experts" >> Envoy?: Jeudi 20 Juin 2019 15:43:46 >> Objet: Re: Different serialization strategies for different formats? > >> Yes, a similar question came up in an internal discussion as well. >> >>> Consider we have a Color class which represents a color in RGB format: >>> >>> class Color { private final int rgb; } >>> >>> The most obvious and efficient way to serialize/deserialize its state is to >>> extract this int field: >>> >>> It's great for binary serialization. However if I serialize to JSON I would not >>> like to see `color: 16711680`. JSON or XML are intended to be at least >>> partially human-readable. So probably I want to see `color: red` or at least >>> `color: #FF0000`. Well no problem, we can alternatively serialize as string: >> >> Good example. There?s no problem in the model with multiple serializers, but it >> raises the question: how would a client select which form? Suppose instead of >> (or in addition to) the version property on the annotation, we had some other >> selectors. Suppose for sake of argument that Color has the following >> serializers: >> >> @Serializer(selector = ?binary?) >> public pattern Color(int colorValue) { ? } >> >> @Serializer(selector = ?text?) >> public pattern Color(int r, int g, int b) { ? } >> >> These tags are selected by the author of Color at development time. But the >> ultimate user of serialization is someone in some other maintenance domain, >> asking to serialize a whole graph that has colors in it. Without some sort of >> global agreement on the taxonomy of selectors, a given graph might have many >> classes which reflect the text/binary distinction (just one possible >> distinction) in a dozen different ways. And the tex/tbinary distinction might >> not be the only distinction one wants to reflect; one could imagine varying >> degrees of detail preservation, for example. >> >> So I like the idea of treating the set of serializers as something that can be >> queried over by a serialization library ? the question is ? what is the >> structure of these queries, such that would-be queriers don?t have to ?join? >> 100 different ?tables? each with their own schema style? > > I don't think we should specify a query scheme, it seems more future proof to only provide a way to expose all serializers (resp deserializer) and let the serialization library provides their own annotations / do the selection on top of what we expose. > > so the JDK will only provide @Serializer/@Deserializer and an hypothetical JSON library will provide @JsonFormat to indicate a supplementary way of selection > > @Serializer @JsonFormat(selector = ?binary?) > public pattern Color(int colorValue) { ? } > > @Serializer @JsonFormat(selector = ?text?) > public pattern Color(int r, int g, int b) { ? } > > > R?mi