From brian.goetz at oracle.com Tue Jul 23 18:32:08 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 23 Jul 2019 14:32:08 -0400 Subject: Records: migration compatibility Message-ID: In the course of exploring serialization support for records, Chris asked about the compatible evolution modes for records.? We have explored this briefly before but let's put this down in one place. Since we are saying that records are a lot like enums, let's start with: ?A. Migrating a record-like class to a record ?B. Migrating a record to a record-like class (which is analogous to refactoring between an enum and a class using the type-safe enum pattern.) Migration A should be both source- and binary- compatible, provided the original class has all the members the record would have -- ctor, dtor, accessors.? Which in turn requires being able to declare the members, including dtor, but we'll come back to that. What about serialization compatibility?? It depends on our serialization story (Chris will chime in with more here), but its fair to note that while migrating from a TSE to an enum is not serialization compatible either. Migration B is slightly more problematic (for both records and enums), as a record will extend Record just as enums extend Enum. Which means casting to, or invoking Record methods on, a migrated record would fail.? (Same is true for enums.)? Again, I'll leave it to Chris to fill in the serialization compatibility story; we have a variety of possible approaches there. What about changing the descriptor of a record? ?C.? Removing components ?D.? Reordering components ?E.? Adding components Removals of all sorts are generally not source- or binary- compatible; removing components will cause public members to disappear and constructors to change their signatures.? So we should have no compatibility expectations of C. D will cause the signature of the canonical ctor and dtor to change.? If the types of the permuted components are different, it may be possible for the author to explicitly implement the old ctor/dtor signature, so that the existing set of members is preserved.? However, I think we should describe this as not being a compatible migration, even if it is possible (in some cases) to make up the difference. E is like D, in that it is possible to add back the old ctor/dtor implementations, and rescue existing callsites, but I think it should be put in the same category. From chris.hegarty at oracle.com Wed Jul 24 13:36:50 2019 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Wed, 24 Jul 2019 14:36:50 +0100 Subject: Exploring Record Serialization [ was: Records: migration compatibility ] In-Reply-To: References: Message-ID: <35373c44-9bb3-885f-98e4-18f6f933ddb5@oracle.com> TL;DR class-to-record and record-to-class refactoring is a very attractive property, which we should explore further in the context of serialization. ---- Current state of Serialization in amber/amber ( July 2019 ). - Middle-of-the-road position on records and Serialization. - Auto generate a `readResolve()` method that pipes the record's state through the canonical constructor. - Advantages: constructor validation checks. - Disadvantages: deserialization process always creates two record instances; may leak "bad" record through back references in the serial stream; brittle / fragile Doubling down on the current approach. A record's serial-form should be that of its state descriptor. Prohibit customization of this. Retain the auto generated `readResolve()`, but also prohibit specifying other Serialization magic methods. Specifically, prohibit: 1. explicit `readResolve` / `writeReplace` 2. explicit `readObject` / `readObjectNoData` / `writeObject` 3. explicit `serialPersistentFields` The canonical constructor defends against "bad" data for both the front-door and back-door APIs. But can we do better? Should records be a first class citizen in the Serialization Protocol? ( spoiler: possibly, but probably not ) We could update the Java Object Serialization Specification to provide explicit support for records (in a similar(ish) way to that of what was done for the Serialization of Enum Constants). Support directly within Object Serialization Stream Protocol avoids the brittleness of the auto-generated magic methods above, and possible user-interaction or bypasses in the code/implementation. It ensures that construction is always, and only, performed through the constructor. The serial format of a record could be the record descriptor and the record's state. Possible format: record-marker record-class record-descriptor field field field ... Advantages: Simple and clean, less fragile, prevents leaking a "bad" record through a back reference in the stream. Disadvantages: new format incompatible with pre-record releases ( stream failure), need to consider compatible record evolution strategy ( N-1 problem ) - putting records in the stream protocol requires this issue to be given serious consideration now ( crystal ball! ). We need to have an evolution story if we're going to put records as a first class citizen in the Serialization protocol. And that story is somewhat dependent on the general evolution of records. The N-1 problem : It should be possible for JDK N-1 to deserialize an object graph that was serialized with JDK N. The Serialization specification goes to great lengths to specify how Serializable classes can be compatibly evolved. That said, there are pitfalls everywhere, and it is incredibly difficult to guarantee that evolving a Serializable class has been done safely. Looking at another recent addition to the Serialization protocol - Enums constants. It is surprising that their serial format is not all that sympathetic to evolution. Enum constants have an effective format of `Enum class + string value`. During deserialization, `Enum.valueOf(class, String value)` is invoked to retrieve the actual Enum constant. JLS 13.4.26. Evolution of Enums: _"Adding or reordering constants in an enum type will not break compatibility with pre-existing binaries"_. Take 'adding' for example. `java.util.concurrent.TimeUnit`, introduced in Java 1.5, then a few constants were added in Java 1.6, e.g. _MINUTES_. Ok. If a TimeUnit is part of a class's serial-form then, depending on its actual value, N-1 compatibility may be broken. For example, fictional `Timeout(long value, TimeUnit unit) implements Serializable`, serialized with Java 1.6 when the unit is _MINUTES_ will fail to deserialize with Java 1.5 - fails with `java.io.InvalidObjectException: enum constant MINUTES does not exist in class java.util.concurrent.TimeUnit`, `Caused by: java.lang.IllegalArgumentException: No enum const class java.util.concurrent.TimeUnit.MINUTES`. This is not great. A lot of care needs to be taken if an enum finds its way into the serial-form of a class, since that enum may be evolved in the future to contain additional values. While Enum constants having direct support in the Serialization protocol, operationally `ObjectInputStream` doesn't handle the N-1 case very gracefully. Given this, and the myriad of other minefields that evolving a Serializable class brings (too many to enumerate here), maybe we can come up with a Serialization format and evolution policy for records, that will be _no worse_ than that of Enums, or other existing aspects (minefields) of Serialization compatibility. Compatible Record Evolution & Migration Brian has provided details in a prior post on this thread, but it seems clear that the higher-order bit is migrating from a record-like class to a record, and migrating from a record to a record-like class ( as opposed to evolving a record itself ). Wouldn't it be nice if serialization of these just worked across refactorings? Given this, then maybe pushing records down into the serialization format itself is not the way to go. Instead it should be possible to use the existing standard serialization format to encode the record class and its component names + values ( just like any other regular Serializable class ). But rather than having the serialization framework create the record instance followed by field stuffing, have it locate the canonical constructor ( or best match constructor ) and invoke it with the deserialized stream fields. "Best match" here needs a little more prototyping to determine how best to allow for possible future evolution of a record, while still being able to deserialize on an N-1 runtime. Additionally, some level of prohibition or limitation could optionally still be applied to the serialization magic methods, to preserve and restrict, by default, the stream fields to that of just the record's state. There are various options here ranging from an error/warning during compilation, to the serialization framework specifying that it effectively ignores these magic methods for records. -Chris. On 23/07/2019 19:32, Brian Goetz wrote: > In the course of exploring serialization support for records, Chris > asked about the compatible evolution modes for records.? We have > explored this briefly before but let's put this down in one place. > > Since we are saying that records are a lot like enums, let's start with: > > ?A. Migrating a record-like class to a record > ?B. Migrating a record to a record-like class > > (which is analogous to refactoring between an enum and a class using the > type-safe enum pattern.) > > Migration A should be both source- and binary- compatible, provided the > original class has all the members the record would have -- ctor, dtor, > accessors.? Which in turn requires being able to declare the members, > including dtor, but we'll come back to that. > > What about serialization compatibility?? It depends on our serialization > story (Chris will chime in with more here), but its fair to note that > while migrating from a TSE to an enum is not serialization compatible > either. > > Migration B is slightly more problematic (for both records and enums), > as a record will extend Record just as enums extend Enum. Which means > casting to, or invoking Record methods on, a migrated record would > fail.? (Same is true for enums.)? Again, I'll leave it to Chris to fill > in the serialization compatibility story; we have a variety of possible > approaches there. > > What about changing the descriptor of a record? > > ?C.? Removing components > ?D.? Reordering components > ?E.? Adding components > > Removals of all sorts are generally not source- or binary- compatible; > removing components will cause public members to disappear and > constructors to change their signatures.? So we should have no > compatibility expectations of C. > > D will cause the signature of the canonical ctor and dtor to change.? If > the types of the permuted components are different, it may be possible > for the author to explicitly implement the old ctor/dtor signature, so > that the existing set of members is preserved.? However, I think we > should describe this as not being a compatible migration, even if it is > possible (in some cases) to make up the difference. > > E is like D, in that it is possible to add back the old ctor/dtor > implementations, and rescue existing callsites, but I think it should be > put in the same category. > > From forax at univ-mlv.fr Fri Jul 26 01:01:00 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 26 Jul 2019 03:01:00 +0200 (CEST) Subject: Records: migration compatibility In-Reply-To: References: Message-ID: <1177773204.1766005.1564102860185.JavaMail.zimbra@u-pem.fr> Hi Brian, ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Mardi 23 Juillet 2019 20:32:08 > Objet: Records: migration compatibility > In the course of exploring serialization support for records, Chris > asked about the compatible evolution modes for records.? We have > explored this briefly before but let's put this down in one place. > > Since we are saying that records are a lot like enums, let's start with: > > ?A. Migrating a record-like class to a record > ?B. Migrating a record to a record-like class > > (which is analogous to refactoring between an enum and a class using the > type-safe enum pattern.) enums is a counter example here, no ? for enums, the only refactoring allowed is from a class to an enum, the other direction doesn't work because all enums inherits from java.lang.Enum. The JDK/javac implementation goes as far as - not allowing classical class to inherits from java.lang.Enum - using java.lang.Enum as a type in several public methods - testing at runtime that the class of an enum has to have the enum bit AND be a direct super type of java.lang.Enum (in case you generate your own bytecode) given that a record already have a different behavior at runtime than a nomrla class (the Record attribute), a public abstract class (AbstractRecord), the migration B seems unlikely. BTW, AbstractRecord also has the nasty side effect of not allowing inline record. R?mi From brian.goetz at oracle.com Fri Jul 26 01:05:20 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 25 Jul 2019 21:05:20 -0400 Subject: Records: migration compatibility In-Reply-To: <1177773204.1766005.1564102860185.JavaMail.zimbra@u-pem.fr> References: <1177773204.1766005.1564102860185.JavaMail.zimbra@u-pem.fr> Message-ID: > > enums is a counter example here, no ? > > for enums, the only refactoring allowed is from a class to an enum, > the other direction doesn't work because all enums inherits from java.lang.Enum. In reality, compatibility is not a binary thing. Yes, if the client depends on Enum methods, then migrating in the other direction will fail. But if the client depends only on the enum constants, it is fine, because resolving EnumClass.X is done with getstatic either way. Its not unlike adding a method; adding a method is generally considered source compatible, but not if there is another method of the same name and same arguments but different return type. There are all sorts of asterisks associated with ?binary compatible? and ?source compatible." > given that a record already have a different behavior at runtime than a nomrla class (the Record attribute), > a public abstract class (AbstractRecord), the migration B seems unlikely. With enums, the migration in both directions is common. Class-to-enum was common when we had classes using the type safe enum pattern; but its not uncommon to exceed what can be done with an enum, and refactor back to a class. I could imagine the same happening with records quite easily. > BTW, AbstractRecord also has the nasty side effect of not allowing inline record. Good point. We?ve been talking about whether inline classes should be able to inherit from abstract classes with no state?. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Jul 26 01:26:28 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 26 Jul 2019 03:26:28 +0200 (CEST) Subject: Records: migration compatibility In-Reply-To: References: <1177773204.1766005.1564102860185.JavaMail.zimbra@u-pem.fr> Message-ID: <149470348.1766546.1564104388453.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Vendredi 26 Juillet 2019 03:05:20 > Objet: Re: Records: migration compatibility >> enums is a counter example here, no ? >> for enums, the only refactoring allowed is from a class to an enum, >> the other direction doesn't work because all enums inherits from java.lang.Enum. > In reality, compatibility is not a binary thing. Yes, if the client depends on > Enum methods, then migrating in the other direction will fail. But if the > client depends only on the enum constants, it is fine, because resolving > EnumClass.X is done with getstatic either way. > Its not unlike adding a method; adding a method is generally considered source > compatible, but not if there is another method of the same name and same > arguments but different return type. There are all sorts of asterisks > associated with ?binary compatible? and ?source compatible." It's true if a user control all the codes that uses the enum values, but once you use external libraries (by example a JSON serializer like jackson that serialize object and enum differently), you start to have a more strict view of what is binary compatible or not. >> given that a record already have a different behavior at runtime than a nomrla >> class (the Record attribute), >> a public abstract class (AbstractRecord), the migration B seems unlikely. > With enums, the migration in both directions is common. Class-to-enum was common > when we had classes using the type safe enum pattern; but its not uncommon to > exceed what can be done with an enum, and refactor back to a class. I could > imagine the same happening with records quite easily. for records, we have a reflection method to extract the name of component of the primary constructor, you can not do that with a class, at best you have a de-constructor but it's a positional way to see the values of a record, no a named way to see the values of a record. So again, migration from a record to a class will depend if you use third party libraries that use reflection or not. >> BTW, AbstractRecord also has the nasty side effect of not allowing inline >> record. > Good point. We?ve been talking about whether inline classes should be able to > inherit from abstract classes with no state?. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: