[External] : Re: Data Oriented Programming, Beyond Records

Remi Forax forax at univ-mlv.fr
Wed Feb 25 17:43:02 UTC 2026


I like the message here, 
it simple, and as you said we can built on it. 

""" 
A class with a state description means that it has accessors for each 
component, and has a canonical deconstruction pattern -- that's it 
"" 

One missing discussion is serialization or exatcly what about the easy serialization we get with records ? 

If a deconstructible class implement Serializable, should a canonical constructor, with the same visibility as the class, required * ? 

Sadly, it's not a decision we can postpone, because it's not a backward compatible change. 

So should we have a special case for Serializable classes ? 

regards, 
Rémi 

* apart if there is a writeReplace or one of the other weird serializations mechanisms is present. 

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, February 25, 2026 3:55:59 PM
> Subject: Re: Data Oriented Programming, Beyond Records

> We've had some good discussions here, and I've got some updates to the ideas
> outlined in the previous snapshot. There's a lot here, including a lot that I'm
> identifying as likely future directions but which should not be the topic of
> current discussion; I want to focus on the highest-priority aspects first (even
> though some of the lower-priority aspects are surely going to be tempting to
> discuss.)

> As we saw with records, there are two forces operating here, which are to some
> degree in competition:

> - A desire for stronger semantics. A class commits to almost nothing; a record
> commits to quite a lot -- by disavowing the freedom to do things that are
> outside the profile of "transparent, shallowly-immutable data carrier". From
> these stronger semantics, we can derive features such as deconstructability
> (pattern matching), reconstructability (withers), etc, as well as deriving a
> number of usually-boilerplate class members.

> - A desire for concise representation, not necessarily because of the effort of
> writing that code (IDEs will write it for you), but because _reading code that
> adds no value_ obfuscates intent. (As always, reading code is more important
> than writing code.)

> With records, we managed to get these entirely aligned; we were able to derive
> the desirable concise representation entirely from having picked the right
> semantics; this is what winning looks like. With the "carrier classes"
> suggestion as outlined in the previous snapshot, we are not quite there yet,
> and we've likely slid a little too far into the "unprincipled concision" camp,
> so we need to make some adjustments.

> Additionally, there was something very uncomfortable about the "carrier classes"
> proposal; it was a lot harder to look at a class and tell whether the absence
> of explicit declarations of certain members (especially equals and hashCode)
> meant they didn't exist, or whether it meant they were derived. The current
> situation where records get a lot of derived stuff and no one else does, while
> inconvenient for all non-record classes, has more clarity.

> In this mail, I outline a slightly different, somewhat more principled position,
> that addresses these concerns. Concision fans may be disappointed in some ways.

> ## Sizing up the problem

> The strong semantic guarantee of records gives us:

> - API contracts
> - construction protocol (canonical constructor)
> - deconstruction protocol (record patterns)
> - component access protocol (accessor methods)
> - state-based equality / hashCode / toString
> - Implementation convenience
> - representation (fields)
> - canonical constructor (modulo parameter validation and normalization)
> - Option to use compact constructor form
> - component accessors
> - record pattern (derived from accessors)
> - implementations of equals, hashCode, toString
> - Potential future semantic features
> - reconstruction (withers)
> - nominal construction and deconstruction

> It will not be possible to give all of these benefits to the almost-record
> classes, because some derive directly from giving up flexibility that
> almost-records don't want to give up, such as the ability to have an internal
> representation that differs from the external API. In these cases, the
> implementation will have to "connect the dots" to some degree; our measure of
> progress will be the degree to which the required code is proportional to the
> deviation from the ideal.

> To get there, we will take the approach of first solving a smaller but more
> coherent problem -- how we get to deconstruction for arbitrary classes and
> interfaces -- and then come back as needed with more targeted tools for
> chipping away at the declaration overhead.

> ## Incomplete, canonical, nominal state descriptions

> In the previous round, we described the state description of a carrier class as
> a "complete, canonical, nominal state description" -- but this reflected a sort
> of wishful thinking. What it really was was an _incomplete_, canonical, nominal
> state description! Because we cannot not stop the user from declaring
> additional state that might affect core behavior -- and in fact, the whole
> point is to give the user more freedom in defining the representation.

> Acknowledging this means we gain some clarity but lose some concision. The
> gained clarity is that a state description on an interface or class means
> something narrower than initially claimed -- that this class has these specific
> named components, and that it can be deconstructed with a canonical
> deconstruction pattern, whose shape matches that of the state description. The
> lost concision derives from lost semantics -- we really don't have a principled
> basis for deriving the Object method implementations in the face of arbitrary
> representation.

> A class with a state description means that it has accessors for each
> component, and has a canonical deconstruction pattern -- that's it

> Saying that the state description is solely about access to components (both
> individually and in bulk) allows us to drop all of the structural restrictions
> we might be inclined to impose on such interfaces or classes -- they can be
> final or not, extend other classes or not, they can have whatever constructors
> they like, carriers can extend non-carriers and vice versa, etc.

> We will call a class or interface that has a state description in its
> declaration a _deconstructible class_, and the elements of the state
> description _class components_. Records become a restricted form of
> deconstructible classes.

> The rules about overriding components largely derive from the existing rules
> about overriding their corresponding accessor methods; subclasses can
> covariantly refine the type of an inherited superclass component, but cannot
> have a component of the same name but a (sufficiently) different type.

> ## What's left?

> So, what's left? Quite a lot, of course; we've addressed the "how do other
> classes and interfaces get pattern matching", but nothing else yet. The "what's
> left" includes:

> - Reconstruction (withers)
> - Deriving implementations of accessors
> - Compact constructors for deconstructible classes
> - Deriving / streamlining object method implementation

> We'll take these in turns, though I'm going to label some of these as "for
> future discussion" to indicate that they are lower on the priority list and
> guide discussions to the higher-priority items.

> ### Reconstruction

> Reconstruction (withers) requires an underlying canonical
> constructor-deconstructor pair. Records always have these, so they always
> qualify for reconstruction. But we said earlier that we make no assumptions
> about the construction protocol of arbitrary deconstructible classes, so how do
> they qualify for reconstructibility?

> We have long struggled with the question of what aspects of this have to be
> explicitly declared vs what aspects can be reasonably inferred by structure.
> Given that we've raised deconstructibility not only to a language feature, but
> to a prominent place in the class declaration, it seems reasonable to say that:

> - A constructor of a deconstructible class D is canonical if it matches the
> state description of D.
> - A deconstructible class D is reconstructible by client C if D has a canonical
> constructor and that constructor is accessible to C.

> That is, given a deconstructible class, which has a canonical state description,
> we can structurally recognize when a constructor is canonical, and derive
> reconstruction if that constructor is present and accessible.

> It may further be desirable to restrict reconstruction to final classes, as this
> reduces the risk of "decapitation", which seems to freak people out quite a lot
> when they learn about the risk (I think this is mostly "unfamiliarity bias",
> but is a restriction worth considering.)

> ### Derived fields in records

> By far the most common profile of "almost records" is "records that want to
> derive some state from their components and cache it." The previous proposal
> addressed this through carrier classes; after some evaluation, I think it is
> better to handle this within records themselves.

> We've recently exposed the "lazy constants" work through an API class, but the
> long-term goal has always been to sediment these into the language eventually.
> Over there, there are discussions going on about "cached instance methods" or
> "lazy instance fields" would allow us to expose lazily derived, cached state in
> records without undermining the record imperative. Of course, this mechanism
> would be available to classes other than records as well.

> Extending the reach of records takes some pressure off of the use cases for
> carrier classes, as more things that are "almost record" can become real
> records. So we'll let the work on laziness play out, and see to what extent it
> addresses the concerns about "records aren't expressive enough."

> ### Nominal invocation

> Reconstruction leans on the nominality of components, but people have been
> wishing for nominal invocation of constructors and deconstructors as well for a
> long time. To the extent we allow for nominal creation and deconstruction of
> records, deconstructible classes would be able to come along for that ride.
> (But please, let's not discuss this now.)

> ### Derived accessors and compact constructors

> Deconstructible classes get the requirement for accessors for each component,
> but as currently stated, get no help in declaring them, because, unlike with
> records, the language is unaware of the mapping between the external API (as
> defined by the state description) and the internal representation.

> In the previous iteration, we filled in that mapping with a `component` modifier
> on fields, which connected those dots, and therefore allowed derivation of the
> (often numerous) accessors whose external API _does_ align with the internal
> representation. (This was a semantic claim: that the field called `x` and the
> component called `x` describe the same thing.) We will come back later to
> explore whether this concept still carries its weight in the slimmed-down role
> for carrier classes.

> Having severed the state description from the construction protocol, the
> implementation is free to choose its own construction protocol. But many
> deconstructible classes are likely to choose a constructor that matches the
> state description, and in these cases, at the very least, we can elide the
> redundant signature declaration of the canonical constructor using the "compact
> constructor" syntax, as we do with records. We may also further use the
> `component` fields to streamline the constructor, but given the slimmed-down
> role for carrier classes, this would also have to be reevaluated.

> ### Object methods

> Having dropped the derivation of Object methods -- because we didn't really have
> a sufficiently principled basis for doing so -- these implementations are
> likely to be either significant sources of boilerplate, or worse, forgotten
> about in deconstructible classes.

> Our best story for this builds on the currently-dormant "concise method bodies"
> JEP, that allows us to delegate method implementations either to method
> references or to objects that implement the method, such as:

> boolean equals(Object other) __delegates_to <equalator-object>

> paired with an API for constructing such objects (which could drive all of the
> Object methods, not just `equals`). This is something that would benefit all
> classes, not just deconstructible ones. (We will return to this topic when
> concise method bodies comes closer to the top of the priority queue. )

> ## Summary

> What we see here eventually gets to the same place -- suitable classes can
> participate in deconstruction, reconstruction, and any future nominal
> construction/deconstruction; the most common forms of "almost records" are
> absorbed into records; classes that are largely data holder classes can get
> more concise expression. Some of these are deferred into the (possibly
> infinite) future, but almost all of these are more broadly applicable than what
> was outlined in the previous version. And we reclaim the clarity that comes
> from records being the locus of derived members, rather than sprinkling
> invisible members into other classes.

> On 1/13/2026 4:52 PM, Brian Goetz wrote:

>> Here's a snapshot of where my head is at with respect to extending the record
>> goodies (including pattern matching) to a broader range of classes,
>> deconstructors for classes and interfaces, and compatible evolution of records.
>> Hopefully this will unblock quite a few things.

>> As usual, let's discuss concepts and directions rather than syntax.

>> # Data-oriented Programming for Java: Beyond records

>> Everyone loves records; they allow us to create shallowly immutable data holder
>> classes -- which we can think of as "nominal tuples" -- derived from a concise
>> state description, and to destructure records through pattern matching. But
>> records have strict constraints, and not all data holder classes fit into the
>> restrictions of records. Maybe they have some mutable state, or derived or
>> cached state that is not part of the state description, or their representation
>> and their API do not match up exactly, or they need to break up their state
>> across a hierarchy. In these classes, even though they may also be “data
>> holders”, the user experience is like falling off a cliff. Even a small
>> deviation from the record ideal means one has to go back to a blank slate and
>> write explicit constructor declarations, accessor method declarations, and
>> Object method implementations -- and give up on destructuring through pattern
>> matching.

>> Since the start of the design process for records, we’ve kept in mind the goal
>> of enabling a broader range of classes to gain access to the "record goodies":
>> reduced declaration burden, participating in destructuring, and soon,
>> [reconstruction]( [ https://openjdk.org/jeps/468 | https://openjdk.org/jeps/468
>> ] ). During the design of records, we
>> also explored a number of weaker semantic models that would allow for greater
>> flexibility. While at the time they all failed to live up to the goals _for
>> records_, there is a weaker set of semantic constraints we can impose that
>> allows for more flexibility and still enables the features we want, along with
>> some degree of syntactic concision that is commensurate with the distance from
>> the record-ideal, without fall-off-the-cliff behaviors.

>> Records, sealed classes, and destructuring with record patterns constitute the
>> first feature arc of "data-oriented programming" for Java. After considering
>> numerous design ideas, we're now ready to move forward with the next "data
>> oriented programming" feature arc: _carrier classes_ (and interfaces.)

>> ## Beyond record patterns

>> Record patterns allow a record instance to be destructured into its components.
>> Record patterns can be used in `instanceof` and `switch`, and when a record
>> pattern is also exhaustive, will be usable in the upcoming [_pattern assignment
>> statement_]( [
>> https://mail.openjdk.org/pipermail/amber-spec-experts/2026-January/004306.html
>> |
>> https://mail.openjdk.org/pipermail/amber-spec-experts/2026-January/004306.html
>> ] ) feature.

>> In exploring the question "how will classes be able to participate in the same
>> sort of destructuring as records", we had initially focused on a new form of
>> declaration in a class -- a "deconstructor" -- that operated as a constructor in
>> reverse. Just as a constructor takes component values and produces an aggregate
>> instance, a deconstructor would take an aggregate instance and recover its
>> component values.

>> But as this exploration played out, the more interesting question turned out to
>> be: which classes are suitable for destructuring in the first place? And the
>> answer to that question led us to a different approach for expressing
>> deconstruction. The classes that are suitable for destructuring are those that,
>> like records, are little more than carriers for a specific tuple of data. This
>> is not just a thing that a class _has_, like a constructor or method, but
>> something a class _is_. And as such, it makes more sense to describe
>> deconstruction as a top-level property of a class. This, in turn, leads to a
>> number of simplifications.

>> ## The power of the state description

>> Records are a semantic feature; they are only incidentally concise. But they
>> _are_ concise; when we declare a record

>> record Point(int x, int y) { ... }

>> we automatically get a sensible API (canonical constructor, deconstruction
>> pattern, accessor methods for each component) and implementation (fields,
>> constructor, accessor methods, Object methods.) We can explicitly specify most
>> of these (except the fields) if we like, but most of the time we don't have to,
>> because the default is exactly what we want.

>> A record is a shallowly-immutable, final class whose API and representation are
>> _completely defined_ by its _state description_. (The slogan for records is
>> "the state, the whole state, and nothing but the state.") The state description
>> is the ordered list of _record components_ declared in the record's header. A
>> component is more than a mere field or accessor method; it is an API element on
>> its own, describing a state element that instances of the class have.

>> The state description of a record has several desirable properties:

>> - The components in the order specified, are the _canonical_ description of the
>> record's state.
>> - The components are the _complete_ description of the record’s state.
>> - The components are _nominal_; their names are a committed part of the
>> record's API.

>> Records derive their benefits from making two commitments:

>> - The _external_ commitment that the data-access API of a record (constructor,
>> deconstruction pattern, and component accessor methods) is defined by the
>> state description.
>> - The _internal_ commitments that the _representation_ of the record (its
>> fields) is also completely defined by the state description.

>> These semantic properties are what enable us to derive almost everything about
>> records. We can derive the API of the canonical constructor because the state
>> description is canonical. We can derive the API for the component accessor
>> methods because the state description is nominal. And we can derive a
>> deconstruction pattern from the accessor methods because the state description
>> is complete (along with sensible implementations for the state-related `Object`
>> methods.)

>> The internal commitment that the state description is also the representation
>> allows us to completely derive the rest of the implementation. Records get a
>> (private, final) field for each component, but more importantly, there is a
>> clear mapping between these fields and their corresponding components, which is
>> what allows us to derive the canonical constructor and accessor method
>> implementations.

>> Records can additionally declare a _compact constructor_ that allows us to elide
>> the boilerplate aspects of record constructors -- the argument list and field
>> assignments -- and just specify the code that is _not_ mechanically derivable.
>> This is more concise, less error-prone, and easier to read:

>> record Rational(int num, int denom) {
>> Rational {
>> if (denom == 0)
>> throw new IllegalArgumentException("denominator cannot be zero");
>> }
>> }

>> is shorthand for the more explicit

>> record Rational(int num, int denom) {
>> Rational(int num, int denom) {
>> if (denom == 0)
>> throw new IllegalArgumentException("denominator cannot be zero");
>> this.num = num;
>> this.denom = denom;
>> }
>> }

>> While compact constructors are pleasantly concise, the more important benefit is
>> that by eliminating the mechanically derivable code, the "more interesting" code
>> comes to the fore.

>> Looking ahead, the state description is a gift that keeps on giving. These
>> semantic commitments are enablers for a number of potential future language and
>> library features for managing object lifecycle, such as:

>> - [Reconstruction]( [ https://openjdk.org/jeps/468 |
>> https://openjdk.org/jeps/468 ] ) of record instances, allowing
>> the appearance of controlled mutation of record state.
>> - Automatic marshalling and unmarshalling of record instances.
>> - Instantiating or destructuring record instances identifying components
>> nominally rather than positionally.

>> ### Reconstruction

>> JEP 468 proposes a mechanism by which a new record instance can be derived from
>> an existing one using syntax that is evocative of direct mutation, via a `with`
>> expression:

>> record Complex(double re, double im) { }
>> Complex c = ...
>> Complex cConjugate = c with { im = -im; };

>> The block on the right side of `with` can contain any Java statements, not just
>> assignments. It is enhanced with mutable variables (_component variables_) for
>> each component of the record, initialized to the value of that component in the
>> record instance on the left, the block is executed, and a new record instance is
>> created whose component values are the ending values of the component variables.

>> A reconstruction expression implicitly destructures the record instance using
>> the canonical deconstruction pattern, executes the block in a scope enhanced
>> with the component variables, and then creates a new record using the canonical
>> constructor. Invariant checking is centralized in the canonical constructor, so
>> if the new state is not valid, the reconstruction will fail. JEP 468 has been
>> "on hold" for a while, primarily because we were waiting for sufficient
>> confidence that there was a path to extending it to suitable classes before
>> committing to it for records. The ideal path would be for those classes to also
>> support a notion of canonical constructor and deconstruction pattern.

>> Careful readers will note a similarity between the transformation block of a
>> `with` expression and the body of a compact constructor. In both cases, the
>> block is "preloaded" with a set of component variables, initialized to suitable
>> starting values, the block can mutate those variables as desired, and upon
>> normal completion of the block, those variables are passed to a canonical
>> constructor to produce the final result. The main difference is where the
>> starting values come from; for a compact constructor, it is from the constructor
>> parameters, and for a reconstruction expression, it is from the canonical
>> deconstruction pattern of the source record to the left of `with`.

>> ### Breaking down the cliff

>> Records make a strong semantic commitment to derive both their API and
>> representation from the state description, and in return get a lot of help from
>> the language. We can now turn our attention to smoothing out "the cliff" --
>> identifying weaker semantic commitments that classes can make that would still
>> allow classes to get _some_ help from the language. And ideally, the amount of
>> help you give up would be proportional to the degree of deviation from the
>> record ideal.

>> With records, we got a lot of mileage out of having a complete, canonical,
>> nominal state description. Where the record contract is sometimes too
>> constraining is the _implementation_ contract that the representation aligns
>> exactly with the state description, that the class is final, that the fields are
>> final, and that the class may not extend anything but `Record`.

>> Our path here takes one step back and one step forward: keeping the external
>> commitment to the state description, but dropping the internal commitment that
>> the state description _is_ the representation -- and then _adding back_ a simple
>> mechanism for mapping fields representing components back to their corresponding
>> components, where practical. (With records, because we derive the
>> representation from the state description, this mapping can be safely inferred.)

>> As a thought experiment, imagine a class that makes the external commitment to a
>> state description -- that the state description is a complete, canonical,
>> nominal description of its state -- but is on its own to provide its
>> representation. What can we do for such a class? Quite a bit, actually. For
>> all the same reasons we can for records, we can derive the API requirement for a
>> canonical constructor and component accessor methods. From there, we can derive
>> both the requirement for a canonical deconstruction pattern, and also the
>> implementation of the deconstruction pattern (as it is implemented in terms of
>> the accessor methods). And since the state description is complete, we can
>> further derive sensible default implementations of the Object methods `equals`,
>> `hashCode`, and `toString` in terms of the accessor methods as well. And given
>> that there is a canonical constructor and deconstruction pattern, it can also
>> participate in reconstruction. The author would just have to provide the
>> fields, accessor methods, and canonical constructor. This is good progress, but
>> we'd like to do better.

>> What enables us to derive the rest of the implementation for records (fields,
>> constructor, accessor methods, and Object methods) is the knowledge of how the
>> representation maps to the state description. Records commit to their state
>> description _being_ the representation, so is is a short leap from there to a
>> complete implementation.

>> To make this more concrete, let's look at a typical "almost record" class, a
>> carrier for the state description `(int x, int y, Optional<String> s)` but which
>> has made the representation choice to internally store `s` as a nullable
>> `String`.

>> ```
>> class AlmostRecord {
>> private final int x;
>> private final int y;
>> private final String s; // *

>> public AlmostRecord(int x, int y, Optional<String> s) {
>> this.x = x;
>> this.y = y;
>> this.s = s.orElse(null); // *
>> }

>> public int x() { return x; }
>> public int y() { return y; }
>> public Optional<String> s() {
>> return Optional.ofNullable(s); // *
>> }

>> public boolean equals(Object other) { ... } // derived from x(), y(), s()
>> public int hashCode() { ... } // "
>> public String toString() { ... } // "
>> }
>> ```

>> The main differences between this class and the expansion of its record analogue
>> are the lines marked with a `*`; these are the ones that deal with the disparity
>> between the state description and the actual representation. It would be nice
>> if the author of this class _only_ had to write the code that was different from
>> what we could derive for a record; not only would this be pleasantly concise,
>> but it would mean that all the code that _is_ there exists to capture the
>> differences between its representation and its API.

>> ## Carrier classes

>> A _carrier class_ is a normal class declared with a state description. As with
>> a record, the state description is a complete, canonical, nominal description of
>> the class's state. In return, the language derives the same API constraints as
>> it does for records: canonical constructor, canonical deconstruction pattern,
>> and component accessor methods.

>> class Point(int x, int y) { // class, not record!
>> // explicitly declared representation

>> ...

>> // must have a constructor taking (int x, int y)
>> // must have accessors for x and y
>> // supports a deconstruction pattern yielding (int x, int y)
>> }

>> Unlike a record, the language makes no assumptions about the object's
>> representation; the class author has to declare that just as with any other
>> class.

>> Saying the state description is "complete" means that it carries all the
>> “important” state of the class -- if we were to extract this state and recreate
>> the object, that should yield an “equivalent” instance. As with records, this
>> can be captured by tying together the behavior of construction, accessors, and
>> equality:

>> ```
>> Point p = ...
>> Point q = new Point(p.x(), p.y());
>> assert p.equals(q);
>> ```

>> We can also derive _some_ implementation from the information we have so far; we
>> can derive sensible implementations of the `Object` methods (implemented in
>> terms
>> of component accessor methods) and we can derive the canonical deconstruction
>> pattern (again in terms of the component accessor methods). And from there, we
>> can derive support for reconstruction (`with` expressions.) Unfortunately, we
>> cannot (yet) derive the bulk of the state-related implementation: the canonical
>> constructor and component accessor methods.

>> ### Component fields and accessor methods

>> One of the most tedious aspects of data-holder classes is the accessor methods;
>> there are often many of them, and they are almost always pure boilerplate. Even
>> though IDEs can reduce the writing burden by generating these for us, readers
>> still have to slog through a lot of low-information code -- just to learn that
>> they didn't actually need to slog through that code after all. We can derive
>> the implementation of accessor methods for records because records make the
>> internal commitment that the components are all backed with individual fields
>> whose name and type align with the state description.

>> For a carrier class, we don't know whether _any_ of the components are directly
>> backed by a single field that aligns to the name or type of the component. But
>> it is a pretty good bet that many carrier class components will do exactly this
>> for at least _some_ of their fields. If we can tell the language that this
>> correspondence is not merely accidental, the language can do more for us.

>> We do so by allowing suitable fields of a carrier class to be declared as
>> `component` fields. (As usual at this stage, syntax is provisional, but not
>> currently a topic for discussion.) A component field must have the same name
>> and type as a component of the current class (though it need not be `private` or
>> `final`, as record fields are.) This signals that this field _is_ the
>> representation for the corresponding component, and hence we can derive the
>> accessor method for this component as well.

>> ```
>> class Point(int x, int y) {
>> private /* mutable */ component int x;
>> private /* mutable */ component int y;

>> // must have a canonical constructor, but (so far) must be explicit
>> public Point(int x, int y) {
>> this.x = x;
>> this.y = y;
>> }

>> // derived implementations of accessors for x and y
>> // derived implementations of equals, hashCode, toString
>> }
>> ```

>> This is getting better; the class author had to bring the representation and the
>> mapping from representation to components (in the form of the `component`
>> modifier), and the canonical constructor.

>> ### Compact constructors

>> Just as we are able to derive the accessor method implementation if we are
>> given an explicit correspondence between a field and a component, we can do the
>> same for constructors. For this, we build on the notion of _compact
>> constructors_ that was introduced for records.

>> As with a record, a compact constructor in a carrier class is a shorthand for a
>> canonical constructor, which has the same shape as the state description, but
>> which is freed of the responsibility of actually committing the ending value of
>> the component parameters to the fields. The main difference is that for a
>> record, _all_ of the components are backed by a component field, whereas for a
>> carrier class, only some of them might be. But we can generalize compact
>> constructors by freeing the author of the responsibility to initialize the
>> _component_ fields, while leaving them responsible for initializing the rest of
>> the fields. In the limiting case where all components are backed by component
>> fields, and there is no other logic desired in the constructor, the compact
>> constructor may be elided.

>> For our mutable `Point` class, this means we can elide nearly everything, except
>> the field declarations themselves:

>> ```
>> class Point(int x, int y) {
>> private /* mutable */ component int x;
>> private /* mutable */ component int y;

>> // derived compact constructor
>> // derived accessors for x, y
>> // derived implementations of equals, hashCode, toString
>> }
>> ```

>> We can think of this class as having an implicit empty compact constructor,
>> which in turn means that the component fields `x` and `y` are initialized from
>> their corresponding constructor parameters. There are also implicitly derived
>> accessor methods for each component, and implementations of `Object` methods
>> based on the state description.

>> This is great for a class where all the components are backed by fields, but
>> what about our `AlmostRecord` class? The story here is good as well; we can
>> derive the accessor methods for the components backed by component fields, and
>> we can elide the initialization of the component fields from the compact
>> constructor, meaning that we _only_ have to specify the code for the parts that
>> deviate from the "record ideal":

>> ```
>> class AlmostRecord(int x,
>> int y,
>> Optional<String> s) {

>> private final component int x;
>> private final component int y;
>> private final String s;

>> public AlmostRecord {
>> this.s = s.orElse(null);
>> // x and y fields implicitly initialized
>> }

>> public Optional<String> s() {
>> return Optional.ofNullable(s);
>> }

>> // derived implementation of x and y accessors
>> // derived implementation of equals, hashCode, toString
>> }
>> ```

>> Because so many real-world almost-records differ from their record ideal in
>> minor ways, we expect to get a significant concision benefit for most carrier
>> classes, as we did for `AlmostRecord`. As with records, if we want to
>> explicitly implement the constructor, accessor methods, or `Object` methods, we
>> are still free to do so.

>> ### Derived state

>> One of the most frequent complaints about records is the inability to derive
>> state from the components and cache it for fast retrieval. With carrier
>> classes, this is simple: declare a non-component field for the derived quantity,
>> initialize it in the constructor, and provide an accessor:

>> ```
>> class Point(int x, int y) {
>> private final component int x;
>> private final component int y;
>> private final double norm;

>> Point {
>> norm = Math.hypot(x, y);
>> }

>> public double norm() { return norm; }

>> // derived implementation of x and y accessors
>> // derived implementation of equals, hashCode, toString
>> }
>> ```

>> ### Deconstruction and reconstruction

>> Like records, carrier classes automatically acquire deconstruction patterns that
>> match the canonical constructor, so we can destructure our `Point` class as if
>> it were a record:

>> case Point(var x, var y):

>> Because reconstruction (`with`) derives from a canonical constructor and
>> corresponding deconstruction pattern, when we support reconstruction of records,
>> we will also be able to do so for carrier classes:

>> point = point with { x = 3; }

>> ## Carrier interfaces

>> A state description makes sense on interfaces as well. It makes the statement
>> that the state description is a complete, canonical, nominal description of the
>> interface's state (subclasses are allowed to add additional state), and
>> accordingly, implementations must provide accessor methods for the components.
>> This enables such interfaces to participate in pattern matching:

>> ```
>> interface Pair<T,U>(T first, U second) {
>> // implicit abstract accessors for first() and second()
>> }

>> ...

>> if (o instanceof Pair(var a, var b)) { ... }
>> ```

>> Along with the upcoming feature for pattern assignment in foreach-loop headers,
>> if `Map.Entry` became a carrier interface (which it will), we would be able to
>> iterate a `Map` like:

>> for (Map.Entry(var key, var val) : map.entrySet()) { ... }

>> It is a common pattern in libraries to export an interface that is sealed to a
>> single private implementation. In this pattern, the interface and
>> implementation can share a common state description:

>> ```
>> public sealed interface Pair<T,U>(T first, U second) { }

>> private record PairImpl<T, U>(T first, U second) implements Pair<T, U> { }
>> ```

>> Compared to the old way of doing this, we get enhanced semantics, better type
>> checking, and more concision.

>> ### Extension

>> The main obligation of a carrier class author is to ensure that the fundamental
>> claim -- that the state description is a complete, canonical, nominal
>> description of the object's state -- is actually true. This does not rule out
>> having the representation of a carrier class spread out over a hierarchy, so
>> unlike records, carrier classes are not required to be final or concrete, nor
>> are they restricted in their extension.

>> There are several cases that arise when carrier classes can participate in
>> extension:

>> - A carrier class extends a non-carrier class;
>> - A non-carrier class extends a carrier class;
>> - A carrier class extends another carrier class, where all of the superclass
>> components are subsumed by the subclass state description;
>> - A carrier class extends another carrier class, but there are one or more
>> superclass components that are not subsumed by the subclass state
>> description.

>> Extending a non-carrier class with a carrier class will usually be motiviated by
>> the desire to "wrap" a state description around an existing hierarchy which we
>> cannot or do not want to modify directly, but we wish to gain the benefits of
>> deconstruction and reconstruction. Such an implementation would have to ensure
>> that the class actually conforms to the state description, and that the
>> canonical constructor and component accessors are implemented.

>> When one carrier class extends another, the more straightforward case is that it
>> simply adds new components to the state description of the superclass. For
>> example, given our `Point` class:

>> ```
>> class Point(int x, int y) {
>> component int x;
>> component int y;

>> // everything else for free!
>> }
>> ```

>> we can use this as the base class for a 3d point class:

>> ```
>> class Point3d(int x, int y, int z) extends Point {
>> component int z;

>> Point3d {
>> super(x, y);
>> }
>> }
>> ```

>> In this case -- because the superclass components are all part of the subclass
>> state description -- we can actually omit the constructor as well, because we
>> can derive the association between subclass components and superclass
>> components, and thereby derive the needed super-constructor invocation. So we
>> could actually write:

>> ```
>> class Point3d(int x, int y, int z) extends Point {
>> component int z;

>> // everything else for free!
>> }
>> ```

>> One might think that we would need some marking on the `x` and `y` components of
>> `Point3d` to indicate that they map to the corresponding components of `Point`,
>> as we did for associating component fields with their corresponding components.
>> But in this case, we need no such marking, because there is no way that an `int
>> x` component of `Point` and an `int x` component of its subclass could possibly
>> refer to different things -- since they both are tied to the same `int x()`
>> accessor methods. So we can safely infer which subclass components are managed
>> by superclasses, just by matching up their names and types.

>> In the other carrier-to-carrier extension case, where one or more superclass
>> components are _not_ subsumed by the subclass state description, it is necessary
>> to provide an explicit `super` constructor call in the subclass constructor.

>> A carrier class may be also declared abstract; the main effect of this is that
>> we will not derive `Object` method implementations, instead leaving that for the
>> subclass to do.

>> ### Abstract records

>> This framework also gives us an opportunity to relax one of the restrictions on
>> records: that records can't extend anything other than `java.lang.Record`. We
>> can also allow records to be declared `abstract`, and for records to extend
>> abstract records.

>> Just as with carrier classes that extend other carrier classes, there are two
>> cases: when the component list of the superclass is entirely contained within
>> that of the subclass, and when one or more superclass components are derived
>> from subclass components (or are constant), but are not components of the
>> subclass itself. And just as with carrier classes, the main difference is
>> whether an explicit `super` call is required in the subclass constructor.

>> When a record extends an abstract record, any components of the subclass that
>> are also components of the superclass do not implicitly get component fields in
>> the subclass (because they are already in the superclass), and they inherit the
>> accessor methods from the superclass.

>> ### Records are carriers too

>> With this framework in place, records can now be seen to be "just" carrier
>> classes that are implicitly final, extend `java.lang.Record`, that implicitly
>> have private final component fields for each component, and can have no other
>> fields.

>> ## Migration compatibility

>> There will surely be some existing classes that would like to become carrier
>> classes. This is a compatible migration as long as none of the mandated members
>> conflict with existing members of the class, and the class adheres to the
>> requirement that the state description is a complete, canonical, and nominal
>> description of the object state.

>> ### Compatible evolution of records and carrier classes

>> To date, libraries have been reluctant to use records in public APIs because
>> of the difficulty of evolving them compatibly. For a record:

>> ```
>> record R(A a, B b) { }
>> ```

>> that wants to evolve by adding new components:

>> ```
>> record R(A a, B b, C c, D d) { }
>> ```

>> we have several compatibility challenges to manage. As long as we are only
>> adding and not removing/renaming, accessor method invocations will continue to
>> work. And existing constructor invocations can be allowed to continue work by
>> explicitly adding back a constructor that has the old shape:

>> ```
>> record R(A a, B b, C c, D d) {

>> // Explicit constructor for old shape required
>> public R(A a, B b) {
>> this(a, b, DEFAULT_C, DEFAULT_D);
>> }

>> }
>> ```

>> But, what can we do about existing uses of record _patterns_? While the
>> translation of record patterns would make adding components binary-compatible,
>> it would not be source-compatible, and there is no way to explicitly add a
>> deconstruction pattern for the old shape as we did with the constructor.

>> We can take advantage of the simplification offered by there being _only_ the
>> canonical deconstruction pattern, and allow uses of deconstruction patterns to
>> supply nested patterns for any _prefix_ of the component list. So for the
>> evolved record R:

>> case R(P1, P2)

>> would be interpreted as:

>> case R(P1, P2, _, _)

>> where `_` is the match-all pattern. This means that one can compatibly evolve a
>> record by only adding new components at the end, and adding a suitable
>> constructor for compatibility with existing constructor invocations.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20260225/ec7b1476/attachment-0001.htm>


More information about the amber-spec-experts mailing list