Data Oriented Programming, Beyond Records
Brian Goetz
brian.goetz at oracle.com
Wed Feb 25 14:55:59 UTC 2026
We've had some good discussions here, and I've got some updates to the
ideas outlined in the previous snapshot. There's a lot here, including
a lot that I'm identifying as likely future directions but which should
not be the topic of current discussion; I want to focus on the
highest-priority aspects first (even though some of the lower-priority
aspects are surely going to be tempting to discuss.)
As we saw with records, there are two forces operating here, which are
to some degree in competition:
- A desire for stronger semantics. A class commits to almost nothing;
a record commits to quite a lot -- by disavowing the freedom to do
things that are outside the profile of "transparent, shallowly-immutable
data carrier". From these stronger semantics, we can derive features
such as deconstructability (pattern matching), reconstructability
(withers), etc, as well as deriving a number of usually-boilerplate
class members.
- A desire for concise representation, not necessarily because of the
effort of writing that code (IDEs will write it for you), but because
_reading code that adds no value_ obfuscates intent. (As always,
reading code is more important than writing code.)
With records, we managed to get these entirely aligned; we were able to
derive the desirable concise representation entirely from having picked
the right semantics; this is what winning looks like. With the "carrier
classes" suggestion as outlined in the previous snapshot, we are not
quite there yet, and we've likely slid a little too far into the
"unprincipled concision" camp, so we need to make some adjustments.
Additionally, there was something very uncomfortable about the "carrier
classes" proposal; it was a lot harder to look at a class and tell
whether the absence of explicit declarations of certain members
(especially equals and hashCode) meant they didn't exist, or whether it
meant they were derived. The current situation where records get a lot
of derived stuff and no one else does, while inconvenient for all
non-record classes, has more clarity.
In this mail, I outline a slightly different, somewhat more principled
position, that addresses these concerns. Concision fans may be
disappointed in some ways.
## Sizing up the problem
The strong semantic guarantee of records gives us:
- API contracts
- construction protocol (canonical constructor)
- deconstruction protocol (record patterns)
- component access protocol (accessor methods)
- state-based equality / hashCode / toString
- Implementation convenience
- representation (fields)
- canonical constructor (modulo parameter validation and normalization)
- Option to use compact constructor form
- component accessors
- record pattern (derived from accessors)
- implementations of equals, hashCode, toString
- Potential future semantic features
- reconstruction (withers)
- nominal construction and deconstruction
It will not be possible to give all of these benefits to the
almost-record classes, because some derive directly from giving up
flexibility that almost-records don't want to give up, such as the
ability to have an internal representation that differs from the
external API. In these cases, the implementation will have to "connect
the dots" to some degree; our measure of progress will be the degree to
which the required code is proportional to the deviation from the ideal.
To get there, we will take the approach of first solving a smaller but
more coherent problem -- how we get to deconstruction for arbitrary
classes and interfaces -- and then come back as needed with more
targeted tools for chipping away at the declaration overhead.
## Incomplete, canonical, nominal state descriptions
In the previous round, we described the state description of a carrier
class as a "complete, canonical, nominal state description" -- but this
reflected a sort of wishful thinking. What it really was was an
_incomplete_, canonical, nominal state description! Because we cannot
not stop the user from declaring additional state that might affect core
behavior -- and in fact, the whole point is to give the user more
freedom in defining the representation.
Acknowledging this means we gain some clarity but lose some concision.
The gained clarity is that a state description on an interface or class
means something narrower than initially claimed -- that this class has
these specific named components, and that it can be deconstructed with a
canonical deconstruction pattern, whose shape matches that of the state
description. The lost concision derives from lost semantics -- we
really don't have a principled basis for deriving the Object method
implementations in the face of arbitrary representation.
A class with a state description means that it has accessors for each
component, and has a canonical deconstruction pattern -- that's it
Saying that the state description is solely about access to components
(both individually and in bulk) allows us to drop all of the structural
restrictions we might be inclined to impose on such interfaces or
classes -- they can be final or not, extend other classes or not, they
can have whatever constructors they like, carriers can extend
non-carriers and vice versa, etc.
We will call a class or interface that has a state description in its
declaration a _deconstructible class_, and the elements of the state
description _class components_. Records become a restricted form of
deconstructible classes.
The rules about overriding components largely derive from the existing
rules about overriding their corresponding accessor methods; subclasses
can covariantly refine the type of an inherited superclass component,
but cannot have a component of the same name but a (sufficiently)
different type.
## What's left?
So, what's left? Quite a lot, of course; we've addressed the "how do
other classes and interfaces get pattern matching", but nothing else
yet. The "what's left" includes:
- Reconstruction (withers)
- Deriving implementations of accessors
- Compact constructors for deconstructible classes
- Deriving / streamlining object method implementation
We'll take these in turns, though I'm going to label some of these as
"for future discussion" to indicate that they are lower on the priority
list and guide discussions to the higher-priority items.
### Reconstruction
Reconstruction (withers) requires an underlying canonical
constructor-deconstructor pair. Records always have these, so they
always qualify for reconstruction. But we said earlier that we make no
assumptions about the construction protocol of arbitrary deconstructible
classes, so how do they qualify for reconstructibility?
We have long struggled with the question of what aspects of this have to
be explicitly declared vs what aspects can be reasonably inferred by
structure. Given that we've raised deconstructibility not only to a
language feature, but to a prominent place in the class declaration, it
seems reasonable to say that:
- A constructor of a deconstructible class D is canonical if it
matches the state description of D.
- A deconstructible class D is reconstructible by client C if D has a
canonical constructor and that constructor is accessible to C.
That is, given a deconstructible class, which has a canonical state
description, we can structurally recognize when a constructor is
canonical, and derive reconstruction if that constructor is present and
accessible.
It may further be desirable to restrict reconstruction to final classes,
as this reduces the risk of "decapitation", which seems to freak people
out quite a lot when they learn about the risk (I think this is mostly
"unfamiliarity bias", but is a restriction worth considering.)
### Derived fields in records
By far the most common profile of "almost records" is "records that want
to derive some state from their components and cache it." The previous
proposal addressed this through carrier classes; after some evaluation,
I think it is better to handle this within records themselves.
We've recently exposed the "lazy constants" work through an API class,
but the long-term goal has always been to sediment these into the
language eventually. Over there, there are discussions going on about
"cached instance methods" or "lazy instance fields" would allow us to
expose lazily derived, cached state in records without undermining the
record imperative. Of course, this mechanism would be available to
classes other than records as well.
Extending the reach of records takes some pressure off of the use cases
for carrier classes, as more things that are "almost record" can become
real records. So we'll let the work on laziness play out, and see to
what extent it addresses the concerns about "records aren't expressive
enough."
### Nominal invocation
Reconstruction leans on the nominality of components, but people have
been wishing for nominal invocation of constructors and deconstructors
as well for a long time. To the extent we allow for nominal creation
and deconstruction of records, deconstructible classes would be able to
come along for that ride. (But please, let's not discuss this now.)
### Derived accessors and compact constructors
Deconstructible classes get the requirement for accessors for each
component, but as currently stated, get no help in declaring them,
because, unlike with records, the language is unaware of the mapping
between the external API (as defined by the state description) and the
internal representation.
In the previous iteration, we filled in that mapping with a `component`
modifier on fields, which connected those dots, and therefore allowed
derivation of the (often numerous) accessors whose external API _does_
align with the internal representation. (This was a semantic claim: that
the field called `x` and the component called `x` describe the same
thing.) We will come back later to explore whether this concept still
carries its weight in the slimmed-down role for carrier classes.
Having severed the state description from the construction protocol, the
implementation is free to choose its own construction protocol. But
many deconstructible classes are likely to choose a constructor that
matches the state description, and in these cases, at the very least, we
can elide the redundant signature declaration of the canonical
constructor using the "compact constructor" syntax, as we do with
records. We may also further use the `component` fields to streamline
the constructor, but given the slimmed-down role for carrier classes,
this would also have to be reevaluated.
### Object methods
Having dropped the derivation of Object methods -- because we didn't
really have a sufficiently principled basis for doing so -- these
implementations are likely to be either significant sources of
boilerplate, or worse, forgotten about in deconstructible classes.
Our best story for this builds on the currently-dormant "concise method
bodies" JEP, that allows us to delegate method implementations either to
method references or to objects that implement the method, such as:
boolean equals(Object other) __delegates_to <equalator-object>
paired with an API for constructing such objects (which could drive all
of the Object methods, not just `equals`). This is something that would
benefit all classes, not just deconstructible ones. (We will return to
this topic when concise method bodies comes closer to the top of the
priority queue.)
## Summary
What we see here eventually gets to the same place -- suitable classes
can participate in deconstruction, reconstruction, and any future
nominal construction/deconstruction; the most common forms of "almost
records" are absorbed into records; classes that are largely data holder
classes can get more concise expression. Some of these are deferred
into the (possibly infinite) future, but almost all of these are more
broadly applicable than what was outlined in the previous version. And
we reclaim the clarity that comes from records being the locus of
derived members, rather than sprinkling invisible members into other
classes.
On 1/13/2026 4:52 PM, Brian Goetz wrote:
> Here's a snapshot of where my head is at with respect to extending the
> record goodies (including pattern matching) to a broader range of
> classes, deconstructors for classes and interfaces, and compatible
> evolution of records. Hopefully this will unblock quite a few things.
>
> As usual, let's discuss concepts and directions rather than syntax.
>
>
>
> # Data-oriented Programming for Java: Beyond records
>
> Everyone loves records; they allow us to create shallowly immutable
> data holder
> classes -- which we can think of as "nominal tuples" -- derived from a
> concise
> state description, and to destructure records through pattern
> matching. But
> records have strict constraints, and not all data holder classes fit
> into the
> restrictions of records. Maybe they have some mutable state, or
> derived or
> cached state that is not part of the state description, or their
> representation
> and their API do not match up exactly, or they need to break up their
> state
> across a hierarchy. In these classes, even though they may also be “data
> holders”, the user experience is like falling off a cliff. Even a small
> deviation from the record ideal means one has to go back to a blank
> slate and
> write explicit constructor declarations, accessor method declarations, and
> Object method implementations -- and give up on destructuring through
> pattern
> matching.
>
> Since the start of the design process for records, we’ve kept in mind
> the goal
> of enabling a broader range of classes to gain access to the "record
> goodies":
> reduced declaration burden, participating in destructuring, and soon,
> [reconstruction](https://openjdk.org/jeps/468). During the design of
> records, we
> also explored a number of weaker semantic models that would allow for
> greater
> flexibility. While at the time they all failed to live up to the goals
> _for
> records_, there is a weaker set of semantic constraints we can impose that
> allows for more flexibility and still enables the features we want,
> along with
> some degree of syntactic concision that is commensurate with the
> distance from
> the record-ideal, without fall-off-the-cliff behaviors.
>
> Records, sealed classes, and destructuring with record patterns
> constitute the
> first feature arc of "data-oriented programming" for Java. After
> considering
> numerous design ideas, we're now ready to move forward with the next "data
> oriented programming" feature arc: _carrier classes_ (and interfaces.)
>
> ## Beyond record patterns
>
> Record patterns allow a record instance to be destructured into its
> components.
> Record patterns can be used in `instanceof` and `switch`, and when a
> record
> pattern is also exhaustive, will be usable in the upcoming [_pattern
> assignment
> statement_](https://mail.openjdk.org/pipermail/amber-spec-experts/2026-January/004306.html)
> feature.
>
> In exploring the question "how will classes be able to participate in
> the same
> sort of destructuring as records", we had initially focused on a new
> form of
> declaration in a class -- a "deconstructor" -- that operated as a
> constructor in
> reverse. Just as a constructor takes component values and produces an
> aggregate
> instance, a deconstructor would take an aggregate instance and recover its
> component values.
>
> But as this exploration played out, the more interesting question
> turned out to
> be: which classes are suitable for destructuring in the first place?
> And the
> answer to that question led us to a different approach for expressing
> deconstruction. The classes that are suitable for destructuring are
> those that,
> like records, are little more than carriers for a specific tuple of
> data. This
> is not just a thing that a class _has_, like a constructor or method, but
> something a class _is_. And as such, it makes more sense to describe
> deconstruction as a top-level property of a class. This, in turn,
> leads to a
> number of simplifications.
>
> ## The power of the state description
>
> Records are a semantic feature; they are only incidentally concise.
> But they
> _are_ concise; when we declare a record
>
> record Point(int x, int y) { ... }
>
> we automatically get a sensible API (canonical constructor, deconstruction
> pattern, accessor methods for each component) and implementation (fields,
> constructor, accessor methods, Object methods.) We can explicitly
> specify most
> of these (except the fields) if we like, but most of the time we don't
> have to,
> because the default is exactly what we want.
>
> A record is a shallowly-immutable, final class whose API and
> representation are
> _completely defined_ by its _state description_. (The slogan for
> records is
> "the state, the whole state, and nothing but the state.") The state
> description
> is the ordered list of _record components_ declared in the record's
> header. A
> component is more than a mere field or accessor method; it is an API
> element on
> its own, describing a state element that instances of the class have.
>
> The state description of a record has several desirable properties:
>
> - The components in the order specified, are the _canonical_
> description of the
> record's state.
> - The components are the _complete_ description of the record’s state.
> - The components are _nominal_; their names are a committed part of the
> record's API.
>
> Records derive their benefits from making two commitments:
>
> - The _external_ commitment that the data-access API of a record
> (constructor,
> deconstruction pattern, and component accessor methods) is defined
> by the
> state description.
> - The _internal_ commitments that the _representation_ of the record (its
> fields) is also completely defined by the state description.
>
> These semantic properties are what enable us to derive almost
> everything about
> records. We can derive the API of the canonical constructor because
> the state
> description is canonical. We can derive the API for the component
> accessor
> methods because the state description is nominal. And we can derive a
> deconstruction pattern from the accessor methods because the state
> description
> is complete (along with sensible implementations for the state-related
> `Object`
> methods.)
>
> The internal commitment that the state description is also the
> representation
> allows us to completely derive the rest of the implementation. Records
> get a
> (private, final) field for each component, but more importantly, there
> is a
> clear mapping between these fields and their corresponding components,
> which is
> what allows us to derive the canonical constructor and accessor method
> implementations.
>
> Records can additionally declare a _compact constructor_ that allows
> us to elide
> the boilerplate aspects of record constructors -- the argument list
> and field
> assignments -- and just specify the code that is _not_ mechanically
> derivable.
> This is more concise, less error-prone, and easier to read:
>
> record Rational(int num, int denom) {
> Rational {
> if (denom == 0)
> throw new IllegalArgumentException("denominator cannot
> be zero");
> }
> }
>
> is shorthand for the more explicit
>
> record Rational(int num, int denom) {
> Rational(int num, int denom) {
> if (denom == 0)
> throw new IllegalArgumentException("denominator cannot
> be zero");
> this.num = num;
> this.denom = denom;
> }
> }
>
> While compact constructors are pleasantly concise, the more important
> benefit is
> that by eliminating the mechanically derivable code, the "more
> interesting" code
> comes to the fore.
>
> Looking ahead, the state description is a gift that keeps on giving.
> These
> semantic commitments are enablers for a number of potential future
> language and
> library features for managing object lifecycle, such as:
>
> - [Reconstruction](https://openjdk.org/jeps/468) of record instances,
> allowing
> the appearance of controlled mutation of record state.
> - Automatic marshalling and unmarshalling of record instances.
> - Instantiating or destructuring record instances identifying components
> nominally rather than positionally.
>
> ### Reconstruction
>
> JEP 468 proposes a mechanism by which a new record instance can be
> derived from
> an existing one using syntax that is evocative of direct mutation, via
> a `with`
> expression:
>
> record Complex(double re, double im) { }
> Complex c = ...
> Complex cConjugate = c with { im = -im; };
>
> The block on the right side of `with` can contain any Java statements,
> not just
> assignments. It is enhanced with mutable variables (_component
> variables_) for
> each component of the record, initialized to the value of that
> component in the
> record instance on the left, the block is executed, and a new record
> instance is
> created whose component values are the ending values of the component
> variables.
>
> A reconstruction expression implicitly destructures the record
> instance using
> the canonical deconstruction pattern, executes the block in a scope
> enhanced
> with the component variables, and then creates a new record using the
> canonical
> constructor. Invariant checking is centralized in the canonical
> constructor, so
> if the new state is not valid, the reconstruction will fail. JEP 468
> has been
> "on hold" for a while, primarily because we were waiting for sufficient
> confidence that there was a path to extending it to suitable classes
> before
> committing to it for records. The ideal path would be for those
> classes to also
> support a notion of canonical constructor and deconstruction pattern.
>
> Careful readers will note a similarity between the transformation
> block of a
> `with` expression and the body of a compact constructor. In both
> cases, the
> block is "preloaded" with a set of component variables, initialized to
> suitable
> starting values, the block can mutate those variables as desired, and upon
> normal completion of the block, those variables are passed to a canonical
> constructor to produce the final result. The main difference is where the
> starting values come from; for a compact constructor, it is from the
> constructor
> parameters, and for a reconstruction expression, it is from the canonical
> deconstruction pattern of the source record to the left of `with`.
>
> ### Breaking down the cliff
>
> Records make a strong semantic commitment to derive both their API and
> representation from the state description, and in return get a lot of
> help from
> the language. We can now turn our attention to smoothing out "the
> cliff" --
> identifying weaker semantic commitments that classes can make that
> would still
> allow classes to get _some_ help from the language. And ideally, the
> amount of
> help you give up would be proportional to the degree of deviation from the
> record ideal.
>
> With records, we got a lot of mileage out of having a complete, canonical,
> nominal state description. Where the record contract is sometimes too
> constraining is the _implementation_ contract that the representation
> aligns
> exactly with the state description, that the class is final, that the
> fields are
> final, and that the class may not extend anything but `Record`.
>
> Our path here takes one step back and one step forward: keeping the
> external
> commitment to the state description, but dropping the internal
> commitment that
> the state description _is_ the representation -- and then _adding
> back_ a simple
> mechanism for mapping fields representing components back to their
> corresponding
> components, where practical. (With records, because we derive the
> representation from the state description, this mapping can be safely
> inferred.)
>
> As a thought experiment, imagine a class that makes the external
> commitment to a
> state description -- that the state description is a complete, canonical,
> nominal description of its state -- but is on its own to provide its
> representation. What can we do for such a class? Quite a bit,
> actually. For
> all the same reasons we can for records, we can derive the API
> requirement for a
> canonical constructor and component accessor methods. From there, we
> can derive
> both the requirement for a canonical deconstruction pattern, and also the
> implementation of the deconstruction pattern (as it is implemented in
> terms of
> the accessor methods). And since the state description is complete, we can
> further derive sensible default implementations of the Object methods
> `equals`,
> `hashCode`, and `toString` in terms of the accessor methods as well.
> And given
> that there is a canonical constructor and deconstruction pattern, it
> can also
> participate in reconstruction. The author would just have to provide the
> fields, accessor methods, and canonical constructor. This is good
> progress, but
> we'd like to do better.
>
> What enables us to derive the rest of the implementation for records
> (fields,
> constructor, accessor methods, and Object methods) is the knowledge of
> how the
> representation maps to the state description. Records commit to their
> state
> description _being_ the representation, so is is a short leap from
> there to a
> complete implementation.
>
> To make this more concrete, let's look at a typical "almost record"
> class, a
> carrier for the state description `(int x, int y, Optional<String> s)`
> but which
> has made the representation choice to internally store `s` as a nullable
> `String`.
>
> ```
> class AlmostRecord {
> private final int x;
> private final int y;
> private final String s; // *
>
> public AlmostRecord(int x, int y, Optional<String> s) {
> this.x = x;
> this.y = y;
> this.s = s.orElse(null); // *
> }
>
> public int x() { return x; }
> public int y() { return y; }
> public Optional<String> s() {
> return Optional.ofNullable(s); // *
> }
>
> public boolean equals(Object other) { ... } // derived from
> x(), y(), s()
> public int hashCode() { ... } // "
> public String toString() { ... } // "
> }
> ```
>
> The main differences between this class and the expansion of its
> record analogue
> are the lines marked with a `*`; these are the ones that deal with the
> disparity
> between the state description and the actual representation. It would
> be nice
> if the author of this class _only_ had to write the code that was
> different from
> what we could derive for a record; not only would this be pleasantly
> concise,
> but it would mean that all the code that _is_ there exists to capture the
> differences between its representation and its API.
>
> ## Carrier classes
>
> A _carrier class_ is a normal class declared with a state
> description. As with
> a record, the state description is a complete, canonical, nominal
> description of
> the class's state. In return, the language derives the same API
> constraints as
> it does for records: canonical constructor, canonical deconstruction
> pattern,
> and component accessor methods.
>
> class Point(int x, int y) { // class, not record!
> // explicitly declared representation
>
> ...
>
> // must have a constructor taking (int x, int y)
> // must have accessors for x and y
> // supports a deconstruction pattern yielding (int x, int y)
> }
>
> Unlike a record, the language makes no assumptions about the object's
> representation; the class author has to declare that just as with any
> other
> class.
>
> Saying the state description is "complete" means that it carries all the
> “important” state of the class -- if we were to extract this state and
> recreate
> the object, that should yield an “equivalent” instance. As with
> records, this
> can be captured by tying together the behavior of construction,
> accessors, and
> equality:
>
> ```
> Point p = ...
> Point q = new Point(p.x(), p.y());
> assert p.equals(q);
> ```
>
> We can also derive _some_ implementation from the information we have
> so far; we
> can derive sensible implementations of the `Object` methods
> (implemented in terms
> of component accessor methods) and we can derive the canonical
> deconstruction
> pattern (again in terms of the component accessor methods). And from
> there, we
> can derive support for reconstruction (`with` expressions.)
> Unfortunately, we
> cannot (yet) derive the bulk of the state-related implementation: the
> canonical
> constructor and component accessor methods.
>
> ### Component fields and accessor methods
>
> One of the most tedious aspects of data-holder classes is the accessor
> methods;
> there are often many of them, and they are almost always pure
> boilerplate. Even
> though IDEs can reduce the writing burden by generating these for us,
> readers
> still have to slog through a lot of low-information code -- just to
> learn that
> they didn't actually need to slog through that code after all. We can
> derive
> the implementation of accessor methods for records because records
> make the
> internal commitment that the components are all backed with individual
> fields
> whose name and type align with the state description.
>
> For a carrier class, we don't know whether _any_ of the components are
> directly
> backed by a single field that aligns to the name or type of the
> component. But
> it is a pretty good bet that many carrier class components will do
> exactly this
> for at least _some_ of their fields. If we can tell the language that
> this
> correspondence is not merely accidental, the language can do more for us.
>
> We do so by allowing suitable fields of a carrier class to be declared as
> `component` fields. (As usual at this stage, syntax is provisional,
> but not
> currently a topic for discussion.) A component field must have the
> same name
> and type as a component of the current class (though it need not be
> `private` or
> `final`, as record fields are.) This signals that this field _is_ the
> representation for the corresponding component, and hence we can
> derive the
> accessor method for this component as well.
>
> ```
> class Point(int x, int y) {
> private /* mutable */ component int x;
> private /* mutable */ component int y;
>
> // must have a canonical constructor, but (so far) must be explicit
> public Point(int x, int y) {
> this.x = x;
> this.y = y;
> }
>
> // derived implementations of accessors for x and y
> // derived implementations of equals, hashCode, toString
> }
> ```
>
> This is getting better; the class author had to bring the
> representation and the
> mapping from representation to components (in the form of the `component`
> modifier), and the canonical constructor.
>
> ### Compact constructors
>
> Just as we are able to derive the accessor method implementation if we are
> given an explicit correspondence between a field and a component, we
> can do the
> same for constructors. For this, we build on the notion of _compact
> constructors_ that was introduced for records.
>
> As with a record, a compact constructor in a carrier class is a
> shorthand for a
> canonical constructor, which has the same shape as the state
> description, but
> which is freed of the responsibility of actually committing the ending
> value of
> the component parameters to the fields. The main difference is that for a
> record, _all_ of the components are backed by a component field,
> whereas for a
> carrier class, only some of them might be. But we can generalize compact
> constructors by freeing the author of the responsibility to initialize the
> _component_ fields, while leaving them responsible for initializing
> the rest of
> the fields. In the limiting case where all components are backed by
> component
> fields, and there is no other logic desired in the constructor, the
> compact
> constructor may be elided.
>
> For our mutable `Point` class, this means we can elide nearly
> everything, except
> the field declarations themselves:
>
> ```
> class Point(int x, int y) {
> private /* mutable */ component int x;
> private /* mutable */ component int y;
>
> // derived compact constructor
> // derived accessors for x, y
> // derived implementations of equals, hashCode, toString
> }
> ```
>
> We can think of this class as having an implicit empty compact
> constructor,
> which in turn means that the component fields `x` and `y` are
> initialized from
> their corresponding constructor parameters. There are also implicitly
> derived
> accessor methods for each component, and implementations of `Object`
> methods
> based on the state description.
>
> This is great for a class where all the components are backed by
> fields, but
> what about our `AlmostRecord` class? The story here is good as well;
> we can
> derive the accessor methods for the components backed by component
> fields, and
> we can elide the initialization of the component fields from the compact
> constructor, meaning that we _only_ have to specify the code for the
> parts that
> deviate from the "record ideal":
>
> ```
> class AlmostRecord(int x,
> int y,
> Optional<String> s) {
>
> private final component int x;
> private final component int y;
> private final String s;
>
> public AlmostRecord {
> this.s = s.orElse(null);
> // x and y fields implicitly initialized
> }
>
> public Optional<String> s() {
> return Optional.ofNullable(s);
> }
>
> // derived implementation of x and y accessors
> // derived implementation of equals, hashCode, toString
> }
> ```
>
> Because so many real-world almost-records differ from their record
> ideal in
> minor ways, we expect to get a significant concision benefit for most
> carrier
> classes, as we did for `AlmostRecord`. As with records, if we want to
> explicitly implement the constructor, accessor methods, or `Object`
> methods, we
> are still free to do so.
>
> ### Derived state
>
> One of the most frequent complaints about records is the inability to
> derive
> state from the components and cache it for fast retrieval. With carrier
> classes, this is simple: declare a non-component field for the derived
> quantity,
> initialize it in the constructor, and provide an accessor:
>
> ```
> class Point(int x, int y) {
> private final component int x;
> private final component int y;
> private final double norm;
>
> Point {
> norm = Math.hypot(x, y);
> }
>
> public double norm() { return norm; }
>
> // derived implementation of x and y accessors
> // derived implementation of equals, hashCode, toString
> }
> ```
>
> ### Deconstruction and reconstruction
>
> Like records, carrier classes automatically acquire deconstruction
> patterns that
> match the canonical constructor, so we can destructure our `Point`
> class as if
> it were a record:
>
> case Point(var x, var y):
>
> Because reconstruction (`with`) derives from a canonical constructor and
> corresponding deconstruction pattern, when we support reconstruction
> of records,
> we will also be able to do so for carrier classes:
>
> point = point with { x = 3; }
>
> ## Carrier interfaces
>
> A state description makes sense on interfaces as well. It makes the
> statement
> that the state description is a complete, canonical, nominal
> description of the
> interface's state (subclasses are allowed to add additional state), and
> accordingly, implementations must provide accessor methods for the
> components.
> This enables such interfaces to participate in pattern matching:
>
> ```
> interface Pair<T,U>(T first, U second) {
> // implicit abstract accessors for first() and second()
> }
>
> ...
>
> if (o instanceof Pair(var a, var b)) { ... }
> ```
>
> Along with the upcoming feature for pattern assignment in foreach-loop
> headers,
> if `Map.Entry` became a carrier interface (which it will), we would be
> able to
> iterate a `Map` like:
>
> for (Map.Entry(var key, var val) : map.entrySet()) { ... }
>
> It is a common pattern in libraries to export an interface that is
> sealed to a
> single private implementation. In this pattern, the interface and
> implementation can share a common state description:
>
> ```
> public sealed interface Pair<T,U>(T first, U second) { }
>
> private record PairImpl<T, U>(T first, U second) implements Pair<T, U> { }
> ```
>
> Compared to the old way of doing this, we get enhanced semantics,
> better type
> checking, and more concision.
>
> ### Extension
>
> The main obligation of a carrier class author is to ensure that the
> fundamental
> claim -- that the state description is a complete, canonical, nominal
> description of the object's state -- is actually true. This does not
> rule out
> having the representation of a carrier class spread out over a
> hierarchy, so
> unlike records, carrier classes are not required to be final or
> concrete, nor
> are they restricted in their extension.
>
> There are several cases that arise when carrier classes can participate in
> extension:
>
> - A carrier class extends a non-carrier class;
> - A non-carrier class extends a carrier class;
> - A carrier class extends another carrier class, where all of the
> superclass
> components are subsumed by the subclass state description;
> - A carrier class extends another carrier class, but there are one or
> more
> superclass components that are not subsumed by the subclass state
> description.
>
> Extending a non-carrier class with a carrier class will usually be
> motiviated by
> the desire to "wrap" a state description around an existing hierarchy
> which we
> cannot or do not want to modify directly, but we wish to gain the
> benefits of
> deconstruction and reconstruction. Such an implementation would have
> to ensure
> that the class actually conforms to the state description, and that the
> canonical constructor and component accessors are implemented.
>
> When one carrier class extends another, the more straightforward case
> is that it
> simply adds new components to the state description of the
> superclass. For
> example, given our `Point` class:
>
> ```
> class Point(int x, int y) {
> component int x;
> component int y;
>
> // everything else for free!
> }
> ```
>
> we can use this as the base class for a 3d point class:
>
> ```
> class Point3d(int x, int y, int z) extends Point {
> component int z;
>
> Point3d {
> super(x, y);
> }
> }
> ```
>
> In this case -- because the superclass components are all part of the
> subclass
> state description -- we can actually omit the constructor as well,
> because we
> can derive the association between subclass components and superclass
> components, and thereby derive the needed super-constructor
> invocation. So we
> could actually write:
>
> ```
> class Point3d(int x, int y, int z) extends Point {
> component int z;
>
> // everything else for free!
> }
> ```
>
> One might think that we would need some marking on the `x` and `y`
> components of
> `Point3d` to indicate that they map to the corresponding components of
> `Point`,
> as we did for associating component fields with their corresponding
> components.
> But in this case, we need no such marking, because there is no way
> that an `int
> x` component of `Point` and an `int x` component of its subclass could
> possibly
> refer to different things -- since they both are tied to the same `int
> x()`
> accessor methods. So we can safely infer which subclass components
> are managed
> by superclasses, just by matching up their names and types.
>
> In the other carrier-to-carrier extension case, where one or more
> superclass
> components are _not_ subsumed by the subclass state description, it is
> necessary
> to provide an explicit `super` constructor call in the subclass
> constructor.
>
> A carrier class may be also declared abstract; the main effect of this
> is that
> we will not derive `Object` method implementations, instead leaving
> that for the
> subclass to do.
>
> ### Abstract records
>
> This framework also gives us an opportunity to relax one of the
> restrictions on
> records: that records can't extend anything other than
> `java.lang.Record`. We
> can also allow records to be declared `abstract`, and for records to
> extend
> abstract records.
>
> Just as with carrier classes that extend other carrier classes, there
> are two
> cases: when the component list of the superclass is entirely contained
> within
> that of the subclass, and when one or more superclass components are
> derived
> from subclass components (or are constant), but are not components of the
> subclass itself. And just as with carrier classes, the main difference is
> whether an explicit `super` call is required in the subclass constructor.
>
> When a record extends an abstract record, any components of the
> subclass that
> are also components of the superclass do not implicitly get component
> fields in
> the subclass (because they are already in the superclass), and they
> inherit the
> accessor methods from the superclass.
>
> ### Records are carriers too
>
> With this framework in place, records can now be seen to be "just" carrier
> classes that are implicitly final, extend `java.lang.Record`, that
> implicitly
> have private final component fields for each component, and can have
> no other
> fields.
>
> ## Migration compatibility
>
> There will surely be some existing classes that would like to become
> carrier
> classes. This is a compatible migration as long as none of the
> mandated members
> conflict with existing members of the class, and the class adheres to the
> requirement that the state description is a complete, canonical, and
> nominal
> description of the object state.
>
> ### Compatible evolution of records and carrier classes
>
> To date, libraries have been reluctant to use records in public APIs
> because
> of the difficulty of evolving them compatibly. For a record:
>
> ```
> record R(A a, B b) { }
> ```
>
> that wants to evolve by adding new components:
>
> ```
> record R(A a, B b, C c, D d) { }
> ```
>
> we have several compatibility challenges to manage. As long as we are
> only
> adding and not removing/renaming, accessor method invocations will
> continue to
> work. And existing constructor invocations can be allowed to continue
> work by
> explicitly adding back a constructor that has the old shape:
>
> ```
> record R(A a, B b, C c, D d) {
>
> // Explicit constructor for old shape required
> public R(A a, B b) {
> this(a, b, DEFAULT_C, DEFAULT_D);
> }
>
> }
> ```
>
> But, what can we do about existing uses of record _patterns_? While the
> translation of record patterns would make adding components
> binary-compatible,
> it would not be source-compatible, and there is no way to explicitly add a
> deconstruction pattern for the old shape as we did with the constructor.
>
> We can take advantage of the simplification offered by there being
> _only_ the
> canonical deconstruction pattern, and allow uses of deconstruction
> patterns to
> supply nested patterns for any _prefix_ of the component list. So for the
> evolved record R:
>
> case R(P1, P2)
>
> would be interpreted as:
>
> case R(P1, P2, _, _)
>
> where `_` is the match-all pattern. This means that one can
> compatibly evolve a
> record by only adding new components at the end, and adding a suitable
> constructor for compatibility with existing constructor invocations.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20260225/bb1d4039/attachment-0001.htm>
More information about the amber-spec-experts
mailing list