Data Oriented Programming, Beyond Records

Wed Jan 21 22:21:53 UTC 2026

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Viktor Klang" <viktor.klang at oracle.com>, "amber-spec-experts"
> <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, January 21, 2026 8:54:40 PM
> Subject: Re: Data Oriented Programming, Beyond Records

>>> But also, you pay a big complexity tax when the new concept is almost like but
>>> can't quite fully meet up with the old concept; it again means that refactoring
>>> from record <--> carrier comes with a significant sharp edge, and this is a big
>>> warning signal. So we would need a much stronger reason than "hmm, kind of like
>>> it better this way" to choose this divergent path. So far I'm not seeing it?
>> There is a big sharp edge between a record and a class which is due to
>> mutability, this is true inside the class but also outside, the user code when
>> something is mutable or not is quite different .So refactoring from a mutable
>> class to to an unmodifiable record is a not battle we should be interested in.

> Just because carrier class state _can_ be mutable, doesn't mean it _must_ be. So
> you're skipping over the interesting case, which is:

> record R(int x, ...) { }

> and

> final class R(int x, ...) { private final component int x; ... } // not
> equivalent to above!

> In your model, the class version of R is painful to write, because you have to
> write equals, hashCode, and toString that delegate to each of the components.
If you want to model something mutable you have to maintain the invariants, if you want to have some fields to be component, you have to write equals/hashCode/toString. 

We can provide a more declarative syntax: 
- for the former, the syntax can declare the preconditions and for each field how to do the defensive copy 
- for the latter, the syntax can declare each field that are part of the equality dance so equals/hashCode/toString can be derived. 

A record is easier to write because it's a sum-types, so it's unmodfiable *and* you can derived equals/hashCode/toString, that's the sweet spot. 

> But that's not even the main point; it is that while there is no theoretical
> distinction between a record and a final class all of whose components are
> backed by final component fields, there is a big and hard-to-explain
> discontinuity when you start from either of those and try to refactor to the
> other.
You are focusing on the gap between an unmodifiable class and a record, even if we fill that gap by adding the "component" feature, 
the gap between modifiable and unmodifiable will still exist. And for me, i do not see why one gap is more important than the other. 

> But you are still not justifying your preference; WHY is identity-based equality
> the *obviously right* choice for carriers? Be semantic please! Tell me what you
> think a carrier *means*.
For an enum, the semantics of equals() has to be ==, so a carrier enum should use the identity-based semantics. 
For a data class, the semantics of equals is likely to not be == so a carrier data class has to override equals/hashCode. 

Basically, a carrier class does not take a side on what the semantics of equals should be, both are fine depending on the use case. 

regards, 
Rémi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20260121/244b5516/attachment.htm>