Data Oriented Programming, Beyond Records

Thu Jan 22 08:58:45 UTC 2026

I still think that reducing identity-based equality to == would be too
narrow-sighted. I understand the standpoint of wanting it to be a default,
but saying that "data identity is the object identity, and if you want
anything beyond that - go write everything from scratch" would be a rather
harsh approach to answer key/identity equality demand for data. More often
than not, especially for persistent data, identity of the object is the
part of object state, often exposed through assessors, which basically is
the component (external commitment from accessor, internal from field), and
providing mechanism to state that the object`s identity is state-based
would be a, in my opinion, good way to get around the tension and would
ease the pressure of choosing the defaults.

I know what I have just described above is basically like a value class
(except that it doesn't have an identity, but == equality kinda mimics it
by comparing state, virtually saying the state IS the identity, and if two
value objects are equal by state, they are *identical*). Though it stumbles
basically into the same issue of comparing the full state using ==, not
just part of it (even though values are immutable so mutability-related
issues are simply not observed). And, the elephant in the room: for carries
we are talking about .equals, not ==, so we are talking about something
like data identity, not object identity.

So I believe the actual solution would be to explore the direction of
acknowledging and embracing the concept of data identity as separate from
object identity and equality. Though, data identity can be a good default
for data equality

On Thu, Jan 22, 2026 at 12:22 AM <forax at univ-mlv.fr> wrote:

>
>
> ------------------------------
>
> *From: *"Brian Goetz" <brian.goetz at oracle.com>
> *To: *"Remi Forax" <forax at univ-mlv.fr>
> *Cc: *"Viktor Klang" <viktor.klang at oracle.com>, "amber-spec-experts" <
> amber-spec-experts at openjdk.java.net>
> *Sent: *Wednesday, January 21, 2026 8:54:40 PM
> *Subject: *Re: Data Oriented Programming, Beyond Records
>
> But also, you pay a big complexity tax when the new concept is almost like
> but can't quite fully meet up with the old concept; it again means that
> refactoring from record <--> carrier comes with a significant sharp edge,
> and this is a big warning signal.  So we would need a much stronger reason
> than "hmm, kind of like it better this way" to choose this divergent path.
> So far I'm not seeing it?
>
>
> There is a big sharp edge between a record and a class which is due to
> mutability, this is true inside the class but also outside, the user code
> when something is mutable or not is quite different .So refactoring from a
> mutable class to to an unmodifiable record is a not battle we should be
> interested in.
>
>
> Just because carrier class state _can_ be mutable, doesn't mean it _must_
> be.  So you're skipping over the interesting case, which is:
>
>     record R(int x, ...) { }
>
> and
>
>     final class R(int x, ...) { private final component int x; ... }  //
> not equivalent to above!
>
> In your model, the class version of R is painful to write, because you
> have to write equals, hashCode, and toString that delegate to each of the
> components.
>
>
> If you want to model something mutable you have to maintain the
> invariants, if you want to have some fields to be component, you have to
> write equals/hashCode/toString.
>
> We can provide a more declarative syntax:
> - for the former, the syntax can declare the preconditions and for each
> field how to do the defensive copy
> - for the latter, the syntax can declare each field that are part of the
> equality dance so equals/hashCode/toString can be derived.
>
> A record is easier to write because it's a sum-types, so it's unmodfiable
> *and* you can derived equals/hashCode/toString, that's the sweet spot.
>
> But that's not even the main point; it is that while there is no
> theoretical distinction between a record and a final class all of whose
> components are backed by final component fields, there is a big and
> hard-to-explain discontinuity when you start from either of those and try
> to refactor to the other.
>
>
> You are focusing on the gap between an unmodifiable class and a record,
> even if we fill that gap by adding the "component" feature,
> the gap between modifiable and unmodifiable will still exist. And for me,
> i do not see why one gap is more important than the other.
>
>
>
> But you are still not justifying your preference; WHY is identity-based
> equality the *obviously right* choice for carriers?  Be semantic please!
> Tell me what you think a carrier *means*.
>
>
> For an enum, the semantics of equals() has to be ==, so a carrier enum
> should use the identity-based semantics.
> For a data class, the semantics of equals is likely to not be == so a
> carrier data class has to override equals/hashCode.
>
> Basically, a carrier class does not take a side on what the semantics of
> equals should be, both are fine depending on the use case.
>
> regards,
> Rémi
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20260122/40ece4db/attachment-0001.htm>