Data Oriented Programming, Beyond Records
Brian Goetz
brian.goetz at oracle.com
Sat Jan 17 16:46:11 UTC 2026
> In my opinion, providing a way to automatically generate
> equals/hashCode and toString() for a mutable class is just a giant
> footgun.
This is actually one of the fundamental design questions here, so I'm
glad you brought it up. (But I will point out that the word "footgun"
is not a magic wand; the claim that "there is risk" does not, in itself,
mean the approach is flawed. Very often there is risk in both
directions, and we have to choose the lesser.)
Shall I assume that, modulo the handling of carriers with mutable
fields, you agree with the rest? I would be happy to have only one
topic to discuss.
> With a mutable class with equals/hashCode/toString generated, it's too
> easy to store an object in a collection, mutate it, and then never
> been able to find it again.
Yes, but also: everyone here knows about this risk. You don't need to
belabor the example :)
This is a reflection of a problem we already have: equals is a semantic
part of the type's definition, about when two instances represent the
"same" value, and mutability is pat of the type's definition, and
"whether you put it in a hash-based collection and then mutate it" is
about _how the instances are used by clients_.
While immutability is a good default, its not always _wrong_ to use
mutability; its just riskier. And for a mutable class, state-based
equality is _still_ a sensible possible implementation of equality; its
just riskier. And putting mutable objects in hash-based collections is
also not wrong; its just riskier. For the bad thing to happen, all of
these have to happen _and then it has to be mutated_. But if we have to
assign primary blame here, it is not the guy who didn't write `final` on
the fields, and not the guy who said that equality was state-based, but
the guy who put it in the collection and mutated it.
If we decided that avoiding this risk were the primary design goal, then
we would have to either disallow mutable fields, or change the way we
define the default equals/hashCode behavior. Potentially ways to do the
latter include:
- never provide a default implementation, inherit the object default
- don't provide a default implementation if there are any mutable fields
- leave mutable fields out of the default implementation, but use the
other fields
While "disallow mutable fields" is a potentially principled answer, it
is pretty restrictive. Of the others, I claim that the proposed
behavior is better than any of them.
Carrier classes are about data, and come with a semantic claim: that the
state description is a complete, canonical description of the state. It
seems pretty questionable then to use identity equality for such a
class. But the other two alternatives listed are both some form of
"action at a distance", harder to keep track of, are still only guesses
at what the user actually wants. The two principled options are "don't
provide equals/hashCode", and "state-based equals/hashCode", and of the
two, the latter makes much more sense.
It is not a bug to put a mutable object in a HashSet; it is a bug to do
that _and_ to later mutate it. So detuning the semantics of carriers,
from something simple and principled to something complicated and which
is just a guess about what the user really wants, just because someone
might do two things that are each individually OK but together not OK,
seems like an over-rotation.
> So I disagree that the component can be non final and that the class
> can be non final (you need those two to be non-modifiable).
For your code, sure. And for mine, most of the time. But you are
saying that "no one should be allowed to have mutable carriers"?
> As Ganapathi Vara Prasad said, maybe you want the reverse semantics,
> instead of indicating the components, you may want to indicate the
> non-component fields, the derived fields. But for me, it does not
> seems useful enough.
This proposal seemed more about code golf?
> A de-constructor becomes an instance method that must return a carrier
> class/carrier interface, a type that has the information to be
> destructured and the structure has to match the one defined by the type.
Careful there. A deconstructor is like a constructor; it applies to an
instance, BUT it cannot be inherited. It is not like an instance method.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20260117/da4c1777/attachment-0001.htm>
More information about the amber-spec-experts
mailing list