Data Oriented Programming, Beyond Records

Anthony Vanelverdinghe anthonyv.be at outlook.com
Sun Jan 18 10:59:46 UTC 2026


On 1/18/2026 9:54 AM, Olexandr Rotan wrote:
>
>     Correcting myself: what about "state-based equals with immutable
>     fields-based hashCode", where "immutable fields" would be defined
>     as "final fields of types for which it is known that their
>     hashCode is immutable" (primitive types, enums, records and
>     carrier classes with the default-generated hashCode method, value
>     classes)?
>
> I would like to once more point out that none of the listed cases 
> except for the first two actually have any guarantee of hashcode 
> immutability, as records and value types are only shallowly immutable, 
> so this whole discussion seems to tackle a much more global topic than 
> just carrier classes equals/hashCode, and challenging the way hashCode 
> is generated in regards to mutability in carriers would need to 
> challenge the same behaviour in records as well, because the fact that 
> only final fields are used to compute hash only defers problem one 
> layer of indirection deeper, to any possibly-mutable value of final 
> field/component

Thanks, you're right, of course. And I should've said "... for which it 
is known that their hashCode is constant", not immutable.

So I propose that records, carrier classes, and value classes have a 
default (1) state-based equals and (2) hashCode whose result is constant 
for any given instance. This would require changing the default hashCode 
for records, but I believe this would be a backward compatible change. 
And hashCode would be able to use at least components/final fields of 
primitive types, enums, and records/carrier classes/value classes with a 
default hashCode. Moreover, javac could recognize patterns like `record 
Names(List<String> value) { public Names { value = List.copyOf(value); } 
}` to use the `value` component as well.

Anthony

>
> On Sun, Jan 18, 2026 at 8:01 AM Anthony Vanelverdinghe 
> <anthonyv.be at outlook.com> wrote:
>
>     On 1/17/2026 6:58 PM, Anthony Vanelverdinghe wrote:
>>     On 1/17/2026 5:46 PM, Brian Goetz wrote:
>>>>     With a mutable class with equals/hashCode/toString generated,
>>>>     it's too easy to store an object in a collection, mutate it,
>>>>     and then never been able to find it again.
>>>
>>>     Yes, but also: everyone here knows about this risk.  You don't
>>>     need to belabor the example :)
>>>
>>>     This is a reflection of a problem we already have: equals is a
>>>     semantic part of the type's definition, about when two instances
>>>     represent the "same" value, and mutability is pat of the type's
>>>     definition, and "whether you put it in a hash-based collection
>>>     and then mutate it" is about _how the instances are used by
>>>     clients_.
>>>
>>>     While immutability is a good default, its not always _wrong_ to
>>>     use mutability; its just riskier.  And for a mutable class,
>>>     state-based equality is _still_ a sensible possible
>>>     implementation of equality; its just riskier.  And putting
>>>     mutable objects in hash-based collections is also not wrong; its
>>>     just riskier.  For the bad thing to happen, all of these have to
>>>     happen _and then it has to be mutated_.  But if we have to
>>>     assign primary blame here, it is not the guy who didn't write
>>>     `final` on the fields, and not the guy who said that equality
>>>     was state-based, but the guy who put it in the collection and
>>>     mutated it.
>>>
>>>     If we decided that avoiding this risk were the primary design
>>>     goal, then we would have to either disallow mutable fields, or
>>>     change the way we define the default equals/hashCode behavior. 
>>>     Potentially ways to do the latter include:
>>>
>>>      - never provide a default implementation, inherit the object
>>>     default
>>>      - don't provide a default implementation if there are any
>>>     mutable fields
>>>      - leave mutable fields out of the default implementation, but
>>>     use the other fields
>>>
>>>     While "disallow mutable fields" is a potentially principled
>>>     answer, it is pretty restrictive.  Of the others, I claim that
>>>     the proposed behavior is better than any of them.
>>>
>>>     Carrier classes are about data, and come with a semantic claim:
>>>     that the state description is a complete, canonical description
>>>     of the state.  It seems pretty questionable then to use identity
>>>     equality for such a class.  But the other two alternatives
>>>     listed are both some form of "action at a distance", harder to
>>>     keep track of, are still only guesses at what the user actually
>>>     wants.  The two principled options are "don't provide
>>>     equals/hashCode", and "state-based equals/hashCode", and of the
>>>     two, the latter makes much more sense.
>     Correcting myself: what about "state-based equals with immutable
>     fields-based hashCode", where "immutable fields" would be defined
>     as "final fields of types for which it is known that their
>     hashCode is immutable" (primitive types, enums, records and
>     carrier classes with the default-generated hashCode method, value
>     classes)?
>>
>>     What about "state-based equals with final fields-based hashCode"?
>>     (Maybe this is actually what you meant with `leave mutable fields
>>     out of the default implementation, but use the other fields`, but
>>     then I don't understand how that's "action at a distance" and
>>     "harder to keep track of".) That would solve the HashSet issue
>>     and be a safe, intuitive default. There might be performance
>>     issues for carrier classes without final fields that are used in
>>     large HashSets, but in that case it's easy enough to provide
>>     one's own implementation of `hashCode`. And by doing so, one
>>     would implicitly consent to the implications of doing so (I could
>>     imagine javac issuing a lint warning for this and/or javadoc
>>     adding a warning to the Javadoc that the carrier class suffers
>>     from the HashSet issue).
>>
>>     Kind regards, Anthony
>>
>>>     It is not a bug to put a mutable object in a HashSet; it is a
>>>     bug to do that _and_ to later mutate it.  So detuning the
>>>     semantics of carriers, from something simple and principled to
>>>     something complicated and which is just a guess about what the
>>>     user really wants, just because someone might do two things that
>>>     are each individually OK but together not OK, seems like an
>>>     over-rotation.
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20260118/ab19f4a8/attachment-0001.htm>


More information about the amber-spec-observers mailing list