Feedback regarding default equals, hashCode and toString implementions for records
Brian Goetz
brian.goetz at oracle.com
Mon Apr 27 14:52:15 UTC 2020
> Hi all,
>
> While experimenting with the new records, I found what I believe is an
> oversight in the implementation.
> The generated equals method for records is based on equality of the
> components, but when the record component is an array, it still uses
> regular Object#equals(Object) method.
Thanks for your thoughts.
Unfortunately, this was no oversight -- this was an extensively
discussed topic, and none of the options were terribly satisfying. In
the end, we concluded that doing what you suggest would cause more
trouble than it was worth, and went back to the uniform appeal to
Object::equals.
Yes, arrays are irritating; they cannot be made immutable, and you
cannot override their equals computation. Many developers believe that
the default equals computation is "obviously wrong", but unfortunately,
it is only "wrong" a lot of the time. The rest of the time, the "other
default" would be "wrong". This is a no-win situation.
We considered doing exactly as you say -- using some sort of deeper
equals when the static type is an array. But, this "fix" creates a
number of inconsistencies, which quickly turn into whack-a-mole, with
extra complexity as a bonus. As an example, suppose you had the
following two records:
record A(String[] ss) { }
record B(Object ss) { }
String[] ss = ...
Then the equals() behavior of `new A(ss)` and `new B(ss)` would be
different! Try explaining that.
By-reference comparison and mutability go hand in hand; if an object is
mutable, then identity equality is probably the right thing. (Mutable
objects whose equals/hashCode are state-based are a perennial sorts of
puzzlers, when you put one in a map, and then the object is mutated, and
then (maybe) it can't be found in the map even though it is there.
(Ironically, List itself is one of the offenders here.)) Arrays make
this choice, for better or worse (but at least make it consistently.)
But arrays are not the only ones who make this choice, and the propose
fix does nothing for them.
> I believe that in order for records to succeed, they need to be truly
> boilerplate-reducing, so I'm suggesting the following changes to the
> default equals, hashCode and toString implementations:
>
> IF the field in the record is statically known to be a one-dimensional
> array:
> THEN java.util.Arrays.equals(array1, array2) is used to compare the
> component of two record instances (applies to
> <recordclass>#equals(Object))
> AND java.util.Arrays.hashCode(array) is used to compute the hashcode of
> the component (applies to <recordclass>#hashCode())
> AND java.util.Arrays.toString(array) is used to compute the string
> representation of the component (applies to
> <recordclass>#toString())
> ELSE IF the field in the record is statically known to be a
> multidimensional array:
> THEN java.util.Arrays.deepEquals(array1, array2) is used to compare the
> component of two record instances (applies to
> <recordclass>#equals(Object))
> AND java.util.Arrays.deepHashCode(array) is used to compute the hashcode
> of the component (applies to <recordclass>#hashCode())
> AND java.util.Arrays.deepToString(array) is used to compute the string
> representation of the component (applies to
> <recordclass>#toString())
> ELSE
> just use the old code generation algorithm from Java 14 that uses
> java.lang.Object#equals
I think you've just illustrated how complex the alternative is -- and it
is still not "right" 100% of the time. So more complexity, just to be
"wrong" less often, is not necessarily a good trade.
The basic problem is that sometimes people think that the by-contents
behavior is natural, and sometimes they think that the by-identity
behavior is natural, but we can't have both defaults. A complex
algorithm like this is trying to read the developer's mind. The current
choice -- treat arrays as any other object -- is sometimes not what
developers want, but it is far easier for developers to reason about
what it means. "Call the equals() method" is something every developer
can understand. They may grumble about "but the equals method for
arrays is wrong", but at least they can understand why they are getting
the result they are.
In the end, we settled on the simpler alternative, which is to uniformly
appeal to Object::equals, and advise users to take care when using
arrays and records, and consider using List instead if content equality
is what is desired.
> I am not familiar with the compiler implementation details, but it looks
> like a trivial change to me.
The complexity of the compiler implementation is generally a
non-consideration in the design process.
More information about the amber-dev
mailing list