[records] equals / hashCode (was: Records -- current status)
Brian Goetz
brian.goetz at oracle.com
Fri Apr 13 17:17:10 UTC 2018
Along the lines of the previous mail, people have and will ask "why
can't I redefine equals/hashCode". And the answer has two layers:
- The constraints on equals/hashCode are stronger for records, and
users might inadvertently violate them. (They can be specified in the
overrides of equals/hashCode in AbstractRecord, so there at least can be
a place where this specification lives, even if no one reads it.)
- In conjunction with ancillary fields, the constraints are sure to be
violated, whether inadvertently and deliberately.
Let's take a look at what sorts of modifications to equals/hashCode
would be OK, should we decide to relax this restriction. Equality
should still derive from the record's state, but there might be
acceptable variations.
Would it be OK to _widen_ the definition of equality, by ignoring a
component of the record?
This is an example of what Gunnar asked for, which is to restrict
equality to the primary key fields:
record PersonEntity(int primaryKey, String name, int age) {
// equality based only on primaryKey
}
Is this OK? Well, let's look at our model:
- Does ctor(dtor(c)) == c? Yes.
- if S1==S2, does ctor(S1) == ctor(S2)? Yes.
- For equal instances, does mutating them in the same way yield equal
instances? Yes.
- For equal instances, does calling the same method on both with the
same parameters yield equivalent results? No.
So, if p1 == p2, we cannot rely on p1.age() == p2.age(), so this fails
the requirements of our pseudo-formal model. (Assuming our model is the
right one.)
So, how would we feel about that? Two records that are equals() to each
other, but not substitable?
A more subtle version of this would be to consider all components, but
use a more inclusive notion of equality for that field, such as
comparing array components by contents.
record Numbers(int[] numbers) {
// equality based on Arrays.equals()
}
- Does ctor(dtor(c)) == c? Yes.
- Do equal state vectors produce equal records? Yes.
- Do identical mutations on equal records produce equal records? Yes.
- Does identical operations on equal records produce equal results?
Almost...
The Almost qualification can be seen here:
int[] a1;
int[] a2 = copyOf(a1);
Numbers r1 = new Numbers(a1), r2 = new Numbers(a2);
boolean same = a1.numbers().equals(a2.numbers())
The accessor will yield up the array references, which will not be
equals() to each other. This is essentially the same problem as above.
You get a similar result if your record represents something like a
rational number and you don't normalize to lowest terms in the
constructor; then you can have q1 equal q2, but q1.numerator() !=
q1.numerator().
Are any of these variations compelling enough to suggest we've got the
wrong model?
On 3/16/2018 2:55 PM, Brian Goetz wrote:
> There are a number of potentially open details on the design for
> records. My inclination is to start with the simplest thing that
> preserves the flexibility and expectations we want, and consider
> opening up later as necessary.
>
> One of the biggest issues, which Kevin raised as a must-address issue,
> is having sufficient support for precondition validation. Without
> foreclosing on the ability to do more later with declarative guards, I
> think the recent construction proposal meets the requirement for
> lightweight enforcement with minimal or no duplication. I'm hopeful
> that this bit is "there".
>
> Our goal all along has been to define records as being “just macros”
> for a finer-grained set of features. Some of these are motivated by
> boilerplate; some are motivated by semantics (coupling semantics of
> API elements to state.) In general, records will get there first, and
> then ordinary classes will get the more general feature, but the
> default answer for "can you relax records, so I can use it in this
> case that almost but doesn't quite fit" should be "no, but there will
> probably be a feature coming that makes that class simpler, wait for
> that."
>
>
> Some other open issues (please see my writeup at
> http://cr.openjdk.java.net/~briangoetz/amber/datum.html for
> reference), and my current thoughts on these, are outlined below.
> Comments welcome!
>
> - Extension. The proposal outlines a notion of abstract record,
> which provides a "width subtyped" hierarchy. Some have questioned
> whether this carries its weight, especially given how Scala doesn't
> support case-to-case extension (some see this as a bug, others as an
> existence proof.) Records can implement interfaces.
>
> - Concrete records are final. Relaxing this adds complexity to the
> equality story; I'm not seeing good reasons to do so.
>
> - Additional constructors. I don't see any reason why additional
> constructors are problematic, especially if they are constrained to
> delegate to the default constructor (which in turn is made far simpler
> if there can be statements ahead of the this() call.) Users may find
> the lack of additional constructors to be an arbitrary limitation (and
> they'd probably be right.)
>
> - Static fields. Static fields seem harmless.
>
> - Additional instance fields. These are a much bigger concern. While
> the primary arguments against them are of the "slippery slope"
> variety, I still have deep misgivings about supporting unrestricted
> non-principal instance fields, and I also haven't found a reasonable
> set of restrictions that makes this less risky. I'd like to keep
> looking for a better story here, before just caving on this, as I
> worry doing so will end up biting us in the back.
>
> - Mutability and accessibility. I'd like to propose an odd choice
> here, which is: fields are final and package (protected for abstract
> records) by default, but finality can be explicitly opted out of
> (non-final) and accessibility can be explicitly widened (public).
>
> - Accessors. Perhaps the most controversial aspect is that records
> are inherently transparent to read; if something wants to truly
> encapsulate state, it's not a record. Records will eventually have
> pattern deconstructors, which will expose their state, so we should go
> out of the gate with the equivalent. The obvious choice is to expose
> read accessors automatically. (These will not be named getXxx; we are
> not burning the ill-advised Javabean naming conventions into the
> language, no matter how much people think it already is.) The obvious
> naming choice for these accessors is fieldName(). No provision for
> write accessors; that's bring-your-own.
>
> - Core methods. Records will get equals, hashCode, and toString.
> There's a good argument for making equals/hashCode final (so they
> can't be explicitly redeclared); this gives us stronger preservation
> of the data invariants that allow us to safely and mechanically
> snapshot / serialize / marshal (we'd definitely want this if we ever
> allowed additional instance fields.) No reason to suppress override
> of toString, though. Records could be safely made cloneable() with
> automatic support too (like arrays), but not clear if this is worth it
> (its darn useful for arrays, though.) I think the auto-generated
> getters should be final too; this leaves arrays as second-class
> components, but I am not sure that bothers me.
>
>
>
>
>
More information about the amber-spec-experts
mailing list