[External] : Re: Consolidating the user model
Brian Goetz
brian.goetz at oracle.com
Wed Nov 3 14:05:21 UTC 2021
> I haven't caught up on the plans for equality in a long time.
This is a good time to catch up on this.
Today, the JVM provides an equality operation on objects in the form of
the `ACMP` instructions. It also provides per-primitive equality
operations (`ICMP`, `FCMP`, etc) for the various primitive types. (The
JVM mostly erases boolean, byte, char, and short to int, so some of
these instructions are "missing".)
Today, the language translate the `==` operator to the appropriate ACMP
/ ICMP / etc instruction, depending on the static type of the operands.
(JLS Ch5 (Contexts and Conversions) does the lifting of managing
mismatches when we, say, compare an object to a primitive.) The
important thing to take away here is that there really are multiple `==`
operators, they are just spelled the same way, and disambiguated by
static typing; let's call them `id==`, `int==`, etc if there's any
ambiguity. Note that `float==` and `double==` are weird when it comes
to `NaN`, so `==` on primitives is not necessarily just a straight
bitwise comparison.
Object has an `equals` method; the default implementation is:
boolean equals(Object other) {
return this == other;
}
So in the absence of code to the contrary, two objects are `equals` if
they are the same object.
Extrapolating, ACMP is a _substitutability test_; it says that
substituting one for the other would have no detectable differences.
Because all objects have a unique identity, comparing the identities is
both necessary and sufficient for a substitutability test. This is the
foundation on which we abstract `==` on the new classes.
If C is a class with no identity, that means an instance is the state,
the whole state, and nothing but the state. So the natural way to ask
"could I substitute instance c1 for instance c2" is to compare each of
its fields with a substitutability test. Which is exactly what `ACMP`
does on primitive objects. In keeping with the notion that each
primitive type has its own `==`, we'll write `Point==` for the equality
on `Point`.
For a simple `Point` primitive class, this is obvious, but it gets
tricky when a primitive is hiding behind a broader static type like
Object or an interface type. Consider:
primitive class Box {
Object contents;
}
How do we compare two boxes? By comparing their contents. How do we
compare contents? With a substitutability test. If we have identity
objects in the box, then the box comparison is "are you both boxes, and
are your contents `id==`". What if we have Points in the box? We need
to compare them with `Point==`. How do we know we have Points in the
box? By looking at their dynamic type. So the `==` operation on
primitive objects not only recurses into fields, but for fields that
could hold _either_ identity or primitive objects (these are `Object`,
interfaces, and some abstract classes), we dynamically select the `==`
operator to use on that field. (Edge cases: an id object is never `==`
to a primitive object; null is always `==` to itself.)
Note that `.ref` is transparent here; in order to get a `Point` into the
`Object` field, we (probably silently) converted it to `Point.ref`. But
`Point.ref` uses the same `==` computation as `Point`. The same is true
for the B2/B3 distinction; no difference. Objects without identity are
equal when their state is equal, whether they're a B2, B3, or B3.ref.
Possibly surprisingly, this has been pushed all the way into `ACMP`.
This means that existing code like the default implementation of
`Object::equals` just works; if you give it primitive objects, it knows
what to do, and performs the proper substitutability test. One rough
edge is that we don't use `==` as the test for float and double fields,
because it's not a proper substitutability test; we use the semantics of
`Float::equals` and `Double::equals` instead. Historical wart.
The bottom line is that `==` is preserved as a substitutability test on
instances of all primitive classes, whether they're "stored" by
reference or value. A corollary is that (finally) Integer instances
provide reliable `==` semantics, rather than the old unreliable
cache-based semantics. (One rift healed.)
More information about the valhalla-spec-observers
mailing list