[External] : Equality operator for identityless classes
John Rose
john.r.rose at oracle.com
Wed Nov 3 17:10:49 UTC 2021
One of the long standing fixtures in the ecosystem is the
set of idioms for correct use of op==/acmp. Another is lots
of articles and IDE checkers which detect other uses which
are dubious. It’s a problem that you cannot use op==/acmp
by itself in most cases; you have to accompany it by a call
to Object::equals. We might try to fix this problem, but
it cannot be expunged from our billions of lines of
pre-existing Java code.
I like to call these equals-accompanying idioms L.I.F.E,
or Legacy Idiom(s) For Equality. It shows up, canonically,
in this method of ju.Objects:
public static boolean equals(Object a, Object b) {
return (a == b) || (a != null && a.equals(b));
}
Thus, the defective character of op==/acmp is just
(wait for it) a fact of L.I.F.E. and we cannot fight it too
much without hurting ourselves.
Turning that around, if L.I.F.E. is a dynamically common
occurrence (as it is surely statically common) then we
can expend JIT complexity budget to deal with it, and
(maybe even) adjust JVM rules around the optimizations
to make more edgy versions of the optimizations legal.
Specifically, this JIT-time transform has the potential to
radically reduce the frequency of op==/acmp:
(a == b) || (a != null && a.equals(b))
=>
(a == null ? b == null : a.equals(b))
This only works if all possible methods selected from
a.equals permit the dropping of op==. The contract
of Object::equals does indeed allow this, but it is not
enforced; the JVMS allows the contract to be broken,
and the transform will expose the breakage. And yet,
there are things we can do here to unlock this transform.
More generally, for other L.I.F.E.-forms, I am confident
we can build JIT transforms that reduce reliance on
acmp, which is suddenly more expensive than its coders
(and the original designers of Java) expect.
Programmers who override Object::equals to (as you
nicely say) disavow identity-based substitutability
will probably write, prompted by their IDE, in a
ceremonial mood, that one occurrence of op==/acmp
to short-circuit the rest of their Foo::equals method.
Or they may erase it, in a purifying mood.
In either case, the above transform requires the JIT
to examine such as either actually or potentially
starting with a short-circuiting op==/acmp.
In any case, such an identity comparison will be
monomorphic in the receiver type, not a
polymorphic multi-way dispatch on Object
references.
So this is not just moving around costs that stay the
same; you can de-virtualize op==/acmp by moving
it into the prologue of all Object::equal methods.
(Non-compliant ones can be handled by splitting
the entry point.) Once the actual or potential
op==/acmp is found at the start of Foo::equals, we
can then inline and reorder the checks in the body
of the equals method. At that point the cost of op==
starts to go to zero.
This is old news; we’ve discussed it in Burlington
now these many years ago. But I thought I’d remind
us of it. And this is really a more hopeful approach
to L.I.F.E. That is, even if we don’t do these JIT
transforms in the first release, there is a path forward
that eventually removes the unintentional costs of
op==/acmp when L.I.F.E. throws them at us.
All this can work without requiring a global move to a
completely new operator (op===), surely an alien form
of L.I.F.E. within our ecosystem.
(Ba-DUM-ch!)
More information about the valhalla-spec-observers
mailing list