Value equality

Brian Goetz brian.goetz at oracle.com
Wed May 18 14:57:24 UTC 2016


Great summary of the options.

For those who didn't read the whole thing:
  - CE is bitwise equality -- "are these two things identical copies"
  - OE is calling Object.equals()
  - NE (for values) is the synthetic "recurse with == on primitive 
components, NE on value components, and OE on reference components"

If it were 1995, and we were inventing Java (and we didn't have our 
heads addled with an interpreter-based cost model), what would we do?  I 
think we'd bind ==(ref,ref) to OE, with an (uglier-named) API point for 
CE (e.g., Objects.isSameReference) which would be used (a) for 
known-interned things, (b) for IdentityHashMap, (c) as a default 
implementation of Object.equals(), and (d) possibly as a 
short-circuiting optimization *inside* overrides of equals().

This hypothetical world (call it J') still gives users the choice of CE 
vs OE whenever they want, while nudging users towards OE (by giving it 
the prime syntactic real estate) which is probably what they want most 
of the time.

Why didn't we do this in 1995?  Hard to know (I'll ask James next time I 
see him), but I'd posit two main forces:

  - C bias.  Since C has *only* CE (and it was desirable to make Java 
feel like "a safer C") it probably seemed like a big improvement already 
to offer programmers both CE and OE on all references, and binding == to 
OE probably seemed too radical at the time.

  - Cost-model bias.  In the Java 1.0 days, pointer comparison was 
probably 100x faster in the interpreter than a virtual call to 
Object.equals().  If binding == to OE was even considered, it was 
probably deemed implausible.

Of course, both of these feel a bit silly 20 years later, but here we 
are.  So, in a J' world, what would we do with ==(val,val)?  I think it 
would be a no-brainer -- bind it to NE, since Java developers would 
already associate == with a deeper comparison.  Then we'd just have to 
adjust whatever the API point for CE is to also accomodate CE on values, 
and we'd be done.

But, we don't live in J' world.  So our choices become:

P1: Bind ==(val,val) to CE, as we do with refs.  Optimization challenges 
with the usual (a==b || a.equals(b)) idiom [1], but the rules work the 
same for values and refs.

P2: Bind ==(val,val) to NE.  This is J' world for values and J world for 
refs.  (With even bigger optimization challenges for the (a==b || 
a.equals(b)) idiom.)  Rules are different for values and refs, meaning 
(a) users will have to keep in mind which world they're in, (b) when 
migrating a class from ref to value they'll have to find and update all 
equality comparisons (!), (c) writing code that's generic over values 
and refs has to use an idiom that works on both, (d) when migrating code 
from ref-generic to any-generic, inspect every equality comparison to 
make sure it's still what was intended.

P3: Add a new equality operator.  I've already been laughed at enough, 
thank you.

P4: Ban ==(val,val).  This might be fine in value-only code, but it 
complicates writing generic code, especially migrating generic code.


[1] John points out that if == is CE, then (a==b||a.equals(b)) will 
redundantly load the fields on failed ==.  But, many equals 
implementations start with "a==b" as a short-circuiting optimization, 
which means "a==b" will be a common (pure) subexpression in the 
resulting expansion (and for values, methods are monomorphic and will 
get inlined more frequently), so the two checks can be collapsed.


> Going back to op==, there are two plausible options for binding it to
> new types:
>
> (P1) Syntax of op==(val,val) and op==(any,any) binds to CE as with
> op==(ref,ref).  Therefore, NE is uniformly reached by today's idiom,
> which traverses value fields twice.
>
> (P2) Syntax of op==(val,val) and op==(any,any) is direct access to
> NE.  CE is reachable by experts at System.isEqualCopy.  The old idiom
> for NE works also calls equals twice.
>
> (P3) Same as P1, op== is uniform access to CE.  New op (spelled
> "===", ".==", "=~", etc.) is uniform, optimizable access to NE,
> attracting users away from legacy idiom for NE.



More information about the valhalla-spec-observers mailing list