Value equality

Wed May 18 19:17:40 UTC 2016

Hi John, Brian,
John suggest it in it's mail, ==(val, val) and ==(any,any) can be not aligned, it will be weird but has its own good. 

P5: merge P3 and P4, ban ==(valuetype, valuetype) but make ==(any, any) the new operator
    that does an x == y || x.equals(y) on a reference type, an x.equals(y) on a value type and x == y on primitive type.

Rémi

----- Mail original -----
> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "John Rose" <john.r.rose at oracle.com>
> Cc: valhalla-spec-experts at openjdk.java.net
> Envoyé: Mercredi 18 Mai 2016 16:57:24
> Objet: Re: Value equality
> 
> Great summary of the options.
> 
> For those who didn't read the whole thing:
>   - CE is bitwise equality -- "are these two things identical copies"
>   - OE is calling Object.equals()
>   - NE (for values) is the synthetic "recurse with == on primitive
> components, NE on value components, and OE on reference components"
> 
> If it were 1995, and we were inventing Java (and we didn't have our
> heads addled with an interpreter-based cost model), what would we do?  I
> think we'd bind ==(ref,ref) to OE, with an (uglier-named) API point for
> CE (e.g., Objects.isSameReference) which would be used (a) for
> known-interned things, (b) for IdentityHashMap, (c) as a default
> implementation of Object.equals(), and (d) possibly as a
> short-circuiting optimization *inside* overrides of equals().
> 
> This hypothetical world (call it J') still gives users the choice of CE
> vs OE whenever they want, while nudging users towards OE (by giving it
> the prime syntactic real estate) which is probably what they want most
> of the time.
> 
> Why didn't we do this in 1995?  Hard to know (I'll ask James next time I
> see him), but I'd posit two main forces:
> 
>   - C bias.  Since C has *only* CE (and it was desirable to make Java
> feel like "a safer C") it probably seemed like a big improvement already
> to offer programmers both CE and OE on all references, and binding == to
> OE probably seemed too radical at the time.
> 
>   - Cost-model bias.  In the Java 1.0 days, pointer comparison was
> probably 100x faster in the interpreter than a virtual call to
> Object.equals().  If binding == to OE was even considered, it was
> probably deemed implausible.
> 
> Of course, both of these feel a bit silly 20 years later, but here we
> are.  So, in a J' world, what would we do with ==(val,val)?  I think it
> would be a no-brainer -- bind it to NE, since Java developers would
> already associate == with a deeper comparison.  Then we'd just have to
> adjust whatever the API point for CE is to also accomodate CE on values,
> and we'd be done.
> 
> But, we don't live in J' world.  So our choices become:
> 
> P1: Bind ==(val,val) to CE, as we do with refs.  Optimization challenges
> with the usual (a==b || a.equals(b)) idiom [1], but the rules work the
> same for values and refs.
> 
> P2: Bind ==(val,val) to NE.  This is J' world for values and J world for
> refs.  (With even bigger optimization challenges for the (a==b ||
> a.equals(b)) idiom.)  Rules are different for values and refs, meaning
> (a) users will have to keep in mind which world they're in, (b) when
> migrating a class from ref to value they'll have to find and update all
> equality comparisons (!), (c) writing code that's generic over values
> and refs has to use an idiom that works on both, (d) when migrating code
> from ref-generic to any-generic, inspect every equality comparison to
> make sure it's still what was intended.
> 
> P3: Add a new equality operator.  I've already been laughed at enough,
> thank you.
> 
> P4: Ban ==(val,val).  This might be fine in value-only code, but it
> complicates writing generic code, especially migrating generic code.
> 
> 
> [1] John points out that if == is CE, then (a==b||a.equals(b)) will
> redundantly load the fields on failed ==.  But, many equals
> implementations start with "a==b" as a short-circuiting optimization,
> which means "a==b" will be a common (pure) subexpression in the
> resulting expansion (and for values, methods are monomorphic and will
> get inlined more frequently), so the two checks can be collapsed.
> 
> 
> > Going back to op==, there are two plausible options for binding it to
> > new types:
> >
> > (P1) Syntax of op==(val,val) and op==(any,any) binds to CE as with
> > op==(ref,ref).  Therefore, NE is uniformly reached by today's idiom,
> > which traverses value fields twice.
> >
> > (P2) Syntax of op==(val,val) and op==(any,any) is direct access to
> > NE.  CE is reachable by experts at System.isEqualCopy.  The old idiom
> > for NE works also calls equals twice.
> >
> > (P3) Same as P1, op== is uniform access to CE.  New op (spelled
> > "===", ".==", "=~", etc.) is uniform, optimizable access to NE,
> > attracting users away from legacy idiom for NE.
> 
>