Equality for values -- new analysis, same conclusion

Brian Goetz brian.goetz at oracle.com
Mon Aug 12 17:37:41 UTC 2019



> I think we should take a step back on that subject,
> because you are all jumping to the conclusion too fast in my opinion.
>
> Let starts by the beginning,
> the question about supporting == on inline type should first be guided by what we should have decided if inline types were present from the inception of Java, it's the usual trick when you want to retcon a feature.

This is a good question, and worth discussing.

> If we had inline types from the beginning, i believe we will never had allowed == on Object, the root type of the hierarchy, but have a special method call that will only work on indirect type like in C#.
Talk about jumping to conclusions too fast :)  This is surely one of the 
options, but by far not the only.

If we had the benefits of hindsight (both for how Java is used, and 
where it was going), we might instead have chosen the following total 
operators:

  - `==` is a substitutibility test for all types
  - `===` delegates to equals() for all class types, and == for 
primitives (let's not discuss this further, as it is separable and 
surely not the problem on the table.)

Note that on "traditional" object references, Object== _is already_ a 
substitutibility test.  In fact, on every type that `==` is defined 
today, it is a substitutibility test (modulo NaN.)  So while we might 
have chosen a different path back then, we can still choose a path that 
is consistent with where we might have gone, by extending == to be a 
substitutibility test for the new types.  This also seems the path of 
least astonishment.

The problem with Object== is not that it is unsound, it's that it is 
_badly overused_.  This largely comes from coding conventions set very 
early in Java's lifetime, such as using `==` as a quick check both in 
the implementation of `equals()` methods, and before calling `equals()` 
(e.g., `x == y || x.equals(y)`).  And this overuse comes from 
performance assumptions from the Java 1.0 days, which were that 
everything was interpreted and virtual method calls like equals() were 
super-expensive.  This was true for the first few years, but in 
hindsight, these coding patterns are the boat anchor, not the semantics 
of Object== itself.  These patterns went from "necessary for 
performance" to "useless but harmless", and it is their harmlessness 
that has allowed them to survive.

Also, let's be honest: the sole reason we're having this conversation is 
that we are concerned about the performance impact. That should surely 
be considered, but letting that dictate the semantics of language-level 
equality would be an extremely risky move -- and something we should 
consider with the utmost of care and skepticism.

> i propose
> - to banned V== (compile error)
> - to make Object==and T== emit a compiler warning explaining that the code should be changed
> - add a method System.identityEquals(RefObject, RefObject) as replacement

I get why this is attractive to you, but I think it will be a constant 
source of confusion to users.  First, we've told users that one of the 
key use cases for value types is numerics.  Numerics are frequently 
compared for equality.  That users can't use `==` on numeric values at 
all will surely be a puzzlement, and not just once per user.  (There are 
things that we can explain to users, and they'll say "OK, I don't like 
it but I get it", but if we try to explain to them why they can't 
compare two Float16s for equality, their eyes will likely glaze over and 
will say "you guys have gone off the deep end.")

Further, many algorithms need to use == to say "have I reach the 
sentinel value" or "is this the element I am looking for."  In 
performance-sensitive code, users want to use == in preference to 
equals().  This again will be a source of puzzlement.

So this approach, while viable, has a much higher cognitive-load cost 
than you are imagining.  (Yes, you could say "when we have operator 
overloading, it won't be a problem."  Given that this is not coming for 
a while, if at all, I don't see this as an answer. And again, let's not 
discuss this now, as it is a distraction.)

I do agree that we should seek to discourage the over-use of `==` 
through compiler warnings and other tools.  But I think that's a 
separate and separable problem.

> Now, the second thing that disturb me is that no email of this thread, lists the two issues of the substitutibility test that make it unsuitable as an implementation of Object==.
> - it's not compatible with the primitive == on float and double, by example,

I think this is mostly a "whatabout" argument.  Yes, it's irritating.  
Yes, it's tiring to keep saying "modulo NaN".  Yes, it was probably a 
mistake.  But given the choice between:

  - NaN is so weird that we should just treat it as a removable 
discontinuity
  - See, NaN does it, so we have precedent, and now can do it wherever 
we like

there's a reasonable choice, and an insane choice.  The reason no one 
has brought it up is because no one wanted to advocate for the insane 
choice.  That seems sane to me.

>    has the stupid property of having == being true and equals() being false if value is NaN.

Yes, its stupid.  Do we want to say "oops, we made a mistake there", or 
emulate that mistake forevermore?

> - it can be really slow
>      1) Object== can be megamorphic
>      2) Object== can do a recursive call
>    so it destroys the assumption that Object== is faster than equals.

This has been discussed extensively, so it puzzles me why you think it 
hasn't been discussed.  Yes, this is a big concern.  Yes, we should look 
for ways to mitigate this.  Yes, we should seek to discourage the 
rampant overuse of ==, and to the extent that the performance model has 
shifted, educate users about the new performance model.  But it is not, 
in itself, an argument why we should pick the wrong (or no) semantics 
for val==.

> so the only choice we have is to return false is the left or the right operand is an inline type.

OK, my turn to be disturbed.  Yes, this is a valid choice, and we can 
discuss it.  But to claim that it is the only choice ... well, to 
misquote Lord Vader: "I find your lack of imagination ... disturbing."

     https://youtu.be/m0XuKORufGk?t=20



More information about the valhalla-spec-observers mailing list