Equality for values -- new analysis, same conclusion

forax at univ-mlv.fr forax at univ-mlv.fr
Mon Aug 12 19:12:38 UTC 2019


----- Mail original -----
> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Lundi 12 Août 2019 19:37:41
> Objet: Re: Equality for values -- new analysis, same conclusion

>> I think we should take a step back on that subject,
>> because you are all jumping to the conclusion too fast in my opinion.
>>
>> Let starts by the beginning,
>> the question about supporting == on inline type should first be guided by what
>> we should have decided if inline types were present from the inception of Java,
>> it's the usual trick when you want to retcon a feature.
> 
> This is a good question, and worth discussing.
> 
>> If we had inline types from the beginning, i believe we will never had allowed
>> == on Object, the root type of the hierarchy, but have a special method call
>> that will only work on indirect type like in C#.
> Talk about jumping to conclusions too fast :)  This is surely one of the
> options, but by far not the only.
> 
> If we had the benefits of hindsight (both for how Java is used, and
> where it was going), we might instead have chosen the following total
> operators:
> 
>  - `==` is a substitutibility test for all types
>  - `===` delegates to equals() for all class types, and == for
> primitives (let's not discuss this further, as it is separable and
> surely not the problem on the table.)
> 
> Note that on "traditional" object references, Object== _is already_ a
> substitutibility test.  In fact, on every type that `==` is defined
> today, it is a substitutibility test (modulo NaN.)  So while we might
> have chosen a different path back then, we can still choose a path that
> is consistent with where we might have gone, by extending == to be a
> substitutibility test for the new types.  This also seems the path of
> least astonishment.

and here we disagree,
first, you don't have to extend ==, you can let it die.

then Object== on an indirect types is not a substitutibility test, you can have two strings that are equals when calling equals() but not when calling Object==, so it's a substitutibility test only when it returns true otherwise, you don't know.


> 
> The problem with Object== is not that it is unsound, it's that it is
> _badly overused_.  This largely comes from coding conventions set very
> early in Java's lifetime, such as using `==` as a quick check both in
> the implementation of `equals()` methods, and before calling `equals()`
> (e.g., `x == y || x.equals(y)`).  And this overuse comes from
> performance assumptions from the Java 1.0 days, which were that
> everything was interpreted and virtual method calls like equals() were
> super-expensive.  This was true for the first few years, but in
> hindsight, these coding patterns are the boat anchor, not the semantics
> of Object== itself.  These patterns went from "necessary for
> performance" to "useless but harmless", and it is their harmlessness
> that has allowed them to survive.
> 
> Also, let's be honest: the sole reason we're having this conversation is
> that we are concerned about the performance impact. That should surely
> be considered, but letting that dictate the semantics of language-level
> equality would be an extremely risky move -- and something we should
> consider with the utmost of care and skepticism.

Don't use ==, use equals, it's something i repeat over and over (and over) to my students,
you want to test equality, use equals.

> 
>> i propose
>> - to banned V== (compile error)
>> - to make Object==and T== emit a compiler warning explaining that the code
>> should be changed
>> - add a method System.identityEquals(RefObject, RefObject) as replacement
> 
> I get why this is attractive to you, but I think it will be a constant
> source of confusion to users.  First, we've told users that one of the
> key use cases for value types is numerics.  Numerics are frequently
> compared for equality.  That users can't use `==` on numeric values at
> all will surely be a puzzlement, and not just once per user.  (There are
> things that we can explain to users, and they'll say "OK, I don't like
> it but I get it", but if we try to explain to them why they can't
> compare two Float16s for equality, their eyes will likely glaze over and
> will say "you guys have gone off the deep end.")

don't use ==, use equals.

> 
> Further, many algorithms need to use == to say "have I reach the
> sentinel value" or "is this the element I am looking for."  In
> performance-sensitive code, users want to use == in preference to
> equals().  This again will be a source of puzzlement.

either you are using indirect types and you can use System.identityEquals or you have to find a creative way to mark an inline object as the sentinel, by example C# equivalent of HashMap uses the sign bit of the field hashCode of the inline object corresponding to an entry of the hashtable to mark the entry as a sentinel. 

> 
> So this approach, while viable, has a much higher cognitive-load cost
> than you are imagining.  (Yes, you could say "when we have operator
> overloading, it won't be a problem."  Given that this is not coming for
> a while, if at all, I don't see this as an answer. And again, let's not
> discuss this now, as it is a distraction.)
> 
> I do agree that we should seek to discourage the over-use of `==`
> through compiler warnings and other tools.  But I think that's a
> separate and separable problem.

we are moving to a world with 3 kinds of values (primitive, indirect, inline), the way to go back to a 2 kind of values is to retcon primitive types as inline types, if we achieve that we will be able to call .equals() on primitive too, making the last "safe" usage of == disappearing.

> 
>> Now, the second thing that disturb me is that no email of this thread, lists the
>> two issues of the substitutibility test that make it unsuitable as an
>> implementation of Object==.
>> - it's not compatible with the primitive == on float and double, by example,
> 
> I think this is mostly a "whatabout" argument.  Yes, it's irritating.
> Yes, it's tiring to keep saying "modulo NaN".  Yes, it was probably a
> mistake.  But given the choice between:
> 
>  - NaN is so weird that we should just treat it as a removable
> discontinuity
>  - See, NaN does it, so we have precedent, and now can do it wherever
> we like
> 
> there's a reasonable choice, and an insane choice.  The reason no one
> has brought it up is because no one wanted to advocate for the insane
> choice.  That seems sane to me.
> 
>>    has the stupid property of having == being true and equals() being false if
>>    value is NaN.
> 
> Yes, its stupid.  Do we want to say "oops, we made a mistake there", or
> emulate that mistake forevermore?

It exposes a flaw in the proposed implementation of the substitutibility test and
- it shows that we will have to change the definition of Object.equals to force it to be more precise than the substitutibility test, something not required by the current javadoc/spec.
- it's the kind of issue that may make our live miserable when we will want to see the primitive types as inline types.

> 
>> - it can be really slow
>>      1) Object== can be megamorphic
>>      2) Object== can do a recursive call
>>    so it destroys the assumption that Object== is faster than equals.
> 
> This has been discussed extensively, so it puzzles me why you think it
> hasn't been discussed.  Yes, this is a big concern.  Yes, we should look
> for ways to mitigate this.  Yes, we should seek to discourage the
> rampant overuse of ==, and to the extent that the performance model has
> shifted, educate users about the new performance model.  But it is not,
> in itself, an argument why we should pick the wrong (or no) semantics
> for val==.

if the perf model has shifted, that why we should not try to provide a remotely useful val==, otherwise more people will start to use more ==

> 
>> so the only choice we have is to return false is the left or the right operand
>> is an inline type.
> 
> OK, my turn to be disturbed.  Yes, this is a valid choice, and we can
> discuss it.  But to claim that it is the only choice ... well, to
> misquote Lord Vader: "I find your lack of imagination ... disturbing."
> 
>      https://youtu.be/m0XuKORufGk?t=20

It's the only choice if you agree that we want to demote ==.

Rémi


More information about the valhalla-spec-observers mailing list