Proposal: Sameness operators

Fri May 15 08:05:08 PDT 2009

That's all fine, Joe, but what I was trying to illustrate is this:

If one argues that .compareTo is not a good fit for the engine behind  
an object based comparison operator, then one should also argue  
that .equals is not the right method to power the engine behind an  
object based equals operator. Somehow everyone agrees that .equals is  
congruent with ==, but < isn't congruent with compareTo, and I'm  
saying that makes no sense whatsoever. Either both are a good fit, or  
both are a bad fit.

I'm very /very/ much in the camp of: Both are good. The situations  
where floats and doubles make you go 'wahuh' are annoying but we  
already have them (the second point I was trying to make: This stuff  
is complicated. It always has been, it always will be; it's an  
intractably complex problem!)

The reason its intractable is because compareTo and equals are both  
used in totally separate use cases:

  1. mathematical operations,
  2. non-mathematical operations.

and they should act very differently depending on how they are used.

When I have a list of doubles that are the result of running a formula  
a couple of times with an input set of numbers, I want NaNs to  
consider themselves equal to each other, and to sort below (or above -  
either way) all other numbers, even the infinities. That way I can  
render the results in a GUI like so:

NaN: 10 times
-Infinity: 4 times
0: 2 times
5: 8 times
Infinity: 1 time

I certainly don't want each NaN to show up separately, or for  
compareTo to throw an exception, making my sort fail, because  
technically you can't compare NaNs to either themselves or any other  
number. In this case, the numbers are just end results, their numeric  
nature has no further computational meanings; They could be strings  
for all the program cares. *I* care about the numbers (the end user  
looking at the display), but to the end user there's no difference  
between "1" and 1, after all.

However, when I have a list of doubles that represents the final score  
in a competition, and I want to use compareTo to determine the winner,  
I certainly don't want the guy that scored NaN to get first place  
because of a fluke of compareTo.

There's no way for compareTo to ever know which case its trying to  
compareTo for. I posit that any system that could make it possible to  
know this is far too complicated to be worth it. Therefore, we can  
safely assume this is an intractable problem. This dichotomy will  
ALWAYS lead to java puzzlers. Period.

We must choose which one we want to accept here:

  A) The 'Perfection isn't feasible, but we're okay with almost  
perfect' option: We accept some puzzlers, but consider .compareTo  
and .equals both as adequate engines behind running comparison  
operations on *any* 2 objects, or

  B) The 'If it isn't perfect (and it will never be), then I don't  
want any part of it' option: We accept that java does not know, *and  
will not ever*, have operators to compare 2 objects (be it for  
equality or for ordering).

There's no "C) We need to come up with something better" - this  
problem isn't a solvable one. It's one of those two.

I vote A. I bet, given the choice between A and B and nothing else,  
most other java programmers would too. We can discuss if I'm wrong  
about these being the only two options, of course. We can argue about  
what these operators should look like, and what they should do when  
you attempt to compare NaNs this way, but the answer is never going to  
satisfy everybody.

  --Reinier Zwitserloot

On May 15, 2009, at 07:22, Joseph D. Darcy wrote:

> Catching up on commenting...
>
> Reinier Zwitserloot wrote:
>> The argument that .compareTo should not be used because it isn't   
>> entirely congruent with either the meaning of .equals() or the   
>> meanings of all the comparison operators (from == to <) on the   
>> primitives, just doesn't hold water, for this simple reason:
>>
>> They make absolutely no sense now, either, and nobody cares about  
>> that.
>>
>
> I care about that! ;-)
>
> The situation is complicated, but there are reasons for the design  
> of the various floating-point comparison operations.
>
>> Here's some fun facts (all asserts are true):
>>
>> int x = 0/0; //ArithmeticException
>> double x = 0/0.0; //okay, x = NaN
>> double y = 0/0.0;
>>
>> assert x != y;
>> assert ! (x == y);
>> assert Double.valueOf(x).equals(Double.valueOf(y)); //WTF!
>> assert Double.valueOf(x).equals(y); //WTF!
>>
>> Clearly, equals is fundamentally broken. We need a new equals! Oh no!
>>
>
> Yes, strange but true, the "equals" relation on floating-point  
> values defined by IEEE 754 is *not* an equivalence relation.  This  
> odd situation arose to accommodate NaNs.  NaNs break the trichotomy  
> of exactly one of the less than, equal to, or greater relations  
> holding between values.  A NaN is *unordered* compared to any  
> floating-point value, even itself.  Since NaNs are Not-a-Numbers  
> they do not obey the rules of normal numerical values.
>
> While the IEEE 754 design is driven by numerical programming  
> concerns, at times there is a need for a true equivalence relation  
> on floating-point values as well as a total ordering.  For example,  
> numerical regression tests in the JDK include a boolean  
> equivalent(double, double) method that returns true if both values  
> are NaN or if the values are ==.  Another oddity in the IEEE 754  
> equals relation is that two distinct values, -0.0 and +0.0, are  
> equal to each other.  These two values are *not* the same under IEEE  
> 754 operations because 1.0/-0.0 is negative infinity while 1.0/+0.0  
> is positive infinity.  For sorting, a real total order is needed.   
> Arrays.sort(double[]) uses the total ordering from Double.compareTo  
> where -0.0 is less than 0.0 and NaNs are qual to one another and  
> greater than positive infinity and.
>
> [snip]
>
>
>> 4. BigDecimal and BigInteger are changed to implement   
>> Comparable<Number> instead. The fact that they currently don't is   
>> something I don't really understand would consider filing as a bug  
>> if  I worked more with mixed BI/BD and primitives math. By doing  
>> this,  something like:
>>
>
> Number should really have b neen an interface more clearly just  
> meaning "convertible to a primitive type;" all Number lets you do is  
> convert to the  primitive types and converting to primitive is not  
> necessarily sufficient to let you do anything else with the value.   
> For example, one could write a Complex class that extended Number  
> and a quaterion class that extended Number too.  A class like  
> BigDecimal can't implement a sensible comparison on an unknown class  
> that just happens to implement Number.  Even properly comparing  
> numerical values of Number classes within the JDK is tricky; without  
> specific instaneof checks, both the double value and long value may  
> need to be extracted even for the primitive types.
>
> -Joe