Value object equality & floating-point values

Fri Feb 9 02:43:34 UTC 2024

Remi asked about the spec change last May that switched the `==` behavior on value objects that wrap floating points from a `doubleToLongBits` comparison to a `doubleToRawLongBits` comparison. Here's my recollection of the motivation.

First, a good summary of the different versions of floating point equality can be found here:
https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#equivalenceRelation

It discusses three different concepts of equality for type 'double'.

- Numerical equality: The behavior of == acting on doubles, with special treatment for NaNs (never equal to themselves) and +0/-0 (distinct but considered equal)

- Representational equivalence: The behavior of `Double.equals` and `doubleToLongBits`-based comparisons, distinguishing +0 from -0, but with all NaN bit patterns considered equal to each other

- Bitwise equivalence: The behavior of `doubleToRawLongBits`-based comparisons, distinguishing +0 from -0, and with every NaN bit pattern distinguished from every other

-----

Now turning to value objects.

Discussing the general concept of equivalence classes, the above reference has this to say: "At least for some purposes, all the members of an equivalence class are substitutable for each other. In particular, in a numeric expression equivalent values can be substituted for one another without changing the result of the expression, meaning changing the equivalence class of the result of the expression."

Value classes that wrap primitive floating point values will have their own notion of what version of "substitutable" they wish to work with, and so what equivalence classes they need. But, at bottom, the JVM and other applications need to have some least common denominator equivalence relation that support substitutability for *all* value classes. That equivalence relation is bitwise equivalence.

That is, consider this class:

value class C {
    private double d;
    C(double d) { this.d = d; }
    long bits() { return Double.doubleToRawLongBits(d); }
}

C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L));
C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L));
assert c1.bits() != c2.bits();

Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 as belonging to the same equivalence class. If they are, it's allowed to substitute c1 for c2 at any time. I think it's pretty clear that would be a mistake. So the JVM internals need to be operating in terms of bitwise equivalence of nested floating-point values.

Now consider another class:

value class D {
    double d;
    D(double d) { this.d = d; }
    public boolean equals(Object o) {
        return o instanceof D that && Math.abs(this.d - that.d) < 0.00001d;
    }
}

D d1 = new D(0.3);
D d2 = new D(0.1+0.2);
assert d1.d != d2.d;

Now we've got a class that wants to work with a much chunkier equivalence relation. (I kind of suspect this isn't an equivalence relation at all, sorry, floating-point experts. But you get the idea.) This class wouldn't mind if the VM *did* randomly swap out d1 for d2, because *in this application*, they're substitutable.

So: different classes will have different needs, we can't anticipate them all, but in certain contexts that lack domain knowledge (like VM optimizations), bitwise equivalence must be used.

Finally: must '==' be defined to reflect "least common denominator" substitutability, or could it be something else? Perhaps representation equivalence, which has some nice properties and can be conveniently expressed in terms of Double.equals?

In theory, sure, there's no reason we couldn't use representational equivalence for '==', and provide some other path to bitwise equivalence (Objects.isSubstitutable?).

But again, note that every class has its own domain-specific equivalence relation needs. This is captured by 'equals'. (Beyond floating point interpretations, don't forget that '==' will often not be the equivalence relation that value classes want for their identity object fields, so they'll need to override the default equals and make some recursive 'equals' calls.)

So we know Java programmers need to be conversant in at least two versions of value object equality: universal substitutability (using bitwise equivalence for floating points), and domain equivalence (defined by 'equals' methods). And traditionally, '==' on objects has been understood to mean universal substitutability. Do we really want to complicate matters further by asking programmers to keep track of *three* object equivalence relations, and teaching them that '==' doesn't *really* mean substitutability anymore? We decided that wasn't worth the trouble—ultimately, we just want to continue to encourage them to use 'equals' in most contexts.