Value object equality & floating-point values

Mon Feb 12 21:00:24 UTC 2024

Reply to some spec-observers discussion about this thread.

> On Feb 10, 2024, at 12:57 PM, Remi Forax <forax at univ-mlv.fr> wrote:
> 
>> From: "Stephen Colebourne" <scolebourne at joda.org>
>> To: "Valhalla Expert Group Observers" <valhalla-spec-observers at openjdk.org>
>> Cc: "daniel smith" <daniel.smith at oracle.com>
>> Sent: Saturday, February 10, 2024 1:57:47 PM
>> Subject: Re: Value object equality & floating-point values
> 
>> Note that the outcome of this is that all value types consisting only
>> of primitive type fields have == the same as the record-ike .equals()
>> definition, which is a very good outcome.
> 
> yes !
> 
> And also all wrappers of primitive types have == the same as their .equals() definition.

Sounds like the main concern here is that '==' is too discriminating for the domain-specific equality tests normal programmers want to make.

I agree, but I don't think this is anything new. The '==' operator has never been the appropriate tool for domain-specific equality tests. It will coincidentally do the "right thing" for a subset of value classes, many of which are cases that only use primitive fields and other value class types (but see more on this below). How we handle floating-points will affect a tiny fraction of use cases. It won't make a difference one way or another on the broader issue, which is whether programmers should routinely rely on '==' for domain-specific equality tests (answer: no).

Even for something as simple as Double, it may initially seem obvious that '==' should match 'equals'. But think about other kinds of floating-points that value classes are meant to enable. What about HalfFloat?

value class HalfFloat {
    private short bits;
}

How is '==' going to behave here? It's going to do a raw bit comparison. Are there multiple NaN encodings for HalfFloats? I'm not an FP expert, but I presume so. Should those different encodings be treated as equivalent by 'equals'? Given the precedent of Float and Double, definitely yes.

This is just to emphasize: "only declare primitive fields" or any similar rule is not going to be enough to guarantee that '==' will give you an appropriate domain-specific equality test for free. Equality is ultimately something we need programmers to define with 'equals'.

---

Stephen Colebourne suggested a normalization approach to floating-point field storage:

>> * For each `float` or `double` field in a value class, the constructor
>> will generate normalization code
>> * The normalization is equivalent to `longBitsToDouble(doubleToLongBits(field))`
>> * Normalization also applies to java.lang.Float and java.lang.Double
>> * == is a Bitwise implementation, but behaves like Representational
>> for developers

The Oracle-internal discussion last spring covered similar ground. There are different ways to stack it, but what they have in common is an interest in eradicating NaN encoding variations as some sort of unwanted anomaly. I get that this is often the case (for the tiny fraction of programmers who ever encounter NaN in the first place). But let's not overlook the fact that, since 1.3, there's an API that explicitly supports these encodings and promises to preserve them (Double.doubleToRawLongBits and Double.longBitsToDouble). Note that Double is a value class that wraps a field of type 'double'. Flattening out NaN encoding differences in the wrapped field would break that API.

(Could we work around that by changing the type of the wrapped field to 'long'? I mean, abstractly speaking, I guess... But now we're back to a class Double whose '==' and 'equals' methods disagree.)