raw floating-point bits in '==' value object comparisons (again/still)

Mon Mar 11 17:43:36 UTC 2024

> From: "-" <liangchenblue at gmail.com>
> To: "Remi Forax" <forax at univ-mlv.fr>, "valhalla-spec-experts"
> <valhalla-spec-experts at openjdk.org>
> Sent: Monday, March 11, 2024 5:19:15 PM
> Subject: Re: raw floating-point bits in '==' value object comparisons
> (again/still)

> Hi Remi,
> I believe we can stick with bitwise equivalence, at least for now. The bitwise
> equivalence, in essence, is a remedy for the compatibility issues around ==
> with the removal of identity. Thus, it should not replace and is still not
> preferable to equals(), even though the migration happens to make == a good
> implementation in many more cases.

This is true for both the bitwise equivalence and the representational equivalent. The idea is not to replace equals(), but to provide an implementation of == for value types. 

> For a Double object d, there are sets S0 that are all objects with the same
> identity as d, S1 that are all objects with the same bitwise representation
> (headers ignored of course) as d, S2 that are all objects with the same
> representational equivalence as d, and Sequals that are all objects that
> equals(d). We can see (<= for "is subset of") {d} == S0 <= S1 <= S2 <=
> (actually ==) Sequals, and whatever the set d == holds true against (call it
> S==) has S== <= Sequals.

> With the removal of identity, S0 is gone, so S1 becomes S==. What you call for
> is to use S2 for S==. I believe that the move from S1 to S2 will be a
> backward-compatible change in the future, but not from S2 to S1. Given the
> significant performance benefits of using S1 for S== instead of S2, I believe
> we can stay with the bitwise equivalence and investigate using S2 for S== if
> the future hardware improvements make it feasible.

I think you are too optimistic about the performance benefits of the bitwise equivalence compared to the representational equivalence. And having the right semantics is more important. 

The representational equivalence semantics is the same as the bitwise equivalence semantics for all types but floating points. So for a lot of value classes, there is no difference. 
You may think that having the bitwise equivalence semantics allows to compare the content the fields using wider registers but this is not true if you are using a concurrent GC (references may need patching), this is not true if a field is a non-null value type represented as a pointer (the default value may be encoded as null). So it's far from clear to me that there is a "significant" performance benefit. Yes, checking if a field is not the nominal NaN as a cost, but given that in most case, the branch will be never taken, we may have hard time to see the difference. 

And again, I prefer a clear and simple semantics compared to mostly the same semantics with a foot gun attached that fires if a NaN is encoded in a funny way. 

> Regards,
> Chen Liang

Rémi 

> On Mon, Mar 11, 2024 at 8:34 AM Remi Forax < [ mailto:forax at univ-mlv.fr |
> forax at univ-mlv.fr ] > wrote:

>> Last week, I explain at JChateau (think JCrete in France, less sun, more
>> chateaux) how value types work from the user POV, among other subject
>> describing the semantics of ==.

>> First, most of the attendee knew the semantics difference between == on double
>> and Double.equals(). I suppose it's because people that attend to such
>> (un-)conference have a more intimate knowledge of Java than an average
>> developer. Second, no attendee knew that NaN was a prefix.

>> So it let me think again on that subject.

>> 1) The argument that of Dan that we want to be able to create a class with two
>> different NaN, does not hold because instead of storing the values as double,
>> the values can be stored as long.

>> value class C {
>> private double d;
>> C(double d) { this.d = d; }
>> long bits() { return Double.doubleToRawLongBits(d); }
>> }

>> C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L));
>> C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L));
>> assert c1.bits() != c2.bits();

>> can be rewritten as

>> value class C {
>> private long l;
>> C(double d) { this.l = Double.doubleToRawLongBits(d); }
>> long bits() { return l; }
>> }

>> 2) The de-duplication of value instances by the GC works with both the bitwise
>> equivalence and the representational equivalence.

>> If the GC only de-duplicate the value instance based only on the bitwise
>> equivalence, it is a valid algorithm under the representational equivalence.

>> So I not convinced that the bitwise equivalence should be choosen instead of the
>> representational equivalence, for me two semantics instead of three is a win.

>> Rémi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-observers/attachments/20240311/28dc6746/attachment-0001.htm>