From john.r.rose at oracle.com Tue Feb 6 15:56:32 2024 From: john.r.rose at oracle.com (John Rose) Date: Tue, 06 Feb 2024 07:56:32 -0800 Subject: JVMS changes for JEP 401 In-Reply-To: References: <147936FE-F316-4C29-BFD4-DAE6B0DA09D5@oracle.com> <1119609899.115393289.1706547974233.JavaMail.zimbra@univ-eiffel.fr> Message-ID: On 29 Jan 2024, at 10:22, Dan Smith wrote: >> >> As I understand the current design intent, a constructor that calls >> 'this' is forbidden from assigning to a strict field, because that >> will lead to a duplicate assignment in a constructor that calls >> 'super'. >> >> This language appears to say the opposite. It seems to state that only >> the first constructor of the current class can assign the class's >> strict instance fields. If that is the case, then further constructors >> (including the one that calls 'super') are forbidden from assigning to >> strict instance fields. > > Hello, > the problem is not duplicate assignment, as you said that part of the spec allows that, > the problem is assignment of final fields after a call to super() or this() because when the assignment occurs, 'this' may have already leaked to another context. > > Right. > > There is a *language* rule that final fields must not be assigned more than once. (Specifically, they must be definitely unassigned before their assignment; we'll add a rule that final fields must be definitely assigned before a 'this()' call.) > > In the *JVM*, the only new thing we're guarding against is assignments after the constructor call. All assignments before a constructor call, duplicate or not, are allowed. This might be worth a non-normative note, to the effect of, ?One may notice that the VM may assign strict fields before this-calls in the VM. Java compilers do not issue putfield instructions at such points, and we do not expect them to do so in the future. A special restriction on such assignments would add no value.? From john.r.rose at oracle.com Tue Feb 6 16:00:01 2024 From: john.r.rose at oracle.com (John Rose) Date: Tue, 06 Feb 2024 08:00:01 -0800 Subject: JVMS changes for JEP 401 In-Reply-To: <96F459D5-1F04-47A8-A1DD-3C7E9B743C8D@oracle.com> References: <147936FE-F316-4C29-BFD4-DAE6B0DA09D5@oracle.com> <1119609899.115393289.1706547974233.JavaMail.zimbra@univ-eiffel.fr> <96F459D5-1F04-47A8-A1DD-3C7E9B743C8D@oracle.com> Message-ID: <310B66B3-FAC7-412A-A1DC-901B9923445F@oracle.com> On 29 Jan 2024, at 10:24, Dan Smith wrote: > On Jan 29, 2024, at 10:22?AM, daniel.smith at oracle.com wrote: > > we'll add a rule that final fields must be definitely assigned before a 'this()' call ?Or that, I suppose. Seems like busy-work to me, though. We never had such restrictions before, on putfields. Since a single init-method can issue duplicate putfields (in the VM), there does not seem to be any harm in having a this-calling init method issuing additional duplicate pufields. Really, I?m OK with a rule here, but would prefer not to issue a rule. From liangchenblue at gmail.com Tue Feb 6 19:17:34 2024 From: liangchenblue at gmail.com (-) Date: Tue, 6 Feb 2024 13:17:34 -0600 Subject: JVMS changes for JEP 401 In-Reply-To: References: <147936FE-F316-4C29-BFD4-DAE6B0DA09D5@oracle.com> <1119609899.115393289.1706547974233.JavaMail.zimbra@univ-eiffel.fr> Message-ID: On a side note, does VM spec require all strict fields to be already assigned via putfield (i.e. no missing assignment, even ones with default zero values) before a super constructor call? (This doesn't affect the correctness of strict fields, as unassigned fields will just have default zero values published) If it doesn't, having no restriction on strict assignment for this-delegating constructors would be a great step toward simplification; but if it does, I think the super-calling constructor has diverged from this-calling constructor enough that adding another no-assignment-in-this-caller rule doesn't hurt that much. On Tue, Feb 6, 2024 at 10:01?AM John Rose wrote: > On 29 Jan 2024, at 10:22, Dan Smith wrote: > > >> > >> As I understand the current design intent, a constructor that calls > >> 'this' is forbidden from assigning to a strict field, because that > >> will lead to a duplicate assignment in a constructor that calls > >> 'super'. > >> > >> This language appears to say the opposite. It seems to state that only > >> the first constructor of the current class can assign the class's > >> strict instance fields. If that is the case, then further constructors > >> (including the one that calls 'super') are forbidden from assigning to > >> strict instance fields. > > > > Hello, > > the problem is not duplicate assignment, as you said that part of the > spec allows that, > > the problem is assignment of final fields after a call to super() or > this() because when the assignment occurs, 'this' may have already leaked > to another context. > > > > Right. > > > > There is a *language* rule that final fields must not be assigned more > than once. (Specifically, they must be definitely unassigned before their > assignment; we'll add a rule that final fields must be definitely assigned > before a 'this()' call.) > > > > In the *JVM*, the only new thing we're guarding against is assignments > after the constructor call. All assignments before a constructor call, > duplicate or not, are allowed. > > This might be worth a non-normative note, to the effect of, ?One may > notice that the VM may assign strict fields before this-calls in the VM. > Java compilers do not issue putfield instructions at such points, and we do > not expect them to do so in the future. A special restriction on such > assignments would add no value.? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Tue Feb 6 20:05:15 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 6 Feb 2024 20:05:15 +0000 Subject: JVMS changes for JEP 401 In-Reply-To: References: <147936FE-F316-4C29-BFD4-DAE6B0DA09D5@oracle.com> <1119609899.115393289.1706547974233.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <41CC8C54-28D9-4A32-B698-CCD4FBA5ECF9@oracle.com> > On Feb 6, 2024, at 11:17?AM, liangchenblue at gmail.com wrote: > > On a side note, does VM spec require all strict fields to be already assigned via putfield (i.e. no missing assignment, even ones with default zero values) before a super constructor call? No. The JVM doesn't care how many assignments you make to a field; you just can't make assignments outside of an authorized context. (For final instance fields, "authorized context" = " method of the same class"; for strict-final instance fields, "authorized context" = " method of the same class, before a this/super call".) > (This doesn't affect the correctness of strict fields, as unassigned fields will just have default zero values published) Right. We did explore applying these same concepts to null-restricted, non-implicitly-constructible fields, and in that case, it can be important for the JVM to prove that a field is written *at least once*, because there is no default value. But mandatory writes are not a capability we need for now. From daniel.smith at oracle.com Wed Feb 7 15:29:01 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 7 Feb 2024 15:29:01 +0000 Subject: EG meeting *canceled*, 2024-02-07 Message-ID: Nothing new to discuss this time, so today's EG meeting is canceled. From forax at univ-mlv.fr Wed Feb 7 19:52:16 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 7 Feb 2024 20:52:16 +0100 (CET) Subject: JVMS changes for JEP 401 In-Reply-To: <147936FE-F316-4C29-BFD4-DAE6B0DA09D5@oracle.com> References: <147936FE-F316-4C29-BFD4-DAE6B0DA09D5@oracle.com> Message-ID: <2001541550.999867.1707335536135.JavaMail.zimbra@univ-eiffel.fr> Hello, in section 6.5, the specified behavior of acmp_eq/acmp_ne for fields of type float and double has been changed. The last time we discussed that subject, the idea was to align the semantics of a field of type float (double) to the semantics of Float.equals() (Double.equals()), but now the spec has been changed to compare the bitwise representations. What is the underlying reason for that change ? R?mi ----- Original Message ----- > From: "daniel smith" > To: "valhalla-spec-experts" > Sent: Wednesday, January 24, 2024 2:11:18 AM > Subject: JVMS changes for JEP 401 > I've posted a revised spec change document for JEP 401, Value Classes and > Objects, here: > > https://cr.openjdk.org/~dlsmith/jep401/jep401-20240116/specs/value-objects-jvms.html > > Not covered by JVMS is anything to do with our internal-only null restriction > annotations, or (naturally) anything about scalarization/flattening. Of course > those things represent a large chunk of the code changes that we anticipate > delivering with JEP 401. > > We've already identified a few issues from internal reviews, which I'll address > in a followup iteration: > > - 5.3.5 needs to allow JVM implementations to "speculatively" load classes > mentioned by Preload, enabling early field layout computations ("speculative" > because the attempt may fail due to circularities, but succeed later) > > - The categorization of attributes in 4.7 is not a great fit for Preload; maybe > we need to redefine the categories a bit > > - The definition of the Preload attribute could use more (perhaps non-normative) > description of how it works, even though this is mostly below the level of the > spec; 5.4 could also help clarify this > > - The reference to the "start of object construction" in the table about > ACC_STRICT is vague; maybe we can do better > > - The 4.9.2 rule about ACC_STRICT is hard to parse, will be rephrased somehow > > - 2.4 should say more about "sameness" as it relates to "identity" From daniel.smith at oracle.com Wed Feb 7 21:21:34 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 7 Feb 2024 21:21:34 +0000 Subject: JVMS changes for JEP 401 In-Reply-To: <2001541550.999867.1707335536135.JavaMail.zimbra@univ-eiffel.fr> References: <147936FE-F316-4C29-BFD4-DAE6B0DA09D5@oracle.com> <2001541550.999867.1707335536135.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <6425DB18-9707-4C9D-B097-0D7F9D0E35A8@oracle.com> > On Feb 7, 2024, at 11:52?AM, Remi Forax wrote: > > Hello, > in section 6.5, the specified behavior of acmp_eq/acmp_ne for fields of type float and double has been changed. > > The last time we discussed that subject, the idea was to align the semantics of a field of type float (double) to the semantics of Float.equals() (Double.equals()), but now the spec has been changed to compare the bitwise representations. > > What is the underlying reason for that change ? Yes, changed with the last refresh in May. I don't see a discussion about this in the EG mail archive, so I'll do some digging and see if I can reconstruct the arguments for doing it this way. From daniel.smith at oracle.com Fri Feb 9 02:43:34 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 9 Feb 2024 02:43:34 +0000 Subject: Value object equality & floating-point values Message-ID: Remi asked about the spec change last May that switched the `==` behavior on value objects that wrap floating points from a `doubleToLongBits` comparison to a `doubleToRawLongBits` comparison. Here's my recollection of the motivation. First, a good summary of the different versions of floating point equality can be found here: https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#equivalenceRelation It discusses three different concepts of equality for type 'double'. - Numerical equality: The behavior of == acting on doubles, with special treatment for NaNs (never equal to themselves) and +0/-0 (distinct but considered equal) - Representational equivalence: The behavior of `Double.equals` and `doubleToLongBits`-based comparisons, distinguishing +0 from -0, but with all NaN bit patterns considered equal to each other - Bitwise equivalence: The behavior of `doubleToRawLongBits`-based comparisons, distinguishing +0 from -0, and with every NaN bit pattern distinguished from every other ----- Now turning to value objects. Discussing the general concept of equivalence classes, the above reference has this to say: "At least for some purposes, all the members of an equivalence class are substitutable for each other. In particular, in a numeric expression equivalent values can be substituted for one another without changing the result of the expression, meaning changing the equivalence class of the result of the expression." Value classes that wrap primitive floating point values will have their own notion of what version of "substitutable" they wish to work with, and so what equivalence classes they need. But, at bottom, the JVM and other applications need to have some least common denominator equivalence relation that support substitutability for *all* value classes. That equivalence relation is bitwise equivalence. That is, consider this class: value class C { private double d; C(double d) { this.d = d; } long bits() { return Double.doubleToRawLongBits(d); } } C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); assert c1.bits() != c2.bits(); Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 as belonging to the same equivalence class. If they are, it's allowed to substitute c1 for c2 at any time. I think it's pretty clear that would be a mistake. So the JVM internals need to be operating in terms of bitwise equivalence of nested floating-point values. Now consider another class: value class D { double d; D(double d) { this.d = d; } public boolean equals(Object o) { return o instanceof D that && Math.abs(this.d - that.d) < 0.00001d; } } D d1 = new D(0.3); D d2 = new D(0.1+0.2); assert d1.d != d2.d; Now we've got a class that wants to work with a much chunkier equivalence relation. (I kind of suspect this isn't an equivalence relation at all, sorry, floating-point experts. But you get the idea.) This class wouldn't mind if the VM *did* randomly swap out d1 for d2, because *in this application*, they're substitutable. So: different classes will have different needs, we can't anticipate them all, but in certain contexts that lack domain knowledge (like VM optimizations), bitwise equivalence must be used. Finally: must '==' be defined to reflect "least common denominator" substitutability, or could it be something else? Perhaps representation equivalence, which has some nice properties and can be conveniently expressed in terms of Double.equals? In theory, sure, there's no reason we couldn't use representational equivalence for '==', and provide some other path to bitwise equivalence (Objects.isSubstitutable?). But again, note that every class has its own domain-specific equivalence relation needs. This is captured by 'equals'. (Beyond floating point interpretations, don't forget that '==' will often not be the equivalence relation that value classes want for their identity object fields, so they'll need to override the default equals and make some recursive 'equals' calls.) So we know Java programmers need to be conversant in at least two versions of value object equality: universal substitutability (using bitwise equivalence for floating points), and domain equivalence (defined by 'equals' methods). And traditionally, '==' on objects has been understood to mean universal substitutability. Do we really want to complicate matters further by asking programmers to keep track of *three* object equivalence relations, and teaching them that '==' doesn't *really* mean substitutability anymore? We decided that wasn't worth the trouble?ultimately, we just want to continue to encourage them to use 'equals' in most contexts. From kevinb9n at gmail.com Fri Feb 9 02:32:47 2024 From: kevinb9n at gmail.com (Kevin Bourrillion) Date: Thu, 8 Feb 2024 18:32:47 -0800 Subject: Simplifying 'value' and 'identity' Message-ID: Hey everyone, [I assume Google will put forth a replacement representative at some point, but either way I've joined this group as an individual member. kevinb9n at gmail.com is how to reach me now!] Here's a response to Dan's December thread which I can't reply to from this other account. So as far as I can tell I think this dovetails perfectly with what I've always wanted, which is the "identity is just an attribute" model. I like pretty much everything about it. Under that idea, identity is absent at the root of the hierarchy, can be added by any subtype, then is inherited by everything below that. Any type that wants to be mutable, lockable, etc. would have to flip that identity switch on. So I don't see that there needs to be any difference between "value" and "indeterminate". Both are just "doesn't have the identity attribute". I feel like there must have been some reason we thought that wouldn't pan out, but I don't remember. In some sense, adding the "attribute" of identity also *removes* a capability at the same time (the capability of being copyable and collapsible at the VM's whims), and at one point I thought that sunk the whole model. But that's not a *client*-facing capability, and I think maybe that excuses it. So anyway, I *think* the only adjustment to my preferred dream model that I recognize in Dan's email is "interfaces don't get to add this identity attribute", which okay, no great loss I suppose. Of course, I don't remember what past discussions I don't remember, so please let me know what I'm missing here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb9n at gmail.com Fri Feb 9 05:06:18 2024 From: kevinb9n at gmail.com (Kevin Bourrillion) Date: Thu, 8 Feb 2024 21:06:18 -0800 Subject: Value object equality & floating-point values In-Reply-To: References: Message-ID: Sounds right to me. The best meaning for `==` is "there is no difference you can even possibly care about". How it behaves on float and double now is the anomaly. imho ALL we need from `==` on value classes is just to be consistent *enough* with identity class behavior to allow migration. Good practice would be to clean up those usages anyway. On Thu, Feb 8, 2024 at 6:43?PM Dan Smith wrote: > Remi asked about the spec change last May that switched the `==` behavior > on value objects that wrap floating points from a `doubleToLongBits` > comparison to a `doubleToRawLongBits` comparison. Here's my recollection of > the motivation. > > First, a good summary of the different versions of floating point equality > can be found here: > > https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#equivalenceRelation > > It discusses three different concepts of equality for type 'double'. > > - Numerical equality: The behavior of == acting on doubles, with special > treatment for NaNs (never equal to themselves) and +0/-0 (distinct but > considered equal) > > - Representational equivalence: The behavior of `Double.equals` and > `doubleToLongBits`-based comparisons, distinguishing +0 from -0, but with > all NaN bit patterns considered equal to each other > > - Bitwise equivalence: The behavior of `doubleToRawLongBits`-based > comparisons, distinguishing +0 from -0, and with every NaN bit pattern > distinguished from every other > > ----- > > Now turning to value objects. > > Discussing the general concept of equivalence classes, the above reference > has this to say: "At least for some purposes, all the members of an > equivalence class are substitutable for each other. In particular, in a > numeric expression equivalent values can be substituted for one another > without changing the result of the expression, meaning changing the > equivalence class of the result of the expression." > > Value classes that wrap primitive floating point values will have their > own notion of what version of "substitutable" they wish to work with, and > so what equivalence classes they need. But, at bottom, the JVM and other > applications need to have some least common denominator equivalence > relation that support substitutability for *all* value classes. That > equivalence relation is bitwise equivalence. > > That is, consider this class: > > value class C { > private double d; > C(double d) { this.d = d; } > long bits() { return Double.doubleToRawLongBits(d); } > } > > C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); > C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); > assert c1.bits() != c2.bits(); > > Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 > as belonging to the same equivalence class. If they are, it's allowed to > substitute c1 for c2 at any time. I think it's pretty clear that would be a > mistake. So the JVM internals need to be operating in terms of bitwise > equivalence of nested floating-point values. > > Now consider another class: > > value class D { > double d; > D(double d) { this.d = d; } > public boolean equals(Object o) { > return o instanceof D that && Math.abs(this.d - that.d) < 0.00001d; > } > } > > D d1 = new D(0.3); > D d2 = new D(0.1+0.2); > assert d1.d != d2.d; > > Now we've got a class that wants to work with a much chunkier equivalence > relation. (I kind of suspect this isn't an equivalence relation at all, > sorry, floating-point experts. But you get the idea.) This class wouldn't > mind if the VM *did* randomly swap out d1 for d2, because *in this > application*, they're substitutable. > > So: different classes will have different needs, we can't anticipate them > all, but in certain contexts that lack domain knowledge (like VM > optimizations), bitwise equivalence must be used. > > Finally: must '==' be defined to reflect "least common denominator" > substitutability, or could it be something else? Perhaps representation > equivalence, which has some nice properties and can be conveniently > expressed in terms of Double.equals? > > In theory, sure, there's no reason we couldn't use representational > equivalence for '==', and provide some other path to bitwise equivalence > (Objects.isSubstitutable?). > > But again, note that every class has its own domain-specific equivalence > relation needs. This is captured by 'equals'. (Beyond floating point > interpretations, don't forget that '==' will often not be the equivalence > relation that value classes want for their identity object fields, so > they'll need to override the default equals and make some recursive > 'equals' calls.) > > So we know Java programmers need to be conversant in at least two versions > of value object equality: universal substitutability (using bitwise > equivalence for floating points), and domain equivalence (defined by > 'equals' methods). And traditionally, '==' on objects has been understood > to mean universal substitutability. Do we really want to complicate matters > further by asking programmers to keep track of *three* object equivalence > relations, and teaching them that '==' doesn't *really* mean > substitutability anymore? We decided that wasn't worth the > trouble?ultimately, we just want to continue to encourage them to use > 'equals' in most contexts. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Feb 9 18:13:03 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 9 Feb 2024 19:13:03 +0100 (CET) Subject: Value object equality & floating-point values In-Reply-To: References: Message-ID: <617932476.3148578.1707502383237.JavaMail.zimbra@univ-eiffel.fr> ---- Original Message ----- > From: "daniel smith" > To: "valhalla-spec-experts" > Sent: Friday, February 9, 2024 3:43:34 AM > Subject: Value object equality & floating-point values > Remi asked about the spec change last May that switched the `==` behavior on > value objects that wrap floating points from a `doubleToLongBits` comparison to > a `doubleToRawLongBits` comparison. Here's my recollection of the motivation. Hello, > > First, a good summary of the different versions of floating point equality can > be found here: > https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#equivalenceRelation > > It discusses three different concepts of equality for type 'double'. > > - Numerical equality: The behavior of == acting on doubles, with special > treatment for NaNs (never equal to themselves) and +0/-0 (distinct but > considered equal) > > - Representational equivalence: The behavior of `Double.equals` and > `doubleToLongBits`-based comparisons, distinguishing +0 from -0, but with all > NaN bit patterns considered equal to each other > > - Bitwise equivalence: The behavior of `doubleToRawLongBits`-based comparisons, > distinguishing +0 from -0, and with every NaN bit pattern distinguished from > every other > > ----- > > Now turning to value objects. > > Discussing the general concept of equivalence classes, the above reference has > this to say: "At least for some purposes, all the members of an equivalence > class are substitutable for each other. In particular, in a numeric expression > equivalent values can be substituted for one another without changing the > result of the expression, meaning changing the equivalence class of the result > of the expression." > > Value classes that wrap primitive floating point values will have their own > notion of what version of "substitutable" they wish to work with, and so what > equivalence classes they need. But, at bottom, the JVM and other applications > need to have some least common denominator equivalence relation that support > substitutability for *all* value classes. That equivalence relation is bitwise > equivalence. > > That is, consider this class: > > value class C { > private double d; > C(double d) { this.d = d; } > long bits() { return Double.doubleToRawLongBits(d); } > } > > C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); > C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); > assert c1.bits() != c2.bits(); > > Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 as > belonging to the same equivalence class. If they are, it's allowed to > substitute c1 for c2 at any time. I think it's pretty clear that would be a > mistake. So the JVM internals need to be operating in terms of bitwise > equivalence of nested floating-point values. As you said there are 3 possible equivalence class. The numerical equality is not really an equivalence so let rule it out. So we have the choice between the bitwise equivalence and the representational equivalence. Whatever the equivalence class we choose, it will be the definition for substitutability. If we choose the representational equivalence, the VM will have more leeway to optimize because it may substitute one instance of C by another whatever the encoding of NaN is. If we choose the bitwise equivalence, the VM will not be able to optimize if it is not the exactly bitwise representation of NaN. > I think it's pretty clear that would be a mistake. I do not compute that statement :) Why do you want users to care about the bitwise representation of NaN ? Both 0x7ff0000000000001L and 0x7ff0000000000002L represents NaN, if we print c1.d and c2.d both will print NaN, if we use c1.d or c2.d in numeric computation, they will both behave as NaN. [...] > So we know Java programmers need to be conversant in at least two versions of > value object equality: universal substitutability (using bitwise equivalence > for floating points), and domain equivalence (defined by 'equals' methods). And > traditionally, '==' on objects has been understood to mean universal > substitutability. Do we really want to complicate matters further by asking > programmers to keep track of *three* object equivalence relations, and teaching > them that '==' doesn't *really* mean substitutability anymore? We decided that > wasn't worth the trouble?ultimately, we just want to continue to encourage them > to use 'equals' in most contexts. Your example is both compatible with the bitwise equivalence and the representational equivalence, because the only difference between the two equivalent classes, is the behavior of NaN. So the only case where using the representational equivalence as substitutability is an issue is if you want equals() to use the bitwise equivalence. In this specific case, it will not work. If we were mathematicians, that the end of the discussion, but we are designing a programming language so we have to take care of the drawbacks of not using the representational equivalence and balance it with the fact that if we choose the representational equivalence for value class, a class that have an equals() that uses the bitwise equivalence can not be declared as a value class. For me, there are serious drawbacks of using the bitwise equivalence, it will clash with the other places where we are already using the representational equivalence: - the bitwise equivalence is pretty obscure and hard to debug given that the string representation is compatible with the representational equivalence, - the behavior of java.lang.Double and java.lang.Float becomes different from the other wrapper types for which both == and equals() have the same semantics, - the semantics of equals in a record is based on the representational equivalence, so a record value with primitive components will also have an == and a equals that disagree. Using your example, but with a value record (supposing the bitwise equivalence) value record C(double d) { } C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); System.out.println(c1); // C[d=NaN] System.out.println(c2); // C[d=NaN] System.out.println(c1 == c2); // false ?? System.out.println(c1.equals(c2)); // true R?mi From daniel.smith at oracle.com Fri Feb 9 18:56:14 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 9 Feb 2024 18:56:14 +0000 Subject: Value object equality & floating-point values In-Reply-To: <617932476.3148578.1707502383237.JavaMail.zimbra@univ-eiffel.fr> References: <617932476.3148578.1707502383237.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <2A11AB2C-C21D-4483-AB81-9EC2210035A5@oracle.com> On Feb 9, 2024, at 10:13?AM, Remi Forax wrote: value class C { private double d; C(double d) { this.d = d; } long bits() { return Double.doubleToRawLongBits(d); } } C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); assert c1.bits() != c2.bits(); Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 as belonging to the same equivalence class. If they are, it's allowed to substitute c1 for c2 at any time. I think it's pretty clear that would be a mistake. I do not compute that statement :) Why do you want users to care about the bitwise representation of NaN ? Both 0x7ff0000000000001L and 0x7ff0000000000002L represents NaN, if we print c1.d and c2.d both will print NaN, if we use c1.d or c2.d in numeric computation, they will both behave as NaN. To be very specific about this example, I think it's bad if the result of the 'c.bits()' method is nondeterministic, making the 'assert' result unpredictable. Two instances of C with representationally-equivalent state can produce different results from their 'c.bits()' method. So they aren't substitutable (per the "can substituted for one another without changing the result of the expression" definition). I would be uncomfortable with the JVM substituting one for the other whenever it wants to. Sure, the *native* operations on *double* almost never distinguish between different NaN encodings. But a *custom* operation on *a class that wraps a double* certainly can. (The example could be improved by doing a better job of illustrating that the double is private internal state, and that the operations exposed by the class need not look at all like floating-point operations, as far as the client of the class is concerned. All they know is they've got an object that is randomly producing nondeterministic results.) This, by itself, is not an argument for '==' being defined to use bitwise equivalence, but it is an argument for a well-defined concept of "substitutable value object" that is based on bitwise equivalence. Using your example, but with a value record (supposing the bitwise equivalence) value record C(double d) { } C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); System.out.println(c1); // C[d=NaN] System.out.println(c2); // C[d=NaN] System.out.println(c1 == c2); // false ?? System.out.println(c1.equals(c2)); // true Sure. I mean, this is the exact same result as you get with an identity record, so I don't think it should be surprising. The fallacy, I think, is in expecting '==' for value objects to be something more than a substitutability test. (Which, like I said, could be done, but it seems like a distraction when what you really want to use is 'equals'.) Compare: value record S(String s) {} S s1 = new S("abc"); S s2 = new S("abcd".substring(0,3)); System.out.println(s1); // S[s=abc] System.out.println(s2); // S[s=abc] System.out.println(s1 == s2); // false System.out.println(s1.quals(s2)); // true This may be a new concept to learn: value objects with double fields can be 'equals' but not '==', just like value objects with reference fields can be 'equals' but not '=='. But I think that's a better quirk to learn than '==' sometimes not meaning "substitutable". -------------- next part -------------- An HTML attachment was scrubbed... URL: From liangchenblue at gmail.com Fri Feb 9 19:52:24 2024 From: liangchenblue at gmail.com (-) Date: Fri, 9 Feb 2024 13:52:24 -0600 Subject: Value object equality & floating-point values In-Reply-To: <2A11AB2C-C21D-4483-AB81-9EC2210035A5@oracle.com> References: <617932476.3148578.1707502383237.JavaMail.zimbra@univ-eiffel.fr> <2A11AB2C-C21D-4483-AB81-9EC2210035A5@oracle.com> Message-ID: Hi Dan and Remi, In addition to Dan's reference pointer example, which I strongly agree with, I wish to share another example that more closely aligns with the existing double or float: OptionalInt. OptionalInt has 2 fields: boolean isPresent and int value. However, not all possible combinations of the 2 fields are usually used; like NaN has many representations, the absent value has many representations (its value field can take any int value). When we have an OptionalInt(false, 0) and an OptionalInt(false, 1), should we have == stand true for them? No, even though their behaviors are the same otherwise. "But OptionalInt's constructor prevents these values!" One might argue. We wish that double had constructors that prevented creation of distinct NaNs, too, as it is not a cartesian product of its components. Even though we must live with this vestige of positive and negative zero and many types of infinity and the special == operator logic for double, I don't think it's a convincing argument for us to complicate the object == logic to support non-cartesian values in corner-case states to be the same as their "normal" states. We now have constructors to normalize and sanitize input values if you really love ==, so the OptionalInt scenario never happens in practice; nothing prevents you from normalizing the double values you write to your strict fields, and you can move on to use == as you please. Regards, Chen Liang On Fri, Feb 9, 2024 at 12:56?PM Dan Smith wrote: > On Feb 9, 2024, at 10:13?AM, Remi Forax wrote: > > value class C { > private double d; > C(double d) { this.d = d; } > long bits() { return Double.doubleToRawLongBits(d); } > } > > C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); > C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); > assert c1.bits() != c2.bits(); > > Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 > as belonging to the same equivalence class. If they are, it's allowed to > substitute c1 for c2 at any time. I think it's pretty clear that would be a > mistake. > > > I do not compute that statement :) > > Why do you want users to care about the bitwise representation of NaN ? > Both 0x7ff0000000000001L and 0x7ff0000000000002L represents NaN, if we > print c1.d and c2.d both will print NaN, if we use c1.d or c2.d in numeric > computation, they will both behave as NaN. > > > To be very specific about this example, I think it's bad if the result of > the 'c.bits()' method is nondeterministic, making the 'assert' result > unpredictable. > > Two instances of C with representationally-equivalent state can produce > different results from their 'c.bits()' method. So they aren't > substitutable (per the "can substituted for one another without changing > the result of the expression" definition). I would be uncomfortable with > the JVM substituting one for the other whenever it wants to. > > Sure, the *native* operations on *double* almost never distinguish between > different NaN encodings. But a *custom* operation on *a class that wraps a > double* certainly can. > > (The example could be improved by doing a better job of illustrating that > the double is private internal state, and that the operations exposed by > the class need not look at all like floating-point operations, as far as > the client of the class is concerned. All they know is they've got an > object that is randomly producing nondeterministic results.) > > This, by itself, is not an argument for '==' being defined to use bitwise > equivalence, but it is an argument for a well-defined concept of > "substitutable value object" that is based on bitwise equivalence. > > Using your example, but with a value record (supposing the bitwise > equivalence) > > value record C(double d) { } > C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); > C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); > > System.out.println(c1); // C[d=NaN] > System.out.println(c2); // C[d=NaN] > System.out.println(c1 == c2); // false ?? > System.out.println(c1.equals(c2)); // true > > > Sure. I mean, this is the exact same result as you get with an identity > record, so I don't think it should be surprising. The fallacy, I think, is > in expecting '==' for value objects to be something more than a > substitutability test. (Which, like I said, could be done, but it seems > like a distraction when what you really want to use is 'equals'.) > > Compare: > > value record S(String s) {} > S s1 = new S("abc"); > S s2 = new S("abcd".substring(0,3)); > > System.out.println(s1); // S[s=abc] > System.out.println(s2); // S[s=abc] > System.out.println(s1 == s2); // false > System.out.println(s1.quals(s2)); // true > > This may be a new concept to learn: value objects with double fields can > be 'equals' but not '==', just like value objects with reference fields can > be 'equals' but not '=='. But I think that's a better quirk to learn than > '==' sometimes not meaning "substitutable". > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Feb 9 20:32:29 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 9 Feb 2024 21:32:29 +0100 (CET) Subject: Value object equality & floating-point values In-Reply-To: <2A11AB2C-C21D-4483-AB81-9EC2210035A5@oracle.com> References: <617932476.3148578.1707502383237.JavaMail.zimbra@univ-eiffel.fr> <2A11AB2C-C21D-4483-AB81-9EC2210035A5@oracle.com> Message-ID: <1198513186.3270854.1707510749886.JavaMail.zimbra@univ-eiffel.fr> > From: "daniel smith" > To: "Remi Forax" > Cc: "valhalla-spec-experts" > Sent: Friday, February 9, 2024 7:56:14 PM > Subject: Re: Value object equality & floating-point values >> On Feb 9, 2024, at 10:13 AM, Remi Forax wrote: >>> value class C { >>> private double d; >>> C(double d) { this.d = d; } >>> long bits() { return Double.doubleToRawLongBits(d); } >>> } >>> C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); >>> C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); >>> assert c1.bits() != c2.bits(); >>> Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 as >>> belonging to the same equivalence class. If they are, it's allowed to >>> substitute c1 for c2 at any time. I think it's pretty clear that would be a >>> mistake. >> I do not compute that statement :) >> Why do you want users to care about the bitwise representation of NaN ? >> Both 0x7ff0000000000001L and 0x7ff0000000000002L represents NaN, if we print >> c1.d and c2.d both will print NaN, if we use c1.d or c2.d in numeric >> computation, they will both behave as NaN. > To be very specific about this example, I think it's bad if the result of the > 'c.bits()' method is nondeterministic, making the 'assert' result > unpredictable. > Two instances of C with representationally-equivalent state can produce > different results from their 'c.bits()' method. So they aren't substitutable > (per the "can substituted for one another without changing the result of the > expression" definition). I would be uncomfortable with the JVM substituting one > for the other whenever it wants to. > Sure, the *native* operations on *double* almost never distinguish between > different NaN encodings. But a *custom* operation on *a class that wraps a > double* certainly can. > (The example could be improved by doing a better job of illustrating that the > double is private internal state, and that the operations exposed by the class > need not look at all like floating-point operations, as far as the client of > the class is concerned. All they know is they've got an object that is randomly > producing nondeterministic results.) > This, by itself, is not an argument for '==' being defined to use bitwise > equivalence, but it is an argument for a well-defined concept of "substitutable > value object" that is based on bitwise equivalence. >> Using your example, but with a value record (supposing the bitwise equivalence) >> value record C(double d) { } >> C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); >> C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); >> System.out.println(c1); // C[d=NaN] >> System.out.println(c2); // C[d=NaN] >> System.out.println(c1 == c2); // false ?? >> System.out.println(c1.equals(c2)); // true > Sure. I mean, this is the exact same result as you get with an identity record, > so I don't think it should be surprising. The fallacy, I think, is in expecting > '==' for value objects to be something more than a substitutability test. > (Which, like I said, could be done, but it seems like a distraction when what > you really want to use is 'equals'.) > Compare: > value record S(String s) {} > S s1 = new S("abc"); > S s2 = new S("abcd".substring(0,3)); > System.out.println(s1); // S[s=abc] > System.out.println(s2); // S[s=abc] > System.out.println(s1 == s2); // false > System.out.println(s1.quals(s2)); // true > This may be a new concept to learn: value objects with double fields can be > 'equals' but not '==', just like value objects with reference fields can be > 'equals' but not '=='. But I think that's a better quirk to learn than '==' > sometimes not meaning "substitutable". That's the main problem with using bitwise equivalence, it's a new concept that is only specific to NaN and alien to developers. Basically, you are pushing for a third option which is neither the semantics of == on double nor equals() on j.l.Double, and worst, it works like equals() on j.l.Double if there is no NaN. Your example using references is fine because it works the same way for all references, the problem with the bitwise equivalence is that it *almost* work like the representational equivalence but for NaN. Also, most NaNs in a program have the same bitwise representation, so == will return false only sometimes, to the point where most people will think that the semantics is the representation equivalence (even if the spec said differently). For me the difference between the bitwise equivalence and the representational equivalence is too subtle, so nobody will care, people will use == instead of equals, and Java programs will sometimes fail randomly (from the developer POV). We fix == but for NaN looks like a terrible slogan. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From scolebourne at joda.org Sat Feb 10 12:57:47 2024 From: scolebourne at joda.org (Stephen Colebourne) Date: Sat, 10 Feb 2024 12:57:47 +0000 Subject: Value object equality & floating-point values In-Reply-To: References: Message-ID: The Java SE specification in java.lang.Double says this: | No IEEE 754 floating-point operation provided by Java can distinguish | between two NaN values of the same type with different bit | patterns. Distinct values of NaN are only distinguishable by | use of the {@code Double.doubleToRawLongBits} method. (Same text since at least Java 6 AFAICT) Note the *only distinguishable*.part. The proposal below breaks this, as it provides a second way to observe different kinds of NaN. Not only that, but unlike doubleToRawLongBits which very few people know about, == on values would be a very mainstream part of the language. Under your proposal developers would need to handle all three kinds of equivalence: - Numeric - for double == double - Representational - for a double wrapped by a record - Bitwise - for a double wrapped by a value class Surely it doesn't make sense for two different kinds of wrapping to result in two different behaviours, neither of which matches the unwrapped behaviour??!! It is my opinion that exposing the concept of different bit patterns of NaN to most developers would be a significant retrograde step for Java. The rules of Java have always been simple wrt doubles - Representational equivalence except for math-style rules on primitive doubles. A proposed solution - normalization I believe there is a simple approach that also works to explain the behaviour of java.lang.Float and java.lang.Double equals(). * For each `float` or `double` field in a value class, the constructor will generate normalization code * The normalization is equivalent to `longBitsToDouble(doubleToLongBits(field))` * Normalization also applies to java.lang.Float and java.lang.Double * == is a Bitwise implementation, but behaves like Representational for developers If deemed important, there could be a mechanism to opt out of auto-generated normalization (I personally don't think the use case is strong enough). Note that the outcome of this is that all value types consisting only of primitive type fields have == the same as the record-ike .equals() definition, which is a very good outcome. Stephen On Fri, 9 Feb 2024 at 02:43, Dan Smith wrote: > > Remi asked about the spec change last May that switched the `==` behavior on value objects that wrap floating points from a `doubleToLongBits` comparison to a `doubleToRawLongBits` comparison. Here's my recollection of the motivation. > > First, a good summary of the different versions of floating point equality can be found here: > https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#equivalenceRelation > > It discusses three different concepts of equality for type 'double'. > > - Numerical equality: The behavior of == acting on doubles, with special treatment for NaNs (never equal to themselves) and +0/-0 (distinct but considered equal) > > - Representational equivalence: The behavior of `Double.equals` and `doubleToLongBits`-based comparisons, distinguishing +0 from -0, but with all NaN bit patterns considered equal to each other > > - Bitwise equivalence: The behavior of `doubleToRawLongBits`-based comparisons, distinguishing +0 from -0, and with every NaN bit pattern distinguished from every other > > ----- > > Now turning to value objects. > > Discussing the general concept of equivalence classes, the above reference has this to say: "At least for some purposes, all the members of an equivalence class are substitutable for each other. In particular, in a numeric expression equivalent values can be substituted for one another without changing the result of the expression, meaning changing the equivalence class of the result of the expression." > > Value classes that wrap primitive floating point values will have their own notion of what version of "substitutable" they wish to work with, and so what equivalence classes they need. But, at bottom, the JVM and other applications need to have some least common denominator equivalence relation that support substitutability for *all* value classes. That equivalence relation is bitwise equivalence. > > That is, consider this class: > > value class C { > private double d; > C(double d) { this.d = d; } > long bits() { return Double.doubleToRawLongBits(d); } > } > > C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); > C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); > assert c1.bits() != c2.bits(); > > Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 as belonging to the same equivalence class. If they are, it's allowed to substitute c1 for c2 at any time. I think it's pretty clear that would be a mistake. So the JVM internals need to be operating in terms of bitwise equivalence of nested floating-point values. > > Now consider another class: > > value class D { > double d; > D(double d) { this.d = d; } > public boolean equals(Object o) { > return o instanceof D that && Math.abs(this.d - that.d) < 0.00001d; > } > } > > D d1 = new D(0.3); > D d2 = new D(0.1+0.2); > assert d1.d != d2.d; > > Now we've got a class that wants to work with a much chunkier equivalence relation. (I kind of suspect this isn't an equivalence relation at all, sorry, floating-point experts. But you get the idea.) This class wouldn't mind if the VM *did* randomly swap out d1 for d2, because *in this application*, they're substitutable. > > So: different classes will have different needs, we can't anticipate them all, but in certain contexts that lack domain knowledge (like VM optimizations), bitwise equivalence must be used. > > Finally: must '==' be defined to reflect "least common denominator" substitutability, or could it be something else? Perhaps representation equivalence, which has some nice properties and can be conveniently expressed in terms of Double.equals? > > In theory, sure, there's no reason we couldn't use representational equivalence for '==', and provide some other path to bitwise equivalence (Objects.isSubstitutable?). > > But again, note that every class has its own domain-specific equivalence relation needs. This is captured by 'equals'. (Beyond floating point interpretations, don't forget that '==' will often not be the equivalence relation that value classes want for their identity object fields, so they'll need to override the default equals and make some recursive 'equals' calls.) > > So we know Java programmers need to be conversant in at least two versions of value object equality: universal substitutability (using bitwise equivalence for floating points), and domain equivalence (defined by 'equals' methods). And traditionally, '==' on objects has been understood to mean universal substitutability. Do we really want to complicate matters further by asking programmers to keep track of *three* object equivalence relations, and teaching them that '==' doesn't *really* mean substitutability anymore? We decided that wasn't worth the trouble?ultimately, we just want to continue to encourage them to use 'equals' in most contexts. > From forax at univ-mlv.fr Sat Feb 10 20:57:06 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 10 Feb 2024 21:57:06 +0100 (CET) Subject: Value object equality & floating-point values In-Reply-To: References: Message-ID: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> ----- Original Message ----- > From: "Stephen Colebourne" > To: "Valhalla Expert Group Observers" > Cc: "daniel smith" > Sent: Saturday, February 10, 2024 1:57:47 PM > Subject: Re: Value object equality & floating-point values > The Java SE specification in java.lang.Double says this: > >| No IEEE 754 floating-point operation provided by Java can distinguish >| between two NaN values of the same type with different bit >| patterns. Distinct values of NaN are only distinguishable by >| use of the {@code Double.doubleToRawLongBits} method. > (Same text since at least Java 6 AFAICT) > > Note the *only distinguishable*.part. The proposal below breaks this, > as it provides a second way to observe different kinds of NaN. Not > only that, but unlike doubleToRawLongBits which very few people know > about, == on values would be a very mainstream part of the language. > > Under your proposal developers would need to handle all three kinds of > equivalence: > - Numeric - for double == double > - Representational - for a double wrapped by a record > - Bitwise - for a double wrapped by a value class > Surely it doesn't make sense for two different kinds of wrapping to > result in two different behaviours, neither of which matches the > unwrapped behaviour??!! > > It is my opinion that exposing the concept of different bit patterns > of NaN to most developers would be a significant retrograde step for > Java. The rules of Java have always been simple wrt doubles - > Representational equivalence except for math-style rules on primitive > doubles. > Yes ! > > A proposed solution - normalization > I believe there is a simple approach that also works to explain the > behaviour of java.lang.Float and java.lang.Double equals(). > > * For each `float` or `double` field in a value class, the constructor > will generate normalization code > * The normalization is equivalent to `longBitsToDouble(doubleToLongBits(field))` > * Normalization also applies to java.lang.Float and java.lang.Double > * == is a Bitwise implementation, but behaves like Representational > for developers > > If deemed important, there could be a mechanism to opt out of > auto-generated normalization (I personally don't think the use case is > strong enough). > > Note that the outcome of this is that all value types consisting only > of primitive type fields have == the same as the record-ike .equals() > definition, which is a very good outcome. yes ! And also all wrappers of primitive types have == the same as their .equals() definition. > > Stephen > R?mi > > On Fri, 9 Feb 2024 at 02:43, Dan Smith wrote: >> >> Remi asked about the spec change last May that switched the `==` behavior on >> value objects that wrap floating points from a `doubleToLongBits` comparison to >> a `doubleToRawLongBits` comparison. Here's my recollection of the motivation. >> >> First, a good summary of the different versions of floating point equality can >> be found here: >> https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#equivalenceRelation >> >> It discusses three different concepts of equality for type 'double'. >> >> - Numerical equality: The behavior of == acting on doubles, with special >> treatment for NaNs (never equal to themselves) and +0/-0 (distinct but >> considered equal) >> >> - Representational equivalence: The behavior of `Double.equals` and >> `doubleToLongBits`-based comparisons, distinguishing +0 from -0, but with all >> NaN bit patterns considered equal to each other >> >> - Bitwise equivalence: The behavior of `doubleToRawLongBits`-based comparisons, >> distinguishing +0 from -0, and with every NaN bit pattern distinguished from >> every other >> >> ----- >> >> Now turning to value objects. >> >> Discussing the general concept of equivalence classes, the above reference has >> this to say: "At least for some purposes, all the members of an equivalence >> class are substitutable for each other. In particular, in a numeric expression >> equivalent values can be substituted for one another without changing the >> result of the expression, meaning changing the equivalence class of the result >> of the expression." >> >> Value classes that wrap primitive floating point values will have their own >> notion of what version of "substitutable" they wish to work with, and so what >> equivalence classes they need. But, at bottom, the JVM and other applications >> need to have some least common denominator equivalence relation that support >> substitutability for *all* value classes. That equivalence relation is bitwise >> equivalence. >> >> That is, consider this class: >> >> value class C { >> private double d; >> C(double d) { this.d = d; } >> long bits() { return Double.doubleToRawLongBits(d); } >> } >> >> C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); >> C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); >> assert c1.bits() != c2.bits(); >> >> Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 as >> belonging to the same equivalence class. If they are, it's allowed to >> substitute c1 for c2 at any time. I think it's pretty clear that would be a >> mistake. So the JVM internals need to be operating in terms of bitwise >> equivalence of nested floating-point values. >> >> Now consider another class: >> >> value class D { >> double d; >> D(double d) { this.d = d; } >> public boolean equals(Object o) { >> return o instanceof D that && Math.abs(this.d - that.d) < 0.00001d; >> } >> } >> >> D d1 = new D(0.3); >> D d2 = new D(0.1+0.2); >> assert d1.d != d2.d; >> >> Now we've got a class that wants to work with a much chunkier equivalence >> relation. (I kind of suspect this isn't an equivalence relation at all, sorry, >> floating-point experts. But you get the idea.) This class wouldn't mind if the >> VM *did* randomly swap out d1 for d2, because *in this application*, they're >> substitutable. >> >> So: different classes will have different needs, we can't anticipate them all, >> but in certain contexts that lack domain knowledge (like VM optimizations), >> bitwise equivalence must be used. >> >> Finally: must '==' be defined to reflect "least common denominator" >> substitutability, or could it be something else? Perhaps representation >> equivalence, which has some nice properties and can be conveniently expressed >> in terms of Double.equals? >> >> In theory, sure, there's no reason we couldn't use representational equivalence >> for '==', and provide some other path to bitwise equivalence >> (Objects.isSubstitutable?). >> >> But again, note that every class has its own domain-specific equivalence >> relation needs. This is captured by 'equals'. (Beyond floating point >> interpretations, don't forget that '==' will often not be the equivalence >> relation that value classes want for their identity object fields, so they'll >> need to override the default equals and make some recursive 'equals' calls.) >> >> So we know Java programmers need to be conversant in at least two versions of >> value object equality: universal substitutability (using bitwise equivalence >> for floating points), and domain equivalence (defined by 'equals' methods). And >> traditionally, '==' on objects has been understood to mean universal >> substitutability. Do we really want to complicate matters further by asking >> programmers to keep track of *three* object equivalence relations, and teaching >> them that '==' doesn't *really* mean substitutability anymore? We decided that >> wasn't worth the trouble?ultimately, we just want to continue to encourage them >> to use 'equals' in most contexts. From ccherlin at gmail.com Mon Feb 12 20:09:59 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Mon, 12 Feb 2024 14:09:59 -0600 Subject: Value object equality & floating-point values In-Reply-To: <2A11AB2C-C21D-4483-AB81-9EC2210035A5@oracle.com> References: <617932476.3148578.1707502383237.JavaMail.zimbra@univ-eiffel.fr> <2A11AB2C-C21D-4483-AB81-9EC2210035A5@oracle.com> Message-ID: On Fri, Feb 9, 2024 at 12:56?PM Dan Smith wrote: > > On Feb 9, 2024, at 10:13?AM, Remi Forax wrote: > > value class C { > private double d; > C(double d) { this.d = d; } > long bits() { return Double.doubleToRawLongBits(d); } > } > > C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); > C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); > assert c1.bits() != c2.bits(); > > Will this assert ever fail? Well, it depends on the JVM treats c1 and c2 as belonging to the same equivalence class. If they are, it's allowed to substitute c1 for c2 at any time. I think it's pretty clear that would be a mistake. > > > I do not compute that statement :) > > Why do you want users to care about the bitwise representation of NaN ? > Both 0x7ff0000000000001L and 0x7ff0000000000002L represents NaN, if we print c1.d and c2.d both will print NaN, if we use c1.d or c2.d in numeric computation, they will both behave as NaN. > > > To be very specific about this example, I think it's bad if the result of the 'c.bits()' method is nondeterministic, making the 'assert' result unpredictable. > > Two instances of C with representationally-equivalent state can produce different results from their 'c.bits()' method. So they aren't substitutable (per the "can substituted for one another without changing the result of the expression" definition). I would be uncomfortable with the JVM substituting one for the other whenever it wants to. > > Sure, the *native* operations on *double* almost never distinguish between different NaN encodings. But a *custom* operation on *a class that wraps a double* certainly can. > > (The example could be improved by doing a better job of illustrating that the double is private internal state, and that the operations exposed by the class need not look at all like floating-point operations, as far as the client of the class is concerned. All they know is they've got an object that is randomly producing nondeterministic results.) > > This, by itself, is not an argument for '==' being defined to use bitwise equivalence, but it is an argument for a well-defined concept of "substitutable value object" that is based on bitwise equivalence. > > Using your example, but with a value record (supposing the bitwise equivalence) > > value record C(double d) { } > C c1 = new C(Double.longBitsToDouble(0x7ff0000000000001L)); > C c2 = new C(Double.longBitsToDouble(0x7ff0000000000002L)); > > System.out.println(c1); // C[d=NaN] > System.out.println(c2); // C[d=NaN] > System.out.println(c1 == c2); // false ?? > System.out.println(c1.equals(c2)); // true > > > Sure. I mean, this is the exact same result as you get with an identity record, so I don't think it should be surprising. The fallacy, I think, is in expecting '==' for value objects to be something more than a substitutability test. (Which, like I said, could be done, but it seems like a distraction when what you really want to use is 'equals'.) > > Compare: > > value record S(String s) {} > S s1 = new S("abc"); > S s2 = new S("abcd".substring(0,3)); > > System.out.println(s1); // S[s=abc] > System.out.println(s2); // S[s=abc] > System.out.println(s1 == s2); // false > System.out.println(s1.quals(s2)); // true > > This may be a new concept to learn: value objects with double fields can be 'equals' but not '==', just like value objects with reference fields can be 'equals' but not '=='. But I think that's a better quirk to learn than '==' sometimes not meaning "substitutable". Question: What is the proposed behavior of == between Double values in Valhalla? I hope it's consistent with other value classes that have double fields. Commentary: I'd prefer for == to compare value class float/double fields using canonical equivalence, same as .equals(), rather than raw bitwise. However, I'll grudgingly accept "raw" equality as long as it's consistent across the board and Float/Double don't receive special treatment. Cheers, Clement Cherlin From daniel.smith at oracle.com Mon Feb 12 21:00:24 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Mon, 12 Feb 2024 21:00:24 +0000 Subject: Value object equality & floating-point values In-Reply-To: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> References: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> Message-ID: Reply to some spec-observers discussion about this thread. > On Feb 10, 2024, at 12:57?PM, Remi Forax wrote: > >> From: "Stephen Colebourne" >> To: "Valhalla Expert Group Observers" >> Cc: "daniel smith" >> Sent: Saturday, February 10, 2024 1:57:47 PM >> Subject: Re: Value object equality & floating-point values > >> Note that the outcome of this is that all value types consisting only >> of primitive type fields have == the same as the record-ike .equals() >> definition, which is a very good outcome. > > yes ! > > And also all wrappers of primitive types have == the same as their .equals() definition. Sounds like the main concern here is that '==' is too discriminating for the domain-specific equality tests normal programmers want to make. I agree, but I don't think this is anything new. The '==' operator has never been the appropriate tool for domain-specific equality tests. It will coincidentally do the "right thing" for a subset of value classes, many of which are cases that only use primitive fields and other value class types (but see more on this below). How we handle floating-points will affect a tiny fraction of use cases. It won't make a difference one way or another on the broader issue, which is whether programmers should routinely rely on '==' for domain-specific equality tests (answer: no). Even for something as simple as Double, it may initially seem obvious that '==' should match 'equals'. But think about other kinds of floating-points that value classes are meant to enable. What about HalfFloat? value class HalfFloat { private short bits; } How is '==' going to behave here? It's going to do a raw bit comparison. Are there multiple NaN encodings for HalfFloats? I'm not an FP expert, but I presume so. Should those different encodings be treated as equivalent by 'equals'? Given the precedent of Float and Double, definitely yes. This is just to emphasize: "only declare primitive fields" or any similar rule is not going to be enough to guarantee that '==' will give you an appropriate domain-specific equality test for free. Equality is ultimately something we need programmers to define with 'equals'. --- Stephen Colebourne suggested a normalization approach to floating-point field storage: >> * For each `float` or `double` field in a value class, the constructor >> will generate normalization code >> * The normalization is equivalent to `longBitsToDouble(doubleToLongBits(field))` >> * Normalization also applies to java.lang.Float and java.lang.Double >> * == is a Bitwise implementation, but behaves like Representational >> for developers The Oracle-internal discussion last spring covered similar ground. There are different ways to stack it, but what they have in common is an interest in eradicating NaN encoding variations as some sort of unwanted anomaly. I get that this is often the case (for the tiny fraction of programmers who ever encounter NaN in the first place). But let's not overlook the fact that, since 1.3, there's an API that explicitly supports these encodings and promises to preserve them (Double.doubleToRawLongBits and Double.longBitsToDouble). Note that Double is a value class that wraps a field of type 'double'. Flattening out NaN encoding differences in the wrapped field would break that API. (Could we work around that by changing the type of the wrapped field to 'long'? I mean, abstractly speaking, I guess... But now we're back to a class Double whose '==' and 'equals' methods disagree.) From ccherlin at gmail.com Tue Feb 13 14:38:12 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Tue, 13 Feb 2024 08:38:12 -0600 Subject: Value object equality & floating-point values In-Reply-To: References: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> Message-ID: On Mon, Feb 12, 2024 at 3:16?PM Dan Smith wrote: > > Reply to some spec-observers discussion about this thread. > > > On Feb 10, 2024, at 12:57?PM, Remi Forax wrote: > > > >> From: "Stephen Colebourne" > >> To: "Valhalla Expert Group Observers" > >> Cc: "daniel smith" > >> Sent: Saturday, February 10, 2024 1:57:47 PM > >> Subject: Re: Value object equality & floating-point values > > > >> Note that the outcome of this is that all value types consisting only > >> of primitive type fields have == the same as the record-ike .equals() > >> definition, which is a very good outcome. > > > > yes ! > > > > And also all wrappers of primitive types have == the same as their .equals() definition. > > Sounds like the main concern here is that '==' is too discriminating for the domain-specific equality tests normal programmers want to make. > > I agree, but I don't think this is anything new. The '==' operator has never been the appropriate tool for domain-specific equality tests. It will coincidentally do the "right thing" for a subset of value classes, many of which are cases that only use primitive fields and other value class types (but see more on this below). How we handle floating-points will affect a tiny fraction of use cases. It won't make a difference one way or another on the broader issue, which is whether programmers should routinely rely on '==' for domain-specific equality tests (answer: no). > > Even for something as simple as Double, it may initially seem obvious that '==' should match 'equals'. But think about other kinds of floating-points that value classes are meant to enable. What about HalfFloat? > > value class HalfFloat { > private short bits; > } > > How is '==' going to behave here? It's going to do a raw bit comparison. Are there multiple NaN encodings for HalfFloats? I'm not an FP expert, but I presume so. Should those different encodings be treated as equivalent by 'equals'? Given the precedent of Float and Double, definitely yes. > > This is just to emphasize: "only declare primitive fields" or any similar rule is not going to be enough to guarantee that '==' will give you an appropriate domain-specific equality test for free. Equality is ultimately something we need programmers to define with 'equals'. > > --- > > Stephen Colebourne suggested a normalization approach to floating-point field storage: > > >> * For each `float` or `double` field in a value class, the constructor > >> will generate normalization code > >> * The normalization is equivalent to `longBitsToDouble(doubleToLongBits(field))` > >> * Normalization also applies to java.lang.Float and java.lang.Double > >> * == is a Bitwise implementation, but behaves like Representational > >> for developers > > The Oracle-internal discussion last spring covered similar ground. There are different ways to stack it, but what they have in common is an interest in eradicating NaN encoding variations as some sort of unwanted anomaly. I get that this is often the case (for the tiny fraction of programmers who ever encounter NaN in the first place). But let's not overlook the fact that, since 1.3, there's an API that explicitly supports these encodings and promises to preserve them (Double.doubleToRawLongBits and Double.longBitsToDouble). Note that Double is a value class that wraps a field of type 'double'. Flattening out NaN encoding differences in the wrapped field would break that API. > > (Could we work around that by changing the type of the wrapped field to 'long'? I mean, abstractly speaking, I guess... But now we're back to a class Double whose '==' and 'equals' methods disagree.) The proposed bitwise implementation of == for value objects exposes internal state, breaking encapsulation. And the class implementor cannot change that behavior. Has anyone performed a security analysis? If == is potentially hazardous and has extremely limited use for value objects, I question whether it should be part of the public API surface of a value class at all. I'd rather get a syntax error from MyDouble a == MyDouble b (or HalfFloat a == HalfFloat b), when invoked outside the class itself or a descendant / enclosed class, than have it provide an actively misleading answer. Since Java doesn't have operator overloading, == really only makes sense for primitives and same-reference/nullness checking. Why not transform == to .equals() for non-nullable and (possibly inlined) Objects.equals() for nullable types? Cheers, Clement Cherlin From ccherlin at gmail.com Tue Feb 13 14:56:07 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Tue, 13 Feb 2024 08:56:07 -0600 Subject: Value object equality & floating-point values In-Reply-To: References: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> Message-ID: On Tue, Feb 13, 2024 at 8:38?AM Clement Cherlin wrote: > > On Mon, Feb 12, 2024 at 3:16?PM Dan Smith wrote: > > > > Reply to some spec-observers discussion about this thread. > > > > > On Feb 10, 2024, at 12:57?PM, Remi Forax wrote: > > > > > >> From: "Stephen Colebourne" > > >> To: "Valhalla Expert Group Observers" > > >> Cc: "daniel smith" > > >> Sent: Saturday, February 10, 2024 1:57:47 PM > > >> Subject: Re: Value object equality & floating-point values > > > > > >> Note that the outcome of this is that all value types consisting only > > >> of primitive type fields have == the same as the record-ike .equals() > > >> definition, which is a very good outcome. > > > > > > yes ! > > > > > > And also all wrappers of primitive types have == the same as their .equals() definition. > > > > Sounds like the main concern here is that '==' is too discriminating for the domain-specific equality tests normal programmers want to make. > > > > I agree, but I don't think this is anything new. The '==' operator has never been the appropriate tool for domain-specific equality tests. It will coincidentally do the "right thing" for a subset of value classes, many of which are cases that only use primitive fields and other value class types (but see more on this below). How we handle floating-points will affect a tiny fraction of use cases. It won't make a difference one way or another on the broader issue, which is whether programmers should routinely rely on '==' for domain-specific equality tests (answer: no). > > > > Even for something as simple as Double, it may initially seem obvious that '==' should match 'equals'. But think about other kinds of floating-points that value classes are meant to enable. What about HalfFloat? > > > > value class HalfFloat { > > private short bits; > > } > > > > How is '==' going to behave here? It's going to do a raw bit comparison. Are there multiple NaN encodings for HalfFloats? I'm not an FP expert, but I presume so. Should those different encodings be treated as equivalent by 'equals'? Given the precedent of Float and Double, definitely yes. > > > > This is just to emphasize: "only declare primitive fields" or any similar rule is not going to be enough to guarantee that '==' will give you an appropriate domain-specific equality test for free. Equality is ultimately something we need programmers to define with 'equals'. > > > > --- > > > > Stephen Colebourne suggested a normalization approach to floating-point field storage: > > > > >> * For each `float` or `double` field in a value class, the constructor > > >> will generate normalization code > > >> * The normalization is equivalent to `longBitsToDouble(doubleToLongBits(field))` > > >> * Normalization also applies to java.lang.Float and java.lang.Double > > >> * == is a Bitwise implementation, but behaves like Representational > > >> for developers > > > > The Oracle-internal discussion last spring covered similar ground. There are different ways to stack it, but what they have in common is an interest in eradicating NaN encoding variations as some sort of unwanted anomaly. I get that this is often the case (for the tiny fraction of programmers who ever encounter NaN in the first place). But let's not overlook the fact that, since 1.3, there's an API that explicitly supports these encodings and promises to preserve them (Double.doubleToRawLongBits and Double.longBitsToDouble). Note that Double is a value class that wraps a field of type 'double'. Flattening out NaN encoding differences in the wrapped field would break that API. > > > > (Could we work around that by changing the type of the wrapped field to 'long'? I mean, abstractly speaking, I guess... But now we're back to a class Double whose '==' and 'equals' methods disagree.) > > The proposed bitwise implementation of == for value objects exposes > internal state, breaking encapsulation. And the class implementor > cannot change that behavior. Has anyone performed a security analysis? > > If == is potentially hazardous and has extremely limited use for value > objects, I question whether it should be part of the public API > surface of a value class at all. I'd rather get a syntax error from > MyDouble a == MyDouble b (or HalfFloat a == HalfFloat b), when invoked > outside the class itself or a descendant / enclosed class, than have > it provide an actively misleading answer. > > Since Java doesn't have operator overloading, == really only makes > sense for primitives and same-reference/nullness checking. Why not > transform == to .equals() for non-nullable and (possibly inlined) > Objects.equals() for nullable types? I meant to refer exclusively to "value types" here, including primitive wrappers. Nothing should change for identity types or primitives. One more thought: Since we've cracked the seal on "implicit" already for constructors, those value classes that truly want bitwise equals for performance could opt-in with "implicit boolean equals(Object o);", preventing the compiler from generating equals() calls when the type of both arguments is statically known to be the same type, and that type has implicit equals. Cheers (and my apologies for the self-reply), Clement Cherlin From daniel.smith at oracle.com Tue Feb 13 15:48:03 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 13 Feb 2024 15:48:03 +0000 Subject: Value object equality & floating-point values In-Reply-To: References: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> Message-ID: [PSA that this discussion has ended up in the wrong mailing list, please be sure to send further replies to valhalla-spec-observers and remove amber-spec-observers.] On Feb 12, 2024, at 6:54?PM, Joe Darcy wrote: On 2/12/2024 3:50 PM, Stephen Colebourne wrote: On Mon, 12 Feb 2024 at 21:16, Dan Smith wrote: Stephen Colebourne suggested a normalization approach to floating-point field storage: * For each `float` or `double` field in a value class, the constructor will generate normalization code * The normalization is equivalent to `longBitsToDouble(doubleToLongBits(field))` * Normalization also applies to java.lang.Float and java.lang.Double * == is a Bitwise implementation, but behaves like Representational for developers The Oracle-internal discussion last spring covered similar ground. There are different ways to stack it, but what they have in common is an interest in eradicating NaN encoding variations as some sort of unwanted anomaly. I get that this is often the case (for the tiny fraction of programmers who ever encounter NaN in the first place). But let's not overlook the fact that, since 1.3, there's an API that explicitly supports these encodings and promises to preserve them (Double.doubleToRawLongBits and Double.longBitsToDouble). Note that Double is a value class that wraps a field of type 'double'. Flattening out NaN encoding differences in the wrapped field would break that API. This claim doesn't stand up to scrutiny I'm afraid. longBitsToDouble() says this: "Note that this method may not be able to return a double NaN with exactly same bit pattern as the long argument" https://docs.oracle.com/javase/8/docs/api/java/lang/Double.html#longBitsToDouble-long- The whole paragraph is a vital read for anyone following this thread. See also the more recent added text discussing "Floating-point Equality, Equivalence, and Comparison:" https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Double.html#equivalenceRelation To put it simply, the spec *already allows* normalization (for the very good reason that no Java developer should ever be exposed to different NaNs). The origin of that paragraph stems from idiosyncrasies of 64-bit double support on the x87 FPU -- tl;dr a format changing load/store instruction rather than a move instruction had to be used to get values on and off the floating-point stack. Fortunately, JEP 306: "Restore Always-Strict Floating-Point Semantics" removed most such issues from a spec perspective, which had not been a serious concern from an implementation perspective for some time. I'm really serious about this topic - because the proposed solution is simply not appropriate for Java as a blue collar language. So-called "NaN boxing" -- storing and extracting extra information from NaN significands -- seems to be a more-common-than-rare implementation technique for some language runtimes; it come up reasonably frequently on Hacker News as one data point. I don't think it is necessary for the Java SE specs or JDK implementation to go out of their way to support NaN boxing, but I don't think it is necessary to prevent the technique from being used either. -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Tue Feb 13 20:22:49 2024 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Feb 2024 12:22:49 -0800 Subject: Value object equality & floating-point values In-Reply-To: References: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <8B2DD8F4-3A04-4807-AABB-E8442FA7BF5A@oracle.com> On 12 Feb 2024, at 13:00, Dan Smith wrote: >> For each `float` or `double` field in a value class, the constructor will generate normalization code Indeed we considered this, but consider the tradeoff: Benefit: Somebody who is looking very closely indeed at NaNs sees a different op== behavior. That person is pleased, yay! Cost: Everybody who stores floats and doubles in value classes sees object creation slow down due to branchy normalization logic, at least wherever hardware creates unpredictable NaNs and/or NaNs that don?t fit the Java model of ?good? NaNs. The cost is too risky, just to please a few NaN-detectives. From daniel.smith at oracle.com Tue Feb 13 21:00:36 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 13 Feb 2024 21:00:36 +0000 Subject: Value object equality & floating-point values In-Reply-To: <8B2DD8F4-3A04-4807-AABB-E8442FA7BF5A@oracle.com> References: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> <8B2DD8F4-3A04-4807-AABB-E8442FA7BF5A@oracle.com> Message-ID: > On Feb 13, 2024, at 12:22?PM, John Rose wrote: > > On 12 Feb 2024, at 13:00, Dan Smith wrote: > >>> For each `float` or `double` field in a value class, the constructor will generate normalization code > > Indeed we considered this, but consider the tradeoff: > > Benefit: Somebody who is looking very closely indeed at NaNs sees a different op== behavior. That person is pleased, yay! > > Cost: Everybody who stores floats and doubles in value classes sees object creation slow down due to branchy normalization logic, at least wherever hardware creates unpredictable NaNs and/or NaNs that don?t fit the Java model of ?good? NaNs. > > The cost is too risky, just to please a few NaN-detectives. Another thing worth pointing out about field storage normalization: as it applies to the (value) class Double, it effectively deprecates the 'doubleToRawLongBits' method and implies that, from now on, in the boxed Double world, there's only one NaN. Stephen pointed out in valhalla-spec-observers that the factory method 'longBitsToDouble' turns out to allow for some amount of normalization?countering my claim that it must preserve all bits as provided. That is true, yet the Double API as a whole is clearly designed to support multiple NaNs. Without them, the introductory discussion about bit-wise vs. representational equivalence, and the distinction between 'doubleToLongBits' and 'doubleToRawLongBits', become moot. So the 'longBitsToDouble' specification may give us an opening to change the space of things encodable with Double, but I don't find much enthusiasm for taking advantage of that opening. An API that is fully conversant in the full 64 bits of the IEEE binary64 type seems like the right fit for the class Double. Then we can turn to other value classes that want to wrap 'double' fields. Some of them very well may want to normalize?and their authors are free to do so. Others might want to do encoding tricks with NaN, as far as the hardware allows. Yet others might opt for an integral type instead for their internal encoding (e.g., in my HalfFloat example above), and again these classes are free to normalize or not normalize in their own program logic. This leaves the platform itself to take a hands-off, least-common-denominator approach: bitwise equivalence, with classes responsible for managing their own bits and their own 'equals' relations. From daniel.smith at oracle.com Tue Feb 13 21:09:43 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 13 Feb 2024 21:09:43 +0000 Subject: Value object equality & floating-point values In-Reply-To: References: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <378C86C4-134C-48A3-B6C3-7282E83305ED@oracle.com> > On Feb 12, 2024, at 6:54?PM, Joe Darcy wrote: >> >> >> To put it simply, the spec *already allows* normalization (for the >> very good reason that no Java developer should ever be exposed to >> different NaNs). > > The origin of that paragraph stems from idiosyncrasies of 64-bit double support on the x87 FPU -- tl;dr a format changing load/store instruction rather than a move instruction had to be used to get values on and off the floating-point stack. Fortunately, JEP 306: "Restore Always-Strict Floating-Point Semantics" removed most such issues from a spec perspective, which had not been a serious concern from an implementation perspective for some time. Joe, does this imply that primitive floating-point data movement *in general* may not preserve NaN bits? E.g., the following assert may fail? value class C { double d; C(double d) { this.d = d; } static double TMP; } double d1 = ...; C.TMP = d1; double d2 = C.TMP; C c1 = new C(d1); C c2 = new C(d2); assert c1 == c2; // bitwise equivalence test From john.r.rose at oracle.com Tue Feb 13 21:18:24 2024 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Feb 2024 13:18:24 -0800 Subject: Value object equality & floating-point values In-Reply-To: <1198513186.3270854.1707510749886.JavaMail.zimbra@univ-eiffel.fr> References: <617932476.3148578.1707502383237.JavaMail.zimbra@univ-eiffel.fr> <2A11AB2C-C21D-4483-AB81-9EC2210035A5@oracle.com> <1198513186.3270854.1707510749886.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <39EA432A-88C4-42CD-873B-1854366DA1A6@oracle.com> On 9 Feb 2024, at 12:32, forax at univ-mlv.fr wrote: > > We fix == but for NaN looks like a terrible slogan. There?s no question of fixing anything. Actually, float== (fcmp/dcmp) is permanently broken for NaN. It?s the inevitable consequence of having Java import IEEE 754. You already teach your students that, surely. We aren?t trying to fix ==. We are steering towards a semantics for values which is compatible with ref== (acmp) today. And that is substitutability, in all its details, including the unwelcome details. Two objects which store distinguishable NaNs are distinct, hence not the same, and hence not ref==. (Whether they are float== is a very different matter.) We are not trying to align ref== with float==, because that is an impossible goal. We could try to file off an unimportant rough edge by making some NaNs less distinct from each other, by normalizing NaNs stored in some fields in some classes. (All value classes fields is the proposal of the week.) That won?t change the result of float== on such values; they are still never float==, whether distinct or not. Filing off that rough edge would shift around the envelope of the sameness predicate ref==. Nobody will care! That ref== is also known (to all well-informed programmers) to be rather mysterious because of accidental object identity. We make ref== somewhat less mysterious by making value objects more likely to compare the same (under ref==). But we don?t undertake to fold in all possible kinds of equality checks ? that is what Object::equals is for. I don?t buy the claim that people will be more surprised by our changes to NaN behavior and object identity (already very surprising!). We have Float::floatToRawIntBits for reasons, and it would be a heavy lift to get rid of it. (Would also delay Valhalla.) We might wish it were not necessary, because it necessarily brings in questions of bitwise equivalence. But the reasons are there: A. IEEE 754 leaves open the possibility that distinct NaNs might carry interesting information. Java doesn?t have to respect that, but it is reasonable to do so. B. Hardware platforms produce varying NaN values in varying conditions. It is simpler to let sleeping NaNs lie, rather than to give each new floating point value a bath to make it look normal, on the chance it might have been an ugly NaN. This is why (I think) we added the ?raw? (bitwise) versions of the float and double conversion methods: The non-raw versions do too much work for too little benefit. C. Following up on B, routine normalization (say, on every data store, not just as requested explicitly by Float::floatToIntBits) would make all float-related data structures slower. Also it would divert JIT optimizer work to solve a problem we made for ourselves, to avoid normalizations for values already proved normal. D. Some programmers actually use the full 64-bit bandwidth of IEEE 754 double values, for exotic purposes. No Java libraries I know of do this, but I?ve heard of it in JavaScript interpreters which encode managed pointers as NaNs. If we start normalizing stored float values, we might cause a compatibility problem with some clever Java code out there. I put this last because although it is a nonzero risk, the downside is likely limited. Point C (an endemic normalization cost) is my main concern. I also agree with Chen?s characterization of the plan of record as not messing with cartesian products. They are efficiently stored in memory, component-wise (bitwise!!) as long as you don?t try to filter out the ?ugly? combinations you don?t prefer. This is equivalent to my observation about the ?64-bit bandwidth? of doubles. If I can?t get 64 independent bits out of my double field, then I know somebody is probably spending extra cycles to suppress unwanted bit combinations. From john.r.rose at oracle.com Tue Feb 13 21:31:07 2024 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Feb 2024 13:31:07 -0800 Subject: Value object equality & floating-point values In-Reply-To: References: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> <8B2DD8F4-3A04-4807-AABB-E8442FA7BF5A@oracle.com> Message-ID: <7A397C81-93B5-4000-BEDA-29FE41197DBE@oracle.com> +1; thanks Dan. Value class authors can work with the tools we give them, if they prefer normal forms. (And they will usually prefer to write their own .equals as well, although it?s not our place to force them to.) On 13 Feb 2024, at 13:00, Dan Smith wrote: > Another thing worth pointing out about field storage normalization: as it applies to the (value) class Double, it effectively deprecates the 'doubleToRawLongBits' method and implies that, from now on, in the boxed Double world, there's only one NaN. > > Stephen pointed out in valhalla-spec-observers that the factory method 'longBitsToDouble' turns out to allow for some amount of normalization?countering my claim that it must preserve all bits as provided. That is true, yet the Double API as a whole is clearly designed to support multiple NaNs. Without them, the introductory discussion about bit-wise vs. representational equivalence, and the distinction between 'doubleToLongBits' and 'doubleToRawLongBits', become moot. > > So the 'longBitsToDouble' specification may give us an opening to change the space of things encodable with Double, but I don't find much enthusiasm for taking advantage of that opening. An API that is fully conversant in the full 64 bits of the IEEE binary64 type seems like the right fit for the class Double. > > Then we can turn to other value classes that want to wrap 'double' fields. Some of them very well may want to normalize?and their authors are free to do so. Others might want to do encoding tricks with NaN, as far as the hardware allows. Yet others might opt for an integral type instead for their internal encoding (e.g., in my HalfFloat example above), and again these classes are free to normalize or not normalize in their own program logic. > > This leaves the platform itself to take a hands-off, least-common-denominator approach: bitwise equivalence, with classes responsible for managing their own bits and their own 'equals' relations. From forax at univ-mlv.fr Tue Feb 13 21:33:28 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 13 Feb 2024 22:33:28 +0100 (CET) Subject: Value object equality & floating-point values In-Reply-To: <8B2DD8F4-3A04-4807-AABB-E8442FA7BF5A@oracle.com> References: <1908581584.3891098.1707598626337.JavaMail.zimbra@univ-eiffel.fr> <8B2DD8F4-3A04-4807-AABB-E8442FA7BF5A@oracle.com> Message-ID: <981417195.6193163.1707860008800.JavaMail.zimbra@univ-eiffel.fr> ----- Original Message ----- > From: "John Rose" > To: "daniel smith" > Cc: "valhalla-spec-experts" , "Remi Forax" > Sent: Tuesday, February 13, 2024 9:22:49 PM > Subject: Re: Value object equality & floating-point values > On 12 Feb 2024, at 13:00, Dan Smith wrote: > >>> For each `float` or `double` field in a value class, the constructor will >>> generate normalization code > > Indeed we considered this, but consider the tradeoff: > > Benefit: Somebody who is looking very closely indeed at NaNs sees a different > op== behavior. That person is pleased, yay! > > Cost: Everybody who stores floats and doubles in value classes sees object > creation slow down due to branchy normalization logic, at least wherever > hardware creates unpredictable NaNs and/or NaNs that don?t fit the Java model > of ?good? NaNs. > > The cost is too risky, just to please a few NaN-detectives. It depends if the normalization is done at creation time or when == is executed. For the latter, in most case, no non canonical NaN will be seen so it will be a branch non taken (especially if the test only branch for non-canonical NaNs). R?mi From john.r.rose at oracle.com Tue Feb 13 21:38:51 2024 From: john.r.rose at oracle.com (John Rose) Date: Tue, 13 Feb 2024 13:38:51 -0800 Subject: Simplifying 'value' and 'identity' In-Reply-To: References: Message-ID: <0CDBB8B8-01E7-412C-A474-A54B5470E100@oracle.com> Good summary, Kevin. And, regardless of your email domain, please do keep your chair at this table! I think one smallish reason we had more value-related distinctions in the past, among supers, was our inability to imagine how abstract classes might contribute constructors and fields to values. We have a story for that now, but only since last July (at the earliest). Now, more super classes are value-capable, so there is less need to mark them explicitly value-capable, and then to check them structurally against such markings. No marks are the best marks. On 8 Feb 2024, at 18:32, Kevin Bourrillion wrote: > Hey everyone, > > [I assume Google will put forth a replacement representative at some > point, > but either way I've joined this group as an individual member. > kevinb9n at gmail.com is how to reach me now!] > > Here's a response to Dan's December thread > > which I can't reply to from this other account. > > So as far as I can tell I think this dovetails perfectly with what > I've > always wanted, which is the "identity is just an attribute" model. I > like > pretty much everything about it. Under that idea, identity is absent > at the > root of the hierarchy, can be added by any subtype, then is inherited > by > everything below that. Any type that wants to be mutable, lockable, > etc. > would have to flip that identity switch on. > > So I don't see that there needs to be any difference between "value" > and > "indeterminate". Both are just "doesn't have the identity attribute". > > I feel like there must have been some reason we thought that wouldn't > pan > out, but I don't remember. In some sense, adding the "attribute" of > identity also *removes* a capability at the same time (the capability > of > being copyable and collapsible at the VM's whims), and at one point I > thought that sunk the whole model. But that's not a *client*-facing > capability, and I think maybe that excuses it. > > So anyway, I *think* the only adjustment to my preferred dream model > that I > recognize in Dan's email is "interfaces don't get to add this identity > attribute", which okay, no great loss I suppose. > > Of course, I don't remember what past discussions I don't remember, so > please let me know what I'm missing here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Feb 13 21:56:33 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 13 Feb 2024 22:56:33 +0100 (CET) Subject: Value object equality & floating-point values In-Reply-To: <39EA432A-88C4-42CD-873B-1854366DA1A6@oracle.com> References: <617932476.3148578.1707502383237.JavaMail.zimbra@univ-eiffel.fr> <2A11AB2C-C21D-4483-AB81-9EC2210035A5@oracle.com> <1198513186.3270854.1707510749886.JavaMail.zimbra@univ-eiffel.fr> <39EA432A-88C4-42CD-873B-1854366DA1A6@oracle.com> Message-ID: <1225514754.6195814.1707861393622.JavaMail.zimbra@univ-eiffel.fr> ----- Original Message ----- > From: "John Rose" > To: "Remi Forax" > Cc: "daniel smith" , "valhalla-spec-experts" > Sent: Tuesday, February 13, 2024 10:18:24 PM > Subject: Re: Value object equality & floating-point values > On 9 Feb 2024, at 12:32, forax at univ-mlv.fr wrote: > >> >> We fix == but for NaN looks like a terrible slogan. > > > There?s no question of fixing anything. Actually, float== (fcmp/dcmp) is > permanently broken for NaN. It?s the inevitable consequence of having Java > import IEEE 754. You already teach your students that, surely. > > We aren?t trying to fix ==. We are steering towards a semantics for values > which is compatible with ref== (acmp) today. And that is substitutability, in > all its details, including the unwelcome details. Two objects which store > distinguishable NaNs are distinct, hence not the same, and hence not ref==. > (Whether they are float== is a very different matter.) I believe that fixing == in case of *boxing* is achievable. I teach my students the difference between == and Double.equals(), and why both exist. The main problem I see with value types and == being the bitwise equivalence is that wrapper types will now mostly works with == Because the case where does not work is a corner case of a corner case, you need NaN that are not Double.NaN, students will never see the issue. If an issue is not easily reproductible, it's hard to convince students that the issue exist (we have the same problem with teaching the difference between the different hardware memory models). > > We are not trying to align ref== with float==, because that is an impossible > goal. > > We could try to file off an unimportant rough edge by making some NaNs less > distinct from each other, by normalizing NaNs stored in some fields in some > classes. (All value classes fields is the proposal of the week.) That won?t > change the result of float== on such values; they are still never float==, > whether distinct or not. If think that normalizing NaNs is an operation users code can do, and maybe something we will have to teach students if == on value types uses bitwise equivalence. But it's not something the execution model should do, at least not only for value classes. If agree with you on that. That does not change the fact that == on value classes can be the representational equivalence instead of the bitwise equivalence. It's what Double.equals()/Float.equals() do. BTW, i've never heard someone complaining that Double.equals()/Float.equals() should not do NaN normalization for performance reason. R?mi From daniel.smith at oracle.com Wed Feb 21 03:11:22 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 21 Feb 2024 03:11:22 +0000 Subject: EG meeting, 2024-02-21 Message-ID: We'll hold an EG meeting Wednesday, 5pm UTC (9am PST, 12pm EST). We can review the discussion in the thread "Value object equality & floating-point values", which addressed the use of raw floating-point bits in '==' value object comparisons. From daniel.smith at oracle.com Wed Feb 21 20:30:46 2024 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 21 Feb 2024 20:30:46 +0000 Subject: Timeline for wrapper class constructor removal from JDK? In-Reply-To: References: Message-ID: <7FFCFB86-7F14-45FF-9119-CCE553B799BF@oracle.com> I got this question about the fate of wrapper class constructors, and it's worth broadcasting the reply: > On Feb 21, 2024, at 8:59?AM, Kurt Alfred Kluever wrote: > > I'm wondering if there's any ETA for when the wrapper class constructors will be actually removed from the JDK? > > More info: I'm an engineer on the Java Core Libraries team at Google. I've mostly scrubbed our own depot of the wrapper class constructors (using ErrorProne's BoxedPrimitiveConstructor refactoring), but the situation for third party libraries is much much worse. We have literally hundreds of third party libraries using these constructors. > > And of course, most of these libraries are legacy / unmaintained upstream. We can locally patch some of them, try to upstream some fixes, but the sheer number of libraries is making me fairly nervous about trying to eventually update our JDK to a version w/o these constructors. > > Any insight about when we should expect the removal would be helpful in terms of planning. Yes, when we did JEP 390 [1], we understood that there would continue to be class files in the wild calling 'new java/lang/Integer' and those class files would need to continue to work. The JEP refers to "Tooling to support execution of binaries that are unable to update their usages of wrapper class constructors" as future work. We envisioned some sort of bytecode rewriting capability to clean up these class files. Since then, we've updated our construction approach for value classes (see [2]), and it is no longer the case that 'new java/lang/Integer' in bytecode is incompatible with a value class Integer. Thus, we no longer need any special tooling?the stale binaries will just continue working! That said, these old classes will stop working if we ever go through with the threatened removal, as advertised by JEP 390. It probably makes sense to roll that back and just treat these constructors as normal @Deprecated APIs, with no specific plans for future removal. [1] https://openjdk.org/jeps/390 [2] https://openjdk.org/jeps/401#Value-object-construction