From karen.kinnear at oracle.com Wed Feb 13 15:41:37 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 13 Feb 2019 10:41:37 -0500 Subject: Valhalla Meeting Notes Jan 30, 2019 Message-ID: <460521D8-FFD2-43F8-890C-2098536FABDA@oracle.com> Next meeting Feb 13, 2019 Attendees: John, Brian, Remi, Tobi, Dan H, Frederic, Karen AI: Remi: send GitHub POC experiment outside of vm with bridge forwarding I. Problem: Where do I put privileged, final, value-specific behaviors initial example: Object.wait*/notify*. Is there a clean way to do this without JVM magic? You can not have final methods in interfaces, and Object methods are always found first. Proposal: JVM injects new super class: RefObject for all reference objects, i.e. all existing class files. Require ValObject as super class for all value classes. java.lang.Object is treated as an honorary interface top type, e.g. in descriptors new Object[] uses Object as the interface top type and can hold any RefObject or ValObject BG: This fits with longer term new top type for Arrays, e.g. interface Array Proposal 2: Extend the canonical factory model used by e.g. List.of() Allow java.lang.Objects and Generics to create a default factory for interfaces and abstract classes ?new? bytecode (without new/dup/ would invoke the default factory, e.g. a static This would be used to allow instantiation of Object(), so new Object() would create a canonical Object which has RefObject as a super class. Brian does not want RefObject or ValObject to be instantiable DH: Are factories responsible for Super calls? KK: How does the vm tell a new ?new? bytecode from an old one? JR: At link resolution time, you know the type you wanted DH: What if the CP entry is used by ldc also? JR: For ?new? might need to renumber the CP index or store resolution separately Concerns: 1. Class.getSuperClass() explicitly assumed to be java.lang.Object * need to find frequency of this in codebases 2. Code that assumes explicit hierarchy depth -> can adapt source Alternative I: both as interface RF: easier to inject BG: VM must enforce not allowing implementing both at once Teaching - this model is not clean from a user expectation perspective Alternative II: (fallback): ValObject as super class, RefObject as super interface BG: wait/notify: need final implementation of wait/notify that normal tools understand vs. adhoc rules DH: Retcon Object -> RefObject? What if we were to change wait/notify to only exist for RefObject BG: concern about speed of translation JR: Generics have parallel reflective problems with ArrayList and Object, with getClass and getSpecies KK: sync is special in the vm for ValObject anyway, so not significant to handle wait/notify specially BG: if from scratch would do classes, if too hard to retrofit could explore Alternative II BG: List can allow any value class RF: [ValObject] BG: abstract type, nullable and not flattened RF: Lambda super is Object BG: Actually it is not specified FP: JVMS: Interface: link to Object methods must succeed, not required from Object II. Bridge Methods - BG Goals: 1. wildcards for specialized generics 2. least important: current bridge methods are a constant source of pain separate compilation can give wrong answer, generate in vm could help Foo { T f1; } Foo f1 has type Object (past and future), can reference through a wildcard methods: could do bridges (also want asType() conversions) fields: want to link with asType() conversions capture that two signatures describe the same member language level: 1 member vm level: 2 unrelated bridges Farther future goal: 3. Member type signature migration e.g. Collection.size() int -> C.s()long or OptionalInt -> Optional RF: goal 3 is not just covariance BG: clients and subtypes Proposal: Forwarding bridges use info: resolution and selection time if the compiler generated bridge, only used at selection time if resolution hits the bridge, rerun resolution with new descriptor if only at selection time: miss ability to correct bridge loops RF: ok with forward asType must be done by target BG: ?asType? e.g. Date -> LocalDate: need plug in code RF: which class decides xform? BG: declaration of migration must provide asType conversion BG: Change signature type: declare signature and provide asType JR: behave as if bridge was there as synthetic BG: if trivial, vm knows widening rules JR: upfront - not want loops combinatorial complexity with species dynamic injection of synthetic methods KK: like having forward info in the vm, need to walk the details - e.g. table of client/subtype & overriding, could use use cases BG: Dan Smith aid we need to check mixing old style bridges and forwarding JR: thought experiment: Mindy - what if we had a class with no method, BSM for all linkage requests for methods BG: prototype strategy use annotations all member access -> indy easier to prototype resolution behavior than selection behavior RF: have a POC on GitHub JR: helpful to experiment with semantics of resolution without changing vm BG: Vicente has a javac mode that does not generate current bridges BG: or could use a bytecode weaver javac allows overloads with different return types today - could use those as experiments DH/TA: look at J9 - spinning MethodHandles for the new behavior JR: Maybe parallel tables, hard to wedge MH KK: still need to study expected behaviors, e.g. for virtual fields RF: vtable for fields DH: will the vm know all signatures at vtable build time? BG: yes classfile contains orig sig and forward sig JR: fixed set of members at class load time with BSM adaptors - could change (ed. note - what could change?) BG: old signature in vtable ?as if final? = migrated, can?t override migrated methods linktime forwarding: change slot, have receiver do selection time forwarding RF: Can?t call the BSM until the first call KK: Two sets of asType adaptations III. Equality JR: 1. Substitutability check != Reference equality except for RefObject, does not require identity Value Types: fields pairwise substitutable Interface/Object: need dynamic check for Val vs Ref if Ref: ref equality, if Val: val substitutability 2. SubstitutabilityHash != Object.HashCode, != IdentityHashCode system.substitutability retcon acmp as substitutability test BG: cost benefits discussion DH: one design center: numerics will write methods w/operator overloading == fits nicely into operator overloading BG: more fundamental case pass sentinel into code that takes m(Object) or m()Object == to check sentinel or existence of element if values, invisible even though allowed as input/return JR: operator overload if statically know type - great other: legacy code, generic code DH: what about generic code == followed by .equals ?most? use JR: well constructed code - is not helped, we must ensure not harmed DH: .equals may start with ==, so pay cost twice BG: JIT - if can inline .equals, can CSE to only have 1 acmp DH: not if generic BG: all generics over VT are specialized DH: ArrayList is not specialized JR: either a fountain of razor blades for users or engineering tasks for us e.g. LIFE (legacy idiom for equality): JIT: if dynamically value -> ? explore - maybe spec? .equals allowed to absorb a preceding substitutability test? BG: need data on costs rather than ?living in fear? KK: Need to gather performance data on larger apps & code cache costs Need to deal with deep recursion/SOE JR: agree there are risks - need to experiment and measure Corrections/clarifications welcome, thanks, Karen From forax at univ-mlv.fr Wed Feb 13 16:13:22 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 13 Feb 2019 17:13:22 +0100 (CET) Subject: Valhalla Meeting Notes Jan 30, 2019 In-Reply-To: <460521D8-FFD2-43F8-890C-2098536FABDA@oracle.com> References: <460521D8-FFD2-43F8-890C-2098536FABDA@oracle.com> Message-ID: <1482107981.644632.1550074402276.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Karen Kinnear" > ?: "valhalla-spec-experts" > Envoy?: Mercredi 13 F?vrier 2019 16:41:37 > Objet: Valhalla Meeting Notes Jan 30, 2019 > Next meeting Feb 13, 2019 > > Attendees: > John, Brian, Remi, Tobi, Dan H, Frederic, Karen > > AI: Remi: send GitHub POC experiment outside of vm with bridge forwarding https://github.com/forax/indy-everywhere R?mi From brian.goetz at oracle.com Wed Feb 20 20:09:43 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 20 Feb 2019 15:09:43 -0500 Subject: Value types, encapsulation, and uninitialized values In-Reply-To: References: Message-ID: Closing the loop on this story.... To summarize what's been said on this thread: ?- Everyone agrees that there are at least some value types that don't have a natural (non-null) default, and that making up a default with zeros for these types is, at best, surprising.? As Kevin put it: > Most value types, I think, don't have a zero, and I believe it will be > quite damaging to treat them as if they do. If Java doesn't either > provide or simulate nullability, then users will be left jumping > through a lot of hoops to simulate nullability themselves (`implements > IsValid`??). ?- John made an impassioned plea for not inventing new, null-like-but-not-null mechanisms, which can be summarized as "no new nulls": ??? http://cr.openjdk.java.net/~jrose/values/nullable-values.html ?- The motivation for supporting "nullable" values is not because we think values should have null in their value set; this is better handled by Optional or type combinators like `Foo?`.? This is really about what happens when someone stumbles across a value that has not been initialized by a constructor (and the most common case here is array elements.) ?- From a user-model perspective, there are a few options. Several folks were bullish on letting the user provide an initial value (say, via a no-arg constructor), but I think this idea runs off the road, since there are some types that _simply have no reasonable (non-null) default value_.? These include domain types like ??? value record PersonName(String first, String last); a default name of ("", "") is only slightly less stupid a default than (null, null).? These also include inner class instances; if there's no enclosing instance available, what are we going to do? Separately, we have explored a number of ways we might implement this in the VM, and I think we have a sensible story.? Some value types are _zero intolerant_ -- this means that the all-zero value is not a member of their value set.? The key observation is: ??? nullability, zero-tolerance, flattenability -- pick two That is, you can have nullable, zero-tolerant values (think `Point?`), but they don't get flattened; or you can have zero-tolerant, flattenable values, but they can't be null.? The third combination (thanks Frederic!) is that it is possible to have nullable, flattenable values, if we make the all-zero representation illegal, and then we use the all-zero representation in the heap to represent `null`, and `getfield` / `aaload` will check for zero on fetch and if zero, put a null on the stack.? (There's a much bigger writeup on this coming; this is the executive summary.)? And because values are monomorphic, different value types can make different choices. Further, a key use case is _migrating_ value-based classes (LocalDateTime, Optional) to value types.? The key impediment so far here has been nullability; we can represent them as nullable + flattenable if we're willing to give up zeros.? Since zeros is a pure implementation detail, a class that wants to migrate can always find a representation where there is at least one non-zero bit. So, the sweet spot seems to be: ?- Values, by default, are non-nullable and flattenable.? The compiler translates value `Point` as `QPoint;`. ?- Users can denote the union of the value set and { null } using an emotional type: `Point?`, which the compiler translates as `LPoint;`.? If a user wants a nullable `Point`, they ask for it; what they give up is flattenability / scalarization.? (I resisted the emotional types as long as I could, but the alignment with the VM implementation was too strong to resist, and this yields significant dividends when we get to the generics story.)? Let's not harp on the details of these types just yet; that's a separate shed to paint. ?- For values that need to defend against uninitialized data, or values that are migrated from references, they can declare themselves to be "null-default"; the cost of these is they must be intolerant of the all-zero value.? These are always translated with `L` carriers, since they are nullable.? Users of these classes pay the extra penalty of checking for zeroes when we go between heap and stack, so they are slightly slower, but they still are flattened and scalarized, which is the big benefit. (Again, I resisted John's point about nulls, but eventually the gravity was too strong; if we don't use null here, we'll reinvent a worse null.) Which correspond to the 3-choose-2 combinations deriving from the observation above. From a user model perspective, users choose between zero-default values (the default) and null-default values (opt in), as the semantics demands.? This is easy to understand (in fact, the biggest risk might be users will like it _too much_, and they'll reach for null-default value classes more often than they should.)? And if you want to represent "maybe Point", you use `Point?` or `Optional` as needed. From a VM perspective, we need to support null-default values; while we've not implemented this yet, it seems pretty reasonable. The bonus is that we have cleared the last blocker to migrating value-based classes to value types; for migrated values, we implicitly make them null-default (also: same treatment for inner value classes), and then migrating Optional and LocalDateTime becomes a completely compatible, in-place move. On 10/11/2018 10:14 AM, Brian Goetz wrote: > Our story is "Codes like a class, works like an int".? A key part of > this is that value types support the same lifecycle as objects, and > the same ability to hide their internals. > > Except, the current story falls down here, because authors must > content with the special all-zero default value, because, unlike > classes, we cannot guarantee that instances are the result of a > constructor.? For some classes (e.g., Complex, Point, etc), this > forced-on-you default is just fine, but for others (e.g., wrappers for > native resources), this is not unlike the regrettable situation with > serialization, where class implementations may confront instances that > could not have resulted from a constructor. > > Classes guard against this through the magic of null; an instance > method will never have to contend with a null receiver, because by the > time we transfer control to the method, we'd already have gotten an > NPE.? Values do not have this protection.? While there are many things > for which we can say "users will learn", I do not think this is one of > them; if a class has a constructor, it will be assumed that the > receiver in a method invocation will be on an instance that has > resulted from construction.? I do not think we can expose the > programming model as-is; it claims to be like classes, but in this > aspect is more like structs. > > So, some values (but not all) will want some sort of protection > against uninitialized values.? One approach here would be to try to > emulate null, by, say, injecting checks for the default value prior to > dereferences.? Another would be to take the route C# did, and allow > users to specify a no-arg constructor, which would customize the > default value.? (Since both are opt-ins, we can educate users about > the costs of selecting these tools, and users can get the benefits of > flatness and density even if these have additional runtime costs.)? > The latter route is less rich, but probably workable.? Both eliminate > the (likely perennial) surprise over uninitialized values for > zero-sensitive classes. > > > From forax at univ-mlv.fr Wed Feb 20 20:33:14 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 20 Feb 2019 21:33:14 +0100 (CET) Subject: acmp again ! Message-ID: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> I still think we have not finished exploring how to implement acmp for value types. We currently have two semantics for value types: - always return false, - recursively compare each components Support of acmp like the support of synchronized or System.identityHashcode is not a part of the original model where we can just say, it should work like an int because it's a consequence of lworld, making value types subtypes of Object, not something inherent to the concept of value types. or said differently, if it should work like an int, a == on value types at Java level should be converted to a vcmp (like there is an icmp_*) which is not an answer to how acmp is supposed to work. The issue with the first semantics is that it will be very surprising and will not be compatible with all the existing codes that only use == instead of a == followed by a call to equals(). The issue with the second semantics is that it's an unbounded computation, shaking the Java performance model everybody has in mind by moving == from one of the fastest operation to a potentially very slow operation. I wonder if there is not a intermediary semantics, return false if one field is an Object or an interface or do a component wise comparison otherwise ? For value types like Point, Complex, == is the component wise comparison, for a value type that works like Optional, == return false. It seems not that bad but it means that doing an == on a wildcard of a reified generics like Atomic depends on the class of T at runtime ? R?mi From brian.goetz at oracle.com Wed Feb 20 20:52:01 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 20 Feb 2019 15:52:01 -0500 Subject: acmp again ! In-Reply-To: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> Message-ID: Yes, it's a tempting direction.? John and I both walked that road, in the hope that there was a reasonable place to cut off the regress.? (And we both walked back disappointed.) One problem is that with erased generics, all Ts become Object: ??? /* erased */ value class Box { ??????? T t; ??? } translates to ??? value class Box { ??????? Object t; ??? } So no two boxes would be == to each other -- in fact, a given Box would not be == to itself.? I think users would find ??? Optional o = Optional.of(p); ??? o == o // false to be surprising!? Similarly, classes like ??? value record Twople(T t, U u); ??? Twople x = new Twople("a", "b"), ???????? y = x; ??? x == x // false ??? x == y // false Whereas the non-generic version: ??? value record TwoStrings(String a, String b); ??? TwoStrings x = new TwoStrings("a", "b"), ??????? y = x; ??? x == x? // true ??? x == y? // true So I think what this does is just move the surprise to where you don't trip over it 100% of the time, but you still trip often enough that you curse it. On 2/20/2019 3:33 PM, Remi Forax wrote: > I still think we have not finished exploring how to implement acmp for value types. > > We currently have two semantics for value types: > - always return false, > - recursively compare each components > > Support of acmp like the support of synchronized or System.identityHashcode is not a part of the original model where we can just say, it should work like an int because it's a consequence of lworld, making value types subtypes of Object, not something inherent to the concept of value types. or said differently, if it should work like an int, a == on value types at Java level should be converted to a vcmp (like there is an icmp_*) which is not an answer to how acmp is supposed to work. > > The issue with the first semantics is that it will be very surprising and will not be compatible with all the existing codes that only use == instead of a == followed by a call to equals(). > The issue with the second semantics is that it's an unbounded computation, shaking the Java performance model everybody has in mind by moving == from one of the fastest operation to a potentially very slow operation. > > I wonder if there is not a intermediary semantics, return false if one field is an Object or an interface or do a component wise comparison otherwise ? > > For value types like Point, Complex, == is the component wise comparison, for a value type that works like Optional, == return false. > > It seems not that bad but it means that doing an == on a wildcard of a reified generics like Atomic depends on the class of T at runtime ? > > R?mi > > > From brian.goetz at oracle.com Wed Feb 20 21:25:51 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 20 Feb 2019 16:25:51 -0500 Subject: acmp again ! In-Reply-To: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> Message-ID: <1c0f0702-a6ec-8845-17c1-be290211ce3f@oracle.com> > The issue with the first semantics is that it will be very surprising and will not be compatible with all the existing codes that only use == instead of a == followed by a call to equals(). > The issue with the second semantics is that it's an unbounded computation, shaking the Java performance model everybody has in mind by moving == from one of the fastest operation to a potentially very slow operation. I'll just add a few more points to this.? (All of these amount to: you have a point, but please, let's not overstate it.) 1.? Java users are used to unbounded equality computations; calling .equals() on a List or Map does a ton of work, and nobody is shocked by this.? What's potentially surprising is to see this behavior behind `==`.? That's the new bit. 2.? Generic code that only uses == and not .equals() is generally broken, so I am not too bothered by this "compatibility" concern. 3.? We don't actually know what the performance impact is.? (And for common cases, it will likely be small.)? We can speculate, but better to gather some performance data before we write off this approach.? There will be a? prototype soon that we can play with. From forax at univ-mlv.fr Wed Feb 20 21:55:14 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 20 Feb 2019 22:55:14 +0100 (CET) Subject: acmp again ! In-Reply-To: References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> Message-ID: <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "valhalla-spec-experts" > Envoy?: Mercredi 20 F?vrier 2019 21:52:01 > Objet: Re: acmp again ! > Yes, it's a tempting direction.? John and I both walked that road, in > the hope that there was a reasonable place to cut off the regress.? (And > we both walked back disappointed.) > > One problem is that with erased generics, all Ts become Object: > > ??? /* erased */ value class Box { > ??????? T t; > ??? } > > translates to > > ??? value class Box { > ??????? Object t; > ??? } > > So no two boxes would be == to each other -- in fact, a given Box would > not be == to itself.? I think users would find > > ??? Optional o = Optional.of(p); > ??? o == o // false > > to be surprising!? Similarly, classes like > > ??? value record Twople(T t, U u); > > ??? Twople x = new Twople("a", "b"), > ???????? y = x; > ??? x == x // false > ??? x == y // false > > > Whereas the non-generic version: > > ??? value record TwoStrings(String a, String b); > > ??? TwoStrings x = new TwoStrings("a", "b"), > ??????? y = x; > ??? x == x? // true > ??? x == y? // true > > So I think what this does is just move the surprise to where you don't > trip over it 100% of the time, but you still trip often enough that you > curse it. I'm still not sure allowing == when the compiler knows that the operands are value types is a good idea, whatever the semantics of acmp is. All your examples works BTW if they are declared as reified generics var x = new Twople("a", "b"); x == x // true, will compare the reference "a" and "b" respectively var x = new Twople(2, 3); x == x // true, will compare 2 and 3 respectively I think we can agree that no solution will be perfect, you have to pick your poison. And yes, always returning false is perhaps the lesser evil after all. R?mi > > > > On 2/20/2019 3:33 PM, Remi Forax wrote: >> I still think we have not finished exploring how to implement acmp for value >> types. >> >> We currently have two semantics for value types: >> - always return false, >> - recursively compare each components >> >> Support of acmp like the support of synchronized or System.identityHashcode is >> not a part of the original model where we can just say, it should work like an >> int because it's a consequence of lworld, making value types subtypes of >> Object, not something inherent to the concept of value types. or said >> differently, if it should work like an int, a == on value types at Java level >> should be converted to a vcmp (like there is an icmp_*) which is not an answer >> to how acmp is supposed to work. >> >> The issue with the first semantics is that it will be very surprising and will >> not be compatible with all the existing codes that only use == instead of a == >> followed by a call to equals(). >> The issue with the second semantics is that it's an unbounded computation, >> shaking the Java performance model everybody has in mind by moving == from one >> of the fastest operation to a potentially very slow operation. >> >> I wonder if there is not a intermediary semantics, return false if one field is >> an Object or an interface or do a component wise comparison otherwise ? >> >> For value types like Point, Complex, == is the component wise comparison, for a >> value type that works like Optional, == return false. >> >> It seems not that bad but it means that doing an == on a wildcard of a reified >> generics like Atomic depends on the class of T at runtime ? >> >> R?mi >> >> From john.r.rose at oracle.com Wed Feb 20 23:03:34 2019 From: john.r.rose at oracle.com (John Rose) Date: Wed, 20 Feb 2019 15:03:34 -0800 Subject: acmp again ! In-Reply-To: <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> Message-ID: On Feb 20, 2019, at 1:55 PM, forax at univ-mlv.fr wrote: > > I think we can agree that no solution will be perfect, you have to pick your poison. > And yes, always returning false is perhaps the lesser evil after all. Yep, at this point we are picking poisons, not cherries. Returning false (either always, or after some cutoff rule) breaks the very fundamental property of reflexivity. Though that's my preference from a "mechanism purist" point of view, it will please only would-be mechanism purists like me. The other 90% of our community will be rubbing our nose in puzzlers and other "those idiots" blog entries, basically forever. The other alternative is to push the expense of a substitutability test into acmp and op==. This will please "semantics purists" (which I can be also), and displease only those people who happen to run over the performance potholes. We don't have any proof that those potholes will be significant; what we do have is many ideas for working around those problems if they should occur. It feels like making acmp potentially expensive also entails making it potentially *inexpensive* again also, at the cost of some engineering in the JVM and JDK. Many uses of acmp which will turn into a substitutability test are (we think) in generic code which is using runtime typing and is reused for a range of different types, some values and some classic references. You could probably write a book on how such codes are optimized today, and most of the existing techniques that provide necessary devirtualization would also solve for expensive acmp. I think also we will want to bake into the JVMS some kind of permission to strength-reduce acmp with a nearby Object.equals call, by making equals into a semi-intrinsic that the JVM can reorder and merge with acmp. Reflexivity of equals is in the javadoc but not in the JVMS and so JITs don't make use of it now. But they might want to if acmp became a subject of optimization. Another idea we may wish to play with is an API point which provides the fixed-cost approximate equality test that acmp provides today. It would not be acmp but rather something with a name like "System.fastSubstitituabilityTest". Programmers of sophisticated libraries could use this as an alternative component to the famous LIFE (Legacy Idiom For Equality). The Objects.equals API point, for example, would convert to a new LIFE instead of remaining stuck in its old LIFE. (We could spend a LIFE-time punning about this.) So the proposed System.isSubstitutable would be implemented as "return x==y", using the slow-but-steady acmp instruction, while the other thing would be an intrinsic native method, with complicated weasel words in its spec. Where there's an equals there's a hash code, so? Next Question: Should System.identityHashCode throw an exception, return zero, or do a deep hash code when presented with a value instance? Since it's a niche function used by experts, maybe the surprising behavior of throwing an exception is permissible, where making op== non-reflexive would surprise everyone. I do believe we want a new API point System.substitutabilityHashCode so that code can truly and explicitly opt into processing that aspect of value types, rather than getting accidental results from System.iHC. (Oh, and also the JVM gets to pick the hash code here, not the JDK. Let's not proliferate base-31 hashing any further, please. This is a vectorizable operation, so hardware should determine the fastest good hash, in any particular run of the JVM. As folks know, I'm talking about moderns hashes based on high-precision multiply, AES-step, etc.) ? John From forax at univ-mlv.fr Thu Feb 21 00:37:12 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 21 Feb 2019 01:37:12 +0100 (CET) Subject: acmp again ! In-Reply-To: References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> Message-ID: <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Remi Forax" > Cc: "Brian Goetz" , "valhalla-spec-experts" > Envoy?: Jeudi 21 F?vrier 2019 00:03:34 > Objet: Re: acmp again ! > On Feb 20, 2019, at 1:55 PM, forax at univ-mlv.fr wrote: >> >> I think we can agree that no solution will be perfect, you have to pick your >> poison. >> And yes, always returning false is perhaps the lesser evil after all. > > Yep, at this point we are picking poisons, not cherries. > > Returning false (either always, or after some cutoff rule) > breaks the very fundamental property of reflexivity. > > Though that's my preference from a "mechanism purist" > point of view, it will please only would-be mechanism > purists like me. The other 90% of our community will > be rubbing our nose in puzzlers and other "those idiots" > blog entries, basically forever. It doesn't match my experience, currently people have no expectation of what == means on a value type, if you explain that a value type has no identity, that == doesn't work on a value type, most people are ok with that. And because they are used to the fact that doing a == on Integers is like playing the Russian roulette, people do not bat an eye when i show the slide with acmp returning false, not being able to store null is a bigger issue. > > The other alternative is to push the expense of a > substitutability test into acmp and op==. This will > please "semantics purists" (which I can be also), and > displease only those people who happen to run over > the performance potholes. We don't have any proof > that those potholes will be significant; what we do > have is many ideas for working around those problems > if they should occur. It feels like making acmp > potentially expensive also entails making it > potentially *inexpensive* again also, at the cost > of some engineering in the JVM and JDK. > > Many uses of acmp which will turn into a > substitutability test are (we think) in generic > code which is using runtime typing and is > reused for a range of different types, some > values and some classic references. You > could probably write a book on how such > codes are optimized today, and most of the > existing techniques that provide necessary > devirtualization would also solve for expensive > acmp. I will be harsh here because i don't share your optimism, we are introducing value types, departing from the comfy model of everything is an object, in part because countless hours have been spent trying to improve escape analysis to be more reliable. You are talking about the same kind of optimisations that will work great for some cases and not for others. Substituting an unreliable mechanism to an other unreliable mechanism doesn't look like a win. > > I think also we will want to bake into the JVMS > some kind of permission to strength-reduce acmp > with a nearby Object.equals call, by making equals > into a semi-intrinsic that the JVM can reorder > and merge with acmp. Reflexivity of equals > is in the javadoc but not in the JVMS and so > JITs don't make use of it now. But they might > want to if acmp became a subject of optimization. > > Another idea we may wish to play with is an API > point which provides the fixed-cost approximate > equality test that acmp provides today. It would > not be acmp but rather something with a name > like "System.fastSubstitituabilityTest". Programmers > of sophisticated libraries could use this as an > alternative component to the famous LIFE > (Legacy Idiom For Equality). The Objects.equals > API point, for example, would convert to a new > LIFE instead of remaining stuck in its old LIFE. > (We could spend a LIFE-time punning about this.) > > So the proposed System.isSubstitutable would > be implemented as "return x==y", using the > slow-but-steady acmp instruction, while the > other thing would be an intrinsic native method, > with complicated weasel words in its spec. or better, to provide a direct replacement of the LIFE pattern, System.fastEquals(), that does an acmp if the arguments are references before calling equals i.e. get ride of LIFE instead of trying to emulate it by providing what people want which is a faster equals when possible. It's time to unplug LIFE from it life support and let it go :) > > Where there's an equals there's a hash code, so? > > Next Question: Should System.identityHashCode > throw an exception, return zero, or do a deep > hash code when presented with a value instance? > Since it's a niche function used by experts, maybe > the surprising behavior of throwing an exception > is permissible, where making op== non-reflexive > would surprise everyone. I do believe we want a > new API point System.substitutabilityHashCode > so that code can truly and explicitly opt into > processing that aspect of value types, rather than > getting accidental results from System.iHC. these both methods are nice candidate for being (Java) compiler intrinsic, which is something i've already contemplated, instead of specifying that a record equals/hashCode and toString() should be desugared to invokedynamic, i think it's better to introduce three new methods structuralEquals, structuralHashCode, structuralToString in the API and say in the spec that for a record the generated equals/hashCode/toString are semantically equivalent to calling structuralEquals/structuralHashCode/structuralToString. > > (Oh, and also the JVM gets to pick the hash code > here, not the JDK. Let's not proliferate base-31 > hashing any further, please. This is a vectorizable > operation, so hardware should determine the fastest > good hash, in any particular run of the JVM. As > folks know, I'm talking about moderns hashes > based on high-precision multiply, AES-step, etc.) yes > > ? John R?mi From john.r.rose at oracle.com Thu Feb 21 00:56:56 2019 From: john.r.rose at oracle.com (John Rose) Date: Wed, 20 Feb 2019 16:56:56 -0800 Subject: acmp again ! In-Reply-To: <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> Message-ID: On Feb 20, 2019, at 4:37 PM, forax at univ-mlv.fr wrote: > > > > ----- Mail original ----- >> De: "John Rose" >> ?: "Remi Forax" >> Cc: "Brian Goetz" , "valhalla-spec-experts" >> Envoy?: Jeudi 21 F?vrier 2019 00:03:34 >> Objet: Re: acmp again ! > >> On Feb 20, 2019, at 1:55 PM, forax at univ-mlv.fr wrote: >>> >>> I think we can agree that no solution will be perfect, you have to pick your >>> poison. >>> And yes, always returning false is perhaps the lesser evil after all. >> >> Yep, at this point we are picking poisons, not cherries. >> >> Returning false (either always, or after some cutoff rule) >> breaks the very fundamental property of reflexivity. >> >> Though that's my preference from a "mechanism purist" >> point of view, it will please only would-be mechanism >> purists like me. The other 90% of our community will >> be rubbing our nose in puzzlers and other "those idiots" >> blog entries, basically forever. > > > It doesn't match my experience, currently people have no expectation of what == means on a value type, if you explain that a value type has no identity, that == doesn't work on a value type, most people are ok with that. And because they are used to the fact that doing a == on Integers is like playing the Russian roulette, people do not bat an eye when i show the slide with acmp returning false, not being able to store null is a bigger issue. Interesting data point. My inner mechanism purist is looking for a straw to grasp at. > >> The other alternative is to push the expense of a >> substitutability test into acmp and op==. This will >> please "semantics purists" (which I can be also), and >> displease only those people who happen to run over >> the performance potholes. We don't have any proof >> that those potholes will be significant; what we do >> have is many ideas for working around those problems >> if they should occur. It feels like making acmp >> potentially expensive also entails making it >> potentially *inexpensive* again also, at the cost >> of some engineering in the JVM and JDK. >> >> Many uses of acmp which will turn into a >> substitutability test are (we think) in generic >> code which is using runtime typing and is >> reused for a range of different types, some >> values and some classic references. You >> could probably write a book on how such >> codes are optimized today, and most of the >> existing techniques that provide necessary >> devirtualization would also solve for expensive >> acmp. > > > I will be harsh here because i don't share your optimism, we are introducing value types, departing from the comfy model of everything is an object, in part because countless hours have been spent trying to improve escape analysis to be more reliable. You are talking about the same kind of optimisations that will work great for some cases and not for others. Substituting an unreliable mechanism to an other unreliable mechanism doesn't look like a win. And if there are no reliable mechanisms, you pick your poison. So the paradoxical "always return false" is reliable, except that it will produce an unpredictable amount of future confusion to people who have been trained to expect a reflexive op==. I.e., there are risks in every direction. (Remi, I wish everyone were harsh like you. It would be a better world.) > >> I think also we will want to bake into the JVMS >> some kind of permission to strength-reduce acmp >> with a nearby Object.equals call, by making equals >> into a semi-intrinsic that the JVM can reorder >> and merge with acmp. Reflexivity of equals >> is in the javadoc but not in the JVMS and so >> JITs don't make use of it now. But they might >> want to if acmp became a subject of optimization. >> >> Another idea we may wish to play with is an API >> point which provides the fixed-cost approximate >> equality test that acmp provides today. It would >> not be acmp but rather something with a name >> like "System.fastSubstitituabilityTest". Programmers >> of sophisticated libraries could use this as an >> alternative component to the famous LIFE >> (Legacy Idiom For Equality). The Objects.equals >> API point, for example, would convert to a new >> LIFE instead of remaining stuck in its old LIFE. >> (We could spend a LIFE-time punning about this.) >> >> So the proposed System.isSubstitutable would >> be implemented as "return x==y", using the >> slow-but-steady acmp instruction, while the >> other thing would be an intrinsic native method, >> with complicated weasel words in its spec. > > or better, to provide a direct replacement of the LIFE pattern, System.fastEquals(), that does an acmp if the arguments are references before calling equals i.e. get ride of LIFE instead of trying to emulate it by providing what people want which is a faster equals when possible. It's time to unplug LIFE from it life support and let it go :) Well, that's what Objects.equals is, right? That's the correct formula to use instead of writing out the LIFE formula by hand. (It's a better method than a LIFE sentence for reforming wayward code.) > >> >> Where there's an equals there's a hash code, so? >> >> Next Question: Should System.identityHashCode >> throw an exception, return zero, or do a deep >> hash code when presented with a value instance? >> Since it's a niche function used by experts, maybe >> the surprising behavior of throwing an exception >> is permissible, where making op== non-reflexive >> would surprise everyone. I do believe we want a >> new API point System.substitutabilityHashCode >> so that code can truly and explicitly opt into >> processing that aspect of value types, rather than >> getting accidental results from System.iHC. > > these both methods are nice candidate for being (Java) compiler intrinsic, which is something i've already contemplated, instead of specifying that a record equals/hashCode and toString() should be desugared to invokedynamic, i think it's better to introduce three new methods structuralEquals, structuralHashCode, structuralToString in the API and say in the spec that for a record the generated equals/hashCode/toString are semantically equivalent to calling structuralEquals/structuralHashCode/structuralToString. Ooh, I like this story. I'll gladly trade in "substitutability" for "structural" if I can get a side order of toString. ? John > >> >> (Oh, and also the JVM gets to pick the hash code >> here, not the JDK. Let's not proliferate base-31 >> hashing any further, please. This is a vectorizable >> operation, so hardware should determine the fastest >> good hash, in any particular run of the JVM. As >> folks know, I'm talking about moderns hashes >> based on high-precision multiply, AES-step, etc.) > > yes > >> >> ? John > > R?mi From brian.goetz at oracle.com Thu Feb 21 01:48:09 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 20 Feb 2019 20:48:09 -0500 Subject: acmp again ! In-Reply-To: <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> Message-ID: <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> > > It doesn't match my experience, currently people have no expectation of what == means on a value type, if you explain that a value type has no identity, that == doesn't work on a value type, most people are ok with that. I think this is fantasy. Value types are OBJECTS.? People think they know what == means on an Object, and for most users, the word "identity" does not come to mind.? Further, we've told people that "values have only state, no identity -- if two values have the same state, they are the same." For such values not to be == will be astonishing. Consider: ???? value class UnsignedInt { ??????? public static UnsignedInt ZERO = ... ??????? private int i; ??????? public static add(UnsignedInt a, UnsignedInt b) { ... } ??? } ??? UnsignedInt x = ... ??? if (x != UnsignedInt.ZERO) { ... } People will never, ever, ever get used to the idea that test is always false.? Its a fantasy to think otherwise.? And the WHOLE POINT of L-World is to allow people to not sweat the small details between values and refs.? This is asking them to be acutely aware all the time. From john.r.rose at oracle.com Thu Feb 21 03:25:27 2019 From: john.r.rose at oracle.com (John Rose) Date: Wed, 20 Feb 2019 19:25:27 -0800 Subject: acmp again ! In-Reply-To: <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> Message-ID: <18F4B236-2005-4971-8BDD-472229037B24@oracle.com> On Feb 20, 2019, at 5:48 PM, Brian Goetz wrote: > > >> >> It doesn't match my experience, currently people have no expectation of what == means on a value type, if you explain that a value type has no identity, that == doesn't work on a value type, most people are ok with that. > > I think this is fantasy. > > Value types are OBJECTS. People think they know what == means on an Object, and for most users, the word "identity" does not come to mind. Further, we've told people that "values have only state, no identity -- if two values have the same state, they are the same." For such values not to be == will be astonishing. I think I have to agree with Brian here. The key idea here is "will be astonishing", emphasis on "will". I can believe that at first introduction to non-reflective equality people might shrug, but actually living with it in the long term is surely a different matter. If my car makes a funny noise in the used car lot, I might shrug it off and buy the car, but if it makes it every time I drive to work I will start to pay attention, with some buyer's regret. I am afraid we will regret non-reflexive op==. Imagine a world where many numbers act like NaN (n!=n). That's kind of what we would be signing up for. > Consider: > > value class UnsignedInt { > public static UnsignedInt ZERO = ... > > private int i; > > public static add(UnsignedInt a, UnsignedInt b) { ... } > } > > UnsignedInt x = ... > if (x != UnsignedInt.ZERO) { ... } > > People will never, ever, ever get used to the idea that test is always false. Its a fantasy to think otherwise. And the WHOLE POINT of L-World is to allow people to not sweat the small details between values and refs. This is asking them to be acutely aware all the time. To put it another way, in terms of buyer's remorse: If people shrug off op== anomalies on first glance, it can only get worse in the future, as they hit bugs in their code coming from those anomalies. This last example reminds me of another mitigating circumstance with structural acmp: Just as many comparisons of references today are against null, and that's cheap no matter what odd stuff is going on under the JVM, many value comparisons will be against initial or default values. In those cases, one of the two compared values will have little or no deep structure, so the comparison will be O(1). The O(N) comparisons will show up only when pairs of deeply structured values are being compared. If those are places where Object.equals (part of LIFE) is also in play, then we can neglect the O(N) comparison as a constant multiplier on Object.equals, or even merge it into Object.equals as I suggested previously. From forax at univ-mlv.fr Thu Feb 21 08:29:12 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 21 Feb 2019 09:29:12 +0100 (CET) Subject: acmp again ! In-Reply-To: <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> Message-ID: <120513136.908975.1550737752869.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "John Rose" > Cc: "valhalla-spec-experts" > Envoy?: Jeudi 21 F?vrier 2019 02:48:09 > Objet: Re: acmp again ! >> >> It doesn't match my experience, currently people have no expectation of what == >> means on a value type, if you explain that a value type has no identity, that >> == doesn't work on a value type, most people are ok with that. > > I think this is fantasy. yes, and == doing a component wise equals is hell and there are no good place in between ... again pick your poison. > > Value types are OBJECTS.? People think they know what == means on an > Object, and for most users, the word "identity" does not come to mind. devs already know what an identity is because - the way boxing/unboxing is defined in 5 - the way value based class are defined in 8 (classes defined as not identity-sensitive). so they may not know the term "identity" but they know the concept under the term "boxing". > Further, we've told people that "values have only state, no identity -- > if two values have the same state, they are the same." For such values > not to be == will be astonishing. > > Consider: > > ???? value class UnsignedInt { > ??????? public static UnsignedInt ZERO = ... > > ??????? private int i; > > ??????? public static add(UnsignedInt a, UnsignedInt b) { ... } > ??? } > > ??? UnsignedInt x = ... > ??? if (x != UnsignedInt.ZERO) { ... } > > People will never, ever, ever get used to the idea that test is always > false.? This example does not compile, you can not do a == or a != on a value type, you have to use equals() like you have to do a equals() on String otherwise you have unpredictable result. Compare your example with Integer x = ... if (x != Integer.ZERO) { ... } most of my undergraduate students (you still have one or two black sheeps) will tell you that you should use equals() here and not ==. The mental model for our users is that seeing a value type as an Object or an interface is a kind of boxing but it's better because - the compiler doesn't let you write code that have unpredictable result (that why == is disable on value types) - if you compare boxed value types as Object using ==, it will always return false, again it's better than sometimes returning true and sometimes returning false like when you box Integers. BTW, using a static field in your example is a premature optimization, you can use a static method if you want a name given that values are not heap allocated. > Its a fantasy to think otherwise.? And the WHOLE POINT of > L-World is to allow people to not sweat the small details between values > and refs.? This is asking them to be acutely aware all the time. No, the whole point of L-World is to see value types as ref types but you can see the artifice for all identity-sensitive operations. Synchronized throwing an ISE is in the same category. Trying to hide that under the rug and pretend it's just small details is wrong to me. I prefer a model that is reliable while surprising at first than a model that allows unbounded computations. R?mi From forax at univ-mlv.fr Thu Feb 21 09:04:41 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 21 Feb 2019 10:04:41 +0100 (CET) Subject: acmp again ! In-Reply-To: <18F4B236-2005-4971-8BDD-472229037B24@oracle.com> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> <18F4B236-2005-4971-8BDD-472229037B24@oracle.com> Message-ID: <476639663.923467.1550739881745.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Brian Goetz" > Cc: "Remi Forax" , "valhalla-spec-experts" > Envoy?: Jeudi 21 F?vrier 2019 04:25:27 > Objet: Re: acmp again ! > On Feb 20, 2019, at 5:48 PM, Brian Goetz wrote: >> >> >>> >>> It doesn't match my experience, currently people have no expectation of what == >>> means on a value type, if you explain that a value type has no identity, that >>> == doesn't work on a value type, most people are ok with that. >> >> I think this is fantasy. >> >> Value types are OBJECTS. People think they know what == means on an Object, and >> for most users, the word "identity" does not come to mind. Further, we've told >> people that "values have only state, no identity -- if two values have the same >> state, they are the same." For such values not to be == will be astonishing. > > I think I have to agree with Brian here. The key idea here is > "will be astonishing", emphasis on "will". I can believe that > at first introduction to non-reflective equality people might > shrug, but actually living with it in the long term is surely > a different matter. If my car makes a funny noise in the > used car lot, I might shrug it off and buy the car, but if it > makes it every time I drive to work I will start to pay > attention, with some buyer's regret. I am afraid we will > regret non-reflexive op==. Imagine a world where > many numbers act like NaN (n!=n). That's kind of what > we would be signing up for. I like the car metaphor, the problem here is that there are only two types of cars available, one that reliably does a funny noise when you turn the windscreen wipers on and the other that drive you out of the road when you turn on the wipers on but only on particular roads (your O(N) case below). Again pick your poison. NaN is interesting because it creates another corner case where the == will be surprising if it's implemented has a component wise comparison, value record Box(double value); var box = new Box(Double.NaN); box == box // false so both semantics are not reflective. > >> Consider: >> >> value class UnsignedInt { >> public static UnsignedInt ZERO = ... >> >> private int i; >> >> public static add(UnsignedInt a, UnsignedInt b) { ... } >> } >> >> UnsignedInt x = ... >> if (x != UnsignedInt.ZERO) { ... } >> >> People will never, ever, ever get used to the idea that test is always false. >> Its a fantasy to think otherwise. And the WHOLE POINT of L-World is to allow >> people to not sweat the small details between values and refs. This is asking >> them to be acutely aware all the time. > > To put it another way, in terms of buyer's remorse: If people > shrug off op== anomalies on first glance, it can only get worse > in the future, as they hit bugs in their code coming from those > anomalies. acmp on value types will be source of anomalies whatever semantics we are choosing, because in both case, it's not the reference semantics, but == on a value type doesn't have to be mapped to acmp. > > This last example reminds me of another mitigating > circumstance with structural acmp: Just as many comparisons > of references today are against null, and that's cheap > no matter what odd stuff is going on under the JVM, > many value comparisons will be against initial or > default values. In those cases, one of the two compared > values will have little or no deep structure, so the > comparison will be O(1). The O(N) comparisons will > show up only when pairs of deeply structured values > are being compared. If those are places where > Object.equals (part of LIFE) is also in play, then > we can neglect the O(N) comparison as a constant > multiplier on Object.equals, or even merge it into > Object.equals as I suggested previously. deeply recursive values will sometimes drove you out of the road, because not all acmps are part of a LIFE pattern, so you have no guarantee that there is no acmp somewhere that will trigger a recursive structural comparison at a location your program can not shallow it. R?mi From brian.goetz at oracle.com Thu Feb 21 13:33:06 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 21 Feb 2019 08:33:06 -0500 Subject: acmp again ! In-Reply-To: <476639663.923467.1550739881745.JavaMail.zimbra@u-pem.fr> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> <18F4B236-2005-4971-8BDD-472229037B24@oracle.com> <476639663.923467.1550739881745.JavaMail.zimbra@u-pem.fr> Message-ID: > NaN is interesting because it creates another corner case where the == will be surprising if it's implemented has a component wise comparison, > value record Box(double value); > > var box = new Box(Double.NaN); > box == box // false > > so both semantics are not reflective. Go re-read the definition of substitutibility, you'll see that indeed it is reflexive.? (Even though `==` on double is not.)?? So you'll need to find another counterexample :) From brian.goetz at oracle.com Thu Feb 21 13:37:31 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 21 Feb 2019 08:37:31 -0500 Subject: acmp again ! In-Reply-To: <476639663.923467.1550739881745.JavaMail.zimbra@u-pem.fr> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> <18F4B236-2005-4971-8BDD-472229037B24@oracle.com> <476639663.923467.1550739881745.JavaMail.zimbra@u-pem.fr> Message-ID: <0dbcdad8-ed9c-9a92-6b63-ef1581eef807@oracle.com> > Again pick your poison. The phrase "pick your poison" is misleading because it suggests all poisons are equally fatal.? And these two poisons are very, very different.? (Alcohol is a poison, and when taken recursively can indeed be fatal, but that's not the usual outcome, nor is it usually a deterrent.) With the "always false" semantics, a fundamental building block of the language has perennially astonishing semantics in situations that will be routinely encountered by all users.? (We've seen languages where the `==` operator has semantics people can't understand; we don't want to be them, or have to deprecate `==` in favor of `===` because the language failed so hard the first time.) With the second, the performance will sometimes be mildly surprising when some wise guy thinks he's being clever and writes some ridiculous code, like a recursive value list.? Then he'll be told to cut that out, and life will go back to normal. This argument feels to me like "let's snatch defeat from the jaws of victory."? For years, we thought it was impractical that we could unify values and references.? But we are now 95% of the way there! Substitutibility is a sound, intuitive generalization of `==` over both refs and values. While we're on the subject of fantasy, let me call attention to another fantasy that we've been engaging in: that somehow values can remain this "weird, off-to-the-side thing."? In Q-world they were -- and when you were writing code, you always had to be aware of whether you were dealing with values or objects (and generic code had to learn new rules because they might be dealing with either.) But that's not the world we've built (thankfully!)? Here, values will be a common, every day occurrence (Optional, LocalDateTime) that all Java code will have to deal with.? (They're Objects!)? We have to give people a sound, intuitive model for dealing with the union of refs and values, because they're going to have to deal with that.? And we're almost there, as long as we don't blow it. I realize that we started out hyper-focused on the performance aspects (because there'd be no point in doing values in the first place if we didn't care about performance.)? But, users will not thank us if we routinely choose confusing semantics because its faster.? Now it's time to focus on delivering a programming model that makes users say "why didn't you do that 20 years ago" -- and we can do that. This is winning; let's take it. From forax at univ-mlv.fr Thu Feb 21 14:01:04 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 21 Feb 2019 15:01:04 +0100 (CET) Subject: acmp again ! In-Reply-To: References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> <18F4B236-2005-4971-8BDD-472229037B24@oracle.com> <476639663.923467.1550739881745.JavaMail.zimbra@u-pem.fr> Message-ID: <1439047274.1022419.1550757664923.JavaMail.zimbra@u-pem.fr> You mean the double comparison use doubleToLongBits, in that case, value record Box(double value); new Box(0.0) == new Box(-1 * 0.0) // false which is not better. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "John Rose" > Cc: "valhalla-spec-experts" > Envoy?: Jeudi 21 F?vrier 2019 14:33:06 > Objet: Re: acmp again ! >> NaN is interesting because it creates another corner case where the == will be >> surprising if it's implemented has a component wise comparison, >> value record Box(double value); >> >> var box = new Box(Double.NaN); >> box == box // false >> >> so both semantics are not reflective. > > Go re-read the definition of substitutibility, you'll see that indeed it > is reflexive.? (Even though `==` on double is not.)?? So you'll need to > find another counterexample :) From forax at univ-mlv.fr Thu Feb 21 15:07:43 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 21 Feb 2019 16:07:43 +0100 (CET) Subject: acmp again ! In-Reply-To: <0dbcdad8-ed9c-9a92-6b63-ef1581eef807@oracle.com> References: <2060922000.853402.1550694794988.JavaMail.zimbra@u-pem.fr> <1848357154.861075.1550699714073.JavaMail.zimbra@u-pem.fr> <623620049.869077.1550709432496.JavaMail.zimbra@u-pem.fr> <8d19897c-1005-cb00-321e-20c02dad7bc9@oracle.com> <18F4B236-2005-4971-8BDD-472229037B24@oracle.com> <476639663.923467.1550739881745.JavaMail.zimbra@u-pem.fr> <0dbcdad8-ed9c-9a92-6b63-ef1581eef807@oracle.com> Message-ID: <873072624.1049683.1550761663358.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "John Rose" > Cc: "valhalla-spec-experts" > Envoy?: Jeudi 21 F?vrier 2019 14:37:31 > Objet: Re: acmp again ! >> Again pick your poison. > > The phrase "pick your poison" is misleading because it suggests all > poisons are equally fatal.? And these two poisons are very, very > different.? (Alcohol is a poison, and when taken recursively can indeed > be fatal, but that's not the usual outcome, nor is it usually a deterrent.) > > With the "always false" semantics, a fundamental building block of the > language has perennially astonishing semantics in situations that will > be routinely encountered by all users.? (We've seen languages where the > `==` operator has semantics people can't understand; we don't want to be > them, or have to deprecate `==` in favor of `===` because the language > failed so hard the first time.) I think you are mixing acmp and ==. You can have acmp with the always false semantics and have an opt-in operator overloading mechanism so calling == on a value type is equivalent to calling equals(), but at language level, not at the VM level. This is very equivalent to what we have now with int and Integer, an == of ints is a special semantics while == of Integers is acmp. > > With the second, the performance will sometimes be mildly surprising > when some wise guy thinks he's being clever and writes some ridiculous > code, like a recursive value list.? Then he'll be told to cut that out, > and life will go back to normal. Composing lambdas that are parameterized creates the kind of recursive value types the Substitutibility test will not like. > > This argument feels to me like "let's snatch defeat from the jaws of > victory."? For years, we thought it was impractical that we could unify > values and references.? But we are now 95% of the way there! > Substitutibility is a sound, intuitive generalization of `==` over both > refs and values. Substitutibility is not a sound generalization. But it doesn't mean the is no way to win, it's just that the Substitutibility test is not the way to win. > > While we're on the subject of fantasy, let me call attention to another > fantasy that we've been engaging in: that somehow values can remain this > "weird, off-to-the-side thing."? In Q-world they were -- and when you > were writing code, you always had to be aware of whether you were > dealing with values or objects (and generic code had to learn new rules > because they might be dealing with either.) But that's not the world > we've built (thankfully!)? Here, values will be a common, every day > occurrence (Optional, LocalDateTime) that all Java code will have to > deal with.? (They're Objects!)? We have to give people a sound, > intuitive model for dealing with the union of refs and values, because > they're going to have to deal with that.? And we're almost there, as > long as we don't blow it. yes, that's exactly my point, we should not blow it by trying to have all objects operations ==, synchronized, identityHashCode, etc trying to masquerade the fact that those operations will never exactly works as their reference counterparts and that a 95% emulation is not a good idea because we want a sound and reliable model. > > I realize that we started out hyper-focused on the performance aspects > (because there'd be no point in doing values in the first place if we > didn't care about performance.)? But, users will not thank us if we > routinely choose confusing semantics because its faster.? Now it's time > to focus on delivering a programming model that makes users say "why > didn't you do that 20 years ago" -- and we can do that. This is winning; > let's take it. it's not about being faster, it's about not being randomly unpredictably slower. R?mi From brian.goetz at oracle.com Thu Feb 21 17:59:02 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 21 Feb 2019 12:59:02 -0500 Subject: Finding the spirit of L-World In-Reply-To: References: Message-ID: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> More on substitutibility and why this it is desirable... > #### Equality > > Now we need to define equality. The terminology is messy, as so many > of the terms we might want to use (object, value, instance) already > have associations. For now, we'll describe a _substitutability_ > predicate on two instances: > > - Two refs are substitutable if they refer to the same object > identity. > - Two primitives are substitutable if they are `==` (modulo special > pleading for `NaN` -- see `Float::equals` and `Double::equals`). > - Two values `a` and `b` are substitutable if they are of the same > type, and for each of the fields `f` of that type, `a.f` and `b.f` > are substitutable. > > We then say that for any two objects, `a == b` iff a and b are > substitutable. Currently, our type system has refs and primitives, and the == predicate applies on all of them.? And for all the types we have today (with the almost-too-small-to-mention anomaly of NaN), == *already is* a substitutibility predicate (where substitutibility means, informally: "no observable difference between the two arguments."? Two refs are substitutible if they refer to the same object identity; two primitives are substitutible if they refer to the same value (modulo NaN.) VM engineers like to refer to `==` on refs as "identity equality", but that's really an implementation detail.? What it really means is: are the two things the same.? And that's what `==` means for primitives too, and that's how the other 99.99% of users think of it too. The natural interpretation of `==` in a world with values is to extend this "are these two things the same" to values too.? The substitutibility relation above applies the same "are you the same" logic equally to refs, values, and primitives.? No sharp edges (except the NaNsense that we are already stuck with.) From kevinb at google.com Fri Feb 22 19:42:48 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 22 Feb 2019 11:42:48 -0800 Subject: Finding the spirit of L-World In-Reply-To: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> References: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> Message-ID: Fair point that `==` has always been the test of *absolute* substitutability. But I think this is overlooking something big: People implement equals() in order to ask for "substitutability for virtually all intents and purposes". Of course, most code should never be going anywhere near identity hash maps or synchronizing on value-like things, etc. And that means that equals() has become the substitutability test that people WANT. This in turn means that every usage of `==` on a non-primitive type (named class) is always suspicious. As a reader and maintainer of code, I need to think about this carefully. Is it a Class -- if so == is harmless but also .equals() is harmless and it's not worth switching idioms. Is it an enum type? I have to go look it up to find out, in which cause it is once again both harmless and pointless (especially if I can replace with switch!). Barring those, then it's either a risky micro-optimization or some other bizarre coding choice that I need to be very careful around. I think we should make users write `equals` to test value types. If they write `==`, they are indicating a special situation where they need identity semantics, which don't make sense for value types, and that should be an error. One of the concerns I've always had about value types is that developers would be forced to maintain a mental database of which types are value types and which are reference types, and that they could not hope to assess the correctness of code they read or write without having that. In a world where users commonly need to do "absolutely substitutable" checks, then this proposal would be the way to achieve that. But, I don't think that's the world we're in. Thoughts? On Thu, Feb 21, 2019 at 9:59 AM Brian Goetz wrote: > More on substitutibility and why this it is desirable... > > > #### Equality > > > > Now we need to define equality. The terminology is messy, as so many > > of the terms we might want to use (object, value, instance) already > > have associations. For now, we'll describe a _substitutability_ > > predicate on two instances: > > > > - Two refs are substitutable if they refer to the same object > > identity. > > - Two primitives are substitutable if they are `==` (modulo special > > pleading for `NaN` -- see `Float::equals` and `Double::equals`). > > - Two values `a` and `b` are substitutable if they are of the same > > type, and for each of the fields `f` of that type, `a.f` and `b.f` > > are substitutable. > > > > We then say that for any two objects, `a == b` iff a and b are > > substitutable. > > Currently, our type system has refs and primitives, and the == predicate > applies on all of them. And for all the types we have today (with the > almost-too-small-to-mention anomaly of NaN), == *already is* a > substitutibility predicate (where substitutibility means, informally: > "no observable difference between the two arguments." Two refs are > substitutible if they refer to the same object identity; two primitives > are substitutible if they refer to the same value (modulo NaN.) > > VM engineers like to refer to `==` on refs as "identity equality", but > that's really an implementation detail. What it really means is: are > the two things the same. And that's what `==` means for primitives too, > and that's how the other 99.99% of users think of it too. > > The natural interpretation of `==` in a world with values is to extend > this "are these two things the same" to values too. The > substitutibility relation above applies the same "are you the same" > logic equally to refs, values, and primitives. No sharp edges (except > the NaNsense that we are already stuck with.) > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com From brian.goetz at oracle.com Fri Feb 22 21:46:34 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Feb 2019 16:46:34 -0500 Subject: Finding the spirit of L-World In-Reply-To: <4259E8F8-DD5A-4FB2-98B1-3372479CC2F6@oracle.com> References: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> <4259E8F8-DD5A-4FB2-98B1-3372479CC2F6@oracle.com> Message-ID: <8f543923-a98d-1c19-8e15-02152eb88ce2@oracle.com> Thanks for bringing up .equals, and the possibility to ban == on values, both of which have been touched on in the past but we've not really focused much on these.? It also helped me clarify why I think there's really only one answer here. As to equals(), having == be a substitutability test does not make ?equals()? obsolete ? far from it.? The existing analogue of == vs .equals() with reference types holds (pretty much exactly) for values when == is a substitutibility check. For example, consider a class like ? ? value class StringWrapper { String s; } The substitutability test here would be to compute (w, u) -> (w.s == u.s);? This _is_ the "are they exactly the same value" subst test, and something we should be able to express.?? But it is not the implementation we'd often want for equals -- we'd likely want to delegate to String::equals: ? ? value class StringWrapper { ??????? String s; ??????? public boolean equals(Object o) { return? o instanceof StringWrapper sw && s.equals(sw.s); } ??? } So == (continues to) means "are you exactly the same thing"; .equals() (continues to) means "are you logically the same thing", as with refs today. ?? And as with refs, the former is a sensible starting point for the latter (sometimes its good enough), but we many want to refine it to allow physically different things to be treated as logically the same.? (The difference is really just a quantitative one; there are just more values than refs (e.g., Complex) for which `==` is already the right answer for `equals()`.)? So while the substitutibility test is quantitatively _closer_ to what equals() is likely to be, it's still not always going to be the same (and its usually going to be simpler.)? And just as we support both now, for reasons, we probably will still want to do so. Coming back to our choices, there are four possible interpretations of ==v so far: ?- The LW1 interpretation is "== means identity, and values have no identity, so == always says false" ?- The substitutibility interpretation is whether the two operands have no observable differences ?- The third is: upcall to .equals() ?- The fourth is: Don't allow it at all. IMO the first is considerably worse than useless; the user is allowed to ask a harmless and familiar question, and is guaranteed to get a surprising answer.? If that's the case, don't let them ask at all.? (And you agree, but, see below.) The second interpretation is a sound generalization of `==` on refs and primitives; it means "are you exactly the same thing", which can be given a precise linguistic meaning, and can be further refined with logical equality tests where desired.? The main objection is that it is more expensive for the VM to implement, and the cost model has a broader variance.? (These are not nothing, I just don't think they trump intuitive semantics or compatibility.) If the second interpretation gives VM engineers fits, the third one is even worse, as it means upcalling to arbitrary Java code from the ACMP bytecode.? It also gives me fits, because it tries to go back 25 years and rewrite what == means for objects.? (And it probably gives you fits, because you've commented frequently on how often equals() implementations are wrong.) The fourth answer, ban it, is surely better than the first, but let's pull on that string. Let's say `==` is meaningless on values, so we ban it.? But, just as with the first, we have a problem for code that trucks in Object (including erased generics).? If this code uses == to compare user-provided values to a user-provided sentinels, this code will /just stop working//. /And even if they are willing to rewrite it, there's now no convenient and reliable way to write code that tests "are you the same thing I saw before". One of the subtle (but ultimately, good, I think) things about L-world is that /values are Objects/.? That means, if you take an Object parameter (or a T, for erased generics), someone can pass you a value, and your code should still work.? (If it does still work, we've achieved (yet again) that elusive form of /forward compatibility/ -- code that was written before the language had feature X, can deal perfectly well with X.) OK, so if we ban == on values, should we ban it on generic code too?? That's the sound choice, since we're quantifying over types for which == may not be defined.? But that's neither source- nor binary- compatible with existing code.? Which means, at least to keep this code working, we still have to assign a meaning to == on T when one or both of the operands are a value.? (Of which there are three choices so far, detailed above.)? So for existing sources and binaries, we should give ==T a meaning, otherwise this code breaks.? Now, what about Object?? There exists plenty of code which accept Object, and use == on it.? So we have to continue to assign a meaning to ==Object too.? Again, we have three choices.? And if we can assign a sound meaning for T== and Object==, which works when you pass a value in, why not use that for value== too? So my claim is: banning it is effectively impossible; we at least have to pick one of the other intepretations to fall back on for existing sources and binaries, and if we're going to do that, we should just do that. Upleveling....?? your concern about "mental database" is a valid one, and one I've been worried about too.? This is why I've been on a search-and-destroy mission to eliminate gratuitous asymmetries between values and references as we bring them closer together in the type system.? (I don't want people who write code that trucks in Object, or erased T, to have to be writing two versions of their code, one for refs and one for values, or even to be thinking much about the differences.)? On that score (downleveling again): ?- The "false" interpretation means that you can ask == of values, but the question is meaningless.? That means, if you are ever to be exposed to values, you have the following bad choices: give up on discriminating between values, or do something different for values and refs, or just use equals() all the time.? If these are your only options, that's pretty terrible. ?- The "ban it" interpretation is similar; you don't get to ask the question, so you're stuck with doing one thing for refs and one for values, or always using .equals().? It also seems impractical; we will end up reinventing one of the other solutions for compatibility reasons only. ?- The "call up to Java" interpretation means that the treatment of == on refs and values are about as different as you can possibly get!? Again, this means people will end up either using .euals() all the time, or writing different code for refs and values. /*?- The substitutibility test is the only interpretation that is consistent with existing understanding and coding idioms, and which will "just work" when values start getting injected into code that was compiled years ago that takes Object / erased T and has no conception of values.? It is the only version that doesn't require that people rewrite their code when values start showing up in your HashMap, or constantly ask themselves "is this instance a value or a ref." */ The distinction between == and .equals() in Java may have its problems, but its how Object works, and people have learned idioms that work for it.? Preserving that intuition, and that code, seems to me to be the highest priority.? Option 4 feels to me like a wishful attempt to try and go back and fix history, which is a worthy goal but we've all watched enough science fiction to know how that ends. On 2/22/2019 3:38 PM, Brian Goetz wrote: > >> On Feb 22, 2019, at 2:42 PM, Kevin Bourrillion wrote: >> >> Fair point that `==` has always been the test of >> /absolute/?substitutability. But I think this is overlooking >> something big: People implement equals() in order to ask for >> "substitutability for virtually all intents and purposes". Of course, >> most code should never be going anywhere near identity hash maps or >> synchronizing on value-like things, etc. And that means that equals() >> has become the substitutability test that people WANT. >> >> This in turn means that every usage of `==` on a non-primitive type >> (named class) is always suspicious. As a reader and maintainer of >> code, I need to think about this carefully. Is it a Class -- if so >> == is harmless but also .equals() is harmless and it's not worth >> switching idioms. Is it an enum type? I have to go look it up to find >> out, in which cause it is once again both harmless and pointless >> (especially if I can replace with switch!). Barring those, then it's >> either a risky micro-optimization or some other bizarre coding choice >> that I need to be very careful around. >> >> I think we should make users write `equals` to test value types. If >> they write `==`, they are indicating a special situation where they >> need identity semantics, which don't make sense for value types, and >> that should be an error. >> >> One of the concerns I've always had about value types is that >> developers would be forced to maintain a mental database of which >> types are value types and which are reference types, and that they >> could not hope to assess the correctness of code they read or write >> without having that. In a world where users commonly need to do >> "absolutely substitutable" checks, then this proposal would be the >> way to achieve that. But, I don't think that's the world we're in. >> >> Thoughts? >> >> >> >> On Thu, Feb 21, 2019 at 9:59 AM Brian Goetz > > wrote: >> >> More on substitutibility and why this it is desirable... >> >> > #### Equality >> > >> > Now we need to define equality.? The terminology is messy, as >> so many >> > of the terms we might want to use (object, value, instance) already >> > have associations. For now, we'll describe a _substitutability_ >> > predicate on two instances: >> > >> >? ? - Two refs are substitutable if they refer to the same object >> >? ? ? identity. >> >? ? - Two primitives are substitutable if they are `==` (modulo >> special >> >? ? ? pleading for `NaN` -- see `Float::equals` and >> `Double::equals`). >> >? ? - Two values `a` and `b` are substitutable if they are of >> the same >> >? ? ? type, and for each of the fields `f` of that type, `a.f` >> and `b.f` >> >? ? ? are substitutable. >> > >> > We then say that for any two objects, `a == b` iff a and b are >> > substitutable. >> >> Currently, our type system has refs and primitives, and the == >> predicate >> applies on all of them.? And for all the types we have today >> (with the >> almost-too-small-to-mention anomaly of NaN), == *already is* a >> substitutibility predicate (where substitutibility means, >> informally: >> "no observable difference between the two arguments."? Two refs are >> substitutible if they refer to the same object identity; two >> primitives >> are substitutible if they refer to the same value (modulo NaN.) >> >> VM engineers like to refer to `==` on refs as "identity >> equality", but >> that's really an implementation detail.? What it really means is: >> are >> the two things the same.? And that's what `==` means for >> primitives too, >> and that's how the other 99.99% of users think of it too. >> >> The natural interpretation of `==` in a world with values is to >> extend >> this "are these two things the same" to values too.? The >> substitutibility relation above applies the same "are you the same" >> logic equally to refs, values, and primitives.? No sharp edges >> (except >> the NaNsense that we are already stuck with.) >> >> >> >> >> -- >> Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com >> > From brian.goetz at oracle.com Fri Feb 22 22:04:36 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 22 Feb 2019 17:04:36 -0500 Subject: Finding the spirit of L-World In-Reply-To: <8f543923-a98d-1c19-8e15-02152eb88ce2@oracle.com> References: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> <4259E8F8-DD5A-4FB2-98B1-3372479CC2F6@oracle.com> <8f543923-a98d-1c19-8e15-02152eb88ce2@oracle.com> Message-ID: Forgot to comment on this: > And that means that equals() has become the substitutability test that > people WANT. I agree with this completely.? But, where I think I disagree (and this is often the place where you and I reach for different tools) is that I don't want to use the language spec to discourage this.? I want to have the language spec assign it a clear, precise, sound, principled meaning, and then we can use lore, advice, static analysis, style guides, code samples, and electric shocks to encourage people to use the language sensibly. In order to be able to write low-level code like IdentityHashMap in Java, we need to be able to express both subst== and deep==.? It may that 25 years ago we picked the wrong meaning to bind to ==, but that's how the language works already.? For dyamically typed (Object-consuming) and generic code, we have guided users towards the LIFE (Legacy Idiom for Equality) idiom: x == y || x.equals(y) /strictly because of the cost model of 1995, because virtual calls were expensive/.? Sad, but the world is full of this code, so we shouldn't break it.? But, as the cost model shifts, we can also guide people away from it and towards just doing x.equals(y) /because the cost model has changed/ (again).? Which we can do through recommendations, style codes, education, and static analysis.? But I don't disagree with your conclusion that /most of the time/, == comparisons are suspect (on refs and on values), and most of these LIFE instances should, eventually, be replaced by plain old `x.equals(y)`. From john.r.rose at oracle.com Sat Feb 23 01:17:59 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 22 Feb 2019 17:17:59 -0800 Subject: Finding the spirit of L-World In-Reply-To: References: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> <4259E8F8-DD5A-4FB2-98B1-3372479CC2F6@oracle.com> <8f543923-a98d-1c19-8e15-02152eb88ce2@oracle.com> Message-ID: On Feb 22, 2019, at 2:04 PM, Brian Goetz wrote: > I want to have the language spec assign it a clear, precise, sound, principled meaning +100 > most of these LIFE instances should, eventually, be replaced by plain old `x.equals(y)`. FWIW, in some cases where y is a constant, the comparison can also be done with a pattern match (`x instanceof y`). This is a generalization of the tried and true `x == null`. The main benefit of using pattern match, versus a simple method call, is that it tolerates nulls. Today's version is `Objects.equals(x, y)`, or the version of LIFE which handles nulls: `x == y || x != null && x.equals(y)`. From john.r.rose at oracle.com Sat Feb 23 02:57:56 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 22 Feb 2019 18:57:56 -0800 Subject: Finding the spirit of L-World In-Reply-To: References: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> Message-ID: <187341CC-87D9-4A4F-B902-C52476EA0532@oracle.com> On Feb 22, 2019, at 11:42 AM, Kevin Bourrillion wrote: > > I think we should make users write `equals` to test value types. If they write `==`, they are indicating a special situation where they need identity semantics, which don't make sense for value types, and that should be an error. This sounds like a proposal for the future, but as Brian points out it is also a constraint on large amounts of generic code that has already been written. Let's make the best of op==; it's in our past and the future of comparison logic in Java is too tightly coupled with the past. ? John From john.r.rose at oracle.com Sat Feb 23 03:02:15 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 22 Feb 2019 19:02:15 -0800 Subject: Finding the spirit of L-World In-Reply-To: <187341CC-87D9-4A4F-B902-C52476EA0532@oracle.com> References: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> <187341CC-87D9-4A4F-B902-C52476EA0532@oracle.com> Message-ID: <4F6FA819-C4FE-42A3-9103-3F5C680888D3@oracle.com> On Feb 22, 2019, at 6:57 PM, John Rose wrote: > > On Feb 22, 2019, at 11:42 AM, Kevin Bourrillion wrote: >> >> I think we should make users write `equals` to test value types. If they write `==`, they are indicating a special situation where they need identity semantics, which don't make sense for value types, and that should be an error. > > This sounds like a proposal for the future, but as Brian points > out it is also a constraint on large amounts of generic code > that has already been written. > > Let's make the best of op==; it's in our past and the future > of comparison logic in Java is too tightly coupled with the past. > > ? John P.S. Also, in exactly-typed code, the substitutability test will *often* be exactly what the user wants. The default implementation of ValObject.equals will use that test, and users will surely opt to inherit that rather than write a local equals method that does the same. (Not always, of course; there are value types and object types where the default equals method doesn't DTRT. But often, for cases like small numerics and tuples.) In that value class, requiring the user to call x.equals(y) instead of writing the known exact equivalent x==y feels like somebody's code style policy, rather than a useful discipline. From john.r.rose at oracle.com Sat Feb 23 04:23:35 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 22 Feb 2019 20:23:35 -0800 Subject: Finding the spirit of L-World In-Reply-To: References: Message-ID: <27559D54-A17B-47ED-A05A-7C736A3848FA@oracle.com> On Jan 23, 2019, at 9:51 AM, Brian Goetz wrote: > >> Because values have no identity, in LW1 `System::identityHashCode` >> throws `UnsupportedOperationException`. However, this is >> unnecessarily harsh; for values, `identityHashCode` could simply >> return `hashCode`. This would enable classes like `IdentityHashMap` >> (used by serialization frameworks) to accept values without >> modification, with reasonable semantics -- two objects would be deemed >> the same if they are `==`. (For serialization, this means that equal >> values would be interned in the stream, which is probably what is >> wanted.) >> >> By return `hashCode`, do you mean call a user defined hashCode function? Would the VM enforce that all values must implement `hashCode()`? Is the intention they are stored (growing the size of the flattened values) or would calling the hashcode() method each time be sufficient? > > I would prefer to call the "built-in? value hashCode ? the one that is deterministically derived from state. That way, we preserve the invariant that == values have equal identity hash codes. Just as op== (acmp) is a built-in and equals is the user-coded variation on it, System.identityHashCode is a built-in and hashCode is the user-coded variation. In both cases, the default implementation of the latter is the former. When we get to values, generics and dynamically typed code lead us to consider retrofitting op== (acmp) and System.iHC from references to values also. Which is what we are talking about. And, at the very least, the existence of such code forces us to define two *new* operations which extend op== (acmp) and System.iHC. The first is a (total) substitutability test. The second is a total hash algorithm that computes a hash code compatible with the first. That second is *not* the same as any coder's override of O.hashCode. It is an intrinsic function, just like op== (acmp). I like Remi's names for them: structuralEquals, structuralHashCode, and also my own isSubstitutable, substitutabilityHashCode. (Problem with "structuralEquals": When applied to references, it does *not* look at structural. Bummer. The word "structural" works for values and not for references. I guess I'm back to substitutability.) Should op== (acmp) be bound to the first and System.identityHashCode bound to the second? OK, if we do that then we don't need to find new names for them. But even then, maybe we *want* the new names so that programmers can advertise their intentions more clearly. We can then deprecate System.identityHashCode, and/or add lint-style warnings to op== on certain cases (as with String today), or whatever, and let users refactor those warnings away by using the newer names. I think where we will end up is with making op== the same as isSubstitutable, but we will still want isSubstitutable for certain coding tasks, such as code which works with floats and doubles and wants to side-step the NaN behavior of op==. Today's workaround for that boxes the values and calls Object.equals. Since equals and hashCode are on parallel tracks, we can also extend System.iHC "in place" to handle values by doing a structural hash code on their component fields. But we are *not* obligated to do so. We can (and should) make System.substitutabilityHashCode a new API point (which calls System.iHC on references only) and then decide what is the most useful thing to do for System.iHC when applied to a value type (and, some day, a primitive). I think a thoughtful prototyping move would be to make it throw an exception or log a warning, and then ask our users to debug the resulting diagnostics. Maybe if the diagnostics are useful they can be turned into JFR events or something. A conservative *final design* (especially if we had to choose one *today*) would be to extend System.iHC in place, so as to keep the trains running on all tracks; we could deprecate System.iHC as a gentle way of encouraging folks to re-evaluate their code (as opposed to a non-gentle exception or log blather). I guess I'm saying that we would be right to make System.iHC throw, at first, when presented with a value instance. Then we can react to what we learn. The obvious reason to make System.substitutabilityHashCode be a different API point from System.identityHashCode is that that latter mentions "identity". So there's a clear pedagogical problem here, if we are telling students "value = object - identity". A final reason to have a separate API point for System.subHC is that it's 2019 already, and 32-bit systems are long gone. The return type of System.subHC should be a full range int64, mixed (in a few CPU cycles) without egregious funnels. In short, new hashCode API points should be routinely upgraded to work well with modern hardware. Why do op== (acmp) and System.iHC deserve different treatments? Because op== is used by 100% and System.iHC by 0.01% of Java programmers. That means that having System.iHC break is a reasonable way to force a few programmers (like Doug Lea) to go and inspect their code for bugs. Deprecating op== for values is not an analogous move; that would amount to saying something like, "all generic Java code is now buggy, please debug". ? John P.S. The word "substitutable" is exactly correct as a way of defining equality for any two nameable terms, in any of a wide range of formalisms, including programming languages. In Java a term is a reference, value, or primitive, and substitutability inspects the type, the identity of the reference, and the structure of the value (but not the mutable parts of a mutable object). For a discussion of the connections between equality and substitution, see http://intrologic.stanford.edu/extras/equality.html The short version is: Two values x, y are equal if and only if the sentence "P(x) <=> P(y)" is true for all predicates "P" (in some relevant domain of discourse also including x and y). This logical pattern of testing for equality depends on substituting x for y (in some P(y)) and seeing if anything changes. If nothing changes, x and y cannot be different. (Or if they still differ, your work is done; go home and sleep it off.) This test is extremely robust (though also impossible to compute directly); it makes no appeal to any intrinsic property of x or y, other than "P can ask it any questions it wants". This exercise demonstrates that equality, in its deepest form, is really an appeal to substitutability. What, then, about other forms of equality which are defined as equivalence relations? Those are built on top of the basic logic of relations and objects (sets, categories, dependent types, choose one). Those extra forms of equality can only "coarsen" the basic equality as determined by substitutability. If two names x, y are the same thing, as proven by substitutability, then they cannot be unequal in a derived equivalence relation E. (Otherwise, that relation E would provide a hook for breaking the substitutability condition, via P(z) := E(x,z). Then, P(x)=true and P(y)=false since E(x,x) must be true and E(x,y) is by assumption false. But this P breaks the substitutability proof that x==y.) Thus, derived equivalence relations cannot distinguish x and y if they are known to be substitutable for each other. Substitutability is the unique "ground condition" for equality. This logic applies exactly to Java and the JVM (as soon as we block out functions P which give non-deterministic or global-state-dependent answers). Thus, the most sensitive, exact, finely discriminating equality test the JVM can provide is the substitutability test. Although the JVM cannot run a theoretical substitutability proof in every acmp instruction, it has enough control over little universe to test efficiently whether such a proof would succeed or fail, since it "knows all the tricks" that any predicate P could possibly play. Recap: There can only be one most-exact equivalence relation, and that coincides with the substitutability test. Other equivalence relations can be derived (as Object.equals is derived from primitive tests) but they cannot distinguish objects which are found to be substitutable. The JVM can implement this test. P.P.S. Is there such a thing as substitutabilityToString? Alas, no. But there is something deeper than it which underlies it, and equality, and hashCode. That is a "something" that would enumerate all the components of a value's substitutability state. Maybe call it System.visitSubstitutabilityState or some such. For a reference type it would visit just that ref and indicate ==ref; for a value of two int fields it would visit each field and indicate ==int. Maybe we don't want something so space-age in the core JDK, but we do need enough reflective API points to be able to build one, or do its job for it. I think Core Reflection, as extended by Valhalla, succeeds in this. From forax at univ-mlv.fr Sat Feb 23 11:25:58 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 23 Feb 2019 12:25:58 +0100 (CET) Subject: Finding the spirit of L-World In-Reply-To: <187341CC-87D9-4A4F-B902-C52476EA0532@oracle.com> References: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> <187341CC-87D9-4A4F-B902-C52476EA0532@oracle.com> Message-ID: <2142437387.1385175.1550921158333.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Kevin Bourrillion" > Cc: "valhalla-spec-experts" > Envoy?: Samedi 23 F?vrier 2019 03:57:56 > Objet: Re: Finding the spirit of L-World > On Feb 22, 2019, at 11:42 AM, Kevin Bourrillion wrote: >> >> I think we should make users write `equals` to test value types. If they write >> `==`, they are indicating a special situation where they need identity >> semantics, which don't make sense for value types, and that should be an error. > > This sounds like a proposal for the future, but as Brian points > out it is also a constraint on large amounts of generic code > that has already been written. No it's not because there is no reified generics code yet. You don't need to support == on a T which can be reified, because at the same time you add 'any' in front of T (or whatever way to say that the container class is a reified generics) you can also replace the use of == + equals (LIFE) to use substituableEquals() instead. > > Let's make the best of op==; it's in our past and the future > of comparison logic in Java is too tightly coupled with the past. This remember me another issue, we are discussing a lot about Java the language but that's not the only language that run on the JVM, by making acmp be an equivalent of substituableEquals() we are making a choice of semantics that may be ok for Java the language but clearly this change of semantics also impact the other languages, i'm thinking about Clojure's identical? by example. > > ? John R?mi From forax at univ-mlv.fr Sat Feb 23 11:37:12 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 23 Feb 2019 12:37:12 +0100 (CET) Subject: Finding the spirit of L-World In-Reply-To: <4F6FA819-C4FE-42A3-9103-3F5C680888D3@oracle.com> References: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> <187341CC-87D9-4A4F-B902-C52476EA0532@oracle.com> <4F6FA819-C4FE-42A3-9103-3F5C680888D3@oracle.com> Message-ID: <1328673173.1386157.1550921832325.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Kevin Bourrillion" > Cc: "valhalla-spec-experts" > Envoy?: Samedi 23 F?vrier 2019 04:02:15 > Objet: Re: Finding the spirit of L-World > On Feb 22, 2019, at 6:57 PM, John Rose wrote: >> >> On Feb 22, 2019, at 11:42 AM, Kevin Bourrillion wrote: >>> >>> I think we should make users write `equals` to test value types. If they write >>> `==`, they are indicating a special situation where they need identity >>> semantics, which don't make sense for value types, and that should be an error. >> >> This sounds like a proposal for the future, but as Brian points >> out it is also a constraint on large amounts of generic code >> that has already been written. >> >> Let's make the best of op==; it's in our past and the future >> of comparison logic in Java is too tightly coupled with the past. >> >> ? John > > P.S. Also, in exactly-typed code, the substitutability test > will *often* be exactly what the user wants. The default > implementation of ValObject.equals will use that test, > and users will surely opt to inherit that rather than write > a local equals method that does the same. > > (Not always, of course; there are value types and object > types where the default equals method doesn't DTRT. > But often, for cases like small numerics and tuples.) > > In that value class, requiring the user to call x.equals(y) instead > of writing the known exact equivalent x==y feels like somebody's > code style policy, rather than a useful discipline. While i agree, i think it's to say that by default == on value types is a compile error but you can opt-in to have == redirected to equals() at compiler level. You don't need to change acmp for that. R?mi From john.r.rose at oracle.com Sat Feb 23 23:44:37 2019 From: john.r.rose at oracle.com (John Rose) Date: Sat, 23 Feb 2019 15:44:37 -0800 Subject: Finding the spirit of L-World In-Reply-To: <2142437387.1385175.1550921158333.JavaMail.zimbra@u-pem.fr> References: <1c4374be-99ea-850b-1a1c-3535e0b440fb@oracle.com> <187341CC-87D9-4A4F-B902-C52476EA0532@oracle.com> <2142437387.1385175.1550921158333.JavaMail.zimbra@u-pem.fr> Message-ID: <6148A0E7-A618-483A-AA74-638CEFF73800@oracle.com> On Feb 23, 2019, at 3:25 AM, Remi Forax wrote: > > This remember me another issue, we are discussing a lot about Java the language but that's not the only language that run on the JVM, by making acmp be an equivalent of substituableEquals() we are making a choice of semantics that may be ok for Java the language but clearly this change of semantics also impact the other languages, i'm thinking about Clojure's identical? by example. JVM languages can and will change their translation strategy if they don't like what acmp does (regardless of what we end up doing). That's one reason it seems likely to me that the losers of the acmp sweepstakes (including my old favorite, the one that is allowed to return false if the JVM wants to punt), should be given useful jobs under other names, not "acmp". My working title for the punting acmp is System.fastSubstitutabilityCheck. The reason acmp is hard is that it's running in old code that hasn't yet be upgraded. Also, we want upgrading to be easy. This is one reason today's generics (though not reified) place constraints on tomorrow's reified generics. So it's not quite correct to say "that code isn't written yet" as if it could compile under whatever new rules we want. From forax at univ-mlv.fr Mon Feb 25 10:11:28 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 25 Feb 2019 11:11:28 +0100 (CET) Subject: The substituability test is breaking the encapsulation Message-ID: <54193999.81105.1551089488401.JavaMail.zimbra@u-pem.fr> Hi all, there is another issue with making the component wide test available for any value types, it's leaking the implementation. Let say we have this class: public value class GuessANumber { private final int value; public GuessANumber(int value) { this.value = value; } public enum Response { LOWER, GREATER, FOUND }; public Response guess(int guess) { if (value < guess) { return Response.LOWER; } if (value > guess) { return Response.GREATER; } return Response.FOUND; } public static GuessANumber random(int seed) { return new GuessANumber(new Random(seed).nextInt(1024)); } } you can naively think that if we have an an instance of GuessANumber var number = GuessANumber.random(0); you have can not get the value of the private field of that instance, but using == you can find it because you can use == to test if number is substituable to a user created GuessANumber. here is how to find the value without using the method guess() System.out.println(IntStream.range(0, 1024).filter(n -> new GuessANumber(n) == number).findFirst()); R?mi From brian.goetz at oracle.com Mon Feb 25 14:32:18 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 25 Feb 2019 09:32:18 -0500 Subject: The substituability test is breaking the encapsulation In-Reply-To: <54193999.81105.1551089488401.JavaMail.zimbra@u-pem.fr> References: <54193999.81105.1551089488401.JavaMail.zimbra@u-pem.fr> Message-ID: <9751BA9A-28B1-40DE-AB02-3F883BC6F64F@oracle.com> Good ? let?s drill into this. At a high level, you?re saying there?s a tension between encapsulation and a state-based comparison primitive; that the state-based comparison is a side channel through which encapsulated state may be leaked. That?s true. (Just as there is a tension between ?values are objects? and ?objects have identity?.) To pick up on John?s note from v-dev over the weekend, value-objects are more easily _forgeable_ than identity-objects. There are infinitely many possible java.lang.Integers, because of the unique-per-instance identity; there are only finitely many instances of value class IntWrapper { public int i; } and, given access to the constructor, you can construct them all, and readily stamp out whatever instance you like, and it is just as good as all other instances with that state. We want value to have as many of the things that classes have, within the constraints that values eschew identity. So they can?t have mutability or layout polymorphism. But they can have methods, fields, constructors, type variables, etc. And we?d like for ?encapsulation? to be in this set. As a trivial observation, the concern you raise here goes away if the constructor is not accessible to the attacker. That suggests there are at least two paths to plugging this leak; tighten state-based comparison, or require classes that want to encapsulate their state to also encapsulate the constructors that can produce arbitrary state. So, rather than blaming ==, or blaming encapsulation, let?s set out some expectations for how we want to use encapsulation in values. (I think this problem may be related to another problem ? that of when a client should be allowed to use `withfield`. For an unencapsulated class like Point, where the ctor expresses no constraints, it seems desirable to let clients say ?p __with x = 2? (with whatever syntax), without making the author expose yet more accessor methods, but clearly for encapsulated values, that?s not OK.) > On Feb 25, 2019, at 5:11 AM, Remi Forax wrote: > > Hi all, > there is another issue with making the component wide test available for any value types, it's leaking the implementation. > > Let say we have this class: > > public value class GuessANumber { > private final int value; > > public GuessANumber(int value) { > this.value = value; > } > > public enum Response { LOWER, GREATER, FOUND }; > > public Response guess(int guess) { > if (value < guess) { > return Response.LOWER; > } > if (value > guess) { > return Response.GREATER; > } > return Response.FOUND; > } > > public static GuessANumber random(int seed) { > return new GuessANumber(new Random(seed).nextInt(1024)); > } > } > > you can naively think that if we have an an instance of GuessANumber > var number = GuessANumber.random(0); > you have can not get the value of the private field of that instance, > but using == you can find it because you can use == to test if number is substituable to a user created GuessANumber. > > here is how to find the value without using the method guess() > System.out.println(IntStream.range(0, 1024).filter(n -> new GuessANumber(n) == number).findFirst()); > > R?mi From john.r.rose at oracle.com Mon Feb 25 20:37:00 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 25 Feb 2019 12:37:00 -0800 Subject: The substituability test is breaking the encapsulation In-Reply-To: <9751BA9A-28B1-40DE-AB02-3F883BC6F64F@oracle.com> References: <54193999.81105.1551089488401.JavaMail.zimbra@u-pem.fr> <9751BA9A-28B1-40DE-AB02-3F883BC6F64F@oracle.com> Message-ID: On Feb 25, 2019, at 6:32 AM, Brian Goetz wrote: > ? > To pick up on John?s note from v-dev over the weekend, value-objects are more easily _forgeable_ than identity-objects. There are infinitely many possible java.lang.Integers, because of the unique-per-instance identity; there are only finitely many instances of > > value class IntWrapper { public int i; } > > and, given access to the constructor, you can construct them all, and readily stamp out whatever instance you like, and it is just as good as all other instances with that state. My current favorite metaphor for explaining the difference between value* objects and reference* objects is the one you coined, Brian, of the little band placed around the infant's wrist just after birth. Similarly the JVM adds such a unique extra identity* (what else do you call it?) to every Integer but not to any int. Values are not values because they have something different from objects, but because they don't have the wristband; you can't tell them apart anymore, compared to objects which always carry their little wristband around. (Metaphor failures: We don't use infant wristbands to *tell infants apart*. And the infant eventually loses the wristband. Tattoo? Let's not; besides those are forgeable. Regardless, the wristband is helpful.) So if an implementor wants every new value to be not-same* to every other new value, the implementor of the value* class can just add a wristband. That is: > value class UnforgeableIntWrapper { > public int i; > private Object wristband = new Object(); } And done, I hope. Does the JVM have anything to add here? I don't think so. If we were to create unforgeable value objects as a third or fourth kind of type, this is the implementation we'd use. So encapsulation, controlled by user choice, provides adequate control over this corner case in op== semantics. ? John From forax at univ-mlv.fr Mon Feb 25 21:23:42 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 25 Feb 2019 22:23:42 +0100 (CET) Subject: The substituability test is breaking the encapsulation In-Reply-To: <9751BA9A-28B1-40DE-AB02-3F883BC6F64F@oracle.com> References: <54193999.81105.1551089488401.JavaMail.zimbra@u-pem.fr> <9751BA9A-28B1-40DE-AB02-3F883BC6F64F@oracle.com> Message-ID: <1034373372.256371.1551129822096.JavaMail.zimbra@u-pem.fr> What i'm saying is that using a component wise test as == as a security implication, something i was not aware before thinking about it, and something i'm sure our users don't want to be aware of. Having two different meanings for "encapsulation", one for references and one for values is possible a solution, but it's moving the problem to the users, by saying, you will have to be careful enough to know that class encapsulation and value class encapsulation works differently. The first part of the moto is "code like a class" and not "code like a class but beware because the encapsulation model is different". It also makes the implementation of an interface by to a value class more hazardous, by example, can a panama Address can be implemented by a value class ? The answer is not easy because the encapsulation model is leaky. > if the constructor is not accessible to the attacker so a serializable value class is a security liability ? And a component wise test is also prone to timing attacks, you can guess the value of the fields far faster than checking all combinations. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "valhalla-spec-experts" > Envoy?: Lundi 25 F?vrier 2019 15:32:18 > Objet: Re: The substituability test is breaking the encapsulation > Good ? let?s drill into this. > > At a high level, you?re saying there?s a tension between encapsulation and a > state-based comparison primitive; that the state-based comparison is a side > channel through which encapsulated state may be leaked. That?s true. (Just as > there is a tension between ?values are objects? and ?objects have identity?.) > > To pick up on John?s note from v-dev over the weekend, value-objects are more > easily _forgeable_ than identity-objects. There are infinitely many possible > java.lang.Integers, because of the unique-per-instance identity; there are only > finitely many instances of > > value class IntWrapper { public int i; } > > and, given access to the constructor, you can construct them all, and readily > stamp out whatever instance you like, and it is just as good as all other > instances with that state. > > We want value to have as many of the things that classes have, within the > constraints that values eschew identity. So they can?t have mutability or > layout polymorphism. But they can have methods, fields, constructors, type > variables, etc. And we?d like for ?encapsulation? to be in this set. > > As a trivial observation, the concern you raise here goes away if the > constructor is not accessible to the attacker. That suggests there are at > least two paths to plugging this leak; tighten state-based comparison, or > require classes that want to encapsulate their state to also encapsulate the > constructors that can produce arbitrary state. > > So, rather than blaming ==, or blaming encapsulation, let?s set out some > expectations for how we want to use encapsulation in values. > > (I think this problem may be related to another problem ? that of when a client > should be allowed to use `withfield`. For an unencapsulated class like Point, > where the ctor expresses no constraints, it seems desirable to let clients say > ?p __with x = 2? (with whatever syntax), without making the author expose yet > more accessor methods, but clearly for encapsulated values, that?s not OK.) > >> On Feb 25, 2019, at 5:11 AM, Remi Forax wrote: >> >> Hi all, >> there is another issue with making the component wide test available for any >> value types, it's leaking the implementation. >> >> Let say we have this class: >> >> public value class GuessANumber { >> private final int value; >> >> public GuessANumber(int value) { >> this.value = value; >> } >> >> public enum Response { LOWER, GREATER, FOUND }; >> >> public Response guess(int guess) { >> if (value < guess) { >> return Response.LOWER; >> } >> if (value > guess) { >> return Response.GREATER; >> } >> return Response.FOUND; >> } >> >> public static GuessANumber random(int seed) { >> return new GuessANumber(new Random(seed).nextInt(1024)); >> } >> } >> >> you can naively think that if we have an an instance of GuessANumber >> var number = GuessANumber.random(0); >> you have can not get the value of the private field of that instance, >> but using == you can find it because you can use == to test if number is >> substituable to a user created GuessANumber. >> >> here is how to find the value without using the method guess() >> System.out.println(IntStream.range(0, 1024).filter(n -> new GuessANumber(n) == >> number).findFirst()); >> > > R?mi From forax at univ-mlv.fr Mon Feb 25 21:28:03 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 25 Feb 2019 22:28:03 +0100 (CET) Subject: The substituability test is breaking the encapsulation In-Reply-To: References: <54193999.81105.1551089488401.JavaMail.zimbra@u-pem.fr> <9751BA9A-28B1-40DE-AB02-3F883BC6F64F@oracle.com> Message-ID: <902152766.256696.1551130083866.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Brian Goetz" > Cc: "Remi Forax" , "valhalla-spec-experts" > Envoy?: Lundi 25 F?vrier 2019 21:37:00 > Objet: Re: The substituability test is breaking the encapsulation > On Feb 25, 2019, at 6:32 AM, Brian Goetz wrote: >> ? >> To pick up on John?s note from v-dev over the weekend, value-objects are more >> easily _forgeable_ than identity-objects. There are infinitely many possible >> java.lang.Integers, because of the unique-per-instance identity; there are only >> finitely many instances of >> >> value class IntWrapper { public int i; } >> >> and, given access to the constructor, you can construct them all, and readily >> stamp out whatever instance you like, and it is just as good as all other >> instances with that state. > > My current favorite metaphor for explaining the difference > between value* objects and reference* objects is the one > you coined, Brian, of the little band placed around the infant's > wrist just after birth. Similarly the JVM adds such a unique > extra identity* (what else do you call it?) to every Integer but > not to any int. Values are not values because they have > something different from objects, but because they don't > have the wristband; you can't tell them apart anymore, > compared to objects which always carry their little wristband > around. (Metaphor failures: We don't use infant wristbands > to *tell infants apart*. And the infant eventually loses the > wristband. Tattoo? Let's not; besides those are forgeable. > Regardless, the wristband is helpful.) > > So if an implementor wants every new value to be not-same* > to every other new value, the implementor of the value* class > can just add a wristband. That is: > >> value class UnforgeableIntWrapper { >> public int i; >> private Object wristband = new Object(); } > > And done, I hope. > > Does the JVM have anything to add here? I don't think so. > If we were to create unforgeable value objects as a third > or fourth kind of type, this is the implementation we'd use. > > So encapsulation, controlled by user choice, provides adequate > control over this corner case in op== semantics. it's unforgeable but you can still guess the content of 'i' using a timing attack ? no ? > > ? John R?mi From brian.goetz at oracle.com Mon Feb 25 21:38:58 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 25 Feb 2019 16:38:58 -0500 Subject: The substituability test is breaking the encapsulation In-Reply-To: <1034373372.256371.1551129822096.JavaMail.zimbra@u-pem.fr> References: <54193999.81105.1551089488401.JavaMail.zimbra@u-pem.fr> <9751BA9A-28B1-40DE-AB02-3F883BC6F64F@oracle.com> <1034373372.256371.1551129822096.JavaMail.zimbra@u-pem.fr> Message-ID: In the absolute worst case, we could give up on encapsulation of fields. I wouldn?t want to do that, but right now, I?d rather do that than accept a non-reflexive ==. But I?m sure there?s a better alternative ? let?s find it. > On Feb 25, 2019, at 4:23 PM, forax at univ-mlv.fr wrote: > > What i'm saying is that using a component wise test as == as a security implication, something i was not aware before thinking about it, > and something i'm sure our users don't want to be aware of. > > Having two different meanings for "encapsulation", one for references and one for values is possible a solution, but it's moving the problem to the users, by saying, you will have to be careful enough to know that class encapsulation and value class encapsulation works differently. > The first part of the moto is "code like a class" and not "code like a class but beware because the encapsulation model is different". > > It also makes the implementation of an interface by to a value class more hazardous, by example, can a panama Address can be implemented by a value class ? The answer is not easy because the encapsulation model is leaky. > >> if the constructor is not accessible to the attacker > so a serializable value class is a security liability ? > > And a component wise test is also prone to timing attacks, you can guess the value of the fields far faster than checking all combinations. > > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "Remi Forax" >> Cc: "valhalla-spec-experts" >> Envoy?: Lundi 25 F?vrier 2019 15:32:18 >> Objet: Re: The substituability test is breaking the encapsulation > >> Good ? let?s drill into this. >> >> At a high level, you?re saying there?s a tension between encapsulation and a >> state-based comparison primitive; that the state-based comparison is a side >> channel through which encapsulated state may be leaked. That?s true. (Just as >> there is a tension between ?values are objects? and ?objects have identity?.) >> >> To pick up on John?s note from v-dev over the weekend, value-objects are more >> easily _forgeable_ than identity-objects. There are infinitely many possible >> java.lang.Integers, because of the unique-per-instance identity; there are only >> finitely many instances of >> >> value class IntWrapper { public int i; } >> >> and, given access to the constructor, you can construct them all, and readily >> stamp out whatever instance you like, and it is just as good as all other >> instances with that state. >> >> We want value to have as many of the things that classes have, within the >> constraints that values eschew identity. So they can?t have mutability or >> layout polymorphism. But they can have methods, fields, constructors, type >> variables, etc. And we?d like for ?encapsulation? to be in this set. >> >> As a trivial observation, the concern you raise here goes away if the >> constructor is not accessible to the attacker. That suggests there are at >> least two paths to plugging this leak; tighten state-based comparison, or >> require classes that want to encapsulate their state to also encapsulate the >> constructors that can produce arbitrary state. >> >> So, rather than blaming ==, or blaming encapsulation, let?s set out some >> expectations for how we want to use encapsulation in values. >> >> (I think this problem may be related to another problem ? that of when a client >> should be allowed to use `withfield`. For an unencapsulated class like Point, >> where the ctor expresses no constraints, it seems desirable to let clients say >> ?p __with x = 2? (with whatever syntax), without making the author expose yet >> more accessor methods, but clearly for encapsulated values, that?s not OK.) >> >>> On Feb 25, 2019, at 5:11 AM, Remi Forax wrote: >>> >>> Hi all, >>> there is another issue with making the component wide test available for any >>> value types, it's leaking the implementation. >>> >>> Let say we have this class: >>> >>> public value class GuessANumber { >>> private final int value; >>> >>> public GuessANumber(int value) { >>> this.value = value; >>> } >>> >>> public enum Response { LOWER, GREATER, FOUND }; >>> >>> public Response guess(int guess) { >>> if (value < guess) { >>> return Response.LOWER; >>> } >>> if (value > guess) { >>> return Response.GREATER; >>> } >>> return Response.FOUND; >>> } >>> >>> public static GuessANumber random(int seed) { >>> return new GuessANumber(new Random(seed).nextInt(1024)); >>> } >>> } >>> >>> you can naively think that if we have an an instance of GuessANumber >>> var number = GuessANumber.random(0); >>> you have can not get the value of the private field of that instance, >>> but using == you can find it because you can use == to test if number is >>> substituable to a user created GuessANumber. >>> >>> here is how to find the value without using the method guess() >>> System.out.println(IntStream.range(0, 1024).filter(n -> new GuessANumber(n) == >>> number).findFirst()); >>> >>> R?mi From john.r.rose at oracle.com Mon Feb 25 21:45:02 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 25 Feb 2019 13:45:02 -0800 Subject: The substituability test is breaking the encapsulation In-Reply-To: <902152766.256696.1551130083866.JavaMail.zimbra@u-pem.fr> References: <54193999.81105.1551089488401.JavaMail.zimbra@u-pem.fr> <9751BA9A-28B1-40DE-AB02-3F883BC6F64F@oracle.com> <902152766.256696.1551130083866.JavaMail.zimbra@u-pem.fr> Message-ID: <0E341196-3A06-4861-804D-01773DCD6F56@oracle.com> On Feb 25, 2019, at 1:28 PM, forax at univ-mlv.fr wrote: > >> So encapsulation, controlled by user choice, provides adequate >> control over this corner case in op== semantics. > > it's unforgeable but you can still guess the content of 'i' using a timing attack ? no ? Yes. These points are not unique to values. They are well known for non-values. If you use value types as security tokens, there's a set of best practices you need to follow. Example this kind of thing today: class PasswordWrapper { private String password; public PasswordWrapper(String password) { ? } boolean checkPassword(String attempt) { ? } // and an anti-pattern: enum Status { GOT_IT, LOWER, HIGHER, HOTTER, COLDER, ? } boolean Status sniffPassword(String attempt) { ? } } The timing attacks are equivalent to the presence of a sniffPassword method. None of this depends particularly on value-ness. The new thing you have noticed, Remi, is that value* classes lack a routine means of forge prevention, the identity wristband. This is the same thing that int lacks and String lacks (as usually employed) and VBCs lack (if clients follow the VBC rules). Does this mean that ints and strings and VBCs are less secure than opaque objects? Yes, I suppose they are, by some measures of security. If you want a really insecure value, make it Comparable, so that it can be guessed by binary search. Does this mean Comparable is insecure? ? John From karen.kinnear at oracle.com Wed Feb 27 15:30:47 2019 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 27 Feb 2019 10:30:47 -0500 Subject: Valhalla EG notes Feb 13, 2019 Message-ID: <7CAFC675-248C-477A-9A4E-D9F331559EA6@oracle.com> Attendees: John, Remi, Dan H, Tobi, Simms, Karen AI: Karen: resend John?s Template Class proposal: http://cr.openjdk.java.net/~jrose/values/template-classes.html AI: Remi: write up proposal on specializing parameters to defineClass I. DH: Locking options: put stake in the ground: Throw exception - consensus II. JR: Generic Specialization: Template class refinement Propose we do at least one prototype for specialization mechanisms this year. 1. Class file is still chief entity constant pool: more articulated, as are class, field and method goal: share constants and bytecodes as much as possible LWorld helps here Constants: change signatures to add reified type parameters in descriptors - model 3 generated new classfiles, lost all sharing Therefore: constant pool needs to be partially shared and partially specialized Proposal: constant pool segments Holes: fill with parameterized types Requirement: No holes in the concrete by the time you get to a reference. Actually no holes by the time you get to verifying the species RF: Concern about resolving too early if specialize CP JR: Risk - we need an experiment RF: wants greater dynamicity, possibly fully dynamic JR: In the VM: we can?t always do late binding - e.g. heap layout - need full information early RF: agree layout needs early info generate specialized class by filling hole when defining the class JR: Entity model: segmented constant pool 1 global segment local segments depending on 1 or more holes, tree structured to at least depth 2 Hole kinds: field type, dynamic constant, method type, MethodHandle structural inheritance of constraints Fill hole when we specialize a CP segment DH: global segment seen by others? JR: yes - resolved at most once DH: condy and MH: lookup dependent on instantiation JR: fill class holes when load/define a class/species, before referencing - i.e. field or method reference class_info, method_info, field_info refer to segment class template, method template ,field template - must specialize constant pool and then instantiate Load class for specialization by providing hole values (ed. note: provide live types) Open question: how to represent generic method more generic than enclosing class DH: CP indicies globally numbered? JR: yes. segments are not overlays DH: named segments? JR: based on 1st constant in the segment rules for referential integrity and placement of constants in segment DH: each specialization CFLH/redefinition or 1 per template? JR: open - default yes (ed. note - need to revisit this one - earlier assumption was redefinition of template class, not each species) JR: also nested generics with a shared constant pool, e.g. in future an inner class could share the same class file RF: What should be shared/not shared? Would like to see done dynamically, specialize parameters to defineClass - JR had to leave DH: JVMTI and general tooling issues KK: working on sharing requirements class species specialized conditional note: sharing requirements are: class-wide and per-species, there is nothing shared across a subset of species. Conditional (possibly ?where? syntax) determines if a method for example will be part of any given species KK: other open issue: raw vs. erased - and best way to deal with backward compatibility During the meeting asked if virtual methods/virtual fields are only needed to deal with raw/wild types - answer was yes. (ed. note: after meeting - found another case - which I can?t recall at the moment) RF: client level proposal: old generics vs. new generics option 1: client with old generics not reference code with new generics option 2: not have to recompile client code to use it, need virtual dispatch Proposal: embrace reuse as central design. Constant pool specialization - want to be careful about adopting java generics semantics. Other languages, e.g. Scala can?t use this - slightly different generics semantics KK: Is there anything we could add to the class file that would make it easier to support generics in other languages? RF: wants to do a prototype at runtime, with no java semantics in the design KK: would it be useful to have information in the class file and language-specific specializers? RF: future: use Lookup.defineClass with 1 dynamic parameter at runtime the dynamic object is like a static of the species no representation of species at compile time no reified type in constant pool KK: Does this imply no sharing at all? RF: Yes RF: Derive species when needed - ask for specialization, create new if none exists, and intern Mark if a field or method is specialized DH: Can a descriptor refer to specific specialization? RF: No: dynamic check at runtime DH: JIT engineers: if this model depends on JIT magic to work, concerns about startup especially in constrained environments RF: specializing a class vs. method are different, maybe JR?s model for classes tradeoffs: not want java generic semantics in vm if full template specialization, can?t have sharing Swift: template: either generic code or compile time inline and specialize - based on caller/callee KK: concerns about performance cost of virtual field/virtual method additional indirections What are the sharing requirements? RF: generic methods - just want to share resolution, not share data segment for each combination of type parameters too much DH: only specialize for value types, reduces the problem RF: Haskell eg. Linked list which encodes the type of the next link or tuple as linked list if CP specialization - never call with exactly the same type DH: If it hurts ? RF: currently works with erasure. Concern slow if new specialization for each link III. RF DynamicValue attribute Another project Remi will lead and create JEP language level: static lazy final improve startup by allowing init with Condy at first access of individual static Drawbacks: opt-in at source change in semantics in static block - there is a lock condy BSM can execute multiple times corrections welcome, thanks, Karen From john.r.rose at oracle.com Wed Feb 27 20:33:30 2019 From: john.r.rose at oracle.com (John Rose) Date: Wed, 27 Feb 2019 12:33:30 -0800 Subject: lazy statics design notes In-Reply-To: <7CAFC675-248C-477A-9A4E-D9F331559EA6@oracle.com> References: <7CAFC675-248C-477A-9A4E-D9F331559EA6@oracle.com> Message-ID: <0755C5BD-59BE-4B73-9D09-C71421CBE8C5@oracle.com> On Feb 27, 2019, at 7:30 AM, Karen Kinnear wrote: > Subject: Valhalla EG notes Feb 13, 2019 > To: valhalla-spec-experts > ... > III. [Remi Forax] DynamicValue attribute > Another project Remi will lead and create JEP > language level: static lazy final > improve startup by allowing init with Condy at first access of individual static > > Drawbacks: opt-in at source > change in semantics > in static block - there is a lock > condy BSM can execute multiple times I was just talking with Vladimir Ivanov about lazy statics. He is working on yet another performance pothole with , generated by Clojure this time. (It's not their fault; the system had to clean up a problem with correct initialization order, and execution is over-constrained already, so the JIT has to generate more conservative code now.) I believe lazy statics would allow programmers (and even more, language implementors) to use much smaller s, or none at all, in favor of granular lazy statics. So, here's a brain dump, fresh from my recent lunch with Vladimir: Big problem #1: If you touch one static, you buy them all. Big problem #2: If any one static misbehaves (blocking, bad bootstrap), all statics misbehave. Big problem #3: If hasn't run yet, you need initialization barriers on all use points of statics; result is that itself, and anything it calls, is uniquely non-optimizable. Big problem #4: After touching one static, the program cannot make progress until the mutex on the whole Class object is released. Big problem #5: Setting up multiple statics is not transactional; you can observe erroneous intermediate states during the run of the . Big problem #6: Statics are really, really hard to process in an AOT engine, because nearly every pre-compiled code path must assume that the static might not be booted up yet, and if boot-up happens (just once per execution) it invalidates many of the assumptions the AOT engine wants to make about nearby code. Solutions from lazy statics: Solution #1: If you touch one that's the one you buy (plus what's in the vestigial if there is one at all). Solution #2: Misbehaving statics don't misbehave until they are used (yes, bug masking, boo hoo). Solution #3: Initialization barriers are trivial: Just detect the T.default value of the variable. Solution #4: There is no mutex, just a CAS at the end of the BSM for the lazy static; no critical section. Solution #5: The CAS at the end of the BSM is inherently transactional. Solution #6: AOT engines can generate somewhat simpler fast-path code by just testing for T.default; the slow-path code is still hard to optimize, but the limits are from the complexity of the BSM that initializes the lazy static, not the total complexity of the code. Objection: What if you *want* a mutex? I didn't like the JVM blocking everything in but I don't want a million racing threads computing the same BSM value either. Ans: Fine, but make that an opt-in mechanism, by folding some kind of flow control into the relevant BSM, for your particular use case. The JVM doesn't have to know about it. Objection: What if I want several statics to initialize in one event, with or without mutex or transactions? Ans: Easy, just have the BSM for each touch the others, or run a common BSM that sets everything up (and then returns the same value). (Note: At the cost of an idempotency requirement during lazy init.) In the most demanding cases, define a private static nested class to serialize everything, which is today's workaround. Objection: Those aren't real statics, because you can't set them to their T.default values! Ans: They are as real as you are going to get without creating lots of side metadata to track the N+1st variable state, which is a cost nobody wants to pay. Objection: But I do want to opt into the overhead and you aren't giving me my T.default; I need the full range of values for my special use case. Ans: Then add an indirection for your use case, to a wrapped copy of your desired value; the null wrapper value is the T.default in this case. It's at least as cheap as anything the JVM would have done intrinsically. Objection: You disrespect 'boolean'. It only has one state left after you filch 'false' to denote non-initialization. My VM hack can do much better than that. Ans: Let me introduce you to java.lang.Boolean. It has three states. Objection: What if someone uses bytecode to monkey with the state of my lazy static? Your design is broken! Ans: This is the sort of corner case that needs extra VM support. In this case, it is sufficient to privatize write access to a variable, even though it may be public, to its declaring class. You can trust the declaring class not to compile subverting assignments into itself, because javac won't let it. Objection: I can't imagine the language design for this; surely there are difficulties you haven't foreseen. Ans: Neither can I, and there certainly are. The sooner we start trying out prototypes the sooner we'll shake out the issues. There are several things to try: http://openjdk.java.net/jeps/8209964 http://cr.openjdk.java.net/~jrose/draft/lazy-final.html Bonus: The T.default hack scales to non-static fields as well. So laziness is a separable tool from the decision to make things static or not; it survives more refactorings. The technique is abundantly optimizable (both static and non-static versions) as proven by the good track record of @Stable inside the JDK. We should share this gem outside the JDK, which requires language and (more) VM support. Language design issue: It's easier to do the lazy static with an attribute than doing the lazy non-static; you need an instance-specific callback for the latter. TBD. The nice thing about this is that the OpenJDK JITs have been making good use of @Stable annotations for a long time. So the main problem here is finding a language and VM framework that legitimizes this sort of pattern (including safety checks and rule enforcement on state changes). When that is done, the JITs should make use of it with little extra effort. ? John From brian.goetz at oracle.com Wed Feb 27 20:58:09 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 27 Feb 2019 15:58:09 -0500 Subject: lazy statics design notes In-Reply-To: <0755C5BD-59BE-4B73-9D09-C71421CBE8C5@oracle.com> References: <7CAFC675-248C-477A-9A4E-D9F331559EA6@oracle.com> <0755C5BD-59BE-4B73-9D09-C71421CBE8C5@oracle.com> Message-ID: <72a9ea69-581c-80b3-dde9-ffec8f8ec5da@oracle.com> I think the answer to all the objections is "then just use ". Programmers are lazy; we can use this to our advantage.? If the users get the benefit of lazy initialization with one new keyword (lazy-final), they will likely use it because programmers love to prematurely optimize and sprinkling "lazy" is easy.? The result will be, most will empty out, except for the ones doing weird stuff. (This strategy is analogous to: to lose weight, you need not control what you eat, you just need to control what food is in your house.? And you don't even have to do anything here other than "don't buy more unhealthy food"; our natural snacking tendencies will empty out the pantry fast enough, and then all that's left will be kale soon enough.) On 2/27/2019 3:33 PM, John Rose wrote: > On Feb 27, 2019, at 7:30 AM, Karen Kinnear wrote: >> Subject: Valhalla EG notes Feb 13, 2019 >> To: valhalla-spec-experts >> ... >> III. [Remi Forax] DynamicValue attribute >> Another project Remi will lead and create JEP >> language level: static lazy final >> improve startup by allowing init with Condy at first access of individual static >> >> Drawbacks: opt-in at source >> change in semantics >> in static block - there is a lock >> condy BSM can execute multiple times > I was just talking with Vladimir Ivanov about lazy > statics. He is working on yet another performance > pothole with , generated by Clojure this time. > (It's not their fault; the system had to clean up a problem > with correct initialization order, and execution > is over-constrained already, so the JIT has to generate > more conservative code now.) > > I believe lazy statics would allow programmers > (and even more, language implementors) to > use much smaller s, or none at all, > in favor of granular lazy statics. > > So, here's a brain dump, fresh from my recent > lunch with Vladimir: > > Big problem #1: If you touch one static, you buy > them all. Big problem #2: If any one static > misbehaves (blocking, bad bootstrap), all statics > misbehave. Big problem #3: If hasn't > run yet, you need initialization barriers on all > use points of statics; result is that itself, > and anything it calls, is uniquely non-optimizable. > Big problem #4: After touching one static, the > program cannot make progress until the mutex > on the whole Class object is released. Big problem > #5: Setting up multiple statics is not transactional; > you can observe erroneous intermediate states during > the run of the . Big problem #6: Statics > are really, really hard to process in an AOT engine, > because nearly every pre-compiled code path must > assume that the static might not be booted up yet, > and if boot-up happens (just once per execution) > it invalidates many of the assumptions the AOT > engine wants to make about nearby code. > > Solutions from lazy statics: Solution #1: If you touch > one that's the one you buy (plus what's in the vestigial > if there is one at all). Solution #2: Misbehaving > statics don't misbehave until they are used (yes, bug > masking, boo hoo). Solution #3: Initialization barriers > are trivial: Just detect the T.default value of the variable. > Solution #4: There is no mutex, just a CAS at the end > of the BSM for the lazy static; no critical section. > Solution #5: The CAS at the end of the BSM is inherently > transactional. Solution #6: AOT engines can generate > somewhat simpler fast-path code by just testing for > T.default; the slow-path code is still hard to optimize, > but the limits are from the complexity of the BSM > that initializes the lazy static, not the total complexity > of the code. > > Objection: What if you *want* a mutex? I didn't like > the JVM blocking everything in but I don't > want a million racing threads computing the same > BSM value either. Ans: Fine, but make that an opt-in > mechanism, by folding some kind of flow control > into the relevant BSM, for your particular use case. > The JVM doesn't have to know about it. > > Objection: What if I want several statics to initialize in > one event, with or without mutex or transactions? > Ans: Easy, just have the BSM for each touch the others, > or run a common BSM that sets everything up (and then > returns the same value). (Note: At the cost of an > idempotency requirement during lazy init.) In the > most demanding cases, define a private static nested > class to serialize everything, which is today's workaround. > > Objection: Those aren't real statics, because you can't > set them to their T.default values! Ans: They are as > real as you are going to get without creating lots of > side metadata to track the N+1st variable state, which > is a cost nobody wants to pay. > > Objection: But I do want to opt into the overhead and > you aren't giving me my T.default; I need the full range > of values for my special use case. Ans: Then add an > indirection for your use case, to a wrapped copy of your > desired value; the null wrapper value is the T.default in > this case. It's at least as cheap as anything the JVM would > have done intrinsically. > > Objection: You disrespect 'boolean'. It only has one > state left after you filch 'false' to denote non-initialization. > My VM hack can do much better than that. Ans: Let me > introduce you to java.lang.Boolean. It has three states. > > Objection: What if someone uses bytecode to monkey > with the state of my lazy static? Your design is broken! > Ans: This is the sort of corner case that needs extra > VM support. In this case, it is sufficient to privatize > write access to a variable, even though it may be public, > to its declaring class. You can trust the declaring class > not to compile subverting assignments into itself, > because javac won't let it. > > Objection: I can't imagine the language design for this; > surely there are difficulties you haven't foreseen. Ans: > Neither can I, and there certainly are. The sooner we > start trying out prototypes the sooner we'll shake out > the issues. There are several things to try: > > http://openjdk.java.net/jeps/8209964 > http://cr.openjdk.java.net/~jrose/draft/lazy-final.html > > Bonus: The T.default hack scales to non-static > fields as well. So laziness is a separable tool > from the decision to make things static or not; > it survives more refactorings. The technique > is abundantly optimizable (both static and > non-static versions) as proven by the good > track record of @Stable inside the JDK. We > should share this gem outside the JDK, > which requires language and (more) VM > support. Language design issue: It's easier > to do the lazy static with an attribute than > doing the lazy non-static; you need an > instance-specific callback for the latter. TBD. > > The nice thing about this is that the OpenJDK JITs > have been making good use of @Stable annotations > for a long time. So the main problem here is finding > a language and VM framework that legitimizes this > sort of pattern (including safety checks and rule > enforcement on state changes). When that is done, > the JITs should make use of it with little extra effort. > > ? John From john.r.rose at oracle.com Wed Feb 27 21:01:07 2019 From: john.r.rose at oracle.com (John Rose) Date: Wed, 27 Feb 2019 13:01:07 -0800 Subject: lazy statics design notes In-Reply-To: <72a9ea69-581c-80b3-dde9-ffec8f8ec5da@oracle.com> References: <7CAFC675-248C-477A-9A4E-D9F331559EA6@oracle.com> <0755C5BD-59BE-4B73-9D09-C71421CBE8C5@oracle.com> <72a9ea69-581c-80b3-dde9-ffec8f8ec5da@oracle.com> Message-ID: <76B29D56-5D18-43FA-B514-70F5EBEB82EC@oracle.com> On Feb 27, 2019, at 12:58 PM, Brian Goetz wrote: > > I think the answer to all the objections is "then just use ". That's fair. Maybe I was overthinking that part. > Programmers are lazy; we can use this to our advantage. If the users get the benefit of lazy initialization with one new keyword (lazy-final), they will likely use it because programmers love to prematurely optimize and sprinkling "lazy" is easy. The result will be, most will empty out, except for the ones doing weird stuff. Bravo, yes. > (This strategy is analogous to: to lose weight, you need not control what you eat, you just need to control what food is in your house. And you don't even have to do anything here other than "don't buy more unhealthy food"; our natural snacking tendencies will empty out the pantry fast enough, and then all that's left will be kale soon enough.) You can *DO* that?? From forax at univ-mlv.fr Thu Feb 28 10:00:01 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 28 Feb 2019 11:00:01 +0100 (CET) Subject: lazy statics design notes In-Reply-To: <0755C5BD-59BE-4B73-9D09-C71421CBE8C5@oracle.com> References: <7CAFC675-248C-477A-9A4E-D9F331559EA6@oracle.com> <0755C5BD-59BE-4B73-9D09-C71421CBE8C5@oracle.com> Message-ID: <609925803.719671.1551348001599.JavaMail.zimbra@u-pem.fr> In term of semantics, i don't think transitioning from T.default, i.e the @Stable semantics is the 'right' semantics. Here is my current mind model about the relation between indy, condy, static lazy final and Brian's forward bridges. | use site | declaration site | | method call | invokedynamic | Brian's forward bridges | | field | invokedynamic | Brian's forward bridges | | constant | ldc condy | getstatic + DynamicValue + condy | | static lazy final is lazy because it shares the same late late binding semantics as indy and condy, i.e it uses the CAS of condy so it doesn't need a specific CAS from T.default. I think that trying to come that will encompass lazy instance field and lazy static field is a trap, because in the lazy instance case there is no constant pool to store the value. Given that we already have condy, i prefer to see the lazy static field as a way to provide a symbolic name to get the condy value from outside the class that store the condy. I believe the static lazy final is a kind of virtual static field, not virtual because there is a virtual dispatch, but virtual in the sense that there is no memory associated with this field in the part of the class that store the static fields value, because the value of the field is stored in the constant pool. Or said differently, Constant dynamic stores the value in the constant pool and a getstatic on a static lazy final is a way to retrieve that value using a symbolic name. In term of implementation, getstatic + DynamicValue + condy is really close to a ldc condy but there is first an access check and it can trigger clinit then instead of returning the value of current_constant_pool[condy_index] like condy, it returns the owner_of_getstatic_constantpool[condy_index]. For the interpreter, a getstatic on a static lazy final field can be quickened into a kind of "long ldc condy" once the access check and class init is done. R?mi ----- Mail original ----- > De: "John Rose" > ?: "Karen Kinnear" > Cc: "valhalla-spec-experts" > Envoy?: Mercredi 27 F?vrier 2019 21:33:30 > Objet: lazy statics design notes > On Feb 27, 2019, at 7:30 AM, Karen Kinnear wrote: >> Subject: Valhalla EG notes Feb 13, 2019 >> To: valhalla-spec-experts >> ... >> III. [Remi Forax] DynamicValue attribute >> Another project Remi will lead and create JEP >> language level: static lazy final >> improve startup by allowing init with Condy at first access of individual static >> >> Drawbacks: opt-in at source >> change in semantics >> in static block - there is a lock >> condy BSM can execute multiple times > > I was just talking with Vladimir Ivanov about lazy > statics. He is working on yet another performance > pothole with , generated by Clojure this time. > (It's not their fault; the system had to clean up a problem > with correct initialization order, and execution > is over-constrained already, so the JIT has to generate > more conservative code now.) > > I believe lazy statics would allow programmers > (and even more, language implementors) to > use much smaller s, or none at all, > in favor of granular lazy statics. > > So, here's a brain dump, fresh from my recent > lunch with Vladimir: > > Big problem #1: If you touch one static, you buy > them all. Big problem #2: If any one static > misbehaves (blocking, bad bootstrap), all statics > misbehave. Big problem #3: If hasn't > run yet, you need initialization barriers on all > use points of statics; result is that itself, > and anything it calls, is uniquely non-optimizable. > Big problem #4: After touching one static, the > program cannot make progress until the mutex > on the whole Class object is released. Big problem > #5: Setting up multiple statics is not transactional; > you can observe erroneous intermediate states during > the run of the . Big problem #6: Statics > are really, really hard to process in an AOT engine, > because nearly every pre-compiled code path must > assume that the static might not be booted up yet, > and if boot-up happens (just once per execution) > it invalidates many of the assumptions the AOT > engine wants to make about nearby code. > > Solutions from lazy statics: Solution #1: If you touch > one that's the one you buy (plus what's in the vestigial > if there is one at all). Solution #2: Misbehaving > statics don't misbehave until they are used (yes, bug > masking, boo hoo). Solution #3: Initialization barriers > are trivial: Just detect the T.default value of the variable. > Solution #4: There is no mutex, just a CAS at the end > of the BSM for the lazy static; no critical section. > Solution #5: The CAS at the end of the BSM is inherently > transactional. Solution #6: AOT engines can generate > somewhat simpler fast-path code by just testing for > T.default; the slow-path code is still hard to optimize, > but the limits are from the complexity of the BSM > that initializes the lazy static, not the total complexity > of the code. > > Objection: What if you *want* a mutex? I didn't like > the JVM blocking everything in but I don't > want a million racing threads computing the same > BSM value either. Ans: Fine, but make that an opt-in > mechanism, by folding some kind of flow control > into the relevant BSM, for your particular use case. > The JVM doesn't have to know about it. > > Objection: What if I want several statics to initialize in > one event, with or without mutex or transactions? > Ans: Easy, just have the BSM for each touch the others, > or run a common BSM that sets everything up (and then > returns the same value). (Note: At the cost of an > idempotency requirement during lazy init.) In the > most demanding cases, define a private static nested > class to serialize everything, which is today's workaround. > > Objection: Those aren't real statics, because you can't > set them to their T.default values! Ans: They are as > real as you are going to get without creating lots of > side metadata to track the N+1st variable state, which > is a cost nobody wants to pay. > > Objection: But I do want to opt into the overhead and > you aren't giving me my T.default; I need the full range > of values for my special use case. Ans: Then add an > indirection for your use case, to a wrapped copy of your > desired value; the null wrapper value is the T.default in > this case. It's at least as cheap as anything the JVM would > have done intrinsically. > > Objection: You disrespect 'boolean'. It only has one > state left after you filch 'false' to denote non-initialization. > My VM hack can do much better than that. Ans: Let me > introduce you to java.lang.Boolean. It has three states. > > Objection: What if someone uses bytecode to monkey > with the state of my lazy static? Your design is broken! > Ans: This is the sort of corner case that needs extra > VM support. In this case, it is sufficient to privatize > write access to a variable, even though it may be public, > to its declaring class. You can trust the declaring class > not to compile subverting assignments into itself, > because javac won't let it. > > Objection: I can't imagine the language design for this; > surely there are difficulties you haven't foreseen. Ans: > Neither can I, and there certainly are. The sooner we > start trying out prototypes the sooner we'll shake out > the issues. There are several things to try: > > http://openjdk.java.net/jeps/8209964 > http://cr.openjdk.java.net/~jrose/draft/lazy-final.html > > Bonus: The T.default hack scales to non-static > fields as well. So laziness is a separable tool > from the decision to make things static or not; > it survives more refactorings. The technique > is abundantly optimizable (both static and > non-static versions) as proven by the good > track record of @Stable inside the JDK. We > should share this gem outside the JDK, > which requires language and (more) VM > support. Language design issue: It's easier > to do the lazy static with an attribute than > doing the lazy non-static; you need an > instance-specific callback for the latter. TBD. > > The nice thing about this is that the OpenJDK JITs > have been making good use of @Stable annotations > for a long time. So the main problem here is finding > a language and VM framework that legitimizes this > sort of pattern (including safety checks and rule > enforcement on state changes). When that is done, > the JITs should make use of it with little extra effort. > > ? John From forax at univ-mlv.fr Thu Feb 28 10:02:10 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 28 Feb 2019 11:02:10 +0100 (CET) Subject: lazy statics design notes In-Reply-To: <76B29D56-5D18-43FA-B514-70F5EBEB82EC@oracle.com> References: <7CAFC675-248C-477A-9A4E-D9F331559EA6@oracle.com> <0755C5BD-59BE-4B73-9D09-C71421CBE8C5@oracle.com> <72a9ea69-581c-80b3-dde9-ffec8f8ec5da@oracle.com> <76B29D56-5D18-43FA-B514-70F5EBEB82EC@oracle.com> Message-ID: <2035995566.720923.1551348130109.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Brian Goetz" > Cc: "valhalla-spec-experts" > Envoy?: Mercredi 27 F?vrier 2019 22:01:07 > Objet: Re: lazy statics design notes > On Feb 27, 2019, at 12:58 PM, Brian Goetz wrote: >> >> I think the answer to all the objections is "then just use ". > > That's fair. Maybe I was overthinking that part. yes :) > >> Programmers are lazy; we can use this to our advantage. If the users get the >> benefit of lazy initialization with one new keyword (lazy-final), they will >> likely use it because programmers love to prematurely optimize and sprinkling >> "lazy" is easy. The result will be, most will empty out, except for >> the ones doing weird stuff. > > Bravo, yes. yes, for this specific case, premature optimization goes in the direction we want. R?mi > >> (This strategy is analogous to: to lose weight, you need not control what you >> eat, you just need to control what food is in your house. And you don't even >> have to do anything here other than "don't buy more unhealthy food"; our >> natural snacking tendencies will empty out the pantry fast enough, and then all >> that's left will be kale soon enough.) > > You can *DO* that?? From brian.goetz at oracle.com Thu Feb 28 15:54:31 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 28 Feb 2019 10:54:31 -0500 Subject: lazy statics design notes In-Reply-To: <609925803.719671.1551348001599.JavaMail.zimbra@u-pem.fr> References: <7CAFC675-248C-477A-9A4E-D9F331559EA6@oracle.com> <0755C5BD-59BE-4B73-9D09-C71421CBE8C5@oracle.com> <609925803.719671.1551348001599.JavaMail.zimbra@u-pem.fr> Message-ID: > I think that trying to come that will encompass lazy instance field and lazy static field is a trap, because in the lazy instance case there is no constant pool to store the value. I tend to agree with Remi on this one. Condy may be a ?mere? implementation tactic, but it?s a darn good one, it has the semantics we want, and the use case of lazy statics is far more important than lazy instance vars. (We we would be happy if we only had lazy statics and not lazy instances.) Note too that the DynamicConstantValue story is a tradeoff to enable a binary compatible migration from non-lazy to lazy; I think this is a sort of corner case, as the vast majority of field accesses are within the same class. If we didn?t care about binary-compatible migrations for public static final fields, there?s a translation-based story that is way simpler ? cross-class field access desguars to invocations of a synthetic accessor. From brian.goetz at oracle.com Thu Feb 28 17:17:43 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 28 Feb 2019 12:17:43 -0500 Subject: Finding the spirit of L-World In-Reply-To: References: Message-ID: > class Object { ... } > class RefObject extends Object { ... } > class ValObject extends Object { ... } Let?s talk about this one some more. There are some obvious practical benefits of bringing these key concepts into the type system. The first is that talking about ref vs value can be done with tools we already have: if (x instanceof RefObject) { synchronized(x) { ? } } void m(ValObject vo) { ? } class Foo { ? } rather than inventing new ways to talk about ?ref only? or ?val only?. Another is that ref- or val-specific behavior has an obvious place to live, and can be implemented with tools we already have (e.g., final methods.) A third is that all objects are no longer created equally; having the ?top types? reflect this will help users learn and understand this. A minor benefit is we no longer need an ACC_VALUE flag; we can just trigger off of ?extends ValObject?. So, what are the costs and risks? - JVMs have to rewrite hierarchies so when a class is loaded that extends Object, it is rewritten to extend RefObject. - Existing code that relies on .getSuperclass() == Object.class might break. - Inheritance hierarchies get one deeper, making searches of superclass chains one longer. Any others? From forax at univ-mlv.fr Thu Feb 28 17:40:04 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 28 Feb 2019 18:40:04 +0100 (CET) Subject: Finding the spirit of L-World In-Reply-To: References: Message-ID: <1904649202.844853.1551375604971.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "valhalla-spec-experts" > Envoy?: Jeudi 28 F?vrier 2019 18:17:43 > Objet: Re: Finding the spirit of L-World >> class Object { ... } >> class RefObject extends Object { ... } >> class ValObject extends Object { ... } > Let?s talk about this one some more. There are some obvious practical benefits > of bringing these key concepts into the type system. The first is that talking > about ref vs value can be done with tools we already have: > if (x instanceof RefObject) { synchronized(x) { ? } } > void m(ValObject vo) { ? } > class Foo { ? } > rather than inventing new ways to talk about ?ref only? or ?val only?. One can note that all these examples only need ValObject as the instanceof test can be rewritten has !(x instanceof ValObject) and introducing only ValObject is not an issue. Obviously, it's not enough apart if we introduce a kind of negation, like class IdentityHashMap { } > Another is that ref- or val-specific behavior has an obvious place to live, and > can be implemented with tools we already have (e.g., final methods.) > A third is that all objects are no longer created equally; having the ?top > types? reflect this will help users learn and understand this. > A minor benefit is we no longer need an ACC_VALUE flag; we can just trigger off > of ?extends ValObject?. > So, what are the costs and risks? > - JVMs have to rewrite hierarchies so when a class is loaded that extends > Object, it is rewritten to extend RefObject. > - Existing code that relies on .getSuperclass() == Object.class might break. > - Inheritance hierarchies get one deeper, making searches of superclass chains > one longer. > Any others? the example from John, Object o = new Object(); o instanceof RefObject R?mi From rschmitt at pobox.com Thu Feb 28 17:57:59 2019 From: rschmitt at pobox.com (Ryan Schmitt) Date: Thu, 28 Feb 2019 09:57:59 -0800 Subject: Finding the spirit of L-World In-Reply-To: References: Message-ID: >From a language user's perspective, I find it strange that a widening conversion from ValObject to Object would give me *more* abilities (such as synchronization), rather than less: ValObject v = ...; synchronized (v) { ... } // (presumably) fails to compile Object o = v; // widening conversion synchronized (o) { ... } // presumably compiles What if this were flipped around? interface ValObject { ... } interface RefObject { ... } class Object implements RefObject, ValObject { ... } // values are objects now! Object o = new Point(...); synchronized (o) { ... } // so far so good ValObject v = (ValObject) o; // narrowing conversion; I can now do fewer things synchronized (v) { ... } // fails to compile RefObject r = (RefObject) o; // ClassCastException On Thu, Feb 28, 2019 at 9:20 AM Brian Goetz wrote: > > > class Object { ... } > > class RefObject extends Object { ... } > > class ValObject extends Object { ... } > > Let?s talk about this one some more. There are some obvious practical > benefits of bringing these key concepts into the type system. The first is > that talking about ref vs value can be done with tools we already have: > > if (x instanceof RefObject) { synchronized(x) { ? } } > > void m(ValObject vo) { ? } > > class Foo { ? } > > rather than inventing new ways to talk about ?ref only? or ?val only?. > > Another is that ref- or val-specific behavior has an obvious place to > live, and can be implemented with tools we already have (e.g., final > methods.) > > A third is that all objects are no longer created equally; having the ?top > types? reflect this will help users learn and understand this. > > A minor benefit is we no longer need an ACC_VALUE flag; we can just > trigger off of ?extends ValObject?. > > > So, what are the costs and risks? > > - JVMs have to rewrite hierarchies so when a class is loaded that extends > Object, it is rewritten to extend RefObject. > - Existing code that relies on .getSuperclass() == Object.class might > break. > - Inheritance hierarchies get one deeper, making searches of superclass > chains one longer. > > Any others? > > > > From brian.goetz at oracle.com Thu Feb 28 18:13:38 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 28 Feb 2019 13:13:38 -0500 Subject: Finding the spirit of L-World In-Reply-To: <1904649202.844853.1551375604971.JavaMail.zimbra@u-pem.fr> References: <1904649202.844853.1551375604971.JavaMail.zimbra@u-pem.fr> Message-ID: > the example from John, > Object o = new Object(); > o instanceof RefObject One way to fix this is: have `new Object` instantiate not an Object, but a RefObject. (This sort of recognizes that Object is half interface, half class.) The weird part is that then ?new Object().getClass() != Object.class. From john.r.rose at oracle.com Thu Feb 28 20:44:46 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 28 Feb 2019 12:44:46 -0800 Subject: Finding the spirit of L-World In-Reply-To: References: <1904649202.844853.1551375604971.JavaMail.zimbra@u-pem.fr> Message-ID: <99E03AAF-B432-4127-854F-0BB52F8E4FD5@oracle.com> On Feb 28, 2019, at 10:13 AM, Brian Goetz wrote: > >> the example from John, >> Object o = new Object(); >> o instanceof RefObject > > One way to fix this is: have `new Object` instantiate not an Object, but a RefObject. (This sort of recognizes that Object is half interface, half class.) Yes. This proposal factors out the non-interface parts of Object into RefObject, which is a great move. > The weird part is that then ?new Object().getClass() != Object.class. One way to fix that and other reflective artifacts, is to piggy-back on specialized generics and their refined type system. In the spirit of brainstorming, let's pull on that string a bit. We could declare that `RefObject` is a *species* of `Object`, not a subclass. Then, `Object` is revealed to be a template with just two species. (The species parameter of `Object` could be `` for `RefObject`, etc. ?Shots fired! Look out, he's got a non-type in his parameter list! Or the species parameter could be a placeholder type of some sort.) Then, given `var x = new Object()`, the value of `x.getClass()` would truly be `Object.class`, but the value of `x.getSpecies()` would be `RefObject.class`. The Core Reflection API could be tweaked in a similar way, so that `String.class.getSuperClass()` would return `Object.class` but `String.class.getSuperSpecies()` would return `RefObject.class`. Here I'm assuming a particular approach to updating Core Reflection to handle the new specialized types which will be popping up everywhere. My point here is that whatever detailed adaptations we make to Core Reflection for species could be made to apply to Object/RefObject. Since templates aren't ready yet, this can't be a fully-baked proposal, but it could work in the long term. Maybe that suggests a short term approximation? Should we be talking about `Object` and `Object` as injected species, rather than `RefObject` and `ValObject`? They would have conditional members, whatever that means. That also feels like an unnecessary complexity now, but it might be more appropriate in a world with templates. Maybe `RefObject` and `ValObject` can be specially-marked classes today, and promoted to species tomorrow. With the risk that the promotion would fail and they'd be specially-marked one-offs forever. Maybe that risk is acceptable. `Object` is already a one-off special class, so it's not terribly surprising that it would refactor into three one-offs. We'd try to rationalize `RefObject` with templates about the same time we rationalize the primitives as `ValObject`s. ? John From brian.goetz at oracle.com Thu Feb 28 21:04:27 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 28 Feb 2019 16:04:27 -0500 Subject: Finding the spirit of L-World In-Reply-To: <99E03AAF-B432-4127-854F-0BB52F8E4FD5@oracle.com> References: <1904649202.844853.1551375604971.JavaMail.zimbra@u-pem.fr> <99E03AAF-B432-4127-854F-0BB52F8E4FD5@oracle.com> Message-ID: <81e50a3e-96e0-6dd4-0d61-ecce8f0570ff@oracle.com> > One way to fix that and other reflective artifacts, is to piggy-back > on specialized generics and their refined type system. In the spirit > of brainstorming, let's pull on that string a bit. > > We could declare that `RefObject` is a *species* of `Object`, not > a subclass. Then, `Object` is revealed to be a template with just > two species. As I was writing, I knew that this was where you were going to go. What that means is that we layer another requirement onto species; that they be denotable with simple names.? I am not sure we want to go there, just for the sake of this one example? > (The species parameter of `Object` could be `` > for `RefObject`, etc. ?Shots fired! Look out, he's got a non-type in > his parameter list! Or the species parameter could be a placeholder > type of some sort.) This is a sensible possible place to get to, but it means we give up the pedagogical benefit of RefObject/ValObject being "just" Java classes.? If we can't write these in Java, that's a big loss; having to have generics with non-type parameters on Day 1 does a lot of damage to the delivery plan. > Since templates aren't ready yet, this can't be a fully-baked proposal, > but it could work in the long term. Maybe that suggests a short term > approximation? Should we be talking about `Object` and > `Object` as injected species, rather than `RefObject` and > `ValObject`? They would have conditional members, whatever that > means. That also feels like an unnecessary complexity now, but > it might be more appropriate in a world with templates. Yeah, also, think about the first day of class.?? Everything is an Object, they say.? OK, fine, whatever an Object is.? Now, some objects are Object, and some are Object, which are sub-species .... WTF?? This is putting a grad-school concept on the 3rd grade curriculum. An alternate approach (which in general I don't love, but could be made to work) is to make {Ref,Val}Object interfaces.? I think its a weaker semantic story, but maybe its viable.? In that case, we could have a (private) sub-species of Object that implements RefObject, and that's what you'd get when you say "new Object()".? Then .getClass() is stable.? But to do it for this reason seems like flea-wagging-tail-wagging-dog. The hybrid approach, which happens to work in this case, is have RefObject be an interface and ValObject be a class.? Again, I don't like it as much as subclasses, but it's not outright unacceptable. It's a little better on the implementation-perturbations and a bit worse on the providing-a-sensible-user-model side. From john.r.rose at oracle.com Thu Feb 28 23:50:39 2019 From: john.r.rose at oracle.com (John Rose) Date: Thu, 28 Feb 2019 15:50:39 -0800 Subject: lazy statics design notes In-Reply-To: <609925803.719671.1551348001599.JavaMail.zimbra@u-pem.fr> References: <7CAFC675-248C-477A-9A4E-D9F331559EA6@oracle.com> <0755C5BD-59BE-4B73-9D09-C71421CBE8C5@oracle.com> <609925803.719671.1551348001599.JavaMail.zimbra@u-pem.fr> Message-ID: <8008118D-8666-4E76-AB2A-18971BCD95C8@oracle.com> On Feb 28, 2019, at 2:00 AM, Remi Forax wrote: > > static lazy final is lazy because it shares the same late late binding semantics as indy and condy, i.e it uses the CAS of condy so it doesn't need a specific CAS from T.default. > I think that trying to come that will encompass lazy instance field and lazy static field is a trap, because in the lazy instance case there is no constant pool to store the value. Hmm? smells like cheese! Let's go look. :-).~ In this case it's more than a trap; the trap can be turned around to become an opportunity. > Given that we already have condy, i prefer to see the lazy static field as a way to provide a symbolic name to get the condy value from outside the class that store the condy. Sure we already have condy, and that should provide (a) an easy extension of the current spec. to cover linkage of these guys, and (b) a good implementation. Looking at the language and user experiences, we see the cheese in the trap: "Hey, why do those fields have to be static? What if I want a non-static one? In fact, I'm in the mood to refactor my code to get rid of statics." What I'm proposing is not to back away from the condy semantics for lazy-statics, but rather add a strategic "tweak" to them so that non-statics can be lazy too. In short: If the BSM returns T.default, don't accept it; throw BSME. At the cost of excluding one point from the value space from a lazy variable of type T, we gain a wider range of efficient implementation choices for JVMs. The excluded point is almost always `null`, which is a second-class value already; we are putting it to work in its usual supporting role as a sentinel value. The excluded point can serve as a sentinel for reflection, too; a Field::getUnresolved call can return the sentinel if the lazy field is not yet resolved. This tweak is very simple. It may irritate us a little, but will pave the way for a comfortable story that extends laziness to non-statics also. ? John