From forax at univ-mlv.fr Fri May 6 09:05:15 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 6 May 2022 11:05:15 +0200 (CEST) Subject: [External] : Re: Record pattern and side effects In-Reply-To: References: <448085588.13668630.1650188930595.JavaMail.zimbra@u-pem.fr> <2897C3E9-8DD2-49AE-9421-479753BB51A5@oracle.com> <1535408822.14991807.1650533325656.JavaMail.zimbra@u-pem.fr> <399dc2fc-0179-4d7c-6568-10032c34d135@oracle.com> <2092001046.15312550.1650584924891.JavaMail.zimbra@u-pem.fr> Message-ID: <1667419546.21849041.1651827915638.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Friday, April 22, 2022 3:34:29 PM > Subject: Re: [External] : Re: Record pattern and side effects >>> Let's imagine that dtor D throws. The wrapping happens when a dtor/accessor is >>> invoked _implicitly_ as a result of evaluating a pattern match. In both cases, >>> we will wrap the thrown exception and throw MatchException. In this way, both >>> instanceof and switch are "clients of" pattern matching, and it is pattern >>> matching that throws. >>> I don't see any destruction here. >> I'm thinking about the refactoring from a code using accessors to a code using a >> deconstructor. >> By example, IDEs may propose to refactor this code >> if (x instanceof D d) A(d.p()); else B; >> to >> if (x instanceof D(P p)) A(p); else B; >> or vice versa >> If you wraps deconstructor exceptions, but not accessor exceptions you have >> mismatch. > OK, sure. This bothers me zero. Having an accessor (or dtor) throw is already > really^3 weird; having a program depend on which specific exception it throws > is really^32 weird. (In both cases, they still throw an exception that you > probably shouldn't be catching, with a clear stack trace explaining where it > went wrong.) Not a case to design the language around. > Still not seeing any "destruction" here. Let's try with a puzzler, i have a recursive list with a slight twist, the Cons can store the size of the list or not (using -1 if not). The accessor throws an exception and with the semantics you propose it will happily be wrapped by as many MatchExceptions as possible. public sealed interface RecChain { default int size() { return switch (this) { case Nil __ -> 0; case Cons(var v, var s, var next) -> 1 + next.size(); }; } record Nil() implements RecChain { } record Cons(int value, int size, RecChain next) implements RecChain { @Override public int size() { return size != -1? size: RecChain.super.size(); } } public static void main(String[] args){ RecChain chain = new Nil(); for (var i = 0; i < 100; i++) { chain = new Cons(i, -1, chain); } System.out.println(chain.size()); } } R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri May 6 13:11:43 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 6 May 2022 09:11:43 -0400 Subject: [External] : Re: Record pattern and side effects In-Reply-To: <1667419546.21849041.1651827915638.JavaMail.zimbra@u-pem.fr> References: <448085588.13668630.1650188930595.JavaMail.zimbra@u-pem.fr> <2897C3E9-8DD2-49AE-9421-479753BB51A5@oracle.com> <1535408822.14991807.1650533325656.JavaMail.zimbra@u-pem.fr> <399dc2fc-0179-4d7c-6568-10032c34d135@oracle.com> <2092001046.15312550.1650584924891.JavaMail.zimbra@u-pem.fr> <1667419546.21849041.1651827915638.JavaMail.zimbra@u-pem.fr> Message-ID: <0a2ddeb6-e6e3-6dc8-4bb9-78fe3aea65a4@oracle.com> > > The accessor throws an exception and with the semantics you propose it > will happily be wrapped by as many MatchExceptions as possible. But this class is already so deeply questionable, because accessors should not throw exceptions, that we've lost the game before we started.? Whether the exception comes wrapped or not is like asking "do you want whipped cream on your mud and axle-grease pie" :) (Also, I don't see where the exception is wrapped multiple times? So I'm not even sure you are clear on what is being propsed.) > > public sealed interface RecChain { > ? default int size() { > ??? return switch (this) { > ????? case Nil __ -> 0; > ????? case Cons(var v, var s, var next) -> 1 + next.size(); > ??? }; > ? } > > ? record Nil() implements RecChain { } > > ? record Cons(int value, int size, RecChain next) implements RecChain { > ??? @Override > ??? public int size() { > ????? return size != -1? size: RecChain.super.size(); > ??? } > ? } > > ? public static void main(String[] args){ > ??? RecChain chain = new Nil(); > ??? for (var i = 0; i < 100; i++) { > ????? chain = new Cons(i, -1, chain); > ??? } > ??? System.out.println(chain.size()); > ? } > } > > > R?mi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri May 6 15:23:12 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 6 May 2022 17:23:12 +0200 (CEST) Subject: [External] : Re: Record pattern and side effects In-Reply-To: <0a2ddeb6-e6e3-6dc8-4bb9-78fe3aea65a4@oracle.com> References: <448085588.13668630.1650188930595.JavaMail.zimbra@u-pem.fr> <2897C3E9-8DD2-49AE-9421-479753BB51A5@oracle.com> <1535408822.14991807.1650533325656.JavaMail.zimbra@u-pem.fr> <399dc2fc-0179-4d7c-6568-10032c34d135@oracle.com> <2092001046.15312550.1650584924891.JavaMail.zimbra@u-pem.fr> <1667419546.21849041.1651827915638.JavaMail.zimbra@u-pem.fr> <0a2ddeb6-e6e3-6dc8-4bb9-78fe3aea65a4@oracle.com> Message-ID: <514644374.22115515.1651850592442.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Friday, May 6, 2022 3:11:43 PM > Subject: Re: [External] : Re: Record pattern and side effects >> The accessor throws an exception and with the semantics you propose it will >> happily be wrapped by as many MatchExceptions as possible. > But this class is already so deeply questionable, because accessors should not > throw exceptions, that we've lost the game before we started. Whether the > exception comes wrapped or not is like asking "do you want whipped cream on > your mud and axle-grease pie" :) People makes mistakes and other ones have to debug it. > (Also, I don't see where the exception is wrapped multiple times? So I'm not > even sure you are clear on what is being propsed.) "I don't see" -> that's exactly my point ! Nobody will see it. R?mi >> public sealed interface RecChain { >> default int size() { >> return switch (this) { >> case Nil __ -> 0; >> case Cons(var v, var s, var next) -> 1 + next.size(); >> }; >> } >> record Nil() implements RecChain { } >> record Cons(int value, int size, RecChain next) implements RecChain { >> @Override >> public int size() { >> return size != -1? size: RecChain.super.size(); >> } >> } >> public static void main(String[] args){ >> RecChain chain = new Nil(); >> for (var i = 0; i < 100; i++) { >> chain = new Cons(i, -1, chain); >> } >> System.out.println(chain.size()); >> } >> } >> R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed May 18 19:18:01 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 18 May 2022 15:18:01 -0400 Subject: Pattern matching: next steps after JEP 405 Message-ID: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> JEP 405 has been proposed to target for 19.? But, it has some loose ends that I'd like to refine before it eventually becomes a final feature.? These include: ?- *Inference for record patterns. *Right now, we make you spell out the type parameters for a generic record pattern, such as: ??? case Box(String s): We also don't allow parameterizations that are not consistent with the match target.? While this is clear, it is verbose (and gets worse when there is nesting), and also, because of the consistency restriction, the parameterization is entirely inferrable.? So we would like to allow the parameters to be inferred.? (Further, inference is relevant to GADTs, because we may be able to gather constraints on the pattern from the nested patterns, if they provide some type parameter specialization.) ?- *Refined type checking for GADTs. *Given a hierarchy like: ??? sealed interface Node { } ??? record IntNode(int i) implements Node { } ??? record FloatNode(float f) implements Node { } we currently cannot type-check programs like: ??? Node twice(Node n) { ? ?? ?? return switch (n) { ? ?? ?????? case IntNode(int x) -> new IntNode(x*2); ??? ? ?? ?? case FloatNode(float x) -> new FloatNode(x*2); ? ? ?? } ?? } because, while the match constraints the instantiation of T in each arm of the switch, the compiler doesn't know this yet. ?-*Varargs patterns. * Records can be varargs, but we have an asymmetry where we can use varargs in constructors but not in deconstruction.? This should be rectified.? The semantics of this is straightforward; given ??? record Foo(int x, int y, int... zs) { } just as ??? new Foo(x, y, z1, z2) is shorthand for ??? new Foo(x, y, new int[] { z1, z2 }) we also can express ??? case Foo(var x, var y, var z1, var z2) as being shorthand for ??? case Foo(var x, var y, int[] { var z1, var z2 }) This means that varargs drags in array patterns. ?- *Array patterns. * The semantics of array patterns are a pretty simple extension to that of record patterns; the rules for exhaustiveness, applicability, nesting, etc, are a relatively light transformation of the corresponding rules for record patterns.? The only new wrinkle is the ability to say "exactly N elements" or "N or more elements". ?- *Primitive patterns. * This is driven by another existing asymmetry; we can use conversions (boxing, widening) when constructing records, but not when deconstructing them.? There is a straightforward (and in hindsight, obvious) interpretation for primitive patterns that is derived entirely from existing cast conversion rules. Obviously there is more we will want to do, but this set feels like what we have to do to "complete" what we started in JEP 405.? I'll post detailed summaries, in separate threads, of each over the next few days. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed May 18 19:57:46 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 18 May 2022 21:57:46 +0200 (CEST) Subject: Pattern matching: next steps after JEP 405 In-Reply-To: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> Message-ID: <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Wednesday, May 18, 2022 9:18:01 PM > Subject: Pattern matching: next steps after JEP 405 > JEP 405 has been proposed to target for 19. But, it has some loose ends that I'd > like to refine before it eventually becomes a final feature. These include: > - Inference for record patterns. Right now, we make you spell out the type > parameters for a generic record pattern, such as: > case Box(String s): > We also don't allow parameterizations that are not consistent with the match > target. While this is clear, it is verbose (and gets worse when there is > nesting), and also, because of the consistency restriction, the > parameterization is entirely inferrable. So we would like to allow the > parameters to be inferred. (Further, inference is relevant to GADTs, because we > may be able to gather constraints on the pattern from the nested patterns, if > they provide some type parameter specialization.) Inference is also something we will need for pattern assignment Box<>(var s) = box; > - Refined type checking for GADTs. Given a hierarchy like: > sealed interface Node { } > record IntNode(int i) implements Node { } > record FloatNode(float f) implements Node { } > we currently cannot type-check programs like: > Node twice(Node n) { > return switch (n) { > case IntNode(int x) -> new IntNode(x*2); > case FloatNode(float x) -> new FloatNode(x*2); > } > } > because, while the match constraints the instantiation of T in each arm of the > switch, the compiler doesn't know this yet. > - Varargs patterns. Records can be varargs, but we have an asymmetry where we > can use varargs in constructors but not in deconstruction. This should be > rectified. The semantics of this is straightforward; given > record Foo(int x, int y, int... zs) { } > just as > new Foo(x, y, z1, z2) > is shorthand for > new Foo(x, y, new int[] { z1, z2 }) > we also can express > case Foo(var x, var y, var z1, var z2) > as being shorthand for > case Foo(var x, var y, int[] { var z1, var z2 }) > This means that varargs drags in array patterns. > - Array patterns. The semantics of array patterns are a pretty simple extension > to that of record patterns; the rules for exhaustiveness, applicability, > nesting, etc, are a relatively light transformation of the corresponding rules > for record patterns. The only new wrinkle is the ability to say "exactly N > elements" or "N or more elements". I wonder if we should not at least work a little on patterns on collections, just to be sure that the syntax and semantics of the patterns on collections and patterns on arrays are not too dissimilar. > - Primitive patterns. This is driven by another existing asymmetry; we can use > conversions (boxing, widening) when constructing records, but not when > deconstructing them. There is a straightforward (and in hindsight, obvious) > interpretation for primitive patterns that is derived entirely from existing > cast conversion rules. When calling a method / constructing an object, you can have several overloads so you need conversions, those conversions are known at compile time. (Note that Java has overloads mostly because there is a rift between primitives and objects, if there was a common supertype, i'm not sure overloads will have carried their own weights.) When doing pattern matching, there is no overloads otherwise you will have to decide which conversions to do at runtime. Given that one mechanism decide which overload to call at compile time and the other decide which patterns to call at runtime, they are not symmetric. You can restrict the set of conversions so there is a bijection between the classes checked at the runtime and the type the user will write, but it's a new mechanism, not an existing symmetry. In case of a pattern assignment, this is different because there is only one pattern that have to cover all the type, so in that case, conversions are not an issues because there is no runtime checks. > Obviously there is more we will want to do, but this set feels like what we have > to do to "complete" what we started in JEP 405. I'll post detailed summaries, > in separate threads, of each over the next few days. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed May 18 21:08:41 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 18 May 2022 17:08:41 -0400 Subject: Pattern matching: next steps after JEP 405 In-Reply-To: <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> Message-ID: <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> > > Inference is also something we will need for pattern assignment > > ? Box<>(var s) = box; Yes, it would work the same in all pattern contexts -- instanceof as well.? Every pattern context has a match target whose static type is known. > > > > ?- *Array patterns. * The semantics of array patterns are a pretty > simple extension to that of record patterns; the rules for > exhaustiveness, applicability, nesting, etc, are a relatively > light transformation of the corresponding rules for record > patterns.? The only new wrinkle is the ability to say "exactly N > elements" or "N or more elements". > > > I wonder if we should not at least work a little on patterns on > collections, just to be sure that the syntax and semantics of the > patterns on collections and patterns on arrays are not too dissimilar. This is a big new area; collection patterns would have to be co-designed with collection literals, and both almost surely depend on some sort of type class mechanism if we want to avoid the feature being lame.? I don't think its realistic to wait this long, nor am I aiming at doing anything that looks like a generic array query mechanism.? Arrays have a first element, a second element, etc; the nesting semantics are very straightforward, and the only real question that needs additional support seems to be "match exactly N" or "match first N". > > > > ?- *Primitive patterns. *This is driven by another existing > asymmetry; we can use conversions (boxing, widening) when > constructing records, but not when deconstructing them.? There is > a straightforward (and in hindsight, obvious) interpretation for > primitive patterns that is derived entirely from existing cast > conversion rules. > > > When calling a method / constructing an object, you can have several > overloads so you need conversions, those conversions are known at > compile time. > (Note that Java has overloads mostly because there is a rift between > primitives and objects, if there was a common supertype, i'm not sure > overloads will have carried their own weights.) > > When doing pattern matching, there is no overloads otherwise you will > have to decide which conversions to do at runtime. There are no overloads YET, because the only deconstruction patterns are in records, and records have only one state description.? But that is a short-lived state of affairs.? When we do declared deconstruction patterns, we will need overload selection, and it will surely need to dualize the existing three-phase overload selection for constructors (e.g., loose, string, and varargs invocation contexts.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu May 19 06:52:39 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 19 May 2022 08:52:39 +0200 (CEST) Subject: Collections patterns In-Reply-To: <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> Message-ID: <170769614.9486186.1652943159193.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Wednesday, May 18, 2022 11:08:41 PM > Subject: Re: Pattern matching: next steps after JEP 405 >> Inference is also something we will need for pattern assignment >> Box<>(var s) = box; > Yes, it would work the same in all pattern contexts -- instanceof as well. Every > pattern context has a match target whose static type is known. >>> - Array patterns. The semantics of array patterns are a pretty simple extension >>> to that of record patterns; the rules for exhaustiveness, applicability, >>> nesting, etc, are a relatively light transformation of the corresponding rules >>> for record patterns. The only new wrinkle is the ability to say "exactly N >>> elements" or "N or more elements". >> I wonder if we should not at least work a little on patterns on collections, >> just to be sure that the syntax and semantics of the patterns on collections >> and patterns on arrays are not too dissimilar. > This is a big new area; collection patterns would have to be co-designed with > collection literals, and both almost surely depend on some sort of type class > mechanism if we want to avoid the feature being lame. I don't think its > realistic to wait this long, nor am I aiming at doing anything that looks like > a generic array query mechanism. Arrays have a first element, a second element, > etc; the nesting semantics are very straightforward, and the only real question > that needs additional support seems to be "match exactly N" or "match first N". We may want to extract sub-parts of the array / collections by example, and i would prefer to have the same semantics and a similar syntax. And i don't think we need type classes here because we can use the target class mechanism instead, like we have done with lambdas, instead of Type x -> x + 1 or whatever the exact syntax, we have (Type) x -> x + 1 We may want [1, 2, 3] to have a type, so it's maybe a little more complicated than just using the target typing but i don't think type classes are needed for collection litterals. For operator overloading on numeric value classes, that's another story. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu May 19 07:21:13 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 19 May 2022 09:21:13 +0200 (CEST) Subject: Pattern matching: next steps after JEP 405 In-Reply-To: <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> Message-ID: <129730903.9506417.1652944873906.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Wednesday, May 18, 2022 11:08:41 PM > Subject: Re: Pattern matching: next steps after JEP 405 >>> - Primitive patterns. This is driven by another existing asymmetry; we can use >>> conversions (boxing, widening) when constructing records, but not when >>> deconstructing them. There is a straightforward (and in hindsight, obvious) >>> interpretation for primitive patterns that is derived entirely from existing >>> cast conversion rules. >> When calling a method / constructing an object, you can have several overloads >> so you need conversions, those conversions are known at compile time. >> (Note that Java has overloads mostly because there is a rift between primitives >> and objects, if there was a common supertype, i'm not sure overloads will have >> carried their own weights.) >> When doing pattern matching, there is no overloads otherwise you will have to >> decide which conversions to do at runtime. > There are no overloads YET, because the only deconstruction patterns are in > records, and records have only one state description. But that is a short-lived > state of affairs. When we do declared deconstruction patterns, we will need > overload selection, and it will surely need to dualize the existing three-phase > overload selection for constructors (e.g., loose, string, and varargs > invocation contexts.) When you have a type pattern X in a middle of a pattern *and* you have conversions, then there is an ambiguity, does instanceof Box(X x) means Box(var v) && v instanceof X x or Box(var v) && X x = (X) v; For the deconstruction pattern, if we have overloads, there is the same kind of ambiguity does Box(X x) means calling Box.deconstructor(X) or calling Box.deconstructor(Y) with y instanceof X So if we want to support overloads, and we may have to, at least to support adding components but keeping the class backward compatible, we need to introduce new rules to solve that ambiguity. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu May 19 13:03:55 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 19 May 2022 09:03:55 -0400 Subject: Collections patterns In-Reply-To: <170769614.9486186.1652943159193.JavaMail.zimbra@u-pem.fr> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> <170769614.9486186.1652943159193.JavaMail.zimbra@u-pem.fr> Message-ID: <695ee8a6-c05e-acaf-0bfe-c22a87335383@oracle.com> > We may want to extract sub-parts of the array / collections by > example, and i would prefer to have the same semantics and a similar > syntax. This is pretty vague, so I'll have to guess about what you might mean. Maybe you mean: "I want to match a list if it contains the a subsequence that matches this sequence of patterns", something like: ??? [ ... p1, p2, ... ] There is surely room to have APIs that query lists like this, but I think this is way out of scope for a pattern matching feature the language.? Pattern matching is about _destructuring_. (Destructuring can be conditional.)?? An array is a linear sequence of elments; it can be destructured by a linear sequence of patterns. Maybe you mean: "I want to decompose a list into the head element and a tail list". In Haskell, we iterate a list by recursion: ??? len :: [a] -> Int ??? len [] = 0 ??? len x:xs = 1 + len xs But again, this is *mere destructuring*, because the cons operator (:) is the linguistic primitive for aggregation, and [ ... ] lists are just sugar over cons.? So matching to `x:xs` is again destructuring.? We could try to apply this to Java, but it gets very clunky (inefficient, no tail recursion, yada yada) because our lists are *built differently*.? Further further, arrays are another step removed from lists even. Or maybe you mean something else; if so, please share! -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu May 19 13:05:07 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 19 May 2022 09:05:07 -0400 Subject: Pattern matching: next steps after JEP 405 In-Reply-To: <129730903.9506417.1652944873906.JavaMail.zimbra@u-pem.fr> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> <129730903.9506417.1652944873906.JavaMail.zimbra@u-pem.fr> Message-ID: > > When you have a type pattern X in a middle of a pattern *and* you have > conversions, then there is an ambiguity, > does instanceof Box(X x) means > ? Box(var v) && v instanceof X x > or > ? Box(var v) && X x = (X) v; This is not an ambiguity in the language, it is confusion on the part of the reader :) In any case, I'm not following your argument here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri May 20 05:27:28 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 20 May 2022 07:27:28 +0200 (CEST) Subject: Pattern matching: next steps after JEP 405 In-Reply-To: References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> <129730903.9506417.1652944873906.JavaMail.zimbra@u-pem.fr> Message-ID: <1100767285.10417932.1653024448461.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Thursday, May 19, 2022 3:05:07 PM > Subject: Re: Pattern matching: next steps after JEP 405 >> When you have a type pattern X in a middle of a pattern *and* you have >> conversions, then there is an ambiguity, >> does instanceof Box(X x) means >> Box(var v) && v instanceof X x >> or >> Box(var v) && X x = (X) v; > This is not an ambiguity in the language, it is confusion on the part of the > reader :) > In any case, I'm not following your argument here. If you have both a type pattern and allow conversions, you have Box(X) is equivalent to Box(var v) && v instanceof Y y && X x = (X) y How do you find Y ? And yes, the bar is not only that Y has to be unique for the compiler, it has also to be obvious for the human reader too. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri May 20 06:09:16 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 20 May 2022 08:09:16 +0200 (CEST) Subject: Collections patterns In-Reply-To: <695ee8a6-c05e-acaf-0bfe-c22a87335383@oracle.com> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> <170769614.9486186.1652943159193.JavaMail.zimbra@u-pem.fr> <695ee8a6-c05e-acaf-0bfe-c22a87335383@oracle.com> Message-ID: <610702102.10427194.1653026956333.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Thursday, May 19, 2022 3:03:55 PM > Subject: Re: Collections patterns >> We may want to extract sub-parts of the array / collections by example, and i >> would prefer to have the same semantics and a similar syntax. > This is pretty vague, so I'll have to guess about what you might mean. > Maybe you mean: "I want to match a list if it contains the a subsequence that > matches this sequence of patterns", something like: > [ ... p1, p2, ... ] > There is surely room to have APIs that query lists like this, but I think this > is way out of scope for a pattern matching feature the language. Pattern > matching is about _destructuring_. (Destructuring can be conditional.) An array > is a linear sequence of elments; it can be destructured by a linear sequence of > patterns. > Maybe you mean: "I want to decompose a list into the head element and a tail > list". > In Haskell, we iterate a list by recursion: > len :: [a] -> Int > len [] = 0 > len x:xs = 1 + len xs > But again, this is *mere destructuring*, because the cons operator (:) is the > linguistic primitive for aggregation, and [ ... ] lists are just sugar over > cons. So matching to `x:xs` is again destructuring. We could try to apply this > to Java, but it gets very clunky (inefficient, no tail recursion, yada yada) > because our lists are *built differently*. Further further, arrays are another > step removed from lists even. > Or maybe you mean something else; if so, please share! The current proposal is more about matching and extracting the first arguments than matching/extracting the last arguments or the rest are also useful IMO. By example, if i want to parse command line arguments composed of options and a filename, i may want to write something like case [String... options, String filename] -> ... case [String...] -> help() R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri May 20 12:46:05 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 20 May 2022 14:46:05 +0200 (CEST) Subject: Pattern matching: next steps after JEP 405 In-Reply-To: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> Message-ID: <209135538.10840473.1653050765862.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Wednesday, May 18, 2022 9:18:01 PM > Subject: Pattern matching: next steps after JEP 405 > JEP 405 has been proposed to target for 19. But, it has some loose ends that I'd > like to refine before it eventually becomes a final feature. These include: [...] > - Varargs patterns. Records can be varargs, but we have an asymmetry where we > can use varargs in constructors but not in deconstruction. This should be > rectified. The semantics of this is straightforward; given > record Foo(int x, int y, int... zs) { } > just as > new Foo(x, y, z1, z2) > is shorthand for > new Foo(x, y, new int[] { z1, z2 }) > we also can express > case Foo(var x, var y, var z1, var z2) > as being shorthand for > case Foo(var x, var y, int[] { var z1, var z2 }) > This means that varargs drags in array patterns. Thinking a bit about the varargs pattern, introducing them is not a good idea because a varargs record is not a safe construct by default, - varargs are arrays, and arrays are mutable in Java, so varargs records are not immutable by default - equals() and hashCode() does not work as is too. The record Foo should be written record Foo(int x, int y, ... zs) { Foo { zs = zs.clone(); } public int[] zs() { return zs.clone(); } public boolean equals(Object o) { return o instanceof Foo foo && x == foo.x && y == foo.y && Arrays.equals(zs, foo.zs); } public int hashCode() { return hash(x, y, Arrays.hashCode(zs)); } } Given that most people will forget that the default behavior of a varargs record is not the right one, introducing a specific pattern for varargs record to mirror them is like giving a gentle nudge to somebody on a cliff. Note that, it does not mean that we will not support varargs record, because one can still write either case Foo(int x, int y, int[] zs) or case Foo(int x, int y, int[] { int... zs }) // or a similar syntax that mix a record pattern and an array pattern but just that there will be no streamlined syntax for a varargs record. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri May 20 13:04:27 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 20 May 2022 09:04:27 -0400 Subject: Collections patterns In-Reply-To: <610702102.10427194.1653026956333.JavaMail.zimbra@u-pem.fr> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> <170769614.9486186.1652943159193.JavaMail.zimbra@u-pem.fr> <695ee8a6-c05e-acaf-0bfe-c22a87335383@oracle.com> <610702102.10427194.1653026956333.JavaMail.zimbra@u-pem.fr> Message-ID: <5112fe88-bb9c-b922-38e0-379c83f4550e@oracle.com> > > Or maybe you mean something else; if so, please share! > > > The current proposal is more about matching and extracting the first > arguments It is really about matching *the whole array*.?? Pattern matching is about destructuring.? Arrays are part of the language.? They have structure.? We give people a way to make arrays by specifying all the elements; pattern matching deconstructs the array by matching all the elements. > than matching/extracting the last arguments or the rest are also > useful IMO. > By example, if i want to parse command line arguments composed of > options and a filename, i may want to write something like > > ? case [String... options, String filename] -> ... > ? case [String...] -> help() Just because something is useful doesn't mean it has an equal claim to be a language feature.? (Arrays are a language feature; inserting into the middle of a sequence is useful, but arrays don't offer that -- we have lists for that.? That doesn't make arrays broken.? Lists are a different feature.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri May 20 13:05:26 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 20 May 2022 09:05:26 -0400 Subject: Pattern matching: next steps after JEP 405 In-Reply-To: <1100767285.10417932.1653024448461.JavaMail.zimbra@u-pem.fr> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <388880150.9336891.1652903866654.JavaMail.zimbra@u-pem.fr> <825a17d3-b844-c2dc-feac-5cca4d5d50ff@oracle.com> <129730903.9506417.1652944873906.JavaMail.zimbra@u-pem.fr> <1100767285.10417932.1653024448461.JavaMail.zimbra@u-pem.fr> Message-ID: I'm sorry, I have no idea what argument you are trying to make.? Start from the beginning. On 5/20/2022 1:27 AM, forax at univ-mlv.fr wrote: > > > ------------------------------------------------------------------------ > > *From: *"Brian Goetz" > *To: *"Remi Forax" > *Cc: *"amber-spec-experts" > *Sent: *Thursday, May 19, 2022 3:05:07 PM > *Subject: *Re: Pattern matching: next steps after JEP 405 > > > > When you have a type pattern X in a middle of a pattern *and* > you have conversions, then there is an ambiguity, > does instanceof Box(X x) means > ? Box(var v) && v instanceof X x > or > ? Box(var v) && X x = (X) v; > > > This is not an ambiguity in the language, it is confusion on the > part of the reader :) > > In any case, I'm not following your argument here. > > > If you have both a type pattern and allow conversions, you have > ? Box(X) is equivalent to Box(var v) && v instanceof Y y && X x = (X) y > > How do you find Y ? > > And yes, the bar is not only that Y has to be unique for the compiler, > it has also to be obvious for the human reader too. > > R?mi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri May 20 13:10:10 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 20 May 2022 09:10:10 -0400 Subject: Pattern matching: next steps after JEP 405 In-Reply-To: <209135538.10840473.1653050765862.JavaMail.zimbra@u-pem.fr> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> <209135538.10840473.1653050765862.JavaMail.zimbra@u-pem.fr> Message-ID: <33fbb9f3-ee40-8b8b-e3c1-3f81a052589b@oracle.com> You are right that varargs records are dancing on the edge of a cliff.? But (a) we have varargs records, and (b) array/varargs patterns are not only for records. If you're arguing that they are not essential *right now* and can be deferred, that's a reasonable argument, but you'd have to actually make that argument. But it seems you are arguing that array and varargs patterns are *fundamentally incoherent.*? This argument seems way overblown, and as you've seen, overblown arguments are usually counterproductive. On 5/20/2022 8:46 AM, Remi Forax wrote: > > > ------------------------------------------------------------------------ > > *From: *"Brian Goetz" > *To: *"amber-spec-experts" > *Sent: *Wednesday, May 18, 2022 9:18:01 PM > *Subject: *Pattern matching: next steps after JEP 405 > > JEP 405 has been proposed to target for 19.? But, it has some > loose ends that I'd like to refine before it eventually becomes a > final feature.? These include: > > [...] > > > > > ?-*Varargs patterns. * Records can be varargs, but we have an > asymmetry where we can use varargs in constructors but not in > deconstruction.? This should be rectified.? The semantics of this > is straightforward; given > > ??? record Foo(int x, int y, int... zs) { } > > just as > > ??? new Foo(x, y, z1, z2) > > is shorthand for > > ??? new Foo(x, y, new int[] { z1, z2 }) > > we also can express > > ??? case Foo(var x, var y, var z1, var z2) > > as being shorthand for > > ??? case Foo(var x, var y, int[] { var z1, var z2 }) > > This means that varargs drags in array patterns. > > > > Thinking a bit about the varargs pattern, introducing them is not a > good idea because a varargs record is not a safe construct by default, > - varargs are arrays, and arrays are mutable in Java, so varargs > records are not immutable by default > - equals() and hashCode() does not work as is too. > > The record Foo should be written > > ? record Foo(int x, int y, ... zs) { > ?? Foo { > ???? zs = zs.clone(); > ?? } > > ?? public int[] zs() { > ???? return zs.clone(); > ?? } > > ?? public boolean equals(Object o) { > ??? return o instanceof Foo foo && x == foo.x && y == foo.y && > Arrays.equals(zs, foo.zs); > ?? } > > ?? public int hashCode() { > ???? return hash(x, y, Arrays.hashCode(zs)); > ?? } > ? } > > Given that most people will forget that the default behavior of a > varargs record is not the right one, introducing a specific pattern > for varargs record to mirror them is like giving a gentle nudge to > somebody on a cliff. > > Note that, it does not mean that we will not support varargs record, > because > one can still write either > ? case Foo(int x, int y, int[] zs) > > or > ? case Foo(int x, int y, int[] { int... zs })?? // or a similar syntax > that mix a record pattern and an array pattern > > but just that there will be no streamlined syntax for a varargs record. > > R?mi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat May 21 11:55:52 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 21 May 2022 13:55:52 +0200 (CEST) Subject: Guard variable and being effectively final Message-ID: <928450346.11322832.1653134152729.JavaMail.zimbra@u-pem.fr> Not sure if it's an implementation bug (bad error message from the compiler) or a spec bug, hence this message to both amber-dev and amber-spec-experts. If i try to compile this code with Java 19 (which currently still uses && instead of when for a guard) interface reverse_polish_notation { static Map OPS = Map.of("+", (a, b) -> a + b, "*", (a, b) -> a * b); static int eval(List expr) { var stack = new ArrayDeque(); for(var token: expr) { final IntBinaryOperator op; stack.push(switch (token) { case String __ && (op = OPS.get(token)) != null -> { var value1 = stack.pop(); var value2 = stack.pop(); yield op.applyAsInt(value1, value2); } default -> Integer.parseInt(token); }); } return stack.pop(); } static void main(String[] args) { var expr = List.of("1", "2", "+", "3", "*", "4"); System.out.println(eval(expr)); } } I get the following error java --enable-preview --source 19 reverse_polish_notation.java reverse_polish_notation.java:17: error: local variables referenced from a guard must be final or effectively final case String __ && (op = OPS.get(token)) != null -> { ^ Note: reverse_polish_notation.java uses preview features of Java SE 19. Note: Recompile with -Xlint:preview for details. 1 error error: compilation failed Obviously the error message is funny, IntBinaryOperator is declared final so it is effectively final. In case of a lambda, final IntBinaryOperator op; Supplier supplier = () -> op = null; supplier.get() can be called several times so "op = null" does not compile. But in the example above, "op" can not be assigned more than once so maybe it should compile. regards, R?mi From forax at univ-mlv.fr Sat May 21 18:35:48 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 21 May 2022 20:35:48 +0200 (CEST) Subject: Guard variable and being effectively final In-Reply-To: <24659138-76aa-fedd-77e8-2f1ad778d4fe@gmail.com> References: <928450346.11322832.1653134152729.JavaMail.zimbra@u-pem.fr> <24659138-76aa-fedd-77e8-2f1ad778d4fe@gmail.com> Message-ID: <103092596.11370644.1653158148387.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "cay horstmann" > To: "Remi Forax" , "amber-spec-experts" > Sent: Saturday, May 21, 2022 7:50:44 PM > Subject: Re: Guard variable and being effectively final > Hi R?my, > > it compiles with build 19-ea+23-1706 if you replace && with when. sadly, assigning op also compiles :( IntBinaryOperator op = null; I think the new implementation is missing a check that the local variables after a "when" has to be effectively final > > Also, remove "4" from the list or add an operator :-) yes > > Cheers, > > Cay R?mi > > Il 21/05/2022 13:55, Remi Forax ha scritto: >> interface reverse_polish_notation { >> static Map OPS = >> Map.of("+", (a, b) -> a + b, "*", (a, b) -> a * b); >> >> static int eval(List expr) { >> var stack = new ArrayDeque(); >> for(var token: expr) { >> final IntBinaryOperator op; >> stack.push(switch (token) { >> case String __ && (op = OPS.get(token)) != null -> { >> var value1 = stack.pop(); >> var value2 = stack.pop(); >> yield op.applyAsInt(value1, value2); >> } >> default -> Integer.parseInt(token); >> }); >> } >> return stack.pop(); >> } >> >> static void main(String[] args) { >> var expr = List.of("1", "2", "+", "3", "*", "4"); >> System.out.println(eval(expr)); >> } >> } > > -- > > Cay S. Horstmann | http://horstmann.com | mailto:cay at horstmann.com From maurizio.cimadamore at oracle.com Mon May 23 17:45:03 2022 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 23 May 2022 18:45:03 +0100 Subject: Guard variable and being effectively final In-Reply-To: <928450346.11322832.1653134152729.JavaMail.zimbra@u-pem.fr> References: <928450346.11322832.1653134152729.JavaMail.zimbra@u-pem.fr> Message-ID: The compiler behavior seems to be in sync with the spec: From [1]: "Any variable that is used but not declared in the guarding expression of a guarded pattern must either be final or effectively final (4.12.4)." And, from [2]: "Any variable that is used but not declared in a |when| expression must be either final or effectively final (4.12.4 )" As for your question of "why doesn't it work", I think it can be decomposed into two questions: * why is a "when" expression restricted to only mention final/effectively-final variables? * even under the constraint of final/effectively final, why isn't "when" allowed to "initialize" a final variable? Both of these add some asymmetries when it comes to refactoring the switch into a chain of if/else. Maurizio [1]: https://docs.oracle.com/javase/specs/jls/se18/preview/specs/patterns-switch-jls.html#jls-6.3.3.1 [2]: http://cr.openjdk.java.net/~gbierman/PatternSwitchPlusRecordPatterns/PatternSwitchPlusRecordPatterns-20220407/specs/patterns-switch-jls.html On 21/05/2022 12:55, Remi Forax wrote: > Not sure if it's an implementation bug (bad error message from the compiler) or a spec bug, > hence this message to both amber-dev and amber-spec-experts. > > If i try to compile this code with Java 19 (which currently still uses && instead of when for a guard) > > interface reverse_polish_notation { > static Map OPS = > Map.of("+", (a, b) -> a + b, "*", (a, b) -> a * b); > > static int eval(List expr) { > var stack = new ArrayDeque(); > for(var token: expr) { > final IntBinaryOperator op; > stack.push(switch (token) { > case String __ && (op = OPS.get(token)) != null -> { > var value1 = stack.pop(); > var value2 = stack.pop(); > yield op.applyAsInt(value1, value2); > } > default -> Integer.parseInt(token); > }); > } > return stack.pop(); > } > > static void main(String[] args) { > var expr = List.of("1", "2", "+", "3", "*", "4"); > System.out.println(eval(expr)); > } > } > > I get the following error > > java --enable-preview --source 19 reverse_polish_notation.java > reverse_polish_notation.java:17: error: local variables referenced from a guard must be final or effectively final > case String __ && (op = OPS.get(token)) != null -> { > ^ > Note: reverse_polish_notation.java uses preview features of Java SE 19. > Note: Recompile with -Xlint:preview for details. > 1 error > error: compilation failed > > Obviously the error message is funny, IntBinaryOperator is declared final so it is effectively final. > > In case of a lambda, > final IntBinaryOperator op; > Supplier supplier = () -> op = null; > > supplier.get() can be called several times so "op = null" does not compile. > > But in the example above, "op" can not be assigned more than once so maybe it should compile. > > regards, > R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue May 24 18:56:31 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 24 May 2022 14:56:31 -0400 Subject: Refined type checking for GADTs (was: Pattern matching: next steps after JEP 405) In-Reply-To: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> References: <9ec630a9-573f-d59c-693b-ca5844ad4517@oracle.com> Message-ID: <23fa3ae0-6cee-f0b3-8841-c3bccc3cca9b@oracle.com> > > ?- *Refined type checking for GADTs. *Given a hierarchy like: > > ??? sealed interface Node { } > ??? record IntNode(int i) implements Node { } > ??? record FloatNode(float f) implements Node { } > > we currently cannot type-check programs like: > > ??? Node twice(Node n) { > ? ?? ?? return switch (n) { > ? ?? ?????? case IntNode(int x) -> new IntNode(x*2); > ??? ? ?? ?? case FloatNode(float x) -> new FloatNode(x*2); > ? ? ?? } > ?? } > > because, while the match constraints the instantiation of T in each > arm of the switch, the compiler doesn't know this yet. Much of this problem has already been explored by "Generalized Algebraic Data Types and Object Oriented Programming" (Kennedy and Russo, 2005); there's a subset of the formalism from that paper which I think can apply somewhat cleanly to Java. The essence of the approach is that in certain scopes (which coincide exactly with the scope of pattern binding variables), additional _type variable equality constraints_ are injected.? For a switch like that above, we inject a T=Integer constraint into the first arm, and a T=Float into the second arm, and do our type checking with these additional constraints.? (The paper uses equational constraints only (T=Integer), but we may want additional upper bounds as well (T <: Comprable)). The way it works in this example is: we gather the constraint Node = Node from the switch (by walking up the hierarchy and doing substitution), and unifying, which gives us the new equational constraint T=Integer.? We then type-check the RHS using the additional constraints. The type checking adds some new rules to reflect equational constraints, FJ-style: ?? \Gamma |- T=U?? \Gamma |- C OK ?? --------------------------------- abstraction ?????? \Gamma |- C = C ?? \Gamma |- C = C ?? --------------------- reduction ?????? \Gamma |- T=U ?? \Gamma |- X OK ?? --------------? reflexivity ?? \Gamma |- X=X ?? \Gamma |- U=T ?? -------------? symmetry ?? \Gamma |- T=U ?? \Gamma |- T=U? \Gamma |- U=V ?? ----------------------------? transitivity ?? \Gamma |= T=V ??? \Gamma |- T=U ?? ---------------- subtyping ?? \Gamma |- T <: U The key is that this only affects type checking; it doesn't rewrite any types.? Since in the first arm we are trying to assign a IntNode to a Node, and IntNode <: Node, by symmetry + subtyping, we get IntNode <: Node, and yay it type-checks. The main moving parts of this sub-feature are: ?- Defining scopes for additional constraints/bounds.? This can piggyback on the existing language of the form "if v is introduced when P is true, then v is definitely matched at X"; we can trivially extend this to say "a constraint is definitely matched at X".? This is almost purely mechanical. ?- Defining additional type-checking rules to support scope-specific constraints, along the lines above, in 4.10 (Subtyping). ?- In the description of type and records patterns (14.30.x), appeal to inference to gather equational constraints, and which patterns introduce an equational constraint. This is obviously only a sketch; more details to follow. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon May 30 12:33:51 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 30 May 2022 14:33:51 +0200 (CEST) Subject: It's the data, stupid ! Message-ID: <1036716303.14854318.1653914031900.JavaMail.zimbra@u-pem.fr> Hi all, i think the recent discussions about the pattern matching are too much about details of implementation and i fear we are losing the big picture, so let me explain why i (we ?) want to add pattern matching to Java. Java roots is OOP, encapsulation serves us well for the last 25+ years, it emphasis API above everything else, data are not important because it's just a possible implementation of the API. But OOP / encapsulation is really important for libraries, less for applications. For an application, data are more important, or at least as important as API. The goal of pattern matching is make data the center of the universe. Data are more important than code, if the data change because the business requirements change, the code should be updated accordingly. Pattern matching allows to write code depending on the data and it will fail to compile if the data change, indicating every places in the code where the code needs to be updated to take care of the new data shape. The data can change in different ways, 1) a new kind of a type (a subtype of an interface) can be introduced, we have added sealed types and make switch on type exhaustive so if a developer add a new subtype of an interface, the compiler will refuse to compile all patterns that are not exhaustive anymore, indicating that the code must be updated. 2) a data can have a new field/component, we have introduced record pattern that match the exact shape of a record, so if a developer add a new component, the compiler will refuse to compile the record pattern with a wrong shape indicating that the code must be updated. So experts, do you agree that this is what we want or did i miss something ? R?mi PS: the title is a nod to James Carville. From brian.goetz at oracle.com Mon May 30 14:31:19 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 30 May 2022 10:31:19 -0400 Subject: It's the data, stupid ! In-Reply-To: <1036716303.14854318.1653914031900.JavaMail.zimbra@u-pem.fr> References: <1036716303.14854318.1653914031900.JavaMail.zimbra@u-pem.fr> Message-ID: <206d71da-82d0-79f9-0fd6-ad43ae2f4654@oracle.com> Indeed, this is a big part of the motivation.? And it's not just pattern matching; its the combination of records (representing data as data), sealed classes (the other half of algebraic data types, enabling richer data-as-data descriptions), and pattern matching (ad-hoc polymorphism, great for data).? The catchphrase we've been using in last few years has been make it easier to do "data-oriented programming" in Java.? This isn't a departure from OO, it's a recognition that not everything is best modeled as an stateful entity that communicates by sending and receiving messages. Rolling back to the origin of these feature set (several years ago at this point), we observed that programs are getting smaller; monoliths give way to smaller services.? And the smaller the unit of code is, the closer it is to the boundary, at which it is exchanging messy untyped (or differently typed) data with the messy real world -- JSON, database result sets, etc.? (Gone are the days where it was Java objects all the way down, including across the wire boundary.)? We needed a simpler way to represent strongly typed ad-hoc data in Java, one that is easy to use, which can be easily mapped to and from the external messy formats at the boundary.? OO is great at defining and defending boundaries (it totally shines at platform libraries), but when it comes to modeling ordinary data, costs just as much but offers us less in return.? And pattern matching is key to being able to easily act on that data, take it apart, put it back together differently, etc.? The future installments of pattern matching are aimed at simplifying the code at that boundary; using pattern matching to mediate conversion from untyped, schema-free envelopes like JSON to illegal-states-are-unrepresentable data. So, yes: records + sealed classes + pattern matching = embracing data as data.? I've got a piece I've been writing on this very topic, I'll send a link when its up. And yes, we've been talking a lot about the details, because that's what this group is for.? But I don't think we have lost sight of the big picture. Is there something you think we've missed? On 5/30/2022 8:33 AM, Remi Forax wrote: > Hi all, > i think the recent discussions about the pattern matching are too much about details of implementation and i fear we are losing the big picture, so let me explain why i (we ?) want to add pattern matching to Java. > > Java roots is OOP, encapsulation serves us well for the last 25+ years, it emphasis API above everything else, data are not important because it's just a possible implementation of the API. > > But OOP / encapsulation is really important for libraries, less for applications. For an application, data are more important, or at least as important as API. > > The goal of pattern matching is make data the center of the universe. Data are more important than code, if the data change because the business requirements change, the code should be updated accordingly. Pattern matching allows to write code depending on the data and it will fail to compile if the data change, indicating every places in the code where the code needs to be updated to take care of the new data shape. > > The data can change in different ways, > 1) a new kind of a type (a subtype of an interface) can be introduced, we have added sealed types and make switch on type exhaustive so if a developer add a new subtype of an interface, the compiler will refuse to compile all patterns that are not exhaustive anymore, indicating that the code must be updated. > 2) a data can have a new field/component, we have introduced record pattern that match the exact shape of a record, so if a developer add a new component, the compiler will refuse to compile the record pattern with a wrong shape indicating that the code must be updated. > > So experts, do you agree that this is what we want or did i miss something ? > > R?mi > > PS: the title is a nod to James Carville. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon May 30 15:18:21 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 30 May 2022 17:18:21 +0200 (CEST) Subject: It's the data, stupid ! In-Reply-To: <206d71da-82d0-79f9-0fd6-ad43ae2f4654@oracle.com> References: <1036716303.14854318.1653914031900.JavaMail.zimbra@u-pem.fr> <206d71da-82d0-79f9-0fd6-ad43ae2f4654@oracle.com> Message-ID: <864944952.15037557.1653923901560.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > > Sent: Monday, May 30, 2022 4:31:19 PM > Subject: Re: It's the data, stupid ! > Indeed, this is a big part of the motivation. And it's not just pattern > matching; its the combination of records (representing data as data), sealed > classes (the other half of algebraic data types, enabling richer data-as-data > descriptions), and pattern matching (ad-hoc polymorphism, great for data). The > catchphrase we've been using in last few years has been make it easier to do > "data-oriented programming" in Java. This isn't a departure from OO, it's a > recognition that not everything is best modeled as an stateful entity that > communicates by sending and receiving messages. > Rolling back to the origin of these feature set (several years ago at this > point), we observed that programs are getting smaller; monoliths give way to > smaller services. And the smaller the unit of code is, the closer it is to the > boundary, at which it is exchanging messy untyped (or differently typed) data > with the messy real world -- JSON, database result sets, etc. (Gone are the > days where it was Java objects all the way down, including across the wire > boundary.) We needed a simpler way to represent strongly typed ad-hoc data in > Java, one that is easy to use, which can be easily mapped to and from the > external messy formats at the boundary. OO is great at defining and defending > boundaries (it totally shines at platform libraries), but when it comes to > modeling ordinary data, costs just as much but offers us less in return. And > pattern matching is key to being able to easily act on that data, take it > apart, put it back together differently, etc. The future installments of > pattern matching are aimed at simplifying the code at that boundary; using > pattern matching to mediate conversion from untyped, schema-free envelopes like > JSON to illegal-states-are-unrepresentable data. > So, yes: records + sealed classes + pattern matching = embracing data as data. > I've got a piece I've been writing on this very topic, I'll send a link when > its up. > And yes, we've been talking a lot about the details, because that's what this > group is for. But I don't think we have lost sight of the big picture. > Is there something you think we've missed? First, i've overlook the importance of the record pattern as a check of the shape of the data. Then if we say that data are more important than code and that the aim of the pattern matching is to detect changes of the shapes of the data, it changes the usefulness of some features/patterns. it makes the varargs pattern a kind of harmful, because it matches data of several shapes, so the code may still compile if the shape of the record/data-type change. Given that we have already establish that - the varargs pattern can be emulated by an array pattern and it's even better because an array pattern checks that the shape is an array and - the varargs is dangerous because record with varargs are hard to get right. The result is that i'm not sure the vararg pattern is a target worth pursuing. Deconstructors of a class also becomes a kind of a war ground between the OOP and the pattern matching, OOP says that API is important and pattern matching says it's ok to change the data changing the API because the compiler will points where the code should be updated. We still want encapsulation because it's a class but we want to detect if its shape change so having a class with several shapes becomes not as useful as i first envision. And we have some pattern methods, that works more like extension methods because you can define them outside of the class you do the matching on (they are not instance pattern method) and how those are supposed to work when a shape is updated. R?mi > On 5/30/2022 8:33 AM, Remi Forax wrote: >> Hi all, >> i think the recent discussions about the pattern matching are too much about >> details of implementation and i fear we are losing the big picture, so let me >> explain why i (we ?) want to add pattern matching to Java. >> Java roots is OOP, encapsulation serves us well for the last 25+ years, it >> emphasis API above everything else, data are not important because it's just a >> possible implementation of the API. >> But OOP / encapsulation is really important for libraries, less for >> applications. For an application, data are more important, or at least as >> important as API. >> The goal of pattern matching is make data the center of the universe. Data are >> more important than code, if the data change because the business requirements >> change, the code should be updated accordingly. Pattern matching allows to >> write code depending on the data and it will fail to compile if the data >> change, indicating every places in the code where the code needs to be updated >> to take care of the new data shape. >> The data can change in different ways, >> 1) a new kind of a type (a subtype of an interface) can be introduced, we have >> added sealed types and make switch on type exhaustive so if a developer add a >> new subtype of an interface, the compiler will refuse to compile all patterns >> that are not exhaustive anymore, indicating that the code must be updated. >> 2) a data can have a new field/component, we have introduced record pattern that >> match the exact shape of a record, so if a developer add a new component, the >> compiler will refuse to compile the record pattern with a wrong shape >> indicating that the code must be updated. >> So experts, do you agree that this is what we want or did i miss something ? >> R?mi >> PS: the title is a nod to James Carville. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Mon May 30 16:36:12 2022 From: amaembo at gmail.com (Tagir Valeev) Date: Mon, 30 May 2022 18:36:12 +0200 Subject: Named record pattern Message-ID: Hello! I'm reading the spec draft near "14.30.1 Kinds of Patterns" [1] and I wonder how the variable declared as named record pattern differs from the variable declared in the type test pattern Assuming record Point(int x, int y) {} One can use a pattern like obj instanceof Point p or use a pattern like obj instanceof Point(int x, int y) p It looks like the variable 'p' should be quite similar in both cases. However: - In the first case we are free to declare 'p' as final or not. In the second case it's unclear from the spec whether the variable 'p' is final or not and whether the user has control on this. It looks like, "obj instanceof final Point(int x, int y) p" syntax is not allowed which brings some asymmetry - In the first case I can use LOCAL_VARIABLE annotations like 'obj instanceof @Cartesian Point p'. It looks like I cannot do the same in the second case, which is another asymmetry. So if I want to upgrade the type test pattern on a record type to a record pattern to match components, I need to give up some features like finality and annotations. Is this intended? With best regards, Tagir Valeev [1] https://cr.openjdk.java.net/~gbierman/PatternSwitchPlusRecordPatterns/PatternSwitchPlusRecordPatterns-20220407/specs/patterns-switch-jls.html#jls-14.30.1 From brian.goetz at oracle.com Mon May 30 16:40:22 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 30 May 2022 12:40:22 -0400 Subject: It's the data, stupid ! In-Reply-To: <864944952.15037557.1653923901560.JavaMail.zimbra@u-pem.fr> References: <1036716303.14854318.1653914031900.JavaMail.zimbra@u-pem.fr> <206d71da-82d0-79f9-0fd6-ad43ae2f4654@oracle.com> <864944952.15037557.1653923901560.JavaMail.zimbra@u-pem.fr> Message-ID: > First, i've overlook the importance of the record pattern as a check > of the shape of the data. > > Then if we say that data are more important than code and that the aim > of the pattern matching is to detect changes of the shapes of the data, > it changes the usefulness of some features/patterns. OK, now that I see what argument you are really winding up for, I think I'm going to disagree.? Yes, data-as-data is a huge benefit; it is something we were not so good at before, and something that has become more important over time.? That has motivated us to *prioritize* the data-centric features of pattern matching over more general ones, because they deliver direct value the soonest.? But if you're trying to leverage that into a "this is the only benefit" (or even the main benefit), I think that's taking it too far. The truly big picture here is that pattern matching is the dual of aggregation.? Java gives us lots of ways to put things together (constructors, factories, builders, maybe some day collection literals), but the reverse of each of these is ad-hoc, different, and usually harder-to-use / more error-prone.? The big picture here is that pattern matching *completes the object model*, by providing the missing reverse link.? (In mathematical terms, a constructor and deconstructor (or factory and static pattern, or builder and "unbuilder", or collection literal and collection pattern) form an *embedding-projection pair*.) Much of this was laid out in Pattern Matching in the Java Object Model: https://github.com/openjdk/amber-docs/blob/master/site/design-notes/patterns/pattern-match-object-model.md > it makes the varargs pattern a kind of harmful, because it matches > data of several shapes, so the code may still compile if the shape of > the record/data-type change. I think you've stretched your argument to the breaking point.? No one said that each pattern can only match *one* structure of data. But for each way of putting together the data, there should be a corresponding way to take it apart. > ?- the varargs pattern can be emulated by an array pattern and it's > even better because an array pattern checks that the shape is an array and Well, we don't have array patterns yet either, but just as varargs invocation is shorthand for a manually created array, varargs patterns are shorthand for an explicit array pattern. > The result is that i'm not sure the vararg pattern is a target worth > pursuing. I think its fine to be "not sure", and its doubly fine to say "I'm not sure the cost-benefit is so compelling, maybe there are other features that we should do first" (like array patterns.)? But if you're trying to make the argument that varargs patterns are actually harmful, you've got a much bigger uphill battle. And don't forget, records are just the first vehicle here; this is coming for arbitrary classes too.? And being able to construct things via varargs construction, but not take them apart by varargs patterns, seems a gratuitous inconsistency.? (Again, maybe we decide that better type inference is worth doing first, but the lack of varargs will still be a wart.) > Deconstructors of a class also becomes a kind of a war ground between > the OOP and the pattern matching, OOP says that API is important and > pattern matching says it's ok to change the data changing the API > because the compiler will points where the code should be updated. > We still want encapsulation because it's a class but we want to detect > if its shape change so having a class with several shapes becomes not > as useful as i first envision. No, these are not in conflict at all.? The biggest tool OOP offers us is encapsulation; it gives us a way to decide how much state we want to expose, in what form, etc, fully decoupled from the representation.? (Records don't have this option for decoupling representation from API, which is what makes it so easy to deliver these features first for records.)? Most classes still choose to give clients _some_ way to access most of the state we pass into the constructor and other API points; its just that this part of the API is usually gratuitously different (e.g., accessors, wrapping with Optional) from the part where state goes in.? Which means that we *do* expose the state to readers, just in a gratuitously different way that we do to writers.? What pattern matching does is gives us exactly the same control we have today over what to expose, and in what form, but lets us do it in a way that is structurally related to how we put state into objects.? It does so with combining multiple return, conditionality, and flow analysis in an integrated way, so we don't have to reinvent these in an ad-hoc way in every class. So while we agree that records + sealed classes + pattern matching enable a nice form of data-oriented programming, and that was indeed a big goal, I think the model you're trying to extrapolate about what the "point" of pattern matching is may be missing its mark. There's a bigger picture here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon May 30 16:53:55 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 30 May 2022 12:53:55 -0400 Subject: Named record pattern In-Reply-To: References: Message-ID: <8bc9cbde-83d6-e400-21f6-f6faf75d6034@oracle.com> Thank you so much for catching this. On 5/30/2022 12:36 PM, Tagir Valeev wrote: > Hello! > > I'm reading the spec draft near "14.30.1 Kinds of Patterns" [1] and I > wonder how the variable declared as named record pattern differs from > the variable declared in the type test pattern > > Assuming record Point(int x, int y) {} > > One can use a pattern like > obj instanceof Point p > or use a pattern like > obj instanceof Point(int x, int y) p > It looks like the variable 'p' should be quite similar in both cases. However: > - In the first case we are free to declare 'p' as final or not. I must admit to being very surprised that you can do this at all!? I don't recall discussion on this, and had you asked me, I would have said that `final` has no place in type-test patterns.? Yet, I just tried it with jshell and it does work as you say.? I am surprised. Can someone recall any discussion over this?? (Surely now someone will point me to where I agreed to this.) Worse, it even works in switch labels!? This is definitely not what I had in mind.? Did this happen because we reused the local variable production for type patterns?? Since switch patterns are about to exit preview, I think we need to fix this ASAP, before switch exits preview. > It looks like, > "obj instanceof final Point(int x, int y) p" syntax is not allowed > which brings some asymmetry > - In the first case I can use LOCAL_VARIABLE annotations like 'obj This very question is why I would not have encouraged us to try to do this for type test patterns at all! > instanceof @Cartesian Point p'. It looks like I cannot do the same in > the second case, which is another asymmetry. We definitely intended to not allow declaration annotations.? As to type-use annotations; well, that's a different problem, and I'm not quite sure what to do.? For sure, we are not going to amend the XxxTypeAnnotations attributes to reify the position of these annotations.? If we allow them and make them available only to annotations processors only, that's another kind of asymmetry, that someone else will complain about. > So if I want to upgrade the type test pattern on a record type to a > record pattern to match components, I need to give up some features > like finality and annotations. Is this intended? It was not really intended that you got those features in the first place. From forax at univ-mlv.fr Mon May 30 17:43:27 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 30 May 2022 19:43:27 +0200 (CEST) Subject: It's the data, stupid ! In-Reply-To: References: <1036716303.14854318.1653914031900.JavaMail.zimbra@u-pem.fr> <206d71da-82d0-79f9-0fd6-ad43ae2f4654@oracle.com> <864944952.15037557.1653923901560.JavaMail.zimbra@u-pem.fr> Message-ID: <1810446329.15098102.1653932607503.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Monday, May 30, 2022 6:40:22 PM > Subject: Re: It's the data, stupid ! >> First, i've overlook the importance of the record pattern as a check of the >> shape of the data. >> Then if we say that data are more important than code and that the aim of the >> pattern matching is to detect changes of the shapes of the data, >> it changes the usefulness of some features/patterns. > OK, now that I see what argument you are really winding up for, I think I'm > going to disagree. Yes, data-as-data is a huge benefit; it is something we were > not so good at before, and something that has become more important over time. > That has motivated us to *prioritize* the data-centric features of pattern > matching over more general ones, because they deliver direct value the soonest. > But if you're trying to leverage that into a "this is the only benefit" (or > even the main benefit), I think that's taking it too far. > The truly big picture here is that pattern matching is the dual of aggregation. > Java gives us lots of ways to put things together (constructors, factories, > builders, maybe some day collection literals), but the reverse of each of these > is ad-hoc, different, and usually harder-to-use / more error-prone. The big > picture here is that pattern matching *completes the object model*, by > providing the missing reverse link. (In mathematical terms, a constructor and > deconstructor (or factory and static pattern, or builder and "unbuilder", or > collection literal and collection pattern) form an *embedding-projection > pair*.) > Much of this was laid out in Pattern Matching in the Java Object Model: > [ > https://github.com/openjdk/amber-docs/blob/master/site/design-notes/patterns/pattern-match-object-model.md > | > https://github.com/openjdk/amber-docs/blob/master/site/design-notes/patterns/pattern-match-object-model.md > ] The problem is that what you propose is a leaky abstraction, because pattern matching works on classes and not on types, so it's not a reverse link. Let say we have a class with two shapes/deconstruction class A { deconstructor (B) { ... } deconstructor (C) { ... } } With the pattern A(D d), D is a runtime class not a type, you have no idea if it means instanceof A a && B b = a.deconstructor() && b instanceof D or instanceof A a && C c = a.deconstructor() && c instanceof D Unlike with a method call (constructor call) where the type of the arguments are available, with the pattern matching, you do not have the types of the arguments, only runtime classes to match. so while a deconstructor can be seen as the inverse of a constructor, a type pattern does not give you the information of the type that allow you to do the method selection on the deconstructors at compile time. >> it makes the varargs pattern a kind of harmful, because it matches data of >> several shapes, so the code may still compile if the shape of the >> record/data-type change. > I think you've stretched your argument to the breaking point. No one said that > each pattern can only match *one* structure of data. I think it's a very good question, the reason we may want to match several structures of data is backward compatibility, but it does not make a lot of sense to offer backward compatibility on data if at the same time the data are more important than the code i.e. if the data drive the code. As i said, it's a question where OOP and DOD (data oriented design ?) disagree one with the other. And this is a problem specific to the deconstructor, for named pattern method, there is no such problem, obviously a user can add as many pattern methods he/she want. > But for each way of putting together the data, there should be a corresponding > way to take it apart. if the pattern matching was a real inverse link, yes, maybe. >> - the varargs pattern can be emulated by an array pattern and it's even better >> because an array pattern checks that the shape is an array and > Well, we don't have array patterns yet either, but just as varargs invocation is > shorthand for a manually created array, varargs patterns are shorthand for an > explicit array pattern. The problem is that varargs pattern can also recognizes a record with no record or class with a deconstructor with no varargs. >> The result is that i'm not sure the vararg pattern is a target worth pursuing. > I think its fine to be "not sure", and its doubly fine to say "I'm not sure the > cost-benefit is so compelling, maybe there are other features that we should do > first" (like array patterns.) But if you're trying to make the argument that > varargs patterns are actually harmful, you've got a much bigger uphill battle. > And don't forget, records are just the first vehicle here; this is coming for > arbitrary classes too. And being able to construct things via varargs > construction, but not take them apart by varargs patterns, seems a gratuitous > inconsistency. (Again, maybe we decide that better type inference is worth > doing first, but the lack of varargs will still be a wart.) You think term of inverse function, we have varargs constructors so we should have varargs pattern, but a pattern is not an inverse function. We have the freedom to provide a simpler model. >> Deconstructors of a class also becomes a kind of a war ground between the OOP >> and the pattern matching, OOP says that API is important and pattern matching >> says it's ok to change the data changing the API because the compiler will >> points where the code should be updated. >> We still want encapsulation because it's a class but we want to detect if its >> shape change so having a class with several shapes becomes not as useful as i >> first envision. > No, these are not in conflict at all. The biggest tool OOP offers us is > encapsulation; it gives us a way to decide how much state we want to expose, in > what form, etc, fully decoupled from the representation. (Records don't have > this option for decoupling representation from API, which is what makes it so > easy to deliver these features first for records.) Most classes still choose to > give clients _some_ way to access most of the state we pass into the > constructor and other API points; its just that this part of the API is usually > gratuitously different (e.g., accessors, wrapping with Optional) from the part > where state goes in. Which means that we *do* expose the state to readers, just > in a gratuitously different way that we do to writers. What pattern matching > does is gives us exactly the same control we have today over what to expose, > and in what form, but lets us do it in a way that is structurally related to > how we put state into objects. It does so with combining multiple return, > conditionality, and flow analysis in an integrated way, so we don't have to > reinvent these in an ad-hoc way in every class. You are choosing the OOP view here, i'm not sure if i disagree or not, i don't know, but i'm just saying that this is a choice and it is a choice that is far from obvious to me. > So while we agree that records + sealed classes + pattern matching enable a nice > form of data-oriented programming, and that was indeed a big goal, I think the > model you're trying to extrapolate about what the "point" of pattern matching > is may be missing its mark. There's a bigger picture here. maybe or maybe not, i don't want to invent something nobody will use in the future apart on slides. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon May 30 18:40:33 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 30 May 2022 14:40:33 -0400 Subject: It's the data, stupid ! In-Reply-To: <1810446329.15098102.1653932607503.JavaMail.zimbra@u-pem.fr> References: <1036716303.14854318.1653914031900.JavaMail.zimbra@u-pem.fr> <206d71da-82d0-79f9-0fd6-ad43ae2f4654@oracle.com> <864944952.15037557.1653923901560.JavaMail.zimbra@u-pem.fr> <1810446329.15098102.1653932607503.JavaMail.zimbra@u-pem.fr> Message-ID: <6fde1524-b10c-3566-ea88-c3cf6ff9629e@oracle.com> > > The problem is that what you propose is a leaky abstraction, because > pattern matching works on classes and not on types, so it's not a > reverse link. ("Leaky abstraction" is sort of an inflammatory term.) What I think you're getting at is that some objects will have state that you can "put in", but can't "take out".? The mathematical relationship here is "embedding projection pair" (this is similar to an adjoint functor pair in some ways.) A good example of this relationship is int and Integer.? Every int corresponds to an Integer, and *almost* every Integer (except null) corresponds to an int.? Imagine there are two functions e : int -> Integer and p : Integer -> int, where p(null) = bottom. Composing e-then-p is an identity; composing p-then-e can lose some information, but we can characterize the information loss.? Records form this same relationship with their cartesian product space (assuming you follow the refined contract outlined in Record::equals).? When you have this relationship, you get some very nice properties, such as "withers" and serialization basically for free.? The relationship between a ctor and the corresponding dtor also has this structure.? So yes, going backwards is "lossy", but in a controlled way.? This turns out to be good enough for a lot of things. > Let say we have a class with two shapes/deconstruction > > class A { > ? deconstructor (B) { ... } > ? deconstructor (C) { ... } > } > > With the pattern A(D d), D is a runtime class not a type, you have no > idea if it means > ? instanceof A a && B b =?a.deconstructor() && b instanceof D > or > ? instanceof A a && C c =?a.deconstructor() && c instanceof D You can have types in the dtor bindings, just as you can have types in the constructor arguments.? Both may use the class type variables, as they are instance "members". > Unlike with a method call (constructor call) where the type of the > arguments are available, with the pattern matching, you do not have > the types of the arguments, only runtime classes to match. This is where the rule of "downcast compatible" comes in.? We see this show up in GADT-like examples (the rules of which are next on our parade.)? For example, if we have: ??? sealed class Node { } ??? record IntNode(int x) implements Node { } then when we switch on a Node: ??? switch (aNode) { ??????? case IntNode n: ... ??? } we may conclude, in the consequent of the appropriate case, that T=int.? (Read the Kennedy and Russo paper for details.) Similarly, if we have: ??? List list = ... then when matching, we may conclude that if its an ArrayList, its an ArrayLIst: ??? switch (list) { ??????? case ArrayList a: ... ??? } but could not say `case ArrayList`, because that is inconsistent with the target type. So, while we can't necessarily distinguish between Foo and Foo because of erasure, that doesn't mean we can't use types; its just that we can't conclude things that the generic type system won't let us. > As i said, it's a question where OOP and DOD (data oriented design ?) > disagree one with the other. I don't think they disagree at all.? They are both useful tools for modeling things; one is good for modeling entities and processes, the other for modeling data, using a common vocabulary.? Our systems may have both! > And this is a problem specific to the deconstructor, for named pattern > method, there is no such problem, obviously a user can add as many > pattern methods he/she want. Because there's no name, we are limited to overloads that are distinct up to erasure; constructors have the same restriction. > But for each way of putting together the data, there should be a > corresponding way to take it apart. > > > if the pattern matching was a real inverse link, yes, maybe. I think the word "real" is doing too much lifting in that sentence. > The problem is that varargs pattern can also recognizes a record with > no record or class with a deconstructor with no varargs. As can a constructor. > > You think term of inverse function, we have varargs constructors so we > should have varargs pattern, but a pattern is not an inverse function. You are interpreting "inverse" too strictly -- and then attempting to use that to prematurely bury the concept. > We have the freedom to provide a simpler model. I think the reality is that you want this to be a smaller, less ambitious feature than is being planned here.? That's a totally valid opinion!? But I think its just a difference of opinion on how much to invest vs how much we get out. But by all means, try to outline (in a single mail, please) your vision for a simpler model. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon May 30 19:45:26 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 30 May 2022 21:45:26 +0200 (CEST) Subject: It's the data, stupid ! In-Reply-To: <6fde1524-b10c-3566-ea88-c3cf6ff9629e@oracle.com> References: <1036716303.14854318.1653914031900.JavaMail.zimbra@u-pem.fr> <206d71da-82d0-79f9-0fd6-ad43ae2f4654@oracle.com> <864944952.15037557.1653923901560.JavaMail.zimbra@u-pem.fr> <1810446329.15098102.1653932607503.JavaMail.zimbra@u-pem.fr> <6fde1524-b10c-3566-ea88-c3cf6ff9629e@oracle.com> Message-ID: <1185196492.15115659.1653939926620.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Monday, May 30, 2022 8:40:33 PM > Subject: Re: It's the data, stupid ! >> The problem is that what you propose is a leaky abstraction, because pattern >> matching works on classes and not on types, so it's not a reverse link. > ("Leaky abstraction" is sort of an inflammatory term.) > What I think you're getting at is that some objects will have state that you can > "put in", but can't "take out". The mathematical relationship here is > "embedding projection pair" (this is similar to an adjoint functor pair in some > ways.) > A good example of this relationship is int and Integer. Every int corresponds to > an Integer, and *almost* every Integer (except null) corresponds to an int. > Imagine there are two functions e : int -> Integer and p : Integer -> int, > where p(null) = bottom. Composing e-then-p is an identity; composing p-then-e > can lose some information, but we can characterize the information loss. > Records form this same relationship with their cartesian product space > (assuming you follow the refined contract outlined in Record::equals). When you > have this relationship, you get some very nice properties, such as "withers" > and serialization basically for free. The relationship between a ctor and the > corresponding dtor also has this structure. So yes, going backwards is "lossy", > but in a controlled way. This turns out to be good enough for a lot of things. I don't disagree about everything you are saying, because the problem is elsewhere. >> Let say we have a class with two shapes/deconstruction >> class A { >> deconstructor (B) { ... } >> deconstructor (C) { ... } >> } >> With the pattern A(D d), D is a runtime class not a type, you have no idea if it >> means >> instanceof A a && B b = a.deconstructor() && b instanceof D >> or >> instanceof A a && C c = a.deconstructor() && c instanceof D > You can have types in the dtor bindings, just as you can have types in the > constructor arguments. Both may use the class type variables, as they are > instance "members". The problem is not at callee site, as you said you have deconstructor binding like you have constructor parameter, the problem is at callsite, when you have a Type Pattern, a type pattern does not declare a type that can be used at compile time but a class that is used at runtime (to do the instanceof). So the problem is not how to declare several deconstructors, it's how to select the right one without type information. This is a problem specific to the switch or instanceof, if you have an assignment context, here because the matching is total, you can extract the types from the expression you are matching. By example with A(D d) = a; here D is a type, not a class, so finding the right descriptor is easy, you can apply the same algorithm as the overloading selection. But with instanceof, a instanceof A(D d) here D is a class, you can not find the corresponding type, so you can not run the overloading selection. >> As i said, it's a question where OOP and DOD (data oriented design ?) disagree >> one with the other. > I don't think they disagree at all. They are both useful tools for modeling > things; one is good for modeling entities and processes, the other for modeling > data, using a common vocabulary. Our systems may have both! yes, the question is when both disagree, who is winning ? for all cases where they both disagree. >> And this is a problem specific to the deconstructor, for named pattern method, >> there is no such problem, obviously a user can add as many pattern methods >> he/she want. > Because there's no name, we are limited to overloads that are distinct up to > erasure; constructors have the same restriction. nope, because if we have a type pattern / record pattern we have no type information. If named pattern method allows overloading, we will have the same issue, there is not enough type information at callsite. >>> But for each way of putting together the data, there should be a corresponding >>> way to take it apart. >> if the pattern matching was a real inverse link, yes, maybe. > I think the word "real" is doing too much lifting in that sentence. >> The problem is that varargs pattern can also recognizes a record with no record >> or class with a deconstructor with no varargs. > As can a constructor. no, there is no spread operator in Java unlike by example in JavaScript, so you can not write class A { A(int i, int j) { ... } } int[] array = ... new A(...array) > I think the reality is that you want this to be a smaller, less ambitious > feature than is being planned here. That's a totally valid opinion! But I think > its just a difference of opinion on how much to invest vs how much we get out. > But by all means, try to outline (in a single mail, please) your vision for a > simpler model. By simpler model, i mean we do not have to mirror all the method call quirks as patterns, only the ones that make sense from a "data oriented design" POV. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon May 30 20:23:12 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 30 May 2022 20:23:12 +0000 Subject: It's the data, stupid ! In-Reply-To: <1185196492.15115659.1653939926620.JavaMail.zimbra@u-pem.fr> References: <1036716303.14854318.1653914031900.JavaMail.zimbra@u-pem.fr> <206d71da-82d0-79f9-0fd6-ad43ae2f4654@oracle.com> <864944952.15037557.1653923901560.JavaMail.zimbra@u-pem.fr> <1810446329.15098102.1653932607503.JavaMail.zimbra@u-pem.fr> <6fde1524-b10c-3566-ea88-c3cf6ff9629e@oracle.com> <1185196492.15115659.1653939926620.JavaMail.zimbra@u-pem.fr> Message-ID: <129AB3EB-4B95-4992-ACA7-AD60BB3BDCA6@oracle.com> The problem is not at callee site, as you said you have deconstructor binding like you have constructor parameter, the problem is at callsite, when you have a Type Pattern, a type pattern does not declare a type that can be used at compile time but a class that is used at runtime (to do the instanceof). So the problem is not how to declare several deconstructors, it's how to select the right one without type information. Overload selection works largely the same as it does with constructors, just with some of of the ?arrows reversed?. But I think you?re extrapolating from deconstructors too much, which are a very constrained kind of declared pattern. A pattern combines an applicability test (does the target match the pattern), zero or more conditional extractions (operations performed on the target only when it is known to match), and creation of variables to receive the extracted data. For a deconstruction pattern, the applicability test is highly constrained ? it is a class-based instanceof test, and the language knows it. If its of the right type, the deconstruction must succeed ? it is *total* on that type. But this is a very special case (and a very useful case, which is why we are doing them first.) Consider a more complex situation, such as querying whether a regex matches a string, and if so, extracting the groups. Here, we can use whatever state we like as our match criteria; we are not constrained to work only with runtime classes. More importantly, the code to determine the match and extract the parameters is completed; if we exposed this as a boolean-returning method (?does it match?), we?d have to do all the work over again when we go to extract the groups (its like containsKey and get in Map.) This is why the current regex implementation returns a stateful matcher object. Class-based matching is just the easy case. nope, because if we have a type pattern / record pattern we have no type information. If named pattern method allows overloading, we will have the same issue, there is not enough type information at callsite. I?m not following this argument. Can you try to write out some end-to-end examples where you say what you are trying to accomplish? By simpler model, i mean we do not have to mirror all the method call quirks as patterns, only the ones that make sense from a "data oriented design" POV. OK, but all your arguments are ?against? ? you just keep saying ?we?re doing it wrong, w can make it simpler.? You haven?t outline *how* it can be simpler, or what problems you are actually worried about. Which makes them pretty hard to understand, or respond to. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Tue May 31 13:42:47 2022 From: amaembo at gmail.com (Tagir Valeev) Date: Tue, 31 May 2022 15:42:47 +0200 Subject: Named record pattern In-Reply-To: <8bc9cbde-83d6-e400-21f6-f6faf75d6034@oracle.com> References: <8bc9cbde-83d6-e400-21f6-f6faf75d6034@oracle.com> Message-ID: Hello! On Mon, May 30, 2022 at 6:54 PM Brian Goetz wrote: > I must admit to being very surprised that you can do this at all! I > don't recall discussion on this, and had you asked me, I would have said > that `final` has no place in type-test patterns. Yet, I just tried it > with jshell and it does work as you say. I am surprised. > > Can someone recall any discussion over this? (Surely now someone will > point me to where I agreed to this.) > > Worse, it even works in switch labels! This is definitely not what I > had in mind. Did this happen because we reused the local variable > production for type patterns? Since switch patterns are about to exit > preview, I think we need to fix this ASAP, before switch exits preview. Erm... I actually thought that it was your idea to allow the 'final' modifier on patterns. This change was introduced in Java 16 (when patterns for instanceof were finalized). Here's the initial e-mail from you (item 2): https://mail.openjdk.java.net/pipermail/amber-spec-experts/2020-August/002433.html There was also subsequent discussion but eventually nobody was really against and this was finalized. > > > It looks like, > > "obj instanceof final Point(int x, int y) p" syntax is not allowed > > which brings some asymmetry > > - In the first case I can use LOCAL_VARIABLE annotations like 'obj > > This very question is why I would not have encouraged us to try to do > this for type test patterns at all! > > > instanceof @Cartesian Point p'. It looks like I cannot do the same in > > the second case, which is another asymmetry. > > We definitely intended to not allow declaration annotations. But they are allowed for type test patterns, since Java 16. > As to > type-use annotations; well, that's a different problem, and I'm not > quite sure what to do. For sure, we are not going to amend the > XxxTypeAnnotations attributes to reify the position of these > annotations. If we allow them and make them available only to > annotations processors only, that's another kind of asymmetry, that > someone else will complain about. > > > So if I want to upgrade the type test pattern on a record type to a > > record pattern to match components, I need to give up some features > > like finality and annotations. Is this intended? > > It was not really intended that you got those features in the first place. > > From brian.goetz at oracle.com Tue May 31 14:49:02 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 31 May 2022 10:49:02 -0400 Subject: Named record pattern In-Reply-To: References: <8bc9cbde-83d6-e400-21f6-f6faf75d6034@oracle.com> Message-ID: <7fe8ca64-c9af-8075-d029-bbfebfffe46e@oracle.com> > Erm... I actually thought that it was your idea to allow the 'final' > modifier on patterns. This change was introduced in Java 16 (when > patterns for instanceof were finalized). Here's the initial e-mail > from you (item 2): > https://mail.openjdk.java.net/pipermail/amber-spec-experts/2020-August/002433.html That mail is exactly the discussion point I was thinking of.? But I never said that there should be a way to declare them as final at all!? I said it was a mistake to have made them automatically final, and that patterns should introduce ordinary mutable locals.? I wasn't suggesting an option, just that we'd picked the wrong default (and created a new category of complexity in the process.) But, its an honest leap from there to "well of course me must have meant you could declare them final."? But had this been explicitly raised, I would have not been in favor of this option, for two reasons: ?- The conversation we are having now -- it was clear that eventually, some more complex pattern would introduce variables in a way such that there was not an obvious "local variable" declaration, and that we would eventually be having a "for consistency" discussion; ?- The value of being able to declare these things final is almost zero; the only reason we are having this conversation at all is "for consistency" with local variables.? But if someone said "should we add a feature to let you make pattern variables final", the "meh" would have been deafening. >>> instanceof @Cartesian Point p'. It looks like I cannot do the same in >>> the second case, which is another asymmetry. >> We definitely intended to not allow declaration annotations. > But they are allowed for type test patterns, since Java 16. Yeah, we've got a problem. From brian.goetz at oracle.com Tue May 31 16:12:06 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 31 May 2022 12:12:06 -0400 Subject: Named record pattern In-Reply-To: <7fe8ca64-c9af-8075-d029-bbfebfffe46e@oracle.com> References: <8bc9cbde-83d6-e400-21f6-f6faf75d6034@oracle.com> <7fe8ca64-c9af-8075-d029-bbfebfffe46e@oracle.com> Message-ID: Gavin reminded me that we are not finalizing patterns in switch in 19 (hard to keep track, sometimes), so we have a little bit of time to figure out what we want here. One thing that is potentially confusing is that patterns work indirectly in a number of ways.? For example, if we have a declared deconstruction pattern for Point, you can't *invoke* it as you can a method; the language runtime invokes it on your behalf under the right situations.? (In this way, a deconstructor is a little like a static initializer; it is a body of code that you declare, but you can't invoke it directly, the runtime invokes it for you at the right time, and that's fine.) I had always imagined the relationship with locals being similar; a pattern causes a local to be injected into certain scopes, but the pattern itself is not a local variable declaration.? Obviously there is more than one way to interpret this, so we should make a more deliberate decision. As a confounding example that suggests that pattern variables are not "just locals", in the past we talked about various forms of "merging": ??? if (t instanceof Box(String s) || t instanceof Bag(String s)) { ... } or ??? case Box(String s): ??? case Bag(String s): ??????? common-code; If pattern variables could be annotated, then the language would be in the position of deciding what happens with ??? case Box(@Foo(1) String s): ??? case Bag(@Foo(2) String s): (This is the "annotation merging" problem, which is why annotations are not inherited in the first place.) I don't have an answer here, but I'm going to think about the various issues and try to capture them in more detail before proposing an answer. On 5/31/2022 10:49 AM, Brian Goetz wrote: > > >> Erm... I actually thought that it was your idea to allow the 'final' >> modifier on patterns. This change was introduced in Java 16 (when >> patterns for instanceof were finalized). Here's the initial e-mail >> from you (item 2): >> https://mail.openjdk.java.net/pipermail/amber-spec-experts/2020-August/002433.html >> > > That mail is exactly the discussion point I was thinking of.? But I > never said that there should be a way to declare them as final at > all!? I said it was a mistake to have made them automatically final, > and that patterns should introduce ordinary mutable locals.? I wasn't > suggesting an option, just that we'd picked the wrong default (and > created a new category of complexity in the process.) > > But, its an honest leap from there to "well of course me must have > meant you could declare them final."? But had this been explicitly > raised, I would have not been in favor of this option, for two reasons: > > ?- The conversation we are having now -- it was clear that eventually, > some more complex pattern would introduce variables in a way such that > there was not an obvious "local variable" declaration, and that we > would eventually be having a "for consistency" discussion; > > ?- The value of being able to declare these things final is almost > zero; the only reason we are having this conversation at all is "for > consistency" with local variables.? But if someone said "should we add > a feature to let you make pattern variables final", the "meh" would > have been deafening. > >>>> instanceof @Cartesian Point p'. It looks like I cannot do the same in >>>> the second case, which is another asymmetry. >>> We definitely intended to not allow declaration annotations. >> But they are allowed for type test patterns, since Java 16. > > Yeah, we've got a problem. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Tue May 31 17:41:21 2022 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 31 May 2022 18:41:21 +0100 Subject: Named record pattern In-Reply-To: References: <8bc9cbde-83d6-e400-21f6-f6faf75d6034@oracle.com> <7fe8ca64-c9af-8075-d029-bbfebfffe46e@oracle.com> Message-ID: While merging is an issue (if supported), I think early in the design we settled on the principle that a binding variable should always only have /one/ corresponding declaration (in the places where it is defined). So, from a language perspective, I don?t see an immediate problem, in the sense that the annotations and finality for ?s? would come from the place where ?s? has been declared. From a compiler perspective, the big question is whether, even in the absence of merging in the flow scoping rules, we could still perform merging under the hood, in order to speed up computation. Consider: |sealed interface Node record Pair(Node fst, Node snd) { } record A(String s) implements Node {} record B(int i) implements Node {} | And then: |case Pair(A(String s1), A(String s2)) -> ... case Pair(A(String s3), B(int i1)) -> ... case Pair(B(int i2), A(String s4)) -> ... case Pair(B(int i3), B(int i4)) -> ... | (for simplicity, all binding names are different, so that we do not depend on whether the block after -> ? completes normally or not) Of course the first two patterns share the common subpattern |A(String s1)| (if you ignore naming differences). So it /might/ be possible, in principle, for the compiler/runtime to generate an optimized decision tree in which the String value for the first A(?) sub-pattern is computed only once, and then shared in the two cases. (A similar reasoning applies to the last two patterns). But if the types in the first and second pattern could be annotated differently, then we are faced with a number of challenges, as the compiler would not be able to just ?blindly? reuse the cached values (as those values would be shared, and, therefore, unannotated). Instead, the compiler would have to assign the cached value into a /fresh/, correctly annotated local variable that is used only inside the given |case|. This is not impossible of course, but adds complexity to the translation strategy, and/or might affect the number of moves we might be able to do. Maurizio On 31/05/2022 17:12, Brian Goetz wrote: > As a confounding example that suggests that pattern variables are not > "just locals", in the past we talked about various forms of "merging": > > ??? if (t instanceof Box(String s) || t instanceof Bag(String s)) { ... } > > or > > ??? case Box(String s): > ??? case Bag(String s): > ??????? common-code; > > If pattern variables could be annotated, then the language would be in > the position of deciding what happens with > > ??? case Box(@Foo(1) String s): > ??? case Bag(@Foo(2) String s): > > (This is the "annotation merging" problem, which is why annotations > are not inherited in the first place.) ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue May 31 22:15:20 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 1 Jun 2022 00:15:20 +0200 (CEST) Subject: Named record pattern In-Reply-To: References: <8bc9cbde-83d6-e400-21f6-f6faf75d6034@oracle.com> <7fe8ca64-c9af-8075-d029-bbfebfffe46e@oracle.com> Message-ID: <885938758.16408066.1654035320588.JavaMail.zimbra@u-pem.fr> > From: "Maurizio Cimadamore" > To: "Brian Goetz" , "Tagir Valeev" > Cc: "amber-spec-experts" > Sent: Tuesday, May 31, 2022 7:41:21 PM > Subject: Re: Named record pattern > While merging is an issue (if supported), I think early in the design we settled > on the principle that a binding variable should always only have one > corresponding declaration (in the places where it is defined). > So, from a language perspective, I don?t see an immediate problem, in the sense > that the annotations and finality for ?s? would come from the place where ?s? > has been declared. > From a compiler perspective, the big question is whether, even in the absence of > merging in the flow scoping rules, we could still perform merging under the > hood, in order to speed up computation. Consider: > sealed interface Node > record Pair (Node fst, Node snd) { } record A (String s) implements Node {} > record B ( int i) implements Node {} > And then: > case Pair (A(String s1) , A (String s2) ) -> ... case Pair (A(String s3) , B ( > int i1) ) -> ... case Pair (B( int i2) , A (String s4) ) -> ... case Pair (B( > int i3) , B ( int i4) ) -> ... > (for simplicity, all binding names are different, so that we do not depend on > whether the block after -> ? completes normally or not) > Of course the first two patterns share the common subpattern A(String s1) (if > you ignore naming differences). So it might be possible, in principle, for the > compiler/runtime to generate an optimized decision tree in which the String > value for the first A(?) sub-pattern is computed only once, and then shared in > the two cases. (A similar reasoning applies to the last two patterns). > But if the types in the first and second pattern could be annotated differently, > then we are faced with a number of challenges, as the compiler would not be > able to just ?blindly? reuse the cached values (as those values would be > shared, and, therefore, unannotated). Instead, the compiler would have to > assign the cached value into a fresh , correctly annotated local variable that > is used only inside the given case . This is not impossible of course, but adds > complexity to the translation strategy, and/or might affect the number of moves > we might be able to do. The same issue arise if the patterns parts are generated at runtime using an invokedynamic. I don't think it's an issue if we consider that inside the pattern tree, we have bindings and that once "->" is crossed, those bindings are materialized as local variables with annotations. I kind of easier to see it if there an invokedynamic and a carrier object. A switch(value) is translated to Object carrier = invokedynamic pattern_match(value) [send the patterns as a constantpool constant]; int index = invokdynamic extract (carrier)[binding 0 /* index */]; switch(index) { case 0 -> { // here we rematerialize the local variables String s1 = invokdynamic extract (carrier)[binding 1]; String s2 = invokdynamic extract (carrier)[binding 2]; ... } case 1 -> { // we materialize s3 and i String s3 = invokdynamic extract (carrier)[binding 1]; int i1 = invokdynamic extract (carrier)[binding 2]; ... } ... } In a sense, this is similar to a lambda, it's not the parameter of the method of the lambda proxy which is annotated but the parameter of the static method desugared from the lambda body. Here, it's not the patterns part (that one that matches) which store the annotations but the part that extract the value from the carrier into local variables that stores the annotation. I think this idea also work if we do not use invokedynamic, the matching par can use a Tree or a DAG that fuses several bindings to one, it's not an issue if we generate local variables initialized from the binding values afterward. regards > Maurizio R?mi > On 31/05/2022 17:12, Brian Goetz wrote: >> As a confounding example that suggests that pattern variables are not "just >> locals", in the past we talked about various forms of "merging": >> if (t instanceof Box(String s) || t instanceof Bag(String s)) { ... } >> or >> case Box(String s): >> case Bag(String s): >> common-code; >> If pattern variables could be annotated, then the language would be in the >> position of deciding what happens with >> case Box(@Foo(1) String s): >> case Bag(@Foo(2) String s): >> (This is the "annotation merging" problem, which is why annotations are not >> inherited in the first place.) > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: