From attila.kelemen85 at gmail.com Mon Apr 1 15:07:38 2024 From: attila.kelemen85 at gmail.com (Attila Kelemen) Date: Mon, 1 Apr 2024 17:07:38 +0200 Subject: Member Patterns -- the bikeshed In-Reply-To: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> Message-ID: ### Sibling problem of comethods I would like to have this proposal explicitly address the relationship with failure handling, because whatever syntax is chosen for pattern matching, it can prohibit future failure handling syntaxes. Since comethods can be interpreted as methods returning `UNION{Success(values), Failure}`, where the `Failure` in pattern matching has only a single possibility (i.e., it is non-descriptive). While this is okay in some cases, it is quite limiting in others. Such as: I can imagine a simple `JsonNode.toString` wanting to use the pattern matching feature, but in this case it is worthwhile to return additional information in case of failure (what the parser likely already has available). Also, while pattern matching is proposed to be used in switches and boolean conditions there is another use-case that comes up frequently: Matching failure is catastrophic. In this case, I don't really want the ceremony. And I would prefer to avoid alternative method names, if a pattern matching is already available. So, for example the statement (the syntax is not a proposal, just an simple example): ``` jsonStr instanceof JsonNode.toString(var jsonNode); ``` The above example would throw a `PatternMatchingException`, if the pattern matching fails. ### Syntax All of the examples I have seen forces duplicating a lot of information (return / parameter types, and the name of the method). Wouldn't it be simpler to just force the comethods to be physically next to its pair? Even if physical separation would be allowed, a convenient syntax for the common case when it is not necessary seems very useful to me. That is, for example (with implicit failure): ``` invertible String commaSeparatedPair(int x, int y) { return x + "," + y; } inverse { String[] parts = that.split(","); if (parts.length != 2) fail; if ( parts[0] instanceof Integer.toString(var x) && parts[1] instanceof Integer.toString(var y) ) { succeed(x, y); } } ``` Though I have chosen the functional body type for this example, I personally don't mind the imperative as well. Also, I'm using `that` as an implicitly declared variable (though I don't care much what a final version would call it). I can't imagine it would be useful to declare its name explicitly, since I have never heard anyone saying: "If only I could declare a different name for `this`.". An additional note that while I have used "invertible" and "inverse", it is not to deviate from the "pattern" keyword, I just lacked the imagination of what would be used for "invertible" if "pattern" is used for "inverse" (and didn't want to think too much about such minor detail). Though I have to admit that I find the word "inverse" here a bit misleading, since the method is not a true inverse in general. Note that in the above example the keyword `inverse` is not strictly necessary, but I think it is useful to make the two parts more visually separate (and also allows Javadoc to be separately declared for the inverse, if it is deemed to be beneficial to declare separately). Also, I personally find the `__bind` idea good. I don't think it makes finding assignments more difficult. In fact, I think it would be good regardless of the function vs. imperative question, because you could use that to conveniently declare fall-back values. Such as: ``` int value = 42; if (!(str instanceof Integer.toString(__bind value))) { log.warn("Using fallback value."); } ``` This of course assumes that a failing pattern method does not change `value` (which I guess can have consequences for the JIT compiler in an imperative style). ### Clarification needs There was a brief mention of "single-abstract-pattern", but I don't see how it is intended to be used in pattern matching. It seems to me that it would require some additional syntax burden. Can you clarify the idea? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Apr 1 15:42:39 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 1 Apr 2024 11:42:39 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> Message-ID: There's a lot here, so let me try to separate. On 4/1/2024 11:07 AM, Attila Kelemen wrote: > ### Sibling problem of comethods > > I would like to have this proposal explicitly address the relationship > with failure handling, because whatever syntax is chosen for pattern > matching, it can prohibit future failure handling syntaxes. Since > comethods can be interpreted as methods returning > `UNION{Success(values), Failure}`, where the `Failure` in pattern > matching has only a single possibility (i.e., it is non-descriptive). > While this is okay in some cases, it is quite limiting in others. Such > as: I can imagine a simple `JsonNode.toString` wanting to use the > pattern matching feature, but in this case it is worthwhile to return > additional information in case of failure (what the parser likely > already has available). This is not really related to the "sibling problem", though; it is orthogonal. The one-bit failure channel is indeed a potential limitation; when there is a complex nested pattern, it might be useful to be able to debug "why did this big pattern fail."?? Not a problem for simple type and record patterns, but as we have more opportunities to compose patterns (imagine: matching an entire JSON document in one big pattern), debugging "why didn't it match" could benefit from more context. So noted: when making syntactic choices surrounding failure, try not to foreclose on future extension that might carry additional failure-related information. > Also, while pattern matching is proposed to be used in switches and > boolean conditions there is another use-case that comes up frequently: > Matching failure is catastrophic. In this case, I don't really want > the ceremony. And I would prefer to avoid alternative method names, if > a pattern matching is already available. So, for example the statement > (the syntax is not a proposal, just an simple example): > > ``` > jsonStr instanceof JsonNode.toString(var jsonNode); > ``` > > The above example would throw a `PatternMatchingException`, if the > pattern matching fails. What you are looking for is the ability to combine the effects of a match (binding the components) with an assertion that the match must succeed, so that you can assert that any failures are "unexpected" and can be handled implicitly by the runtime.? We considered something like this under the guise of a `let` or `match` statement, which is on hold pending a more disciplined analysis of totality requirements for pattern bodies, and may come back after that. Again, this is an orthogonal problem (and we are working hard to avoid loading the requirements up with "wish list" items, because otherwise we'll never ship anything.) > ### Syntax > > All of the examples I have seen?forces duplicating?a lot of > information (return / parameter types, and the name of the method). > > Wouldn't it be simpler to just force the comethods to be physically > next to its pair? The "with inverse" suggestion is one we explored (and something similar was explored in the "JMatch" exploration (Liu and Myers, various publications.))? It has a few serious problems: ?- It is not like anything else in the language, nor is it a mechanism that seems likely applicable to other situations, which have the effect of making it appear "nailed on the side".? We are used to members having stand-alone declarations. ?- Since not every pattern is going to be the dual of an actual method, we still need a full-blown syntax for declaring patterns, in which case this just becomes a "weird shorthand" for a common case. (The same argument was in play in deciding against declaring an exhaustive group of patterns with a single header.) > There was a brief mention of "single-abstract-pattern", but I don't > see how it is intended to be used in pattern matching. It seems to me > that it would require some additional syntax burden. Can you clarify > the idea? Lambdas.? A SAM interface is one with a single abstract method, and we extract the function shape and use it as a target type for lambdas.? Similarly, a SAP interface is one with a single abstract pattern.? We can play the same game, except the shape is (match candidate) -> { pattern body that binds declared bindings }. This allows APIs to be extend with patterns, such as a `match` method in streams: ????? objects.stream().match(e -> e instanceof String s).... where Stream::match takes a SAP interface with a single binding, such as: ??? interface Matchy { ??????? pattern(T that) p(U u); ??? } ... ??? // in Stream ??? Stream match(Matchy m); -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Apr 1 16:16:45 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 1 Apr 2024 12:16:45 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> Message-ID: I've received a number of private mails, each trying to "rescue" the imperative approach somehow. First, an analogy to set the stage (at the risk of derailing the actual discussion.) BEGIN ANALOGY Elsewhere in Amber, we are working on reconstruction expressions for immutable objects: ??? point with { x = 2*x; y = 2*y; } A number of people have seized on this as the backdoor through which they'll get by-name constructor invocation, asking for some variant of ??? Point with { x = 3; y = 4; } as a constructor syntax.? While there's nothing *locally* wrong with this as a linguistic form, I think there's something very wrong with the motivation. The motivation is that some people really, really want by-name invocation for everything: constructors, methods, patterns.? And I totally get that; they've seen it in other languages and like it, it makes code more readable and less error-prone, etc etc. Unfortunately, this feature is just much harder to add to Java than it seems to 99.999% of Java developers.? (If it were as easy and harmless as everyone thinks it is, we would have done it decades ago.? It's not impossible, but it is significant, and we'd rather invest that effort elsewhere.) When confronted with the perceived hostility towards a feature they want, people start bargaining ("I realize I can't have it in general, but maybe I can have it for records?")? The above pseudo-constructor syntax for records is exactly this sort of bargaining -- but its a terrible bargain. Its a terrible bargain because there's no chance of extending this to other contexts where people want it just as much (e.g., methods), and, to make it worse, if we ever did decide to do by-name invocation in general, the reconstructor-inspired syntax would be a white elephant next to ??? new Point(x: 1, y: 2) which is more direct and concise.? So its a hard no on the reconstructor-inspired constructor syntax, because of one or both of: ?- Having it for records only will quickly feel like "glass 90%" empty, and ?- If we eventually fill the glass, there'll be some sludgy residue in it. But, the bargaining instinct is strong.? First, another digression. BEGIN NESTED DIGRESSION I was at an AMA (Ask Me Anything) at a conference about eight years ago.? The audience peppered me with the usual "Will Java ever..." questions.? We had a great time.? One of the questions was "what about making semicolons optional."? I went through some examples where removing a semicolon in existing code could change the meaning in surprising ways.? But, the questioner was not to be deterred; he shot back with "OK, since records are a new construct, how about you make semicolons optional inside record declarations?" Serious props to this dude for dogged persistence towards his goals, but I almost fell off my chair laughing.? What a stupid language that would result in!? Asking developers to reason about "in which constructs are semicolons optional" would be worse than either extreme. The moral of this story is that when we want something badly enough, we become blind to the obvious negative side-effects.? We convince ourselves that getting it in a limited context, in a stilted way, is better than nothing.? But it's usually worse than nothing, because it creates new edge cases, and new constraints for making things globally better.? When it comes to language design, "local optimizations" usually aren't. END NESTED DIGRESSION The point of the nested digression was to illustrate the danger of trying to locally re-invent a feature that is better handled globally.? The wither-inspired constructor syntax is clearly dominated by a more general by-name invocation mechanism, but in our depression that we might never get one, we embrace clearly suboptimal choices. END ANALOGY So what was the point of the analogy?? AFAICS, the "imperative" syntax has two motivations: ?- To look like the syntactic dual of a constructor, and ?- As a way of providing "by name" matcher completion, rather than only "positional" matcher completion. To the first, I think the document does a good job at dismissing; this is a local maxima, but it falls apart in the bigger context of patterns in the object model. The point of the analogy is to highlight that the second is a similar form of bargaining over the lack of positional invocation.? But if we were ever to have by-name invocation for methods, there's an obvious syntax that extends to binding production: ??? matches Point(x: 1, y: 2) as well as pattern use sites: ??? case Point(x: var first, y: var second): ... which suggests that we should avoid the temptation to use the imperative style as a "consolation prize" for the lack of general by-name invocation. On 3/29/2024 5:58 PM, Brian Goetz wrote: > We now come to the long-awaited bikeshed discussion on what member > patterns should look like. > > Bikeshed disclaimer for EG: > ? - This is likely to evoke strong opinions, so please take pains to > be especially constructive > ? - Long reply-to-reply threads should be avoided even more than usual > ? - Holistic, considered replies preferred > ? - Please change subject line if commenting on a sub-topic or tangential > ??? concern > > Special reminders for Remi: > ?- Use of words like "should", "must", "shouldn't", "mistake", > "wrong", "broken" > ?? are strictly forbidden. > ?- If in doubt, ask questions first. > > Notes for external observers: > ?- This is a working document for the EG; the discussion may continue > for a > ?? while before there is an official proposal.? Please be patient. > > > # Pattern declaration: the bikeshed > > We've largely identified the model for what kinds of patterns we need to > express, but there are still several degrees of freedom in the syntax. > > As the model has simplified during the design process, the space of syntax > choices has been pruned back, which is a good thing.? However, there > are still > quite a few smaller decisions to be made.? Not all of the > considerations are > orthogonal, so while they are presented individually, this is not a > "pick one > from each column" menu. > > Some of these simplifications include: > > ?- Patterns with "input arguments" have been removed; another way to > get to what > ?? this gave us may come back in another form. > ?- I have grown increasingly skeptical of the value of the imperative > `match` > ?? statement.? With better totality analysis, I think it can be > eliminated. > > We can discuss these separately but I would like to sync first on the > broad > strokes for how patterns are expressed. > > ## Object model requirements > > As outlined in "Towards Member Patterns", the basic model is that > patterns are > the dual of other executable members (constructors, static methods, > instance > methods.)? While they are like methods in that they have inputs, > outputs, names, > and an imperative body, they have additional degrees of freedom that > constructors and methods lack: > > ?- Patterns are, in general, _conditional_ (they can succeed or fail), > and only > ?? produce bindings (outputs) when they succeed.? This conditionality is > ?? understood by the language's flow analysis, and is used for > computing scoping > ?? and definite assignment. > ?- Methods can return at most one value; when a pattern completes > successfully, > ?? it may bind multiple values. > ?- All patterns have a _match candidate_, which is a distinguished, > ?? possibly-implicit parameter.? Some patterns also have a receiver, > which is > ?? also a distinguished, possibly-implicit parameter.? In some such > cases the > ?? receiver and match candidate are aliased, but in others these may > refer to > ?? different objects. > > So a pattern is a named executable member that takes a _match > candidate_ as a > possibly-implicit parameter, maybe takes a receiver as an implicit > parameter, > and has zero or more conditional _bindings_.? Its body can perform > imperative > computation, and can terminate either with match failure or success.? > In the > success case, it must provide a value for each binding. > > Deconstruction patterns are special in many of the same ways > constructors are: > they are constrained in their name, inheritance, and probably their > conditionality (they should probably always succeed).? Just as the > syntax for > constructors differs slightly from that of instance methods, the > syntax for > deconstructors may differ slightly from that of instance patterns.? Static > patterns, like static methods, have no receiver and do not have access > to the > type parameters of the enclosing class. > > Like constructors and methods, patterns can be overloaded, but in > accordance > with their duality to constructors and methods, the overloading > happens on the > _bindings_, not the inputs. > > ## Use-site syntax > > There are several kinds of type-driven patterns built into the > language: type > patterns and record patterns.? A type pattern in a `switch` looks like: > > ??? case String s: ... > > And a record pattern looks like: > > ??? case MyRecord(P1, P2, ...): ... > > where `P1..Pn` are nested patterns that are recursively matched to the > components of the record.? This use-site syntax for record patterns > was chosen > for its similarity to the construction syntax, to highlight that a record > pattern is the dual of record construction. > > **Deconstruction patterns.**? The simplest kind of member pattern, a > deconstruction pattern, will have the same use-site syntax as a record > pattern; > record patterns can be thought of as a deconstruction pattern > "acquired for > free" by records, just as records do with constructors, accessors, object > methods, etc.? So the use of a deconstruction pattern for `Point` > looks like: > > ??? case Point(var x, var y): ... > > whether `Point` is a record or an ordinary class equipped with a suitable > deconstruction pattern. > > **Static patterns.**? Continuing with the idea that the destructuring > syntax > should evoke the aggregation syntax, there is an obvious candidate for the > use-site syntax for static patterns: > > ??? case Optional.of(var e): ... > ??? case Optional.empty(): ... > > **Instance patterns.**? Uses of instance patterns will likely come in > two forms, > analogous to bound and unbound instance method references, depending > on whether > the receiver and the match candidate are the same object.? In the > unbound form, > used when the receiver is the same object as the match candidate, the > pattern > name is qualified by a _type_: > > ``` > Class k = ... > switch (k) { > ??? // Qualified by type > ??? case Class.arrayClass(var componentType): ... > } > ``` > > This means that we _resolve_ the pattern `arrayClass` starting at > `Class` and > _select_ the pattern using the receiver, `k`.? We may also be able to > omit the > class qualifier if the static type of the match candidate is sufficient to > resolve the desired pattern. > > In the bound form, used when the receiver is distinct from the match > candidate, > the pattern name is qualified with an explicit _receiver expression_.? > As an > example, consider an interface that captures primitive widening and > narrowing > conversions, such as those between `int` and `long`.? In the widening > direction, > conversion is unconditional, so this can be modeled as a method from > `int` to > `long`.? In the other direction, conversion is conditional, so this is > better > modeled as a _pattern_ whose match candidate is `long` and which binds > an `int` > on success.? Since these are instance methods of some class (say, > `NumericConversion`), we need to provide the receiver instance in > order to > resolve the pattern: > > ``` > NumericConversion nc = ... > > switch (aLong) { > ??? case nc.narrowed(int i): > ??? ... > } > ``` > > The explicit receiver syntax would also be used if we exposed regular > expression > matching as a pattern on the `j.u.r.Pattern` object (the name collision on > `Pattern` is unfortunate).? Imagine we added a `matching` instance > pattern to > `j.u.r.Pattern`; then we could use it in `instanceof` as follows: > > ``` > static final java.util.regex.Pattern P = Pattern.compile("(a*)(b*)"); > ... > if (aString instanceof P.matching(String as, String bs)) { ... } > ``` > > Each of these use-site syntaxes is modeled after the use-site syntax for a > method invocation or method reference. > > ## Declaration-site syntax > > To avoid being biased by the simpler cases, we're going to work all > the cases > concurrently rather than starting with the simpler cases and working > up.? (It > might seem sensible to start with deconstructors, since they are the > "easy" > case, but if we did that, we would likely be biased by their > simplicity and then > find ourselves painted into a corner.)? As our example gallery, we > will consider: > > ?- Deconstruction pattern for `Point`; > ?- Static patterns for `Optional::of` and `Optional::empty`; > ?- Static pattern for "power of two" (illustrating a computations > where success > ?? or failure, and computation of bindings, cannot easily be separated); > ?- Instance pattern for `Class::arrayClass` (used unbound); > ?- Instance pattern for `Pattern::matching` on regular expressions > (used bound). > > Member patterns, like methods, have _names_.? (We can think of > constructors as > being named for their enclosing classes, and the same for > deconstructors.)? All > member patterns have a (possibly empty) ordered list of _bindings_, > which are > the dual of constructor or method parameters.? Bindings, in turn, have > names and > types.? And like constructors and methods, member patterns have a > _body_ which > is a block statement.? Member patterns also have a _match candidate_, > which is a > likely-implicit method parameter. > > ### Member patterns as inverse methods and constructors > > Regardless of syntax, let us remind ourselves that that deconstructors > are the > categorical dual to constructors (coconstructors), and pattern methods > are the > categorical dual to methods (comethods).? They are dual in their > structure: a > constructor or method takes N arguments and produces a result, the > corresponding > member pattern consumes a match candidate and (conditionally) produces N > bindings. > > Moreover, they are semantically dual: the return value produced by > construction > or factory invocation is the match candidate for the corresponding member > pattern, and the bindings produced by a member pattern are the answers > to the > _Pattern Question_ -- "could this object have come from an invocation > of my > dual, and if so, with what arguments." > > ### What do we call them? > > Given the significant overlap between methods and patterns, the first > question > about the declaration we need to settle is how to identify a member > pattern > declaration as distinct from a method or constructor declaration.? > _Towards > Member Patterns_ tried out a syntax that recognized these as _inverse_ > methods > and constructors: > > ??? public Point(int x, int y) { ... } > ??? public inverse Point(int x, int y) { ... } > > While this is a principled choice which clearly highlights the > duality, and one > that might be good for specification and verbal description, it is > questionable > whether this would be a great syntax for reading and writing programs. > > A more traditional option is to choose a "noun" (conditional) keyword, > such as > `pattern`, `matcher`, `extractor`, `view`, etc: > > ??? public pattern Point(int x, int y) { ... } > > If we are using a noun keyword to identify pattern declarations, we > could use > the same noun for all of them, or we could choose a different one for > deconstruction patterns: > > ??? public deconstructor Point(int x, int y) { ... } > > Alternately, we could reach for a symbol to indicate that we are > talking about > an inverted member.? C++ fans might suggest > > ??? public ~Point(int x, int y) { ... } > > but this is too cryptic (it's evocative once you see it, but then it > becomes > less evocative as we move away from deconstructors towards instance > patterns.) > > If we wish to offer finer-grained control over conditionality, we might > additionally need a `total` / `partial` modifier, though I would > prefer to avoid > that. > > Of the keyword candidates, there is one that stands out (for good and bad) > because it connects to something that is already in the language: > `pattern`.? On > the one hand, using the term `pattern` for the declaration is a slight > abuse; on > the other, users will immediately connect it with "ah, so that's how I > make a > new pattern" or "so that's what happens when I match against this > pattern." > (Lisps would resolve this tension by calling it `defpattern`.) > > The others (`matcher`, `view`, `extractor`, etc) are all made-up terms > that > don't connect to anything else in the language, for better or worse.? > If we pick > one of these, we are asking users to sort out _three_ separate new > things in > their heads: (use-site) patterns, (declaration-site) matchers, and the > rules of > how patterns and matchers are connected.? Calling them both > "patterns", despite > the mild abuse of terminology, ties them together in a way that > recognizes their > connection. > > My personal position: `pattern` is the strongest candidate here, > despite some > flaws. > > ### Binding lists and match candidates > > There are two obvious alternatives for describing the binding list and > match > candidate of a pattern declaration, both with their roots in the > constructor and > method syntax: > > ?- Pretend that a pattern declaration is like a method with multiple > return, and > ?? put the binding list in the "return position", and make the match > candidate > ?? an ordinary parameter; > ?- Lean into the inverse relationship between constructors and methods > (and > ?? consistency with the use-site syntax), and put the binding list in the > ?? "parameter list position". For static patterns and some instance > patterns, > ?? which need to explicitly identify the match candidate type, there > are several > ?? sub-options: > ?? - Lean further into the duality, putting the match candidate type > in the > ???? "return position"; > ?? - Put the match candidate type somewhere else, where it is less > likely to be > ???? confused for a method return. > > The "method-like" approach might look like this: > > ``` > class Point { > ??? // Constructor and deconstructor > ??? public Point(int x, int y) { ... } > ??? public pattern (int x, int y) Point(Point target) { ... } > ??? ... > } > > class Optional { > ??? // Static factory and pattern > ??? public static Optional of(T t) { ... } > ??? public static pattern (T t) of(Optional target) { ... } > ??? ... > } > ``` > > The "inverse" approach might look like: > > ``` > class Point { > ??? // Constructor and deconstructor > ??? public Point(int x, int y) { ... } > ??? public pattern Point(int x, int y) { ... } > ??? ... > } > > class Optional { > ??? // Static factory and pattern (using the first sub-option) > ??? public static Optional of(T t) { ... } > ??? public static pattern Optional of(T t) { ... } > ??? ... > } > ``` > > With the "method-like" approach, the match candidate gets an explicit name > selected by the author; with the inverse approach, we can go with a > predefined > name such as `that`.? (Because deconstructors do not have receivers, > we could by > abuse of notation arrange for the keyword `this` to refer instead to > the match > candidate within the body of a deconstructor.? While this might seem > to lead to > a more familiar notation for writing deconstructors, it would create a > gratuitous asymmetry between the bodies of deconstruction patterns and > those of > other patterns.) > > Between these choices, nearly all the considerations favor the "inverse" > approach: > > ?- The "inverse" approach makes the declaration look like the use > site.? This > ?? highlights that `pattern Point(int x, int y)` is what gets invoked > when you > ?? match against the pattern use `Point(int x, int y)`.? (This point is so > ?? strong that we should probably just stop here.) > ?- The "inverse" members also look like their duals; the only > difference is the > ?? `pattern` keyword (and possibly the placement of the match > candidate type). > ?? This makes matched pairs much more obvious, and such matched pairs > will be > ?? critical both for future language features and for library idioms. > ?- The method-like approach is suggestive of multiple return or > tuples, which is > ?? probably helpful for the first few minutes but actually harmful in > the long > ?? term. This feature is _not_ (much as some people would like to > believe) about > ?? multiple return or tuples, and playing into this misperception will > only make > ?? it harder to truly understand.? So this suggestion ends up propping > up the > ?? wrong mental model. > > The main downside of the "inverse" approach is the one-time speed bump > of the > unfamiliarity of the inverted syntax.? (The "method-like" syntax also > has its > own speed bumps, it is just unfamiliar in different ways.)? But unlike the > advantages of the inverse approach, which continue to add value > forever, this > speed bump is a one-time hurdle to get over. > > To smooth out the speed bumps of the inverse approach, we can consider > moving > the position of the match candidate for static and (suitable) instance > pattern > declarations, such as: > > ``` > class Optional { > ??? // the usual static factory > ??? public static Optional of(T t) { ... } > > ??? // Various ways of writing the corresponding pattern > ??? public static pattern of(T t) for Optional { ... } > ??? // or ... > ??? public static pattern(Optional) of(T t) { ... } > ??? // or ... > ??? public static pattern(Optional that) of(T t) { ... } > ??? // or ... > ??? public static pattern> of(T t) { ... } > ??? ... > } > ``` > > (The deconstructor example looks the same with either variant.) Of these, > treating the match candidate like a "parameter" of "pattern" is > probably the > most evocative: > > ``` > public static pattern(Optional that) of(T t) { ... } > ``` > > as it can be read as "pattern taking the parameter `Optional that` > called > `of`, binding `T`, and is a short departure from the inverse syntax. > > The main value of the various rearrangements is that users don't need > to think > about things operating in reverse to parse the syntax.? This trades > some of the > secondary point (patterns looking almost exactly like their inverses) > for a > certain amount of cognitive load, while maintaining the most important > consideration: that the declaration site look like the use site. > > For instance pattern declarations, if the match candidate type is the > same as > the receiver type, the match candidate type can be elided as it is with > deconstructors. > > My personal position: the "multiple return" version is terrible; all the > sub-variants of the inverse version are probably workable. > > ### Naming the match candidate > > We've been assuming so far that the match candidate always has a fixed > name, > such as `that`; this is an entirely workable approach.? Some of the > variants are > also amenable to allowing authors to explicitly select a name for the > match > candidate.? For example, if we put the match candidate as a > "parameter" to the `pattern` keyword, there is an obvious place to put > the name: > > ``` > static pattern(Optional target) of(T t) { ... } > ``` > > My personal opinion: I don't think this degree of freedom buys us > much, and in > the long run readability probably benefits by picking a fixed name > like `that` > and sticking with it.? Even with a fixed name, if there is a sensible > position > for the name, allowing users to type `that` for explicitness is fine > (as we do > with instance methods, though many people don't know this.)? We may > even want to > require it. > > ## Body types > > Just as there are two obvious approaches for the declaration, there > are two > obvious approaches we could take for the body (though there is some > coupling > between them.)? We'll call the two body approaches _imperative_ and > _functional_. > > The imperative approach treats bindings as initially-DU variables that > must be > DA on successful completion, getting their value through ordinary > assignment; > the functional approach sets all the bindings at once, positionally.? > Either > way, member patterns (except maybe deconstructors) also need a way to > differentiate a successful match from a failed match. > > Here is the `Point` deconstructor with both imperative and functional > style. The > functional style uses a placeholder `match` statement to indicate a > successful > match and provision of bindings: > > ``` > class Point { > ??? int x, y; > > ??? Point(int x, int y) { > ??????? this.x = x; > ??????? this.y = y; > ??? } > > ??? // Imperative style, deconstructor always succeeds > ??? pattern Point(int x, int y) { > ??????? x = that.x; > ??????? y = that.y; > ??? } > > ??? // Functional style > ??? pattern Point(int x, int y) { > ??????? match(that.x, that.y); > ??? } > } > ``` > > There are some obvious differences here.? In the imperative style, the > dtor body > looks much more like the reverse of the ctor body. The functional > style is more > concise (and amenable to further concision via the "concise method bodies" > mechanism in the future), as well as a number of less obvious > differences.? For > deconstructors, the imperative approach is likely to feel more natural > because > of the obvious symmetry with constructors. > > In reality, it is _premature at this point to have an opinion_, because we > haven't yet seen the full scope of the problem; deconstructors are a > special > case in many ways, which almost surely is distorting our initial > opinion.? As we > move towards conditional patterns (and pattern lambdas), our opinions > may flip. > > Regardless of which we pick, there are some additional syntactic > choices to be > made -- what syntax to use to indicate success (we used `match` in the > above > example) or failure.? (We should be especially careful around trying > to reuse > words like `return`, `break`, or `yield` because, in the case where > there are > zero bindings (which is allowable), it becomes unclear whether they > mean "fail" > or "succeed with zero bindings".) > > ### Success and failure > > Except for possibly deconstructors, which we may require to be total, > a pattern > declaration needs a way to indicate success and failure.? In the > examples above, > we posited a `match` statement to indicate success in the functional > approach, > and in both examples leaned on the "implicit success" of > deconstructors (under > the assumption they always succeed).? Now let's look at the more > general case to > figure out what else is needed. > > For a static pattern like `Optional::of`, success is conditional.? Using > `match-fail` as a placeholder for "the match failed", this might look like > (functional version): > > ``` > public static pattern(Optional that) of(T t) { > ??? if (that.isPresent()) > ??????? match (that.get()); > ??? else > ??????? match-fail; > } > ``` > > The imperative version is less pretty, though.? Using `match-success` as a > placeholder: > > ``` > public static pattern(Optional that) of(T t) { > ??? if (that.isPresent()) { > ??????? t = that.get(); > ??????? match-success; > ??? } > ??? else > ??????? match-fail; > } > ``` > > Both arms of the `if` feel excessively ceremonial here.? And if we > chose to not > make all deconstruction patterns unconditional, deconstructors would > likely need > some explicit success as well: > > ``` > pattern Point(int x, int y) { > ??? x = that.x; > ??? y = that.y; > ??? match-success; > } > ``` > > It might be tempting to try and eliminate the need for explicit success by > inferring it from whether or not the bindings are DA or not, but this is > error-prone, is less type-checkable, and falls apart completely for > patterns > with no bindings. > > ### Implicit failure in the functional approach > > One of the ceremonial-seeming aspects of `Optional::of` above is > having to say > `else match-fail`, which doesn't feel like it adds a lot of value.? > Perhaps we > can be more concise without losing clarity. > > Most conditional patterns will have a predicate to determine matching, > and then > some conditional code to compute the bindings and claim success.? > Having to say > "and if the predicate didn't hold, then I fail" seems like ceremony > for the > author and noise for the reader.? Instead, if a conditional pattern > falls off > the end without matching, we could treat that as simply not matching: > > ``` > public static pattern(Optional that) of(T t) { > ??? if (that.isPresent()) > ??????? match (that.get()); > } > ``` > > This says what we mean: if the optional is present, then this pattern > succeeds > and bind the contents of the `Optional`.? As long as our "succeed" > construct > strongly enough connotes that we are terminating abruptly and > successfully, this > code is perfectly clear.? And most conditional patterns will look a > lot like > `Optional::of`; do some sort of test and if it succeeds, extract the > state and > bind it. > > At first glance, this "implicit fail" idiom may seem error-prone or > sloppy.? But > after writing a few dozen patterns, one quickly tires of saying "else > match-fail" -- and the reader doesn't necessarily appreciate reading > it either. > > Implicit failure also simplifies the selection of how we explicitly > indicate > failure; using `return` in a pattern for "no match" becomes pretty > much a forced > move.? We observe that (in a void method), "return" and "falling off > the end" > are equivalent; if "falling off the end" means "no match", then so > should an > explicit `return`.? So in those few cases where we need to explicitly > signal "no > match", we can just use `return`.? It won't come up that often, but > here's an > example where it does: > > ``` > static pattern(int that) powerOfTwo(int exp) { > ??? int exp = 0; > > ??? if (that < 1) > ??????? return; // explicit fail > > ??? while (that > 1) { > ??????? if (that % 2 == 0) { > ??????????? that /= 2; > ??????????? ++exp; > ??????? } > ??????? else > ??????????? return; // explicit fail > ??? } > ??? match (exp); > } > ``` > > As a bonus, if `return` as match failure is a forced move, we need > only select a > term for "successful match" (which obviously can't be `return`).? We > could use > `match` as we have in the examples, or a variant like `matched` or > `matches`. > But rather than just creating a new control operator, we have an > opportunity to > lean into the duality a little harder, by including the pattern syntax > in the > match: > > ``` > matches of(that.get()); > ``` > > or the (optionally?) qualified (inferring type arguments, as we do at > the use > site): > > ``` > matches Optional.of(that.get()); > ``` > > These "use the name" approaches trades a small amount of verbosity to > gain a > higher degree of fidelity to the pattern use site (and to evoke the > comethod > completion.) > > If we don't choose "implicit fail", we would have to invent _two_ new > control > flow statements to indicate "success" and "failure". > > My personal position: for the functional approach, implicit failure > both makes > the code simpler and clearer, and after you get used to it, you don't > want to go > back.? Whether we say `match` or `matches` or `matches ` > are all > workable, though I like some variant that names the pattern. > > ### Implicit success in the imperative approach > > In the imperative approach, we can be implicit as well, but it feels more > natural (at least, initially) to choose implicit success rather than > failure. > This works great for unconditional patterns: > > ``` > pattern Point(int x, int y) { > ??? x = that.x; > ??? y = that.y; > ??? // implicit success > } > ``` > > but not quite as well for conditional patterns: > > ``` > static pattern(Optional that) of(T t) { > ??? if (that.isPresent()) { > ??????? t = that.get(); > ??? } > ??? else > ??????? match-fail; > ??? // implicit success > } > ``` > > We can eliminate one of the arms of the if, with the more concise (but > convoluted) inversion: > > ``` > static pattern(Optional that) of(T t) { > ??? if (!that.isPresent()) > ??????? match-fail; > ??? t = that.get(); > ??? // implicit success > } > ``` > > Just as with the functional approach, if we choose imperative and > "implicit > success", using `return` to indicate success is pretty much a forced > move. > > ### Imperative is a trap > > If we assume that functional implies implicit failure, and imperative > implies > implicit success, then our choices become: > > ``` > class Optional { > ??? public static Optional of(T t) { ... } > > ??? // imperative, implicit success > ??? public static pattern(Optional that) of(T t) { > ??????? if (that.isPresent()) { > ??????????? t = that.get(); > ??????? } > ??????? else > ??????????? match-fail; > ??? } > > ??? // functional, implicit failure > ??? public static pattern(Optional that) of(T t) { > ??????? if (that.isPresent()) > ??????????? matches of(that.get()); > ??? } > } > ``` > > Once we get past deconstructors, the imperative approach looks worse by > comparison because we need to assign all the bindings (which is _O(n)_ > assignments) _and also_ indicate success or failure somehow, whereas > in the > functional style all can be done together with a single `matches` > statement. > > Looking at the alternatives, except maybe for unconditional patterns, the > functional example above seems a lot more natural.? The imperative > approach > works with deconstructors (assuming they are not conditional), but > does not > scale so well to conditionality -- which is the essence of patterns. > > From a theoretical perspective, the method-comethod duality also gives > us a > forceful nudge towards the functional approach.? In a method, the method > arguments are specified as a positional list of expressions at the use > site: > > ??? m(a, b, c) > > and these values are invisibly copied into the parameter slots of the > method > prior to frame activation.? The dual to that for a comethod to > similarly convey > the bindings in a positional list of expressions (as they must either > all be > produced or none), where they are copied into the slots provided at > the use > site, as is indicated by `matches` in the above examples. > > My personal position: the imperative style feels like a trap. It seems > "obvious" at first if we start with deconstructors, but becomes > increasingly > difficult when we get past this case, and gets in the way of other > opportunities.? The last gasp before acceptance is the discomfort that > dtor and > ctor bodies are written in different styles, but in the rear-view > mirror, this > feels like a non-issue. > > ### Derive imperative from functional? > > If we start with "functional with implicit failure", we can possibly > rescue > imperative by deriving a version of imperative from functional, by > "overloading" > the match-success operator. > > If we have a pattern whose binding names are `b1..bn` of types > `B1..Bn`, then > the `matches` operator must take a list of expressions `e1..en` whose > arity and > types are compatible with `B1..Bn`.? But we could allow `matches` to > also have a > nilary form, which would have the effect of being shorthand for > > ??? matches (b1, b2, ..., bn) > > where each of `b1..bn` must be DA at the point of matching. This means > that we > could express patterns in either form: > > ``` > class Optional { > ??? public static Optional of(T t) { ... } > > ??? // imperative, derived from functional with implicit failure > ??? public static pattern(Optional that) of(T t) { > ??????? if (that.isPresent()) { > ??????????? t = that.get(); > ??????????? matches of; > ??????? } > ??? } > > ??? public static pattern(Optional that) of(T t) { > ??????? if (that.isPresent()) > ??????????? matches of(that.get()); > ??? } > } > ``` > > This flexibility allows users to select a more verbose expression in > exchange > for a clearer association of expressions and bindings, though as we'll > see, it > does come with some additional constraints. > > ### Wrapping an existing API > > Nearly every library has methods (sometimes sets of methods) that are > patterns > in disguise, such as the pair of methods `isArray` and > `getComponentType` in > `Class`, or the `Matcher` helper type in `java.util.regex`. Library > maintainers > will likely want to wrap (or replace) these with real patterns, so > these can > participate more effectively in conditional contexts, and in some cases, > highlight their duality with factory methods. > > Matching a string against a `j.u.r.Pattern` regular expression has all > the same > elements as a pattern, just with an ad-hoc API (and one that I have to > look up > every time).? But we can fairly easily wrap a true pattern around the > existing > API.? To match against a `Pattern` today, we pass the match candidate to > `Pattern::matcher`, which returns a `Matcher` with accessors > `Matcher::matches` > (did it match) and `Matcher::group` (conditionally extract a > particular capture > group.)? If we want to wrap this with a pattern called `regexMatch`: > > ``` > pattern(String that) regexMatch(String... groups) { > ??? Matcher m = this.matcher(that); > ??? if (m.matches()) > ??????? matches Pattern.regexMatch(IntStream.range(1, m.groupCount()) > ??????????????????????????????????????????? .map(Matcher::group) > .toArray(String[]::new)); > ??? // whole lotta matchin' goin' on > } > ``` > > This says that a `j.u.r.Pattern` has an instance pattern called > `regex`, whose > match candidate is `String`, and which binds a varargs of `String` > corresponding > to the capture groups.? The implementation simply delegates to the > existing > `j.u.r.Matcher` API.? This means that `j.u.r.Pattern` becomes a sort > of "pattern > object", and we can use it as a receiver at the use site: > > ``` > static Pattern As = Pattern.compile("(a*)"); > static Pattern Bs = Pattern.compile("(b*)"); > ... > switch (string) { > ??? case As.regexMatch(var as): ... > ??? case Bs.regexMatch(var bs): ... > ??? ... > } > ``` > > ### Odds and ends > > There are a number of loose ends here.? We could choose other names > for the > match-success and match-fail operations, including trying to reuse > `break` or > `yield`.? But, this reuse is tricky; it must be very clear whether a > given form > of abrupt completion means "success" or "failure", because in the case of > patterns with no bindings, we will have no other syntactic cues to help > disambiguate.? (I think having a single `matches`, with implicit > failure and > `return` meaning failure, is the sweet spot here.) > > Another question is whether the binding list introduces corresponding > variables > into the scope of the body.? For imperative, the answer is "surely > yes"; for > functional, the answer is "maybe" (unless we want to do the trick where we > derive imperative from functional, in which case the answer is "yes" > again.) > > If the binding list does not correspond to variables in the body, this > may be > initially discomforting; because they do not declare program elements, > they may > feel that they are left "dangling".? But even if they are not declaring > _program_ elements, they are still declaring _API_ elements (similar > to the > return type of a method.)? We will want to provide Javadoc on the > bindings, just > like with parameters; we will want to match up binding names in > deconstructors > with parameter names in constructors; we may even someday want to support > by-name binding at the use site (e.g., `case Foo(a: var a)`). The > names are > needed for all of these, just not for the body. Names still matter.? > My take > here is that this is a transient "different is scary" reaction, one > that we > would get over quickly. > > A final question is whether we should consider unqualified names as > implicitly > qualified by `that` (and also `this`, for instance patterns, with some > conflict > resolution).? Users will probably grow tired of typing `that.` all the > time, and most of the time, the unqualified use is perfectly readable. > > ## Exhaustiveness > > There is one last syntax question in front of us: how to indicate that > a set of > patterns are (claimed to be) exhaustive on a given match candidate > type.? We see > this with `Optional::of` and `Optional::empty`; it would be sad if the > compiler > did not realize that these two patterns together were exhaustive on > `Optional`. > This is not a feature that will be used often, but not having it at > all will be > a repeated irritant. > > The best I've come up with is to call these `case` patterns, where a > set of > `case` patterns for a given match candidate type in a given class are > asserted > to be an exhaustive set: > > ``` > class Optional { > ??? static Optional of(T t) { ... } > ??? static Optional empty() { ... } > > ??? static case pattern of(T t) for Optional { ... } > ??? static case pattern empty() for Optional { ... } > } > ``` > > Because they may not be truly exhaustive, `switch` constructs will > have to back > up the static assumption of exhaustiveness with a dynamic check, as we > do for > other sets of exhaustive patterns that may have remainder. > > I've experimented with variants of `sealed` but it felt more forced, > so this is > the best I've come up with. > > ## Example: patterns delegating to other patterns > > Pattern implementations must compose.? Just as a subclass constructor > delegates > to a superclass constructor, the same should be true for deconstructors. > Here's a typical superclass-subclass pair: > > ``` > class A { > ??? private final int a; > > ??? public A(int a) { this.a = a; } > ??? public pattern A(int a) { matches A(that.a); } > } > > class B extends A { > ??? private final int b; > > ??? public B(int a, int b) { > ??????? super(a); > ??????? this.b = b; > ??? } > > ??? // Imperative style > ??? public pattern B(int a, int b) { > ??????? if (that instanceof super(var aa)) { > ??????????? a = aa; > ??????????? b = that.b; > ??????????? matches B; > ??????? } > ??? } > > ??? // Functional style > ??? public pattern B(int a, int b) { > ??????? if (that instanceof super(var a)) > ??????????? matches B(a, b); > ??? } > } > ``` > > (Ignore the flow analysis and totality for the time being; we'll come > back to > this in a separate document.) > > The first thing that jumps out at us is that, in the imperative > version, we had > to create a "garbage" variable `aa` to receive the binding, because > `a` was > already in scope, and then we have to copy the garbage variable into > the real > binding variable. Users will surely balk at this, and rightly so.? In the > functional version (depending on the choices from "Odds and Ends") we > are free > to use the more natural name and avoid the roundabout locution. > > We might be tempted to fix the "garbage variable" problem by inventing > another > sub-feature: the ability to use an existing variable as the target of > a binding, > such as: > > ``` > pattern Point(int a, int b) { > ??? if (this instanceof A(__bind a)) > ??????? b = this.b; > } > ``` > > But, I think the language is stronger without this feature, for two > reasons. > First, having to reason about whether a pattern match introduces a new > binding > or assigns to an existing variables is additional cognitive load for > users to > reason about, and second, having assignment to locals happening through > something other than assignment introduces additional complexity in > finding > where a variable is modified.? While we can argue about the general > utility of > this feature, bringing it in just to solve the garbage-variable problem is > particularly unattractive. > > ## Pattern lambdas > > One final consideration is is that patterns may also have a lambda > form.? Given > a single-abstract-pattern (SAP) interface: > > ``` > interface Converter { > ??? pattern(T t) convert(U u); > } > ``` > > one can implement such a pattern with a lambda. Such a lambda has one > parameter > (the match candidate), and its body looks like the body of a declared > pattern: > > ``` > Converter c = > ??? i -> { > ??????? if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE) > ??????????? matches Converter.convert((short) i); > ??? }; > ``` > > Because the bindings of the pattern lambda are defined in the > interface, not in > the lambda, this is one more reason not to like the imperative > version: it is > brittle, and alpha-renaming bindings in the interface would be a > source-incompatible change. > > ## Example gallery > > Here's all the pattern examples so far, and a few more, using the > suggested > style (functional, implicit fail, implicit `that`-qualification): > > ``` > // Point dtor > pattern Point(int x, int y) { > ??? matches Point(x, y); > } > > // Optional -- static patterns for Optional::of, Optional::empty > static case pattern(Optional that) of(T t) { > ??? if (isPresent()) > ??????? matches of(t); > } > > static case pattern(Optional that) empty() { > ??? if (!isPresent()) > ??????? matches empty(); > } > > // Class -- instance pattern for arrayClass (match candidate type > inferred) > pattern arrayClass(Class componentType) { > ??? if (that.isArray()) > ??????? matches arrayClass(that.getComponentType()); > } > > // regular expression -- instance pattern in j.u.r.Pattern > pattern(String that) regexMatch(String... groups) { > ??? Matcher m = matcher(that); > ??? if (m.matches()) > ??????? matches Pattern.regexMatch(IntStream.range(1, m.groupCount()) > ??????????????????????????????????????????? .map(Matcher::group) > .toArray(String[]::new)); > } > > // power of two (somewhere) > static pattern(int that) powerOfTwo(int exp) { > ??? int exp = 0; > > ??? if (that < 1) > ??????? return; > > ??? while (that > 1) { > ??????? if (that % 2 == 0) { > ??????????? that /= 2; > ??????????? exp++; > ??????? } > ??????? else > ??????????? return; > ??? } > ??? matches powerOfTwo(exp); > } > ``` > > ## Closing thoughts > > I came out of this exploration with very different conclusions than I > expected > when going in.? At first, the "inverse" syntax seemed stilted, but > over time it > started to seem more obvious.? Similarly, I went in expecting to > prefer the > imperative approach for the body, but over time, started to warm to the > functional approach, and eventually concluded it was basically a > forced move if > we want to support more than just deconstructors.? And I started out > skeptical > of "implicit fail", but after writing a few dozen patterns with it, > going back > to fully explicit felt painful.? All of this is to say, you should > hold your > initial opinions at arm's length, and give the alternatives a chance > to sink in. > > For most _conditional_ patterns (and conditionality is at the heart of > pattern > matching), the functional approach cleanly highlights both the match > predicate > and the flow of values, and is considerably less fussy than the imperative > approach in the same situation; `Optional::of`, `Class::arrayClass`, > and `regex` > look great here, much better than the would with imperative. None of these > illustrate delegation, but in the presence of delegation, the gap gets > even > wider. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From attila.kelemen85 at gmail.com Mon Apr 1 16:31:14 2024 From: attila.kelemen85 at gmail.com (Attila Kelemen) Date: Mon, 1 Apr 2024 18:31:14 +0200 Subject: Member Patterns -- the bikeshed In-Reply-To: References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> Message-ID: > > Lambdas. A SAM interface is one with a single abstract method, and we > extract the function shape and use it as a target type for lambdas. > Similarly, a SAP interface is one with a single abstract pattern. We can > play the same game, except the shape is (match candidate) -> { pattern body > that binds declared bindings }. This allows APIs to be extend with > patterns, such as a `match` method in streams: > > objects.stream().match(e -> e instanceof String s).... > > where Stream::match takes a SAP interface with a single binding, such as: > > interface Matchy { > pattern(T that) p(U u); > } > ... > // in Stream > Stream match(Matchy m); > I see, thanks. What I was missing is that I thought a "pattern" method cannot exist without its pair. And that explains why wasn't the short "inverse" syntax chosen: Because the fact that the "pattern" method has the same name / types is just a coincidence from the point of view of the compiler (hopefully I'm not misunderstanding something again). -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Apr 1 16:34:49 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 1 Apr 2024 12:34:49 -0400 Subject: Deconstructor (and pattern) overload selection Message-ID: <01758bfd-55ab-4d8d-97ec-e20d885ff2a3@oracle.com> The next big pattern matching JEP will be about deconstruction patterns.? (Static and instance patterns will likely come separately.)? Now that we've got the bikeshed painting underway, there are a few other loose ends here, and one of them is overload selection. We explored taking the existing overload selection algorithm and turning it inside out, but after going down that road a bit, I think this both unnecessarily much complexity for not enough value, and also potentially fraught with nasty corner cases.? I think there is a much simpler answer here which is entirely good enough. First, let's remind ourselves, why do we have constructor overloading in the first place?? There are three main reasons: ?- Concision.? If a fully-general constructor takes many parameters, but not all are essential to the use case, then the construction site becomes a site of accidental complexity.? Being able to handle common grouping of parameters simplifies use sites. ?- Flexibility.? Related to the above, not only might the user not need to specify a given constructor parameter, but they want the flexibility of saying "let the implementation pick the best value".? Constructors with fewer parameters reserve more flexibility for the implementation. ?- Alternative representations.? Some objects may take multiple representations as input, such as accepting a Date, a LocalDate, or a LocalDateTime. The first two cases are generally handled with "telescoping constructor nests", where we have: ??? Foo(A a) ??? Foo(A a, B b) ??? Foo(A a, B b, C d, D d) Sometimes the telescopes don't fold perfectly, and becomes "trees": ??? Foo(A a) ??? Foo(A a, B b) ??? Foo(A a, C c, D d) ??? Foo(A a, B b, C d, D d) Which constructors to include are subjective judgments on the part of class authors to find good tradeoffs between code size and concision/flexibility. We had initially assumed that each constructor overload would have a corresponding deconstructor, but further experimentation suggests this is not an ideal assumption. Clue One that it is not a good assumption comes from the asymmetry between constructors and deconstructors; if we have constructors and deconstructors of shape C(List), then it is OK to invoke C's constructor with List or its subtypes, but we can invoke C's deconstructor with List or its subtypes or its supertypes. Clue Two is that applicability for constructors is based on method invocation context, but applicability for deconstructors is based on cast context, which has different rules.? It seems unlikely that we will ever get symmetry given this. The "Flexibility" requirement does not really apply to deconstructors; having a deconstructor that accepts additional bindings does not constrain anything, not in the same way as a constructor taking needlessly specific arguments.? Imagine if ArrayList had only constructors that take int (for array capacity); this is terrible for the constructor, because it forces a resource management decision onto users who will not likely make a very good decision, and one that is hard to change later, but pretty much harmless for deconstructors. The "Concision" requirement does not really apply as much to deconstructors as constructors; matching with `Foo(var a, _, _)` is not nearly as painful as invoking with lots of parameters, each of which require an explicit choice by the user. So the main reason for overloading deconstructors is to match representations with the constructor overloads -- but with a given "representation set", there probably does not need to be as many deconstructors as constructors. What we really need is to match the "maximal" constructor in a telescoping nest with a corresponding deconstructor, or for a tree-shaped set, one for each "maximal" representation. So for a class with constructors ??? Foo() ??? Foo(A a) ??? Foo(A a, B B) ??? Foo(X x) ??? Foo(X x, Y y) we would want dtors for (A,B) and (X,Y), but don't really need the others. So, let's start fresh on overload selection. Deconstructors have a set of applicability rules based on arity first (eventually, varargs, but not yet) and then on applicability of type patterns, which is in turn rooted in castability.? Because we don't have the compatibility problem introduced by autoboxing, we can ignore the distinction between phase 1 and 2 of overload selection (we will have this problem with varargs later, though.) Given this, the main question we have to resolve is to what degree -- if any -- we may deem one overload "more applicable" than others.? I think there is one rule here that is forced: an exact type match (modulo erasure) is more applicable than an inexact type match.? So given: ??? D(Object o) ??? D(String s) then ??? case D(String s) should choose the latter.? This allows the client to (mostly) steer to a specific overload just by using the right types (rather than `var` or a subtype.)? It is not clear to me whether we need anything more here; in the event of ambiguity, a client can pick the right overload with the right type patterns.? (Nested patterns may need to be manually unrolled to subsequent clauses in some cases.) So basically (on a per-binding basis): an exact match is more applicable than an inexact match, and ... that's it. Users can steer towards a particular overload by selecting exact matches on enough bindings.? Libraries can provide their own "joins" if they want to disambiguate problematic overloads like: ??? D(Object o, String s) ??? D(String s, Object o) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Apr 1 17:06:32 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 1 Apr 2024 13:06:32 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> Message-ID: <3ea34476-63dc-4e06-9ffa-35e64c5667c5@oracle.com> > I see, thanks. What I was missing is that I thought a "pattern" method > cannot exist without its pair. And that explains why wasn't the short > "inverse" syntax chosen: Because the fact that the "pattern" method > has the same name / types is just a coincidence from the point of view > of the compiler (hopefully I'm not misunderstanding something again). Yes.? Pairs of ctor/dtor (and similar for methods) are an extremely useful *API structuring mechanism*, but they are not required by the language. The language will likely provide some help in pairing them up, for use in contexts like withers and serialization, but that's it. Maybe this help will be implicit (same name + same arity + same parameter/binding types + same parameter/binding names means they are paired), maybe it will be explicit (some sort of "invertible" modifier); we are working through use cases now to figure this out. But we want pairing / invertibility to be something that developers choose. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbepincket at live.be Mon Apr 1 19:53:52 2024 From: robbepincket at live.be (Robbe Pincket) Date: Mon, 1 Apr 2024 19:53:52 +0000 Subject: Deconstructor (and pattern) overload selection In-Reply-To: <01758bfd-55ab-4d8d-97ec-e20d885ff2a3@oracle.com> References: <01758bfd-55ab-4d8d-97ec-e20d885ff2a3@oracle.com> Message-ID: Brian Goetz 2024-04-01 16:34Z: > [...] > > - Concision. If a fully-general constructor takes many parameters, but not all are essential to the use case, then the construction site becomes a site of accidental complexity. Being able to handle common grouping of parameters simplifies use sites. > > - Flexibility. Related to the above, not only might the user not need to specify a given constructor parameter, but they want the flexibility of saying "let the implementation pick the best value". Constructors with fewer parameters reserve more flexibility for the implementation. > > - Alternative representations. Some objects may take multiple representations as input, such as accepting a Date, a LocalDate, or a LocalDateTime. > > The first two cases are generally handled with "telescoping constructor nests", where we have: > > Foo(A a) > Foo(A a, B b) > Foo(A a, B b, C d, D d) > > [...] > > The "Flexibility" requirement does not really apply to deconstructors; having a deconstructor that accepts additional bindings does not constrain anything, not in the same way as a constructor taking needlessly specific arguments. Imagine if ArrayList had only constructors that take int (for array capacity); this is terrible for the constructor, because it forces a resource management decision onto users who will not likely make a very good decision, and one that is hard to change later, but pretty much harmless for deconstructors. > > The "Concision" requirement does not really apply as much to deconstructors as constructors; matching with `Foo(var a, _, _)` is not nearly as painful as invoking with lots of parameters, each of which require an explicit choice by the user. > > So the main reason for overloading deconstructors is to match representations with the constructor overloads -- but with a given "representation set", there probably does not need to be as many deconstructors as constructors. What we really need is to match the "maximal" constructor in a telescoping nest with a corresponding deconstructor, or for a tree-shaped set, one for each "maximal" representation. > > So for a class with constructors > > Foo() > Foo(A a) > Foo(A a, B B) > Foo(X x) > Foo(X x, Y y) > > we would want dtors for (A,B) and (X,Y), but don't really need the others. Well that does depend on whether you are going to allow deconstructors to be not total. If they don't have to be total, the deconstructor `Foo(X x)` could be short for `Foo(X x, null)` if constructor `Foo(X x)` calls `Foo(x, null)`. However, it would become more complex when `Foo(X x)` calls `Foo(x, Foo.DEFAULT_Y)`. Now the deconstructor `Foo(X x)` doesn't really have a longer form equivalent. Kind regards Robbe Pincket -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccherlin at gmail.com Tue Apr 2 22:16:19 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Tue, 2 Apr 2024 17:16:19 -0500 Subject: Member Patterns -- the bikeshed In-Reply-To: <6a243154-82fe-4a7b-88c7-a70c1605a3bf@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <6a243154-82fe-4a7b-88c7-a70c1605a3bf@oracle.com> Message-ID: # Pattern input (candidate) / output (binding) syntax Java already has a familiar syntax for separating a variable binding from an input value: It's the enhanced for loop header. for ( : ) becomes pattern ( : ) Examples: ## Optional public static pattern of(T t : Optional that) case Optional.of(T t) ## Map.Entry public static pattern of(K key, V value : Map.Entry that) case Map.Entry.of(String key, Integer value) # Covering pattern groups Java's already got labels, it could also have "covering pattern group labels". ## Integer case PARITY pattern even(int even : Integer that) case PARITY pattern odd(int odd : Integer that) case (SIGNUM, FACTORING) pattern negative(int n : Integer that) case (SIGNUM, FACTORING) pattern zero(int z : Integer that) case SIGNUM pattern positive(int p : Integer that) case FACTORING pattern unit(int u : Integer that) case FACTORING pattern prime(int p : Integer that) case FACTORING pattern composite(int c : Integer that) The case sets {even, odd}, {positive, zero, negative} and {negative, zero, unit, prime, composite} are all total, but the case set {even, negative, composite} is not. Cheers, Clement Cherlin On Sat, Mar 30, 2024 at 3:53?PM Brian Goetz wrote: > > > On 3/30/2024 3:23 PM, Victor Nazarov wrote: > > I have two points that I think may be good to consider in the list of > options. > > 1. I'm not sure if this was considered, but I find explicit lists of > covering patterns > rather natural and more flexible than using case as a pattern-modifier. > > > Agreed (this is how F# does it), and we tried that, but it is so contrary > to how members are done in Java. (One might think that one could declare > a "sealed" pattern, which "permits" a list of other patterns, and this > sounds perfectly natural, but it looks pretty weird.) > > The important feature of explicit lists is that there may be more than one > covering set of patterns. > > > Yes, been down this road too, but the reality is that this is not likely > to come up nearly as often as one might imagine. > > 2. I think that there is a middle ground between functional and imperative > pattern body definition style that may look cumbersome at first, but > nevertheless gives you best of both worlds: > > > The `match` block is an interesting idea, will consider. > > > > * deconstructor patterns look dual to constructors > * names from the list of pattern variables are actually used and > checked by the compiler > * control flow is still functional, which is more natural > > The downside that is retained from the imperative style is the need for > alpha-renaming, > but I think we still have to deal with shadowing and renaming > local-variable seems natural and easy. > > Middle ground may be used like a special form that can be used in the > pattern body. > This form works mostly the same way as `with`-clause as defined in the > "Derived Record Instances" JEP. > > Here is the long list of examples to fully illustrate different > interactions: > > ```` > class Optional matches (of|empty) { > public static pattern> of(T value) { > if (that.isPresent()) { > match { > value = that.get(); > } > } > } > > public static pattern> empty() { > if (that.isEmpty()) > match {} > } > } > > class Pattern { > public pattern regexMatch(String... groups) { > Matcher m = this.matcher(that); > if (m.matches()) { > match { > groups = > IntStream.range(1, m.groupCount()) > .map(Matcher::group) > .toArray(String[]::new); > } > } > } > } > > class A { > private final int a; > > public A(int a) { > this.a = a; > } > public pattern A(int a) { > match { > a = that.a; > } > } > } > > class B extends A { > private final int b; > > public B(int a, int b) { > super(a); > this.b = b; > } > > public pattern B(int a, int b) { > if (that instanceof super(var aa)) { > match { > a = aa; > b = that.b; > } > } > } > } > > interface Converter { > pattern convert(U u); > } > Converter c = > pattern (s) -> { > if (that >= Short.MIN_VALUE && that <= Short.MAX_VALUE) > match { > s = (short) that; > } > }; > ```` > > -- > Victor Nazarov > > > On Fri, Mar 29, 2024 at 10:59?PM Brian Goetz > wrote: > >> We now come to the long-awaited bikeshed discussion on what member >> patterns should look like. >> >> Bikeshed disclaimer for EG: >> - This is likely to evoke strong opinions, so please take pains to be >> especially constructive >> - Long reply-to-reply threads should be avoided even more than usual >> - Holistic, considered replies preferred >> - Please change subject line if commenting on a sub-topic or tangential >> concern >> >> Special reminders for Remi: >> - Use of words like "should", "must", "shouldn't", "mistake", "wrong", >> "broken" >> are strictly forbidden. >> - If in doubt, ask questions first. >> >> Notes for external observers: >> - This is a working document for the EG; the discussion may continue for >> a >> while before there is an official proposal. Please be patient. >> >> >> # Pattern declaration: the bikeshed >> >> We've largely identified the model for what kinds of patterns we need to >> express, but there are still several degrees of freedom in the syntax. >> >> As the model has simplified during the design process, the space of syntax >> choices has been pruned back, which is a good thing. However, there are >> still >> quite a few smaller decisions to be made. Not all of the considerations >> are >> orthogonal, so while they are presented individually, this is not a "pick >> one >> from each column" menu. >> >> Some of these simplifications include: >> >> - Patterns with "input arguments" have been removed; another way to get >> to what >> this gave us may come back in another form. >> - I have grown increasingly skeptical of the value of the imperative >> `match` >> statement. With better totality analysis, I think it can be >> eliminated. >> >> We can discuss these separately but I would like to sync first on the >> broad >> strokes for how patterns are expressed. >> >> ## Object model requirements >> >> As outlined in "Towards Member Patterns", the basic model is that >> patterns are >> the dual of other executable members (constructors, static methods, >> instance >> methods.) While they are like methods in that they have inputs, outputs, >> names, >> and an imperative body, they have additional degrees of freedom that >> constructors and methods lack: >> >> - Patterns are, in general, _conditional_ (they can succeed or fail), >> and only >> produce bindings (outputs) when they succeed. This conditionality is >> understood by the language's flow analysis, and is used for computing >> scoping >> and definite assignment. >> - Methods can return at most one value; when a pattern completes >> successfully, >> it may bind multiple values. >> - All patterns have a _match candidate_, which is a distinguished, >> possibly-implicit parameter. Some patterns also have a receiver, >> which is >> also a distinguished, possibly-implicit parameter. In some such cases >> the >> receiver and match candidate are aliased, but in others these may >> refer to >> different objects. >> >> So a pattern is a named executable member that takes a _match candidate_ >> as a >> possibly-implicit parameter, maybe takes a receiver as an implicit >> parameter, >> and has zero or more conditional _bindings_. Its body can perform >> imperative >> computation, and can terminate either with match failure or success. In >> the >> success case, it must provide a value for each binding. >> >> Deconstruction patterns are special in many of the same ways constructors >> are: >> they are constrained in their name, inheritance, and probably their >> conditionality (they should probably always succeed). Just as the syntax >> for >> constructors differs slightly from that of instance methods, the syntax >> for >> deconstructors may differ slightly from that of instance patterns. Static >> patterns, like static methods, have no receiver and do not have access to >> the >> type parameters of the enclosing class. >> >> Like constructors and methods, patterns can be overloaded, but in >> accordance >> with their duality to constructors and methods, the overloading happens >> on the >> _bindings_, not the inputs. >> >> ## Use-site syntax >> >> There are several kinds of type-driven patterns built into the language: >> type >> patterns and record patterns. A type pattern in a `switch` looks like: >> >> case String s: ... >> >> And a record pattern looks like: >> >> case MyRecord(P1, P2, ...): ... >> >> where `P1..Pn` are nested patterns that are recursively matched to the >> components of the record. This use-site syntax for record patterns was >> chosen >> for its similarity to the construction syntax, to highlight that a record >> pattern is the dual of record construction. >> >> **Deconstruction patterns.** The simplest kind of member pattern, a >> deconstruction pattern, will have the same use-site syntax as a record >> pattern; >> record patterns can be thought of as a deconstruction pattern "acquired >> for >> free" by records, just as records do with constructors, accessors, object >> methods, etc. So the use of a deconstruction pattern for `Point` looks >> like: >> >> case Point(var x, var y): ... >> >> whether `Point` is a record or an ordinary class equipped with a suitable >> deconstruction pattern. >> >> **Static patterns.** Continuing with the idea that the destructuring >> syntax >> should evoke the aggregation syntax, there is an obvious candidate for the >> use-site syntax for static patterns: >> >> case Optional.of(var e): ... >> case Optional.empty(): ... >> >> **Instance patterns.** Uses of instance patterns will likely come in two >> forms, >> analogous to bound and unbound instance method references, depending on >> whether >> the receiver and the match candidate are the same object. In the unbound >> form, >> used when the receiver is the same object as the match candidate, the >> pattern >> name is qualified by a _type_: >> >> ``` >> Class k = ... >> switch (k) { >> // Qualified by type >> case Class.arrayClass(var componentType): ... >> } >> ``` >> >> This means that we _resolve_ the pattern `arrayClass` starting at `Class` >> and >> _select_ the pattern using the receiver, `k`. We may also be able to >> omit the >> class qualifier if the static type of the match candidate is sufficient to >> resolve the desired pattern. >> >> In the bound form, used when the receiver is distinct from the match >> candidate, >> the pattern name is qualified with an explicit _receiver expression_. As >> an >> example, consider an interface that captures primitive widening and >> narrowing >> conversions, such as those between `int` and `long`. In the widening >> direction, >> conversion is unconditional, so this can be modeled as a method from >> `int` to >> `long`. In the other direction, conversion is conditional, so this is >> better >> modeled as a _pattern_ whose match candidate is `long` and which binds an >> `int` >> on success. Since these are instance methods of some class (say, >> `NumericConversion`), we need to provide the receiver instance in >> order to >> resolve the pattern: >> >> ``` >> NumericConversion nc = ... >> >> switch (aLong) { >> case nc.narrowed(int i): >> ... >> } >> ``` >> >> The explicit receiver syntax would also be used if we exposed regular >> expression >> matching as a pattern on the `j.u.r.Pattern` object (the name collision on >> `Pattern` is unfortunate). Imagine we added a `matching` instance >> pattern to >> `j.u.r.Pattern`; then we could use it in `instanceof` as follows: >> >> ``` >> static final java.util.regex.Pattern P = Pattern.compile("(a*)(b*)"); >> ... >> if (aString instanceof P.matching(String as, String bs)) { ... } >> ``` >> >> Each of these use-site syntaxes is modeled after the use-site syntax for a >> method invocation or method reference. >> >> ## Declaration-site syntax >> >> To avoid being biased by the simpler cases, we're going to work all the >> cases >> concurrently rather than starting with the simpler cases and working up. >> (It >> might seem sensible to start with deconstructors, since they are the >> "easy" >> case, but if we did that, we would likely be biased by their simplicity >> and then >> find ourselves painted into a corner.) As our example gallery, we will >> consider: >> >> - Deconstruction pattern for `Point`; >> - Static patterns for `Optional::of` and `Optional::empty`; >> - Static pattern for "power of two" (illustrating a computations where >> success >> or failure, and computation of bindings, cannot easily be separated); >> - Instance pattern for `Class::arrayClass` (used unbound); >> - Instance pattern for `Pattern::matching` on regular expressions (used >> bound). >> >> Member patterns, like methods, have _names_. (We can think of >> constructors as >> being named for their enclosing classes, and the same for >> deconstructors.) All >> member patterns have a (possibly empty) ordered list of _bindings_, which >> are >> the dual of constructor or method parameters. Bindings, in turn, have >> names and >> types. And like constructors and methods, member patterns have a _body_ >> which >> is a block statement. Member patterns also have a _match candidate_, >> which is a >> likely-implicit method parameter. >> >> ### Member patterns as inverse methods and constructors >> >> Regardless of syntax, let us remind ourselves that that deconstructors >> are the >> categorical dual to constructors (coconstructors), and pattern methods >> are the >> categorical dual to methods (comethods). They are dual in their >> structure: a >> constructor or method takes N arguments and produces a result, the >> corresponding >> member pattern consumes a match candidate and (conditionally) produces N >> bindings. >> >> Moreover, they are semantically dual: the return value produced by >> construction >> or factory invocation is the match candidate for the corresponding member >> pattern, and the bindings produced by a member pattern are the answers to >> the >> _Pattern Question_ -- "could this object have come from an invocation of >> my >> dual, and if so, with what arguments." >> >> ### What do we call them? >> >> Given the significant overlap between methods and patterns, the first >> question >> about the declaration we need to settle is how to identify a member >> pattern >> declaration as distinct from a method or constructor declaration. >> _Towards >> Member Patterns_ tried out a syntax that recognized these as _inverse_ >> methods >> and constructors: >> >> public Point(int x, int y) { ... } >> public inverse Point(int x, int y) { ... } >> >> While this is a principled choice which clearly highlights the duality, >> and one >> that might be good for specification and verbal description, it is >> questionable >> whether this would be a great syntax for reading and writing programs. >> >> A more traditional option is to choose a "noun" (conditional) keyword, >> such as >> `pattern`, `matcher`, `extractor`, `view`, etc: >> >> public pattern Point(int x, int y) { ... } >> >> If we are using a noun keyword to identify pattern declarations, we could >> use >> the same noun for all of them, or we could choose a different one for >> deconstruction patterns: >> >> public deconstructor Point(int x, int y) { ... } >> >> Alternately, we could reach for a symbol to indicate that we are talking >> about >> an inverted member. C++ fans might suggest >> >> public ~Point(int x, int y) { ... } >> >> but this is too cryptic (it's evocative once you see it, but then it >> becomes >> less evocative as we move away from deconstructors towards instance >> patterns.) >> >> If we wish to offer finer-grained control over conditionality, we might >> additionally need a `total` / `partial` modifier, though I would prefer >> to avoid >> that. >> >> Of the keyword candidates, there is one that stands out (for good and bad) >> because it connects to something that is already in the language: >> `pattern`. On >> the one hand, using the term `pattern` for the declaration is a slight >> abuse; on >> the other, users will immediately connect it with "ah, so that's how I >> make a >> new pattern" or "so that's what happens when I match against this >> pattern." >> (Lisps would resolve this tension by calling it `defpattern`.) >> >> The others (`matcher`, `view`, `extractor`, etc) are all made-up terms >> that >> don't connect to anything else in the language, for better or worse. If >> we pick >> one of these, we are asking users to sort out _three_ separate new things >> in >> their heads: (use-site) patterns, (declaration-site) matchers, and the >> rules of >> how patterns and matchers are connected. Calling them both "patterns", >> despite >> the mild abuse of terminology, ties them together in a way that >> recognizes their >> connection. >> >> My personal position: `pattern` is the strongest candidate here, despite >> some >> flaws. >> >> ### Binding lists and match candidates >> >> There are two obvious alternatives for describing the binding list and >> match >> candidate of a pattern declaration, both with their roots in the >> constructor and >> method syntax: >> >> - Pretend that a pattern declaration is like a method with multiple >> return, and >> put the binding list in the "return position", and make the match >> candidate >> an ordinary parameter; >> - Lean into the inverse relationship between constructors and methods >> (and >> consistency with the use-site syntax), and put the binding list in the >> "parameter list position". For static patterns and some instance >> patterns, >> which need to explicitly identify the match candidate type, there are >> several >> sub-options: >> - Lean further into the duality, putting the match candidate type in >> the >> "return position"; >> - Put the match candidate type somewhere else, where it is less likely >> to be >> confused for a method return. >> >> The "method-like" approach might look like this: >> >> ``` >> class Point { >> // Constructor and deconstructor >> public Point(int x, int y) { ... } >> public pattern (int x, int y) Point(Point target) { ... } >> ... >> } >> >> class Optional { >> // Static factory and pattern >> public static Optional of(T t) { ... } >> public static pattern (T t) of(Optional target) { ... } >> ... >> } >> ``` >> >> The "inverse" approach might look like: >> >> ``` >> class Point { >> // Constructor and deconstructor >> public Point(int x, int y) { ... } >> public pattern Point(int x, int y) { ... } >> ... >> } >> >> class Optional { >> // Static factory and pattern (using the first sub-option) >> public static Optional of(T t) { ... } >> public static pattern Optional of(T t) { ... } >> ... >> } >> ``` >> >> With the "method-like" approach, the match candidate gets an explicit name >> selected by the author; with the inverse approach, we can go with a >> predefined >> name such as `that`. (Because deconstructors do not have receivers, we >> could by >> abuse of notation arrange for the keyword `this` to refer instead to the >> match >> candidate within the body of a deconstructor. While this might seem to >> lead to >> a more familiar notation for writing deconstructors, it would create a >> gratuitous asymmetry between the bodies of deconstruction patterns and >> those of >> other patterns.) >> >> Between these choices, nearly all the considerations favor the "inverse" >> approach: >> >> - The "inverse" approach makes the declaration look like the use site. >> This >> highlights that `pattern Point(int x, int y)` is what gets invoked >> when you >> match against the pattern use `Point(int x, int y)`. (This point is so >> strong that we should probably just stop here.) >> - The "inverse" members also look like their duals; the only difference >> is the >> `pattern` keyword (and possibly the placement of the match candidate >> type). >> This makes matched pairs much more obvious, and such matched pairs >> will be >> critical both for future language features and for library idioms. >> - The method-like approach is suggestive of multiple return or tuples, >> which is >> probably helpful for the first few minutes but actually harmful in the >> long >> term. This feature is _not_ (much as some people would like to >> believe) about >> multiple return or tuples, and playing into this misperception will >> only make >> it harder to truly understand. So this suggestion ends up propping up >> the >> wrong mental model. >> >> The main downside of the "inverse" approach is the one-time speed bump of >> the >> unfamiliarity of the inverted syntax. (The "method-like" syntax also has >> its >> own speed bumps, it is just unfamiliar in different ways.) But unlike the >> advantages of the inverse approach, which continue to add value forever, >> this >> speed bump is a one-time hurdle to get over. >> >> To smooth out the speed bumps of the inverse approach, we can consider >> moving >> the position of the match candidate for static and (suitable) instance >> pattern >> declarations, such as: >> >> ``` >> class Optional { >> // the usual static factory >> public static Optional of(T t) { ... } >> >> // Various ways of writing the corresponding pattern >> public static pattern of(T t) for Optional { ... } >> // or ... >> public static pattern(Optional) of(T t) { ... } >> // or ... >> public static pattern(Optional that) of(T t) { ... } >> // or ... >> public static pattern> of(T t) { ... } >> ... >> } >> ``` >> >> (The deconstructor example looks the same with either variant.) Of these, >> treating the match candidate like a "parameter" of "pattern" is probably >> the >> most evocative: >> >> ``` >> public static pattern(Optional that) of(T t) { ... } >> ``` >> >> as it can be read as "pattern taking the parameter `Optional that` >> called >> `of`, binding `T`, and is a short departure from the inverse syntax. >> >> The main value of the various rearrangements is that users don't need to >> think >> about things operating in reverse to parse the syntax. This trades some >> of the >> secondary point (patterns looking almost exactly like their inverses) for >> a >> certain amount of cognitive load, while maintaining the most important >> consideration: that the declaration site look like the use site. >> >> For instance pattern declarations, if the match candidate type is the >> same as >> the receiver type, the match candidate type can be elided as it is with >> deconstructors. >> >> My personal position: the "multiple return" version is terrible; all the >> sub-variants of the inverse version are probably workable. >> >> ### Naming the match candidate >> >> We've been assuming so far that the match candidate always has a fixed >> name, >> such as `that`; this is an entirely workable approach. Some of the >> variants are >> also amenable to allowing authors to explicitly select a name for the >> match >> candidate. For example, if we put the match candidate as a "parameter" >> to the `pattern` keyword, there is an obvious place to put the name: >> >> ``` >> static pattern(Optional target) of(T t) { ... } >> ``` >> >> My personal opinion: I don't think this degree of freedom buys us much, >> and in >> the long run readability probably benefits by picking a fixed name like >> `that` >> and sticking with it. Even with a fixed name, if there is a sensible >> position >> for the name, allowing users to type `that` for explicitness is fine (as >> we do >> with instance methods, though many people don't know this.) We may even >> want to >> require it. >> >> ## Body types >> >> Just as there are two obvious approaches for the declaration, there are >> two >> obvious approaches we could take for the body (though there is some >> coupling >> between them.) We'll call the two body approaches _imperative_ and >> _functional_. >> >> The imperative approach treats bindings as initially-DU variables that >> must be >> DA on successful completion, getting their value through ordinary >> assignment; >> the functional approach sets all the bindings at once, positionally. >> Either >> way, member patterns (except maybe deconstructors) also need a way to >> differentiate a successful match from a failed match. >> >> Here is the `Point` deconstructor with both imperative and functional >> style. The >> functional style uses a placeholder `match` statement to indicate a >> successful >> match and provision of bindings: >> >> ``` >> class Point { >> int x, y; >> >> Point(int x, int y) { >> this.x = x; >> this.y = y; >> } >> >> // Imperative style, deconstructor always succeeds >> pattern Point(int x, int y) { >> x = that.x; >> y = that.y; >> } >> >> // Functional style >> pattern Point(int x, int y) { >> match(that.x, that.y); >> } >> } >> ``` >> >> There are some obvious differences here. In the imperative style, the >> dtor body >> looks much more like the reverse of the ctor body. The functional style >> is more >> concise (and amenable to further concision via the "concise method bodies" >> mechanism in the future), as well as a number of less obvious >> differences. For >> deconstructors, the imperative approach is likely to feel more natural >> because >> of the obvious symmetry with constructors. >> >> In reality, it is _premature at this point to have an opinion_, because we >> haven't yet seen the full scope of the problem; deconstructors are a >> special >> case in many ways, which almost surely is distorting our initial >> opinion. As we >> move towards conditional patterns (and pattern lambdas), our opinions may >> flip. >> >> Regardless of which we pick, there are some additional syntactic choices >> to be >> made -- what syntax to use to indicate success (we used `match` in the >> above >> example) or failure. (We should be especially careful around trying to >> reuse >> words like `return`, `break`, or `yield` because, in the case where there >> are >> zero bindings (which is allowable), it becomes unclear whether they mean >> "fail" >> or "succeed with zero bindings".) >> >> ### Success and failure >> >> Except for possibly deconstructors, which we may require to be total, a >> pattern >> declaration needs a way to indicate success and failure. In the examples >> above, >> we posited a `match` statement to indicate success in the functional >> approach, >> and in both examples leaned on the "implicit success" of deconstructors >> (under >> the assumption they always succeed). Now let's look at the more general >> case to >> figure out what else is needed. >> >> For a static pattern like `Optional::of`, success is conditional. Using >> `match-fail` as a placeholder for "the match failed", this might look like >> (functional version): >> >> ``` >> public static pattern(Optional that) of(T t) { >> if (that.isPresent()) >> match (that.get()); >> else >> match-fail; >> } >> ``` >> >> The imperative version is less pretty, though. Using `match-success` as a >> placeholder: >> >> ``` >> public static pattern(Optional that) of(T t) { >> if (that.isPresent()) { >> t = that.get(); >> match-success; >> } >> else >> match-fail; >> } >> ``` >> >> Both arms of the `if` feel excessively ceremonial here. And if we chose >> to not >> make all deconstruction patterns unconditional, deconstructors would >> likely need >> some explicit success as well: >> >> ``` >> pattern Point(int x, int y) { >> x = that.x; >> y = that.y; >> match-success; >> } >> ``` >> >> It might be tempting to try and eliminate the need for explicit success by >> inferring it from whether or not the bindings are DA or not, but this is >> error-prone, is less type-checkable, and falls apart completely for >> patterns >> with no bindings. >> >> ### Implicit failure in the functional approach >> >> One of the ceremonial-seeming aspects of `Optional::of` above is having >> to say >> `else match-fail`, which doesn't feel like it adds a lot of value. >> Perhaps we >> can be more concise without losing clarity. >> >> Most conditional patterns will have a predicate to determine matching, >> and then >> some conditional code to compute the bindings and claim success. Having >> to say >> "and if the predicate didn't hold, then I fail" seems like ceremony for >> the >> author and noise for the reader. Instead, if a conditional pattern falls >> off >> the end without matching, we could treat that as simply not matching: >> >> ``` >> public static pattern(Optional that) of(T t) { >> if (that.isPresent()) >> match (that.get()); >> } >> ``` >> >> This says what we mean: if the optional is present, then this pattern >> succeeds >> and bind the contents of the `Optional`. As long as our "succeed" >> construct >> strongly enough connotes that we are terminating abruptly and >> successfully, this >> code is perfectly clear. And most conditional patterns will look a lot >> like >> `Optional::of`; do some sort of test and if it succeeds, extract the >> state and >> bind it. >> >> At first glance, this "implicit fail" idiom may seem error-prone or >> sloppy. But >> after writing a few dozen patterns, one quickly tires of saying "else >> match-fail" -- and the reader doesn't necessarily appreciate reading it >> either. >> >> Implicit failure also simplifies the selection of how we explicitly >> indicate >> failure; using `return` in a pattern for "no match" becomes pretty much a >> forced >> move. We observe that (in a void method), "return" and "falling off the >> end" >> are equivalent; if "falling off the end" means "no match", then so should >> an >> explicit `return`. So in those few cases where we need to explicitly >> signal "no >> match", we can just use `return`. It won't come up that often, but >> here's an >> example where it does: >> >> ``` >> static pattern(int that) powerOfTwo(int exp) { >> int exp = 0; >> >> if (that < 1) >> return; // explicit fail >> >> while (that > 1) { >> if (that % 2 == 0) { >> that /= 2; >> ++exp; >> } >> else >> return; // explicit fail >> } >> match (exp); >> } >> ``` >> >> As a bonus, if `return` as match failure is a forced move, we need only >> select a >> term for "successful match" (which obviously can't be `return`). We >> could use >> `match` as we have in the examples, or a variant like `matched` or >> `matches`. >> But rather than just creating a new control operator, we have an >> opportunity to >> lean into the duality a little harder, by including the pattern syntax in >> the >> match: >> >> ``` >> matches of(that.get()); >> ``` >> >> or the (optionally?) qualified (inferring type arguments, as we do at the >> use >> site): >> >> ``` >> matches Optional.of(that.get()); >> ``` >> >> These "use the name" approaches trades a small amount of verbosity to >> gain a >> higher degree of fidelity to the pattern use site (and to evoke the >> comethod >> completion.) >> >> If we don't choose "implicit fail", we would have to invent _two_ new >> control >> flow statements to indicate "success" and "failure". >> >> My personal position: for the functional approach, implicit failure both >> makes >> the code simpler and clearer, and after you get used to it, you don't >> want to go >> back. Whether we say `match` or `matches` or `matches ` >> are all >> workable, though I like some variant that names the pattern. >> >> ### Implicit success in the imperative approach >> >> In the imperative approach, we can be implicit as well, but it feels more >> natural (at least, initially) to choose implicit success rather than >> failure. >> This works great for unconditional patterns: >> >> ``` >> pattern Point(int x, int y) { >> x = that.x; >> y = that.y; >> // implicit success >> } >> ``` >> >> but not quite as well for conditional patterns: >> >> ``` >> static pattern(Optional that) of(T t) { >> if (that.isPresent()) { >> t = that.get(); >> } >> else >> match-fail; >> // implicit success >> } >> ``` >> >> We can eliminate one of the arms of the if, with the more concise (but >> convoluted) inversion: >> >> ``` >> static pattern(Optional that) of(T t) { >> if (!that.isPresent()) >> match-fail; >> t = that.get(); >> // implicit success >> } >> ``` >> >> Just as with the functional approach, if we choose imperative and >> "implicit >> success", using `return` to indicate success is pretty much a forced >> move. >> >> ### Imperative is a trap >> >> If we assume that functional implies implicit failure, and imperative >> implies >> implicit success, then our choices become: >> >> ``` >> class Optional { >> public static Optional of(T t) { ... } >> >> // imperative, implicit success >> public static pattern(Optional that) of(T t) { >> if (that.isPresent()) { >> t = that.get(); >> } >> else >> match-fail; >> } >> >> // functional, implicit failure >> public static pattern(Optional that) of(T t) { >> if (that.isPresent()) >> matches of(that.get()); >> } >> } >> ``` >> >> Once we get past deconstructors, the imperative approach looks worse by >> comparison because we need to assign all the bindings (which is _O(n)_ >> assignments) _and also_ indicate success or failure somehow, whereas in >> the >> functional style all can be done together with a single `matches` >> statement. >> >> Looking at the alternatives, except maybe for unconditional patterns, the >> functional example above seems a lot more natural. The imperative >> approach >> works with deconstructors (assuming they are not conditional), but does >> not >> scale so well to conditionality -- which is the essence of patterns. >> >> From a theoretical perspective, the method-comethod duality also gives us >> a >> forceful nudge towards the functional approach. In a method, the method >> arguments are specified as a positional list of expressions at the use >> site: >> >> m(a, b, c) >> >> and these values are invisibly copied into the parameter slots of the >> method >> prior to frame activation. The dual to that for a comethod to similarly >> convey >> the bindings in a positional list of expressions (as they must either all >> be >> produced or none), where they are copied into the slots provided at the >> use >> site, as is indicated by `matches` in the above examples. >> >> My personal position: the imperative style feels like a trap. It seems >> "obvious" at first if we start with deconstructors, but becomes >> increasingly >> difficult when we get past this case, and gets in the way of other >> opportunities. The last gasp before acceptance is the discomfort that >> dtor and >> ctor bodies are written in different styles, but in the rear-view mirror, >> this >> feels like a non-issue. >> >> ### Derive imperative from functional? >> >> If we start with "functional with implicit failure", we can possibly >> rescue >> imperative by deriving a version of imperative from functional, by >> "overloading" >> the match-success operator. >> >> If we have a pattern whose binding names are `b1..bn` of types `B1..Bn`, >> then >> the `matches` operator must take a list of expressions `e1..en` whose >> arity and >> types are compatible with `B1..Bn`. But we could allow `matches` to also >> have a >> nilary form, which would have the effect of being shorthand for >> >> matches (b1, b2, ..., bn) >> >> where each of `b1..bn` must be DA at the point of matching. This means >> that we >> could express patterns in either form: >> >> ``` >> class Optional { >> public static Optional of(T t) { ... } >> >> // imperative, derived from functional with implicit failure >> public static pattern(Optional that) of(T t) { >> if (that.isPresent()) { >> t = that.get(); >> matches of; >> } >> } >> >> public static pattern(Optional that) of(T t) { >> if (that.isPresent()) >> matches of(that.get()); >> } >> } >> ``` >> >> This flexibility allows users to select a more verbose expression in >> exchange >> for a clearer association of expressions and bindings, though as we'll >> see, it >> does come with some additional constraints. >> >> ### Wrapping an existing API >> >> Nearly every library has methods (sometimes sets of methods) that are >> patterns >> in disguise, such as the pair of methods `isArray` and `getComponentType` >> in >> `Class`, or the `Matcher` helper type in `java.util.regex`. Library >> maintainers >> will likely want to wrap (or replace) these with real patterns, so these >> can >> participate more effectively in conditional contexts, and in some cases, >> highlight their duality with factory methods. >> >> Matching a string against a `j.u.r.Pattern` regular expression has all >> the same >> elements as a pattern, just with an ad-hoc API (and one that I have to >> look up >> every time). But we can fairly easily wrap a true pattern around the >> existing >> API. To match against a `Pattern` today, we pass the match candidate to >> `Pattern::matcher`, which returns a `Matcher` with accessors >> `Matcher::matches` >> (did it match) and `Matcher::group` (conditionally extract a particular >> capture >> group.) If we want to wrap this with a pattern called `regexMatch`: >> >> ``` >> pattern(String that) regexMatch(String... groups) { >> Matcher m = this.matcher(that); >> if (m.matches()) >> matches Pattern.regexMatch(IntStream.range(1, m.groupCount()) >> .map(Matcher::group) >> .toArray(String[]::new)); >> // whole lotta matchin' goin' on >> } >> ``` >> >> This says that a `j.u.r.Pattern` has an instance pattern called `regex`, >> whose >> match candidate is `String`, and which binds a varargs of `String` >> corresponding >> to the capture groups. The implementation simply delegates to the >> existing >> `j.u.r.Matcher` API. This means that `j.u.r.Pattern` becomes a sort of >> "pattern >> object", and we can use it as a receiver at the use site: >> >> ``` >> static Pattern As = Pattern.compile("(a*)"); >> static Pattern Bs = Pattern.compile("(b*)"); >> ... >> switch (string) { >> case As.regexMatch(var as): ... >> case Bs.regexMatch(var bs): ... >> ... >> } >> ``` >> >> ### Odds and ends >> >> There are a number of loose ends here. We could choose other names for >> the >> match-success and match-fail operations, including trying to reuse >> `break` or >> `yield`. But, this reuse is tricky; it must be very clear whether a >> given form >> of abrupt completion means "success" or "failure", because in the case of >> patterns with no bindings, we will have no other syntactic cues to help >> disambiguate. (I think having a single `matches`, with implicit failure >> and >> `return` meaning failure, is the sweet spot here.) >> >> Another question is whether the binding list introduces corresponding >> variables >> into the scope of the body. For imperative, the answer is "surely yes"; >> for >> functional, the answer is "maybe" (unless we want to do the trick where we >> derive imperative from functional, in which case the answer is "yes" >> again.) >> >> If the binding list does not correspond to variables in the body, this >> may be >> initially discomforting; because they do not declare program elements, >> they may >> feel that they are left "dangling". But even if they are not declaring >> _program_ elements, they are still declaring _API_ elements (similar to >> the >> return type of a method.) We will want to provide Javadoc on the >> bindings, just >> like with parameters; we will want to match up binding names in >> deconstructors >> with parameter names in constructors; we may even someday want to support >> by-name binding at the use site (e.g., `case Foo(a: var a)`). The names >> are >> needed for all of these, just not for the body. Names still matter. My >> take >> here is that this is a transient "different is scary" reaction, one that >> we >> would get over quickly. >> >> A final question is whether we should consider unqualified names as >> implicitly >> qualified by `that` (and also `this`, for instance patterns, with some >> conflict >> resolution). Users will probably grow tired of typing `that.` all the >> time, and most of the time, the unqualified use is perfectly readable. >> >> ## Exhaustiveness >> >> There is one last syntax question in front of us: how to indicate that a >> set of >> patterns are (claimed to be) exhaustive on a given match candidate type. >> We see >> this with `Optional::of` and `Optional::empty`; it would be sad if the >> compiler >> did not realize that these two patterns together were exhaustive on >> `Optional`. >> This is not a feature that will be used often, but not having it at all >> will be >> a repeated irritant. >> >> The best I've come up with is to call these `case` patterns, where a set >> of >> `case` patterns for a given match candidate type in a given class are >> asserted >> to be an exhaustive set: >> >> ``` >> class Optional { >> static Optional of(T t) { ... } >> static Optional empty() { ... } >> >> static case pattern of(T t) for Optional { ... } >> static case pattern empty() for Optional { ... } >> } >> ``` >> >> Because they may not be truly exhaustive, `switch` constructs will have >> to back >> up the static assumption of exhaustiveness with a dynamic check, as we do >> for >> other sets of exhaustive patterns that may have remainder. >> >> I've experimented with variants of `sealed` but it felt more forced, so >> this is >> the best I've come up with. >> >> ## Example: patterns delegating to other patterns >> >> Pattern implementations must compose. Just as a subclass constructor >> delegates >> to a superclass constructor, the same should be true for deconstructors. >> Here's a typical superclass-subclass pair: >> >> ``` >> class A { >> private final int a; >> >> public A(int a) { this.a = a; } >> public pattern A(int a) { matches A(that.a); } >> } >> >> class B extends A { >> private final int b; >> >> public B(int a, int b) { >> super(a); >> this.b = b; >> } >> >> // Imperative style >> public pattern B(int a, int b) { >> if (that instanceof super(var aa)) { >> a = aa; >> b = that.b; >> matches B; >> } >> } >> >> // Functional style >> public pattern B(int a, int b) { >> if (that instanceof super(var a)) >> matches B(a, b); >> } >> } >> ``` >> >> (Ignore the flow analysis and totality for the time being; we'll come >> back to >> this in a separate document.) >> >> The first thing that jumps out at us is that, in the imperative version, >> we had >> to create a "garbage" variable `aa` to receive the binding, because `a` >> was >> already in scope, and then we have to copy the garbage variable into the >> real >> binding variable. Users will surely balk at this, and rightly so. In the >> functional version (depending on the choices from "Odds and Ends") we are >> free >> to use the more natural name and avoid the roundabout locution. >> >> We might be tempted to fix the "garbage variable" problem by inventing >> another >> sub-feature: the ability to use an existing variable as the target of a >> binding, >> such as: >> >> ``` >> pattern Point(int a, int b) { >> if (this instanceof A(__bind a)) >> b = this.b; >> } >> ``` >> >> But, I think the language is stronger without this feature, for two >> reasons. >> First, having to reason about whether a pattern match introduces a new >> binding >> or assigns to an existing variables is additional cognitive load for >> users to >> reason about, and second, having assignment to locals happening through >> something other than assignment introduces additional complexity in >> finding >> where a variable is modified. While we can argue about the general >> utility of >> this feature, bringing it in just to solve the garbage-variable problem is >> particularly unattractive. >> >> ## Pattern lambdas >> >> One final consideration is is that patterns may also have a lambda form. >> Given >> a single-abstract-pattern (SAP) interface: >> >> ``` >> interface Converter { >> pattern(T t) convert(U u); >> } >> ``` >> >> one can implement such a pattern with a lambda. Such a lambda has one >> parameter >> (the match candidate), and its body looks like the body of a declared >> pattern: >> >> ``` >> Converter c = >> i -> { >> if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE) >> matches Converter.convert((short) i); >> }; >> ``` >> >> Because the bindings of the pattern lambda are defined in the interface, >> not in >> the lambda, this is one more reason not to like the imperative version: >> it is >> brittle, and alpha-renaming bindings in the interface would be a >> source-incompatible change. >> >> ## Example gallery >> >> Here's all the pattern examples so far, and a few more, using the >> suggested >> style (functional, implicit fail, implicit `that`-qualification): >> >> ``` >> // Point dtor >> pattern Point(int x, int y) { >> matches Point(x, y); >> } >> >> // Optional -- static patterns for Optional::of, Optional::empty >> static case pattern(Optional that) of(T t) { >> if (isPresent()) >> matches of(t); >> } >> >> static case pattern(Optional that) empty() { >> if (!isPresent()) >> matches empty(); >> } >> >> // Class -- instance pattern for arrayClass (match candidate type >> inferred) >> pattern arrayClass(Class componentType) { >> if (that.isArray()) >> matches arrayClass(that.getComponentType()); >> } >> >> // regular expression -- instance pattern in j.u.r.Pattern >> pattern(String that) regexMatch(String... groups) { >> Matcher m = matcher(that); >> if (m.matches()) >> matches Pattern.regexMatch(IntStream.range(1, m.groupCount()) >> .map(Matcher::group) >> .toArray(String[]::new)); >> } >> >> // power of two (somewhere) >> static pattern(int that) powerOfTwo(int exp) { >> int exp = 0; >> >> if (that < 1) >> return; >> >> while (that > 1) { >> if (that % 2 == 0) { >> that /= 2; >> exp++; >> } >> else >> return; >> } >> matches powerOfTwo(exp); >> } >> ``` >> >> ## Closing thoughts >> >> I came out of this exploration with very different conclusions than I >> expected >> when going in. At first, the "inverse" syntax seemed stilted, but over >> time it >> started to seem more obvious. Similarly, I went in expecting to prefer >> the >> imperative approach for the body, but over time, started to warm to the >> functional approach, and eventually concluded it was basically a forced >> move if >> we want to support more than just deconstructors. And I started out >> skeptical >> of "implicit fail", but after writing a few dozen patterns with it, going >> back >> to fully explicit felt painful. All of this is to say, you should hold >> your >> initial opinions at arm's length, and give the alternatives a chance to >> sink in. >> >> For most _conditional_ patterns (and conditionality is at the heart of >> pattern >> matching), the functional approach cleanly highlights both the match >> predicate >> and the flow of values, and is considerably less fussy than the imperative >> approach in the same situation; `Optional::of`, `Class::arrayClass`, and >> `regex` >> look great here, much better than the would with imperative. None of >> these >> illustrate delegation, but in the presence of delegation, the gap gets >> even >> wider. >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Apr 3 10:21:35 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 3 Apr 2024 12:21:35 +0200 (CEST) Subject: Member Patterns -- the bikeshed In-Reply-To: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> Message-ID: <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> Hello, I think it is also interesting to instead of starting from deconstruction and then trying to expand, to do in the other way, starts with a pattern backed by a method and then see the deconstruction as a special case of a pattern backend by a method. So instead of using a top-down approach, try to use a bottom-up approach. Let's take Optional as first example, Optional is defined like this public final class Optional { private T value; private Optional(T value) { this.value = value; } public static Optional empty() { return new Optional<>(null); } public static Optional of(T value) { Objects.requireNonNull(value); return new Optional<>(value); } } This is a final class, so obviously it can not be matched with a record pattern, but we can have methods that see an instance of Optional as records. By example, we can have a method "asPresent" that returns a record with one component containing the value if the value inside the Optional is present. I'm using the prefix "as" here because this is the one commonly used in Java (it also the same semantics as as the keyword "as" in C#/Kotlin). The return value of the method asPresent(), is a record, here named "$CarrierPresent". So we get something like this at declaration site: public /*value*/ record $CarrierPresent(T value) {} public $CarrierPresent asPresent() { if (value == null) { return null; } return new $CarrierPresent<>(value); } and at call site, we can use "when" + instanceof inside a switch like this: var optional = .. var result = switch (optional) { case Optional opt when opt.asPresent() instanceof Optional.$CarrierPresent(String s) -> ... ... }; The same way, we also way a method "asEmpty()" defined like this: public /*value*/ record $CarrierEmpty() {} public $CarrierEmpty asEmpty() { if (value != null) { return null; } return new $CarrierEmpty(); } so we can switch on both asPresent() and asEmpty() like this: var optional = .. var result = switch (optional) { case Optional opt when opt.asPresent() instanceof Optional.$CarrierPresent(String s) -> ... case Optional opt when opt.asEmpty() instanceof Optional.$CarrierEmpty() -> ... default -> throw new MatchException("boom !", null); }; So we are able to write the code using actual Java so adding a pattern that calls a method can be seen as an exercise of adding syntactic sugar. Brian, yes, i'm well aware of the shortcoming of that approach but the idea here is to try to meet you in the middle. At use site, we want to simplify the code to write something like var optional = ... var result = switch (optional) { case Optional.asPresent(String s) -> ... case Optional.asEmpty() -> ... ... }; We can note that the '.' just after the type (here Optional) does not work exactly as a method call, in the example, this is *not* a static method call, it's more a reference to a method that appears to be an instance method in our case. As Brian said, like the semantics of '::', we can expect to reference an instance method, a static method or a bound method. If we imagine a method static "asInteger" declared in String, that works like parseInt but return null instead of throwing an exeception, and a method asInteger() on a Matcher that call String.asInteger() on matcher.group(1) instance method: OptionalInt.asPresent(int v) -> ... static method. String.asInteger(int v) -> ... bound method. matcher.asInteger(int v) -> ... Like '::', I think we should method call without a prefix because it is not clear if the prefix should be the instance switched upon or the current "this" in that case. So as a user if you want to bound this, you will have to write it explicitly. The method referenced by a method pattern can be an abstract method, an instance method, a static method, a default method, a varargs, etc. What is important is that this method must return either a record or something that can be deconstructed as a record. At declaration site, we have to decide two things, how to represent a record without having users to declare a record and how to represent match/no-match, i.e. is returning null (or any other signal for no match) should be something hidden or not. I see no reason to couple those two points, so those can be seen as two different features. How to declare a carrier on the spot ? Technically, it does not have to be a record, it has to be something that describes all its component at runtime, let's call it a carrier. So it should be something like this for asPresent and asEmpty public carrier (T value) asPresent() { ... } public carrier () asEmpty() { ... } We can note that carrier can be a keyword on the method itself or a keyword on the return type of the method or even a way to define a type, e.g. carrier(int a, int b) correspond to the type of an instance of the product of int a and int b. Another question is that if carrier if a way to create a carrier type, can it be used in other place than just as a return type. How to create a carrier ? For asPresent(), we want to be able to express the that a carrier can either represent several values or no match. This can be done, either using a pair of keyword like match/nomatch public carrier(T value) asPresent() { if (value == null) { return nomatch; } return match (value); } or using null and reusing the keyword carrier after "new" so the syntax looks like an instantiation of a normal class. public carrier(T value) asPresent() { if (value == null) { return null; } return new carrier (value); } With that in mind, we can introduce a second example, destructuring Map.Entry. So if we add a method "asEntry", actually, we can write it like this public interface Map { public interface Entry { public /*value*/ record $Carrier(K key, V value) {} abstract $Carrier/*!*/ asEntry(); } public static Map.Entry entry(K key, V value) { ... } } Using a carrier, Entry can be written like this: public interface Entry { abstract carrier!(K key, V value) asEntry(); } You can notice, that if we introduce '!' and '?' in the future, it can be used to indicate that the match is total, and perhaps it means that the return null should be explicit and not implicit. and pattern match like this: var entry = ... var result = switch (entry) { case Map.Entry.asEntry(String key, Integer value) -> ... }; Unlikle in Java, in the example above, Map.Entry is defined as a functional interface, so because asEntry() is a method, a lambda can be used like this: Map entry = () -> new carrier("foo", "bar"); The last example is the class Point and how to specify a deconstructor. Again, actually we can write: class Point { private int x, y; public Point(int x, int y) { this.x = x; this.y = y; } public /*value*/ record $Carrier(int x, int y) {} public static $Carrier/*!*/ deconstructor(Point that) { Objects.requireNonNull(that); return new $Carrier(that.x, that.y); } } Note that here, we want the method to be static so it can not be overriden by the subclasses. or using the carrier syntax: public static carrier(int x, int y) deconstructor(Point that) { Objects.requireNonNull(that); return new carrier(that.x, that.y); } Here, we can make "deconstructor" a local keyword. The compiler will in that case verifies that it returns a carrier and that it can not return null. Another interresting things with the carrier notation is that is make the syntax for destructuring of an object quite obvious: var point = ... carrier(int x, int y) = point; I think that by not starting from the deconstructor, the notion of inverse methods make less sense. I think that the notion of carrier / carrier type is less disruptive that the notion of member patterns. regards, R?mi > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Friday, March 29, 2024 10:58:54 PM > Subject: Member Patterns -- the bikeshed > We now come to the long-awaited bikeshed discussion on what member patterns > should look like. > Bikeshed disclaimer for EG: > - This is likely to evoke strong opinions, so please take pains to be especially > constructive > - Long reply-to-reply threads should be avoided even more than usual > - Holistic, considered replies preferred > - Please change subject line if commenting on a sub-topic or tangential > concern > Special reminders for Remi: > - Use of words like "should", "must", "shouldn't", "mistake", "wrong", "broken" > are strictly forbidden. > - If in doubt, ask questions first. > Notes for external observers: > - This is a working document for the EG; the discussion may continue for a > while before there is an official proposal. Please be patient. > # Pattern declaration: the bikeshed > We've largely identified the model for what kinds of patterns we need to > express, but there are still several degrees of freedom in the syntax. > As the model has simplified during the design process, the space of syntax > choices has been pruned back, which is a good thing. However, there are still > quite a few smaller decisions to be made. Not all of the considerations are > orthogonal, so while they are presented individually, this is not a "pick one > from each column" menu. > Some of these simplifications include: > - Patterns with "input arguments" have been removed; another way to get to what > this gave us may come back in another form. > - I have grown increasingly skeptical of the value of the imperative `match` > statement. With better totality analysis, I think it can be eliminated. > We can discuss these separately but I would like to sync first on the broad > strokes for how patterns are expressed. > ## Object model requirements > As outlined in "Towards Member Patterns", the basic model is that patterns are > the dual of other executable members (constructors, static methods, instance > methods.) While they are like methods in that they have inputs, outputs, names, > and an imperative body, they have additional degrees of freedom that > constructors and methods lack: > - Patterns are, in general, _conditional_ (they can succeed or fail), and only > produce bindings (outputs) when they succeed. This conditionality is > understood by the language's flow analysis, and is used for computing scoping > and definite assignment. > - Methods can return at most one value; when a pattern completes successfully, > it may bind multiple values. > - All patterns have a _match candidate_, which is a distinguished, > possibly-implicit parameter. Some patterns also have a receiver, which is > also a distinguished, possibly-implicit parameter. In some such cases the > receiver and match candidate are aliased, but in others these may refer to > different objects. > So a pattern is a named executable member that takes a _match candidate_ as a > possibly-implicit parameter, maybe takes a receiver as an implicit parameter, > and has zero or more conditional _bindings_. Its body can perform imperative > computation, and can terminate either with match failure or success. In the > success case, it must provide a value for each binding. > Deconstruction patterns are special in many of the same ways constructors are: > they are constrained in their name, inheritance, and probably their > conditionality (they should probably always succeed). Just as the syntax for > constructors differs slightly from that of instance methods, the syntax for > deconstructors may differ slightly from that of instance patterns. Static > patterns, like static methods, have no receiver and do not have access to the > type parameters of the enclosing class. > Like constructors and methods, patterns can be overloaded, but in accordance > with their duality to constructors and methods, the overloading happens on the > _bindings_, not the inputs. > ## Use-site syntax > There are several kinds of type-driven patterns built into the language: type > patterns and record patterns. A type pattern in a `switch` looks like: > case String s: ... > And a record pattern looks like: > case MyRecord(P1, P2, ...): ... > where `P1..Pn` are nested patterns that are recursively matched to the > components of the record. This use-site syntax for record patterns was chosen > for its similarity to the construction syntax, to highlight that a record > pattern is the dual of record construction. > **Deconstruction patterns.** The simplest kind of member pattern, a > deconstruction pattern, will have the same use-site syntax as a record pattern; > record patterns can be thought of as a deconstruction pattern "acquired for > free" by records, just as records do with constructors, accessors, object > methods, etc. So the use of a deconstruction pattern for `Point` looks like: > case Point(var x, var y): ... > whether `Point` is a record or an ordinary class equipped with a suitable > deconstruction pattern. > **Static patterns.** Continuing with the idea that the destructuring syntax > should evoke the aggregation syntax, there is an obvious candidate for the > use-site syntax for static patterns: > case Optional.of(var e): ... > case Optional.empty(): ... > **Instance patterns.** Uses of instance patterns will likely come in two forms, > analogous to bound and unbound instance method references, depending on whether > the receiver and the match candidate are the same object. In the unbound form, > used when the receiver is the same object as the match candidate, the pattern > name is qualified by a _type_: > ``` > Class k = ... > switch (k) { > // Qualified by type > case Class.arrayClass(var componentType): ... > } > ``` > This means that we _resolve_ the pattern `arrayClass` starting at `Class` and > _select_ the pattern using the receiver, `k`. We may also be able to omit the > class qualifier if the static type of the match candidate is sufficient to > resolve the desired pattern. > In the bound form, used when the receiver is distinct from the match candidate, > the pattern name is qualified with an explicit _receiver expression_. As an > example, consider an interface that captures primitive widening and narrowing > conversions, such as those between `int` and `long`. In the widening direction, > conversion is unconditional, so this can be modeled as a method from `int` to > `long`. In the other direction, conversion is conditional, so this is better > modeled as a _pattern_ whose match candidate is `long` and which binds an `int` > on success. Since these are instance methods of some class (say, > `NumericConversion`), we need to provide the receiver instance in order to > resolve the pattern: > ``` > NumericConversion nc = ... > switch (aLong) { > case nc.narrowed(int i): > ... > } > ``` > The explicit receiver syntax would also be used if we exposed regular expression > matching as a pattern on the `j.u.r.Pattern` object (the name collision on > `Pattern` is unfortunate). Imagine we added a `matching` instance pattern to > `j.u.r.Pattern`; then we could use it in `instanceof` as follows: > ``` > static final java.util.regex.Pattern P = Pattern.compile("(a*)(b*)"); > ... > if (aString instanceof P.matching(String as, String bs)) { ... } > ``` > Each of these use-site syntaxes is modeled after the use-site syntax for a > method invocation or method reference. > ## Declaration-site syntax > To avoid being biased by the simpler cases, we're going to work all the cases > concurrently rather than starting with the simpler cases and working up. (It > might seem sensible to start with deconstructors, since they are the "easy" > case, but if we did that, we would likely be biased by their simplicity and then > find ourselves painted into a corner.) As our example gallery, we will consider: > - Deconstruction pattern for `Point`; > - Static patterns for `Optional::of` and `Optional::empty`; > - Static pattern for "power of two" (illustrating a computations where success > or failure, and computation of bindings, cannot easily be separated); > - Instance pattern for `Class::arrayClass` (used unbound); > - Instance pattern for `Pattern::matching` on regular expressions (used bound). > Member patterns, like methods, have _names_. (We can think of constructors as > being named for their enclosing classes, and the same for deconstructors.) All > member patterns have a (possibly empty) ordered list of _bindings_, which are > the dual of constructor or method parameters. Bindings, in turn, have names and > types. And like constructors and methods, member patterns have a _body_ which > is a block statement. Member patterns also have a _match candidate_, which is a > likely-implicit method parameter. > ### Member patterns as inverse methods and constructors > Regardless of syntax, let us remind ourselves that that deconstructors are the > categorical dual to constructors (coconstructors), and pattern methods are the > categorical dual to methods (comethods). They are dual in their structure: a > constructor or method takes N arguments and produces a result, the corresponding > member pattern consumes a match candidate and (conditionally) produces N > bindings. > Moreover, they are semantically dual: the return value produced by construction > or factory invocation is the match candidate for the corresponding member > pattern, and the bindings produced by a member pattern are the answers to the > _Pattern Question_ -- "could this object have come from an invocation of my > dual, and if so, with what arguments." > ### What do we call them? > Given the significant overlap between methods and patterns, the first question > about the declaration we need to settle is how to identify a member pattern > declaration as distinct from a method or constructor declaration. _Towards > Member Patterns_ tried out a syntax that recognized these as _inverse_ methods > and constructors: > public Point(int x, int y) { ... } > public inverse Point(int x, int y) { ... } > While this is a principled choice which clearly highlights the duality, and one > that might be good for specification and verbal description, it is questionable > whether this would be a great syntax for reading and writing programs. > A more traditional option is to choose a "noun" (conditional) keyword, such as > `pattern`, `matcher`, `extractor`, `view`, etc: > public pattern Point(int x, int y) { ... } > If we are using a noun keyword to identify pattern declarations, we could use > the same noun for all of them, or we could choose a different one for > deconstruction patterns: > public deconstructor Point(int x, int y) { ... } > Alternately, we could reach for a symbol to indicate that we are talking about > an inverted member. C++ fans might suggest > public ~Point(int x, int y) { ... } > but this is too cryptic (it's evocative once you see it, but then it becomes > less evocative as we move away from deconstructors towards instance patterns.) > If we wish to offer finer-grained control over conditionality, we might > additionally need a `total` / `partial` modifier, though I would prefer to avoid > that. > Of the keyword candidates, there is one that stands out (for good and bad) > because it connects to something that is already in the language: `pattern`. On > the one hand, using the term `pattern` for the declaration is a slight abuse; on > the other, users will immediately connect it with "ah, so that's how I make a > new pattern" or "so that's what happens when I match against this pattern." > (Lisps would resolve this tension by calling it `defpattern`.) > The others (`matcher`, `view`, `extractor`, etc) are all made-up terms that > don't connect to anything else in the language, for better or worse. If we pick > one of these, we are asking users to sort out _three_ separate new things in > their heads: (use-site) patterns, (declaration-site) matchers, and the rules of > how patterns and matchers are connected. Calling them both "patterns", despite > the mild abuse of terminology, ties them together in a way that recognizes their > connection. > My personal position: `pattern` is the strongest candidate here, despite some > flaws. > ### Binding lists and match candidates > There are two obvious alternatives for describing the binding list and match > candidate of a pattern declaration, both with their roots in the constructor and > method syntax: > - Pretend that a pattern declaration is like a method with multiple return, and > put the binding list in the "return position", and make the match candidate > an ordinary parameter; > - Lean into the inverse relationship between constructors and methods (and > consistency with the use-site syntax), and put the binding list in the > "parameter list position". For static patterns and some instance patterns, > which need to explicitly identify the match candidate type, there are several > sub-options: > - Lean further into the duality, putting the match candidate type in the > "return position"; > - Put the match candidate type somewhere else, where it is less likely to be > confused for a method return. > The "method-like" approach might look like this: > ``` > class Point { > // Constructor and deconstructor > public Point(int x, int y) { ... } > public pattern (int x, int y) Point(Point target) { ... } > ... > } > class Optional { > // Static factory and pattern > public static Optional of(T t) { ... } > public static pattern (T t) of(Optional target) { ... } > ... > } > ``` > The "inverse" approach might look like: > ``` > class Point { > // Constructor and deconstructor > public Point(int x, int y) { ... } > public pattern Point(int x, int y) { ... } > ... > } > class Optional { > // Static factory and pattern (using the first sub-option) > public static Optional of(T t) { ... } > public static pattern Optional of(T t) { ... } > ... > } > ``` > With the "method-like" approach, the match candidate gets an explicit name > selected by the author; with the inverse approach, we can go with a predefined > name such as `that`. (Because deconstructors do not have receivers, we could by > abuse of notation arrange for the keyword `this` to refer instead to the match > candidate within the body of a deconstructor. While this might seem to lead to > a more familiar notation for writing deconstructors, it would create a > gratuitous asymmetry between the bodies of deconstruction patterns and those of > other patterns.) > Between these choices, nearly all the considerations favor the "inverse" > approach: > - The "inverse" approach makes the declaration look like the use site. This > highlights that `pattern Point(int x, int y)` is what gets invoked when you > match against the pattern use `Point(int x, int y)`. (This point is so > strong that we should probably just stop here.) > - The "inverse" members also look like their duals; the only difference is the > `pattern` keyword (and possibly the placement of the match candidate type). > This makes matched pairs much more obvious, and such matched pairs will be > critical both for future language features and for library idioms. > - The method-like approach is suggestive of multiple return or tuples, which is > probably helpful for the first few minutes but actually harmful in the long > term. This feature is _not_ (much as some people would like to believe) about > multiple return or tuples, and playing into this misperception will only make > it harder to truly understand. So this suggestion ends up propping up the > wrong mental model. > The main downside of the "inverse" approach is the one-time speed bump of the > unfamiliarity of the inverted syntax. (The "method-like" syntax also has its > own speed bumps, it is just unfamiliar in different ways.) But unlike the > advantages of the inverse approach, which continue to add value forever, this > speed bump is a one-time hurdle to get over. > To smooth out the speed bumps of the inverse approach, we can consider moving > the position of the match candidate for static and (suitable) instance pattern > declarations, such as: > ``` > class Optional { > // the usual static factory > public static Optional of(T t) { ... } > // Various ways of writing the corresponding pattern > public static pattern of(T t) for Optional { ... } > // or ... > public static pattern(Optional) of(T t) { ... } > // or ... > public static pattern(Optional that) of(T t) { ... } > // or ... > public static pattern> of(T t) { ... } > ... > } > ``` > (The deconstructor example looks the same with either variant.) Of these, > treating the match candidate like a "parameter" of "pattern" is probably the > most evocative: > ``` > public static pattern(Optional that) of(T t) { ... } > ``` > as it can be read as "pattern taking the parameter `Optional that` called > `of`, binding `T`, and is a short departure from the inverse syntax. > The main value of the various rearrangements is that users don't need to think > about things operating in reverse to parse the syntax. This trades some of the > secondary point (patterns looking almost exactly like their inverses) for a > certain amount of cognitive load, while maintaining the most important > consideration: that the declaration site look like the use site. > For instance pattern declarations, if the match candidate type is the same as > the receiver type, the match candidate type can be elided as it is with > deconstructors. > My personal position: the "multiple return" version is terrible; all the > sub-variants of the inverse version are probably workable. > ### Naming the match candidate > We've been assuming so far that the match candidate always has a fixed name, > such as `that`; this is an entirely workable approach. Some of the variants are > also amenable to allowing authors to explicitly select a name for the match > candidate. For example, if we put the match candidate as a "parameter" to the > `pattern` keyword, there is an obvious place to put the name: > ``` > static pattern(Optional target) of(T t) { ... } > ``` > My personal opinion: I don't think this degree of freedom buys us much, and in > the long run readability probably benefits by picking a fixed name like `that` > and sticking with it. Even with a fixed name, if there is a sensible position > for the name, allowing users to type `that` for explicitness is fine (as we do > with instance methods, though many people don't know this.) We may even want to > require it. > ## Body types > Just as there are two obvious approaches for the declaration, there are two > obvious approaches we could take for the body (though there is some coupling > between them.) We'll call the two body approaches _imperative_ and > _functional_. > The imperative approach treats bindings as initially-DU variables that must be > DA on successful completion, getting their value through ordinary assignment; > the functional approach sets all the bindings at once, positionally. Either > way, member patterns (except maybe deconstructors) also need a way to > differentiate a successful match from a failed match. > Here is the `Point` deconstructor with both imperative and functional style. The > functional style uses a placeholder `match` statement to indicate a successful > match and provision of bindings: > ``` > class Point { > int x, y; > Point(int x, int y) { > this.x = x; > this.y = y; > } > // Imperative style, deconstructor always succeeds > pattern Point(int x, int y) { > x = that.x; > y = that.y; > } > // Functional style > pattern Point(int x, int y) { > match(that.x, that.y); > } > } > ``` > There are some obvious differences here. In the imperative style, the dtor body > looks much more like the reverse of the ctor body. The functional style is more > concise (and amenable to further concision via the "concise method bodies" > mechanism in the future), as well as a number of less obvious differences. For > deconstructors, the imperative approach is likely to feel more natural because > of the obvious symmetry with constructors. > In reality, it is _premature at this point to have an opinion_, because we > haven't yet seen the full scope of the problem; deconstructors are a special > case in many ways, which almost surely is distorting our initial opinion. As we > move towards conditional patterns (and pattern lambdas), our opinions may flip. > Regardless of which we pick, there are some additional syntactic choices to be > made -- what syntax to use to indicate success (we used `match` in the above > example) or failure. (We should be especially careful around trying to reuse > words like `return`, `break`, or `yield` because, in the case where there are > zero bindings (which is allowable), it becomes unclear whether they mean "fail" > or "succeed with zero bindings".) > ### Success and failure > Except for possibly deconstructors, which we may require to be total, a pattern > declaration needs a way to indicate success and failure. In the examples above, > we posited a `match` statement to indicate success in the functional approach, > and in both examples leaned on the "implicit success" of deconstructors (under > the assumption they always succeed). Now let's look at the more general case to > figure out what else is needed. > For a static pattern like `Optional::of`, success is conditional. Using > `match-fail` as a placeholder for "the match failed", this might look like > (functional version): > ``` > public static pattern(Optional that) of(T t) { > if (that.isPresent()) > match (that.get()); > else > match-fail; > } > ``` > The imperative version is less pretty, though. Using `match-success` as a > placeholder: > ``` > public static pattern(Optional that) of(T t) { > if (that.isPresent()) { > t = that.get(); > match-success; > } > else > match-fail; > } > ``` > Both arms of the `if` feel excessively ceremonial here. And if we chose to not > make all deconstruction patterns unconditional, deconstructors would likely need > some explicit success as well: > ``` > pattern Point(int x, int y) { > x = that.x; > y = that.y; > match-success; > } > ``` > It might be tempting to try and eliminate the need for explicit success by > inferring it from whether or not the bindings are DA or not, but this is > error-prone, is less type-checkable, and falls apart completely for patterns > with no bindings. > ### Implicit failure in the functional approach > One of the ceremonial-seeming aspects of `Optional::of` above is having to say > `else match-fail`, which doesn't feel like it adds a lot of value. Perhaps we > can be more concise without losing clarity. > Most conditional patterns will have a predicate to determine matching, and then > some conditional code to compute the bindings and claim success. Having to say > "and if the predicate didn't hold, then I fail" seems like ceremony for the > author and noise for the reader. Instead, if a conditional pattern falls off > the end without matching, we could treat that as simply not matching: > ``` > public static pattern(Optional that) of(T t) { > if (that.isPresent()) > match (that.get()); > } > ``` > This says what we mean: if the optional is present, then this pattern succeeds > and bind the contents of the `Optional`. As long as our "succeed" construct > strongly enough connotes that we are terminating abruptly and successfully, this > code is perfectly clear. And most conditional patterns will look a lot like > `Optional::of`; do some sort of test and if it succeeds, extract the state and > bind it. > At first glance, this "implicit fail" idiom may seem error-prone or sloppy. But > after writing a few dozen patterns, one quickly tires of saying "else > match-fail" -- and the reader doesn't necessarily appreciate reading it either. > Implicit failure also simplifies the selection of how we explicitly indicate > failure; using `return` in a pattern for "no match" becomes pretty much a forced > move. We observe that (in a void method), "return" and "falling off the end" > are equivalent; if "falling off the end" means "no match", then so should an > explicit `return`. So in those few cases where we need to explicitly signal "no > match", we can just use `return`. It won't come up that often, but here's an > example where it does: > ``` > static pattern(int that) powerOfTwo(int exp) { > int exp = 0; > if (that < 1) > return; // explicit fail > while (that > 1) { > if (that % 2 == 0) { > that /= 2; > ++exp; > } > else > return; // explicit fail > } > match (exp); > } > ``` > As a bonus, if `return` as match failure is a forced move, we need only select a > term for "successful match" (which obviously can't be `return`). We could use > `match` as we have in the examples, or a variant like `matched` or `matches`. > But rather than just creating a new control operator, we have an opportunity to > lean into the duality a little harder, by including the pattern syntax in the > match: > ``` > matches of(that.get()); > ``` > or the (optionally?) qualified (inferring type arguments, as we do at the use > site): > ``` > matches Optional.of(that.get()); > ``` > These "use the name" approaches trades a small amount of verbosity to gain a > higher degree of fidelity to the pattern use site (and to evoke the comethod > completion.) > If we don't choose "implicit fail", we would have to invent _two_ new control > flow statements to indicate "success" and "failure". > My personal position: for the functional approach, implicit failure both makes > the code simpler and clearer, and after you get used to it, you don't want to go > back. Whether we say `match` or `matches` or `matches ` are all > workable, though I like some variant that names the pattern. > ### Implicit success in the imperative approach > In the imperative approach, we can be implicit as well, but it feels more > natural (at least, initially) to choose implicit success rather than failure. > This works great for unconditional patterns: > ``` > pattern Point(int x, int y) { > x = that.x; > y = that.y; > // implicit success > } > ``` > but not quite as well for conditional patterns: > ``` > static pattern(Optional that) of(T t) { > if (that.isPresent()) { > t = that.get(); > } > else > match-fail; > // implicit success > } > ``` > We can eliminate one of the arms of the if, with the more concise (but > convoluted) inversion: > ``` > static pattern(Optional that) of(T t) { > if (!that.isPresent()) > match-fail; > t = that.get(); > // implicit success > } > ``` > Just as with the functional approach, if we choose imperative and "implicit > success", using `return` to indicate success is pretty much a forced move. > ### Imperative is a trap > If we assume that functional implies implicit failure, and imperative implies > implicit success, then our choices become: > ``` > class Optional { > public static Optional of(T t) { ... } > // imperative, implicit success > public static pattern(Optional that) of(T t) { > if (that.isPresent()) { > t = that.get(); > } > else > match-fail; > } > // functional, implicit failure > public static pattern(Optional that) of(T t) { > if (that.isPresent()) > matches of(that.get()); > } > } > ``` > Once we get past deconstructors, the imperative approach looks worse by > comparison because we need to assign all the bindings (which is _O(n)_ > assignments) _and also_ indicate success or failure somehow, whereas in the > functional style all can be done together with a single `matches` statement. > Looking at the alternatives, except maybe for unconditional patterns, the > functional example above seems a lot more natural. The imperative approach > works with deconstructors (assuming they are not conditional), but does not > scale so well to conditionality -- which is the essence of patterns. > From a theoretical perspective, the method-comethod duality also gives us a > forceful nudge towards the functional approach. In a method, the method > arguments are specified as a positional list of expressions at the use site: > m(a, b, c) > and these values are invisibly copied into the parameter slots of the method > prior to frame activation. The dual to that for a comethod to similarly convey > the bindings in a positional list of expressions (as they must either all be > produced or none), where they are copied into the slots provided at the use > site, as is indicated by `matches` in the above examples. > My personal position: the imperative style feels like a trap. It seems > "obvious" at first if we start with deconstructors, but becomes increasingly > difficult when we get past this case, and gets in the way of other > opportunities. The last gasp before acceptance is the discomfort that dtor and > ctor bodies are written in different styles, but in the rear-view mirror, this > feels like a non-issue. > ### Derive imperative from functional? > If we start with "functional with implicit failure", we can possibly rescue > imperative by deriving a version of imperative from functional, by "overloading" > the match-success operator. > If we have a pattern whose binding names are `b1..bn` of types `B1..Bn`, then > the `matches` operator must take a list of expressions `e1..en` whose arity and > types are compatible with `B1..Bn`. But we could allow `matches` to also have a > nilary form, which would have the effect of being shorthand for > matches (b1, b2, ..., bn) > where each of `b1..bn` must be DA at the point of matching. This means that we > could express patterns in either form: > ``` > class Optional { > public static Optional of(T t) { ... } > // imperative, derived from functional with implicit failure > public static pattern(Optional that) of(T t) { > if (that.isPresent()) { > t = that.get(); > matches of; > } > } > public static pattern(Optional that) of(T t) { > if (that.isPresent()) > matches of(that.get()); > } > } > ``` > This flexibility allows users to select a more verbose expression in exchange > for a clearer association of expressions and bindings, though as we'll see, it > does come with some additional constraints. > ### Wrapping an existing API > Nearly every library has methods (sometimes sets of methods) that are patterns > in disguise, such as the pair of methods `isArray` and `getComponentType` in > `Class`, or the `Matcher` helper type in `java.util.regex`. Library maintainers > will likely want to wrap (or replace) these with real patterns, so these can > participate more effectively in conditional contexts, and in some cases, > highlight their duality with factory methods. > Matching a string against a `j.u.r.Pattern` regular expression has all the same > elements as a pattern, just with an ad-hoc API (and one that I have to look up > every time). But we can fairly easily wrap a true pattern around the existing > API. To match against a `Pattern` today, we pass the match candidate to > `Pattern::matcher`, which returns a `Matcher` with accessors `Matcher::matches` > (did it match) and `Matcher::group` (conditionally extract a particular capture > group.) If we want to wrap this with a pattern called `regexMatch`: > ``` > pattern(String that) regexMatch(String... groups) { > Matcher m = this.matcher(that); > if (m.matches()) > matches Pattern.regexMatch(IntStream.range(1, m.groupCount()) > .map(Matcher::group) > .toArray(String[]::new)); > // whole lotta matchin' goin' on > } > ``` > This says that a `j.u.r.Pattern` has an instance pattern called `regex`, whose > match candidate is `String`, and which binds a varargs of `String` corresponding > to the capture groups. The implementation simply delegates to the existing > `j.u.r.Matcher` API. This means that `j.u.r.Pattern` becomes a sort of "pattern > object", and we can use it as a receiver at the use site: > ``` > static Pattern As = Pattern.compile("(a*)"); > static Pattern Bs = Pattern.compile("(b*)"); > ... > switch (string) { > case As.regexMatch(var as): ... > case Bs.regexMatch(var bs): ... > ... > } > ``` > ### Odds and ends > There are a number of loose ends here. We could choose other names for the > match-success and match-fail operations, including trying to reuse `break` or > `yield`. But, this reuse is tricky; it must be very clear whether a given form > of abrupt completion means "success" or "failure", because in the case of > patterns with no bindings, we will have no other syntactic cues to help > disambiguate. (I think having a single `matches`, with implicit failure and > `return` meaning failure, is the sweet spot here.) > Another question is whether the binding list introduces corresponding variables > into the scope of the body. For imperative, the answer is "surely yes"; for > functional, the answer is "maybe" (unless we want to do the trick where we > derive imperative from functional, in which case the answer is "yes" again.) > If the binding list does not correspond to variables in the body, this may be > initially discomforting; because they do not declare program elements, they may > feel that they are left "dangling". But even if they are not declaring > _program_ elements, they are still declaring _API_ elements (similar to the > return type of a method.) We will want to provide Javadoc on the bindings, just > like with parameters; we will want to match up binding names in deconstructors > with parameter names in constructors; we may even someday want to support > by-name binding at the use site (e.g., `case Foo(a: var a)`). The names are > needed for all of these, just not for the body. Names still matter. My take > here is that this is a transient "different is scary" reaction, one that we > would get over quickly. > A final question is whether we should consider unqualified names as implicitly > qualified by `that` (and also `this`, for instance patterns, with some conflict > resolution). Users will probably grow tired of typing `that.` all the time, and > most of the time, the unqualified use is perfectly readable. > ## Exhaustiveness > There is one last syntax question in front of us: how to indicate that a set of > patterns are (claimed to be) exhaustive on a given match candidate type. We see > this with `Optional::of` and `Optional::empty`; it would be sad if the compiler > did not realize that these two patterns together were exhaustive on `Optional`. > This is not a feature that will be used often, but not having it at all will be > a repeated irritant. > The best I've come up with is to call these `case` patterns, where a set of > `case` patterns for a given match candidate type in a given class are asserted > to be an exhaustive set: > ``` > class Optional { > static Optional of(T t) { ... } > static Optional empty() { ... } > static case pattern of(T t) for Optional { ... } > static case pattern empty() for Optional { ... } > } > ``` > Because they may not be truly exhaustive, `switch` constructs will have to back > up the static assumption of exhaustiveness with a dynamic check, as we do for > other sets of exhaustive patterns that may have remainder. > I've experimented with variants of `sealed` but it felt more forced, so this is > the best I've come up with. > ## Example: patterns delegating to other patterns > Pattern implementations must compose. Just as a subclass constructor delegates > to a superclass constructor, the same should be true for deconstructors. > Here's a typical superclass-subclass pair: > ``` > class A { > private final int a; > public A(int a) { this.a = a; } > public pattern A(int a) { matches A(that.a); } > } > class B extends A { > private final int b; > public B(int a, int b) { > super(a); > this.b = b; > } > // Imperative style > public pattern B(int a, int b) { > if (that instanceof super(var aa)) { > a = aa; > b = that.b; > matches B; > } > } > // Functional style > public pattern B(int a, int b) { > if (that instanceof super(var a)) > matches B(a, b); > } > } > ``` > (Ignore the flow analysis and totality for the time being; we'll come back to > this in a separate document.) > The first thing that jumps out at us is that, in the imperative version, we had > to create a "garbage" variable `aa` to receive the binding, because `a` was > already in scope, and then we have to copy the garbage variable into the real > binding variable. Users will surely balk at this, and rightly so. In the > functional version (depending on the choices from "Odds and Ends") we are free > to use the more natural name and avoid the roundabout locution. > We might be tempted to fix the "garbage variable" problem by inventing another > sub-feature: the ability to use an existing variable as the target of a binding, > such as: > ``` > pattern Point(int a, int b) { > if (this instanceof A(__bind a)) > b = this.b; > } > ``` > But, I think the language is stronger without this feature, for two reasons. > First, having to reason about whether a pattern match introduces a new binding > or assigns to an existing variables is additional cognitive load for users to > reason about, and second, having assignment to locals happening through > something other than assignment introduces additional complexity in finding > where a variable is modified. While we can argue about the general utility of > this feature, bringing it in just to solve the garbage-variable problem is > particularly unattractive. > ## Pattern lambdas > One final consideration is is that patterns may also have a lambda form. Given > a single-abstract-pattern (SAP) interface: > ``` > interface Converter { > pattern(T t) convert(U u); > } > ``` > one can implement such a pattern with a lambda. Such a lambda has one parameter > (the match candidate), and its body looks like the body of a declared pattern: > ``` > Converter c = > i -> { > if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE) > matches Converter.convert((short) i); > }; > ``` > Because the bindings of the pattern lambda are defined in the interface, not in > the lambda, this is one more reason not to like the imperative version: it is > brittle, and alpha-renaming bindings in the interface would be a > source-incompatible change. > ## Example gallery > Here's all the pattern examples so far, and a few more, using the suggested > style (functional, implicit fail, implicit `that`-qualification): > ``` > // Point dtor > pattern Point(int x, int y) { > matches Point(x, y); > } > // Optional -- static patterns for Optional::of, Optional::empty > static case pattern(Optional that) of(T t) { > if (isPresent()) > matches of(t); > } > static case pattern(Optional that) empty() { > if (!isPresent()) > matches empty(); > } > // Class -- instance pattern for arrayClass (match candidate type inferred) > pattern arrayClass(Class componentType) { > if (that.isArray()) > matches arrayClass(that.getComponentType()); > } > // regular expression -- instance pattern in j.u.r.Pattern > pattern(String that) regexMatch(String... groups) { > Matcher m = matcher(that); > if (m.matches()) > matches Pattern.regexMatch(IntStream.range(1, m.groupCount()) > .map(Matcher::group) > .toArray(String[]::new)); > } > // power of two (somewhere) > static pattern(int that) powerOfTwo(int exp) { > int exp = 0; > if (that < 1) > return; > while (that > 1) { > if (that % 2 == 0) { > that /= 2; > exp++; > } > else > return; > } > matches powerOfTwo(exp); > } > ``` > ## Closing thoughts > I came out of this exploration with very different conclusions than I expected > when going in. At first, the "inverse" syntax seemed stilted, but over time it > started to seem more obvious. Similarly, I went in expecting to prefer the > imperative approach for the body, but over time, started to warm to the > functional approach, and eventually concluded it was basically a forced move if > we want to support more than just deconstructors. And I started out skeptical > of "implicit fail", but after writing a few dozen patterns with it, going back > to fully explicit felt painful. All of this is to say, you should hold your > initial opinions at arm's length, and give the alternatives a chance to sink in. > For most _conditional_ patterns (and conditionality is at the heart of pattern > matching), the functional approach cleanly highlights both the match predicate > and the flow of values, and is considerably less fussy than the imperative > approach in the same situation; `Optional::of`, `Class::arrayClass`, and `regex` > look great here, much better than the would with imperative. None of these > illustrate delegation, but in the presence of delegation, the gap gets even > wider. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Apr 3 12:15:41 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 3 Apr 2024 14:15:41 +0200 (CEST) Subject: Deconstructor (and pattern) overload selection In-Reply-To: <01758bfd-55ab-4d8d-97ec-e20d885ff2a3@oracle.com> References: <01758bfd-55ab-4d8d-97ec-e20d885ff2a3@oracle.com> Message-ID: <485816464.46635967.1712146541027.JavaMail.zimbra@univ-eiffel.fr> Hello, I would be even more brutal here because I think that the reason *alternative representation* is better serve by factory methods than constructors. The same way, in term of de-constructrion, for *alternative representation*, a named method pattern is better than a deconstructor. So instead of one deconstructor for (A,B) and another one for (X,Y), I think we should stir users ti use two methods asAB() and asXY(). For me, i think it's enough, at least for a first preview, to disambiguate only on the deconstructor arity. Note that the deconstructor arity is important because conceptually adding a deconstructor with a supplementary binding is conceptually equivalent to adding a getter to a class, we need that to be able to enhance a class in a backward compatible way. regards, R?mi > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Monday, April 1, 2024 6:34:49 PM > Subject: Deconstructor (and pattern) overload selection > The next big pattern matching JEP will be about deconstruction patterns. (Static > and instance patterns will likely come separately.) Now that we've got the > bikeshed painting underway, there are a few other loose ends here, and one of > them is overload selection. > We explored taking the existing overload selection algorithm and turning it > inside out, but after going down that road a bit, I think this both > unnecessarily much complexity for not enough value, and also potentially > fraught with nasty corner cases. I think there is a much simpler answer here > which is entirely good enough. > First, let's remind ourselves, why do we have constructor overloading in the > first place? There are three main reasons: > - Concision. If a fully-general constructor takes many parameters, but not all > are essential to the use case, then the construction site becomes a site of > accidental complexity. Being able to handle common grouping of parameters > simplifies use sites. > - Flexibility. Related to the above, not only might the user not need to specify > a given constructor parameter, but they want the flexibility of saying "let the > implementation pick the best value". Constructors with fewer parameters reserve > more flexibility for the implementation. > - Alternative representations. Some objects may take multiple representations as > input, such as accepting a Date, a LocalDate, or a LocalDateTime. > The first two cases are generally handled with "telescoping constructor nests", > where we have: > Foo(A a) > Foo(A a, B b) > Foo(A a, B b, C d, D d) > Sometimes the telescopes don't fold perfectly, and becomes "trees": > Foo(A a) > Foo(A a, B b) > Foo(A a, C c, D d) > Foo(A a, B b, C d, D d) > Which constructors to include are subjective judgments on the part of class > authors to find good tradeoffs between code size and concision/flexibility. > We had initially assumed that each constructor overload would have a > corresponding deconstructor, but further experimentation suggests this is not > an ideal assumption. > Clue One that it is not a good assumption comes from the asymmetry between > constructors and deconstructors; if we have constructors and deconstructors of > shape C(List), then it is OK to invoke C's constructor with List or its > subtypes, but we can invoke C's deconstructor with List or its subtypes or its > supertypes. > Clue Two is that applicability for constructors is based on method invocation > context, but applicability for deconstructors is based on cast context, which > has different rules. It seems unlikely that we will ever get symmetry given > this. > The "Flexibility" requirement does not really apply to deconstructors; having a > deconstructor that accepts additional bindings does not constrain anything, not > in the same way as a constructor taking needlessly specific arguments. Imagine > if ArrayList had only constructors that take int (for array capacity); this is > terrible for the constructor, because it forces a resource management decision > onto users who will not likely make a very good decision, and one that is hard > to change later, but pretty much harmless for deconstructors. > The "Concision" requirement does not really apply as much to deconstructors as > constructors; matching with `Foo(var a, _, _)` is not nearly as painful as > invoking with lots of parameters, each of which require an explicit choice by > the user. > So the main reason for overloading deconstructors is to match representations > with the constructor overloads -- but with a given "representation set", there > probably does not need to be as many deconstructors as constructors. What we > really need is to match the "maximal" constructor in a telescoping nest with a > corresponding deconstructor, or for a tree-shaped set, one for each "maximal" > representation. > So for a class with constructors > Foo() > Foo(A a) > Foo(A a, B B) > Foo(X x) > Foo(X x, Y y) > we would want dtors for (A,B) and (X,Y), but don't really need the others. > So, let's start fresh on overload selection. Deconstructors have a set of > applicability rules based on arity first (eventually, varargs, but not yet) and > then on applicability of type patterns, which is in turn rooted in castability. > Because we don't have the compatibility problem introduced by autoboxing, we > can ignore the distinction between phase 1 and 2 of overload selection (we will > have this problem with varargs later, though.) > Given this, the main question we have to resolve is to what degree -- if any -- > we may deem one overload "more applicable" than others. I think there is one > rule here that is forced: an exact type match (modulo erasure) is more > applicable than an inexact type match. So given: > D(Object o) > D(String s) > then > case D(String s) > should choose the latter. This allows the client to (mostly) steer to a specific > overload just by using the right types (rather than `var` or a subtype.) It is > not clear to me whether we need anything more here; in the event of ambiguity, > a client can pick the right overload with the right type patterns. (Nested > patterns may need to be manually unrolled to subsequent clauses in some cases.) > So basically (on a per-binding basis): an exact match is more applicable than an > inexact match, and ... that's it. Users can steer towards a particular overload > by selecting exact matches on enough bindings. Libraries can provide their own > "joins" if they want to disambiguate problematic overloads like: > D(Object o, String s) > D(String s, Object o) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Apr 3 12:48:40 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 3 Apr 2024 08:48:40 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> I would summarize your comments below as: Let's throw the entire model in the garbage, and replace it with something like Scala's "return an Optional" instead. We've been discussing the model for several years; you've been asking (and waiting patiently) for "when are we going to talk about declaration syntax", and now that we're there, you want to throw it all out and start over? We've discussed how strategies that rely on "ask the user to declare a record for every API point" feel clever for about five minutes, but start to feel old quickly. The "carrier" concept in your examples seems to be just another way of reinventing multiple return -- with the added dis-bonus of being like but not quite the same as records.? We've been pretty clear that "multiple return" is not the design center here. The use of ! for indicating totality is interesting, that's worth thinking about. On 4/3/2024 6:21 AM, Remi Forax wrote: > I think that by not starting from the deconstructor, the notion of > inverse methods make less sense. > I think that the notion of carrier / carrier type is less disruptive > that the notion of member patterns. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Apr 3 14:23:22 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 3 Apr 2024 16:23:22 +0200 (CEST) Subject: Member Patterns -- the bikeshed In-Reply-To: <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> Message-ID: <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Wednesday, April 3, 2024 2:48:40 PM > Subject: Re: Member Patterns -- the bikeshed > I would summarize your comments below as: Let's throw the entire model in the > garbage, and replace it with something like Scala's "return an Optional" > instead. > We've been discussing the model for several years; you've been asking (and > waiting patiently) for "when are we going to talk about declaration syntax", > and now that we're there, you want to throw it all out and start over? My makeup job was too big so you do not recognize your model behind :) There are two parts, the declaration part and the use-site part. Correct me if i'm wrong but apart the support of a method pattern with no prefix, we are in agreement here. For the declaration part, I think that carrier(int x, int y) asCartesian() is more readable than inverse () asCartesian(int x, int y) The inverse notation is a leaky abstraction in a leat two cases - when a modifier or an annotation is used. For an annotation, there is a notion of a target and the parameter target is at the wrong place, - when declaring a lambda, because in that case the parameters are not inversed. Now, what i call a carrier type is what you call a list of bindings. In terms of syntax, I think it is important to put a name in front of that list of bindings, i've proposed "carrier" so we provide a name for that feature, it's easier when discussing about it it or google it. That does not change the fact that a method that returns a carrier is a special method because it requires at least a special erasure (because overloading), and a special reflection API, But I hope, we will not cross the line and have to use new opcodes in the bytecode. For me, a method that returns a carrier is something that can be desugared classical Java elements like an enum or a record is desugared to a class. > We've discussed how strategies that rely on "ask the user to declare a record > for every API point" feel clever for about five minutes, but start to feel old > quickly. yes, this is what you have to do actually if you simulate the feature with Java nowadays. Not, what you should have to do in the future. And the idea is to do better, among other things, we want to suport overloading. > The "carrier" concept in your examples seems to be just another way of > reinventing multiple return -- with the added dis-bonus of being like but not > quite the same as records. We've been pretty clear that "multiple return" is > not the design center here. The idea behind a carrier is to let users define their binding list is a way that does not feel too strange, that why I propose to add a name/keyword in front of the binding list. And I do not know how you define what a binding list is but multiple return + components description is a good definition for me. R?mi > The use of ! for indicating totality is interesting, that's worth thinking > about. > On 4/3/2024 6:21 AM, Remi Forax wrote: >> I think that by not starting from the deconstructor, the notion of inverse >> methods make less sense. >> I think that the notion of carrier / carrier type is less disruptive that the >> notion of member patterns. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Apr 3 15:00:55 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 3 Apr 2024 11:00:55 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <1882416b-31d1-4160-8418-11e994300a65@oracle.com> Despite several years of warnings and other attempts at preparing the ground, you seem intent on falling into the trap of thinking that these things are "just methods" and that we are better served by generalizing methods to support patterns.? Everything about the model here places patterns as dual to methods; trying to hide that with syntax that makes it look like "just a method" is then putting the ball in our own net, because it props up wrong ideas about what is going on. (In the classfile translation, we will of course use methods, and some sort of carrier, but that's a compilation trick, and we surely don't want to expose this model to ordinary programmers (though MethodHandle programmers will probably have to deal with it.)) Unlike methods, patterns are conditional. Unlike methods, patterns can bind zero or more results. Unlike methods, patterns are overloaded on their bindings, not their arguments. > Now, what i call a carrier type is what you call a list of bindings. You can call it that, but the syntax you are proposing presents it as something different -- as a thing that is returned from a method. > And I do not know how you define what a binding list is but multiple > return + components description is a good definition for me. We define it the same way as we define a parameter list.? A parameter list is not a first-class thing in the language; you can't express one separately from a method call, or assign one to a variable, or return one.? It is strictly a linguistic mechanism for (a) declaring the shape of a method and (b) passing parameters to a method at runtime. A binding list is the dual of this; it is strictly a linguistic mechanism for (a) declaring the shape of a pattern and (b) passing bindings _from_ a pattern at runtime. On 4/3/2024 10:23 AM, forax at univ-mlv.fr wrote: > > > ------------------------------------------------------------------------ > > *From: *"Brian Goetz" > *To: *"Remi Forax" > *Cc: *"amber-spec-experts" > *Sent: *Wednesday, April 3, 2024 2:48:40 PM > *Subject: *Re: Member Patterns -- the bikeshed > > I would summarize your comments below as: Let's throw the entire > model in the garbage, and replace it with something like Scala's > "return an Optional" instead. > > We've been discussing the model for several years; you've been > asking (and waiting patiently) for "when are we going to talk > about declaration syntax", and now that we're there, you want to > throw it all out and start over? > > > My makeup job was too big so you do not recognize your model behind :) > > There are two parts, the declaration part and the use-site part. > Correct me if i'm wrong but apart the support of a method pattern with > no prefix, we are in agreement here. > > For the declaration part, I think that > ? carrier(int x, int y) asCartesian() > is more readable than > ? inverse () asCartesian(int x, int y) > > The inverse notation is a leaky abstraction in a leat two cases > - when a modifier or an annotation is used. For an annotation, there > is a notion of a target and the parameter target is at the wrong place, > - when declaring a lambda, because in that case the parameters are not > inversed. > > Now, what i call a carrier type is what you call a list of bindings. > In terms of syntax, I think it is important to put a name in front of > that list of bindings, i've proposed "carrier" so we provide a name > for that feature, it's easier when discussing about it it or google it. > That does not change the fact that a method that returns a carrier is > a special method because it requires at least a special erasure > (because overloading), and a special reflection API, > > But I hope, we will not cross the line and have to use new opcodes in > the bytecode. > For me, a method that returns a carrier is something that can be > desugared classical Java elements like an enum or a record is > desugared to a class. > > > We've discussed how strategies that rely on "ask the user to > declare a record for every API point" feel clever for about five > minutes, but start to feel old quickly. > > > yes, this is what you have to do actually if you simulate the feature > with Java nowadays. Not, what you should have to do in the future. > And the idea is to do better, among other things, we want to suport > overloading. > > > > The "carrier" concept in your examples seems to be just another > way of reinventing multiple return -- with the added dis-bonus of > being like but not quite the same as records.? We've been pretty > clear that "multiple return" is not the design center here. > > > The idea behind a carrier is to let users define their binding list is > a way that does not feel too strange, that why I propose to add a > name/keyword in front of the binding list. > > And I do not know how you define what a binding list is but multiple > return + components description is a good definition for me. > > R?mi > > > > The use of ! for indicating totality is interesting, that's worth > thinking about. > > > > On 4/3/2024 6:21 AM, Remi Forax wrote: > > I think that by not starting from the deconstructor, the > notion of inverse methods make less sense. > I think that the notion of carrier / carrier type is less > disruptive that the notion of member patterns. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Apr 3 18:07:42 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 3 Apr 2024 14:07:42 -0400 Subject: String template interpolation as a two steps process In-Reply-To: <237706846.41203165.1711616747142.JavaMail.zimbra@univ-eiffel.fr> References: <237706846.41203165.1711616747142.JavaMail.zimbra@univ-eiffel.fr> Message-ID: We've had this discussion before. In the old model, it *is* a two-step process; processors could cache type and analysis information in the indy call site at capture time through the Linkage mechanism, and then could use that information at application time.? It's just that in the old model, this was not exposed to processors outside the JDK, at least not initially.? We know that you didn't like that, but we felt that the Linkage API needed a lot more work before we were willing to expose it to arbitrary code, and didn't want to delay the feature for that.? This is old news. For the JDK processors, the "shortcomings" you list (e.g., caching of types, duplicated validation) are not present. In the new model, it's the same story.? There's a currently-privileged API that "processors" can use to cache analysis information in the call site, which is computed at first application rather than at capture time, and reused in subsequent applications.? Again, it is not yet available to processors outside the JDK.? It's the exact same story. I don't see anything in the mail that isn't already handled for the JDK processors, but if I missed something, let me know? On 3/28/2024 5:05 AM, Remi Forax wrote: > Hello, > over last week-end, i've implemented an XML template processor using the Java 22 state of the spec (using old template processor syntax) and i would like to propose to see the processing of a string template as a two steps process. > > I will use the XML template processor i've developed as an example, > https://github.com/forax/html-component/blob/master/src/test/java/Demo.java > > Here is how it works, the idea is that if i want to generate the XML of a product, i will write something like this. > > record Product(String name, int price) implements Component { > public Renderer render() { > return $.""" > > \{name}\{price * 1.20} > > """; > } > } > > Component is an interface with only one method render() that returns a Renderer and a Renderer is also an interface that is able to send XML events. > And "$" is the name of the template processor defined in Component as a static field. > > The code of the template processor is here > https://github.com/forax/html-component/blob/master/src/main/java/com/github/forax/htmlcomponent/ComponentTemplateProcessor.java#L193 > > Conceptually, what a template processor should do is a two step process, first validate the template, in my case validate that the template is a valid XML fragment and then interpolate the result of the validation using the arguments of the template. > > So processing a sting template is currently > process(StringTemplate) <=> { validate(StringTemplate); interpolate(StringTemplate); } > > There are two main shortcomings of the idea that processing a string template is equivalent to calling a method that takes a StringTemplate. > - (notypes) the types of the holes are no propagated to the StringTemplate, so the validation part can not verify that the template is correctly typed. > - (cache) the validation part has to be re-executed each time. > > To illustrate the issue (notype), I can have a XML fragment that depends on another class, but i've no way to test if the referenced Product is a record that takes a name of type String and a price of type int because while those types are known by the compiler, they are not available into the String Template. > > record Cart() implements Component { > public Renderer render() { > return $.""" > > > >
> """; > } > } > > To illustrate the issue (cache), in the code above, i've two calls to rend a Product with different attributes, but for each call to Product::render(), the validation step will be re-executed. As an implementer, I can try to cache the result of the validation but that's far from easy, very bug prone and ultimately not very efficient. > > Given that a string template literal is a literal, i propose that the Java runtime helps by doing the caching of the validation step. > > The simplest way I see for that is to separate string template in two, a constant template part composed of the fragments (List) and the types (List>) from the non constant part, the arguments of the template (List). > > For that, we need a user-defined intermediary object that correspond to the result of the validation, the creation of this object is the proof that the string template is validated and this object can be cached by the JDK runtime. > > In that case, processing a string template is equivalent to > var cached userDefinedValidatedTemplate = validateAndCreate(List fragment, List> types); > process(userDefinedValidatedTemplate, arguments); } > > > So > - I propose that StringTemplate is the tuple List fragment, List> types. > > - Users can create a special template validated class, with a factory method that takes a StringTemplate and is tagged a being a template validator > > for example > __template_validated__ class ValidatedXMLDOM { > ... > public static __template__validator__ ValidatedXMLDOM of(StringTemplate stringTemplate) { ... } > } > > - a processor method is a method that takes a __template_validated__ object followed by parameters storing the template string arguments > By example > processXML(ValidatedXMLDOM dom, Object... arguments) > > At compile time, either processXML is called using an invokedynamic or the __template_validated__ instance is computed with a constant dynamic or both. > But the idea is that the generated bytecode ensure that the __template_validated__ instance is created once and cached. > > This is a rough sketch, a lot of details are up to debate but i think we should start to think that the template processing is a two steps process. > > R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Apr 3 18:46:48 2024 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 3 Apr 2024 18:46:48 +0000 Subject: Member Patterns -- the bikeshed In-Reply-To: <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> Message-ID: R?mi, I get the impression that, in introducing the notion of a ?carrier?, you seem to be focused on how deconstructors and patterns will necessarily be implemented in terms of the current definitions of Java and the JVM, or at least trying to explain it to the user (Java programmer) in terms of such an implementation. But taking the point of view of such a user, I just don?t see the need to introduce a new notion of ?carrier" to explain the set of match results (an ordered sequence) from a pattern, just as I don't see any need to introduce a new notion of ?carrier" to explain the set of arguments (an ordered sequence) to a method. There are other languages that treat a sequence of arguments as a first-class object, and treat methods or functions as simply always taking one argument, which may be one of those sequence objects; conceptually a function body first deconstructs the argument-sequence object. And the same approach works for the value returned, and that is how multiple results are addressed in such a language. But Java has historically not been that kind of language. Like C and C++, from the start it has supported the idea of a function/method call that takes a sequence of arguments. That sequence of arguments is not a first-class object, and is not considered to have a type or any associated methods. The way that sequence is represented at run time is really of no concern to the programmer, and that fact has historically made it easier to allocate them on the stack rather than the heap. Yes, sometimes we wish that sequence were really a record (so that we could pass it around as a single object) or a map of some kind (so that we could pass argument values tagged by keywords rather than presenting them in a specific order), but that?s just not the way Java is. And I suggest that in a Java-based model where patterns are regarded as duals of methods, the same observations apply to sequences of match results. There is no need for such a sequence to be an object, or even to be given a special name such as ?carrier?. The representation of such a sequence at run time is not the programmer?s concern, and that in turn may make it easier to allocate them on the stack rather than the heap in some situations. All I care about as a user of patterns is that I supply a match candidate, a pattern that does not fail produces an ordered sequence of match results, and those results are then bound, in order, to variables I specify at the point of pattern use. For me, a pattern that returns a sequence of match results is something that can be, but need not be, desugared into classical Java elements. ?Guy On Apr 3, 2024, at 10:23?AM, forax at univ-mlv.fr wrote: ________________________________ From: "Brian Goetz" To: "Remi Forax" Cc: "amber-spec-experts" Sent: Wednesday, April 3, 2024 2:48:40 PM Subject: Re: Member Patterns -- the bikeshed I would summarize your comments below as: Let's throw the entire model in the garbage, and replace it with something like Scala's "return an Optional" instead. We've been discussing the model for several years; you've been asking (and waiting patiently) for "when are we going to talk about declaration syntax", and now that we're there, you want to throw it all out and start over? My makeup job was too big so you do not recognize your model behind :) There are two parts, the declaration part and the use-site part. Correct me if i'm wrong but apart the support of a method pattern with no prefix, we are in agreement here. For the declaration part, I think that carrier(int x, int y) asCartesian() is more readable than inverse () asCartesian(int x, int y) The inverse notation is a leaky abstraction in a leat two cases - when a modifier or an annotation is used. For an annotation, there is a notion of a target and the parameter target is at the wrong place, - when declaring a lambda, because in that case the parameters are not inversed. Now, what i call a carrier type is what you call a list of bindings. In terms of syntax, I think it is important to put a name in front of that list of bindings, i've proposed "carrier" so we provide a name for that feature, it's easier when discussing about it it or google it. That does not change the fact that a method that returns a carrier is a special method because it requires at least a special erasure (because overloading), and a special reflection API, But I hope, we will not cross the line and have to use new opcodes in the bytecode. For me, a method that returns a carrier is something that can be desugared classical Java elements like an enum or a record is desugared to a class. We've discussed how strategies that rely on "ask the user to declare a record for every API point" feel clever for about five minutes, but start to feel old quickly. yes, this is what you have to do actually if you simulate the feature with Java nowadays. Not, what you should have to do in the future. And the idea is to do better, among other things, we want to suport overloading. The "carrier" concept in your examples seems to be just another way of reinventing multiple return -- with the added dis-bonus of being like but not quite the same as records. We've been pretty clear that "multiple return" is not the design center here. The idea behind a carrier is to let users define their binding list is a way that does not feel too strange, that why I propose to add a name/keyword in front of the binding list. And I do not know how you define what a binding list is but multiple return + components description is a good definition for me. R?mi The use of ! for indicating totality is interesting, that's worth thinking about. On 4/3/2024 6:21 AM, Remi Forax wrote: I think that by not starting from the deconstructor, the notion of inverse methods make less sense. I think that the notion of carrier / carrier type is less disruptive that the notion of member patterns. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Apr 4 13:30:40 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 4 Apr 2024 15:30:40 +0200 (CEST) Subject: Member Patterns -- the bikeshed In-Reply-To: References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <675389563.47530211.1712237440439.JavaMail.zimbra@univ-eiffel.fr> > From: "Guy Steele" > To: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > > Sent: Wednesday, April 3, 2024 8:46:48 PM > Subject: Re: Member Patterns -- the bikeshed > R?mi, > I get the impression that, in introducing the notion of a ?carrier?, you seem to > be focused on how deconstructors and patterns will necessarily be implemented > in terms of the current definitions of Java and the JVM, or at least trying to > explain it to the user (Java programmer) in terms of such an implementation. > But taking the point of view of such a user, I just don?t see the need to > introduce a new notion of ?carrier" to explain the set of match results (an > ordered sequence) from a pattern, just as I don't see any need to introduce a > new notion of ?carrier" to explain the set of arguments (an ordered sequence) > to a method. For me the notion of carrier has several advantages: - it is easy to explain, a carrier is like an anynous record or a record with a predefined name, so this an object like any other normal object - the syntax is to close a type declaration, so the syntax can be extended to add '!' or '?' to signal if the pattern is optional or not, with the caveat that it only works well if we actually introduce '!' and '?' in the langage. About what should be shown or not, if people uses a debugger, they will see the object that acts as the carrier, so the idea is that instead of pretending that this object does not exist, we can make it like a normal object. In my mind, this is quite similar to the lambda proxy object, it exists, it is opaque enough to be usuless (not even a correct toString) so in practice you do not really care about it. I agree that not everything should be exposed to the user, how really a carrier object is created, how the matching work or how the VM disambiguates the overloads should be hidden, at least until someone takes a look to the bytecode or uses java.lang.invoke. But perhaps i'm wrong and there is no need to decorate the binding list to like a proper object. > There are other languages that treat a sequence of arguments as a first-class > object, and treat methods or functions as simply always taking one argument, > which may be one of those sequence objects; conceptually a function body first > deconstructs the argument-sequence object. And the same approach works for the > value returned, and that is how multiple results are addressed in such a > language. > But Java has historically not been that kind of language. Like C and C++, from > the start it has supported the idea of a function/method call that takes a > sequence of arguments. That sequence of arguments is not a first-class object, > and is not considered to have a type or any associated methods. The way that > sequence is represented at run time is really of no concern to the programmer, > and that fact has historically made it easier to allocate them on the stack > rather than the heap. Yes, sometimes we wish that sequence were really a record > (so that we could pass it around as a single object) or a map of some kind (so > that we could pass argument values tagged by keywords rather than presenting > them in a specific order), but that?s just not the way Java is. I will just add that with value classes (that can implement an interface like Map BTW) we are very close to that, that's why I do not think that exposing a binding list as an object is an issue. > And I suggest that in a Java-based model where patterns are regarded as duals of > methods, the same observations apply to sequences of match results. There is no > need for such a sequence to be an object, or even to be given a special name > such as ?carrier?. The representation of such a sequence at run time is not the > programmer?s concern, and that in turn may make it easier to allocate them on > the stack rather than the heap in some situations. All I care about as a user > of patterns is that I supply a match candidate, a pattern that does not fail > produces an ordered sequence of match results, and those results are then > bound, in order, to variables I specify at the point of pattern use. Patterns are not dual of methods, pattern deconstructors are dual of methods, but this is a special case. A pattern not only have a sequence of match results, it can have parameters too. For example, I may want to introduce an instance pattern asInteger() in java.lang.String that works like Integer.parseInt() but not match instead of throwing an exception if the string does not represent an integer. I may also want that pattern to decode hexadecimal so like Integer.parsing(int radix), I want my pattern to also takes a radix as parameter. In that case, my pattern asInteger() has an int value as match result and has an int radix as parameter. Using the carrier syntax, it's something like carrier(int value) asInteger(int radix) { ... } or without the carrier syntax but with a keyword pattern, it's something like pattern (int value) asInteger(int radix) { ... } > For me, a pattern that returns a sequence of match results is something that can > be, but need not be , desugared into classical Java elements. yes, it does not need to be, pehaps showing a binding list instead of a return type is not an issue. > ?Guy R?mi >> On Apr 3, 2024, at 10:23 AM, forax at univ-mlv.fr wrote: >>> From: "Brian Goetz" >>> To: "Remi Forax" >>> Cc: "amber-spec-experts" >>> Sent: Wednesday, April 3, 2024 2:48:40 PM >>> Subject: Re: Member Patterns -- the bikeshed >>> I would summarize your comments below as: Let's throw the entire model in the >>> garbage, and replace it with something like Scala's "return an Optional" >>> instead. >>> We've been discussing the model for several years; you've been asking (and >>> waiting patiently) for "when are we going to talk about declaration syntax", >>> and now that we're there, you want to throw it all out and start over? >> My makeup job was too big so you do not recognize your model behind :) >> There are two parts, the declaration part and the use-site part. >> Correct me if i'm wrong but apart the support of a method pattern with no >> prefix, we are in agreement here. >> For the declaration part, I think that >> carrier(int x, int y) asCartesian() >> is more readable than >> inverse () asCartesian(int x, int y) >> The inverse notation is a leaky abstraction in a leat two cases >> - when a modifier or an annotation is used. For an annotation, there is a notion >> of a target and the parameter target is at the wrong place, >> - when declaring a lambda, because in that case the parameters are not inversed. >> Now, what i call a carrier type is what you call a list of bindings. >> In terms of syntax, I think it is important to put a name in front of that list >> of bindings, i've proposed "carrier" so we provide a name for that feature, >> it's easier when discussing about it it or google it. >> That does not change the fact that a method that returns a carrier is a special >> method because it requires at least a special erasure (because overloading), >> and a special reflection API, >> But I hope, we will not cross the line and have to use new opcodes in the >> bytecode. >> For me, a method that returns a carrier is something that can be desugared >> classical Java elements like an enum or a record is desugared to a class. >>> We've discussed how strategies that rely on "ask the user to declare a record >>> for every API point" feel clever for about five minutes, but start to feel old >>> quickly. >> yes, this is what you have to do actually if you simulate the feature with Java >> nowadays. Not, what you should have to do in the future. >> And the idea is to do better, among other things, we want to suport overloading. >>> The "carrier" concept in your examples seems to be just another way of >>> reinventing multiple return -- with the added dis-bonus of being like but not >>> quite the same as records. We've been pretty clear that "multiple return" is >>> not the design center here. >> The idea behind a carrier is to let users define their binding list is a way >> that does not feel too strange, that why I propose to add a name/keyword in >> front of the binding list. >> And I do not know how you define what a binding list is but multiple return + >> components description is a good definition for me. >> R?mi >>> The use of ! for indicating totality is interesting, that's worth thinking >>> about. >>> On 4/3/2024 6:21 AM, Remi Forax wrote: >>>> I think that by not starting from the deconstructor, the notion of inverse >>>> methods make less sense. >>>> I think that the notion of carrier / carrier type is less disruptive that the >>>> notion of member patterns. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Apr 4 14:04:07 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 4 Apr 2024 10:04:07 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: <675389563.47530211.1712237440439.JavaMail.zimbra@univ-eiffel.fr> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> <675389563.47530211.1712237440439.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <032ae9e2-4e7a-4fe7-ac4e-5003638d3d0c@oracle.com> > > But taking the point of view of such a user, I just don?t see the > need to introduce a new notion of ?carrier" to explain the set of > match results (an ordered sequence) from a pattern, just as I > don't see any need to introduce a new notion of ?carrier" to > explain the set of arguments (an ordered sequence) to a method. > > > For me the notion of carrier has several advantages: > - it is easy to explain, a carrier is like an anynous record or a > record with a predefined name, so this an object like any other normal > object > - the syntax is to close a type declaration, so the syntax can be > extended to add '!' or '?' to signal if the pattern is optional or > not, with the caveat that it only works well if we actually introduce > '!' and '?' in the langage. Yes, this is the old "it's just a method with multiple return" / "method that returns Optional" analogy.? While comfortable-seeming, because it builds on things the user is already familiar with, this analogy is flawed.? Embracing a flawed but familiar model might help users in the first five minutes, but it will damage the language forever after. Calling it something new like "carrier" has the obvious disadvantage that it appears to be like a record, but isn't a record.? This will cause endless confusion.? If we instead call these "anonymous records", which is at least more honest (we've been through this flavor of the design, it felt cool at first, but started to decay almost instantly), you end up trying to reify the binding list in a way that we deliberately _don't_ reify a method parameter list. Scala's "function that returns Optional" is a glass that is half full and half empty.? It relies on generous structural magic, and therefore does not fully live within the type system. And Scala's patterns are less ambitious; they are much more "static", sitting at the periphery of the object model. > > > And I suggest that in a Java-based model where patterns are > regarded as duals of methods, the same observations apply to > sequences of match results. There is no need for such a sequence > to be an object, or even to be given a special name such as > ?carrier?. The representation of such a sequence at run time is > not the programmer?s concern, and that in turn may make it easier > to allocate them on the stack rather than the heap in some > situations. All I care about as a user of patterns is that I > supply a match candidate, a pattern that does not fail produces an > ordered sequence of match results, and those results are then > bound, in order, to variables I specify at the point of pattern use. > > > Patterns are not dual of methods, pattern deconstructors are dual of > methods, but this is a special case. Perhaps in Remi-world, Remi-patterns are something else.? But that's not the feature being designed here. The design center here is that patterns are the dual of _aggregative_ methods (methods that takes more primitive ingredients and produce something more abstract), like "make me a list from these elements", "make me a point from this x and y", or "make me a class for the array type whose component type is X".? They are not merely "methods that return multiple values", nor are they "methods that might fail".? The method to which they are dual need not actually exist (e.g., a pattern that describes a regex match need not have a partner which generates conformant strings, but it could). If you disagree about the design center, you have two choices: ?- Agree to disagree, and help design the best feature within the planned design center ?- Be honest that you are advocating to throw the design in the garbage, and want to replace it with something else (likely, something not as fully worked through), rather than merely offering a "tweak" to the syntax The bar for the latter is very, very high.? You should make your case directly, honestly, persuasively, and completely, and you should be prepared that you may still not convince people, in which case it is back to choice A. Even if you are not successful, there is value in trying; the value is in forcing us to come to a clearer statement of what the design center is.? We will have to explain this to others, so refining this story is valuable. > A pattern not only have a sequence of match results, it can have > parameters too. > For example, I may want to introduce an instance pattern asInteger() > in java.lang.String that works like Integer.parseInt() but not match > instead of throwing an exception if the string does not represent an > integer. So, this is a good illustration of the dangers of "method-think" here.? There is exactly one correct, obvious name for this pattern: "Integer::toString".? Except that at first, to almost everyone, it will not seem correct and not seem obvious. When all we had for designing the pair of conversions `int <--> String` was methods, we modeled them as arbitrarily distinct, unrelated methods.? Going from int -> String was easy, since there was an obvious way to do it and it worked for all integers.? Going the other way is harder, because it is partial, and because the language didn't offer a canonical way to reflect partiality, we had to invent one, and every such method did its own thing (maybe it returns a default; maybe it throws; etc.)? The API author had to make up TWO names, and we all know naming is hard. Worse, the two methods toString and parseInt were not obviously related, which meant that they were not discoverable from each other.? Worse still, the arms-length relationship between them meant that they could easily gratuitously diverge. When this is all we could do, we did it, and didn't realize there was something better.? But there is something better, and once you see it, it is so blindingly obvious that you would not think to go back. If you name the pattern `toString`, then: ?- The two are easily discoverable from each other; ?- The author will naturally align the semantics of the two, as they are obviously two sides of the same coin; ?- Promotion of int to String (aggregation), and the recovery of that int from a candidate String (destructuring), look the same: ??? if (aString instanceof Integer.toString(int i)) { ... } This evokes the Pattern Question: "could this string have come from Integer.toString(i) for some i, and if so, please give me an `i` for which this is the case." If you encourage people to keep thinking about parseInt as a mere method, they will continue to write dramatically worse APIs, which are harder to write, harder to use, harder to read, and result in gratutious asymmetries. > I may also want that pattern to decode hexadecimal Then find the duality.? Converting from int to hex string could reasonably be a method called toHexString, and there's your pattern: ??? if (aString instanceof Integer.toHexString(var i)) { ... } > I want my pattern to also takes a radix as parameter. In that case, my > pattern asInteger() has an int value as match result and has an int > radix as parameter. Here, you are saying "I want patterns with input parameters back." And I agree that they were there for a reason.? I think there is a better way to bring back that functionality, and we can talk about it when we put the core feature to rest. > Using the carrier syntax, it's something like Please stop pretending that this is a mere syntax choice.? This is a complete reinterpretation of what patterns are.? You want patterns to "just" be "conditional methods that can return multiple things." I get how that seems a useful feature, but if that's what you want, then the burden is on you is to be honest about it and to convince us to make a radical change of direction.? The mere existence of "here is some useful code I could write with my direction" is barely even a start at that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Apr 4 13:59:00 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 4 Apr 2024 15:59:00 +0200 (CEST) Subject: Member Patterns -- the bikeshed In-Reply-To: <1882416b-31d1-4160-8418-11e994300a65@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> <1882416b-31d1-4160-8418-11e994300a65@oracle.com> Message-ID: <955924241.47570855.1712239140830.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Wednesday, April 3, 2024 5:00:55 PM > Subject: Re: Member Patterns -- the bikeshed > Despite several years of warnings and other attempts at preparing the ground, > you seem intent on falling into the trap of thinking that these things are > "just methods" and that we are better served by generalizing methods to support > patterns. Everything about the model here places patterns as dual to methods; > trying to hide that with syntax that makes it look like "just a method" is then > putting the ball in our own net, because it props up wrong ideas about what is > going on. > (In the classfile translation, we will of course use methods, and some sort of > carrier, but that's a compilation trick, and we surely don't want to expose > this model to ordinary programmers (though MethodHandle programmers will > probably have to deal with it.)) Also people using a debugger will see it, people using proxies, aop, interception etc will see it too. > Unlike methods, patterns are conditional. > Unlike methods, patterns can bind zero or more results. > Unlike methods, patterns are overloaded on their bindings, not their arguments. - patterns are conditional Technically, the code of the pattern is not conditionnal, it will be always executed. We may want to tag at the declaration if a pattern can return no-match or not so the compiler will not require a default branch in a switch. But the body of a pattern behave the same way as the body of a method. - Unlike methods, patterns can bind zero or more results Yes, but it can be see as returning one result containing multiple components - Unlike methods, patterns are overloaded on their bindings, not their arguments No, patterns can be overloaded by both their bindings *and* their parameters And, patterns are also bound to an instance, static, default, abstract, synchronized, etc like methods. You may say that you are proposing a split where i see a lump, but for me it's like saying constructor are not methods, that's true in a sense. >> Now, what i call a carrier type is what you call a list of bindings. > You can call it that, but the syntax you are proposing presents it as something > different -- as a thing that is returned from a method. Yes, the list of binding can be declared whenever we want but i think it's easier to understand a template if the binding list is declared where the return type should be, because seeing a template as like a method that instead of returning a value propagate several bindings is not a bad approximation. Like constructor do not specify a return type because as a casual user you want to think it "returns" the instance but the VM physics uses void. >> And I do not know how you define what a binding list is but multiple return + >> components description is a good definition for me. > We define it the same way as we define a parameter list. A parameter list is not > a first-class thing in the language; you can't express one separately from a > method call, or assign one to a variable, or return one. It is strictly a > linguistic mechanism for (a) declaring the shape of a method and (b) passing > parameters to a method at runtime. > A binding list is the dual of this; it is strictly a linguistic mechanism for > (a) declaring the shape of a pattern and (b) passing bindings _from_ a pattern > at runtime. I do not disagree. Maybe we do not need the keyword to be attached to the binding list but to be attached to the method itself. My idea was that people will see the runtime object like they see the lambda proxy so attaching the keyword on the binding list was a way to say, if you want to think it's like a method that returns several componenets, that's not a bad approximation. R?mi > On 4/3/2024 10:23 AM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] wrote: >>> From: "Brian Goetz" [ mailto:brian.goetz at oracle.com | ] >>> To: "Remi Forax" [ mailto:forax at univ-mlv.fr | ] >>> Cc: "amber-spec-experts" [ mailto:amber-spec-experts at openjdk.java.net | >>> ] >>> Sent: Wednesday, April 3, 2024 2:48:40 PM >>> Subject: Re: Member Patterns -- the bikeshed >>> I would summarize your comments below as: Let's throw the entire model in the >>> garbage, and replace it with something like Scala's "return an Optional" >>> instead. >>> We've been discussing the model for several years; you've been asking (and >>> waiting patiently) for "when are we going to talk about declaration syntax", >>> and now that we're there, you want to throw it all out and start over? >> My makeup job was too big so you do not recognize your model behind :) >> There are two parts, the declaration part and the use-site part. >> Correct me if i'm wrong but apart the support of a method pattern with no >> prefix, we are in agreement here. >> For the declaration part, I think that >> carrier(int x, int y) asCartesian() >> is more readable than >> inverse () asCartesian(int x, int y) >> The inverse notation is a leaky abstraction in a leat two cases >> - when a modifier or an annotation is used. For an annotation, there is a notion >> of a target and the parameter target is at the wrong place, >> - when declaring a lambda, because in that case the parameters are not inversed. >> Now, what i call a carrier type is what you call a list of bindings. >> In terms of syntax, I think it is important to put a name in front of that list >> of bindings, i've proposed "carrier" so we provide a name for that feature, >> it's easier when discussing about it it or google it. >> That does not change the fact that a method that returns a carrier is a special >> method because it requires at least a special erasure (because overloading), >> and a special reflection API, >> But I hope, we will not cross the line and have to use new opcodes in the >> bytecode. >> For me, a method that returns a carrier is something that can be desugared >> classical Java elements like an enum or a record is desugared to a class. >>> We've discussed how strategies that rely on "ask the user to declare a record >>> for every API point" feel clever for about five minutes, but start to feel old >>> quickly. >> yes, this is what you have to do actually if you simulate the feature with Java >> nowadays. Not, what you should have to do in the future. >> And the idea is to do better, among other things, we want to suport overloading. >>> The "carrier" concept in your examples seems to be just another way of >>> reinventing multiple return -- with the added dis-bonus of being like but not >>> quite the same as records. We've been pretty clear that "multiple return" is >>> not the design center here. >> The idea behind a carrier is to let users define their binding list is a way >> that does not feel too strange, that why I propose to add a name/keyword in >> front of the binding list. >> And I do not know how you define what a binding list is but multiple return + >> components description is a good definition for me. >> R?mi >>> The use of ! for indicating totality is interesting, that's worth thinking >>> about. >>> On 4/3/2024 6:21 AM, Remi Forax wrote: >>>> I think that by not starting from the deconstructor, the notion of inverse >>>> methods make less sense. >>>> I think that the notion of carrier / carrier type is less disruptive that the >>>> notion of member patterns. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Apr 4 14:31:43 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 4 Apr 2024 16:31:43 +0200 (CEST) Subject: String template interpolation as a two steps process In-Reply-To: References: <237706846.41203165.1711616747142.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <134731301.47611970.1712241103632.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > > Sent: Wednesday, April 3, 2024 8:07:42 PM > Subject: Re: String template interpolation as a two steps process > We've had this discussion before. Not exactly, i will try to emphasis the differences. > In the old model, it *is* a two-step process; processors could cache type and > analysis information in the indy call site at capture time through the Linkage > mechanism, and then could use that information at application time. It's just > that in the old model, this was not exposed to processors outside the JDK, at > least not initially. We know that you didn't like that, but we felt that the > Linkage API needed a lot more work before we were willing to expose it to > arbitrary code, and didn't want to delay the feature for that. This is old > news. > For the JDK processors, the "shortcomings" you list (e.g., caching of types, > duplicated validation) are not present. > In the new model, it's the same story. There's a currently-privileged API that > "processors" can use to cache analysis information in the call site, which is > computed at first application rather than at capture time, and reused in > subsequent applications. Again, it is not yet available to processors outside > the JDK. It's the exact same story. > I don't see anything in the mail that isn't already handled for the JDK > processors, but if I missed something, let me know? The String Template API servent two masters, the end users and we have made some progress on that front and the library developers, the one that will provide libraries with methods that take StringTemplate as parameter and here we do not have really help them, as you said the "shortcomings" for them are still present . So as a end user, if i want to use a library that uses a StringTemplate, i know it will come with tradeoff due to those "shortcomings", did the library bypass the security so it's faster, did the library uses a cache with no eviction policy, did the library uses internal implementation details that will make my code not working with the future version of Java, etc. In a nutshell, we do not provide the tools for library developers to write a good library using string templates. In the past, due to the fact that an interface was used for the processor, it was hard, not impossible, but hard, to provide a good api without adding security issues to the platform (the Lookup object being transfered to the string template processor). With the new design, I think we can provide a better API to the library implementors, not a priviledged API like only the JDK can use but a better API that the one currently proposed. R?mi > On 3/28/2024 5:05 AM, Remi Forax wrote: >> Hello, >> over last week-end, i've implemented an XML template processor using the Java 22 >> state of the spec (using old template processor syntax) and i would like to >> propose to see the processing of a string template as a two steps process. >> I will use the XML template processor i've developed as an example, [ >> https://github.com/forax/html-component/blob/master/src/test/java/Demo.java | >> https://github.com/forax/html-component/blob/master/src/test/java/Demo.java ] >> Here is how it works, the idea is that if i want to generate the XML of a >> product, i will write something like this. >> record Product(String name, int price) implements Component { >> public Renderer render() { >> return $.""" >> >> \{name}\{price * 1.20} >> >> """; >> } >> } >> Component is an interface with only one method render() that returns a Renderer >> and a Renderer is also an interface that is able to send XML events. >> And "$" is the name of the template processor defined in Component as a static >> field. >> The code of the template processor is here [ >> https://github.com/forax/html-component/blob/master/src/main/java/com/github/forax/htmlcomponent/ComponentTemplateProcessor.java#L193 >> | >> https://github.com/forax/html-component/blob/master/src/main/java/com/github/forax/htmlcomponent/ComponentTemplateProcessor.java#L193 >> ] Conceptually, what a template processor should do is a two step process, >> first validate the template, in my case validate that the template is a valid >> XML fragment and then interpolate the result of the validation using the >> arguments of the template. >> So processing a sting template is currently >> process(StringTemplate) <=> { validate(StringTemplate); >> interpolate(StringTemplate); } >> There are two main shortcomings of the idea that processing a string template is >> equivalent to calling a method that takes a StringTemplate. >> - (notypes) the types of the holes are no propagated to the StringTemplate, so >> the validation part can not verify that the template is correctly typed. >> - (cache) the validation part has to be re-executed each time. >> To illustrate the issue (notype), I can have a XML fragment that depends on >> another class, but i've no way to test if the referenced Product is a record >> that takes a name of type String and a price of type int because while those >> types are known by the compiler, they are not available into the String >> Template. >> record Cart() implements Component { >> public Renderer render() { >> return $.""" >> >> >> >>
>> """; >> } >> } >> To illustrate the issue (cache), in the code above, i've two calls to rend a >> Product with different attributes, but for each call to Product::render(), the >> validation step will be re-executed. As an implementer, I can try to cache the >> result of the validation but that's far from easy, very bug prone and >> ultimately not very efficient. >> Given that a string template literal is a literal, i propose that the Java >> runtime helps by doing the caching of the validation step. >> The simplest way I see for that is to separate string template in two, a >> constant template part composed of the fragments (List) and the types >> (List>) from the non constant part, the arguments of the template >> (List). >> For that, we need a user-defined intermediary object that correspond to the >> result of the validation, the creation of this object is the proof that the >> string template is validated and this object can be cached by the JDK runtime. >> In that case, processing a string template is equivalent to >> var cached userDefinedValidatedTemplate = validateAndCreate(List >> fragment, List> types); >> process(userDefinedValidatedTemplate, arguments); } >> So >> - I propose that StringTemplate is the tuple List fragment, >> List> types. >> - Users can create a special template validated class, with a factory method >> that takes a StringTemplate and is tagged a being a template validator >> for example >> __template_validated__ class ValidatedXMLDOM { >> ... >> public static __template__validator__ ValidatedXMLDOM of(StringTemplate >> stringTemplate) { ... } >> } >> - a processor method is a method that takes a __template_validated__ object >> followed by parameters storing the template string arguments >> By example >> processXML(ValidatedXMLDOM dom, Object... arguments) >> At compile time, either processXML is called using an invokedynamic or the >> __template_validated__ instance is computed with a constant dynamic or both. >> But the idea is that the generated bytecode ensure that the >> __template_validated__ instance is created once and cached. >> This is a rough sketch, a lot of details are up to debate but i think we should >> start to think that the template processing is a two steps process. >> R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Apr 4 15:04:30 2024 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 4 Apr 2024 15:04:30 +0000 Subject: Member Patterns -- the bikeshed In-Reply-To: <675389563.47530211.1712237440439.JavaMail.zimbra@univ-eiffel.fr> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> <675389563.47530211.1712237440439.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <58E4A261-5D40-4549-B406-34A7285755D2@oracle.com> On Apr 4, 2024, at 9:30?AM, forax at univ-mlv.fr wrote: Patterns are not dual of methods, pattern deconstructors are dual of methods, but this is a special case. A pattern not only have a sequence of match results, it can have parameters too. For example, I may want to introduce an instance pattern asInteger() in java.lang.String that works like Integer.parseInt() but not match instead of throwing an exception if the string does not represent an integer. I may also want that pattern to decode hexadecimal so like Integer.parsing(int radix), I want my pattern to also takes a radix as parameter. In that case, my pattern asInteger() has an int value as match result and has an int radix as parameter. Using the carrier syntax, it's something like carrier(int value) asInteger(int radix) { ... } or without the carrier syntax but with a keyword pattern, it's something like pattern (int value) asInteger(int radix) { ? } I think what is missing is a necessary shift in terminology under Brian?s new proposal: if parameters are needed for a pattern, then in effect you curry it. In this new model, `asInteger` is not a pattern; rather, it is a pattern factory?that is, a method that returns a pattern. This is possible because of the introduction of SAPs, so that a pattern can be expressed using lambda syntax. So we should speak of method `asInteger` as a pattern factory, and `asInteger(16)` as a pattern. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron.pressler at oracle.com Thu Apr 4 15:24:01 2024 From: ron.pressler at oracle.com (Ron Pressler) Date: Thu, 4 Apr 2024 15:24:01 +0000 Subject: String template interpolation as a two steps process In-Reply-To: <134731301.47611970.1712241103632.JavaMail.zimbra@univ-eiffel.fr> References: <237706846.41203165.1711616747142.JavaMail.zimbra@univ-eiffel.fr> <134731301.47611970.1712241103632.JavaMail.zimbra@univ-eiffel.fr> Message-ID: > On 4 Apr 2024, at 15:31, forax at univ-mlv.fr wrote: > > > In the past, due to the fact that an interface was used for the processor, it was hard, not impossible, but hard, to provide a good api without adding security issues to the platform (the Lookup object being transfered to the string template processor). With the new design, I think we can provide a better API to the library implementors, not a priviledged API like only the JDK can use but a better API that the one currently proposed. Yes ? assuming you mean some non-privileged mechanism for caching parsing results ? but that shouldn?t necessarily be in the first preview of the new design. It is a performance optimisation that can be postponed until we?re sure we?re on the right track from the perspective of the consumer. Writing good template processors is an interesting and challenging aspect done by ?template processing professionals? (e.g. see what?s required of proper HTML template processing: https://rawgit.com/mikesamuel/sanitized-jquery-templates/trunk/safetemplate.html) and there?s much for library authors to digest before considering caching (even though it may become an issue later on). ? Ron From forax at univ-mlv.fr Thu Apr 4 16:03:39 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 4 Apr 2024 18:03:39 +0200 (CEST) Subject: String template interpolation as a two steps process In-Reply-To: References: <237706846.41203165.1711616747142.JavaMail.zimbra@univ-eiffel.fr> <134731301.47611970.1712241103632.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <1934623723.47675009.1712246619624.JavaMail.zimbra@univ-eiffel.fr> ----- Original Message ----- > From: "Ron Pressler" > To: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > Sent: Thursday, April 4, 2024 5:24:01 PM > Subject: Re: String template interpolation as a two steps process >> On 4 Apr 2024, at 15:31, forax at univ-mlv.fr wrote: Hi Ron, >> >> >> In the past, due to the fact that an interface was used for the processor, it >> was hard, not impossible, but hard, to provide a good api without adding >> security issues to the platform (the Lookup object being transfered to the >> string template processor). With the new design, I think we can provide a >> better API to the library implementors, not a priviledged API like only the JDK >> can use but a better API that the one currently proposed. > > Yes ? assuming you mean some non-privileged mechanism for caching parsing > results ? but that shouldn?t necessarily be in the first preview of the new > design. It is a performance optimisation that can be postponed until we?re sure > we?re on the right track from the perspective of the consumer. The idea of the API is that instead of taking a StringTemplate as parameter, the StringTemplate is splitted into two parts, the fragments part is used to create a cacheable object and then the template processor method takes both the cacheable objet and the live values as arguments. So process($" ... \{value1} ... \{value2} ") is transformed to var template = StringTemplate.of(fragments); var cachedObject = CachedObject.of(template); process(cachedObject, value1, value2)) I agree that there is not enough time to implement that new API in 23. And i can write a bytecode rewriter that patch the bytecode generated for Java 23 to go through an invokedynamic that will do the caching so we get a good idea of what the compiler should generate and what the bootstrap method should do. I'm a little worry about the stacktrace we expose when the cached object is initialized. > Writing good > template processors is an interesting and challenging aspect done by ?template > processing professionals? (e.g. see what?s required of proper HTML template > processing: > https://rawgit.com/mikesamuel/sanitized-jquery-templates/trunk/safetemplate.html) > and there?s much for library authors to digest before considering caching (even > though it may become an issue later on). It's for PHP and PHP templates are far more complex that Java string templates, for example, PHP allows {if} ... {else}. > > ? Ron R?mi From ron.pressler at oracle.com Thu Apr 4 16:10:46 2024 From: ron.pressler at oracle.com (Ron Pressler) Date: Thu, 4 Apr 2024 16:10:46 +0000 Subject: [External] : Re: String template interpolation as a two steps process In-Reply-To: <1934623723.47675009.1712246619624.JavaMail.zimbra@univ-eiffel.fr> References: <237706846.41203165.1711616747142.JavaMail.zimbra@univ-eiffel.fr> <134731301.47611970.1712241103632.JavaMail.zimbra@univ-eiffel.fr> <1934623723.47675009.1712246619624.JavaMail.zimbra@univ-eiffel.fr> Message-ID: > On 4 Apr 2024, at 17:03, forax at univ-mlv.fr wrote: > > > It's for PHP and PHP templates are far more complex that Java string templates, for example, PHP allows {if} ... {else}. > It is not (only) for PHP. It?s a general approach, and it?s recognised as the recommended approach for modern HTML processors; I think people writing HTML template processors are well familiar with this article (e.g. it's the one used by the HTML templates in Go?s standard library). ? Ron From brian.goetz at oracle.com Thu Apr 4 17:11:25 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 4 Apr 2024 13:11:25 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> Message-ID: <10ff8e67-366e-429f-a0ec-311a05a9b838@oracle.com> There's obviously some more discussion coming about "what is a pattern", but let me summarize the points on which we've asked for syntax feedback, and make another call (I can't believe I have to ask) for opinions here. Use-site syntax.? The document catalogs the use-site syntax for deconstruction, static, and bound/unbound instance pattern uses. I don't think any of these are controversial.? (There are details to be captured, such as qualifier inference, but I think the overall scheme here is sound.) Identifying a member as a pattern.? The proposed approach is a "pattern" keyword for all pattern kinds, but there are other choices. Method-style (multiple return) vs inverse-style.? I thought the document made it entirely clear that the method-style declaration was going to be a loser, but I guess we had more work to do there. Position of match candidate.? Here, there is a reasonable menu of choices: ??? static pattern Optional of(T t) ??? static pattern> of(T t) ??? static pattern(Optional that) of(T t) ??? static pattern of(T t) for Optional Naming of match candidate.? The document proposes to use `that` uniformly. Body types.? There is the broad choice of "imperative vs functional"; within that, there are choices about "implicit failure" or "implicit success."? There is also how we indicate success and failure.? The suggested approach is functional, implicit failure, return means fail, success is indicated by `match patternName(BINDINGS)`. Exhaustiveness.? The document proposes `case` as a modifier for patterns that form exhaustive sets.? This isn't great, but note that this feature is likely to be used less often than we probably think, as new code will likely steer towards sealed classes and deconstruction patterns. On 3/29/2024 5:58 PM, Brian Goetz wrote: > We now come to the long-awaited bikeshed discussion on what member > patterns should look like. > > Bikeshed disclaimer for EG: > ? - This is likely to evoke strong opinions, so please take pains to > be especially constructive > ? - Long reply-to-reply threads should be avoided even more than usual > ? - Holistic, considered replies preferred > ? - Please change subject line if commenting on a sub-topic or tangential > ??? concern > > Special reminders for Remi: > ?- Use of words like "should", "must", "shouldn't", "mistake", > "wrong", "broken" > ?? are strictly forbidden. > ?- If in doubt, ask questions first. > > Notes for external observers: > ?- This is a working document for the EG; the discussion may continue > for a > ?? while before there is an official proposal.? Please be patient. > > > # Pattern declaration: the bikeshed > > We've largely identified the model for what kinds of patterns we need to > express, but there are still several degrees of freedom in the syntax. > > As the model has simplified during the design process, the space of syntax > choices has been pruned back, which is a good thing.? However, there > are still > quite a few smaller decisions to be made.? Not all of the > considerations are > orthogonal, so while they are presented individually, this is not a > "pick one > from each column" menu. > > Some of these simplifications include: > > ?- Patterns with "input arguments" have been removed; another way to > get to what > ?? this gave us may come back in another form. > ?- I have grown increasingly skeptical of the value of the imperative > `match` > ?? statement.? With better totality analysis, I think it can be > eliminated. > > We can discuss these separately but I would like to sync first on the > broad > strokes for how patterns are expressed. > > ## Object model requirements > > As outlined in "Towards Member Patterns", the basic model is that > patterns are > the dual of other executable members (constructors, static methods, > instance > methods.)? While they are like methods in that they have inputs, > outputs, names, > and an imperative body, they have additional degrees of freedom that > constructors and methods lack: > > ?- Patterns are, in general, _conditional_ (they can succeed or fail), > and only > ?? produce bindings (outputs) when they succeed.? This conditionality is > ?? understood by the language's flow analysis, and is used for > computing scoping > ?? and definite assignment. > ?- Methods can return at most one value; when a pattern completes > successfully, > ?? it may bind multiple values. > ?- All patterns have a _match candidate_, which is a distinguished, > ?? possibly-implicit parameter.? Some patterns also have a receiver, > which is > ?? also a distinguished, possibly-implicit parameter.? In some such > cases the > ?? receiver and match candidate are aliased, but in others these may > refer to > ?? different objects. > > So a pattern is a named executable member that takes a _match > candidate_ as a > possibly-implicit parameter, maybe takes a receiver as an implicit > parameter, > and has zero or more conditional _bindings_.? Its body can perform > imperative > computation, and can terminate either with match failure or success.? > In the > success case, it must provide a value for each binding. > > Deconstruction patterns are special in many of the same ways > constructors are: > they are constrained in their name, inheritance, and probably their > conditionality (they should probably always succeed).? Just as the > syntax for > constructors differs slightly from that of instance methods, the > syntax for > deconstructors may differ slightly from that of instance patterns.? Static > patterns, like static methods, have no receiver and do not have access > to the > type parameters of the enclosing class. > > Like constructors and methods, patterns can be overloaded, but in > accordance > with their duality to constructors and methods, the overloading > happens on the > _bindings_, not the inputs. > > ## Use-site syntax > > There are several kinds of type-driven patterns built into the > language: type > patterns and record patterns.? A type pattern in a `switch` looks like: > > ??? case String s: ... > > And a record pattern looks like: > > ??? case MyRecord(P1, P2, ...): ... > > where `P1..Pn` are nested patterns that are recursively matched to the > components of the record.? This use-site syntax for record patterns > was chosen > for its similarity to the construction syntax, to highlight that a record > pattern is the dual of record construction. > > **Deconstruction patterns.**? The simplest kind of member pattern, a > deconstruction pattern, will have the same use-site syntax as a record > pattern; > record patterns can be thought of as a deconstruction pattern > "acquired for > free" by records, just as records do with constructors, accessors, object > methods, etc.? So the use of a deconstruction pattern for `Point` > looks like: > > ??? case Point(var x, var y): ... > > whether `Point` is a record or an ordinary class equipped with a suitable > deconstruction pattern. > > **Static patterns.**? Continuing with the idea that the destructuring > syntax > should evoke the aggregation syntax, there is an obvious candidate for the > use-site syntax for static patterns: > > ??? case Optional.of(var e): ... > ??? case Optional.empty(): ... > > **Instance patterns.**? Uses of instance patterns will likely come in > two forms, > analogous to bound and unbound instance method references, depending > on whether > the receiver and the match candidate are the same object.? In the > unbound form, > used when the receiver is the same object as the match candidate, the > pattern > name is qualified by a _type_: > > ``` > Class k = ... > switch (k) { > ??? // Qualified by type > ??? case Class.arrayClass(var componentType): ... > } > ``` > > This means that we _resolve_ the pattern `arrayClass` starting at > `Class` and > _select_ the pattern using the receiver, `k`.? We may also be able to > omit the > class qualifier if the static type of the match candidate is sufficient to > resolve the desired pattern. > > In the bound form, used when the receiver is distinct from the match > candidate, > the pattern name is qualified with an explicit _receiver expression_.? > As an > example, consider an interface that captures primitive widening and > narrowing > conversions, such as those between `int` and `long`.? In the widening > direction, > conversion is unconditional, so this can be modeled as a method from > `int` to > `long`.? In the other direction, conversion is conditional, so this is > better > modeled as a _pattern_ whose match candidate is `long` and which binds > an `int` > on success.? Since these are instance methods of some class (say, > `NumericConversion`), we need to provide the receiver instance in > order to > resolve the pattern: > > ``` > NumericConversion nc = ... > > switch (aLong) { > ??? case nc.narrowed(int i): > ??? ... > } > ``` > > The explicit receiver syntax would also be used if we exposed regular > expression > matching as a pattern on the `j.u.r.Pattern` object (the name collision on > `Pattern` is unfortunate).? Imagine we added a `matching` instance > pattern to > `j.u.r.Pattern`; then we could use it in `instanceof` as follows: > > ``` > static final java.util.regex.Pattern P = Pattern.compile("(a*)(b*)"); > ... > if (aString instanceof P.matching(String as, String bs)) { ... } > ``` > > Each of these use-site syntaxes is modeled after the use-site syntax for a > method invocation or method reference. > > ## Declaration-site syntax > > To avoid being biased by the simpler cases, we're going to work all > the cases > concurrently rather than starting with the simpler cases and working > up.? (It > might seem sensible to start with deconstructors, since they are the > "easy" > case, but if we did that, we would likely be biased by their > simplicity and then > find ourselves painted into a corner.)? As our example gallery, we > will consider: > > ?- Deconstruction pattern for `Point`; > ?- Static patterns for `Optional::of` and `Optional::empty`; > ?- Static pattern for "power of two" (illustrating a computations > where success > ?? or failure, and computation of bindings, cannot easily be separated); > ?- Instance pattern for `Class::arrayClass` (used unbound); > ?- Instance pattern for `Pattern::matching` on regular expressions > (used bound). > > Member patterns, like methods, have _names_.? (We can think of > constructors as > being named for their enclosing classes, and the same for > deconstructors.)? All > member patterns have a (possibly empty) ordered list of _bindings_, > which are > the dual of constructor or method parameters.? Bindings, in turn, have > names and > types.? And like constructors and methods, member patterns have a > _body_ which > is a block statement.? Member patterns also have a _match candidate_, > which is a > likely-implicit method parameter. > > ### Member patterns as inverse methods and constructors > > Regardless of syntax, let us remind ourselves that that deconstructors > are the > categorical dual to constructors (coconstructors), and pattern methods > are the > categorical dual to methods (comethods).? They are dual in their > structure: a > constructor or method takes N arguments and produces a result, the > corresponding > member pattern consumes a match candidate and (conditionally) produces N > bindings. > > Moreover, they are semantically dual: the return value produced by > construction > or factory invocation is the match candidate for the corresponding member > pattern, and the bindings produced by a member pattern are the answers > to the > _Pattern Question_ -- "could this object have come from an invocation > of my > dual, and if so, with what arguments." > > ### What do we call them? > > Given the significant overlap between methods and patterns, the first > question > about the declaration we need to settle is how to identify a member > pattern > declaration as distinct from a method or constructor declaration.? > _Towards > Member Patterns_ tried out a syntax that recognized these as _inverse_ > methods > and constructors: > > ??? public Point(int x, int y) { ... } > ??? public inverse Point(int x, int y) { ... } > > While this is a principled choice which clearly highlights the > duality, and one > that might be good for specification and verbal description, it is > questionable > whether this would be a great syntax for reading and writing programs. > > A more traditional option is to choose a "noun" (conditional) keyword, > such as > `pattern`, `matcher`, `extractor`, `view`, etc: > > ??? public pattern Point(int x, int y) { ... } > > If we are using a noun keyword to identify pattern declarations, we > could use > the same noun for all of them, or we could choose a different one for > deconstruction patterns: > > ??? public deconstructor Point(int x, int y) { ... } > > Alternately, we could reach for a symbol to indicate that we are > talking about > an inverted member.? C++ fans might suggest > > ??? public ~Point(int x, int y) { ... } > > but this is too cryptic (it's evocative once you see it, but then it > becomes > less evocative as we move away from deconstructors towards instance > patterns.) > > If we wish to offer finer-grained control over conditionality, we might > additionally need a `total` / `partial` modifier, though I would > prefer to avoid > that. > > Of the keyword candidates, there is one that stands out (for good and bad) > because it connects to something that is already in the language: > `pattern`.? On > the one hand, using the term `pattern` for the declaration is a slight > abuse; on > the other, users will immediately connect it with "ah, so that's how I > make a > new pattern" or "so that's what happens when I match against this > pattern." > (Lisps would resolve this tension by calling it `defpattern`.) > > The others (`matcher`, `view`, `extractor`, etc) are all made-up terms > that > don't connect to anything else in the language, for better or worse.? > If we pick > one of these, we are asking users to sort out _three_ separate new > things in > their heads: (use-site) patterns, (declaration-site) matchers, and the > rules of > how patterns and matchers are connected.? Calling them both > "patterns", despite > the mild abuse of terminology, ties them together in a way that > recognizes their > connection. > > My personal position: `pattern` is the strongest candidate here, > despite some > flaws. > > ### Binding lists and match candidates > > There are two obvious alternatives for describing the binding list and > match > candidate of a pattern declaration, both with their roots in the > constructor and > method syntax: > > ?- Pretend that a pattern declaration is like a method with multiple > return, and > ?? put the binding list in the "return position", and make the match > candidate > ?? an ordinary parameter; > ?- Lean into the inverse relationship between constructors and methods > (and > ?? consistency with the use-site syntax), and put the binding list in the > ?? "parameter list position". For static patterns and some instance > patterns, > ?? which need to explicitly identify the match candidate type, there > are several > ?? sub-options: > ?? - Lean further into the duality, putting the match candidate type > in the > ???? "return position"; > ?? - Put the match candidate type somewhere else, where it is less > likely to be > ???? confused for a method return. > > The "method-like" approach might look like this: > > ``` > class Point { > ??? // Constructor and deconstructor > ??? public Point(int x, int y) { ... } > ??? public pattern (int x, int y) Point(Point target) { ... } > ??? ... > } > > class Optional { > ??? // Static factory and pattern > ??? public static Optional of(T t) { ... } > ??? public static pattern (T t) of(Optional target) { ... } > ??? ... > } > ``` > > The "inverse" approach might look like: > > ``` > class Point { > ??? // Constructor and deconstructor > ??? public Point(int x, int y) { ... } > ??? public pattern Point(int x, int y) { ... } > ??? ... > } > > class Optional { > ??? // Static factory and pattern (using the first sub-option) > ??? public static Optional of(T t) { ... } > ??? public static pattern Optional of(T t) { ... } > ??? ... > } > ``` > > With the "method-like" approach, the match candidate gets an explicit name > selected by the author; with the inverse approach, we can go with a > predefined > name such as `that`.? (Because deconstructors do not have receivers, > we could by > abuse of notation arrange for the keyword `this` to refer instead to > the match > candidate within the body of a deconstructor.? While this might seem > to lead to > a more familiar notation for writing deconstructors, it would create a > gratuitous asymmetry between the bodies of deconstruction patterns and > those of > other patterns.) > > Between these choices, nearly all the considerations favor the "inverse" > approach: > > ?- The "inverse" approach makes the declaration look like the use > site.? This > ?? highlights that `pattern Point(int x, int y)` is what gets invoked > when you > ?? match against the pattern use `Point(int x, int y)`.? (This point is so > ?? strong that we should probably just stop here.) > ?- The "inverse" members also look like their duals; the only > difference is the > ?? `pattern` keyword (and possibly the placement of the match > candidate type). > ?? This makes matched pairs much more obvious, and such matched pairs > will be > ?? critical both for future language features and for library idioms. > ?- The method-like approach is suggestive of multiple return or > tuples, which is > ?? probably helpful for the first few minutes but actually harmful in > the long > ?? term. This feature is _not_ (much as some people would like to > believe) about > ?? multiple return or tuples, and playing into this misperception will > only make > ?? it harder to truly understand.? So this suggestion ends up propping > up the > ?? wrong mental model. > > The main downside of the "inverse" approach is the one-time speed bump > of the > unfamiliarity of the inverted syntax.? (The "method-like" syntax also > has its > own speed bumps, it is just unfamiliar in different ways.)? But unlike the > advantages of the inverse approach, which continue to add value > forever, this > speed bump is a one-time hurdle to get over. > > To smooth out the speed bumps of the inverse approach, we can consider > moving > the position of the match candidate for static and (suitable) instance > pattern > declarations, such as: > > ``` > class Optional { > ??? // the usual static factory > ??? public static Optional of(T t) { ... } > > ??? // Various ways of writing the corresponding pattern > ??? public static pattern of(T t) for Optional { ... } > ??? // or ... > ??? public static pattern(Optional) of(T t) { ... } > ??? // or ... > ??? public static pattern(Optional that) of(T t) { ... } > ??? // or ... > ??? public static pattern> of(T t) { ... } > ??? ... > } > ``` > > (The deconstructor example looks the same with either variant.) Of these, > treating the match candidate like a "parameter" of "pattern" is > probably the > most evocative: > > ``` > public static pattern(Optional that) of(T t) { ... } > ``` > > as it can be read as "pattern taking the parameter `Optional that` > called > `of`, binding `T`, and is a short departure from the inverse syntax. > > The main value of the various rearrangements is that users don't need > to think > about things operating in reverse to parse the syntax.? This trades > some of the > secondary point (patterns looking almost exactly like their inverses) > for a > certain amount of cognitive load, while maintaining the most important > consideration: that the declaration site look like the use site. > > For instance pattern declarations, if the match candidate type is the > same as > the receiver type, the match candidate type can be elided as it is with > deconstructors. > > My personal position: the "multiple return" version is terrible; all the > sub-variants of the inverse version are probably workable. > > ### Naming the match candidate > > We've been assuming so far that the match candidate always has a fixed > name, > such as `that`; this is an entirely workable approach.? Some of the > variants are > also amenable to allowing authors to explicitly select a name for the > match > candidate.? For example, if we put the match candidate as a > "parameter" to the `pattern` keyword, there is an obvious place to put > the name: > > ``` > static pattern(Optional target) of(T t) { ... } > ``` > > My personal opinion: I don't think this degree of freedom buys us > much, and in > the long run readability probably benefits by picking a fixed name > like `that` > and sticking with it.? Even with a fixed name, if there is a sensible > position > for the name, allowing users to type `that` for explicitness is fine > (as we do > with instance methods, though many people don't know this.)? We may > even want to > require it. > > ## Body types > > Just as there are two obvious approaches for the declaration, there > are two > obvious approaches we could take for the body (though there is some > coupling > between them.)? We'll call the two body approaches _imperative_ and > _functional_. > > The imperative approach treats bindings as initially-DU variables that > must be > DA on successful completion, getting their value through ordinary > assignment; > the functional approach sets all the bindings at once, positionally.? > Either > way, member patterns (except maybe deconstructors) also need a way to > differentiate a successful match from a failed match. > > Here is the `Point` deconstructor with both imperative and functional > style. The > functional style uses a placeholder `match` statement to indicate a > successful > match and provision of bindings: > > ``` > class Point { > ??? int x, y; > > ??? Point(int x, int y) { > ??????? this.x = x; > ??????? this.y = y; > ??? } > > ??? // Imperative style, deconstructor always succeeds > ??? pattern Point(int x, int y) { > ??????? x = that.x; > ??????? y = that.y; > ??? } > > ??? // Functional style > ??? pattern Point(int x, int y) { > ??????? match(that.x, that.y); > ??? } > } > ``` > > There are some obvious differences here.? In the imperative style, the > dtor body > looks much more like the reverse of the ctor body. The functional > style is more > concise (and amenable to further concision via the "concise method bodies" > mechanism in the future), as well as a number of less obvious > differences.? For > deconstructors, the imperative approach is likely to feel more natural > because > of the obvious symmetry with constructors. > > In reality, it is _premature at this point to have an opinion_, because we > haven't yet seen the full scope of the problem; deconstructors are a > special > case in many ways, which almost surely is distorting our initial > opinion.? As we > move towards conditional patterns (and pattern lambdas), our opinions > may flip. > > Regardless of which we pick, there are some additional syntactic > choices to be > made -- what syntax to use to indicate success (we used `match` in the > above > example) or failure.? (We should be especially careful around trying > to reuse > words like `return`, `break`, or `yield` because, in the case where > there are > zero bindings (which is allowable), it becomes unclear whether they > mean "fail" > or "succeed with zero bindings".) > > ### Success and failure > > Except for possibly deconstructors, which we may require to be total, > a pattern > declaration needs a way to indicate success and failure.? In the > examples above, > we posited a `match` statement to indicate success in the functional > approach, > and in both examples leaned on the "implicit success" of > deconstructors (under > the assumption they always succeed).? Now let's look at the more > general case to > figure out what else is needed. > > For a static pattern like `Optional::of`, success is conditional.? Using > `match-fail` as a placeholder for "the match failed", this might look like > (functional version): > > ``` > public static pattern(Optional that) of(T t) { > ??? if (that.isPresent()) > ??????? match (that.get()); > ??? else > ??????? match-fail; > } > ``` > > The imperative version is less pretty, though.? Using `match-success` as a > placeholder: > > ``` > public static pattern(Optional that) of(T t) { > ??? if (that.isPresent()) { > ??????? t = that.get(); > ??????? match-success; > ??? } > ??? else > ??????? match-fail; > } > ``` > > Both arms of the `if` feel excessively ceremonial here.? And if we > chose to not > make all deconstruction patterns unconditional, deconstructors would > likely need > some explicit success as well: > > ``` > pattern Point(int x, int y) { > ??? x = that.x; > ??? y = that.y; > ??? match-success; > } > ``` > > It might be tempting to try and eliminate the need for explicit success by > inferring it from whether or not the bindings are DA or not, but this is > error-prone, is less type-checkable, and falls apart completely for > patterns > with no bindings. > > ### Implicit failure in the functional approach > > One of the ceremonial-seeming aspects of `Optional::of` above is > having to say > `else match-fail`, which doesn't feel like it adds a lot of value.? > Perhaps we > can be more concise without losing clarity. > > Most conditional patterns will have a predicate to determine matching, > and then > some conditional code to compute the bindings and claim success.? > Having to say > "and if the predicate didn't hold, then I fail" seems like ceremony > for the > author and noise for the reader.? Instead, if a conditional pattern > falls off > the end without matching, we could treat that as simply not matching: > > ``` > public static pattern(Optional that) of(T t) { > ??? if (that.isPresent()) > ??????? match (that.get()); > } > ``` > > This says what we mean: if the optional is present, then this pattern > succeeds > and bind the contents of the `Optional`.? As long as our "succeed" > construct > strongly enough connotes that we are terminating abruptly and > successfully, this > code is perfectly clear.? And most conditional patterns will look a > lot like > `Optional::of`; do some sort of test and if it succeeds, extract the > state and > bind it. > > At first glance, this "implicit fail" idiom may seem error-prone or > sloppy.? But > after writing a few dozen patterns, one quickly tires of saying "else > match-fail" -- and the reader doesn't necessarily appreciate reading > it either. > > Implicit failure also simplifies the selection of how we explicitly > indicate > failure; using `return` in a pattern for "no match" becomes pretty > much a forced > move.? We observe that (in a void method), "return" and "falling off > the end" > are equivalent; if "falling off the end" means "no match", then so > should an > explicit `return`.? So in those few cases where we need to explicitly > signal "no > match", we can just use `return`.? It won't come up that often, but > here's an > example where it does: > > ``` > static pattern(int that) powerOfTwo(int exp) { > ??? int exp = 0; > > ??? if (that < 1) > ??????? return; // explicit fail > > ??? while (that > 1) { > ??????? if (that % 2 == 0) { > ??????????? that /= 2; > ??????????? ++exp; > ??????? } > ??????? else > ??????????? return; // explicit fail > ??? } > ??? match (exp); > } > ``` > > As a bonus, if `return` as match failure is a forced move, we need > only select a > term for "successful match" (which obviously can't be `return`).? We > could use > `match` as we have in the examples, or a variant like `matched` or > `matches`. > But rather than just creating a new control operator, we have an > opportunity to > lean into the duality a little harder, by including the pattern syntax > in the > match: > > ``` > matches of(that.get()); > ``` > > or the (optionally?) qualified (inferring type arguments, as we do at > the use > site): > > ``` > matches Optional.of(that.get()); > ``` > > These "use the name" approaches trades a small amount of verbosity to > gain a > higher degree of fidelity to the pattern use site (and to evoke the > comethod > completion.) > > If we don't choose "implicit fail", we would have to invent _two_ new > control > flow statements to indicate "success" and "failure". > > My personal position: for the functional approach, implicit failure > both makes > the code simpler and clearer, and after you get used to it, you don't > want to go > back.? Whether we say `match` or `matches` or `matches ` > are all > workable, though I like some variant that names the pattern. > > ### Implicit success in the imperative approach > > In the imperative approach, we can be implicit as well, but it feels more > natural (at least, initially) to choose implicit success rather than > failure. > This works great for unconditional patterns: > > ``` > pattern Point(int x, int y) { > ??? x = that.x; > ??? y = that.y; > ??? // implicit success > } > ``` > > but not quite as well for conditional patterns: > > ``` > static pattern(Optional that) of(T t) { > ??? if (that.isPresent()) { > ??????? t = that.get(); > ??? } > ??? else > ??????? match-fail; > ??? // implicit success > } > ``` > > We can eliminate one of the arms of the if, with the more concise (but > convoluted) inversion: > > ``` > static pattern(Optional that) of(T t) { > ??? if (!that.isPresent()) > ??????? match-fail; > ??? t = that.get(); > ??? // implicit success > } > ``` > > Just as with the functional approach, if we choose imperative and > "implicit > success", using `return` to indicate success is pretty much a forced > move. > > ### Imperative is a trap > > If we assume that functional implies implicit failure, and imperative > implies > implicit success, then our choices become: > > ``` > class Optional { > ??? public static Optional of(T t) { ... } > > ??? // imperative, implicit success > ??? public static pattern(Optional that) of(T t) { > ??????? if (that.isPresent()) { > ??????????? t = that.get(); > ??????? } > ??????? else > ??????????? match-fail; > ??? } > > ??? // functional, implicit failure > ??? public static pattern(Optional that) of(T t) { > ??????? if (that.isPresent()) > ??????????? matches of(that.get()); > ??? } > } > ``` > > Once we get past deconstructors, the imperative approach looks worse by > comparison because we need to assign all the bindings (which is _O(n)_ > assignments) _and also_ indicate success or failure somehow, whereas > in the > functional style all can be done together with a single `matches` > statement. > > Looking at the alternatives, except maybe for unconditional patterns, the > functional example above seems a lot more natural.? The imperative > approach > works with deconstructors (assuming they are not conditional), but > does not > scale so well to conditionality -- which is the essence of patterns. > > From a theoretical perspective, the method-comethod duality also gives > us a > forceful nudge towards the functional approach.? In a method, the method > arguments are specified as a positional list of expressions at the use > site: > > ??? m(a, b, c) > > and these values are invisibly copied into the parameter slots of the > method > prior to frame activation.? The dual to that for a comethod to > similarly convey > the bindings in a positional list of expressions (as they must either > all be > produced or none), where they are copied into the slots provided at > the use > site, as is indicated by `matches` in the above examples. > > My personal position: the imperative style feels like a trap. It seems > "obvious" at first if we start with deconstructors, but becomes > increasingly > difficult when we get past this case, and gets in the way of other > opportunities.? The last gasp before acceptance is the discomfort that > dtor and > ctor bodies are written in different styles, but in the rear-view > mirror, this > feels like a non-issue. > > ### Derive imperative from functional? > > If we start with "functional with implicit failure", we can possibly > rescue > imperative by deriving a version of imperative from functional, by > "overloading" > the match-success operator. > > If we have a pattern whose binding names are `b1..bn` of types > `B1..Bn`, then > the `matches` operator must take a list of expressions `e1..en` whose > arity and > types are compatible with `B1..Bn`.? But we could allow `matches` to > also have a > nilary form, which would have the effect of being shorthand for > > ??? matches (b1, b2, ..., bn) > > where each of `b1..bn` must be DA at the point of matching. This means > that we > could express patterns in either form: > > ``` > class Optional { > ??? public static Optional of(T t) { ... } > > ??? // imperative, derived from functional with implicit failure > ??? public static pattern(Optional that) of(T t) { > ??????? if (that.isPresent()) { > ??????????? t = that.get(); > ??????????? matches of; > ??????? } > ??? } > > ??? public static pattern(Optional that) of(T t) { > ??????? if (that.isPresent()) > ??????????? matches of(that.get()); > ??? } > } > ``` > > This flexibility allows users to select a more verbose expression in > exchange > for a clearer association of expressions and bindings, though as we'll > see, it > does come with some additional constraints. > > ### Wrapping an existing API > > Nearly every library has methods (sometimes sets of methods) that are > patterns > in disguise, such as the pair of methods `isArray` and > `getComponentType` in > `Class`, or the `Matcher` helper type in `java.util.regex`. Library > maintainers > will likely want to wrap (or replace) these with real patterns, so > these can > participate more effectively in conditional contexts, and in some cases, > highlight their duality with factory methods. > > Matching a string against a `j.u.r.Pattern` regular expression has all > the same > elements as a pattern, just with an ad-hoc API (and one that I have to > look up > every time).? But we can fairly easily wrap a true pattern around the > existing > API.? To match against a `Pattern` today, we pass the match candidate to > `Pattern::matcher`, which returns a `Matcher` with accessors > `Matcher::matches` > (did it match) and `Matcher::group` (conditionally extract a > particular capture > group.)? If we want to wrap this with a pattern called `regexMatch`: > > ``` > pattern(String that) regexMatch(String... groups) { > ??? Matcher m = this.matcher(that); > ??? if (m.matches()) > ??????? matches Pattern.regexMatch(IntStream.range(1, m.groupCount()) > ??????????????????????????????????????????? .map(Matcher::group) > .toArray(String[]::new)); > ??? // whole lotta matchin' goin' on > } > ``` > > This says that a `j.u.r.Pattern` has an instance pattern called > `regex`, whose > match candidate is `String`, and which binds a varargs of `String` > corresponding > to the capture groups.? The implementation simply delegates to the > existing > `j.u.r.Matcher` API.? This means that `j.u.r.Pattern` becomes a sort > of "pattern > object", and we can use it as a receiver at the use site: > > ``` > static Pattern As = Pattern.compile("(a*)"); > static Pattern Bs = Pattern.compile("(b*)"); > ... > switch (string) { > ??? case As.regexMatch(var as): ... > ??? case Bs.regexMatch(var bs): ... > ??? ... > } > ``` > > ### Odds and ends > > There are a number of loose ends here.? We could choose other names > for the > match-success and match-fail operations, including trying to reuse > `break` or > `yield`.? But, this reuse is tricky; it must be very clear whether a > given form > of abrupt completion means "success" or "failure", because in the case of > patterns with no bindings, we will have no other syntactic cues to help > disambiguate.? (I think having a single `matches`, with implicit > failure and > `return` meaning failure, is the sweet spot here.) > > Another question is whether the binding list introduces corresponding > variables > into the scope of the body.? For imperative, the answer is "surely > yes"; for > functional, the answer is "maybe" (unless we want to do the trick where we > derive imperative from functional, in which case the answer is "yes" > again.) > > If the binding list does not correspond to variables in the body, this > may be > initially discomforting; because they do not declare program elements, > they may > feel that they are left "dangling".? But even if they are not declaring > _program_ elements, they are still declaring _API_ elements (similar > to the > return type of a method.)? We will want to provide Javadoc on the > bindings, just > like with parameters; we will want to match up binding names in > deconstructors > with parameter names in constructors; we may even someday want to support > by-name binding at the use site (e.g., `case Foo(a: var a)`). The > names are > needed for all of these, just not for the body. Names still matter.? > My take > here is that this is a transient "different is scary" reaction, one > that we > would get over quickly. > > A final question is whether we should consider unqualified names as > implicitly > qualified by `that` (and also `this`, for instance patterns, with some > conflict > resolution).? Users will probably grow tired of typing `that.` all the > time, and most of the time, the unqualified use is perfectly readable. > > ## Exhaustiveness > > There is one last syntax question in front of us: how to indicate that > a set of > patterns are (claimed to be) exhaustive on a given match candidate > type.? We see > this with `Optional::of` and `Optional::empty`; it would be sad if the > compiler > did not realize that these two patterns together were exhaustive on > `Optional`. > This is not a feature that will be used often, but not having it at > all will be > a repeated irritant. > > The best I've come up with is to call these `case` patterns, where a > set of > `case` patterns for a given match candidate type in a given class are > asserted > to be an exhaustive set: > > ``` > class Optional { > ??? static Optional of(T t) { ... } > ??? static Optional empty() { ... } > > ??? static case pattern of(T t) for Optional { ... } > ??? static case pattern empty() for Optional { ... } > } > ``` > > Because they may not be truly exhaustive, `switch` constructs will > have to back > up the static assumption of exhaustiveness with a dynamic check, as we > do for > other sets of exhaustive patterns that may have remainder. > > I've experimented with variants of `sealed` but it felt more forced, > so this is > the best I've come up with. > > ## Example: patterns delegating to other patterns > > Pattern implementations must compose.? Just as a subclass constructor > delegates > to a superclass constructor, the same should be true for deconstructors. > Here's a typical superclass-subclass pair: > > ``` > class A { > ??? private final int a; > > ??? public A(int a) { this.a = a; } > ??? public pattern A(int a) { matches A(that.a); } > } > > class B extends A { > ??? private final int b; > > ??? public B(int a, int b) { > ??????? super(a); > ??????? this.b = b; > ??? } > > ??? // Imperative style > ??? public pattern B(int a, int b) { > ??????? if (that instanceof super(var aa)) { > ??????????? a = aa; > ??????????? b = that.b; > ??????????? matches B; > ??????? } > ??? } > > ??? // Functional style > ??? public pattern B(int a, int b) { > ??????? if (that instanceof super(var a)) > ??????????? matches B(a, b); > ??? } > } > ``` > > (Ignore the flow analysis and totality for the time being; we'll come > back to > this in a separate document.) > > The first thing that jumps out at us is that, in the imperative > version, we had > to create a "garbage" variable `aa` to receive the binding, because > `a` was > already in scope, and then we have to copy the garbage variable into > the real > binding variable. Users will surely balk at this, and rightly so.? In the > functional version (depending on the choices from "Odds and Ends") we > are free > to use the more natural name and avoid the roundabout locution. > > We might be tempted to fix the "garbage variable" problem by inventing > another > sub-feature: the ability to use an existing variable as the target of > a binding, > such as: > > ``` > pattern Point(int a, int b) { > ??? if (this instanceof A(__bind a)) > ??????? b = this.b; > } > ``` > > But, I think the language is stronger without this feature, for two > reasons. > First, having to reason about whether a pattern match introduces a new > binding > or assigns to an existing variables is additional cognitive load for > users to > reason about, and second, having assignment to locals happening through > something other than assignment introduces additional complexity in > finding > where a variable is modified.? While we can argue about the general > utility of > this feature, bringing it in just to solve the garbage-variable problem is > particularly unattractive. > > ## Pattern lambdas > > One final consideration is is that patterns may also have a lambda > form.? Given > a single-abstract-pattern (SAP) interface: > > ``` > interface Converter { > ??? pattern(T t) convert(U u); > } > ``` > > one can implement such a pattern with a lambda. Such a lambda has one > parameter > (the match candidate), and its body looks like the body of a declared > pattern: > > ``` > Converter c = > ??? i -> { > ??????? if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE) > ??????????? matches Converter.convert((short) i); > ??? }; > ``` > > Because the bindings of the pattern lambda are defined in the > interface, not in > the lambda, this is one more reason not to like the imperative > version: it is > brittle, and alpha-renaming bindings in the interface would be a > source-incompatible change. > > ## Example gallery > > Here's all the pattern examples so far, and a few more, using the > suggested > style (functional, implicit fail, implicit `that`-qualification): > > ``` > // Point dtor > pattern Point(int x, int y) { > ??? matches Point(x, y); > } > > // Optional -- static patterns for Optional::of, Optional::empty > static case pattern(Optional that) of(T t) { > ??? if (isPresent()) > ??????? matches of(t); > } > > static case pattern(Optional that) empty() { > ??? if (!isPresent()) > ??????? matches empty(); > } > > // Class -- instance pattern for arrayClass (match candidate type > inferred) > pattern arrayClass(Class componentType) { > ??? if (that.isArray()) > ??????? matches arrayClass(that.getComponentType()); > } > > // regular expression -- instance pattern in j.u.r.Pattern > pattern(String that) regexMatch(String... groups) { > ??? Matcher m = matcher(that); > ??? if (m.matches()) > ??????? matches Pattern.regexMatch(IntStream.range(1, m.groupCount()) > ??????????????????????????????????????????? .map(Matcher::group) > .toArray(String[]::new)); > } > > // power of two (somewhere) > static pattern(int that) powerOfTwo(int exp) { > ??? int exp = 0; > > ??? if (that < 1) > ??????? return; > > ??? while (that > 1) { > ??????? if (that % 2 == 0) { > ??????????? that /= 2; > ??????????? exp++; > ??????? } > ??????? else > ??????????? return; > ??? } > ??? matches powerOfTwo(exp); > } > ``` > > ## Closing thoughts > > I came out of this exploration with very different conclusions than I > expected > when going in.? At first, the "inverse" syntax seemed stilted, but > over time it > started to seem more obvious.? Similarly, I went in expecting to > prefer the > imperative approach for the body, but over time, started to warm to the > functional approach, and eventually concluded it was basically a > forced move if > we want to support more than just deconstructors.? And I started out > skeptical > of "implicit fail", but after writing a few dozen patterns with it, > going back > to fully explicit felt painful.? All of this is to say, you should > hold your > initial opinions at arm's length, and give the alternatives a chance > to sink in. > > For most _conditional_ patterns (and conditionality is at the heart of > pattern > matching), the functional approach cleanly highlights both the match > predicate > and the flow of values, and is considerably less fussy than the imperative > approach in the same situation; `Optional::of`, `Class::arrayClass`, > and `regex` > look great here, much better than the would with imperative. None of these > illustrate delegation, but in the presence of delegation, the gap gets > even > wider. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Thu Apr 4 17:29:54 2024 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 4 Apr 2024 19:29:54 +0200 Subject: Exhaustiveness and instance patterns Message-ID: Hello! Suppose we declare several instance patterns and define that they form an exhaustive set (using syntax from the bikeshed thread): class X { case pattern(String that) p1() {...} case pattern(String that) p2() {...} } To match, we need an instance of type X. Could it be an arbitrary expression, or it should be a limited thing (e.g., only a local variable)? And how the exhaustiveness will be determined? E.g.: X myX = ... switch(str) { case myX.p1() -> {...} case myX.p2() -> {...} } Here, we can assume that the set of cases is exhaustive, because p1() and p2() have the same effectively-final qualifier. But what if it's a method call? switch(str) { case getX().p1() -> {...} case getX().p2() -> {...} } The `getX()` method may return different instances of X, and it's not evident anymore whether this set of patterns is exhaustive. Do we have any strategy regarding this case? Or exhaustive sets are not allowed for instance patterns? With kind regards, Tagir Valeev -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Thu Apr 4 17:34:27 2024 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 4 Apr 2024 19:34:27 +0200 Subject: Exhaustiveness and mutual exclusion Message-ID: Hello! Another question: if we declare an exhaustive set of patterns, does it imply that they should be mutually exclusive? This is probably not so important for compiler, but could be important for IDE functions like inspections or refactorings. E.g.: switch(opt) { case Optional.of(var x) -> {...} case Optional.empty() -> {...} } Is it safe to reorder branches in this switch? This is true for Optional, but can we say this for any custom set of exhaustive patterns? int x = 0; if (opt instanceof Optional.of(var x)) {x++;} if (opt instanceof Optional.empty()) {x++;} Can a static analyzer assume that only one `if` body will be visited here (i.e., x is never 2)? Again, an analyzer may hardcode this knowledge for Optional, but would it be true for any set of exhaustive patterns? With kind regards, Tagir Valeev. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Apr 4 17:46:30 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 4 Apr 2024 13:46:30 -0400 Subject: Exhaustiveness and instance patterns In-Reply-To: References: Message-ID: <409ce12b-0902-4dca-988a-d83bdeaeec80@oracle.com> The examples I had worked out previously were all static patterns, so this is a good one to work through. For an instance pattern use, we may or may not put some constraints on the receiver expression.? I have an intuition that some sort of "effectively constant" constraint is useful, but I haven't fully worked through the details.? But that's a separate concern. In your examples, I think we say that a set of instance patterns { x.p1(), x.p2() } are exhaustive if p1 and p2 form a complete set of case patterns, _and_ the receiver expressions `x` are provably the same.? I think this amounts to "if its a constant or an effectively final variable, otherwise you lose." BTW, it is not just method invocations.? If its a nonfinal field, it could be mutated by the p1() implementation (that would be very rude, but allowed). On 4/4/2024 1:29 PM, Tagir Valeev wrote: > Hello! > > Suppose we declare several instance patterns and define that they form > an exhaustive set (using syntax from the bikeshed thread): > > class X { > ? case pattern(String that) p1() {...} > ? case pattern(String that) p2() {...} > } > > To match, we need an instance of type X. Could it be an arbitrary > expression, or it should be a limited thing (e.g., only a local > variable)? And how the exhaustiveness will be determined? E.g.: > > X myX = ... > switch(str) { > ? case myX.p1() -> {...} > ? case myX.p2() -> {...} > } > > Here, we can assume that the set of cases is exhaustive, because p1() > and p2() have the same effectively-final qualifier. But what if it's a > method call? > > switch(str) { > ? case getX().p1() -> {...} > ? case getX().p2() -> {...} > } > > The `getX()` method may return different instances of X, and it's not > evident anymore whether this set of patterns is exhaustive. Do we have > any strategy regarding this case? Or exhaustive sets are not allowed > for instance patterns? > > With kind regards, > Tagir Valeev From brian.goetz at oracle.com Thu Apr 4 17:48:29 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 4 Apr 2024 13:48:29 -0400 Subject: Exhaustiveness and mutual exclusion In-Reply-To: References: Message-ID: <91642300-3868-465c-85fa-0c908d3c685f@oracle.com> I don't think "exhaustiveness" need imply "independence."? As an example from the bytecode API, we have factories for aload_0, aload(0), and load(INT, 0), all of which produce the same bytecode.? It is reasonable for a "bytecode cursor" to have instance patterns for each of these too, and for the sum of all the patterns for all the bytecodes to be exhaustive, even though there is clearly overlap. On 4/4/2024 1:34 PM, Tagir Valeev wrote: > Hello! > > Another question: if we declare an exhaustive set of patterns, does it > imply that they should be mutually exclusive? This is probably not so > important for compiler, but could be important for IDE functions like > inspections or refactorings. E.g.: > > switch(opt) { > ?case Optional.of(var x) -> {...} > ?case Optional.empty() -> {...} > } > Is it safe to reorder branches in this switch? This is true for > Optional, but can we say this for any custom set of exhaustive patterns? > > int x = 0; > if (opt instanceof Optional.of(var x)) {x++;} > if (opt instanceof Optional.empty()) {x++;} > > Can a static analyzer assume that only one `if` body will be visited > here (i.e., x is never 2)? Again, an analyzer may hardcode this > knowledge for Optional, but would it be true for any set of exhaustive > patterns? > > With kind regards, > Tagir Valeev. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbepincket at live.be Thu Apr 4 19:39:19 2024 From: robbepincket at live.be (Robbe Pincket) Date: Thu, 4 Apr 2024 19:39:19 +0000 Subject: Exhaustiveness and instance patterns In-Reply-To: <409ce12b-0902-4dca-988a-d83bdeaeec80@oracle.com> References: <409ce12b-0902-4dca-988a-d83bdeaeec80@oracle.com> Message-ID: "if its a constant or an effectively final variable, otherwise you lose." Guards have the same requirements on variables so this seems logical to me. Kind regards Robbe Pincket ________________________________ Van: amber-spec-observers namens Brian Goetz Verzonden: donderdag 4 april 2024 19:46 Aan: Tagir Valeev ; amber-spec-experts Onderwerp: Re: Exhaustiveness and instance patterns The examples I had worked out previously were all static patterns, so this is a good one to work through. For an instance pattern use, we may or may not put some constraints on the receiver expression. I have an intuition that some sort of "effectively constant" constraint is useful, but I haven't fully worked through the details. But that's a separate concern. In your examples, I think we say that a set of instance patterns { x.p1(), x.p2() } are exhaustive if p1 and p2 form a complete set of case patterns, _and_ the receiver expressions `x` are provably the same. I think this amounts to "if its a constant or an effectively final variable, otherwise you lose." BTW, it is not just method invocations. If its a nonfinal field, it could be mutated by the p1() implementation (that would be very rude, but allowed). On 4/4/2024 1:29 PM, Tagir Valeev wrote: > Hello! > > Suppose we declare several instance patterns and define that they form > an exhaustive set (using syntax from the bikeshed thread): > > class X { > case pattern(String that) p1() {...} > case pattern(String that) p2() {...} > } > > To match, we need an instance of type X. Could it be an arbitrary > expression, or it should be a limited thing (e.g., only a local > variable)? And how the exhaustiveness will be determined? E.g.: > > X myX = ... > switch(str) { > case myX.p1() -> {...} > case myX.p2() -> {...} > } > > Here, we can assume that the set of cases is exhaustive, because p1() > and p2() have the same effectively-final qualifier. But what if it's a > method call? > > switch(str) { > case getX().p1() -> {...} > case getX().p2() -> {...} > } > > The `getX()` method may return different instances of X, and it's not > evident anymore whether this set of patterns is exhaustive. Do we have > any strategy regarding this case? Or exhaustive sets are not allowed > for instance patterns? > > With kind regards, > Tagir Valeev -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Apr 4 20:18:28 2024 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 4 Apr 2024 20:18:28 +0000 Subject: Member Patterns -- the bikeshed In-Reply-To: <10ff8e67-366e-429f-a0ec-311a05a9b838@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <10ff8e67-366e-429f-a0ec-311a05a9b838@oracle.com> Message-ID: On Apr 4, 2024, at 1:11?PM, Brian Goetz wrote: There's obviously some more discussion coming about "what is a pattern", but let me summarize the points on which we've asked for syntax feedback, and make another call (I can't believe I have to ask) for opinions here. ... Body types. There is the broad choice of "imperative vs functional"; within that, there are choices about "implicit failure" or "implicit success." There is also how we indicate success and failure. The suggested approach is functional, implicit failure, return means fail, success is indicated by `match patternName(BINDINGS)`. The draft proposal that Brian sent out on March 29, in the section and subsections with these headings: ## Body types ### Success and failure ### Implicit failure in the functional approach ### Implicit success in the imperative approach ### Imperative is a trap ### Derive imperative from functional? laid out a version of the functional approach in which failure is implicit, a version of the imperative approach in which success is implicit, and an add-on to the functional approach that allows it to be used in a way that is syntactically similar to the imperative approach. But this was an incomplete presentation of a design space that actually has more possibilities and potential symmetries. Here I undertake a complete retelling of an imperative and approach and a functional approach and then compare them. An important difference is that I will assume a version of the imperative approach in which failure, rather than success, is implicit. The reason for this is while we expect simple deconstructors always to succeed?and that motivates us to make success implicit, to make deconstructors one line shorter?that is not true for other kinds of patterns, and I think it is good to mark pattern success explicitly no matter which approach is used used. Here, then, is my retelling. As part of this retelling, I will explain pattern-match success in terms of a new kind of reason for abrupt completion: ?a successful match with match results (z1, z2, ?, zn)? where each zk is some value. An Imperative Approach (in which failure is implicit) The parameters of a pattern declaration are definitely unassigned at the start of the body of the declaration. They may be given values through ordinary assignment. For expository purposes, let the names of the parameters be v1, v2, ?, vn. If execution of the body completes normally, or completes abruptly for any reason other than a successful match, then the invocation of the pattern results in a failed match. In particular, the statement `return;` may be used in the body of a pattern declaration to indicate failure to match. The statement `match;` (or, if you prefer, `match patternName;`, but I will stick with the shorter form for now) indicates a successful match. It may be used only within the body of a pattern declaration. Execution of `match;` causes the body of the pattern declaration to complete abruptly, the reason being a successful match with match results (v1, v2, ?, vn)?that is, the current values of the parameters v1, v2, ?, vn are used as the match results. It is a compile-time error if any of the parameters of a pattern declaration is not definitely assigned at any `match;` statement. Optional restriction: It is a compile-time error if the body of a deconstructor pattern declaration can complete normally or contains a `return;` statement. (This restriction would imply that a deconstructor cannot fail to match. This restriction would not apply to static or instance patterns.) Here is the Point deconstructor written in the imperative style. class Point { int x, y; Point(int x, int y) { this.x = x; this.y = y; } // Imperative style pattern Point(int x, int y) { x = that.x; y = that.y; match; // Match success must be signaled explicitly } } In this imperative style, the deconstructed body looks like the ?reverse" of the constructor body, with the sides of each assignment swapped and `that` substituted for `this`?and, of course, the addition of a `match` statement to signal success. Special convenience feature: Another form of the `match` statement is provided for convenience: match (e1, e2, ?, en); means { var t1 = e1, t2 = e2, ?, tn = en; v1 = t1; v2 = t2; ? vn = t1; match; } where temporaries t1, t2, ?, tn are fresh local variables that do not occur elsewhere in the program. (It is a compile-time error if the number of expressions does not match the number of parameters, or if for any k the type of ek is not assignment-compatible with the declared type of vk.) This allows the deconstructor for Point to be written this way instead if desired: // Imperative style, but using the extended `match` statement to abbreviate a series of boilerplate assignments pattern Point(int x, int y) { match (that.x, that.y); } Alternatively, a Functional Approach (in which failure is likewise implicit) [This is very close to what Brian proposed, but I express it in the same detailed terms that I used above to describe the variant imperative approach that assumes failure is implicit.] If execution of the body completes normally, or completes abruptly for any reason other than a successful match, then the invocation of the pattern results in a failed match. In particular, the statement `return;` may be used in the body of a pattern declaration to indicate failure to match. For expository purposes, let the names of the parameters of the pattern be v1, v2, ?, vn. The statement `match (e1, e2, ?, en);` (or, if you prefer, `match patternName(e1, e2, ?, en);`, but I will stick with the shorter form for now) indicates a successful match, using the values of the expressions e1, e2, ?, en. It may be used only within the body of a pattern declaration. Execution of `match (e1, e2, ?, en);` causes the body of the pattern declaration to complete abruptly, the reason being a successful match with match results (z1, z2, ?, zn), where z1, z2, ?, zn are the respective results of evaluating the expressions e1, e2, ?, en (working left-to-right). If evaluation of any en completes abruptly, then evaluation of `match (e1, e2, ?, en);` completes abruptly for the same reason. It is a compile-time error if the number of expressions does not match the number of parameters, or if for any k the type of ek is not assignment-compatible with the declared type of vk. Optional restriction: It is a compile-time error if the body of a deconstructor pattern declaration can complete normally or contains a `return;` statement. (This restriction would imply that a deconstructor cannot fail to match. This restriction would not apply to static or instance patterns.) Here is the Point deconstructor written in the functional style. class Point { int x, y; Point(int x, int y) { this.x = x; this.y = y; } // Functional style pattern Point(int x, int y) { match (that.x that.y); } } In the functional style, the `match` statement that signals success looks somewhat like an invocation that provides desired values corresponding to the declared parameters. The parameters of a pattern declaration are in fact declared local variables that are definitely unassigned at the start of the body of the declaration. They may be given values through ordinary assignment, but need not be; the compiler will not complain just because a pattern parameter goes unused as a local variable. One possible use for them is to hold values intended to be match results while other values are still being computed. Special convenience feature: Another form of the `match` statement is provided for convenience: match; means match (v1, v2, ?, vn); where v1, v2, ?, vn are the names of the declared pattern parameters. This allows the deconstructor for Point to be written this way instead: // Functional style, but using parameter variables for convenience to stash intermediate match result values as they are computed pattern Point(int x, int y) { x = that.x; y = that.y; match; } It is a compile-time error if any of the parameters of a pattern declaration is not definitely assigned at any `match;` statement. Comparing These Imperative and Functional Approaches The two approaches are described from different perspectives, and suggest slightly different implementation techniques, but they allow the programmer to write exactly the same set of programs. Assuming reasonable compiler optimization of chained assignments and unused local variables, the resulting machine code should be the same in either case. Whether or not to use explicit assignment to the pattern parameter variables becomes entirely a matter of taste. If the number of parameters is, say, 4 or less, I would probably prefer to write a pattern in the functional style, to cut down on clutter. But if the number of parameters is, say, 7 or more, I would probably prefer to write a pattern in the imperative style, to make it easier to see that each match result has been assigned to the correct parameter. In between, my mileage might vary. It would seem, then, from these explanations and examples, that we could choose either of these models as the ?official? explanation of how the bodies of pattern declarations work. I actually thought that for a little while. It does seem that either is easily derived from the other by introducing a plausible ?special convenience feature?. But if we want to be able to use the SAP (single-abstract-pattern interfaces) feature that Brian introduces toward the end, in his section ## Pattern lambdas so that patterns can be expressed as lambda expressions, then the functional approach is clearly the better choice. To see why, consider his example: interface Converter { pattern(T t) convert(U u); } Converter c = i -> { if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE) match Converter.convert((short) i); }; This lambda expression is, of course, written in the functional style. But watch what happens if we try to write it in the imperative style: Converter c = i -> { if (i >= Short.MIN_VALUE && i <= Short.MAX_VALUE) { u = (short) I; // PROBLEM: u is not in scope match; } }; The problem is that the parameter name `u` is declared in the SAP interface `Converter` but is not in scope within the lambda expression. This, I think, is reason enough to regard the functional approach as the ?official explanation? of what is going on, because, as with methods and the way they bind method parameters to argument values, the baseline mechanism in Java for establishing correspondence between parameters and values is order within a sequence rather than matching of parameter names. So, in the end, I recommend adopting the functional approach, but I also recommend adopting the ?special convenience feature? so that the syntactic style of the imperative approach can be used in certain common cases. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccherlin at gmail.com Thu Apr 4 21:29:35 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Thu, 4 Apr 2024 16:29:35 -0500 Subject: Member Patterns -- the bikeshed In-Reply-To: <10ff8e67-366e-429f-a0ec-311a05a9b838@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <10ff8e67-366e-429f-a0ec-311a05a9b838@oracle.com> Message-ID: Good summary! As always, I have strong opinions and suggestions on how to improve just about everything. # Use-site syntax ## Deconstruction if (let MyClassWithDeconstructor(var a, var b) = myThing) { ... } switch (myThing) { case MyClassWithDeconstructor(var a, var b) -> ... } ## Static if (MyClass.staticPattern(var a, var b) matches someThing) { ... } switch (someThing) { case MyClass.staticPattern(var a, var b) -> ... } ## Instance / Bound If someThing's static type is not MyClass: if (MyClass.instancePattern(var a, var b) matches someThing) { ... } switch (something) { case MyClass.instancePattern(var a, var b) -> ... } Or, if the object's static type is known to be MyClass: if (myThing.instancePattern(var a, var b)) { ... } or if (instancePattern(var a, var b) matches myThing) { ... } switch (myThing) { case instancePattern(var a, var b) -> ... } ## Unbound PatternMatcher matcher = MyClass::myPattern; if (matcher.match(var a, var b) matches someThing) { ... } switch (someThing) { case matcher.match(var a, var b) -> ... } # Identifying a member as a pattern The "pattern" keyword is fine, as long as it's sufficiently contextual to not break existing usages. For example, `Pattern pattern = Pattern.compile("...")` should still be a valid field declaration. # Position of match candidate None of the above! As previously suggested: static pattern of(A a, B b : Optional) # Naming of match candidate I think always using "that" is fine. # Body types I agree with the suggested functional, implicit failure, return means fail. However, I see no need for "match" to take "patternName"[1]. The more concise `match(BINDINGS)` is preferable to `match patternName(BINDINGS)`. Indeed, `return` for failure and `return BINDINGS` for success would work just as well, except in the case of a degenerate pattern with no bindings, which could `return true` for success and `return` or `return false` for failure.[2] # Exhaustiveness The `case` modifier is fine, but the design should leave room for `case LABEL` or `case (LABEL1, LABEL2)` to delineate membership in exhaustive set(s), as a potential future enhancement. [1] Java already has enough gratuitous repetition in class declarations. [2] Sorry, Brian, but member patterns *are* basically "conditional var-return methods". The more the design leans into that, the better. In my opinion, of course. # Bonus Gratuitous Perl Advocacy Alternative spelling suggestion for `a matches b`: `a =~ b` Alternative spelling suggestion for `!(a matches b)`: `a !~ b` Cheers, Clement Cherlin -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Apr 4 22:55:32 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 4 Apr 2024 18:55:32 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <10ff8e67-366e-429f-a0ec-311a05a9b838@oracle.com> Message-ID: <1962c204-cd98-4c10-ad9f-562acfbc333a@oracle.com> > > if (let MyClassWithDeconstructor(var a, var b) = myThing) { ... } There may ultimately be room for an imperative match construct. However, we have learned (sometimes, we've had to learn the same lesson more than once) that "frills" should come later, and that thinking about them too early will distort thinking about the more important concepts.? "Let" is clearly a frill, as it is largely equivalent to a switch, as you point out.? So, maybe later.? Or not. > if (MyClass.staticPattern(var a, var b) matches someThing) { ... } This ship sailed when we chose to overload `instanceof` for type patterns.? We considered this then, chose to stick with instanceof, but even if we were convinced it was a bad choice (which I'm mostly not), trying to switch horses midstream is much worse. > Or, if the object's static type is known to be MyClass: > > if (myThing.instancePattern(var a, var b)) { ... } Its cute, but it doesn't work.? When methods and patterns are paired -- which is likely to be the most common outcome -- the method invocation and pattern invocation look too similar, and, in the case of nilary methods/patterns, this approach falls apart completely as they are textually the same. > # Identifying a member as a pattern > > The "pattern" keyword is fine, as long as it's sufficiently contextual > to not break existing usages. For example, `Pattern pattern = > Pattern.compile("...")` should still be a valid field declaration. This is not our first rodeo. > # Position of match candidate > > None of the above! As previously suggested: > > static pattern of(A a, B b : Optional) Valid candidate. > However, I see no need for "match" to take "patternName"[1]. The more > concise `match(BINDINGS)` is preferable to `match patternName(BINDINGS)`. Be very careful with pronouncements like "is preferable".?? I realize you probably mean "I find it preferable" (but also, there's probably a reason it didn't occur to you to phrase it this way), but the two mean vastly different things. Score: 1 out of N? :) From gavin.bierman at oracle.com Fri Apr 5 14:01:54 2024 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 5 Apr 2024 14:01:54 +0000 Subject: Update on String Templates (JEP 459) In-Reply-To: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> References: <8EF82E14-DDF9-48A2-8CAE-7E7DC3C2AC9F@oracle.com> Message-ID: <50609BB6-867A-4352-A3BF-08D22429ECE1@oracle.com> Thanks for the extensive feedback following Brian's email. I think it?s fair to say that there is still a broad range of opinions on exactly what form this feature should take. The time has come for us to decide what to do about this feature with respect to JDK 23. Given that there is support for a change in the design but a lack of clear consensus on what that new design might look like, the prudent course of action is to (i) NOT ship the current design as a preview feature in JDK 23, and (ii) take our time continuing the design process. We all agree that our favourite language deserves us taking whatever time is needed to perfect our design! Preview features are exactly intended for this - for trying out mature designs before we commit to them for all time. Sometimes we are going to want to change our minds. So, to be clear: there will be no string template feature, even with --enable-preview, in JDK 23. For those of you experimenting with string templates in JDK 22 - please continue to do so, and share your experiences with us. This is the best form of feedback! (We really don?t need, for example, reminders of what other languages do - we have done all that extensive research already. But we don?t know about your application; kick the tires and maybe you?ll unearth something. Play around and send us your feedback - good or bad.) Thanks, Gavin On 8 Mar 2024, at 18:35, Brian Goetz wrote: Time to check in with where were are with String Templates. We?ve gone through two rounds of preview, and have received some feedback. As a reminder, the primary goal of gathering feedback is to learn things about the design or implementation that we don?t already know. This could be bug reports, experience reports, code review, careful analysis, novel alternatives, etc. And the best feedback usually comes from using the feature ?in anger? ? trying to actually write code with it. (?Some people would prefer a different syntax? or ?some people would prefer we focused on string interpolation only? fall squarely in the ?things we already knew? camp.) In the course of using this feature in the `jextract` project, we did learn quite a few things we didn?t already know, and this was conclusive enough that it has motivated us to adjust our approach in this feature. Specifically, the role of processors is ?outsized? to the value they offer, and, after further exploration, we now believe it is possible to achieve the goals of the feature without an explicit ?processor? abstraction at all! This is a very positive development. First, I want to affirm that that the goals of the project have not changed. From JEP 459: Goals ? Simplify the writing of Java programs by making it easy to express strings that include values computed at run time. ? Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks). ? Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions. ? Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates. ? Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON). ? Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation. Non-Goals ? It is not a goal to introduce syntactic sugar for Java's string concatenation operator (+), since that would circumvent the goal of validation. ? It is not a goal to deprecate or remove the StringBuilder and StringBuffer classes, which have traditionally been used for complex or programmatic string composition. Another thing that has not changed is our view on the syntax for embedding expressions. While many people did express the opinion of ?why not ?just' do what Kotlin/Scala does?, this issue was more than fully explored during the initial design round. (In fact, while syntax disagreements are often purely subjective, this one was far more clear ? the $-syntax is objectively worse, and would be doubly so if injected into an existing language where there were already string literals in the wild. This has all been more than adequately covered elsewhere, so I won?t rehash it here.) Now, let?s talk about what we do think should change: the role of processors and the StringTemplate type. Processors were envisioned as a means to abstract the transformation of templates to their final form (whether string, or something else.) However, Java already has a well established means of abstracting behavior: methods. (In fact, a processor application can be viewed as merely a new syntax for a method call.) Our experience using the feature highlighted the question: When converting a SQL query expressed as a template to the form required by the database (such as PreparedStatement), why do we need to say: DB.?? template ?? When we could use an ordinary Java library: Query q = Query.of(??template??) Indeed, one of the worst things about having processors in the language is that API designers are put in the difficult situation of not knowing whether to write a processor or an ordinary API, and often have to make that choice before the consequences are fully understood. (To add to this, processors raise similar questions at the use site.) But the real criticism here is that template capture and processing are complected, when they should be separate, composable features. This motivated us to revisit some of the reasons why processors were so central to the initial design in the first place. And it turned out, this choice had been influenced ? perhaps overly so ? by early implementation experiments. (One of the background design goals was to enable expensive operations like `String::format` to be (much) cheaper. Without digressing too deeply on performance, String::format can be more than an order of magnitude worse than the equivalent concatenation operation, and this in turn sometimes motivates developers to use worse idioms for formatting. The FMT processor brough that cost back in line with the equivalent concatenation.) These early experiments biased the design towards needing to know the processor at the point of template capture, but upon reexamination we realized that there are other ways to achieve the desired performance goals without requiring processors to be known at capture time. This, in turn, enabled us to revisit a point in the design space we had transited through earlier, where string templates were ?just a new kind of literal? and the job performed by processors could instead be performed by ordinary APIs. At this point, a simpler design and implementation emerged that met the semantic, correctness, and performance goals: template literals (?Hello \{name}?) are simply the literal form of StringTemplate: StringTemplate st = ?Hello \{name}?; String and StringTemplate remain unrelated types. (We explored a number of ways to interconvert them, but they caused more trouble than they solved.) Processing of string templates, including interpolation, is done by ordinary APIs that deal in StringTemplate, aided by some clever implementation tricks to ensure good performance. For APIs where interpolation is known to be safe in the domain, such as PrintWriter, APIs can make that choice on behalf of the domain, by providing overloads to embody this design choice: void println(String) { ? } void println(StringTemplate) { ? interpolate and delegate to println(String) ?. } The upshot is that for interpolation-safe APIs like println, we can use a template directly without giving up any safety: System.out.println(?Hello \{name}?); In this example, the string template evaluates to StringTemplate, not String (no implicit interpolation), and chooses the StringTemplate overload of println, which in turn chooses how to process the template. This stays true to the design principle that interpolation is dangerous enough that it should be an explicit choice in the code ? but it allows that choice to be made by libraries when the library is comfortable doing so. Similarly, the FMT processor is replaced by an overload of String::format that interprets templates with embedded format specifiers (e.g., ?%d?): String format(String formatString, Object? parameters) { ? same as today ? } String format(StringTemplate template) {... equivalent of FMT ...} And users can call this as: String s = String.format(?Hello %12s\{name}?); Here, the String::format API has chosen to interpret string templates according to the rules previously specified in the FMT processor (not ordinary interpolation), but that choice is embedded in the library semantics so no further explicit choice at the use site is required. The user already chose to pass it to String::format; that?s all the processing selection that is needed. Where APIs do not express a choice of what template expansion means, users continue to be free to process them explicitly before passing them, using APIs that do (such as String::format or ordinary interpolation.). The result is: - The need for use-site "goop" (previously, the processor name; now, static or instance methods to process a template) goes away entirely when dealing with libraries that are already template-friendly. - Even with libraries that require use-site goop, it is no more intrusive than before, and can be reduced over time as APIs get with the program. - StringTemplate is just another type that APIs can support if they want. The "DB" processor becomes an ordinary factory method that accepts a string template or an ordinary builder API. - APIs now can have _more_ control over the timing and meaning of template processing, because we are not biasing so strongly towards early processing. - It becomes easier to abstract over template processing (i.e., combine or manipulate templates as templates before processing) - Interpolation remains an explicit choice, but ST-aware libraries can make this choice on behalf of the user. - The language feature and API surface get considerably smaller, which is good. Core JDK APIs (e.g., println, format, exception constructors) get upgraded to work with string templates. The remaining question that everyone is probably asking is: ?so how do we do interpolation.? The answer there is ?ordinary library methods?. This might be a static method (String.join(StringTemplate)) or an instance method (template.join()), shed to be painted (but please, not right now.). This is a sketch of direction, so feel free to pose questions/comments on the direction. We?ll discuss the details as we go. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Apr 5 15:18:24 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 5 Apr 2024 11:18:24 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <10ff8e67-366e-429f-a0ec-311a05a9b838@oracle.com> Message-ID: Thanks for this more detailed explanation. Here's one thing I would like to drill into, which I hinted at in my mail the other day: why we find "imperative" comforting.? I proposed two theories: ?- Looks like a constructor body in the mirror ?- Nominal association between binding and value is sometimes more clear than positional association Indeed, your comment that "which I would choose likely depends on arity" suggests that you are mostly aiming at the latter. Assuming this is most of the answer, it leads me to ask the following questions: 1.? In a world where we had a more general mechanism for by-name {invocation,matching}, wouldn't we prefer that?? Let's say that by-name invocation looked like: ??? new Point(x: 1, y: 2) The logical companion at the use site would be: ??? case Point(x: var a, y: var b): and the logical companion at the match site would be: ??? matches Point(x: this.x, y: this.y) While I'm not ready to commit to this feature, it seems to me the possibility that we could have a broader way to associate names with values at various parentheses-bounded constructs suggest that inventing a fresh one, with more limited applicability, might not be ideal. 2.? Users can already simulate imperative with functional without a language feature, and indeed, can do so more flexibly because it's not all-or-nothing.? Suppose we had the following imperative dtor: ??? pattern Foo(int a, int b, int c, ... int z) { ??????? a = this.a; ??????? b = /* really complicated computation */ ??????? c = this.c; ??????? ... more trivial bindings ... ??????? match; ??? } Here, one binding is complex and the rest are trivial.? With functional, users can already do: ??? pattern Foo(int a, int b, int c, ... int z) { ??????? var b = /* really complicated computation */ ??????? match Foo(this.a, b, this.c, ...) ??? } Which is to say, if we need to use imperative logic to "outline" a complex calculation, the language provides features for doing so.? Now, a dtor with so many bindings (and worse, all of the same type) is at risk for getting "out of sync", but at this point I refer back to argument #1, which is that someday we may be able to provide nominal context for these expressions to prevent such errors, at which point we can write: ??????? match Foo(a: this.a, b: b, c: this.c, ...) without having to have two linguistic ways to write a matcher (with the attendant "style wars".) On 4/4/2024 4:18 PM, Guy Steele wrote: > >> On Apr 4, 2024, at 1:11?PM, Brian Goetz wrote: >> >> There's obviously some more discussion coming about "what is a >> pattern", but let me summarize the points on which we've asked for >> syntax feedback, and make another call (I can't believe I have to >> ask) for opinions here. >> ... >> >> Body types.? There is the broad choice of "imperative vs functional"; >> within that, there are choices about "implicit failure" or "implicit >> success."? There is also how we indicate success and failure.? The >> suggested approach is functional, implicit failure, return means >> fail, success is indicated by `match patternName(BINDINGS)`. > > The draft proposal that Brian sent out on March 29, in the section and > subsections with these headings: > > ## Body types > ### Success and failure > ### Implicit failure in the functional approach > ### Implicit success in the imperative approach > ### Imperative is a trap > ### Derive imperative from functional? > > laid out a version of the functional approach in which failure is > implicit, a version of the imperative approach in which success is > implicit, and an add-on to the functional approach that allows it to > be used in a way that is syntactically similar to the imperative > approach. But this was an incomplete presentation of a design space > that actually has more possibilities and potential symmetries. > > Here I undertake a complete retelling of an imperative and approach > and a functional approach and then compare them. An important > difference is that *I will assume a version of the imperative approach > in which failure, rather than success, is implicit*. The reason for > this is while we expect simple deconstructors always to succeed?and > that motivates us to make success implicit, to make deconstructors one > line shorter?that is not true for other kinds of patterns, and I think > it is good to mark pattern success explicitly no matter which approach > is used used. > > Here, then, is my retelling. As part of this retelling, I will explain > pattern-match success in terms of a new kind of reason for abrupt > completion: ?a successful match with match results (z1, z2, ?, zn)? > where each zk is some value. > > > *An Imperative Approach (in which failure is implicit)* > > The parameters of a pattern declaration are definitely unassigned at > the start of the body of the declaration. They may be given values > through ordinary assignment. For expository purposes, let the names of > the parameters be v1, v2, ?, vn. > > If execution of the body completes normally, or completes abruptly for > any reason other than a successful match, then the invocation of the > pattern results in a failed match. In particular, the statement > `return;` may be used in the body of a pattern declaration to indicate > failure to match. > > The statement `match;` (or, if you prefer, `match patternName;`, but I > will stick with the shorter form for now) indicates a successful > match. It may be used only within the body of a pattern declaration. > Execution of `match;` causes the body of the pattern declaration to > complete abruptly, the reason being a successful match with match > results (v1, v2, ?, vn)?that is, the current values of the parameters > v1, v2, ?, vn are used as the match results. > > It is a compile-time error if any of the parameters of a pattern > declaration is not definitely assigned at any `match;` statement. > > /Optional restriction:/?It is a compile-time error if the body of a > deconstructor pattern declaration can complete normally or contains a > `return;` statement. (This restriction would imply that a > deconstructor cannot fail to match. This restriction would not apply > to static or instance patterns.) > > Here is the Point deconstructor written in the imperative style. > |class Point { int x, y; Point(int x, int y) { this.x = x; this.y = y; > } // Imperative style pattern Point(int x, int y) { x = that.x; y = > that.y; match; // Match success must be signaled explicitly } }| > In this imperative style, the deconstructed body looks like the > ?reverse" of the constructor body, with the sides of each assignment > swapped and `that` substituted for `this`?and, of course, the addition > of a `match` statement to signal success. > > *Special convenience feature: *Another form of the `match` statement > is provided for convenience: > |match (e1, e2, ?, en); | > means > |{ var t1 = e1, t2 = e2, ?, tn = en; v1 = t1; v2 = t2; ? vn = t1; match; }| > where temporaries t1, t2, ?, tn are fresh local variables that do not > occur elsewhere in the program. (It is a compile-time error if the > number of expressions does not match the number of parameters, or if > for any k the type of ek is not assignment-compatible with the > declared type of vk.) > > This allows the deconstructor for Point to be written this way instead > if desired: > |// Imperative style, but using the extended `match` statement to > abbreviate a series of boilerplate assignments pattern Point(int x, > int y) { match (that.x, that.y); }| > > > *Alternatively, a Functional Approach **(in which failure is likewise > implicit)* > > [This is very close to what Brian proposed, but I express it in the > same detailed terms that I used above to describe the variant > imperative approach that assumes failure is implicit.] > > If execution of the body completes normally, or completes abruptly for > any reason other than a successful match, then the invocation of the > pattern results in a failed match. In particular, the statement > `return;` may be used in the body of a pattern declaration to indicate > failure to match. > > For expository purposes, let the names of the parameters of the > pattern be v1, v2, ?, vn. > > The statement `match (e1, e2, ?, en);` (or, if you prefer, `match > patternName(e1, e2, ?, en);`, but I will stick with the shorter form > for now) indicates a successful match, using the values of the > expressions e1, e2, ?, en. It may be used only within the body of a > pattern declaration. Execution of `match (e1, e2, ?, en);` causes the > body of the pattern declaration to complete abruptly, the reason being > a successful match with match results (z1, z2, ?, zn), where z1, z2, > ?, zn are the respective results of evaluating the expressions e1, e2, > ?, en (working left-to-right). If evaluation of any en completes > abruptly, then evaluation of `match (e1, e2, ?, en);` completes > abruptly for the same reason. > > It is a compile-time error if the number of expressions does not match > the number of parameters, or if for any k the type of ek is not > assignment-compatible with the declared type of vk. > > /Optional restriction:/?It is a compile-time error if the body of a > deconstructor pattern declaration can complete normally or contains a > `return;` statement. (This restriction would imply that a > deconstructor cannot fail to match. This restriction would not apply > to static or instance patterns.) > > Here is the Point deconstructor written in the functional style. > |class Point { int x, y; Point(int x, int y) { this.x = x; this.y = y; > } // Functional style pattern Point(int x, int y) { match (that.x > that.y); } }| > > In the functional style, the `match` statement that signals success > looks somewhat like an invocation that provides desired values > corresponding to the declared parameters. > > The parameters of a pattern declaration are in fact declared local > variables that are definitely unassigned at the start of the body of > the declaration. They may be given values through ordinary assignment, > but need not be; the compiler will not complain just because a pattern > parameter goes unused as a local variable. One possible use for them > is to hold values intended to be match results while other values are > still being computed. > > *Special convenience feature: *Another form of the `match` statement > is provided for convenience: > |match; | > means > |match (v1, v2, ?, vn);| > where v1, v2, ?, vn are the names of the declared pattern parameters. > > This allows the deconstructor for Point to be written this way instead: > |// Functional style, but using parameter variables for convenience to > stash intermediate match result values as they are computed pattern > Point(int x, int y) { x = that.x; y = that.y; match; }| > It is a compile-time error if any of the parameters of a pattern > declaration is not definitely assigned at any `match;` statement. > > > *Comparing These Imperative and?Functional Approaches* > > The two approaches are described from different perspectives, and > suggest slightly different implementation techniques, but *they allow > the programmer to write exactly the same set of programs*. Assuming > reasonable compiler optimization of chained assignments and unused > local variables, *the resulting machine code should be the same in > either case*. Whether or not to use explicit assignment to the pattern > parameter variables becomes entirely a matter of taste. If the number > of parameters is, say, 4 or less, I would probably prefer to write a > pattern in the functional style, to cut down on clutter. But if the > number of parameters is, say, 7 or more, I would probably prefer to > write a pattern in the imperative style, to make it easier to see that > each match result has been assigned to the correct parameter. In > between, my mileage might vary. > > *It would seem, then, from these explanations and examples, that we > could choose either of these models as the ?official? explanation of > how the bodies of pattern declarations work.* I actually thought that > for a little while. It does seem that either is easily derived from > the other by introducing a plausible ?special convenience feature?. > > *But* if we want to be able to use the SAP (single-abstract-pattern > interfaces) feature that Brian introduces toward the end, in his section > > ## Pattern lambdas > > so that patterns can be expressed as lambda expressions, then *the > functional approach is clearly the better choice*. To see why, > consider his example: > |interface Converter { pattern(T t) convert(U u); }| > |Converter c = ??? i -> { ??????? if (i >= > Short.MIN_VALUE && i <= Short.MAX_VALUE)? ? ? ? ? ? ? ? match > Converter.convert((short) i);};| > This lambda expression is, of course, written in the functional style. > But watch what happens if we try to write it in the imperative style: > |Converter c =i -> {? ? ? ? ? if (i >= Short.MIN_VALUE > && i <= Short.MAX_VALUE) {u = (short) I; // PROBLEM: u is not in scope > match; } }; | > The problem is that the parameter name `u` is declared in the SAP > interface `Converter` but is not in scope within the lambda > expression. This, I think, is reason enough to regard the functional > approach as the ?official explanation? of what is going on, because, > as with methods and the way they bind method parameters to argument > values, the baseline mechanism in Java for establishing correspondence > between parameters and values is order within a sequence rather than > matching of parameter names. > > So, in the end, I recommend adopting the functional approach, but I > also recommend adopting the ?special convenience feature? so that the > syntactic style of the imperative approach can be used in certain > common cases. > > ?Guy > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Apr 8 14:38:07 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 8 Apr 2024 16:38:07 +0200 (CEST) Subject: Member Patterns -- the bikeshed In-Reply-To: <58E4A261-5D40-4549-B406-34A7285755D2@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> <675389563.47530211.1712237440439.JavaMail.zimbra@univ-eiffel.fr> <58E4A261-5D40-4549-B406-34A7285755D2@oracle.com> Message-ID: <1786599334.52238847.1712587087868.JavaMail.zimbra@univ-eiffel.fr> > From: "Guy Steele" > To: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > > Sent: Thursday, April 4, 2024 5:04:30 PM > Subject: Re: Member Patterns -- the bikeshed >> On Apr 4, 2024, at 9:30 AM, forax at univ-mlv.fr wrote: >> Patterns are not dual of methods, pattern deconstructors are dual of methods, >> but this is a special case. >> A pattern not only have a sequence of match results, it can have parameters too. >> For example, I may want to introduce an instance pattern asInteger() in >> java.lang.String that works like Integer.parseInt() but not match instead of >> throwing an exception if the string does not represent an integer. I may also >> want that pattern to decode hexadecimal so like Integer.parsing(int radix), I >> want my pattern to also takes a radix as parameter. In that case, my pattern >> asInteger() has an int value as match result and has an int radix as parameter. >> Using the carrier syntax, it's something like >> carrier(int value) asInteger(int radix) { ... } >> or without the carrier syntax but with a keyword pattern, it's something like >> pattern (int value) asInteger(int radix) { ? } > I think what is missing is a necessary shift in terminology under Brian?s new > proposal: if parameters are needed for a pattern, then in effect you curry it. > In this new model, `asInteger` is not a pattern; rather, it is a pattern > factory?that is, a method that returns a pattern. This is possible because of > the introduction of SAPs, so that a pattern can be expressed using lambda > syntax. This is super confusing. For curryfication, you need at least one parameter, so the SAP needs have one parameter but at the same time a SAP abstracts a pattern and pattern can not have a parameter ? > So we should speak of method `asInteger` as a pattern factory, and > `asInteger(16)` as a pattern. You mean, asInteger().apply(16) or SAP unlike SAM can to be called directly using parenthesis ? You also - loose the ability to have several overloads, asInteger() and asInteger(16) can not be both valid at the same time if you have currification (at least in Java). - you need a new way to distinguish between a pattern and a pattern factory, the former has bindings, the later return a SAP interface. So we have a small/medium feature "pattern method" but for some reasons I do not understand instead of being a special kind of method, it's an entirely new concept, pattern. The body of a pattern works like a body of a method but for some reasons i do not undesrtand for returning the binding values, you have to use the name of the pattern (apart if this is a single abstract pattern because in that can it's the interface) and for returning no match, this is implicit. Some existing languages work like that but not Java. The parameters of a pattern instead of being method parameters, for some reasons i do not understand uses curryfication. Some languages work like that but not Java. This is very confusing. Again, why a pattern is not a new kind of method with a binding list instead of a return type and in the body, one can return match(val1, val2, ...) or return nomatch ? > ?Guy R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Apr 8 14:51:27 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 8 Apr 2024 10:51:27 -0400 Subject: Member Patterns -- the bikeshed In-Reply-To: <1786599334.52238847.1712587087868.JavaMail.zimbra@univ-eiffel.fr> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <407911070.46485631.1712139695181.JavaMail.zimbra@univ-eiffel.fr> <50039898-ad37-4d62-9165-f48954f47ecc@oracle.com> <152426466.46744752.1712154202764.JavaMail.zimbra@univ-eiffel.fr> <675389563.47530211.1712237440439.JavaMail.zimbra@univ-eiffel.fr> <58E4A261-5D40-4549-B406-34A7285755D2@oracle.com> <1786599334.52238847.1712587087868.JavaMail.zimbra@univ-eiffel.fr> Message-ID: > > This is super confusing. It is confusing in part because you are jumping ahead to the one of the things that I specifically said "we're going to talk about that later", which is parameterized patterns. ? (As I've reminded before, jumping ahead when you are specifically asked not to, means that you forfeit your right to have an opinion about the topic actually being discussed...) So, I'll ask again: please stay focused on the discussion at hand, rather than trying to redesign the next part?? (And please, please, please, stop using words like "not", "can't", "impossible", "doesn't work", etc, incorrectly.? It's OK to not understand fully something.? It is not OK to not understand it fully and declare it to be wrong.) I will take the feedback "I wish we had parameterized patterns now", and I understand why that is important to you now.? Now, lets get back to the topic being discussed. (The irony here is deadly.? For several years, I've been talking model, you've been asking "when can we talk syntax", and I've been saying "wait."? Now, finally I am *asking* for a syntax discussion, and *now* you want to revisit model decisions?) If you have no significant syntax opinions here (other than "I prefer the method-style declaration"), that's fine, just say so and we can move on. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Tue Apr 16 01:25:45 2024 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 16 Apr 2024 01:25:45 +0000 Subject: Member Patterns -- the bikeshed In-Reply-To: <10ff8e67-366e-429f-a0ec-311a05a9b838@oracle.com> References: <989c8197-659a-4fe2-a432-5b4adfcfa4cb@oracle.com> <10ff8e67-366e-429f-a0ec-311a05a9b838@oracle.com> Message-ID: Well, there has not been a lot of traffic during the last week and a half on these topics (?syntax?), so I will jump in and offer my two cents? worth. On Apr 4, 2024, at 1:11?PM, Brian Goetz wrote: There's obviously some more discussion coming about "what is a pattern", but let me summarize the points on which we've asked for syntax feedback, and make another call (I can't believe I have to ask) for opinions here. Use-site syntax. The document catalogs the use-site syntax for deconstruction, static, and bound/unbound instance pattern uses. I don't think any of these are controversial. (There are details to be captured, such as qualifier inference, but I think the overall scheme here is sound.) I do, too. Identifying a member as a pattern. The proposed approach is a "pattern" keyword for all pattern kinds, but there are other choices. I like using `pattern` as a (necessarily contextual) keyword. Method-style (multiple return) vs inverse-style. I thought the document made it entirely clear that the method-style declaration was going to be a loser, but I guess we had more work to do there. Inverse style is good. Position of match candidate. Here, there is a reasonable menu of choices: static pattern Optional of(T t) static pattern> of(T t) static pattern(Optional that) of(T t) static pattern of(T t) for Optional I don?t feel strongly about this choice. Naming of match candidate. The document proposes to use `that` uniformly. I think using `that` uniformly is a good choice. Body types. There is the broad choice of "imperative vs functional"; within that, there are choices about "implicit failure" or "implicit success." There is also how we indicate success and failure. The suggested approach is functional, implicit failure, return means fail, success is indicated by `match patternName(BINDINGS)`. I am all for this. If we choose to provide the ?special convenience feature? to support the imperative style, I would suggest that it be simply `match;` and not `match patternName;` because its reason for existence is to abbreviate. Exhaustiveness. The document proposes `case` as a modifier for patterns that form exhaustive sets. This isn't great, but note that this feature is likely to be used less often than we probably think, as new code will likely steer towards sealed classes and deconstruction patterns. Agreed, not great, but I think we could live with it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From angelos.bimpoudis at oracle.com Fri Apr 19 13:05:42 2024 From: angelos.bimpoudis at oracle.com (Angelos Bimpoudis) Date: Fri, 19 Apr 2024 13:05:42 +0000 Subject: Exception handling in switch (Preview) Message-ID: Dear spec experts, A while ago we discussed on this list about enhancing the switch? construct to support case? labels that match exceptions thrown during evaluation of the selector expression. A draft JEP for this feature is now available at: https://bugs.openjdk.org/browse/JDK-8323658 Please take a look at this new JEP and give us your feedback. Thanks, Aggelos -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Apr 19 14:10:31 2024 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 19 Apr 2024 14:10:31 +0000 Subject: Exception handling in switch (Preview) In-Reply-To: References: Message-ID: <6A01DA6B-51F8-4CC8-8EC2-1E3226DF47F6@oracle.com> I just read the JEP all the way through, and it all looks good to me. ?Guy On Apr 19, 2024, at 9:05?AM, Angelos Bimpoudis wrote: Dear spec experts, A while ago we discussed on this list about enhancing the switch? construct to support case? labels that match exceptions thrown during evaluation of the selector expression. A draft JEP for this feature is now available at: https://bugs.openjdk.org/browse/JDK-8323658 Please take a look at this new JEP and give us your feedback. Thanks, Aggelos -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Fri Apr 19 17:45:42 2024 From: alex.buckley at oracle.com (Alex Buckley) Date: Fri, 19 Apr 2024 10:45:42 -0700 Subject: Exception handling in switch (Preview) In-Reply-To: References: Message-ID: <9475b481-888c-4901-9575-6c0987f8fe8c@oracle.com> On 4/19/2024 6:05 AM, Angelos Bimpoudis wrote: > A while ago we discussed on this list about enhancing the |switch| > ??construct to > support |case|??labels that match exceptions thrown during evaluation of the > selector expression. A draft JEP for this feature is now available at: > > https://bugs.openjdk.org/browse/JDK-8323658 I tightened things up throughout and added myself as a reviewer. Alex From davidalayachew at gmail.com Fri Apr 19 19:49:34 2024 From: davidalayachew at gmail.com (David Alayachew) Date: Fri, 19 Apr 2024 15:49:34 -0400 Subject: Exception handling in switch (Preview) In-Reply-To: <9475b481-888c-4901-9575-6c0987f8fe8c@oracle.com> References: <9475b481-888c-4901-9575-6c0987f8fe8c@oracle.com> Message-ID: Looks really good. I am especially grateful about the part that says grouping related cases together. With this feature, switch has officially reached a bedrock status. It has effectively become our if statement for expressions. Incredibly excited to play with this when it comes out. On Fri, Apr 19, 2024, 1:46?PM Alex Buckley wrote: > On 4/19/2024 6:05 AM, Angelos Bimpoudis wrote: > > A while ago we discussed on this list about enhancing the |switch| > > ? construct to > > support |case|? labels that match exceptions thrown during evaluation of > the > > selector expression. A draft JEP for this feature is now available at: > > > > https://bugs.openjdk.org/browse/JDK-8323658 > > I tightened things up throughout and added myself as a reviewer. > > Alex > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Sat Apr 20 08:00:48 2024 From: amaembo at gmail.com (Tagir Valeev) Date: Sat, 20 Apr 2024 10:00:48 +0200 Subject: Exception handling in switch (Preview) In-Reply-To: References: Message-ID: Dear experts, looking into this proposal, I'm really not convinced that Java needs it. We already have try-catch statements, and it sounds strange to provide another way to express the same semantics. I don't see what the new construct adds, aside from a bit of syntactic sugar. On the other hand, it creates a new source of subtle bugs, especially when exceptions are unchecked. E.g., consider: switch(a.b().c().d()) { case ... case throws RuntimeException ex -> handle(ex); } Now, one may want to refactor the code, extracting a.b(), a.b().c(), or the whole a.b().c().d() to a separate variable for clarity, or to avoid a long line. This action is usually safe, and it was totally safe in switches so far (even with patterns and case null). Now, it's not safe, as exceptions thrown from the extracted part are not handled by the 'case throws' branch. I don't see a good way to perform this refactoring in a semantically equivalent way. The only possibility I see is to duplicate the exception handler in the external catch: try { var ab = a.b(); switch(ab.c().d()) { case ... case throws RuntimeException ex -> handle(ex); } } catch(RuntimeException ex) { handle(ex); // duplicated code } As switch selector does not allow using several expressions or to declare new variables, extract/inline refactorings can easily become very painful, or cause subtle bugs if not performed correctly. Note that it's not a problem inside usual try-catch statement (*), as you can easily add or remove more statements inside the try-body. (*) Except resource declaration, but it's rarely a problem, and in some cases it's still possible to extract parts as separate resources, because you can declare several of them I think, instead of repurposing switch to be another form of try-catch we could add more love to try-catch allowing it to be an expression with yields in branches. The proposed JEP allows something like this: Integer toIntOrNull(String s) { return switch(Integer.parseInt(s)) { case int i -> i; case throws NumberFormatException _ -> null; } } But we are still limited by a single expression in the selector. An alternative would be Integer toIntOrNull(String s) { return try { yield Integer.parseInt(s); } catch(NumberFormatException _) { yield null; }; } Here, all kinds of refactorings are possible. And we actually don't need to express pattern matching, because we essentially don't need any pattern matching. Also, note that some of the situations which are usually solved with exception handling in modern Java (e.g. Pattern.compile -> PatternSyntaxException, or UUID.fromString -> IllegalArgumentException, or Integer.parseInt above) will be covered in future by member patterns. So probably if we concentrate more on member patterns, people will need much less exception handling in business logic, and such an enhancement will be not so useful anyway? Speaking about the sample from the JEP, can we imagine something like this in the future (sic!) Java? switch(future) { case Future.cancelled() -> ... case Future.interrupted() -> ... case Future.failed(Exception ex) -> ... // no need to unwrap ExecutionException manually case Future.successful(Box b) -> ... } One more note about the JEP text. It's unclear for me whether 'case throw' branches could catch a residual result. More precisely, if MatchException happens, or NullPointerException happens (selector evaluated to null, but there's no 'case null'), can these exceptions be caught by the 'case throws' branches in the same switch? With best regards, Tagir Valeev. On Fri, Apr 19, 2024 at 3:05?PM Angelos Bimpoudis < angelos.bimpoudis at oracle.com> wrote: > Dear spec experts, > > A while ago we discussed on this list about enhancing the switch? construct > to > support case? labels that match exceptions thrown during evaluation of the > selector expression. A draft JEP for this feature is now available at: > > https://bugs.openjdk.org/browse/JDK-8323658 > > Please take a look at this new JEP and give us your feedback. > > Thanks, > Aggelos > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Apr 22 09:15:13 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 22 Apr 2024 11:15:13 +0200 (CEST) Subject: Exception handling in switch (Preview) In-Reply-To: References: Message-ID: <2132026114.10511446.1713777313317.JavaMail.zimbra@univ-eiffel.fr> I agree with Tagir here, a lot of exceptions come from the fact that we do not have pattern methods yet (Integer.parseInt, etc) and a try expression (try-with-resources expression, try-catch expression or try-finally expression) seems to compose better than a "case throws". regards, R?mi > From: "Tagir Valeev" > To: "Angelos Bimpoudis" > Cc: "amber-spec-experts" > Sent: Saturday, April 20, 2024 10:00:48 AM > Subject: Re: Exception handling in switch (Preview) > Dear experts, > looking into this proposal, I'm really not convinced that Java needs it. We > already have try-catch statements, and it sounds strange to provide another way > to express the same semantics. I don't see what the new construct adds, aside > from a bit of syntactic sugar. On the other hand, it creates a new source of > subtle bugs, especially when exceptions are unchecked. E.g., consider: > switch(a.b().c().d()) { > case ... > case throws RuntimeException ex -> handle(ex); > } > Now, one may want to refactor the code, extracting a.b(), a.b().c(), or the > whole a.b().c().d() to a separate variable for clarity, or to avoid a long > line. > This action is usually safe, and it was totally safe in switches so far (even > with patterns and case null). Now, it's not safe, as exceptions thrown from the > extracted part are not handled by the 'case throws' branch. > I don't see a good way to perform this refactoring in a semantically equivalent > way. The only possibility I see is to duplicate the exception handler in the > external catch: > try { > var ab = a.b(); > switch(ab.c().d()) { > case ... > case throws RuntimeException ex -> handle(ex); > } > } > catch(RuntimeException ex) { > handle(ex); // duplicated code > } > As switch selector does not allow using several expressions or to declare new > variables, extract/inline refactorings can easily become very painful, or cause > subtle bugs if not performed correctly. > Note that it's not a problem inside usual try-catch statement (*), as you can > easily add or remove more statements inside the try-body. > (*) Except resource declaration, but it's rarely a problem, and in some cases > it's still possible to extract parts as separate resources, because you can > declare several of them > I think, instead of repurposing switch to be another form of try-catch we could > add more love to try-catch allowing it to be an expression with yields in > branches. The proposed JEP allows something like this: > Integer toIntOrNull(String s) { > return switch(Integer.parseInt(s)) { > case int i -> i; > case throws NumberFormatException _ -> null; > } > } > But we are still limited by a single expression in the selector. An alternative > would be > Integer toIntOrNull(String s) { > return try { yield Integer.parseInt(s); } > catch(NumberFormatException _) { yield null; }; > } > Here, all kinds of refactorings are possible. And we actually don't need to > express pattern matching, because we essentially don't need any pattern > matching. > Also, note that some of the situations which are usually solved with exception > handling in modern Java (e.g. Pattern.compile -> PatternSyntaxException, or > UUID.fromString -> IllegalArgumentException, or Integer.parseInt above) will be > covered in future by member patterns. So probably if we concentrate more on > member patterns, people will need much less exception handling in business > logic, and such an enhancement will be not so useful anyway? Speaking about the > sample from the JEP, can we imagine something like this in the future (sic!) > Java? > switch(future) { > case Future.cancelled() -> ... > case Future.interrupted() -> ... > case Future.failed(Exception ex) -> ... // no need to unwrap ExecutionException > manually > case Future.successful(Box b) -> ... > } > One more note about the JEP text. It's unclear for me whether 'case throw' > branches could catch a residual result. More precisely, if MatchException > happens, or NullPointerException happens (selector evaluated to null, but > there's no 'case null'), can these exceptions be caught by the 'case throws' > branches in the same switch? > With best regards, > Tagir Valeev. > On Fri, Apr 19, 2024 at 3:05 PM Angelos Bimpoudis < [ > mailto:angelos.bimpoudis at oracle.com | angelos.bimpoudis at oracle.com ] > wrote: >> Dear spec experts, >> A while ago we discussed on this list about enhancing the switch ? construct to >> support case ? labels that match exceptions thrown during evaluation of the >> selector expression. A draft JEP for this feature is now available at: >> [ https://bugs.openjdk.org/browse/JDK-8323658 | >> https://bugs.openjdk.org/browse/JDK-8323658 ] >> Please take a look at this new JEP and give us your feedback. >> Thanks, >> Aggelos -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan.heidinga at oracle.com Mon Apr 22 14:05:08 2024 From: dan.heidinga at oracle.com (Dan Heidinga) Date: Mon, 22 Apr 2024 14:05:08 +0000 Subject: Exception handling in switch (Preview) In-Reply-To: References: Message-ID: My reading of the JEP is that this is that it?s about treating all the possible results of evaluating the selector uniformly ? both the normal results and the exceptional ones. Being able to treat the results uniformly makes it easier to be precise about what exceptions should be handled in each way. With today?s try/catch, there is no way to distinguish handling of an exception thrown by the selector from one due to the case handling code. This JEP allows those two cases to be distinguished. As to refactoring, my impression is that the new case throws is a win as it allows the handling code to be scoped closer to the potential source of the exception. More responses in line. From: amber-spec-experts on behalf of Tagir Valeev Date: Saturday, April 20, 2024 at 4:01?AM To: Angelos Bimpoudis Cc: amber-spec-experts Subject: Re: Exception handling in switch (Preview) Dear experts, looking into this proposal, I'm really not convinced that Java needs it. We already have try-catch statements, and it sounds strange to provide another way to express the same semantics. I don't see what the new construct adds, aside from a bit of syntactic sugar. On the other hand, it creates a new source of subtle bugs, especially when exceptions are unchecked. E.g., consider: switch(a.b().c().d()) { case ... case throws RuntimeException ex -> handle(ex); } Now, one may want to refactor the code, extracting a.b(), a.b().c(), or the whole a.b().c().d() to a separate variable for clarity, or to avoid a long line. This action is usually safe, and it was totally safe in switches so far (even with patterns and case null). Now, it's not safe, as exceptions thrown from the extracted part are not handled by the 'case throws' branch. I don't see a good way to perform this refactoring in a semantically equivalent way. The only possibility I see is to duplicate the exception handler in the external catch: try { var ab = a.b(); switch(ab.c().d()) { case ... case throws RuntimeException ex -> handle(ex); } } catch(RuntimeException ex) { handle(ex); // duplicated code } This doesn?t mean quite the same thing as the new try/catch block will catch RuntimeExceptions thrown by the ?case ?? in addition to those thrown by the selector (?ab.c().d()?). I think we can use more switches to make the refactoring: var ab = switch(a.b()) { case Object o -> o; // need more precise type than Object case throws RuntimeException ex -> handle(ex); } switch(ab.c().d()) { case ... case throws RuntimeException ex -> handle(ex); } As switch selector does not allow using several expressions or to declare new variables, extract/inline refactorings can easily become very painful, or cause subtle bugs if not performed correctly. Note that it's not a problem inside usual try-catch statement (*), as you can easily add or remove more statements inside the try-body. (*) Except resource declaration, but it's rarely a problem, and in some cases it's still possible to extract parts as separate resources, because you can declare several of them I think, instead of repurposing switch to be another form of try-catch we could add more love to try-catch allowing it to be an expression with yields in branches. The proposed JEP allows something like this: Integer toIntOrNull(String s) { return switch(Integer.parseInt(s)) { case int i -> i; case throws NumberFormatException _ -> null; } } But we are still limited by a single expression in the selector. An alternative would be Integer toIntOrNull(String s) { return try { yield Integer.parseInt(s); } catch(NumberFormatException _) { yield null; }; } Here, all kinds of refactorings are possible. And we actually don't need to express pattern matching, because we essentially don't need any pattern matching. I?m not sure I follow the point of making try/catch an expression here. We can write this code today with return: Integer toIntOrNull(String s) { try { return Integer.parseInt(s); } catch(NumberFormatException _) { return null; } } Also, note that some of the situations which are usually solved with exception handling in modern Java (e.g. Pattern.compile -> PatternSyntaxException, or UUID.fromString -> IllegalArgumentException, or Integer.parseInt above) will be covered in future by member patterns. So probably if we concentrate more on member patterns, people will need much less exception handling in business logic, and such an enhancement will be not so useful anyway? Speaking about the sample from the JEP, can we imagine something like this in the future (sic!) Java? switch(future) { case Future.cancelled() -> ... case Future.interrupted() -> ... case Future.failed(Exception ex) -> ... // no need to unwrap ExecutionException manually case Future.successful(Box b) -> ... } One more note about the JEP text. It's unclear for me whether 'case throw' branches could catch a residual result. More precisely, if MatchException happens, or NullPointerException happens (selector evaluated to null, but there's no 'case null'), can these exceptions be caught by the 'case throws' branches in the same switch? I think if the selector evaluates to ?null?, then it is the switch, not the selector, that throws NPE so I wouldn?t expect a case throws NPE to handle that exception. Similarly, a MatchException isn?t thrown by the selector, it?s a result of the exhaustive switch not matching (ie: remainder handling) so I?d similarly expect it not to trigger a case throws MatchException. But happy to be corrected here. --Dan With best regards, Tagir Valeev. On Fri, Apr 19, 2024 at 3:05?PM Angelos Bimpoudis > wrote: Dear spec experts, A while ago we discussed on this list about enhancing the switch? construct to support case? labels that match exceptions thrown during evaluation of the selector expression. A draft JEP for this feature is now available at: https://bugs.openjdk.org/browse/JDK-8323658 Please take a look at this new JEP and give us your feedback. Thanks, Aggelos -------------- next part -------------- An HTML attachment was scrubbed... URL: From pablogrisafi1975 at gmail.com Mon Apr 22 14:55:24 2024 From: pablogrisafi1975 at gmail.com (Pablo Grisafi) Date: Mon, 22 Apr 2024 11:55:24 -0300 Subject: Exception handling in switch (Preview) In-Reply-To: References: Message-ID: Dear experts Tagir Valeev does not like the proposal and among other things suggests > > But we are still limited by a single expression in the selector. An alternative would be > Integer toIntOrNull(String s) { > return try { yield Integer.parseInt(s); } > catch(NumberFormatException _) { yield null; }; > } Por que no los dos? I do like the proposal, but also like the yield-in-try option Tagir Valeev proposes In fact, why can't we have yield in any block? That will give as if-expressions, try-expressions and even simply block-expressions int a = { var x = .... var y = .... yield x + y; } thanks for your time Pablo Grisafi pablogrisafi1975 at gmail.com From brian.goetz at oracle.com Tue Apr 23 06:56:38 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 23 Apr 2024 06:56:38 +0000 Subject: Exception handling in switch (Preview) In-Reply-To: References: Message-ID: <577E666C-5A6F-45B9-8782-50D4CEEE563B@oracle.com> To the question ?does Java need this?, well, of course we don?t *need* it; we do have try-catch statements. But as Dan points out, the main challenge of using methods that both return a value and throw exceptions is that we cannot handle all the results uniformly. And as the JEP points out, backing off to try-catch is doubly painful: not only do we have to use two constructs, but we are back to statement-land, which is far more error prone and less composable than expressions. As to ?why don?t we just make try-catch an expression?, well, that?s where we started with this feature exploration. It turns out to just be too weak to be useful. The main constraint is that the try and catch parts have to yield the same type, but the constraint to produce a ?default? value of that type from the catch arms is just too constraining. If you try some nontrivial examples this becomes clear pretty quickly. But there is one criticism of this feature that is I think at the root of what you are getting at, which is that having effect cases in switch produces only a shallow-ish unification (or, alternately, that it is not the primitive.) Having effect cases lets us deal uniformly with all the consequences of evaluating the selector uniformly in one place, but like try-catch, you have to deal with them _right there_. Whereas with a Try monad, you could capture the Try and then process it by passing it to another method, putting it on a queue and letting some other thread process it, etc. If we were to excavate to the bottom, then we would likely want `try e` to evaluate to a Try monad, at which point the current proposal is a sugary representation of: switch (try e) { case Success(P1) -> ? case Success(P2) -> ? case Failure(E1) -> ? } We explored this point as well in the exploration, and backed off. But we can do this in either order; if we have a `try e` primitive that evaluates to a Try, then we can retroactively redefine a switch with `case throws` clauses to be sugar for the above. I will think further about the pattern matching connection you propose. On Apr 20, 2024, at 10:00 AM, Tagir Valeev > wrote: Dear experts, looking into this proposal, I'm really not convinced that Java needs it. We already have try-catch statements, and it sounds strange to provide another way to express the same semantics. I don't see what the new construct adds, aside from a bit of syntactic sugar. On the other hand, it creates a new source of subtle bugs, especially when exceptions are unchecked. E.g., consider: switch(a.b().c().d()) { case ... case throws RuntimeException ex -> handle(ex); } Now, one may want to refactor the code, extracting a.b(), a.b().c(), or the whole a.b().c().d() to a separate variable for clarity, or to avoid a long line. This action is usually safe, and it was totally safe in switches so far (even with patterns and case null). Now, it's not safe, as exceptions thrown from the extracted part are not handled by the 'case throws' branch. I don't see a good way to perform this refactoring in a semantically equivalent way. The only possibility I see is to duplicate the exception handler in the external catch: try { var ab = a.b(); switch(ab.c().d()) { case ... case throws RuntimeException ex -> handle(ex); } } catch(RuntimeException ex) { handle(ex); // duplicated code } As switch selector does not allow using several expressions or to declare new variables, extract/inline refactorings can easily become very painful, or cause subtle bugs if not performed correctly. Note that it's not a problem inside usual try-catch statement (*), as you can easily add or remove more statements inside the try-body. (*) Except resource declaration, but it's rarely a problem, and in some cases it's still possible to extract parts as separate resources, because you can declare several of them I think, instead of repurposing switch to be another form of try-catch we could add more love to try-catch allowing it to be an expression with yields in branches. The proposed JEP allows something like this: Integer toIntOrNull(String s) { return switch(Integer.parseInt(s)) { case int i -> i; case throws NumberFormatException _ -> null; } } But we are still limited by a single expression in the selector. An alternative would be Integer toIntOrNull(String s) { return try { yield Integer.parseInt(s); } catch(NumberFormatException _) { yield null; }; } Here, all kinds of refactorings are possible. And we actually don't need to express pattern matching, because we essentially don't need any pattern matching. Also, note that some of the situations which are usually solved with exception handling in modern Java (e.g. Pattern.compile -> PatternSyntaxException, or UUID.fromString -> IllegalArgumentException, or Integer.parseInt above) will be covered in future by member patterns. So probably if we concentrate more on member patterns, people will need much less exception handling in business logic, and such an enhancement will be not so useful anyway? Speaking about the sample from the JEP, can we imagine something like this in the future (sic!) Java? switch(future) { case Future.cancelled() -> ... case Future.interrupted() -> ... case Future.failed(Exception ex) -> ... // no need to unwrap ExecutionException manually case Future.successful(Box b) -> ... } One more note about the JEP text. It's unclear for me whether 'case throw' branches could catch a residual result. More precisely, if MatchException happens, or NullPointerException happens (selector evaluated to null, but there's no 'case null'), can these exceptions be caught by the 'case throws' branches in the same switch? With best regards, Tagir Valeev. On Fri, Apr 19, 2024 at 3:05?PM Angelos Bimpoudis > wrote: Dear spec experts, A while ago we discussed on this list about enhancing the switch? construct to support case? labels that match exceptions thrown during evaluation of the selector expression. A draft JEP for this feature is now available at: https://bugs.openjdk.org/browse/JDK-8323658 Please take a look at this new JEP and give us your feedback. Thanks, Aggelos -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Apr 23 16:03:40 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 23 Apr 2024 18:03:40 +0200 (CEST) Subject: New candidate JEP: 468: Derived Record Creation (Preview) In-Reply-To: References: <20240228200401.D42EB6C2F78@eggemoggin.niobe.net> Message-ID: <406969038.11938212.1713888220443.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "attila kelemen85" > Cc: "amber-dev" , "Gavin Bierman" > > Sent: Tuesday, April 23, 2024 3:34:16 PM > Subject: Re: New candidate JEP: 468: Derived Record Creation (Preview) [promoted to amber-spec-expert because I think this discussion is quite interresting] > So, a further thing to keep in mind is that currently, adding fields to records > is not even source compatible to begin with. For example, if we have > record Point(int x, int y) { } > And a client uses it in a pattern match: > case Point(int x, int y): > And then we add an `int z` component, the client will break. (When we are able > to declare deconstruction patterns, such a migration could include an XY > pattern as well as constructor, but we are not there yet.) > I think your mail rests on the assumption that it should be fine to modify > records willy-nilly and expect compatibility without a recompile-the-world, but > I think this is a questionable assumption. This is a reasonable assumption. Java is well know to be backward compatible, people will expect any new feature to be backward compatible too. As an example, the way enums were compiled in 1.5 was not bacward compatible if the enums were used in a switch. Later the translation strategy was changed to be backward compatible. One possible quick fix is to restrict the access. For sealed classes, we have restricted the permitted subclasses to be in the same package/same module to avoid such separate compilation issues. Do you think that introducing the same restriction on derived record creation is a good idea ? > Records will likely have features that ordinary classes do not yet have access > to for a while, making such changes risky. Yes, the idea of derived record creation is based on the fact that there is a canonical constructor + a way to deconstruct using accessors (for now). I do not see classes having canonical constructors in the future so yes, this feature is limited to records. R?mi >> On Apr 20, 2024, at 5:49 PM, Attila Kelemen < [ >> mailto:attila.kelemen85 at gmail.com | attila.kelemen85 at gmail.com ] > wrote: >> I have a backward compatibility concern about this JEP. Consider that I have the >> following record: >> `record MyRecord(int x, int y) { }` >> One day I realize that I need that 3rd property which I want to add in a >> backward compatible way, which I will do the following way: >> ``` >> record MyRecord(int x, int y, int z) { >> public MyRecord(int x, int y) { >> this(x, y, 0); >> } >> } >> ``` >> As far as I understand, this will still remain binary compatible. However, if I >> didn't miss anything, then this JEP makes it non-source compatible, because >> someone might wrote the following code: >> ``` >> var obj1 = new MyRecord(1, 2); >> int z = 26; >> var obj2 = obj1 with { y = z; } >> ``` >> If this code is compiled again, then it will compile without error, but while in >> the first version `obj2.y == 26`, now `obj2.y == 0`. This seems rather nasty to >> me because I was once bitten by this in Gradle (I can't recall if it was Groovy >> or Kotlin, but it doesn't really matter), where this is a threat, and you have >> to be very careful adding a new property in plugin extensions with a too >> generic name. Even though Gradle scripts are far less prone to this, since >> those scripts are usually a lot less complicated than normal code. >> I saw in the JEP that on the left hand side of the assignment this issue can't >> happen, but as far as I can see the above problem is not prevented. >> My proposal would be to, instead of shadowing variables, raise a compile time >> error when the property name would shadow another variable. Though that still >> leaves the above example backward incompatible, but at least I would be >> notified of it by the compiler, instead of the compiler silently creating a >> bug. >> Another solution would be that the shadowing is done in the opposite order, and >> the `int z = 26;` shadows the record property (with a potential warning). In >> this case it would be even source compatible, if I didn't miss something. >> Attila -------------- next part -------------- An HTML attachment was scrubbed... URL: From ccherlin at gmail.com Wed Apr 24 16:48:40 2024 From: ccherlin at gmail.com (Clement Cherlin) Date: Wed, 24 Apr 2024 11:48:40 -0500 Subject: Exception handling in switch (Preview) In-Reply-To: References: Message-ID: On Sat, Apr 20, 2024 at 3:01?AM Tagir Valeev wrote: > Dear experts, > > looking into this proposal, I'm really not convinced that Java needs it. > We already have try-catch statements, and it sounds strange to provide > another way to express the same semantics. I don't see what the new > construct adds, aside from a bit of syntactic sugar. On the other hand, it > creates a new source of subtle bugs, especially when exceptions are > unchecked. E.g., consider: > > switch(a.b().c().d()) { > case ... > case throws RuntimeException ex -> handle(ex); > } > > Now, one may want to refactor the code, extracting a.b(), a.b().c(), or > the whole a.b().c().d() to a separate variable for clarity, or to avoid a > long line. > This action is usually safe, and it was totally safe in switches so far > (even with patterns and case null). Now, it's not safe, as exceptions > thrown from the extracted part are not handled by the 'case throws' branch. > I don't see a good way to perform this refactoring in a semantically > equivalent way. The only possibility I see is to duplicate the exception > handler in the external catch: > > try { > var ab = a.b(); > switch(ab.c().d()) { > case ... > case throws RuntimeException ex -> handle(ex); > } > } > catch(RuntimeException ex) { > handle(ex); // duplicated code > } > > There are some straightforward alternatives. To avoid long lines, wrap the switch expression across multiple lines. To refactor the switch expression across multiple statements for clarity, use a private method. I write this sort of thing all the time: private D abcd(A a) { var b = a.b(); var c = b.c(); return c.d(); } ... switch (abcd(a)) { case ... case throws RuntimeException ex -> handle(ex); } > As switch selector does not allow using several expressions or to declare > new variables, extract/inline refactorings can easily become very > painful, or cause subtle bugs if not performed correctly. > Note that it's not a problem inside usual try-catch statement (*), as you > can easily add or remove more statements inside the try-body. > > (*) Except resource declaration, but it's rarely a problem, and in some > cases it's still possible to extract parts as separate resources, because > you can declare several of them > > I think, instead of repurposing switch to be another form of try-catch we > could add more love to try-catch allowing it to be an expression with > yields in branches. The proposed JEP allows something like this: > > Integer toIntOrNull(String s) { > return switch(Integer.parseInt(s)) { > case int i -> i; > case throws NumberFormatException _ -> null; > } > } > > But we are still limited by a single expression in the selector. An > alternative would be > Integer toIntOrNull(String s) { > return try { yield Integer.parseInt(s); } > catch(NumberFormatException _) { yield null; }; > } > Here, all kinds of refactorings are possible. And we actually don't need > to express pattern matching, because we essentially don't need any pattern > matching. > > Also, note that some of the situations which are usually solved with > exception handling in modern Java (e.g. Pattern.compile -> > PatternSyntaxException, or UUID.fromString -> IllegalArgumentException, or > Integer.parseInt above) will be covered in future by member patterns. So > probably if we concentrate more on member patterns, people will need much > less exception handling in business logic, and such an enhancement will be > not so useful anyway? Speaking about the sample from the JEP, can we > imagine something like this in the future (sic!) Java? > I agree *in part* with this sentiment. Many library methods that currently throw exceptions would be better written as member patterns. However, there will always be a need to handle exceptions thrown by code that works with the outside world (databases, network, filesystem). These are often libraries and frameworks we do not control. Having "case throws" available would be a significant improvement for those situations. switch (getEntityFromNetworkDatabase(id)) { case SomeEntity entity -> ... case DatabaseException ex -> ... case NetworkException ex -> ... } switch(future) { > case Future.cancelled() -> ... > case Future.interrupted() -> ... > case Future.failed(Exception ex) -> ... // no need to unwrap > ExecutionException manually > case Future.successful(Box b) -> ... > } > > One more note about the JEP text. It's unclear for me whether 'case throw' > branches could catch a residual result. More precisely, if MatchException > happens, or NullPointerException happens (selector evaluated to null, but > there's no 'case null'), can these exceptions be caught by the 'case > throws' branches in the same switch? > > With best regards, > Tagir Valeev. > > > On Fri, Apr 19, 2024 at 3:05?PM Angelos Bimpoudis < > angelos.bimpoudis at oracle.com> wrote: > >> Dear spec experts, >> >> A while ago we discussed on this list about enhancing the switch construct >> to >> support case labels that match exceptions thrown during evaluation of the >> selector expression. A draft JEP for this feature is now available at: >> >> https://bugs.openjdk.org/browse/JDK-8323658 >> >> Please take a look at this new JEP and give us your feedback. >> >> Thanks, >> Aggelos >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Apr 25 03:06:48 2024 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 25 Apr 2024 03:06:48 +0000 Subject: Exception handling in switch (Preview) In-Reply-To: References: <577E666C-5A6F-45B9-8782-50D4CEEE563B@oracle.com> Message-ID: <2C4027C7-B85E-4BD9-ACD4-512E576E6215@oracle.com> On Apr 24, 2024, at 10:50?PM, Kevin Bourrillion wrote: Hey Angelos! For whatever my take is worth here, I'm also skeptical. I think the bar for adding another way to catch exceptions should be really high, and the benefits here wouldn't clear it. I just don't expect the nested switch-in-try will be painful enough often enough. This feature would only be applicable when the whole possibly-exception-producing code can fit into a single expression in the switch header . . . This raises a very good question: what do we expect the "whole possibly-exception-producing code? to look like in practice? I conjecture that the attractive situation for using switch-with-?case throws"-clauses will NOT involve arbitrarily large expressions that might need to be refactored, but rather a single method call (or possibly an expression with a single operator) where all argument expressions are variables or constants, rather than other stuff that might cause exceptions. But I am not sure. So maybe this is a conjecture that could be researched. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Apr 25 03:06:48 2024 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 25 Apr 2024 03:06:48 +0000 Subject: Exception handling in switch (Preview) In-Reply-To: References: <577E666C-5A6F-45B9-8782-50D4CEEE563B@oracle.com> Message-ID: <2C4027C7-B85E-4BD9-ACD4-512E576E6215@oracle.com> On Apr 24, 2024, at 10:50?PM, Kevin Bourrillion wrote: Hey Angelos! For whatever my take is worth here, I'm also skeptical. I think the bar for adding another way to catch exceptions should be really high, and the benefits here wouldn't clear it. I just don't expect the nested switch-in-try will be painful enough often enough. This feature would only be applicable when the whole possibly-exception-producing code can fit into a single expression in the switch header . . . This raises a very good question: what do we expect the "whole possibly-exception-producing code? to look like in practice? I conjecture that the attractive situation for using switch-with-?case throws"-clauses will NOT involve arbitrarily large expressions that might need to be refactored, but rather a single method call (or possibly an expression with a single operator) where all argument expressions are variables or constants, rather than other stuff that might cause exceptions. But I am not sure. So maybe this is a conjecture that could be researched. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Fri Apr 26 10:07:30 2024 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 26 Apr 2024 10:07:30 +0000 Subject: Draft Spec for Preview of Module Import Declarations (JEP 476) Message-ID: <30E93260-DC44-4E4E-977B-41C9348FDE62@oracle.com> Dear experts: The first draft of a spec covering JEP 476 (Module Import Declarations (Preview)) https://cr.openjdk.org/~gbierman/jep476/latest/ Feel free to contact me directly or on this list with any comments. Thanks Gavin On 17 Apr 2024, at 19:58, Mark Reinhold wrote: https://openjdk.org/jeps/476 Summary: Enhance the Java programming language with the ability to succinctly import all of the packages exported by a module. This simplifies the reuse of modular libraries, but does not require the importing code to be in a module itself. This is a preview language feature. - Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Fri Apr 26 16:34:22 2024 From: alex.buckley at oracle.com (Alex Buckley) Date: Fri, 26 Apr 2024 09:34:22 -0700 Subject: Draft Spec for Preview of Module Import Declarations (JEP 476) In-Reply-To: <30E93260-DC44-4E4E-977B-41C9348FDE62@oracle.com> References: <30E93260-DC44-4E4E-977B-41C9348FDE62@oracle.com> Message-ID: <0b21e7a6-e277-4936-8ece-71cabee2ca4a@oracle.com> A few small things: - 7.5 says "A single-module-import declaration (7.5.5) imports all the accessible classes and interfaces, as needed, from every package exported by a given module." The "every package" makes me a little nervous because there might be qualified exports which are not importable by the s-m-i declaration. That is, we don't want 7.5 to conflict with 7.5.5's "The packages exported by the module M ***to the current module.***" Recommend: "A single-module-import declaration (7.5.5) imports all the accessible classes and interfaces of the packages exported by a given module, as needed." Unlike the preceding four bullets, we don't mention the canonical name of the module, because at present modules don't have canonical names; that's probably still OK. - Example 7.5.5-1 mentions "All simple compilation units implicitly import the module java.base (7.3)." which should be a normative statement, not an informative one. (I'm sure it will move/relate to a new section on Simple Compilation Units in future, but for now we shouldn't lose it in an example.) - I recommend having Example 7.5.5-1 be "Single-Module-Import in Ordinary Compilation Units", forking off Example 7.5.5-2 "Single-Module-Import in Modular Compilation Units" at "Import declarations can also appear in a modular compilation unit." Eventually there will be an Example of "S-M-I in Simple Compilation Units". Alex On 4/26/2024 3:07 AM, Gavin Bierman wrote: > Dear experts: > > The first draft of a spec covering JEP 476 (Module Import Declarations > (Preview)) > > https://cr.openjdk.org/~gbierman/jep476/latest/ > > > Feel free to contact me directly or on this list with any comments. > > Thanks > Gavin > >> On 17 Apr 2024, at 19:58, Mark Reinhold wrote: >> >> https://openjdk.org/jeps/476 >> >> ?Summary: Enhance the Java programming language with the ability to >> ?succinctly import all of the packages exported by a module. This >> ?simplifies the reuse of modular libraries, but does not require the >> ?importing code to be in a module itself. This is a preview language >> ?feature. >> >> - Mark From gavin.bierman at oracle.com Tue Apr 30 11:35:04 2024 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Tue, 30 Apr 2024 11:35:04 +0000 Subject: Draft Spec for Preview of Module Import Declarations (JEP 476) In-Reply-To: <0b21e7a6-e277-4936-8ece-71cabee2ca4a@oracle.com> References: <30E93260-DC44-4E4E-977B-41C9348FDE62@oracle.com> <0b21e7a6-e277-4936-8ece-71cabee2ca4a@oracle.com> Message-ID: Thanks Alex. I updated the online version. > On 26 Apr 2024, at 17:34, Alex Buckley wrote: > > A few small things: > > - 7.5 says "A single-module-import declaration (7.5.5) imports all the accessible classes and interfaces, as needed, from every package exported by a given module." The "every package" makes me a little nervous because there might be qualified exports which are not importable by the s-m-i declaration. That is, we don't want 7.5 to conflict with 7.5.5's "The packages exported by the module M ***to the current module.***" Recommend: "A single-module-import declaration (7.5.5) imports all the accessible classes and interfaces of the packages exported by a given module, as needed." Unlike the preceding four bullets, we don't mention the canonical name of the module, because at present modules don't have canonical names; that's probably still OK. Done, thanks. > > - Example 7.5.5-1 mentions "All simple compilation units implicitly import the module java.base (7.3)." which should be a normative statement, not an informative one. (I'm sure it will move/relate to a new section on Simple Compilation Units in future, but for now we shouldn't lose it in an example.) Oops - that sentence shouldn?t appear at all. Removed. > > - I recommend having Example 7.5.5-1 be "Single-Module-Import in Ordinary Compilation Units", forking off Example 7.5.5-2 "Single-Module-Import in Modular Compilation Units" at "Import declarations can also appear in a modular compilation unit." Eventually there will be an Example of "S-M-I in Simple Compilation Units?. Yes, good idea, thanks. Done. Gavin From forax at univ-mlv.fr Tue Apr 30 13:22:31 2024 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 30 Apr 2024 15:22:31 +0200 (CEST) Subject: Derived record creation and Data Oriented programming Message-ID: <1211263117.16028569.1714483351732.JavaMail.zimbra@univ-eiffel.fr> Hello, they have been several messages on amber-dev about the compatibility of the derived record creation. I think part of the issue reported is that with the proposed syntax, the call to the desconstructor and the canonical constructor is implicit. Let's take an example record Point(int x, int y) {} var point = new Point(2, 3); In fact, var point2 = point with { x = y; }; is a shortcut for: var point2 = point with { Point(int x, int y) = this; // i.e. int x = this.x(); int y = this.y(); x = y; yield new Point(x, y); }; One problem with the current syntax is that because the call to the [de]constructors is implicit. I think we shoud allow users to write the implicit calls if they want. I wonder if - we should not allow yield to be used so the compiler adds yield automatically only if there is no yield ? - we should in the future when deconstructors are introduced, allow users to call a deconstructor explicitly and only provide one if not explicitly written ? Being able to write the calls explicitly is important because it's a way to detect if the record has been modified without the proper constructor/destructor has been written to be backward compatible (Like in a switch, a record pattern detects when a record component is added while a type pattern does not). Being able to call a the deconstructor explicitly also have the advantage to avoid to declare a variable/calls the accessor if not needed. By example var point2 = point with { Point(_, var y) = this; x = y; }; regards, R?mi From brian.goetz at oracle.com Tue Apr 30 13:32:35 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 30 Apr 2024 09:32:35 -0400 Subject: Derived record creation and Data Oriented programming In-Reply-To: <1211263117.16028569.1714483351732.JavaMail.zimbra@univ-eiffel.fr> References: <1211263117.16028569.1714483351732.JavaMail.zimbra@univ-eiffel.fr> Message-ID: Interesting idea.? Of the two sides, allowing explicit *constructor* calls is significantly more practical (`yield` has control flow consequences understood by the language, and obviously means "yield this value from the current expression", whereas doing a random pattern match with `this` as a match candidate is not remotely clear that you intend to override the deconstruction.) But, let's back up a second: what do we gain from this?? Let's say we have two records with similar states: ??? record A(int x, int y, int z) { } ??? record B(int x, int y, int z) { } A a = ... B b = a with { yield new B(x, y, z); } I don't see how this is more clear than: B b = switch (a) { ????????? case A(var x, var y, var z) -> new B(x, y, z); ????? }; (or an imperative match, if we have one.)? In fact, it seems less clear, since it is not really a "with" anything.? It's using `with` as a short form of "shred to components", which is not what `with` is intended to convey. So let's back up: what problem are we trying to solve here? On 4/30/2024 9:22 AM, Remi Forax wrote: > Hello, > they have been several messages on amber-dev about the compatibility of the derived record creation. > > I think part of the issue reported is that with the proposed syntax, the call to the desconstructor and the canonical constructor is implicit. > > Let's take an example > > record Point(int x, int y) {} > > var point = new Point(2, 3); > > > In fact, > var point2 = point with { x = y; }; > > is a shortcut for: > > var point2 = point with { > Point(int x, int y) = this; // i.e. int x = this.x(); int y = this.y(); > x = y; > yield new Point(x, y); > }; > > One problem with the current syntax is that because the call to the [de]constructors is implicit. I think we shoud allow users to write the implicit calls if they want. > > I wonder if > - we should not allow yield to be used so the compiler adds yield automatically only if there is no yield ? > - we should in the future when deconstructors are introduced, allow users to call a deconstructor explicitly and only provide one if not explicitly written ? > > Being able to write the calls explicitly is important because it's a way to detect if the record has been modified without the proper constructor/destructor has been written to be backward compatible (Like in a switch, a record pattern detects when a record component is added while a type pattern does not). > > Being able to call a the deconstructor explicitly also have the advantage to avoid to declare a variable/calls the accessor if not needed. > By example > var point2 = point with { > Point(_, var y) = this; > x = y; > }; > > regards, > R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Apr 30 15:35:36 2024 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 30 Apr 2024 17:35:36 +0200 (CEST) Subject: Derived record creation and Data Oriented programming In-Reply-To: References: <1211263117.16028569.1714483351732.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <2006254717.16140758.1714491336502.JavaMail.zimbra@univ-eiffel.fr> > From: "Brian Goetz" > To: "Remi Forax" , "amber-spec-experts" > > Sent: Tuesday, April 30, 2024 3:32:35 PM > Subject: Re: Derived record creation and Data Oriented programming > Interesting idea. Of the two sides, allowing explicit *constructor* calls is > significantly more practical (`yield` has control flow consequences understood > by the language, and obviously means "yield this value from the current > expression", whereas doing a random pattern match with `this` as a match > candidate is not remotely clear that you intend to override the > deconstruction.) > But, let's back up a second: what do we gain from this? Let's say we have two > records with similar states: > record A(int x, int y, int z) { } > record B(int x, int y, int z) { } > A a = ... > B b = a with { yield new B(x, y, z); } > I don't see how this is more clear than: > B b = switch (a) { > case A(var x, var y, var z) -> new B(x, y, z); > }; > (or an imperative match, if we have one.) In fact, it seems less clear, since it > is not really a "with" anything. It's using `with` as a short form of "shred to > components", which is not what `with` is intended to convey. > So let's back up: what problem are we trying to solve here? The problem is that the syntax of "with" does not show that with depends on the deconstructor and the constructor of the record, so the behavior in case of separate compilation is not clear. I see several ways to fix that: - restrict the uses of "with" to the package/module containing the record (as we have done with sealed types), - allows a syntax variation where the components of the record are listed, - link the canonical deconstructor/constructor of the record at runtime (which can be simpler if the local variable declared outside of the block are captured (like the variables inside a "when" expression)). and maybe there are other solutions ? R?mi > On 4/30/2024 9:22 AM, Remi Forax wrote: >> Hello, >> they have been several messages on amber-dev about the compatibility of the >> derived record creation. >> I think part of the issue reported is that with the proposed syntax, the call to >> the desconstructor and the canonical constructor is implicit. >> Let's take an example >> record Point(int x, int y) {} >> var point = new Point(2, 3); >> In fact, >> var point2 = point with { x = y; }; >> is a shortcut for: >> var point2 = point with { >> Point(int x, int y) = this; // i.e. int x = this.x(); int y = this.y(); >> x = y; >> yield new Point(x, y); >> }; >> One problem with the current syntax is that because the call to the >> [de]constructors is implicit. I think we shoud allow users to write the >> implicit calls if they want. >> I wonder if >> - we should not allow yield to be used so the compiler adds yield automatically >> only if there is no yield ? >> - we should in the future when deconstructors are introduced, allow users to >> call a deconstructor explicitly and only provide one if not explicitly written >> ? >> Being able to write the calls explicitly is important because it's a way to >> detect if the record has been modified without the proper >> constructor/destructor has been written to be backward compatible (Like in a >> switch, a record pattern detects when a record component is added while a type >> pattern does not). >> Being able to call a the deconstructor explicitly also have the advantage to >> avoid to declare a variable/calls the accessor if not needed. >> By example >> var point2 = point with { >> Point(_, var y) = this; >> x = y; >> }; >> regards, >> R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Apr 30 16:22:01 2024 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 30 Apr 2024 12:22:01 -0400 Subject: Derived record creation and Data Oriented programming In-Reply-To: <2006254717.16140758.1714491336502.JavaMail.zimbra@univ-eiffel.fr> References: <1211263117.16028569.1714483351732.JavaMail.zimbra@univ-eiffel.fr> <2006254717.16140758.1714491336502.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <89e427e1-9e56-46c9-8373-158ccbfd151d@oracle.com> > > So let's back up: what problem are we trying to solve here? > > > The problem is that the syntax of "with" does not show that with > depends on the deconstructor and the constructor of the record, so the > behavior in case of separate compilation is not clear. I think the lack of clarity you are concerned about is that the names of the components are being used as a new kind of API, where we lift API elements to variables, and you're concerned that those names are "weakly coupled" between declaration and client?? Is this right? > I see several ways to fix that: > - restrict the uses of "with" to the package/module containing the > record (as we have done with sealed types), With this restriction, I think the feature is not really useful enough to justify.? Plus this will annoy people. > - allows a syntax variation where the components of the record are listed, So, this would be like a C++ lambda: ??? // don't take the syntax seriously, purely meant to evoke C++ lambda syntax ??? p = p with [x,y]{ x = 3; } and then the [x,y] would be used for the ctor/dtor lookup? > - link the canonical deconstructor/constructor of the record at > runtime (which can be simpler if the local variable declared outside > of the block are captured (like the variables inside a "when" > expression)). Not sure what you mean here, but it probably doesn't scale to classes? > and maybe there are other solutions ? > > R?mi > > > On 4/30/2024 9:22 AM, Remi Forax wrote: > > Hello, > they have been several messages on amber-dev about the compatibility of the derived record creation. > > I think part of the issue reported is that with the proposed syntax, the call to the desconstructor and the canonical constructor is implicit. > > Let's take an example > > record Point(int x, int y) {} > > var point = new Point(2, 3); > > > In fact, > var point2 = point with { x = y; }; > > is a shortcut for: > > var point2 = point with { > Point(int x, int y) = this; // i.e. int x = this.x(); int y = this.y(); > x = y; > yield new Point(x, y); > }; > > One problem with the current syntax is that because the call to the [de]constructors is implicit. I think we shoud allow users to write the implicit calls if they want. > > I wonder if > - we should not allow yield to be used so the compiler adds yield automatically only if there is no yield ? > - we should in the future when deconstructors are introduced, allow users to call a deconstructor explicitly and only provide one if not explicitly written ? > > Being able to write the calls explicitly is important because it's a way to detect if the record has been modified without the proper constructor/destructor has been written to be backward compatible (Like in a switch, a record pattern detects when a record component is added while a type pattern does not). > > Being able to call a the deconstructor explicitly also have the advantage to avoid to declare a variable/calls the accessor if not needed. > By example > var point2 = point with { > Point(_, var y) = this; > x = y; > }; > > regards, > R?mi > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb9n at gmail.com Thu Apr 25 02:50:22 2024 From: kevinb9n at gmail.com (Kevin Bourrillion) Date: Wed, 24 Apr 2024 19:50:22 -0700 Subject: Exception handling in switch (Preview) In-Reply-To: <577E666C-5A6F-45B9-8782-50D4CEEE563B@oracle.com> References: <577E666C-5A6F-45B9-8782-50D4CEEE563B@oracle.com> Message-ID: Hey Angelos! For whatever my take is worth here, I'm also skeptical. I think the bar for adding another way to catch exceptions should be really high, and the benefits here wouldn't clear it. I just don't expect the nested switch-in-try will be painful enough *often* enough. This feature would only be applicable when the whole possibly-exception-producing code can fit into a single expression in the switch header, so (a) you'll have to revert to the other form often enough, and (b) highjinks might ensue as users cram too much code into there... Normally I expect that when I see an expression, with no curly braces (or ->) involved, my mental model is that that expression gets evaluated and *then* the resulting value is passed to the surrounding context. This proposed version of switch seems to work differently than that, but with no curly braces to set it off, and that feels novel to me. Maybe it's more precedented than I think though? (for/while still fit this, considering that execution returns to the point just before the header so the expressions are naturally evaluated again...) Now, if there was strong motivation to support actual *patterns* in these exception cases then that would seem to justify this. Although in that case I'd ask why regular catch clauses shouldn't accept patterns as well. ~~ If it does proceed then I must take exception (har) with using "throws" to mean "catch"! Today we have this breakdown * throws - what might get thrown / is allowed to be thrown * throw - what IS actually getting thrown * catch - what HAS actually been thrown We expect users to learn that, and the current design would ask them to partly unlearn it. Imho, `case catch FooException` is better! I hope this is helpful... On Mon, Apr 22, 2024 at 11:56?PM Brian Goetz wrote: > To the question ?does Java need this?, well, of course we don?t *need* it; > we do have try-catch statements. But as Dan points out, the main challenge > of using methods that both return a value and throw exceptions is that we > cannot handle all the results uniformly. And as the JEP points out, > backing off to try-catch is doubly painful: not only do we have to use two > constructs, but we are back to statement-land, which is far more error > prone and less composable than expressions. > > As to ?why don?t we just make try-catch an expression?, well, that?s where > we started with this feature exploration. It turns out to just be too weak > to be useful. The main constraint is that the try and catch parts have to > yield the same type, but the constraint to produce a ?default? value of > that type from the catch arms is just too constraining. If you try some > nontrivial examples this becomes clear pretty quickly. > > But there is one criticism of this feature that is I think at the root of > what you are getting at, which is that having effect cases in switch > produces only a shallow-ish unification (or, alternately, that it is not > the primitive.) Having effect cases lets us deal uniformly with all the > consequences of evaluating the selector uniformly in one place, but like > try-catch, you have to deal with them _right there_. Whereas with a Try > monad, you could capture the Try and then process it by passing it to > another method, putting it on a queue and letting some other thread process > it, etc. > > If we were to excavate to the bottom, then we would likely want `try e` to > evaluate to a Try monad, at which point the current proposal is a sugary > representation of: > > switch (try e) { > case Success(P1) -> ? > case Success(P2) -> ? > case Failure(E1) -> ? > } > > We explored this point as well in the exploration, and backed off. But we > can do this in either order; if we have a `try e` primitive that evaluates > to a Try, then we can retroactively redefine a switch with `case throws` > clauses to be sugar for the above. > > I will think further about the pattern matching connection you propose. > > On Apr 20, 2024, at 10:00 AM, Tagir Valeev wrote: > > Dear experts, > > looking into this proposal, I'm really not convinced that Java needs it. > We already have try-catch statements, and it sounds strange to provide > another way to express the same semantics. I don't see what the new > construct adds, aside from a bit of syntactic sugar. On the other hand, it > creates a new source of subtle bugs, especially when exceptions are > unchecked. E.g., consider: > > switch(a.b().c().d()) { > case ... > case throws RuntimeException ex -> handle(ex); > } > > Now, one may want to refactor the code, extracting a.b(), a.b().c(), or > the whole a.b().c().d() to a separate variable for clarity, or to avoid a > long line. > This action is usually safe, and it was totally safe in switches so far > (even with patterns and case null). Now, it's not safe, as exceptions > thrown from the extracted part are not handled by the 'case throws' branch. > I don't see a good way to perform this refactoring in a semantically > equivalent way. The only possibility I see is to duplicate the exception > handler in the external catch: > > try { > var ab = a.b(); > switch(ab.c().d()) { > case ... > case throws RuntimeException ex -> handle(ex); > } > } > catch(RuntimeException ex) { > handle(ex); // duplicated code > } > > As switch selector does not allow using several expressions or to declare > new variables, extract/inline refactorings can easily become very > painful, or cause subtle bugs if not performed correctly. > Note that it's not a problem inside usual try-catch statement (*), as you > can easily add or remove more statements inside the try-body. > > (*) Except resource declaration, but it's rarely a problem, and in some > cases it's still possible to extract parts as separate resources, because > you can declare several of them > > I think, instead of repurposing switch to be another form of try-catch we > could add more love to try-catch allowing it to be an expression with > yields in branches. The proposed JEP allows something like this: > > Integer toIntOrNull(String s) { > return switch(Integer.parseInt(s)) { > case int i -> i; > case throws NumberFormatException _ -> null; > } > } > > But we are still limited by a single expression in the selector. An > alternative would be > Integer toIntOrNull(String s) { > return try { yield Integer.parseInt(s); } > catch(NumberFormatException _) { yield null; }; > } > Here, all kinds of refactorings are possible. And we actually don't need > to express pattern matching, because we essentially don't need any pattern > matching. > > Also, note that some of the situations which are usually solved with > exception handling in modern Java (e.g. Pattern.compile -> > PatternSyntaxException, or UUID.fromString -> IllegalArgumentException, or > Integer.parseInt above) will be covered in future by member patterns. So > probably if we concentrate more on member patterns, people will need much > less exception handling in business logic, and such an enhancement will be > not so useful anyway? Speaking about the sample from the JEP, can we > imagine something like this in the future (sic!) Java? > > switch(future) { > case Future.cancelled() -> ... > case Future.interrupted() -> ... > case Future.failed(Exception ex) -> ... // no need to unwrap > ExecutionException manually > case Future.successful(Box b) -> ... > } > > One more note about the JEP text. It's unclear for me whether 'case throw' > branches could catch a residual result. More precisely, if MatchException > happens, or NullPointerException happens (selector evaluated to null, but > there's no 'case null'), can these exceptions be caught by the 'case > throws' branches in the same switch? > > With best regards, > Tagir Valeev. > > > On Fri, Apr 19, 2024 at 3:05?PM Angelos Bimpoudis < > angelos.bimpoudis at oracle.com> wrote: > >> Dear spec experts, >> >> A while ago we discussed on this list about enhancing the switch? construct >> to >> support case? labels that match exceptions thrown during evaluation of >> the >> selector expression. A draft JEP for this feature is now available at: >> >> https://bugs.openjdk.org/browse/JDK-8323658 >> >> Please take a look at this new JEP and give us your feedback. >> >> Thanks, >> Aggelos >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: