Guards

Guy Steele guy.steele at oracle.com
Fri Mar 5 22:12:27 UTC 2021


Thanks for this summary, Brian.  But there is just one place where the argument involves a perhaps unnecessary overcommitment.  See below.

> On Mar 5, 2021, at 2:14 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> Let me try and summarize all that has been said on the Guards topic.  
> 
> #### Background and requirements
> 
> For `instanceof`, we don't need any sort of guard right now (with the patterns we have); we can already conjoin arbitrary boolean expressions with `&&` in all the contexts we can use `instanceof`, because it's a boolean expression.  (This may change in the future as patterns get richer.)  So we can already express our canonical guarded Point example with
> 
>     if (p instanceof Point(var x, var y) && x > y) { ... }
> 
> with code that no one will find confusing.  
> 
> For switch, we can't do this, because case labels are not boolean expressions, they're some ad-hoc sub-language.  When the sub-language was so limited that it could only express int and string constants, this wasn't a problem; there was little refinement needed on `case "Foo"`.  
> 
> As we make switch more powerful, we face a problem: if the user drifts out of the territory of what can be expressed as case labels, they fall off the cliff and have to refactor their 50-way switch into an if-else chain.  This will be a really bad user experience.  Some sort of escape hatch to boolean logic buys us insurance against this bad experience -- as long as you can express your non-pattern criteria with a boolean expression (which is pretty rich), you don't have to leave switch-land.  
> 
> So we took as our requirement: 
> 
>     Some sort of guard construct that is usable in switch is a forced move.
> 
> #### Expressing guards in switch
> 
> There are several ways to envision guards:
> 
>  - As patterns that refine other patterns (e.g., a "true" pattern)

A guard construct need not itself be a pattern.  Rather, it can be viewed as a map from patterns to patterns.  Indeed, they are formulated in exactly that way in Gavin’s BNF in JEP JDK-8213076 "Pattern Matching for switch”: a guard is not a pattern, but can only appear within a pattern as the right-hand operand of `&`:

Pattern:
	PatternOperand
	Pattern & PatternOperandOrGuard
PatternOperandOrGuard:
	PatternOperand
	GuardPattern

As a result, if we curry and squint, we can see that “& Guardpattern” is a map from patterns to patterns.  We can also see that “& Pattern” is a map from patterns to patterns; and finally we can appreciate two other points: (1) GuardPattern need not ever actually be regarded as a patterns, and (2) we have overloaded `&` to mean two rather different things.  While it is possible to express the fact that a guard construct is a map from patterns to patterns by insisting that a guard is itself a pattern and then using the pattern conjunction operator, this is not the only way to express or model that fact.

Now, the quoted BNF has reached its current structure because, as the JEP carefully explains,

	The grammar has been carefully designed to exclude a guard pattern as a valid
	top-level pattern. There is little point in writing pattern matching code such as
	o instanceof true(s.length != 0). Guard patterns are intended to be refine
	the meaning of other patterns. The grammar reflects this intuition.

As a result, a guard necessarily appears to the right of a `&` and therefore necessarily to the right of a pattern.  We should also inquire as to whether it is ever desirable in practice, within a chain of `&` (pattern conditional-and) operations for a pattern to appear to the right of a guard.  If not, then `&` chains always have the simple form

	pattern & pattern & … & pattern & guard & guard & … & guard

where the number of patterns must be positive but the number of guards may be zero.  And if this is the case, it is not unreasonable to ask whether readability might not be better served by better marking that transition from patterns to guard in the chain, for example:

	pattern & pattern & … & pattern when guard & guard & … & guard

And then we see that there really is no reason to try to overload `&` (however it is actually spelled) to mean both pattern conjunction and guard conjunction, because guard conjunction already exists in the form of the `&&` expression operator:

	pattern & pattern & … & pattern when guard && guard && … && guard

and therefore we can, after all, simplify this general form to the case of zero or one guards:

	pattern & pattern & … & pattern [when guard]

Finally, given that (an earlier version of) the patterns design already encompasses forms that can bind the entire object as well as components (what is done in other languages with `as`,  I have to ask: what are the envisioned practical applications of pattern conjunction other than as a cute way to include guards or a (more verbose) way to bind the entire value as well as components?  Maybe as a way to fake intersection types?
	
Now, all of this has no bearing on whether or not guards are required to be “top level only” in all cases; it argues only that guards need not appear within pattern-conjunction chains.  But I believe it would be perfectly reasonable to write

	case Point(int x when x > 0, int y when y > x):

Rémi has argued that this would be better written

	case Point(int x, int y) when x > 0 && y > x:

but I would argue that this choice is, and should be, a matter of style, and when matching against a record with many fields it might be more readable to mention each field’s constraint next to its binding rather than to make the reader compare a list of bindings against a list of constraints.

Bottom line: there are conceptually three distinct combining forms:

	pattern conjunction
	guard conjunction
	to a pattern, attach a guard

and it may be a mistake after all to conflate them by trying to use the same syntax or symbol for all three.

So what I would like to see is the convincing application example where you really do want to write

	pattern & guard & pattern

because then everything I’ve written above falls to the ground.  But if we cannot come up with such an example, then perhaps what I have written should be examined carefully as a serious design alternative.

[more below]


> . . .
> 
> #### Options
> 
> I suspect that we'd get a lot of mileage out of just renaming true to something like "when"; it avoids the "but that's not what true is" reaction, and is readable enough:
> 
>     case Foo(var x) & when(x > 0):

Sorry to bikeshed here, but while “when” is nice, I think “if” is even more appealing (short, familiar, already a keyword), especially if it alone can express attachment of a guard to a pattern (and we can argue about whether the parentheses are required):

	case Foo(var x) if (x > 0):

[more below]

> but I think it will still be perceived as "glass half empty", with lots of "why do I need the &" reactions.  And, in the trivial (but likely quite common, at least initially) case of one pattern and one guard, the answers are not likely to be very satisfying, no matter how solidly grounded in reality, because the generality of the compositional approach is not yet obvious enough to those seeing patterns for the first time.  
> 
> I am not compelled by the direction of "just add guards to switch and be done with it", because that's a job we're going to have to re-do later.  But I think there's a small tweak which may help a lot: do that job now, with only a small shadow of lasting damage:
> 
>  - Expose `grobble(expr)` clauses as an option on pattern switch cases;
> 
>  - When we introduce & combination (which can be deferred if we have a switch guard now), plan for a `grobble(e)` pattern.  At that point, 
> 
>     case Foo(var x) grobble(x > 0): 
> 
> is revealed to be sugar for
> 
>     case Foo(var x) & grobble(x > 0): 
> 
> As as bonus, we can use grobble by itself in pattern switches to incorporate non-target criteria:
> 
>     case grobble(e): 
> 
> which is later revealed to be sugar for:
> 
>     case Foo(var _) & grobble(e): 

I think you meant it is sugar for

    case var _ & grobble(e): 

If so, then compare that to the claim that

    case if (e):

is sugar for

    case var _ if (e): 

This might be very easy to explain to programmers who want to drop a few if-branches into a switch without having to convert the entire switch into an if-else chain.

> The downside here is that in the long run, we have something like the C-style array declarations; in the trivial case of a single pattern with a guard, you can leave in the & or leave it out, not unlike declaring `int[] x` vs `int x[]`.  Like the "transitional" (but in fact permanent) sop of C-style declarations, the "optional &" will surely become an impediment ("why can I leave it out here, but not there, that's inconsistent").  
> 
> All that said, this is probably an acceptable worse-is-better direction, where in the short term users are not forced to confront a model that they don't yet understand (or borrow concepts from the future), with a path to sort-of-almost-unification in the future that is probably acceptable.
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20210305/348b0bd5/attachment-0001.htm>


More information about the amber-spec-experts mailing list