Two new draft pattern matching JEPs

Brian Goetz brian.goetz at oracle.com
Wed Mar 3 16:53:29 UTC 2021



> For a starter, at high level, the idea is to mix patterns and expressions (guards are boolean expressions), but at the same time, we have discussed several times to not allow constants inside patterns to make a clear distinction between patterns and expressions. We have a inconsistency here.

No, this is not *inconsistent* at all.  (It may be confusing, though.)  
We've already discussed how some patterns (e.g., regex) will take input 
arguments, which are expressions.  We haven't quite nailed down our 
syntactic conventions for separating input expressions from output 
bindings, but the notion of a pattern that accepts expressions as input 
is most decidedly not outside the model.

Your real argument here seems to be that this is a design based on where 
the language is going, and the pieces that we are exposing now don't 
stand on their own as well as we might like.  (We've seen this before; 
users can often react very badly to design decisions whose justification 
is based on where we're going, but haven't gotten.)

> The traditional approach for guards cleanly separate the pattern part from the expression part
>    case Rectangle(Point x, Point y) if x > 0 && y > 0
> which makes far more sense IMO.

Yes, we were stuck there for a while as being the "obvious" choice 
(modulo choice of keyword); to treat this problem as a switch problem, 
and nail new bits of syntax onto switch.

To restate the obvious, this is a considerably weaker construct.  It is 
(a) switch-specific, and (b) means we cannot intersperse guards with 
more patterns, we have to do all the patterns and then all the guards 
(i.e., it doesn't compose.)  In the case of simple switches over 
records, this extra power is not obviously needed, but its lack will 
bite us as patterns get more powerful.

I think the argument you want to be making here is "OK, so the 
compositional approach is obviously better and more powerful, but it 
will put the users off, especially in the simple cases, and most of the 
cases they will run into for the next few years are simple". This leads 
you to "so we should do something that is objectively worse, but less 
challenging to the users."  (Despite this "loaded" way of writing it, 
this really is the argument you're making.  And it is a valid one -- but 
please, let's not try to convince ourselves that it is remotely better 
in any fundamental way, other than "doesn't make people think as hard."

Essentially, I think what you're saying is that it is _too early_ to 
introduce & patterns -- because guards are a weak motivation to do so.  
I don't disagree with this point, but it puts us in a bind -- either we 
don't provide a way to write guarded patterns now (which is not a 
problem for instanceof, but is for switch), or we nail some bit of 
terrible syntax onto switch that we're stuck with. These are not good 
choices.

I'm open to "worse is better" arguments, but please, let's not try to 
convince ourselves that it's not objectively worse.

> The current proposal allows
>    case Rectangle(Point x & true(x > 0), Point y & true(y > 0))
> which is IMO far least readable because the clean separation between the patterns and the expressions is missing.

Yes, it does.  But that's not a bug, its a feature.  It allows users to 
put the guards where they make sense in the specific situation. Just 
like any feature that gives users flexibility, they have the flexibility 
to write more or less readable code.  Any compositional mechanism can 
lead to better or worse compositions.  But that doesn't seem like a very 
good argument for "force them to always write it the same way, even when 
that way isn't always what they want."

> There is also a mismatch in term of evaluation, an expression is evaluated from left to right, for a pattern, you have bindings and bindings are all populated at the same time by a deconstructor, this may cause issue, by example, this is legal in term of execution
>    case Rectangle(Point x & true(x > 0 && y > 0), Point y)
> because at the point where the pattern true(...) is evaluated, the Rectangle has already been destructured, obviously, we can ban this kind of patterns to try to conserve the left to right evaluation but the it will still leak in a debugger, you have access to the value of 'y' before the expression inside true() is called.

I think the existing flow-scoping rules handle this already. 
Rectangle(Point x & true(x > 0 && y > 0), Point y).  The guard is 
"downstream" of the `Point x` pattern, but not of the `Point y` pattern 
or the `Rectangle(P,Q)` pattern.  So I think y is just out of scope at 
this use by the existing rules.  (But, it's a great test case!)

> Also in term of syntax again, introducing '&' in between patterns overloads the operator '&' with one another meaning, my students already have troubles to make the distinction between & and && in expressions.
> As i already said earlier, and this is also said in the Python design document, we don't really need an explicit 'and' operator in between patterns because there is already an implicit 'and' between the sub-patterns of a deconstructing pattern.

You know the rule: you don't get to pick syntax nits unless you're ready 
to endorse the rest of the design :)

> For me, the cons far outweigh the pro(s) here, but perhaps i've missed something ?

You make a lot of arguments here.  I don't buy most of the specifics -- 
they mostly feel like "XY" arguments -- but I do think that there is an 
X here, that you're walking in circles around but not naming. So let's 
try to name that.

The closest you get to is this closing argument:

> To finish, there is also an issue with the lack of familiarity, when we have designed lambdas, we have take a great care to have a syntax similar to the C#, JS, Scala syntax, the concept of guards is well known, to introduce a competing feature in term of syntax and semantics, the bar has to be set very high because we are forcing people to learn a Java specific syntax, not seen in any other mainstream languages*.

which can be interpreted as "what you propose is clearly better than any 
language with patterns has done, but we may win anyway by doing the 
silly, ad-hoc, weak thing every other language does, because it will be 
more familiar to Java developers."   I think this is a potentially valid 
argument, but let's be honest about the argument we're making.

The other argument you're making is a slightly different one; that the 
benefits of combining patterns with &, and of having patterns that take 
expressions as input, is more motivated by _future_ use cases than the 
ones that this JEP is addressing, and justifying more powerful mechanism 
"because you're going to need it soon" can be a hard sell.

The third argument, which you've not made, but I'll make for you, is one 
of discoverability.  If you wanted to write

     case Foo(var x, var y) & true(x > y):

it is not obvious that the way to refine a pattern is to & it with a 
pattern that ignores its target; this requires a more sophisticated 
notion of what a pattern is than users will have now. In other words, 
this is "too clever"; it requires users to absorb too many new concepts 
and combine them in a novel way, when all they want is a simple 
conditional filter.

I think *these* arguments are valid stewardship discussions to be having.

As a counterargument, let me point you to this thread from amber-dev 2 
years ago (!).  It takes a few messages to get going, but the general 
idea is "why should we make pattern matching an expression using 
instanceof, and not just have an `ifmatch` statement?"  My reply, which 
is a paean to the glory of composition, is here:

https://mail.openjdk.java.net/pipermail/amber-dev/2018-December/003842.html

This is not to say that this argument is the right answer in all cases, 
but that it is soooo easy to convince ourselves that an ad-hoc, 
non-compositional mechanism is "all we'll ever need", and that usually 
ends in tears.

Let's also realize that we're likely to get here eventually anyway; 
pattern AND is valuable, and eventually you'll be able to write 
true-like patterns as ordinary declared patterns.

So, having laid out the real argument -- that maybe we're getting ahead 
of ourselves -- are either of the alternatives (no guards, or a guard 
construct nailed on the side of switch), acceptable?  (Remi, I know your 
answer, so I'm asking everyone else.)

(Or, could it be that the real concern here is just that `true` is ugly?)



More information about the amber-spec-experts mailing list