Two new draft pattern matching JEPs
Brian Goetz
brian.goetz at oracle.com
Wed Mar 3 16:53:29 UTC 2021
> For a starter, at high level, the idea is to mix patterns and expressions (guards are boolean expressions), but at the same time, we have discussed several times to not allow constants inside patterns to make a clear distinction between patterns and expressions. We have a inconsistency here.
No, this is not *inconsistent* at all. (It may be confusing, though.)
We've already discussed how some patterns (e.g., regex) will take input
arguments, which are expressions. We haven't quite nailed down our
syntactic conventions for separating input expressions from output
bindings, but the notion of a pattern that accepts expressions as input
is most decidedly not outside the model.
Your real argument here seems to be that this is a design based on where
the language is going, and the pieces that we are exposing now don't
stand on their own as well as we might like. (We've seen this before;
users can often react very badly to design decisions whose justification
is based on where we're going, but haven't gotten.)
> The traditional approach for guards cleanly separate the pattern part from the expression part
> case Rectangle(Point x, Point y) if x > 0 && y > 0
> which makes far more sense IMO.
Yes, we were stuck there for a while as being the "obvious" choice
(modulo choice of keyword); to treat this problem as a switch problem,
and nail new bits of syntax onto switch.
To restate the obvious, this is a considerably weaker construct. It is
(a) switch-specific, and (b) means we cannot intersperse guards with
more patterns, we have to do all the patterns and then all the guards
(i.e., it doesn't compose.) In the case of simple switches over
records, this extra power is not obviously needed, but its lack will
bite us as patterns get more powerful.
I think the argument you want to be making here is "OK, so the
compositional approach is obviously better and more powerful, but it
will put the users off, especially in the simple cases, and most of the
cases they will run into for the next few years are simple". This leads
you to "so we should do something that is objectively worse, but less
challenging to the users." (Despite this "loaded" way of writing it,
this really is the argument you're making. And it is a valid one -- but
please, let's not try to convince ourselves that it is remotely better
in any fundamental way, other than "doesn't make people think as hard."
Essentially, I think what you're saying is that it is _too early_ to
introduce & patterns -- because guards are a weak motivation to do so.
I don't disagree with this point, but it puts us in a bind -- either we
don't provide a way to write guarded patterns now (which is not a
problem for instanceof, but is for switch), or we nail some bit of
terrible syntax onto switch that we're stuck with. These are not good
choices.
I'm open to "worse is better" arguments, but please, let's not try to
convince ourselves that it's not objectively worse.
> The current proposal allows
> case Rectangle(Point x & true(x > 0), Point y & true(y > 0))
> which is IMO far least readable because the clean separation between the patterns and the expressions is missing.
Yes, it does. But that's not a bug, its a feature. It allows users to
put the guards where they make sense in the specific situation. Just
like any feature that gives users flexibility, they have the flexibility
to write more or less readable code. Any compositional mechanism can
lead to better or worse compositions. But that doesn't seem like a very
good argument for "force them to always write it the same way, even when
that way isn't always what they want."
> There is also a mismatch in term of evaluation, an expression is evaluated from left to right, for a pattern, you have bindings and bindings are all populated at the same time by a deconstructor, this may cause issue, by example, this is legal in term of execution
> case Rectangle(Point x & true(x > 0 && y > 0), Point y)
> because at the point where the pattern true(...) is evaluated, the Rectangle has already been destructured, obviously, we can ban this kind of patterns to try to conserve the left to right evaluation but the it will still leak in a debugger, you have access to the value of 'y' before the expression inside true() is called.
I think the existing flow-scoping rules handle this already.
Rectangle(Point x & true(x > 0 && y > 0), Point y). The guard is
"downstream" of the `Point x` pattern, but not of the `Point y` pattern
or the `Rectangle(P,Q)` pattern. So I think y is just out of scope at
this use by the existing rules. (But, it's a great test case!)
> Also in term of syntax again, introducing '&' in between patterns overloads the operator '&' with one another meaning, my students already have troubles to make the distinction between & and && in expressions.
> As i already said earlier, and this is also said in the Python design document, we don't really need an explicit 'and' operator in between patterns because there is already an implicit 'and' between the sub-patterns of a deconstructing pattern.
You know the rule: you don't get to pick syntax nits unless you're ready
to endorse the rest of the design :)
> For me, the cons far outweigh the pro(s) here, but perhaps i've missed something ?
You make a lot of arguments here. I don't buy most of the specifics --
they mostly feel like "XY" arguments -- but I do think that there is an
X here, that you're walking in circles around but not naming. So let's
try to name that.
The closest you get to is this closing argument:
> To finish, there is also an issue with the lack of familiarity, when we have designed lambdas, we have take a great care to have a syntax similar to the C#, JS, Scala syntax, the concept of guards is well known, to introduce a competing feature in term of syntax and semantics, the bar has to be set very high because we are forcing people to learn a Java specific syntax, not seen in any other mainstream languages*.
which can be interpreted as "what you propose is clearly better than any
language with patterns has done, but we may win anyway by doing the
silly, ad-hoc, weak thing every other language does, because it will be
more familiar to Java developers." I think this is a potentially valid
argument, but let's be honest about the argument we're making.
The other argument you're making is a slightly different one; that the
benefits of combining patterns with &, and of having patterns that take
expressions as input, is more motivated by _future_ use cases than the
ones that this JEP is addressing, and justifying more powerful mechanism
"because you're going to need it soon" can be a hard sell.
The third argument, which you've not made, but I'll make for you, is one
of discoverability. If you wanted to write
case Foo(var x, var y) & true(x > y):
it is not obvious that the way to refine a pattern is to & it with a
pattern that ignores its target; this requires a more sophisticated
notion of what a pattern is than users will have now. In other words,
this is "too clever"; it requires users to absorb too many new concepts
and combine them in a novel way, when all they want is a simple
conditional filter.
I think *these* arguments are valid stewardship discussions to be having.
As a counterargument, let me point you to this thread from amber-dev 2
years ago (!). It takes a few messages to get going, but the general
idea is "why should we make pattern matching an expression using
instanceof, and not just have an `ifmatch` statement?" My reply, which
is a paean to the glory of composition, is here:
https://mail.openjdk.java.net/pipermail/amber-dev/2018-December/003842.html
This is not to say that this argument is the right answer in all cases,
but that it is soooo easy to convince ourselves that an ad-hoc,
non-compositional mechanism is "all we'll ever need", and that usually
ends in tears.
Let's also realize that we're likely to get here eventually anyway;
pattern AND is valuable, and eventually you'll be able to write
true-like patterns as ordinary declared patterns.
So, having laid out the real argument -- that maybe we're getting ahead
of ourselves -- are either of the alternatives (no guards, or a guard
construct nailed on the side of switch), acceptable? (Remi, I know your
answer, so I'm asking everyone else.)
(Or, could it be that the real concern here is just that `true` is ugly?)
More information about the amber-spec-observers
mailing list