Two new draft pattern matching JEPs
forax at univ-mlv.fr
forax at univ-mlv.fr
Thu Mar 4 15:56:35 UTC 2021
I want to separate the discussions about & between patterns and true()/false() aka mix pattern and expressions,
because the later is a call for trouble for me, the former is just a question about adding a new pattern or not.
I see true() and false() has an heresy because conceptually a bunch of patterns is a kind of declarative API while expressions are not.
Allowing expression right inside a pattern means that you can witness the evaluation order of patterns, something we don't want.
Here is an example with two patterns that starts with the same prefix
private static int COUNTER = 0;
private static boolean inc() { COUNTER++; return true; }
...
var rectangle = new Rectangle(new Point(1, 2), new Point(2, 1));
switch(rectangle) {
case Rectangle(Point x & inc(), null) -> ...
case Rectangle(Point x & inc(), Point y) -> ...
}
what is the value of COUNTER ?
The fundamental problem is that you can not skip the evaluation of an expression while you can skip the evaluation of a pattern because either you already have evaluate it or you know that this pattern is not relevant.
For a Stream, we can not avoid the lambda of a map or a filter to do a side effect at runtime, here we have the possibility to say that a Pattern is not an expression but a more declarative form.
You may object that a deconstructor can do side effect so the point is moot but even with that there is a big difference between allowing someone to create a class with a deconstructor that does a side effect and allowing anyone that write a switch to do any side effects in the middle of the patterns.
Now, you are saying that being able to not allow expressions anywhere is a weaker construct, and yes, this is true, but patterns are more than just a nice syntactic constructs,
they have the potential to also have a much nicer semantics that will free people to think in term of side effects.
(other comments inlined)
----- Mail original -----
> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "Remi Forax" <forax at univ-mlv.fr>, "Gavin Bierman" <gavin.bierman at oracle.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Mercredi 3 Mars 2021 17:53:29
> Objet: Re: Two new draft pattern matching JEPs
>> For a starter, at high level, the idea is to mix patterns and expressions
>> (guards are boolean expressions), but at the same time, we have discussed
>> several times to not allow constants inside patterns to make a clear
>> distinction between patterns and expressions. We have a inconsistency here.
>
> No, this is not *inconsistent* at all. (It may be confusing, though.)
> We've already discussed how some patterns (e.g., regex) will take input
> arguments, which are expressions. We haven't quite nailed down our
> syntactic conventions for separating input expressions from output
> bindings, but the notion of a pattern that accepts expressions as input
> is most decidedly not outside the model.
>
> Your real argument here seems to be that this is a design based on where
> the language is going, and the pieces that we are exposing now don't
> stand on their own as well as we might like. (We've seen this before;
> users can often react very badly to design decisions whose justification
> is based on where we're going, but haven't gotten.)
There is a misunderstanding here, i'm referring to the fact that
case Point(1, 1):
is actually rejected because it's too much like new Point(1,1) but at the same time, you want to allow expressions in the middle of patterns.
>
>> The traditional approach for guards cleanly separate the pattern part from the
>> expression part
>> case Rectangle(Point x, Point y) if x > 0 && y > 0
>> which makes far more sense IMO.
>
> Yes, we were stuck there for a while as being the "obvious" choice
> (modulo choice of keyword); to treat this problem as a switch problem,
> and nail new bits of syntax onto switch.
>
> To restate the obvious, this is a considerably weaker construct. It is
> (a) switch-specific, and (b) means we cannot intersperse guards with
> more patterns, we have to do all the patterns and then all the guards
> (i.e., it doesn't compose.) In the case of simple switches over
> records, this extra power is not obviously needed, but its lack will
> bite us as patterns get more powerful.
For (a), yes it's switch specific and it's great because we don't need it for instanceof, you can already use && inside the if of an instanceof and you don't need it when declaring local variables because the pattern has to be total. So being specific to switch is not really an issue.
For (b), you can always shift all the contents of true() and false() to the right into a traditional guard, so we don't need true() and false()
[...]
>
>> There is also a mismatch in term of evaluation, an expression is evaluated from
>> left to right, for a pattern, you have bindings and bindings are all populated
>> at the same time by a deconstructor, this may cause issue, by example, this is
>> legal in term of execution
>> case Rectangle(Point x & true(x > 0 && y > 0), Point y)
>> because at the point where the pattern true(...) is evaluated, the Rectangle has
>> already been destructured, obviously, we can ban this kind of patterns to try
>> to conserve the left to right evaluation but the it will still leak in a
>> debugger, you have access to the value of 'y' before the expression inside
>> true() is called.
>
> I think the existing flow-scoping rules handle this already.
> Rectangle(Point x & true(x > 0 && y > 0), Point y). The guard is
> "downstream" of the `Point x` pattern, but not of the `Point y` pattern
> or the `Rectangle(P,Q)` pattern. So I think y is just out of scope at
> this use by the existing rules. (But, it's a great test case!)
My point is that at the time the content of true() is executed, 'x' AND 'y' are already known.
So the patterns
Rectangle(Point x & true(x > 0 && y > 0), Point y)
Rectangle(Point x & true(x > 0), Point y & true(y > 0))
or
Rectangle(Point x, Point y) & true(x > 0 && y > 0)
are all equivalent in term of runtime execution, so there is little point to have different syntax for the same execution.
[...]
>
> The closest you get to is this closing argument:
>
>> To finish, there is also an issue with the lack of familiarity, when we have
>> designed lambdas, we have take a great care to have a syntax similar to the C#,
>> JS, Scala syntax, the concept of guards is well known, to introduce a competing
>> feature in term of syntax and semantics, the bar has to be set very high
>> because we are forcing people to learn a Java specific syntax, not seen in any
>> other mainstream languages*.
>
> which can be interpreted as "what you propose is clearly better than any
> language with patterns has done, but we may win anyway by doing the
> silly, ad-hoc, weak thing every other language does, because it will be
> more familiar to Java developers." I think this is a potentially valid
> argument, but let's be honest about the argument we're making.
And it's not really better, because most of the combination are just different syntax of the same runtime semantics, which is for me more an issue than anything else.
Rémi
More information about the amber-spec-experts
mailing list