Draft JEP on pattern matching
Brian Goetz
brian.goetz at oracle.com
Sun Jun 4 21:35:45 UTC 2017
> This looks good to me as a first step. I'm curious about the semantics
> of the new switch statement.
>
> Haskell and friends have the behaviour that the matches are
> conceptually evaluated from top to bottom:
>
> data T = A | B | C
>
> f :: T -> Integer
> f x =
> case x of
> A -> 0
> B -> 1
> C -> 2
>
> So:
>
> f A ⇒ 0
> f B ⇒ 1
> f C ⇒ 2
>
> However:
>
> f :: T -> Integer
> f x =
> case x of
> _ -> 0
> B -> 1
> C -> 2
>
> This would yield (with various compilation warnings):
>
> f A ⇒ 0
> f B ⇒ 0
> f C ⇒ 0
>
> ... because the _ pattern matches anything and, as the branches are
> conceptually evaluated from top to bottom, the B and C patterns are
> unreachable.
All things being equal, tests are applied in the order they are
specified in the source file. (If we can statically eliminate a test
with class hierarchy analysis, we don't actually have to do the test.)
And just as we do today with catch blocks, there will be a flow analysis
for detecting dominated tests, which will result in compilation errors
(since the code they guard is unreachable.)
For the flow analysis, we impose a partial order on patterns. The most
obvious source of ordering is subtyping; if T <: U, then "T t" < "U u"
(type test patterns.) One can do the same for deconstruction patterns
(when we have them.) And similarly for AND patterns if we have them: P
< P && Q. And all patterns are < default. Etc. If we find that the
patterns are listed in an order inconsistent with this ordering, that's
an error.
> So:
>
> f :: T -> Integer
> f x =
> case x of
> A -> 0
> _ -> 1
>
> Would yield:
>
> f A ⇒ 0
> f B ⇒ 1
> f C ⇒ 2
With the correction (f C -> 1), that's right.
> Java's existing switch statement obviously doesn't do this: The branch
> of the switch that matches exactly is the only one that's evaluated.
> I'm curious how the new switch statement would change these semantics.
> How would the following behave with guards?
>
> switch (c) {
> case Integer i && i >= 0: return 1;
> case Integer i && i >= 1: return 2;
> default: return 3;
> }
As humans, we see that the first test dominates the second, so we might
hope for a compiler error telling us that the second is unreachable.
But obviously this is not possible in the general case, so the compiler
will probably conclude that these are unordered with respect to each
other, and take the first one.
You can consider this a generalization of the existing switch behavior
(which only supports constant patterns); since constant patterns are all
unordered with respect to each other, there's no possibility that
multiple patterns could match, so there's never a question about program
order.
If the guards (or patterns, for that matter) have side-effects, you will
probably get less than an imperative programmer might want in the way of
promises about the timing, order, or arity of the side-effects.
> For that matter:
>
> switch (c) {
> case Integer i: return 1;
> case Number n: return 2;
> case Object o: return 3;
> default: return 4;
> }
>
> What happens if c is an Integer? A Number? An Object? Is the default
> branch actually necessary if one of the branches specifies Object with
> no guards?
>
Type test patterns apply to the dynamic type, of course. So we test
first against Integer, then against Number, then against Object. The
default branch will never be taken (and can be statically identified as
unreachable.)
More information about the amber-dev
mailing list