Draft JEP on pattern matching

Sun Jun 4 21:35:45 UTC 2017

> This looks good to me as a first step. I'm curious about the semantics
> of the new switch statement.
>
> Haskell and friends have the behaviour that the matches are
> conceptually evaluated from top to bottom:
>
>    data T = A | B | C
>
>    f :: T -> Integer
>    f x =
>      case x of
>        A -> 0
>        B -> 1
>        C -> 2
>
> So:
>
>    f A ⇒ 0
>    f B ⇒ 1
>    f C ⇒ 2
>
> However:
>
>    f :: T -> Integer
>    f x =
>      case x of
>        _ -> 0
>        B -> 1
>        C -> 2
>
> This would yield (with various compilation warnings):
>
>    f A ⇒ 0
>    f B ⇒ 0
>    f C ⇒ 0
>
> ... because the _ pattern matches anything and, as the branches are
> conceptually evaluated from top to bottom, the B and C patterns are
> unreachable.

All things being equal, tests are applied in the order they are 
specified in the source file.  (If we can statically eliminate a test 
with class hierarchy analysis, we don't actually have to do the test.)  
And just as we do today with catch blocks, there will be a flow analysis 
for detecting dominated tests, which will result in compilation errors 
(since the code they guard is unreachable.)

For the flow analysis, we impose a partial order on patterns.  The most 
obvious source of ordering is subtyping; if T <: U, then "T t" < "U u" 
(type test patterns.)  One can do the same for deconstruction patterns 
(when we have them.)  And similarly for AND patterns if we have them: P 
< P && Q.  And all patterns are < default.  Etc.  If we find that the 
patterns are listed in an order inconsistent with this ordering, that's 
an error.

> So:
>
>    f :: T -> Integer
>    f x =
>      case x of
>        A -> 0
>        _ -> 1
>
> Would yield:
>
>    f A ⇒ 0
>    f B ⇒ 1
>    f C ⇒ 2

With the correction (f C -> 1), that's right.
> Java's existing switch statement obviously doesn't do this: The branch
> of the switch that matches exactly is the only one that's evaluated.
> I'm curious how the new switch statement would change these semantics.
> How would the following behave with guards?
>
>    switch (c) {
>      case Integer i && i >= 0: return 1;
>      case Integer i && i >= 1: return 2;
>      default: return 3;
>    }

As humans, we see that the first test dominates the second, so we might 
hope for a compiler error telling us that the second is unreachable.  
But obviously this is not possible in the general case, so the compiler 
will probably conclude that these are unordered with respect to each 
other, and take the first one.

You can consider this a generalization of the existing switch behavior 
(which only supports constant patterns); since constant patterns are all 
unordered with respect to each other, there's no possibility that 
multiple patterns could match, so there's never a question about program 
order.

If the guards (or patterns, for that matter) have side-effects, you will 
probably get less than an imperative programmer might want in the way of 
promises about the timing, order, or arity of the side-effects.

> For that matter:
>
>    switch (c) {
>      case Integer i: return 1;
>      case Number n: return 2;
>      case Object o: return 3;
>      default: return 4;
>    }
>
> What happens if c is an Integer? A Number? An Object? Is the default
> branch actually necessary if one of the branches specifies Object with
> no guards?
>

Type test patterns apply to the dynamic type, of course.  So we test 
first against Integer, then against Number, then against Object. The 
default branch will never be taken (and can be statically identified as 
unreachable.)