New [draft] JEPs for Pattern Matching

Brian Goetz brian.goetz at oracle.com
Fri Dec 14 15:14:38 UTC 2018


> I can see plenty of examples and motivating use cases for a single
> pattern test in an if statement, with optional when and optional else.

OK, now we're getting somewhere.  What you're really saying is "I can 
imagine wanting to do things like...

     if (target matches pattern && result-refinement) {
         ...
     }
     else {
         ...
     }

...but I'm having trouble seeing examples that go much past that. Please 
help me see what you see."

Recall that pattern matching fuses three things:
  - A test
  - Zero or more conditional extractions, if the test succeeds
  - Binding the extracted quantities into fresh variables (which are in 
scope in the code dominated by a successful match)

The reason pattern matching makes sense as a linguistic abstraction is 
that we tend to do these things together _all the time_.  What do you do 
after a successful instanceof test?  Cast the target -- almost 100% of 
the time.  What do you do next?  Put the result in a variable.  What do 
you do next?  Use the variable, because it has a refined type.  Why make 
those three things, when they can be one?

The benefit of transforming

     if (x instanceof Foo) {
         Foo f = (Foo) x;
         // use x
     }

into

     if (x instanceof Foo f) {
        // use x
     }

is obvious; it's not only more direct, but eliminates an opportunity to 
make a silly error.  But that already fits into what you've already 
imagined; what's outside that?

I think part of the trouble you're having is that the pattern defined in 
the first round are relatively weak (instanceof), so they don't get 
combined in terribly interesting ways -- yet.

So first, let's digress into where we're going.  Another place where we 
needlessly decompose the test-extract-bind triplet is in APIs like 
Optional.  We have `Optional::isPresent` and `Optional::get`; it is a 
very common mistake to use the latter without checking the former 
first.  The root of this user mistake is in the API design; it is an 
accident waiting to happen, because these two API points really want to 
be _one pattern_.  (Without patterns, we design the APIs with the 
language we've got, and we get this.)  If, instead of having an 
unconditional Optional::get method, we had patterns for 
`Optional.empty()` and `Optional.of(var contents)`, it would be much 
harder to make this mistake, because instead you'd say:

     if (o instanceof Optional.of(var contents)) { /* use contents */ }

The two API points we have -- which must always be used together -- can 
be fused into a single pattern, which is really what the user needs 
anyway, since 99+% of the time we should be calling them together.

OK, now, back to your main question -- why not just do something special 
for `if`?  Boolean expressions can be used in more than `if`; they can 
be used in conditional expressions, while loops, for loops, they can be 
returned from methods, etc.  All of these are useful with patterns, but 
we're not done with "if" yet.

Maybe you have some code that looks like this:

     if (x instanceof OptionHolder oh) {
         // process options
     }
     else {
         throw new InvalidOptionsException();
     }

And suppose the body of the `if` were long.  Very often, people prefer 
(reasonably so) to invert the `if`, so that you first test your 
precondition, fail-fast if it is not met, and then go on to the real work:

     if (!(x instanceof OptionHolder oh)) {
         throw new InvalidOptionsException();
     }
     else {
         // process options
     }

But that only works if I can invert the test.  If `instanceof` is a 
boolean expression, I can do so trivially.  If it's some special form 
for if, I can't.  So now we've invented a language feature that is 
resistant to the most basic of refactorings -- inverting an if-else 
block.  (Which would surely lead to calls for "please add 
if-not-matches".)  So clearly, we want to be able to invert a pattern 
match.

What about preconditions?  Suppose we only want to process the options 
when we're in debug mode.

     if (debugMode && x instanceof OptionHolder oh) {
         // process options
     }

That seems a pretty reasonable thing to want to do.  So not only do we 
want to refine the result of a match, we want to be able to precede it 
with a condition.

What about ORing?  Same deal.  Sticking with the "options processing" 
example, imagine we have a command line option processor that returns an 
Optional describing the argument of a switch.  Imagine further we have 
two sources of options; a prefs files we've already parsed, and command 
line switches.

     if (prefsFile.debug()
         || (getOpt("--debug") instanceof Optional(String s) && 
isBooleanTrue(s))) {
         ...
     }

Could I write this without the pattern match?  Sure I could, but it 
would be messier and more error-prone.

(Digression: if we wrote an options-processing framework for this, 
rather than capture "is this string the boolean true" as a static method 
from String to boolean, we'd write it as a pattern, and nest the 
patterns, exposing the boolean result we want, and hiding the raw 
string, which is only useful for further parsing:

     if (getOpt("--debug") instanceof Optional(BooleanString(boolean b))

BooleanString could fail to match if the string were not a valid boolean 
(i.e., not "true", "false", "t", "f", "yes", "no", etc) and otherwise 
convert the string to a boolean.

End digression.)

OK, so we've seen that even for if, being able to combine it with other 
conditionals with && and || is valuable, as is inversion.

If something is useful as the operand of `if`, it's useful in a 
conditional expression:

     int maxThreads = getOpt("--max-threads") instanceof Optional(String s)
         ? Integer.parseInt(s)
         : DEFAULT_MAX_THREADS;

Why would we tell people they can't refactor their if statements to 
conditionals?

If something is useful as the operand of an if, its probably useful on 
its own:

     boolean debugValue = getOpt("--debug") instanceof Optional(String s)
                              && isBooleanTrue(s);

Sometimes we return booleans from methods.  Like, `Object::equals`.  
Currently, we write `equals()` in a convoluted way (here's some 
IDE-generated code):

     public boolean equals(Object o) {
         if (this == o) return true;
         if (!(o instanceof Name)) return false;

         Name name = (Name) o;

         if (!first.equals(name.first)) return false;
         return last.equals(name.last);
     }

Such control flow.  Much goto.

Instead, I can write this as one expression with a pattern match:

     public boolean equals(Object o) {
         return (o instanceof Name name)
             && first.equals(name.first)
             && last.equals(name.last);
     }

This works because `instanceof` is a boolean expression.

OK, what about other control constructs?  We use booleans in do, while, 
and for loops.  And yes, they're useful there too, though these are 
likely more advanced use cases.  (The more computation that can be 
expressed as a pattern match, such as matching a string against a 
regular expression (which is a perfect application for pattern 
matching), the more likely you are to want to do this in a loop header.)


TO answer your concrete questions:


> I assume the following is either pointless or a compile error:
>    boolean isString = obj instanceof String s;

It's valid code, but its silly, in exactly the same way that we can do 
silly stuff today, like declaring a variable and not using it. In fact, 
that's exactly what this is -- declaring a variable and not using it.  
Legal, but silly.

> What about this - is s in scope in the braces?:
>    while (obj instanceof String s) { ... }

Yes.  Follow the control flow, and ask yourself: when I get inside the 
braces, did the instanceof expression always match?  If so, then yes.  
(Scoping for binding variables is just definite assignment, which is 
just "follow the control flow, was it assigned in all paths that could 
lead here?")

> Or this - pointless, or error?:
>   list.stream().filter(obj -> obj instanceof String s);

Same as the first one; you've declared a variable that is never used.  
It is in scope in invisibly small places where you could usefully put 
code if you wanted to, though:

     list.stream().filter(o -> o instanceof String s && !s.isBlank())

Again, follow the control flow -- when I get to the s.isBlank(), must 
the pattern have matched?  If so, then s is in scope.


Finally, as an example of what people might do if we didn't have an 
instanceof expression -- they might instead synthesize one with a switch 
expression:

     String castedToString = null;
     boolean b = switch (o) {
         case String s -> { castedToString = s; break true; }
         default -> false;
    };

Yuck.  This code is worse than `o instanceof String s` in so many ways.  
Would you want to encourage this?




More information about the amber-dev mailing list