New [draft] JEPs for Pattern Matching

Tue Dec 18 01:05:36 UTC 2018

I asked "Does Java need patterns to be usable as arbitrary boolean
expressions?" and "I think its right to question whether there really
is enough
motivating use case to justify patterns *everywhere*". I think your
mail answers those questions. While it will allow some stupid code
(and no doubt some horrific code if someone uses side effects in
patterns), the consistency of them being boolean expressions can be a
win.

Thanks for the additional use cases, which I think were worth exploring.
Stephen

On Fri, 14 Dec 2018 at 15:14, Brian Goetz <brian.goetz at oracle.com> wrote:
>
> > I can see plenty of examples and motivating use cases for a single
> > pattern test in an if statement, with optional when and optional else.
>
> OK, now we're getting somewhere.  What you're really saying is "I can
> imagine wanting to do things like...
>
>      if (target matches pattern && result-refinement) {
>          ...
>      }
>      else {
>          ...
>      }
>
> ...but I'm having trouble seeing examples that go much past that. Please
> help me see what you see."
>
> Recall that pattern matching fuses three things:
>   - A test
>   - Zero or more conditional extractions, if the test succeeds
>   - Binding the extracted quantities into fresh variables (which are in
> scope in the code dominated by a successful match)
>
> The reason pattern matching makes sense as a linguistic abstraction is
> that we tend to do these things together _all the time_.  What do you do
> after a successful instanceof test?  Cast the target -- almost 100% of
> the time.  What do you do next?  Put the result in a variable.  What do
> you do next?  Use the variable, because it has a refined type.  Why make
> those three things, when they can be one?
>
> The benefit of transforming
>
>      if (x instanceof Foo) {
>          Foo f = (Foo) x;
>          // use x
>      }
>
> into
>
>      if (x instanceof Foo f) {
>         // use x
>      }
>
> is obvious; it's not only more direct, but eliminates an opportunity to
> make a silly error.  But that already fits into what you've already
> imagined; what's outside that?
>
> I think part of the trouble you're having is that the pattern defined in
> the first round are relatively weak (instanceof), so they don't get
> combined in terribly interesting ways -- yet.
>
> So first, let's digress into where we're going.  Another place where we
> needlessly decompose the test-extract-bind triplet is in APIs like
> Optional.  We have `Optional::isPresent` and `Optional::get`; it is a
> very common mistake to use the latter without checking the former
> first.  The root of this user mistake is in the API design; it is an
> accident waiting to happen, because these two API points really want to
> be _one pattern_.  (Without patterns, we design the APIs with the
> language we've got, and we get this.)  If, instead of having an
> unconditional Optional::get method, we had patterns for
> `Optional.empty()` and `Optional.of(var contents)`, it would be much
> harder to make this mistake, because instead you'd say:
>
>      if (o instanceof Optional.of(var contents)) { /* use contents */ }
>
> The two API points we have -- which must always be used together -- can
> be fused into a single pattern, which is really what the user needs
> anyway, since 99+% of the time we should be calling them together.
>
> OK, now, back to your main question -- why not just do something special
> for `if`?  Boolean expressions can be used in more than `if`; they can
> be used in conditional expressions, while loops, for loops, they can be
> returned from methods, etc.  All of these are useful with patterns, but
> we're not done with "if" yet.
>
> Maybe you have some code that looks like this:
>
>      if (x instanceof OptionHolder oh) {
>          // process options
>      }
>      else {
>          throw new InvalidOptionsException();
>      }
>
> And suppose the body of the `if` were long.  Very often, people prefer
> (reasonably so) to invert the `if`, so that you first test your
> precondition, fail-fast if it is not met, and then go on to the real work:
>
>      if (!(x instanceof OptionHolder oh)) {
>          throw new InvalidOptionsException();
>      }
>      else {
>          // process options
>      }
>
> But that only works if I can invert the test.  If `instanceof` is a
> boolean expression, I can do so trivially.  If it's some special form
> for if, I can't.  So now we've invented a language feature that is
> resistant to the most basic of refactorings -- inverting an if-else
> block.  (Which would surely lead to calls for "please add
> if-not-matches".)  So clearly, we want to be able to invert a pattern
> match.
>
> What about preconditions?  Suppose we only want to process the options
> when we're in debug mode.
>
>      if (debugMode && x instanceof OptionHolder oh) {
>          // process options
>      }
>
> That seems a pretty reasonable thing to want to do.  So not only do we
> want to refine the result of a match, we want to be able to precede it
> with a condition.
>
> What about ORing?  Same deal.  Sticking with the "options processing"
> example, imagine we have a command line option processor that returns an
> Optional describing the argument of a switch.  Imagine further we have
> two sources of options; a prefs files we've already parsed, and command
> line switches.
>
>      if (prefsFile.debug()
>          || (getOpt("--debug") instanceof Optional(String s) &&
> isBooleanTrue(s))) {
>          ...
>      }
>
> Could I write this without the pattern match?  Sure I could, but it
> would be messier and more error-prone.
>
> (Digression: if we wrote an options-processing framework for this,
> rather than capture "is this string the boolean true" as a static method
> from String to boolean, we'd write it as a pattern, and nest the
> patterns, exposing the boolean result we want, and hiding the raw
> string, which is only useful for further parsing:
>
>      if (getOpt("--debug") instanceof Optional(BooleanString(boolean b))
>
> BooleanString could fail to match if the string were not a valid boolean
> (i.e., not "true", "false", "t", "f", "yes", "no", etc) and otherwise
> convert the string to a boolean.
>
> End digression.)
>
> OK, so we've seen that even for if, being able to combine it with other
> conditionals with && and || is valuable, as is inversion.
>
> If something is useful as the operand of `if`, it's useful in a
> conditional expression:
>
>      int maxThreads = getOpt("--max-threads") instanceof Optional(String s)
>          ? Integer.parseInt(s)
>          : DEFAULT_MAX_THREADS;
>
> Why would we tell people they can't refactor their if statements to
> conditionals?
>
> If something is useful as the operand of an if, its probably useful on
> its own:
>
>      boolean debugValue = getOpt("--debug") instanceof Optional(String s)
>                               && isBooleanTrue(s);
>
> Sometimes we return booleans from methods.  Like, `Object::equals`.
> Currently, we write `equals()` in a convoluted way (here's some
> IDE-generated code):
>
>      public boolean equals(Object o) {
>          if (this == o) return true;
>          if (!(o instanceof Name)) return false;
>
>          Name name = (Name) o;
>
>          if (!first.equals(name.first)) return false;
>          return last.equals(name.last);
>      }
>
> Such control flow.  Much goto.
>
> Instead, I can write this as one expression with a pattern match:
>
>      public boolean equals(Object o) {
>          return (o instanceof Name name)
>              && first.equals(name.first)
>              && last.equals(name.last);
>      }
>
> This works because `instanceof` is a boolean expression.
>
> OK, what about other control constructs?  We use booleans in do, while,
> and for loops.  And yes, they're useful there too, though these are
> likely more advanced use cases.  (The more computation that can be
> expressed as a pattern match, such as matching a string against a
> regular expression (which is a perfect application for pattern
> matching), the more likely you are to want to do this in a loop header.)
>
>
> TO answer your concrete questions:
>
>
> > I assume the following is either pointless or a compile error:
> >    boolean isString = obj instanceof String s;
>
> It's valid code, but its silly, in exactly the same way that we can do
> silly stuff today, like declaring a variable and not using it. In fact,
> that's exactly what this is -- declaring a variable and not using it.
> Legal, but silly.
>
> > What about this - is s in scope in the braces?:
> >    while (obj instanceof String s) { ... }
>
> Yes.  Follow the control flow, and ask yourself: when I get inside the
> braces, did the instanceof expression always match?  If so, then yes.
> (Scoping for binding variables is just definite assignment, which is
> just "follow the control flow, was it assigned in all paths that could
> lead here?")
>
> > Or this - pointless, or error?:
> >   list.stream().filter(obj -> obj instanceof String s);
>
> Same as the first one; you've declared a variable that is never used.
> It is in scope in invisibly small places where you could usefully put
> code if you wanted to, though:
>
>      list.stream().filter(o -> o instanceof String s && !s.isBlank())
>
> Again, follow the control flow -- when I get to the s.isBlank(), must
> the pattern have matched?  If so, then s is in scope.
>
>
> Finally, as an example of what people might do if we didn't have an
> instanceof expression -- they might instead synthesize one with a switch
> expression:
>
>      String castedToString = null;
>      boolean b = switch (o) {
>          case String s -> { castedToString = s; break true; }
>          default -> false;
>     };
>
> Yuck.  This code is worse than `o instanceof String s` in so many ways.
> Would you want to encourage this?
>
>