Patterns: declaration

Sun Jan 24 04:14:23 UTC 2021

> On Jan 22, 2021, at 11:08 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> Here's the first half of an *extremely rough* doc-in-progress that starts to show more of the syntax of the declaration of a pattern -- but does yet not dive into the _body_ of a pattern (which we should still leave for a later discussion).  This is still more about object model than directly about syntax; many people have been asking "I'm having a hard time imagining what this looks like in a class file", so this should help a bit.  It also fleshes out some of the requirements, specifically patterns delegating to other patterns.  
> ...

Hi, Brian,

This exploration of the different ways of declaring patterns is very useful, as
if the observation that for every sort of constructor or method, there can be a
corresponding sort of pattern.

I believe, however, that it then falls prey to a fallacy: I believe that it is
not correct to conclude, just because every sort of constructor or method has a
corresponding form of pattern, that for every sort of constructor or method the
corresponding form of pattern should be used.  In particular, I believe that it
is not always best (indeed, it is _seldom_ best) for a static factory method to
be paired with a static pattern; it will usually be better in several ways to
provide an instance pattern instead.

There is an inherent asymmetry between a static factory method and a pattern:
the static method has no target, whereas a pattern (whether static or instance)
_always_ has a target.

I have long felt that static methods in Java are a kind of second-class kludge.
You use a static method for one of four reasons: (1) to paper over the problem
that a constructor _must_ allocate a new object rather than possibly returning
an already existing object; (2) to paper over the problem that all constructors
of a class "have the same name", so you can't have two constructors with the
same signature; (3) to paper over the problem that you can't declare the method
in the class where it logically belongs because the class does not exist, so you
need to stick it somewhere else; (4) to paper over the problem that you can't
declare the method in the class where it logically belongs because you don't
have control over its source code, so you need to stick it somewhere else.

Example of (1): We use the static method `Optional<T>.empty()` rather than
saying `new Optional<T>()` because the latter would require actually creating a
new object every time; the static method allows sharing of a single empty
object.

Example of (2): You might want to create a point object by giving it either
x and y double values, or r and theta double values, and we can't just say
`new Point(x, y)` or `new Point(r, theta)` because they have the same signature,
so instead we use a static factory method for one case or both cases,
which allows us to provide different names.

Example of (3): We write `Math.max(x, y)` because there is are no classes for
primitive types.  I would much rather write `x.max(y)` (even better would be
`x max y`), but it just can't be done, so we use a static method instead.
It would be possible to write `max(x,y)` if `max` were defined as a static
method in the correct scope for my code, but I don't want to do that over and
over.  I have to say that for the last 25 years I have have felt a minor bit of
resentment every time I have to write "Math."; it's just stupid noise that
arises from a deficiency of the language design.

Example of (4): I am working with BigInteger values, and have an application
that computes greatest common divisors and least common multiples a lot.
BigInteger happens to have an lcd method, so hooray—I can write `a.gcd(b)` when
I need to.  But I cannot write `a.lcm(b)` because BigInteger doesn't provide it
and I don't have control over its source code.  So I settle for defining a
static method, and I have that same tiny feeling of resentment every time I have
To say `MyUtilities.lcm(a, b)` rather than `a.lcm(b)`.

Static methods are clunky.  I believe that their counterpart, static patterns,
are equally clunky.  The problems we are having dealing with the implicit target
of a static pattern are a direct reflection of that clunkiness.

Okay, end of general philosophical complaining; what do I propose to do?

Let us consider three paradigmatic examples.  In the first example, I want to
switch on a value of type Object and match some special cases where it happens
to be an instance of Optional<T> or of OptEither<T,U>.  In the second example,
I want to switch on a value of type Optional<String> and match the two cases where
it is or is not empty.  The third example will be the standard regex example.

If we provide static patterns to correspond to the static factory methods, the code
looks like this (part of this is copied from your original email on this topic);

```
class Optional<T> {
    static<T> Optional<T> of(T t) { ... }
    static<T> pattern(T t) of(Optional<T> target) { ... }

    static<T> Optional<T> empty() { ... }
    static<T> pattern() empty(Optional<T> target) { ... }
}
class OptEither<T,U> {
    static<T> OptEither<T,U> left(T t) { ... }
    static<T> pattern(T t) left(OptEither<T,U> target) { ... }

    static<T> OptEither<T,U> right(U u) { ... }
    static<T> pattern(U u) right(OptEither<T,U> target) { ... }

    static<T> OptEither<T,U> empty() { ... }
    static<T> pattern() empty(OptEither<T,U> target) { ... }
}
```

For the first example, this produces code that looks pretty good, because it is
obviously necessary to indicate a type in addition to the method names:

```
switch (myObject) {
    case Optional.of(var t): ...
    case Optional.empty(): ...
    case OptEither.left(var t): ...
    case OptEither.right(var u): ...
    case OptEither.empty(): ...
}
```

But for the second example, having to indicate the type "Optional" is redundant:

```
switch (myOptionalString) {
    case Optional.of(var t): ...
    case Optional.empty(): ...
}
```

While I might wish to provide that extra clutter to assist the reader, in many
cases I would prefer to omit it:

```
switch (myOptionalString) {
    case of(var t): ...
    case empty(): ...
}
```

Maybe you think that "of" looks a little strange, but that is perhaps just the
result of Java having chosen to use the words "of" and "empty" rather than
"some" and "none" (commonly used in other functional languages).  Maybe it looks
nicer when using OptEither:

```
switch (myOptEitherStringThread) {
    case left(var str): ...
    case right(var thr): ...
    case empty(): ...
}
```

We can get this concision for the second example by declaring instance patterns
rather than static patterns to complement the static factory methods.
Unfortunately, this makes the first example no longer work.

I see three possible ways forward.  Idea (a) is to use pattern conjunction:

```
switch (myObject) {
    case Optional &&& of(var t): ...
    case Optional &&& empty(): ...
    case OptEither &&& left(var t): ...
    case OptEither &&& right(var u): ...
    case OptEither &&& empty(): ...
}
```

with the understanding that the compiler would have to be smart enough to do the
flow analysis and realize that matching a type such as `Optional` guarantees
that the value matched by `of(var t)` is indeed of type Optional, and this information
then allows the compiler to locate the declaration of the pattern.

Idea (b), which right now I slightly favor, is to define a new kind of pattern,
of the form `T P(...)`, which is effectively interpreted the same way as
`T _ &&& P(...)` but is easier for the compiler to recognize (and for humans to
read):

```
switch (myObject) {
    case Optional of(var t): ...
    case Optional empty(): ...
    case OptEither left(var t): ...
    case OptEither right(var u): ...
    case OptEither empty(): ...
}

Idea (c) is to regard the meaning of a pattern of the form `T.P(...)` as
depending on whether the pattern P declared in class T is a static pattern or an
instance pattern.  This allows us to use the originally proposed form with dots:

```
switch (myObject) {
    case Optional.of(var t): ...
    case Optional.empty(): ...
    case OptEither.left(var t): ...
    case OptEither.right(var u): ...
    case OptEither.empty(): ...
}
```

at the expense of overloading the notation `T.P`, which some programmers might
find disturbing.

Now let's consider the third example, regex.  If we want to use a pattern
(called, say, "regex") to match a string, ideally it would be an instance
pattern of the class String, so that I can say just

	switch (myString) {
	case regex(...)(...): ...
	...
	}

But suppose I am the author of `regex` but I have no control over the source
code for String; I can't declare an instance pattern for String.  The best I can
do is to declare a static pattern.

The point I want to make here is that the example in your original email:

```
case regex(AS_AND_BS)(String aString, String bString):
```

depends on declaring that static pattern someplace where its bare name is in
scope.  If, as is more likely, the static pattern is declared within some
utility class, then uses of it necessarily get more verbose:

```
case StringUtility.regex(AS_AND_BS)(String aString, String bString):
```

Bottom line: Static patterns, like static methods, are clunky to use and
therefore instance patterns should be used wherever possible, even as
complements to static factory methods.  Instance patterns also have the
advantage that you can (very obviously) use `this` to refer to the matchee.
If we follow this design philosophy, then maybe static patterns will be used
much less often than we might have thought, in which case maybe it matters less
exactly how we solve the problem of how to refer to the matchee of a static
pattern.

—Guy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20210123/c245f138/attachment.htm>