Feedback: Using pattern context to make type patterns more consistent and manage nulls
Brian Goetz
brian.goetz at oracle.com
Mon Jan 25 12:46:36 UTC 2021
> I'm not concerned
> with null, but with basic code readability and consistency.
The great thing about being motivated by consistency is that you get to
pick what you find it most important to be consistent with :) In a
complex world, there are always going to be preexisting
"inconsistencies", which means you always have a choice of what to
choose to be consistent with. For example, the null-handling treatment
of instanceof and cast are different, which one could call
"inconsistent", except that ... it is right. Consistency is a good
guiding principles, and gratuitous inconsistency is surely bad, but it
is not necessarily the highest good -- nor necessarily even a
well-defined concept.
In any case, your concern has to be at least a little bit about null,
because ... that's the only thing that is varying here; no one disagrees
that `String s` should match all non-null instances of String. You are
suggesting type patterns never match null, so they can be "consistent"
with how the `instanceof` bytecode works. There's nothing wrong with
that particular preference of "things to be consistent with", but its
not the only choice, and it has costs.
For the record, here are the consistencies we've chosen:
- var should consistently only mean "type inference", so users can
orthogonally choose whether to use manifest or inferred types, according
to what they find more readable, and freely switch back and forth, and ..
- a nested patterns `x matches P(Q)` should mean nothing more than
P(var alpha) && alpha matches Q.
When you make these two choices, the semantics we have pretty much write
themselves.
Of course, just because the semantics derive from consistent and
principled choices, doesn't mean that it results in a good programming
model; we have to validate that the consequences of these choices leads
to the programs we actually want. I think your main concern here is
that you are worried this is not the case, but we should let actual code
be our guide here.
It's important to realize that the typical "type switch" examples we've
long wanted in Java don't actually give us that much intuition for what
typically happens with *nested* pattern switches. When you write enough
code using such things, you see that there are tree-shaped patterns of
code that emerge where at each level, cases organize themselves into
lattices, where specialized cases funnel into catch-all cases:
case Box(Prime p):
case Box(Even e):
case Box(Object o): // catch all
...
and that the typical pattern of how this works, while "inconsistent"
with respect to who gets the nulls, turns out to actually be what you
want, a great fraction of the time. (If null is not in the domain of
what is in the boxes, it doesn't matter; if it is, catch-all Box code
(which the last represents) will be prepared to deal with it.) In the
cases where it is not what you want, you can exclude it (with guards,
with null checks, whatever.)
Another thing to realize is that the examples with `Box(Chocolate)` are
just simple examples. In the real world, we'll have deconstructors with
handfuls of bindings (as constructors do today), and a deconstruction
pattern that reads `Foo(var a, var b, var c, var d, var e)` may not be
quite as appealing from a readability perspective as when there's only
one parameter and its obvious what it is. Forcing users to choose
between semantics and readability (whichever way they happen to want to
go with var-vs-manifest) is not a good look.
> That is a huge inconsistency! A developer has nothing in the code to
> separate the static one from the dynamic one:
>
> switch (box) {
> case Box(Integer i) ...
> case Box(Number n) ...
> }
>
> Reading this code I have no way of knowing what it does. None.
>
> If box is Box<Number> it does one thing. If box is Box<Object> it does
> something else. Sure the impact is only on null, but that is a
> secondary detail and not what is driving my concern. The key point is
> that someone reading the code can't tell what branch the code will
> take, and can get a different outcome for two identical patterns in
> different parts of the codebase.
I get that this seems different and scary when it is all theoretical and
we're extrapolating from almost no examples. Write some code with it,
though, and I think you'll find it is surprisingly natural, and the
things you are worried about don't happen remotely as often as you are
scared they will. And, the alternatives are far worse. We could set
`var` on fire (really not such a good deal), or we could invent multiple
kinds of type patterns, one nullable, and one not (a lot of added
complexity, just so users can spend more energy focusing on low-level
corner cases than they really ever want), or we could tinker with the
semantics so that the razor blades are hidden in even less expected
places, or we could add `T!` type patterns but not have `T!` in the
general type system (imagine the rants about inconsistency then.)
Having a simple set of rules derived from a small number of clear
principles goes a long way towards helping people reason about the
complexity that we can't make go away.
> If you can find an alternative to using `var` in the way I propose
> that is fine by me. As I pointed out in my last email, the situations
> where there is a conflict to resolve are relatively rare, because best
> practice is to use `var` for the final case anyway.
I agree that will be common in the obvious cases. (In which case, the
things you are worried about will happen even less often.) But as
mentioned above, I'm skeptical that this will actually be a "best
practice" when there are many bindings and its not completely obvious
what the types are. This is something users should get to choose for
themselves.
> As it stands, the proposal will never be acceptable to me because it
> fails the code readability test - premature optimization by using the
> static context means the code doesn't do what it says it does.
These rules, for all the parts you don't like, are simple; I have great
faith that you will learn how things work and how to read the patterns
of code that typically emerge. (You might even like it, after writing
some actual code with it, but regardless, I don't believe you, or any
other Java developer, is incapable of understanding what the code says.
We see examples of this all over the place in Java, such as:
- Method overloading. How do I know which overload of x.foo() I am
calling, or which overload X::foo refers to? A: when it's not obvious,
ask the IDE, or look it up in the Javadoc.
- Type inference. How do I know what types are being inferred for
generic method calls, or what gets inferred for `var`? A: when it's not
obvious, ask the IDE (or, for masochists, work it out yourself), and
then, put explicit witnesses in the code so other readers can see.
More information about the amber-dev
mailing list