Unnamed variables and match-all patterns
Brian Goetz
brian.goetz at oracle.com
Wed Sep 7 17:41:33 UTC 2022
We've gone around and around a few times on "unnamed variables"
(underscore), starting with JEP 302 (Lambda Leftovers). We reclaimed
the underscore token in Java 9 with the intention of using it for
unnamed variables and "any" patterns. Along the way, we ran into some
hiccups, and it has sat on the shelf for a while. Let's take it down,
dust it off, and see if we have any more clarity than before.
There are three syntactic productions in which we might want to use
underscore as a "don't care" indicator:
- Unnamed variables. Here, underscore stands in for a variable name.
When we declare a local variable, catch formal, pattern variable, etc,
whose name is `_`, which has the effect of entering no new names in
scope. It becomes an "initialize-only" variable.
try { ... }
catch (FooException _) { throw new BarException("foo"); }
- Partial inference. Here, underscore stands in for a type name.
Today, we can infer type variables for generic method invocations and
constructor invocations, but it is all-or-nothing. Being able to denote
"infer this type" would allow us to do partial inference:
foo.<String, _>m(...)
- "Any" patterns. Here, underscore is a pattern, which matches
everything, and binds nothing.
case Foo(var s, _): ...
We don't have to do all of these; right now we're not considering
partial inference, but the other two are reasonable options. Unnamed
variables have been a long-standing request; any patterns will likely be
a common request soon as well.
For a match-all pattern, there is little to say other than "_" is one of
the alternatives of the Pattern production, it is applicable to all
types, it is unconditional on all types, and it has no bindings. The
specification already has a concept of "any" patterns; this is just
making it denotable.
I think there is little controversy about using unnamed local variables
(local variable declaration statements, catch formals, foreach induction
variables, resources in try-with-resources) and unnamed lambda
parameters. What is common to all of these is that these are _pure
implementation details_, where the author has elected to not give a name
to a variable that is entirely implementation-facing. This seems
eminently reasonable. Unnamed parameters can help eliminate errors by
capturing design assumptions and make life easier for static analysis
tools that like to point out unused variables.
Where we stumble is on method parameters, because method parameter names
serve two masters -- the implementation (as the declaration of a
variable) and the API (as part of the specification of what the method
does.) Among other things, we like to document the semantics of method
parameters in Javadoc with the `@param` tag, but doing so requires a
name (or inventing a new Javadoc mechanism like `@param #4`, likely a
loser.) Secondarily, sometimes parameter names are retained in the
MethodParameters attribute, though that attribute (JVMS 4.7.24) already
supports parameters without names by using a zero CP index.
With `var`, we drew a clear line of "implementation only" -- you can't
infer a method return type, even for a private method, you can only use
it for local variables and lambda formals. This has been pretty
successful.
We've explored a number of intermediate points on the spectrum with
varying degrees of stability:
A) Implementation only -- local variables, catch formals, for-loop
induction variables, TWR resources, pattern variables, lambda formals
B) "A++", where we add in method parameters of anonymous classes
C) Adding in method parameters _for non-initial declarations_ -- allow
unnamed parameters only for methods that override a method from a
supertype, ensuring that there is a real specification of what the
parameters mean.
D) Anything goes, any method parameter can be unnamed, throwing
specification to the wind.
A is a stable point, and has the advantage of mostly lining up with
where we can use `var`. But users will surely grumble that they can't
use it for implementations of methods from supertypes. As this feature
request predates lambdas and patterns, giving it to lambdas and patterns
but not ordinary methods might feel a bit mean.
The motivation for B is obvious -- to support smooth refactoring between
lambdas and inner classes -- but is not a very stable point, as one will
immediately ask "what about refactoring to named classes".
C feels attractive, though there would surely be complaints too; it
excludes constructors and static methods (which might sometimes want
unnamed parameters when a parameter is no longer used, but stays around
for binary compatibility), and even some initial declarations. But,
these cases are likely to be somewhat more rare, so I don't object to
leaving these aside. The main concern is that this might feel
arbitrary. There is also the possibility for some confusion; it is not
obvious what it means when you override a method that already has an
unnamed parameter. Can you give it a name and use it? It is a little
weird that the lack of name applies only to the implementation of the
method, but somehow bleeds into the specification. There is also some
impact on Javadoc, as well as lingering concerns that there are other
shoes to drop other than Javadoc and MethodParameters.
D is also stable, but feels like it makes the language less safe, by
making some methods unspecifiable. On the other hand, the people who
might use it for initial declarations, static methods, etc, are also the
sort of people who probably don't write specification anyway (otherwise
they would realize that they are depriving their callers of useful
information.)
In (C), Javadoc could insert an `@implNote` that says something like
"this implementation ignores the value of parameters <x> and <y> from
declaring method Foo::bar". In (D), it could say "ignores its 3rd and
4th parameter", or insert synthetic @param tags for parameters whose
name is something like "<unnamed>".
Past discussions seemed to gravitate toward either A or D, which are
also the simplest / most stable points. I guess it becomes a question
of getting over the "makes the language less safe" concerns.
Regardless, I'd like to see if we can quantify the "lingering concerns
about other shoes to drop."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20220907/45c0575c/attachment-0001.htm>
More information about the amber-spec-observers
mailing list