Unnamed variables and match-all patterns

Wed Sep 7 21:43:42 UTC 2022

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, September 7, 2022 7:41:33 PM
> Subject: Unnamed variables and match-all patterns

> We've gone around and around a few times on "unnamed variables" (underscore),
> starting with JEP 302 (Lambda Leftovers). We reclaimed the underscore token in
> Java 9 with the intention of using it for unnamed variables and "any" patterns.
> Along the way, we ran into some hiccups, and it has sat on the shelf for a
> while. Let's take it down, dust it off, and see if we have any more clarity
> than before.

> There are three syntactic productions in which we might want to use underscore
> as a "don't care" indicator:

> - Unnamed variables. Here, underscore stands in for a variable name. When we
> declare a local variable, catch formal, pattern variable, etc, whose name is
> `_`, which has the effect of entering no new names in scope. It becomes an
> "initialize-only" variable.

> try { ... }
> catch (FooException _) { throw new BarException("foo"); }

> - Partial inference. Here, underscore stands in for a type name. Today, we can
> infer type variables for generic method invocations and constructor
> invocations, but it is all-or-nothing. Being able to denote "infer this type"
> would allow us to do partial inference:

> foo.<String, _>m(...)

> - "Any" patterns. Here, underscore is a pattern, which matches everything, and
> binds nothing.

> case Foo(var s, _): ...

> We don't have to do all of these; right now we're not considering partial
> inference, but the other two are reasonable options. Unnamed variables have
> been a long-standing request; any patterns will likely be a common request soon
> as well.

> For a match-all pattern, there is little to say other than "_" is one of the
> alternatives of the Pattern production, it is applicable to all types, it is
> unconditional on all types, and it has no bindings. The specification already
> has a concept of "any" patterns; this is just making it denotable.

> I think there is little controversy about using unnamed local variables (local
> variable declaration statements, catch formals, foreach induction variables,
> resources in try-with-resources) and unnamed lambda parameters. What is common
> to all of these is that these are _pure implementation details_, where the
> author has elected to not give a name to a variable that is entirely
> implementation-facing. This seems eminently reasonable. Unnamed parameters can
> help eliminate errors by capturing design assumptions and make life easier for
> static analysis tools that like to point out unused variables.

> Where we stumble is on method parameters, because method parameter names serve
> two masters -- the implementation (as the declaration of a variable) and the
> API (as part of the specification of what the method does.) Among other things,
> we like to document the semantics of method parameters in Javadoc with the
> `@param` tag, but doing so requires a name (or inventing a new Javadoc
> mechanism like `@param #4`, likely a loser.) Secondarily, sometimes parameter
> names are retained in the MethodParameters attribute, though that attribute
> (JVMS 4.7.24) already supports parameters without names by using a zero CP
> index.

> With `var`, we drew a clear line of "implementation only" -- you can't infer a
> method return type, even for a private method, you can only use it for local
> variables and lambda formals. This has been pretty successful.

> We've explored a number of intermediate points on the spectrum with varying
> degrees of stability:

> A) Implementation only -- local variables, catch formals, for-loop induction
> variables, TWR resources, pattern variables, lambda formals
> B) "A++", where we add in method parameters of anonymous classes
> C) Adding in method parameters _for non-initial declarations_ -- allow unnamed
> parameters only for methods that override a method from a supertype, ensuring
> that there is a real specification of what the parameters mean.
> D) Anything goes, any method parameter can be unnamed, throwing specification to
> the wind.

> A is a stable point, and has the advantage of mostly lining up with where we can
> use `var`. But users will surely grumble that they can't use it for
> implementations of methods from supertypes. As this feature request predates
> lambdas and patterns, giving it to lambdas and patterns but not ordinary
> methods might feel a bit mean.

> The motivation for B is obvious -- to support smooth refactoring between lambdas
> and inner classes -- but is not a very stable point, as one will immediately
> ask "what about refactoring to named classes".

> C feels attractive, though there would surely be complaints too; it excludes
> constructors and static methods (which might sometimes want unnamed parameters
> when a parameter is no longer used, but stays around for binary compatibility),
> and even some initial declarations. But, these cases are likely to be somewhat
> more rare, so I don't object to leaving these aside. The main concern is that
> this might feel arbitrary. There is also the possibility for some confusion; it
> is not obvious what it means when you override a method that already has an
> unnamed parameter. Can you give it a name and use it? It is a little weird that
> the lack of name applies only to the implementation of the method, but somehow
> bleeds into the specification. There is also some impact on Javadoc, as well as
> lingering concerns that there are other shoes to drop other than Javadoc and
> MethodParameters.

> D is also stable, but feels like it makes the language less safe, by making some
> methods unspecifiable. On the other hand, the people who might use it for
> initial declarations, static methods, etc, are also the sort of people who
> probably don't write specification anyway (otherwise they would realize that
> they are depriving their callers of useful information.)

> In (C), Javadoc could insert an `@implNote` that says something like "this
> implementation ignores the value of parameters <x> and <y> from declaring
> method Foo::bar". In (D), it could say "ignores its 3rd and 4th parameter", or
> insert synthetic @param tags for parameters whose name is something like
> "<unnamed>".

> Past discussions seemed to gravitate toward either A or D, which are also the
> simplest / most stable points. I guess it becomes a question of getting over
> the "makes the language less safe" concerns.

> Regardless, I'd like to see if we can quantify the "lingering concerns about
> other shoes to drop."

There is a C-bis, where '_' is allowed for private methods but that's not important. 

As a teacher, i vote for A, APIs should be documented, giving a good name to a parameter is usually the first step. 

Rémi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220907/ef73d26d/attachment.htm>