Yield as contextual keyword (was: Call for bikeshed -- break replacement in expression switch)

Thu May 23 21:29:58 UTC 2019

> On May 22, 2019, at 9:45 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> The “compromise” strategy is like the smart strategy, except that it trades fixed lookahead for missing a few more method invocation cases.  Here, we look at the tokens that follow the identifier yield, and use those to determine whether to classify yield as a keyword or identifier.  (We’d choose identifier if it is an assignment op (=, +=, etc), left-bracket, dot, and a few others, plus a few two-token sequences (e.g., ++ and then semicolon), which is lookahead(2).  

> The compromise strategy misses some cases we could parse unambiguously, but also offers a simpler user model: always qualify invocations of methods called yield when used as expression statements.  And it offers the better lookup behavior, which will make life easier for IDEs.  

There's still some space for different design choices within the compromise strategy: what happens to names in contexts *other than* the start of a statement?

I think it's really helpful to split the question into three parts: variable names, type names, and method names.

1) Variable names: we've established that, with a fixed lookahead, every legal use of the variable name 'yield' can be properly interpreted. Great.

2) Type names: 'yield' might be used as the name of a class, type of a method parameter, type of a field, array component type, type of a 'final' local variable etc. Or we can prohibit it entirely as a type name.

We went through this when designing 'var', and settled on the more restrictive position: you can't declare classes/interfaces/type vars or make reference to types with name 'var', regardless of context. That way, there's no risk of confusion between subtly different programs—wherever you see 'var' used as a type, you know it can only mean the keyword.

I think it's best to treat 'yield' like 'var' in this case.

3) Method names: 'yield(' at the start of a statement means YieldStatement, but what about other contexts in which method invocations can appear?

Example:
var v = switch (x) {
    case 1 -> yield(x); // method call?
    default -> { yield(x); } // no-op, produces x (oops!)
};

Fortunately, the different normal-completion behavior of a method call and a yield statement will probably catch most errors of this form—when I type the braces above, I'll probably also try adding a statement after the attempted 'yield' call, and the compiler will complain that the statement is unreachable. But it's all very subtle (not to mention painful for IDEs).

Taking inspiration from the treatment of type names, my preference here is to make a blanket restriction that's easy to visualize: an *unqualified* method invocation must not use the name 'yield'. Context is irrelevant. The workaround is always to add a qualifier.

(If, in the future, we introduce local methods or something similar that can't be qualified, we should not allow such methods to be named 'yield'.)

---

Are people generally good with my preferred restrictions, or do you think it's better to be more permissive?