Simplifying switch labels
Dan Smith
daniel.smith at oracle.com
Wed Jun 1 21:51:18 UTC 2022
I recently reviewed the spec changes for patterns in switch, and found the treatment of switch labels pretty hard to work out. Since this is really about the design more than the specification, I thought I'd share some thoughts here. Consider this early feedback on the Java 19 preview.
The status quo: syntactically, a switch label is a set of colon- and comma-separated elements, where an *element* is one of { constant, enum, pattern, null, default }. (I'm oversimplifying delimiters a little: 'case' or a plain 'default' must follow ':', but not ','.) Layered on top of that syntax are a number of restrictions: colon delimiters aren't allowed when using switch *rules*; the entire set of elements must not have repeat elements; constants and enums are only allowed for certain switch types; certain combinations of elements are disallowed, while others are okay. Then there are inter-label restrictions, mainly expressed by the dominance relation.
I find all these non-syntactic rules to be pretty hard to keep in my head, especially when it comes to which combinations of elements are allowed.
A couple of simplifying moves I'd suggest:
(1) Don't try to merge sets of switch block labels (e.g., 'case foo: case bar:').
These consecutive labels have the *effect* of handling multiple cases with a single block of code, but I think we can formally treat them as two unique blocks, the first of which falls through to the second. And I think that framing is more in line with how programmers would typically read the syntax.
In this framing, the restrictions about sets of elements in a single label don't apply, because we're talking about two different labels. But we have rules to prevent various abuses. Examples:
case 23: case Pattern: // illegal before and now, due to fallthrough Pattern rule
case Pattern: case Pattern: // ditto
case null: case Pattern: // allowed before, illegal now: use a comma for null
case Pattern: default: // illegal before, legal now: you fell through to the default case
case Pattern: noop(); default: // legal before and now
case Pattern: case null: // binds the pattern before, pattern is out of scope now
case Pattern: noop(); case null: // legal with pattern out of scope, before and now
case Pattern: case 23: // illegal before, legal now with fallthrough
case 23: case 23: // illegal before and now, due to dominance
case default: default: // illegal before and now, due to "only one default" rule
Another way to argue this is that I think 'case Pattern: case somethingelse' has a lot more in common, syntactically and conceptually, with 'case Pattern: noop(); case somethingelse:' than it does with 'case Pattern, somethingelse'.
A possible limitation is if there are already special rules in the language that treat colon-delimited labels differently than separate blocks with fallthrough. I can't think of any right now, but I may be forgetting something...
(2) Reduce the syntactic surface of comma-separated switch labels.
There are a lot of combinations of elements—maybe a majority?—that don't make sense and we prohibit. There are some others that are a bit odd, but we allow them anyway. I'd prefer to cut back on having multiple ways to do things, and syntactically enumerate the few cases that are actually meaningful.
Something like:
SwitchLabel:
case CaseValue { , CaseValue } :
case Pattern { Guard }:
case null, Pattern:
default:
case null, default:
CaseValue:
null
ConstantExpression
EnumConstantName
(There's some ambiguity in CaseValue, and I think we should do better with 'ConstantExpression', but okay, set that aside, this is the concept at least.)
Note that the second kind of Pattern SwitchLabel is especially weird—it binds 'null' to a pattern variable, and requires the pattern to be a (possibly parenthesized) type pattern. So it's nice to break it out as its own syntactic case. I'd also suggest rethinking whether "case null," is the right way to express the two kinds of nullable SwitchLabels, but anyway now it's really clear that they are special.
Rules like "no duplicates" and "only for certain switch types" only need to be expressed as constraints on the second kind of SwitchLabel. Dominance also seems like it would be more manageable to specify/understand.
Some things this would newly disallow:
- 'case default:'—just say 'default:'
- case 23, default:'—just say 'default:' (mention '23' in a comment if it's important to call out)
- 'case default, null:'—could add this case, I guess, or just say that 'default' always goes second
- 'case Pattern, null:'—ditto
- 'case null, Pattern Guard:'—confusing whether the guard is checked when 'null'
This cuts back especially on the degrees of freedom for 'default': the only useful thing you can add to it, and should want to add to it, is making it match null.
More information about the amber-spec-experts
mailing list