break seen as a C archaism
Brian Goetz
brian.goetz at oracle.com
Tue Mar 13 20:18:45 UTC 2018
Thanks for the detailed analysis. I'm glad to see that a larger
percentage could be converted to expression switch given sufficient
effort; I'm also not surprised to see that a lot had accidental reasons
that kept them off the happy path. And, your analysis comports with
expectations in another way: that some required more significant
intervention than others to lever them into compliance. For example,
the ones with side-effecting activities or logging, while good
candidates for "strongly dissuade", happen in the real world, and not
all codebases have the level of discipline or willingness to refactor
that yours does. So I'm not sure I'm ready to toss either of the size-7
sub-buckets aside so quick; not everyone is as sanguine as "well, go
refactor then" as you are.
Which is to say, adjusting the data brings the simple bucket from 86%
(which seemed low) to 93-95%, which is "most of the time" but not "all
of the time". So most of the time, the code will not need whatever
escape mechanism (break or block), but it will often enough that having
some escape hatch is needed.
You didn't mention fallthrough, but because that's everyone's favorite
punching bag here, I'll add: most of the time, fallthrough is not needed
in expression switches, but having reviewed a number of low-level
switches in the JDK, it is sometimes desirable even for switches that
can be converted to expressions. One of the motivations for refining
break in this way is so that when fallthrough is needed, the existing
idiom just works as everyone understands it to.
> imho, early signs suggest that the grossness of `break x` is not
> /nearly/ justified by the actual observed positive value of supporting
> multi-statement cases in expression switch. Are we open to killing
> that, or would we be if I produced more and clearer evidence?
That's one valid interpretation of the data, but not the only. Whether
making break behave more like return (takes a value in non-void context,
doesn't take one in void context) is gross or natural is subjective.
Here's my subjective interpretation: "great, most of the time, you don't
have to use the escape hatch, but when you do, it has control flow just
like the break we know, extended to take a value in non-void contexts,
so will be fairly familiar."
But setting aside subjective reactions, are there better alternatives?
Let's review what has been considered already, and why they've been
passed over:
- Do nothing; only allow single expressions. Non-starter.
- Traditional "block expressions"; { S; S; e }. Terrible fit for
Java, so no.
- Some other form of block expression. Seems a very big hammer for a
small problem, which will surely interact with other features, and will
likely call for follow-ons of its own.
- Some sort of bespoke "block expression for switch".
On the latter, one obvious choice is something lambda-like:
case 1 -> 1;
case 2 -> { println("two"); return 2; }
You might argue that this is familiar because it's using `return` just
like lambda, but ... yuck. Lambdas are their own invocation scope, so
`return` can be twisted into making sense, but the block of a switch is
not, so `return` is definitely the wrong word here. (Arguably it was
the wrong word for lambdas too; had someone suggested `break` at the
right time back then I would probably have been pretty compelled by this
suggestion, but we picked `return` early (when we were still caught up
in "lambdas are sugar for inner classes") and didn't look back. Oh
well.) But it really seems like a bridge too far to use `return` here.
The obvious alternative, then, is ... break:
case 1 -> 1;
case 2 -> { println("two"); break 2; }
But that is pretty similar to what we have now, just with braces. If
the concern is that we're stretching `break` too far, then this is just
as bad.
Worse, it has two significant additional downsides:
1. You can't fall through at all. (Yes, I know some people think this
is an upside.) But real code does use fallthrough, and this leaves them
without any alternative; it also widens the asymmetry of expression
switch vs statement switch. (Combine this with other suggestions that
widen the asymmetry between pattern and non-pattern switch, and you have
four switch constructs. Oops.)
2. Either you can only use these block expressions in switch, in which
case people hate us for one reason, or you can use them everywhere, and
they hate us for another. (I have a hard time imagining that this
doesn't run into conflicts with other contexts in which one could use
break (how could it not), plus, I don't think this is the block
expression idiom I want in the language anyway.)
So it seems like a half-measure that is worse on nearly every metric.
There might be other alternatives, but I don't see a better one, other
than deprecating switch and designing a whole new mechanism. Which,
while I understand the attraction of, I don't think that's doing the
users a favor either.
And, to defend what we've proposed: it's exactly the switch we all know,
warts and all. Very little new; very little in the way of asymmetry
between void/value and pattern/constant. The cost is that we have to
accept the existing warts, primarily the weird block expression (blocks
of statements with break not surrounded by braces), the weird scoping,
and fallthrough.
This choice reminds me of the old Yiddish proverb of the Tree of
Sorrows. (https://www.inspirationalstories.com/0/69.html).
If you've got something better ...
On 3/13/2018 3:32 PM, Kevin Bourrillion wrote:
> On Fri, Mar 9, 2018 at 3:21 PM, Louis Wasserman <lowasser at google.com
> <mailto:lowasser at google.com>> wrote:
>
> Simplifying: let's call normal cases in a switch simple if they're
> a single statement or a no-op fallthrough, and let's call a
> default simple if it's a single statement or it's not there at all.
>
> Among switches apparently convertible to expression switches,
>
> * 81% had all simple normal cases and a simple default.
> * 5% had all simple normal cases and a nonsimple default.
> * 12% had a nonsimple normal case and a simple default.
> * 2% had a nonsimple normal case and a nonsimple default.
>
> I was surprised it was as high as 19%, so I grabbed a random sample of
> these 45 occurrences from Google's codebase and reviewed them. My goal
> was to find evidence that multi-statement cases in expression switches
> are important and common. Spoiler: I found said evidence underwhelming.
>
> There were 3 that I would call false matches (e.g. two that simply
> used a void `return` instead of `break` after every case without reason).
>
> There were fully 20 out of the remaining 42 that I quickly concluded
> should be refactored regardless of anything else, and where that
> refactoring happens to leave them with only simple cases and simple/no
> default. These refactorings were varied (hoist out code common to all
> non-exception cases; simplify unreachable code; change to `if` if only
> 1-2 cases; extract a method (needing only 1-2 parameters) for a case
> that is much bigger than the others; switch from loop to Streams;
> change `if/else` to ?:; move a precondition check to a more
> appropriate location; and a few other varied cleanups).
>
> Next there were 7 examples where the non-simple cases included
> side-effecting code, like setting fields or calling void methods. In
> Google Style I expect that we will probably forbid (or at least
> strongly dissuade) side effects in expression switch. I should
> probably bring this up separately, but I am pretty convinced by now
> that users should see expression switch and procedural switch as two
> completely different things, and by convention should always keep the
> former purely functional.
>
> Next there were 7 examples where a case was "non-simple" only because
> it was using the "log, then return a null object (or null), instead of
> throwing an exception" anti-pattern. I was surprised this was that
> popular. and another 2 that used the "log-and-also-throw" anti-pattern.
>
> 2 examples had a use-once local variable that saved a /little/ bit of
> nesting. I wouldn't normally refactor these, but if expression switch
> had no mechanism for multi-statement cases, I wouldn't think twice
> about it.
>
> 1 example had cases that looked nearly identical, 3 statements each,
> that could all be hoisted out of the switch, except that the types
> that differed across the three didn't implement a common interface (as
> they clearly should have). Slightly compelling.
>
> 1 example had all simple cases except that one also wanted to check an
> assertion. Okay, slightly compelling.
>
> Finally, the cases that were the most compelling to me: 3 examples had
> one or more large cases, where factoring them out into helper methods
> would imho be ugly because >=3 parameters would be required. If
> expression switch didn't permit multi-statement cases, I would just
> keep them as procedural switches. It's only 3 out of 42.
>
> Summary:
>
> imho, early signs suggest that the grossness of `break x` is not
> /nearly/ justified by the actual observed positive value of supporting
> multi-statement cases in expression switch. Are we open to killing
> that, or would we be if I produced more and clearer evidence?
>
>
>
>
>
>
> On Fri, Mar 9, 2018 at 2:56 PM Brian Goetz <brian.goetz at oracle.com
> <mailto:brian.goetz at oracle.com>> wrote:
>
> Did you happen to calculate what percentage was _not_ the
> "default" case? I would expect that to be a considerable
> fraction.
>
> On 3/9/2018 5:49 PM, Kevin Bourrillion wrote:
>> On Fri, Mar 9, 2018 at 1:19 PM, Remi Forax <forax at univ-mlv.fr
>> <mailto:forax at univ-mlv.fr>> wrote:
>>
>> When i asked what we should do instead, the answer is either:
>> 1/ we should not allow block of codes in the expression
>> switch but only expression
>> 2/ that we should use the lambda syntax with return,
>> even if the semantics is different from the lambda semantics.
>>
>> I do not like (1) because i think the expression switch
>> will become useless
>>
>>
>> In our (large) codebase, +Louis determined that, among switch
>> statements that appear translatable to expression switch,
>> 13.8% of them seem to require at least one multi-statement case.
>>
>
>
>
>
> --
> Kevin Bourrillion | Java Librarian | Google, Inc. |kevinb at google.com
> <mailto:kevinb at google.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20180313/b4f50dab/attachment-0001.html>
More information about the amber-spec-experts
mailing list