[enhanced-switches] My experience of converting old switches to new ones

Sat Sep 17 18:06:09 UTC 2022

Hello!

Our codebase was updated recently from Java 11 to Java 17 level, and
we started gradually using new Java features. Recently, I converted in
a semi-automated manner ~1000 of the old switches to new ones (either
statements or expressions). Here's my thoughts about this. Probably
somebody will find them interesting.

1. Knowing that the switch never falls through is really relieving. So
if you see an arrow after the first case, you immediately know that
this is a 'simple switch' (not doing fallthrough). If you see colon,
you start looking more precisely: probably something fancy is done in
this switch (otherwise, it's likely that automated refactoring would
be suggested to use an arrow). So the arrow basically separates simple
switches and complex ones.

2. I really have a desire to use expression switches, even when it
requires some code repetition. E.g., a common pattern:

if (cond) {
  switch(val) {
    case A -> return "a";
    case B -> return "b";
  }
}
return "default";

I tend to convert it to

if (cond) {
  return switch(val) {
    case A -> "a";
    case B -> "b";
    default -> "default";
  }
}
return "default";

There's a cost of repeating the "default" expression. However, we also
have a benefit. Now, we know that under the condition we always return
no matter what.

Unfortunately, sometimes the default expression can be non-trivial
(e.g., return super.blahblah(all, my, parameters, passed)). In this
case, I'm reluctant to duplicate it many times. It's quite possible
that in fact this is an impossible case, and it's written there just
because something should be written. However, this requires a deeper
understanding of the original code.

3. It's also sad that signaling about impossible cases is quite long.
`default -> assert false;` is not accepted for obvious reasons.
Writing every time `default -> throw new
IllegalStateException("Unexpected value of "+selectorValue);` is very
verbose and distracts from the actual code. Probably some syntactic
sugar to assert that we covered all possible values in a
non-exhaustive switch would be nice (like `default impossible;` or
whatever). E.g., I observed the following (arguably strange) pattern:

switch((cond1() ? 0 : 1) + (cond2() ? 0 : 2)) {
case 0 -> ...
case 1 -> ...
case 2 -> ...
case 3 -> ...
default -> throw new AssertionError("cannot reach here");
}

4. I really enjoyed exhaustive switch expressions over enums. I
removed probably a hundred of redundant default branches. In switch
statements, people do different things when they are forced to write
default, even though they covered all the enum values:
a. throw something (throw new IllegalStateException(), throw new
AssertionError(), etc.)
b. return something simple (return "", return null, etc.)
c. assert+return something simple: assert false; return null;
d. questionable: join return branch with the last case (case LAST:default: ...)
e. more dangerous: omit the explicit last case and use default instead
it, while it's clear that default actually handles non-mentioned case.

Luckily all of these are unnecessary anymore if you can use switch
expressions. I even forcibly push down switch expressions inside
something (e.g., a call), just to be able to use it. E.g.:

switch(MY_ENUM) {
  case A -> setSomething("a");
  case B -> setSomething("b");
  case C -> setSomething("c");
  default -> throw new IllegalStateException("impossible; all values
are covered");
}

Can be nicely converted to

setSomething(switch(MY_ENUM) {
  case A -> "a";
  case B -> "b";
  case C -> "c";
});

Unfortunately, this is not always the case. Sometimes, you cannot use
switch expression at all, and in this case, inability to specify
exhaustiveness is really annoying. We need total switch statements.

5. At first, I thought that switch expressions are best for return
values, assignment rvalues and variable declaration initializers, but
in other contexts they are too verbose and may make things more
complex than necessary. However, I started liking using them as the
last argument of the call. E.g., before:

switch(x) {
case "a":return wrap(getA());
case "b":return wrap(getB());
case "c":return wrap(getC());
default:throw new IllegalArgumentException();
}

after:

return wrap(switch(x) {
  case "a" -> getA();
  case "b" -> getB();
  case "c" -> getC();
  default -> throw new IllegalArgumentException();
});

If you don't have tail arguments after switch, then you don't lose the
context, and you immediately know that every non-exceptional return
value is wrapped. It's also possible to extract such a switch into a
separate local variable, but even without extraction it reads nicely.

It's also ok to use switch expressions inside other switch
expressions. Especially useful in double-dispatch enum methods (e.g.,
some kind of lattice operations):

enum Item {
BOTTOM, A, B, AB, TOP;
Item join(Item other) {
  return switch(this) {
    case TOP -> this;
    case BOTTOM -> other;
    case A -> switch(other) {
        case A, AB, TOP -> other;
        case B -> AB;
        case BOTTOM -> this;
      };
    case B -> switch(other) {
        case B, AB, TOP -> other;
        case A -> AB;
        case BOTTOM -> this;
      };
    case AB -> switch(other) {
        case TOP -> other;
        case A, B, AB, BOTTOM -> this;
      };
  };
}
}
Reads much better than tons of returns before. Also, thanks to
exhaustiveness checks, you know that every single case is covered.

6. I started to like yield. In some cases, only a couple of branches
of a long switch that returns from every branch have some complex
intermediate computations or conditional branches. In this case, it's
still better to convert it to switch expression, and replace some
returns with yields. And even if every single branch is complex, using
switch expression + yield may make code more clear. E.g., it may
clearly show that the purpose of the whole switch is to assign a value
to the same variable, though computation of variable value in every
branch could be complex.

Also, it can be implicitly assumed that even complex switch
expressions with yields don't produce side-effects. Of course, this is
not controlled by a compiler but it would be a bad practice to produce
them, so there could be an agreement between the team. In this case,
reading the code is simplified a lot. If you see `var something =
switch(...) {...}`, you immediately know that regardless of the switch
complexity, we just calculate the value for `something`, so we can
skip the whole thing if we are not interested in details. If you see a
switch statement, you are less sure whether every single branch does
only this.

7. I really miss `case null`. I saw many switches these days, during
my conversion quest. And it happens quite often in our codebase that
the null case is handled separately before the switch (often the same
as 'default', but sometimes not). In the IntelliJ codebase, we really
use nulls extensively, even though some people may think that it's a
bad idea. It's good that we will have `case null` in future.

8. I also miss `case default`. It's strange, but I often see old
switches where `default:` is joined with other cases. Probably more
often with strings, less often with enums. Something like:

switch(valueFromConfig) {
  case "increase": increase(); break;
  case "decrease": decrease(); break;
  case "enable": enable(); break;
  case "disable": // "disable" is a documented value and we explicitly
process it
  default: // something unknown, but we still want to fallback to
default value which is "disable"
    disable(); break;
}

With Java 17 enhanced switches, we should either delete `case
"disable"`, or duplicate the branch. If we delete, it will not be so
clear anymore that this value is especially processed as "official"
value. In the future, I could use `case "disable", default ->
disable();` which would solve the issue.

9. Some old switches are actually shorted than new ones, and I'm not
sure about conversion. Usually, it's like this:

if (condition) {
  switch(value) {
  case 1: return "a";
  case 2: return "b";
  case 3: return "c";
  // no default case, execution continues
  }
}
... a lot of common code for `condition` is false or `value` is not
listed in cases ...

Here it's hard to use switch expression, and enhanced switch statement
only becomes longer and cluttered with syntax:
switch(value) {
case 1 -> { return "a"; }
case 2 -> { return "b"; }
case 3 -> { return "c"; }
}

Well, it's possible to refactor to something like

String result = !condition ? null : switch(value) {
  case 1 -> "a";
  case 2 -> "b";
  case 3 -> "c";
  default -> null;
  };
if (result != null) return result;
... a lot of common code for `condition` is false or `value` is not
listed in cases ...

But it's questionable whether this makes the code more readable.

10. Sometimes, one or few enum values are peeled off in advance. In
this case, nice conversion becomes problematic. E.g.:

enum Mode {IGNORE, A, B, C}

void updateMode(Mode mode) {
  if (mode == Mode.IGNORE) return;
  System.out.println("Processing...");
  switch(mode) {
    case A -> process("a");
    case B -> process("b");
    case C -> process("c");
  }
}

It's almost convertible to switch expression. However, the switch is
non-exhaustive, and you cannot get exhaustiveness benefits. It's
possible to add a throwing branch, though it's also long and verbose:

void updateMode(Mode mode) {
  if (mode == Mode.IGNORE) return;
  System.out.println("Processing...");
  process(switch(mode) {
    case A -> "a";
    case B -> "b";
    case C -> "c";
    case IGNORE -> throw new AssertionError("impossible; handled before");
    // hooray, exhaustive now!
  });
}

Of course, it would be too much for javac to analyze code to this
extent and allow skipping IGNORE branch, as it was checked before
(IntelliJ analyzer knows this). However, it's still sad. This somehow
corresponds to item 3. Probably short syntax for impossible branches
would be nice.

Thank you for reading my very long email.

With best regards,
Tagir Valeev.