Proposal: Strings in Switch

Joseph D. Darcy Joe.Darcy at Sun.COM
Fri Feb 27 15:20:11 PST 2009


To start the discussions, here is my proposal for allowing Strings in 
switch.

-Joe

PROJECT COIN SMALL LANGUAGE CHANGE PROPOSAL FORM v1.0

AUTHOR(S): Joseph D. Darcy

OVERVIEW
Provide a two sentence or shorter description of these five aspects of 
the feature:
FEATURE SUMMARY: Should be suitable as a summary in a language tutorial.

Add the ability to switch on string values analogous to the existing 
ability to switch on values of the primitive types.

MAJOR ADVANTAGE: What makes the proposal a favorable change?

More regular coding patterns can be used for operations selected on the 
basis of a set of constant string values; the meaning of the new 
construct should be obvious to Java developers.

MAJOR BENEFIT: Why is the platform better if the proposal is adopted?

Potentially better performance for string-based dispatch code.

MAJOR DISADVANTAGE: There is always a cost.

Some increased implementation and testing complexity for the compiler.

ALTERNATIVES: Can the benefits and advantages be had some way without a 
language change?

No; chained if-then-else tests for string equality are potentially 
expensive and introducing an enum for its switchable constants, one per 
string value of interest, would add another type to a program without 
good cause.

EXAMPLES
Show us the code!
SIMPLE EXAMPLE: Show the simplest possible program utilizing the new 
feature.
String s = ...
switch(s) {
  case "foo":
    processFoo(s);
    break;
}

ADVANCED EXAMPLE: Show advanced usage(s) of the feature.
String s = ...
switch(s) {
 case "quux":
    processQuux(s);
    // fall-through

  case "foo":
  case "bar":
    processFooOrBar(s);
    break;

  case "baz":
     processBaz(s);
    // fall-through

  default:
    processDefault(s);
    break;
}

DETAILS
SPECIFICATION: Describe how the proposal affects the grammar, type 
system, and meaning of expressions and statements in the Java 
Programming Language as well as any other known impacts.

The lexical grammar is unchanged.  String is added to the set of types 
valid for a switch statement in JLSv3 section 14.11.  Since Strings are 
already included in the definition of constant expressions, JLSv3 
section 15.28, the SwitchLabel production does not need to be augmented. 
The existing restrictions in 14.11 on no duplicate labels, at most one 
default, no null labels, etc. all apply to Strings as well.  The type 
system is unchanged.  The definite assignment analysis of switch 
statement, JLSv3 section 16.2.9, is unchanged as well.

COMPILATION: How would the feature be compiled to class files?

One way to support this change would be to augment the JVM's 
lookupswitch instruction to operate on String values; however, that 
approach is not recommended or necessary.  It would be possible to 
translate the switches to equivalent if-then-else code, but that would 
require unnecessary equality comparisons which are potentially 
expensive.  Instead, a switch should occur on a predictable and fast 
integer (or long) function value computed from the string.  The most 
natural choice for this function is String.hashCode, but other functions 
could also be used either alone or in conjunction with hashCode.  (The 
specification of String.hashCode is assume to be stable at this point.)  
If all the string labels have different lengths, String.length() could 
be used instead of hashCode.  Generally a String.equals() check will be 
needed to verify the candidate string's identity in addition to the 
evaluation of the screening function because multiple string inputs 
could evaluate to the same result.

A single case label, a single case label with a default, and two case 
labels can be special-cased to just equality checks without function 
evaluations.  If there are collisions in String.hashCode on the set of 
case labels in a switch block, a different function without collisions 
on that set of inputs should be used; for example ((long)s.hashCode<<32 
) + s.length()) is another candidate function.

Here are desugarings to currently legal Java source for the two examples 
above where the default hash code do not collide:

// Simple Example
if (s.equals("foo")) { // cause NPE if s is null
  processFoo(s);
}


// Advanced example
{  // new scope for synthetic variables
  boolean $take_default = false;
  boolean $fallthrough = false;
  $default_label: {
      switch(s.hashCode()) { // cause NPE if s is null
      case 3482567: // "quux".hashCode()
          if (!s.equals("quux")) {
              $take_default = true;
              break $default_label;
          }
          processQuux(s);
          $fallthrough = true;
                case 101574: // "foo".hashCode()
          if (!$fallthrough && !s.equals("foo")) {
              $take_default = true;
              break $default_label;
          }
          $fallthrough = true;
      case 97299:  // "bar".hashCode()
          if (!$fallthrough && !s.equals("bar")) {
              $take_default = true;
              break $default_label;
          }
          processFooOrBar(s);
          break;

      case 97307: // "baz".hashCode()
          if (!s.equals("baz")) {
              $take_default = true;
              break $default_label;
          }
          processBaz(s);
          $fallthrough = true;

      default:
          $take_default = true;
          break $default_label;
      }
  }
  if($take_default)
      processDefault(s);
}

In the advanced example, the boolean "fallthrough" variable is needed to 
track whether a fall-through has occurred so the string equality checks 
can be skipped.  If there are no fall-throughs, this variable can be 
removed.  Likewise, if there is no default label in the original code, 
the $take_default variable is not needed and a simple "break" can be 
used instead.

In a translation directly to bytecode, the synthetic state variables can 
be replaced with goto's; expressing this in pseudo Java source with goto:

// Advanced example in pseudo Java with goto
switch(s.hashCode()) { // cause NPE if s is null
case 3482567: // "quux".hashCode()
    if (!s.equals("quux"))
        goto $default_label;
    goto $fooOrBarCode_label;
   
case 101574: // "foo".hashCode()
    if (!s.equals("foo"))
        goto $default_label;
    goto $fooOrBarCode_label;

case 97299:  // "bar".hashCode()
    if (!s.equals("bar"))
        goto $default_label;

    $fooOrBarCode_label:
    processFooOrBar(s);
    break;

case 97307: // "baz".hashCode()
    if (!s.equals("baz"))
        goto $default_label;
    processBaz(s);

default:
$default_label:
    processDefault(s);
    break;
}

Related to compilation, a compiler's existing diagnostics around falling 
through switches, such as java's -Xlint:fallthrough option and 
@SuppressWarnings("fallthrough"), should work identically on switch 
statements based on Strings.

TESTING: How can the feature be tested?

Generating various simple and complex uses of the new structure and 
verifying the proper execution paths occur; combinations to test include 
switch statements with and without fall-throughs, with and without 
collisions in the hash codes, and with and without default labels.

LIBRARY SUPPORT: Are any supporting libraries needed for the feature?

No.

REFLECTIVE APIS: Do any of the various and sundry reflection APIs need 
to be updated? This list of reflective APIs includes but is not limited 
to core reflection (java.lang.Class and java.lang.reflect.*), 
javax.lang.model.*, the doclet API, and JPDA.

Only reflective APIs that model statements in the source language might 
be affected.  None of core reflection, javax.lang.model.*, the doclet 
API, and JDPA model statements; therefore, they are unaffected.  The 
tree API in javac, 
http://java.sun.com/javase/6/docs/jdk/api/javac/tree/index.html, does 
model statements, but the existing API for switch statements is general 
enough to model the revised language without any API changes.

OTHER CHANGES: Do any other parts of the platform need be updated too? 
Possibilities include but are not limited to JNI, serialization, and 
output of the javadoc tool.

No.

MIGRATION: Sketch how a code base could be converted, manually or 
automatically, to use the new feature.

Look for sequences of if ("constant string".equals(foo)) or if 
(foo.equals("constant string")) and replace accordingly.

COMPATIBILITY
BREAKING CHANGES: Are any previously valid programs now invalid? If so, 
list one.

All existing programs remain valid.

EXISTING PROGRAMS: How do source and class files of earlier platform 
versions interact with the feature? Can any new overloadings occur? Can 
any new overriding occur?

The semantics of existing class files and legal source files and are 
unchanged by this feature.

REFERENCES
EXISTING BUGS: Please include a list of any existing Sun bug ids related 
to this proposal.

5012262 Using Strings and Objects in Switch case statements.
http://bugs.sun.com/view_bug.do?bug_id=5012262

URL FOR PROTOTYPE (optional):

No prototype at this time.



More information about the coin-dev mailing list