PROPOSAL: Enhanced for each loop iteration control

Stephen Colebourne scolebourne at joda.org
Sat Mar 21 14:46:08 PDT 2009


Enhanced for each loop iteration control

(re-sent with correct subject line)

This proposal extends the for each loop to allow access to meta data
including the index and the remove method.

(This proposal doesn't really go into enough detail - with JSR-310 my
time here is limited. I'd hope that there is just about enough though....)

-----------------------------------------------------------------------------------
Enhanced for each loop iteration control

AUTHOR(S):
Stephen Colebourne

*OVERVIEW*

FEATURE SUMMARY:
Extends the Java 5 for-each loop to allow access to the loop index,
whether this is the first or last iteration, and to remove the current item.

MAJOR ADVANTAGE:
The for-each loop is almost certainly the most new popular feature from
Java 5. It works because it increases the abstraction level - instead of
having to express the low-level details of how to loop around a list or
array (with an index or iterator), the developer simply states that they
want to loop and the language takes care of the rest. However, all the
benefit is lost as soon as the developer needs to access the index or to
remove an item.

The original Java 5 for each work took a relatively conservative stance
on a number of issues aiming to tackle the 80% case. However, loops are
such a common form in coding that the remaining 20% that was not tackled
represents a significant body of code.

The process of converting the loop back from the for each to be index or
iterator based is painful. This is because the old loop style if
significantly lower-level, more verbose and less clear. It is also
painful as most IDEs don't support this kind of 'de-refactoring'.

MAJOR BENEFIT:
A common coding idiom is expressed at a higher abstraction than at
present. This aids readability and clarity.

Accessing the index currently requires using an int based loop, or
placing a separate int counter outside the loop (which then remains in
scope after the loop). The proposed solution doesn't result in manual
manipulation of the index.

Accessing the iterator remove requires using an iterator based loop.
With generics this is remarkably verbose. The proposed solution is
significantly shorter and cleaner.

MAJOR DISADVANTAGE:
The enhanced for each loop is complicated with additional functionality.
(This is mitigated by being easy and obvious to use)

More code is generated magically by the compiler. (This is mitigated by
a simple desugaring)

ALTERNATIVES:
Use the existing language constructs, typically the standard for loop.

Use BGGA/JCA style closures, with control statements. It should be noted
that these are consistently the most controversial parts of the closure
debate, making the 'let's wait for closures' argument against this
proposal weaker (as any final closures implementation may not include
control statements).


*EXAMPLES*

SIMPLE EXAMPLE:

    StringBuilder buf = new StringBuilder();
    for (String str : list : it) {
      if (str == null) {
        it.remove();
      }
    }

whereas, today we write:

    StringBuilder buf = new StringBuilder();
    for (Iterator<String> it = list.iterator(); it.hasNext();) {
      String str = it.next();
      if (str == null) {
        it.remove();
      }
    }

ADVANCED EXAMPLE:

Example1:

    for (String str : list : it) {
      System.out.println("Row " + it.index() + " has the value " + str);
    }

whereas, today we might write:

    int index = 0;
    for (String str : list) {
      System.out.println("Row " + index + " has the value " + str);
      index++;
    }

or

    for (int i = 0; i < list.size(); i++) {
      String str = list.get(i);
      System.out.println("Row " + index + " has the value " + str);
    }

Example 2:

    StringBuilder buf = new StringBuilder();
    for (String str : list : it) {
      if (it.isFirst()) {
        buf.append(str);
      } else {
        buf.append(", ").append(str);
      }
    }

    StringBuilder buf = new StringBuilder();
    for (String str : list : it) {
      if (it.isLast()) {
        buf.append(str);
      } else {
        buf.append(str).append(", ");
      }
    }


*DETAILS*

SPECIFICATION:

Lexical:
No new tokens are added. The colon token is reused in the extended
enhanced for each statement.

Syntax:

   EnhancedForStatement:
     for ( VariableModifiersopt Type Identifier : Expression) Statement
     for ( VariableModifiersopt Type Identifier : Expression : Ident)
Statement

Semantics:

The first enhanced for each statement (the current form) will compile as
it does today.

The extended enhanced for each statement will operate as follows.
The iterator control variable is a standard variable declared to be
final. It will never be null. The type is dependent on whether the
expression is an array or an Iterable. It will either be
ArrayIterationControl<T> or IterableIterationControl<T>. The type is not
specified as it is redundent information, ie. the type is inferred. It
is scoped for the life of the loop.

public final class IterableIterationControlIterator<T> {
    public IterableIterationControlIterator(Iterable<T> iterable) {
      this.control = new IterableIterationControl(iterable.iterator());
    }
    public boolean hasNext() { return control.hasNext() }
    public T next() { return control.next() }
    public IterableIterationControl<T> control() { return control }
}
public final class IterableIterationControl<T> {
    public IterableIterationControl(Iterator<T> iteratorToWrap) {
      this.it = iteratorToWrap;
    }
    boolean hasNext() { return it.hasNext() }
    T next() { originalIndex++; if (lastWasRemoved) { lastWasRemoved =
false } else { index++ } return it.next() }
    public T remove() { removed++; lastWasRemoved = true; return
it.remove() }
    public int index() { return index }
    public int originalIndex() { return originalIndex }
    public boolean isFirst() { return index == 1 }
    public boolean isLast() { return it.hasNext() == false }
}

public final class ArrayIterationControlIterator<T> {
    public ArrayIterationControlIterator(T[] array) {
      this.control = new ArrayIterationControl(array);
    }
    public boolean hasNext() { return control.hasNext() }
    public T next() { return control.next() }
    public ArrayIterationControl<T> control() { return control }
}
public final class ArrayIterationControl<T> {
    public ArrayIterationControl(T[] array) { this.array = array; }
    boolean hasNext() { return index < array.length; }
    T next() { return array[++index]; }
    public int index() { return index - 1; }
    public boolean isFirst() { return index == 1; }
    public boolean isLast() { return index == array.length; }
}

Exception analysis:
The method remove() on the iteration control variable can throw an
UnsuportedOperationException. However, this is no different from any
other method call.

Definite assignment:
The new variable iteration control variable is a final variable that is
definitely assigned from creation.

COMPILATION:
The extended enhanced for each loop is implemented by wrapping the two
control classes around the Iterable or the array.

The Iterable design is desugared from:

    for (T item : iterable : control) { ... }

to:
    {
      IterableIterationControlIterator<T> $it = new
IterableIterationControlIterator(iterable);
      IterableIterationControl<T> control = $it.control();
      while ($it.hasNext()) {
        T item = $it.next();
        ...
      }
    }

The array design is desugared similarly:

    {
      ArrayIterationControlIterator<T> $it = new
ArrayIterationControlIterator(iterable);
      ArrayIterationControl<T> control = $it.control();
      while ($it.hasNext()) {
        T item = $it.next();
        ...
      }
    }

There is the option to optimise this if the iteration control variable
is not assigned to any other variable or passed to any other method.
However, that is out of scope for now.

TESTING:
Testing will be similar to the enhanced for loop. Arrays and Iterables
of various types and sizes will be used. The null expression will also
be tested.

LIBRARY SUPPORT:
Yes, as detailed above.

REFLECTIVE APIS:
No.

OTHER CHANGES:
The javac tree API would need to be updated to model the change.

MIGRATION:
Migration is not required. However, an IDE refactoring could now convert
more int and Iterator based for loops than it does at present.


*COMPATIBILITY*

BREAKING CHANGES:
No breaking changes are known using this conversion scheme.

EXISTING PROGRAMS:
This conversion is pure syntax sugar, so there are no known interactions
with existing programs.


*REFERENCES*

EXISTING BUGS:
I searched the bug database, but nothing came up (which is surprising).

URL FOR PROTOTYPE:
None

DOCUMENTS:
[1] Stephen Colebourne's blog:
http://www.jroller.com/scolebourne/entry/java_7_for_each_loop
[2] Stephen Colebourne's original writeup:
http://docs.google.com/Edit?docID=dfn5297z_15ck7x5ghr


*DESIGN ISSUES*
There are numerous alternative ways in which this feature can be added.
These include:
- using the keyword:
    for (String str : list) {
      for.remove();  // problem is nested for loops
    }
- using a label as a psuedo-variable:
    it: for (String str : list) {
      it:remove();  // note colon and not dot
    }
- using an additional clause before the for each colon:
    for (String str, int index : list) {
      // no access to remove, and conflicts for for each for maps
    }

The chosen solution involves simple Java classes, and a simple desugar.
The downside of the chosen solution is performance, as it involves
creating two wrapping objects and routing hasNext() and next() via
additional layers of method calls. A possible extension would be for the
compiler to identify if the iteration control variable is not passed to
another method. If so, then the code could be all be generated inline.







More information about the coin-dev mailing list