Enhanced for each loop iteration control

Stephen Colebourne scolebourne at joda.org
Sat Mar 21 10:33:52 PDT 2009


Enhanced for each loop iteration control

This proposal extends the for each loop to allow access to meta data 
including the index and the remove method.

(This proposal doesn't really go into enough detail - with JSR-310 my 
time here is limited. I'd hope that there is just about enough though....)

-----------------------------------------------------------------------------------
Enhanced for each loop iteration control

AUTHOR(S):
Stephen Colebourne

*OVERVIEW*

FEATURE SUMMARY:
Extends the Java 5 for-each loop to allow access to the loop index, 
whether this is the first or last iteration, and to remove the current item.

MAJOR ADVANTAGE:
The for-each loop is almost certainly the most new popular feature from 
Java 5. It works because it increases the abstraction level - instead of 
having to express the low-level details of how to loop around a list or 
array (with an index or iterator), the developer simply states that they 
want to loop and the language takes care of the rest. However, all the 
benefit is lost as soon as the developer needs to access the index or to 
remove an item.

The original Java 5 for each work took a relatively conservative stance 
on a number of issues aiming to tackle the 80% case. However, loops are 
such a common form in coding that the remaining 20% that was not tackled 
represents a significant body of code.

The process of converting the loop back from the for each to be index or 
iterator based is painful. This is because the old loop style if 
significantly lower-level, more verbose and less clear. It is also 
painful as most IDEs don't support this kind of 'de-refactoring'.

MAJOR BENEFIT:
A common coding idiom is expressed at a higher abstraction than at 
present. This aids readability and clarity.

Accessing the index currently requires using an int based loop, or 
placing a separate int counter outside the loop (which then remains in 
scope after the loop). The proposed solution doesn't result in manual 
manipulation of the index.

Accessing the iterator remove requires using an iterator based loop. 
With generics this is remarkably verbose. The proposed solution is 
significantly shorter and cleaner.

MAJOR DISADVANTAGE:
The enhanced for each loop is complicated with additional functionality. 
(This is mitigated by being easy and obvious to use)

More code is generated magically by the compiler. (This is mitigated by 
a simple desugaring)

ALTERNATIVES:
Use the existing language constructs, typically the standard for loop.

Use BGGA/JCA style closures, with control statements. It should be noted 
that these are consistently the most controversial parts of the closure 
debate, making the 'let's wait for closures' argument against this 
proposal weaker (as any final closures implementation may not include 
control statements).


*EXAMPLES*

SIMPLE EXAMPLE:

   StringBuilder buf = new StringBuilder();
   for (String str : list : it) {
     if (str == null) {
       it.remove();
     }
   }

whereas, today we write:

   StringBuilder buf = new StringBuilder();
   for (Iterator<String> it = list.iterator(); it.hasNext();) {
     String str = it.next();
     if (str == null) {
       it.remove();
     }
   }

ADVANCED EXAMPLE:

Example1:

   for (String str : list : it) {
     System.out.println("Row " + it.index() + " has the value " + str);
   }

whereas, today we might write:

   int index = 0;
   for (String str : list) {
     System.out.println("Row " + index + " has the value " + str);
     index++;
   }

or

   for (int i = 0; i < list.size(); i++) {
     String str = list.get(i);
     System.out.println("Row " + index + " has the value " + str);
   }

Example 2:

   StringBuilder buf = new StringBuilder();
   for (String str : list : it) {
     if (it.isFirst()) {
       buf.append(str);
     } else {
       buf.append(", ").append(str);
     }
   }

   StringBuilder buf = new StringBuilder();
   for (String str : list : it) {
     if (it.isLast()) {
       buf.append(str);
     } else {
       buf.append(str).append(", ");
     }
   }


*DETAILS*

SPECIFICATION:

Lexical:
No new tokens are added. The colon token is reused in the extended 
enhanced for each statement.

Syntax:

  EnhancedForStatement:
    for ( VariableModifiersopt Type Identifier : Expression) Statement
    for ( VariableModifiersopt Type Identifier : Expression : Ident) 
Statement

Semantics:

The first enhanced for each statement (the current form) will compile as 
it does today.

The extended enhanced for each statement will operate as follows.
The iterator control variable is a standard variable declared to be 
final. It will never be null. The type is dependent on whether the 
expression is an array or an Iterable. It will either be 
ArrayIterationControl<T> or IterableIterationControl<T>. The type is not 
specified as it is redundent information, ie. the type is inferred. It 
is scoped for the life of the loop.

public final class IterableIterationControlIterator<T> {
   public IterableIterationControlIterator(Iterable<T> iterable) {
     this.control = new IterableIterationControl(iterable.iterator());
   }
   public boolean hasNext() { return control.hasNext() }
   public T next() { return control.next() }
   public IterableIterationControl<T> control() { return control }
}
public final class IterableIterationControl<T> {
   public IterableIterationControl(Iterator<T> iteratorToWrap) {
     this.it = iteratorToWrap;
   }
   boolean hasNext() { return it.hasNext() }
   T next() { originalIndex++; if (lastWasRemoved) { lastWasRemoved = 
false } else { index++ } return it.next() }
   public T remove() { removed++; lastWasRemoved = true; return 
it.remove() }
   public int index() { return index }
   public int originalIndex() { return originalIndex }
   public boolean isFirst() { return index == 1 }
   public boolean isLast() { return it.hasNext() == false }
}

public final class ArrayIterationControlIterator<T> {
   public ArrayIterationControlIterator(T[] array) {
     this.control = new ArrayIterationControl(array);
   }
   public boolean hasNext() { return control.hasNext() }
   public T next() { return control.next() }
   public ArrayIterationControl<T> control() { return control }
}
public final class ArrayIterationControl<T> {
   public ArrayIterationControl(T[] array) { this.array = array; }
   boolean hasNext() { return index < array.length; }
   T next() { return array[++index]; }
   public int index() { return index - 1; }
   public boolean isFirst() { return index == 1; }
   public boolean isLast() { return index == array.length; }
}

Exception analysis:
The method remove() on the iteration control variable can throw an 
UnsuportedOperationException. However, this is no different from any 
other method call.

Definite assignment:
The new variable iteration control variable is a final variable that is 
definitely assigned from creation.

COMPILATION:
The extended enhanced for each loop is implemented by wrapping the two 
control classes around the Iterable or the array.

The Iterable design is desugared from:

   for (T item : iterable : control) { ... }

to:
   {
     IterableIterationControlIterator<T> $it = new 
IterableIterationControlIterator(iterable);
     IterableIterationControl<T> control = $it.control();
     while ($it.hasNext()) {
       T item = $it.next();
       ...
     }
   }

The array design is desugared similarly:

   {
     ArrayIterationControlIterator<T> $it = new 
ArrayIterationControlIterator(iterable);
     ArrayIterationControl<T> control = $it.control();
     while ($it.hasNext()) {
       T item = $it.next();
       ...
     }
   }

There is the option to optimise this if the iteration control variable 
is not assigned to any other variable or passed to any other method. 
However, that is out of scope for now.

TESTING:
Testing will be similar to the enhanced for loop. Arrays and Iterables 
of various types and sizes will be used. The null expression will also 
be tested.

LIBRARY SUPPORT:
Yes, as detailed above.

REFLECTIVE APIS:
No.

OTHER CHANGES:
The javac tree API would need to be updated to model the change.

MIGRATION:
Migration is not required. However, an IDE refactoring could now convert 
more int and Iterator based for loops than it does at present.


*COMPATIBILITY*

BREAKING CHANGES:
No breaking changes are known using this conversion scheme.

EXISTING PROGRAMS:
This conversion is pure syntax sugar, so there are no known interactions 
with existing programs.


*REFERENCES*

EXISTING BUGS:
I searched the bug database, but nothing came up (which is surprising).

URL FOR PROTOTYPE:
None

DOCUMENTS:
[1] Stephen Colebourne's blog: 
http://www.jroller.com/scolebourne/entry/java_7_for_each_loop
[2] Stephen Colebourne's original writeup: 
http://docs.google.com/Edit?docID=dfn5297z_15ck7x5ghr


*DESIGN ISSUES*
There are numerous alternative ways in which this feature can be added. 
These include:
- using the keyword:
   for (String str : list) {
     for.remove();  // problem is nested for loops
   }
- using a label as a psuedo-variable:
   it: for (String str : list) {
     it:remove();  // note colon and not dot
   }
- using an additional clause before the for each colon:
   for (String str, int index : list) {
     // no access to remove, and conflicts for for each for maps
   }

The chosen solution involves simple Java classes, and a simple desugar. 
The downside of the chosen solution is performance, as it involves 
creating two wrapping objects and routing hasNext() and next() via 
additional layers of method calls. A possible extension would be for the 
compiler to identify if the iteration control variable is not passed to 
another method. If so, then the code could be all be generated inline.





More information about the coin-dev mailing list