PROPOSAL: fold keyword

Gabriel Belingueres belingueres at gmail.com
Thu Mar 5 13:05:28 PST 2009


Hi,

This is a proposal for adding a simple "fold" operation on
Collections. It is in a very draft state IMO, and perhaps it was
already discussed in the process of adding closures to the language.
Here I send it to you, nevertheless.

Regards,
Gabriel

PROJECT COIN SMALL LANGUAGE CHANGE PROPOSAL FORM v1.0

AUTHOR(S): Gabriel Belingueres

OVERVIEW

Provide a "fold" sentence to Java for non trivial iterations over
Collections. See
http://en.wikipedia.org/wiki/Fold_(higher-order_function)

FEATURE SUMMARY:

Add the ability to support a form of "advanced for statement", to
capture typical collection iteration patterns involving tests on some
condition to break the iteration, or calculating a single value from
the collection elements, assigning a default value when the collection
is empty.
The new "fold" keyword acts like a sugared, moderately complex
enhanced for keyword that abstract this typical processing on
collections.

MAJOR ADVANTAGE:

Makes the coding of not so trivial operations on collections more readable.

MAJOR BENEFIT:

Readability: Makes clearer the intention of the programmer when
iterating over a collection to perform those typical but non trivial
iterations.

MAJOR DISADVANTAGE:

Some increased implementation and testing complexity for the compiler.
Add a new keyword for the language "fold".

ALTERNATIVES:

There are some free libraries like Apache's common-collection that
helps in abstracting some aspects of complex collection iteration,
however this library in particular is targeted for older java versions
with no support for generics (others may do though). Libraries would
need to pass the body of the for loop as an object, requiring to
create an anonymous class, or making use of reflection to call the
appropriate statements as if it where the body of the loop, making the
implementation either less readable or less efficient, respectively.

The full fledged alternative would be to add closures to Java (but
closures seems to be not targeted for Java 7 AFAIK.)

EXAMPLES

SIMPLE EXAMPLE:

 List<Integer> list = ArrayList<Integer>();
 l.add(...);
 // sum all numbers in the list
 Integer sum =
   fold(Integer n : list;     // iterating collection element n
        Integer result = 0) { // Where to accumulate the result, and
the initial value
     result += n;
   };
 System.out.println(sum);

ADVANCED EXAMPLES:

 // returns true if all salaries of the employee collection are lower
than 1000 or the collection is empty.
 List<Employee> emps = ....;
 Boolean allLower =
   fold(Employee emp : emps;
        Boolean result = Boolean.TRUE;
        !result) {  // break condition: must be a boolean expression
     result = (emp.getSalary().compareTo(new BigDecimal(1000) < 0);
   }

 // returns a new collection with the salaries of the employees not on vacation
 List<Employee> emps = ...;
 List<BigDecimal> salaries =
   fold(Employee e : emps;
        List<BigDecimal> newList = new ArrayList<BigDecimal>()) {
     if (!e.onVacation()) {
       newList.add(e.getSalary());
     }
   }

 // returns the first N employees whose sum of salaries are lower than 1000000
 List<Employee> emps = ...;
 BigDecimal sum = BigDecimal.ZERO; // sum the salaries
 List<Employee> selectedEmps =
   fold(Employee e : emps;
        List<Employee> result = new ArrayList<Employee>();
        sum.add(e.getSalary()).compareTo(new BigDecimal(1000000)) < 0) {
        result.add(e);
        sum = sum.add(e.getSalary());
   };


DETAILS
SPECIFICATION:
The following is somewhat informal, making emphasis in this draft
document on the main idea instead of in formal grammar and JLS issues.

SYNTAX:

foldStatement:
   fold(VariableModifiers_opt Type Identifier: Expression;
LocalVariableDeclaration; Expression_opt) Statement

The first part is similar to what is required for the enhanced "for" keyword.
The second part is a variable declaration that MUST be initialized
with the value returned when the collection is empty. (It seems to me
that it doesn't make any sense if the variable is declared "final".)
Should be limited in scope to the fold body (to align the behavior
with the for keyword).
The third part is an optional boolean expression which acts as a break
condition.

The fold returns an object that is to be assignable from the same
static type declared in the second part, which can be assigned to a
variable or used as an expression.

The behavior when the collection is null, or in the presence of
exceptions should follow the same criteria than the enhanced for loop.

COMPILATION:

Here are desugarings to currently legal Java source for the two examples above:

   // simple example
   Integer sum;
   { // new scope for synthetic variables
     Integer $result = 0;
     if (!list.isEmpty()) { // NPE if collection is null
       for(Integer n : list) {
         $result += n;
       }
     }

     sum = $result;
   }

   // first advanced example (with break condition)
   Boolean allLower;
   { // new scope for synthetic variables
     Boolean $result = Boolean.TRUE;
     if (!emps.isEmpty()) { // NPE if collection is null
       for(Employee e : emps) {
         if (!$result)
           break;
         $result = (emp.getSalary().compareTo(new BigDecimal(1000));
       }
     }

     allLower = $result;
   }


TESTING:

Generating various simple and complex uses of the new structure and
verifying the proper execution results; combinations to test include
fold statements with and without break conditions, tests with specific
collection types such as Lists, Sets, Maps. Can expand on the enhanced
for tests for further testing.

LIBRARY SUPPORT:

The same libraries that needs to be supported by the enhanced for loop.

REFLECTIVE APIS:

None of core reflection, javax.lang.model.*, the doclet API, and JDPA
model statements; therefore, they are unaffected (AFAIK). The tree API
in javac, does model statements, but maybe we can adapt the existing
API for enhanced for statements.

OTHER CHANGES:

No.

MIGRATION:

Look for collection iterations with nested break conditions or element
processing returning a single value.

COMPATIBILITY
BREAKING CHANGES:

All existing programs remain valid.

EXISTING PROGRAMS:

The semantics of existing class files and legal source files are
unchanged by this feature.

REFERENCES
EXISTING BUGS:

None that I know about (doing a quick search on the bug database.)

URL FOR PROTOTYPE (optional):

No prototype at this time.



More information about the coin-dev mailing list