Extermal references: Closures for the Java Programming Language (v0.5) a.k.a BGGA

Fri Feb 26 21:23:37 PST 2010

Alex asked me to include BGGA and other documents in messages on this list,
so that they may formally be considered contributions to project lambda.

Enclosed please find: Closures for the Java Programming Language (v0.5) also
known as BGGA
It may also be found at <http://www.javac.info/closures-v05.html>

*Changes in this revision: *

   - Changed the terminology slightly to emphasize the distinction between a
   closure literal and the result of evaluating it, which is a closure
   instance.
   - Corrected the with example.
   - Specified the class-file format for communicating throws type
   parameters.
   - Added grammar rules for throws type parameters and type arguments.
   - The *Future Directions* section has been moved to a separate *Open
   Issues <http://www.javac.info/issues-v05.html>* document.
   - This specification now describes only the *functional* version of the
   proposed language feature.
   - Added support for control abstractions that act as
loops<http://gafter.blogspot.com/2006/10/iterative-control-abstraction-user.html>by
binding the meaning of
   break and continue.
   - The package for system-generated function type interfaces is now
   java.lang.function.

Closures for the Java Programming Language (v0.5)

*Gilad Bracha, Neal Gafter, James Gosling, Peter von der Ahé*

Modern programming languages provide a mixture of primitives for composing
programs. Most notably Scheme, Smaltalk, Ruby, and Scala have direct
language support for parameterized delayed-execution blocks of code,
variously called *lambda*, *anonymous functions*, or *closures*. These
provide a natural way to express some kinds of abstractions that are
currently quite awkward to express in Java. For programming in the small,
anonymous functions allow one to abstract an algorithm over a piece of code;
that is, they allow one to more easily extract the common parts of two
almost-identical pieces of code. For programming in the large, anonymous
functions support APIs that express an algorithm abstracted over some
computational aspect of the algorithm. For example, they enable writing
methods that act as library-defined control
constructs<http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=889232>.

 Closure Literals

We introduce a syntactic form for constructing an anonymus function value:

*Primary:**ClosureLiteral**ClosureLiteral*:*{* *FormalParametersopt* *=>* *
BlockStatementsopt* *Expressionopt* *}*

Evaluating the closure literal expression results in a *closure instance *.
A closure instance is converted to some object type by a *closure conversion
*. In the nominal version of the specification, it is a compile-time error
if a closure literal appears in a context where it is not subject to a
closure conversion. In the functional version of the specification, if a
closure literal is not subject to a closure conversion it is converted to
the *corresponding function type* of the closure literal, which is the
function type with: identical argument types; a return type that is the type
of the final expression, if one exists, or java.lang.Unreachable if the
closure literal's body cannot complete normally, or void otherwise; and a
throws type list corresponding to the checked exception types that can be
thrown from the body of the closure literal. The conversion, in either
case,occurs entirely at compile-time.

*Example*: the following closure literal takes two integers and yields their
sum:

{int x, int y => x+y}

 A closure literal captures a block of code - the block statements and the
expression - parameterized by the closure literal's formal parameters. All
free lexical bindings - that is, lexical bindings not defined within the
closure literal - are bound at the time of evaluation of the closure literal
to their meaning in the lexical context in which the closure literal
appears. Free lexical bindings include references to variables from
enclosing scopes, and the meaning of this, break, continue, and return.
Evaluating the closure literal does not cause the statements or expression
to be evaluated, but packages them up at runtime with a representation of
the lexical context to be invoked later.

*Rationale*: One purpose for closure literals is to allow a programmer to
refactor common code into a shared utility, with the difference between the
use sites being abstracted into a closure literal by the client. The code to
be abstracted sometimes contains a break, continue, or return statement.
This need not be an obstacle to the transformation. One implication of the
specification is that a break or continue statement appearing within a
closure literal's body may transfer to a matching enclosing statement. A
return statement always returns from the nearest enclosing method or
constructor. A function may outlive the target of control transfers
appearing within it.

At runtime, if a break statement is executed that would transfer control out
of a statement that is no longer executing, or is executing in another
thread, the VM throws a new unchecked exception, UnmatchedNonlocalTransfer.
Similarly, an UnmatchedNonlocalTransfer is thrown when a continue statement
attempts to complete a loop iteration that is not executing in the current
thread. Finally, an UnmatchedNonlocalTransfer is thrown when a
returnstatement is executed if the method invocation to which the
return statement would transfer control is not on the stack of the current
thread.
Function Types

We introduce a syntactic form for a *function type*:

*Type:**FunctionType**FunctionType*:*{* *Typesopt* *=>* *Type* *Throwsopt* *
}* *Types:**Type
Type **,* *Types*

Informally, a function type describes the set of functions that accept a
given list of argument types, result in a value of the given type, and may
throw the indicated checked exception types.

*Example*: the following assigns to the local variable plus a function that
computes the sum of its two int arguments:

{int,int=>int} plus = {int x, int y => x+y};

 A function type is translated into an instantiation of a generic
system-generated interface type with a single method. The interface has a
type parameter for each argument type and return type that is a reference
type, and one additional type parameter for the exception signature. The
generic instance of the interface used to represent a function type has
covariant wildcards for the type arguments representing the return type and
exception type, and contravariant wildcards for function argument types,
except when a function type is used in the context of a declared supertype,
in which case the type arguments appear without wildcards. *[This paragraph
is a placeholder for a more precise description, including the name of the
interface, required for interoperability among compilers. We expect the
generated interfaces to be in the package java.lang.function, which will be
closed to user-defined code, and the naming scheme to be an extension of the
JVM method signature scheme.] *

*Example*. The variable declaration

{int,String=>Number throws IOException} xyzzy;

is translated into

interface Function1<R,A2,throws E> { // system-generated
    R invoke(int x1, A2 x2) throws E;
}
Function1<? extends Number,? super String,? extends IOException> xyzzy;

With the exception that the generated interface name and parameter names
would be synthetic, and the compiler, runtime system, and debuggers should
display these types using the function type syntax rather than in terms of
the translation involving generics and wildcards.

Because a function type *is* an interface type, it can be extended by a
user-defined interface and it can be implemented by a user-defined class.

 This translation scheme causes a function type

{ *T0* ... *Tn* => *Tr* throws *E0*, ... *Em* }

to be defined as a system-generated interface type with a single abstract
method

*Tr* invoke(*T0* x0, ... *Tn* xn) throws *E0*, ... *Em*;

that obeys the following subtyping relationship:

A function type { *T0* ... *Tn* => *Tr* throws *E0*, ... *Em* } is a subtype
of function type { *U0* ... *Ul* => *Ur* throws *X0*, ... *Xk* } iff all of
the following hold:

   - Either
      - *Tr* is the same as *Ur* or
      - Both are reference types and *Tr* is a subtype of *Ur*;
   - *n *is equal to *l*;
   - for every* i* in *0..n*, either *Ti* and *Ui* are the same types or
   both are reference types and* Ui* is a subtype of*Ti*; and
   - for every *j* in *0..m*, there exists an *h* in *0..k* such that
*Ej*is a subtype of
   *Xh*.

In English, these rules are

   - Function types may have covariant returns;
   - A subtype function must accept the same number of arguments as its
   supertype;
   - Function types may have contravariant argument types; and
   - The subtype function may not throw any checked exception type not
   declared in the supertype.

While the subtype rules for function types may at first glance appear
arcane, they are defined this way for very good reason: this is the
classical *arrow rule*, which has proven itself indispensible in languages
that offer support for higher-order programming. And while the rules seem
complex, function types do not add complexity to Java's type system because
function types and their subtype relations can be understood as a
straightforward application of generics and wildcards (existing constructs).
>From the programmer's perspective, function types just "do the right thing."

 Closure Conversion

A closure literal may be assigned to a variable or parameter of any
compatible interface type by the *closure conversion*:

There is a *closure conversion* from a closure literal to every interface
type that has a single method *m* such that the closure literal is
compatible with* m*. A closure literal is *compatible with* a method *m* iff
all of the following hold:

   - Either
      - The closure literal has no final expression and the method *m* has
      return type void or java.lang.Void; or
      - The closure literal has a final expression and there is an
      assignment conversion from its type to the return type of *m*; or
      - The body of the closure literal cannot complete normally.
   - *m* has the same number of arguments as the closure literal.
   - For each corresponding argument position in the argument lists**, both
   argument types are the same.
   - Every exception type that can be thrown by the body of the closure
   literal is a subtype of some exception type in the throws clause of *m*.

The closure conversion supports converting a function that yields no result
to an interface whose method returns java.lang.Void. In this case a
nullvalue is returned from the function. This is necessary to write
APIs that
support *completion
transparency<http://gafter.blogspot.com/2006/11/closures-esoterica-completion.html>
*: that is, in which an invocation of the API can return normally if and
only if the function that is passed to it can complete normally.

If the target of the closure conversion extends the marker interface
java.lang.RestrictedFunction then

   - it is a compile-time error if the function being converted contains a
   break, continue or return statement whose target is outside the closure
   literal; and
   - it is a compile-time error if the function being converted refers to a
   non-final local variable declared in an enclosing scope.

 The closure conversion to a "restricted" interface only allows functions
that obey the current restrictions on anonymous instances. The motivation
for this is to help catch inadvertent use of non-local control flow in
situations where it would be inappropriate. Examples would be when the
function is passed to another thread to run asynchronously, or stored in a
data structure to be invoked later.

*Example*: We can create a closure instance that adds two to its argument
like this:

interface IntFunction {
    int invoke(int i);
}
IntFunction plus2 = {int x => x+2};

 Alternatively, using function types:

{int=>int} plus2 = {int x => x+2};

 We can use the existing Executor framework to run a closure instance in the
background:

void sayHello(java.util.concurrent.Executor ex) {
    ex.execute({=> System.out.println("hello"); });
}

 Exception type parameters

To support exception transparency (that is, the design of control-invocation
APIs in which the compiler can infer that exceptions thrown by the closure
are thrown by the API), we add a new type of generic formal type argument to
receive a set of thrown types. *[This deserves a more formal treatment]*

*TypeParameter:*throws* TypeVariable* *TypeBoundopt**ActualTypeArgument*:*
throws* *ExceptionList* *ExceptionList*:*ReferenceType*
*ReferenceType** **|* *ExceptionList*

What follows is an example of how the proposal would be used to write an
exception-transparent version of a method that locks and then unlocks a
java.util.concurrent.Lock, before and after a user-provided block of code. In
the nominal version of the proposal:

interface Block<T,throws E> {
    T invoke() throws E;
}
public static <T,throws E extends Exception>
T withLock(Lock lock, Block<T,E> block) throws E {

    lock.lock();
    try {
        return block.invoke();
    } finally {
        lock.unlock();
    }
}

Or using the functional version of the proposal:

public static <T,throws E extends Exception>
T withLock(Lock lock, {=>T throws E} block) throws E {
    lock.lock();
    try {
        return block.invoke();

    } finally {
        lock.unlock();
    }
}

This can be used as follows:

withLock(lock, {=>
    System.out.println("hello");
});

 This uses a newly introduced form of generic type parameter. The type
parameter E is declared to be used in throws clauses. If the extends clause
is omitted, a type parameter declared with the throws keyword is considered
to extend Exception (instead of Object, which is the default for other type
parameters). This type parameter accepts multiple exception types. For
example, if you invoke it with a block that can throw IOException or
NumberFormatException the type parameter E would be given as
IOException|NumberFormatException. In those rare cases that you want to use
explicit type parameters with multiple thrown types, the keyword throws is
required in the invocation, like this:

Locks.<throws IOException|NumberFormatException>withLock(lock, {=>
    System.out.println("hello");
});

 You can think of this kind of type parameter accepting disjunction, "or"
types. That is to allow it to match the exception signature of a function
that throws any number of different checked exceptions. If the block throws
no exceptions, the type argument would be the type null (see below).

Type parameters declared this way can be used only in a throws clause or as
a type argument to a generic method, interface, or class whose type argument
was declared this way too.

Throws type parameters are indicated in class files by the presence of the
character '^' preceding the type parameter.

We introduce a meaning for the keyword null as a type name; it designates
the type of the expression null. A class literal for the type of null is
null.class. These are necessary to allow reflection, type inference, and
closure literals to work for functions that result in the value null or that
throw no checked exceptions. We also add the non-instantiable class
java.lang.Null as a placeholder, and its static member field TYPE as a
synonym for null.class.
 Definite assignment

The body of a closure literal may not assign to a final variable declared
outside the closure literal.

A closure literal does not affect the DA/DU status of any free variables it
names.

A f ree variable referenced inside a closure literal receives its initial DA
state from the DA state of the variable at the point where the closure
literal appears.
 The type Unreachable

We add the non-instantiable type java.lang.Unreachable. Values of this type
appear where statements or expressions cannot complete normally. This is
necessary to enable writing APIs that provide
transparency<http://gafter.blogspot.com/2006/11/closures-esoterica-completion.html>for
functions that do not complete normally.
Unreachable is a subtype of every obect type. The null type is a subtype of
Unreachable. At runtime, a NullPointerException is thrown when a value of
type Unreachable would appear. This occurs when a programmer converts nullto
Unreachable.

 *Example*: The following illustrates a function being assigned to a
variable of the correct type.

interface NullaryFunction<T, throws E> {
    T invoke() throws E;
}
NullaryFunction<Unreachable,null> thrower = {=> throw new AssertionError(); };

 Reachable statements

The initial statement of a closure literal is reachable.

An expression statement in which the expression is of type
Unreachablecannot complete normally.
Control invocation syntax

A new invocation statement syntax is added to make closures convenient for
control abstraction:

*ControlInvocationStatement*:for*opt Primary* (* FormalParameters *: *
ExpressionListopt* ) *Statement*
for*opt Primary* (* ExpressionListopt* ) *Statement*

 This syntax is a shorthand for the following statement:

*Primary* ( *ExpressionList*, {* FormalParametersopt** *=> *Statement* } );

(See the later section *loop abstractions* for the meaning of the optional
for keyword) This syntax makes some kinds of function-receiving APIs more
convenient to use to compose statements.

Note: There is some question of the correct order in the rewriting. On the
one hand, the function seems most natural in the last position when not
using the abbreviated syntax. On the other hand, that doesn't work well with
varargs methods. Which is best remains an open issue.

 We could use the shorthand to write our previous example this way

withLock(lock) {
    System.out.println("hello");
}

 Rationale: This is not an expression form for a very good reason: it looks
like a statement, and we expect it to be used most commonly as a statement
for the purpose of writing APIs that abstract patterns of control. If it
were an expression form, an invocation like this would require a trailing
semicolon after the close curly brace of a controlled block. Forgetting the
semicolon would probably be a common source of error. The syntax is
convenient for both synchronous (e.g. see withLock) and asynchronous (e.g.
see Executor.execute) use cases.

Another example of its use would be a an API that closes a
java.io.Closeableafter a user-supplied block of code, discarding any
exception from
Closeable.close:

class OneArgBlock<R, T, throws E> {
    R invoke(T t) throws E;
}
<R, T extends java.io.Closeable, throws E>
R with(T t, OneArgBlock<R, ? super T,E> block) throws E {
    try {

        return block.invoke(t);
    } finally {
        try { t.close(); } catch (IOException ex) {}
    }
}

 Or using the functional version:

<R, T extends java.io.Closeable, throws E>
R with(T t, {T=>R throws E} block) throws E {
    try {
        return block.invoke(t);
    } finally {
        try { t.close(); } catch (IOException ex) {}

    }
}

 We could use the shorthand with this API to close a number of streams at
the end of a block of code:

with(FileReader in : makeReader()) with(FileWriter out : makeWriter()) {
    // code using in and out
}

Loop Abstractions

The specification supports writing methods that act as control statements,
but when used to support loops the programmer should be able to specify that
the statement should capture the meaning of break and continue in a way
analogous to the built-in loop statements. For this purpose we allow
the forkeyword at the beginning of a control invocation statement (see
above), and
introduce the use of the keyword for as a modifier on method declarations:

*MethodDeclarator*:for*opt* *Identifier* ( *FormalParamterListopt* )

An overriding or implementing method must be declared with the keyword
forif and only if the method being overridden or implemented was
declared with
the keyword for.

A control invocation statement must use the for keyword if and only if the
method being invoked was declared with the keyword for.

Within the controlled statement of a control invocation statement using the
keyword for:

   - An unlabelled break statement breaks from the control invocation
   statement.
   - An unlabelled continue statement breaks from the controlled statement.

 *Example*: The following illustrates a method used to loop through the
key-value pairs in a map. The method

<K,V,throws X>
void for eachEntry(Map<K,V> map, {K,V=>void throws X} block)
        throws X {
    for (Map.Entry<K,V> entry : map.entrySet()) {
        block.invoke(entry.getKey(), entry.getValue());

    }

}

allows us to rewrite this

void test(Map<String,Integer> map) {
    for (Map.Entry<String,Integer> entry : map.entrySet()) {
        String name = entry.getKey();
        Integer value = entry.getValue();

        if ("end".equals(name)) break;
        if (name.startsWith("com.sun.")) continue;
        System.out.println(name + ":" + value);
    }

}

like this

void test(Map<String,Integer> map) {
    for eachEntry(String name, Integer value : map) {
        if ("end".equals(name)) break;
        if (name.startsWith("com.sun.")) continue;

        System.out.println(name + ":" + value);
    }

}

 Acknowledgments

The authors would like to thank the following people whose discussions and
insight have helped us craft, refine, and improve this proposal:

* C. Scott Ananian, Lars Bak, Cedric Beust, Joshua Bloch, Martin Buchholz,
Danny Coward, Erik Ernst, Rémi Forax, Christian Plesner Hansen, Kohsuke
Kawaguchi, Danny Lagrouw, Doug Lea,"crazy" Bob Lee, Martin Odersky, Tim
Peierls, Cris Perdue, John Rose, Ken Russell, Guy Steele, Mads Torgersen,
Jan Vitek, *and* Dave Yost.*

------------------------------
[image: Creative Commons
License]<http://creativecommons.org/licenses/by-sa/3.0/us/>
Closures for Java by Gilad Bracha, Neal Gafter, James Gosling, Peter von der
Ahé <http://www.javac.info/> is licensed under a Creative Commons
Attribution-Share Alike 3.0 United States
License<http://creativecommons.org/licenses/by-sa/3.0/us/>.