stack-bound vs. travelling closures proposal.

Wed Jun 16 21:18:55 PDT 2010

>From a private email conversation I've been informed this proposal was hard
to find.

I've also updated it slightly.

PROPOSAL: Restricted vs. Portable closures - in regards to Exception
Transparency.

# Example Use-Case

This use-case directs the remainder of the proposal.

In java6, we have:

Collections.sort(List a, Comparator c) { ...}, as well as:

new TreeSet(Comparator c) { ... }

It would be nice if Collections.sort did indeed support exception
transparency, letting a user throw a checked exception from within the
comparator and handle it from outside the sort() call. However, doing
something similar to TreeSet is clearly impossible without breaking
backwards compatibility: If the Comparator can throw, say, IOException, then
TreeSet's add and addAll methods, at the very least, must also declare that
they can throw this exception. However, doing so is not compatible with the
java.util.Set interface.

# Closure Literals

There are portable closures which have no restrictions of any kind but which
are truly like portable code blocks; a simpler syntax for today's anonymous
inner classes. They cannot capture (mutable) variables, do not and will
never support long return/break/continue, have no exception transparency of
any kind, but you can store them for later, move them to another thread, and
otherwise treat them as one would any random object.

The other kind of closure is a restricted closure: The compiler will ensure
that if the closure is run at all, it is run in such a way that the runtime
stack matches lexical scope. Such a closure can capture any variable,
mutable or not (because threading cannot possibly be an issue, by definition
- the stack is still there, so it must be the same thread), when java 8
rolls around, will support transfers (long break/continue/return), and has
automagical exception transparency; it just works, no need for a variadic
type parameter.

By example:

TreeSet's constructor remains as it is now, as by default all parameters are
NOT restricted. Exception transparency for TreeSet's Comparator will not be
supported, but we've already seen that supporting it is not feasible anyway.
For Collections.sort, on the other hand, we mark it as restricted, like so:

public static <T> void sort(List<T> a, do Comparator<? super T> c) {
    ...
}

note the "do" keyword there. It can be any keyword, such as "restricted",
that's a detail that can be decided on later. What this does is two things:
First of all, the compiler will ensure that the "c" variable is not assigned
to anything (be it a field or a local variable), is not captured by a
non-restricted closure, is not captured by a method local class, is not
returned, and is not passed as variable to another method (with one
exception: It's okay to pass c as a closure to another method that also has
a 'do' keyword for that parameter). Secondly, that parameter is marked (via
an extended field) as supporting a restricted closure.

Then, IF a closure is passed directly as parameter, one gets hassle-free
exception transparency, and one can legally write this:

public void test() throws IOException {
    Collections.sort(someList, #(String a, String b) {
        if (a == null) throw new IOException();
        return a.length() - b.length();
    });
}

It'll work because the compiler first realizes that the closure throws an
IOException which isn't caught/declared anywhere in its own lexical scope,
but it IS caught/declared in its containing scope, thus the exception itself
is marked as "legal, but only if restricted". During the resolution phase,
the compiler figures out that this closure is being passed in the position
of a restricted parameter and just compiles it without complaining about not
handling a checked exception. The JVM already supports this (checked
exceptions are checked by javac, the jvm or the class verifier don't care -
see any alternative JVM language, which all let you throw checked exceptions
without declaring that you do).

There's only one caveat: The author of the Collections.sort method may
presume that any call to Comparator.compare() couldn't possibly throw, say,
an IOException, because "compare()" does not declare this. And that would be
a mistake - this entire concept is predicated on the notion that any checked
exception not explicitly declared by that closure falls "through" all
intermediate methods right back to the original lexical scope, at which
point it'll be handled. This fall-through scheme can only work if none of
the intermediate methods catch the exception.

Example:

public static <T> void sort(List<T> a, do Comparator<? super T> c) {
    try {
        c.compare(null, null);
        new FileInputStream("foo.bar");
    } catch (IOException e) {
        //I'll just presume this must have been the FileInputStream!
    }
}

would be a coding mistake - it'll fail because it'll catch any transparent
IOExceptions from the closure as well. However, this is in practice not a
problem at all, for two reasons:

(1) Any java program that works like the above and assumes that IOException
must have come from the FileInputStream is already buggy. It's the JVM
you're coding for and not java (the language) programs; class files produced
by jython, scalac, jruby, and just about any other alternative language
routinely throw checked exceptions "sneakily". Even within the constraints
of java itself, sneaky exceptions can happen. If not explicitly (via e.g.
someClass.newInstance() which can sneaky throw), then implicitly, by mixing
class files from different compile runs. Thus, this isn't really a new
caveat.

(2) Existing code will NEVER be bit with this. The author has to step in and
add a do keyword to one of the parameters. When doing so the author assumes
the responsibility of checking that his code doesn't make this assumption.

Such a system is also backwards compatible. You can never remove a "do"
keyword from a signature, but you can add it. One could for example add the
'do' keyword to the Comparator parameter in Collections.sort.

End of Official Proposal.

# Clarifications

Just to highlight how exception transparency is completely free in this
system, at least for restricted closures: By definition the original stack
is still intact, therefore, whatever construct (a catch or a throws clause)
made it legal to throw a certain checked exception at the lexical location
where the closure is defined is still "live" on the stack. Exceptions, by
definition, walk up the stack until they find the nearest handler. Assuming
all code in between the closure definition and its execution leave it alone,
the exception will always end up at a point where it'll be handled (or
thrown onwards but explicitly declared with a "throws" clause, which is ok).

# Restricted closures for fork/join and ParallelArray.

This proposal only adds exception transparency for restricted closures;
unrestricted closures don't get any exception transparency, but just about
every instance of unrestricted closure usage in libraries I can think of
doesn't really need exception transparency; see for example new
TreeSet(Comparator). The rare case where ET would be useful is mostly
focussed around ParallelArrays and fork/join - parallelizing operations
which freeze the caller thread until they complete. This case isn't
supported well, but isn't hopeless either:

 A. Add an 'unshackle' utility method that takes in any parameter (even a
restricted one) and returns it. There's no way to write such a method in
javac (as returning a restricted parameter is not legal), but as restricted
parameters is purely a concept enforced by javac, it is possible to produce
a class file with this method in it. It can be part of the standard
libraries. It should clearly document the process of transporting
exceptions.

 B. Add a sneaky throws method to this library.

fork/join can then be written to still support restricted closures, though
at the cost of less help from the type system:

// Intentionally stupid implementation, its only purpose is to show that the
above will suffice to create a somewhat hacky way to make PA/forkjoin work
with restricted closures.

public static <T> void forEachParallelized(List<T> list, do #void(T)
restricted) {
    ThreadPool pool = new ThreadPool(10);
    #void(T) unrestricted = unshackle(restricted);
    for (int i = 0; i < list.size(); i++) {
        pool.get(i % 10).addToQueue(unrestricted, list.get(i));
    }
    pool.start();
    pool.join();
    for (int i = 0; i < 10; i++) {
        Throwable t = pool.getThrowableFor(i);
        if (t != null) sneakyThrow(t);
    }
}

Another use case involves delayed execution. This use-case isn't supported
at all with this proposal. You could not do something along the lines of:

DelayedOperation<Integer> op = new DelayedOperation<Integer>(#Integer()
(reallySlowRandomizerThatThrowsIOExceptions.nextInt(20)));
op.begin();
// Do something else that'll take a while.
try {
    System.out.println(op.get());
} catch (IOException e) {
    System.out.println("Hardware randomness source has been disconnected.");
}

--Reinier Zwitserloot