Exception transparency

Reinier Zwitserloot reinier at zwitserloot.com
Mon Jun 7 21:53:42 PDT 2010


Making the exceptions transparent by encoding them in the type system is
going to be very unwieldy, even if a "throws T" variadic type parameter
concept is introduced. I've got a much better idea: Escape detection on
closures. Any closures that are guaranteed to run such that the runtime
stack matches the lexical stack can get exception transparency completely
free, no new syntax and no new concepts required; it would "just work".

Let's take Comparator for example. Currently it is used in two contexts;
Collections/Arrays.sort, as well as for the TreeSet constructor. It would
make perfect sense for the sort variant to use the proposed variadic type
parameter concept to make any exceptions thrown as part of the comparison
transparent. Side-stepping for a moment the general shambles that occurs
when trying to integrate functional types into existing library code, this
would perhaps look something like so:

public static <T, throws E> void sort(List<T> a, Comparator<? super T, E> c)
throws E {
 ...
}

Just look at that. Generics in libraries already has a tendency to look like
gobbledygook and the above is not helping things at all. Even if we abandon
Comparator here and use a straight closure type like so:

public static <T, throws E> void sort(List<T> a, #int(T, T)(throws E) c)
throws E {
  ...
}

isn't an improvement.


But, let's stick with it, and inspect what happens to TreeSet.

Now we run into even bigger trouble; applying a (variadic or monadic,
doesn't matter) exceptions type parameter here is simply not possible. We'd
have to add "throws E" to at the very least TreeSet's add and addAll
methods, but we clearly can't; both of those methods come from the Set
interface and those don't have any thrown exceptions declared today, so,
that won't work. Even in a hypothetical world where we can break backwards
compatibility we still can't pragmatically replace Set's current add method
with: "public boolean add(T item) throws E" - because what's that E even
stand for? In e.g. a HashSet, and most other sets, that "E" makes no sense
at all. Its a unique aspect of TreeSet. We could simply not let TreeSet
implement Set but that feels completely wrong.


The most obvious solution then appears to be one of three options:

(A) don't try; leave sort and TreeSet untouched and do not add exception
transparency to sort. But that would be a shame.
(B) overload Collections.sort with the variadic type parameter based
exception transparency. Leave TreeSet untouched. But now we have two
incompatible "comparator" concepts, so this too would be a weak solution.
(C) hack it together by letting TreeSet rewrap any checked exceptions thrown
during add/addAll into a custom unchecked class, akin to
InvocationTargetException or ExecutionException). But that's the very
ugliness we're trying to avoid!

None of these feel like good solutions. Even going with the most likely one,
option B, that still leaves us with this brain twiddler:

public static <T, throws E> void sort(List<T> a, #int(T, T)(throws E) c)
throws E {
  ...
}

One of the nice things of static typing is that the signature all by itself
is usually enough documentation for a library user to know exactly what to
do. But with signatures like those, I'm not sure that'll remain true. It's
also highly redundant; creating new boilerplate is something we ought to try
and avoid.

So, here's my alternative:

The first part: Create two separate kinds of closures, similar to a later
addendum to the BGGA proposal concerning "safe" and "unsafe" closures.

There are portable closures which have no restrictions of any kind but which
are truly like portable code blocks; a simpler syntax for today's anonymous
inner classes. They cannot capture (mutable) variables, do not and will
never support long return/break/continue, have no exception transparency of
any kind, but you can store them for later, move them to another thread, and
otherwise treat them as one would any random object.

The other kind of closure is a restricted closure: The compiler will ensure
that if the closure is run at all, it is run in such a way that the runtime
stack matches lexical scope. Such a closure can capture any variable,
mutable or not (because threading cannot possibly be an issue, by definition
- the stack is still there, so it must be the same thread), when java 8
rolls around, will support transfers (long break/continue/return), and has
automagical exception transparency; it just works, no need for a variadic
type parameter.

Here's how this might work:

TreeSet's constructor remains as it is now, and by default all closures are
NOT restricted. Exception transparency for TreeSet's Comparator will not be
supported, but we've already seen that supporting it is not feasible anyway,
so that's a good result. For Collections.sort, on the other hand, we mark it
as restricted, like so:

public static <T> void sort(List<T> a, do Comparator<? super T> c) {
    ...
}

note the "do" keyword there. It can be any keyword, such as "restricted",
that's a detail that can be decided on later. What this does is two things:
First of all, the compiler will ensure that the "c" variable is not assigned
to anything (be it a field or a local variable), is not captured by a
non-restricted closure, is not captured by a method local class, is not
returned, and is not passed as variable to another method (with one
exception: It's okay to pass c as a closure to another method that also has
a 'do' keyword for that parameter). Secondly, that parameter is marked (via
an extended field) as supporting a restricted closure.


Then, IF a closure is passed directly as parameter, one gets hassle-free
exception transparency, and one can legally write this:

public void test() throws IOException {
    Collections.sort(someList, #(String a, String b) {
        if (a == null) throw new IOException();
        return a.length() - b.length();
    });
}

It'll work because the compiler first realizes that the closure throws an
IOException which isn't caught/declared anywhere in its own lexical scope,
but it IS caught/declared in its containing scope, thus the exception itself
is marked as "legal, but only if restricted". During the resolution phase,
the compiler figures out that this closure is being passed in the position
of a restricted parameter and just compiles it without complaining about not
handling a checked exception. The JVM already supports this (checked
exceptions are checked by javac, the jvm or the class verifier don't care -
see any alternative JVM language, which all let you throw checked exceptions
without declaring that you do).

There's only one caveat: The author of the Collections.sort method may
presume that any call to Comparator.compare() couldn't possibly throw, say,
an IOException, because "compare()" does not declare this. And that would be
a mistake - this entire concept is predicated on the notion that any checked
exception not explicitly declared by that closure falls "through" all
intermediate methods right back to the original lexical scope, at which
point it'll be handled. That'll work, but only if all those intermediate
methods don't touch that checked exception. So, something like this:

public static <T> void sort(List<T> a, do Comparator<? super T> c) {
    try {
        c.compare(null, null);
        new FileInputStream("foo.bar");
    } catch (IOException e) {
        //I'll just presume this must have been the FileInputStream!
    }
}

would be a coding mistake - it'll fail because it'll catch any transparent
IOExceptions from the closure as well. However, this is in practice not a
problem at all, for two reasons:

(1) Any java program that works like the above and assumes that IOException
must have come from the FileInputStream is already buggy. It's the JVM
you're coding for and not java (the language) programs; class files produced
by jython, scalac, jruby, and just about any other alternative language
routinely throw checked exceptions "sneakily". Even within the constraints
of java itself, sneaky exceptions can happen. If not explicitly (via e.g.
someClass.newInstance() which can sneaky throw), then implicitly, by mixing
class files from different compile runs. Thus, this isn't really a new
caveat.

(2) Existing code will NEVER be bit with this. The author has to step in and
add a do keyword to one of the parameters. When doing so the author assumes
the responsibility of checking that his code doesn't make this assumption.


Such a system is also backwards compatible. You can never remove a "do"
keyword from a signature, but you can add it.


Just to highlight how exception transparency is completely free in this
system, at least for restricted closures: By definition the original stack
is still intact, therefore, whatever construct (a catch or a throws clause)
made it legal to throw a certain checked exception at the lexical location
where the closure is defined is still "live" on the stack. Exceptions, by
definition, walk up the stack until they find the nearest handler. Assuming
all code in between the closure definition and its execution leave it alone,
the exception will always end up at a point where it'll be handled (or
thrown onwards but explicitly declared with a "throws" clause, which is ok).

This leaves a hole for unrestricted closures, which would then not supported
exception transparency at all, but just about every instance of unrestricted
closure usage in libraries I can think of doesn't really need exception
transparency. The rare case where that would be useful is mostly focussed
around ParallelArrays and fork/join, in that PA transports the closure block
across threads but it "emulates" the stack (in that the method call that
runs an operation on the PA doesn't return until that operation is complete,
though in the process of completing the operation, many threads will be
used). This case can be supported by offering a native method that
unshackles the escape detection from a parameter. The javadoc on this method
can explain in detail how its the job of the caller to transport any and all
exceptions, checked or not, back to the caller, and how in the future
Break/Continue/ReturnTransfer throwables too may occur and should be
transported back to the original thread.

There are a few use cases where exception transparency is desired but
there's no way, even with the above "transport the exception back to the
caller" concept, to keep exception transparency out of the type system.
However, I posit that with the above system in place these cases don't even
get close to being common enough to warrant the considerable burden of
adding the "throws E" concept to the type system. As has been pointed out by
Brian, other than "throws" and "catch", a variadic type parameter can't
really be used anywhere in java.

Therefore, by adding restricted/portable status to closures, this particular
dilemma is solved in a very simple fashion, and it'll also be quite useful
for solving the problem with whether or not to allow capturing mutable
variables, as well as be a good starting point for a future addition to
closures to let them support "long" break/continue/return.

--Reinier Zwitserloot



On Tue, Jun 8, 2010 at 2:08 AM, Brian Goetz <brian.goetz at oracle.com> wrote:

> Summary: Patience, Grasshopper.  All of this will come in due time.
>
> > You describe a syntax for declaring these new kinds of type parameters,
> > but no syntax for the use site.  The one-type case is familiar, and you
> > make two suggestions for the no-type case, but what about the
> > two-or-more case (e.g. "IOException or SQLException")?  What about
> > disjunctive bounds (they arise in APIs once you start using these type
> > parameters).
>
> The disjunctive syntax (A|B) is the likely front-runner here.
>
> > I conjecture that a new syntax may not be needed at the declaration site
> > at all; it can be inferred from the explicit bound on the type parameter
> > instead of vice-versa, simplifying the syntax and making it more
> > familiar.  Specifically, you can treat any type parameter as a throws
> > type parameter if its bound is Throwable or a subtype.  There is no need
> > to restrict where the type parameter may be used.
>
> I suspect we might regret that.  When you try to write a class like
>   class ExceptionList<T extends Exception> implements List<T> { ... }
> you'll find that the compiler has guessed your intentions incorrectly.
>
> Similarly we could go down the route of explicitly variadic type parameters
> (identified say by T... or T*).  And the only place you could use them
> would
> be ... throws and maybe catch clauses, since those are the only places in
> the
> language where you can put a list of types.
>
> > I conjecture that a new syntax may not be needed at the use site either,
> > or at least not frequently.
>
> Likely true.  There is a balance to be struck.  On the one hand, new syntax
> means something new to learn.  On the other hand, shoehorning new semantics
> into existing syntax means likely surprises.  While it can probably be done
> with minimal new syntax, this may not be the best thing for the language or
> the community.
>
> The choice of "throws" or similar at the declaration site and nothing at
> the
> use site is probably a reasonable balance, but as always: syntax choices to
> be
> made last.
>
> > How do these new type parameters interact with wildcards?  Can they be
> > used in function types?  What are the subtyping rules?
>
> To be provided.
>
> > You suggest that inference will frequently provide these type
> > parameters, but there is no provision for that in any of the draft
> > specifications for project lambda.
>
> To be provided.
>
> > Your suggestion that a type parameter may be used in a catch blocks and
> > is treated as the erasure will be very confusing, and depending on
> > precisely what you mean it might undermine exception checking.  Can you
> > please explain what motivated you to add that?  Are you hoping that the
> > interaction with final type parameters may allow intercepting exceptions
> > even when the exception type is a formal generic parameter?
>
> The use in catch blocks is an optional element of this proposal, one that
> may
> well turn out in the end to be confusing or unworkable.  But the
> desirability
> of such a feature should be obvious.  We plan to explore it.
>
>


More information about the lambda-dev mailing list