Exception transparency

Tue Jun 8 07:24:52 PDT 2010

What I like about this idea is that it clearly distinguishes between the 
features that are useful in a "portable" way vs those which are tied to the 
execution environment in which the lambda is captured.

In order for such a system to work, it has to be reflected in the type system. 
  One way to do so would be to introduce a characteristic of functions (call 
it "purity") and state:

     pure T->U  <:  T->U

Then \lambda.{ 42 } is a pure function (of type Void->Int); \lambda.{ x } 
(where x is a nonfinal local) is not.

Restricting non-pure closures to executing in their creating stack scope is 
easy.

The hard part is reifying purity in the (nominal Java) type system. 
Otherwise, you run into problems.  The purity of a closure would need to be 
identifiable statically (preferably) or dynamically (worst case) so that we 
didn't try to "transport" a non-pure closure.

Consider a method like List.forEach().  If forEach() has to assume the worst 
(impurity), then forEach() must be sequential.  But that is terrible!  Adding 
closures and not allowing the possibility of parallel iteration?  Not OK.

If purity were reflected in the static type system, there could be overloaded 
versions of forEach() for pure and impure version, and the pure version could 
be parallel.

If purity were reflected only dynamically, then forEach() could test for "if 
lambda instanceof pure" (or whatever) and choose between serial and parallel 
iteration.

Add to that the need for migration compatibility -- whether we support 
denotable function types or not, this has to work with SAM types.  The obvious 
way to tag purity with SAM types would be with a marker interface, but then 
decorator-like methods (like immutableList()) would "wash away" the marker.

I like your idea, I really do.  I hope you can figure out a way to address 
these problems!

On 6/8/2010 12:53 AM, Reinier Zwitserloot wrote:
> Making the exceptions transparent by encoding them in the type system is
> going to be very unwieldy, even if a "throws T" variadic type parameter
> concept is introduced. I've got a much better idea: Escape detection on
> closures. Any closures that are guaranteed to run such that the runtime
> stack matches the lexical stack can get exception transparency
> completely free, no new syntax and no new concepts required; it would
> "just work".
>
> Let's take Comparator for example. Currently it is used in two contexts;
> Collections/Arrays.sort, as well as for the TreeSet constructor. It
> would make perfect sense for the sort variant to use the proposed
> variadic type parameter concept to make any exceptions thrown as part of
> the comparison transparent. Side-stepping for a moment the general
> shambles that occurs when trying to integrate functional types into
> existing library code, this would perhaps look something like so:
>
> public static <T, throws E> void sort(List<T> a, Comparator<? super T,
> E> c) throws E {
>   ...
> }
>
> Just look at that. Generics in libraries already has a tendency to look
> like gobbledygook and the above is not helping things at all. Even if we
> abandon Comparator here and use a straight closure type like so:
>
> public static <T, throws E> void sort(List<T> a, #int(T, T)(throws E) c)
> throws E {
>    ...
> }
>
> isn't an improvement.
>
>
> But, let's stick with it, and inspect what happens to TreeSet.
>
> Now we run into even bigger trouble; applying a (variadic or monadic,
> doesn't matter) exceptions type parameter here is simply not possible.
> We'd have to add "throws E" to at the very least TreeSet's add and
> addAll methods, but we clearly can't; both of those methods come from
> the Set interface and those don't have any thrown exceptions declared
> today, so, that won't work. Even in a hypothetical world where we can
> break backwards compatibility we still can't pragmatically replace Set's
> current add method with: "public boolean add(T item) throws E" - because
> what's that E even stand for? In e.g. a HashSet, and most other sets,
> that "E" makes no sense at all. Its a unique aspect of TreeSet. We could
> simply not let TreeSet implement Set but that feels completely wrong.
>
>
> The most obvious solution then appears to be one of three options:
>
> (A) don't try; leave sort and TreeSet untouched and do not add exception
> transparency to sort. But that would be a shame.
> (B) overload Collections.sort with the variadic type parameter based
> exception transparency. Leave TreeSet untouched. But now we have two
> incompatible "comparator" concepts, so this too would be a weak solution.
> (C) hack it together by letting TreeSet rewrap any checked exceptions
> thrown during add/addAll into a custom unchecked class, akin to
> InvocationTargetException or ExecutionException). But that's the very
> ugliness we're trying to avoid!
>
> None of these feel like good solutions. Even going with the most likely
> one, option B, that still leaves us with this brain twiddler:
>
> public static <T, throws E> void sort(List<T> a, #int(T, T)(throws E) c)
> throws E {
>    ...
> }
>
> One of the nice things of static typing is that the signature all by
> itself is usually enough documentation for a library user to know
> exactly what to do. But with signatures like those, I'm not sure that'll
> remain true. It's also highly redundant; creating new boilerplate is
> something we ought to try and avoid.
>
> So, here's my alternative:
>
> The first part: Create two separate kinds of closures, similar to a
> later addendum to the BGGA proposal concerning "safe" and "unsafe" closures.
>
> There are portable closures which have no restrictions of any kind but
> which are truly like portable code blocks; a simpler syntax for today's
> anonymous inner classes. They cannot capture (mutable) variables, do not
> and will never support long return/break/continue, have no exception
> transparency of any kind, but you can store them for later, move them to
> another thread, and otherwise treat them as one would any random object.
>
> The other kind of closure is a restricted closure: The compiler will
> ensure that if the closure is run at all, it is run in such a way that
> the runtime stack matches lexical scope. Such a closure can capture any
> variable, mutable or not (because threading cannot possibly be an issue,
> by definition - the stack is still there, so it must be the same
> thread), when java 8 rolls around, will support transfers (long
> break/continue/return), and has automagical exception transparency; it
> just works, no need for a variadic type parameter.
>
> Here's how this might work:
>
> TreeSet's constructor remains as it is now, and by default all closures
> are NOT restricted. Exception transparency for TreeSet's Comparator will
> not be supported, but we've already seen that supporting it is not
> feasible anyway, so that's a good result. For Collections.sort, on the
> other hand, we mark it as restricted, like so:
>
> public static <T> void sort(List<T> a, do Comparator<? super T> c) {
>      ...
> }
>
> note the "do" keyword there. It can be any keyword, such as
> "restricted", that's a detail that can be decided on later. What this
> does is two things: First of all, the compiler will ensure that the "c"
> variable is not assigned to anything (be it a field or a local
> variable), is not captured by a non-restricted closure, is not captured
> by a method local class, is not returned, and is not passed as variable
> to another method (with one exception: It's okay to pass c as a closure
> to another method that also has a 'do' keyword for that parameter).
> Secondly, that parameter is marked (via an extended field) as supporting
> a restricted closure.
>
>
> Then, IF a closure is passed directly as parameter, one gets hassle-free
> exception transparency, and one can legally write this:
>
> public void test() throws IOException {
>      Collections.sort(someList, #(String a, String b) {
>          if (a == null) throw new IOException();
>          return a.length() - b.length();
>      });
> }
>
> It'll work because the compiler first realizes that the closure throws
> an IOException which isn't caught/declared anywhere in its own lexical
> scope, but it IS caught/declared in its containing scope, thus the
> exception itself is marked as "legal, but only if restricted". During
> the resolution phase, the compiler figures out that this closure is
> being passed in the position of a restricted parameter and just compiles
> it without complaining about not handling a checked exception. The JVM
> already supports this (checked exceptions are checked by javac, the jvm
> or the class verifier don't care - see any alternative JVM language,
> which all let you throw checked exceptions without declaring that you do).
>
> There's only one caveat: The author of the Collections.sort method may
> presume that any call to Comparator.compare() couldn't possibly throw,
> say, an IOException, because "compare()" does not declare this. And that
> would be a mistake - this entire concept is predicated on the notion
> that any checked exception not explicitly declared by that closure falls
> "through" all intermediate methods right back to the original lexical
> scope, at which point it'll be handled. That'll work, but only if all
> those intermediate methods don't touch that checked exception. So,
> something like this:
>
> public static <T> void sort(List<T> a, do Comparator<? super T> c) {
>      try {
>          c.compare(null, null);
>          new FileInputStream("foo.bar");
>      } catch (IOException e) {
>          //I'll just presume this must have been the FileInputStream!
>      }
> }
>
> would be a coding mistake - it'll fail because it'll catch any
> transparent IOExceptions from the closure as well. However, this is in
> practice not a problem at all, for two reasons:
>
> (1) Any java program that works like the above and assumes that
> IOException must have come from the FileInputStream is already buggy.
> It's the JVM you're coding for and not java (the language) programs;
> class files produced by jython, scalac, jruby, and just about any other
> alternative language routinely throw checked exceptions "sneakily". Even
> within the constraints of java itself, sneaky exceptions can happen. If
> not explicitly (via e.g. someClass.newInstance() which can sneaky
> throw), then implicitly, by mixing class files from different compile
> runs. Thus, this isn't really a new caveat.
>
> (2) Existing code will NEVER be bit with this. The author has to step in
> and add a do keyword to one of the parameters. When doing so the author
> assumes the responsibility of checking that his code doesn't make this
> assumption.
>
>
> Such a system is also backwards compatible. You can never remove a "do"
> keyword from a signature, but you can add it.
>
>
> Just to highlight how exception transparency is completely free in this
> system, at least for restricted closures: By definition the original
> stack is still intact, therefore, whatever construct (a catch or a
> throws clause) made it legal to throw a certain checked exception at the
> lexical location where the closure is defined is still "live" on the
> stack. Exceptions, by definition, walk up the stack until they find the
> nearest handler. Assuming all code in between the closure definition and
> its execution leave it alone, the exception will always end up at a
> point where it'll be handled (or thrown onwards but explicitly declared
> with a "throws" clause, which is ok).
>
> This leaves a hole for unrestricted closures, which would then not
> supported exception transparency at all, but just about every instance
> of unrestricted closure usage in libraries I can think of doesn't really
> need exception transparency. The rare case where that would be useful is
> mostly focussed around ParallelArrays and fork/join, in that PA
> transports the closure block across threads but it "emulates" the stack
> (in that the method call that runs an operation on the PA doesn't return
> until that operation is complete, though in the process of completing
> the operation, many threads will be used). This case can be supported by
> offering a native method that unshackles the escape detection from a
> parameter. The javadoc on this method can explain in detail how its the
> job of the caller to transport any and all exceptions, checked or not,
> back to the caller, and how in the future Break/Continue/ReturnTransfer
> throwables too may occur and should be transported back to the original
> thread.
>
> There are a few use cases where exception transparency is desired but
> there's no way, even with the above "transport the exception back to the
> caller" concept, to keep exception transparency out of the type system.
> However, I posit that with the above system in place these cases don't
> even get close to being common enough to warrant the considerable burden
> of adding the "throws E" concept to the type system. As has been pointed
> out by Brian, other than "throws" and "catch", a variadic type parameter
> can't really be used anywhere in java.
>
> Therefore, by adding restricted/portable status to closures, this
> particular dilemma is solved in a very simple fashion, and it'll also be
> quite useful for solving the problem with whether or not to allow
> capturing mutable variables, as well as be a good starting point for a
> future addition to closures to let them support "long"
> break/continue/return.
>
> --Reinier Zwitserloot
>
>
>
> On Tue, Jun 8, 2010 at 2:08 AM, Brian Goetz <brian.goetz at oracle.com
> <mailto:brian.goetz at oracle.com>> wrote:
>
>     Summary: Patience, Grasshopper.  All of this will come in due time.
>
>      > You describe a syntax for declaring these new kinds of type
>     parameters,
>      > but no syntax for the use site.  The one-type case is familiar,
>     and you
>      > make two suggestions for the no-type case, but what about the
>      > two-or-more case (e.g. "IOException or SQLException")?  What about
>      > disjunctive bounds (they arise in APIs once you start using these
>     type
>      > parameters).
>
>     The disjunctive syntax (A|B) is the likely front-runner here.
>
>      > I conjecture that a new syntax may not be needed at the
>     declaration site
>      > at all; it can be inferred from the explicit bound on the type
>     parameter
>      > instead of vice-versa, simplifying the syntax and making it more
>      > familiar.  Specifically, you can treat any type parameter as a throws
>      > type parameter if its bound is Throwable or a subtype.  There is
>     no need
>      > to restrict where the type parameter may be used.
>
>     I suspect we might regret that.  When you try to write a class like
>        class ExceptionList<T extends Exception> implements List<T> { ... }
>     you'll find that the compiler has guessed your intentions incorrectly.
>
>     Similarly we could go down the route of explicitly variadic type
>     parameters
>     (identified say by T... or T*).  And the only place you could use
>     them would
>     be ... throws and maybe catch clauses, since those are the only
>     places in the
>     language where you can put a list of types.
>
>      > I conjecture that a new syntax may not be needed at the use site
>     either,
>      > or at least not frequently.
>
>     Likely true.  There is a balance to be struck.  On the one hand, new
>     syntax
>     means something new to learn.  On the other hand, shoehorning new
>     semantics
>     into existing syntax means likely surprises.  While it can probably
>     be done
>     with minimal new syntax, this may not be the best thing for the
>     language or
>     the community.
>
>     The choice of "throws" or similar at the declaration site and
>     nothing at the
>     use site is probably a reasonable balance, but as always: syntax
>     choices to be
>     made last.
>
>      > How do these new type parameters interact with wildcards?  Can
>     they be
>      > used in function types?  What are the subtyping rules?
>
>     To be provided.
>
>      > You suggest that inference will frequently provide these type
>      > parameters, but there is no provision for that in any of the draft
>      > specifications for project lambda.
>
>     To be provided.
>
>      > Your suggestion that a type parameter may be used in a catch
>     blocks and
>      > is treated as the erasure will be very confusing, and depending on
>      > precisely what you mean it might undermine exception checking.
>       Can you
>      > please explain what motivated you to add that?  Are you hoping
>     that the
>      > interaction with final type parameters may allow intercepting
>     exceptions
>      > even when the exception type is a formal generic parameter?
>
>     The use in catch blocks is an optional element of this proposal, one
>     that may
>     well turn out in the end to be confusing or unworkable.  But the
>     desirability
>     of such a feature should be obvious.  We plan to explore it.
>
>