Static analysis of non-local transfers: a proposal

Wed Dec 3 03:18:43 PST 2008

Hi again,

Thanks to Mark for his very helpful email on the concurrency issues.

As promised, here are some ideas on a simplified scheme for exception tracking
which becomes possible if the restrictions outlined in my previous emails
are adopted. The headline is that the use of exception type parameters then
becomes superfluous, as one can give a naturally exception-transparent system
that doesn't require them.

In this longish message, I'll describe...

(A) a way of setting this up while staying as close as possible to the  
existing
     proposal (although still relying on a distinction between function and
     interface types).
(B) an alternative system which seems simpler and cleaner overall (though is
     admittedly slightly coarser in its analysis of exceptions).
(C) some further general thoughts (some rather obvious) on the issue of
     function/interface type compatibility, expanding on a few of the points
     in previous messages.

(A).
The idea of "exception transparency", as I understand it, is that one would
like a system in which one could pass a function F with throwables E
to a method or function G that may not know about E, but which may have its
own throwables E' - and the type system would infer that the throwables of
G(F) were E union E'. In the present scheme (v0.5), the formal parameter f
of G has to be tagged as possibly raising an arbitrary exception by means of
an exception type parameter E, so as to prevent things like:

class Bad {
    { => void} x ;

    void store ({ => void f}) { x = f ; }

    void run () { x.invoke() } ;
}

Bad o = new Bad() ;

{  class MyException extends Exception ;
    o.store { => throw new MyException() ;}
}

o.run() ; // MyException thrown outside its scope!

(Note: even if exception type parameters are elided, as suggested in the
"Open Issues" document, they would presumably need to be added in for
typechecking purposes in order to exclude examples like the above.)

However, under my proposed restrictions on the use of store at function types,
the above class Bad would be excluded anyway, because it attempts to allow
the dynamic behaviour of f to be retained by the longer-lived variable x.
In other words, exactly the same restrictions on "flow of dynamic behaviour"
which prevent escaping "returns" also prevents unruly exception-raising
behaviour.

Of course, one might think these restrictions are themselves rather severe,
but here my response would be: if it's unrestricted higher-order store you
want, you can use objects, not closures. (One can even imagine exception type
parameters being added for the sake of allowing exception-transparency when
working with interface types - but whether or not one did this, they wouldn't
be needed for function types.)

To summarize so far, my suggestion would be that if F is an expression
of function type, a method application G(F) is allowed whether or not
the formal parameter of G explicitly mentions the throwables of F, and
the type system knows that the throwables of G(F) are those of G plus those
of F. The point is that if F might throw e, then the evaluation of G(F) itself
might throw e, but after G has returned, no further appearances of e  
can arise,
since the exception-raising potential of F does not leak into e.g. the fields
of the target object.

Similarly, if G is itself a higher-order function, G.invoke(F) can be allowed
whether or not the argument type of G mentions the throwables of F; again,
the throwables of G.invoke(F) are those of G plus those of F.

All this works nicely for methods/functions whose return type is not itself
a function type. If the return type is a function type (e.g. as in curried
functions), some further issues arise, and here my system is admittedly a
little crude. In the existing system with exception type variables, it is
possible to distinguish among the following three scenarios by appropriately
careful placement of the exception type variable E:

* the evaluation of G(F) might throw e, but if it doesn't, later invocation
   of the resulting closure won't throw e.
* the evaluation of G(F) won't throw e, but later invocation of the resulting
   closure might.
* either the evaluation or the later invocation might throw e.

Under (the most straightforward version of) my proposal, these distinctions
can't be made, as the uses of the formal parameter f aren't "tracked" by the
type system. Being somewhat conservative, the easiest thing is just to say
that the relevant set E union E' must be used both as the throwable set of
the expression G(F) and as the "throws" annotation in the type of the
resulting function. One can of course do better than this by a more careful
analysis of the body of G - in a way, this could be seen as using exception
type variables without admitting it ;-)

This is a little bit unpleasant - and it obviously gets worse for curried
functions with more arguments - but the question is whether the extra control
one gets from exception type variables is sufficiently useful in practice to
justify the extra machinery that they involve (as well as the cluttering of
code, if they're not elided).

(B).
Although something like the above might be a good option in the context of
BGGA, I should mention that a more radical simplication is also possible,
in which the coarseness of the above proposal is (so to speak) made  
systematic.
Here the idea is that we don't annotate each arrow appearing in a function
type with its own set of exceptions; rather, we annotate the type as a whole
with a single set at top-level:

    Type  ::= Type1 Throws_opt
    Type1 ::= StandardJavaType
            | { Types1_opt => Type1 }
    Types1 ::= Type1 | Type1 , Types1

with the proviso that for a standard Java type, the throws clause should
be absent. (Thus, the treatment of exceptions for standard Java types is
unaffected - only function types are treated in this way.) The assertion
"F has type (T throws e_1,...,e_n)" should then be interpreted as saying
that the set e_1,...,e_n includes all exceptions that may be either raised
by F immediately or squeezed out of it by some later invocation. Of course,
this means that various distinctions of the above kind are no longer possible
for curried functions - on the other hand, the gain in simplicity may well
have its own appeal in the context of an already complex language!

Further details of this idea, including a type system worked out for a
fragment of ML, can be found in an unpublished draft document available at

    http://homepages.inf.ed.ac.uk/jrl/Research/exceptions.pdf

(Only the system $\EE_1$ described in this document is relevant here.)
I currently have a PhD student developing these ideas further (although the
focus of his work is more on the denotational semantics for such systems).

(C).
Finally, a few more thoughts on function/interface type compatibility.

(1) I said in an earlier message that for the purpose of static analysis,
function and interface types could be distinguished at an early stage
and then identified. Having thought about it a bit more, it seems that
this should actually be quite easy to do. After parsing, only a rather crude
analysis (far less than full typechecking) is needed to identify which
variables and (sub)expressions are of function type, which is really all
that is needed to verify my conditions.

However, I can't resist adding that if function and interface types were
distinguished at all at *any* level, I think it would be much better if
a separate syntax for function application was provided, rather than e.g.
the ".invoke()" form which cries out to be read as a method invocation.

(2) If one wishes to guarantee "no late transfers" (in the sequential  
setting),
then as outlined in my second message, a condition on implicitly  
type-converted
expressions is needed. This condition is similar to (actually slightly more
liberal than) the one currently adopted for conversion to RestrictedFunction
interfaces. So a "hard line" approach would be to forbid all such conversions
that don't satisfy the conditions. I imagine, though, that a softer line will
seem more attractive: e.g. allow all type-conversions in the interests of
retrofitting etc., but issue a compiler warning if they don't conform to the
restrictions. That way, if the compiler is able to discover that a program
is safe against late transfers, it will share this welcome knowledge with the
programmer via the absence of such warning.

(3) That all seems very nice if exceptions are left out of the picture.
But if one wished to adopt (either version of) my simpler scheme for exception
tracking, a little more care is needed over these conversions. Specifically,
in the above examples, occurrences of the parameter f within the body of G
mustn't be convertible to interface types - or at least not without tagging
them somehow as potentially exception-raising - otherwise the whole system
becomes exception-unsafe.

Various solutions are possible here, but to help me to understand what kinds
of type-compatibility are hoped for, I'd like to ask the following question.
The current spec describes a fairly generous scheme of "closure conversion"
which applies just to closure *literals* (not e.g. to variables). This is all
very nice and fits happily with the "soft line" scheme suggested above,
plus (probably, with a little care) the simplified treatment of exceptions.
So my question is: how important is it that *in addition to* this conversion
scheme, closure types are deemed identical (for programming purposes) to
certain (synthetic) interface types, so that arbitrary expressions of closure
type are automatically of interface type? Are there use cases for which this
is important? I appreciate that for compatibility amongst compilers, one wants
to require that the types look the same at the back-end - but need they also
be the same to the programmer?

As I say, even within the present system there are probably various things one
might do. But I am really trying to argue that the advantages of  
distinguishing
function and interface types might considerably outweigh the disadvantages ;>)
Closures and objects do offer different things, and drawing a clear
distinction between them would allow one to exploit these differences to
the full!

Best wishes,
John

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.