Static analysis of non-local transfers: a proposal

Fri Nov 21 19:54:29 PST 2008

Hi John,

This is very interesting work - thanks for taking the time to  
describe your ideas!

One initial question:

If I've understood correctly, the restrictions you've described rely  
on detecting that the target of an assignment is of function type.   
However, the closures specification allows a closure literal such as  
{==> return;} to be converted to a 'compatible' interface type, and  
I'm wondering to what extent your proposal would be able to handle that?

Here's a concrete example of the kind I have in mind, which currently  
throws an UnmatchedTransfer:

class Conversion {

	interface Task {
		void exec();
	}

	static Task task;

	static void store(Task t) {
		task = t;
	}

	static void m() {
		store({==> return;});
	}

	public static void main(String[] args) {
		m();
		task.exec();
	}
}

As I see it, the only restrictions that could be applied here would  
be on the code in m(), since everything else is standard Java code.

Kind regards,

Mark Mahieu

On 20 Nov 2008, at 17:27, John Longley wrote:

>
> Dear Java closures people,
>
> I very much appreciate your interesting and enterprising work on  
> Javac.
> As a theoretically minded researcher in programming languages, I  
> have a
> suggestion to offer regarding a possible scheme for static analysis of
> programs involving return/break/continue, so as to pre-empt
> UnmatchedNonlocalTransfer exceptions at runtime.
>
> My colleague Phil Wadler at Edinburgh University recently drew my  
> attention
> to your work, suggesting that some of my own research ideas might find
> applications there. Having studied your proposal (v0.5), it does  
> indeed seem
> plausible to me that this may be the case, so this message is a  
> first attempt
> at trying to convey my thoughts in Java terminology. Of course,  
> it's highly
> possible that you have thought through this kind of thing already -  
> or that
> I've misunderstood or overlooked something important! - but in any  
> case,
> I'd be very interested in discussing the issues further.
>
> If I've got it right, my proposal has the following characteristics:
>
> * It consists simply of a compile-time checkable restriction on  
> programs of
>   the Javac language exactly as proposed in the spec for v0.5. No  
> additional
>   annotations by the programmer are required.
> * Programs obeying the restriction will never throw an
>   UnmatchedNonlocalTransfer exception at runtime.
> * The restriction doesn't rule out any standard Java programs (i.e.  
> programs
>   not involving closures).
> * The restriction doesn't seem to exclude any of the motivating  
> examples
>   of Java programs with closures that I have seen so far.
> * More tentatively, if the restrictions were adopted as part of the  
> language
>   spec, I believe this might also permit a simplification of the  
> currently
>   proposed system for tracking exceptions, yielding the same safety
>   properties but requiring fewer exception-related annotations by the
>   programmer. However, I won't go into details of this in the  
> present note.
>
> I'll describe the idea here using Java terminology. However, I'm  
> also hoping
> (soon!) to produce a draft paper presenting a toy language with  
> formal typing
> rules etc. which illustrates the essential issues, along with a  
> proof of the
> desired safety property for this language.
>
> So here goes. I'll concentrate here on the restriction insofar as  
> it affects
> the use of "return", and then briefly comment on "break" and  
> "continue".
>
> The restriction consists of two aspects:
>
> * A restriction on assignment expressions where the variable being
>   assigned to is of function type. (Note however that  
> *initializers* for
>   variables of function type are not restricted, except as  
> prescribed by
>   the ordinary rules of Java, e.g. we can't have "return" if we're not
>   within a method or constructor body.)
> * A restriction on "return" statements themselves.
>
> To describe the restrictions, the following notions are useful.
> First, we suppose that in some preprocessing phase, the compiler has
> annotated each occurrence of a method or constructor declaration  
> with a
> unique *target label*, and moreover has annotated each occurrence
> of "return" with the appropriate target label as determined by lexical
> scoping (i.e. with the target label of the closest enclosing method or
> constructor declaration).
>
> By a *dynamic element*, we shall mean either a variable of function  
> type
> or a target label. I don't think this is an especially good name  
> (other
> suggestions welcome!), but the intuition is that a dynamic element  
> can be
> "invoked" - by being applied in the case of a variable of function  
> type,
> or by being jumped to in the case of a target label - causing a  
> computation
> which might trigger (or itself be) an attempted transfer of control.
> By contrast, a variable of primitive type is not considered  
> "dynamic" since
> there is nothing to "invoke". (It might seem odd that variables  
> containing
> references to objects are *not* deemed to be dynamic - roughly,  
> this is because
> a method invocation cannot *in itself* be responsible for a non- 
> local transfer,
> except indirectly via some other dynamic element such as a field in  
> the
> target object.)
>
> Every dynamic element has a *scope*: the scope of a variable is
> defined as usual, while the scope of a target label is deemed to be
> the whole of the corresponding method or constructor declaration.
> We define a partial order relation on dynamic elements as follows:  
> y << x if
> the scope of x strictly contains that of y. (The important  
> intuition here is
> "x may outlive y".) We also write y <<= x if the scope of x  
> contains or is
> equal to that of y.
>
> We also have the usual notion of a variable appearing *free* in an
> expression e. In addition, we say a target label t appears *free*  
> in e if
> e contains a return statement annotated with t and not "bound" by a
> method/constructor declaration annotated by t within e itself.
>
> We can now give an easy-to-understand but slightly crude first cut  
> at our
> restrictions as follows (I'll improve on this below):
>
> * In an assignment expression x = e where x is of function type,
>   no dynamic element y << x appears free in e.
> * In a statement return e annotated with target label t,
>   no dynamic element y <<= t appears free in e.
>
> Thus, for example, the following four code fragments are forbidden:
>
> 1.   {=> int} x ;  // a field declaration, say
>      void m () {
>          x = {=> return 5 ;}
>      }
>      // scope of target label narrower than that of x
>
> 2.   {=> int} x ;
>      void m () {
>          {=> int} y = {=> return 5 ;} ;
>          x = y ;
>      }
>      // scope of y narrower than that of x
>
> 3.   return {=> return 5 ;}
>      // both returns will have the same target label here
>
> 4.   {=> int} x = {=> return 5 ;} ;
>      return x ;
>      // scope of x is narrower than that of target label
>
> Whilst this version of our restriction suffices to guarantee safety of
> returns, it imposes some possibly annoying restrictions on the way the
> programmer has to write things. For example, under the above rules
> one cannot write
>
>      {int => {=> int}} K = {n:int => {=> n}} ;
>      {=> int} f ;
>      {  {=> int} g = {=> 5} ;
>         f = K.invoke (g.invoke()) ;
>      }
>
> even though it is OK to write
>
>      {int => {=> int}} K = {n:int => {=> n}} ;
>      {=> int} f ;
>      {  {=> int} g = {=> 5} ;
>         int n = g.invoke() ;
>         f = K.invoke(n) ;
>      }
>
> which is not essentially different. Thus, the programmer would  
> sometimes
> have to rewrite code in an unnatural style in order to conform to the
> above rules.
>
> The point about this example is that in the assignment
>
>      f = K.invoke (g.invoke()) ;
>
> the occurrence of g is not "dangerous", because the subexpression
> g.invoke() is evaluated to a ground value (5) before the assignment is
> performed, so there is no way for the dynamic behaviour of g to find
> its way into that of f. One can take account of this via a slightly  
> more
> fine-grained version of our restrictions as follows.
>
> Let's say a syntactic subexpression e' of e is *normal* if e' does not
> appear inside a closure literal in e (i.e. underneath a lambda).
> And say a dynamic element y *critically appears in* an expression e
> if there is an occurrence y_0 of y which is not contained in any  
> normal
> subexpression e' of non-function type.
>
> Thus, in the above example, g appears in K.invoke (g.invoke()),
> though not critically. However, the appearance of g in {=> g.invoke()}
> *is* critical, because the invocation of the dynamic behaviour of g
> is delayed.
>
> With these notions in place, we may now relax our restrictions to the
> following (just by replacing "appears" by "critically appears"):
>
> * In an assignment expression x = e where x is of function type,
>   no dynamic element y << x critically appears free in e.
> * In a statement return e annotated with target label t,
>   no dynamic element y <<= t critically appears free in e.
>
> One further simple relaxation is worth mentioning. For an  
> assignment x = e,
> the appearance of some y << x isn't dangerous if y is a variable  
> and there
> is no intervening target label between y and x. So we could in  
> general define
> y <<< x if for some target label t we have y <<= t << x, and then  
> replace
> "y << x" by "y <<< x" in the restriction on assignment expressions.
>
> So that's the idea. It should now be reasonably clear how to extend  
> this
> scheme to deal with "break" and "continue", by annotating loop  
> constructs
> with target labels at appropriate points.
>
> Of course, a few adjustments might be needed to take account of all  
> aspects
> of Java (e.g. I guess some care is also needed over where to put  
> the target
> labels in try-finally statements), but I'm hopeful that in essence  
> the scheme
> should be able to do everything one wants, without sacrificing much of
> importance in terms of programming power or convenience.
>
> Thanks for reading this far! I would be interested to know what you  
> think,
> so any reactions/queries/problematic examples gratefully received.
>
> Cheers,
> John Longley
> (Lecturer, School of Informatics, University of Edinburgh, Scotland)
>
>
> [PS: Please reply to my email address, as I guess I'm not a  
> recipient of
> this mailing list.]
>
>
> -- 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>