Project Lambda: Java Language Specification draft

Fri Jan 22 18:24:22 PST 2010

Hi Neal,

Le 23/01/2010 02:36, Neal Gafter a écrit :
> A couple of thoughts on this draft, inline
>
> On Fri, Jan 22, 2010 at 2:55 PM, Alex Buckley<Alex.Buckley at sun.com>  wrote:
>    
>> This document does not consider implementation. The mapping of lambda
>> expressions to objects, and of function types to class or interface
>> types, is neither designed nor specified. Even if the mapping was
>> designed here, it is unlikely ever to be specified in the JLS. Binary
>> compatibility for lambda expressions will eventually be specified in
>> terms of changes to function types only. It is a goal of this document
>> to allow the implementer freedom as to how and when lambda expressions
>> are evaluated.
>>      
> I think these are unavoidable in the JLS.  The specification must be
> precise enough that code generated by distinct Java compilers can
> interoperate.  Within the specification, we need to know if a class
> can "implement" a function type.  If so, it is probably an interface.
> If not, it probably isn't.
>
>    
>> - Lambda expressions as closures: There are effectively-final
>>   variables, but I am holding off shared variables for now. As
>>   background reading to why loop variables should not be shared, see
>>   http://blogs.msdn.com/ericlippert/archive/2009/11/12/closing-over-the-loop-variable-considered-harmful.aspx.
>>      
> Actually, that is largely an error in the specification of C#'s
> foreach loop, which we're aiming to fix.
>
>    
>> * Expressions
>>
>> [15.8 Primary Expressions]
>>
>> Expression:
>>   LambdaExpression
>>
>> LambdaExpression:
>>   '#' '(' FormalParameterList_opt ')' '(' Expression_opt ')'
>>   '#' '(' FormalParameterList_opt ')' Block
>>
>>   #()()
>>   #()(5)
>>   #()(x.m())
>>   #()((foo++))
>>   #()("a"+"b")
>>   #()( {1,2,3} )   // Proposed collection literal expression from Coin
>>
>>   #(){}
>>   #(){return 5;}
>>   #(){x.m();}
>>   #(){foo++;}
>>      
> Did you mean to make these primary expressions?  I hope so.  Otherwise
> many uses would require yet another set of parens.
>
>    
>> [15.8.3 this]
>>
>> The keyword this may be used only in the body of an instance method,
>> instance initializer or constructor, or in the initializer of an
>> instance variable of a class, *or in a lambda expression*.
>>
>> The type of this in a lambda expression is the function type of the
>> lambda expression.
>>
>> /*
>> Treatment of 'this' inside a lambda expression is essentially the same
>> as 'this' inside the body of an anonymous inner class.
>> */
>>      
> That's the worst possible treatment of "this".  It has all the
> disadvantages of the inner class approach (not being transparent) with
> none of the advantages (you can't use any inherited members of the
> type to which the lambda expression is converted).  This is the first
> time I've heard anyone advocate such treatment for "this".
>    

Agree.
Another question is what is the meaning of super in that case ?

>    
>> [15.8.6 Lambda Expressions]
>>
>> A lambda expression is used to create a new object that is a lambda
>> instance (15.8.7).
>>      
> I don't think you mean to say "new" object.  I think you mean to say
> that it results in an object that is a lambda instance.  Whether or
> not it is new is something you explicitly said you didn't want to
> specify.
>    

Yes, this is important.

>    
>> A lambda expression specifies a expression or block of code,
>>      
> That's not quite right, because it could be neither (the expression is
> optional).
>    

I think that this is an error in the grammar,
#() () seems not really useful.

>    
>> followed
>> by a (possibly empty) list of formal parameters to the expression or
>> block.
>>      
> Actually, in your syntax the parameters *precede* the body, not follow it.
>
>    
>> The body of a lambda expression is an expression or a block of code.
>>      
> Again, sometimes it is neither.
>
>    
>> If the body of a lambda expression is an expression, then the type of
>> the body is the type of the expression.
>>      
> And if it is neither?
>
>    
>> If the body of a lambda expression is a block, then either all or none
>> of the return statements in the block must have an Expression. If no
>> return statement has an Expression, then the body of the lambda
>> expression is void, i.e. has no type.
>>      
> That's inconsistent with the way methods work, and would prevent the
> following useful code
>
> Callable<String>  lazyResult = #(){ throw new UnsupportedOperationException(); }
>    

Why not using the left hand side type, like when infering a generic call ?
I think it will having to introduce 'Nothing'.

>    
>> If all return statements have an
>> Expression, then the types of the Expressions must be
>> assignment-compatible with each other, or a compile-time error
>> occurs.
>>      
> This disallows "if (e) return "foo" else return new Object();" because
> Object is not assignment-compatible to String.
>
>    
>> The type of the body is lub(T1..Tn) where T1..Tn are the types
>> of the Expressions after boxing conversion.
>>      
> So the result type is always a reference type (or void) when the body
> is a block?
>    

Yes, it should be after a possible conversion and
the conversion should be done or not depending on proto-type.

>    
>> The type of a lambda expression is a function type #T(S1..Sm)(E1..En)
>>      
> That syntax is not defined anywhere in your specification.  What is a
> function type?
>    

This is defined in section 4.3

>    
>> where:
>>
>> - If the body of the lambda expression is void, then T is void,
>>   indicating no return type; otherwise, T is the type of the body of
>>   the lambda expression after capture conversion.
>>
>> - S1..Sm is the list, possibly empty, of types of the formal
>>   parameters of the lambda expression.
>>
>>   #()() has type #void()
>>      
> According to the specification above, this is missing some parens.
>
>    
>>   #() { if (..) return "1"; else return 2; } has type #Integer()
>>      
> It looks like an error because of the constraint "If all return
> statements have an
> Expression, then the types of the Expressions must be
> assignment-compatible with each other, or a compile-time error
> occurs."
>
>    
>> Any local variable, formal method parameter, or exception handler
>> parameter used but not declared in a lambda expression must be
>> effectively-final.
>>
>> A local variable, formal method parameter, or exception handler
>> parameter is effectively-final if it is never the target of an
>> initialization or assignment expression except where definitely
>> unassigned.
>>      
> In other words, can't capture much more than an anonymous inner class
> could.  Disappointing.
>
>    
>> It is a compile-time error to modify the value of an effectively-final
>> variable in the body of a lambda expression.
>>      
> Given the definition of effectively-final, above, this constraint is vacuous.
>
>    
>> FunctionType:
>>   '#' ResultType '(' [Type] ')' FunctionThrows_opt
>>
>> FunctionThrows:
>>   '(' 'throws' ExceptionTypeList ')'
>>
>> ExceptionTypeList:
>>   Identifier
>>   ExceptionTypeList '|' Identifier
>>
>> The notation #T(S1..Sm)(E1..En) indicates a function type with return
>> type T, formal parameter types S1, S2, ..., Sm, and checked exception
>> types E1, E2, ..., En.
>>      
> I think you're missing "throws".  Either that, or you haven't told us
> the relationship between this syntax and the syntax for function
> types.
>
>    
>> 'void' may be used in a function type to indicate that the body of a
>> lambda expression has no return value. This occurs if the body of the
>> lambda expression is a block that can either a) complete normally or
>> b) complete abruptly by reason other than being a return with value V.
>>      
> What about the case when there's no expression or block?  e.g. #()()
>
>    
>> [4.10.4 Subtyping among Function Types]
>>
>> #T(S1..Sm)(E1..En) is a direct supertype of #V(U1..Um)(F1..Fo) iff all
>>   of the following hold:
>> - T is a supertype of V.
>> - for i in 1..m: Ui is a supertype of Si.
>> - for j in 1..o: There exists a k in 1..n such that Ek is a supertype of Fj.
>>
>>   #Object(String,Integer) is a supertype of #Package(Object,Number).
>>   #Object(Object,Object) is also a supertype of #Package(Object,Number).
>>   #Object(Object[]) is a supertype of #Object[](Object).
>>      
> So "#float()" can be assigned from "#int()"?  It will be interesting
> to see how one can generate verifiable code for this without resorting
> to some reflection-like APIs.  Similarly, I'm surprised you allow
> assigning "#void(float)" to "#void(int)".  How will the generated code
> for the lambda know how to interpret the bits of the incoming value?
>    

I think this can of conversion can be handled by method handle
but these rules are aliens to Java.

>    
>> Object is a direct supertype of any function type.
>>
>> A function type that is void (i.e. has no return type) and has formal
>> parameter types P1..Pn is a supertype of a function type #T(S1..Sn)
>> iff Si is a supertype of Pi (i in 1..n).
>>      
> So "#void()" can be assigned from #int()"?  It will be interesting to
> see how to generate verifiable JVM code for calling these (how much is
> left on the stack by invoking a lambda?).
>
>    
>> The above [SAM] definition deliberately does not treat multiple non-Object
>> abstract methods with compatible signatures as if they represented a
>> single abstract method. This reflects existing practice whereby if an
>> interface or abstract class has multiple such members, it is not
>> possible for a non-abstract class to implement the interface/extend
>> the abstract class simply by providing a single concrete method.
>>      
> That's not existing practice:
>
> interface A { void f(); }
> interface B { void f(); }
> interface C extends A, B {} // not SAM by definition
> public class D implements C { public void f() {} }
>    

I could be very useful if you want to retrofit C
that only extends A by introducing extends B.
Without that rule, this kind of change will break source
compatibility.

>    
>> A lambda conversion exists from a function type #T(S1..Sm)(E1..En) to
>> the descriptor of the target abstract method M of a SAM type, provided
>> that all of the following hold:
>>
>> - If T is not void, then T can be converted to the return type of M by
>>   assignment conversion.
>> - If T is void, then M is void or has return type java.lang.Void.
>> - M is not generic and has m formal parameters.
>> - For i in 1..m, the i'th formal parameter of M has type Si.
>> - For j in 1..n, the checked exception type Ej is a subtype of some
>>   exception type in the throws clause of M.
>>      
> + The constructor is accessible?
>
> Where in the caller are exceptions that were declared in the throws
> clause of the SAM's constructor checked?
>
> You need to specify the runtime behavior of this conversion (when the
> constructor is invoked is observable).
>
>    
>> The type of this in a lambda expression is the function type of the
>> lambda expression.
>>      
> Oh, really?  So the result type of a lambda expression can depend on
> the type of "this" (if "this" appears within a return statement).  And
> the type of "this" depends on the result type of the lambda
> expression.  Your specification gives no hint how this infinite
> regress is to be resolved.
>    

Is there a cool Hindley-Milner like algorithm somewhere ?

>    
>> Therefore, it is convenient for the body of a lambda expression to
>> have access to members of the SAM type. To achieve this, I am thinking
>> that 'this' in the body of the lambda expression may be cast to the
>> SAM type:
>>      
> It gets worse and worse!  I thought you were trying to avoid
> specifying implementation details, but this would force the objects to
> be the same.  I cannot imagine how you could make that happen without
> introducing some significant inefficiencies.  Consider:
>
> abstract class R { public abstract void run(); }
> #void() lambda = #() { R self = (R)this; ... }
> R r = lambda; // magic!
>
> Note that the object referenced by the variable r is of type "R", but
> it is also of the reference type "#void()" (it is visible with that
> static type as "this" inside the body of the lambda).  Therefore, one
> could convert lambdas from one "SAM" to a different compatible "SAM"
> using function types as intermediaries:
>
> abstract class R1 { public abstract void run(); }
> abstract class R2 { public abstract void invoke(); }
> #void() lambda = #() {}
> R1 r1 = lambda;
> R2 r2 = (#void()) r1; // magic!
>
> Note that the above cast must succeed, because the object r1 must
> dynamically be a subtype of "#void()" - otherwise, it would not be
> capable of being viewed as that type when seen as "this" inside the
> lambda.
>
> Also, this will encourage people to write casts (that might be incorrect).
>
>    
>> [5.3] Method Invocation Conversion
>>
>> Method invocation contexts allow the use of one of the following:
>> - a lambda conversion (5.1.14).
>>      
> Interesting.  How does type inference and overload resolution work
> when calling an overloaded method with an argument that is a lambda
> expression?  Type inference currently does not have any rules to
> handle these cases.
>
> For example, if I have a generic method
>
> <T>  T doit(#T() lambda) { return lambda!(); }
>
> And an invocation
>
> String s = doit(#()("foo"));
>
> There doesn't appear to be any way to infer that the type argument to
> the invocation is String.  More subtly, with
>
> <T extends Runnable>  void doit2(T t) {}
>
> How does one infer the type parameter T in the invocation
>
> doit2(#(){});
>
>    
>> A lambda invocation expression on a lambda expression of type
>> #T(S1..Sm)(X1..Xn) can throw an exception type E iff either:
>>
>> - some expression of the argument list can throw E, or
>> - there exists an i in 1..n such that Xi is E.
>>      
> + or the receiver expression can throw E?
>
>    

Rémi