How about a combo?

Knox, Liam Liam.Knox at morganstanley.com
Fri Mar 19 16:55:01 PDT 2010


f1 = int(int x) { return x * x; };

That feels more like Java tome  

-----Original Message-----
From: lambda-dev-bounces at openjdk.java.net [mailto:lambda-dev-bounces at openjdk.java.net] On Behalf Of Pavel Minaev
Sent: Saturday, March 20, 2010 6:13 AM
To: lambda-dev at openjdk.java.net
Subject: How about a combo?

On one hand, I really like Howard's angle-brackets proposal in the variety where a magic type is used rather than #, or other Perlesque "magic symbol". For example:

    java.lang.Function<int(int) throws E1, E2> f;

Naturally you'd omit java.lang normally, as with other types from that package.

>From syntax perspective, this would mean an alternative production of
what may be inside the angle brackets for a type name. The syntax for that production is the same as the existing MethodHeader production without MethodModifiers and TypeParameters, and with name of method itself and all parameters omitted. Such a production would only be allowed as a type argument for the "magic" class java.lang.Function, which does not have a single normal class declaration (much like arrays do not) - this check would, of course, have to happen after parsing stage.

Here are a few more declarations showcasing both simple, and variously convoluted cases:

    java.lang.Function<int()> f0;
    Function<int(int)> f1;
    Function<void(int, int) throws E1, E2> f2;
    Function<Function<void(int, int) throws E1, E2>(Function<void(int,
int) throws E2, E3>, int) throws E1, E3> f2;

The obvious benefit of this syntax is that it nests relatively cleanly, at least to someone familiar with the existing syntax for generics. I would also argue that playing on the similarity to generics may be a good thing here, if we treat them as an idea of "parametrizing a family of types G with some parameter T" - it's just that, so far, T has always been a list of types, while this extends it further. The basic concept, though, remains the same - "Function" is a family in the same way e.g. "List" is, and it can be parametrized by specifying parameters and return types.

At the same time, the syntax within the angle brackets has direct and obvious correspondence to existing method declaration syntax.

The advantage over a "magic symbol" is that it better preserves the existing "Java feel" of type references, by which I mean that, so far, all Java types always begin with either one of the few primitive type keywords, or a type name (which, by established convention, is also always capitalized). A "magic symbol" breaks this, and it is precisely this which, in my opinion, creates this feeling of "unnaturalness"
about many syntaxes proposed so far that use various non-alphanumeric symbols at the beginning of function type.

Note: all syntax above is actually in Howard's proposal (with his own alternations), I didn't add anything new there. The only difference is that he used Lambda as a magic class name. I think that Function is more descriptive in a sense that it would make all implications more obvious to someone not familiar with this syntax (but familiar with Java syntax otherwise), but this is a minor point in any case.


Now, lambdas themselves. I like one of Neal's proposals for that, the one which disposes away with any special symbol to mark the start of the expression (and consequently looks a lot like C#):

    f0 = int() -> 42;
    f0 = int() { return 42; }
    f1 = int(int x) -> x * x;
    f1 = int(int x) { return x * x; };
    f2 = void(int x, int y) throws Exception {
    	if (x < y) throw new Exception();
    }
    f2 = void(int x, int y) throws Exception -> foo();
    f3 = Function<void(int, int) throws E1, E2>(Function<void(int,
int) throws E2, E3> f, int x) throws E1, E3 {
    	return void(int y, int z) throws E1, E2 {
    		f(x * y, x * z);
    	}
    }

The only thing I'd prefer over his grammar is to avoid the need for -> for statement lambdas (which is shown in the examples above). They are unambiguously identified by "{" following ")", anyway. With that amendment, the syntax for statement lambdas directly matches existing syntax for method declarations, except that method name is omitted.

The reason to have -> there for expression lambdas is to be able to disambiguate, in a context-free manner, lambdas with generic return types and/or arguments from expressions. E.g.:

    A<B> (C<D> e) -> ... // expression lambda
    A<B> (C<D> e) { ... // statement lambda
    A<B> (C<D> e) ... // a bunch of < and > operator calls

So far as I can see, instead of "->", expression could be parenthesized - this is also unambiguous. However, I believe that something like "int(int x)(x * x)" is less readable than "int(int x)
-> x * x". Of course, this is subjective.

Another, more lengthy option, is to use "return" instead of "->", e.g.:

    f1 = int(int x) return x * x;

It is even more obvious to someone familiar with existing Java syntax, and the extra 4 chars aren't that big of an issue - Java lambdas are going to be pretty verbose anyway in the lack of parameter type inference. And it avoids any possible confusion with the widespread use of -> as member access operator in other syntactically similar languages (C, C++, C#, PHP...).

With respect to need to perform unbounded lookahead here to determine if something is a lambda or not, this is, so far as I know, already a problem with generic types & casts, and can be solved in exact same way - backtracking in a context-free parser, or contextual parsing.

On type inference. I think that inference for exception specifications on lambdas would be helpful, and would not otherwise negatively affect anything else. Whether this means that "throws" clause in lambda definitions can be dropped altogether, or whether having it to be able to express one's intent explicitly is still a good idea, is an open question.

For return type inference, I don't know if it makes much sense if parameter types are not inferred, anyway. Regardless, if desired, the syntax is fairly straightforward:

    f0 = () -> 42;



I foresee one common objection to the overall combination presented above, and that is perceived syntactic inconsistency between function types (which require Function<>), and lambda definitions (which do not). I think that this issue is not as significant as it seems, since
1) the syntaxes are still largely consistent aside from that, and 2) consistency with existing Java concepts for both syntaxes (i.e. with type syntax for function types, and with method declaration syntax for lambda definitions) is, in fact, more important.


--------------------------------------------------------------------------
NOTICE: If received in error, please destroy, and notify sender. Sender does not intend to waive confidentiality or privilege. Use of this email is prohibited when received in error. We may monitor and store emails to the extent permitted by applicable law.


More information about the lambda-dev mailing list