How about a combo?

Fri Mar 19 14:12:48 PDT 2010

On one hand, I really like Howard's angle-brackets proposal in the
variety where a magic type is used rather than #, or other Perlesque
"magic symbol". For example:

    java.lang.Function<int(int) throws E1, E2> f;

Naturally you'd omit java.lang normally, as with other types from that package.

>From syntax perspective, this would mean an alternative production of
what may be inside the angle brackets for a type name. The syntax for
that production is the same as the existing MethodHeader production
without MethodModifiers and TypeParameters, and with name of method
itself and all parameters omitted. Such a production would only be
allowed as a type argument for the "magic" class java.lang.Function,
which does not have a single normal class declaration (much like
arrays do not) - this check would, of course, have to happen after
parsing stage.

Here are a few more declarations showcasing both simple, and variously
convoluted cases:

    java.lang.Function<int()> f0;
    Function<int(int)> f1;
    Function<void(int, int) throws E1, E2> f2;
    Function<Function<void(int, int) throws E1, E2>(Function<void(int,
int) throws E2, E3>, int) throws E1, E3> f2;

The obvious benefit of this syntax is that it nests relatively
cleanly, at least to someone familiar with the existing syntax for
generics. I would also argue that playing on the similarity to
generics may be a good thing here, if we treat them as an idea of
"parametrizing a family of types G with some parameter T" - it's just
that, so far, T has always been a list of types, while this extends it
further. The basic concept, though, remains the same - "Function" is a
family in the same way e.g. "List" is, and it can be parametrized by
specifying parameters and return types.

At the same time, the syntax within the angle brackets has direct and
obvious correspondence to existing method declaration syntax.

The advantage over a "magic symbol" is that it better preserves the
existing "Java feel" of type references, by which I mean that, so far,
all Java types always begin with either one of the few primitive type
keywords, or a type name (which, by established convention, is also
always capitalized). A "magic symbol" breaks this, and it is precisely
this which, in my opinion, creates this feeling of "unnaturalness"
about many syntaxes proposed so far that use various non-alphanumeric
symbols at the beginning of function type.

Note: all syntax above is actually in Howard's proposal (with his own
alternations), I didn't add anything new there. The only difference is
that he used Lambda as a magic class name. I think that Function is
more descriptive in a sense that it would make all implications more
obvious to someone not familiar with this syntax (but familiar with
Java syntax otherwise), but this is a minor point in any case.

Now, lambdas themselves. I like one of Neal's proposals for that, the
one which disposes away with any special symbol to mark the start of
the expression (and consequently looks a lot like C#):

    f0 = int() -> 42;
    f0 = int() { return 42; }
    f1 = int(int x) -> x * x;
    f1 = int(int x) { return x * x; };
    f2 = void(int x, int y) throws Exception {
    	if (x < y) throw new Exception();
    }
    f2 = void(int x, int y) throws Exception -> foo();
    f3 = Function<void(int, int) throws E1, E2>(Function<void(int,
int) throws E2, E3> f, int x) throws E1, E3 {
    	return void(int y, int z) throws E1, E2 {
    		f(x * y, x * z);
    	}
    }

The only thing I'd prefer over his grammar is to avoid the need for ->
for statement lambdas (which is shown in the examples above). They are
unambiguously identified by "{" following ")", anyway. With that
amendment, the syntax for statement lambdas directly matches existing
syntax for method declarations, except that method name is omitted.

The reason to have -> there for expression lambdas is to be able to
disambiguate, in a context-free manner, lambdas with generic return
types and/or arguments from expressions. E.g.:

    A<B> (C<D> e) -> ... // expression lambda
    A<B> (C<D> e) { ... // statement lambda
    A<B> (C<D> e) ... // a bunch of < and > operator calls

So far as I can see, instead of "->", expression could be
parenthesized - this is also unambiguous. However, I believe that
something like "int(int x)(x * x)" is less readable than "int(int x)
-> x * x". Of course, this is subjective.

Another, more lengthy option, is to use "return" instead of "->", e.g.:

    f1 = int(int x) return x * x;

It is even more obvious to someone familiar with existing Java syntax,
and the extra 4 chars aren't that big of an issue - Java lambdas are
going to be pretty verbose anyway in the lack of parameter type
inference. And it avoids any possible confusion with the widespread
use of -> as member access operator in other syntactically similar
languages (C, C++, C#, PHP...).

With respect to need to perform unbounded lookahead here to determine
if something is a lambda or not, this is, so far as I know, already a
problem with generic types & casts, and can be solved in exact same
way - backtracking in a context-free parser, or contextual parsing.

On type inference. I think that inference for exception specifications
on lambdas would be helpful, and would not otherwise negatively affect
anything else. Whether this means that "throws" clause in lambda
definitions can be dropped altogether, or whether having it to be able
to express one's intent explicitly is still a good idea, is an open
question.

For return type inference, I don't know if it makes much sense if
parameter types are not inferred, anyway. Regardless, if desired, the
syntax is fairly straightforward:

    f0 = () -> 42;

I foresee one common objection to the overall combination presented
above, and that is perceived syntactic inconsistency between function
types (which require Function<>), and lambda definitions (which do
not). I think that this issue is not as significant as it seems, since
1) the syntaxes are still largely consistent aside from that, and 2)
consistency with existing Java concepts for both syntaxes (i.e. with
type syntax for function types, and with method declaration syntax for
lambda definitions) is, in fact, more important.