Method reference double-colon syntax
Brian Goetz
brian.goetz at oracle.com
Wed May 30 19:43:28 PDT 2012
> I checked. The compiler changes occurred in February but no
> justification given. I am not looking to judge the justification; just
> wondering how they justified :: that was better than #.
There were a number of reasons. No one thinks :: is perfect, but # is
worse, and there were many other alternatives explored that were
rejected for various technical reasons. (Others may disagree with the
conclusions or the reasons -- you have a right to -- but let's not dive
down that rathole.)
The prototype implementation used the following syntax for method
references: Foo#bar. This made sense mostly because the lambda syntax
used #, and secondarily (way distant second) that the Javadoc syntax for
methods was Foo#bar.
With the change in lambda syntax to one that does not use a #, the first
no longer applies. The Javadoc justification is in actuality pretty
weak; I would hazard that only a few percent of Java developers have
used it, and if you quizzed developers on "what is the pound symbol used
for in Java", most would say "nothing". The Javadoc use is more of a
post-hoc justification than a real reason to choose it.
It would have been desirable for the lambda syntax to be similar to the
method reference syntax. Foo->bar seems really good, until you realize
it is unworkable; with the optional generic type arguments (on both
sides of the delimiter, just like with dot), you could get Foo-><T>bar
or Foo<T>->bar or worst, Foo<T>-><U>bar. Oops. (Maybe with good syntax
coloring that might be readable.)
It turns out that the obvious prefix syntaxes, such as &Foo.bar were
syntactically ambiguous with other constructs. Obvious other candidates
for infix delimiters (including : and .) also fail for various reasons.
There were many other candidates proposed, such as backtick (`Foo.bar)
or compound infix delimiters (Foo&.bar) but no one could get behind these.
That doesn't leave many credible choices.
Using # as an infix syntax is questionable. While there is room for
opinion to vary, # is more of a "naturally prefix" syntax than a
"naturally infix" syntax, whereas :: is more of a "naturally infix"
syntax. (Look at their use in other languages.) The real knock against
:: is "ugh, looks like C++", but if we ruled out any construct that was
used in any other language with which people might have had a bad
experience, we'd need to get much bigger keyboards. (And, in actuality,
using :: in method references is related to namespacing, so the C++
connection might actually be a positive hint.)
Given all this, the deciding factor turned out to be global syntactic
real estate management. There is a significant hidden cost to picking a
syntax: using a syntactic form for one feature may foreclose on using
elements of that form for other, possibly more "deserving" features. As
stewards, we have a responsibility not only to deliver language
improvements that use pleasant syntax, but also to do global syntactic
real estate management, otherwise we cripple our ability to continue to
evolve the language without being undermined by silly syntactic
roadblocks. There are darn few unsullied characters left; # is one of
the few good ones.
Using # for lambdas+method references might have made sense; then # can
be thought of as the "delayed evaluation" operator, and the two uses
reinforce each other. That's at least a biggish payback for using our
last bit of virgin symbology. Using it for method references only is
less of a payback. Far less. Like paying for a hamburger with a Rolex.
To illustrate what I mean by "we could do better", here's an alternative
proposal that gets far more mileage out of #: structured literals.
While these plans are not in place for 8, we have already stated our
desire to add structured literals for lists, maps, sets, etc. # as a
prefix symbol, combined with delimiters, gives us a far higher
return-on-syntax as a structured literal builder (as a bonus, # is
already associated with structured literals in a lot of languages, going
all the way back to many early assembly languages where # was the
immediate addressing mode.) For example:
#[ 1, 2, 3 ] // Array, list, set
#{ "foo" : "bar", "blah" : "wooga" } // Map literals
#/(\d+)$/ // Regex
#(a, b) // Tuple
#(a: 3, b: 4) // Record
#"There are {foo.size()} foos" // String literal
Not that we'd embrace all of these immediately (or ever), but the point
is: there's a lot of room for expansion, unlike using # for MRs only.
(We can use target typing to distinguish between array, set, list
literals (even JSON literals)). Much more bang for the pound.
The :: infix syntax:
ClassName::methodName
ClassName<T>::methodName
ClassName::<U>genericMethodName
works acceptably well. Some people like it, and some people hate it --
just like #. There's never going to be a perfect syntax for anything
that makes everyone jump up in unison and say "yeah, that's it!" But ::
is OK, and using up :: here is far better than using up #. (And, while
this might look a little weird to C++ programmers, the overlap between
the Java and C++ developer bases at this point is small enough that I
don't think we should be too worried about that.)
More information about the lambda-dev
mailing list