list literal gotcha and suggestion
Reinier Zwitserloot
reinier at zwitserloot.com
Mon Sep 28 23:14:28 PDT 2009
If we insist on having both short-hand set and list literals, then
some people are neccessarily going to be confused about the syntax,
regardless of {} being used for sets and [] being used for lists, or
vice versa, for reasons already covered: [] is lists and {} is sets in
other languages, but in java, {} is more closely associated with lists
due to array literals. There's no answer out of this unless this
answer involves either eliminating set literals, or forcing the need
to mention the target type somehow: e.g. something like "Set["a", "b",
"c"]" versus "List["a", "b", "c"]" which is quite a price to pay to
eliminate the confusion. Perhaps if the literal syntax defaults to
Lists if you omit the type, this is a workable alternative.
Unfortunately, this is hard to rhyme with the parser: Is "Set[10]" to
be interpreted as: Please create a Set<Integer>, and populate it with
the number '10', or should it be interpreted as: "Set" is a variable
that points to an array, and I want the 11th item in this array.
(variables ought to start with a lowercase letter, but this is merely
convention and not something the parser can rely on). So, such a
change would probably move this proposal beyond the scope of coin.
So, let's turn this argument on its head: Why are we trying so hard to
make set literals work? Why don't we just remove them from the
proposal? The need for them seems minor compared to lists. When the
collection size is small (below about a 100), O(1) lookup performance
is irrelevant (and even if it was relevant, due to the extra
housekeeping that Sets have to do, Lists tend to actually beat Sets in
performance, even for contains(), if the list/set is small!), and yet,
if the initial list or set is being created via a literal, the list/
set will most likely remain small. If only list literals existed,
creating a set is much cleaner than what you get now:
new HashSet<>({"a", "b", "c"});
versus:
new HashSet<String>(Arrays.asList("a", "b", "c"));
This doesn't work if you reverse the scenario; you can't make lists
from set literals (as the duplicates would have already been removed
at thiat point). Even in this longer form of set literal, you've
eliminated the biggest problem in the status quo: A reliance on a
completely unrelated class (Arrays - what does the arrays utility
class have to do with the creation of collections from explicit
values? It SHOULD have no relation whatsoever), and extremely
wordiness - partly because of the accepted diamond proposal.
Perhaps some research needs to be done for how often "set literals"
are created now in real life java code. Search for the patterns:
new HashSet<T>(Arrays.asList(T...));
googlecollections.ImmutableSet.of(T...);
HashSet/Set<T> x = new HashSet<T>(); x.add(t); //repeated 1 or more
times.
If not very often, then isn't the right answer here to just leave them
out entirely, eliminating the confusion in the process?
I may have missed it, but I can't remember seeing the technical
details on this proposal. What does a list literal construct? A
mutable ArrayList, or an immutable undefined implementation of List? I
would strongly suggest these literals are immutable by default,
particularly because making them mutable is easy: new
ArrayList<>({"a", "b", "c"});.
If only the static methods in interfaces proposal had been taken more
seriously, this could have been solved decently with a library,
especially because of the acceptance of the easier varargs invocation
proposal:
List.of("a", "b", "c");
is even better than:
["a", "b", "c"];
because it avoids the "Is it a Set or a List" issue entirely, and
doesn't require taking up valuable parser flexibility the way real
literals would.
Even in complex situations:
List<Set<String>> complicated = List.of(Set.of("a", "b", "a"), null,
Set.empty());
methodCall(List.of("a", "b, "c"));
compared to:
List<Set<String>> complicated = [{"a", "b", "a"}, null, {}];
methodCall(["a", "b", "c"]);
the interface-based static methods are probably slightly worse overall
than true literals in these slightly more complex situations, but the
difference isn't that big, and the static methods still win in
eliminating list/set confusion. Right now the static method based
solution would cause a warning you can safely ignore, and is thus
useless (try it yourself with the google collections API's "of"
methods), but as mentioned the simpler varargs invocation proposal
eliminates the unneccessary warnings, making that form a decent
alternative.
NB: I doubt it'll help at this point in time, but I'll vouch for
writing up the JLS patch and a prototype compiler, delivered within a
month after greenlighting the idea. The implementation would work
along the lines of the proposal I handed in for coin to allow static
methods in interfaces without requiring any changes to the JVM. The
strategy boils down to creating a new inner class in the interface,
named "$Methods", and moving all static methods into this new class,
then requiring that these static methods are called only via their
original class name, and not on instances or subtypes (Every style
checker I know of generates a warning if you call static methods via
an instance anyway, so I don't consider this a big loss), and
translating any static method call on an interface type from
InterfaceName.methodName(params) to InterfaceName.
$Methods.methodName(params). As plenty of other languages running on
the JVM do allow it, there's a modicum of benefit for JVM language
interop if the proposal is accepted as well, by standardizing the
approach.
--Reinier Zwitserloot
On 2009/29/09, at 07:34, Joshua Bloch wrote:
> Paul,
>
> On Mon, Sep 28, 2009 at 10:28 PM, Paul Benedict
> <pbenedict at apache.org>wrote:
>
>> Josh,
>>
>> I think using braces or brackets to indicate the correct type is
>> hardly
>> intuitive or easy to remember. Choosing the wrong syntax by
>> accident will
>> instantiate the wrong type, and the difference between the brace or
>> bracket
>> is pretty subtle visually.
>
>
> Usually it won't compile: you can't assign a Set to a List or vice-
> versa.
> Nick's example was carefully chosen: he invoked a constructor that
> took a
> Collection, which admits either a Set or a List.
>
>
>
>> If Java developers have to begin saying, "Which syntax do I need to
>> use for
>> a List vs. Set?", then I question the whole cost-to-benefit-ratio
>> of this
>> "small" (i.e, coin) proposal.
>
>
> Agreed, I do think the syntax we settled on is reasonably evocative,
> memorable, and consistent with other languages. Braces (AKA curly
> braces)
> are used to represent a Set in mathematical notation, and square
> brackets
> are used to index into and to declare arrays, which are list-like.
>
>
>> I can see the JDK 7 certification tests already asking this
>> question --
>> it's a good gotcha question. Not being a language expert, and
>> recognizing
>> that other languages already use what's being proposed, the syntax
>> still
>> doesn't pass my common sense meter. Do the technical justifications
>> really
>> outweigh simplicity?
>>
>
> I think it's probably the best that we can do, but I could be
> wrong. I will
> investigate other options.
>
> Josh
>
More information about the coin-dev
mailing list