list literal gotcha and suggestion

Lawrence Kesteloot lk at teamten.com
Mon Sep 28 23:33:10 PDT 2009


In support of Reinier's point is that Python has no set literal. They
discussed adding it to Python 3, but I don't think it made it in.
Removing it fixes the ambiguity about the empty braces {} being a set
or a map.

Lawrence



On Mon, Sep 28, 2009 at 11:14 PM, Reinier Zwitserloot
<reinier at zwitserloot.com> wrote:
> If we insist on having both short-hand set and list literals, then
> some people are neccessarily going to be confused about the syntax,
> regardless of {} being used for sets and [] being used for lists, or
> vice versa, for reasons already covered: [] is lists and {} is sets in
> other languages, but in java, {} is more closely associated with lists
> due to array literals. There's no answer out of this unless this
> answer involves either eliminating set literals, or forcing the need
> to mention the target type somehow: e.g. something like "Set["a", "b",
> "c"]" versus "List["a", "b", "c"]" which is quite a price to pay to
> eliminate the confusion. Perhaps if the literal syntax defaults to
> Lists if you omit the type, this is a workable alternative.
> Unfortunately, this is hard to rhyme with the parser: Is "Set[10]" to
> be interpreted as: Please create a Set<Integer>, and populate it with
> the number '10', or should it be interpreted as: "Set" is a variable
> that points to an array, and I want the 11th item in this array.
> (variables ought to start with a lowercase letter, but this is merely
> convention and not something the parser can rely on). So, such a
> change would probably move this proposal beyond the scope of coin.
>
> So, let's turn this argument on its head: Why are we trying so hard to
> make set literals work? Why don't we just remove them from the
> proposal? The need for them seems minor compared to lists. When the
> collection size is small (below about a 100), O(1) lookup performance
> is irrelevant (and even if it was relevant, due to the extra
> housekeeping that Sets have to do, Lists tend to actually beat Sets in
> performance, even for contains(), if the list/set is small!), and yet,
> if the initial list or set is being created via a literal, the list/
> set will most likely remain small. If only list literals existed,
> creating a set is much cleaner than what you get now:
>
> new HashSet<>({"a", "b", "c"});
>
> versus:
>
> new HashSet<String>(Arrays.asList("a", "b", "c"));
>
> This doesn't work if you reverse the scenario; you can't make lists
> from set literals (as the duplicates would have already been removed
> at thiat point). Even in this longer form of set literal, you've
> eliminated the biggest problem in the status quo: A reliance on a
> completely unrelated class (Arrays - what does the arrays utility
> class have to do with the creation of collections from explicit
> values? It SHOULD have no relation whatsoever), and extremely
> wordiness - partly because of the accepted diamond proposal.
>
> Perhaps some research needs to be done for how often "set literals"
> are created now in real life java code. Search for the patterns:
>
> new HashSet<T>(Arrays.asList(T...));
> googlecollections.ImmutableSet.of(T...);
> HashSet/Set<T> x = new HashSet<T>(); x.add(t); //repeated 1 or more
> times.
>
> If not very often, then isn't the right answer here to just leave them
> out entirely, eliminating the confusion in the process?
>
> I may have missed it, but I can't remember seeing the technical
> details on this proposal. What does a list literal construct? A
> mutable ArrayList, or an immutable undefined implementation of List? I
> would strongly suggest these literals are immutable by default,
> particularly because making them mutable is easy: new
> ArrayList<>({"a", "b", "c"});.
>
> If only the static methods in interfaces proposal had been taken more
> seriously, this could have been solved decently with a library,
> especially because of the acceptance of the easier varargs invocation
> proposal:
>
> List.of("a", "b", "c");
>
> is even better than:
>
> ["a", "b", "c"];
>
> because it avoids the "Is it a Set or a List" issue entirely, and
> doesn't require taking up valuable parser flexibility the way real
> literals would.
>
> Even in complex situations:
>
> List<Set<String>> complicated = List.of(Set.of("a", "b", "a"), null,
> Set.empty());
>
> methodCall(List.of("a", "b, "c"));
>
> compared to:
>
> List<Set<String>> complicated = [{"a", "b", "a"}, null, {}];
> methodCall(["a", "b", "c"]);
>
> the interface-based static methods are probably slightly worse overall
> than true literals in these slightly more complex situations, but the
> difference isn't that big, and the static methods still win in
> eliminating list/set confusion. Right now the static method based
> solution would cause a warning you can safely ignore, and is thus
> useless (try it yourself with the google collections API's "of"
> methods), but as mentioned the simpler varargs invocation proposal
> eliminates the unneccessary warnings, making that form a decent
> alternative.
>
> NB: I doubt it'll help at this point in time, but I'll vouch for
> writing up the JLS patch and a prototype compiler, delivered within a
> month after greenlighting the idea. The implementation would work
> along the lines of the proposal I handed in for coin to allow static
> methods in interfaces without requiring any changes to the JVM. The
> strategy boils down to creating a new inner class in the interface,
> named "$Methods", and moving all static methods into this new class,
> then requiring that these static methods are called only via their
> original class name, and not on instances or subtypes (Every style
> checker I know of generates a warning if you call static methods via
> an instance anyway, so I don't consider this a big loss), and
> translating any static method call on an interface type from
> InterfaceName.methodName(params) to InterfaceName.
> $Methods.methodName(params). As plenty of other languages running on
> the JVM do allow it, there's a modicum of benefit for JVM language
> interop if the proposal is accepted as well, by standardizing the
> approach.
>
>  --Reinier Zwitserloot
>
> On 2009/29/09, at 07:34, Joshua Bloch wrote:
>
>> Paul,
>>
>> On Mon, Sep 28, 2009 at 10:28 PM, Paul Benedict
>> <pbenedict at apache.org>wrote:
>>
>>> Josh,
>>>
>>> I think using braces or brackets to indicate the correct type is
>>> hardly
>>> intuitive or easy to remember. Choosing the wrong syntax by
>>> accident will
>>> instantiate the wrong type, and the difference between the brace or
>>> bracket
>>> is pretty subtle visually.
>>
>>
>> Usually it won't compile: you can't assign a Set to a List or vice-
>> versa.
>> Nick's example was carefully chosen: he invoked a constructor that
>> took a
>> Collection, which admits either a Set or a List.
>>
>>
>>
>>> If Java developers have to begin saying, "Which syntax do I need to
>>> use for
>>> a List vs. Set?", then I question the whole cost-to-benefit-ratio
>>> of this
>>> "small" (i.e, coin) proposal.
>>
>>
>> Agreed, I do think the syntax we settled on is reasonably evocative,
>> memorable, and consistent with other languages.  Braces (AKA curly
>> braces)
>> are used to represent a Set in mathematical notation, and square
>> brackets
>> are used to index into and to declare arrays, which are list-like.
>>
>>
>>> I can see the JDK 7 certification tests already asking this
>>> question --
>>> it's a good gotcha question. Not being a language expert, and
>>> recognizing
>>> that other languages already use what's being proposed, the syntax
>>> still
>>> doesn't pass my common sense meter. Do the technical justifications
>>> really
>>> outweigh simplicity?
>>>
>>
>> I think it's probably the best that we can do, but I could be
>> wrong.  I will
>> investigate other options.
>>
>>             Josh
>>
>
>
>



More information about the coin-dev mailing list