RFC: draft API for JEP 269 Convenience Collection Factories
Peter Levart
peter.levart at gmail.com
Fri Oct 9 12:08:24 UTC 2015
Hi,
On 10/09/2015 04:39 AM, Paul Benedict wrote:
> I don't think the statements "Creates an unmodifiable set containing X
> elements" is always true. Since sets cannot have duplicates, it's possible
> passing in X elements gives you less than that based on equality. I think
> the Set docs should say "...X possible elements if unique". Wordsmith
> something better if you can, of course.
The same goes for Map.of(....).
The question is should the factories uniquify the element(s) / key(s) or
should they throw IllegalArgumentException?
In case of the former, which element / entry should they keep - the one
appearing 1st or last in the source?
For example:
Map<String, Integer> map = Map.of("a", 1, "a", 2);
System.out.println(map);
What should the result be:
1. {"a", 1}
2. {"a", 2}
3. IllegalArgumentException
I don't have a preference, but I think it should be specified.
Regards, Peter
>
> Cheers,
> Paul
>
> On Thu, Oct 8, 2015 at 6:39 PM, Stuart Marks <stuart.marks at oracle.com>
> wrote:
>
>> Hi all,
>>
>> Please review and comment on this draft API for JEP 269, Convenience
>> Collection Factories. For this review I'd like to focus on the API, and set
>> aside implementation issues and discussion for later.
>>
>>
>> JEP:
>>
>> http://openjdk.java.net/jeps/269
>>
>> javadoc:
>>
>>
>> http://cr.openjdk.java.net/~smarks/reviews/jep269/api.20151008.mod/
>>
>> specdiff:
>>
>>
>> http://cr.openjdk.java.net/~smarks/reviews/jep269/api.20151008.specdiff/overview-summary.html
>>
>>
>> Most of the API is pretty straightforward, with fixed-arg and varargs
>> "of()" factories for List, Set, ArrayList, and HashSet; and with fixed-arg
>> "of()" factories and varargs "ofEntries()" factories for Map and HashMap.
>>
>> There are a few issues on which I'd like to solicit discussion.
>>
>> 1. Number of fixed arg overloads.
>>
>> I've somewhat arbitrarily provided up to 5 fixed-arg overloads for the
>> lists and sets, and up to 8 pairs for the fixed-arg map factories. The
>> rationale for 8 pairs is that there are 8 primitives, and various language
>> processing tools often have maps for the primitive types. (But such tools
>> also often need to handle the Void type, which exceeds the limit of 8. So
>> this might need to change if we want to follow this rationale.)
>>
>> I also note that Guava's immutable factories provide 11 fixed-arg
>> overloads for list, 5 for set, and 5 pairs for map. I'd be curious as to
>> the rationale for this, and whether it also would apply to the JDK.
>>
>> 2. Other concrete collection factories.
>>
>> I've chosen to provide factories for the concrete collections ArrayList,
>> HashSet, and HashMap, since those seem to be the most commonly used. Is
>> there a need to provide factories for other concrete collections, such as
>> LinkedHashMap?
>>
>> 3. Duplicate handling.
>>
>> My current thinking is for the Set and Map factories to throw
>> IllegalArgumentException if a duplicate element or key is detected. The
>> current draft specification is silent on this point. It needs to be
>> specified, one way or another.
>>
>> The rationale for throwing an exception is that if these factories are
>> used in a "literal like" fashion, then having a duplicate is almost
>> certainly a programming error. Consider this example:
>>
>> Map<String,TypeUse> m = Map.ofEntries(
>> entry("CDATA", CBuiltinLeafInfo.NORMALIZED_STRING),
>> entry("ENTITY", CBuiltinLeafInfo.TOKEN),
>> entry("ENTITIES", CBuiltinLeafInfo.STRING.makeCollection()),
>> entry("ENUMERATION", CBuiltinLeafInfo.STRING.makeCollection()),
>> entry("NMTOKEN", CBuiltinLeafInfo.TOKEN),
>> entry("NMTOKENS", CBuiltinLeafInfo.STRING.makeCollection()),
>> entry("ID", CBuiltinLeafInfo.ID),
>> entry("IDREF", CBuiltinLeafInfo.IDREF),
>> entry("IDREFS",
>> TypeUseFactory.makeCollection(CBuiltinLeafInfo.IDREF));
>> entry("ENUMERATION", CBuiltinLeafInfo.TOKEN));
>>
>> (derived from [1])
>>
>> If duplicates were silently ignored, this might result in hard-to-spot
>> errors.
>>
>> There's also the matter of which value ends up being used in the case of
>> duplicate map keys, and whether this should be specified. A fairly obvious
>> policy would be "last one wins" but I'm reluctant to specify that, as it
>> starts to place unnecessary constraints on implementations. However, the
>> alternative of leaving it unspecified is also unpalatable.
>>
>> I'm aware that very few programming systems with similar constructs will
>> signal an error on duplicate elements. Python, Ruby, Groovy, Scala, and
>> Perl all seem to allow duplicates in maps or equivalent, apparently with a
>> last-wins policy. (Though sometimes it's hard to tell if the policy is
>> specified.)
>>
>> The only system I've been able to find that explicitly rejects duplicates
>> is Clojure, and this policy isn't without controversy. [2] The main
>> rationale is to prevent programming errors.
>>
>> There is a python bug [3] where it was proposed that duplicates in a dict
>> should raise an error or warning, also in order to catch programming
>> errors. The request was rejected, not necessarily because it was a bad
>> idea, but primarily because it would be a backward incompatible change.
>>
>> The easiest thing to do would simply to require last-wins, since
>> "everybody else is doing it" ... but that doesn't mean it's right. Since
>> we're introducing a new API here, there is no compatibility issue. Throwing
>> an exception for duplicates seems like a good way to prevent a certain
>> class of programming errors.
>>
>> What do people think?
>>
>> s'marks
>>
>> [1]
>> http://hg.openjdk.java.net/jdk8/jdk8/jaxws/file/d03dd22762db/src/share/jaxws_classes/com/sun/tools/internal/xjc/reader/dtd/TDTDReader.java#l420
>>
>> [2]
>> http://dev.clojure.org/display/design/Allow+duplicate+map+keys+and+set+elements
>>
>> [3] https://bugs.python.org/issue16385
>>
>>
More information about the core-libs-dev
mailing list