Additional Collectors

Brian Goetz brian.goetz at oracle.com
Wed Apr 3 12:13:47 PDT 2013


The short answer is: we have deliberately moved away from trying to 
capture these properties statically.

Not only does having UniqueStream and ParallelStream and SortedStream 
result in a combinatorial explosion of interfaces 
(IntUniqueParallelSortedStream), but most of the raw information needed 
to get static safety does not actually exist in the static type system. 
  For example, it is quite common that you know that an array contains 
no duplicates or is sorted, but you'd have to do some sort of "cast" to 
teach the static type system that:

   Arrays.stream(array)
         .pretendIAmUnique()
         .pretendIAmSorted()
         ...

And the pretendXxx() call here is like an unsafe cast; the compiler 
cannot verify that you are doing so safely.  So we're back to dynamic 
detection, with a little extra documentation, and a monstrous API bloat 
as the cost of that extra documentation.

Instead, we can provide canned merge policies such as "throw", "first 
wins", and "last wins".

On 4/3/2013 3:08 PM, Ali Ebrahimi wrote:
>
> Hi,
>
> On Wed, Apr 3, 2013 at 11:01 PM, Brian Goetz <brian.goetz at oracle.com
> <mailto:brian.goetz at oracle.com>> wrote:
>
>     This should be handled by the two overloadings of toMap.  The
>     general one:
>
>        toMap(Function<T,U>, Supplier<M>, BinaryOperator<U>)
>
>     takes a merge function which resolves duplicates.
>
>     The default form:
>
>        toMap(Function<T,U>)
>
>     implicitly uses a merge function which throws.  The doc says:
>
>     If the input elements contains duplicates
>     (according to {@link Object#equals(Object)}), an {@code
>     IllegalStateException} is thrown when the
>     collection operation is performed.
>
>     for the basic form and documents the use of the merge function for
>     the more general form.
>
>     Is that not adequate?
>
> yes, but this is runtime safe solution. My suggestion was compile time safe.
>
> Ali Ebrahimi
>
>
>
>     On 4/3/2013 2:21 PM, Ali Ebrahimi wrote:
>
>         Hi brian,
>         I have concerns about toMap method and this may result in
>         unexpected and
>         unpredictable results in user program, and this method only have
>         mean
>         for unique collections (Set) and streams (resulted for
>         Stream.distinct).
>         Consider this example:
>
>         class Entity{
>                int id;   //key field
>                String name;
>            // override equals and hashcode
>         ....
>         }
>
>         Entity foo = new Entity(1, "Foo");
>         Entity bar = new Entity(1, "Bar");
>         List<Entity> entities = list(new Entity(0, "Some"),foo,..., bar,...)
>
>         Map<Entity,String> entitymap=entities.stream().__collect(ToMap(e
>         -> e.name <http://e.name>
>         <http://e.name>));
>
>
>         what is result of entitymap.get(foo)? "Foo" or "Bar"
>
>         Map<Entity,String>
>         entitymap2=entities.__parallelStream().collect(__ToMap(e
>         -> e.name <http://e.name> <http://e.name>));
>
>
>         what is result of entitymap2.get(foo)? "Foo" or "Bar"
>
>         Suggestion1: get rid of ToMap
>         Suggestion 2: May be we need consider adding subclass
>         UniqueStream with
>         additional method toMap and change return type of
>           Stream.distinct and
>         Set.stream to UniqueStream.
>
>         What do you think?
>
>         Ali Ebrahimi
>
>
>
>
>
>         On Wed, Apr 3, 2013 at 9:57 PM, Brian Goetz
>         <brian.goetz at oracle.com <mailto:brian.goetz at oracle.com>
>         <mailto:brian.goetz at oracle.com
>         <mailto:brian.goetz at oracle.com>__>> wrote:
>
>              There's been some feedback on lambda-dev and from the
>         recent Lambda
>              Hack Day on Collectors.  There were two big categories:
>
>              1.  Need more / better docs.
>
>              2.  We want some more collectors.
>
>              The first is obvious and we've been working on those.  Here
>         are some
>              suggestions for simple additions to the Collector set.
>
>                - count() (and possibly sum, min, max)
>
>              These are straighforward analogues of the specialized stream
>              methods; they serve as a "gentle on ramp"  to understanding
>         reduction.
>
>              People also expressed concern that the "toMap()" (nee mappedTo,
>              joiningWith) is not flexible enough.  As a reminder, what
>         toMap does
>              is take a Stream<T> and a function T->U and produces a
>         Map<T,U>.
>                Some people call this "backwards"; they would rather have
>              something that takes a Stream<T> and function T->K and
>         produces a
>              Map<K,T>.  And others would rather have something that
>         takes two
>              functions T->K and T->U and produces a Map<K,U>.
>
>              All of these are useful enough.  The question is how to fit
>         them
>              into the API.  I think the name "toMap" is a bit of a
>         challenge,
>              since there are several "modes" and not all of them can be
>         easily
>              handled by overloads.  Maybe:
>
>                 toMap(T->U) // first version
>                 toMap(T->K, T->U) // third version
>
>              and leave the second version out, since the third version
>         can easily
>              simulate the second?
>
>
>


More information about the lambda-libs-spec-observers mailing list