Possible groupingBy simplification?

Remi Forax forax at univ-mlv.fr
Wed Apr 10 10:10:25 PDT 2013


Joe,
collect(toList(), groupingBy(f));
   => how do you express the fact that you may want to group in cascade ?

collect(groupingBy(f)).toList()
   => what is the resulting type of collect(groupingBy(f)) ?
         is it a super-type of Stream ?

Brian,
I'm fine with the proposed changes.

Rémi

On 04/10/2013 06:42 PM, Joe Bowbeer wrote:
>
> Correction: All the grouping(f) should be groupingBy(f)
>
> On Apr 10, 2013 9:37 AM, "Joe Bowbeer" <joe.bowbeer at gmail.com 
> <mailto:joe.bowbeer at gmail.com>> wrote:
>
>     For consistency with minBy and friends, all the 'By' methods
>     should take a single argument: f. Hence grouping(f).
>
>     No-arg and one-arg forms are the easiest to use and maintain. Just
>     the additional comma, and which pair of parens contains it, is a
>     significant burden.
>
>     The most readable forms of collect that have an explicit toList()
>     would be of the form:
>
>     collect(grouping(f)).toList();
>
>     or maybe
>
>     collect(toList(), groupingBy(f));
>
>     Joe
>
>     On Apr 10, 2013 2:35 AM, "Paul Sandoz" <paul.sandoz at oracle.com
>     <mailto:paul.sandoz at oracle.com>> wrote:
>
>
>         On Apr 9, 2013, at 11:56 PM, Joe Bowbeer
>         <joe.bowbeer at gmail.com <mailto:joe.bowbeer at gmail.com>> wrote:
>
>         > I like the most popular form.  In fact, I think it's the
>         only one that I've
>         > used.
>         >
>         > The argument that users will gain by removing their most
>         common form seems
>         > kind of far-fetched.
>         >
>
>         If each method in Collectors does just one conceptual thing we
>         can concisely express in documentation it is easier to
>         remember and therefore easier to read the code, easier to find
>         in documentation be it using the IDE or otherwise. Thus to me
>         that suggests removing conceptual variants or renaming them.
>
>         If the list variants were called say groupingByToList that
>         would ensure the "one conceptual thing":  classifies elements
>         by key, and collects elements associated with that key to a
>         list. But i suspect we might not require those methods if the
>         leap of stream.collector(toList()) can be grasped.
>
>         The same applies to toMap. I think it is easier to
>         understand/read if it does just one conceptual thing: elements
>         are keys, elements are mapped to values, conflicting keys
>         result in an exception. If that does not fit ones requirements
>         use groupingBy.
>
>         Paul.
>
>         > In my experience, I do a ctrl-space and look for my target
>         return type on
>         > the right-hand-side of the IDE popup, and then I try to fill
>         in the missing
>         > information, such as parameters.  In this case, having to
>         provide toList()
>         > would probably be a stumbling block for me, as the IDE is
>         not as good when
>         > it comes to suggesting expressions for parameters.
>         >
>         > I sort of like the symmetry with collect(toList()) but not
>         enough to make
>         > up for the loss.
>         >
>         >
>         >
>         > On Tue, Apr 9, 2013 at 2:16 PM, Brian Goetz
>         <brian.goetz at oracle.com <mailto:brian.goetz at oracle.com>> wrote:
>         >
>         >> Paul suggested the following possible simplification for
>         groupingBy.  It
>         >> is somewhat counterintuitive at first glance, in that it
>         removes the most
>         >> commonly used form (!), but might make things easier to
>         grasp in the long
>         >> run (aided by good docs.)
>         >>
>         >> Recall we currently have four forms of groupingBy:
>         >>
>         >>    // classifier only -- maps keys to list of matching elements
>         >>    Collector<T, Map<K, List<T>>>
>         >>    groupingBy(Function<? super T, ? extends K> classifier)
>         >>
>         >>    // Like above, but with explicit map ctor
>         >>    <T, K, M extends Map<K, List<T>>>
>         >>    Collector<T, M>
>         >>    groupingBy(Function<? super T, ? extends K> classifier,
>         >>               Supplier<M> mapFactory)
>         >>
>         >>    // basic cascaded form
>         >>    Collector<T, Map<K, D>>
>         >>    groupingBy(Function<? super T, ? extends K> classifier,
>         >>               Collector<T, D> downstream)
>         >>
>         >>    // cascaded form with explicit ctor
>         >>    <T, K, D, M extends Map<K, D>>
>         >>    Collector<T, M>
>         >>    groupingBy(Function<? super T, ? extends K> classifier,
>         >>               Supplier<M> mapFactory,
>         >>               Collector<T, D> downstream)
>         >>
>         >> Plus four corresponding forms for groupingByConcurrent.
>         >>
>         >> The first form is likely to be the most common, as it is
>         the traditional
>         >> "group by".  It is equivalent to:
>         >>
>         >>  groupingBy(classifier, toList());
>         >>
>         >> The proposal is: Drop the first two forms.  Just as users
>         can learn that
>         >> to collect elements into a list, you do:
>         >>
>         >>  collect(toList())
>         >>
>         >> people can learn that to do the simple form of groupBy, you
>         can do:
>         >>
>         >>  collect(groupingBy(f, toList());
>         >>
>         >> Which also reads perfectly well.
>         >>
>         >> By cutting the number of forms in half, it helps users to
>         realize that
>         >> groupingBy does just one thing -- classifies elements by
>         key, and collects
>         >> elements associated with that key.  Obviously the docs for
>         groupingBy can
>         >> show examples of the simple grouping as well as more
>         sophisticated
>         >> groupings.
>         >>
>         >>
>



More information about the lambda-libs-spec-experts mailing list