Design for collections upgrades

Tue Mar 8 12:29:52 PST 2011

This would be my preference. Make everything that doesn't need to actually
iterate lazy by default and do copying at the end if necessary.

-- 
Colin

On Tue, Mar 8, 2011 at 2:54 PM, Sam Pullara <sam at sampullara.com> wrote:

> Or just overload filter with a second argument that is the collection you
> would like to add to. What is nice about the lazy collections is that you
> don't need to generate intermediate collections at all and can much more
> efficiently chain them. Perhaps make them all lazy and require a .to() at
> the end if you want a Collection vs a Stream?
>
> > Collection<String>  fooAbles = things.filter(#Thing.isFoo).to(new
> HashSet<>());
>
>
> Sam
>
> On Mar 8, 2011, at 11:42 AM, Rémi Forax wrote:
>
> >
> >     List<String>  things = ...
> >     Collection<String>  fooAbles = things.filter(#Thing.isFoo); // ooh,
> pretty
> >
> >
> > Not that pretty because filter have to create a new collection and
> > there is no way to do that apart hard coding a new ArrayList somewhere.
> >
> > It's better in my opinion to have a filterTo that takes a collection
> > as argument.
> >
> > Collection<String>  fooAbles = things.filterTo(#Thing.isFoo, new
> HashSet<>());
> >
> >
> > Rémi
> >
> > On 03/08/2011 06:23 PM, Brian Goetz wrote:
> >> Since people are already discussing this based on an experimental
> >> checkin, let me outline the big picture plan here.
> >>
> >> The general idea is to add functional-like operations to collections --
> >> filter, map, reduce, apply.
> >>
> >> I see three sensible modes, with explicit choices of which you get.
> >>
> >> 1.  Serial / Eager.  This is the straight
> >> collections-with-functional-style mode, and some samples have already
> >> been checked in as proof of concept.  Operations on collections yield
> >> new collections, and you can chain the calls.  It values ease of use
> >> over performance (no new concepts like laziness), but the performance
> >> model is still highly predictable.  You get things like
> >>
> >>       Collection fooAbles = things.filter( #{ t ->  t.isFoo() });
> >>
> >> or, with method references:
> >>
> >>       Collection fooAbles = things.filter(#Thing.isFoo); // ooh, pretty
> >>
> >> You can also chain calls together, though you pay a (predictable)
> >> performance cost of intermediate collections, which for small
> >> collections is unlikely to matter:
> >>
> >>       maxFooWeight = things.filter(#Thing.isFoo)
> >>                            .map(#Thing.getWeight)
> >>                            .max();
> >>
> >> The benefit here is concision and clarity.  The cost is some
> >> performance, but maybe not so much that people freak out.  If people
> >> care, they move to the next model, which is:
> >>
> >> 2.  Serial / Lazy.  Here, the primary abstraction is Stream (name to be
> >> chosen later, Remi used "lazy" in his example.)  To transfer between
> >> "eager world" and "lazy world", you use conversion methods (toStream /
> >> toCollection).  A typical call chain probably looks like:
> >>    collection.toStream / op / op / op / {toCollection,reduce,apply}
> >>
> >> so the above example becomes
> >>
> >>       maxFooWeight = things.asStream()
> >>                            .filter(#Thing.isFoo)
> >>                            .map(#Thing.getWeight)
> >>                            .max();
> >>
> >> The return type of Collection.filter is different from the return type
> >> of Stream.filter, so the choice and performance costs are reflected in
> >> the static type system.  This avoids the cost of the intermediate
> >> collections, but is still serial.  If you care about that, you move up
> >> to the next model, which is:
> >>
> >> 3.  Parallel / Lazy.  Here, the primary abstraction is something like
> >> ParallelStream or ParallelIterable.  Let's call it ParallelFoo to avoid
> >> bikeshedding for the moment.  Now, the code looks like:
> >>
> >>       maxFooWeight = things.asParallelFoo()
> >>                            .filter(#Thing.isFoo)
> >>                            .map(#Thing.getWeight)
> >>                            .max();
> >>
> >> Again, the return type of ParallelFoo.filter is different from
> >> Stream.filter or Collection.filter, so again the choice is reflected in
> >> the static type system.  But you don't have to rewrite your code.
> >>
> >> The beauty here is twofold:
> >>
> >>   - The base model (serial/eager) is easy to understand and natural to
> >> use as a way of expressing what the programmer wants to do, and
> >> attractive enough to stand on its own -- just a little slow with big
> >> collections.
> >>   - Switching between execution models is mostly a matter of adding an
> >> explicit conversion or two in the call chain, as the models are similar
> >> enough that the rest of the code should still work (and even mean the
> >> same thing.)
> >>
> >>
> >> On 3/8/2011 8:43 AM, Rémi Forax wrote:
> >>>    Le 08/03/2011 14:31, Jim Mayer a écrit :
> >>>> // I can tolerate this one
> >>>>       long product(List<Integer>    list) {
> >>>>         return list.map(#{x ->    (long) x}).reduce(0L, #{sum, x ->
>  sum * x});
> >>>>       }
> >>> I prefer this one:
> >>>
> >>>     long product(List<Integer>    list) {
> >>>         return list.lazy().map(#{x ->    (long) x}).reduce(0L, #{sum, x
> ->    sum * x});
> >>>     }
> >>>
> >>> lazy() means, don't do map directly, but wait and do map and reduce in
> >>> one iteration.
> >>>
> >>> Rémi
> >>>
> >>>
> >
> >
>
>
>