Design for collections upgrades

Rémi Forax forax at univ-mlv.fr
Tue Mar 8 11:42:20 PST 2011


     List<String>  things = ...
     Collection<String>  fooAbles = things.filter(#Thing.isFoo); // ooh, pretty


Not that pretty because filter have to create a new collection and
there is no way to do that apart hard coding a new ArrayList somewhere.

It's better in my opinion to have a filterTo that takes a collection
as argument.

Collection<String>  fooAbles = things.filterTo(#Thing.isFoo, new HashSet<>());


Rémi

On 03/08/2011 06:23 PM, Brian Goetz wrote:
> Since people are already discussing this based on an experimental
> checkin, let me outline the big picture plan here.
>
> The general idea is to add functional-like operations to collections --
> filter, map, reduce, apply.
>
> I see three sensible modes, with explicit choices of which you get.
>
> 1.  Serial / Eager.  This is the straight
> collections-with-functional-style mode, and some samples have already
> been checked in as proof of concept.  Operations on collections yield
> new collections, and you can chain the calls.  It values ease of use
> over performance (no new concepts like laziness), but the performance
> model is still highly predictable.  You get things like
>
>        Collection fooAbles = things.filter( #{ t ->  t.isFoo() });
>
> or, with method references:
>
>        Collection fooAbles = things.filter(#Thing.isFoo); // ooh, pretty
>
> You can also chain calls together, though you pay a (predictable)
> performance cost of intermediate collections, which for small
> collections is unlikely to matter:
>
>        maxFooWeight = things.filter(#Thing.isFoo)
>                             .map(#Thing.getWeight)
>                             .max();
>
> The benefit here is concision and clarity.  The cost is some
> performance, but maybe not so much that people freak out.  If people
> care, they move to the next model, which is:
>
> 2.  Serial / Lazy.  Here, the primary abstraction is Stream (name to be
> chosen later, Remi used "lazy" in his example.)  To transfer between
> "eager world" and "lazy world", you use conversion methods (toStream /
> toCollection).  A typical call chain probably looks like:
>     collection.toStream / op / op / op / {toCollection,reduce,apply}
>
> so the above example becomes
>
>        maxFooWeight = things.asStream()
>                             .filter(#Thing.isFoo)
>                             .map(#Thing.getWeight)
>                             .max();
>
> The return type of Collection.filter is different from the return type
> of Stream.filter, so the choice and performance costs are reflected in
> the static type system.  This avoids the cost of the intermediate
> collections, but is still serial.  If you care about that, you move up
> to the next model, which is:
>
> 3.  Parallel / Lazy.  Here, the primary abstraction is something like
> ParallelStream or ParallelIterable.  Let's call it ParallelFoo to avoid
> bikeshedding for the moment.  Now, the code looks like:
>
>        maxFooWeight = things.asParallelFoo()
>                             .filter(#Thing.isFoo)
>                             .map(#Thing.getWeight)
>                             .max();
>
> Again, the return type of ParallelFoo.filter is different from
> Stream.filter or Collection.filter, so again the choice is reflected in
> the static type system.  But you don't have to rewrite your code.
>
> The beauty here is twofold:
>
>    - The base model (serial/eager) is easy to understand and natural to
> use as a way of expressing what the programmer wants to do, and
> attractive enough to stand on its own -- just a little slow with big
> collections.
>    - Switching between execution models is mostly a matter of adding an
> explicit conversion or two in the call chain, as the models are similar
> enough that the rest of the code should still work (and even mean the
> same thing.)
>
>
> On 3/8/2011 8:43 AM, Rémi Forax wrote:
>>     Le 08/03/2011 14:31, Jim Mayer a écrit :
>>> // I can tolerate this one
>>>        long product(List<Integer>    list) {
>>>          return list.map(#{x ->    (long) x}).reduce(0L, #{sum, x ->    sum * x});
>>>        }
>> I prefer this one:
>>
>>      long product(List<Integer>    list) {
>>          return list.lazy().map(#{x ->    (long) x}).reduce(0L, #{sum, x ->    sum * x});
>>      }
>>
>> lazy() means, don't do map directly, but wait and do map and reduce in
>> one iteration.
>>
>> Rémi
>>
>>



More information about the lambda-dev mailing list