Design for collections upgrades

"Zdeněk Troníček" tronicek at fit.cvut.cz
Thu Mar 10 02:46:06 PST 2011


To me it seems logical that filter() returns the same collection as was
the original collection. For Set you do not have any other choice either:

set.filter(predicate)

cannot switch from HashSet to TreeSet or back.

Z.
-- 
Zdenek Tronicek
FIT CTU in Prague


Rémi Forax napsal(a):
>
>      List<String>  things = ...
>      Collection<String>  fooAbles = things.filter(#Thing.isFoo); // ooh,
> pretty
>
>
> Not that pretty because filter have to create a new collection and
> there is no way to do that apart hard coding a new ArrayList somewhere.
>
> It's better in my opinion to have a filterTo that takes a collection
> as argument.
>
> Collection<String>  fooAbles = things.filterTo(#Thing.isFoo, new
> HashSet<>());
>
>
> Rémi
>
> On 03/08/2011 06:23 PM, Brian Goetz wrote:
>> Since people are already discussing this based on an experimental
>> checkin, let me outline the big picture plan here.
>>
>> The general idea is to add functional-like operations to collections --
>> filter, map, reduce, apply.
>>
>> I see three sensible modes, with explicit choices of which you get.
>>
>> 1.  Serial / Eager.  This is the straight
>> collections-with-functional-style mode, and some samples have already
>> been checked in as proof of concept.  Operations on collections yield
>> new collections, and you can chain the calls.  It values ease of use
>> over performance (no new concepts like laziness), but the performance
>> model is still highly predictable.  You get things like
>>
>>        Collection fooAbles = things.filter( #{ t ->  t.isFoo() });
>>
>> or, with method references:
>>
>>        Collection fooAbles = things.filter(#Thing.isFoo); // ooh, pretty
>>
>> You can also chain calls together, though you pay a (predictable)
>> performance cost of intermediate collections, which for small
>> collections is unlikely to matter:
>>
>>        maxFooWeight = things.filter(#Thing.isFoo)
>>                             .map(#Thing.getWeight)
>>                             .max();
>>
>> The benefit here is concision and clarity.  The cost is some
>> performance, but maybe not so much that people freak out.  If people
>> care, they move to the next model, which is:
>>
>> 2.  Serial / Lazy.  Here, the primary abstraction is Stream (name to be
>> chosen later, Remi used "lazy" in his example.)  To transfer between
>> "eager world" and "lazy world", you use conversion methods (toStream /
>> toCollection).  A typical call chain probably looks like:
>>     collection.toStream / op / op / op / {toCollection,reduce,apply}
>>
>> so the above example becomes
>>
>>        maxFooWeight = things.asStream()
>>                             .filter(#Thing.isFoo)
>>                             .map(#Thing.getWeight)
>>                             .max();
>>
>> The return type of Collection.filter is different from the return type
>> of Stream.filter, so the choice and performance costs are reflected in
>> the static type system.  This avoids the cost of the intermediate
>> collections, but is still serial.  If you care about that, you move up
>> to the next model, which is:
>>
>> 3.  Parallel / Lazy.  Here, the primary abstraction is something like
>> ParallelStream or ParallelIterable.  Let's call it ParallelFoo to avoid
>> bikeshedding for the moment.  Now, the code looks like:
>>
>>        maxFooWeight = things.asParallelFoo()
>>                             .filter(#Thing.isFoo)
>>                             .map(#Thing.getWeight)
>>                             .max();
>>
>> Again, the return type of ParallelFoo.filter is different from
>> Stream.filter or Collection.filter, so again the choice is reflected in
>> the static type system.  But you don't have to rewrite your code.
>>
>> The beauty here is twofold:
>>
>>    - The base model (serial/eager) is easy to understand and natural to
>> use as a way of expressing what the programmer wants to do, and
>> attractive enough to stand on its own -- just a little slow with big
>> collections.
>>    - Switching between execution models is mostly a matter of adding an
>> explicit conversion or two in the call chain, as the models are similar
>> enough that the rest of the code should still work (and even mean the
>> same thing.)
>>
>>
>> On 3/8/2011 8:43 AM, Rémi Forax wrote:
>>>     Le 08/03/2011 14:31, Jim Mayer a écrit :
>>>> // I can tolerate this one
>>>>        long product(List<Integer>    list) {
>>>>          return list.map(#{x ->    (long) x}).reduce(0L, #{sum, x ->
>>>>  sum * x});
>>>>        }
>>> I prefer this one:
>>>
>>>      long product(List<Integer>    list) {
>>>          return list.lazy().map(#{x ->    (long) x}).reduce(0L, #{sum,
>>> x ->    sum * x});
>>>      }
>>>
>>> lazy() means, don't do map directly, but wait and do map and reduce in
>>> one iteration.
>>>
>>> Rémi
>>>
>>>
>
>
>



More information about the lambda-dev mailing list