Design for collections upgrades
"Zdeněk Troníček"
tronicek at fit.cvut.cz
Thu Mar 10 02:46:06 PST 2011
To me it seems logical that filter() returns the same collection as was
the original collection. For Set you do not have any other choice either:
set.filter(predicate)
cannot switch from HashSet to TreeSet or back.
Z.
--
Zdenek Tronicek
FIT CTU in Prague
Rémi Forax napsal(a):
>
> List<String> things = ...
> Collection<String> fooAbles = things.filter(#Thing.isFoo); // ooh,
> pretty
>
>
> Not that pretty because filter have to create a new collection and
> there is no way to do that apart hard coding a new ArrayList somewhere.
>
> It's better in my opinion to have a filterTo that takes a collection
> as argument.
>
> Collection<String> fooAbles = things.filterTo(#Thing.isFoo, new
> HashSet<>());
>
>
> Rémi
>
> On 03/08/2011 06:23 PM, Brian Goetz wrote:
>> Since people are already discussing this based on an experimental
>> checkin, let me outline the big picture plan here.
>>
>> The general idea is to add functional-like operations to collections --
>> filter, map, reduce, apply.
>>
>> I see three sensible modes, with explicit choices of which you get.
>>
>> 1. Serial / Eager. This is the straight
>> collections-with-functional-style mode, and some samples have already
>> been checked in as proof of concept. Operations on collections yield
>> new collections, and you can chain the calls. It values ease of use
>> over performance (no new concepts like laziness), but the performance
>> model is still highly predictable. You get things like
>>
>> Collection fooAbles = things.filter( #{ t -> t.isFoo() });
>>
>> or, with method references:
>>
>> Collection fooAbles = things.filter(#Thing.isFoo); // ooh, pretty
>>
>> You can also chain calls together, though you pay a (predictable)
>> performance cost of intermediate collections, which for small
>> collections is unlikely to matter:
>>
>> maxFooWeight = things.filter(#Thing.isFoo)
>> .map(#Thing.getWeight)
>> .max();
>>
>> The benefit here is concision and clarity. The cost is some
>> performance, but maybe not so much that people freak out. If people
>> care, they move to the next model, which is:
>>
>> 2. Serial / Lazy. Here, the primary abstraction is Stream (name to be
>> chosen later, Remi used "lazy" in his example.) To transfer between
>> "eager world" and "lazy world", you use conversion methods (toStream /
>> toCollection). A typical call chain probably looks like:
>> collection.toStream / op / op / op / {toCollection,reduce,apply}
>>
>> so the above example becomes
>>
>> maxFooWeight = things.asStream()
>> .filter(#Thing.isFoo)
>> .map(#Thing.getWeight)
>> .max();
>>
>> The return type of Collection.filter is different from the return type
>> of Stream.filter, so the choice and performance costs are reflected in
>> the static type system. This avoids the cost of the intermediate
>> collections, but is still serial. If you care about that, you move up
>> to the next model, which is:
>>
>> 3. Parallel / Lazy. Here, the primary abstraction is something like
>> ParallelStream or ParallelIterable. Let's call it ParallelFoo to avoid
>> bikeshedding for the moment. Now, the code looks like:
>>
>> maxFooWeight = things.asParallelFoo()
>> .filter(#Thing.isFoo)
>> .map(#Thing.getWeight)
>> .max();
>>
>> Again, the return type of ParallelFoo.filter is different from
>> Stream.filter or Collection.filter, so again the choice is reflected in
>> the static type system. But you don't have to rewrite your code.
>>
>> The beauty here is twofold:
>>
>> - The base model (serial/eager) is easy to understand and natural to
>> use as a way of expressing what the programmer wants to do, and
>> attractive enough to stand on its own -- just a little slow with big
>> collections.
>> - Switching between execution models is mostly a matter of adding an
>> explicit conversion or two in the call chain, as the models are similar
>> enough that the rest of the code should still work (and even mean the
>> same thing.)
>>
>>
>> On 3/8/2011 8:43 AM, Rémi Forax wrote:
>>> Le 08/03/2011 14:31, Jim Mayer a écrit :
>>>> // I can tolerate this one
>>>> long product(List<Integer> list) {
>>>> return list.map(#{x -> (long) x}).reduce(0L, #{sum, x ->
>>>> sum * x});
>>>> }
>>> I prefer this one:
>>>
>>> long product(List<Integer> list) {
>>> return list.lazy().map(#{x -> (long) x}).reduce(0L, #{sum,
>>> x -> sum * x});
>>> }
>>>
>>> lazy() means, don't do map directly, but wait and do map and reduce in
>>> one iteration.
>>>
>>> Rémi
>>>
>>>
>
>
>
More information about the lambda-dev
mailing list