Design for collections upgrades
Stephen Colebourne
scolebourne at joda.org
Thu Mar 10 03:57:48 PST 2011
The point I was trying to make is that to me the method name "filter"
means filter this collection in place, not return a new copy that is
filtered. Just like Collections.sort, shuffle or swap. Its the active
tense that implies it.
I also think that Java is a mutation based language in developers
minds, and if you asked around then that in-place change is what would
be expected. Now, the FP viewpoint suggests thats a bad idea, and the
presence of immutable collections makes it hard to implement, but that
doesn't change what I perceive developers would currently expect from
that method name and similar actve tense ones.
Finally, I'd argue against using "map" as a method name in Java, given
the strong connection to the entirely different concept of the Map
interface.
Stephen
2011/3/10 "Zdeněk Troníček" <tronicek at fit.cvut.cz>:
> To me it seems logical that filter() returns the same collection as was
> the original collection. For Set you do not have any other choice either:
>
> set.filter(predicate)
>
> cannot switch from HashSet to TreeSet or back.
>
> Z.
> --
> Zdenek Tronicek
> FIT CTU in Prague
>
>
> Rémi Forax napsal(a):
>>
>> List<String> things = ...
>> Collection<String> fooAbles = things.filter(#Thing.isFoo); // ooh,
>> pretty
>>
>>
>> Not that pretty because filter have to create a new collection and
>> there is no way to do that apart hard coding a new ArrayList somewhere.
>>
>> It's better in my opinion to have a filterTo that takes a collection
>> as argument.
>>
>> Collection<String> fooAbles = things.filterTo(#Thing.isFoo, new
>> HashSet<>());
>>
>>
>> Rémi
>>
>> On 03/08/2011 06:23 PM, Brian Goetz wrote:
>>> Since people are already discussing this based on an experimental
>>> checkin, let me outline the big picture plan here.
>>>
>>> The general idea is to add functional-like operations to collections --
>>> filter, map, reduce, apply.
>>>
>>> I see three sensible modes, with explicit choices of which you get.
>>>
>>> 1. Serial / Eager. This is the straight
>>> collections-with-functional-style mode, and some samples have already
>>> been checked in as proof of concept. Operations on collections yield
>>> new collections, and you can chain the calls. It values ease of use
>>> over performance (no new concepts like laziness), but the performance
>>> model is still highly predictable. You get things like
>>>
>>> Collection fooAbles = things.filter( #{ t -> t.isFoo() });
>>>
>>> or, with method references:
>>>
>>> Collection fooAbles = things.filter(#Thing.isFoo); // ooh, pretty
>>>
>>> You can also chain calls together, though you pay a (predictable)
>>> performance cost of intermediate collections, which for small
>>> collections is unlikely to matter:
>>>
>>> maxFooWeight = things.filter(#Thing.isFoo)
>>> .map(#Thing.getWeight)
>>> .max();
>>>
>>> The benefit here is concision and clarity. The cost is some
>>> performance, but maybe not so much that people freak out. If people
>>> care, they move to the next model, which is:
>>>
>>> 2. Serial / Lazy. Here, the primary abstraction is Stream (name to be
>>> chosen later, Remi used "lazy" in his example.) To transfer between
>>> "eager world" and "lazy world", you use conversion methods (toStream /
>>> toCollection). A typical call chain probably looks like:
>>> collection.toStream / op / op / op / {toCollection,reduce,apply}
>>>
>>> so the above example becomes
>>>
>>> maxFooWeight = things.asStream()
>>> .filter(#Thing.isFoo)
>>> .map(#Thing.getWeight)
>>> .max();
>>>
>>> The return type of Collection.filter is different from the return type
>>> of Stream.filter, so the choice and performance costs are reflected in
>>> the static type system. This avoids the cost of the intermediate
>>> collections, but is still serial. If you care about that, you move up
>>> to the next model, which is:
>>>
>>> 3. Parallel / Lazy. Here, the primary abstraction is something like
>>> ParallelStream or ParallelIterable. Let's call it ParallelFoo to avoid
>>> bikeshedding for the moment. Now, the code looks like:
>>>
>>> maxFooWeight = things.asParallelFoo()
>>> .filter(#Thing.isFoo)
>>> .map(#Thing.getWeight)
>>> .max();
>>>
>>> Again, the return type of ParallelFoo.filter is different from
>>> Stream.filter or Collection.filter, so again the choice is reflected in
>>> the static type system. But you don't have to rewrite your code.
>>>
>>> The beauty here is twofold:
>>>
>>> - The base model (serial/eager) is easy to understand and natural to
>>> use as a way of expressing what the programmer wants to do, and
>>> attractive enough to stand on its own -- just a little slow with big
>>> collections.
>>> - Switching between execution models is mostly a matter of adding an
>>> explicit conversion or two in the call chain, as the models are similar
>>> enough that the rest of the code should still work (and even mean the
>>> same thing.)
>>>
>>>
>>> On 3/8/2011 8:43 AM, Rémi Forax wrote:
>>>> Le 08/03/2011 14:31, Jim Mayer a écrit :
>>>>> // I can tolerate this one
>>>>> long product(List<Integer> list) {
>>>>> return list.map(#{x -> (long) x}).reduce(0L, #{sum, x ->
>>>>> sum * x});
>>>>> }
>>>> I prefer this one:
>>>>
>>>> long product(List<Integer> list) {
>>>> return list.lazy().map(#{x -> (long) x}).reduce(0L, #{sum,
>>>> x -> sum * x});
>>>> }
>>>>
>>>> lazy() means, don't do map directly, but wait and do map and reduce in
>>>> one iteration.
>>>>
>>>> Rémi
>>>>
>>>>
>>
>>
>>
>
>
>
More information about the lambda-dev
mailing list