Design for collections upgrades
Rémi Forax
forax at univ-mlv.fr
Thu Mar 10 05:57:31 PST 2011
Yes, we need to provide methods that filter/map directly
the content of a collections.
But I don't think it's a good idea to name them filter or map.
Why not filterAll and mapAll ?
Rémi
Le 10/03/2011 12:57, Stephen Colebourne a écrit :
> The point I was trying to make is that to me the method name "filter"
> means filter this collection in place, not return a new copy that is
> filtered. Just like Collections.sort, shuffle or swap. Its the active
> tense that implies it.
>
> I also think that Java is a mutation based language in developers
> minds, and if you asked around then that in-place change is what would
> be expected. Now, the FP viewpoint suggests thats a bad idea, and the
> presence of immutable collections makes it hard to implement, but that
> doesn't change what I perceive developers would currently expect from
> that method name and similar actve tense ones.
>
> Finally, I'd argue against using "map" as a method name in Java, given
> the strong connection to the entirely different concept of the Map
> interface.
>
> Stephen
>
>
> 2011/3/10 "Zdeněk Troníček"<tronicek at fit.cvut.cz>:
>> To me it seems logical that filter() returns the same collection as was
>> the original collection. For Set you do not have any other choice either:
>>
>> set.filter(predicate)
>>
>> cannot switch from HashSet to TreeSet or back.
>>
>> Z.
>> --
>> Zdenek Tronicek
>> FIT CTU in Prague
>>
>>
>> Rémi Forax napsal(a):
>>> List<String> things = ...
>>> Collection<String> fooAbles = things.filter(#Thing.isFoo); // ooh,
>>> pretty
>>>
>>>
>>> Not that pretty because filter have to create a new collection and
>>> there is no way to do that apart hard coding a new ArrayList somewhere.
>>>
>>> It's better in my opinion to have a filterTo that takes a collection
>>> as argument.
>>>
>>> Collection<String> fooAbles = things.filterTo(#Thing.isFoo, new
>>> HashSet<>());
>>>
>>>
>>> Rémi
>>>
>>> On 03/08/2011 06:23 PM, Brian Goetz wrote:
>>>> Since people are already discussing this based on an experimental
>>>> checkin, let me outline the big picture plan here.
>>>>
>>>> The general idea is to add functional-like operations to collections --
>>>> filter, map, reduce, apply.
>>>>
>>>> I see three sensible modes, with explicit choices of which you get.
>>>>
>>>> 1. Serial / Eager. This is the straight
>>>> collections-with-functional-style mode, and some samples have already
>>>> been checked in as proof of concept. Operations on collections yield
>>>> new collections, and you can chain the calls. It values ease of use
>>>> over performance (no new concepts like laziness), but the performance
>>>> model is still highly predictable. You get things like
>>>>
>>>> Collection fooAbles = things.filter( #{ t -> t.isFoo() });
>>>>
>>>> or, with method references:
>>>>
>>>> Collection fooAbles = things.filter(#Thing.isFoo); // ooh, pretty
>>>>
>>>> You can also chain calls together, though you pay a (predictable)
>>>> performance cost of intermediate collections, which for small
>>>> collections is unlikely to matter:
>>>>
>>>> maxFooWeight = things.filter(#Thing.isFoo)
>>>> .map(#Thing.getWeight)
>>>> .max();
>>>>
>>>> The benefit here is concision and clarity. The cost is some
>>>> performance, but maybe not so much that people freak out. If people
>>>> care, they move to the next model, which is:
>>>>
>>>> 2. Serial / Lazy. Here, the primary abstraction is Stream (name to be
>>>> chosen later, Remi used "lazy" in his example.) To transfer between
>>>> "eager world" and "lazy world", you use conversion methods (toStream /
>>>> toCollection). A typical call chain probably looks like:
>>>> collection.toStream / op / op / op / {toCollection,reduce,apply}
>>>>
>>>> so the above example becomes
>>>>
>>>> maxFooWeight = things.asStream()
>>>> .filter(#Thing.isFoo)
>>>> .map(#Thing.getWeight)
>>>> .max();
>>>>
>>>> The return type of Collection.filter is different from the return type
>>>> of Stream.filter, so the choice and performance costs are reflected in
>>>> the static type system. This avoids the cost of the intermediate
>>>> collections, but is still serial. If you care about that, you move up
>>>> to the next model, which is:
>>>>
>>>> 3. Parallel / Lazy. Here, the primary abstraction is something like
>>>> ParallelStream or ParallelIterable. Let's call it ParallelFoo to avoid
>>>> bikeshedding for the moment. Now, the code looks like:
>>>>
>>>> maxFooWeight = things.asParallelFoo()
>>>> .filter(#Thing.isFoo)
>>>> .map(#Thing.getWeight)
>>>> .max();
>>>>
>>>> Again, the return type of ParallelFoo.filter is different from
>>>> Stream.filter or Collection.filter, so again the choice is reflected in
>>>> the static type system. But you don't have to rewrite your code.
>>>>
>>>> The beauty here is twofold:
>>>>
>>>> - The base model (serial/eager) is easy to understand and natural to
>>>> use as a way of expressing what the programmer wants to do, and
>>>> attractive enough to stand on its own -- just a little slow with big
>>>> collections.
>>>> - Switching between execution models is mostly a matter of adding an
>>>> explicit conversion or two in the call chain, as the models are similar
>>>> enough that the rest of the code should still work (and even mean the
>>>> same thing.)
>>>>
>>>>
>>>> On 3/8/2011 8:43 AM, Rémi Forax wrote:
>>>>> Le 08/03/2011 14:31, Jim Mayer a écrit :
>>>>>> // I can tolerate this one
>>>>>> long product(List<Integer> list) {
>>>>>> return list.map(#{x -> (long) x}).reduce(0L, #{sum, x ->
>>>>>> sum * x});
>>>>>> }
>>>>> I prefer this one:
>>>>>
>>>>> long product(List<Integer> list) {
>>>>> return list.lazy().map(#{x -> (long) x}).reduce(0L, #{sum,
>>>>> x -> sum * x});
>>>>> }
>>>>>
>>>>> lazy() means, don't do map directly, but wait and do map and reduce in
>>>>> one iteration.
>>>>>
>>>>> Rémi
>>>>>
>>>>>
>>>
>>>
>>
>>
More information about the lambda-dev
mailing list