Design for collections upgrades

Thu Mar 10 05:57:31 PST 2011

  Yes, we need to provide methods that filter/map directly
the content of a collections.
But I don't think it's a good idea to name them filter or map.

Why not filterAll and mapAll ?

Rémi

Le 10/03/2011 12:57, Stephen Colebourne a écrit :
> The point I was trying to make is that to me the method name "filter"
> means filter this collection in place, not return a new copy that is
> filtered. Just like Collections.sort, shuffle or swap. Its the active
> tense that implies it.
>
> I also think that Java is a mutation based language in developers
> minds, and if you asked around then that in-place change is what would
> be expected. Now, the FP viewpoint suggests thats a bad idea, and the
> presence of immutable collections makes it hard to implement, but that
> doesn't change what I perceive developers would currently expect from
> that method name and similar actve tense ones.
>
> Finally, I'd argue against using "map" as a method name in Java, given
> the strong connection to the entirely different concept of the Map
> interface.
>
> Stephen
>
>
> 2011/3/10 "Zdeněk Troníček"<tronicek at fit.cvut.cz>:
>> To me it seems logical that filter() returns the same collection as was
>> the original collection. For Set you do not have any other choice either:
>>
>> set.filter(predicate)
>>
>> cannot switch from HashSet to TreeSet or back.
>>
>> Z.
>> --
>> Zdenek Tronicek
>> FIT CTU in Prague
>>
>>
>> Rémi Forax napsal(a):
>>>       List<String>    things = ...
>>>       Collection<String>    fooAbles = things.filter(#Thing.isFoo); // ooh,
>>> pretty
>>>
>>>
>>> Not that pretty because filter have to create a new collection and
>>> there is no way to do that apart hard coding a new ArrayList somewhere.
>>>
>>> It's better in my opinion to have a filterTo that takes a collection
>>> as argument.
>>>
>>> Collection<String>    fooAbles = things.filterTo(#Thing.isFoo, new
>>> HashSet<>());
>>>
>>>
>>> Rémi
>>>
>>> On 03/08/2011 06:23 PM, Brian Goetz wrote:
>>>> Since people are already discussing this based on an experimental
>>>> checkin, let me outline the big picture plan here.
>>>>
>>>> The general idea is to add functional-like operations to collections --
>>>> filter, map, reduce, apply.
>>>>
>>>> I see three sensible modes, with explicit choices of which you get.
>>>>
>>>> 1.  Serial / Eager.  This is the straight
>>>> collections-with-functional-style mode, and some samples have already
>>>> been checked in as proof of concept.  Operations on collections yield
>>>> new collections, and you can chain the calls.  It values ease of use
>>>> over performance (no new concepts like laziness), but the performance
>>>> model is still highly predictable.  You get things like
>>>>
>>>>         Collection fooAbles = things.filter( #{ t ->    t.isFoo() });
>>>>
>>>> or, with method references:
>>>>
>>>>         Collection fooAbles = things.filter(#Thing.isFoo); // ooh, pretty
>>>>
>>>> You can also chain calls together, though you pay a (predictable)
>>>> performance cost of intermediate collections, which for small
>>>> collections is unlikely to matter:
>>>>
>>>>         maxFooWeight = things.filter(#Thing.isFoo)
>>>>                              .map(#Thing.getWeight)
>>>>                              .max();
>>>>
>>>> The benefit here is concision and clarity.  The cost is some
>>>> performance, but maybe not so much that people freak out.  If people
>>>> care, they move to the next model, which is:
>>>>
>>>> 2.  Serial / Lazy.  Here, the primary abstraction is Stream (name to be
>>>> chosen later, Remi used "lazy" in his example.)  To transfer between
>>>> "eager world" and "lazy world", you use conversion methods (toStream /
>>>> toCollection).  A typical call chain probably looks like:
>>>>      collection.toStream / op / op / op / {toCollection,reduce,apply}
>>>>
>>>> so the above example becomes
>>>>
>>>>         maxFooWeight = things.asStream()
>>>>                              .filter(#Thing.isFoo)
>>>>                              .map(#Thing.getWeight)
>>>>                              .max();
>>>>
>>>> The return type of Collection.filter is different from the return type
>>>> of Stream.filter, so the choice and performance costs are reflected in
>>>> the static type system.  This avoids the cost of the intermediate
>>>> collections, but is still serial.  If you care about that, you move up
>>>> to the next model, which is:
>>>>
>>>> 3.  Parallel / Lazy.  Here, the primary abstraction is something like
>>>> ParallelStream or ParallelIterable.  Let's call it ParallelFoo to avoid
>>>> bikeshedding for the moment.  Now, the code looks like:
>>>>
>>>>         maxFooWeight = things.asParallelFoo()
>>>>                              .filter(#Thing.isFoo)
>>>>                              .map(#Thing.getWeight)
>>>>                              .max();
>>>>
>>>> Again, the return type of ParallelFoo.filter is different from
>>>> Stream.filter or Collection.filter, so again the choice is reflected in
>>>> the static type system.  But you don't have to rewrite your code.
>>>>
>>>> The beauty here is twofold:
>>>>
>>>>     - The base model (serial/eager) is easy to understand and natural to
>>>> use as a way of expressing what the programmer wants to do, and
>>>> attractive enough to stand on its own -- just a little slow with big
>>>> collections.
>>>>     - Switching between execution models is mostly a matter of adding an
>>>> explicit conversion or two in the call chain, as the models are similar
>>>> enough that the rest of the code should still work (and even mean the
>>>> same thing.)
>>>>
>>>>
>>>> On 3/8/2011 8:43 AM, Rémi Forax wrote:
>>>>>      Le 08/03/2011 14:31, Jim Mayer a écrit :
>>>>>> // I can tolerate this one
>>>>>>         long product(List<Integer>      list) {
>>>>>>           return list.map(#{x ->      (long) x}).reduce(0L, #{sum, x ->
>>>>>>   sum * x});
>>>>>>         }
>>>>> I prefer this one:
>>>>>
>>>>>       long product(List<Integer>      list) {
>>>>>           return list.lazy().map(#{x ->      (long) x}).reduce(0L, #{sum,
>>>>> x ->      sum * x});
>>>>>       }
>>>>>
>>>>> lazy() means, don't do map directly, but wait and do map and reduce in
>>>>> one iteration.
>>>>>
>>>>> Rémi
>>>>>
>>>>>
>>>
>>>
>>
>>