Design for collections upgrades

Brian Goetz brian.goetz at oracle.com
Thu Mar 10 11:17:58 PST 2011


These are straightforward lambda-taking version of removeAll/retainAll. 
  On the plan.

On 3/10/2011 8:57 AM, Rémi Forax wrote:
>    Yes, we need to provide methods that filter/map directly
> the content of a collections.
> But I don't think it's a good idea to name them filter or map.
>
> Why not filterAll and mapAll ?
>
> Rémi
>
> Le 10/03/2011 12:57, Stephen Colebourne a écrit :
>> The point I was trying to make is that to me the method name "filter"
>> means filter this collection in place, not return a new copy that is
>> filtered. Just like Collections.sort, shuffle or swap. Its the active
>> tense that implies it.
>>
>> I also think that Java is a mutation based language in developers
>> minds, and if you asked around then that in-place change is what would
>> be expected. Now, the FP viewpoint suggests thats a bad idea, and the
>> presence of immutable collections makes it hard to implement, but that
>> doesn't change what I perceive developers would currently expect from
>> that method name and similar actve tense ones.
>>
>> Finally, I'd argue against using "map" as a method name in Java, given
>> the strong connection to the entirely different concept of the Map
>> interface.
>>
>> Stephen
>>
>>
>> 2011/3/10 "Zdeněk Troníček"<tronicek at fit.cvut.cz>:
>>> To me it seems logical that filter() returns the same collection as was
>>> the original collection. For Set you do not have any other choice either:
>>>
>>> set.filter(predicate)
>>>
>>> cannot switch from HashSet to TreeSet or back.
>>>
>>> Z.
>>> --
>>> Zdenek Tronicek
>>> FIT CTU in Prague
>>>
>>>
>>> Rémi Forax napsal(a):
>>>>        List<String>     things = ...
>>>>        Collection<String>     fooAbles = things.filter(#Thing.isFoo); // ooh,
>>>> pretty
>>>>
>>>>
>>>> Not that pretty because filter have to create a new collection and
>>>> there is no way to do that apart hard coding a new ArrayList somewhere.
>>>>
>>>> It's better in my opinion to have a filterTo that takes a collection
>>>> as argument.
>>>>
>>>> Collection<String>     fooAbles = things.filterTo(#Thing.isFoo, new
>>>> HashSet<>());
>>>>
>>>>
>>>> Rémi
>>>>
>>>> On 03/08/2011 06:23 PM, Brian Goetz wrote:
>>>>> Since people are already discussing this based on an experimental
>>>>> checkin, let me outline the big picture plan here.
>>>>>
>>>>> The general idea is to add functional-like operations to collections --
>>>>> filter, map, reduce, apply.
>>>>>
>>>>> I see three sensible modes, with explicit choices of which you get.
>>>>>
>>>>> 1.  Serial / Eager.  This is the straight
>>>>> collections-with-functional-style mode, and some samples have already
>>>>> been checked in as proof of concept.  Operations on collections yield
>>>>> new collections, and you can chain the calls.  It values ease of use
>>>>> over performance (no new concepts like laziness), but the performance
>>>>> model is still highly predictable.  You get things like
>>>>>
>>>>>          Collection fooAbles = things.filter( #{ t ->     t.isFoo() });
>>>>>
>>>>> or, with method references:
>>>>>
>>>>>          Collection fooAbles = things.filter(#Thing.isFoo); // ooh, pretty
>>>>>
>>>>> You can also chain calls together, though you pay a (predictable)
>>>>> performance cost of intermediate collections, which for small
>>>>> collections is unlikely to matter:
>>>>>
>>>>>          maxFooWeight = things.filter(#Thing.isFoo)
>>>>>                               .map(#Thing.getWeight)
>>>>>                               .max();
>>>>>
>>>>> The benefit here is concision and clarity.  The cost is some
>>>>> performance, but maybe not so much that people freak out.  If people
>>>>> care, they move to the next model, which is:
>>>>>
>>>>> 2.  Serial / Lazy.  Here, the primary abstraction is Stream (name to be
>>>>> chosen later, Remi used "lazy" in his example.)  To transfer between
>>>>> "eager world" and "lazy world", you use conversion methods (toStream /
>>>>> toCollection).  A typical call chain probably looks like:
>>>>>       collection.toStream / op / op / op / {toCollection,reduce,apply}
>>>>>
>>>>> so the above example becomes
>>>>>
>>>>>          maxFooWeight = things.asStream()
>>>>>                               .filter(#Thing.isFoo)
>>>>>                               .map(#Thing.getWeight)
>>>>>                               .max();
>>>>>
>>>>> The return type of Collection.filter is different from the return type
>>>>> of Stream.filter, so the choice and performance costs are reflected in
>>>>> the static type system.  This avoids the cost of the intermediate
>>>>> collections, but is still serial.  If you care about that, you move up
>>>>> to the next model, which is:
>>>>>
>>>>> 3.  Parallel / Lazy.  Here, the primary abstraction is something like
>>>>> ParallelStream or ParallelIterable.  Let's call it ParallelFoo to avoid
>>>>> bikeshedding for the moment.  Now, the code looks like:
>>>>>
>>>>>          maxFooWeight = things.asParallelFoo()
>>>>>                               .filter(#Thing.isFoo)
>>>>>                               .map(#Thing.getWeight)
>>>>>                               .max();
>>>>>
>>>>> Again, the return type of ParallelFoo.filter is different from
>>>>> Stream.filter or Collection.filter, so again the choice is reflected in
>>>>> the static type system.  But you don't have to rewrite your code.
>>>>>
>>>>> The beauty here is twofold:
>>>>>
>>>>>      - The base model (serial/eager) is easy to understand and natural to
>>>>> use as a way of expressing what the programmer wants to do, and
>>>>> attractive enough to stand on its own -- just a little slow with big
>>>>> collections.
>>>>>      - Switching between execution models is mostly a matter of adding an
>>>>> explicit conversion or two in the call chain, as the models are similar
>>>>> enough that the rest of the code should still work (and even mean the
>>>>> same thing.)
>>>>>
>>>>>
>>>>> On 3/8/2011 8:43 AM, Rémi Forax wrote:
>>>>>>       Le 08/03/2011 14:31, Jim Mayer a écrit :
>>>>>>> // I can tolerate this one
>>>>>>>          long product(List<Integer>       list) {
>>>>>>>            return list.map(#{x ->       (long) x}).reduce(0L, #{sum, x ->
>>>>>>>    sum * x});
>>>>>>>          }
>>>>>> I prefer this one:
>>>>>>
>>>>>>        long product(List<Integer>       list) {
>>>>>>            return list.lazy().map(#{x ->       (long) x}).reduce(0L, #{sum,
>>>>>> x ->       sum * x});
>>>>>>        }
>>>>>>
>>>>>> lazy() means, don't do map directly, but wait and do map and reduce in
>>>>>> one iteration.
>>>>>>
>>>>>> Rémi
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>>
>
>


More information about the lambda-dev mailing list