Design for collections upgrades (was: Re: lambda-dev Digest, Vol 15, Issue 20 [reduce method result type])

Tue Mar 8 10:17:03 PST 2011

Very nice. I'm curious about the exception declaration in the
streaming model. In your serial/lazy example below:

>      maxFooWeight = things.asStream()
>                           .filter(#Thing.isFoo)
>                           .map(#Thing.getWeight)
>                           .max();

max() can throw the union of the exceptions thrown by Thing.isFoo and
Thing.getWeight. What does the declaration of map() look like? It's
part of a type that is parameterized on Thing.isFoo's exceptions and
must add, in its return value, the exceptions that Thing.getWeight
throws.

Also, would you consider adding list comprehension syntax, which would
de-sugar to the serial/eager version:

    weights = [thing.getWeight() for Thing thing : things if thing.isFoo()]

or the lazy version:

    weights = (thing.getWeight() for Thing thing : things if thing.isFoo())

Lawrence

On Tue, Mar 8, 2011 at 9:23 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
> Since people are already discussing this based on an experimental
> checkin, let me outline the big picture plan here.
>
> The general idea is to add functional-like operations to collections --
> filter, map, reduce, apply.
>
> I see three sensible modes, with explicit choices of which you get.
>
> 1.  Serial / Eager.  This is the straight
> collections-with-functional-style mode, and some samples have already
> been checked in as proof of concept.  Operations on collections yield
> new collections, and you can chain the calls.  It values ease of use
> over performance (no new concepts like laziness), but the performance
> model is still highly predictable.  You get things like
>
>      Collection fooAbles = things.filter( #{ t -> t.isFoo() });
>
> or, with method references:
>
>      Collection fooAbles = things.filter(#Thing.isFoo); // ooh, pretty
>
> You can also chain calls together, though you pay a (predictable)
> performance cost of intermediate collections, which for small
> collections is unlikely to matter:
>
>      maxFooWeight = things.filter(#Thing.isFoo)
>                           .map(#Thing.getWeight)
>                           .max();
>
> The benefit here is concision and clarity.  The cost is some
> performance, but maybe not so much that people freak out.  If people
> care, they move to the next model, which is:
>
> 2.  Serial / Lazy.  Here, the primary abstraction is Stream (name to be
> chosen later, Remi used "lazy" in his example.)  To transfer between
> "eager world" and "lazy world", you use conversion methods (toStream /
> toCollection).  A typical call chain probably looks like:
>   collection.toStream / op / op / op / {toCollection,reduce,apply}
>
> so the above example becomes
>
>      maxFooWeight = things.asStream()
>                           .filter(#Thing.isFoo)
>                           .map(#Thing.getWeight)
>                           .max();
>
> The return type of Collection.filter is different from the return type
> of Stream.filter, so the choice and performance costs are reflected in
> the static type system.  This avoids the cost of the intermediate
> collections, but is still serial.  If you care about that, you move up
> to the next model, which is:
>
> 3.  Parallel / Lazy.  Here, the primary abstraction is something like
> ParallelStream or ParallelIterable.  Let's call it ParallelFoo to avoid
> bikeshedding for the moment.  Now, the code looks like:
>
>      maxFooWeight = things.asParallelFoo()
>                           .filter(#Thing.isFoo)
>                           .map(#Thing.getWeight)
>                           .max();
>
> Again, the return type of ParallelFoo.filter is different from
> Stream.filter or Collection.filter, so again the choice is reflected in
> the static type system.  But you don't have to rewrite your code.
>
> The beauty here is twofold:
>
>  - The base model (serial/eager) is easy to understand and natural to
> use as a way of expressing what the programmer wants to do, and
> attractive enough to stand on its own -- just a little slow with big
> collections.
>  - Switching between execution models is mostly a matter of adding an
> explicit conversion or two in the call chain, as the models are similar
> enough that the rest of the code should still work (and even mean the
> same thing.)
>
>
> On 3/8/2011 8:43 AM, Rémi Forax wrote:
>>    Le 08/03/2011 14:31, Jim Mayer a écrit :
>>> // I can tolerate this one
>>>       long product(List<Integer>   list) {
>>>         return list.map(#{x ->   (long) x}).reduce(0L, #{sum, x ->   sum * x});
>>>       }
>>
>> I prefer this one:
>>
>>     long product(List<Integer>   list) {
>>         return list.lazy().map(#{x ->   (long) x}).reduce(0L, #{sum, x ->   sum * x});
>>     }
>>
>> lazy() means, don't do map directly, but wait and do map and reduce in
>> one iteration.
>>
>> Rémi
>>
>>
>
>