Design for collections upgrades
Craig P. Motlin
cmotlin at gmail.com
Mon Mar 14 06:25:45 PDT 2011
The problem is that the result of calling filter or transform on a lazy
collection is another lazy collection, which makes them very hard to adopt.
There's a lot of code out there that's already doing transformations like
this:
public List<Address> doTransform(List<Person> people)
{
List<Address> result = new ArrayList<Address>();
for (Person person : people)
{
result.add(person.getAddress());
}
return result;
}
Since it's public, it might be really hard to change the return type to a
lazy collection. The same argument applies for fields. In order to use a
lambda here, the code would become transform().asList(), which will be
slower than the original code. And if it's slower, it will hurt adoption.
There is one problem with covariant return types at compile time. It's easy
for filter(), it can always be covariant. It's a little different for
transform(). When you transform an ArrayList, you should get back another
ArrayList by default. Sometimes people will want to collapse duplicates but
you shouldn't have to create an ArrayList just to create a HashSet right
afterwards. So it makes sense to have a second form of transform() and
filter() that take a second argument, a mutable collection to dump their
output into. HashSet.transform() should also return a HashSet by default.
For TreeSet, that's impossible though. If you're starting with a
TreeSet<Person> then it was constructed with some Comparator<Person>. You
can't return SortedSet<Address> from transform() because there's no
Comparator<Address>. It actually makes sense for SortedSet.transform() to
return a List. Since SortedSet extends Set, there are two difficult choices.
- Have Set and SortedSet both return HashSet
- Have Set return Set and SortedSet return List but the static type they
return is just Collection
On Mon, Mar 14, 2011 at 4:07 AM, Peter Levart <peter.levart at marand.si>wrote:
> On 03/10/11, Rémi Forax wrote:
> > Le 10/03/2011 13:10, Peter Levart a écrit :
> > > Let Iterable by default (via defender methods) be lazy but let
> Collection (also via defender methods) override the same methods an make
> them eager. Also privide two methods to switch between eager and lazy on the
> Iterable itself (and override in Collection and subtypes).
> >
> > If Collection.filter overrides Iterable.filter, they should have the
> > same semantics.
> > And as I already say, how to implement toEager() ?
>
> Now after more discussion on the list, I'm not supporting my suggestion any
> further. But I'm not keen on a new "Stream" type either. The original fear
> that lazy evaluation of filtering/mapping/reducing/etc... operations on
> collections would feel unnatural to casual programmer, because collections
> library is currently not using any form of lazy evaluation is, I think, at
> least partialy unfounded.
>
> Lazy evaluation of operations on collection can be described as having two
> subtle semantic properties:
>
> 1. The evaluation of predicates, mappers, reducers, etc. happens at the
> latest possible moment, when the resulting data is requested (iterated over,
> collected into a propper collection, etc...)
>
> 2. The original data that is the source for the operations is stored in the
> source collection (the resulting Iterable is not detatched from original
> data - it is a transformed view over source collection). So modifying source
> collection before or in the middle of iteration over the result has a
> consequence on the result.
>
> The 2nd property is already known to users of collections library
> (Map.keySet(), Map.values(), ...) and therefore, if propperly documented,
> can not present any additional confusion.
>
> The 1st property is something new that everyone will have to learn since
> more and more APIs can be expected to exhibit this behaviour after lambdas
> are introduced to Java. Programmers will have to learn about pure functions
> and why they are their friends...
>
> I'm now a supporter of lazy java.lang.Iterable (and by inheritance also
> java.util.Collection). Overall it represents simpler and more consistent API
> with no need for additional Stream type. It does not exhibit problems that
> eager evaluation does:
>
> - what should the eager result type be and what implementation should be
> choosen (ArrayList/HashSet/TreeSet)
> - inherent sub-optimabillity (think of how easy it is to write: s = s +
> "xxx"; s = s + "..."; ...instead of using StringBuilder)
>
> Peter
>
> >
> > Rémi
> >
> >
> >
>
>
More information about the lambda-dev
mailing list