Design for collections upgrades
Rémi Forax
forax at univ-mlv.fr
Mon Mar 14 07:36:02 PDT 2011
On 03/14/2011 03:33 PM, Sam Pullara wrote:
> If we use Iterable for stream it would be the same number of objects
> since the for loop is making the same object. The only real difference
> is the lambda.
>
> Sent from my iPhone
I don't follow you on this.
The for loop create an iterator not an iterable.
Rémi
> On Mar 14, 2011, at 7:30 AM, Rémi Forax<forax at univ-mlv.fr> wrote:
>
>> The corresponding code is:
>> return people.map(#{ p -> p.getAddress() }).to(new ArrayList<>());
>>
>> It create one supplementary object (the stream) compared to the code you
>> provide
>> and should be a little slower because currently the lambda code is not
>> inlined
>> into the code of map().
>> But if the list is not big<10 000 elements you should not notice it.
>>
>> Rémi
>>
>> On 03/14/2011 02:25 PM, Craig P. Motlin wrote:
>>> The problem is that the result of calling filter or transform on a lazy
>>> collection is another lazy collection, which makes them very hard to adopt.
>>> There's a lot of code out there that's already doing transformations like
>>> this:
>>>
>>> public List<Address> doTransform(List<Person> people)
>>> {
>>> List<Address> result = new ArrayList<Address>();
>>> for (Person person : people)
>>> {
>>> result.add(person.getAddress());
>>> }
>>> return result;
>>> }
>>>
>>> Since it's public, it might be really hard to change the return type to a
>>> lazy collection. The same argument applies for fields. In order to use a
>>> lambda here, the code would become transform().asList(), which will be
>>> slower than the original code. And if it's slower, it will hurt adoption.
>>>
>>> There is one problem with covariant return types at compile time. It's easy
>>> for filter(), it can always be covariant. It's a little different for
>>> transform(). When you transform an ArrayList, you should get back another
>>> ArrayList by default. Sometimes people will want to collapse duplicates but
>>> you shouldn't have to create an ArrayList just to create a HashSet right
>>> afterwards. So it makes sense to have a second form of transform() and
>>> filter() that take a second argument, a mutable collection to dump their
>>> output into. HashSet.transform() should also return a HashSet by default.
>>> For TreeSet, that's impossible though. If you're starting with a
>>> TreeSet<Person> then it was constructed with some Comparator<Person>. You
>>> can't return SortedSet<Address> from transform() because there's no
>>> Comparator<Address>. It actually makes sense for SortedSet.transform() to
>>> return a List. Since SortedSet extends Set, there are two difficult choices.
>>>
>>> - Have Set and SortedSet both return HashSet
>>> - Have Set return Set and SortedSet return List but the static type they
>>> return is just Collection
>>>
>>>
>>> On Mon, Mar 14, 2011 at 4:07 AM, Peter Levart<peter.levart at marand.si>wrote:
>>>
>>>> On 03/10/11, Rémi Forax wrote:
>>>>> Le 10/03/2011 13:10, Peter Levart a écrit :
>>>>>> Let Iterable by default (via defender methods) be lazy but let
>>>> Collection (also via defender methods) override the same methods an make
>>>> them eager. Also privide two methods to switch between eager and lazy on the
>>>> Iterable itself (and override in Collection and subtypes).
>>>>> If Collection.filter overrides Iterable.filter, they should have the
>>>>> same semantics.
>>>>> And as I already say, how to implement toEager() ?
>>>> Now after more discussion on the list, I'm not supporting my suggestion any
>>>> further. But I'm not keen on a new "Stream" type either. The original fear
>>>> that lazy evaluation of filtering/mapping/reducing/etc... operations on
>>>> collections would feel unnatural to casual programmer, because collections
>>>> library is currently not using any form of lazy evaluation is, I think, at
>>>> least partialy unfounded.
>>>>
>>>> Lazy evaluation of operations on collection can be described as having two
>>>> subtle semantic properties:
>>>>
>>>> 1. The evaluation of predicates, mappers, reducers, etc. happens at the
>>>> latest possible moment, when the resulting data is requested (iterated over,
>>>> collected into a propper collection, etc...)
>>>>
>>>> 2. The original data that is the source for the operations is stored in the
>>>> source collection (the resulting Iterable is not detatched from original
>>>> data - it is a transformed view over source collection). So modifying source
>>>> collection before or in the middle of iteration over the result has a
>>>> consequence on the result.
>>>>
>>>> The 2nd property is already known to users of collections library
>>>> (Map.keySet(), Map.values(), ...) and therefore, if propperly documented,
>>>> can not present any additional confusion.
>>>>
>>>> The 1st property is something new that everyone will have to learn since
>>>> more and more APIs can be expected to exhibit this behaviour after lambdas
>>>> are introduced to Java. Programmers will have to learn about pure functions
>>>> and why they are their friends...
>>>>
>>>> I'm now a supporter of lazy java.lang.Iterable (and by inheritance also
>>>> java.util.Collection). Overall it represents simpler and more consistent API
>>>> with no need for additional Stream type. It does not exhibit problems that
>>>> eager evaluation does:
>>>>
>>>> - what should the eager result type be and what implementation should be
>>>> choosen (ArrayList/HashSet/TreeSet)
>>>> - inherent sub-optimabillity (think of how easy it is to write: s = s +
>>>> "xxx"; s = s + "..."; ...instead of using StringBuilder)
>>>>
>>>> Peter
>>>>
>>>>> Rémi
>>>>>
>>>>>
>>>>>
>>
More information about the lambda-dev
mailing list