Design for collections upgrades
Craig P. Motlin
cmotlin at gmail.com
Sun Mar 13 14:08:29 PDT 2011
The Scala collections framework uses eager evaluation by default and uses
the .view() method to switch over to the lazy world. It seems to work well
in practice.
Eager evaluation should stay for performance reasons. It should stay as the
*default* for the reasons Brian mentioned. Everyone assumes that lazy
evaluation would usually be faster, but the Scala community has discovered
that that's usually not the case (on the JVM). Of course, it will be the
case if you're doing some short circuit evaluation, like find() the first
element that satisfies some predicate in an already filtered list. Some
performance testing is in order.
On Sun, Mar 13, 2011 at 4:15 PM, Sam Pullara <sam at sampullara.com> wrote:
> If we name List.filter() and Iterable.filter() the same thing there will be
> no way to get to the default lazy version of filter() on a List even if the
> Iterable version of filter() is lazy without either naming the methods
> differently or requiring something that changes a List into something that
> doesn't implement List but does implement Iterable or another streaming
> interface. Am I missing something?
>
> One solution might be to put those methods on Iterator and require
> .iterator() (simply another name for .asStream() really). This would be kind
> of nice in that the for() loop would now use the lazy versions by default.
>
> Higher-level question: Does anyone really think that developers should use
> the eager versions or are we putting them in there as some kind of idiot
> proof implementation?
>
> Sam
>
> On Mar 13, 2011, at 12:47 PM, Rémi Forax wrote:
>
> > asStream() is not required at all.
> >
> > The question is more, should we mix lazy and eager method in the same
> > interface or not.
> > If the answer is No. asStream() is a way to specify that you want to go
> > to the lazy world.
> > If the answer is Yes. asStream is not needed and basically all static
> > methods can goes into their
> > interface as extension method.
> >
> > Rémi
> >
> >
> > On 03/13/2011 08:01 PM, Sam Pullara wrote:
> >> For the last year or so I have been using the Google Collections
> library, now distributed as part of Guava. It has most of the APIs that we
> are talking about and I think that regular Java programmers are using them.
> They separate out Lists, Iterables, Sets, Maps, Collections and Iterators.
> Most of the code that I have written would work great with the methods in
> these classes added as extension methods to the normal APIs. Function
> literals would of course slot right in. Here are some links to their APIs:
> >>
> >>
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Iterables.html
> >>
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Iterators.html
> >>
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Lists.html
> >>
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Sets.html
> >>
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Maps.html
> >>
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Collections2.html
> >>
> >> My understanding is that they have an advantage over extension methods
> in the caller determines how to treat the target collection. For example, if
> you use:
> >>
> >> public class Iterables {
> >> public static<F,T> Iterable<T> transform(Iterable<F> fromIterable,
> Function<? super F,? extends T> function)...
> >> }
> >>
> >> You get streaming behavior whereas the equivalent in Lists is evaluated
> eagerly:
> >>
> >> public class Lists {
> >> public static<F,T> List<T> transform(List<F> fromList, Function<?
> super F,? extends T> function)
> >> }
> >>
> >> Apparently with extension methods we don't have this luxury as the type
> at the call site does not pick the extension method default but instead the
> underlying implementation does. Because of this it appears that the current
> proposals for things like .asStream() will actually end up being more
> verbose than using their Guava counterparts once function literals are
> available. As for the naming conventions, Guava went with filter/transform.
> Personally, I renamed them f/t since chaining together several in Guava gets
> a little silly with such long names since they show up next to one another.
> Here is some current code of mine:
> >>
> >> return l(t(f(s.bagService.featured(0, 30L, category),
> new P<Row<Bag, byte[]>>() {
> >> public boolean apply(@Nullable Row<Bag, byte[]>
> input) {
> >> return input != null&&
> ids.add(input.value.owner());
> >> }
> >> }), new F<Row<Bag, byte[]>, BagCodeFactory.BagCode>() {
> >> public BagCodeFactory.BagCode apply(Row<Bag, byte[]>
> input) {
> >> return new BagCodeFactory(s).getBag(page, input);
> >> }
> >> }), count);
> >>
> >> l = limit, t = transform, f = filter
> >>
> >> In JDK8 with function literals it would look like (featured() returns an
> Iterable):
> >>
> >> return l(t(f(s.bagService.featured(0, 30L, category), #{
> input -> input != null&& ids.add(input.value.owner())}),
> >> #{ input -> new BagCodeFactory(s).getBag(page,
> input)}), count);
> >>
> >> Now, using some of the suggestions on the list so far, it could look
> like:
> >>
> >> return s.bagService.featured(0, 30L,
> category).asStream().fiter(#{ input -> input != null&&
> ids.add(input.value.owner())})
> >> .transform(#{ input -> new
> BagCodeFactory(s).getBag(page, input)}).limit(count);
> >>
> >> Which isn't too bad though since featured() is already returning an
> Iterable it feels redundant to asStream() it especially since it really is
> an Iterable only and doesn't use an underlying Java collection at all. It
> concerns me that the call site can't control behavior through typing. If we
> only put the extension methods on Iterable/Iterator it would be pretty easy
> for the developer to see that the operations are lazy and there would never
> be any confusion as to the behavior you would get at the call site.
> >>
> >> Sam
> >>
> >> On Mar 13, 2011, at 4:12 AM, Steven Simpson wrote:
> >>
> >>> On 10/03/11 13:57, Rémi Forax wrote:
> >>>> Yes, we need to provide methods that filter/map directly
> >>>> the content of a collections.
> >>>> But I don't think it's a good idea to name them filter or map.
> >>>>
> >>>> Why not filterAll and mapAll ?
> >>> I'm starting to lose track of this thread, but I recall the following
> >>> points:
> >>>
> >>> * Lazy operations are desirable, as are eager ones, and in-place
> ones.
> >>> * There could be a type or family thereof for lazy collections, e.g.
> >>> Stream or Iterable.
> >>> * Collections currently don't have methods for generating new
> >>> collections, so adding them is a considerable shift. They do have
> >>> in-place methods, and new methods like 'filter' share the same
> >>> tense, suggesting that they should be in-place methods too.
> >>>
> >>> In case it hasn't yet been mentioned, I'd like to add that collections
> >>> also support /views/ in several places, e.g. subList, entrySet, etc.
> Do
> >>> we need a lazy type when we can have views? I haven't pinned it down
> in
> >>> my mind yet, but I suspect that views are slightly different to
> streams,
> >>> in that they also allow us to modify the original collection. Or, to
> >>> put it another way, they differ from mutating methods, in that we can
> >>> choose not to perform any mutation.
> >>>
> >>> When we want to delete a portion of a list, we write:
> >>>
> >>> list.subList(a, b).clear()
> >>>
> >>> ...because the sublist view allows us to modify the original list. But
> >>> if we want to extract a sublist, and leave the original untouched, we
> write:
> >>>
> >>> new ArrayList(list.subList(a, b))
> >>>
> >>> List.subList could be regarded as lazy, until we apply a mutating
> method
> >>> to it like clear(). And if we never do that, it stays lazy even while
> >>> we do an eager copy.
> >>>
> >>> Can we do the same with filter (using a noun like "selection" to be
> >>> consistent with subList)?
> >>>
> >>> list.selection(pred).clear(); // mutating original; removeAll(pred)?
> >>> new ArrayList(list.selection(pred)); // preserving original
> >>>
> >>> Views, having ordinary collection types, remain Iterable too:
> >>>
> >>> for (Element elem : list.selection(pred)) { ... }
> >>>
> >>> I suppose my point is that there is already a precedent for (mutable)
> >>> laziness in the collection framework, in the form of views.
> >>>
> >>> Cheers,
> >>>
> >>> Steven
> >>>
> >>> --
> >>>
> >>>
> >>>
> >>>
> >>
> >
> >
>
>
>
More information about the lambda-dev
mailing list