Design for collections upgrades

Pavel Minaev int19h at gmail.com
Sun Mar 13 14:18:43 PDT 2011


Note that the notion of "default" type implies that operations are named the
same. If names themselves clearly express the difference, then neither is
default, really.

I think the concern isn't so much that lazy will be faster, but rather that
semantics of eager and lazy (not just perf, but also things like when
exceptions get thrown) are different enough that it should be obvious in
code which is which.

On Sun, Mar 13, 2011 at 2:08 PM, Craig P. Motlin <cmotlin at gmail.com> wrote:

> The Scala collections framework uses eager evaluation by default and uses
> the .view() method to switch over to the lazy world. It seems to work well
> in practice.
>
> Eager evaluation should stay for performance reasons. It should stay as the
> *default* for the reasons Brian mentioned. Everyone assumes that lazy
> evaluation would usually be faster, but the Scala community has discovered
> that that's usually not the case (on the JVM). Of course, it will be the
> case if you're doing some short circuit evaluation, like find() the first
> element that satisfies some predicate in an already filtered list. Some
> performance testing is in order.
>
> On Sun, Mar 13, 2011 at 4:15 PM, Sam Pullara <sam at sampullara.com> wrote:
>
> > If we name List.filter() and Iterable.filter() the same thing there will
> be
> > no way to get to the default lazy version of filter() on a List even if
> the
> > Iterable version of filter() is lazy without either naming the methods
> > differently or requiring something that changes a List into something
> that
> > doesn't implement List but does implement Iterable or another streaming
> > interface. Am I missing something?
> >
> > One solution might be to put those methods on Iterator and require
> > .iterator() (simply another name for .asStream() really). This would be
> kind
> > of nice in that the for() loop would now use the lazy versions by
> default.
> >
> > Higher-level question: Does anyone really think that developers should
> use
> > the eager versions or are we putting them in there as some kind of idiot
> > proof implementation?
> >
> > Sam
> >
> > On Mar 13, 2011, at 12:47 PM, Rémi Forax wrote:
> >
> > > asStream() is not required at all.
> > >
> > > The question is more, should we mix lazy and eager method in the same
> > > interface or not.
> > > If the answer is No. asStream() is a way to specify that you want to go
> > > to the lazy world.
> > > If the answer is Yes. asStream is not needed and basically all static
> > > methods can goes into their
> > > interface as extension method.
> > >
> > > Rémi
> > >
> > >
> > > On 03/13/2011 08:01 PM, Sam Pullara wrote:
> > >> For the last year or so I have been using the Google Collections
> > library, now distributed as part of Guava. It has most of the APIs that
> we
> > are talking about and I think that regular Java programmers are using
> them.
> > They separate out Lists, Iterables, Sets, Maps, Collections and
> Iterators.
> > Most of the code that I have written would work great with the methods in
> > these classes added as extension methods to the normal APIs. Function
> > literals would of course slot right in. Here are some links to their
> APIs:
> > >>
> > >>
> >
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Iterables.html
> > >>
> >
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Iterators.html
> > >>
> >
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Lists.html
> > >>
> >
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Sets.html
> > >>
> >
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Maps.html
> > >>
> >
> http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Collections2.html
> > >>
> > >> My understanding is that they have an advantage over extension methods
> > in the caller determines how to treat the target collection. For example,
> if
> > you use:
> > >>
> > >> public class Iterables {
> > >> public static<F,T>  Iterable<T>  transform(Iterable<F>  fromIterable,
> > Function<? super F,? extends T>  function)...
> > >> }
> > >>
> > >> You get streaming behavior whereas the equivalent in Lists is
> evaluated
> > eagerly:
> > >>
> > >> public class Lists {
> > >> public static<F,T>  List<T>  transform(List<F>  fromList, Function<?
> > super F,? extends T>  function)
> > >> }
> > >>
> > >> Apparently with extension methods we don't have this luxury as the
> type
> > at the call site does not pick the extension method default but instead
> the
> > underlying implementation does. Because of this it appears that the
> current
> > proposals for things like .asStream() will actually end up being more
> > verbose than using their Guava counterparts once function literals are
> > available. As for the naming conventions, Guava went with
> filter/transform.
> > Personally, I renamed them f/t since chaining together several in Guava
> gets
> > a little silly with such long names since they show up next to one
> another.
> > Here is some current code of mine:
> > >>
> > >>                 return l(t(f(s.bagService.featured(0, 30L, category),
> > new P<Row<Bag, byte[]>>() {
> > >>                   public boolean apply(@Nullable Row<Bag, byte[]>
> >  input) {
> > >>                     return input != null&&
> >  ids.add(input.value.owner());
> > >>                   }
> > >>                 }), new F<Row<Bag, byte[]>, BagCodeFactory.BagCode>()
> {
> > >>                   public BagCodeFactory.BagCode apply(Row<Bag, byte[]>
> >  input) {
> > >>                     return new BagCodeFactory(s).getBag(page, input);
> > >>                   }
> > >>                 }), count);
> > >>
> > >> l = limit, t = transform, f = filter
> > >>
> > >> In JDK8 with function literals it would look like (featured() returns
> an
> > Iterable):
> > >>
> > >>                 return l(t(f(s.bagService.featured(0, 30L, category),
> #{
> > input ->  input != null&&  ids.add(input.value.owner())}),
> > >>                      #{ input ->  new BagCodeFactory(s).getBag(page,
> > input)}), count);
> > >>
> > >> Now, using some of the suggestions on the list so far, it could look
> > like:
> > >>
> > >>                 return s.bagService.featured(0, 30L,
> > category).asStream().fiter(#{ input ->  input != null&&
> >  ids.add(input.value.owner())})
> > >>                      .transform(#{ input ->  new
> > BagCodeFactory(s).getBag(page, input)}).limit(count);
> > >>
> > >> Which isn't too bad though since featured() is already returning an
> > Iterable it feels redundant to asStream() it especially since it really
> is
> > an Iterable only and doesn't use an underlying Java collection at all. It
> > concerns me that the call site can't control behavior through typing. If
> we
> > only put the extension methods on Iterable/Iterator it would be pretty
> easy
> > for the developer to see that the operations are lazy and there would
> never
> > be any confusion as to the behavior you would get at the call site.
> > >>
> > >> Sam
> > >>
> > >> On Mar 13, 2011, at 4:12 AM, Steven Simpson wrote:
> > >>
> > >>> On 10/03/11 13:57, Rémi Forax wrote:
> > >>>>  Yes, we need to provide methods that filter/map directly
> > >>>> the content of a collections.
> > >>>> But I don't think it's a good idea to name them filter or map.
> > >>>>
> > >>>> Why not filterAll and mapAll ?
> > >>> I'm starting to lose track of this thread, but I recall the following
> > >>> points:
> > >>>
> > >>>    * Lazy operations are desirable, as are eager ones, and in-place
> > ones.
> > >>>    * There could be a type or family thereof for lazy collections,
> e.g.
> > >>>      Stream or Iterable.
> > >>>    * Collections currently don't have methods for generating new
> > >>>      collections, so adding them is a considerable shift.  They do
> have
> > >>>      in-place methods, and new methods like 'filter' share the same
> > >>>      tense, suggesting that they should be in-place methods too.
> > >>>
> > >>> In case it hasn't yet been mentioned, I'd like to add that
> collections
> > >>> also support /views/ in several places, e.g. subList, entrySet, etc.
> >  Do
> > >>> we need a lazy type when we can have views?  I haven't pinned it down
> > in
> > >>> my mind yet, but I suspect that views are slightly different to
> > streams,
> > >>> in that they also allow us to modify the original collection.  Or, to
> > >>> put it another way, they differ from mutating methods, in that we can
> > >>> choose not to perform any mutation.
> > >>>
> > >>> When we want to delete a portion of a list, we write:
> > >>>
> > >>>  list.subList(a, b).clear()
> > >>>
> > >>> ...because the sublist view allows us to modify the original list.
>  But
> > >>> if we want to extract a sublist, and leave the original untouched, we
> > write:
> > >>>
> > >>>  new ArrayList(list.subList(a, b))
> > >>>
> > >>> List.subList could be regarded as lazy, until we apply a mutating
> > method
> > >>> to it like clear().  And if we never do that, it stays lazy even
> while
> > >>> we do an eager copy.
> > >>>
> > >>> Can we do the same with filter (using a noun like "selection" to be
> > >>> consistent with subList)?
> > >>>
> > >>>  list.selection(pred).clear(); // mutating original; removeAll(pred)?
> > >>>  new ArrayList(list.selection(pred)); // preserving original
> > >>>
> > >>> Views, having ordinary collection types, remain Iterable too:
> > >>>
> > >>>  for (Element elem : list.selection(pred)) { ... }
> > >>>
> > >>> I suppose my point is that there is already a precedent for (mutable)
> > >>> laziness in the collection framework, in the form of views.
> > >>>
> > >>> Cheers,
> > >>>
> > >>> Steven
> > >>>
> > >>> --
> > >>>
> > >>>
> > >>>
> > >>>
> > >>
> > >
> > >
> >
> >
> >
>
>


More information about the lambda-dev mailing list