Into

Remi Forax forax at univ-mlv.fr
Sat Dec 22 14:24:23 PST 2012


On 12/22/2012 09:21 PM, Doug Lea wrote:
>
> I'm happy to see "into" go. But I'm not all the way sold on whether
> the best way is via more abstraction (Reducers, Tabulators, etc)
> versus less. Here's the "less" version.
> I don't want to argue too hard for taking this approach,
> but figure that is worth describing:
>
> For mutative additions to collections/maps, why not
> just let people use either forEach(...add...)  or
> parallel-forEach(...add..), depending on
> whether the destination is concurrent and the source
> amenable to parallelism, and/or whether, in the case of
> Maps, they want put vs putIfAbsent vs merge. The idioms
> are easy, and clearly reflect mutative intent. (*)

Your asking people to take car about the concurrency in their code 
instead of letting the pipeline taking care of that.
While it should be possible, that's why there is a forEach, it should 
not be the default idiom because people will write
   ArrayList<String> list = new ArrayList<>();
   list.parallel().filter(...).forEach(list::add);
We should prefer a slow into() to a fast forEach() that only works if 
your name is Doug or Brian.

>
> For more functional/fluent/streamy usages, you'd like
> to enforce that any into-ish method creates a fresh instance
> and that construction cannot interfere with anything else.
> So why not treat these as factories in existing catgories:
>   toCollection(), toList(), toRandomAccessList()
>   toSet(), toNavigableSet();
> plus grouping versions
>   toMap(keyfn, ...), toNavigableMap(keyFn, ...);
> (with merge- and multi- variants.)
>
> Here the streams implementation gets to pick the
> underlying concrete class. This may entail sensing
> parallelism but only in the case of toList is
> constrained by orderedness.
> The streams implementation could pick the best applicable
> options and improve them or pick others over time.
> (For example, initially,, all of toCollection(), toList(),
> and toRandomAccessList() could pick ArrayList.)
> The implementation can also take advantage of the fact
> that some collections (especially ArrayList) support fast
> parallel insertion upon creation but not once published.

you can't use arrayList.addAll(arrayList2) ?

>
> People who don't like the choices of concrete classes
> can instead use the first option of creating some concrete
> collections and  populating them.
>
> Summary: Replace into with:
>   * manual forEach-based idioms for mutative stuff
>   * opaque factory-based methods for fluent/function stuff
> And triage out for now any other non-collections-based into targets

on things that can be done is to write into that way:
   into(Supplier<? extends T> supplier)
and used that way:
   stream.into(ArrayList::new)

The implementation of the pipeline can check if the Supplier is a 
constructor reference to a well known collections of the JDK and optimize.
But in that case, users can not use Collections.newSetFromMap.

>
> (*) footnote: There are now a few more options for
> parallel insertions into concurrent collections.
> A while ago I added a "newKeySet" factory
> method to JDK8/V8 version of CHM, so
> there is now a JDK Concurrent Set implementation
> that people can use. Someday similarly for skip lists,
> so there will be a concurrent sorted/navigable set.
> High-performance concurrent Lists are unlikely any
> time soon, although ReadMostlyVector is better
> than nothing. (I'm still not sure if it should move
> from jsr166e.extra into JDK...).
>
> -Doug
>
>

Rémi



More information about the lambda-libs-spec-observers mailing list