New convenience methods on Stream
Donald Raab
donraab at gmail.com
Mon Jun 14 04:01:39 UTC 2021
Thank you for the response Tagir!
The intent of all three of these proposals to was to explore better ways of providing improved interop between Streams and 3rd party collection libraries.
I agree with you on option 1. This is my least favorite option, but seemed the most obvious shortcut and should be relatively straightforward to implement. It primarily saves characters, and can save creating additional Collectors that then have to be discovered and learned by developers. We named our static utility class in Eclipse Collections Collectors2 to try and help with discovery as it should show up near Collectors in IDE lookups.
Option 3 is also straightforward to implement, and has the additional utility you identified for existing collections. As you point out it can also be optimized for JDK types like ArrayList. Unfortunately, there is no general interface for ensureCapacity, although maybe there could be if this method is deemed worthwhile.
Option 2 was interesting to me because there are quite a few existing methods in 3rd party collection libraries that take var args as an argument to create collection types. I found and listed 24 factory methods below in different frameworks where I found this could be used today, and there are likely many more. It did bother me having a '(T[])’ cast but I couldn’t see another alternative that would allow for all these existing var arg methods to be useful. I had thought of something similar to the Stream transform idea you shared, and this works with Eclipse Collections today as we have factory methods which take Streams, but I didn’t initially see this option working anywhere else today as I couldn’t find any other library providing factory methods that take Streams. That doesn’t mean they couldn't be added eventually of course, but I was looking at what was available to leverage today.
The transform idea you linked to would work as follows with Eclipse Collections:
MutableList list = stream.transform(Lists.mutable::fromStream)
MutableSet set = stream.transform(Sets.mutable::fromStream)
// compared to collect with Collectors2 today
MutableList list = stream.collect(Collectors2.toList())
MutableSet set = stream.collect(Collectors2.toSet())
ImmutableList list = stream.transform(Lists.immutable::fromStream)
ImmutableSet set = stream.transform(Sets.immutable::fromStream)
// compared to collect with Collectors2 today
MutableList list = stream.collect(Collectors2.toImmutableList())
MutableSet set = stream.collect(Collectors2.toImmutableSet())
In terms of character savings, this is less helpful that calling the method “to”, but this was never about just character savings. I think I could live with <U> transform(Function<Stream<T>, U> function).
Having read your email a few times now and the link to the Bug in the Bug System you provided, I agree with the usefulness of the transform approach. Thank you for taking the time to describe this. I would upvote the bug if I could.
Thanks,
Don
> On Apr 30, 2021, at 9:25 PM, Tagir Valeev <amaembo at gmail.com> wrote:
>
> Hello!
>
> 1. toCollection looks too specific to be added to JDK. Essentially,
> it's a shortcut for toCollection constructor and unlike toList, it
> cannot add many optimizations there. So we basically save several
> characters and nothing more. And toCollection collector is orders of
> magnitude less used than toList. I think that it's even less used than
> joining or groupingBy, so why not providing these shortcuts first?
> It's always about where to draw a line.
>
> 2. to() sounds quite specific, as it expects that the target object is
> ready to accept an array. But why? Probably its API more suitable to
> accept a collection as an input? It's essentially 'toArrayAndThen'
> (similar to 'collectingAndThen' collector). There's a suggestion [1]
> to add transform() method to stream which is much more general, and
> can be used like `stream.transform(s -> Arrays.asList(s.toArray()))`.
> Note that collection libraries may provide ready methods that supply
> the lambdas specific to this library, and this allows to encapsulate
> the exact implementation detail, like whether we need an array or
> something else as an intermediate step, like:
>
> class MyCustomCollection<T> {
> ...
> static <T> Function<Stream<T>, MyCustomCollection<T>> toMyCustomCollection() {
> return s -> createMyCustomCollectionFromStream((T[])s.toArray());
> // in the next version we may change to for-each version if we
> find that it's more performant
> // or s -> {var c = new MyCustomCollection<T>();
> s.forEach(c::add); return c;}
> }
> }
>
> And use `stream.transform(MyCustomCollection.toMyCustomCollection())`
> without caring whether it's array or something else in-between.
>
> Finally, adding a `(T[])` cast to the standard library sounds a bad
> idea. Imagine if the custom collection has a fixed element type:
>
> class MyStringCollection implements Collection<String> {
> public MyStringCollection(String[] array) {...}
> }
>
> It looks like stream.to(MyStringCollection::new) should work but in
> fact, it will throw a ClassCastException
>
> 3. into() sounds more interesting as it's indeed useful to dump the
> stream into an existing collection. It's mostly useful if the
> collection is non-empty, as you can append into single collection from
> existing sources. Essentially, into(c) == forEach(c::add), but it's
> also possible to add optimizations for specific collections (like `if
> (isSizedStream() && c instanceof ArrayList<?> al) {
> al.ensureCapacity(...); }`), so it could be faster.
>
> With best regards,
> Tagir Valeev
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8140283
>
> On Thu, Apr 29, 2021 at 5:58 AM Donald Raab <donraab at gmail.com> wrote:
>>
>> I looked through a few libraries and found some methods where the option #2 proposal for Steam might be useful. If the JDK had constructors for ArrayList, HashSet and other collection types that take arrays this method would work there as well.
>>
>>> default <R extends Iterable<T>> R to(Function<T[], R> function)
>>> {
>>> return function.apply((T[]) this.toArray());
>>> }
>>
>>
>> // JDK
>> Set<String> set = stream.to(Set::of);
>> List<String> list = stream.to(List::of);
>> List<String> arraysAsList = stream.to(Arrays::asList);
>>
>> // Guava
>> ArrayList<String> arrayList = stream.to(Lists::newArrayList);
>> HashSet<String> hashSet = stream.to(Sets::newHashSet);
>> Multiset<String> multiset = stream.to(ImmutableMultiset::copyOf);
>> List<String> guavaList = stream.to(ImmutableList::copyOf);
>> Set<String> guavaSet = stream.to(ImmutableSet::copyOf);
>>
>> // Apache Commons Collections
>> FluentIterable<String> fluentIterable = stream.to(FluentIterable::of);
>>
>> // Eclipse Collections
>> MutableList<String> adapter = stream.to(ArrayAdapter::adapt);
>>
>> MutableList<String> mutableList = stream.to(Lists.mutable::with);
>> MutableSet<String> mutableSet = stream.to(Sets.mutable::with);
>> MutableBag<String> mutableBag = stream.to(Bags.mutable::with);
>>
>> // Eclipse Collections - ListIterable, SetIterable and Bag all extend Iterable, not Collection
>> ListIterable<String> listIterable = stream.to(Lists.mutable::with);
>> SetIterable<String> setIterable = stream.to(Sets.mutable::with);
>> Bag<String> bag = stream.to(Bags.mutable::with);
>>
>> // Eclipse Collections - Immutable Collections do not extend Collection
>> ImmutableList<String> immutableList = stream.to(Lists.immutable::with);
>> ImmutableSet<String> immutableSet = stream.to(Sets.immutable::with);
>> ImmutableBag<String> immutableBag = stream.to(Bags.immutable::with);
>>
>> // Eclipse Collections - Stack does not extend Collection
>> StackIterable<String> stackIterable = stream.to(Stacks.mutable::with);
>> MutableStack<String> mutableStack = stream.to(Stacks.mutable::with);
>> ImmutableStack<String> immutableStack = stream.to(Stacks.immutable::with);
>>
>> // Eclipse Collections - Mutable Map and MutableBiMap are both Iterable<V> so they are valid returns
>> MutableMap<String, String> map =
>> stream.to(array -> ArrayAdapter.adapt(array)
>> .toMap(String::toLowerCase, String::toUpperCase));
>>
>> MutableBiMap<String, String> biMap =
>> stream.to(array -> ArrayAdapter.adapt(array)
>> .toBiMap(String::toLowerCase, String::toUpperCase));
>>
>> Thanks,
>> Don
>>
>>> On Apr 27, 2021, at 1:35 AM, Donald Raab <donraab at gmail.com> wrote:
>>>
>>> I realized after sending that option 2 can be made more abstract:
>>>
>>> default <R extends Iterable<T>> R to(Function<T[], R> function)
>>> {
>>> return function.apply((T[]) this.toArray());
>>> }
>>>
>>>>
>>>> 2. Pass the result of toArray directly into a function that can then return a Collection. This should work with Set.of, List.of and any 3rd party collections which take arrays.
>>>>
>>>> default <R extends Collection<T>> R to(Function<T[], R> function)
>>>> {
>>>> return function.apply((T[]) this.toArray());
>>>> }
>>>>
>>>> Usage Examples:
>>>>
>>>> Set<String> set = stream.to(Set::of);
>>>> List<String> list = stream.to(List::of);
>>>> List<String> arrayList = stream.to(Arrays::asList);
>>>>
>>>
>>
More information about the core-libs-dev
mailing list