BiCollector
Peter Levart
peter.levart at gmail.com
Thu Jun 14 06:53:50 UTC 2018
Hi Tagir,
Great! I've been thinking about the same trick - just add another
function and you don't have to hard code the decision about the choice
of imperfect tuple type. But are we just talking of aestetics here? In
the absence of value types, every choice is imperfect. Ok, sometimes you
may have luck to be able to further reduce the result into a singular
type, but how often does that occur?
I'm not so concerned about the choice of tuple type but about another
aspect. The characteristics of a combined collector can only be an
intersection of characteristics of individual collectors (minus
IDENTITY_FINISH, when you provide the finisher/combiner of the results).
If one of the collectors is CONCURRENT, but the other is not, then both
will be used in non-CONCURRENT collection mode, which might be
sub-optimal. But such is the nature of "single pass". You can't travel
to the destination via two distinct paths at the same time (if you're
not doing quantum superposition :-). I was positively surprised that
such composition is even possible in such a simple way, which speaks of
the well thought out initial design of the API.
So I guess that although a very useful tool in practice,
pairing/bi-collection will remain in the domain of custom Stream
extensions. Or maybe not?
Regards, Peter
On 06/14/18 06:56, Tagir Valeev wrote:
> Hello!
>
> In my StreamEx library I created a "pairing" collector which does
> similar job, but allows user to decide how to combine the results of
> two collectors. This adds more flexibility. The signature is like this:
>
> public static <T, A1, A2, R1, R2, R> Collector<T, ?, R> pairing(
> Collector<? super T, A1, R1> c1,
> Collector<? super T, A2, R2> c2,
> BiFunction<? super R1, ? super R2, ? extends R> finisher)
>
> Having such collector, The proposed `toBoth(c1, c2)` can be
> implemented as simple as `pairing(c1, c2, Map::entry)`. OTOH if
> somebody wants to use their own Pair class, it would be `pairing(c1,
> c2, Pair::new)`.
> Sometimes you don't need a pair, but can create compound result object
> right here. E.g.:
>
> Collector<BigDecimal, ?, BigDecimal> summing =
> Collectors.reducing(BigDecimal.ZERO, BigDecimal::add);
> Collector<BigDecimal, ?, BigDecimal> averaging =
> pairing(summing, Collectors.counting(), (sum, count) ->
> sum.divide(BigDecimal.valueOf(count),
> RoundingMode.HALF_EVEN));
>
> So locking to Map.Entry is entirely unnecessary.
>
> With best regards,
> Tagir Valeev.
>
>
>
> On Thu, Jun 14, 2018 at 6:11 AM Brian Goetz <brian.goetz at oracle.com
> <mailto:brian.goetz at oracle.com>> wrote:
>
> I really like how BiCollector can be private, so all we'd have to
> expose
> is toBoth(), and the arguments of toBoth() are just ordinary
> collectors. So no new types or abstractions; just a multiplexing.
>
> The use of Map.Entry as a pair surrogate is unfortunate -- its
> definitely not an Entry -- though I understand why you did this.
> I'm not
> sure if this is fatal or not for a JDK method, but it's pretty bad*.
> (You could generalize to n-ary, returning a List or array (you can
> take
> a varargs of Collector), at the loss of sharp types for the results.)
>
>
> *Yes, I'm sure one can find precedent of this being done; this has no
> effect on whether it's bad.
>
> On 6/11/2018 8:39 AM, Peter Levart wrote:
> > Hi,
> >
> > Have you ever wanted to perform a collection of the same Stream
> into
> > two different targets using two Collectors? Say you wanted to
> collect
> > Map.Entry elements into two parallel lists, each of them containing
> > keys and values respectively. Or you wanted to collect elements
> into
> > groups by some key, but also count them at the same time? Currently
> > this is not possible to do with a single Stream. You have to create
> > two identical streams, so you end up passing Supplier<Stream> to
> other
> > methods instead of bare Stream.
> >
> > I created a little utility Collector implementation that serves the
> > purpose quite well:
> >
> > /**
> > * A {@link Collector} implementation taking two delegate
> Collector(s)
> > and producing result composed
> > * of two results produced by delegating collectors, wrapped in
> {@link
> > Map.Entry} object.
> > *
> > * @param <T> the type of elements collected
> > * @param <K> the type of 1st delegate collector collected result
> > * @param <V> tye type of 2nd delegate collector collected result
> > */
> > public class BiCollector<T, K, V> implements Collector<T,
> > Map.Entry<Object, Object>, Map.Entry<K, V>> {
> > private final Collector<T, Object, K> keyCollector;
> > private final Collector<T, Object, V> valCollector;
> >
> > @SuppressWarnings("unchecked")
> > public BiCollector(Collector<T, ?, K> keyCollector,
> Collector<T,
> > ?, V> valCollector) {
> > this.keyCollector = (Collector)
> > Objects.requireNonNull(keyCollector);
> > this.valCollector = (Collector)
> > Objects.requireNonNull(valCollector);
> > }
> >
> > @Override
> > public Supplier<Map.Entry<Object, Object>> supplier() {
> > Supplier<Object> keySupplier = keyCollector.supplier();
> > Supplier<Object> valSupplier = valCollector.supplier();
> > return () -> new
> > AbstractMap.SimpleImmutableEntry<>(keySupplier.get(),
> valSupplier.get());
> > }
> >
> > @Override
> > public BiConsumer<Map.Entry<Object, Object>, T> accumulator() {
> > BiConsumer<Object, T> keyAccumulator =
> > keyCollector.accumulator();
> > BiConsumer<Object, T> valAccumulator =
> > valCollector.accumulator();
> > return (accumulation, t) -> {
> > keyAccumulator.accept(accumulation.getKey(), t);
> > valAccumulator.accept(accumulation.getValue(), t);
> > };
> > }
> >
> > @Override
> > public BinaryOperator<Map.Entry<Object, Object>> combiner() {
> > BinaryOperator<Object> keyCombiner =
> keyCollector.combiner();
> > BinaryOperator<Object> valCombiner =
> valCollector.combiner();
> > return (accumulation1, accumulation2) -> new
> > AbstractMap.SimpleImmutableEntry<>(
> > keyCombiner.apply(accumulation1.getKey(),
> > accumulation2.getKey()),
> > valCombiner.apply(accumulation1.getValue(),
> > accumulation2.getValue())
> > );
> > }
> >
> > @Override
> > public Function<Map.Entry<Object, Object>, Map.Entry<K, V>>
> > finisher() {
> > Function<Object, K> keyFinisher = keyCollector.finisher();
> > Function<Object, V> valFinisher = valCollector.finisher();
> > return accumulation -> new
> AbstractMap.SimpleImmutableEntry<>(
> > keyFinisher.apply(accumulation.getKey()),
> > valFinisher.apply(accumulation.getValue())
> > );
> > }
> >
> > @Override
> > public Set<Characteristics> characteristics() {
> > EnumSet<Characteristics> intersection =
> > EnumSet.copyOf(keyCollector.characteristics());
> > intersection.retainAll(valCollector.characteristics());
> > return intersection;
> > }
> > }
> >
> >
> > Do you think this class is general enough to be part of standard
> > Collectors repertoire?
> >
> > For example, accessed via factory method
> Collectors.toBoth(Collector
> > coll1, Collector coll2), bi-collection could then be coded
> simply as:
> >
> > Map<String, Integer> map = ...
> >
> > Map.Entry<List<String>, List<Integer>> keys_values =
> > map.entrySet()
> > .stream()
> > .collect(
> > toBoth(
> > mapping(Map.Entry::getKey, toList()),
> > mapping(Map.Entry::getValue, toList())
> > )
> > );
> >
> >
> > Map.Entry<Map<Integer, Long>, Long> histogram_count =
> > ThreadLocalRandom
> > .current()
> > .ints(100, 0, 10)
> > .boxed()
> > .collect(
> > toBoth(
> > groupingBy(Function.identity(), counting()),
> > counting()
> > )
> > );
> >
> >
> > Regards, Peter
> >
>
More information about the core-libs-dev
mailing list