BiCollector

Remi Forax forax at univ-mlv.fr
Mon Jun 11 18:40:33 UTC 2018


Hi Peter,
You may find something on lambda-dev, i remember that we discussed about BiStream,
and as Paul said we decide to not include it because the performance were not great and it adds another axis to the API making it even harder to retrofitted it when we will introduce specialization.

regards,
Rémi

----- Mail original -----
> De: "Peter Levart" <peter.levart at gmail.com>
> À: "Paul Sandoz" <paul.sandoz at oracle.com>
> Cc: "core-libs-dev" <core-libs-dev at openjdk.java.net>
> Envoyé: Lundi 11 Juin 2018 20:28:36
> Objet: Re: BiCollector

> Hi Paul,
> 
> Can you point me to some BiStream code (if it is available publicly)?
> 
> Thanks, Peter
> 
> On 06/11/18 19:10, Paul Sandoz wrote:
>> Hi Peter,
>>
>> I like it and can see it being useful, thanks for sharing.
>>
>> I am hesitating a little about it being in the JDK because there is the larger
>> abstraction of a BiStream, where a similar form of collection would naturally
>> fit (but perhaps without the intersection constraints for the
>> characteristics?). We experimented a few times with BiStream and got quite far
>> but decided pull back due to the lack of value types and specialized generics.
>> So i dunno how this might turn out in the future and if your BiCollector fits
>> nicely into such a future model.
>>
>> What are you thoughts on this?
>>
>> FWIW i would call it a “splitting” or “bisecting" collector e.g.
>> “s.collect(bisecting(…))”
>>
>> Paul.
>>
>>
>>
>>
>>> On Jun 11, 2018, at 5:39 AM, Peter Levart <peter.levart at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> Have you ever wanted to perform a collection of the same Stream into two
>>> different targets using two Collectors? Say you wanted to collect Map.Entry
>>> elements into two parallel lists, each of them containing keys and values
>>> respectively. Or you wanted to collect elements into  groups by some key, but
>>> also count them at the same time? Currently this is not possible to do with a
>>> single Stream. You have to create two identical streams, so you end up passing
>>> Supplier<Stream> to other methods instead of bare Stream.
>>>
>>> I created a little utility Collector implementation that serves the purpose
>>> quite well:
>>>
>>> /**
>>>   * A {@link Collector} implementation taking two delegate Collector(s) and
>>>   producing result composed
>>>   * of two results produced by delegating collectors, wrapped in {@link Map.Entry}
>>>   object.
>>>   *
>>>   * @param <T> the type of elements collected
>>>   * @param <K> the type of 1st delegate collector collected result
>>>   * @param <V> tye type of 2nd delegate collector collected result
>>>   */
>>> public class BiCollector<T, K, V> implements Collector<T, Map.Entry<Object,
>>> Object>, Map.Entry<K, V>> {
>>>      private final Collector<T, Object, K> keyCollector;
>>>      private final Collector<T, Object, V> valCollector;
>>>
>>>      @SuppressWarnings("unchecked")
>>>      public BiCollector(Collector<T, ?, K> keyCollector, Collector<T, ?, V>
>>>      valCollector) {
>>>          this.keyCollector = (Collector) Objects.requireNonNull(keyCollector);
>>>          this.valCollector = (Collector) Objects.requireNonNull(valCollector);
>>>      }
>>>
>>>      @Override
>>>      public Supplier<Map.Entry<Object, Object>> supplier() {
>>>          Supplier<Object> keySupplier = keyCollector.supplier();
>>>          Supplier<Object> valSupplier = valCollector.supplier();
>>>          return () -> new AbstractMap.SimpleImmutableEntry<>(keySupplier.get(),
>>>          valSupplier.get());
>>>      }
>>>
>>>      @Override
>>>      public BiConsumer<Map.Entry<Object, Object>, T> accumulator() {
>>>          BiConsumer<Object, T> keyAccumulator = keyCollector.accumulator();
>>>          BiConsumer<Object, T> valAccumulator = valCollector.accumulator();
>>>          return (accumulation, t) -> {
>>>              keyAccumulator.accept(accumulation.getKey(), t);
>>>              valAccumulator.accept(accumulation.getValue(), t);
>>>          };
>>>      }
>>>
>>>      @Override
>>>      public BinaryOperator<Map.Entry<Object, Object>> combiner() {
>>>          BinaryOperator<Object> keyCombiner = keyCollector.combiner();
>>>          BinaryOperator<Object> valCombiner = valCollector.combiner();
>>>          return (accumulation1, accumulation2) -> new AbstractMap.SimpleImmutableEntry<>(
>>>              keyCombiner.apply(accumulation1.getKey(), accumulation2.getKey()),
>>>              valCombiner.apply(accumulation1.getValue(), accumulation2.getValue())
>>>          );
>>>      }
>>>
>>>      @Override
>>>      public Function<Map.Entry<Object, Object>, Map.Entry<K, V>> finisher() {
>>>          Function<Object, K> keyFinisher = keyCollector.finisher();
>>>          Function<Object, V> valFinisher = valCollector.finisher();
>>>          return accumulation -> new AbstractMap.SimpleImmutableEntry<>(
>>>              keyFinisher.apply(accumulation.getKey()),
>>>              valFinisher.apply(accumulation.getValue())
>>>          );
>>>      }
>>>
>>>      @Override
>>>      public Set<Characteristics> characteristics() {
>>>          EnumSet<Characteristics> intersection =
>>>          EnumSet.copyOf(keyCollector.characteristics());
>>>          intersection.retainAll(valCollector.characteristics());
>>>          return intersection;
>>>      }
>>> }
>>>
>>>
>>> Do you think this class is general enough to be part of standard Collectors
>>> repertoire?
>>>
>>> For example, accessed via factory method Collectors.toBoth(Collector coll1,
>>> Collector coll2), bi-collection could then be coded simply as:
>>>
>>>          Map<String, Integer> map = ...
>>>
>>>          Map.Entry<List<String>, List<Integer>> keys_values =
>>>              map.entrySet()
>>>                 .stream()
>>>                 .collect(
>>>                     toBoth(
>>>                         mapping(Map.Entry::getKey, toList()),
>>>                         mapping(Map.Entry::getValue, toList())
>>>                     )
>>>                 );
>>>
>>>
>>>          Map.Entry<Map<Integer, Long>, Long> histogram_count =
>>>              ThreadLocalRandom
>>>                  .current()
>>>                  .ints(100, 0, 10)
>>>                  .boxed()
>>>                  .collect(
>>>                      toBoth(
>>>                          groupingBy(Function.identity(), counting()),
>>>                          counting()
>>>                      )
>>>                  );
>>>
>>>
>>> Regards, Peter


More information about the core-libs-dev mailing list