BiCollector

Mon Jun 18 22:43:06 UTC 2018

Hi Peter,

Existing composing collectors tend to unpack all the functions of the input collector ahead of time, i don’t recall being too concerned about this at the time. It does allow for more robust null checking, something we were less diligent about doing.

Paul.

> On Jun 17, 2018, at 3:04 AM, Peter Levart <peter.levart at gmail.com> wrote:
> 
> Hi Tagir,
> 
> I don't know if this is important, but in your approach the particular functions of the sub-collectors are retrieved eagerly even if they are later not used. This should typically not present a problem as Collector(s) should be prepared to be used in any scenario (parallel or serial). But anyway in my approach, the sub-functions of the given collectors are retrived lazily, each time the Stream implementation needs them - the dynamics of invoking sub-collector methods to retrieve the functions is therefore unchanged when a collector is used directly to collect a Stream or as a sub-collector in the bi-collection scenario.
> 
> What do you think of that particular detail? Paul?
> 
> Regards, Peter
> 
> On 06/15/18 13:26, Tagir Valeev wrote:
>> Hello!
>> 
>> I created a preliminary webrev of my own implementation (no testcases yet):
>> http://cr.openjdk.java.net/~tvaleev/patches/pairing/webrev/ <http://cr.openjdk.java.net/%7Etvaleev/patches/pairing/webrev/>
>> If anybody wants to sponsor my implementation, I will happily log an issue and write tests.
>> 
>> The name "pairing" was invented by me, but as I'm not a native English speaker I cannot judge whether it's optimal, so better ideas are welcome.
>> Also I decided to remove accumulator types from public type variables. They do not add anything to type signature, only clutter it 
>> increasing the number of type parameters from 4 to 6. I think it was a mistake to expose the accumulator type parameter in other cascading collectors 
>> like filtering(), collectingAndThen(), groupingBy(), etc. I'm not insisting though, if you feel that conformance to existing collectors is 
>> more important than simplicity.
>> 
>> With best regards,
>> Tagir Valeev.
>> 
>> On Fri, Jun 15, 2018 at 5:05 AM Brian Goetz <brian.goetz at oracle.com <mailto:brian.goetz at oracle.com>> wrote:
>> 
>> > Well, I don't see the need to pack the two results into a Map.Entry 
>> > (or any similar) container as a drawback. 
>> 
>>  From an "integrity of the JDK APIs" perspective, it is unquestionably a 
>> drawback.  These items are not a Key and an associated Value in a Map; 
>> it's merely pretending that Map.Entry really means "Pair".  There's a 
>> reason we don't have a Pair class in the JDK (and no, let's not reopen 
>> that now); using something else as a Pair proxy that is supposed to have 
>> specific semantics is worse. (It's fine to do this in your own code, but 
>> not in the JDK. Different standards for code that has different audiences.)
>> 
>> Tagir's proposed sidestepping is nice, and it will also play nicely with 
>> records, because then you can say:
>> 
>>       record NameAndCount(String name, int count);
>> 
>>       stream.collect(pairing(collectName, collectCount, NameAndCount::new));
>> 
>> and get a more properly abstract result out.  And its more in the spirit 
>> of existing Collectors.  If you want to use Map.Entry as an 
>> _implementation detail_, that's fine.
>> 
>> I can support this form.
>> 
>> > I also don't see a larger abstraction like BiStream as a natural fit 
>> > for a similar thing. 
>> 
>> I think the BiStream connection is mostly tangential.  We tried hard to 
>> support streams of (K,V) pairs when we did streams, as Paul can attest, 
>> but it was a huge complexity-inflater and dropping this out paid an 
>> enormous simplification payoff.
>> 
>> With records, having streams of tuples will be simpler to represent, but 
>> no more performant; it will take until we get to value types and 
>> specialized generics to get the performance we want out of this.
>> 
>> 
>