RFR: JDK-8205461 Create Collector which merges results of two other collectors

Roger Riggs roger.riggs at oracle.com
Tue Aug 7 13:57:42 UTC 2018


Hi,

As an API addition, it will need a CSR as well and it should have a 
couple of
reviewers that are fully aware of Streams design.

Regards, Roger


On 8/7/18 4:08 AM, Peter Levart wrote:
> Hi Tagir,
>
> Unless you have already got a sponsor (can't remember if somebody 
> already offered you a sponsorship), I can volunteer.
>
> Regarding the code, I only have one comment about the naming of the 
> last parameter: "finisher". To avoid confusion, it would be good to 
> name it differently from the Collector's functions. "finisher" and 
> "combiner" are already taken by Collector API. You describe the 
> parameter in javadoc as "the function which merges two results into 
> the single one". So what about "merger" or "result[s]Merger" ?
>
> ...and one comment about handling of IDENTITY_FINISH: You correctly 
> remove IDENTITY_FINISH from the resulting collector's characteristics, 
> but you don't honor the IDENTITY_FINISH characteristics of the two 
> parameter collectors. It should work nevertheless, but suppose some 
> "sloppy programmed" collector doesn't bother to return the identity 
> function when it announces that it has IDENTITY_FINISH characteristic...
>
> Regards, Peter
>
>
> On 08/07/2018 07:52 AM, Tagir Valeev wrote:
>> Ping! Could you please review and sponsor this changeset?
>> I updated version tag from since 11 to since 12:
>> http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r2/
>>
>> Thanks in advance!
>>
>> With best regards,
>> Tagir Valeev.
>>
>> On Thu, Jun 21, 2018 at 11:33 AM Tagir Valeev <amaembo at gmail.com> wrote:
>>
>>> Please review and sponsor:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8205461
>>> http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r1/
>>>
>>> See also previous discussion thread at [1]. It seems that we did not 
>>> reach
>>> the final conclusion about the collector name, so in this review 
>>> it's still
>>> named as "pairing" (proposed by me). Other name proposals:
>>>
>>> bisecting - by Paul Sandoz (sounds like input is split by two parts 
>>> like
>>> in partitioningBy, which is not the case)
>>> tee or teeing - by Brian Goetz (IIUC from the T shape, it's 
>>> assymmetrical
>>> operation, while our collector is symmetrical)
>>> duplex(ing) - by Jacob Glickman (well, probably ok?)
>>> bifurcate - by James Laskey (or bifurcating?)
>>> replicator - by James Laskey (as in DNA)
>>> replicating - by Peter Levart
>>> fanout or fanningOut - Chris Hegarty (sounds cryptic to me, checked
>>> Wikipedia [2], does not look like suitable)
>>> distributing - by Brian Goetz (like in distributive law a(b+c) = ab+ac;
>>> but common meaning of "distributing" is different)
>>> tapping - by Kirk Pepperdine (I have no associations; Google images 
>>> shows
>>> some acupuncture procedures)
>>> split - by Kirk Pepperdine (may be confused with Spliterator)
>>> unzipping - by John Rose
>>> biMapping - by Zheka Kozlov (but he also suggests to change signature
>>> which is unnecessary)
>>> toBoth - by Peter Levart (but usually toXyz shows target container like
>>> toArray/toList/toSet, but there's not "Both" container)
>>> collectionToBothAndThen - by Zheka Kozlov (but too long)
>>> collectingToBoth - by Zheka Kozlov
>>> collectingTo - by Brian Goetz (isn't collect(collectingTo(...)) a 
>>> little
>>> bit repititive?)
>>> biCollecting - by Zheka Kozlov
>>> expanding - by Paul Sandoz (doesn't sound very descriptive to me)
>>> forking - by Victor Williams Stafusa da Silva (could be thought that
>>> something is parallelized as forking is usually something connected to
>>> parallel computations)
>>>
>>> I think that it's better to concentrate not on the "splitting" part 
>>> (each
>>> element is passed to two collectors), but on the "joining" part 
>>> (results of
>>> two collectors are combined together). So my preference now is 
>>> "merging" or
>>> "combining". If limiting the selection to the suggested options I'd 
>>> prefer
>>> "duplexing" or "bifurcating" or to stay with my original version 
>>> "pairing".
>>> Of course original Stream API author voices have more weight here, 
>>> so if
>>> you insist on particular version, I will follow.
>>>
>>> By the way looking into CollectorsTest.java I found some minor 
>>> things to
>>> cleanup:
>>> 1. `.map(mapper::apply)` and `.flatMap(mapper::apply)` can be replaced
>>> with simple `.map(mapper)` and `.flatMap(mapper)` respectively
>>> 2. In many methods redundant `throws ReflectiveOperationException` is
>>> declared while exception is never thrown
>>> 3. The `if (!Map.class.isAssignableFrom(map.getClass()))` looks 
>>> useless as
>>> `map` is already of `Map<Boolean, D>` type, so this check is always 
>>> false
>>> (we would never enter this method if it's true). Similarly `if
>>> (!List.class.isAssignableFrom(value.getClass()))`
>>> 4. Deprecated `clazz.newInstance()` is used
>>> 5. Test named `testGroupubgByWithReducing` should be renamed to
>>> `testGroupingByWithReducing`
>>>
>>> Should I fix some or all of these issues? As separate changeset or 
>>> within
>>> this one?
>>>
>>> With best regards,
>>> Tagir Valeev.
>>>
>>> [1]
>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-June/053718.html 
>>>
>>> [2] https://en.wikipedia.org/wiki/Fan-out
>>>
>



More information about the core-libs-dev mailing list