RFR: JDK-8205461 Create Collector which merges results of two other collectors

Tagir Valeev amaembo at gmail.com
Sun Aug 19 09:17:33 UTC 2018


Hello, Peter!

Yes, I would be happy, though see also Roger note.

> Regarding the code, I only have one comment about the naming of the last
parameter: "finisher"

Sounds reasonable, renamed to "merger".

> ...and one comment about handling of IDENTITY_FINISH

I think this would complicate the code unnecessarily adding unpleasant
unchecked casts and more branches. I think that returning non-identity
finisher while reporting IDENTITY_FINISH characteristic is a contract
violation. See, for example, that the existing collectingAndThen collector
relies on the finisher implementation just like I do. It simply uses
downstream.finisher().andThen(finisher)
even if finisher is identity.

With best regards,
Tagir Valeev.

On Tue, Aug 7, 2018 at 3:08 PM Peter Levart <peter.levart at gmail.com> wrote:

> Hi Tagir,
>
> Unless you have already got a sponsor (can't remember if somebody already
> offered you a sponsorship), I can volunteer.
>
> Regarding the code, I only have one comment about the naming of the last
> parameter: "finisher". To avoid confusion, it would be good to name it
> differently from the Collector's functions. "finisher" and "combiner" are
> already taken by Collector API. You describe the parameter in javadoc as
> "the function which merges two results into the single one". So what about
> "merger" or "result[s]Merger" ?
>
> ...and one comment about handling of IDENTITY_FINISH: You correctly remove
> IDENTITY_FINISH from the resulting collector's characteristics, but you
> don't honor the IDENTITY_FINISH characteristics of the two parameter
> collectors. It should work nevertheless, but suppose some "sloppy
> programmed" collector doesn't bother to return the identity function when
> it announces that it has IDENTITY_FINISH characteristic...
>
> Regards, Peter
>
>
> On 08/07/2018 07:52 AM, Tagir Valeev wrote:
>
> Ping! Could you please review and sponsor this changeset?
> I updated version tag from since 11 to since 12:http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r2/
>
> Thanks in advance!
>
> With best regards,
> Tagir Valeev.
>
> On Thu, Jun 21, 2018 at 11:33 AM Tagir Valeev <amaembo at gmail.com> <amaembo at gmail.com> wrote:
>
>
> Please review and sponsor:
> https://bugs.openjdk.java.net/browse/JDK-8205461http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r1/
>
> See also previous discussion thread at [1]. It seems that we did not reach
> the final conclusion about the collector name, so in this review it's still
> named as "pairing" (proposed by me). Other name proposals:
>
> bisecting - by Paul Sandoz (sounds like input is split by two parts like
> in partitioningBy, which is not the case)
> tee or teeing - by Brian Goetz (IIUC from the T shape, it's assymmetrical
> operation, while our collector is symmetrical)
> duplex(ing) - by Jacob Glickman (well, probably ok?)
> bifurcate - by James Laskey (or bifurcating?)
> replicator - by James Laskey (as in DNA)
> replicating - by Peter Levart
> fanout or fanningOut - Chris Hegarty (sounds cryptic to me, checked
> Wikipedia [2], does not look like suitable)
> distributing - by Brian Goetz (like in distributive law a(b+c) = ab+ac;
> but common meaning of "distributing" is different)
> tapping - by Kirk Pepperdine (I have no associations; Google images shows
> some acupuncture procedures)
> split - by Kirk Pepperdine (may be confused with Spliterator)
> unzipping - by John Rose
> biMapping - by Zheka Kozlov (but he also suggests to change signature
> which is unnecessary)
> toBoth - by Peter Levart (but usually toXyz shows target container like
> toArray/toList/toSet, but there's not "Both" container)
> collectionToBothAndThen - by Zheka Kozlov (but too long)
> collectingToBoth - by Zheka Kozlov
> collectingTo - by Brian Goetz (isn't collect(collectingTo(...)) a little
> bit repititive?)
> biCollecting - by Zheka Kozlov
> expanding - by Paul Sandoz (doesn't sound very descriptive to me)
> forking - by Victor Williams Stafusa da Silva (could be thought that
> something is parallelized as forking is usually something connected to
> parallel computations)
>
> I think that it's better to concentrate not on the "splitting" part (each
> element is passed to two collectors), but on the "joining" part (results of
> two collectors are combined together). So my preference now is "merging" or
> "combining". If limiting the selection to the suggested options I'd prefer
> "duplexing" or "bifurcating" or to stay with my original version "pairing".
> Of course original Stream API author voices have more weight here, so if
> you insist on particular version, I will follow.
>
> By the way looking into CollectorsTest.java I found some minor things to
> cleanup:
> 1. `.map(mapper::apply)` and `.flatMap(mapper::apply)` can be replaced
> with simple `.map(mapper)` and `.flatMap(mapper)` respectively
> 2. In many methods redundant `throws ReflectiveOperationException` is
> declared while exception is never thrown
> 3. The `if (!Map.class.isAssignableFrom(map.getClass()))` looks useless as
> `map` is already of `Map<Boolean, D>` type, so this check is always false
> (we would never enter this method if it's true). Similarly `if
> (!List.class.isAssignableFrom(value.getClass()))`
> 4. Deprecated `clazz.newInstance()` is used
> 5. Test named `testGroupubgByWithReducing` should be renamed to
> `testGroupingByWithReducing`
>
> Should I fix some or all of these issues? As separate changeset or within
> this one?
>
> With best regards,
> Tagir Valeev.
>
> [1]http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-June/053718.html
> [2] https://en.wikipedia.org/wiki/Fan-out
>
>
>


More information about the core-libs-dev mailing list