RFR: JDK-8205461 Create Collector which merges results of two other collectors

Tomasz Linkowski t.linkowski at gmail.com
Fri Sep 14 19:52:11 UTC 2018


Hi all,


Stuart, thanks for your response and explanations, I really appreciate it!
Good point about function composition - didn't cross my mind, but it could
be confusing indeed.

As for "merger" - good point about "biFinisher", but I'm still unconvinced
about "merger", because usually, you won't *merge* the two results (like
summing ints or concatenating strings) - instead, you'll usually *compose* them
into another object (yeah, and a moment before I said "composition" could
be confusing). But how about the name "duplexer" in the meaning of "to make
into a duplex"?

As for "duplex" not being a verb - according to some dictionaries, "duplex"
IS a verb:
- https://en.wiktionary.org/wiki/duplex#Verb
- https://www.dictionary.com/browse/duplex
- https://www.merriam-webster.com/dictionary/duplex

That said, I certainly wouldn't object against the word "composite" finding
its way into the JavaDoc.


Federico, since I proposed "collectingBothAndThen", I obviously agree with
you. To be honest, though, the major drawback of this name would be its
length (both wordwise and letterwise). On the other hand, it seems that
"collectingBothAndThen" could be much easier to understand than
"duplexing", especially for non-natives.


Regards,
Tomasz Linkowski


On Fri, Sep 14, 2018 at 9:11 PM, Federico Peralta Schaffner <
federico.peralta at gmail.com> wrote:

> Hi,
>
> I've already sent an email to this list regarding this issue, but I'm not
> sure if it finally made its way. So here I'm sending something again w.r.t
> the name...
>
> duplexing might be OK, but
>
> collectingBothAndThen
>
> is hard to beat.
>
> Arguments:
>
> - Existing collectingAndThen method is the simpler version of this guy, why
> introducing other verbs?
>
> - While being short (for all the things this operation does), it is yet an
> extremely descriptive name; this method does exactly what its name says,
> i.e. it really collects the stream using two collectors and then applies a
> function to transform both collected results into the final result
>
> - It reads naturally when used along with other existing methods of the
> Collectors API, especially when used as a downstream collector, i.e.
>
> filtering(
>     n -> n > 0,
>     collectingBothAndThen(
>         summingDouble(e -> e.getDouble()),
>         counting(),
>         (sum, count) -> sum / count));
>
>
> Just my 2 cents.
>
> Thanks,
> fps.-
>
> El 14 set. 2018 2:44 p.m., "Tomasz Linkowski" <t.linkowski at gmail.com>
> escribió:
>
>  Hi,
>
> I'd like to remind three little things to which there was no response
> (AFAIK):
>
> 1) Brian Goetz' suggestion of changing "? extends R" into "R":
>     -
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-
> August/054947.html
>
> 2) Stuart Marks' suggestion about renaming "c1" and "c2" to "downstream1"
> and "downstream2":
>     -
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-
> August/054949.html
>
> 3) my suggestion about renaming "merger" to "biFinisher" because "merger"
> means BinaryOperator everywhere else:
>     -
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-
> September/055235.html
>
>
> As to the name, I still think a name related to a well-understood concept
> (like the composite pattern [1]) would be better. Note that "composite" is
> also a verb [2], but "Collectors.compositing" looks a bit strange.
> "Collectors.composing" seems much better to me, but - as far as I
> understand - there was some concern that the users could misunderstand it
> as element-wise composition, is that right?
>
> Regards,
> Tomasz Linkowski
>
> [1] https://en.wikipedia.org/wiki/Composite_pattern
> [2] https://en.wiktionary.org/wiki/composite#Verb
>
>
>
> On Fri, Sep 14, 2018 at 11:23 AM, Peter Levart <peter.levart at gmail.com>
>
> wrote:
>
> > Hi Tagir,
> >
> > I like duplexing more than teeingAndThen. If consensus can be established
> > about the name, I think you will then want to update the CSR draft to
> > reflect new name. Then we'll kindly ask Stuart if he has any more advice
> > before submitting the CSR...
> >
> > Regards, Peter
> >
> >
> > On 09/14/2018 10:41 AM, Tagir Valeev wrote:
> >
> >> Hello, Stuart and Peter!
> >>
> >> Thank you for valuable comments. I updated the webrev:
> >> http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r4/
> >>
> >> 1. I renamed "teeingAndThen" to "duplexing". Brian insisted that
> >> "-ing" suffix shall present and I agree. Hopefully it's final name.
> >> 2. I updated the spec as Stuart suggested.
> >>
> >> No changes in implementation since r3 revision. Please check.
> >>
> >> With best regards,
> >> Tagir Valeev.
> >>
> >>
> >> On Tue, Aug 21, 2018 at 12:43 PM Stuart Marks <stuart.marks at oracle.com>
> >> wrote:
> >>
> >>> Hi Tagir,
> >>>
> >>> Thanks for working on this. This looks really cool! And thanks Peter
> for
> >>> agreeing to sponsor it.
> >>>
> >>> I can help out with the CSR. My first bit of advice about the CSR
> >>> process is to
> >>> hold off until the specification is complete. :-)
> >>>
> >>> I think the intent of the API is fine, but I think some details of the
> >>> returned
> >>> collector need to be ironed out first.
> >>>
> >>> 1. The spec doesn't say what the returned collector's supplier,
> >>> accumulator,
> >>> combiner, and finisher do. On the one hand, we don't necessarily want
> to
> >>> describe the actual implementation. On the other hand, we need to
> >>> specify how
> >>> the thing actually behaves. One can certainly deduce the intended
> >>> behavior from
> >>> the description, but this really needs to be specified, and it mustn't
> >>> rely on
> >>> the reader having to derive the required behaviors. Since the actual
> >>> implementation is fairly simple, the spec might end up being rather
> >>> close to the
> >>> implementation, but that might be unavoidable.
> >>>
> >>> I'm envisioning something like this:
> >>>
> >>>    - supplier: creates a result container that contains result
> containers
> >>> obtained by calling each collector's supplier
> >>>
> >>>    - accumulator: calls each collector's accumulator with its result
> >>> container
> >>> and the input element
> >>>
> >>> ... and similar for the combiner and finisher functions.
> >>>
> >>> 2. Characteristics
> >>>
> >>>    - UNORDERED: should the returned collector be UNORDERED if *either*
> >>> of the
> >>> provided collectors is UNORDERED? (Current draft says *both*.)
> >>>
> >>>    - CONCURRENT: current draft seems correct
> >>>
> >>>    - IDENTITY_FINISH: clearly not present; perhaps this should be
> >>> specified
> >>>
> >>> 3. Parameter naming
> >>>
> >>> The collector parameters are referred to as "specified collectors" or
> >>> "supplied
> >>> collectors". Other "higher-order" collectors refer to their underlying
> >>> collectors as "downstream" collectors. I think it would be useful to
> >>> work the
> >>> "downstream" concept into this collector. This would enable the opening
> >>> summary
> >>> sentence of the doc to be something like, "Returns a collector that is
> a
> >>> composite of two downstream collectors" or some such. (But see naming
> >>> below.)
> >>>
> >>> 4. Naming
> >>>
> >>> Sigh, naming is hard, and I know there was a fair amount of discussion
> >>> in the
> >>> previous thread and earlier in this one, but it seems like there's
> still
> >>> some
> >>> dissatisfaction with the name. (And I'm not particularly thrilled with
> >>> teeingAndThen myself.) In a few minutes I've managed to come up with a
> >>> few more
> >>> names that (mostly) don't seem to have been proposed before, and so
> here
> >>> they
> >>> are for your consideration:
> >>>
> >>>    - compound
> >>>    - composite
> >>>    - conjoined
> >>>    - bonded
> >>>    - fused
> >>>    - duplex
> >>>
> >>> Discussion:
> >>>
> >>> A "composite" evokes function composition; this might be good, though
> it
> >>> might
> >>> be odd in that collectors can't be composed in the same way that
> >>> functions are.
> >>>
> >>> "Compound" might be a useful alternative. In chemistry, two substances
> >>> are
> >>> combined (or bonded, or fused) to form a single substance, which is a
> >>> compound.
> >>>
> >>> "Conjoin" seems to adequately describe the structure of the two
> >>> collectors, but
> >>> it lacks somewhat the connotation of unifying them.
> >>>
> >>> In an earlier discussion, Brian had pushed back on names related to
> >>> split/fork/merge/join since those are currently in use in streams
> >>> regarding
> >>> splitting of input elements and merging of results. In describing how
> the
> >>> current proposal differs, he said that elements are "multiplexed" to
> the
> >>> different collectors. Since we're doing this with two collectors, how
> >>> about
> >>> "duplex"? (I note that Jacob Glickman also had suggested "duplex".)
> >>>
> >>> s'marks
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 8/20/18 1:48 AM, Tagir Valeev wrote:
> >>>
> >>>> Hello!
> >>>>
> >>>> A CSR is created:
> >>>> https://bugs.openjdk.java.net/browse/JDK-8209685
> >>>> (this is my first CSR, hopefully I did it correctly)
> >>>>
> >>>> With best regards,
> >>>> Tagir Valeev.
> >>>> On Mon, Aug 20, 2018 at 2:06 PM Peter Levart <peter.levart at gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi Tagir,
> >>>>>
> >>>>> I think this looks very good. It just needs a CSR. Will you file it?
> >>>>>
> >>>>> Regards, Peter
> >>>>>
> >>>>> On 08/19/2018 11:24 AM, Tagir Valeev wrote:
> >>>>>
> >>>>> Hello, Brian!
> >>>>>
> >>>>> Of the three phases, teeing is the most important and least obvious,
> so
> >>>>> I think something that includes that in the name is going to be
> >>>>> helpful.  Perhaps "teeingAndThen" is more evocative and not totally
> >>>>> unwieldy.
> >>>>>
> >>>>> Ok, sounds acceptable to me. Renamed pairing to teeingAndThen.
> >>>>>
> >>>>> By the way looking into CollectorsTest.java I found some minor things
> >>>>> to
> >>>>> cleanup:
> >>>>> 1. `.map(mapper::apply)` and `.flatMap(mapper::apply)` can be
> replaced
> >>>>> with
> >>>>> simple `.map(mapper)` and `.flatMap(mapper)` respectively
> >>>>>
> >>>>> Does IntelliJ have an inspection for eliminating such locutions?
> >>>>>
> >>>>> Sure, that's how I found them. Well, I took the liberty to fix these
> >>>>> two things.
> >>>>>
> >>>>> 2. In many methods redundant `throws ReflectiveOperationException` is
> >>>>> declared while exception is never thrown
> >>>>>
> >>>>> For test code where a significant fraction of test cases are going to
> >>>>> throw something, we often do this, since its easier to just uniformly
> >>>>> tag such methods rather than thinking about which test methods
> actually
> >>>>> throw the exception and which don't.  So I think this is harmless
> >>>>> (though cleaning it up is harmless too.)
> >>>>>
> >>>>> I'm not thinking about this, because my IDE thinks for me :-) Ok,
> I'll
> >>>>> leave them as is for now.
> >>>>>
> >>>>> You may want to optimize the EnumSet mechanics for the case where
> >>>>> neither collector has interesting characteristics.
> >>>>>
> >>>>> Added a special case when reported characteristics for either of
> >>>>> collectors are empty or IDENTITY_FINISH only.
> >>>>> I think this should be a common case.
> >>>>>
> >>>>> The updated webrev is posted here (along with Peter suggestion to
> >>>>> rename finisher to merger):
> >>>>> http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r3/
> >>>>> Also copyright year is updated
> >>>>>
> >>>>> With best regards,
> >>>>> Tagir Valeev
> >>>>>
> >>>>>
> >>>>>
> >
>
>
> --
> Pozdrawiam,
> Tomasz Linkowski
>



-- 
Pozdrawiam,
Tomasz Linkowski


More information about the core-libs-dev mailing list