RFR: JDK-8205461 Create Collector which merges results of two other collectors

Tomasz Linkowski t.linkowski at gmail.com
Fri Sep 14 13:43:23 UTC 2018


 Hi,

I'd like to remind three little things to which there was no response
(AFAIK):

1) Brian Goetz' suggestion of changing "? extends R" into "R":
    -
http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-August/054947.html

2) Stuart Marks' suggestion about renaming "c1" and "c2" to "downstream1"
and "downstream2":
    -
http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-August/054949.html

3) my suggestion about renaming "merger" to "biFinisher" because "merger"
means BinaryOperator everywhere else:
    -
http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-September/055235.html


As to the name, I still think a name related to a well-understood concept
(like the composite pattern [1]) would be better. Note that "composite" is
also a verb [2], but "Collectors.compositing" looks a bit strange.
"Collectors.composing" seems much better to me, but - as far as I
understand - there was some concern that the users could misunderstand it
as element-wise composition, is that right?

Regards,
Tomasz Linkowski

[1] https://en.wikipedia.org/wiki/Composite_pattern
[2] https://en.wiktionary.org/wiki/composite#Verb



On Fri, Sep 14, 2018 at 11:23 AM, Peter Levart <peter.levart at gmail.com>
wrote:

> Hi Tagir,
>
> I like duplexing more than teeingAndThen. If consensus can be established
> about the name, I think you will then want to update the CSR draft to
> reflect new name. Then we'll kindly ask Stuart if he has any more advice
> before submitting the CSR...
>
> Regards, Peter
>
>
> On 09/14/2018 10:41 AM, Tagir Valeev wrote:
>
>> Hello, Stuart and Peter!
>>
>> Thank you for valuable comments. I updated the webrev:
>> http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r4/
>>
>> 1. I renamed "teeingAndThen" to "duplexing". Brian insisted that
>> "-ing" suffix shall present and I agree. Hopefully it's final name.
>> 2. I updated the spec as Stuart suggested.
>>
>> No changes in implementation since r3 revision. Please check.
>>
>> With best regards,
>> Tagir Valeev.
>>
>>
>> On Tue, Aug 21, 2018 at 12:43 PM Stuart Marks <stuart.marks at oracle.com>
>> wrote:
>>
>>> Hi Tagir,
>>>
>>> Thanks for working on this. This looks really cool! And thanks Peter for
>>> agreeing to sponsor it.
>>>
>>> I can help out with the CSR. My first bit of advice about the CSR
>>> process is to
>>> hold off until the specification is complete. :-)
>>>
>>> I think the intent of the API is fine, but I think some details of the
>>> returned
>>> collector need to be ironed out first.
>>>
>>> 1. The spec doesn't say what the returned collector's supplier,
>>> accumulator,
>>> combiner, and finisher do. On the one hand, we don't necessarily want to
>>> describe the actual implementation. On the other hand, we need to
>>> specify how
>>> the thing actually behaves. One can certainly deduce the intended
>>> behavior from
>>> the description, but this really needs to be specified, and it mustn't
>>> rely on
>>> the reader having to derive the required behaviors. Since the actual
>>> implementation is fairly simple, the spec might end up being rather
>>> close to the
>>> implementation, but that might be unavoidable.
>>>
>>> I'm envisioning something like this:
>>>
>>>    - supplier: creates a result container that contains result containers
>>> obtained by calling each collector's supplier
>>>
>>>    - accumulator: calls each collector's accumulator with its result
>>> container
>>> and the input element
>>>
>>> ... and similar for the combiner and finisher functions.
>>>
>>> 2. Characteristics
>>>
>>>    - UNORDERED: should the returned collector be UNORDERED if *either*
>>> of the
>>> provided collectors is UNORDERED? (Current draft says *both*.)
>>>
>>>    - CONCURRENT: current draft seems correct
>>>
>>>    - IDENTITY_FINISH: clearly not present; perhaps this should be
>>> specified
>>>
>>> 3. Parameter naming
>>>
>>> The collector parameters are referred to as "specified collectors" or
>>> "supplied
>>> collectors". Other "higher-order" collectors refer to their underlying
>>> collectors as "downstream" collectors. I think it would be useful to
>>> work the
>>> "downstream" concept into this collector. This would enable the opening
>>> summary
>>> sentence of the doc to be something like, "Returns a collector that is a
>>> composite of two downstream collectors" or some such. (But see naming
>>> below.)
>>>
>>> 4. Naming
>>>
>>> Sigh, naming is hard, and I know there was a fair amount of discussion
>>> in the
>>> previous thread and earlier in this one, but it seems like there's still
>>> some
>>> dissatisfaction with the name. (And I'm not particularly thrilled with
>>> teeingAndThen myself.) In a few minutes I've managed to come up with a
>>> few more
>>> names that (mostly) don't seem to have been proposed before, and so here
>>> they
>>> are for your consideration:
>>>
>>>    - compound
>>>    - composite
>>>    - conjoined
>>>    - bonded
>>>    - fused
>>>    - duplex
>>>
>>> Discussion:
>>>
>>> A "composite" evokes function composition; this might be good, though it
>>> might
>>> be odd in that collectors can't be composed in the same way that
>>> functions are.
>>>
>>> "Compound" might be a useful alternative. In chemistry, two substances
>>> are
>>> combined (or bonded, or fused) to form a single substance, which is a
>>> compound.
>>>
>>> "Conjoin" seems to adequately describe the structure of the two
>>> collectors, but
>>> it lacks somewhat the connotation of unifying them.
>>>
>>> In an earlier discussion, Brian had pushed back on names related to
>>> split/fork/merge/join since those are currently in use in streams
>>> regarding
>>> splitting of input elements and merging of results. In describing how the
>>> current proposal differs, he said that elements are "multiplexed" to the
>>> different collectors. Since we're doing this with two collectors, how
>>> about
>>> "duplex"? (I note that Jacob Glickman also had suggested "duplex".)
>>>
>>> s'marks
>>>
>>>
>>>
>>>
>>>
>>> On 8/20/18 1:48 AM, Tagir Valeev wrote:
>>>
>>>> Hello!
>>>>
>>>> A CSR is created:
>>>> https://bugs.openjdk.java.net/browse/JDK-8209685
>>>> (this is my first CSR, hopefully I did it correctly)
>>>>
>>>> With best regards,
>>>> Tagir Valeev.
>>>> On Mon, Aug 20, 2018 at 2:06 PM Peter Levart <peter.levart at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Tagir,
>>>>>
>>>>> I think this looks very good. It just needs a CSR. Will you file it?
>>>>>
>>>>> Regards, Peter
>>>>>
>>>>> On 08/19/2018 11:24 AM, Tagir Valeev wrote:
>>>>>
>>>>> Hello, Brian!
>>>>>
>>>>> Of the three phases, teeing is the most important and least obvious, so
>>>>> I think something that includes that in the name is going to be
>>>>> helpful.  Perhaps "teeingAndThen" is more evocative and not totally
>>>>> unwieldy.
>>>>>
>>>>> Ok, sounds acceptable to me. Renamed pairing to teeingAndThen.
>>>>>
>>>>> By the way looking into CollectorsTest.java I found some minor things
>>>>> to
>>>>> cleanup:
>>>>> 1. `.map(mapper::apply)` and `.flatMap(mapper::apply)` can be replaced
>>>>> with
>>>>> simple `.map(mapper)` and `.flatMap(mapper)` respectively
>>>>>
>>>>> Does IntelliJ have an inspection for eliminating such locutions?
>>>>>
>>>>> Sure, that's how I found them. Well, I took the liberty to fix these
>>>>> two things.
>>>>>
>>>>> 2. In many methods redundant `throws ReflectiveOperationException` is
>>>>> declared while exception is never thrown
>>>>>
>>>>> For test code where a significant fraction of test cases are going to
>>>>> throw something, we often do this, since its easier to just uniformly
>>>>> tag such methods rather than thinking about which test methods actually
>>>>> throw the exception and which don't.  So I think this is harmless
>>>>> (though cleaning it up is harmless too.)
>>>>>
>>>>> I'm not thinking about this, because my IDE thinks for me :-) Ok, I'll
>>>>> leave them as is for now.
>>>>>
>>>>> You may want to optimize the EnumSet mechanics for the case where
>>>>> neither collector has interesting characteristics.
>>>>>
>>>>> Added a special case when reported characteristics for either of
>>>>> collectors are empty or IDENTITY_FINISH only.
>>>>> I think this should be a common case.
>>>>>
>>>>> The updated webrev is posted here (along with Peter suggestion to
>>>>> rename finisher to merger):
>>>>> http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r3/
>>>>> Also copyright year is updated
>>>>>
>>>>> With best regards,
>>>>> Tagir Valeev
>>>>>
>>>>>
>>>>>
>


-- 
Pozdrawiam,
Tomasz Linkowski


More information about the core-libs-dev mailing list