RFR: JDK-8205461 Create Collector which merges results of two other collectors

Stuart Marks stuart.marks at oracle.com
Tue Aug 21 05:43:47 UTC 2018


Hi Tagir,

Thanks for working on this. This looks really cool! And thanks Peter for 
agreeing to sponsor it.

I can help out with the CSR. My first bit of advice about the CSR process is to 
hold off until the specification is complete. :-)

I think the intent of the API is fine, but I think some details of the returned 
collector need to be ironed out first.

1. The spec doesn't say what the returned collector's supplier, accumulator, 
combiner, and finisher do. On the one hand, we don't necessarily want to 
describe the actual implementation. On the other hand, we need to specify how 
the thing actually behaves. One can certainly deduce the intended behavior from 
the description, but this really needs to be specified, and it mustn't rely on 
the reader having to derive the required behaviors. Since the actual 
implementation is fairly simple, the spec might end up being rather close to the 
implementation, but that might be unavoidable.

I'm envisioning something like this:

  - supplier: creates a result container that contains result containers 
obtained by calling each collector's supplier

  - accumulator: calls each collector's accumulator with its result container 
and the input element

... and similar for the combiner and finisher functions.

2. Characteristics

  - UNORDERED: should the returned collector be UNORDERED if *either* of the 
provided collectors is UNORDERED? (Current draft says *both*.)

  - CONCURRENT: current draft seems correct

  - IDENTITY_FINISH: clearly not present; perhaps this should be specified

3. Parameter naming

The collector parameters are referred to as "specified collectors" or "supplied 
collectors". Other "higher-order" collectors refer to their underlying 
collectors as "downstream" collectors. I think it would be useful to work the 
"downstream" concept into this collector. This would enable the opening summary 
sentence of the doc to be something like, "Returns a collector that is a 
composite of two downstream collectors" or some such. (But see naming below.)

4. Naming

Sigh, naming is hard, and I know there was a fair amount of discussion in the 
previous thread and earlier in this one, but it seems like there's still some 
dissatisfaction with the name. (And I'm not particularly thrilled with 
teeingAndThen myself.) In a few minutes I've managed to come up with a few more 
names that (mostly) don't seem to have been proposed before, and so here they 
are for your consideration:

  - compound
  - composite
  - conjoined
  - bonded
  - fused
  - duplex

Discussion:

A "composite" evokes function composition; this might be good, though it might 
be odd in that collectors can't be composed in the same way that functions are.

"Compound" might be a useful alternative. In chemistry, two substances are 
combined (or bonded, or fused) to form a single substance, which is a compound.

"Conjoin" seems to adequately describe the structure of the two collectors, but 
it lacks somewhat the connotation of unifying them.

In an earlier discussion, Brian had pushed back on names related to 
split/fork/merge/join since those are currently in use in streams regarding 
splitting of input elements and merging of results. In describing how the 
current proposal differs, he said that elements are "multiplexed" to the 
different collectors. Since we're doing this with two collectors, how about 
"duplex"? (I note that Jacob Glickman also had suggested "duplex".)

s'marks





On 8/20/18 1:48 AM, Tagir Valeev wrote:
> Hello!
> 
> A CSR is created:
> https://bugs.openjdk.java.net/browse/JDK-8209685
> (this is my first CSR, hopefully I did it correctly)
> 
> With best regards,
> Tagir Valeev.
> On Mon, Aug 20, 2018 at 2:06 PM Peter Levart <peter.levart at gmail.com> wrote:
>>
>> Hi Tagir,
>>
>> I think this looks very good. It just needs a CSR. Will you file it?
>>
>> Regards, Peter
>>
>> On 08/19/2018 11:24 AM, Tagir Valeev wrote:
>>
>> Hello, Brian!
>>
>> Of the three phases, teeing is the most important and least obvious, so
>> I think something that includes that in the name is going to be
>> helpful.  Perhaps "teeingAndThen" is more evocative and not totally
>> unwieldy.
>>
>> Ok, sounds acceptable to me. Renamed pairing to teeingAndThen.
>>
>> By the way looking into CollectorsTest.java I found some minor things to
>> cleanup:
>> 1. `.map(mapper::apply)` and `.flatMap(mapper::apply)` can be replaced with
>> simple `.map(mapper)` and `.flatMap(mapper)` respectively
>>
>> Does IntelliJ have an inspection for eliminating such locutions?
>>
>> Sure, that's how I found them. Well, I took the liberty to fix these two things.
>>
>> 2. In many methods redundant `throws ReflectiveOperationException` is
>> declared while exception is never thrown
>>
>> For test code where a significant fraction of test cases are going to
>> throw something, we often do this, since its easier to just uniformly
>> tag such methods rather than thinking about which test methods actually
>> throw the exception and which don't.  So I think this is harmless
>> (though cleaning it up is harmless too.)
>>
>> I'm not thinking about this, because my IDE thinks for me :-) Ok, I'll
>> leave them as is for now.
>>
>> You may want to optimize the EnumSet mechanics for the case where
>> neither collector has interesting characteristics.
>>
>> Added a special case when reported characteristics for either of
>> collectors are empty or IDENTITY_FINISH only.
>> I think this should be a common case.
>>
>> The updated webrev is posted here (along with Peter suggestion to
>> rename finisher to merger):
>> http://cr.openjdk.java.net/~tvaleev/webrev/8205461/r3/
>> Also copyright year is updated
>>
>> With best regards,
>> Tagir Valeev
>>
>>


More information about the core-libs-dev mailing list