Collector

Howard Lovatt howard.lovatt at gmail.com
Tue Mar 12 20:37:23 PDT 2013


The interface I suggested isn't much more complicated than the existing
one. You can write a default for convertResult that simply casts the
intermediate result to the result. Therefore if you aren't using an
intermediate result it is practically zero overhead (a cast that will
succeed).

Why not add something along these lines?

Can still work on a more general solution for the future.


On 13 March 2013 13:34, Brian Goetz <brian.goetz at oracle.com> wrote:

> Yes, this is a very desirable feature.  I spent a long time trying to get
> this to work, but was not able to come up with an acceptable API that gave
> the desired flexibility without making the API horribly complicated or
> imposing unreasonable performance costs.
>
> This shows up with averaging (intemediate state is int[2]), string concat
> (StringBuilder), etc.
>
> The basic problem is when the collector is the "downstream" of another
> collector, such as grouping.  Say, you want to do the query "average sale
> by salesman".  You'd like to reduce to int[2] and then convert to double.
>  But that means you either have to build a view map (keeping the
> intermediate values around), or build a new map with the same keyset but
> whose values are a function applied to the old keyset.  Both are kind of
> sucky, and the latter does not parallelize well.
>
> The new experimental form for Collector may be more friendly to this goal,
> though.  And I'd like to have it -- its a definite "nice to have."
>
>
> On 3/12/2013 10:26 PM, Howard Lovatt wrote:
>
>> Hi,
>>
>> I have found the Collector interface and its primitive siblings really
>> useful with streams. However I have wanted to add a feature where it works
>> optionally with an intermediate type and then converts to the result type.
>> IE
>>
>> interface Collector<T, I, R> {
>>    boolean isConcurrent();
>>
>>    Supplier<I> intermediateSupplier();
>>
>>    BiConsumer<I, T> accumulator();
>>
>>    BinaryOperator<I> combiner();
>>
>>    R resultConverter(I intermediateResult);
>> }
>>
>> The reasons that I have wanted an intermediate type fall into
>> two categories: for efficiency and/or for convenience of programming. For
>> example it would be great to add a collector that made a
>> string representation of a stream. EG assuming stream contains 1, 2, 3:
>>
>>    stream.collect(toString)
>>
>> gives the string
>>
>>    "[1, 2, 3]"
>>
>> Note the output is a String and it is surrounded by square brackets.
>> Assuming that the intermediate type was a StringBuilder, the result
>> converter function would be:
>>
>>    String resultConverter(final StringBuilder intermediateResult) {
>>      intermediateResult.insert(0, '[');
>>      intermediateResult.append(']')**;
>>      return intermediateResult.toString();
>>    }
>>
>> The above example demonstrated both efficiency, StringBuilder is more
>> efficient than String, and convenience, it is easier to add the '[]' at
>> the
>> end.
>>
>> Is this something others have wanted?
>>
>>    -- Howard.
>>
>>


-- 
  -- Howard.


More information about the lambda-dev mailing list