RFR: 8196106: Support nested infinite or recursive flat mapped streams [v2]
Paul Sandoz
psandoz at openjdk.org
Tue Apr 9 16:52:10 UTC 2024
On Tue, 9 Apr 2024 10:07:46 GMT, Viktor Klang <vklang at openjdk.org> wrote:
>> This PR implements Gatherer-inspired encoding of `flatMap` that shows that it is both competitive performance-wise as well as improve correctness.
>>
>> Below is the performance of `Stream::flatMap` (for reference types):
>>
>> Before this PR:
>>
>> Benchmark (size) Mode Cnt Score Error Units
>> FlatMap.par_array 10 thrpt 12 294008,937 ? 54369,110 ops/s
>> FlatMap.par_array 100 thrpt 12 62411,229 ? 14868,119 ops/s
>> FlatMap.par_array 1000 thrpt 12 8263,821 ? 452,622 ops/s
>> FlatMap.par_iterate 10 thrpt 12 23029,978 ? 4274,449 ops/s
>> FlatMap.par_iterate 100 thrpt 12 10532,907 ? 321,694 ops/s
>> FlatMap.par_iterate 1000 thrpt 12 981,571 ? 135,270 ops/s
>> FlatMap.seq_array 10 thrpt 12 2955648,495 ? 32539,142 ops/s
>> FlatMap.seq_array 100 thrpt 12 41851,009 ? 377,546 ops/s
>> FlatMap.seq_array 1000 thrpt 12 1740,281 ? 1229,974 ops/s
>> FlatMap.seq_iterate 10 thrpt 12 321727,690 ? 5149,356 ops/s
>> FlatMap.seq_iterate 100 thrpt 12 8437,198 ? 56,635 ops/s
>> FlatMap.seq_iterate 1000 thrpt 12 76,994 ? 0,965 ops/s
>>
>>
>> After this PR:
>>
>>
>> Benchmark (size) Mode Cnt Score Error Units
>> FlatMap.par_array 10 thrpt 12 283350,051 ? 35567,223 ops/s
>> FlatMap.par_array 100 thrpt 12 53846,906 ? 19241,913 ops/s
>> FlatMap.par_array 1000 thrpt 12 8230,909 ? 156,362 ops/s
>> FlatMap.par_iterate 10 thrpt 12 26328,500 ? 5411,401 ops/s
>> FlatMap.par_iterate 100 thrpt 12 10470,862 ? 249,991 ops/s
>> FlatMap.par_iterate 1000 thrpt 12 986,511 ? 224,050 ops/s
>> FlatMap.seq_array 10 thrpt 12 5654826,565 ? 27317,453 ops/s
>> FlatMap.seq_array 100 thrpt 12 187929,786 ? 542,787 ops/s
>> FlatMap.seq_array 1000 thrpt 12 2385,346 ? 9,827 ops/s
>> FlatMap.seq_iterate 10 thrpt 12 812722,403 ? 160500,399 ops/s
>> FlatMap.seq_iterate 100 thrpt 12 13542,472 ? 118,769 ops/s
>> FlatMap.seq_iterate 1000 thrpt 12 157,056 ? 1,814 ops/s
>>
>>
>> For streams of size 100k, the following numbers are interesting:
>>
>> Before this PR:
>>
>> Benchmark (size) Mode Cnt Score Error Units
>> FlatMap.par_array 100000 thrpt 12 0,325 ? 0,004 ops/s
>> FlatMap.par_iterate 100000 thrpt 12 0,106 ? 0,008 o...
>
> Viktor Klang has updated the pull request incrementally with one additional commit since the last revision:
>
> Updating copyright year
src/java.base/share/classes/java/util/stream/DoublePipeline.java line 280:
> 278: result.sequential().allMatch(this);
> 279: else
> 280: result.sequential().forEach(sink::accept);
I think that might create a new double consumer instance for every input element. Alternatively you can compute and cache it as a field, replacing `shorts` and use a `null` check.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18625#discussion_r1557995257
More information about the core-libs-dev
mailing list