Stream.concat with varagrs
Olexandr Rotan
rotanolexandr842 at gmail.com
Wed Sep 17 20:14:55 UTC 2025
So i have played around a bit and managed to come up with some
implementation based on array of streams, you can find it here:
https://github.com/Evemose/nconcat/blob/master/src/main/java/nconcat/NConcatSpliterator.java
I have also added a small benchmark to the project, and the numbers are:
Benchmark (streamCount) Mode
Cnt Score Error Units
NConcatBenchmark.nConcatFindFirst 4 avgt
10 131.616 � 15.474 ns/op
NConcatBenchmark.nConcatFindFirst 8 avgt
10 187.929 � 6.544 ns/op
NConcatBenchmark.nConcatFindFirst 16 avgt
10 322.342 � 6.940 ns/op
NConcatBenchmark.nConcatFindFirst 32 avgt
10 659.856 � 85.509 ns/op
NConcatBenchmark.nConcatFindFirst 64 avgt
10 1214.133 � 22.156 ns/op
NConcatBenchmark.nConcatMethod 4 avgt
10 1910.150 � 25.269 ns/op
NConcatBenchmark.nConcatMethod 8 avgt
10 3865.364 � 112.536 ns/op
NConcatBenchmark.nConcatMethod 16 avgt
10 7743.097 � 74.655 ns/op
NConcatBenchmark.nConcatMethod 32 avgt
10 15840.551 � 440.659 ns/op
NConcatBenchmark.nConcatMethod 64 avgt
10 32891.336 � 1122.630 ns/op
NConcatBenchmark.nConcatToListWithFilter 4 avgt
10 9527.120 � 376.325 ns/op
NConcatBenchmark.nConcatToListWithFilter 8 avgt
10 20260.027 � 552.444 ns/op
NConcatBenchmark.nConcatToListWithFilter 16 avgt
10 44724.856 � 5040.069 ns/op
NConcatBenchmark.nConcatToListWithFilter 32 avgt
10 82577.518 � 2050.955 ns/op
NConcatBenchmark.nConcatToListWithFilter 64 avgt
10 181460.219 � 20809.669 ns/op
NConcatBenchmark.nconcatToList 4 avgt
10 9268.814 � 712.883 ns/op
NConcatBenchmark.nconcatToList 8 avgt
10 18164.147 � 786.803 ns/op
NConcatBenchmark.nconcatToList 16 avgt
10 35146.891 � 966.871 ns/op
NConcatBenchmark.nconcatToList 32 avgt
10 68944.262 � 5321.730 ns/op
NConcatBenchmark.nconcatToList 64 avgt
10 136845.984 � 3491.562 ns/op
NConcatBenchmark.standardStreamConcat 4 avgt
10 1951.522 � 85.130 ns/op
NConcatBenchmark.standardStreamConcat 8 avgt
10 3990.410 � 190.517 ns/op
NConcatBenchmark.standardStreamConcat 16 avgt
10 8599.869 � 685.878 ns/op
NConcatBenchmark.standardStreamConcat 32 avgt
10 17923.603 � 361.874 ns/op
NConcatBenchmark.standardStreamConcat 64 avgt
10 46797.408 � 4458.069 ns/op
NConcatBenchmark.standardStreamConcatFindFirst 4 avgt
10 125.192 � 3.123 ns/op
NConcatBenchmark.standardStreamConcatFindFirst 8 avgt
10 303.791 � 8.670 ns/op
NConcatBenchmark.standardStreamConcatFindFirst 16 avgt
10 907.429 � 52.620 ns/op
NConcatBenchmark.standardStreamConcatFindFirst 32 avgt
10 2964.749 � 320.141 ns/op
NConcatBenchmark.standardStreamConcatFindFirst 64 avgt
10 11749.653 � 189.300 ns/op
NConcatBenchmark.standardStreamConcatToList 4 avgt
10 7059.642 � 740.735 ns/op
NConcatBenchmark.standardStreamConcatToList 8 avgt
10 13714.980 � 250.208 ns/op
NConcatBenchmark.standardStreamConcatToList 16 avgt
10 27028.052 � 565.047 ns/op
NConcatBenchmark.standardStreamConcatToList 32 avgt
10 53537.731 � 853.363 ns/op
NConcatBenchmark.standardStreamConcatToList 64 avgt
10 105847.755 � 3179.918 ns/op
NConcatBenchmark.standardStreamConcatToListWithFilter 4 avgt
10 9736.527 � 154.817 ns/op
NConcatBenchmark.standardStreamConcatToListWithFilter 8 avgt
10 20607.061 � 713.083 ns/op
NConcatBenchmark.standardStreamConcatToListWithFilter 16 avgt
10 41241.199 � 1171.672 ns/op
NConcatBenchmark.standardStreamConcatToListWithFilter 32 avgt
10 83029.244 � 1843.176 ns/op
NConcatBenchmark.standardStreamConcatToListWithFilter 64 avgt
10 182349.009 � 11282.832 ns/op
Basically, the conclusion is following (guilty of using AI for summarizing):
The comprehensive benchmarks reveal that *NConcat significantly outperforms
> the standard library for processing-intensive operations* while trailing
> in simple collection scenarios. For short-circuit operations like
> findFirst(), NConcat delivers 38-90% better performance as stream count
> increases, reaching nearly 10x faster execution at 64 streams due to
> superior scaling (19ns/stream vs 184ns/stream). Full traversal operations
> like forEach consistently favor NConcat by 2-30%, with the advantage
> growing at scale. However, simple collection operations (toList())
> consistently run 22-24% faster with the standard library across all stream
> counts.
I have tried multiple approaches to optimize toList with know size of all
sub-streams (which is clearly the reason why standard implementation wins
here), and am sure that there is still plenty of room for improvement,
especially in parallel, but the takeaway is, even a naive implementation
like mine could bring a significant performance improvement to the table in
early short-circuiting and full traversal cases that do not depend on size
of the spliterator.
Besides the performance part, of course, the most significant advantage of
my proposal, as I think, is still developer experience, both reading and
writing stream code.
Please let me know your thoughts on the results of prototype and possible
ways forward.
Best regards
On Wed, Sep 17, 2025 at 6:04 PM Olexandr Rotan <rotanolexandr842 at gmail.com>
wrote:
> Hello everyone! Thanks for your responses
>
> I will start of by answering to Viktor
>
> I guess a "simple" implementation of an N-ary concat could work, but it
>> would have performance implications (think a recursive use of
>> Stream.concat())
>
>
> I too find just the addition of small reduction-performing sugar methods
> rather unsatisfactory and most certainly not bringing enough value to be
> considered a valuable addition. Moreover, I have not checked it myself, but
> I would dare to guess that popular utility libraries such as Guava or
> Apache Commons already provide this sort of functionality in their utility
> classes. Though, if this method could bring some significant performance
> benefits, I think it may be a valuable candidate to consider. Though, to me
> as a user, the main value would be uniformity of the API and ease of use
> and read. The main reason I am writing about this in the first place is the
> unintuitive inconsistency with many other static methods-creators that
> happily accept varargs
>
> I may play around with this spliterator code you have linked to to see if
> I could make it generalized for arrays of streams
>
> Now, answering to Pavel
>
> Is it such a useful use case, though? I mean, it's no different from
>> SequenceInputStream(...) or Math.min/max for that matter. I very rarely
>> have to do Math.min(a, Math(min(b, c)) or some such.
>
>
> I certainly see your point, but I would dare to say that most applications
> rely on the streams much more than SequenceInputStream and Math classes,
> and their lookalikes. Stream.concat is primarily a way to merge a few
> datasource outputs into one, for later uniform processing, which, in the
> nutshell, is one of the most common tasks in data-centric applications. Of
> course, not every such use case has characteristics that incline developers
> to use Stream.concat, such as combination of Stream.of and
> Collection.stream() sources, and even if they do, not every case that fits
> previous requirement requires to merge more than 2 sources. However, for
> mid-to-large scale apps, for which java is known the most, I would say it's
> fairly common. I went over our codebase and found that there were at least
> 10+ usages of concat, and a few of them followed this kinda ugly pattern of
> nested concates.
>
> Separately, it's not just one method. Consider that `concat` is also
>> implemented in specialized streams such as IntStream, DoubleStream, and
>> LongStream.
>
>
> This is unfortunate, but I would dare to say that once Reference
> spliterrator is implemented, others may also be derived by analogy fairly
> quickly
>
> And last but not least, answering Daniel
>
> Not immediately obvious but you can create a Stream<Stream<T>> using
>> Stream.of and reduce that using Stream::concat to obtain a Stream<T>.
>
> Something along those lines:
>
> ```
>> var stream = Stream.of(Stream.of(1,2,3), Stream.of(4), Stream.of(5, 6,
>> 7, 8)).reduce(Stream.empty(), Stream::concat, Stream::concat);
>
>
> This is what I meant by "reduction-like" implementation, which is fairly
> straightforward, but just from the looks of it, one could assume that this
> solution will surely have performance consequences, even if using flatmap
> insead of reduce. Not sure though, how often people would want to use such
> approach on the array of streams huge enough for the performance difference
> to be noticable, though I would assume that there is a non-linear scale of
> consumed time and resources from the length of streams array due to the
> implementation of concat method.
>
> Nevertheless, this is an acceptable workaround for such cases, even though
> not the most readable one. Even if this approach is accepted as sufficient
> for such cases of n-sized array of streams merging, It would probably make
> some sense to put note about it in the docs of the concat method. Though,
> not having concat(Stream..) overload would still remain unintuitive for
> many developers, including me
>
> Thanks everybody for the answers again
>
> Best regards
>
> On Wed, Sep 17, 2025 at 5:15 PM Pavel Rappo <pavel.rappo at gmail.com> wrote:
>
>> > this would be a great quality of life improvement
>>
>> Is it such a useful use case, though? I mean, it's no different from
>> SequenceInputStream(...) or Math.min/max for that matter. I very
>> rarely have to do Math.min(a, Math(min(b, c)) or some such. And those
>> methods predate streams API by more than a decade.
>>
>> Separately, it's not just one method. Consider that `concat` is also
>> implemented in specialized streams such as IntStream, DoubleStream,
>> and LongStream.
>>
>> On Wed, Sep 17, 2025 at 2:58 PM Olexandr Rotan
>> <rotanolexandr842 at gmail.com> wrote:
>> >
>> > Greetings to everyone on the list.
>> >
>> > When working on some routine tasks recently, I have encountered a,
>> seemingly to me, strange decision in design of Stream.concat method,
>> specifically the fact that it accepts exactly two streams. My concrete
>> example was something along the lines of
>> >
>> > var studentIds = ...;
>> > var teacherIds = ...;
>> > var partnerIds = ...;
>> >
>> > return Stream.concat(
>> > studentIds.stream(),
>> > teacherIds.stream(),
>> > partnerIds.stream() // oops, this one doesn't work
>> > )
>> >
>> > so I had to transform concat to a rather ugly
>> > Stream.concat(
>> > studentIds.stream(),
>> > Stream.concat(
>> > teacherIds.stream(),
>> > partnerIds.stream()
>> > )
>> > )
>> >
>> > Later on I had to add 4th stream of a single element (Stream.of), and
>> this one became even more ugly
>> >
>> > When I first wrote third argument to concat and saw that IDE highlights
>> it as error, I was very surprised. This design seems inconsistent not only
>> with the whole java stdlib, but even with Stream.of static method of the
>> same class. Is there any particular reason why concat takes exactly to
>> arguments?
>> >
>> > I would say that, even if just in a form of sugar method that just does
>> reduce on array (varagrs) of streams, this would be a great quality of life
>> improvement, but I'm sure there also may be some room for performance
>> improvement.
>> >
>> > Of course, there are workarounds like Stream.of + flatmap, but:
>> >
>> > 1. It gets messy when trying to concat streams of literal elements set
>> (Stream.of) and streams of collections or arrays
>> > 2. It certainly has significant performance overhead
>> > 3. It still doesn't explain absence of varagrs overload of concat
>> >
>> > So, once again, is there any particular reason to restrict arguments
>> list to exactly two streams? If not, I would be happy to contribute
>> Stream.concat(Stream... streams) overload.
>> >
>> > Best regards
>> >
>> >
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250917/1e87e3fa/attachment-0001.htm>
More information about the core-libs-dev
mailing list