[External] : Re: Should mapConcurrent() respect time order instead of input order?
Jige Yu
yujige at gmail.com
Mon Jun 2 15:05:54 UTC 2025
Thank you for your response and for considering my feedback on the
mapConcurrent() gatherer. I understand and respect that the final decision
rests with the JDK maintainers.
I would like to offer a couple of further points for consideration. My
perspective is that strict adherence to input order for mapConcurrent()
might not be the most common or beneficial default behavior for users. I'd
be very interested to see any research or data that suggests otherwise, as
that would certainly inform my understanding.
>From my experience, a more common need is for higher throughput in
I/O-intensive operations. The ability to support use cases like race()—where
the first successfully completed operation determines the outcome—also
seems like a valuable capability that is currently infeasible due to the
ordering constraint.
As I see it, if a developer specifically requires the input order to be
preserved, this can be achieved with relative ease by applying a subsequent
sorting operation. For instance:
.gather(mapConcurrent(...))
.sorted(Comparator.comparing(Result::getInputSequenceId))
The primary challenge in these scenarios is typically the efficient fan-out
and execution of concurrent tasks, not the subsequent sorting of results.
Conversely, as you've noted, there isn't a straightforward way to modify
the current default ordered behavior to achieve the higher throughput or
race() semantics that an unordered approach would naturally provide.
While re-implementing the gatherer is a possibility, the existing
implementation is non-trivial, and creating a custom, robust alternative
represents a significant undertaking. My hope was that an unordered option
could be a valuable addition to the standard library, benefiting a wider
range of developers.
Thank you again for your time and consideration.
On Mon, Jun 2, 2025 at 7:48 AM Viktor Klang <viktor.klang at oracle.com> wrote:
> >Even if it by default preserves input order, when I explicitly called
> stream.unordered(), could mapConcurrent() respect that and in return
> achieve higher throughput with support for race?
>
> The Gatherer doesn't know whether the Stream is unordered or ordered. The
> operation should be semantically equivalent anyway.
>
> Cheers,
> √
>
>
> *Viktor Klang*
> Software Architect, Java Platform Group
> Oracle
> ------------------------------
> *From:* Jige Yu <yujige at gmail.com>
> *Sent:* Monday, 2 June 2025 16:29
> *To:* Viktor Klang <viktor.klang at oracle.com>; core-libs-dev at openjdk.org <
> core-libs-dev at openjdk.org>
> *Subject:* [External] : Re: Should mapConcurrent() respect time order
> instead of input order?
>
> Sorry. Forgot to copy to the mailing list.
>
> On Mon, Jun 2, 2025 at 7:27 AM Jige Yu <yujige at gmail.com> wrote:
>
> Thanks Viktor!
>
> I was thinking from my own experience that I wouldn't have automatically
> assumed that a concurrent fanout library would by default preserve input
> order.
>
> And I think wanting high throughput with real-life utilities like race
> would be more commonly useful.
>
> But I could be wrong.
>
> Regardless, mapConcurrent() can do both, no?
>
> Even if it by default preserves input order, when I explicitly called
> stream.unordered(), could mapConcurrent() respect that and in return
> achieve higher throughput with support for race?
>
>
>
> On Mon, Jun 2, 2025 at 2:33 AM Viktor Klang <viktor.klang at oracle.com>
> wrote:
>
> Hi!
>
> In a similar vein to the built-in Collectors,
> the built-in Gatherers provide solutions to common stream-related
> problems, but also, they also serve as "inspiration" for developers for
> what is possible to implement using Gatherers.
>
> If someone, for performance reasons, and with a use-case which does not
> require encounter-order, want to take advantage of that combination of
> circumstances, it is definitely possible to implement your own Gatherer
> which has that behavior.
>
> Cheers,
> √
>
>
> *Viktor Klang*
> Software Architect, Java Platform Group
> Oracle
> ------------------------------
> *From:* core-libs-dev <core-libs-dev-retn at openjdk.org> on behalf of Jige
> Yu <yujige at gmail.com>
> *Sent:* Sunday, 1 June 2025 21:08
> *To:* core-libs-dev at openjdk.org <core-libs-dev at openjdk.org>
> *Subject:* Should mapConcurrent() respect time order instead of input
> order?
>
> It seems like for most people, input order isn't that important for
> concurrent work, and concurrent results being in non-deterministic order is
> often expected.
>
> If mapConcurrent() just respect output encounter order:
>
> It'll be able to achieve *higher throughput* if an early task is slow,
> For example, with concurrency=2, and if the first task takes 10 minutes to
> run, mapConcurrent() would only be able to process 2 tasks within the first
> 10 minutes; whereas with encounter order, the first task being slow doesn't
> block the 3rd - 100th elements from being processed and output.
>
> mapConcurrent() can be used to implement useful concurrent semantics, for
> example to *support race* semantics. Imagine if I need to send request to
> 10 candidate backends and take whichever that succeeds first, I'd be able
> to do:
>
> backends.stream()
> .gather(mapConcurrent(
> backend -> {
> try {
> return backend.fetchOrder();
> } catch (RpcException e) {
> return null; // failed to fetch but not fatal
> }
> })
> .filter(Objects::notNull)
> .findFirst(); // first success then cancel the rest
>
>
> Cheers,
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250602/91b8ae3b/attachment-0001.htm>
More information about the core-libs-dev
mailing list