Virtual Thread Scheduler - why ForkJoinPool?

Wed Oct 5 12:46:08 UTC 2022

On Wed, Oct 5, 2022 at 4:06 AM Alan Bateman <Alan.Bateman at oracle.com> wrote:

> On 04/10/2022 15:15, Jonas Schlecht wrote:
>
> Hi everybody,
>
> I am currently writing my thesis which, for some part, also covers Virtual
> Threads. So far, I get how everything works but I don’t understand how the
> work-stealing nature of the ForkJoinPool is used by Virtual Threads.
> I know the benefits of a work-stealing scheduler and how the ForkJoinPool
> uses ForkJoinTasks. But how do Virtual Threads „fork“ tasks which can be
> stolen by other threads? Do they even do that? As far as I understand it,
> the same scheduling result could be achieved by using any other thread pool
> with the amount of available CPU cores as the number of threads. After all,
> the ForkJoinPool needs ForkJoinTasks to use the work-stealing logic. Or am
> I mistaken here?
>
>
>
> Could you maybe point me to some ressources that explain why you decided
> to use the ForkJoinPool and how it is used? I couldn’t find any online.
>
>
> For starters, think of ForkJoinPool as a "better thread pool". It has many
> advantages over a thread pool that uses a shared blocking queue for all
> tasks.
>
> Another thing is a ForkJoinPool can be created in "async mode" which is
> local FIFO scheduling. This is good for applications doing message passing
> and also good for scheduling the tasks for virtual threads.
>
> As others have pointed out, scheduling a virtual thread to execute causes
> a special task for the thread to be pushed to one of the FJP submission
> queues. A virtual thread T1 unparking virtual thread T2 will push the task
> for T2 to the submission queue of T1's carrier (worker thread). It may be
> that T2's task is executed by the that worker thread or it may be that some
> worker thread steals the task.
>
> There are a few other features of ForkJoinPool that you might want to look
> into. One is that it parallelism can be dynamically changed when workers
> are blocked - you'll see this is used to smooth over cases where virtual
> threads temporarily pin their carrier during file I/O operations. Another
> recent addition to ForkJoinPool is the ability to submit a task without
> signalling, look for "lazySubmit".  This is used to reduce steals in very
> specific cases such as when a thread is unparked while parking.
>

There are some disadvantages to this approach as well: this means that at
present, Thread.yield() is counter-intuitively unfair (and thus not useful)
for cooperatively switching among busy tasks [1]. As a workaround, one can
use e.g. LockSupport.parkNanos(1) or similar, which causes a task to be
sent to an external scheduled thread pool that immediately re-queues the
task, at the cost of ping-ponging between threads. Hacking [2] the Loom
implementation to use single-threaded executors avoids this problem and a
few others (at the possible cost of additional dispatch latency compared to
FJP for some workloads and CPU configurations).

[1] https://www.morling.dev/blog/loom-and-thread-fairness/ (towards the
bottom) - but note that the test in this blog post allocates fairly heavily
inside of the loop, which might obscure the effect due to contention on the
allocator; to reproduce the issue I changed the test to use a pure
arithmetic computation.
[2] https://github.com/dmlloyd/loomania - at your own risk, don't say I
didn't warn you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20221005/3b121a95/attachment.htm>