Cache topology aware scheduling

Wed Sep 11 17:55:58 UTC 2024

+1. The FFM code is a great example!

> On Sep 11, 2024, at 12:53 PM, Alan Bateman <alan.bateman at oracle.com> wrote:
> 
> On 10/09/2024 05:10, Danny Thomas wrote:
>> I've switched to foreign functions for the native calls, using the current CPU for external submissions, and a queuing threshold to decide when to select the least loaded pool. Significantly improved CPU utilization versus the default scheduler with a slight throughput bump:
>> 
>> https://github.com/DanielThomas/virtual-threads-cluster-aware/commit/c0e7b6141a84eb77e6848fa84014e7a98ddfc75b <https://urldefense.com/v3/__https://github.com/DanielThomas/virtual-threads-cluster-aware/commit/c0e7b6141a84eb77e6848fa84014e7a98ddfc75b__;!!ACWV5N9M2RV99hQ!NYPwGwIo_GAtKLVsGpCxRAPgR40IvyvZFfG8b3prgbw7T9DAJqjaElA9yjBq1CaMR0Gef5W4FwxUK50Hxw$>
>> 
>> I'll improve the benchmark to be lumpier with more submission pressure to make work stealing more of a factor, and then look at balancing with pollSubmission.
>> 
> 
> Good use of FFM.
> 
> You probably know this already: ForkJoinPool::getQueuedSubmissionCount is a O(n) scan so will be interesting to see how this performs as a heuristic.
> 
> Related is that ForkWorkWorkerThread has a method that tests two queues (local and "current source") as a cheap way to test if it could execute something immediately.  This is currently used by Exchanger and LinkedTransferQueue to influence whether to spin.  Doug Lea has been thinking about whether to expose. Your experiments may be case that could use it.
> 
> -Alan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240911/b956b4d4/attachment.htm>