Loom scheduler improvements?

Tue Jan 3 09:58:09 UTC 2023

> On 2 Jan 2023, at 07:27, robert engels <rengels at ix.netcom.com> wrote:
> 
> Hi loom devs - Happy New Year!
> 
> I have been digging into loom and testing the performance.
> 
> I think I see an area for possible improvement in the scheduler. See the following async profiler capture:
> 
> <PastedGraphic-2.png>
> 
> Notice that the most expensive operation in the system is that the ForkJoin worker is parking attempting to schedule work due to contention on the DelayedWorkQueue lock (the vthread is trying to park itself for N nanos). Ordinarily I wouldn’t be too concerned with micro benchmarks, but the primary goal of Loom is efficiency and this seems like it should be fairly straightforward to address with a specialized lock-free structure for use by Loom carrier threads. I have to expect the Loom design is to handle vthreads will a very short runtime until park/reschedule.
> 
> As it is, I wrote a specialized lock, rather than using the standard ReentrentLock since the standard always ended up parking the acquiring too soon - only to be awoken nearly immediately, and then rescheduled. It seems better to use some heuristics based on the cost of schedule switch and the expected park time and then spin for N loops prior to parking.
> 
> I see the same behavior with both JDK19 and JDK20.
> 
> Does the above look reasonable? Any other suggestions?
> 
> 
> 

Your profile seems to be using ScheduledThreadPoolExecutor rather than ForkJoinPool.

Also, keep in mind that virtual threads’ performance mostly comes from their ability to be plentiful, so a reasonable assumption for benchmarks is a high number of threads (at least thousands).

— Ron