: Re: Project Loom VirtualThreads hang

Fri Jan 6 11:49:22 UTC 2023

Sorry, I mean preemptive time sharing then. 😊
(From user perspective, right now Loom doesn’t _look_ preemptive even if it is under the hood since switch can occur only at specific points in user code unless I’m missing something.)

As I said, it would not be as bad as the NodeJS example but still a similar problem: a native thread pinned by CPU-bound request.

If I have a Kub pod with CPU limit 2, I send 4 CPU-bound http requests concurrently, I really expect them to run concurrently, not 2 then 2 – it’s rather real world scenario.
Especially since the 2 first might be longer, I can’t always know in advance.
Sure, if you have 100% CPU usage all the time there is a problem, but that doesn’t mean incoming requests should be fully stuck as soon as 100% CPU usage is hit on a pod.
(Note that 100% usage on Pod doesn’t mean 100% on the underlying worker node VM.)

The problem here is not raw performance but scheduling fairness.
In my opinion, for CPU-bound requests on vthreads, that would be even better to have slightly worse perf overall but maintain correct fairness.
Otherwise, we will have to be concerned each time about running enough carrier/native threads to support possible worst-case concurrent CPU-bound requests – compared with the awesome simplification Loom gives regarding blocked IO handling.

From a different perspective: I think Loom cheap stack/thread parking could be useful also for CPU-bound tasks. It could allow more concurrent CPU-bound tasks to make progress without improving perf overall.

Thanks
Arnaud

First, virtual thread scheduling is already preemptive — the JDK decides when to preempt a virtual thread without any explicit cooperation from user code. What you’re talking about is making the decision to preempt based on some expired time-slice on the CPU. This is called time-sharing.

Second, no, this isn’t like Node.JS because the scheduler uses all cores. To saturate the scheduler you’d need to hog the CPU on *all* cores at the same time, i.e. reach 100% CPU utilisation. It is true that the kernel scheduler will employ time-sharing at 100% CPU while the virtual threads scheduler currently doesn’t, but servers don’t normally run at 100% CPU, and when they do I don’t think people are happy with how well they behave under that condition. I.e. the time-sharing offered by the OS is definitely useful for some things, but making servers work well at 100% CPU doesn’t appear to be one of them — as far as we know.

Time-sharing does help when you have a small number of low-priority background processing jobs running on the server, but unlike Erlang or Go, Java easily offers that without adding time-sharing to the virtual thread scheduler.

So the question is, can time-sharing for virtual threads actually improve real workloads, and, if so, what should be the scheduling algorithm that would achieve that? To know the answer to this question, we’ll need to see real workloads that could potentially be helped by time-sharing. I haven’t seen one yet, but if anyone finds any, we’d certainly take a close look at that.

As for file IO, we’re currently working on employing io_uring (where available) to make that properly block the virtual thread rather than the OS thread. By the way, the current implementation will actually temporarily increase the number of OS threads used by the scheduler when doing long filesystem operations.

— Ron

Unless otherwise stated above:

Compagnie IBM France
Siège Social : 17, avenue de l'Europe, 92275 Bois-Colombes Cedex
RCS Nanterre 552 118 465
Forme Sociale : S.A.S.
Capital Social : 664 069 390,60 €
SIRET : 552 118 465 03644 - Code NAF 6203Z
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20230106/df6abca2/attachment-0001.htm>