: Re: Project Loom VirtualThreads hang
Arnaud Masson
arnaud.masson at fr.ibm.com
Fri Jan 6 20:03:07 UTC 2023
Sure.
Servers with both IO-bound and CPU-bound requests are what I see in the real world. š
Thanks
Arnaud
Over the years weāve been working on virtual threads, weāve made lots of simulations. But the features we have are not features that address artificial simulations, but only those of them that correspond to real problems our real users face in the real world. What we need to improve things arenāt more simulations, but reports from real systems (or simulations that try to mimic behaviour observed in a real system).
ā Ron
On 6 Jan 2023, at 19:45, Arnaud Masson <arnaud.masson at fr.ibm.com<mailto:arnaud.masson at fr.ibm.com>> wrote:
I canāt see how stop-the-world effect can be avoided once all your carriers are busy with non-switchable CPU-bound tasks. Maybe Iām missing something š
Not very different from other pinning problems (JNI...), except the argument that 100% CPU usage should never occur so not a problem.
I will try to make some test to simulate and post the result here.
Thanks
Arnaud
I donāt think that increasing the schedulerās parallelism would help, nor do I think youād see a āstop-the-worldā, but again, these hypotheses are just not actionable. Thereās nothing we can do to address them. When you find a problem, please report it and weāll investigate what can be done.
ā Ron
On 6 Jan 2023, at 19:11, Arnaud Masson <arnaud.masson at fr.ibm.com<mailto:arnaud.masson at fr.ibm.com>> wrote:
I donāt think having 100% CPU usage on a pod is enough to justify a āstop-the-worldā effect on Loom scheduling for the other tasks.
Also 100% is the extreme case, but there can be 75% CPU usage, meaning only 1 carrier left for all other tasks in my example.
Again not a blocker I guess, just have to increase the carrier count to mitigate, but itās good old native thread sizing again where it should not be really needed.
āTime-sharing would make those expensive tasks complete in a lot more than 10 secondsā:
I understand there would be switching overhead (so itās slower), but I donāt understand why it would be much slower if there are few of them like in my example.
thanks
Arnaud
Unless otherwise stated above:
Compagnie IBM France
SiĆØge Social : 17, avenue de l'Europe, 92275 Bois-Colombes Cedex
RCS Nanterre 552 118 465
Forme Sociale : S.A.S.
Capital Social : 664 069 390,60 ā¬
SIRET : 552 118 465 03644 - Code NAF 6203Z
Unless otherwise stated above:
Compagnie IBM France
SiĆØge Social : 17, avenue de l'Europe, 92275 Bois-Colombes Cedex
RCS Nanterre 552 118 465
Forme Sociale : S.A.S.
Capital Social : 664 069 390,60 ā¬
SIRET : 552 118 465 03644 - Code NAF 6203Z
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20230106/d306f67f/attachment-0001.htm>
More information about the loom-dev
mailing list