<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.apple-converted-space
{mso-style-name:apple-converted-space;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 70.85pt 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="FR" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">My scenario is not when you have ten thousands of CPU bound tasks.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">The scenario I’m talking about is when you have thousands of IO bound tasks and just enough CPU-bound tasks (in the same scheduler) to pin most of the carrier threads for some time.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">When I talk about CPU-bound task here, it’s the worst case when there is not even a yield() or random small IO so it won’t allow to switch the carrier thread until fully completed if
I understand correctly.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">Example:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">A webapp with a Loom scheduler with 4 native threads.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">The server gets continuously on average 100 concurrent IO-bound requests: ok<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">Then it gets a one-shot group of 4 CPU-bound requests, pure CPU (no yield) stuff taking 10 secs each.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">Won’t the 4 carrier threads be sticking to the 4 CPU-bounds requests preventing progress on the other IO-bound requests for 10 secs, then resume work on IO-bound requests?<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">(So not very different from your time sharing example: timesharing on the minority / CPU-bounds tasks would maintain responsiveness for the majority of IO-bound.)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">Thanks<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US">Arnaud<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">Why are you concerned about a problem no one has reported yet? As I said, we are very interested in fixing problems, but we don’t know how to fix non-problems. If someone shows us a problem
— a server that misbehaves under some loads — we’ll try to address it. Even if there is a problem with scheduling, and even if time-sharing could solve it, we still don’t know what kind of time-sharing algorithm to employ until we see the actual problem, so
there’s nothing we can do about it until we know what it actually is.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">It’s not that I’m resistant to adding time-sharing. Far from it. But until we have a problem-reproducer that we can test and mark as fixed, we really can’t tell if the time-sharing that we were
to introduce today is the time-sharing that would solve the problem we are yet to see.<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><br>
<br>
<o:p></o:p></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">Of course generally on web apps, most requests are IO-bound (http, jdbc) but I do have seen CPU-bound requests on prod (sometimes accidental).<o:p></o:p></span></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">Have you encountered a problem with virtual threads in servers with occasional CPU-bound requests?<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><br>
<br>
<o:p></o:p></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"> <o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">I don’t use Loom on prod today (I guess not a lot of people do since it’s still preview), so if you are asking if I see a production issue, answer is no.<o:p></o:p></span></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">Have you seen a problem with a server misbehaving in testing, then?<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">You need to understand that I’m not teasing you. I actually want people to report problems so that we could make the product better by fixing them. But merely saying that there could be a problem
and that some sketch of a solution could address it doesn’t give us anything actionable that would help us improve the product.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><br>
<br>
<o:p></o:p></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"> <o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">I suppose if I want to migrate to loom and be safe, I can increase the number of native carriers in the underlying pool (N >> core count).<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">It’s just that if there was timesharing in Loom, I don’t see why vthreads would not be systematically used (almost blindly, for CPU-bound and IO-bound).<o:p></o:p></span></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">But servers based on thread pools have *worse* fairness than virtual threads: they don’t share as well even when not at 100% CPU. I don’t understand why you think more workers would make you
safer when you’re not sure whether or not time-sharing helps server workloads at all.<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><br>
<br>
<o:p></o:p></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"> <o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">I’m curious, when do you think<span class="apple-converted-space"> </span>preemptive time sharing (as implemented in various OSes for decades) is useful?<o:p></o:p></span></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US">Time-sharing in *non* realtime kernels is crucial useful to keep a system responsive to operator interaction in the presence of a few background tasks that can saturate the CPU; without it,
the operator isn’t even able to terminate resource-hungry processes (as would happen in early Windows versions). But transaction processing in servers is different. You have tens of thousands of tasks, if they consume a significant amount CPU then you’re overcommitted
by orders of magnitude — indeed, OS time sharing is not able to make servers well-behaved at 100% CPU — but responsiveness to operator intervention is still preserved thanks to the OS time-sharing.<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:35.4pt"><span lang="EN-US"><o:p> </o:p></span></p>
<div>
<p class="MsoNormal" style="margin-left:35.4pt">— Ron<o:p></o:p></p>
</div>
</div>
<DIV>
Unless otherwise stated above:<BR>
<BR>
Compagnie IBM France<BR>
Siège Social : 17, avenue de l'Europe, 92275 Bois-Colombes Cedex<BR>
RCS Nanterre 552 118 465<BR>
Forme Sociale : S.A.S.<BR>
Capital Social : 664 069 390,60 €<BR>
SIRET : 552 118 465 03644 - Code NAF 6203Z<BR>
</DIV></body>
</html>