<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:998265709;
mso-list-type:hybrid;
mso-list-template-ids:-894950270 -900809868 269025305 269025307 269025295 269025305 269025307 269025295 269025305 269025307;}
@list l0:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-CA link=blue vlink=purple style='word-wrap:break-word'><div class=WordSection1><p class=MsoNormal><span style='mso-fareast-language:EN-US'>Just testing my intuition here… because reading what Ron says is often eye-opening… and changes my intuition<o:p></o:p></span></p><p class=MsoNormal><span style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><ol style='margin-top:0cm' start=1 type=1><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level1 lfo1'><span style='mso-fareast-language:EN-US'>Loom improves concurrency via Virtual Threads<o:p></o:p></span></li><ol style='margin-top:0cm' start=1 type=a><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level2 lfo1'><span style='mso-fareast-language:EN-US'>And consequently, potentially improves throughput<o:p></o:p></span></li></ol><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level1 lfo1'><span style='mso-fareast-language:EN-US'>A key aspect of concurrency is blocking, where blocked tasks enable resources to be applied to unblocked tasks (where Fork-Join is highly effective)<o:p></o:p></span></li><ol style='margin-top:0cm' start=1 type=a><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level2 lfo1'><span style='mso-fareast-language:EN-US'>Pre-Loom, resources such as Threads could be applied to unblocked tasks, but<o:p></o:p></span></li></ol></ol><p class=MsoListParagraph style='margin-left:108.0pt;text-indent:-108.0pt;mso-text-indent-alt:-9.0pt;mso-list:l0 level3 lfo1'><![if !supportLists]><span style='mso-fareast-language:EN-US'><span style='mso-list:Ignore'><span style='font:7.0pt "Times New Roman"'> </span>i.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='mso-fareast-language:EN-US'>Platform Threads are heavy, expensive, etc. such that the number of Platform Threads puts a bound on concurrency<o:p></o:p></span></p><ol style='margin-top:0cm' start=2 type=1><ol style='margin-top:0cm' start=2 type=a><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level2 lfo1'><span style='mso-fareast-language:EN-US'>Post-Loom, resources such as Virtual Threads can now be applied to unblocked tasks, such that<o:p></o:p></span></li></ol></ol><p class=MsoListParagraph style='margin-left:108.0pt;text-indent:-108.0pt;mso-text-indent-alt:-9.0pt;mso-list:l0 level3 lfo1'><![if !supportLists]><span style='mso-fareast-language:EN-US'><span style='mso-list:Ignore'><span style='font:7.0pt "Times New Roman"'> </span>i.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='mso-fareast-language:EN-US'>Light, cheap, etc. Virtual Threads enable a much higher bound on concurrency<o:p></o:p></span></p><p class=MsoListParagraph style='margin-left:108.0pt;text-indent:-108.0pt;mso-text-indent-alt:-9.0pt;mso-list:l0 level3 lfo1'><![if !supportLists]><span style='mso-fareast-language:EN-US'><span style='mso-list:Ignore'><span style='font:7.0pt "Times New Roman"'> </span>ii.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='mso-fareast-language:EN-US'>According to Little’s Law, throughput <b><i>can</i></b> rise because the number of threads <b><i>can</i></b> rise.<o:p></o:p></span></p><ol style='margin-top:0cm' start=3 type=1><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level1 lfo1'><span style='mso-fareast-language:EN-US'>Little’s Law also says “The only requirements are that the system be stable and non-preemptive;”<o:p></o:p></span></li><ol style='margin-top:0cm' start=1 type=a><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level2 lfo1'><span style='mso-fareast-language:EN-US'>While the underlying O/S may be preemptive, the JVM is not, so this requirement is met.<o:p></o:p></span></li><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level2 lfo1'><span style='mso-fareast-language:EN-US'>But, Ron says, “</span>While it is true that the rate of arrival might rise without bound, if the number of threads is insufficient to meet it, then the system is no longer stable (normally that means that queues are growing without bound).”<span style='mso-fareast-language:EN-US'><o:p></o:p></span></li><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level2 lfo1'>Which I take to imply, that increasing the number of Virtual Threads increases the stability… ?<span style='mso-fareast-language:EN-US'><o:p></o:p></span></li></ol></ol><p class=MsoListParagraph style='margin-left:108.0pt;text-indent:-108.0pt;mso-text-indent-alt:-9.0pt;mso-list:l0 level3 lfo1'><![if !supportLists]><span style='mso-fareast-language:EN-US'><span style='mso-list:Ignore'><span style='font:7.0pt "Times New Roman"'> </span>i.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]>Even in Loom, there is an upper bound on Virtual Threads created, albeit a much higher upper bound.<span style='mso-fareast-language:EN-US'><o:p></o:p></span></p><ol style='margin-top:0cm' start=4 type=1><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level1 lfo1'><span style='mso-fareast-language:EN-US'>Where I am still confused is<o:p></o:p></span></li><ol style='margin-top:0cm' start=1 type=a><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level2 lfo1'><span style='mso-fareast-language:EN-US'>In Loom, I would expect that even when all our CPU Cores are at 100%, 100% throughput, the system is still stable?<o:p></o:p></span></li></ol></ol><p class=MsoListParagraph style='margin-left:108.0pt;text-indent:-108.0pt;mso-text-indent-alt:-9.0pt;mso-list:l0 level3 lfo1'><![if !supportLists]><span style='mso-fareast-language:EN-US'><span style='mso-list:Ignore'><span style='font:7.0pt "Times New Roman"'> </span>i.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='mso-fareast-language:EN-US'>Or maybe I am misinterpreting what Ron said?<o:p></o:p></span></p><ol style='margin-top:0cm' start=4 type=1><ol style='margin-top:0cm' start=2 type=a><li class=MsoListParagraph style='margin-left:0cm;mso-list:l0 level2 lfo1'><span style='mso-fareast-language:EN-US'>However, latency will suffer, unless<o:p></o:p></span></li></ol></ol><p class=MsoListParagraph style='margin-left:108.0pt;text-indent:-108.0pt;mso-text-indent-alt:-9.0pt;mso-list:l0 level3 lfo1'><![if !supportLists]><span style='mso-fareast-language:EN-US'><span style='mso-list:Ignore'><span style='font:7.0pt "Times New Roman"'> </span>i.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='mso-fareast-language:EN-US'>more CPU Cores are added to the overall load, via some load balancer<o:p></o:p></span></p><p class=MsoListParagraph style='margin-left:108.0pt;text-indent:-108.0pt;mso-text-indent-alt:-9.0pt;mso-list:l0 level3 lfo1'><![if !supportLists]><span style='mso-fareast-language:EN-US'><span style='mso-list:Ignore'><span style='font:7.0pt "Times New Roman"'> </span>ii.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='mso-fareast-language:EN-US'>flow control, such as backpressure, is added such that queues do not grow without bound (a topic I would love to explore more)<o:p></o:p></span></p><p class=MsoListParagraph style='margin-left:108.0pt;text-indent:-108.0pt;mso-text-indent-alt:-9.0pt;mso-list:l0 level3 lfo1'><![if !supportLists]><span style='mso-fareast-language:EN-US'><span style='mso-list:Ignore'><span style='font:7.0pt "Times New Roman"'> </span>iii.<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='mso-fareast-language:EN-US'>Or, does an increase in latency mean a loss of stability?<o:p></o:p></span></p><p class=MsoNormal><span style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span style='mso-fareast-language:EN-US'>Cheers, Eric<o:p></o:p></span></p><p class=MsoNormal><span style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b><span lang=EN-US>From:</span></b><span lang=EN-US> loom-dev <loom-dev-retn@openjdk.org> <b>On Behalf Of </b>Ron Pressler<br><b>Sent:</b> July 13, 2022 6:30 AM<br><b>To:</b> Alex Otenko <oleksandr.otenko@gmail.com><br><b>Cc:</b> Rob Bygrave <robin.bygrave@gmail.com>; Egor Ushakov <egor.ushakov@jetbrains.com>; loom-dev@openjdk.org<br><b>Subject:</b> Re: [External] : Re: jstack, profilers and other tools<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>The application of Little’s law is 100% correct. Little’s law tells us that the number of threads must *necessarily* rise if throughput is to be high. Whether or not that alone is *sufficient* might depend on the concurrency level of other resources as well. The number of threads is not the only quantity that limits the L in the formula, but L cannot be higher than the number of threads. Obviously, if the system’s level of concurrency is bounded at a very low level — say, 10 — then having more than 10 threads is unhelpful, but as we’re talking about a program that uses virtual threads, we know that is not the case.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Also, Little’s law describes *stable* systems; i.e. it says that *if* the system is stable, then a certain relationship must hold. While it is true that the rate of arrival might rise without bound, if the number of threads is insufficient to meet it, then the system is no longer stable (normally that means that queues are growing without bound).<o:p></o:p></p><div><div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>— Ron<o:p></o:p></p></div><div><p class=MsoNormal><br><br><o:p></o:p></p><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><div><p class=MsoNormal>On 13 Jul 2022, at 14:00, Alex Otenko <<a href="mailto:oleksandr.otenko@gmail.com">oleksandr.otenko@gmail.com</a>> wrote:<o:p></o:p></p></div><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>This is an incorrect application of Little's Law. The law only posits that there is a connection between quantities. It doesn't specify which variables depend on which. In particular, throughput is not a free variable. <o:p></o:p></p><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Throughput is something outside your control. 100k users open their laptops at 9am and login within 1 second - that's it, you have throughput of 100k ops/sec.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Then based on response time the system is able to deliver, you can tell what concurrency makes sense here. Adding threads is not going to change anything - certainly not if threads are not the bottleneck resource. Threads become the bottleneck when you have hardware to run them, but not the threads.<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>On Tue, 12 Jul 2022, 15:47 Ron Pressler, <<a href="mailto:ron.pressler@oracle.com">ron.pressler@oracle.com</a>> wrote:<o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm'><div><div><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal><br><br><o:p></o:p></p><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><div><p class=MsoNormal>On 11 Jul 2022, at 22:13, Rob Bygrave <<a href="mailto:robin.bygrave@gmail.com" target="_blank">robin.bygrave@gmail.com</a>> wrote:<o:p></o:p></p></div><p class=MsoNormal><o:p> </o:p></p><div><div><div><div><p class=MsoNormal><i>> An existing application that migrates to using virtual threads doesn’t replace its platform threads with virtual threads</i><o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>What I have been confident about to date based on the testing I've done is that we can use Jetty with a Loom based thread pool and that has worked very well. That is replacing current platform threads with virtual threads. I'm suggesting this will frequently be sub 1000 virtual threads. Ron, are you suggesting this isn't a valid use of virtual threads or am I reading too much into what you've said here?<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div></div></div></div></blockquote><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>The throughput advantage to virtual threads comes from one aspect — their *number* — as explained by Little’s law. A web server employing virtual thread would not replace a pool of N platform threads with a pool of N virtual threads, as that does not increase the number of threads required to increase throughput. Rather, it replaces the pool of N virtual threads with an unpooled ExecutorService that spawns at least one new virtual thread for every HTTP serving task. Only that can increase the number of threads sufficiently to improve throughput.<o:p></o:p></p></div><p class=MsoNormal><br><br><o:p></o:p></p><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><div><div><div><div><p class=MsoNormal><o:p> </o:p></p></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>> <b><i>unusual</i></b> for an application that has any virtual threads to have fewer than, say, 10,000<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>In the case of http server use of virtual thread, I feel the use of <b><i>unusual</i></b> is too strong. That is, when we are using virtual threads for application code handling of http request/response (like Jetty + Loom), I suspect this is frequently going to operate with less than 1000 concurrent requests per server instance. <o:p></o:p></p></div></div></div></blockquote><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>1000 concurrent requests would likely translate to more than 10,000 virtual threads due to fanout (JEPs 425 and 428 cover this). In fact, even without fanout, every HTTP request might wish to spawn more than one thread, for example to have one thread for reading and one for writing. The number 10,000, however, is just illustrative. Clearly, an application with virtual threads will have some large number of threads (significantly larger than applications with just platform threads), because the ability to have a large number of threads is what virtual threads are for.<o:p></o:p></p></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>The important point is that tooling needs to adapt to a high number of threads, which is why we’ve added a tool that’s designed to make sense of many threads, where jstack might not be very useful.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>— Ron<o:p></o:p></p></div><p class=MsoNormal><o:p> </o:p></p></div></div></blockquote></div></div></blockquote></div><p class=MsoNormal><o:p> </o:p></p></div></div></div></div></body></html>