<div dir="auto">That's the indisputable bit. The contentious part is that adding more threads is going to increase throughput.<div dir="auto"><br></div><div dir="auto">Supposing that 10k threads are there, and you actually need them, you should get concurrency level 10k. Let's see what that means in practice.</div><div dir="auto"><br></div><div dir="auto">If it is a 1-CPU machine, 10k requests in flight somewhere at any given time means they are waiting for 99.99% of time. Or, out of 1 second they spend 100 microseconds on CPU, and waiting for something for the rest of the time (or, out of 100ms response time, 10 microseconds on CPU - barely enough to parse REST request). This can't be the case for the majority of workflows.</div><div dir="auto"><br></div><div dir="auto">Of course, having 10k threads for less than 1 second each doesn't mean you are getting concurrency thar is unattainable with fewer threads.</div><div dir="auto"><br></div><div dir="auto">The bottom line is that adding threads you aren't necessarily increasing concurrency.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 15 Jul 2022, 10:19 Ron Pressler, <<a href="mailto:ron.pressler@oracle.com">ron.pressler@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div style="word-wrap:break-word;line-break:after-white-space">

The number of threads doesn’t “do” or not do you do anything. If requests arrive at 100K per second, each takes 500ms to process, then the number of threads you’re using *is equal to* at least 50K (assuming thread-per-request) in a stable system, that’s all.

 That is the physical meaning: the formula tells you what the quantities *are* in a stable system.

<div><br>

</div>

<div>Because in a thread-per-request program, every concurrent request takes up at least one thread, while the formula does not immediately tell you how many machines are used, or what the RAM, CPU, and network bandwidth utilisation is, it does give

 you a lower bound on the total number of live threads. Conversely, the number of threads gives an upper bound on L.</div>

<div><br>

</div>

<div>As to the rest about splitting into subtasks, that increases L and reduces W by the same factor, so when applying Little’s law it’s handy to treat W as the total latency, *as if* it was processed sequentially, if we’re interested in L being the

 number of concurrent requests. More about that here: <a href="https://inside.java/2020/08/07/loom-performance/" target="_blank" rel="noreferrer">

https://inside.java/2020/08/07/loom-performance/</a></div>

<div><br>

</div>

<div>— Ron<br>

<div><br>

<blockquote type="cite">

<div>On 15 Jul 2022, at 09:37, Alex Otenko <<a href="mailto:oleksandr.otenko@gmail.com" target="_blank" rel="noreferrer">oleksandr.otenko@gmail.com</a>> wrote:</div>

<br>

<div>

<div dir="auto">You quickly jumped to a *therefore*.

<div dir="auto"><br>

</div>

<div dir="auto">Newton's second law binds force, mass and acceleration. But you can't say that you can decrease mass by increasing acceleration, if the force remains the same. That is, the statement would be arithmetically correct, but it would have

 no physical meaning.</div>

<div dir="auto"><br>

</div>

<div dir="auto">Adding threads allows to do more work. But you can't do more work at will - the amount of work going through the system is a quantity independent of your design.</div>

<div dir="auto"><br>

</div>

<div dir="auto">Now, what you could do at will, is split the work into sub-tasks. Virtual threads allow to do this at very little cost. However, you still can't talk about an increase in concurrency due to Little's law, because - enter Amdahl - response

 time changes.</div>

<div dir="auto"><br>

</div>

<div dir="auto">Say, 100k requests get split into 10 sub tasks each, each runnable independently. Amdahl says your response time is going down 10-fold. So you have 100k requests times 1ms gives concurrency 100. Concurrency got reduced. Not surprising

 at all, because now each request spends 10x less time in the system.</div>

<div dir="auto"><br>

</div>

<div dir="auto">What about subtasks? Aren't we running more of them? Does this mean concurrency increased?</div>

<div dir="auto"><br>

</div>

<div dir="auto">Yes, 100k requests begets 1m sub tasks. We can't compare concurrency, because the definition of the unit of work changed: was W, became W/10. But let's see anyway. So we have 1m tasks, each finished in 1ms - concurrency is 1000. Same

 as before splitting the work and matching change of response time. I treat this like I would any units of measurement change.</div>

<div dir="auto"><br>

</div>

<div dir="auto"><br>

</div>

<div dir="auto">So whereas I see a lot of good from being able to spin up threads, lots and shortlived, I don't see how you can claim concurrency increases, or that Little's law somehow controls throughput.</div>

<div dir="auto"><br>

</div>

<div dir="auto"><br>

</div>

<div dir="auto">Alex</div>

</div>

<br>

<div class="gmail_quote">

<div dir="ltr" class="gmail_attr">On Thu, 14 Jul 2022, 11:01 Ron Pressler, <<a href="mailto:ron.pressler@oracle.com" target="_blank" rel="noreferrer">ron.pressler@oracle.com</a>> wrote:<br>

</div>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div style="word-wrap:break-word;line-break:after-white-space">Little’s law tells us what the relationship between concurrency, throughput and latency is if the system is stable. It tells us that if latency doesn’t decrease, then concurrency rises

 with throughput (again, if the system is stable). Therefore, to support high throughput you need a high level of concurrency. Since the Java platform’s unit of concurrency is the thread, to support high throughput you need a high number of threads. There might

 be other things you also need more of, but you *at least* need a high number of threads.

<div><br>

</div>

<div>The number of threads is an *upper bound* on concurrency, because the platform cannot make concurrent progress on anything without a thread (with the caveat in the next paragraph). There might be other upper bounds, too (e.g. you need enough memory

 to concurrently store all the working data for your concurrent operations), but the number of threads *is* an upper bound, and the one virtual threads are there to remove.<br>

<div><br>

</div>

<div>Of course, as JEP 425 explains, you could abandon threads altogether and use some other construct as your unit of concurrency, but then you lose platform support. </div>

<div><br>

</div>

<div>In any event, virtual threads exist to support a high number of threads, as Little’s law requires, therefore, if you use virtual threads, you have a high number of them.</div>

<div><br>

</div>

<div>— Ron<br>

<div><br>

<blockquote type="cite">

<div>On 14 Jul 2022, at 08:12, Alex Otenko <<a href="mailto:oleksandr.otenko@gmail.com" rel="noreferrer noreferrer" target="_blank">oleksandr.otenko@gmail.com</a>> wrote:</div>

<br>

<div>

<div dir="auto">Hi Ron,

<div dir="auto"><br>

</div>

<div dir="auto">It looks you are unconvinced. Let me try with illustrative numbers.</div>

<div dir="auto"><br>

</div>

<div dir="auto">The users opening their laptops at 9am don't know how many threads you have. So throughput remains 100k ops/sec in both setups below. Suppose, in the first setup we have a system that is stable with 1000 threads. Little's law tells

 us that the response time cannot exceed 10ms in this case. Little's law does not prescribe response time, by the way; it is merely a consequence of the statement that the system is stable: it couldn't have been stable if its response time were higher.</div>

<div dir="auto"><br>

</div>

<div dir="auto">Now, let's create one thread per request. One claim is that this increases concurrency (and I object to this point alone). Suppose this means concurrency becomes 100k. Little's law says that the response time must be 1 second. Sorry,

 but that's hardly an improvement! In fact, for any concurrency greater than 1000 you must get response time higher than 10ms we've got with 1000 threads. This is not what we want. Fortunately, this is not what happens either.</div>

<div dir="auto"><br>

</div>

<div dir="auto">Really, thread count in the thread per request design has little to do with concurrency level. Concurrency level is a derived quantity. It only tells us how many requests are making progress at any given time in a system that experiences

 request arrival rate R and which is able to process them in time T. The only thing you can control through system design is response time T.</div>

<div dir="auto"><br>

</div>

<div dir="auto">There are good reasons to design a system that way, but Little's law is not one of them.</div>

</div>

<br>

<div class="gmail_quote">

<div dir="ltr" class="gmail_attr">On Wed, 13 Jul 2022, 14:29 Ron Pressler, <<a href="mailto:ron.pressler@oracle.com" rel="noreferrer noreferrer" target="_blank">ron.pressler@oracle.com</a>> wrote:<br>

</div>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div style="word-wrap:break-word;line-break:after-white-space">

<div>The application of Little’s law is 100% correct. Little’s law tells us that the number of threads must *necessarily* rise if throughput is to be high. Whether or not that alone is *sufficient* might depend on the concurrency level of other resources

 as well. The number of threads is not the only quantity that limits the L in the formula, but L cannot be higher than the number of threads. Obviously, if the system’s level of concurrency is bounded at a very low level — say, 10 — then having more than 10

 threads is unhelpful, but as we’re talking about a program that uses virtual threads, we know that is not the case.</div>

<div><br>

</div>

<div>Also, Little’s law describes *stable* systems; i.e. it says that *if* the system is stable, then a certain relationship must hold. While it is true that the rate of arrival might rise without bound, if the number of threads is insufficient to

 meet it, then the system is no longer stable (normally that means that queues are growing without bound).<br>

<div>

<div>

<div><br>

</div>

<div>— Ron</div>

<div><br>

<blockquote type="cite">

<div>On 13 Jul 2022, at 14:00, Alex Otenko <<a href="mailto:oleksandr.otenko@gmail.com" rel="noreferrer noreferrer noreferrer" target="_blank">oleksandr.otenko@gmail.com</a>> wrote:</div>

<br>

<div>

<div dir="auto">This is an incorrect application of Little's Law. The law only posits that there is a connection between quantities. It doesn't specify which variables depend on which. In particular, throughput is not a free variable.

<div dir="auto"><br>

</div>

<div dir="auto">Throughput is something outside your control. 100k users open their laptops at 9am and login within 1 second - that's it, you have throughput of 100k ops/sec.</div>

<div dir="auto"><br>

</div>

<div dir="auto">Then based on response time the system is able to deliver, you can tell what concurrency makes sense here. Adding threads is not going to change anything - certainly not if threads are not the bottleneck resource. Threads become the

 bottleneck when you have hardware to run them, but not the threads.</div>

</div>

<br>

<div class="gmail_quote">

<div dir="ltr" class="gmail_attr">On Tue, 12 Jul 2022, 15:47 Ron Pressler, <<a href="mailto:ron.pressler@oracle.com" rel="noreferrer noreferrer noreferrer" target="_blank">ron.pressler@oracle.com</a>> wrote:<br>

</div>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div style="word-wrap:break-word;line-break:after-white-space">

<div style="word-wrap:break-word;line-break:after-white-space"><br>

<div><br>

<blockquote type="cite">

<div>On 11 Jul 2022, at 22:13, Rob Bygrave <<a href="mailto:robin.bygrave@gmail.com" rel="noreferrer noreferrer noreferrer noreferrer" target="_blank">robin.bygrave@gmail.com</a>> wrote:</div>

<br>

<div>

<div dir="ltr">

<div dir="ltr">

<div></div>

<div><i>> An existing application that migrates to using virtual threads doesn’t replace its platform threads with virtual threads</i></div>

<div><br>

</div>

<div>What I have been confident about to date based on the testing I've done is that we can use Jetty with a Loom based thread pool and that has worked very well. That is replacing current platform threads with virtual threads. I'm suggesting this

 will frequently be sub 1000 virtual threads.  Ron, are you suggesting this isn't a valid use of virtual threads or am I reading too much into what you've said here?<br>

</div>

<div><br>

</div>

</div>

</div>

</div>

</blockquote>

<div><br>

</div>

<div>The throughput advantage to virtual threads comes from one aspect — their *number* — as explained by Little’s law. A web server employing virtual thread would not replace a pool of N platform threads with a pool of N virtual threads, as that does

 not increase the number of threads required to increase throughput. Rather, it replaces the pool of N virtual threads with an unpooled ExecutorService that spawns at least one new virtual thread for every HTTP serving task. Only that can increase the number

 of threads sufficiently to improve throughput.</div>

<br>

<blockquote type="cite">

<div>

<div dir="ltr">

<div dir="ltr">

<div><br>

</div>

</div>

<div dir="ltr"><br>

</div>

<div dir="ltr">> <i><b>unusual</b></i> for an application that has any virtual threads to have fewer than, say, 10,000</div>

<div><br>

</div>

<div>In the case of http server use of virtual thread, I feel the use of

<i><b>unusual</b></i> is too strong. That is, when we are using virtual threads for application code handling of http request/response (like Jetty + Loom), I suspect this is frequently going to operate with less than 1000 concurrent requests

 per server instance.  <br>

</div>

</div>

</div>

</blockquote>

<div><br>

</div>

<div>1000 concurrent requests would likely translate to more than 10,000 virtual threads due to fanout (JEPs 425 and 428 cover this). In fact, even without fanout, every HTTP request might wish to spawn more than one thread, for example to have one

 thread for reading and one for writing. The number 10,000, however, is just illustrative. Clearly, an application with virtual threads will have some large number of threads (significantly larger than applications with just platform threads), because the ability

 to have a large number of threads is what virtual threads are for.</div>

</div>

<div><br>

</div>

<div>The important point is that tooling needs to adapt to a high number of threads, which is why we’ve added a tool that’s designed to make sense of many threads, where jstack might not be very useful.</div>

<div><br>

</div>

<div>— Ron</div>

<br>

</div>

</div>

</blockquote>

</div>

</div>

</blockquote>

</div>

<br>

</div>

</div>

</div>

</div>

</blockquote>

</div>

</div>

</blockquote>

</div>

<br>

</div>

</div>

</div>

</blockquote>

</div>

</div>

</blockquote>

</div>

<br>

</div>

</div>

</blockquote></div>