[External] : Re: jstack, profilers and other tools

eric at kolotyluk.net eric at kolotyluk.net
Thu Jul 14 17:29:45 UTC 2022


Thanks for that clarification, Ron…

 

Would it be fair to say a system is stable, in the case where when the Load Balancer predicts the current resources will soon become unstable, it launches more instances/resources before instability sets in? That is, appropriate feedback loops can keep the underlying systems stable? Of course, keeping a system stable under load bursts could be challenging, but manageable.

 

My past experience with Akka/Scala shows that we can provision systems such that we get better CPU utilization because such systems can distribute tasks more effectively over limited Platform Threads. I am hopeful that with Loom, Virtual Threads, Structured Concurrency, etc. we also can provision systems with better resource utilization. For example, the load balance spawns new instances at 75% utilization rather than 50% under conventional systems.

 

Also, my hope is that Loom will let us design and implement this with a lower cognitive load than using reactive programming techniques. Once Loom becomes more available, such as Java 19, it will be interesting to see what impact this has on the Reactive Community, Scala, Akka, Kotlin and Kotlin Coroutines, etc.

 

This whole Loom Project would make a fascinating University course as a lot of important historical lessons are apparent. There has been a lot of progress from Fibers to Virtual Threads…

 

I took two Coursera <https://www.coursera.org/>  courses, one on Functional Programming in Scala, and the other on Reactive Programming in Scala. They were great courses. It would be fantastic for Coursera to have a course on Concurrent Programming in Loom.

 

Cheers, Eric

 

From: Ron Pressler <ron.pressler at oracle.com> 
Sent: July 14, 2022 3:35 AM
To: Eric Kolotyluk <eric at kolotyluk.net>
Cc: Alex Otenko <oleksandr.otenko at gmail.com>; Rob Bygrave <robin.bygrave at gmail.com>; Egor Ushakov <egor.ushakov at jetbrains.com>; loom-dev at openjdk.org
Subject: Re: [External] : Re: jstack, profilers and other tools

 

First, there is no such thing as more or less stable. Stability is binary. Either the rate at which requests are completed is equal to the rate at which they arrive (the system is stable), or it is lower (in which case requests pile up and the system is unstable). Although, I guess you could talk about how quickly requests pile up and your server starts dropping them. 

 

Second, if your system is stable, Little’s law tells you how many requests are being concurrently served. Obviously, if you’re serving L concurrent requests in a stable system, then you have sufficient resources to serve them concurrently. Every request might consume a little or a lot of some resources — CPU, memory, networking — and so those resources' availability imposes upper bounds on your concurrency. But (assuming you use threads as your units of concurrency) every concurrent request must consume at least one thread, or it won’t be able to make progress at all. So threads are also an upper bound on concurrency, and we know empirically that in a great many server systems OS threads become the most constraining upper bound on concurrency well before other resources. Virtual threads remove that particular limitation, which helps all those systems, and now the concurrency of your system is only limited by the other resources I mentioned.

 

If every request consumes 1/10 of your available CPU over its entire duration, then your CPU puts a limit of 10 on your concurrency and threads are not your bottleneck, but if you’re using virtual threads — meaning you want a much higher number of threads — then that’s not your circumstance. Clearly, when your CPU, or any other resource consumed by the requests you serve, is at 100% (for any non-instantaneous duration) then your system is not stable.

 

— Ron





On 13 Jul 2022, at 19:26, eric at kolotyluk.net <mailto:eric at kolotyluk.net>  wrote:

 

Just testing my intuition here… because reading what Ron says is often eye-opening… and changes my intuition

 

1.	Loom improves concurrency via Virtual Threads

a.	And consequently, potentially improves throughput

2.	A key aspect of concurrency is blocking, where blocked tasks enable resources to be applied to unblocked tasks (where Fork-Join is highly effective)

a.	Pre-Loom, resources such as Threads could be applied to unblocked tasks, but

                                                               i.      Platform Threads are heavy, expensive, etc. such that the number of Platform Threads puts a bound on concurrency

b.	Post-Loom, resources such as Virtual Threads can now be applied to unblocked tasks, such that

                                                               i.      Light, cheap, etc. Virtual Threads enable a much higher bound on concurrency

                                                             ii.      According to Little’s Law, throughput can rise because the number of threads can rise.

3.	Little’s Law also says “The only requirements are that the system be stable and non-preemptive;”

a.	While the underlying O/S may be preemptive, the JVM is not, so this requirement is met.
b.	But, Ron says, “While it is true that the rate of arrival might rise without bound, if the number of threads is insufficient to meet it, then the system is no longer stable (normally that means that queues are growing without bound).”
c.	Which I take to imply, that increasing the number of Virtual Threads increases the stability… ?

                                                               i.      Even in Loom, there is an upper bound on Virtual Threads created, albeit a much higher upper bound.

4.	Where I am still confused is

a.	In Loom, I would expect that even when all our CPU Cores are at 100%, 100% throughput, the system is still stable?

                                                               i.      Or maybe I am misinterpreting what Ron said?

b.	However, latency will suffer, unless

                                                               i.      more CPU Cores are added to the overall load, via some load balancer

                                                             ii.      flow control, such as backpressure, is added such that queues do not grow without bound (a topic I would love to explore more)

                                                           iii.      Or, does an increase in latency mean a loss of stability?

 

Cheers, Eric

 

From: loom-dev <loom-dev-retn at openjdk.org <mailto:loom-dev-retn at openjdk.org> > On Behalf Of Ron Pressler
Sent: July 13, 2022 6:30 AM
To: Alex Otenko <oleksandr.otenko at gmail.com <mailto:oleksandr.otenko at gmail.com> >
Cc: Rob Bygrave <robin.bygrave at gmail.com <mailto:robin.bygrave at gmail.com> >; Egor Ushakov <egor.ushakov at jetbrains.com <mailto:egor.ushakov at jetbrains.com> >; loom-dev at openjdk.org <mailto:loom-dev at openjdk.org> 
Subject: Re: [External] : Re: jstack, profilers and other tools

 

The application of Little’s law is 100% correct. Little’s law tells us that the number of threads must *necessarily* rise if throughput is to be high. Whether or not that alone is *sufficient* might depend on the concurrency level of other resources as well. The number of threads is not the only quantity that limits the L in the formula, but L cannot be higher than the number of threads. Obviously, if the system’s level of concurrency is bounded at a very low level — say, 10 — then having more than 10 threads is unhelpful, but as we’re talking about a program that uses virtual threads, we know that is not the case.

 

Also, Little’s law describes *stable* systems; i.e. it says that *if* the system is stable, then a certain relationship must hold. While it is true that the rate of arrival might rise without bound, if the number of threads is insufficient to meet it, then the system is no longer stable (normally that means that queues are growing without bound).

 

— Ron






On 13 Jul 2022, at 14:00, Alex Otenko <oleksandr.otenko at gmail.com <mailto:oleksandr.otenko at gmail.com> > wrote:

 

This is an incorrect application of Little's Law. The law only posits that there is a connection between quantities. It doesn't specify which variables depend on which. In particular, throughput is not a free variable. 

 

Throughput is something outside your control. 100k users open their laptops at 9am and login within 1 second - that's it, you have throughput of 100k ops/sec.

 

Then based on response time the system is able to deliver, you can tell what concurrency makes sense here. Adding threads is not going to change anything - certainly not if threads are not the bottleneck resource. Threads become the bottleneck when you have hardware to run them, but not the threads.

 

On Tue, 12 Jul 2022, 15:47 Ron Pressler, <ron.pressler at oracle.com <mailto:ron.pressler at oracle.com> > wrote:

 






On 11 Jul 2022, at 22:13, Rob Bygrave <robin.bygrave at gmail.com <mailto:robin.bygrave at gmail.com> > wrote:

 

> An existing application that migrates to using virtual threads doesn’t replace its platform threads with virtual threads

 

What I have been confident about to date based on the testing I've done is that we can use Jetty with a Loom based thread pool and that has worked very well. That is replacing current platform threads with virtual threads. I'm suggesting this will frequently be sub 1000 virtual threads.  Ron, are you suggesting this isn't a valid use of virtual threads or am I reading too much into what you've said here?

 

 

The throughput advantage to virtual threads comes from one aspect — their *number* — as explained by Little’s law. A web server employing virtual thread would not replace a pool of N platform threads with a pool of N virtual threads, as that does not increase the number of threads required to increase throughput. Rather, it replaces the pool of N virtual threads with an unpooled ExecutorService that spawns at least one new virtual thread for every HTTP serving task. Only that can increase the number of threads sufficiently to improve throughput.






 

 

> unusual for an application that has any virtual threads to have fewer than, say, 10,000

 

In the case of http server use of virtual thread, I feel the use of unusual is too strong. That is, when we are using virtual threads for application code handling of http request/response (like Jetty + Loom), I suspect this is frequently going to operate with less than 1000 concurrent requests per server instance.  

 

1000 concurrent requests would likely translate to more than 10,000 virtual threads due to fanout (JEPs 425 and 428 cover this). In fact, even without fanout, every HTTP request might wish to spawn more than one thread, for example to have one thread for reading and one for writing. The number 10,000, however, is just illustrative. Clearly, an application with virtual threads will have some large number of threads (significantly larger than applications with just platform threads), because the ability to have a large number of threads is what virtual threads are for.

 

The important point is that tooling needs to adapt to a high number of threads, which is why we’ve added a tool that’s designed to make sense of many threads, where jstack might not be very useful.

 

— Ron

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20220714/a406fba6/attachment-0001.htm>


More information about the loom-dev mailing list