[External] : Re: jstack, profilers and other tools

Ron Pressler ron.pressler at oracle.com
Tue Aug 2 21:31:29 UTC 2022


Thinking more about my previous response and your question, there’s a point that I think is worth highlighting.

We’d like to think of threads not as a resource but as an application construct; not as a piece of infrastructure, but a piece of business logic. Why? Because that’s what Java’s basic design encourages, except it hasn’t been practically feasible for a while. Indeed, what I always say when explaining how to adopt virtual threads is that virtual threads don’t replace platform threads — they replace tasks.

You don’t ask what happens if the application adds more tasks because the question is meaningless. The number of tasks in your application is however much it needs as governed by the application logic; no more, no less. We want the same for threads. The only question is *can* we have as many threads as that (i.e. as many threads as tasks). With virtual threads the answer is yes.

— Ron

On 2 Aug 2022, at 21:56, Ron Pressler <ron.pressler at oracle.com<mailto:ron.pressler at oracle.com>> wrote:



On 2 Aug 2022, at 20:53, Alex Otenko <oleksandr.otenko at gmail.com<mailto:oleksandr.otenko at gmail.com>> wrote:

I think you have two different meanings of thread-per-request. What most people write, is a pool of threads with a queue. Then you also have a different notion where thread-per-request really creates a new thread for every request as and when they arrive. It's hard to follow which one you mean when.

All of these definitions amount to the exact same result in this context and you can use them interchangeably. We define it to mean a server where a request consumes at least one thread exclusively for the entire time it is being processed.

We get the exact same result for servers that obtain a thread from a pool to service a request, and implement a queue in the software where the requests wait before being assigned to a thread. Why? Because the easiest way to see what happens is to still consider the system’s boundary from the moment a thread is assigned, and so W is the time spent *in the thread* (excluding the queue), and then a mathematical theorem tells us what happens in the queue: no matter the queue’s capacity — a zero or ten billion — if we have sufficient threads "inside", then we know that the total numbers of requests that have to wait outside, whether they are Java objects in that queue, bytes in the network buffer, or people clicking refresh — their total number will not grow indefinitely, and so the behaviour of our server will be stable.


I understand that when you propose to draw a line on the other side of the queue you can describe what happens to requests. But because you vigorously objected to including the queue wait, I want to see how the description of the system changes if I add more threads. If there is no way to distinguish the two (one thread vs two threads), I'll be inclined to use a description where I can see the difference - for example, the description of the system where queue wait is included. I see no harm in that, only clarity.

First, whichever queue you include in the system, the one that Little’s law applies to with respect to stability is not that one. Again, an unstable system where the formula doesn’t hold will have a queue forming (regardless of whether it is implemented digitally or not) *outside* of it. Therefore, Little’s law always teaches us something about a queue outside whatever system we delineate. If you choose to describe a system that contains an internal queue (which, therefore, will not be the queue that defines stability), then if requests wait in that queue, then those requests — which are not represented by threads — will be counted toward L, as they’re inside.

Second, I think I must have repeated the sentence “no one is talking about adding threads” in this discussion at least five times. I don’t care what happens when you add more threads (beyond what’s needed, that is), because the number of threads in the servers I’m interested in is dictated by how many requests you have and what they’re doing; i.e. it is a user-facing construct, not a feature of the system’s internal design. You get one thread for the request, and then one for each outgoing I/O operation or something along those lines. I.e. you have as many threads as you need; any fewer and you become unstable, and what would it even mean to add threads to such a system beyond of what it needs? When threads are used in this way, the question of “what happens when you add more” makes as much sense as asking “what happens when you add more strings?”

The question that is of interest to me in this context is how can that kind of server make optimal use of the hardware, and the kind of feedback we’re looking for is of the kind, “I’ve written a thread-per-request server using virtual threads; here’s what went well and what didn’t."


As for sizing - there are formulas that appear very similar to Little's law, but it is made explicit what time is used there, so there really is no confusion. For example, arrival_rate/departure_rate_per_thread tells you how many threads you need "without" Little's law. (In quotes, because 1/departure_rate_per_thread is really the time it takes one thread to process one request - queue wait conveniently left out)

I don’t see any confusion about this theorem, it is very widely used in many areas of system design — both in software as well as operations research — and that is the theorem I use to explain the need for virtual threads, as I find it simple and instructive. But if other formulas suit your needs better, use those.

— Ron

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20220802/66af7050/attachment.htm>


More information about the loom-dev mailing list