[External] : Re: jstack, profilers and other tools

Thu Jul 28 20:31:41 UTC 2022

Hi Ron,

The claim in JEP is the same as in this email thread, so that is not much
help. But now I don't need help anymore, because I found the explaination
how the thread count, response time, request rate and thread-per-request
are connected.

Now what makes me bothered about the claims.

Little's law connects throughput to concurrency. We agreed it has no
connection to thread count. That's a disconnect between the claim about
threads and Little's law dictating it.

There's also the assumption that response time remains constant, but that's
a mighty assumption - response time changes with thread count.

There's also the claim of needing more threads. That's also not something
that follows from thread-per-request. Essentially, thread-per-request is a
constructive proof of having an infinite number of threads. How can one
want more? Also, from a different angle - the number of threads in the
thread-per-request needed does not depend on throughput at all.

Just consider what the request rate means. It means that if you choose
however small time frame, and however large request count, there is a
nonzero probability that it will happen. Consequently the number of threads
needed is arbitrarily large for any throughput. Which is just another way
to say that the number of threads is effectively infinite and there is no
point trying to connect it to Little's law. Request rate doesn't change the
number of threads that can exist at any given time, it only changes the
probability of observing any particular number of them in a fixed period of
time.

All this is only criticism of the formal mathematical claims made here and
in the JEP. Nothing needs doing, if no one is interested in formal claims
being perfect.

Alex

On Tue, 26 Jul 2022, 15:41 Ron Pressler, <ron.pressler at oracle.com> wrote:

>
>
> On 26 Jul 2022, at 14:33, Alex Otenko <oleksandr.otenko at gmail.com> wrote:
>
> Hi Ron,
>
> I think I can verbalize what bothered me all along.
>
> I wish someone made a distinction between:
>
> Offered traffic - actual term; determined based on the time one thread
> spends on request.
>
> Capacity - I don't think this is the actual term. This is the actual
> thread count. If this is at or below offered traffic, the system is not
> stable. You can increase capacity until you get to the thread-per-request,
> which probably corresponds to +oo.
>
>
> I don’t understand this sentence.
>
>
> Concurrency as used in Little's law. This is measured in the same units as
> offered traffic, but is not the same as offered traffic, because the time
> used here is the actual response time, which includes all sorts of waits.
>
>
> None of that matters. Little’s law is a mathematical theorem about some
> unit arriving at some processing centre — a customer, a request, whatever —
> and for *that* unit, the theorem relates the average latency of performing
> that operation and the average rate of arrival of those things to the
> average number of those things existing concurrently in the centre. So, we
> pick requests as the things we look at, and everything follows. The theorem
> tells us how many requests, on average, are concurrently being processed,
> and since we’re assuming thread-per-request, this tells us how many threads
> are active, because *by definition* of thread-per-request a concurrent
> request takes at least one thread.
>
>
> The confusing bit then is that we can't be talking of concurrency before
> capacity exceeds offered traffic, because the system is not stable, and
> after that adding threads only decreases concurrency.
>
>
>
> No one is talking about *adding* threads. The number of threads grows
> because rising throughput *makes it grow* in a thread-per-request system.
> Also, we’re not interested in what’s happening in a system in the process
> of crashing.
>
>
>
> Then also the pragmatic angle. At which point, or for what systems should
> I say "yeah, we can't do this without Virtual threads", and at which point
> should I say "thread-per-request is the way to go".
>
>
> As explained in JEP 425, there is absolutely no such point: Picking
> thread-per-request is the premise we’re taking as a given, not the
> conclusion. I.e. we assume thread-per-request, and the conclusion is that
> we need many threads. Virtual threads are designed to allow
> thread-per-request servers to achieve the maximum throughput allowable by
> the hardware.
>
> Why do so many people want to pick thread-per-request? Because
> thread-per-request is the model that allows representing your application’s
> unit of concurrency with the platform’s unit of concurrency, and the Java
> platform has only one such unit: the thread. I.e. it is the only model that
> the language and the platform fully support. That is why asynchronous APIs
> are essentially DSLs and do not rely on the language’s basic
> composition constructs (loops, try/catch, try-with-resources etc.), why JFR
> yields less-than-informative profiles for such programs, and why debuggers
> can’t step through the logical flow of such programs.
>
> So there is absolutely no point at which you’d say “we must do it like
> that”. But *if* you choose to do it like that then you’d need virtual
> threads if your concurrency exceeds ~1000.
>
> Thread-per-request or async are neither good nor bad; they’re just
> different aesthetic styles for writing code. But Java only fully supports
> the former, and *IF* you choose to do it that way, THEN you’ll need virtual
> threads. In other words, a person who should be interested in virtual
> threads is one who thinks it would be nice to write code in the
> thread-per-request style, but doesn’t want to give up on throughput. I
> think the JEP is clear on that.
>
>
> The answer to the first question is: "when your offered traffic is in
> thousands per CPU". Why CPU specifically? Because otherwise something else
> is the bottleneck. This means 100ms wait per 100 microseconds of on-cpu
> time. I don't know how common this is in the world, but in my practice this
> never was the case - because 100 microseconds is about as much as a REST
> endpoint takes to produce a few KB of JSON, and 100ms wait is an eternity
> in comparison. Why thousands? Because we had 200 threads per CPU and sync
> code, and were fine. Maybe it's gross, but Virtual threads is not the
> killer feature in those cases. Ok, I haven't seen the world, but I reckon
> the back of the envelope working out is ok.
>
>
> If what you’re claiming is that simple thread-per-request servers using OS
> threads are satisfactory for virtually all systems, then that has long
> since been established to not be the case. There’s just no point arguing
> over this. As I think I already told you, 100ms wait is the total of all
> waits, even if done in parallel, and it is quite common because quite a lot
> of servers do outgoing calls to scores of services. It is very common for a
> single incoming request to do 20 outgoing I/O requests if not more.
>
>
> The second question is then not really based on performance, rather on
> architectural differences that thread-per-request offers. One less thing to
> tune is good. The reason that this is not a performance question, is that
> adding threads gets response time indistinguishably close to minimal
> possible way before you get to +oo.
>
>
> As long as you’re talking about “adding threads” I can tell you’re not
> getting this. No one is suggesting adding threads.
>
> If you pick thread-per-request, then the number of threads grows with
> throughput, and that’s why you need virtual threads.
>
>
>
> Alex
>
> On Tue, 26 Jul 2022, 10:15 Ron Pressler, <ron.pressler at oracle.com> wrote:
>
>> Let me make this as simple as I think I can:
>>
>> 1. We are talking *only* about a server that creates a new thread for
>> every incoming request. That’s how we define “thread-per-request." If what
>> you have in mind is a server that operates in any other way, you’re
>> misunderstanding the conversation.
>>
>> 2. While artificially increasing the number of threads in that server
>> would do nothing, whatever that system’s latency is, whatever its resource
>> utilisation is, a rising rate of requests *will* result in that server
>> having more threads that are alive concurrently (by virtue of how it
>> operates, as a rising request rate will not cause that server to reduce
>> latency); i.e. it’s the increased throughput that causes the number of
>> threads to rise, not vice-versa. Therefore, to cope with high request rates
>> that server must have the capacity for many threads.
>>
>> That is all, and that is how we know that a server using virtual threads
>> would normally have a great many of them: because virtual threads are used
>> by thread-per-request servers with high throughputs. Other things will
>> happen too, and other concurrency limits will eventually come into play,
>> but this — that the number of threads will rise — is necessarily true.
>>
>> Now we can get to what I think your actual point is. You believe that the
>> server we’re talking about must be at some kind of a disadvantage compared
>> to other kinds of servers. I understand you want me to convince that is not
>> the case, but the only thing I can do to do that at this point is for you
>> to actually write a server in this style, employing virtual threads, and
>> then report what problems and limitations you actually run into, not
>> hypothesise what problems you think you might run into. That will help you
>> understand how virtual threads are used, and will help us find potentially
>> missing APIs.
>>
>> — Ron
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20220728/1ac796ad/attachment-0001.htm>