<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=iso-2022-jp">

<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>

</head>

<body dir="ltr">

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

It's important to remember that all of this is capped by Little's Law, Queuing Theory, Amdahl's Law (or rather, Universal Scalability Law). There's always going to be a tension between prioritizing finishing already started work and starting to process new

 work, and the less information there is about what the work entails, the less sophistication can be applied to achieve optimality.</div>

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

As with all scalability-testing, it is important to be able to control tail latencies (i.e. avoiding optimizing for p50 at the expense of p90), which is only possible by either doing less work (load shedding, graceful degradation, or equivalent) or diverting

 work (dynamic scaling, nearest-gateway-routing, or equivalent).</div>

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

At the end of the day, optimization tends to improve efficiency, alas there's always going to be a bottleneck which defines the ceiling for what is currently possible. Identifying said bottleneck (remember that it can move, depending on where most pressure

 is applied), tends to yield the best information on where to spend the most effort.</div>

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div id="Signature" style="color: inherit;">

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Cheers,<br>

$B"e(B</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<b><br>

</b></div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<b>Viktor Klang</b></div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Software Architect, Java Platform Group<br>

Oracle</div>

</div>

<div id="appendonsend"></div>

<hr style="display:inline-block;width:98%" tabindex="-1">

<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> loom-dev <loom-dev-retn@openjdk.org> on behalf of Robert Engels <robaho@icloud.com><br>

<b>Sent:</b> Monday, 24 June 2024 22:12<br>

<b>To:</b> Matthew Swift <matthew.swift@gmail.com><br>

<b>Cc:</b> loom-dev@openjdk.org <loom-dev@openjdk.org><br>

<b>Subject:</b> Re: Experience using virtual threads in EA 23-loom+4-102</font>

<div> </div>

</div>

<div class="" style="word-wrap:break-word; line-break:after-white-space">I still think it might be helpful to use virtual threads for all connections (simplicity!) - but when to perform the cpu intensive work like hashing, put a callable/future on a $B!H(Bcpu only$B!I(B

 executor with a capped number of platform threads and join(). It should be a trivial refactor of the code.

<div class=""><br class="">

</div>

<div class="">The problem with using VT for everything is that a VT is not time-sliced, so you could quickly consume all of the carrier threads and then you make no progress on the IO (fan out) requests - which is especially bad if they are simply calling out

 to other servers (less bad if doing lots of local disk io).<br class="">

<div><br class="">

<blockquote type="cite" class="">

<div class="">On Jun 24, 2024, at 12:05 PM, Matthew Swift <<a href="mailto:matthew.swift@gmail.com" class="">matthew.swift@gmail.com</a>> wrote:</div>

<br class="x_Apple-interchange-newline">

<div class="">

<div dir="ltr" class="">

<div class="">Thanks Robert.</div>

<div class=""><br class="">

</div>

<div class="">The main issue we face with our application is that the client load can vary substantially over time. For example, we might experience a lot of CPU intensive authentication traffic (e.g. PBKDF2 hashing) in the morning, but then a lot of IO bound

 traffic at other times. It's hard to find the ideal number of worker threads: many threads work well for IO bound traffic, as you say, but sacrifices performance when the load is more CPU bound. On my 10 core (20 hyper threads) laptop, I observe nearly a 15-20%

 drop in throughput when subjecting the server to 1200 concurrent CPU bound requests, but a much smaller drop when using virtual threads:</div>

<div class=""><br class="">

</div>

<div class="">* 10 platform threads: ~260K requests/s (note: this is too few threads for more IO bound traffic)</div>

<div class="">* 40 platform threads: ~220K requests/s</div>

<div class="">* 1200 platform threads: ~220K requests/s (note: this would be the equivalent of a one platform thread per request)<br class="">

</div>

<div class="">* virtual threads: 252K requests/s (note: FJ pool defaults to 20 on my laptop - I didn't try disabling hyperthreading).</div>

<div class=""><br class="">

</div>

<div class="">I find the "one size fits all" provided by virtual threads to be much easier for developers and our users alike. I don't have to worry about complex architectures involving split thread pools (one for CPU, one for IO), etc. We also have to deal

 with slow misbehaving clients, which has meant use of async IO and hard to debug call-back hell :-) All of this goes away with virtual threads as it will allow us to use simpler blocking network IO and a simple one thread per request design that is much more

 tolerant to heterogeneous traffic patterns.<br class="">

</div>

<div class=""><br class="">

</div>

<div class="">It also opens up the possibility of future enhancements that would definitely shine with virtual threads as you suggest. For example, modern hashing algorithms, such as Argon2, can take hundreds of milliseconds of computation, which is simply

 too costly to scale horizontally in the data layer. We want to offload this to an external elastic compute service, but we could very quickly have thousands of blocked platform threads with response times this high.</div>

<div class=""><br class="">

</div>

<div class="">Cheers,</div>

<div class="">Matt</div>

<div class=""><br class="">

</div>

<div class=""><br class="">

</div>

<div class="x_gmail_quote">

<div dir="ltr" class="x_gmail_attr">On Fri, 21 Jun 2024 at 19:29, robert engels <<a href="mailto:rengels@ix.netcom.com" class="">rengels@ix.netcom.com</a>> wrote:<br class="">

</div>

<blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">

<div class="" style="">Hi,

<div class=""><br class="">

</div>

<div class="">Just an fyi, until you get into the order of 1k, 10k, etc. concurrent clients - I would expect platform threads to outperform virtual threads by quite a bit (best case be the same). Modern OS$B!G(Bs routinely handle thousands of active threads. (My

 OSX desktop with 4 true cores has nearly 5k threads running).</div>

<div class=""><br class="">

</div>

<div class="">Also, if you can saturate your CPUs or local IO bus, adding more threads isn$B!G(Bt going to help. VirtualThreads shine when the request handler is fanning out to multiple remote services.</div>

<div class=""><br class="">

</div>

<div class="">Regards,</div>

<div class="">Robert</div>

<div class=""><br class="">

</div>

</div>

</blockquote>

</div>

</div>

</div>

</blockquote>

</div>

<br class="">

</div>

</div>

</body>

</html>