Experience using virtual threads in EA 23-loom+4-102

robert engels rengels at ix.netcom.com
Fri Jun 21 17:29:34 UTC 2024


Hi,

Just an fyi, until you get into the order of 1k, 10k, etc. concurrent clients - I would expect platform threads to outperform virtual threads by quite a bit (best case be the same). Modern OS’s routinely handle thousands of active threads. (My OSX desktop with 4 true cores has nearly 5k threads running).

Also, if you can saturate your CPUs or local IO bus, adding more threads isn’t going to help. VirtualThreads shine when the request handler is fanning out to multiple remote services.

Regards,
Robert


> On Jun 21, 2024, at 12:05 PM, Matthew Swift <matthew.swift at gmail.com> wrote:
> 
> Hello again,
> 
> As promised, here is my second (shorter I hope!) email sharing feedback on the recent Loom EA build (23-loom+4-102). If follows up on my previous email https://mail.openjdk.org/pipermail/loom-dev/2024-June/006788.html <https://mail.openjdk.org/pipermail/loom-dev/2024-June/006788.html>.
> 
> I performed some experiments using the same application described in my previous email. However, in order to properly test the improvements to Object monitors (synchronized blocks and Object.wait()) I reverted all of the thread-pinning related changes that I had made in order to support virtual threads with JDK21. Specifically, I reverted the changes converting uses of monitors to ReentrantLock.
> 
> I'm pleased to say that this EA build looks extremely promising! :-) 
> 
> ### Experiment #1: read stress test
> 
> * platform threads: 215K/s throughput, CPU 14% idle
> * virtual threads: 235K/s throughput, CPU 5% idle.
> 
> Comment: there's a slight throughput improvement, but CPU utilization is slightly higher too. Presumably this is due to the number of carrier threads being closely matched to the number of CPUs (I noticed significantly less context switching with v threads).
> 
> ### Experiment #2: heavily indexed write stress test, with 40 clients
> 
> * platform threads: 9300/s throughput, CPU 27% idle
> * virtual threads: 8800/s throughput, CPU 24% idle.
> 
> Comment: there is a ~5% performance degradation using virtual threads. This is better than the degradation I observed in my previous email after switching to ReentrantLock though.
> 
> ### Experiment #3: extreme heavy indexed write stress test, with 120 clients
> 
> * platform threads: 1450/s throughput
> * virtual threads: 1450/s throughput (i.e. about the same).
> 
> Comment:
> 
> This test is intended to stress the internal locking mechanisms as much as possible and expose any pinning problems.
> With JDK21 virtual threads the test would sometimes deadlock and thread dumps would show 100+ fork join carrier threads.
> This is no longer the case with the EA build. It looks really solid.
> 
> This test does expose one important difference between platform threads and virtual threads though. Let's take a look at the response times:
> 
> Platform threads:
> 
> -------------------------------------------------------------------------------
> |     Throughput    |                 Response Time                |          | 
> |    (ops/second)   |                (milliseconds)                |          | 
> |   recent  average |   recent  average    99.9%   99.99%  99.999% |  err/sec | 
> -------------------------------------------------------------------------------
> ...
> |   1442.6   1606.6 |   83.097   74.683   448.79   599.79   721.42 |      0.0 | 
> |   1480.8   1594.0 |   81.125   75.282   442.50   599.79   721.42 |      0.0 |
> 
> Virtual threads:
> 
>  -------------------------------------------------------------------------------
> |     Throughput    |                 Response Time                |          | 
> |    (ops/second)   |                (milliseconds)                |          | 
> |   recent  average |   recent  average    99.9%   99.99%  99.999% |  err/sec | 
> -------------------------------------------------------------------------------
> ...
> |   1445.4   1645.3 |   81.375   72.623  3170.89  4798.28  8925.48 |      0.0 | 
> |   1442.2   1625.0 |   81.047   73.371  3154.12  4798.28  6106.91 |      0.0 | 
> 
> The outliers with virtual threads are much much higher. Could this be due to potential starvation when rescheduling virtual threads in the fork join pool?
> 
> Cheers,
> Matt
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240621/90ef08ab/attachment.htm>


More information about the loom-dev mailing list