Performance of pooling virtual threads vs. semaphores

Thu May 30 01:15:14 UTC 2024

You can see from the flame graph that the system is spending all of its time scheduling VTs and sleeping - the “work” is negligible. The is the carrier thread on my system with the most “work” (execute()) - most of the carrier threads are doing no work at all. Every carrier is spending its time going to sleep and waking back up.

> On May 29, 2024, at 7:10 PM, robert engels <rengels at ix.netcom.com> wrote:
> 
> Ignore that - the first set of metrics are in ms.
> 
> But the wall time within a numTasks is nearly identical regardless of scenario.
> 
>> On May 29, 2024, at 7:09 PM, robert engels <rengels at ix.netcom.com <mailto:rengels at ix.netcom.com>> wrote:
>> 
>> But looking at the numbers some more they don’t make sense. The total times go way down, moving from 10_000 tasks to 100_000 tasks. Then they go up again (expected) moving from 100_000 to 1_000_000
>> 
>> I don’t think the wall time should ever go down when the number of tasks go up, so something doesn’t seem right.
>> 
>>> On May 29, 2024, at 6:03 PM, Attila Kelemen <attila.kelemen85 at gmail.com <mailto:attila.kelemen85 at gmail.com>> wrote:
>>> 
>>> Yeah, just realized that and sent my email pretty much literally a second after your email :)
>>> 
>>> Anyway, while yes in theory 1M threads are contending for the semaphore, but I don't think that should be a problem, because the contention is rather theoretical, since those "contending" VTs are just sitting in a queue, and after each release only one of them should be released. Also, I think Liam's comparison is fair, because none of the other two methods push back, so pushing back only in the VT version would be very unfair.
>>> 
>>> robert engels <rengels at ix.netcom.com <mailto:rengels at ix.netcom.com>> ezt írta (időpont: 2024. máj. 30., Cs, 0:58):
>>> I remember that too, but in this case I don’t think it is the cause.
>>> 
>>> In the bounded/pooled thread scenario - you are only scheduling 600 threads (either platform or virtual).
>>> 
>>> In “scenario #2” all 1M virtual threads are created and are contending on a sempahore. This contention on a single resource does not occur in the other scenarios - this will lead to thrashing of the scheduler.
>>> 
>>> I suspect if it is run under a profiler it will be obvious. With 128 carrier threads, you have increased the contention over a typical machine by an order of magnitude.
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240529/7b6277df/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-1.png
Type: image/png
Size: 245369 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240529/7b6277df/PastedGraphic-1-0001.png>