Performance of pooling virtual threads vs. semaphores

Thu May 30 01:19:51 UTC 2024

I am guessing the JVM is optimizing away the call to fibonacci() since the results are not used, so all the tasks are doing are sleeping and waking up.

Not a good test.

> On May 29, 2024, at 8:15 PM, robert engels <rengels at ix.netcom.com> wrote:
> 
> You can see from the flame graph that the system is spending all of its time scheduling VTs and sleeping - the “work” is negligible. The is the carrier thread on my system with the most “work” (execute()) - most of the carrier threads are doing no work at all. Every carrier is spending its time going to sleep and waking back up.
> 
> <PastedGraphic-1.png>
>> On May 29, 2024, at 7:10 PM, robert engels <rengels at ix.netcom.com <mailto:rengels at ix.netcom.com>> wrote:
>> 
>> Ignore that - the first set of metrics are in ms.
>> 
>> But the wall time within a numTasks is nearly identical regardless of scenario.
>> 
>>> On May 29, 2024, at 7:09 PM, robert engels <rengels at ix.netcom.com <mailto:rengels at ix.netcom.com>> wrote:
>>> 
>>> But looking at the numbers some more they don’t make sense. The total times go way down, moving from 10_000 tasks to 100_000 tasks. Then they go up again (expected) moving from 100_000 to 1_000_000
>>> 
>>> I don’t think the wall time should ever go down when the number of tasks go up, so something doesn’t seem right.
>>> 
>>>> On May 29, 2024, at 6:03 PM, Attila Kelemen <attila.kelemen85 at gmail.com <mailto:attila.kelemen85 at gmail.com>> wrote:
>>>> 
>>>> Yeah, just realized that and sent my email pretty much literally a second after your email :)
>>>> 
>>>> Anyway, while yes in theory 1M threads are contending for the semaphore, but I don't think that should be a problem, because the contention is rather theoretical, since those "contending" VTs are just sitting in a queue, and after each release only one of them should be released. Also, I think Liam's comparison is fair, because none of the other two methods push back, so pushing back only in the VT version would be very unfair.
>>>> 
>>>> robert engels <rengels at ix.netcom.com <mailto:rengels at ix.netcom.com>> ezt írta (időpont: 2024. máj. 30., Cs, 0:58):
>>>> I remember that too, but in this case I don’t think it is the cause.
>>>> 
>>>> In the bounded/pooled thread scenario - you are only scheduling 600 threads (either platform or virtual).
>>>> 
>>>> In “scenario #2” all 1M virtual threads are created and are contending on a sempahore. This contention on a single resource does not occur in the other scenarios - this will lead to thrashing of the scheduler.
>>>> 
>>>> I suspect if it is run under a profiler it will be obvious. With 128 carrier threads, you have increased the contention over a typical machine by an order of magnitude.
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240529/734c847d/attachment.htm>