Loom and high performance networking
Robert Engels
robaho at icloud.com
Tue Aug 13 21:13:30 UTC 2024
I found it. It is ForkJoinPool::awaitWork() and it does appear to spin.
> On Aug 13, 2024, at 4:11 PM, Robert Engels <robaho at icloud.com> wrote:
>
> It’s been a long time since I’ve looked at the ForkJoinPool code and it appears to have become way more complex than I remember.
>
> Can someone point me to the code area where after a task completes the CarrierThread/ForkJoinThread tries to get more work? Is there any spin loop here at all?
>
> My new hypothesis is that we enough parallelism the task completes but there is no waiting work, so it parks. And the park/unpark is way more expensive than the time until the poller enqueues another VT as read - so with less parallelism there is a higher chance of work being available - and thus limits the number of park/unpark cycles - improving the overall performance.
>
> I would think a queue like this should spin at least as long as the expected park/unpark cost (time).
>
>> On Aug 13, 2024, at 10:34 AM, robert engels <robaho at icloud.com> wrote:
>>
>> I did. It didn’t make any difference. I checked the thread dump as well and the extras were created.
>>
>> Surprised that lowering the priority didn’t help - so now I need to think about other options. It feels like something when the carriers can use all the cores that the poller is prevented from running - like some sort of lock being held by the carrier/vt and do it thrashes around until it eventually gets a chance.
>>
>>> On Aug 13, 2024, at 10:26 AM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>>>
>>> On 13/08/2024 15:59, robert engels wrote:
>>>> Surprisingly, lowering the priority of the carrier threads did not result in the same performance gains as reducing the parallelism.
>>>>
>>> Did you do any experiments with -Djdk.readPollers=2 or -Djdk.readPollers=4 to remove contention from the kqueue from the picture.
>>>
>>> -Alan
>
More information about the loom-dev
mailing list