Loom and high performance networking
    Robert Engels 
    robaho at icloud.com
       
    Tue Aug 13 21:24:01 UTC 2024
    
    
  
Sorry for the spam, it spins in the deactivate().
> On Aug 13, 2024, at 4:21 PM, Robert Engels <robaho at icloud.com> wrote:
> 
> Actually, it seems the spin was taken out sometime after JDK-21. Anyone know why? (Maybe it is not, but the awaitWork is far more complex with no spins variable and little documentation).
> 
>> On Aug 13, 2024, at 4:13 PM, Robert Engels <robaho at icloud.com> wrote:
>> 
>> I found it. It is ForkJoinPool::awaitWork() and it does appear to spin.
>> 
>>> On Aug 13, 2024, at 4:11 PM, Robert Engels <robaho at icloud.com> wrote:
>>> 
>>> It’s been a long time since I’ve looked at the ForkJoinPool code and it appears to have become way more complex than I remember.
>>> 
>>> Can someone point me to the code area where after a task completes the CarrierThread/ForkJoinThread tries to get more work? Is there any spin loop here at all?
>>> 
>>> My new hypothesis is that we enough parallelism the task completes but there is no waiting work, so it parks. And the park/unpark is way more expensive than the time until the poller enqueues another VT as read - so with less parallelism there is a higher chance of work being available - and thus limits the number of park/unpark cycles - improving the overall performance.
>>> 
>>> I would think a queue like this should spin at least as long as the expected park/unpark cost (time).
>>> 
>>>> On Aug 13, 2024, at 10:34 AM, robert engels <robaho at icloud.com> wrote:
>>>> 
>>>> I did. It didn’t make any difference. I checked the thread dump as well and the extras were created. 
>>>> 
>>>> Surprised that lowering the priority didn’t help - so now I need to think about other options. It feels like something when the carriers can use all the cores that the poller is prevented from running - like some sort of lock being held by the carrier/vt and do it thrashes around until it eventually gets a chance. 
>>>> 
>>>>> On Aug 13, 2024, at 10:26 AM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>>>>> 
>>>>> On 13/08/2024 15:59, robert engels wrote:
>>>>>> Surprisingly, lowering the priority of the carrier threads did not result in the same performance gains as reducing the parallelism.
>>>>>> 
>>>>> Did you do any experiments with -Djdk.readPollers=2 or -Djdk.readPollers=4 to remove contention from the kqueue from the picture.
>>>>> 
>>>>> -Alan
>>> 
>> 
> 
    
    
More information about the loom-dev
mailing list