Loom and high performance networking

Wed Aug 14 05:04:03 UTC 2024

On 14/08/2024 12:59 am, robert engels wrote:
> Surprisingly, lowering the priority of the carrier threads did not 
> result in the same performance gains as reducing the parallelism.

To change priorities you also need to set ThreadPriorityPolicy=1

   product(int, ThreadPriorityPolicy, 
0,                                     \
           "0 : 
Normal.                                                     "\
           "    VM chooses priorities that are appropriate for 
normal       "\
           " applications.                                               "\
           "    On Windows applications are allowed to use higher 
native    "\
           "    priorities. However, with ThreadPriorityPolicy=0, VM 
will   "\
           "    not use the highest possible native 
priority,               "\
           "    THREAD_PRIORITY_TIME_CRITICAL, as it may interfere 
with     "\
           "    system threads. On Linux thread priorities are 
ignored      "\
           "    because the OS does not support static priority 
in          "\
           "    SCHED_OTHER scheduling class which is the only choice 
for   "\
           "    non-root, non-realtime 
applications.                        "\
           "1 : 
Aggressive.                                                 "\
           "    Java thread priorities map over to the entire range 
of      "\
           "    native thread priorities. Higher Java thread priorities 
map "\
           "    to higher native thread priorities. This policy should 
be   "\
           "    used with care, as sometimes it can cause 
performance       "\
           "    degradation in the application and/or the entire system. 
On "\
           "    Linux/BSD/macOS this policy requires root privilege or 
an   "\
           "    extended 
capability.")                                       \
           range(0, 
1)                                                       \

David
-------

>> On Aug 13, 2024, at 7:33 AM, Alan Bateman <Alan.Bateman at oracle.com> 
>> wrote:
>>
>>  On 13/08/2024 12:25, Robert Engels wrote:
>>> :
>>>
>>> Using VT pollers (pollerMode=2) and default parallelism:
>>>
>>> robertengels at macmini go-wrk %  wrk -H 'Host: imac' -H 'Accept: 
>>> text/plain,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7' 
>>> -H 'Connection: keep-alive' --latency -d 20 -c 1000 --timeout 8 -t 8 
>>> http://imac:8080/plaintext 
>>> <https://urldefense.com/v3/__http://imac:8080/plaintext__;!!ACWV5N9M2RV99hQ!OSop_Xwpde2E9iW4xAOJ-iUfPMQ4NDkHxIr3S7UmvlgiGk94boiQLh9ZO6Btn1uwp6HY3TYdhv9yzjU8$>
>>> Running 20s test @ http://imac:8080/plaintext 
>>> <https://urldefense.com/v3/__http://imac:8080/plaintext__;!!ACWV5N9M2RV99hQ!OSop_Xwpde2E9iW4xAOJ-iUfPMQ4NDkHxIr3S7UmvlgiGk94boiQLh9ZO6Btn1uwp6HY3TYdhv9yzjU8$>
>>>   8 threads and 1000 connections
>>>   Thread Stats   Avg      Stdev     Max +/- Stdev
>>>     Latency    66.76ms  137.25ms   1.70s 87.06%
>>>     Req/Sec    14.38k     2.36k   21.94k 75.88%
>>>   Latency Distribution
>>>      50%    4.74ms
>>>      75%   51.68ms
>>>      90%  263.08ms
>>>      99%  664.99ms
>>>   2289858 requests in 20.02s, 310.10MB read
>>>   Socket errors: connect 0, read 2135, write 7, timeout 0
>>> Requests/sec: 114360.91
>>> Transfer/sec:     15.49MB
>>>
>>> and the same 8% idle.
>>>
>>> So I am pretty sure my hypothesis is correct. I may try and build a 
>>> Loom / use a library to lower priority of the carrier threads. I 
>>> suspect I will see similar performance to the reduced parallelism case.
>>
>> With -Djdk.pollerMode=2 then the poller threads are virtual threads 
>> and so run on the same carrier. You'll need to play with 
>> -Djdk.readPollers=N as there isn't a good default for this on macOS.
>>
>> -Alan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240814/2832c093/attachment-0001.htm>