Loom and high performance networking
David Holmes
david.holmes at oracle.com
Wed Aug 14 05:04:03 UTC 2024
On 14/08/2024 12:59 am, robert engels wrote:
> Surprisingly, lowering the priority of the carrier threads did not
> result in the same performance gains as reducing the parallelism.
To change priorities you also need to set ThreadPriorityPolicy=1
product(int, ThreadPriorityPolicy,
0, \
"0 :
Normal. "\
" VM chooses priorities that are appropriate for
normal "\
" applications. "\
" On Windows applications are allowed to use higher
native "\
" priorities. However, with ThreadPriorityPolicy=0, VM
will "\
" not use the highest possible native
priority, "\
" THREAD_PRIORITY_TIME_CRITICAL, as it may interfere
with "\
" system threads. On Linux thread priorities are
ignored "\
" because the OS does not support static priority
in "\
" SCHED_OTHER scheduling class which is the only choice
for "\
" non-root, non-realtime
applications. "\
"1 :
Aggressive. "\
" Java thread priorities map over to the entire range
of "\
" native thread priorities. Higher Java thread priorities
map "\
" to higher native thread priorities. This policy should
be "\
" used with care, as sometimes it can cause
performance "\
" degradation in the application and/or the entire system.
On "\
" Linux/BSD/macOS this policy requires root privilege or
an "\
" extended
capability.") \
range(0,
1) \
David
-------
>> On Aug 13, 2024, at 7:33 AM, Alan Bateman <Alan.Bateman at oracle.com>
>> wrote:
>>
>> On 13/08/2024 12:25, Robert Engels wrote:
>>> :
>>>
>>> Using VT pollers (pollerMode=2) and default parallelism:
>>>
>>> robertengels at macmini go-wrk % wrk -H 'Host: imac' -H 'Accept:
>>> text/plain,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7'
>>> -H 'Connection: keep-alive' --latency -d 20 -c 1000 --timeout 8 -t 8
>>> http://imac:8080/plaintext
>>> <https://urldefense.com/v3/__http://imac:8080/plaintext__;!!ACWV5N9M2RV99hQ!OSop_Xwpde2E9iW4xAOJ-iUfPMQ4NDkHxIr3S7UmvlgiGk94boiQLh9ZO6Btn1uwp6HY3TYdhv9yzjU8$>
>>> Running 20s test @ http://imac:8080/plaintext
>>> <https://urldefense.com/v3/__http://imac:8080/plaintext__;!!ACWV5N9M2RV99hQ!OSop_Xwpde2E9iW4xAOJ-iUfPMQ4NDkHxIr3S7UmvlgiGk94boiQLh9ZO6Btn1uwp6HY3TYdhv9yzjU8$>
>>> 8 threads and 1000 connections
>>> Thread Stats Avg Stdev Max +/- Stdev
>>> Latency 66.76ms 137.25ms 1.70s 87.06%
>>> Req/Sec 14.38k 2.36k 21.94k 75.88%
>>> Latency Distribution
>>> 50% 4.74ms
>>> 75% 51.68ms
>>> 90% 263.08ms
>>> 99% 664.99ms
>>> 2289858 requests in 20.02s, 310.10MB read
>>> Socket errors: connect 0, read 2135, write 7, timeout 0
>>> Requests/sec: 114360.91
>>> Transfer/sec: 15.49MB
>>>
>>> and the same 8% idle.
>>>
>>> So I am pretty sure my hypothesis is correct. I may try and build a
>>> Loom / use a library to lower priority of the carrier threads. I
>>> suspect I will see similar performance to the reduced parallelism case.
>>
>> With -Djdk.pollerMode=2 then the poller threads are virtual threads
>> and so run on the same carrier. You'll need to play with
>> -Djdk.readPollers=N as there isn't a good default for this on macOS.
>>
>> -Alan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240814/2832c093/attachment-0001.htm>
More information about the loom-dev
mailing list