Starting and joining a lot of threads slower on AMD systems compared to Intel systems

Thomas Stüfe thomas.stuefe at gmail.com
Sun May 16 07:02:16 UTC 2021


On Sun, May 16, 2021 at 8:06 AM Thomas Stüfe <thomas.stuefe at gmail.com>
wrote:

> The difference to J9 is annoying though.
>
> It just occurred to me that we have this handshake between creator thread
> and newborn thread. The creator thread waits for the newborn to be up and
> running, in order to avoid inconsistent thread states. Which requires the
> newborn to get at least some cycles, competing with all its siblings.
>

Which interestingly we don't do for all platforms (not on AIX nor Windows).
Another reason to execute such tests on the same OS and hardware.


>
> I wonder whether J9 does this differently. May be worth looking into.
>
>
Found a trivial low hanging fruit at least:
https://github.com/openjdk/jdk/pull/4042

..Thomas


> ..Thomas
>
>
> On Sun, May 16, 2021 at 7:46 AM Aleksey Shipilev <shade at redhat.com> wrote:
>
>> In addition to excellent Thomas Stuefes' reply:
>>
>> On 5/15/21 10:15 PM, Waishon wrote:
>> > In a study project we should create 100.000 threads to get a feeling
>> for the time it takes to
>> > create a lot of threads and why it's more efficient to use tasks
>> instead.
>> I think what you found is in the spirit of the assignment: not only you
>> discovered the overheads are
>> high, but also that they are different across systems, OSes, etc.
>>
>> > What might be the reason why starting and joining threads is so much
>> slower on AMD systems
>> > compared to Intel systems?
>> >
>> > (Disclaimer: This question was also posted on Stackoverflow, which
>> referred to this mailing list:
>> >
>> https://stackoverflow.com/questions/67550679/creating-threads-is-slower-on-amd-systems-compared-to-intel-systems
>> )
>> Well, could be a number of things. Most likely, since we are dealing with
>> OS syscalls on thread
>> creation, we are probably doing mostly memory and task management for new
>> threads. But to know more,
>> you need to follow the advice of the very first SO comment: "You should
>> probably use a profiler and
>> it will better tell you where the time is being spent." [1]
>>
>> For example, with async-profiler [2] would profile both Java, JVM and
>> kernel code:
>>   $ java
>> -agentpath:$asyncProfilerPath/libasyncProfiler.so=start,event=cpu,frequency=10000,file=profile.html
>>
>> ThreadCreator
>>
>> --
>> Thanks,
>> -Aleksey
>>
>> [1]
>>
>> https://stackoverflow.com/questions/67550679/creating-threads-is-slower-on-amd-systems-compared-to-intel-systems#comment119397887_67550679
>> [2] https://github.com/jvm-profiling-tools/async-profiler
>>
>>


More information about the discuss mailing list