Thread-Local Handshakes slowdown on single-cpu machines
David Holmes
david.holmes at oracle.com
Wed Apr 29 05:31:42 UTC 2020
On 29/04/2020 1:26 pm, Karakó Miklós wrote:
> Hi Patricio,
>
> Yes, it seems like a scheduling issue for us too. One of my colleagues
> spotted the following sentence in Robbin Ehn's Handshaking HotSpot
> presentation (https://youtu.be/VBCOfAJ409s?t=2222): ~"if the VM thread
> finds that all threads are unsafe, it keeps spinning"
> Could it be that actually the VM thread gets more CPU time and actually
> blocks JavaThread?
The "spinning" is intended to be done in a scheduler-friendly way via:
void os::naked_short_nanosleep(jlong ns) {
assert(ns > -1 && ns < NANOUNITS, "Un-interruptable sleep, short time
use only");
int64_t start = os::javaTimeNanos();
do {
if (SwitchToThread() == 0) {
// Nothing else is ready to run on this cpu, spin a little
SpinPause();
}
} while (os::javaTimeNanos() - start < ns);
}
but this won't be that scheduler-friendly because it places the VMThread
at the head of the priority queue, so it will cede up-to a timeslice to
another thread, then run another iteration of the loop and hence
alternate between a JavaThread and the VMThread until they are all
blocked at the safepoint. What is really wanted is for the VMThread to
go to the back of the scheduling queue.
David
-----
> I did not except that but I could reproduce the issue with a simple
> hello world:
>
> public class Main {
> public static void main(String[] args) {
> System.out.println("Hello world");
> }
> }
>
> A few measurements on Windows 10 x64 in VirtualBox:
>
>
> PS Z:\main> Measure-Command { .\main-1cpu.bat }
>
> Minutes : 0
> Seconds : 7
> Milliseconds : 20
> Ticks : 70203455
> TotalMinutes : 0.117005758333333
> TotalSeconds : 7.0203455
> TotalMilliseconds : 7020.3455
>
>
>
> PS Z:\main> Measure-Command { .\main-full.bat }
>
> Minutes : 0
> Seconds : 1
> Milliseconds : 575
> Ticks : 15751352
> TotalSeconds : 1.5751352
> TotalMilliseconds : 1575.1352
>
>
>
> PS Z:\main> Measure-Command { .\main-full.bat }
>
> Minutes : 0
> Seconds : 1
> Milliseconds : 558
> Ticks : 15585384
> TotalSeconds : 1.5585384
> TotalMilliseconds : 1558.5384
>
>
>
> PS Z:\main> Measure-Command { .\main-1cpu.bat }
>
> Minutes : 0
> Seconds : 23
> Milliseconds : 648
> Ticks : 236482685
> TotalSeconds : 23.6482685
> TotalMilliseconds : 23648.2685
>
>
> main-1cpu.bat:
>
> c:\windows\system32\cmd.exe /C start /affinity 1 \jdk-14\bin\java
> -Xlog:all=trace:file=app-1cpu.txt:uptime,tid,level,tags:filecount=50 Main
>
> main-full.bat:
>
> c:\windows\system32\cmd.exe /C start /affinity FF \jdk-14\bin\java
> -Xlog:all=trace:file=app-full.txt:uptime,tid,level,tags:filecount=50 Main
>
> The files are also available here:
> https://palacsint.hu/etc/jdk/jdk14-single-cpu-slowdown-reprod-poc/
> Or in a zip:
> https://palacsint.hu/etc/jdk/jdk14-single-cpu-slowdown-reprod-poc.zip
>
> You can find trace level logs of three testruns in the run1-2-3
> directories as well.
>
> Best,
> Miklos
>
>
> On 4/29/20 2:18 AM, Patricio Chilano wrote:
>> Hi Miklos,
>>
>> On 4/28/20 8:04 AM, Karakó Miklós wrote:
>>> Hello David,
>>>
>>> I've tried with OpenJDK 15 EA Build 20. Unfortunately it's slow as
>>> with OpenJDK14.
>>> A colleague of mine set CPU affinity with taskset to only one CPU on
>>> Linux and did not experience the slowdown. So the issue seems
>>> Windows-related.
>> I wouldn’t rule out this being a scheduling issue. With thread-local
>> handshakes some operations like the ones you see in your logs
>> (deoptimization, biasedlocking revocation) do not need all JavaThreads
>> to be stopped at the same time in order to be executed. If you only
>> have one CPU but many JavaThreads running, depending on the scheduler,
>> maybe it could happen that the JavaThread you need to handshake is
>> constantly being left behind in the scheduler queue and so the
>> operation takes longer to execute. When you add
>> -XX:-ThreadLocalHandshakes, those operations will be executed using
>> safepoints, which means all JavaThreads are stopped. So even if the
>> scheduler decides to give priority to other JavaThreads those will
>> block and free the CPU. Also when adding more CPU that will increase
>> the likelihood of a JavaThread getting scheduled and might explain why
>> you see it gets fixed.
>>
>> Alternatively the JavaThread that needs to be handshaked is being
>> scheduled but is not polling for the handshake. But I think in that
>> the case you should still have the same issue with
>> -XX:-ThreadLocalHandshakes because the polling mechanism is the same,
>> although it's true that the misbehaving JavaThread will have more CPU
>> time to complete whatever is doing while others are stopped when
>> safepointing.
>>
>> One thing that makes me question the scheduler theory is the fact that
>> you see such improvements when only disabling biased-locking in 13.0.2
>> because that was still using safepoints back then. Can you check the
>> logs to see in which operations you see the pauses when running with
>> default options compared to -XX:-UseBiasedLocking?
>>
>> I’m also surprised you see those long logging pauses because all the
>> other JavaThreads were already processed in the handshake so something
>> must be executing. Maybe you can switch the logging to trace instead
>> of debug. Also maybe you can add logging at Java code so you can
>> follow what is actually running? You could send it to the same app.txt
>> and flush every time you write to see it ordered in the log. You can
>> also get the uptime you see in the UL logs with
>> ManagementFactory.getRuntimeMXBean().getUptime(). That logging will
>> affect timing but might give some insights.
>>
>> Thanks,
>> Patricio
>>> Miklos
>>>
>>> On 4/28/20 1:02 AM, David Holmes wrote:
>>>> Hi Miklos,
>>>>
>>>> On 28/04/2020 7:26 am, Karakó Miklós wrote:
>>>>> Hello,
>>>>>
>>>>> We bumped into a possible Thread-Local Handshakes issue with
>>>>> multiple apps. It seems that enabled TLH slows down applications
>>>>> running on (although rare) single CPU boxes. I would be grateful if
>>>>> you could confirm that this is a known trade-off for these rare
>>>>> setups or a possible JVM bug. That would save us at least a few
>>>>> hours of debugging.
>>>>>
>>>>> TL;DR: Both tested apps stop frequently around
>>>>> "HandshakeOneThread", "HandshakeAllThreads" and "Revoked bias of
>>>>> currently-unlocked object" running with
>>>>> OpenJDK12/OpenJDK13/OpenJDK14. OpenJDK 13 with
>>>>> -XX:-ThreadLocalHandshakes fixes the issue. Adding a second CPU to
>>>>> the virtual machine fixes the issue. Enabling hyper-threading fixes
>>>>> the issue.
>>>>
>>>> Can you try with latest JDK 15 build, just to see if this may be
>>>> something already addressed?
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> More details are available at StackOverflow:
>>>>> https://stackoverflow.com/questions/61375565/slow-application-frequent-jvm-hangs-with-single-cpu-setups-and-java-12
>>>>>
>>>>>
>>>>> All thoughts are welcome.
>>>>>
>>>>> Best,
>>>>> Miklos
>>>>>
>>>
>>
>
More information about the hotspot-dev
mailing list