Thread-Local Handshakes slowdown on single-cpu machines

Wed Apr 29 00:18:53 UTC 2020

Hi Miklos,

On 4/28/20 8:04 AM, Karakó Miklós wrote:
> Hello David,
>
> I've tried with OpenJDK 15 EA Build 20. Unfortunately it's slow as 
> with OpenJDK14.
> A colleague of mine set CPU affinity with taskset to only one CPU on 
> Linux and did not experience the slowdown. So the issue seems 
> Windows-related.
I wouldn’t rule out this being a scheduling issue. With thread-local 
handshakes some operations like the ones you see in your logs 
(deoptimization, biasedlocking revocation) do not need all JavaThreads 
to be stopped at the same time in order to be executed. If you only have 
one CPU but many JavaThreads running, depending on the scheduler, maybe 
it could happen that the JavaThread you need to handshake is constantly 
being left behind in the scheduler queue and so the operation takes 
longer to execute. When you add -XX:-ThreadLocalHandshakes, those 
operations will be executed using safepoints, which means all 
JavaThreads are stopped. So even if the scheduler decides to give 
priority to other JavaThreads those will block and free the CPU. Also 
when adding more CPU that will increase the likelihood of a JavaThread 
getting scheduled and might explain why you see it gets fixed.

Alternatively the JavaThread that needs to be handshaked is being 
scheduled but is not polling for the handshake. But I think in that the 
case you should still have the same issue with 
-XX:-ThreadLocalHandshakes because the polling mechanism is the same, 
although it's true that the misbehaving JavaThread will have more CPU 
time to complete whatever is doing while others are stopped when 
safepointing.

One thing that makes me question the scheduler theory is the fact that 
you see such improvements when only disabling biased-locking in 13.0.2 
because that was still using safepoints back then. Can you check the 
logs to see in which operations you see the pauses when running with 
default options compared to -XX:-UseBiasedLocking?

I’m also surprised you see those long logging pauses because all the 
other JavaThreads were already processed in the handshake so something 
must be executing. Maybe you can switch the logging to trace instead of 
debug. Also maybe you can add logging at Java code so you can follow 
what is actually running? You could send it to the same app.txt and 
flush every time you write to see it ordered in the log. You can also 
get the uptime you see in the UL logs with 
ManagementFactory.getRuntimeMXBean().getUptime(). That logging will 
affect timing but might give some insights.

Thanks,
Patricio
> Miklos
>
> On 4/28/20 1:02 AM, David Holmes wrote:
>> Hi Miklos,
>>
>> On 28/04/2020 7:26 am, Karakó Miklós wrote:
>>> Hello,
>>>
>>> We bumped into a possible Thread-Local Handshakes issue with 
>>> multiple apps. It seems that enabled TLH slows down applications 
>>> running on (although rare) single CPU boxes. I would be grateful if 
>>> you could confirm that this is a known trade-off for these rare 
>>> setups or a possible JVM bug. That would save us at least a few 
>>> hours of debugging.
>>>
>>> TL;DR: Both tested apps stop frequently around "HandshakeOneThread", 
>>> "HandshakeAllThreads" and "Revoked bias of currently-unlocked 
>>> object" running with OpenJDK12/OpenJDK13/OpenJDK14. OpenJDK 13 with 
>>> -XX:-ThreadLocalHandshakes fixes the issue. Adding a second CPU to 
>>> the virtual machine fixes the issue. Enabling hyper-threading fixes 
>>> the issue.
>>
>> Can you try with latest JDK 15 build, just to see if this may be 
>> something already addressed?
>>
>> Thanks,
>> David
>>
>>> More details are available at StackOverflow: 
>>> https://stackoverflow.com/questions/61375565/slow-application-frequent-jvm-hangs-with-single-cpu-setups-and-java-12 
>>>
>>>
>>> All thoughts are welcome.
>>>
>>> Best,
>>> Miklos
>>>
>