Deadlock on OpenJDK 17

Kemper, William kemperw at amazon.com
Fri Jan 19 18:33:39 UTC 2024


Hi - thank you for sharing this. I've opened a ticket in the bug tracker for this: https://bugs.openjdk.org/browse/JDK-8324225


It would be really helpful if you are able to share a portable reproducer for this. It would also be helpful to know why the VMThread thinks the Shenandoah worker thread is still running. If you could run an instance of your application with `-Xlog:safepoint=trace` it would shed some light on things (please test this in a non-production environment, I'm not sure what sort of log volume this will produce for your application). We will also work on a reproducer by running our tests with `-XX:+SafepointALot` .


William


________________________________
From: shenandoah-dev <shenandoah-dev-retn at openjdk.org> on behalf of Kirill A.Korinsky <kirill at korins.ky>
Sent: Friday, January 19, 2024 9:14:49 AM
To: shenandoah-dev at openjdk.org
Subject: [EXTERNAL] Deadlock on OpenJDK 17

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



Greetings,

I'd like to share with you that seems like a deadlock inside Shenandoah.

The deadlock means that JVM goes into state similar ot stop of the worls,
it doesn't response to jstack or jmap with error like:

  Unable to open socket file /proc/XXX/root/tmp/.java_pidYYY

It runs inside docker container and I have gc verbose log to stdout,
that allows me to capture the last message before it's frozen:

  [2024-01-19T16:23:09.871+0000][info][gc] Trigger: Metadata GC Threshold
  [2024-01-19T16:23:09.876+0000][info][gc] GC(2) Concurrent reset 5.054ms
  [2024-01-19T16:23:09.876+0000][info][gc] GC(2) Pause Init Mark (unload classes) 0.045ms
  [2024-01-19T16:23:09.889+0000][info][gc] GC(2) Concurrent marking roots 12.535ms
  [2024-01-19T16:23:10.425+0000][info][gc] GC(2) Concurrent marking (unload classes) 535.868ms
  [2024-01-19T16:23:10.425+0000][info][gc] GC(2) Pause Final Mark (unload classes) 0.156ms
  [2024-01-19T16:23:10.492+0000][info][gc] GC(2) Concurrent thread roots 66.065ms
  [2024-01-19T16:23:10.513+0000][info][gc] GC(2) Concurrent weak references 21.074ms

The error had happened on Fedora 39's java-17-openjdk-17.0.9.0.9-3.fc39.x86_64

As a proof that it is a deadlock I'd like to share output of all threads from
gdb's point of which I gets as thread apply all bt

JVM runs with options:
  -XX:+UseShenandoahGC -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:+UseTransparentHugePages

--
wbr, Kirill
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/shenandoah-dev/attachments/20240119/0ed0cdfc/attachment.htm>


More information about the shenandoah-dev mailing list