Deadlock on OpenJDK 17

Fri Jan 19 19:16:00 UTC 2024

On Fri, 19 Jan 2024 19:33:39 +0100,
Kemper, William wrote:
> 
> It would be really helpful if you are able to share a portable reproducer for
> this. It would also be helpful to know why the VMThread thinks the Shenandoah
> worker thread is still running. If you could run an instance of your
> application with `-Xlog:safepoint=trace` it would shed some light on things
> (please test this in a non-production environment, I'm not sure what sort of
> log volume this will produce for your application). We will also work on a
> reproducer by running our tests with `-XX:+SafepointALot` .

I afraid that I can't make portable reproducer because the issue is happening
during redeployment of large cluster (>200 machines in total) and may affects a
random machine. Or two.

This happened only on start, and if application had started and works few
minutes, it will work without this issue.

Unfortently not each redeployment triggers it, let say one of four.

Until I've figured out how to reproduce it, I have no idea how to trace it on
production environment without perofrmance degradation and it's clearly that
both -Xlog:safepoint=trace and -XX:+SafepointALot aren't an option here :(

-- 
wbr, Kirill