RFR: 8359820: Improve handshake/safepoint timeout diagnostic messages [v3]

Anton Artemov duke at openjdk.org
Fri Jul 18 10:49:52 UTC 2025


On Fri, 18 Jul 2025 09:53:33 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> * Is the receiving thread the one meant to receive the SIGILL? Then why print this at all, we have callstack and thread info already?

Yes, the receiving thread is the one to receive the SIGILL. I agree that the changes introduce a degree of redundancy, but it is difficult to see by looking at the thread callstack that it was killed by the timeout mechanism of the handshake. I found it by looking at events log, see the discussion in JBS. 


> * Is the receiving thread not the originally intended recipient? But how? This can only happen either if the original recipient thread blocked - which we don't do in hotspot code AFAIK, so it could only be a library method that temporary sets a signal mask - or if there is a bug in the sending code - in which case we should fix it?

I think I already described a possible situation: if the receiver does not report the crash within 3 seconds, then a fatal error will be reported by the calling thread. However, it may happen that any other thread receives SIGILL for any other reason within that time interval. But the "busy" thread is already in the "communicative" variable, which will not be the signal receiver in this particular case. I do not really know if this situation is just hypothetical or ever occurred in practice.

> * Is the SIGILL completely unrelated to the safepoint? Then why print the information?

No, it is intentionally fired by the timeout handler. Quote from mr. Shipilev, see the issue discussion: "The intent for SIGILL is to trigger the crash at the thread that blocks handshake/safepoint sync. E.g. a Java thread that is stuck on miscompiled loop without safepoint checks. Or some VM code that spins without VM transitions. See [JDK-8219584](https://bugs.openjdk.org/browse/JDK-8219584). This feature is remarkably useful in the field, used this dozens of times. So whatever we do, we need to keep printing the instructions block and hopefully a backtrace."

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26309#issuecomment-3089036110


More information about the hotspot-runtime-dev mailing list