RFR: JDK-8319633: runtime/posixSig/TestPosixSig.java intermittent timeouts on AIX

Thomas Stuefe stuefe at openjdk.org
Thu Nov 23 15:12:06 UTC 2023


On Thu, 23 Nov 2023 14:35:11 GMT, Joachim Kern <jkern at openjdk.org> wrote:

> Every 1-2 weeks we run into timeouts when running jtreg test runtime/posixSig/TestPosixSig.java on AIX.
> The thread stack shows that we are in line 54 of TestPosixSig.java.
> 
> The reason is the following: The test registers a new dummy signal handler for SIGILL, without delegating the task to the previous handler in the chain. In case the VM then calls a Java method marked as not-entrant at least on PPC64 a SIGILL is raised. Because this is not handled by the registered handler the SIGILL will happen again and again in an endless recursion.
> One solution would be to add a delegation to the hotspot signal handler, which is the previous handler in the chain.

Good catch. I am surprised that this does not happen more often.

This is not an AIX issue. Please change the JBS ticket (title, os, and description) and PR title to make it a general issue on all *nixes. 

The issue is that this test wants to check that the periodic JNI checker catches modified signal handlers (if a native app uses signals, it must use the signal interposition library). This is racy - there is a time window between setting the handler and the VM noticing it; any signal we receive during that time will not be processed by the VM.

This issue highlights an inherent problem with the JNI checker. Maybe we should increase its frequency, but with 10ms its already quite high.

My solution would have been just to use a signal that is monitored by the JNI checker, but not used in operations; that is only expected to be triggered from outside. I would have replaced SIGILL with SIGQUIT, which is used to trigger a thread dump.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16797#issuecomment-1824589091


More information about the hotspot-runtime-dev mailing list