RFR(M): 8219584: Try to dump error file by thread which causes safepoint timeout

Doerr, Martin martin.doerr at sap.com
Fri Feb 22 15:36:04 UTC 2019


Hi all,

the VM supports diagnostic flags -XX:+SafepointTimeout and -XX:+AbortVMOnSafepointTimeout to detect safepoint synchronization timeouts and to exit with an error message.
However, we usually don't see what the thread was doing which didn't reach the safepoint.
We can get a more helpful hs_err file if we kill that thread and let it dump the hs_err file.

My following proposal does:

  1.  Introduce a function for sending a signal to another thread (not for Windows).
  2.  If possible, send a SIGILL to thread which didn't reach safepoint.
  3.  Make SafepointALot diagnostic instead of develop in order to make it usable together with SafepointTimeout.
  4.  Extend error reporting to make it easy to recognize if the thread was killed by another thread.
  5.  Add a jtreg test.

Webrev:
http://cr.openjdk.java.net/~mdoerr/8219584_kill_thread_on_safepoint_timeout/webrev.00/


The test contains a long running loop without safepoint compiled by C2. The new enhancement leads to an hs_err output (excerpt):
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00003be1001f5fd5, pid=15329, tid=15330
#
# Signal was sent by thread with id 15339
# Reason: "blocking a safepoint"
#
...
# J 29 c2 TestAbortVMOnSafepointTimeout.test_loop(I)I (31 bytes) @ 0x000003ff7ae6d508 [0x000003ff7ae6d3c0+0x0000000000000148]
...
---------------  T H R E A D  ---------------

Current thread (0x0000000080039000):  JavaThread "main" [_thread_in_Java, id=15330, stack(0x000003ff7e000000,0x000003ff7e100000)]

Stack: [0x000003ff7e000000,0x000003ff7e100000],  sp=0x000003ff7e0fe778,  free space=1017k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
J 29 c2 TestAbortVMOnSafepointTimeout.test_loop(I)I (31 bytes) @ 0x000003ff7ae6d508 [0x000003ff7ae6d3c0+0x0000000000000148]
j  TestAbortVMOnSafepointTimeout.main([Ljava/lang/String;)V+6
v  ~StubRoutines::call_stub
V  [libjvm.so+0xb0957a]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x6b2
V  [libjvm.so+0xb08614]  JavaCalls::call(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x8c
...
Event: 1.558 Thread 0x00000000808a4000 sent signal 4 to Thread 0x0000000080039000 because blocking a safepoint.


Please review.

Best regads,
Martin



More information about the hotspot-runtime-dev mailing list