RFR(XXS): 8007312: Signal Dispatcher thread to start and register ctrl-break handler before TRACE_INITIALIZE
David Holmes
david.holmes at oracle.com
Fri Feb 1 05:40:55 PST 2013
On 1/02/2013 10:27 PM, Markus Grönlund wrote:
> David,
>
> 0171f3a0 56bdb160 56bdaf2b 00000001 3dd4dde1 kernel32!SetConsoleCtrlHandler
> 0171f3e0 074b248c 00000002 074b2420 00000000 msvcr100!signal+0x125 [f:\dd\vctools\crt_bld\self_x86\crt\src\winsig.c @ 256]
> 0171f3f4 074af2a7 00000002 074b2420 00000000 jvm!os::signal+0x3c [d:\hotspot_src\hsx\24\hotspot_31_jan\src\os\windows\vm\os_windows.cpp @ 1860]
> 0171f44c 6a0b4748 00000002 00000002 0193de46 jvm!JVM_RegisterSignal+0x1b7 [d:\hotspot_src\hsx\24\hotspot_31_jan\src\os\windows\vm\jvm_windows.cpp @ 68]
> WARNING: Stack unwind information not available. Following frames may be wrong.
> 0171f58c 078f860d 0171f5ec 0171f760 0000000a java_6a0b0000!Java_sun_misc_Signal_handle0+0xd
> 0171f630 074bb2ef 0171f758 0171f6a0 0171f6f0 jvm!JavaCalls::call_helper+0x3dd [d:\hotspot_src\hsx\24\hotspot_31_jan\src\share\vm\runtime\javacalls.cpp @ 415]
> 0171f678 078f8222 078f8230 0171f758 0171f6a0 jvm!os::os_exception_wrapper+0x9f [d:\hotspot_src\hsx\24\hotspot_31_jan\src\os_cpu\windows_x86\vm\os_windows_x86.cpp @ 113]
> 0171f694 078f80db 0171f758 0173f6d0 0171f6f0 jvm!JavaCalls::call+0x52 [d:\hotspot_src\hsx\24\hotspot_31_jan\src\share\vm\runtime\javacalls.cpp @ 320]
> 0171f6d0 078f811b 0171f758 0173f6d4 05fa6aa8 jvm!JavaCalls::call_static+0x9b [d:\hotspot_src\hsx\24\hotspot_31_jan\src\share\vm\runtime\javacalls.cpp @ 286]
> 0171f738 07936ccb 0171f758 0173f6cc 05fa6aa8 jvm!JavaCalls::call_static+0x2b [d:\hotspot_src\hsx\24\hotspot_31_jan\src\share\vm\runtime\javacalls.cpp @ 292]
> 0171f770 079361bd 0173ec00 7ffde000 77d37894 jvm!call_initializeSystemClass+0x6b [d:\hotspot_src\hsx\24\hotspot_31_jan\src\share\vm\runtime\thread.cpp @ 1022]
> 0171f974 0785232f 0171f9d4 0171f9bb 00000000 jvm!Threads::create_vm+0xcdd [d:\hotspot_src\hsx\24\hotspot_31_jan\src\share\vm\runtime\thread.cpp @ 3497]
> 0171f9bc 011713c1 0171fa28 0171fa34 0171f9d4 jvm!JNI_CreateJavaVM+0x7f [d:\hotspot_src\hsx\24\hotspot_31_jan\src\share\vm\prims\jni.cpp @ 5134]
>
> Registration of UserHandler() actually comes in as an effect of call_initializeSystemClass() during VM startup...
Yes via Terminator:
// Setup Java signal handlers for HUP, TERM, and INT (where
available).
Terminator.setup();
> I agree this seems wrong as we are not ready at this point.
Not sure we can really do anything about it though - we'd need to split
the Java level initialization into two steps at least. Even then it may
be impossible to get rid of all cycles in the initialization dependencies.
So I guess the VM can only note that the handler can be installed before
it is ready and watch for these early invocations. Or (more complex) the
VM could simply note the request to install the signal handler and defer
the actual installation to a later point. But all that is again
disruptive to the initialization process.
Which brings us back to simply checking for not being initialized.
David
>
> /Markus
>
>
> -----Original Message-----
> From: David Holmes
> Sent: den 1 februari 2013 13:04
> To: Markus Grönlund
> Cc: hotspot-dev at openjdk.java.net
> Subject: Re: RFR(XXS): 8007312: Signal Dispatcher thread to start and register ctrl-break handler before TRACE_INITIALIZE
>
> I think java.lang.Terminator installs the handler sometime during class library initialization. This in itself seems wrong, if that is happening before the VM has initialized the signal handling subsystem.
>
> David
>
> On 1/02/2013 9:56 PM, Markus Grönlund wrote:
>> Thanks David. Inline.
>>
>> -----Original Message-----
>> From: David Holmes
>> Sent: den 1 februari 2013 12:32
>> To: Markus Grönlund
>> Cc: hotspot-dev at openjdk.java.net
>> Subject: Re: RFR(XXS): 8007312: Signal Dispatcher thread to start and
>> register ctrl-break handler before TRACE_INITIALIZE
>>
>> Hi Markus,
>>
>> I appreciate the change of approach. Can we not check if the initialization has occurred and so keep the assert where it already exists eg:
>>
>> void os::signal_notify(int signal_number) {
>> BOOL ret;
>> + if (sig_sem != NULL) {
>> Atomic::inc(&pending_signals[signal_number]);
>> ret = ::ReleaseSemaphore(sig_sem, 1, NULL);
>> assert(ret != 0, "ReleaseSemaphore() failed");
>> + }
>> }
>> [MG]
>> We could absolutely do this for Windows (a handle NULL check is easy to do). More unsure how to validate an eventual uninitialized semaphore of the assert for Solaris though where I would need more state tracking to keep track of the initialization state of the sema_t. That said, I am not even sure this is a real issue on Solaris, because I haven't been able to test with a debug build there yet. Let's take Solaris as a separate step, and resolve the Windows issue first.
>>
>> That said I'm unclear how on Solaris we get into this code. You said
>> on Windows the C runtime fault handler causes things to get directed
>> back to UserHandler. But on linux/Solaris I did not expect any VM
>> handler to be installed at this point so the default signal response
>> would occur - to kill the process. The shared code does this at the
>> end of
>> os::signal_init()
>>
>> os::signal(SIGBREAK, os::user_handler())
>>
>> So I'm not sure how we end up in UserHandler() ??
>> [MG]
>> You put your finger on the spot here - this is actually my main confusion at this point as well (and I agree with you description, I wonder how we can ever end up in UserHandler() even though we haven't called SetConsoleCtrlHandler() on it....apparently its registered:
>>
>> 04a4fdd0 07b2242c 00000002 04a4fe18 7851afbb jvm!os::signal_notify+0x4
>> [d:\hotspot_src\hsx\24\hotspot_31_jan\src\os\windows\vm\os_windows.cpp
>> @ 1961] 04a4fddc 7851afbb 00000002 a0bf8b55 00000000
>> jvm!UserHandler+0xc
>> [d:\hotspot_src\hsx\24\hotspot_31_jan\src\os\windows\vm\os_windows.cpp
>> @ 1845]
>> 04a4fe18 764ae3d8 00000000 aa5b288e 00000000
>> msvcr100!ctrlevent_capture+0x90
>> [f:\dd\vctools\crt_bld\self_x86\crt\src\winsig.c @ 130]<<-- get the C
>> Runtime default ctrlevent_capture to call us back (I guess this is
>> normal linked list of signal handlers passing down), but our callback
>> has not been setup up proper yet? At least not from the VM. I
>>
>> 04a4fea4 7647ed6c 00000000 04a4fef0 7793377b
>> kernel32!CtrlRoutine+0x126
>> 04a4feb0 7793377b 00000000 769482db 00000000
>> kernel32!BaseThreadInitThunk+0xe
>> 04a4fef0 7793374e 76494a03 00000000 00000000
>> ntdll!__RtlUserThreadStart+0x70
>> 04a4ff08 00000000 76494a03 00000000 00000000
>> ntdll!_RtlUserThreadStart+0x1b
>>
>> I only find a single call to SetConsoleCtrlHandler() but that does not
>> register UserHandler(), it registers consoleHandler()
>>
>> Unless...maybe the java.exe launcher is doing some signal registration here? But how would it be possible to register a callback in the jvm.dll? hmm...Very interesting, I'm going to find out.. stay tuned...
>>
>> Cheers
>> Markus
>>
>> David
>>
>> On 1/02/2013 6:20 PM, Markus Grönlund wrote:
>>> Hi David,
>>>
>>> Many thanks for your input.
>>>
>>> I do acknowledge the startup sequence is fragile, and this is quite tricky business - you raise good points.
>>>
>>> From what I can read/test/trace, we are ready to start up the Signal Dispatcher Java thread at an earlier point (technically from a VM perspective), but as you say, this is mainly based on the best (worst?) theoretical case (and a bit of hands on a few different platforms) - in addition, if we take into account what expectations current testing has in this area (which I am unsure of at this point), there is now a change introduced between JVMTI_EVENT_VM_START and JVMTI_EVENT_JVMTI_PHASE_LIVE - and who knows what kind of test issues this might be causing...could be found out, but with a lot of effort...
>>>
>>> I see that the only OS:s where the correct behavior is actually checked are Solaris and Windows (asserts).
>>>
>>> I think we need the first stage TRACE_INITIALIZE before set_init_completed() and JVMTI_EVENT_JVMTI_PHASE_LIVE (don't remember the actual rationale off the top of my head), but I can verify if it can move later...
>>>
>>> Since this is a very hard area to get right, as you say, there will always be window where this is always an issue...I was considering a more pragmatic means just to resolve this particular issue in a somewhat timely manner (even though it's not my preferred choice):
>>>
>>> If we lax up (remove) the asserts os::signal_notify() on the return from semaphore signaling (sema_post() solaris and ReleaseSemaphore() windows), this should cease being an issue (we then get the same behavior as PRODUCT builds for non-product builds (do nothing)). ( btw it's not an error to close a NULL handle on Windows, but if you do it in debug mode you get told about it).
>>>
>>> I never like removing asserts however, but this seems like a feasible (time/effort/risk) resolution to this problem.
>>>
>>> Appreciate your comments as always, let me know what you think of this.
>>>
>>> Cheers
>>> Markus
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: David Holmes
>>> Sent: den 1 februari 2013 05:47
>>> To: Markus Grönlund
>>> Cc: hotspot-dev at openjdk.java.net
>>> Subject: Re: RFR(XXS): 8007312: Signal Dispatcher thread to start and
>>> register ctrl-break handler before TRACE_INITIALIZE
>>>
>>> Hi Markus,
>>>
>>> I understand that there is a bug if ctrl-C is pressed before we have initalized the signal handler. I also realize that normally the chance of hitting that window is negligible, but with TRACE_INITIALIZE active the window becomes much larger.
>>>
>>> But as I wrote in the bug report the VM initialization sequence is very fragile and changes to it have to be made with great care. Basic tests only test the normal startup path. Problems arise where we use specific VM flags (eg tracing) with specific VM services (specific GC) and even a specific VM (fastdebug).
>>>
>>> In this case I can't convince myself it is safe for the ctrl-C handler to be setup at this earlier point in time. We are still in the initialization phase of the VM and the handler code has to make Java calls to create and start a Java signal handling thread. Are we ready to execute that Java code at this point in time - before set_init_completed()? What might go wrong? What if the Java code hits an exception?
>>>
>>> Can you test your fix by adding a sleep after os::signal_init and sending ctrl-C?
>>>
>>> Can you emulate this problem on Linux/Solaris by adding a sleep before the original os::signal_init() ? Does it crash there too, or is this a problem limited to Windows? If so the fix should be limited to Windows.
>>>
>>> Can you consider moving TRACE_INITIALIZE instead?
>>>
>>> Thanks,
>>> David
>>>
>>> On 1/02/2013 7:22 AM, Markus Grönlund wrote:
>>>> Greetings,
>>>>
>>>>
>>>>
>>>> Can I kindly ask for a few reviews and a putback sponsorship for the following change:
>>>>
>>>>
>>>>
>>>> Bugid: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007312
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~mgronlun/8007312/webrev01/
>>>>
>>>> Hotspot JPRT builds/tests : completed successfully
>>>>
>>>>
>>>>
>>>> Comment on issue/fix:
>>>>
>>>>
>>>>
>>>> (windows analysis)
>>>>
>>>>
>>>>
>>>> Pressing ctrl-c before Hotspot signal/console handler has been registered actually asserts/stops the VM (which to the user appears like a crash) on non-product builds.
>>>>
>>>>
>>>>
>>>> Before Hotspot registers its own jvm!consoleHandler with kernel32!CtrlRoutine, the C runtime default msvcr100!ctrlevent_capture is implicitly used - this calls back into jvm!UserHandler, which forwards into os::signal_notify() which uses uninitialized variables.
>>>>
>>>>
>>>>
>>>> // on Windows this creates the following issue when closing a NULL
>>>> handle to a semaphore
>>>>
>>>>
>>>>
>>>> void os::signal_notify(int signal_number) {
>>>>
>>>> BOOL ret;
>>>>
>>>>
>>>>
>>>> Atomic::inc(&pending_signals[signal_number]);
>>>>
>>>> ret = ::ReleaseSemaphore(sig_sem, 1, NULL);<<--- call
>>>> ReleaseSemaphore on global handle sig_sem which has not been
>>>> setup/created yet == is NULL (is created in os::signal_init_pd())
>>>>
>>>> assert(ret != 0, "ReleaseSemaphore() failed");<<-- assert traps
>>>> here (GetLastError() == 0xc0000008 - An invalid HANDLE was
>>>> specified)
>>>>
>>>>
>>>>
>>>> }
>>>>
>>>>
>>>>
>>>> The short initial tests I have done locally so far on moving the signal handler registration earlier as suggested in this patch makes it very hard to issue crtl-c before Hotspot signal handler has been setup.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Markus
More information about the hotspot-dev
mailing list