RFR(XXS): 8007312: Signal Dispatcher thread to start and register ctrl-break handler before TRACE_INITIALIZE

David Holmes david.holmes at oracle.com
Fri Feb 1 04:03:54 PST 2013


I think java.lang.Terminator installs the handler sometime during class 
library initialization. This in itself seems wrong, if that is happening 
before the VM has initialized the signal handling subsystem.

David

On 1/02/2013 9:56 PM, Markus Grönlund wrote:
> Thanks David. Inline.
>
> -----Original Message-----
> From: David Holmes
> Sent: den 1 februari 2013 12:32
> To: Markus Grönlund
> Cc: hotspot-dev at openjdk.java.net
> Subject: Re: RFR(XXS): 8007312: Signal Dispatcher thread to start and register ctrl-break handler before TRACE_INITIALIZE
>
> Hi Markus,
>
> I appreciate the change of approach. Can we not check if the initialization has occurred and so keep the assert where it already exists eg:
>
> void os::signal_notify(int signal_number) {
>     BOOL ret;
> + if (sig_sem != NULL) {
>     Atomic::inc(&pending_signals[signal_number]);
>     ret = ::ReleaseSemaphore(sig_sem, 1, NULL);
>     assert(ret != 0, "ReleaseSemaphore() failed");
> + }
> }
> [MG]
> We could absolutely do this for Windows (a handle NULL check is easy to do). More unsure how to validate an eventual uninitialized semaphore of the assert for Solaris though where I would need more state tracking to keep track of the initialization state of the sema_t. That said, I am not even sure this is a real issue on Solaris, because I haven't been able to test with a debug build there yet. Let's take Solaris as a separate step, and resolve the Windows issue first.
>
> That said I'm unclear how on Solaris we get into this code. You said on Windows the C runtime fault handler causes things to get directed back to UserHandler. But on linux/Solaris I did not expect any VM handler to be installed at this point so the default signal response would occur - to kill the process. The shared code does this at the end of
> os::signal_init()
>
>    os::signal(SIGBREAK, os::user_handler())
>
> So I'm not sure how we end up in UserHandler() ??
> [MG]
> You put your finger on the spot here - this is actually my main confusion at this point as well (and I agree with you description, I wonder how we can ever end up in UserHandler() even though we haven't called SetConsoleCtrlHandler() on it....apparently its registered:
>
> 04a4fdd0 07b2242c 00000002 04a4fe18 7851afbb jvm!os::signal_notify+0x4 [d:\hotspot_src\hsx\24\hotspot_31_jan\src\os\windows\vm\os_windows.cpp @ 1961]
> 04a4fddc 7851afbb 00000002 a0bf8b55 00000000 jvm!UserHandler+0xc [d:\hotspot_src\hsx\24\hotspot_31_jan\src\os\windows\vm\os_windows.cpp @ 1845]
> 04a4fe18 764ae3d8 00000000 aa5b288e 00000000 msvcr100!ctrlevent_capture+0x90 [f:\dd\vctools\crt_bld\self_x86\crt\src\winsig.c @ 130]<<-- get the C Runtime default ctrlevent_capture to call us back (I guess this is normal linked list of signal handlers passing down), but our callback has not been setup up proper yet? At least not from the VM. I
>
> 04a4fea4 7647ed6c 00000000 04a4fef0 7793377b kernel32!CtrlRoutine+0x126
> 04a4feb0 7793377b 00000000 769482db 00000000 kernel32!BaseThreadInitThunk+0xe
> 04a4fef0 7793374e 76494a03 00000000 00000000 ntdll!__RtlUserThreadStart+0x70
> 04a4ff08 00000000 76494a03 00000000 00000000 ntdll!_RtlUserThreadStart+0x1b
>
> I only find a single call to SetConsoleCtrlHandler() but that does not register UserHandler(), it registers consoleHandler()
>
> Unless...maybe the java.exe launcher is doing some signal registration here? But how would it be possible to register a callback in the jvm.dll? hmm...Very interesting, I'm going to find out.. stay tuned...
>
> Cheers
> Markus
>
> David
>
> On 1/02/2013 6:20 PM, Markus Grönlund wrote:
>> Hi David,
>>
>> Many thanks for your input.
>>
>> I do acknowledge the startup sequence is fragile, and this is quite tricky business - you raise good points.
>>
>>   From what I can read/test/trace, we are ready to start up the Signal Dispatcher Java thread at an earlier point (technically from a VM perspective), but as you say, this is mainly based on the best (worst?) theoretical case (and a bit of hands on a few different platforms) - in addition, if we take into account what expectations current testing has in this area (which I am unsure of at this point), there is now a change introduced between JVMTI_EVENT_VM_START and JVMTI_EVENT_JVMTI_PHASE_LIVE - and who knows what kind of test issues this might be causing...could be found out, but with a lot of effort...
>>
>> I see that the only OS:s where the correct behavior is actually checked are Solaris and Windows (asserts).
>>
>> I think we need the first stage TRACE_INITIALIZE before set_init_completed() and JVMTI_EVENT_JVMTI_PHASE_LIVE (don't remember the actual rationale off the top of my head), but I can verify if it can move later...
>>
>> Since this is a very hard area to get right, as you say, there will always be window where this is always an issue...I was considering a more pragmatic means just to resolve this particular issue in a somewhat timely manner (even though it's not my preferred choice):
>>
>> If we lax up (remove) the asserts os::signal_notify() on the return from semaphore signaling (sema_post() solaris and ReleaseSemaphore() windows), this should cease being an issue (we then get the same behavior as PRODUCT builds for non-product builds (do nothing)). ( btw it's not an error to close a NULL handle on Windows, but if you do it in debug mode you get told about it).
>>
>> I never like removing asserts however, but this seems like a feasible (time/effort/risk) resolution to this problem.
>>
>> Appreciate your comments as always, let me know what you think of this.
>>
>> Cheers
>> Markus
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: David Holmes
>> Sent: den 1 februari 2013 05:47
>> To: Markus Grönlund
>> Cc: hotspot-dev at openjdk.java.net
>> Subject: Re: RFR(XXS): 8007312: Signal Dispatcher thread to start and
>> register ctrl-break handler before TRACE_INITIALIZE
>>
>> Hi Markus,
>>
>> I understand that there is a bug if ctrl-C is pressed before we have initalized the signal handler. I also realize that normally the chance of hitting that window is negligible, but with TRACE_INITIALIZE active the window becomes much larger.
>>
>> But as I wrote in the bug report the VM initialization sequence is very fragile and changes to it have to be made with great care. Basic tests only test the normal startup path. Problems arise where we use specific VM flags (eg tracing) with specific VM services (specific GC) and even a specific VM (fastdebug).
>>
>> In this case I can't convince myself it is safe for the ctrl-C handler to be setup at this earlier point in time. We are still in the initialization phase of the VM and the handler code has to make Java calls to create and start a Java signal handling thread. Are we ready to execute that Java code at this point in time - before set_init_completed()? What might go wrong? What if the Java code hits an exception?
>>
>> Can you test your fix by adding a sleep after os::signal_init and sending ctrl-C?
>>
>> Can you emulate this problem on Linux/Solaris by adding a sleep before the original os::signal_init() ? Does it crash there too, or is this a problem limited to Windows? If so the fix should be limited to Windows.
>>
>> Can you consider moving TRACE_INITIALIZE instead?
>>
>> Thanks,
>> David
>>
>> On 1/02/2013 7:22 AM, Markus Grönlund wrote:
>>> Greetings,
>>>
>>>
>>>
>>> Can I kindly ask for a few reviews and a putback sponsorship for the following change:
>>>
>>>
>>>
>>> Bugid: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007312
>>>
>>> Webrev: http://cr.openjdk.java.net/~mgronlun/8007312/webrev01/
>>>
>>> Hotspot JPRT builds/tests : completed successfully
>>>
>>>
>>>
>>> Comment on issue/fix:
>>>
>>>
>>>
>>> (windows analysis)
>>>
>>>
>>>
>>> Pressing ctrl-c before Hotspot signal/console handler has been registered actually asserts/stops the VM (which to the user appears like a crash) on non-product builds.
>>>
>>>
>>>
>>> Before Hotspot registers its own jvm!consoleHandler with kernel32!CtrlRoutine, the C runtime default msvcr100!ctrlevent_capture is implicitly used - this calls back into jvm!UserHandler, which forwards into os::signal_notify() which uses uninitialized variables.
>>>
>>>
>>>
>>> // on Windows this creates the following issue when closing a NULL
>>> handle to a  semaphore
>>>
>>>
>>>
>>> void os::signal_notify(int signal_number) {
>>>
>>>      BOOL ret;
>>>
>>>
>>>
>>>      Atomic::inc(&pending_signals[signal_number]);
>>>
>>>      ret = ::ReleaseSemaphore(sig_sem, 1, NULL);<<--- call
>>> ReleaseSemaphore on global handle sig_sem which has not been
>>> setup/created yet == is NULL (is created in os::signal_init_pd())
>>>
>>>      assert(ret != 0, "ReleaseSemaphore() failed");<<-- assert traps
>>> here (GetLastError() == 0xc0000008 - An invalid HANDLE was specified)
>>>
>>>
>>>
>>> }
>>>
>>>
>>>
>>> The short initial tests I have done locally so far on moving the signal handler registration earlier as suggested in this patch makes it very hard to issue crtl-c before Hotspot signal handler has been setup.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Markus


More information about the hotspot-dev mailing list