RFR(XXS): 8007312: Signal Dispatcher thread to start and register ctrl-break handler before TRACE_INITIALIZE

Markus Grönlund markus.gronlund at oracle.com
Fri Feb 1 00:20:43 PST 2013


Hi David,

Many thanks for your input.

I do acknowledge the startup sequence is fragile, and this is quite tricky business - you raise good points.

>From what I can read/test/trace, we are ready to start up the Signal Dispatcher Java thread at an earlier point (technically from a VM perspective), but as you say, this is mainly based on the best (worst?) theoretical case (and a bit of hands on a few different platforms) - in addition, if we take into account what expectations current testing has in this area (which I am unsure of at this point), there is now a change introduced between JVMTI_EVENT_VM_START and JVMTI_EVENT_JVMTI_PHASE_LIVE - and who knows what kind of test issues this might be causing...could be found out, but with a lot of effort...

I see that the only OS:s where the correct behavior is actually checked are Solaris and Windows (asserts).

I think we need the first stage TRACE_INITIALIZE before set_init_completed() and JVMTI_EVENT_JVMTI_PHASE_LIVE (don't remember the actual rationale off the top of my head), but I can verify if it can move later...

Since this is a very hard area to get right, as you say, there will always be window where this is always an issue...I was considering a more pragmatic means just to resolve this particular issue in a somewhat timely manner (even though it's not my preferred choice):

If we lax up (remove) the asserts os::signal_notify() on the return from semaphore signaling (sema_post() solaris and ReleaseSemaphore() windows), this should cease being an issue (we then get the same behavior as PRODUCT builds for non-product builds (do nothing)). ( btw it's not an error to close a NULL handle on Windows, but if you do it in debug mode you get told about it).

I never like removing asserts however, but this seems like a feasible (time/effort/risk) resolution to this problem.

Appreciate your comments as always, let me know what you think of this.

Cheers
Markus 

 



-----Original Message-----
From: David Holmes 
Sent: den 1 februari 2013 05:47
To: Markus Grönlund
Cc: hotspot-dev at openjdk.java.net
Subject: Re: RFR(XXS): 8007312: Signal Dispatcher thread to start and register ctrl-break handler before TRACE_INITIALIZE

Hi Markus,

I understand that there is a bug if ctrl-C is pressed before we have initalized the signal handler. I also realize that normally the chance of hitting that window is negligible, but with TRACE_INITIALIZE active the window becomes much larger.

But as I wrote in the bug report the VM initialization sequence is very fragile and changes to it have to be made with great care. Basic tests only test the normal startup path. Problems arise where we use specific VM flags (eg tracing) with specific VM services (specific GC) and even a specific VM (fastdebug).

In this case I can't convince myself it is safe for the ctrl-C handler to be setup at this earlier point in time. We are still in the initialization phase of the VM and the handler code has to make Java calls to create and start a Java signal handling thread. Are we ready to execute that Java code at this point in time - before set_init_completed()? What might go wrong? What if the Java code hits an exception?

Can you test your fix by adding a sleep after os::signal_init and sending ctrl-C?

Can you emulate this problem on Linux/Solaris by adding a sleep before the original os::signal_init() ? Does it crash there too, or is this a problem limited to Windows? If so the fix should be limited to Windows.

Can you consider moving TRACE_INITIALIZE instead?

Thanks,
David

On 1/02/2013 7:22 AM, Markus Grönlund wrote:
> Greetings,
>
>
>
> Can I kindly ask for a few reviews and a putback sponsorship for the following change:
>
>
>
> Bugid: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007312
>
> Webrev: http://cr.openjdk.java.net/~mgronlun/8007312/webrev01/
>
> Hotspot JPRT builds/tests : completed successfully
>
>
>
> Comment on issue/fix:
>
>
>
> (windows analysis)
>
>
>
> Pressing ctrl-c before Hotspot signal/console handler has been registered actually asserts/stops the VM (which to the user appears like a crash) on non-product builds.
>
>
>
> Before Hotspot registers its own jvm!consoleHandler with kernel32!CtrlRoutine, the C runtime default msvcr100!ctrlevent_capture is implicitly used - this calls back into jvm!UserHandler, which forwards into os::signal_notify() which uses uninitialized variables.
>
>
>
> // on Windows this creates the following issue when closing a NULL 
> handle to a  semaphore
>
>
>
> void os::signal_notify(int signal_number) {
>
>    BOOL ret;
>
>
>
>    Atomic::inc(&pending_signals[signal_number]);
>
>    ret = ::ReleaseSemaphore(sig_sem, 1, NULL);<<--- call 
> ReleaseSemaphore on global handle sig_sem which has not been 
> setup/created yet == is NULL (is created in os::signal_init_pd())
>
>    assert(ret != 0, "ReleaseSemaphore() failed");<<-- assert traps 
> here (GetLastError() == 0xc0000008 - An invalid HANDLE was specified)
>
>
>
> }
>
>
>
> The short initial tests I have done locally so far on moving the signal handler registration earlier as suggested in this patch makes it very hard to issue crtl-c before Hotspot signal handler has been setup.
>
>
>
> Thanks
>
> Markus


More information about the hotspot-dev mailing list