[OpenJDK 2D-Dev] <AWT Dev> Request for review: 8026385: [macosx] (awt) setjmp/longjmp changes the process signal mask on OS X

Staffan Larsen staffan.larsen at oracle.com
Tue May 20 13:55:00 UTC 2014


On 20 maj 2014, at 10:34, Volker Simonis <volker.simonis at gmail.com> wrote:

> Hi everybody,
> 
> I've analyzed this issue on AIX (and HPUX) and came to the conclusion
> that it is not a problem on these platforms.
> 
> Just to make sure I got everything right, I'll summarize my
> understanding of the problem here:
> 
> On MacOS, setjmp() saves the signal mask of the current thread BUT
> restores the signal mask of the whole process after the corresponding
> longjmp(). This obviously changes the signal mask of all threads which
> had a different signal mask than the initial thread who called
> setjmp(). I could easily reproduce this by loading a class which
> provokes a verification exception in the un-fixed libverify.so.
> Afterwards, the VM-thread won't listen to SIGQUIT anymore because it
> now masks this signal in the same way like the Java thread which had
> the verification exception.
> 
> On AIX/HPUX setjmp()/longjmp() is defined in the same way like on
> MacOS in that it saves and restores the signal mask (and both
> platforms provide _setjmp()/_longjmp() versions as well which do not
> manipulate the signal mask). But while it is not clear from the
> documentation, all my tests showed that longjmp() only restores the
> signal mask of the current thread, which should be OK (i.e. I couldn't
> see a blocked VM thread which doesn't react on SIQUIT anymore).
> 
> So to cut a long story short - no action seems to be required to fix
> this issue on AIX:
> 
> As a side note I wonder why we didn't use sigsetjmp()/siglongjmp()
> with a zero 'savemask' argument to fix this problem? It is
> standardized by POSIX and would avoid the usage of "#ifdef __APPLE”.

It’s not clear to me if sigsetjmp()/siglongjmp() sets the signal mask on the process or on the pthread. The problem on OS X was that the VM sets up the sigprocmask on the pthreads (using pthread_sigmask()), but longjmp() restores the mask for the process (using sigprocmask()).


/Staffan


> 
> Thank you and best regards,
> Volker
> 
> 
> On Fri, May 16, 2014 at 6:04 PM, David DeHaven <david.dehaven at oracle.com> wrote:
>> 
>> I'd thought about that, since Apple borrowed most of it's underpinnings from BSD, but had no evidence to suggest it was necessary.
>> 
>> Anyways, at least you've identified and can rectify the issue :)
>> 
>> -DrD-
>> 
>>> Wow, sometimes it really makes sense to read apparently unrelated
>>> email-threads on Friday afternoon:)
>>> 
>>> It seems that AIX/HPUX have exactly the same problem like MacOS X.
>>> From the AIX setjmp man-page:
>>> 
>>> "The setjmp and longjmp subroutines save and restore the signal mask
>>> sigmask (2), while _setjmp and _longjmp manipulate only the stack
>>> context."
>>> 
>>> From the HPUXM setjmp man-page:
>>> 
>>> "setjmp(), longjmp()         These always save and restore the signal mask.
>>> _setjmp(), _longjmp()        These never manipulate the signal mask."
>>> 
>>> I think it doesn't make sense for you to wait pushing this until I
>>> provide the corresponding AIX patches because anyway I'll have to
>>> provide a fix not only for this issue but also for the other bugs you
>>> mentioned (i.e. 8023786 and 8023720). So I'll better create a new bug
>>> for AIX.
>>> 
>>> Thank you and best regards,
>>> Volker
>>> 
>>> On Fri, May 16, 2014 at 5:38 PM, David DeHaven <david.dehaven at oracle.com> wrote:
>>>> 
>>>> Thanks!
>>>> 
>>>> -DrD-
>>>> 
>>>>> The splashscreen changes look fine to me. Approved.
>>>>> 
>>>>> --
>>>>> best regards,
>>>>> Anthony
>>>>> 
>>>>> On 5/16/2014 7:18 PM, David DeHaven wrote:
>>>>>> 
>>>>>> Could someone on AWT team approve the splashscreen changes?
>>>>>> 
>>>>>> -DrD-
>>>>>> 
>>>>>>> Approved.
>>>>>>> 
>>>>>>> -phil.
>>>>>>> 
>>>>>>> On 5/15/2014 9:31 AM, David DeHaven wrote:
>>>>>>>> Ping!
>>>>>>>> 
>>>>>>>> Does this look OK?
>>>>>>>> 
>>>>>>>> I've also filed an issue against JavaFX:
>>>>>>>> https://javafx-jira.kenai.com/browse/RT-37125
>>>>>>>> 
>>>>>>>> -DrD-
>>>>>>>> 
>>>>>>>>>>>> I tried not modifying libpng but still ended up with lingering references to longjmp in pngread.o, despite libpng having png_ptr->longjmp_fn (bug in libpng?). pngread.c calls setjmp to set a default location to jump to in case the caller doesn't call setjmp, so if we continue down this path something in libpng must be modified. The only other option is to create our own setjmp.h and order it before /usr/include/setjmp.h, which seems dubious at best.
>>>>>>>>>>>> 
>>>>>>>>>>>> I'm curious if the libpng changes are even needed since it's only used for splashscreen, which happens very early in the launch process. Also note that we didn't originally even call png_set_longjmp_fn, so any error should have resulted in an abort() instead of a call to longjmp... it appears we could retain the functionality we have today and #undef PNG_SETJMP_SUPPORTED (pngconf.h?). That would put the onus on developers to make sure their pngs don't have errors in them, or libsplashscreen will abort()...
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> That's an interesting question and the answer might extend to the splashscreen changes too.
>>>>>>>>>>> Its platform specific code and on MAC, the thread is created using pthreads directly and that
>>>>>>>>>>> thread goes away once splashscreen is done. But its running at the same time as the VM
>>>>>>>>>>> is booting up and creating threads and setting their signal masks. So I don't think you can
>>>>>>>>>>> guarantee that it won't mess up the masks on the JRE threads if the PNG is bad. And I'm
>>>>>>>>>>> also not sure you want to remove error handling from the library either.
>>>>>>>>>>> So a HIGHLY VISIBLE DO NOT REMOVE comment might be the best you can do here.
>>>>>>>>>> I have a better idea:
>>>>>>>>>> 
>>>>>>>>>> png_default_error is the only place where png_longjmp is called. We could call png_set_error_fn to set up our own error handler (for Mac only), compile with PNG_SETJMP_SUPPORTED unset so it doesn't pull in setjmp/longjmp and our own implementation of the error handler would call _longjmp, which would jump back to where we call setjmp currently.
>>>>>>>>> Ok, I figured out what's going on. It's not quite intuitive...
>>>>>>>>> 
>>>>>>>>> png_jmpbuf is a macro defined in png.h, this calls png_set_longjmp_fn with longjmp, which is why I was seeing references to longjmp in the object file. That's what was throwing me off as it seems like it should only be getting the jmp_buf ptr stored in the png_ptr. I guess the intention was that setjmp/longjmp was optional, if you don't call setjmp then it just abort()s.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I changed splashscreen_png.c to:
>>>>>>>>> #ifdef __APPLE__
>>>>>>>>>  if (_setjmp(png_set_longjmp_fn(png_ptr, _longjmp, sizeof(jmp_buf)))) {
>>>>>>>>> #else
>>>>>>>>>  if (setjmp(png_jmpbuf(png_ptr))) {
>>>>>>>>> #endif
>>>>>>>>> 
>>>>>>>>> and it calls _longjmp instead. I verified this works by changing the macro to set png_longjmp to exit() and without the above change it does indeed exit prematurely with a bad png, with the change it reports the error but continues to load the application as would be expected.
>>>>>>>>> 
>>>>>>>>> pngread.o still has a symbol table entry for _longjmp instead of __longjmp, but it's benign since we're ultimately forcing it to use the correct function. So I've left libpng completely unchanged.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> With the change and using a bad png for splashscreen, I was able to get a stack trace once the application was running. Without the change to splashscreen_png.c, jstack was unable to connect to the process. So splashscreen absolutely can interfere with the signal handling.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Updated webrev:
>>>>>>>>> http://cr.openjdk.java.net/~ddehaven/8026385/jdk.1/
>>>>>>>>> 
>>>>>>>>> I can look into writing a regression test for this. It might not be trivial though since we're dealing with signal handlers, and if timing is a factor the test may not be reliable.
>>>>>>>>> 
>>>>>>>>> -DrD-
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> 




More information about the 2d-dev mailing list