Question about stack overflows in native code

Tue Apr 4 10:11:56 UTC 2017

On 4/04/2017 6:30 PM, Thomas Stüfe wrote:
> Hi David,
>
> On Mon, Apr 3, 2017 at 11:02 PM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     Just to follow up on what Fred responded ...
>
>     On 4/04/2017 4:42 AM, Thomas Stüfe wrote:
>
>         Hi Fred,
>
>         thanks! Some more questions inline.
>
>         On Mon, Apr 3, 2017 at 8:29 PM, Frederic Parain
>         <frederic.parain at oracle.com <mailto:frederic.parain at oracle.com>>
>         wrote:
>
>             When the yellow zone is hit and the thread state is not in
>             _thread_in_java (which means thread state is
>             _thread_in_native or
>             _thread_in_vm), the yellow zone is silently disabled and the
>             thread
>             is allowed to resume its execution.
>
>
>         Disabled by whom exactly?
>
>         Normally, this would be done in the signal handler, but that
>         requires
>         enough stack space to run. AFAIK jitted or interpreted code does
>         stack
>         banging in order to trigger the yellow-page-segfault at a point
>         where there
>         are enough pages left on the stack to invoke the signal handler
>         (n shadow
>         pages before), but that is not guaranteed to work with native
>         C-compiled
>         code, no?
>
>
>     The stack banging is done to ensure the stackoverflow is hit before
>     we start doing the actual operation. The size of the yellow and red
>     zones are supposed to be sufficient to allow the respective signal
>     processing and response to be executed.
>
>
> And the size of the shadow pages should be sufficient to invoke initial
> signal handler which will unprotect the yellow or red zone, right?
> So, back to my original question, if native C code does not bang the
> stack but simply runs into the yellow zone, process will simply die, or?

I thought Fred already answered that. The signal handler simply disables 
the yellow zone and returns:

           } else {
             // Thread was in the vm or native code.  Return and try to 
finish.
             thread->disable_stack_yellow_reserved_zone();
             return 1;
           }

If it keeps going and hits the red zone then the red zone will be 
disabled, we print some error messages, and then should call 
VMError::report_and_die(). But I admit the signal handler logic is quite 
complex so I may have missed something. :)

>
>
>     But that assumes you simply advance into the guard zones - if your
>     native code suddenly jumped to the end of the yellow zone for
>     example, then signal processing would hit the red zone; similarly if
>     you jump to the end of the red zone then signal processing will hit
>     the OS guard page. If you jump past all guard pages you simply die.
>
>
> Thank you!
>
> See also my response to Fred. We wondered whether exporting a simple JNI
> helper function to check the stack size on behalf of the native code
> would be something helpful, for cooperative native code at least.

Perhaps. Haven't really thought about it. :)

Cheers,
David

> Kind Regards, Thomas
>
>
>     David
>
>
>         (not just a theory, we have a test case here where a stack
>         overflow in
>         native code just silently kills the process.)
>
>         I guess it may work accidentally if the C-compiled code itself
>         does some
>         form of stack banging when establishing frames, in order to
>         detect OS stack
>         overflows? Very fuzzy here. But whatever the C-compiled code
>         does, it has
>         no notion about how much space we need to invoke the signal
>         handler and
>         handle stack overflows, no?
>
>         When the red zone is hit, what ever the current thread state is,
>
>             the red zone is disabled and VMError::report_and_die() is
>             called,
>             which should generate a hs_err file unless the generation of the
>             error file requires more memory than the red zone provides.
>
>             Fred
>
>
>         Thanks, Thomas
>
>
>
>
>             On 04/03/2017 02:08 PM, Thomas Stüfe wrote:
>
>                 Hi,
>
>                 Today we wondered what would happen when a stack
>                 overflow occurs in native
>                 code running in a java thread (an attached thread or one
>                 created by the
>                 VM).
>
>                 In that case yellow and red pages are in place, but this
>                 would not help
>                 much, would it not, because the native code would not do
>                 any stack
>                 banging?
>
>                 So, native code would hit the yellow page, and then
>                 there would probably
>                 not be enough space left on the stack to invoke the
>                 signal handler. The
>                 result would be immediate VM death - not even an hs-err
>                 file - is that
>                 correct?
>
>                 Also, we would hit the our own yellow page, not the
>                 guard page the OS may
>                 or may not have established, so - on UNIX - this would
>                 show up as
>                 "Segmentation Fault", not "Stack Overflow", or?
>
>                 Thank you,
>
>                 Thomas
>
>
>