JEP 270: Reserved Stack Areas for Critical Sections

Volker Simonis volker.simonis at gmail.com
Thu Sep 24 14:22:18 UTC 2015


On Thu, Sep 24, 2015 at 4:00 PM, Frederic Parain
<frederic.parain at oracle.com> wrote:
> Volker,
>
> The delayed StackOverflowError provided several information
> about what happened:
>   - it has a special message "Delayed StackOverflowError due
>     to ReservedStackAccess annotated method" so user can
>     know that it is a delayed StackOverflowError and not
>     an ordinary StackOverflowError
>   - the call stack provided with the delayed StackOverflowError
>     is generated at the point where the error is thrown, not
>     the point where access to the reserved area has been granted,
>     So it indicates exactly where the execution has been interrupted
>     (when C() returns in your example).
>

OK, I see. This were exactly my concerns and having the stack being
generated at the point where the error is thrown is good.

Thanks,
Volker

>
> Note: The warning message printed by the JVM is generated and printed
> when the reserved area is hit for the first time, thus it doesn't give
> information about where the delayed StackOverflowError is thrown.
>
> Regards,
>
> Fred
>
>
> On 24/09/2015 15:36, Volker Simonis wrote:
>>
>> Hi Frederic,
>>
>> I understand all this. My question is if the delayed
>> StackOverflowException is not misleadingly pretending that the
>> execution was interrupted at a specific point of the program while the
>> program actually safely passed that point. When we have
>>
>> final void critical() {
>>    A();
>>    B();
>>    C();
>> }
>>
>> and B() would trigger the StackOverflowException() a smart caller of
>> 'critical()' might assume that C() hasn't been executed (and maybe
>> handle that appropriately). But with the new reserved stack area
>> feature, C() actually has been successfully executed. Isn't this
>> something we must be aware of or do I misunderstand something here?
>>
>> Thanks,
>> Volker
>>
>> PS: and thanks for the hint with subpage_prot(). I haven't been aware
>> of that syscall until now but it may be worth investigating it in more
>> detail.
>>
>>
>> On Thu, Sep 24, 2015 at 3:16 PM, Frederic Parain
>> <frederic.parain at oracle.com> wrote:
>>>
>>> Hi Volker,
>>>
>>> The point here is that the reserved area aims to protect some
>>> data structures from corruption, not to save the faulty thread.
>>> A thread that hits its reserved area is an abnormal situation.
>>> The goal is not try to save the thread by giving it more stack
>>> space, because in most cases it will simply continue to consume
>>> stack space and reach the new limit anyway. The goal is to
>>> try to throw the StackOverflowException at the more appropriate
>>> point of execution in order to mitigate damages.
>>>
>>> Throwing the StackOverflowException is the behavior expected
>>> from the JVM when a thread has consumed all its stack space.
>>> The exception is thrown to notify the thread that it has
>>> reached its stack limit. The warning printed by the JVM can
>>> help to diagnose the issue because exceptions are often catch
>>> in library code (see JDK-7011862). But the warning is not
>>> sufficient because the Thread cannot see it.
>>>
>>> Regards,
>>>
>>> Fred
>>>
>>>
>>> On 24/09/2015 10:15, Volker Simonis wrote:
>>>>
>>>>
>>>> I also have problems understanding the following part:
>>>>
>>>> "If the protection of the reserved zone has been removed to allow a
>>>> critical section to complete its execution, the protection must be
>>>> restored and the delayed StackOverflowError thrown as soon as the
>>>> thread exits the critical section."
>>>>
>>>> Does this mean that after the critical section completes the VM will
>>>> still throw the StackOverflowException that would have been thrown if
>>>> we did not had the extra reserved space? Isn't this misleading?
>>>> Wouldn't user code which catches this exception expect that the
>>>> critical section was not executed till the end because of the
>>>> exception? Why do we need to throw the delayed StackOverflowException
>>>> in that case? Wouldn't the warning which will be printed be enough
>>>> information (and semantically more correct compared to throwing an
>>>> exception)?
>>>>
>>>> Regards,
>>>> Volker
>>>>
>>>> On Thu, Sep 24, 2015 at 9:05 AM, Volker Simonis
>>>> <volker.simonis at gmail.com> wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> that's an interesting JEP. Thanks a lot for the detailed analysis and
>>>>> problem description.
>>>>>
>>>>> I see a little problem with the additional zone you want to introduce.
>>>>> This zone must be page-size aligned and at least one page. But on
>>>>> PowerPC it is not uncommon nowadays that Linux runs with a default
>>>>> page size of 64K. Wasting another page for the new zone would result
>>>>> in 3*64K = 192K overhead for the three zones. Not sure about other
>>>>> architectures but it seems that at least AArch64 supports a 64K
>>>>> default page size as well. Itanium even has two stacks (register and
>>>>> memory) which doubles the trouble.
>>>>>
>>>>> Taking all this into account, have you considered using a part of the
>>>>> available yellow zom as reserved stack are for critical sections? If
>>>>> you run with a page size of 4K you could just increase the yellow zone
>>>>> by the required number of pages, otherwise you could just use it as is
>>>>> (64K should be enough for a critical section according your analysis.
>>>>>
>>>>> We must also somehow ensure that critical sections can not call
>>>>> arbitrary user code, otherwise the stack usage of the critical section
>>>>> will be unbound. Did you use some kind of static analysis tools to
>>>>> check for this?
>>>>>
>>>>> Thank you and best regards,
>>>>> Volker
>>>>>
>>>>> On Thu, Sep 24, 2015 at 2:25 AM,  <mark.reinhold at oracle.com> wrote:
>>>>>>
>>>>>>
>>>>>> New JEP Candidate: http://openjdk.java.net/jeps/270
>>>>>>
>>>>>> - Mark
>>>
>>>
>>>
>>> --
>>> Frederic Parain - Oracle
>>> Grenoble Engineering Center - France
>>> Phone: +33 4 76 18 81 17
>>> Email: Frederic.Parain at oracle.com
>
>
> --
> Frederic Parain - Oracle
> Grenoble Engineering Center - France
> Phone: +33 4 76 18 81 17
> Email: Frederic.Parain at oracle.com


More information about the hotspot-runtime-dev mailing list