JEP 270: Reserved Stack Areas for Critical Sections
Volker Simonis
volker.simonis at gmail.com
Thu Sep 24 13:36:55 UTC 2015
Hi Frederic,
I understand all this. My question is if the delayed
StackOverflowException is not misleadingly pretending that the
execution was interrupted at a specific point of the program while the
program actually safely passed that point. When we have
final void critical() {
A();
B();
C();
}
and B() would trigger the StackOverflowException() a smart caller of
'critical()' might assume that C() hasn't been executed (and maybe
handle that appropriately). But with the new reserved stack area
feature, C() actually has been successfully executed. Isn't this
something we must be aware of or do I misunderstand something here?
Thanks,
Volker
PS: and thanks for the hint with subpage_prot(). I haven't been aware
of that syscall until now but it may be worth investigating it in more
detail.
On Thu, Sep 24, 2015 at 3:16 PM, Frederic Parain
<frederic.parain at oracle.com> wrote:
> Hi Volker,
>
> The point here is that the reserved area aims to protect some
> data structures from corruption, not to save the faulty thread.
> A thread that hits its reserved area is an abnormal situation.
> The goal is not try to save the thread by giving it more stack
> space, because in most cases it will simply continue to consume
> stack space and reach the new limit anyway. The goal is to
> try to throw the StackOverflowException at the more appropriate
> point of execution in order to mitigate damages.
>
> Throwing the StackOverflowException is the behavior expected
> from the JVM when a thread has consumed all its stack space.
> The exception is thrown to notify the thread that it has
> reached its stack limit. The warning printed by the JVM can
> help to diagnose the issue because exceptions are often catch
> in library code (see JDK-7011862). But the warning is not
> sufficient because the Thread cannot see it.
>
> Regards,
>
> Fred
>
>
> On 24/09/2015 10:15, Volker Simonis wrote:
>>
>> I also have problems understanding the following part:
>>
>> "If the protection of the reserved zone has been removed to allow a
>> critical section to complete its execution, the protection must be
>> restored and the delayed StackOverflowError thrown as soon as the
>> thread exits the critical section."
>>
>> Does this mean that after the critical section completes the VM will
>> still throw the StackOverflowException that would have been thrown if
>> we did not had the extra reserved space? Isn't this misleading?
>> Wouldn't user code which catches this exception expect that the
>> critical section was not executed till the end because of the
>> exception? Why do we need to throw the delayed StackOverflowException
>> in that case? Wouldn't the warning which will be printed be enough
>> information (and semantically more correct compared to throwing an
>> exception)?
>>
>> Regards,
>> Volker
>>
>> On Thu, Sep 24, 2015 at 9:05 AM, Volker Simonis
>> <volker.simonis at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> that's an interesting JEP. Thanks a lot for the detailed analysis and
>>> problem description.
>>>
>>> I see a little problem with the additional zone you want to introduce.
>>> This zone must be page-size aligned and at least one page. But on
>>> PowerPC it is not uncommon nowadays that Linux runs with a default
>>> page size of 64K. Wasting another page for the new zone would result
>>> in 3*64K = 192K overhead for the three zones. Not sure about other
>>> architectures but it seems that at least AArch64 supports a 64K
>>> default page size as well. Itanium even has two stacks (register and
>>> memory) which doubles the trouble.
>>>
>>> Taking all this into account, have you considered using a part of the
>>> available yellow zom as reserved stack are for critical sections? If
>>> you run with a page size of 4K you could just increase the yellow zone
>>> by the required number of pages, otherwise you could just use it as is
>>> (64K should be enough for a critical section according your analysis.
>>>
>>> We must also somehow ensure that critical sections can not call
>>> arbitrary user code, otherwise the stack usage of the critical section
>>> will be unbound. Did you use some kind of static analysis tools to
>>> check for this?
>>>
>>> Thank you and best regards,
>>> Volker
>>>
>>> On Thu, Sep 24, 2015 at 2:25 AM, <mark.reinhold at oracle.com> wrote:
>>>>
>>>> New JEP Candidate: http://openjdk.java.net/jeps/270
>>>>
>>>> - Mark
>
>
> --
> Frederic Parain - Oracle
> Grenoble Engineering Center - France
> Phone: +33 4 76 18 81 17
> Email: Frederic.Parain at oracle.com
More information about the hotspot-runtime-dev
mailing list