RFR(M): 8169373: Work around linux NPTL stack guard error.

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Nov 29 00:38:21 UTC 2016


On 11/28/16 2:08 AM, Lindenmaier, Goetz wrote:
> Hi,
>
> I'm working on a fix for OS guard pages on stacks.  I figured there
> are VM guard pages (reserved, yellow, red) on the compiler stacks
> _and_ OS guard pages.  For Java threads, the OS guard pages are left
> out.  I think this should be done for compiler threads, too. Please
> confirm.
> Webrev:
> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.04/

src/os/aix/vm/os_aix.cpp
     L888:   pthread_attr_setguardsize(&attr, 
os::Aix::default_guard_size(thr_type));
         No check or assert on the return status of this call.
         Is one needed?

     L3044:   // guard pages, so only enable libc guard pages for 
non-Java threads.
         src/os/linux/vm/os_linux.cpp also has this comment:
         // (Remember: compiler thread is a Java thread, too!)

     L3051:   return ((thr_type == java_thread || thr_type == 
compiler_thread) ? 0 : 4*K);
         nit - please add spaces around the '*' so '4 * K'.'

src/os/aix/vm/os_aix.hpp
     No comments.

src/os/linux/vm/os_linux.cpp
     L729:   // is not implemented properly. The posix standard requires 
to add
         Typo: 'to add' -> 'adding'

     L738:   pthread_attr_setguardsize(&attr, 
os::Linux::default_guard_size(thr_type));
         No check or assert on the return status of this call.
         Is one needed?

     L2851:   // Creating guard page is very expensive. Java thread has 
HotSpot
     L2852:   // guard page, only enable glibc guard page for non-Java 
threads.
     L2853:   // (Remember: compiler thread is a java thread, too!)
         Typo: "java thread" -> "Java thread" (consistency)

         This comment block should be common to all the platforms that
         define default_guard_size(). Yes, I can see that AIX needs to
         add another paragraph, but we should make the core comment common
         if possible.

     L6090: // Java/Compiler thread:
         Thanks for making this common in os_linux.cpp.

     L6095: //    |    glibc guard page    | - guard, attached Java 
thread usually has
         Clarity: "guard," -> "guard page,"

         Typo: "Java thread" -> "JavaThread" (consistency)

     L6099: //    |  HotSpot Guard Pages   | - red and yellow pages
         The fairly recently added reserved page should be mentioned
         here also?

     L6120 // ** P1 (aka bottom) and size ( P2 = P1 - size) are the 
address and stack size returned from
         Typo: "( P2 = ..." -> "(P2 = ..."

     L6148:       fatal("Can not locate current stack attributes!");
         Typo: "Can not" -> "Cannot"

     L6175:   // stack size includes normal stack and HotSpot guard pages
         Perhaps add to the comment:
         "for the threads that have HotSpot guard pages."

src/os/posix/vm/os_posix.cpp
     L1097:   pthread_attr_getstacksize(attr, &stack_size);
     L1098:   pthread_attr_getguardsize(attr, &guard_size);
         Do these two calls need to have their result checked?

         Now I'm starting to wonder if all the uses of these
         two APIs need to be checked? Separate bug?

src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
     L540: size_t os::Posix::_compiler_thread_min_stack_allowed = 512 * K;
     L541: size_t os::Posix::_java_thread_min_stack_allowed = 512 * K;
         So prior to the fix for 8140520, src/os/aix/vm/os_aix.cpp had
         this single min_stack_allowed value:

         L3601:   os::Aix::min_stack_allowed = 
MAX2(os::Aix::min_stack_allowed,
         L3602: JavaThread::stack_guard_zone_size() +
         L3603: JavaThread::stack_shadow_zone_size() +
         L3604:                                     (4*BytesPerWord 
COMPILER2_PRESENT(+2)) * 4 * K);

         and the fix for 8140520 changed that for *NIX platforms to
         three mins in src/os/posix/vm/os_posix.cpp:

         L1108:   _java_thread_min_stack_allowed = 
MAX2(_java_thread_min_stack_allowed,
         L1109: JavaThread::stack_guard_zone_size() +
         L1110: JavaThread::stack_shadow_zone_size() +
         L1111:                                         (4 * 
BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K);

         L1150:   _compiler_thread_min_stack_allowed = 
align_size_up(_compiler_thread_min_stack_allowed, vm_page_size());

         L1161   _vm_internal_thread_min_stack_allowed = 
align_size_up(_vm_internal_thread_min_stack_allowed, vm_page_size());

         Which means that the compiler_thread no longer benefits from
         the extra space for quard and shadow pages. The thinking in
         8140520 was that the compiler_thread and vm_internal_threads
         don't need the quard and shadow pages since they don't run
         Java code (ignoring JVMCI for now).

         So I can see bumping _compiler_thread_min_stack_allowed from
         128 -> 512 as one solution for getting that extra space back.
         However, I don't understand why _java_thread_min_stack_allowed
         has changed from 128 -> 512.


         I had previously made this comment:

         > To put it another way, I'd like to see us add extra space to 
solve
         > the 64K page issue directly instead of as a side effect of the
         > red/yellow page addition.

         And Goetz replied with:

         > I don't understand.  What do you mean by 'directly'?


         So prior to the fix for 8140520, src/os/solaris/vm/os_solaris.cpp
         had a block like this:

         L4468:   // For 64kbps there will be a 64kb page size, which makes
         L4469:   // the usable default stack size quite a bit less.  
Increase the
         L4470:   // stack for 64kb (or any > than 8kb) pages, this 
increases
         L4471:   // virtual memory fragmentation (since we're not 
creating the
         L4472   // stack on a power of 2 boundary.  The real fix for this
         L4473   // should be to fix the guard page mechanism.
         L4474
         L4475   if (vm_page_size() > 8*K) {
         L4476     threadStackSizeInBytes = (threadStackSizeInBytes != 0)
         L4477        ? threadStackSizeInBytes +
         L4478          JavaThread::stack_red_zone_size() +
         L4479          JavaThread::stack_yellow_zone_size()
         L4480        : 0;
         L4481     ThreadStackSize = threadStackSizeInBytes/K;
         L4482   }

         The above is an example of what I mean by solving the 64K
         page issue directly. In the fix for 8140520, that code block
         was moved to os::Posix::set_minimum_stack_sizes() in
         src/os/posix/vm/os_posix.cpp and put in a "#ifdef SOLARIS...
         #endif // SOLARIS" block. Coleen filed a bug to determine
         whether that code can be deleted:

         JDK-8161093 Solaris for >8k pagesize adds extra guard pages
         https://bugs.openjdk.java.net/browse/JDK-8161093

         but perhaps this bug shows that the code is needed?


         OK so this is probably the longest code review comment
         I have ever written, but the summary is:

         - I understand bumping _compiler_thread_min_stack_allowed,
           but should it be solved in a different way?
         - I don't understand bumping _java_thread_min_stack_allowed

src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
     No comments.

src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
     L538: size_t os::Posix::_compiler_thread_min_stack_allowed = 384 * K;
     L539: size_t os::Posix::_java_thread_min_stack_allowed = 384 * K;

         Same monster comment as src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
         and the same summary:

         - I understand bumping _compiler_thread_min_stack_allowed,
           but should it be solved in a different way?
         - I don't understand bumping _java_thread_min_stack_allowed

src/os_cpu/linux_s390/vm/os_linux_s390.cpp
     L478: size_t os::Posix::_compiler_thread_min_stack_allowed = 128 * K;
     L479: size_t os::Posix::_java_thread_min_stack_allowed = 236 * K;
         Bumping _java_thread_min_stack_allowed but not bumping
         _compiler_thread_min_stack_allowed. I'm confused here.

src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp
     No comments.

src/os_cpu/linux_x86/vm/os_linux_x86.cpp
     No comments.

src/os_cpu/linux_zero/vm/os_linux_zero.cpp
     No comments.

Sorry it took me so long to write this up...

Dan

> The change affecting the compier threads is in os_linux.cpp, default_guard_size(),
> where '|| thr_type == compiler_thread' has been added. The function
> was also moved from the os_cpu files, as it's identical on all cpus.
>
> Best regards,
>    Goetz.
>
>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Montag, 28. November 2016 00:25
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> 'daniel.daugherty at oracle.com' <daniel.daugherty at oracle.com>; 'hotspot-
>> runtime-dev at openjdk.java.net' <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard error.
>>
>> Hi Goetz,
>>
>> On 24/11/2016 10:15 PM, Lindenmaier, Goetz wrote:
>>> Hi,
>>>
>>> I now edited the stuff I had proposed below:
>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.03/
>>> This includes
>>>    - the NPTL fix from webrev.02
>> Okay in principle. As discussed this only impacts non-JavaThreads so the
>> change should be minimal.
>>
>>>    - merging code on linux
>> Went a bit further than I had expected but if this truly isn't CPU
>> dependent code then great!
>>
>>>    - not adding OS guard to compiler threads.
>> Okay in principle. IIUC we will now save the OS guard page for compiler
>> thread stacks. Is that the only impact? The hotspot-compiler-dev folk
>> may want to sign off on this part.
>>
>>
>> A few minor comments:
>>
>> src/os/linux/vm/os_linux.cpp
>>
>> 2854   return ((thr_type == java_thread || thr_type ==
>> os::compiler_thread) ...
>>
>> os:: should be used for both types or none.
>>
>> 6153     pthread_attr_getguardsize(&attr, &guard_size);
>>
>> Can you at least verify a zero return code in an assert/assert_status
>> please.
>>
>> ---
>>
>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>
>> Are the changes to min_stack_allowed just fixing an existing bug?
>>
>> ---
>>
>> Thanks,
>> David
>> -----
>>
>>
>>> I think this should be pushed for this bug ID. For the other changes I'll
>>> make another bug.
>>>
>>> Best regards,
>>>    Goetz.
>>>
>>>> -----Original Message-----
>>>> From: Lindenmaier, Goetz
>>>> Sent: Wednesday, November 23, 2016 8:11 AM
>>>> To: David Holmes <david.holmes at oracle.com>;
>>>> daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net
>>>> Subject: RE: RFR(M): 8169373: Work around linux NPTL stack guard error.
>>>>
>>>> Hi,
>>>>
>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct with JVMCI. The
>>>>> ability for a CompilerThread to execute Java code (can_call_java()) is
>>>>> now a dynamic property depending on whether the current compiler is the
>>>>> JVMCI compiler.
>>>> Ah, then I should also leave space for shadow pages in the minimal stack
>> size
>>>> of comiler threads.
>>>>
>>>> Do we agree on the cleanup and on leaving out the OS guard page on
>>>> compiler threads?
>>>> Then I would edit a change comprising the NPTL workaround and these
>>>> additional changes, and split the other issue into a new bug?  I think this
>>>> will easy the reviewing.
>>>>
>>>> Best regards,
>>>>    Goetz.
>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>> Sent: Mittwoch, 23. November 2016 02:50
>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>> daniel.daugherty at oracle.com; hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard error.
>>>>>
>>>>> On 22/11/2016 11:19 PM, Lindenmaier, Goetz wrote:
>>>>>> Hi Dan,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>>> Sent: Dienstag, 22. November 2016 14:01
>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; David Holmes
>>>>>>> <david.holmes at oracle.com>; hotspot-runtime-dev at openjdk.java.net
>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
>>>> error.
>>>>>>> On 11/22/16 3:55 AM, Lindenmaier, Goetz wrote:
>>>>>>>> Hi Dan,
>>>>>>>>
>>>>>>>> I ran into a row of issues, some errors on the platforms.
>>>>>>>>
>>>>>>>> What I meant with that comment is that
>>>>>>>> os::Linux::min_stack_allowed = MAX2(os::Linux::min_stack_allowed,
>>>>>>>>                                         JavaThread::stack_guard_zone_size() +
>>>>>>>>                                         JavaThread::stack_shadow_zone_size() +
>>>>>>>>                                         (4*BytesPerWord COMPILER2_PRESENT(+2)) *
>> 4
>>>> *
>>>>> K);
>>>>>>>> was executed, and min_stack_allowed used for all stacks. Now,
>>>> compiler
>>>>>>> and
>>>>>>>> vm minimum stack sizes are not increased by these sizes.
>>>>>>> Now I see what you mean. Thanks for clearing this up.
>>>>>>>
>>>>>>> I should have remembered that part of the change because we went
>>>> back
>>>>>>> and forth about removing the red/yellow zone pages from the other
>>>>>>> threads. In particular, we discussed the compiler thread because it
>>>>>>> is-a JavaThread. Our conclusion was that a compiler thread doesn't
>>>>>>> execute Java bytecode so we could remove the red/yellow pages...
>>>>>> Yes, it does not execute java byte code.
>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct with JVMCI. The
>>>>> ability for a CompilerThread to execute Java code (can_call_java()) is
>>>>> now a dynamic property depending on whether the current compiler is the
>>>>> JVMCI compiler.
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> Therefore you can remove the shadow pages.  There is no code that
>>>>>> will bang.
>>>>>> But red/yellow pages are protected right during thread startup.
>>>>>> Therefore you must have enough space for them.
>>>>>> On ppc, we try to protect three 64K pages out of the 128K compiler stack.
>>>>>> That obviously fails.
>>>>>>
>>>>>> Therefore I propose:
>>>>>>       size_t os::Posix::_java_thread_min_stack_allowed = 48 * K;   // Set
>>>>> platform dependent.
>>>>>> in  os::Posix::set_minimum_stack_sizes():
>>>>>>    _java_thread_min_stack_allowed = _java_thread_min_stack_allowed +
>>>>>>                                     JavaThread::stack_guard_zone_size() +
>>>>>>                                     JavaThread::stack_shadow_zone_size();
>>>>>>
>>>>>> (Similar for _compiler_thread_min_stack_allowed).
>>>>>>
>>>>>> The minimal stack size is made up of three components:
>>>>>>     1. Sizes of interpreter/C1/C2 frames. Depends on HotSpot
>>>>> implementation/platform/os.
>>>>>>     2. Sizes of C++ frames: depends on C++ compiler.
>>>>>> These are fixed at compile time.
>>>>>>     3. Sizes of red/yellow/reserved/shadow pages. Depends on the system
>>>>> the
>>>>>>         VM is used on.  This is not fixed at compile time. (Our ppc machines
>>>>> differ
>>>>>>        in page size.)
>>>>>> Therefore 3. should not be included in a compile time constant.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> And that decision allowed us to be exposed to the 64K page issue
>>>>>>> because the "extra" space isn't there anymore.
>>>>>>>
>>>>>>>> At least the _compiler_thread_min_stack_allowed should be increased
>>>>>>>> similarly by red/yellow zone size.  The compiler thread is a Java
>>>>>>>> thread and must have space for these zones.
>>>>>>> I'm not sure that I completely agree (yet). To me, the red/yellow
>>>>>>> pages are there for Java thread stack overflow semantics. Yes, the
>>>>>>> compiler thread needs extra space when 64K pages are used, but I
>>>>>>> would prefer that we add that space via a different calculation.
>>>>>> Yes they are. But compiler threads happen tob e a subclass of
>>>>>> Java threads and use the same run() method that puts the pages
>>>>>> There.
>>>>>> I agree that they are not needed for Compiler threads, nor for
>>>>>> CodeCacheSweeperThreads. I don't really now about
>>>>>>   JvmtiAgentThreads and ServiceThreads, but all of the get the guard
>>>> pages
>>>>>> because they are derived from JavaThread.
>>>>>>
>>>>>>> To put it another way, I'd like to see us add extra space to solve
>>>>>>> the 64K page issue directly instead of as a side effect of the
>>>>>>> red/yellow page addition.
>>>>>> I don't understand.  What do you mean by 'directly'?
>>>>>>
>>>>>>>> Also, the change added a test that is failing now.
>>>>>>> And that's a "good thing" (TM), right? :-)
>>>>>> Yes, it showed a bug and thus raised the need to fix it! :)
>>>>>>
>>>>>> Best regards,
>>>>>>    Goetz.
>>>>>>
>>>>>>
>>>>>>> Again, thanks for clarifying 8140520's role in this issue.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>     Goetz.
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>>>>> Sent: Montag, 21. November 2016 17:28
>>>>>>>>> To: David Holmes <david.holmes at oracle.com>; Lindenmaier, Goetz
>>>>>>>>> <goetz.lindenmaier at sap.com>; hotspot-runtime-
>>>>> dev at openjdk.java.net
>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
>>>>> error.
>>>>>>>>> Sorry for the delayed responses to this thread. I've been on vacation...
>>>>>>>>>
>>>>>>>>> One comment/query embedded below...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/10/16 8:40 PM, David Holmes wrote:
>>>>>>>>>> Hi Goetz,
>>>>>>>>>>
>>>>>>>>>> On 11/11/2016 8:00 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> This issue is different to 6675312, see also my comment in the bug.
>>>>>>>>>>>
>>>>>>>>>>> It appears running jtreg test
>>>>> runtime/Thread/TooSmallStackSize.java,
>>>>>>>>>>> with my patch below you can reproduce it on linuxx86_64. You can
>>>>> not
>>>>>>>>>>> do that with 6675312. Also, I would assume there are systems out
>>>>> there
>>>>>>>>>>> on x86 that uses 64-K pages, did you run the tests on these? I
>>>> would
>>>>>>>>>>> assume you get hard crashes with stack overflows in the first C2
>>>>>>>>>>> compilation if there is only 64K usable CompilerThreadStack.
>>>>>>>>>>>
>>>>>>>>>>> My fix does not affect Java threads, which are the largest amount
>>>>>>>>>>> of threads used by the VM. It affects only the non-Java threads.
>>>>>>>>>>> It adds one page to these threads. The page does not require
>>>>> memory,
>>>>>>>>>>> as it's protected. The stack will only require more space if the
>>>> thread
>>>>>>>>>>> ran into a stack overflow before the fix as else the pages are not
>>>>>>>>>>> mapped.
>>>>>>>>>>> This are stack overflows that cause hard crashes, at least on ppc
>>>> the
>>>>> VM
>>>>>>>>>>> does not properly catch these stack overflows, so any setup
>>>> working
>>>>>>> now
>>>>>>>>>>> will not run into the additional space. Altogether there should be
>>>> no
>>>>>>>>>>> effect on running systems besides requiring one more entry in the
>>>>>>>>>>> page table per non-Java thread.
>>>>>>>>>>>
>>>>>>>>>>> The problem is caused by a rather recent change (8140520:
>>>> segfault
>>>>> on
>>>>>>>>>>> solaris-amd64
>>>>>>>>>>> with "-XX:VMThreadStackSize=1" option) which was pushed after
>>>>>>>>>>> feature-close. As this was a rather recent change, it must be
>>>>>>>>>>> possible to
>>>>>>>>>>> fix this follow up issue. What else is this period in the project good
>>>>>>>>>>> for if not fixing issues?
>>>>>>>>>> So I am seeing a number of factors here.
>>>>>>>>>>
>>>>>>>>>> First, 8140520, set:
>>>>>>>>>>
>>>>>>>>>> size_t os::Posix::_compiler_thread_min_stack_allowed = 128 * K;
>>>>>>>>> So I'm confused by the above comment:
>>>>>>>>>
>>>>>>>>>    > The problem is caused by a rather recent change (8140520:
>>>> segfault
>>>>>>>>>    > on solaris-amd64 with "-XX:VMThreadStackSize=1" option)
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8140520-webrev/5-jdk9-hs-
>>>>>>>>>
>>>>> open/hotspot/src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp.frames.html
>>>>>>>>> shows this change:
>>>>>>>>>
>>>>>>>>> @@ -531,19 +531,17 @@
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>> //////////////////////////////////////////////////////////////////////////////
>>>>>>>>> //
>>>>>>>>>     // thread stack
>>>>>>>>>
>>>>>>>>> -size_t os::Linux::min_stack_allowed = 128*K;
>>>>>>>>> +size_t os::Posix::_compiler_thread_min_stack_allowed = 128 * K;
>>>>>>>>> +size_t os::Posix::_java_thread_min_stack_allowed = 128 * K;
>>>>>>>>> +size_t os::Posix::_vm_internal_thread_min_stack_allowed = 128 * K;
>>>>>>>>>
>>>>>>>>> so the existing single variable of 'min_stack_allowed' was
>>>>>>>>> replaced by three variables: _compiler_thread_min_stack_allowed,
>>>>>>>>> _java_thread_min_stack_allowed, and
>>>>>>>>> _vm_internal_thread_min_stack_allowed.
>>>>>>>>>
>>>>>>>>> The old single variable and the three new variables are all
>>>>>>>>> initialized to the same value (128K) so the fix for 8140520
>>>>>>>>> did not change stack sizes for this platform. In fact, only
>>>>>>>>> one platform had a size change (Solaris X64).
>>>>>>>>>
>>>>>>>>> So I'm confused about how the fix for 8140520 caused this problem.
>>>>>>>>>
>>>>>>>>> Based on David's analysis below, it looks to me like this 64K stack
>>>>>>>>> guard page problem should also exist prior to the fix for 8140520.
>>>>>>>>>
>>>>>>>>> Goetz, can you please explain how 8140520 caused this problem?
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Second on linux PPC it is hardwired to use 2 guard pages:
>>>>>>>>>>
>>>>>>>>>>    return 2 * page_size();
>>>>>>>>>>
>>>>>>>>>> Third, you had a pagesize of 64K.
>>>>>>>>>>
>>>>>>>>>> Fourth, NPTL takes the guard space from the stack space - hence
>>>> with
>>>>> 2
>>>>>>>>>> x 64K guard, and a 128K stack it was all consumed.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> In the proposed changes you now only use page_size() for the guard,
>>>>> so
>>>>>>>>>> that alone would have fixed the observed problem.
>>>>>>>>>>
>>>>>>>>>> But in addition you want to address the NPTL problem by adding
>>>> back
>>>>>>>>>> the guard space to the stack size requested. That alone would also
>>>>>>>>>> have fixed the observed problem. :)
>>>>>>>>>>
>>>>>>>>>> But in addition you have increased the minimum stack size:
>>>>>>>>>>
>>>>>>>>>> ! size_t os::Posix::_compiler_thread_min_stack_allowed = 192 * K;
>>>>>>>>>>
>>>>>>>>>> which again, on its own would have fixed the original problem. :)
>>>>>>>>>>
>>>>>>>>>> Did you really intend to increase the real minimum stack from 128K
>>>> to
>>>>>>>>>> 256K ? (on a 64K page system)
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> Focusing simply on the shared code change to adjust the requested
>>>>>>>>>> stacksize by the amount of guard space (if any), this does not seem
>>>>>>>>>> unreasonable. As you note it is restricted to non-JavaThreads and
>>>> only
>>>>>>>>>> adds a page to reserved stack space.
>>>>>>>>>>
>>>>>>>>>> My only query now is whether the minimum stacksize detection logic
>>>>>>>>>> will correctly report the real minimum stack size (taking into account
>>>>>>>>>> the need for the guard page) ?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> So I really think this issue should be fixed.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>>     Goetz.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>>>>> Sent: Thursday, November 10, 2016 10:02 PM
>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
>>>>>>>>> runtime-
>>>>>>>>>>>> dev at openjdk.java.net
>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
>>>> guard
>>>>>>> error.
>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>
>>>>>>>>>>>> As per the bug report, this issue was already known (6675312) and
>>>>> we
>>>>>>>>>>>> chose not to try and address it due to no reported issues at the
>>>>> time.
>>>>>>>>>>>> While I see that you have encountered an issue (is it real or
>>>>>>>>>>>> fabricated?) I think this change is too intrusive to be applied at this
>>>>>>>>>>>> stage of the JDK 9 release cycle, as it will change the stack
>>>>>>>>>>>> requirements of every application running on Linux.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>> On 11/11/2016 1:58 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please review this change. I please need a sponsor:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>>>>>>>>> stackFix/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the Linux NPTL pthread implementation the guard size
>>>>> mechanism
>>>>>>>>>>>>> is not
>>>>>>>>>>>>> implemented properly. The posix standard requires to add the
>>>>> size
>>>>>>>>>>>>> of the
>>>>>>>>>>>>> guard pages to the stack size, instead Linux takes the space out
>>>> of
>>>>>>>>>>>>> 'stacksize'.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The Posix standard
>>>>>>>>> http://pubs.opengroup.org/onlinepubs/9699919799/
>>>>>>>>>>>>>    says "the implementation allocates extra memory at the
>>>>> overflow
>>>>>>>>>>>>> end of
>>>>>>>>>>>>> the stack". The linux man page
>>>>>>>>>>>>> https://linux.die.net/man/3/pthread_attr_setguardsize says "As
>>>>> at
>>>>>>>>>>>>> glibc
>>>>>>>>>>>>> 2.8, the NPTL threading implementation includes the guard area
>>>>>>> within
>>>>>>>>>>>>> the stack size allocation, rather than allocating extra space at
>>>>>>>>>>>>> the end
>>>>>>>>>>>>> of the stack, as POSIX.1 requires".
>>>>>>>>>>>>>
>>>>>>>>>>>>> I encounter this problem in
>>>>> runtime/Thread/TooSmallStackSize.java
>>>>>>>>>>>>> on ppc
>>>>>>>>>>>>> with 64K pages. _compiler_thread_min_stack_allowed is 128K
>>>> on
>>>>>>> ppc,
>>>>>>>>> and
>>>>>>>>>>>>> ppc specifies two OS guard pages. The VM crashes in pthread
>>>>>>> creation
>>>>>>>>>>>>> because there is no usable space in the thread stack after
>>>>> allocating
>>>>>>>>>>>>> the guard pages.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But TooSmallStackSize.java requires that the VM comes up with
>>>>> the
>>>>>>>>>>>>> stack
>>>>>>>>>>>>> size mentioned in the error message.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This fix adapts the requested stack size on Linux by the size of
>>>> the
>>>>>>>>>>>>> guard pages to mimick proper behaviour, see change to
>>>>>>> os_linux.cpp.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The change also streamlines usage of stack_guard_page on
>>>>> linuxppc,
>>>>>>>>>>>>> linuxppcle, aixppc and linuxs390.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> To reproduce the error on linux_x86_64, apply below patch and
>>>>> call
>>>>>>> the
>>>>>>>>>>>>> VM with -XX:CompilerThreadStackSize=64.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm still exploring why I had to choose such big compiler stacks
>>>> on
>>>>>>>>>>>>> ppc
>>>>>>>>>>>>> to get -version passing, but I wanted to send the RFR now as
>>>>> people
>>>>>>>>>>>>> obviously looked at the bug I opened (Thanks David!).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>>     Goetz.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff -r b7ae012c55c3 src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>>>>>>>>>
>>>>>>>>>>>>> --- a/src/os_cpu/linux_x86/vm/os_linux_x86.cpp  Mon Nov 07
>>>>>>> 12:37:28
>>>>>>>>>>>> 2016
>>>>>>>>>>>>> +0100
>>>>>>>>>>>>>
>>>>>>>>>>>>> +++ b/src/os_cpu/linux_x86/vm/os_linux_x86.cpp  Thu Nov 10
>>>>>>>>> 16:52:17
>>>>>>>>>>>> 2016
>>>>>>>>>>>>> +0100
>>>>>>>>>>>>>
>>>>>>>>>>>>> @@ -701,7 +701,7 @@
>>>>>>>>>>>>>
>>>>>>>>>>>>> size_t os::Linux::default_guard_size(os::ThreadType thr_type) {
>>>>>>>>>>>>>
>>>>>>>>>>>>>      // Creating guard page is very expensive. Java thread has
>>>>> HotSpot
>>>>>>>>>>>>>      // guard page, only enable glibc guard page for non-Java
>>>>> threads.
>>>>>>>>>>>>> -  return (thr_type == java_thread ? 0 : page_size());
>>>>>>>>>>>>>
>>>>>>>>>>>>> +  return (thr_type == java_thread ? 0 : 64*K);
>>>>>>>>>>>>>
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> // Java thread:
>>>>>>>>>>>>>



More information about the hotspot-compiler-dev mailing list