RFR(M): 8169373: Work around linux NPTL stack guard error.

Daniel D. Daugherty daniel.daugherty at oracle.com
Fri Dec 2 23:46:38 UTC 2016


OK... kicked off the usual JPRT -testset hotspot pre-push job...
Also kicked off a "full rbt run". JPRT should be done in < 2 hours
and RBT should finish by the end of the weekend...

Dan


On 12/2/16 2:04 PM, Daniel D. Daugherty wrote:
> Vladimir, Thanks for the review!
>
> OK, the ball is in my court (as sponsor) and I need to figure out what
> kind of RBT testing David H wants to see on the patch before I push it...
> It's the weekend already for David so I'm guessing he's out mowing the
> lawn... :-)
>
> Dan
>
>
> On 12/2/16 11:12 AM, Vladimir Kozlov wrote:
>> I read through whole tread and you guys did good job with review :)
>>
>> I agree with changes (and keeping guard pages for compiler threads).
>>
>> Thanks,
>> Vladimir
>>
>> On 12/1/16 2:32 AM, Lindenmaier, Goetz wrote:
>>> Hi David,
>>>
>>> I fixed the comment:
>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.07/
>>>
>>> We run a lot of tests with this change:
>>> Hotspot jtreg, jck, spec, SAP applications
>>> On these platforms:
>>> Windows_x86_64, linux_x86_64, solaris_sparc, mac_x86_64,
>>> Linux_ppc64, linux_ppc64le, aix_ppc64, linux_s390
>>> I did not spot a problem there.
>>>
>>> Best regards,
>>>   Goetz.
>>>
>>>> -----Original Message-----
>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>> Sent: Mittwoch, 30. November 2016 22:51
>>>> To: daniel.daugherty at oracle.com; Lindenmaier, Goetz
>>>> <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net;
>>>> 'hotspot-runtime-dev at openjdk.java.net' <hotspot-runtime-
>>>> dev at openjdk.java.net>
>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard 
>>>> error.
>>>>
>>>> On 1/12/2016 1:20 AM, Daniel D. Daugherty wrote:
>>>>>> Would you mind sponsoring this change?
>>>>>
>>>>> Yes, I will sponsor this change. However, I would like to get a
>>>>> thumbs up from David H. on the latest version and I don't see
>>>>> any review from someone on the Compiler Team.
>>>>
>>>> I'm okay with proposed changes - but also want to hear from compiler
>>>> team and I'd want to see this put through some advance testing 
>>>> before it
>>>> gets pushed (full rbt run).
>>>>
>>>> I have one minor nit in the wording of the fatal error messages 
>>>> "failed
>>>> with errno" - these methods don't touch errno so I'd prefer it if it
>>>> said error.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Vladimir! We need someone on the Compiler Team to look at these
>>>>> CompilerThread stack size changes...
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 11/30/16 12:57 AM, Lindenmaier, Goetz wrote:
>>>>>> Hi Dan,
>>>>>>
>>>>>>> Thumbs up! I don't need to see a new webrev if you choose
>>>>>>> to fix the minor comments above.
>>>>>> I anyways did a new one fixing the comments:
>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.06/ 
>>>>>>
>>>>>> Would you mind sponsoring this change?
>>>>>>
>>>>>> I took the minimum stack sizes from my experimental runs where
>>>>>> I had removed the automatic resizing to find the really needed 
>>>>>> space.
>>>>>> If I put something smaller there, I could as well put '0' ... as
>>>>>> _java_thread_min_stack_allowed =
>>>> MAX2(_java_thread_min_stack_allowed,
>>>>>>
>>>>>> JavaThread::stack_guard_zone_size() +
>>>>>>
>>>>>> JavaThread::stack_shadow_zone_size() +
>>>>>>                                          (4 * BytesPerWord
>>>>>> COMPILER2_PRESENT(+ 2)) * 4 * K);
>>>>>> will fix it.
>>>>>> This code effectively limits the usable stack size to
>>>>>>     (4 * BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K)
>>>>>> which makes the initialization of _java_thread_min_stack_allowed 
>>>>>> with
>>>>>> platform
>>>>>> values pointless.
>>>>>>
>>>>>> I'll open a new bug for the follow-up stack issue, probably 
>>>>>> tomorrow.
>>>>>> I don't feel like addressing testing all the possible error 
>>>>>> codes, as
>>>>>> they probably should be fixed in more places, and there is no issue
>>>>>> pending currently.  Maybe it should be fixed in jdk10 at some point.
>>>>>>
>>>>>> Best regards,
>>>>>>    Goetz
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>>> Sent: Dienstag, 29. November 2016 20:04
>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 
>>>>>>> hotspot-compiler-
>>>>>>> dev at openjdk.java.net; 'hotspot-runtime-dev at openjdk.java.net' 
>>>>>>> <hotspot-
>>>>>>> runtime-dev at openjdk.java.net>
>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard 
>>>>>>> error.
>>>>>>>
>>>>>>> On 11/29/16 2:30 AM, Lindenmaier, Goetz wrote:
>>>>>>>> Hi Dan,
>>>>>>>>
>>>>>>>> see my replies inline ...
>>>>>>>> New webrev:
>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.05/ 
>>>>>>>>
>>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>>>       L887:   // libc guard page
>>>>>>>           nit - You made other existing comments into sentences 
>>>>>>> (leading
>>>>>>>                 capital and trailing '.'), but not this new 
>>>>>>> comment.
>>>>>>> Why?
>>>>>>>
>>>>>>> src/os/aix/vm/os_aix.hpp
>>>>>>>       No comments.
>>>>>>>
>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>       L6096: //    |                        |/  1 page glibc guard.
>>>>>>>           nit - "1 page glibc guard" -> "1 glibc guard page."
>>>>>>>
>>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>>>       No comments.
>>>>>>>
>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>       No comments.
>>>>>>>
>>>>>>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp
>>>>>>>       L875: //    |                        |/  1 page glibc guard.
>>>>>>>           nit - "1 page glibc guard" -> "1 glibc guard page."
>>>>>>>
>>>>>>> src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
>>>>>>>       No comments.
>>>>>>>
>>>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>>>>>>>       No comments.
>>>>>>>
>>>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>>>>>>       No comments.
>>>>>>>
>>>>>>> src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp
>>>>>>>       No comments.
>>>>>>>
>>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>>>       No comments.
>>>>>>>
>>>>>>> src/os_cpu/linux_zero/vm/os_linux_zero.cpp
>>>>>>>       No comments.
>>>>>>>
>>>>>>>
>>>>>>> Thumbs up! I don't need to see a new webrev if you choose
>>>>>>> to fix the minor comments above.
>>>>>>>
>>>>>>> Some replies embedded below.
>>>>>>>
>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>>>>> Sent: Dienstag, 29. November 2016 01:38
>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
>>>> compiler-
>>>>>>>>> dev at openjdk.java.net; 'hotspot-runtime-dev at openjdk.java.net'
>>>> <hotspot-
>>>>>>>>> runtime-dev at openjdk.java.net>
>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
>>>>>>>>> error.
>>>>>>>>>
>>>>>>>>> On 11/28/16 2:08 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm working on a fix for OS guard pages on stacks.  I figured 
>>>>>>>>>> there
>>>>>>>>>> are VM guard pages (reserved, yellow, red) on the compiler 
>>>>>>>>>> stacks
>>>>>>>>>> _and_ OS guard pages.  For Java threads, the OS guard pages 
>>>>>>>>>> are left
>>>>>>>>>> out.  I think this should be done for compiler threads, too. 
>>>>>>>>>> Please
>>>>>>>>>> confirm.
>>>>>>>>>> Webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>>>> stackFix/webrev.04/
>>>>>>>>>>
>>>>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>>>>>        L888: pthread_attr_setguardsize(&attr,
>>>>>>>>> os::Aix::default_guard_size(thr_type));
>>>>>>>>>            No check or assert on the return status of this call.
>>>>>>>>>            Is one needed?
>>>>>>>> I implemented this as the existing code on linux which has
>>>>>>>> no check either.  I think a failure is quite theoretical.
>>>>>>>> Because of your comment below I'll leave it as-is.
>>>>>>> OK.
>>>>>>>
>>>>>>>
>>>>>>>>>        L3044:   // guard pages, so only enable libc guard 
>>>>>>>>> pages for
>>>>>>>>> non-Java threads.
>>>>>>>>>            src/os/linux/vm/os_linux.cpp also has this comment:
>>>>>>>>>            // (Remember: compiler thread is a Java thread, too!)
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>>        L3051:   return ((thr_type == java_thread || thr_type ==
>>>>>>>>> compiler_thread) ? 0 : 4*K);
>>>>>>>>>            nit - please add spaces around the '*' so '4 * K'.'
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>> src/os/aix/vm/os_aix.hpp
>>>>>>>>>        No comments.
>>>>>>>>>
>>>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>>>        L729:   // is not implemented properly. The posix standard
>>>>>>>>> requires
>>>>>>>>> to add
>>>>>>>>>            Typo: 'to add' -> 'adding'
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>>        L738: pthread_attr_setguardsize(&attr,
>>>>>>>>> os::Linux::default_guard_size(thr_type));
>>>>>>>>>            No check or assert on the return status of this call.
>>>>>>>>>            Is one needed?
>>>>>>>> See above.
>>>>>>>>
>>>>>>>>>        L2851:   // Creating guard page is very expensive. Java
>>>>>>>>> thread has
>>>>>>>>> HotSpot
>>>>>>>>>        L2852:   // guard page, only enable glibc guard page for
>>>>>>>>> non-Java
>>>>>>>>> threads.
>>>>>>>>>        L2853:   // (Remember: compiler thread is a java 
>>>>>>>>> thread, too!)
>>>>>>>>>            Typo: "java thread" -> "Java thread" (consistency)
>>>>>>>>>
>>>>>>>>>            This comment block should be common to all the 
>>>>>>>>> platforms
>>>>>>>>> that
>>>>>>>>>            define default_guard_size(). Yes, I can see that AIX
>>>>>>>>> needs to
>>>>>>>>>            add another paragraph, but we should make the core 
>>>>>>>>> comment
>>>>>>> common
>>>>>>>>>            if possible.
>>>>>>>> I made the first three lines look alike.
>>>>>>>>
>>>>>>>>>        L6090: // Java/Compiler thread:
>>>>>>>>>            Thanks for making this common in os_linux.cpp.
>>>>>>>>>
>>>>>>>>>        L6095: //    |    glibc guard page    | - guard, 
>>>>>>>>> attached Java
>>>>>>>>> thread usually has
>>>>>>>>>            Clarity: "guard," -> "guard page,"
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>>            Typo: "Java thread" -> "JavaThread" (consistency)
>>>>>>>> I changed both to Java thread as at the other locations.
>>>>>>>>
>>>>>>>>>        L6099: //    | HotSpot Guard Pages   | - red and yellow 
>>>>>>>>> pages
>>>>>>>>>            The fairly recently added reserved page should be 
>>>>>>>>> mentioned
>>>>>>>>>            here also?
>>>>>>>> Yes. Fixed.   Also fixed it to say
>>>>>>>> JavaThread::stack_reserved_zone_base().
>>>>>>>> Also fixed comment on bsd.
>>>>>>> Thanks for also fixing BSD.
>>>>>>>
>>>>>>>
>>>>>>>>>        L6120 // ** P1 (aka bottom) and size ( P2 = P1 - size) 
>>>>>>>>> are the
>>>>>>>>> address and stack size returned from
>>>>>>>>>            Typo: "( P2 = ..." -> "(P2 = ..."
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>>        L6148: fatal("Can not locate current stack attributes!");
>>>>>>>>>            Typo: "Can not" -> "Cannot"
>>>>>>>> Fixed.
>>>>>>>>
>>>>>>>>>        L6175:   // stack size includes normal stack and HotSpot
>>>>>>>>> guard pages
>>>>>>>>>            Perhaps add to the comment:
>>>>>>>>>            "for the threads that have HotSpot guard pages."
>>>>>>>> Fixed.  I also checked my comments for "OS guard pages" and
>>>>>>>> replaced it by "glibc guard pages" which is used in several places
>>>>>>>> already, same for "VM guard page" --> "HotSpot guard page". I
>>>>>>>> think this is also more consistent.
>>>>>>> I agree!
>>>>>>>
>>>>>>>
>>>>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>>>>>        L1097:   pthread_attr_getstacksize(attr, &stack_size);
>>>>>>>>>        L1098:   pthread_attr_getguardsize(attr, &guard_size);
>>>>>>>>>            Do these two calls need to have their result checked?
>>>>>>>>>
>>>>>>>>>            Now I'm starting to wonder if all the uses of these
>>>>>>>>>            two APIs need to be checked? Separate bug?
>>>>>>>> It would be more consistent with the specification of the methods,
>>>>>>>> On the other side it's quite unlikely that these fail if attr 
>>>>>>>> != NULL.
>>>>>>> So should we file a new bug? Or does this fall into the realm of
>>>>>>> other OS/libc code that we call and assume never fails? :-)
>>>>>>>
>>>>>>>
>>>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>>>        L540: size_t 
>>>>>>>>> os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>>>> 512 * K;
>>>>>>>>>        L541: size_t os::Posix::_java_thread_min_stack_allowed 
>>>>>>>>> = 512
>>>>>>>>> * K;
>>>>>>>>>            So prior to the fix for 8140520,
>>>>>>>>> src/os/aix/vm/os_aix.cpp had
>>>>>>>>>            this single min_stack_allowed value:
>>>>>>>>>
>>>>>>>>>            L3601:   os::Aix::min_stack_allowed =
>>>>>>>>> MAX2(os::Aix::min_stack_allowed,
>>>>>>>>>            L3602: JavaThread::stack_guard_zone_size() +
>>>>>>>>>            L3603: JavaThread::stack_shadow_zone_size() +
>>>>>>>>> L3604: (4*BytesPerWord
>>>>>>>>> COMPILER2_PRESENT(+2)) * 4 * K);
>>>>>>>>>
>>>>>>>>>            and the fix for 8140520 changed that for *NIX 
>>>>>>>>> platforms to
>>>>>>>>>            three mins in src/os/posix/vm/os_posix.cpp:
>>>>>>>>>
>>>>>>>>>            L1108:   _java_thread_min_stack_allowed =
>>>>>>>>> MAX2(_java_thread_min_stack_allowed,
>>>>>>>>>            L1109: JavaThread::stack_guard_zone_size() +
>>>>>>>>>            L1110: JavaThread::stack_shadow_zone_size() +
>>>>>>>>> L1111:                                         (4 *
>>>>>>>>> BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K);
>>>>>>>>>
>>>>>>>>>            L1150: _compiler_thread_min_stack_allowed =
>>>>>>>>> align_size_up(_compiler_thread_min_stack_allowed, 
>>>>>>>>> vm_page_size());
>>>>>>>>>
>>>>>>>>>            L1161 _vm_internal_thread_min_stack_allowed =
>>>>>>>>> align_size_up(_vm_internal_thread_min_stack_allowed,
>>>> vm_page_size());
>>>>>>>>>
>>>>>>>>>            Which means that the compiler_thread no longer 
>>>>>>>>> benefits
>>>>>>>>> from
>>>>>>>>>            the extra space for quard and shadow pages. The 
>>>>>>>>> thinking in
>>>>>>>>>            8140520 was that the compiler_thread and
>>>>>>>>> vm_internal_threads
>>>>>>>>>            don't need the quard and shadow pages since they 
>>>>>>>>> don't run
>>>>>>>>>            Java code (ignoring JVMCI for now).
>>>>>>>>>
>>>>>>>>>            So I can see bumping 
>>>>>>>>> _compiler_thread_min_stack_allowed
>>>>>>>>> from
>>>>>>>>>            128 -> 512 as one solution for getting that extra 
>>>>>>>>> space
>>>>>>>>> back.
>>>>>>>>>            However, I don't understand why
>>>>>>>>> _java_thread_min_stack_allowed
>>>>>>>>>            has changed from 128 -> 512.
>>>>>>>> Because it was never correct before.
>>>>>>> OK. That sounds like the new test that I included with 8140520 
>>>>>>> would
>>>>>>> have failed with JavaThread stack sizes even before the product 
>>>>>>> code
>>>>>>> changes from 8140520 were made.
>>>>>>>
>>>>>>> Since the size calculation for JavaThread stack sizes wasn't 
>>>>>>> changed
>>>>>>> for any platform in 8140520, that tends to indicate that the more
>>>>>>> limited JDK test (test/tools/launcher/TooSmallStackSize.java) 
>>>>>>> should
>>>>>>> also have failed before the fix for 8140520.
>>>>>>>
>>>>>>> Please clarify the need for the _java_thread_min_stack_allowed 
>>>>>>> change
>>>>>>> from 128 -> 512. Unless test/tools/launcher/TooSmallStackSize.java
>>>>>>> is never run in your testing, I'm having troubling seeing why the
>>>>>>> _java_thread_min_stack_allowed increase is needed.
>>>>>>>
>>>>>>>
>>>>>>>>>            I had previously made this comment:
>>>>>>>>>            > To put it another way, I'd like to see us add extra
>>>>>>>>> space to
>>>>>>>>>            > solve the 64K page issue directly instead of as a 
>>>>>>>>> side
>>>>>>>>> effect
>>>>>>>>>            > of the red/yellow page addition.
>>>>>>>>>            And Goetz replied with:
>>>>>>>>>            > I don't understand.  What do you mean by 'directly'?
>>>>>>>>>
>>>>>>>>>            So prior to the fix for 8140520,
>>>>>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>>>>>>            had a block like this:
>>>>>>>>>
>>>>>>>>>            L4468:   // For 64kbps there will be a 64kb page size,
>>>>>>>>> which makes
>>>>>>>>>            L4469:   // the usable default stack size quite a 
>>>>>>>>> bit less.
>>>>>>>>> Increase the
>>>>>>>>>            L4470:   // stack for 64kb (or any > than 8kb) 
>>>>>>>>> pages, this
>>>>>>>>> increases
>>>>>>>>>            L4471:   // virtual memory fragmentation (since 
>>>>>>>>> we're not
>>>>>>>>> creating the
>>>>>>>>>            L4472   // stack on a power of 2 boundary.  The 
>>>>>>>>> real fix
>>>>>>>>> for this
>>>>>>>>>            L4473   // should be to fix the guard page mechanism.
>>>>>>>>>            L4474
>>>>>>>>>            L4475   if (vm_page_size() > 8*K) {
>>>>>>>>>            L4476     threadStackSizeInBytes =
>>>>>>>>> (threadStackSizeInBytes != 0)
>>>>>>>>>            L4477        ? threadStackSizeInBytes +
>>>>>>>>>            L4478 JavaThread::stack_red_zone_size() +
>>>>>>>>>            L4479 JavaThread::stack_yellow_zone_size()
>>>>>>>>>            L4480        : 0;
>>>>>>>>>            L4481     ThreadStackSize = threadStackSizeInBytes/K;
>>>>>>>>>            L4482   }
>>>>>>>>>
>>>>>>>>>            The above is an example of what I mean by solving 
>>>>>>>>> the 64K
>>>>>>>>>            page issue directly. In the fix for 8140520, that code
>>>>>>>>> block
>>>>>>>>>            was moved to os::Posix::set_minimum_stack_sizes() in
>>>>>>>>>            src/os/posix/vm/os_posix.cpp and put in a "#ifdef
>>>>>>>>> SOLARIS...
>>>>>>>>>            #endif // SOLARIS" block. Coleen filed a bug to 
>>>>>>>>> determine
>>>>>>>>>            whether that code can be deleted:
>>>>>>>>>
>>>>>>>>>            JDK-8161093 Solaris for >8k pagesize adds extra 
>>>>>>>>> guard pages
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161093
>>>>>>>>>
>>>>>>>>>            but perhaps this bug shows that the code is needed?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>            OK so this is probably the longest code review comment
>>>>>>>>>            I have ever written, but the summary is:
>>>>>>>>>
>>>>>>>>>            - I understand bumping 
>>>>>>>>> _compiler_thread_min_stack_allowed,
>>>>>>>>>              but should it be solved in a different way?
>>>>>>>>>            - I don't understand bumping 
>>>>>>>>> _java_thread_min_stack_allowed
>>>>>>>> I plan to do a follow up change to fix this.  Let's leave this
>>>>>>>> discussion
>>>>>>>> to that review.  Here I just want to fix the NPTL issue and the 
>>>>>>>> basic
>>>>>>>> sizing that is needed to pass the new test on ppc/s390.
>>>>>>> Same question here about the simpler JDK version of the test.
>>>>>>>
>>>>>>> Does test/tools/launcher/TooSmallStackSize.java get run in
>>>>>>> your test environments?
>>>>>>>
>>>>>>>
>>>>>>>>> src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
>>>>>>>>>        No comments.
>>>>>>>>>
>>>>>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>>>>>>>>>        L538: size_t 
>>>>>>>>> os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>>>> 384 * K;
>>>>>>>>>        L539: size_t os::Posix::_java_thread_min_stack_allowed 
>>>>>>>>> = 384
>>>>>>>>> * K;
>>>>>>>>>
>>>>>>>>>            Same monster comment as
>>>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>>>            and the same summary:
>>>>>>>>>
>>>>>>>>>            - I understand bumping 
>>>>>>>>> _compiler_thread_min_stack_allowed,
>>>>>>>>>              but should it be solved in a different way?
>>>>>>>>>            - I don't understand bumping 
>>>>>>>>> _java_thread_min_stack_allowed
>>>>>>>>>
>>>>>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>>>>>>>>        L478: size_t 
>>>>>>>>> os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>>>> 128 * K;
>>>>>>>>>        L479: size_t os::Posix::_java_thread_min_stack_allowed 
>>>>>>>>> = 236
>>>>>>>>> * K;
>>>>>>>>>            Bumping _java_thread_min_stack_allowed but not bumping
>>>>>>>>>            _compiler_thread_min_stack_allowed. I'm confused here.
>>>>>>>> The numbers are what I need to startup on the machines.  128 is 
>>>>>>>> just
>>>>>>>> fine on the machines we have.  (That's the problem of the
>>>>>>>> current setup: you have to tune this compile time constant for the
>>>>>>>> page size of the machine you are running on. But let's discuss 
>>>>>>>> this
>>>>>>>> in the follow up change.)
>>>>>>> OK about discussing this with a follow-up change. I guess I see
>>>>>>> the compile time initialization as a "minimum setting assuming the
>>>>>>> smallest page size". If we discover (at runtime) that the page
>>>>>>> size is bigger, then we adjust the minimum that we need...
>>>>>>>
>>>>>>> Again, defer to another bug. Do we have a bug ID yet?
>>>>>>>
>>>>>>>
>>>>>>>>> src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp
>>>>>>>>>        No comments.
>>>>>>>>>
>>>>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>>>>>        No comments.
>>>>>>>>>
>>>>>>>>> src/os_cpu/linux_zero/vm/os_linux_zero.cpp
>>>>>>>>>        No comments.
>>>>>>>>>
>>>>>>>>> Sorry it took me so long to write this up...
>>>>>>>> No matter, thanks for this thorough review!
>>>>>>> You are very welcome. Thanks for being willing to dive into such
>>>>>>> a complicated area (thread stack sizes)...
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>     Goetz.
>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>> The change affecting the compier threads is in os_linux.cpp,
>>>>>>>>> default_guard_size(),
>>>>>>>>>> where '|| thr_type == compiler_thread' has been added. The 
>>>>>>>>>> function
>>>>>>>>>> was also moved from the os_cpu files, as it's identical on 
>>>>>>>>>> all cpus.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>      Goetz.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>>>> Sent: Montag, 28. November 2016 00:25
>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>>>>>> 'daniel.daugherty at oracle.com' <daniel.daugherty at oracle.com>;
>>>>>>> 'hotspot-
>>>>>>>>>>> runtime-dev at openjdk.java.net' <hotspot-runtime-
>>>> dev at openjdk.java.net>
>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack 
>>>>>>>>>>> guard
>>>>>>>>>>> error.
>>>>>>>>>>>
>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>
>>>>>>>>>>> On 24/11/2016 10:15 PM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I now edited the stuff I had proposed below:
>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>>>>>>> stackFix/webrev.03/
>>>>>>>>>>>> This includes
>>>>>>>>>>>>      - the NPTL fix from webrev.02
>>>>>>>>>>> Okay in principle. As discussed this only impacts 
>>>>>>>>>>> non-JavaThreads
>>>>>>>>>>> so the
>>>>>>>>>>> change should be minimal.
>>>>>>>>>>>
>>>>>>>>>>>>      - merging code on linux
>>>>>>>>>>> Went a bit further than I had expected but if this truly 
>>>>>>>>>>> isn't CPU
>>>>>>>>>>> dependent code then great!
>>>>>>>>>>>
>>>>>>>>>>>>      - not adding OS guard to compiler threads.
>>>>>>>>>>> Okay in principle. IIUC we will now save the OS guard page for
>>>>>>>>>>> compiler
>>>>>>>>>>> thread stacks. Is that the only impact? The 
>>>>>>>>>>> hotspot-compiler-dev
>>>>>>>>>>> folk
>>>>>>>>>>> may want to sign off on this part.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> A few minor comments:
>>>>>>>>>>>
>>>>>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>>>>>
>>>>>>>>>>> 2854   return ((thr_type == java_thread || thr_type ==
>>>>>>>>>>> os::compiler_thread) ...
>>>>>>>>>>>
>>>>>>>>>>> os:: should be used for both types or none.
>>>>>>>>>>>
>>>>>>>>>>> 6153     pthread_attr_getguardsize(&attr, &guard_size);
>>>>>>>>>>>
>>>>>>>>>>> Can you at least verify a zero return code in an
>>>>>>>>>>> assert/assert_status
>>>>>>>>>>> please.
>>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>>
>>>>>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>>>>>>>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>>>>>>>>>>
>>>>>>>>>>> Are the changes to min_stack_allowed just fixing an existing 
>>>>>>>>>>> bug?
>>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> I think this should be pushed for this bug ID. For the other
>>>>>>>>>>>> changes I'll
>>>>>>>>>>>> make another bug.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>      Goetz.
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Lindenmaier, Goetz
>>>>>>>>>>>>> Sent: Wednesday, November 23, 2016 8:11 AM
>>>>>>>>>>>>> To: David Holmes <david.holmes at oracle.com>;
>>>>>>>>>>>>> daniel.daugherty at oracle.com; hotspot-runtime-
>>>> dev at openjdk.java.net
>>>>>>>>>>>>> Subject: RE: RFR(M): 8169373: Work around linux NPTL stack 
>>>>>>>>>>>>> guard
>>>>>>> error.
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct 
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>> JVMCI.
>>>>>>> The
>>>>>>>>>>>>>> ability for a CompilerThread to execute Java code
>>>>>>>>>>>>>> (can_call_java()) is
>>>>>>>>>>>>>> now a dynamic property depending on whether the current
>>>> compiler
>>>>>>> is
>>>>>>>>> the
>>>>>>>>>>>>>> JVMCI compiler.
>>>>>>>>>>>>> Ah, then I should also leave space for shadow pages in the 
>>>>>>>>>>>>> minimal
>>>>>>> stack
>>>>>>>>>>> size
>>>>>>>>>>>>> of comiler threads.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do we agree on the cleanup and on leaving out the OS guard 
>>>>>>>>>>>>> page
>>>> on
>>>>>>>>>>>>> compiler threads?
>>>>>>>>>>>>> Then I would edit a change comprising the NPTL workaround and
>>>>>>>>>>>>> these
>>>>>>>>>>>>> additional changes, and split the other issue into a new bug?
>>>>>>>>>>>>> I think this
>>>>>>>>>>>>> will easy the reviewing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>      Goetz.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>>>>>>> Sent: Mittwoch, 23. November 2016 02:50
>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>>>>>>>>> daniel.daugherty at oracle.com; hotspot-runtime-
>>>> dev at openjdk.java.net
>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
>>>> guard
>>>>>>>>> error.
>>>>>>>>>>>>>> On 22/11/2016 11:19 PM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>> From: Daniel D. Daugherty
>>>> [mailto:daniel.daugherty at oracle.com]
>>>>>>>>>>>>>>>> Sent: Dienstag, 22. November 2016 14:01
>>>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; David
>>>>>>> Holmes
>>>>>>>>>>>>>>>> <david.holmes at oracle.com>; hotspot-runtime-
>>>>>>> dev at openjdk.java.net
>>>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
>>>>>>>>>>>>>>>> guard
>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>> On 11/22/16 3:55 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I ran into a row of issues, some errors on the platforms.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What I meant with that comment is that
>>>>>>>>>>>>>>>>> os::Linux::min_stack_allowed =
>>>>>>> MAX2(os::Linux::min_stack_allowed,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> JavaThread::stack_guard_zone_size() +
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> JavaThread::stack_shadow_zone_size() +
>>>>>>>>>>>>>>>>>                                           (4*BytesPerWord
>>>>>>>>> COMPILER2_PRESENT(+2)) *
>>>>>>>>>>> 4
>>>>>>>>>>>>> *
>>>>>>>>>>>>>> K);
>>>>>>>>>>>>>>>>> was executed, and min_stack_allowed used for all 
>>>>>>>>>>>>>>>>> stacks. Now,
>>>>>>>>>>>>> compiler
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> vm minimum stack sizes are not increased by these sizes.
>>>>>>>>>>>>>>>> Now I see what you mean. Thanks for clearing this up.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I should have remembered that part of the change 
>>>>>>>>>>>>>>>> because we
>>>>>>> went
>>>>>>>>>>>>> back
>>>>>>>>>>>>>>>> and forth about removing the red/yellow zone pages from 
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>>> threads. In particular, we discussed the compiler thread
>>>>>>>>>>>>>>>> because it
>>>>>>>>>>>>>>>> is-a JavaThread. Our conclusion was that a compiler thread
>>>>>>>>>>>>>>>> doesn't
>>>>>>>>>>>>>>>> execute Java bytecode so we could remove the red/yellow
>>>>>>>>>>>>>>>> pages...
>>>>>>>>>>>>>>> Yes, it does not execute java byte code.
>>>>>>>>>>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct 
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>> JVMCI.
>>>>>>> The
>>>>>>>>>>>>>> ability for a CompilerThread to execute Java code
>>>>>>>>>>>>>> (can_call_java()) is
>>>>>>>>>>>>>> now a dynamic property depending on whether the current
>>>> compiler
>>>>>>> is
>>>>>>>>> the
>>>>>>>>>>>>>> JVMCI compiler.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Therefore you can remove the shadow pages.  There is no 
>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> will bang.
>>>>>>>>>>>>>>> But red/yellow pages are protected right during thread 
>>>>>>>>>>>>>>> startup.
>>>>>>>>>>>>>>> Therefore you must have enough space for them.
>>>>>>>>>>>>>>> On ppc, we try to protect three 64K pages out of the 128K
>>>>>>>>>>>>>>> compiler
>>>>>>>>> stack.
>>>>>>>>>>>>>>> That obviously fails.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Therefore I propose:
>>>>>>>>>>>>>>>         size_t os::Posix::_java_thread_min_stack_allowed 
>>>>>>>>>>>>>>> = 48
>>>>>>>>>>>>>>> * K;   //
>>>>>>> Set
>>>>>>>>>>>>>> platform dependent.
>>>>>>>>>>>>>>> in os::Posix::set_minimum_stack_sizes():
>>>>>>>>>>>>>>>      _java_thread_min_stack_allowed =
>>>>>>> _java_thread_min_stack_allowed
>>>>>>>>> +
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> JavaThread::stack_guard_zone_size() +
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> JavaThread::stack_shadow_zone_size();
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (Similar for _compiler_thread_min_stack_allowed).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The minimal stack size is made up of three components:
>>>>>>>>>>>>>>>       1. Sizes of interpreter/C1/C2 frames. Depends on 
>>>>>>>>>>>>>>> HotSpot
>>>>>>>>>>>>>> implementation/platform/os.
>>>>>>>>>>>>>>>       2. Sizes of C++ frames: depends on C++ compiler.
>>>>>>>>>>>>>>> These are fixed at compile time.
>>>>>>>>>>>>>>>       3. Sizes of red/yellow/reserved/shadow pages. Depends
>>>>>>>>>>>>>>> on the
>>>>>>>>> system
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>           VM is used on.  This is not fixed at compile 
>>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>> (Our ppc
>>>>>>>>> machines
>>>>>>>>>>>>>> differ
>>>>>>>>>>>>>>>          in page size.)
>>>>>>>>>>>>>>> Therefore 3. should not be included in a compile time 
>>>>>>>>>>>>>>> constant.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> And that decision allowed us to be exposed to the 64K page
>>>>>>>>>>>>>>>> issue
>>>>>>>>>>>>>>>> because the "extra" space isn't there anymore.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> At least the _compiler_thread_min_stack_allowed should be
>>>>>>>>> increased
>>>>>>>>>>>>>>>>> similarly by red/yellow zone size.  The compiler 
>>>>>>>>>>>>>>>>> thread is
>>>>>>>>>>>>>>>>> a Java
>>>>>>>>>>>>>>>>> thread and must have space for these zones.
>>>>>>>>>>>>>>>> I'm not sure that I completely agree (yet). To me, the
>>>>>>>>>>>>>>>> red/yellow
>>>>>>>>>>>>>>>> pages are there for Java thread stack overflow semantics.
>>>>>>>>>>>>>>>> Yes, the
>>>>>>>>>>>>>>>> compiler thread needs extra space when 64K pages are used,
>>>>>>>>>>>>>>>> but I
>>>>>>>>>>>>>>>> would prefer that we add that space via a different
>>>>>>>>>>>>>>>> calculation.
>>>>>>>>>>>>>>> Yes they are. But compiler threads happen tob e a 
>>>>>>>>>>>>>>> subclass of
>>>>>>>>>>>>>>> Java threads and use the same run() method that puts the 
>>>>>>>>>>>>>>> pages
>>>>>>>>>>>>>>> There.
>>>>>>>>>>>>>>> I agree that they are not needed for Compiler threads, 
>>>>>>>>>>>>>>> nor for
>>>>>>>>>>>>>>> CodeCacheSweeperThreads. I don't really now about
>>>>>>>>>>>>>>>     JvmtiAgentThreads and ServiceThreads, but all of the 
>>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>> the guard
>>>>>>>>>>>>> pages
>>>>>>>>>>>>>>> because they are derived from JavaThread.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To put it another way, I'd like to see us add extra 
>>>>>>>>>>>>>>>> space to
>>>>>>>>>>>>>>>> solve
>>>>>>>>>>>>>>>> the 64K page issue directly instead of as a side effect 
>>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>>> red/yellow page addition.
>>>>>>>>>>>>>>> I don't understand.  What do you mean by 'directly'?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also, the change added a test that is failing now.
>>>>>>>>>>>>>>>> And that's a "good thing" (TM), right? :-)
>>>>>>>>>>>>>>> Yes, it showed a bug and thus raised the need to fix it! :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>      Goetz.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Again, thanks for clarifying 8140520's role in this issue.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>       Goetz.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>> From: Daniel D. Daugherty
>>>>>>> [mailto:daniel.daugherty at oracle.com]
>>>>>>>>>>>>>>>>>> Sent: Montag, 21. November 2016 17:28
>>>>>>>>>>>>>>>>>> To: David Holmes <david.holmes at oracle.com>; Lindenmaier,
>>>>>>> Goetz
>>>>>>>>>>>>>>>>>> <goetz.lindenmaier at sap.com>; hotspot-runtime-
>>>>>>>>>>>>>> dev at openjdk.java.net
>>>>>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL 
>>>>>>>>>>>>>>>>>> stack
>>>>>>> guard
>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>>> Sorry for the delayed responses to this thread. I've 
>>>>>>>>>>>>>>>>>> been on
>>>>>>>>> vacation...
>>>>>>>>>>>>>>>>>> One comment/query embedded below...
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 11/10/16 8:40 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 11/11/2016 8:00 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> This issue is different to 6675312, see also my 
>>>>>>>>>>>>>>>>>>>> comment
>>>>>>>>>>>>>>>>>>>> in the
>>>>>>>>> bug.
>>>>>>>>>>>>>>>>>>>> It appears running jtreg test
>>>>>>>>>>>>>> runtime/Thread/TooSmallStackSize.java,
>>>>>>>>>>>>>>>>>>>> with my patch below you can reproduce it on 
>>>>>>>>>>>>>>>>>>>> linuxx86_64.
>>>>>>>>>>>>>>>>>>>> You
>>>>>>>>> can
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>> do that with 6675312. Also, I would assume there are
>>>>>>>>>>>>>>>>>>>> systems
>>>>>>> out
>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>>>> on x86 that uses 64-K pages, did you run the tests on
>>>>>>>>>>>>>>>>>>>> these? I
>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>> assume you get hard crashes with stack overflows in 
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> first
>>>>>>> C2
>>>>>>>>>>>>>>>>>>>> compilation if there is only 64K usable
>>>>>>>>>>>>>>>>>>>> CompilerThreadStack.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> My fix does not affect Java threads, which are the 
>>>>>>>>>>>>>>>>>>>> largest
>>>>>>> amount
>>>>>>>>>>>>>>>>>>>> of threads used by the VM. It affects only the 
>>>>>>>>>>>>>>>>>>>> non-Java
>>>>>>>>>>>>>>>>>>>> threads.
>>>>>>>>>>>>>>>>>>>> It adds one page to these threads. The page does not
>>>>>>>>>>>>>>>>>>>> require
>>>>>>>>>>>>>> memory,
>>>>>>>>>>>>>>>>>>>> as it's protected. The stack will only require more
>>>>>>>>>>>>>>>>>>>> space if the
>>>>>>>>>>>>> thread
>>>>>>>>>>>>>>>>>>>> ran into a stack overflow before the fix as else the
>>>>>>>>>>>>>>>>>>>> pages are
>>>>>>> not
>>>>>>>>>>>>>>>>>>>> mapped.
>>>>>>>>>>>>>>>>>>>> This are stack overflows that cause hard crashes, at
>>>>>>>>>>>>>>>>>>>> least on
>>>>>>> ppc
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> VM
>>>>>>>>>>>>>>>>>>>> does not properly catch these stack overflows, so 
>>>>>>>>>>>>>>>>>>>> any setup
>>>>>>>>>>>>> working
>>>>>>>>>>>>>>>> now
>>>>>>>>>>>>>>>>>>>> will not run into the additional space. Altogether 
>>>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>>>> should
>>>>>>> be
>>>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>>>> effect on running systems besides requiring one more
>>>>>>>>>>>>>>>>>>>> entry in
>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> page table per non-Java thread.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The problem is caused by a rather recent change
>>>> (8140520:
>>>>>>>>>>>>> segfault
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>> solaris-amd64
>>>>>>>>>>>>>>>>>>>> with "-XX:VMThreadStackSize=1" option) which was 
>>>>>>>>>>>>>>>>>>>> pushed
>>>>>>> after
>>>>>>>>>>>>>>>>>>>> feature-close. As this was a rather recent change, it
>>>>>>>>>>>>>>>>>>>> must be
>>>>>>>>>>>>>>>>>>>> possible to
>>>>>>>>>>>>>>>>>>>> fix this follow up issue. What else is this period in
>>>>>>>>>>>>>>>>>>>> the project
>>>>>>>>> good
>>>>>>>>>>>>>>>>>>>> for if not fixing issues?
>>>>>>>>>>>>>>>>>>> So I am seeing a number of factors here.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> First, 8140520, set:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> size_t os::Posix::_compiler_thread_min_stack_allowed 
>>>>>>>>>>>>>>>>>>> = 128
>>>> *
>>>>>>> K;
>>>>>>>>>>>>>>>>>> So I'm confused by the above comment:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>      > The problem is caused by a rather recent change
>>>>>>>>>>>>>>>>>> (8140520:
>>>>>>>>>>>>> segfault
>>>>>>>>>>>>>>>>>>      > on solaris-amd64 with "-XX:VMThreadStackSize=1"
>>>>>>>>>>>>>>>>>> option)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8140520-webrev/5-jdk9- 
>>>>>>>>>>>>>>>>>>
>>>> hs-
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>> open/hotspot/src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp.frames.html
>>>>>>>>>>>>>>>>>> shows this change:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> @@ -531,19 +531,17 @@
>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>> ////////////////////////////////////////////////////////////////////////////// 
>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>> //
>>>>>>>>>>>>>>>>>>       // thread stack
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -size_t os::Linux::min_stack_allowed = 128*K;
>>>>>>>>>>>>>>>>>> +size_t os::Posix::_compiler_thread_min_stack_allowed 
>>>>>>>>>>>>>>>>>> = 128
>>>> *
>>>>>>> K;
>>>>>>>>>>>>>>>>>> +size_t os::Posix::_java_thread_min_stack_allowed = 
>>>>>>>>>>>>>>>>>> 128 * K;
>>>>>>>>>>>>>>>>>> +size_t 
>>>>>>>>>>>>>>>>>> os::Posix::_vm_internal_thread_min_stack_allowed =
>>>>>>>>>>>>>>>>>> 128
>>>>>>> *
>>>>>>>>> K;
>>>>>>>>>>>>>>>>>> so the existing single variable of 
>>>>>>>>>>>>>>>>>> 'min_stack_allowed' was
>>>>>>>>>>>>>>>>>> replaced by three variables:
>>>>>>> _compiler_thread_min_stack_allowed,
>>>>>>>>>>>>>>>>>> _java_thread_min_stack_allowed, and
>>>>>>>>>>>>>>>>>> _vm_internal_thread_min_stack_allowed.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The old single variable and the three new variables 
>>>>>>>>>>>>>>>>>> are all
>>>>>>>>>>>>>>>>>> initialized to the same value (128K) so the fix for 
>>>>>>>>>>>>>>>>>> 8140520
>>>>>>>>>>>>>>>>>> did not change stack sizes for this platform. In 
>>>>>>>>>>>>>>>>>> fact, only
>>>>>>>>>>>>>>>>>> one platform had a size change (Solaris X64).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So I'm confused about how the fix for 8140520 caused 
>>>>>>>>>>>>>>>>>> this
>>>>>>> problem.
>>>>>>>>>>>>>>>>>> Based on David's analysis below, it looks to me like 
>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>> 64K stack
>>>>>>>>>>>>>>>>>> guard page problem should also exist prior to the fix 
>>>>>>>>>>>>>>>>>> for
>>>>>>> 8140520.
>>>>>>>>>>>>>>>>>> Goetz, can you please explain how 8140520 caused this
>>>>>>>>>>>>>>>>>> problem?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Second on linux PPC it is hardwired to use 2 guard 
>>>>>>>>>>>>>>>>>>> pages:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>      return 2 * page_size();
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Third, you had a pagesize of 64K.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Fourth, NPTL takes the guard space from the stack 
>>>>>>>>>>>>>>>>>>> space -
>>>>>>> hence
>>>>>>>>>>>>> with
>>>>>>>>>>>>>> 2
>>>>>>>>>>>>>>>>>>> x 64K guard, and a 128K stack it was all consumed.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In the proposed changes you now only use page_size() 
>>>>>>>>>>>>>>>>>>> for
>>>> the
>>>>>>>>> guard,
>>>>>>>>>>>>>> so
>>>>>>>>>>>>>>>>>>> that alone would have fixed the observed problem.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> But in addition you want to address the NPTL problem by
>>>>>>>>>>>>>>>>>>> adding
>>>>>>>>>>>>> back
>>>>>>>>>>>>>>>>>>> the guard space to the stack size requested. That alone
>>>>>>>>>>>>>>>>>>> would
>>>>>>> also
>>>>>>>>>>>>>>>>>>> have fixed the observed problem. :)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> But in addition you have increased the minimum stack 
>>>>>>>>>>>>>>>>>>> size:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ! size_t 
>>>>>>>>>>>>>>>>>>> os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>>>>>>>>>>>>>> 192 *
>>>>>>> K;
>>>>>>>>>>>>>>>>>>> which again, on its own would have fixed the original
>>>>>>>>>>>>>>>>>>> problem.
>>>>>>> :)
>>>>>>>>>>>>>>>>>>> Did you really intend to increase the real minimum 
>>>>>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>>>>>> from
>>>>>>>>> 128K
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> 256K ? (on a 64K page system)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Focusing simply on the shared code change to adjust the
>>>>>>> requested
>>>>>>>>>>>>>>>>>>> stacksize by the amount of guard space (if any), this
>>>>>>>>>>>>>>>>>>> does not
>>>>>>> seem
>>>>>>>>>>>>>>>>>>> unreasonable. As you note it is restricted to
>>>>>>>>>>>>>>>>>>> non-JavaThreads
>>>>>>> and
>>>>>>>>>>>>> only
>>>>>>>>>>>>>>>>>>> adds a page to reserved stack space.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> My only query now is whether the minimum stacksize
>>>> detection
>>>>>>>>> logic
>>>>>>>>>>>>>>>>>>> will correctly report the real minimum stack size 
>>>>>>>>>>>>>>>>>>> (taking
>>>>>>>>>>>>>>>>>>> into
>>>>>>>>> account
>>>>>>>>>>>>>>>>>>> the need for the guard page) ?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> So I really think this issue should be fixed.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>       Goetz.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>>>>>>>>>>>>>> Sent: Thursday, November 10, 2016 10:02 PM
>>>>>>>>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>> hotspot-
>>>>>>>>>>>>>>>>>> runtime-
>>>>>>>>>>>>>>>>>>>>> dev at openjdk.java.net
>>>>>>>>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL
>>>> stack
>>>>>>>>>>>>> guard
>>>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> As per the bug report, this issue was already known
>>>>>>> (6675312)
>>>>>>>>> and
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>> chose not to try and address it due to no reported
>>>>>>>>>>>>>>>>>>>>> issues at
>>>>>>> the
>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>>>>>> While I see that you have encountered an issue (is it
>>>>>>>>>>>>>>>>>>>>> real or
>>>>>>>>>>>>>>>>>>>>> fabricated?) I think this change is too intrusive 
>>>>>>>>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>>>>>>>>> applied
>>>>>>> at
>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>> stage of the JDK 9 release cycle, as it will 
>>>>>>>>>>>>>>>>>>>>> change the
>>>>>>>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>>>>>>>> requirements of every application running on Linux.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 11/11/2016 1:58 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Please review this change. I please need a sponsor:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>>>>>>>>>>>>>>>>>> stackFix/webrev.01/
>>>>>>>>>>>>>>>>>>>>>> In the Linux NPTL pthread implementation the 
>>>>>>>>>>>>>>>>>>>>>> guard size
>>>>>>>>>>>>>> mechanism
>>>>>>>>>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>>>> implemented properly. The posix standard requires to
>>>> add
>>>>>>> the
>>>>>>>>>>>>>> size
>>>>>>>>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>>>>>>>>> guard pages to the stack size, instead Linux 
>>>>>>>>>>>>>>>>>>>>>> takes the
>>>>>>>>>>>>>>>>>>>>>> space
>>>>>>> out
>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>> 'stacksize'.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> The Posix standard
>>>>>>>>>>>>>>>>>> http://pubs.opengroup.org/onlinepubs/9699919799/
>>>>>>>>>>>>>>>>>>>>>> says "the implementation allocates extra memory
>>>>>>>>>>>>>>>>>>>>>> at the
>>>>>>>>>>>>>> overflow
>>>>>>>>>>>>>>>>>>>>>> end of
>>>>>>>>>>>>>>>>>>>>>> the stack". The linux man page
>>>>>>>>>>>>>>>>>>>>>> https://linux.die.net/man/3/pthread_attr_setguardsize 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> says
>>>>>>> "As
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>>> glibc
>>>>>>>>>>>>>>>>>>>>>> 2.8, the NPTL threading implementation includes the
>>>> guard
>>>>>>>>> area
>>>>>>>>>>>>>>>> within
>>>>>>>>>>>>>>>>>>>>>> the stack size allocation, rather than allocating
>>>>>>>>>>>>>>>>>>>>>> extra space
>>>>>>> at
>>>>>>>>>>>>>>>>>>>>>> the end
>>>>>>>>>>>>>>>>>>>>>> of the stack, as POSIX.1 requires".
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I encounter this problem in
>>>>>>>>>>>>>> runtime/Thread/TooSmallStackSize.java
>>>>>>>>>>>>>>>>>>>>>> on ppc
>>>>>>>>>>>>>>>>>>>>>> with 64K pages. 
>>>>>>>>>>>>>>>>>>>>>> _compiler_thread_min_stack_allowed is
>>>>>>> 128K
>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>> ppc,
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>> ppc specifies two OS guard pages. The VM crashes in
>>>>>>> pthread
>>>>>>>>>>>>>>>> creation
>>>>>>>>>>>>>>>>>>>>>> because there is no usable space in the thread stack
>>>>>>>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>> allocating
>>>>>>>>>>>>>>>>>>>>>> the guard pages.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> But TooSmallStackSize.java requires that the VM 
>>>>>>>>>>>>>>>>>>>>>> comes
>>>> up
>>>>>>>>> with
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>>>>>>>>> size mentioned in the error message.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> This fix adapts the requested stack size on Linux by
>>>>>>>>>>>>>>>>>>>>>> the size
>>>>>>> of
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> guard pages to mimick proper behaviour, see 
>>>>>>>>>>>>>>>>>>>>>> change to
>>>>>>>>>>>>>>>> os_linux.cpp.
>>>>>>>>>>>>>>>>>>>>>> The change also streamlines usage of 
>>>>>>>>>>>>>>>>>>>>>> stack_guard_page
>>>> on
>>>>>>>>>>>>>> linuxppc,
>>>>>>>>>>>>>>>>>>>>>> linuxppcle, aixppc and linuxs390.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> To reproduce the error on linux_x86_64, apply below
>>>> patch
>>>>>>>>> and
>>>>>>>>>>>>>> call
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> VM with -XX:CompilerThreadStackSize=64.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I'm still exploring why I had to choose such big 
>>>>>>>>>>>>>>>>>>>>>> compiler
>>>>>>> stacks
>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>> ppc
>>>>>>>>>>>>>>>>>>>>>> to get -version passing, but I wanted to send the 
>>>>>>>>>>>>>>>>>>>>>> RFR
>>>>>>>>>>>>>>>>>>>>>> now as
>>>>>>>>>>>>>> people
>>>>>>>>>>>>>>>>>>>>>> obviously looked at the bug I opened (Thanks 
>>>>>>>>>>>>>>>>>>>>>> David!).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>       Goetz.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> diff -r b7ae012c55c3
>>>>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>>>>>>>>>>>>>>>>>> --- a/src/os_cpu/linux_x86/vm/os_linux_x86.cpp  Mon
>>>> Nov
>>>>>>> 07
>>>>>>>>>>>>>>>> 12:37:28
>>>>>>>>>>>>>>>>>>>>> 2016
>>>>>>>>>>>>>>>>>>>>>> +0100
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> +++ b/src/os_cpu/linux_x86/vm/os_linux_x86.cpp Thu
>>>> Nov
>>>>>>> 10
>>>>>>>>>>>>>>>>>> 16:52:17
>>>>>>>>>>>>>>>>>>>>> 2016
>>>>>>>>>>>>>>>>>>>>>> +0100
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> @@ -701,7 +701,7 @@
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> size_t os::Linux::default_guard_size(os::ThreadType
>>>>>>> thr_type) {
>>>>>>>>>>>>>>>>>>>>>> // Creating guard page is very expensive. Java
>>>>>>>>>>>>>>>>>>>>>> thread
>>>>>>> has
>>>>>>>>>>>>>> HotSpot
>>>>>>>>>>>>>>>>>>>>>> // guard page, only enable glibc guard page for
>>>>>>>>>>>>>>>>>>>>>> non-Java
>>>>>>>>>>>>>> threads.
>>>>>>>>>>>>>>>>>>>>>> - return (thr_type == java_thread ? 0 : 
>>>>>>>>>>>>>>>>>>>>>> page_size());
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> +  return (thr_type == java_thread ? 0 : 64*K);
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> // Java thread:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>
>
>



More information about the hotspot-runtime-dev mailing list