RFR(M): 8169373: Work around linux NPTL stack guard error.

David Holmes david.holmes at oracle.com
Thu Dec 1 12:44:07 UTC 2016


Hi Goetz,

On 1/12/2016 8:32 PM, Lindenmaier, Goetz wrote:
> Hi David,
>
> I fixed the comment:
> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.07/

There are multiple occurrences to be fixed - no need for me to see new 
webrev though.

> We run a lot of tests with this change:
> Hotspot jtreg, jck, spec, SAP applications
> On these platforms:
> Windows_x86_64, linux_x86_64, solaris_sparc, mac_x86_64,
> Linux_ppc64, linux_ppc64le, aix_ppc64, linux_s390
> I did not spot a problem there.

That's great to hear - appreciated. But at this stage of the game some 
healthy paranoia is in order so I'd like to see this run through our 
internal testing too.

Thanks,
David

> Best regards,
>   Goetz.
>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Mittwoch, 30. November 2016 22:51
>> To: daniel.daugherty at oracle.com; Lindenmaier, Goetz
>> <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net;
>> 'hotspot-runtime-dev at openjdk.java.net' <hotspot-runtime-
>> dev at openjdk.java.net>
>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard error.
>>
>> On 1/12/2016 1:20 AM, Daniel D. Daugherty wrote:
>>>> Would you mind sponsoring this change?
>>>
>>> Yes, I will sponsor this change. However, I would like to get a
>>> thumbs up from David H. on the latest version and I don't see
>>> any review from someone on the Compiler Team.
>>
>> I'm okay with proposed changes - but also want to hear from compiler
>> team and I'd want to see this put through some advance testing before it
>> gets pushed (full rbt run).
>>
>> I have one minor nit in the wording of the fatal error messages "failed
>> with errno" - these methods don't touch errno so I'd prefer it if it
>> said error.
>>
>> Thanks,
>> David
>> -----
>>
>>> Vladimir! We need someone on the Compiler Team to look at these
>>> CompilerThread stack size changes...
>>>
>>> Dan
>>>
>>>
>>> On 11/30/16 12:57 AM, Lindenmaier, Goetz wrote:
>>>> Hi Dan,
>>>>
>>>>> Thumbs up! I don't need to see a new webrev if you choose
>>>>> to fix the minor comments above.
>>>> I anyways did a new one fixing the comments:
>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.06/
>>>> Would you mind sponsoring this change?
>>>>
>>>> I took the minimum stack sizes from my experimental runs where
>>>> I had removed the automatic resizing to find the really needed space.
>>>> If I put something smaller there, I could as well put '0' ... as
>>>> _java_thread_min_stack_allowed =
>> MAX2(_java_thread_min_stack_allowed,
>>>>
>>>> JavaThread::stack_guard_zone_size() +
>>>>
>>>> JavaThread::stack_shadow_zone_size() +
>>>>                                          (4 * BytesPerWord
>>>> COMPILER2_PRESENT(+ 2)) * 4 * K);
>>>> will fix it.
>>>> This code effectively limits the usable stack size to
>>>>     (4 * BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K)
>>>> which makes the initialization of _java_thread_min_stack_allowed with
>>>> platform
>>>> values pointless.
>>>>
>>>> I'll open a new bug for the follow-up stack issue, probably tomorrow.
>>>> I don't feel like addressing testing all the possible error codes, as
>>>> they probably should be fixed in more places, and there is no issue
>>>> pending currently.  Maybe it should be fixed in jdk10 at some point.
>>>>
>>>> Best regards,
>>>>    Goetz
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>> Sent: Dienstag, 29. November 2016 20:04
>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-compiler-
>>>>> dev at openjdk.java.net; 'hotspot-runtime-dev at openjdk.java.net' <hotspot-
>>>>> runtime-dev at openjdk.java.net>
>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard error.
>>>>>
>>>>> On 11/29/16 2:30 AM, Lindenmaier, Goetz wrote:
>>>>>> Hi Dan,
>>>>>>
>>>>>> see my replies inline ...
>>>>>> New webrev:
>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.05/
>>>>> src/os/aix/vm/os_aix.cpp
>>>>>       L887:   // libc guard page
>>>>>           nit - You made other existing comments into sentences (leading
>>>>>                 capital and trailing '.'), but not this new comment.
>>>>> Why?
>>>>>
>>>>> src/os/aix/vm/os_aix.hpp
>>>>>       No comments.
>>>>>
>>>>> src/os/linux/vm/os_linux.cpp
>>>>>       L6096: //    |                        |/  1 page glibc guard.
>>>>>           nit - "1 page glibc guard" -> "1 glibc guard page."
>>>>>
>>>>> src/os/posix/vm/os_posix.cpp
>>>>>       No comments.
>>>>>
>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>       No comments.
>>>>>
>>>>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp
>>>>>       L875: //    |                        |/  1 page glibc guard.
>>>>>           nit - "1 page glibc guard" -> "1 glibc guard page."
>>>>>
>>>>> src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
>>>>>       No comments.
>>>>>
>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>>>>>       No comments.
>>>>>
>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>>>>       No comments.
>>>>>
>>>>> src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp
>>>>>       No comments.
>>>>>
>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>       No comments.
>>>>>
>>>>> src/os_cpu/linux_zero/vm/os_linux_zero.cpp
>>>>>       No comments.
>>>>>
>>>>>
>>>>> Thumbs up! I don't need to see a new webrev if you choose
>>>>> to fix the minor comments above.
>>>>>
>>>>> Some replies embedded below.
>>>>>
>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>>> Sent: Dienstag, 29. November 2016 01:38
>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
>> compiler-
>>>>>>> dev at openjdk.java.net; 'hotspot-runtime-dev at openjdk.java.net'
>> <hotspot-
>>>>>>> runtime-dev at openjdk.java.net>
>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
>>>>>>> error.
>>>>>>>
>>>>>>> On 11/28/16 2:08 AM, Lindenmaier, Goetz wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm working on a fix for OS guard pages on stacks.  I figured there
>>>>>>>> are VM guard pages (reserved, yellow, red) on the compiler stacks
>>>>>>>> _and_ OS guard pages.  For Java threads, the OS guard pages are left
>>>>>>>> out.  I think this should be done for compiler threads, too. Please
>>>>>>>> confirm.
>>>>>>>> Webrev:
>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>> stackFix/webrev.04/
>>>>>>>>
>>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>>>        L888:   pthread_attr_setguardsize(&attr,
>>>>>>> os::Aix::default_guard_size(thr_type));
>>>>>>>            No check or assert on the return status of this call.
>>>>>>>            Is one needed?
>>>>>> I implemented this as the existing code on linux which has
>>>>>> no check either.  I think a failure is quite theoretical.
>>>>>> Because of your comment below I'll leave it as-is.
>>>>> OK.
>>>>>
>>>>>
>>>>>>>        L3044:   // guard pages, so only enable libc guard pages for
>>>>>>> non-Java threads.
>>>>>>>            src/os/linux/vm/os_linux.cpp also has this comment:
>>>>>>>            // (Remember: compiler thread is a Java thread, too!)
>>>>>> Fixed.
>>>>>>
>>>>>>>        L3051:   return ((thr_type == java_thread || thr_type ==
>>>>>>> compiler_thread) ? 0 : 4*K);
>>>>>>>            nit - please add spaces around the '*' so '4 * K'.'
>>>>>> Fixed.
>>>>>>
>>>>>>> src/os/aix/vm/os_aix.hpp
>>>>>>>        No comments.
>>>>>>>
>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>        L729:   // is not implemented properly. The posix standard
>>>>>>> requires
>>>>>>> to add
>>>>>>>            Typo: 'to add' -> 'adding'
>>>>>> Fixed.
>>>>>>
>>>>>>>        L738:   pthread_attr_setguardsize(&attr,
>>>>>>> os::Linux::default_guard_size(thr_type));
>>>>>>>            No check or assert on the return status of this call.
>>>>>>>            Is one needed?
>>>>>> See above.
>>>>>>
>>>>>>>        L2851:   // Creating guard page is very expensive. Java
>>>>>>> thread has
>>>>>>> HotSpot
>>>>>>>        L2852:   // guard page, only enable glibc guard page for
>>>>>>> non-Java
>>>>>>> threads.
>>>>>>>        L2853:   // (Remember: compiler thread is a java thread, too!)
>>>>>>>            Typo: "java thread" -> "Java thread" (consistency)
>>>>>>>
>>>>>>>            This comment block should be common to all the platforms
>>>>>>> that
>>>>>>>            define default_guard_size(). Yes, I can see that AIX
>>>>>>> needs to
>>>>>>>            add another paragraph, but we should make the core comment
>>>>> common
>>>>>>>            if possible.
>>>>>> I made the first three lines look alike.
>>>>>>
>>>>>>>        L6090: // Java/Compiler thread:
>>>>>>>            Thanks for making this common in os_linux.cpp.
>>>>>>>
>>>>>>>        L6095: //    |    glibc guard page    | - guard, attached Java
>>>>>>> thread usually has
>>>>>>>            Clarity: "guard," -> "guard page,"
>>>>>> Fixed.
>>>>>>
>>>>>>>            Typo: "Java thread" -> "JavaThread" (consistency)
>>>>>> I changed both to Java thread as at the other locations.
>>>>>>
>>>>>>>        L6099: //    |  HotSpot Guard Pages   | - red and yellow pages
>>>>>>>            The fairly recently added reserved page should be mentioned
>>>>>>>            here also?
>>>>>> Yes. Fixed.   Also fixed it to say
>>>>>> JavaThread::stack_reserved_zone_base().
>>>>>> Also fixed comment on bsd.
>>>>> Thanks for also fixing BSD.
>>>>>
>>>>>
>>>>>>>        L6120 // ** P1 (aka bottom) and size ( P2 = P1 - size) are the
>>>>>>> address and stack size returned from
>>>>>>>            Typo: "( P2 = ..." -> "(P2 = ..."
>>>>>> Fixed.
>>>>>>
>>>>>>>        L6148:       fatal("Can not locate current stack attributes!");
>>>>>>>            Typo: "Can not" -> "Cannot"
>>>>>> Fixed.
>>>>>>
>>>>>>>        L6175:   // stack size includes normal stack and HotSpot
>>>>>>> guard pages
>>>>>>>            Perhaps add to the comment:
>>>>>>>            "for the threads that have HotSpot guard pages."
>>>>>> Fixed.  I also checked my comments for "OS guard pages" and
>>>>>> replaced it by "glibc guard pages" which is used in several places
>>>>>> already, same for "VM guard page" --> "HotSpot guard page". I
>>>>>> think this is also more consistent.
>>>>> I agree!
>>>>>
>>>>>
>>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>>>        L1097:   pthread_attr_getstacksize(attr, &stack_size);
>>>>>>>        L1098:   pthread_attr_getguardsize(attr, &guard_size);
>>>>>>>            Do these two calls need to have their result checked?
>>>>>>>
>>>>>>>            Now I'm starting to wonder if all the uses of these
>>>>>>>            two APIs need to be checked? Separate bug?
>>>>>> It would be more consistent with the specification of the methods,
>>>>>> On the other side it's quite unlikely that these fail if attr != NULL.
>>>>> So should we file a new bug? Or does this fall into the realm of
>>>>> other OS/libc code that we call and assume never fails? :-)
>>>>>
>>>>>
>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>        L540: size_t os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>> 512 * K;
>>>>>>>        L541: size_t os::Posix::_java_thread_min_stack_allowed = 512
>>>>>>> * K;
>>>>>>>            So prior to the fix for 8140520,
>>>>>>> src/os/aix/vm/os_aix.cpp had
>>>>>>>            this single min_stack_allowed value:
>>>>>>>
>>>>>>>            L3601:   os::Aix::min_stack_allowed =
>>>>>>> MAX2(os::Aix::min_stack_allowed,
>>>>>>>            L3602: JavaThread::stack_guard_zone_size() +
>>>>>>>            L3603: JavaThread::stack_shadow_zone_size() +
>>>>>>>            L3604:                                     (4*BytesPerWord
>>>>>>> COMPILER2_PRESENT(+2)) * 4 * K);
>>>>>>>
>>>>>>>            and the fix for 8140520 changed that for *NIX platforms to
>>>>>>>            three mins in src/os/posix/vm/os_posix.cpp:
>>>>>>>
>>>>>>>            L1108:   _java_thread_min_stack_allowed =
>>>>>>> MAX2(_java_thread_min_stack_allowed,
>>>>>>>            L1109: JavaThread::stack_guard_zone_size() +
>>>>>>>            L1110: JavaThread::stack_shadow_zone_size() +
>>>>>>>            L1111:                                         (4 *
>>>>>>> BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K);
>>>>>>>
>>>>>>>            L1150:   _compiler_thread_min_stack_allowed =
>>>>>>> align_size_up(_compiler_thread_min_stack_allowed, vm_page_size());
>>>>>>>
>>>>>>>            L1161   _vm_internal_thread_min_stack_allowed =
>>>>>>> align_size_up(_vm_internal_thread_min_stack_allowed,
>> vm_page_size());
>>>>>>>
>>>>>>>            Which means that the compiler_thread no longer benefits
>>>>>>> from
>>>>>>>            the extra space for quard and shadow pages. The thinking in
>>>>>>>            8140520 was that the compiler_thread and
>>>>>>> vm_internal_threads
>>>>>>>            don't need the quard and shadow pages since they don't run
>>>>>>>            Java code (ignoring JVMCI for now).
>>>>>>>
>>>>>>>            So I can see bumping _compiler_thread_min_stack_allowed
>>>>>>> from
>>>>>>>            128 -> 512 as one solution for getting that extra space
>>>>>>> back.
>>>>>>>            However, I don't understand why
>>>>>>> _java_thread_min_stack_allowed
>>>>>>>            has changed from 128 -> 512.
>>>>>> Because it was never correct before.
>>>>> OK. That sounds like the new test that I included with 8140520 would
>>>>> have failed with JavaThread stack sizes even before the product code
>>>>> changes from 8140520 were made.
>>>>>
>>>>> Since the size calculation for JavaThread stack sizes wasn't changed
>>>>> for any platform in 8140520, that tends to indicate that the more
>>>>> limited JDK test (test/tools/launcher/TooSmallStackSize.java) should
>>>>> also have failed before the fix for 8140520.
>>>>>
>>>>> Please clarify the need for the _java_thread_min_stack_allowed change
>>>>> from 128 -> 512. Unless test/tools/launcher/TooSmallStackSize.java
>>>>> is never run in your testing, I'm having troubling seeing why the
>>>>> _java_thread_min_stack_allowed increase is needed.
>>>>>
>>>>>
>>>>>>>            I had previously made this comment:
>>>>>>>            > To put it another way, I'd like to see us add extra
>>>>>>> space to
>>>>>>>            > solve the 64K page issue directly instead of as a side
>>>>>>> effect
>>>>>>>            > of the red/yellow page addition.
>>>>>>>            And Goetz replied with:
>>>>>>>            > I don't understand.  What do you mean by 'directly'?
>>>>>>>
>>>>>>>            So prior to the fix for 8140520,
>>>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>>>>            had a block like this:
>>>>>>>
>>>>>>>            L4468:   // For 64kbps there will be a 64kb page size,
>>>>>>> which makes
>>>>>>>            L4469:   // the usable default stack size quite a bit less.
>>>>>>> Increase the
>>>>>>>            L4470:   // stack for 64kb (or any > than 8kb) pages, this
>>>>>>> increases
>>>>>>>            L4471:   // virtual memory fragmentation (since we're not
>>>>>>> creating the
>>>>>>>            L4472   // stack on a power of 2 boundary.  The real fix
>>>>>>> for this
>>>>>>>            L4473   // should be to fix the guard page mechanism.
>>>>>>>            L4474
>>>>>>>            L4475   if (vm_page_size() > 8*K) {
>>>>>>>            L4476     threadStackSizeInBytes =
>>>>>>> (threadStackSizeInBytes != 0)
>>>>>>>            L4477        ? threadStackSizeInBytes +
>>>>>>>            L4478          JavaThread::stack_red_zone_size() +
>>>>>>>            L4479          JavaThread::stack_yellow_zone_size()
>>>>>>>            L4480        : 0;
>>>>>>>            L4481     ThreadStackSize = threadStackSizeInBytes/K;
>>>>>>>            L4482   }
>>>>>>>
>>>>>>>            The above is an example of what I mean by solving the 64K
>>>>>>>            page issue directly. In the fix for 8140520, that code
>>>>>>> block
>>>>>>>            was moved to os::Posix::set_minimum_stack_sizes() in
>>>>>>>            src/os/posix/vm/os_posix.cpp and put in a "#ifdef
>>>>>>> SOLARIS...
>>>>>>>            #endif // SOLARIS" block. Coleen filed a bug to determine
>>>>>>>            whether that code can be deleted:
>>>>>>>
>>>>>>>            JDK-8161093 Solaris for >8k pagesize adds extra guard pages
>>>>>>>            https://bugs.openjdk.java.net/browse/JDK-8161093
>>>>>>>
>>>>>>>            but perhaps this bug shows that the code is needed?
>>>>>>>
>>>>>>>
>>>>>>>            OK so this is probably the longest code review comment
>>>>>>>            I have ever written, but the summary is:
>>>>>>>
>>>>>>>            - I understand bumping _compiler_thread_min_stack_allowed,
>>>>>>>              but should it be solved in a different way?
>>>>>>>            - I don't understand bumping _java_thread_min_stack_allowed
>>>>>> I plan to do a follow up change to fix this.  Let's leave this
>>>>>> discussion
>>>>>> to that review.  Here I just want to fix the NPTL issue and the basic
>>>>>> sizing that is needed to pass the new test on ppc/s390.
>>>>> Same question here about the simpler JDK version of the test.
>>>>>
>>>>> Does test/tools/launcher/TooSmallStackSize.java get run in
>>>>> your test environments?
>>>>>
>>>>>
>>>>>>> src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
>>>>>>>        No comments.
>>>>>>>
>>>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>>>>>>>        L538: size_t os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>> 384 * K;
>>>>>>>        L539: size_t os::Posix::_java_thread_min_stack_allowed = 384
>>>>>>> * K;
>>>>>>>
>>>>>>>            Same monster comment as
>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>            and the same summary:
>>>>>>>
>>>>>>>            - I understand bumping _compiler_thread_min_stack_allowed,
>>>>>>>              but should it be solved in a different way?
>>>>>>>            - I don't understand bumping _java_thread_min_stack_allowed
>>>>>>>
>>>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>>>>>>        L478: size_t os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>> 128 * K;
>>>>>>>        L479: size_t os::Posix::_java_thread_min_stack_allowed = 236
>>>>>>> * K;
>>>>>>>            Bumping _java_thread_min_stack_allowed but not bumping
>>>>>>>            _compiler_thread_min_stack_allowed. I'm confused here.
>>>>>> The numbers are what I need to startup on the machines.  128 is just
>>>>>> fine on the machines we have.  (That's the problem of the
>>>>>> current setup: you have to tune this compile time constant for the
>>>>>> page size of the machine you are running on. But let's discuss this
>>>>>> in the follow up change.)
>>>>> OK about discussing this with a follow-up change. I guess I see
>>>>> the compile time initialization as a "minimum setting assuming the
>>>>> smallest page size". If we discover (at runtime) that the page
>>>>> size is bigger, then we adjust the minimum that we need...
>>>>>
>>>>> Again, defer to another bug. Do we have a bug ID yet?
>>>>>
>>>>>
>>>>>>> src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp
>>>>>>>        No comments.
>>>>>>>
>>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>>>        No comments.
>>>>>>>
>>>>>>> src/os_cpu/linux_zero/vm/os_linux_zero.cpp
>>>>>>>        No comments.
>>>>>>>
>>>>>>> Sorry it took me so long to write this up...
>>>>>> No matter, thanks for this thorough review!
>>>>> You are very welcome. Thanks for being willing to dive into such
>>>>> a complicated area (thread stack sizes)...
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>
>>>>>> Best regards,
>>>>>>     Goetz.
>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>> The change affecting the compier threads is in os_linux.cpp,
>>>>>>> default_guard_size(),
>>>>>>>> where '|| thr_type == compiler_thread' has been added. The function
>>>>>>>> was also moved from the os_cpu files, as it's identical on all cpus.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>      Goetz.
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>> Sent: Montag, 28. November 2016 00:25
>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>>>> 'daniel.daugherty at oracle.com' <daniel.daugherty at oracle.com>;
>>>>> 'hotspot-
>>>>>>>>> runtime-dev at openjdk.java.net' <hotspot-runtime-
>> dev at openjdk.java.net>
>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
>>>>>>>>> error.
>>>>>>>>>
>>>>>>>>> Hi Goetz,
>>>>>>>>>
>>>>>>>>> On 24/11/2016 10:15 PM, Lindenmaier, Goetz wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I now edited the stuff I had proposed below:
>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>>>>> stackFix/webrev.03/
>>>>>>>>>> This includes
>>>>>>>>>>      - the NPTL fix from webrev.02
>>>>>>>>> Okay in principle. As discussed this only impacts non-JavaThreads
>>>>>>>>> so the
>>>>>>>>> change should be minimal.
>>>>>>>>>
>>>>>>>>>>      - merging code on linux
>>>>>>>>> Went a bit further than I had expected but if this truly isn't CPU
>>>>>>>>> dependent code then great!
>>>>>>>>>
>>>>>>>>>>      - not adding OS guard to compiler threads.
>>>>>>>>> Okay in principle. IIUC we will now save the OS guard page for
>>>>>>>>> compiler
>>>>>>>>> thread stacks. Is that the only impact? The hotspot-compiler-dev
>>>>>>>>> folk
>>>>>>>>> may want to sign off on this part.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> A few minor comments:
>>>>>>>>>
>>>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>>>
>>>>>>>>> 2854   return ((thr_type == java_thread || thr_type ==
>>>>>>>>> os::compiler_thread) ...
>>>>>>>>>
>>>>>>>>> os:: should be used for both types or none.
>>>>>>>>>
>>>>>>>>> 6153     pthread_attr_getguardsize(&attr, &guard_size);
>>>>>>>>>
>>>>>>>>> Can you at least verify a zero return code in an
>>>>>>>>> assert/assert_status
>>>>>>>>> please.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>>>>>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>>>>>>>>
>>>>>>>>> Are the changes to min_stack_allowed just fixing an existing bug?
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I think this should be pushed for this bug ID. For the other
>>>>>>>>>> changes I'll
>>>>>>>>>> make another bug.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>      Goetz.
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Lindenmaier, Goetz
>>>>>>>>>>> Sent: Wednesday, November 23, 2016 8:11 AM
>>>>>>>>>>> To: David Holmes <david.holmes at oracle.com>;
>>>>>>>>>>> daniel.daugherty at oracle.com; hotspot-runtime-
>> dev at openjdk.java.net
>>>>>>>>>>> Subject: RE: RFR(M): 8169373: Work around linux NPTL stack guard
>>>>> error.
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct with
>>>>>>>>>>>> JVMCI.
>>>>> The
>>>>>>>>>>>> ability for a CompilerThread to execute Java code
>>>>>>>>>>>> (can_call_java()) is
>>>>>>>>>>>> now a dynamic property depending on whether the current
>> compiler
>>>>> is
>>>>>>> the
>>>>>>>>>>>> JVMCI compiler.
>>>>>>>>>>> Ah, then I should also leave space for shadow pages in the minimal
>>>>> stack
>>>>>>>>> size
>>>>>>>>>>> of comiler threads.
>>>>>>>>>>>
>>>>>>>>>>> Do we agree on the cleanup and on leaving out the OS guard page
>> on
>>>>>>>>>>> compiler threads?
>>>>>>>>>>> Then I would edit a change comprising the NPTL workaround and
>>>>>>>>>>> these
>>>>>>>>>>> additional changes, and split the other issue into a new bug?
>>>>>>>>>>> I think this
>>>>>>>>>>> will easy the reviewing.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>>      Goetz.
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>>>>> Sent: Mittwoch, 23. November 2016 02:50
>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>>>>>>> daniel.daugherty at oracle.com; hotspot-runtime-
>> dev at openjdk.java.net
>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
>> guard
>>>>>>> error.
>>>>>>>>>>>> On 22/11/2016 11:19 PM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: Daniel D. Daugherty
>> [mailto:daniel.daugherty at oracle.com]
>>>>>>>>>>>>>> Sent: Dienstag, 22. November 2016 14:01
>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; David
>>>>> Holmes
>>>>>>>>>>>>>> <david.holmes at oracle.com>; hotspot-runtime-
>>>>> dev at openjdk.java.net
>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
>>>>>>>>>>>>>> guard
>>>>>>>>>>> error.
>>>>>>>>>>>>>> On 11/22/16 3:55 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I ran into a row of issues, some errors on the platforms.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What I meant with that comment is that
>>>>>>>>>>>>>>> os::Linux::min_stack_allowed =
>>>>> MAX2(os::Linux::min_stack_allowed,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> JavaThread::stack_guard_zone_size() +
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> JavaThread::stack_shadow_zone_size() +
>>>>>>>>>>>>>>>                                           (4*BytesPerWord
>>>>>>> COMPILER2_PRESENT(+2)) *
>>>>>>>>> 4
>>>>>>>>>>> *
>>>>>>>>>>>> K);
>>>>>>>>>>>>>>> was executed, and min_stack_allowed used for all stacks. Now,
>>>>>>>>>>> compiler
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> vm minimum stack sizes are not increased by these sizes.
>>>>>>>>>>>>>> Now I see what you mean. Thanks for clearing this up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I should have remembered that part of the change because we
>>>>> went
>>>>>>>>>>> back
>>>>>>>>>>>>>> and forth about removing the red/yellow zone pages from the
>>>>>>>>>>>>>> other
>>>>>>>>>>>>>> threads. In particular, we discussed the compiler thread
>>>>>>>>>>>>>> because it
>>>>>>>>>>>>>> is-a JavaThread. Our conclusion was that a compiler thread
>>>>>>>>>>>>>> doesn't
>>>>>>>>>>>>>> execute Java bytecode so we could remove the red/yellow
>>>>>>>>>>>>>> pages...
>>>>>>>>>>>>> Yes, it does not execute java byte code.
>>>>>>>>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct with
>>>>>>>>>>>> JVMCI.
>>>>> The
>>>>>>>>>>>> ability for a CompilerThread to execute Java code
>>>>>>>>>>>> (can_call_java()) is
>>>>>>>>>>>> now a dynamic property depending on whether the current
>> compiler
>>>>> is
>>>>>>> the
>>>>>>>>>>>> JVMCI compiler.
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>> -----
>>>>>>>>>>>>
>>>>>>>>>>>>> Therefore you can remove the shadow pages.  There is no code
>>>>>>>>>>>>> that
>>>>>>>>>>>>> will bang.
>>>>>>>>>>>>> But red/yellow pages are protected right during thread startup.
>>>>>>>>>>>>> Therefore you must have enough space for them.
>>>>>>>>>>>>> On ppc, we try to protect three 64K pages out of the 128K
>>>>>>>>>>>>> compiler
>>>>>>> stack.
>>>>>>>>>>>>> That obviously fails.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Therefore I propose:
>>>>>>>>>>>>>         size_t os::Posix::_java_thread_min_stack_allowed = 48
>>>>>>>>>>>>> * K;   //
>>>>> Set
>>>>>>>>>>>> platform dependent.
>>>>>>>>>>>>> in  os::Posix::set_minimum_stack_sizes():
>>>>>>>>>>>>>      _java_thread_min_stack_allowed =
>>>>> _java_thread_min_stack_allowed
>>>>>>> +
>>>>>>>>>>>>>
>>>>>>>>>>>>> JavaThread::stack_guard_zone_size() +
>>>>>>>>>>>>>
>>>>>>>>>>>>> JavaThread::stack_shadow_zone_size();
>>>>>>>>>>>>>
>>>>>>>>>>>>> (Similar for _compiler_thread_min_stack_allowed).
>>>>>>>>>>>>>
>>>>>>>>>>>>> The minimal stack size is made up of three components:
>>>>>>>>>>>>>       1. Sizes of interpreter/C1/C2 frames. Depends on HotSpot
>>>>>>>>>>>> implementation/platform/os.
>>>>>>>>>>>>>       2. Sizes of C++ frames: depends on C++ compiler.
>>>>>>>>>>>>> These are fixed at compile time.
>>>>>>>>>>>>>       3. Sizes of red/yellow/reserved/shadow pages. Depends
>>>>>>>>>>>>> on the
>>>>>>> system
>>>>>>>>>>>> the
>>>>>>>>>>>>>           VM is used on.  This is not fixed at compile time.
>>>>>>>>>>>>> (Our ppc
>>>>>>> machines
>>>>>>>>>>>> differ
>>>>>>>>>>>>>          in page size.)
>>>>>>>>>>>>> Therefore 3. should not be included in a compile time constant.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> And that decision allowed us to be exposed to the 64K page
>>>>>>>>>>>>>> issue
>>>>>>>>>>>>>> because the "extra" space isn't there anymore.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> At least the _compiler_thread_min_stack_allowed should be
>>>>>>> increased
>>>>>>>>>>>>>>> similarly by red/yellow zone size.  The compiler thread is
>>>>>>>>>>>>>>> a Java
>>>>>>>>>>>>>>> thread and must have space for these zones.
>>>>>>>>>>>>>> I'm not sure that I completely agree (yet). To me, the
>>>>>>>>>>>>>> red/yellow
>>>>>>>>>>>>>> pages are there for Java thread stack overflow semantics.
>>>>>>>>>>>>>> Yes, the
>>>>>>>>>>>>>> compiler thread needs extra space when 64K pages are used,
>>>>>>>>>>>>>> but I
>>>>>>>>>>>>>> would prefer that we add that space via a different
>>>>>>>>>>>>>> calculation.
>>>>>>>>>>>>> Yes they are. But compiler threads happen tob e a subclass of
>>>>>>>>>>>>> Java threads and use the same run() method that puts the pages
>>>>>>>>>>>>> There.
>>>>>>>>>>>>> I agree that they are not needed for Compiler threads, nor for
>>>>>>>>>>>>> CodeCacheSweeperThreads. I don't really now about
>>>>>>>>>>>>>     JvmtiAgentThreads and ServiceThreads, but all of the get
>>>>>>>>>>>>> the guard
>>>>>>>>>>> pages
>>>>>>>>>>>>> because they are derived from JavaThread.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> To put it another way, I'd like to see us add extra space to
>>>>>>>>>>>>>> solve
>>>>>>>>>>>>>> the 64K page issue directly instead of as a side effect of the
>>>>>>>>>>>>>> red/yellow page addition.
>>>>>>>>>>>>> I don't understand.  What do you mean by 'directly'?
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also, the change added a test that is failing now.
>>>>>>>>>>>>>> And that's a "good thing" (TM), right? :-)
>>>>>>>>>>>>> Yes, it showed a bug and thus raised the need to fix it! :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>      Goetz.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Again, thanks for clarifying 8140520's role in this issue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>       Goetz.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>> From: Daniel D. Daugherty
>>>>> [mailto:daniel.daugherty at oracle.com]
>>>>>>>>>>>>>>>> Sent: Montag, 21. November 2016 17:28
>>>>>>>>>>>>>>>> To: David Holmes <david.holmes at oracle.com>; Lindenmaier,
>>>>> Goetz
>>>>>>>>>>>>>>>> <goetz.lindenmaier at sap.com>; hotspot-runtime-
>>>>>>>>>>>> dev at openjdk.java.net
>>>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
>>>>> guard
>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>> Sorry for the delayed responses to this thread. I've been on
>>>>>>> vacation...
>>>>>>>>>>>>>>>> One comment/query embedded below...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 11/10/16 8:40 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 11/11/2016 8:00 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This issue is different to 6675312, see also my comment
>>>>>>>>>>>>>>>>>> in the
>>>>>>> bug.
>>>>>>>>>>>>>>>>>> It appears running jtreg test
>>>>>>>>>>>> runtime/Thread/TooSmallStackSize.java,
>>>>>>>>>>>>>>>>>> with my patch below you can reproduce it on linuxx86_64.
>>>>>>>>>>>>>>>>>> You
>>>>>>> can
>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>> do that with 6675312. Also, I would assume there are
>>>>>>>>>>>>>>>>>> systems
>>>>> out
>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>> on x86 that uses 64-K pages, did you run the tests on
>>>>>>>>>>>>>>>>>> these? I
>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>> assume you get hard crashes with stack overflows in the
>>>>>>>>>>>>>>>>>> first
>>>>> C2
>>>>>>>>>>>>>>>>>> compilation if there is only 64K usable
>>>>>>>>>>>>>>>>>> CompilerThreadStack.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> My fix does not affect Java threads, which are the largest
>>>>> amount
>>>>>>>>>>>>>>>>>> of threads used by the VM. It affects only the non-Java
>>>>>>>>>>>>>>>>>> threads.
>>>>>>>>>>>>>>>>>> It adds one page to these threads. The page does not
>>>>>>>>>>>>>>>>>> require
>>>>>>>>>>>> memory,
>>>>>>>>>>>>>>>>>> as it's protected. The stack will only require more
>>>>>>>>>>>>>>>>>> space if the
>>>>>>>>>>> thread
>>>>>>>>>>>>>>>>>> ran into a stack overflow before the fix as else the
>>>>>>>>>>>>>>>>>> pages are
>>>>> not
>>>>>>>>>>>>>>>>>> mapped.
>>>>>>>>>>>>>>>>>> This are stack overflows that cause hard crashes, at
>>>>>>>>>>>>>>>>>> least on
>>>>> ppc
>>>>>>>>>>> the
>>>>>>>>>>>> VM
>>>>>>>>>>>>>>>>>> does not properly catch these stack overflows, so any setup
>>>>>>>>>>> working
>>>>>>>>>>>>>> now
>>>>>>>>>>>>>>>>>> will not run into the additional space. Altogether there
>>>>>>>>>>>>>>>>>> should
>>>>> be
>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>> effect on running systems besides requiring one more
>>>>>>>>>>>>>>>>>> entry in
>>>>>>> the
>>>>>>>>>>>>>>>>>> page table per non-Java thread.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The problem is caused by a rather recent change
>> (8140520:
>>>>>>>>>>> segfault
>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>> solaris-amd64
>>>>>>>>>>>>>>>>>> with "-XX:VMThreadStackSize=1" option) which was pushed
>>>>> after
>>>>>>>>>>>>>>>>>> feature-close. As this was a rather recent change, it
>>>>>>>>>>>>>>>>>> must be
>>>>>>>>>>>>>>>>>> possible to
>>>>>>>>>>>>>>>>>> fix this follow up issue. What else is this period in
>>>>>>>>>>>>>>>>>> the project
>>>>>>> good
>>>>>>>>>>>>>>>>>> for if not fixing issues?
>>>>>>>>>>>>>>>>> So I am seeing a number of factors here.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> First, 8140520, set:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> size_t os::Posix::_compiler_thread_min_stack_allowed = 128
>> *
>>>>> K;
>>>>>>>>>>>>>>>> So I'm confused by the above comment:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>      > The problem is caused by a rather recent change
>>>>>>>>>>>>>>>> (8140520:
>>>>>>>>>>> segfault
>>>>>>>>>>>>>>>>      > on solaris-amd64 with "-XX:VMThreadStackSize=1"
>>>>>>>>>>>>>>>> option)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8140520-webrev/5-jdk9-
>> hs-
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>> open/hotspot/src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp.frames.html
>>>>>>>>>>>>>>>> shows this change:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> @@ -531,19 +531,17 @@
>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>> //////////////////////////////////////////////////////////////////////////////
>>>>>
>>>>>>>>>>>>>>>> //
>>>>>>>>>>>>>>>>       // thread stack
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -size_t os::Linux::min_stack_allowed = 128*K;
>>>>>>>>>>>>>>>> +size_t os::Posix::_compiler_thread_min_stack_allowed = 128
>> *
>>>>> K;
>>>>>>>>>>>>>>>> +size_t os::Posix::_java_thread_min_stack_allowed = 128 * K;
>>>>>>>>>>>>>>>> +size_t os::Posix::_vm_internal_thread_min_stack_allowed =
>>>>>>>>>>>>>>>> 128
>>>>> *
>>>>>>> K;
>>>>>>>>>>>>>>>> so the existing single variable of 'min_stack_allowed' was
>>>>>>>>>>>>>>>> replaced by three variables:
>>>>> _compiler_thread_min_stack_allowed,
>>>>>>>>>>>>>>>> _java_thread_min_stack_allowed, and
>>>>>>>>>>>>>>>> _vm_internal_thread_min_stack_allowed.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The old single variable and the three new variables are all
>>>>>>>>>>>>>>>> initialized to the same value (128K) so the fix for 8140520
>>>>>>>>>>>>>>>> did not change stack sizes for this platform. In fact, only
>>>>>>>>>>>>>>>> one platform had a size change (Solaris X64).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So I'm confused about how the fix for 8140520 caused this
>>>>> problem.
>>>>>>>>>>>>>>>> Based on David's analysis below, it looks to me like this
>>>>>>>>>>>>>>>> 64K stack
>>>>>>>>>>>>>>>> guard page problem should also exist prior to the fix for
>>>>> 8140520.
>>>>>>>>>>>>>>>> Goetz, can you please explain how 8140520 caused this
>>>>>>>>>>>>>>>> problem?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Second on linux PPC it is hardwired to use 2 guard pages:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>      return 2 * page_size();
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Third, you had a pagesize of 64K.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Fourth, NPTL takes the guard space from the stack space -
>>>>> hence
>>>>>>>>>>> with
>>>>>>>>>>>> 2
>>>>>>>>>>>>>>>>> x 64K guard, and a 128K stack it was all consumed.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In the proposed changes you now only use page_size() for
>> the
>>>>>>> guard,
>>>>>>>>>>>> so
>>>>>>>>>>>>>>>>> that alone would have fixed the observed problem.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> But in addition you want to address the NPTL problem by
>>>>>>>>>>>>>>>>> adding
>>>>>>>>>>> back
>>>>>>>>>>>>>>>>> the guard space to the stack size requested. That alone
>>>>>>>>>>>>>>>>> would
>>>>> also
>>>>>>>>>>>>>>>>> have fixed the observed problem. :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> But in addition you have increased the minimum stack size:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ! size_t os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>>>>>>>>>>>> 192 *
>>>>> K;
>>>>>>>>>>>>>>>>> which again, on its own would have fixed the original
>>>>>>>>>>>>>>>>> problem.
>>>>> :)
>>>>>>>>>>>>>>>>> Did you really intend to increase the real minimum stack
>>>>>>>>>>>>>>>>> from
>>>>>>> 128K
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> 256K ? (on a 64K page system)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Focusing simply on the shared code change to adjust the
>>>>> requested
>>>>>>>>>>>>>>>>> stacksize by the amount of guard space (if any), this
>>>>>>>>>>>>>>>>> does not
>>>>> seem
>>>>>>>>>>>>>>>>> unreasonable. As you note it is restricted to
>>>>>>>>>>>>>>>>> non-JavaThreads
>>>>> and
>>>>>>>>>>> only
>>>>>>>>>>>>>>>>> adds a page to reserved stack space.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> My only query now is whether the minimum stacksize
>> detection
>>>>>>> logic
>>>>>>>>>>>>>>>>> will correctly report the real minimum stack size (taking
>>>>>>>>>>>>>>>>> into
>>>>>>> account
>>>>>>>>>>>>>>>>> the need for the guard page) ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So I really think this issue should be fixed.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>       Goetz.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>>>>>>>>>>>> Sent: Thursday, November 10, 2016 10:02 PM
>>>>>>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>> hotspot-
>>>>>>>>>>>>>>>> runtime-
>>>>>>>>>>>>>>>>>>> dev at openjdk.java.net
>>>>>>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL
>> stack
>>>>>>>>>>> guard
>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> As per the bug report, this issue was already known
>>>>> (6675312)
>>>>>>> and
>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>> chose not to try and address it due to no reported
>>>>>>>>>>>>>>>>>>> issues at
>>>>> the
>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>>>> While I see that you have encountered an issue (is it
>>>>>>>>>>>>>>>>>>> real or
>>>>>>>>>>>>>>>>>>> fabricated?) I think this change is too intrusive to be
>>>>>>>>>>>>>>>>>>> applied
>>>>> at
>>>>>>> this
>>>>>>>>>>>>>>>>>>> stage of the JDK 9 release cycle, as it will change the
>>>>>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>>>>>> requirements of every application running on Linux.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 11/11/2016 1:58 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Please review this change. I please need a sponsor:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>>>>>>>>>>>>>>>> stackFix/webrev.01/
>>>>>>>>>>>>>>>>>>>> In the Linux NPTL pthread implementation the guard size
>>>>>>>>>>>> mechanism
>>>>>>>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>> implemented properly. The posix standard requires to
>> add
>>>>> the
>>>>>>>>>>>> size
>>>>>>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>>>>>>> guard pages to the stack size, instead Linux takes the
>>>>>>>>>>>>>>>>>>>> space
>>>>> out
>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>> 'stacksize'.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The Posix standard
>>>>>>>>>>>>>>>> http://pubs.opengroup.org/onlinepubs/9699919799/
>>>>>>>>>>>>>>>>>>>>      says "the implementation allocates extra memory
>>>>>>>>>>>>>>>>>>>> at the
>>>>>>>>>>>> overflow
>>>>>>>>>>>>>>>>>>>> end of
>>>>>>>>>>>>>>>>>>>> the stack". The linux man page
>>>>>>>>>>>>>>>>>>>> https://linux.die.net/man/3/pthread_attr_setguardsize
>>>>>>>>>>>>>>>>>>>> says
>>>>> "As
>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>> glibc
>>>>>>>>>>>>>>>>>>>> 2.8, the NPTL threading implementation includes the
>> guard
>>>>>>> area
>>>>>>>>>>>>>> within
>>>>>>>>>>>>>>>>>>>> the stack size allocation, rather than allocating
>>>>>>>>>>>>>>>>>>>> extra space
>>>>> at
>>>>>>>>>>>>>>>>>>>> the end
>>>>>>>>>>>>>>>>>>>> of the stack, as POSIX.1 requires".
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I encounter this problem in
>>>>>>>>>>>> runtime/Thread/TooSmallStackSize.java
>>>>>>>>>>>>>>>>>>>> on ppc
>>>>>>>>>>>>>>>>>>>> with 64K pages. _compiler_thread_min_stack_allowed is
>>>>> 128K
>>>>>>>>>>> on
>>>>>>>>>>>>>> ppc,
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> ppc specifies two OS guard pages. The VM crashes in
>>>>> pthread
>>>>>>>>>>>>>> creation
>>>>>>>>>>>>>>>>>>>> because there is no usable space in the thread stack
>>>>>>>>>>>>>>>>>>>> after
>>>>>>>>>>>> allocating
>>>>>>>>>>>>>>>>>>>> the guard pages.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> But TooSmallStackSize.java requires that the VM comes
>> up
>>>>>>> with
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>>>>>>> size mentioned in the error message.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> This fix adapts the requested stack size on Linux by
>>>>>>>>>>>>>>>>>>>> the size
>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> guard pages to mimick proper behaviour, see change to
>>>>>>>>>>>>>> os_linux.cpp.
>>>>>>>>>>>>>>>>>>>> The change also streamlines usage of stack_guard_page
>> on
>>>>>>>>>>>> linuxppc,
>>>>>>>>>>>>>>>>>>>> linuxppcle, aixppc and linuxs390.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> To reproduce the error on linux_x86_64, apply below
>> patch
>>>>>>> and
>>>>>>>>>>>> call
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> VM with -XX:CompilerThreadStackSize=64.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I'm still exploring why I had to choose such big compiler
>>>>> stacks
>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>> ppc
>>>>>>>>>>>>>>>>>>>> to get -version passing, but I wanted to send the RFR
>>>>>>>>>>>>>>>>>>>> now as
>>>>>>>>>>>> people
>>>>>>>>>>>>>>>>>>>> obviously looked at the bug I opened (Thanks David!).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       Goetz.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> diff -r b7ae012c55c3
>>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>>>>>>>>>>>>>>>> --- a/src/os_cpu/linux_x86/vm/os_linux_x86.cpp  Mon
>> Nov
>>>>> 07
>>>>>>>>>>>>>> 12:37:28
>>>>>>>>>>>>>>>>>>> 2016
>>>>>>>>>>>>>>>>>>>> +0100
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> +++ b/src/os_cpu/linux_x86/vm/os_linux_x86.cpp  Thu
>> Nov
>>>>> 10
>>>>>>>>>>>>>>>> 16:52:17
>>>>>>>>>>>>>>>>>>> 2016
>>>>>>>>>>>>>>>>>>>> +0100
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> @@ -701,7 +701,7 @@
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> size_t os::Linux::default_guard_size(os::ThreadType
>>>>> thr_type) {
>>>>>>>>>>>>>>>>>>>>        // Creating guard page is very expensive. Java
>>>>>>>>>>>>>>>>>>>> thread
>>>>> has
>>>>>>>>>>>> HotSpot
>>>>>>>>>>>>>>>>>>>>        // guard page, only enable glibc guard page for
>>>>>>>>>>>>>>>>>>>> non-Java
>>>>>>>>>>>> threads.
>>>>>>>>>>>>>>>>>>>> -  return (thr_type == java_thread ? 0 : page_size());
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> +  return (thr_type == java_thread ? 0 : 64*K);
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> // Java thread:
>>>>>>>>>>>>>>>>>>>>
>>>


More information about the hotspot-compiler-dev mailing list