RFR(M): 8169373: Work around linux NPTL stack guard error.
Daniel D. Daugherty
daniel.daugherty at oracle.com
Fri Dec 2 21:04:18 UTC 2016
Vladimir, Thanks for the review!
OK, the ball is in my court (as sponsor) and I need to figure out what
kind of RBT testing David H wants to see on the patch before I push it...
It's the weekend already for David so I'm guessing he's out mowing the
lawn... :-)
Dan
On 12/2/16 11:12 AM, Vladimir Kozlov wrote:
> I read through whole tread and you guys did good job with review :)
>
> I agree with changes (and keeping guard pages for compiler threads).
>
> Thanks,
> Vladimir
>
> On 12/1/16 2:32 AM, Lindenmaier, Goetz wrote:
>> Hi David,
>>
>> I fixed the comment:
>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.07/
>>
>> We run a lot of tests with this change:
>> Hotspot jtreg, jck, spec, SAP applications
>> On these platforms:
>> Windows_x86_64, linux_x86_64, solaris_sparc, mac_x86_64,
>> Linux_ppc64, linux_ppc64le, aix_ppc64, linux_s390
>> I did not spot a problem there.
>>
>> Best regards,
>> Goetz.
>>
>>> -----Original Message-----
>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>> Sent: Mittwoch, 30. November 2016 22:51
>>> To: daniel.daugherty at oracle.com; Lindenmaier, Goetz
>>> <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net;
>>> 'hotspot-runtime-dev at openjdk.java.net' <hotspot-runtime-
>>> dev at openjdk.java.net>
>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard error.
>>>
>>> On 1/12/2016 1:20 AM, Daniel D. Daugherty wrote:
>>>>> Would you mind sponsoring this change?
>>>>
>>>> Yes, I will sponsor this change. However, I would like to get a
>>>> thumbs up from David H. on the latest version and I don't see
>>>> any review from someone on the Compiler Team.
>>>
>>> I'm okay with proposed changes - but also want to hear from compiler
>>> team and I'd want to see this put through some advance testing
>>> before it
>>> gets pushed (full rbt run).
>>>
>>> I have one minor nit in the wording of the fatal error messages "failed
>>> with errno" - these methods don't touch errno so I'd prefer it if it
>>> said error.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> Vladimir! We need someone on the Compiler Team to look at these
>>>> CompilerThread stack size changes...
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 11/30/16 12:57 AM, Lindenmaier, Goetz wrote:
>>>>> Hi Dan,
>>>>>
>>>>>> Thumbs up! I don't need to see a new webrev if you choose
>>>>>> to fix the minor comments above.
>>>>> I anyways did a new one fixing the comments:
>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.06/
>>>>>
>>>>> Would you mind sponsoring this change?
>>>>>
>>>>> I took the minimum stack sizes from my experimental runs where
>>>>> I had removed the automatic resizing to find the really needed space.
>>>>> If I put something smaller there, I could as well put '0' ... as
>>>>> _java_thread_min_stack_allowed =
>>> MAX2(_java_thread_min_stack_allowed,
>>>>>
>>>>> JavaThread::stack_guard_zone_size() +
>>>>>
>>>>> JavaThread::stack_shadow_zone_size() +
>>>>> (4 * BytesPerWord
>>>>> COMPILER2_PRESENT(+ 2)) * 4 * K);
>>>>> will fix it.
>>>>> This code effectively limits the usable stack size to
>>>>> (4 * BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K)
>>>>> which makes the initialization of _java_thread_min_stack_allowed with
>>>>> platform
>>>>> values pointless.
>>>>>
>>>>> I'll open a new bug for the follow-up stack issue, probably tomorrow.
>>>>> I don't feel like addressing testing all the possible error codes, as
>>>>> they probably should be fixed in more places, and there is no issue
>>>>> pending currently. Maybe it should be fixed in jdk10 at some point.
>>>>>
>>>>> Best regards,
>>>>> Goetz
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>> Sent: Dienstag, 29. November 2016 20:04
>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>> hotspot-compiler-
>>>>>> dev at openjdk.java.net; 'hotspot-runtime-dev at openjdk.java.net'
>>>>>> <hotspot-
>>>>>> runtime-dev at openjdk.java.net>
>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
>>>>>> error.
>>>>>>
>>>>>> On 11/29/16 2:30 AM, Lindenmaier, Goetz wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> see my replies inline ...
>>>>>>> New webrev:
>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.05/
>>>>>>>
>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>> L887: // libc guard page
>>>>>> nit - You made other existing comments into sentences
>>>>>> (leading
>>>>>> capital and trailing '.'), but not this new comment.
>>>>>> Why?
>>>>>>
>>>>>> src/os/aix/vm/os_aix.hpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>> L6096: // | |/ 1 page glibc guard.
>>>>>> nit - "1 page glibc guard" -> "1 glibc guard page."
>>>>>>
>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp
>>>>>> L875: // | |/ 1 page glibc guard.
>>>>>> nit - "1 page glibc guard" -> "1 glibc guard page."
>>>>>>
>>>>>> src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>> No comments.
>>>>>>
>>>>>> src/os_cpu/linux_zero/vm/os_linux_zero.cpp
>>>>>> No comments.
>>>>>>
>>>>>>
>>>>>> Thumbs up! I don't need to see a new webrev if you choose
>>>>>> to fix the minor comments above.
>>>>>>
>>>>>> Some replies embedded below.
>>>>>>
>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
>>>>>>>> Sent: Dienstag, 29. November 2016 01:38
>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
>>> compiler-
>>>>>>>> dev at openjdk.java.net; 'hotspot-runtime-dev at openjdk.java.net'
>>> <hotspot-
>>>>>>>> runtime-dev at openjdk.java.net>
>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
>>>>>>>> error.
>>>>>>>>
>>>>>>>> On 11/28/16 2:08 AM, Lindenmaier, Goetz wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm working on a fix for OS guard pages on stacks. I figured
>>>>>>>>> there
>>>>>>>>> are VM guard pages (reserved, yellow, red) on the compiler stacks
>>>>>>>>> _and_ OS guard pages. For Java threads, the OS guard pages
>>>>>>>>> are left
>>>>>>>>> out. I think this should be done for compiler threads, too.
>>>>>>>>> Please
>>>>>>>>> confirm.
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>>> stackFix/webrev.04/
>>>>>>>>>
>>>>>>>> src/os/aix/vm/os_aix.cpp
>>>>>>>> L888: pthread_attr_setguardsize(&attr,
>>>>>>>> os::Aix::default_guard_size(thr_type));
>>>>>>>> No check or assert on the return status of this call.
>>>>>>>> Is one needed?
>>>>>>> I implemented this as the existing code on linux which has
>>>>>>> no check either. I think a failure is quite theoretical.
>>>>>>> Because of your comment below I'll leave it as-is.
>>>>>> OK.
>>>>>>
>>>>>>
>>>>>>>> L3044: // guard pages, so only enable libc guard pages
>>>>>>>> for
>>>>>>>> non-Java threads.
>>>>>>>> src/os/linux/vm/os_linux.cpp also has this comment:
>>>>>>>> // (Remember: compiler thread is a Java thread, too!)
>>>>>>> Fixed.
>>>>>>>
>>>>>>>> L3051: return ((thr_type == java_thread || thr_type ==
>>>>>>>> compiler_thread) ? 0 : 4*K);
>>>>>>>> nit - please add spaces around the '*' so '4 * K'.'
>>>>>>> Fixed.
>>>>>>>
>>>>>>>> src/os/aix/vm/os_aix.hpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>> L729: // is not implemented properly. The posix standard
>>>>>>>> requires
>>>>>>>> to add
>>>>>>>> Typo: 'to add' -> 'adding'
>>>>>>> Fixed.
>>>>>>>
>>>>>>>> L738: pthread_attr_setguardsize(&attr,
>>>>>>>> os::Linux::default_guard_size(thr_type));
>>>>>>>> No check or assert on the return status of this call.
>>>>>>>> Is one needed?
>>>>>>> See above.
>>>>>>>
>>>>>>>> L2851: // Creating guard page is very expensive. Java
>>>>>>>> thread has
>>>>>>>> HotSpot
>>>>>>>> L2852: // guard page, only enable glibc guard page for
>>>>>>>> non-Java
>>>>>>>> threads.
>>>>>>>> L2853: // (Remember: compiler thread is a java thread,
>>>>>>>> too!)
>>>>>>>> Typo: "java thread" -> "Java thread" (consistency)
>>>>>>>>
>>>>>>>> This comment block should be common to all the
>>>>>>>> platforms
>>>>>>>> that
>>>>>>>> define default_guard_size(). Yes, I can see that AIX
>>>>>>>> needs to
>>>>>>>> add another paragraph, but we should make the core
>>>>>>>> comment
>>>>>> common
>>>>>>>> if possible.
>>>>>>> I made the first three lines look alike.
>>>>>>>
>>>>>>>> L6090: // Java/Compiler thread:
>>>>>>>> Thanks for making this common in os_linux.cpp.
>>>>>>>>
>>>>>>>> L6095: // | glibc guard page | - guard,
>>>>>>>> attached Java
>>>>>>>> thread usually has
>>>>>>>> Clarity: "guard," -> "guard page,"
>>>>>>> Fixed.
>>>>>>>
>>>>>>>> Typo: "Java thread" -> "JavaThread" (consistency)
>>>>>>> I changed both to Java thread as at the other locations.
>>>>>>>
>>>>>>>> L6099: // | HotSpot Guard Pages | - red and yellow
>>>>>>>> pages
>>>>>>>> The fairly recently added reserved page should be
>>>>>>>> mentioned
>>>>>>>> here also?
>>>>>>> Yes. Fixed. Also fixed it to say
>>>>>>> JavaThread::stack_reserved_zone_base().
>>>>>>> Also fixed comment on bsd.
>>>>>> Thanks for also fixing BSD.
>>>>>>
>>>>>>
>>>>>>>> L6120 // ** P1 (aka bottom) and size ( P2 = P1 - size)
>>>>>>>> are the
>>>>>>>> address and stack size returned from
>>>>>>>> Typo: "( P2 = ..." -> "(P2 = ..."
>>>>>>> Fixed.
>>>>>>>
>>>>>>>> L6148: fatal("Can not locate current stack
>>>>>>>> attributes!");
>>>>>>>> Typo: "Can not" -> "Cannot"
>>>>>>> Fixed.
>>>>>>>
>>>>>>>> L6175: // stack size includes normal stack and HotSpot
>>>>>>>> guard pages
>>>>>>>> Perhaps add to the comment:
>>>>>>>> "for the threads that have HotSpot guard pages."
>>>>>>> Fixed. I also checked my comments for "OS guard pages" and
>>>>>>> replaced it by "glibc guard pages" which is used in several places
>>>>>>> already, same for "VM guard page" --> "HotSpot guard page". I
>>>>>>> think this is also more consistent.
>>>>>> I agree!
>>>>>>
>>>>>>
>>>>>>>> src/os/posix/vm/os_posix.cpp
>>>>>>>> L1097: pthread_attr_getstacksize(attr, &stack_size);
>>>>>>>> L1098: pthread_attr_getguardsize(attr, &guard_size);
>>>>>>>> Do these two calls need to have their result checked?
>>>>>>>>
>>>>>>>> Now I'm starting to wonder if all the uses of these
>>>>>>>> two APIs need to be checked? Separate bug?
>>>>>>> It would be more consistent with the specification of the methods,
>>>>>>> On the other side it's quite unlikely that these fail if attr !=
>>>>>>> NULL.
>>>>>> So should we file a new bug? Or does this fall into the realm of
>>>>>> other OS/libc code that we call and assume never fails? :-)
>>>>>>
>>>>>>
>>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>> L540: size_t
>>>>>>>> os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>>> 512 * K;
>>>>>>>> L541: size_t os::Posix::_java_thread_min_stack_allowed =
>>>>>>>> 512
>>>>>>>> * K;
>>>>>>>> So prior to the fix for 8140520,
>>>>>>>> src/os/aix/vm/os_aix.cpp had
>>>>>>>> this single min_stack_allowed value:
>>>>>>>>
>>>>>>>> L3601: os::Aix::min_stack_allowed =
>>>>>>>> MAX2(os::Aix::min_stack_allowed,
>>>>>>>> L3602: JavaThread::stack_guard_zone_size() +
>>>>>>>> L3603: JavaThread::stack_shadow_zone_size() +
>>>>>>>> L3604: (4*BytesPerWord
>>>>>>>> COMPILER2_PRESENT(+2)) * 4 * K);
>>>>>>>>
>>>>>>>> and the fix for 8140520 changed that for *NIX
>>>>>>>> platforms to
>>>>>>>> three mins in src/os/posix/vm/os_posix.cpp:
>>>>>>>>
>>>>>>>> L1108: _java_thread_min_stack_allowed =
>>>>>>>> MAX2(_java_thread_min_stack_allowed,
>>>>>>>> L1109: JavaThread::stack_guard_zone_size() +
>>>>>>>> L1110: JavaThread::stack_shadow_zone_size() +
>>>>>>>> L1111: (4 *
>>>>>>>> BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K);
>>>>>>>>
>>>>>>>> L1150: _compiler_thread_min_stack_allowed =
>>>>>>>> align_size_up(_compiler_thread_min_stack_allowed, vm_page_size());
>>>>>>>>
>>>>>>>> L1161 _vm_internal_thread_min_stack_allowed =
>>>>>>>> align_size_up(_vm_internal_thread_min_stack_allowed,
>>> vm_page_size());
>>>>>>>>
>>>>>>>> Which means that the compiler_thread no longer benefits
>>>>>>>> from
>>>>>>>> the extra space for quard and shadow pages. The
>>>>>>>> thinking in
>>>>>>>> 8140520 was that the compiler_thread and
>>>>>>>> vm_internal_threads
>>>>>>>> don't need the quard and shadow pages since they
>>>>>>>> don't run
>>>>>>>> Java code (ignoring JVMCI for now).
>>>>>>>>
>>>>>>>> So I can see bumping _compiler_thread_min_stack_allowed
>>>>>>>> from
>>>>>>>> 128 -> 512 as one solution for getting that extra space
>>>>>>>> back.
>>>>>>>> However, I don't understand why
>>>>>>>> _java_thread_min_stack_allowed
>>>>>>>> has changed from 128 -> 512.
>>>>>>> Because it was never correct before.
>>>>>> OK. That sounds like the new test that I included with 8140520 would
>>>>>> have failed with JavaThread stack sizes even before the product code
>>>>>> changes from 8140520 were made.
>>>>>>
>>>>>> Since the size calculation for JavaThread stack sizes wasn't changed
>>>>>> for any platform in 8140520, that tends to indicate that the more
>>>>>> limited JDK test (test/tools/launcher/TooSmallStackSize.java) should
>>>>>> also have failed before the fix for 8140520.
>>>>>>
>>>>>> Please clarify the need for the _java_thread_min_stack_allowed
>>>>>> change
>>>>>> from 128 -> 512. Unless test/tools/launcher/TooSmallStackSize.java
>>>>>> is never run in your testing, I'm having troubling seeing why the
>>>>>> _java_thread_min_stack_allowed increase is needed.
>>>>>>
>>>>>>
>>>>>>>> I had previously made this comment:
>>>>>>>> > To put it another way, I'd like to see us add extra
>>>>>>>> space to
>>>>>>>> > solve the 64K page issue directly instead of as a
>>>>>>>> side
>>>>>>>> effect
>>>>>>>> > of the red/yellow page addition.
>>>>>>>> And Goetz replied with:
>>>>>>>> > I don't understand. What do you mean by 'directly'?
>>>>>>>>
>>>>>>>> So prior to the fix for 8140520,
>>>>>>>> src/os/solaris/vm/os_solaris.cpp
>>>>>>>> had a block like this:
>>>>>>>>
>>>>>>>> L4468: // For 64kbps there will be a 64kb page size,
>>>>>>>> which makes
>>>>>>>> L4469: // the usable default stack size quite a
>>>>>>>> bit less.
>>>>>>>> Increase the
>>>>>>>> L4470: // stack for 64kb (or any > than 8kb)
>>>>>>>> pages, this
>>>>>>>> increases
>>>>>>>> L4471: // virtual memory fragmentation (since
>>>>>>>> we're not
>>>>>>>> creating the
>>>>>>>> L4472 // stack on a power of 2 boundary. The real
>>>>>>>> fix
>>>>>>>> for this
>>>>>>>> L4473 // should be to fix the guard page mechanism.
>>>>>>>> L4474
>>>>>>>> L4475 if (vm_page_size() > 8*K) {
>>>>>>>> L4476 threadStackSizeInBytes =
>>>>>>>> (threadStackSizeInBytes != 0)
>>>>>>>> L4477 ? threadStackSizeInBytes +
>>>>>>>> L4478 JavaThread::stack_red_zone_size() +
>>>>>>>> L4479 JavaThread::stack_yellow_zone_size()
>>>>>>>> L4480 : 0;
>>>>>>>> L4481 ThreadStackSize = threadStackSizeInBytes/K;
>>>>>>>> L4482 }
>>>>>>>>
>>>>>>>> The above is an example of what I mean by solving
>>>>>>>> the 64K
>>>>>>>> page issue directly. In the fix for 8140520, that code
>>>>>>>> block
>>>>>>>> was moved to os::Posix::set_minimum_stack_sizes() in
>>>>>>>> src/os/posix/vm/os_posix.cpp and put in a "#ifdef
>>>>>>>> SOLARIS...
>>>>>>>> #endif // SOLARIS" block. Coleen filed a bug to
>>>>>>>> determine
>>>>>>>> whether that code can be deleted:
>>>>>>>>
>>>>>>>> JDK-8161093 Solaris for >8k pagesize adds extra
>>>>>>>> guard pages
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8161093
>>>>>>>>
>>>>>>>> but perhaps this bug shows that the code is needed?
>>>>>>>>
>>>>>>>>
>>>>>>>> OK so this is probably the longest code review comment
>>>>>>>> I have ever written, but the summary is:
>>>>>>>>
>>>>>>>> - I understand bumping
>>>>>>>> _compiler_thread_min_stack_allowed,
>>>>>>>> but should it be solved in a different way?
>>>>>>>> - I don't understand bumping
>>>>>>>> _java_thread_min_stack_allowed
>>>>>>> I plan to do a follow up change to fix this. Let's leave this
>>>>>>> discussion
>>>>>>> to that review. Here I just want to fix the NPTL issue and the
>>>>>>> basic
>>>>>>> sizing that is needed to pass the new test on ppc/s390.
>>>>>> Same question here about the simpler JDK version of the test.
>>>>>>
>>>>>> Does test/tools/launcher/TooSmallStackSize.java get run in
>>>>>> your test environments?
>>>>>>
>>>>>>
>>>>>>>> src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>>>>>>>> L538: size_t
>>>>>>>> os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>>> 384 * K;
>>>>>>>> L539: size_t os::Posix::_java_thread_min_stack_allowed =
>>>>>>>> 384
>>>>>>>> * K;
>>>>>>>>
>>>>>>>> Same monster comment as
>>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>> and the same summary:
>>>>>>>>
>>>>>>>> - I understand bumping
>>>>>>>> _compiler_thread_min_stack_allowed,
>>>>>>>> but should it be solved in a different way?
>>>>>>>> - I don't understand bumping
>>>>>>>> _java_thread_min_stack_allowed
>>>>>>>>
>>>>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>>>>>>> L478: size_t
>>>>>>>> os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>>> 128 * K;
>>>>>>>> L479: size_t os::Posix::_java_thread_min_stack_allowed =
>>>>>>>> 236
>>>>>>>> * K;
>>>>>>>> Bumping _java_thread_min_stack_allowed but not bumping
>>>>>>>> _compiler_thread_min_stack_allowed. I'm confused here.
>>>>>>> The numbers are what I need to startup on the machines. 128 is
>>>>>>> just
>>>>>>> fine on the machines we have. (That's the problem of the
>>>>>>> current setup: you have to tune this compile time constant for the
>>>>>>> page size of the machine you are running on. But let's discuss this
>>>>>>> in the follow up change.)
>>>>>> OK about discussing this with a follow-up change. I guess I see
>>>>>> the compile time initialization as a "minimum setting assuming the
>>>>>> smallest page size". If we discover (at runtime) that the page
>>>>>> size is bigger, then we adjust the minimum that we need...
>>>>>>
>>>>>> Again, defer to another bug. Do we have a bug ID yet?
>>>>>>
>>>>>>
>>>>>>>> src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> src/os_cpu/linux_zero/vm/os_linux_zero.cpp
>>>>>>>> No comments.
>>>>>>>>
>>>>>>>> Sorry it took me so long to write this up...
>>>>>>> No matter, thanks for this thorough review!
>>>>>> You are very welcome. Thanks for being willing to dive into such
>>>>>> a complicated area (thread stack sizes)...
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Best regards,
>>>>>>> Goetz.
>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>> The change affecting the compier threads is in os_linux.cpp,
>>>>>>>> default_guard_size(),
>>>>>>>>> where '|| thr_type == compiler_thread' has been added. The
>>>>>>>>> function
>>>>>>>>> was also moved from the os_cpu files, as it's identical on all
>>>>>>>>> cpus.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Goetz.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>>> Sent: Montag, 28. November 2016 00:25
>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>>>>> 'daniel.daugherty at oracle.com' <daniel.daugherty at oracle.com>;
>>>>>> 'hotspot-
>>>>>>>>>> runtime-dev at openjdk.java.net' <hotspot-runtime-
>>> dev at openjdk.java.net>
>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
>>>>>>>>>> error.
>>>>>>>>>>
>>>>>>>>>> Hi Goetz,
>>>>>>>>>>
>>>>>>>>>> On 24/11/2016 10:15 PM, Lindenmaier, Goetz wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I now edited the stuff I had proposed below:
>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>>>>>> stackFix/webrev.03/
>>>>>>>>>>> This includes
>>>>>>>>>>> - the NPTL fix from webrev.02
>>>>>>>>>> Okay in principle. As discussed this only impacts
>>>>>>>>>> non-JavaThreads
>>>>>>>>>> so the
>>>>>>>>>> change should be minimal.
>>>>>>>>>>
>>>>>>>>>>> - merging code on linux
>>>>>>>>>> Went a bit further than I had expected but if this truly
>>>>>>>>>> isn't CPU
>>>>>>>>>> dependent code then great!
>>>>>>>>>>
>>>>>>>>>>> - not adding OS guard to compiler threads.
>>>>>>>>>> Okay in principle. IIUC we will now save the OS guard page for
>>>>>>>>>> compiler
>>>>>>>>>> thread stacks. Is that the only impact? The hotspot-compiler-dev
>>>>>>>>>> folk
>>>>>>>>>> may want to sign off on this part.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> A few minor comments:
>>>>>>>>>>
>>>>>>>>>> src/os/linux/vm/os_linux.cpp
>>>>>>>>>>
>>>>>>>>>> 2854 return ((thr_type == java_thread || thr_type ==
>>>>>>>>>> os::compiler_thread) ...
>>>>>>>>>>
>>>>>>>>>> os:: should be used for both types or none.
>>>>>>>>>>
>>>>>>>>>> 6153 pthread_attr_getguardsize(&attr, &guard_size);
>>>>>>>>>>
>>>>>>>>>> Can you at least verify a zero return code in an
>>>>>>>>>> assert/assert_status
>>>>>>>>>> please.
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
>>>>>>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
>>>>>>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
>>>>>>>>>>
>>>>>>>>>> Are the changes to min_stack_allowed just fixing an existing
>>>>>>>>>> bug?
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>> -----
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> I think this should be pushed for this bug ID. For the other
>>>>>>>>>>> changes I'll
>>>>>>>>>>> make another bug.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Goetz.
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Lindenmaier, Goetz
>>>>>>>>>>>> Sent: Wednesday, November 23, 2016 8:11 AM
>>>>>>>>>>>> To: David Holmes <david.holmes at oracle.com>;
>>>>>>>>>>>> daniel.daugherty at oracle.com; hotspot-runtime-
>>> dev at openjdk.java.net
>>>>>>>>>>>> Subject: RE: RFR(M): 8169373: Work around linux NPTL stack
>>>>>>>>>>>> guard
>>>>>> error.
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct with
>>>>>>>>>>>>> JVMCI.
>>>>>> The
>>>>>>>>>>>>> ability for a CompilerThread to execute Java code
>>>>>>>>>>>>> (can_call_java()) is
>>>>>>>>>>>>> now a dynamic property depending on whether the current
>>> compiler
>>>>>> is
>>>>>>>> the
>>>>>>>>>>>>> JVMCI compiler.
>>>>>>>>>>>> Ah, then I should also leave space for shadow pages in the
>>>>>>>>>>>> minimal
>>>>>> stack
>>>>>>>>>> size
>>>>>>>>>>>> of comiler threads.
>>>>>>>>>>>>
>>>>>>>>>>>> Do we agree on the cleanup and on leaving out the OS guard
>>>>>>>>>>>> page
>>> on
>>>>>>>>>>>> compiler threads?
>>>>>>>>>>>> Then I would edit a change comprising the NPTL workaround and
>>>>>>>>>>>> these
>>>>>>>>>>>> additional changes, and split the other issue into a new bug?
>>>>>>>>>>>> I think this
>>>>>>>>>>>> will easy the reviewing.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Goetz.
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>>>>>> Sent: Mittwoch, 23. November 2016 02:50
>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>>>>>>>> daniel.daugherty at oracle.com; hotspot-runtime-
>>> dev at openjdk.java.net
>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
>>> guard
>>>>>>>> error.
>>>>>>>>>>>>> On 22/11/2016 11:19 PM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>> From: Daniel D. Daugherty
>>> [mailto:daniel.daugherty at oracle.com]
>>>>>>>>>>>>>>> Sent: Dienstag, 22. November 2016 14:01
>>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; David
>>>>>> Holmes
>>>>>>>>>>>>>>> <david.holmes at oracle.com>; hotspot-runtime-
>>>>>> dev at openjdk.java.net
>>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
>>>>>>>>>>>>>>> guard
>>>>>>>>>>>> error.
>>>>>>>>>>>>>>> On 11/22/16 3:55 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I ran into a row of issues, some errors on the platforms.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What I meant with that comment is that
>>>>>>>>>>>>>>>> os::Linux::min_stack_allowed =
>>>>>> MAX2(os::Linux::min_stack_allowed,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> JavaThread::stack_guard_zone_size() +
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> JavaThread::stack_shadow_zone_size() +
>>>>>>>>>>>>>>>> (4*BytesPerWord
>>>>>>>> COMPILER2_PRESENT(+2)) *
>>>>>>>>>> 4
>>>>>>>>>>>> *
>>>>>>>>>>>>> K);
>>>>>>>>>>>>>>>> was executed, and min_stack_allowed used for all
>>>>>>>>>>>>>>>> stacks. Now,
>>>>>>>>>>>> compiler
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> vm minimum stack sizes are not increased by these sizes.
>>>>>>>>>>>>>>> Now I see what you mean. Thanks for clearing this up.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I should have remembered that part of the change because we
>>>>>> went
>>>>>>>>>>>> back
>>>>>>>>>>>>>>> and forth about removing the red/yellow zone pages from the
>>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>> threads. In particular, we discussed the compiler thread
>>>>>>>>>>>>>>> because it
>>>>>>>>>>>>>>> is-a JavaThread. Our conclusion was that a compiler thread
>>>>>>>>>>>>>>> doesn't
>>>>>>>>>>>>>>> execute Java bytecode so we could remove the red/yellow
>>>>>>>>>>>>>>> pages...
>>>>>>>>>>>>>> Yes, it does not execute java byte code.
>>>>>>>>>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct with
>>>>>>>>>>>>> JVMCI.
>>>>>> The
>>>>>>>>>>>>> ability for a CompilerThread to execute Java code
>>>>>>>>>>>>> (can_call_java()) is
>>>>>>>>>>>>> now a dynamic property depending on whether the current
>>> compiler
>>>>>> is
>>>>>>>> the
>>>>>>>>>>>>> JVMCI compiler.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Therefore you can remove the shadow pages. There is no code
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> will bang.
>>>>>>>>>>>>>> But red/yellow pages are protected right during thread
>>>>>>>>>>>>>> startup.
>>>>>>>>>>>>>> Therefore you must have enough space for them.
>>>>>>>>>>>>>> On ppc, we try to protect three 64K pages out of the 128K
>>>>>>>>>>>>>> compiler
>>>>>>>> stack.
>>>>>>>>>>>>>> That obviously fails.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Therefore I propose:
>>>>>>>>>>>>>> size_t os::Posix::_java_thread_min_stack_allowed
>>>>>>>>>>>>>> = 48
>>>>>>>>>>>>>> * K; //
>>>>>> Set
>>>>>>>>>>>>> platform dependent.
>>>>>>>>>>>>>> in os::Posix::set_minimum_stack_sizes():
>>>>>>>>>>>>>> _java_thread_min_stack_allowed =
>>>>>> _java_thread_min_stack_allowed
>>>>>>>> +
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> JavaThread::stack_guard_zone_size() +
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> JavaThread::stack_shadow_zone_size();
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (Similar for _compiler_thread_min_stack_allowed).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The minimal stack size is made up of three components:
>>>>>>>>>>>>>> 1. Sizes of interpreter/C1/C2 frames. Depends on
>>>>>>>>>>>>>> HotSpot
>>>>>>>>>>>>> implementation/platform/os.
>>>>>>>>>>>>>> 2. Sizes of C++ frames: depends on C++ compiler.
>>>>>>>>>>>>>> These are fixed at compile time.
>>>>>>>>>>>>>> 3. Sizes of red/yellow/reserved/shadow pages. Depends
>>>>>>>>>>>>>> on the
>>>>>>>> system
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> VM is used on. This is not fixed at compile time.
>>>>>>>>>>>>>> (Our ppc
>>>>>>>> machines
>>>>>>>>>>>>> differ
>>>>>>>>>>>>>> in page size.)
>>>>>>>>>>>>>> Therefore 3. should not be included in a compile time
>>>>>>>>>>>>>> constant.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And that decision allowed us to be exposed to the 64K page
>>>>>>>>>>>>>>> issue
>>>>>>>>>>>>>>> because the "extra" space isn't there anymore.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> At least the _compiler_thread_min_stack_allowed should be
>>>>>>>> increased
>>>>>>>>>>>>>>>> similarly by red/yellow zone size. The compiler thread is
>>>>>>>>>>>>>>>> a Java
>>>>>>>>>>>>>>>> thread and must have space for these zones.
>>>>>>>>>>>>>>> I'm not sure that I completely agree (yet). To me, the
>>>>>>>>>>>>>>> red/yellow
>>>>>>>>>>>>>>> pages are there for Java thread stack overflow semantics.
>>>>>>>>>>>>>>> Yes, the
>>>>>>>>>>>>>>> compiler thread needs extra space when 64K pages are used,
>>>>>>>>>>>>>>> but I
>>>>>>>>>>>>>>> would prefer that we add that space via a different
>>>>>>>>>>>>>>> calculation.
>>>>>>>>>>>>>> Yes they are. But compiler threads happen tob e a
>>>>>>>>>>>>>> subclass of
>>>>>>>>>>>>>> Java threads and use the same run() method that puts the
>>>>>>>>>>>>>> pages
>>>>>>>>>>>>>> There.
>>>>>>>>>>>>>> I agree that they are not needed for Compiler threads,
>>>>>>>>>>>>>> nor for
>>>>>>>>>>>>>> CodeCacheSweeperThreads. I don't really now about
>>>>>>>>>>>>>> JvmtiAgentThreads and ServiceThreads, but all of the get
>>>>>>>>>>>>>> the guard
>>>>>>>>>>>> pages
>>>>>>>>>>>>>> because they are derived from JavaThread.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To put it another way, I'd like to see us add extra
>>>>>>>>>>>>>>> space to
>>>>>>>>>>>>>>> solve
>>>>>>>>>>>>>>> the 64K page issue directly instead of as a side effect
>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>> red/yellow page addition.
>>>>>>>>>>>>>> I don't understand. What do you mean by 'directly'?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Also, the change added a test that is failing now.
>>>>>>>>>>>>>>> And that's a "good thing" (TM), right? :-)
>>>>>>>>>>>>>> Yes, it showed a bug and thus raised the need to fix it! :)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>> Goetz.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Again, thanks for clarifying 8140520's role in this issue.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>> Goetz.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>> From: Daniel D. Daugherty
>>>>>> [mailto:daniel.daugherty at oracle.com]
>>>>>>>>>>>>>>>>> Sent: Montag, 21. November 2016 17:28
>>>>>>>>>>>>>>>>> To: David Holmes <david.holmes at oracle.com>; Lindenmaier,
>>>>>> Goetz
>>>>>>>>>>>>>>>>> <goetz.lindenmaier at sap.com>; hotspot-runtime-
>>>>>>>>>>>>> dev at openjdk.java.net
>>>>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL
>>>>>>>>>>>>>>>>> stack
>>>>>> guard
>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>> Sorry for the delayed responses to this thread. I've
>>>>>>>>>>>>>>>>> been on
>>>>>>>> vacation...
>>>>>>>>>>>>>>>>> One comment/query embedded below...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 11/10/16 8:40 PM, David Holmes wrote:
>>>>>>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 11/11/2016 8:00 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>>>> Hi David,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This issue is different to 6675312, see also my comment
>>>>>>>>>>>>>>>>>>> in the
>>>>>>>> bug.
>>>>>>>>>>>>>>>>>>> It appears running jtreg test
>>>>>>>>>>>>> runtime/Thread/TooSmallStackSize.java,
>>>>>>>>>>>>>>>>>>> with my patch below you can reproduce it on
>>>>>>>>>>>>>>>>>>> linuxx86_64.
>>>>>>>>>>>>>>>>>>> You
>>>>>>>> can
>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>> do that with 6675312. Also, I would assume there are
>>>>>>>>>>>>>>>>>>> systems
>>>>>> out
>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>>> on x86 that uses 64-K pages, did you run the tests on
>>>>>>>>>>>>>>>>>>> these? I
>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>> assume you get hard crashes with stack overflows in the
>>>>>>>>>>>>>>>>>>> first
>>>>>> C2
>>>>>>>>>>>>>>>>>>> compilation if there is only 64K usable
>>>>>>>>>>>>>>>>>>> CompilerThreadStack.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> My fix does not affect Java threads, which are the
>>>>>>>>>>>>>>>>>>> largest
>>>>>> amount
>>>>>>>>>>>>>>>>>>> of threads used by the VM. It affects only the non-Java
>>>>>>>>>>>>>>>>>>> threads.
>>>>>>>>>>>>>>>>>>> It adds one page to these threads. The page does not
>>>>>>>>>>>>>>>>>>> require
>>>>>>>>>>>>> memory,
>>>>>>>>>>>>>>>>>>> as it's protected. The stack will only require more
>>>>>>>>>>>>>>>>>>> space if the
>>>>>>>>>>>> thread
>>>>>>>>>>>>>>>>>>> ran into a stack overflow before the fix as else the
>>>>>>>>>>>>>>>>>>> pages are
>>>>>> not
>>>>>>>>>>>>>>>>>>> mapped.
>>>>>>>>>>>>>>>>>>> This are stack overflows that cause hard crashes, at
>>>>>>>>>>>>>>>>>>> least on
>>>>>> ppc
>>>>>>>>>>>> the
>>>>>>>>>>>>> VM
>>>>>>>>>>>>>>>>>>> does not properly catch these stack overflows, so
>>>>>>>>>>>>>>>>>>> any setup
>>>>>>>>>>>> working
>>>>>>>>>>>>>>> now
>>>>>>>>>>>>>>>>>>> will not run into the additional space. Altogether
>>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>>> should
>>>>>> be
>>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>>> effect on running systems besides requiring one more
>>>>>>>>>>>>>>>>>>> entry in
>>>>>>>> the
>>>>>>>>>>>>>>>>>>> page table per non-Java thread.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The problem is caused by a rather recent change
>>> (8140520:
>>>>>>>>>>>> segfault
>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> solaris-amd64
>>>>>>>>>>>>>>>>>>> with "-XX:VMThreadStackSize=1" option) which was pushed
>>>>>> after
>>>>>>>>>>>>>>>>>>> feature-close. As this was a rather recent change, it
>>>>>>>>>>>>>>>>>>> must be
>>>>>>>>>>>>>>>>>>> possible to
>>>>>>>>>>>>>>>>>>> fix this follow up issue. What else is this period in
>>>>>>>>>>>>>>>>>>> the project
>>>>>>>> good
>>>>>>>>>>>>>>>>>>> for if not fixing issues?
>>>>>>>>>>>>>>>>>> So I am seeing a number of factors here.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> First, 8140520, set:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> size_t os::Posix::_compiler_thread_min_stack_allowed
>>>>>>>>>>>>>>>>>> = 128
>>> *
>>>>>> K;
>>>>>>>>>>>>>>>>> So I'm confused by the above comment:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> > The problem is caused by a rather recent change
>>>>>>>>>>>>>>>>> (8140520:
>>>>>>>>>>>> segfault
>>>>>>>>>>>>>>>>> > on solaris-amd64 with "-XX:VMThreadStackSize=1"
>>>>>>>>>>>>>>>>> option)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8140520-webrev/5-jdk9-
>>> hs-
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>> open/hotspot/src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp.frames.html
>>>>>>>>>>>>>>>>> shows this change:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> @@ -531,19 +531,17 @@
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>> //////////////////////////////////////////////////////////////////////////////
>>>>>>
>>>>>>
>>>>>>>>>>>>>>>>> //
>>>>>>>>>>>>>>>>> // thread stack
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -size_t os::Linux::min_stack_allowed = 128*K;
>>>>>>>>>>>>>>>>> +size_t os::Posix::_compiler_thread_min_stack_allowed
>>>>>>>>>>>>>>>>> = 128
>>> *
>>>>>> K;
>>>>>>>>>>>>>>>>> +size_t os::Posix::_java_thread_min_stack_allowed =
>>>>>>>>>>>>>>>>> 128 * K;
>>>>>>>>>>>>>>>>> +size_t
>>>>>>>>>>>>>>>>> os::Posix::_vm_internal_thread_min_stack_allowed =
>>>>>>>>>>>>>>>>> 128
>>>>>> *
>>>>>>>> K;
>>>>>>>>>>>>>>>>> so the existing single variable of 'min_stack_allowed'
>>>>>>>>>>>>>>>>> was
>>>>>>>>>>>>>>>>> replaced by three variables:
>>>>>> _compiler_thread_min_stack_allowed,
>>>>>>>>>>>>>>>>> _java_thread_min_stack_allowed, and
>>>>>>>>>>>>>>>>> _vm_internal_thread_min_stack_allowed.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The old single variable and the three new variables
>>>>>>>>>>>>>>>>> are all
>>>>>>>>>>>>>>>>> initialized to the same value (128K) so the fix for
>>>>>>>>>>>>>>>>> 8140520
>>>>>>>>>>>>>>>>> did not change stack sizes for this platform. In fact,
>>>>>>>>>>>>>>>>> only
>>>>>>>>>>>>>>>>> one platform had a size change (Solaris X64).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So I'm confused about how the fix for 8140520 caused this
>>>>>> problem.
>>>>>>>>>>>>>>>>> Based on David's analysis below, it looks to me like this
>>>>>>>>>>>>>>>>> 64K stack
>>>>>>>>>>>>>>>>> guard page problem should also exist prior to the fix for
>>>>>> 8140520.
>>>>>>>>>>>>>>>>> Goetz, can you please explain how 8140520 caused this
>>>>>>>>>>>>>>>>> problem?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Dan
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Second on linux PPC it is hardwired to use 2 guard
>>>>>>>>>>>>>>>>>> pages:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> return 2 * page_size();
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Third, you had a pagesize of 64K.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Fourth, NPTL takes the guard space from the stack
>>>>>>>>>>>>>>>>>> space -
>>>>>> hence
>>>>>>>>>>>> with
>>>>>>>>>>>>> 2
>>>>>>>>>>>>>>>>>> x 64K guard, and a 128K stack it was all consumed.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In the proposed changes you now only use page_size() for
>>> the
>>>>>>>> guard,
>>>>>>>>>>>>> so
>>>>>>>>>>>>>>>>>> that alone would have fixed the observed problem.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> But in addition you want to address the NPTL problem by
>>>>>>>>>>>>>>>>>> adding
>>>>>>>>>>>> back
>>>>>>>>>>>>>>>>>> the guard space to the stack size requested. That alone
>>>>>>>>>>>>>>>>>> would
>>>>>> also
>>>>>>>>>>>>>>>>>> have fixed the observed problem. :)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> But in addition you have increased the minimum stack
>>>>>>>>>>>>>>>>>> size:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ! size_t os::Posix::_compiler_thread_min_stack_allowed =
>>>>>>>>>>>>>>>>>> 192 *
>>>>>> K;
>>>>>>>>>>>>>>>>>> which again, on its own would have fixed the original
>>>>>>>>>>>>>>>>>> problem.
>>>>>> :)
>>>>>>>>>>>>>>>>>> Did you really intend to increase the real minimum stack
>>>>>>>>>>>>>>>>>> from
>>>>>>>> 128K
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> 256K ? (on a 64K page system)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Focusing simply on the shared code change to adjust the
>>>>>> requested
>>>>>>>>>>>>>>>>>> stacksize by the amount of guard space (if any), this
>>>>>>>>>>>>>>>>>> does not
>>>>>> seem
>>>>>>>>>>>>>>>>>> unreasonable. As you note it is restricted to
>>>>>>>>>>>>>>>>>> non-JavaThreads
>>>>>> and
>>>>>>>>>>>> only
>>>>>>>>>>>>>>>>>> adds a page to reserved stack space.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> My only query now is whether the minimum stacksize
>>> detection
>>>>>>>> logic
>>>>>>>>>>>>>>>>>> will correctly report the real minimum stack size
>>>>>>>>>>>>>>>>>> (taking
>>>>>>>>>>>>>>>>>> into
>>>>>>>> account
>>>>>>>>>>>>>>>>>> the need for the guard page) ?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> So I really think this issue should be fixed.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>> Goetz.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
>>>>>>>>>>>>>>>>>>>> Sent: Thursday, November 10, 2016 10:02 PM
>>>>>>>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>> hotspot-
>>>>>>>>>>>>>>>>> runtime-
>>>>>>>>>>>>>>>>>>>> dev at openjdk.java.net
>>>>>>>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL
>>> stack
>>>>>>>>>>>> guard
>>>>>>>>>>>>>>> error.
>>>>>>>>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> As per the bug report, this issue was already known
>>>>>> (6675312)
>>>>>>>> and
>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>> chose not to try and address it due to no reported
>>>>>>>>>>>>>>>>>>>> issues at
>>>>>> the
>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>>>>> While I see that you have encountered an issue (is it
>>>>>>>>>>>>>>>>>>>> real or
>>>>>>>>>>>>>>>>>>>> fabricated?) I think this change is too intrusive
>>>>>>>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>>>>>>>> applied
>>>>>> at
>>>>>>>> this
>>>>>>>>>>>>>>>>>>>> stage of the JDK 9 release cycle, as it will change
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>>>>>>> requirements of every application running on Linux.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 11/11/2016 1:58 AM, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please review this change. I please need a sponsor:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
>>>>>>>>>>>>>>>>> stackFix/webrev.01/
>>>>>>>>>>>>>>>>>>>>> In the Linux NPTL pthread implementation the guard
>>>>>>>>>>>>>>>>>>>>> size
>>>>>>>>>>>>> mechanism
>>>>>>>>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>>> implemented properly. The posix standard requires to
>>> add
>>>>>> the
>>>>>>>>>>>>> size
>>>>>>>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>>>>>>>> guard pages to the stack size, instead Linux takes
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> space
>>>>>> out
>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> 'stacksize'.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The Posix standard
>>>>>>>>>>>>>>>>> http://pubs.opengroup.org/onlinepubs/9699919799/
>>>>>>>>>>>>>>>>>>>>> says "the implementation allocates extra memory
>>>>>>>>>>>>>>>>>>>>> at the
>>>>>>>>>>>>> overflow
>>>>>>>>>>>>>>>>>>>>> end of
>>>>>>>>>>>>>>>>>>>>> the stack". The linux man page
>>>>>>>>>>>>>>>>>>>>> https://linux.die.net/man/3/pthread_attr_setguardsize
>>>>>>>>>>>>>>>>>>>>> says
>>>>>> "As
>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>> glibc
>>>>>>>>>>>>>>>>>>>>> 2.8, the NPTL threading implementation includes the
>>> guard
>>>>>>>> area
>>>>>>>>>>>>>>> within
>>>>>>>>>>>>>>>>>>>>> the stack size allocation, rather than allocating
>>>>>>>>>>>>>>>>>>>>> extra space
>>>>>> at
>>>>>>>>>>>>>>>>>>>>> the end
>>>>>>>>>>>>>>>>>>>>> of the stack, as POSIX.1 requires".
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I encounter this problem in
>>>>>>>>>>>>> runtime/Thread/TooSmallStackSize.java
>>>>>>>>>>>>>>>>>>>>> on ppc
>>>>>>>>>>>>>>>>>>>>> with 64K pages. _compiler_thread_min_stack_allowed is
>>>>>> 128K
>>>>>>>>>>>> on
>>>>>>>>>>>>>>> ppc,
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> ppc specifies two OS guard pages. The VM crashes in
>>>>>> pthread
>>>>>>>>>>>>>>> creation
>>>>>>>>>>>>>>>>>>>>> because there is no usable space in the thread stack
>>>>>>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>> allocating
>>>>>>>>>>>>>>>>>>>>> the guard pages.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> But TooSmallStackSize.java requires that the VM comes
>>> up
>>>>>>>> with
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>>>>>>>> size mentioned in the error message.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> This fix adapts the requested stack size on Linux by
>>>>>>>>>>>>>>>>>>>>> the size
>>>>>> of
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> guard pages to mimick proper behaviour, see change to
>>>>>>>>>>>>>>> os_linux.cpp.
>>>>>>>>>>>>>>>>>>>>> The change also streamlines usage of stack_guard_page
>>> on
>>>>>>>>>>>>> linuxppc,
>>>>>>>>>>>>>>>>>>>>> linuxppcle, aixppc and linuxs390.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> To reproduce the error on linux_x86_64, apply below
>>> patch
>>>>>>>> and
>>>>>>>>>>>>> call
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> VM with -XX:CompilerThreadStackSize=64.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I'm still exploring why I had to choose such big
>>>>>>>>>>>>>>>>>>>>> compiler
>>>>>> stacks
>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>> ppc
>>>>>>>>>>>>>>>>>>>>> to get -version passing, but I wanted to send the RFR
>>>>>>>>>>>>>>>>>>>>> now as
>>>>>>>>>>>>> people
>>>>>>>>>>>>>>>>>>>>> obviously looked at the bug I opened (Thanks David!).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Goetz.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> diff -r b7ae012c55c3
>>>>>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>>>>>>>>>>>>>>>>>>> --- a/src/os_cpu/linux_x86/vm/os_linux_x86.cpp Mon
>>> Nov
>>>>>> 07
>>>>>>>>>>>>>>> 12:37:28
>>>>>>>>>>>>>>>>>>>> 2016
>>>>>>>>>>>>>>>>>>>>> +0100
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> +++ b/src/os_cpu/linux_x86/vm/os_linux_x86.cpp Thu
>>> Nov
>>>>>> 10
>>>>>>>>>>>>>>>>> 16:52:17
>>>>>>>>>>>>>>>>>>>> 2016
>>>>>>>>>>>>>>>>>>>>> +0100
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> @@ -701,7 +701,7 @@
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> size_t os::Linux::default_guard_size(os::ThreadType
>>>>>> thr_type) {
>>>>>>>>>>>>>>>>>>>>> // Creating guard page is very expensive. Java
>>>>>>>>>>>>>>>>>>>>> thread
>>>>>> has
>>>>>>>>>>>>> HotSpot
>>>>>>>>>>>>>>>>>>>>> // guard page, only enable glibc guard page for
>>>>>>>>>>>>>>>>>>>>> non-Java
>>>>>>>>>>>>> threads.
>>>>>>>>>>>>>>>>>>>>> - return (thr_type == java_thread ? 0 : page_size());
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> + return (thr_type == java_thread ? 0 : 64*K);
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> // Java thread:
>>>>>>>>>>>>>>>>>>>>>
>>>>
More information about the hotspot-compiler-dev
mailing list