RFR(M): 8169373: Work around linux NPTL stack guard error.
Lindenmaier, Goetz
goetz.lindenmaier at sap.com
Thu Dec 1 10:32:06 UTC 2016
Hi David,
I fixed the comment:
http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.07/
We run a lot of tests with this change:
Hotspot jtreg, jck, spec, SAP applications
On these platforms:
Windows_x86_64, linux_x86_64, solaris_sparc, mac_x86_64,
Linux_ppc64, linux_ppc64le, aix_ppc64, linux_s390
I did not spot a problem there.
Best regards,
Goetz.
> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Mittwoch, 30. November 2016 22:51
> To: daniel.daugherty at oracle.com; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; hotspot-compiler-dev at openjdk.java.net;
> 'hotspot-runtime-dev at openjdk.java.net' <hotspot-runtime-
> dev at openjdk.java.net>
> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard error.
>
> On 1/12/2016 1:20 AM, Daniel D. Daugherty wrote:
> >> Would you mind sponsoring this change?
> >
> > Yes, I will sponsor this change. However, I would like to get a
> > thumbs up from David H. on the latest version and I don't see
> > any review from someone on the Compiler Team.
>
> I'm okay with proposed changes - but also want to hear from compiler
> team and I'd want to see this put through some advance testing before it
> gets pushed (full rbt run).
>
> I have one minor nit in the wording of the fatal error messages "failed
> with errno" - these methods don't touch errno so I'd prefer it if it
> said error.
>
> Thanks,
> David
> -----
>
> > Vladimir! We need someone on the Compiler Team to look at these
> > CompilerThread stack size changes...
> >
> > Dan
> >
> >
> > On 11/30/16 12:57 AM, Lindenmaier, Goetz wrote:
> >> Hi Dan,
> >>
> >>> Thumbs up! I don't need to see a new webrev if you choose
> >>> to fix the minor comments above.
> >> I anyways did a new one fixing the comments:
> >> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.06/
> >> Would you mind sponsoring this change?
> >>
> >> I took the minimum stack sizes from my experimental runs where
> >> I had removed the automatic resizing to find the really needed space.
> >> If I put something smaller there, I could as well put '0' ... as
> >> _java_thread_min_stack_allowed =
> MAX2(_java_thread_min_stack_allowed,
> >>
> >> JavaThread::stack_guard_zone_size() +
> >>
> >> JavaThread::stack_shadow_zone_size() +
> >> (4 * BytesPerWord
> >> COMPILER2_PRESENT(+ 2)) * 4 * K);
> >> will fix it.
> >> This code effectively limits the usable stack size to
> >> (4 * BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K)
> >> which makes the initialization of _java_thread_min_stack_allowed with
> >> platform
> >> values pointless.
> >>
> >> I'll open a new bug for the follow-up stack issue, probably tomorrow.
> >> I don't feel like addressing testing all the possible error codes, as
> >> they probably should be fixed in more places, and there is no issue
> >> pending currently. Maybe it should be fixed in jdk10 at some point.
> >>
> >> Best regards,
> >> Goetz
> >>
> >>
> >>> -----Original Message-----
> >>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
> >>> Sent: Dienstag, 29. November 2016 20:04
> >>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-compiler-
> >>> dev at openjdk.java.net; 'hotspot-runtime-dev at openjdk.java.net' <hotspot-
> >>> runtime-dev at openjdk.java.net>
> >>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard error.
> >>>
> >>> On 11/29/16 2:30 AM, Lindenmaier, Goetz wrote:
> >>>> Hi Dan,
> >>>>
> >>>> see my replies inline ...
> >>>> New webrev:
> >>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-stackFix/webrev.05/
> >>> src/os/aix/vm/os_aix.cpp
> >>> L887: // libc guard page
> >>> nit - You made other existing comments into sentences (leading
> >>> capital and trailing '.'), but not this new comment.
> >>> Why?
> >>>
> >>> src/os/aix/vm/os_aix.hpp
> >>> No comments.
> >>>
> >>> src/os/linux/vm/os_linux.cpp
> >>> L6096: // | |/ 1 page glibc guard.
> >>> nit - "1 page glibc guard" -> "1 glibc guard page."
> >>>
> >>> src/os/posix/vm/os_posix.cpp
> >>> No comments.
> >>>
> >>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
> >>> No comments.
> >>>
> >>> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp
> >>> L875: // | |/ 1 page glibc guard.
> >>> nit - "1 page glibc guard" -> "1 glibc guard page."
> >>>
> >>> src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
> >>> No comments.
> >>>
> >>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
> >>> No comments.
> >>>
> >>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
> >>> No comments.
> >>>
> >>> src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp
> >>> No comments.
> >>>
> >>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
> >>> No comments.
> >>>
> >>> src/os_cpu/linux_zero/vm/os_linux_zero.cpp
> >>> No comments.
> >>>
> >>>
> >>> Thumbs up! I don't need to see a new webrev if you choose
> >>> to fix the minor comments above.
> >>>
> >>> Some replies embedded below.
> >>>
> >>>
> >>>>> -----Original Message-----
> >>>>> From: Daniel D. Daugherty [mailto:daniel.daugherty at oracle.com]
> >>>>> Sent: Dienstag, 29. November 2016 01:38
> >>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
> compiler-
> >>>>> dev at openjdk.java.net; 'hotspot-runtime-dev at openjdk.java.net'
> <hotspot-
> >>>>> runtime-dev at openjdk.java.net>
> >>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
> >>>>> error.
> >>>>>
> >>>>> On 11/28/16 2:08 AM, Lindenmaier, Goetz wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> I'm working on a fix for OS guard pages on stacks. I figured there
> >>>>>> are VM guard pages (reserved, yellow, red) on the compiler stacks
> >>>>>> _and_ OS guard pages. For Java threads, the OS guard pages are left
> >>>>>> out. I think this should be done for compiler threads, too. Please
> >>>>>> confirm.
> >>>>>> Webrev:
> >>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
> stackFix/webrev.04/
> >>>>>>
> >>>>> src/os/aix/vm/os_aix.cpp
> >>>>> L888: pthread_attr_setguardsize(&attr,
> >>>>> os::Aix::default_guard_size(thr_type));
> >>>>> No check or assert on the return status of this call.
> >>>>> Is one needed?
> >>>> I implemented this as the existing code on linux which has
> >>>> no check either. I think a failure is quite theoretical.
> >>>> Because of your comment below I'll leave it as-is.
> >>> OK.
> >>>
> >>>
> >>>>> L3044: // guard pages, so only enable libc guard pages for
> >>>>> non-Java threads.
> >>>>> src/os/linux/vm/os_linux.cpp also has this comment:
> >>>>> // (Remember: compiler thread is a Java thread, too!)
> >>>> Fixed.
> >>>>
> >>>>> L3051: return ((thr_type == java_thread || thr_type ==
> >>>>> compiler_thread) ? 0 : 4*K);
> >>>>> nit - please add spaces around the '*' so '4 * K'.'
> >>>> Fixed.
> >>>>
> >>>>> src/os/aix/vm/os_aix.hpp
> >>>>> No comments.
> >>>>>
> >>>>> src/os/linux/vm/os_linux.cpp
> >>>>> L729: // is not implemented properly. The posix standard
> >>>>> requires
> >>>>> to add
> >>>>> Typo: 'to add' -> 'adding'
> >>>> Fixed.
> >>>>
> >>>>> L738: pthread_attr_setguardsize(&attr,
> >>>>> os::Linux::default_guard_size(thr_type));
> >>>>> No check or assert on the return status of this call.
> >>>>> Is one needed?
> >>>> See above.
> >>>>
> >>>>> L2851: // Creating guard page is very expensive. Java
> >>>>> thread has
> >>>>> HotSpot
> >>>>> L2852: // guard page, only enable glibc guard page for
> >>>>> non-Java
> >>>>> threads.
> >>>>> L2853: // (Remember: compiler thread is a java thread, too!)
> >>>>> Typo: "java thread" -> "Java thread" (consistency)
> >>>>>
> >>>>> This comment block should be common to all the platforms
> >>>>> that
> >>>>> define default_guard_size(). Yes, I can see that AIX
> >>>>> needs to
> >>>>> add another paragraph, but we should make the core comment
> >>> common
> >>>>> if possible.
> >>>> I made the first three lines look alike.
> >>>>
> >>>>> L6090: // Java/Compiler thread:
> >>>>> Thanks for making this common in os_linux.cpp.
> >>>>>
> >>>>> L6095: // | glibc guard page | - guard, attached Java
> >>>>> thread usually has
> >>>>> Clarity: "guard," -> "guard page,"
> >>>> Fixed.
> >>>>
> >>>>> Typo: "Java thread" -> "JavaThread" (consistency)
> >>>> I changed both to Java thread as at the other locations.
> >>>>
> >>>>> L6099: // | HotSpot Guard Pages | - red and yellow pages
> >>>>> The fairly recently added reserved page should be mentioned
> >>>>> here also?
> >>>> Yes. Fixed. Also fixed it to say
> >>>> JavaThread::stack_reserved_zone_base().
> >>>> Also fixed comment on bsd.
> >>> Thanks for also fixing BSD.
> >>>
> >>>
> >>>>> L6120 // ** P1 (aka bottom) and size ( P2 = P1 - size) are the
> >>>>> address and stack size returned from
> >>>>> Typo: "( P2 = ..." -> "(P2 = ..."
> >>>> Fixed.
> >>>>
> >>>>> L6148: fatal("Can not locate current stack attributes!");
> >>>>> Typo: "Can not" -> "Cannot"
> >>>> Fixed.
> >>>>
> >>>>> L6175: // stack size includes normal stack and HotSpot
> >>>>> guard pages
> >>>>> Perhaps add to the comment:
> >>>>> "for the threads that have HotSpot guard pages."
> >>>> Fixed. I also checked my comments for "OS guard pages" and
> >>>> replaced it by "glibc guard pages" which is used in several places
> >>>> already, same for "VM guard page" --> "HotSpot guard page". I
> >>>> think this is also more consistent.
> >>> I agree!
> >>>
> >>>
> >>>>> src/os/posix/vm/os_posix.cpp
> >>>>> L1097: pthread_attr_getstacksize(attr, &stack_size);
> >>>>> L1098: pthread_attr_getguardsize(attr, &guard_size);
> >>>>> Do these two calls need to have their result checked?
> >>>>>
> >>>>> Now I'm starting to wonder if all the uses of these
> >>>>> two APIs need to be checked? Separate bug?
> >>>> It would be more consistent with the specification of the methods,
> >>>> On the other side it's quite unlikely that these fail if attr != NULL.
> >>> So should we file a new bug? Or does this fall into the realm of
> >>> other OS/libc code that we call and assume never fails? :-)
> >>>
> >>>
> >>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
> >>>>> L540: size_t os::Posix::_compiler_thread_min_stack_allowed =
> >>>>> 512 * K;
> >>>>> L541: size_t os::Posix::_java_thread_min_stack_allowed = 512
> >>>>> * K;
> >>>>> So prior to the fix for 8140520,
> >>>>> src/os/aix/vm/os_aix.cpp had
> >>>>> this single min_stack_allowed value:
> >>>>>
> >>>>> L3601: os::Aix::min_stack_allowed =
> >>>>> MAX2(os::Aix::min_stack_allowed,
> >>>>> L3602: JavaThread::stack_guard_zone_size() +
> >>>>> L3603: JavaThread::stack_shadow_zone_size() +
> >>>>> L3604: (4*BytesPerWord
> >>>>> COMPILER2_PRESENT(+2)) * 4 * K);
> >>>>>
> >>>>> and the fix for 8140520 changed that for *NIX platforms to
> >>>>> three mins in src/os/posix/vm/os_posix.cpp:
> >>>>>
> >>>>> L1108: _java_thread_min_stack_allowed =
> >>>>> MAX2(_java_thread_min_stack_allowed,
> >>>>> L1109: JavaThread::stack_guard_zone_size() +
> >>>>> L1110: JavaThread::stack_shadow_zone_size() +
> >>>>> L1111: (4 *
> >>>>> BytesPerWord COMPILER2_PRESENT(+ 2)) * 4 * K);
> >>>>>
> >>>>> L1150: _compiler_thread_min_stack_allowed =
> >>>>> align_size_up(_compiler_thread_min_stack_allowed, vm_page_size());
> >>>>>
> >>>>> L1161 _vm_internal_thread_min_stack_allowed =
> >>>>> align_size_up(_vm_internal_thread_min_stack_allowed,
> vm_page_size());
> >>>>>
> >>>>> Which means that the compiler_thread no longer benefits
> >>>>> from
> >>>>> the extra space for quard and shadow pages. The thinking in
> >>>>> 8140520 was that the compiler_thread and
> >>>>> vm_internal_threads
> >>>>> don't need the quard and shadow pages since they don't run
> >>>>> Java code (ignoring JVMCI for now).
> >>>>>
> >>>>> So I can see bumping _compiler_thread_min_stack_allowed
> >>>>> from
> >>>>> 128 -> 512 as one solution for getting that extra space
> >>>>> back.
> >>>>> However, I don't understand why
> >>>>> _java_thread_min_stack_allowed
> >>>>> has changed from 128 -> 512.
> >>>> Because it was never correct before.
> >>> OK. That sounds like the new test that I included with 8140520 would
> >>> have failed with JavaThread stack sizes even before the product code
> >>> changes from 8140520 were made.
> >>>
> >>> Since the size calculation for JavaThread stack sizes wasn't changed
> >>> for any platform in 8140520, that tends to indicate that the more
> >>> limited JDK test (test/tools/launcher/TooSmallStackSize.java) should
> >>> also have failed before the fix for 8140520.
> >>>
> >>> Please clarify the need for the _java_thread_min_stack_allowed change
> >>> from 128 -> 512. Unless test/tools/launcher/TooSmallStackSize.java
> >>> is never run in your testing, I'm having troubling seeing why the
> >>> _java_thread_min_stack_allowed increase is needed.
> >>>
> >>>
> >>>>> I had previously made this comment:
> >>>>> > To put it another way, I'd like to see us add extra
> >>>>> space to
> >>>>> > solve the 64K page issue directly instead of as a side
> >>>>> effect
> >>>>> > of the red/yellow page addition.
> >>>>> And Goetz replied with:
> >>>>> > I don't understand. What do you mean by 'directly'?
> >>>>>
> >>>>> So prior to the fix for 8140520,
> >>>>> src/os/solaris/vm/os_solaris.cpp
> >>>>> had a block like this:
> >>>>>
> >>>>> L4468: // For 64kbps there will be a 64kb page size,
> >>>>> which makes
> >>>>> L4469: // the usable default stack size quite a bit less.
> >>>>> Increase the
> >>>>> L4470: // stack for 64kb (or any > than 8kb) pages, this
> >>>>> increases
> >>>>> L4471: // virtual memory fragmentation (since we're not
> >>>>> creating the
> >>>>> L4472 // stack on a power of 2 boundary. The real fix
> >>>>> for this
> >>>>> L4473 // should be to fix the guard page mechanism.
> >>>>> L4474
> >>>>> L4475 if (vm_page_size() > 8*K) {
> >>>>> L4476 threadStackSizeInBytes =
> >>>>> (threadStackSizeInBytes != 0)
> >>>>> L4477 ? threadStackSizeInBytes +
> >>>>> L4478 JavaThread::stack_red_zone_size() +
> >>>>> L4479 JavaThread::stack_yellow_zone_size()
> >>>>> L4480 : 0;
> >>>>> L4481 ThreadStackSize = threadStackSizeInBytes/K;
> >>>>> L4482 }
> >>>>>
> >>>>> The above is an example of what I mean by solving the 64K
> >>>>> page issue directly. In the fix for 8140520, that code
> >>>>> block
> >>>>> was moved to os::Posix::set_minimum_stack_sizes() in
> >>>>> src/os/posix/vm/os_posix.cpp and put in a "#ifdef
> >>>>> SOLARIS...
> >>>>> #endif // SOLARIS" block. Coleen filed a bug to determine
> >>>>> whether that code can be deleted:
> >>>>>
> >>>>> JDK-8161093 Solaris for >8k pagesize adds extra guard pages
> >>>>> https://bugs.openjdk.java.net/browse/JDK-8161093
> >>>>>
> >>>>> but perhaps this bug shows that the code is needed?
> >>>>>
> >>>>>
> >>>>> OK so this is probably the longest code review comment
> >>>>> I have ever written, but the summary is:
> >>>>>
> >>>>> - I understand bumping _compiler_thread_min_stack_allowed,
> >>>>> but should it be solved in a different way?
> >>>>> - I don't understand bumping _java_thread_min_stack_allowed
> >>>> I plan to do a follow up change to fix this. Let's leave this
> >>>> discussion
> >>>> to that review. Here I just want to fix the NPTL issue and the basic
> >>>> sizing that is needed to pass the new test on ppc/s390.
> >>> Same question here about the simpler JDK version of the test.
> >>>
> >>> Does test/tools/launcher/TooSmallStackSize.java get run in
> >>> your test environments?
> >>>
> >>>
> >>>>> src/os_cpu/linux_aarch64/vm/os_linux_aarch64.cpp
> >>>>> No comments.
> >>>>>
> >>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
> >>>>> L538: size_t os::Posix::_compiler_thread_min_stack_allowed =
> >>>>> 384 * K;
> >>>>> L539: size_t os::Posix::_java_thread_min_stack_allowed = 384
> >>>>> * K;
> >>>>>
> >>>>> Same monster comment as
> >>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
> >>>>> and the same summary:
> >>>>>
> >>>>> - I understand bumping _compiler_thread_min_stack_allowed,
> >>>>> but should it be solved in a different way?
> >>>>> - I don't understand bumping _java_thread_min_stack_allowed
> >>>>>
> >>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
> >>>>> L478: size_t os::Posix::_compiler_thread_min_stack_allowed =
> >>>>> 128 * K;
> >>>>> L479: size_t os::Posix::_java_thread_min_stack_allowed = 236
> >>>>> * K;
> >>>>> Bumping _java_thread_min_stack_allowed but not bumping
> >>>>> _compiler_thread_min_stack_allowed. I'm confused here.
> >>>> The numbers are what I need to startup on the machines. 128 is just
> >>>> fine on the machines we have. (That's the problem of the
> >>>> current setup: you have to tune this compile time constant for the
> >>>> page size of the machine you are running on. But let's discuss this
> >>>> in the follow up change.)
> >>> OK about discussing this with a follow-up change. I guess I see
> >>> the compile time initialization as a "minimum setting assuming the
> >>> smallest page size". If we discover (at runtime) that the page
> >>> size is bigger, then we adjust the minimum that we need...
> >>>
> >>> Again, defer to another bug. Do we have a bug ID yet?
> >>>
> >>>
> >>>>> src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp
> >>>>> No comments.
> >>>>>
> >>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
> >>>>> No comments.
> >>>>>
> >>>>> src/os_cpu/linux_zero/vm/os_linux_zero.cpp
> >>>>> No comments.
> >>>>>
> >>>>> Sorry it took me so long to write this up...
> >>>> No matter, thanks for this thorough review!
> >>> You are very welcome. Thanks for being willing to dive into such
> >>> a complicated area (thread stack sizes)...
> >>>
> >>> Dan
> >>>
> >>>
> >>>
> >>>> Best regards,
> >>>> Goetz.
> >>>>
> >>>>> Dan
> >>>>>
> >>>>>> The change affecting the compier threads is in os_linux.cpp,
> >>>>> default_guard_size(),
> >>>>>> where '|| thr_type == compiler_thread' has been added. The function
> >>>>>> was also moved from the os_cpu files, as it's identical on all cpus.
> >>>>>>
> >>>>>> Best regards,
> >>>>>> Goetz.
> >>>>>>
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
> >>>>>>> Sent: Montag, 28. November 2016 00:25
> >>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>>>>>> 'daniel.daugherty at oracle.com' <daniel.daugherty at oracle.com>;
> >>> 'hotspot-
> >>>>>>> runtime-dev at openjdk.java.net' <hotspot-runtime-
> dev at openjdk.java.net>
> >>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack guard
> >>>>>>> error.
> >>>>>>>
> >>>>>>> Hi Goetz,
> >>>>>>>
> >>>>>>> On 24/11/2016 10:15 PM, Lindenmaier, Goetz wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I now edited the stuff I had proposed below:
> >>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
> >>> stackFix/webrev.03/
> >>>>>>>> This includes
> >>>>>>>> - the NPTL fix from webrev.02
> >>>>>>> Okay in principle. As discussed this only impacts non-JavaThreads
> >>>>>>> so the
> >>>>>>> change should be minimal.
> >>>>>>>
> >>>>>>>> - merging code on linux
> >>>>>>> Went a bit further than I had expected but if this truly isn't CPU
> >>>>>>> dependent code then great!
> >>>>>>>
> >>>>>>>> - not adding OS guard to compiler threads.
> >>>>>>> Okay in principle. IIUC we will now save the OS guard page for
> >>>>>>> compiler
> >>>>>>> thread stacks. Is that the only impact? The hotspot-compiler-dev
> >>>>>>> folk
> >>>>>>> may want to sign off on this part.
> >>>>>>>
> >>>>>>>
> >>>>>>> A few minor comments:
> >>>>>>>
> >>>>>>> src/os/linux/vm/os_linux.cpp
> >>>>>>>
> >>>>>>> 2854 return ((thr_type == java_thread || thr_type ==
> >>>>>>> os::compiler_thread) ...
> >>>>>>>
> >>>>>>> os:: should be used for both types or none.
> >>>>>>>
> >>>>>>> 6153 pthread_attr_getguardsize(&attr, &guard_size);
> >>>>>>>
> >>>>>>> Can you at least verify a zero return code in an
> >>>>>>> assert/assert_status
> >>>>>>> please.
> >>>>>>>
> >>>>>>> ---
> >>>>>>>
> >>>>>>> src/os_cpu/aix_ppc/vm/os_aix_ppc.cpp
> >>>>>>> src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp
> >>>>>>> src/os_cpu/linux_s390/vm/os_linux_s390.cpp
> >>>>>>>
> >>>>>>> Are the changes to min_stack_allowed just fixing an existing bug?
> >>>>>>>
> >>>>>>> ---
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> David
> >>>>>>> -----
> >>>>>>>
> >>>>>>>
> >>>>>>>> I think this should be pushed for this bug ID. For the other
> >>>>>>>> changes I'll
> >>>>>>>> make another bug.
> >>>>>>>>
> >>>>>>>> Best regards,
> >>>>>>>> Goetz.
> >>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: Lindenmaier, Goetz
> >>>>>>>>> Sent: Wednesday, November 23, 2016 8:11 AM
> >>>>>>>>> To: David Holmes <david.holmes at oracle.com>;
> >>>>>>>>> daniel.daugherty at oracle.com; hotspot-runtime-
> dev at openjdk.java.net
> >>>>>>>>> Subject: RE: RFR(M): 8169373: Work around linux NPTL stack guard
> >>> error.
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct with
> >>>>>>>>>> JVMCI.
> >>> The
> >>>>>>>>>> ability for a CompilerThread to execute Java code
> >>>>>>>>>> (can_call_java()) is
> >>>>>>>>>> now a dynamic property depending on whether the current
> compiler
> >>> is
> >>>>> the
> >>>>>>>>>> JVMCI compiler.
> >>>>>>>>> Ah, then I should also leave space for shadow pages in the minimal
> >>> stack
> >>>>>>> size
> >>>>>>>>> of comiler threads.
> >>>>>>>>>
> >>>>>>>>> Do we agree on the cleanup and on leaving out the OS guard page
> on
> >>>>>>>>> compiler threads?
> >>>>>>>>> Then I would edit a change comprising the NPTL workaround and
> >>>>>>>>> these
> >>>>>>>>> additional changes, and split the other issue into a new bug?
> >>>>>>>>> I think this
> >>>>>>>>> will easy the reviewing.
> >>>>>>>>>
> >>>>>>>>> Best regards,
> >>>>>>>>> Goetz.
> >>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
> >>>>>>>>>> Sent: Mittwoch, 23. November 2016 02:50
> >>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>>>>>>>>> daniel.daugherty at oracle.com; hotspot-runtime-
> dev at openjdk.java.net
> >>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
> guard
> >>>>> error.
> >>>>>>>>>> On 22/11/2016 11:19 PM, Lindenmaier, Goetz wrote:
> >>>>>>>>>>> Hi Dan,
> >>>>>>>>>>>
> >>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>> From: Daniel D. Daugherty
> [mailto:daniel.daugherty at oracle.com]
> >>>>>>>>>>>> Sent: Dienstag, 22. November 2016 14:01
> >>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; David
> >>> Holmes
> >>>>>>>>>>>> <david.holmes at oracle.com>; hotspot-runtime-
> >>> dev at openjdk.java.net
> >>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
> >>>>>>>>>>>> guard
> >>>>>>>>> error.
> >>>>>>>>>>>> On 11/22/16 3:55 AM, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>>> Hi Dan,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I ran into a row of issues, some errors on the platforms.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> What I meant with that comment is that
> >>>>>>>>>>>>> os::Linux::min_stack_allowed =
> >>> MAX2(os::Linux::min_stack_allowed,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> JavaThread::stack_guard_zone_size() +
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> JavaThread::stack_shadow_zone_size() +
> >>>>>>>>>>>>> (4*BytesPerWord
> >>>>> COMPILER2_PRESENT(+2)) *
> >>>>>>> 4
> >>>>>>>>> *
> >>>>>>>>>> K);
> >>>>>>>>>>>>> was executed, and min_stack_allowed used for all stacks. Now,
> >>>>>>>>> compiler
> >>>>>>>>>>>> and
> >>>>>>>>>>>>> vm minimum stack sizes are not increased by these sizes.
> >>>>>>>>>>>> Now I see what you mean. Thanks for clearing this up.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I should have remembered that part of the change because we
> >>> went
> >>>>>>>>> back
> >>>>>>>>>>>> and forth about removing the red/yellow zone pages from the
> >>>>>>>>>>>> other
> >>>>>>>>>>>> threads. In particular, we discussed the compiler thread
> >>>>>>>>>>>> because it
> >>>>>>>>>>>> is-a JavaThread. Our conclusion was that a compiler thread
> >>>>>>>>>>>> doesn't
> >>>>>>>>>>>> execute Java bytecode so we could remove the red/yellow
> >>>>>>>>>>>> pages...
> >>>>>>>>>>> Yes, it does not execute java byte code.
> >>>>>>>>>> Bzzzt! Sorry Goetz and Dan but that is no longer correct with
> >>>>>>>>>> JVMCI.
> >>> The
> >>>>>>>>>> ability for a CompilerThread to execute Java code
> >>>>>>>>>> (can_call_java()) is
> >>>>>>>>>> now a dynamic property depending on whether the current
> compiler
> >>> is
> >>>>> the
> >>>>>>>>>> JVMCI compiler.
> >>>>>>>>>>
> >>>>>>>>>> David
> >>>>>>>>>> -----
> >>>>>>>>>>
> >>>>>>>>>>> Therefore you can remove the shadow pages. There is no code
> >>>>>>>>>>> that
> >>>>>>>>>>> will bang.
> >>>>>>>>>>> But red/yellow pages are protected right during thread startup.
> >>>>>>>>>>> Therefore you must have enough space for them.
> >>>>>>>>>>> On ppc, we try to protect three 64K pages out of the 128K
> >>>>>>>>>>> compiler
> >>>>> stack.
> >>>>>>>>>>> That obviously fails.
> >>>>>>>>>>>
> >>>>>>>>>>> Therefore I propose:
> >>>>>>>>>>> size_t os::Posix::_java_thread_min_stack_allowed = 48
> >>>>>>>>>>> * K; //
> >>> Set
> >>>>>>>>>> platform dependent.
> >>>>>>>>>>> in os::Posix::set_minimum_stack_sizes():
> >>>>>>>>>>> _java_thread_min_stack_allowed =
> >>> _java_thread_min_stack_allowed
> >>>>> +
> >>>>>>>>>>>
> >>>>>>>>>>> JavaThread::stack_guard_zone_size() +
> >>>>>>>>>>>
> >>>>>>>>>>> JavaThread::stack_shadow_zone_size();
> >>>>>>>>>>>
> >>>>>>>>>>> (Similar for _compiler_thread_min_stack_allowed).
> >>>>>>>>>>>
> >>>>>>>>>>> The minimal stack size is made up of three components:
> >>>>>>>>>>> 1. Sizes of interpreter/C1/C2 frames. Depends on HotSpot
> >>>>>>>>>> implementation/platform/os.
> >>>>>>>>>>> 2. Sizes of C++ frames: depends on C++ compiler.
> >>>>>>>>>>> These are fixed at compile time.
> >>>>>>>>>>> 3. Sizes of red/yellow/reserved/shadow pages. Depends
> >>>>>>>>>>> on the
> >>>>> system
> >>>>>>>>>> the
> >>>>>>>>>>> VM is used on. This is not fixed at compile time.
> >>>>>>>>>>> (Our ppc
> >>>>> machines
> >>>>>>>>>> differ
> >>>>>>>>>>> in page size.)
> >>>>>>>>>>> Therefore 3. should not be included in a compile time constant.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> And that decision allowed us to be exposed to the 64K page
> >>>>>>>>>>>> issue
> >>>>>>>>>>>> because the "extra" space isn't there anymore.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> At least the _compiler_thread_min_stack_allowed should be
> >>>>> increased
> >>>>>>>>>>>>> similarly by red/yellow zone size. The compiler thread is
> >>>>>>>>>>>>> a Java
> >>>>>>>>>>>>> thread and must have space for these zones.
> >>>>>>>>>>>> I'm not sure that I completely agree (yet). To me, the
> >>>>>>>>>>>> red/yellow
> >>>>>>>>>>>> pages are there for Java thread stack overflow semantics.
> >>>>>>>>>>>> Yes, the
> >>>>>>>>>>>> compiler thread needs extra space when 64K pages are used,
> >>>>>>>>>>>> but I
> >>>>>>>>>>>> would prefer that we add that space via a different
> >>>>>>>>>>>> calculation.
> >>>>>>>>>>> Yes they are. But compiler threads happen tob e a subclass of
> >>>>>>>>>>> Java threads and use the same run() method that puts the pages
> >>>>>>>>>>> There.
> >>>>>>>>>>> I agree that they are not needed for Compiler threads, nor for
> >>>>>>>>>>> CodeCacheSweeperThreads. I don't really now about
> >>>>>>>>>>> JvmtiAgentThreads and ServiceThreads, but all of the get
> >>>>>>>>>>> the guard
> >>>>>>>>> pages
> >>>>>>>>>>> because they are derived from JavaThread.
> >>>>>>>>>>>
> >>>>>>>>>>>> To put it another way, I'd like to see us add extra space to
> >>>>>>>>>>>> solve
> >>>>>>>>>>>> the 64K page issue directly instead of as a side effect of the
> >>>>>>>>>>>> red/yellow page addition.
> >>>>>>>>>>> I don't understand. What do you mean by 'directly'?
> >>>>>>>>>>>
> >>>>>>>>>>>>> Also, the change added a test that is failing now.
> >>>>>>>>>>>> And that's a "good thing" (TM), right? :-)
> >>>>>>>>>>> Yes, it showed a bug and thus raised the need to fix it! :)
> >>>>>>>>>>>
> >>>>>>>>>>> Best regards,
> >>>>>>>>>>> Goetz.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> Again, thanks for clarifying 8140520's role in this issue.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Dan
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>> Goetz.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>> From: Daniel D. Daugherty
> >>> [mailto:daniel.daugherty at oracle.com]
> >>>>>>>>>>>>>> Sent: Montag, 21. November 2016 17:28
> >>>>>>>>>>>>>> To: David Holmes <david.holmes at oracle.com>; Lindenmaier,
> >>> Goetz
> >>>>>>>>>>>>>> <goetz.lindenmaier at sap.com>; hotspot-runtime-
> >>>>>>>>>> dev at openjdk.java.net
> >>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL stack
> >>> guard
> >>>>>>>>>> error.
> >>>>>>>>>>>>>> Sorry for the delayed responses to this thread. I've been on
> >>>>> vacation...
> >>>>>>>>>>>>>> One comment/query embedded below...
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 11/10/16 8:40 PM, David Holmes wrote:
> >>>>>>>>>>>>>>> Hi Goetz,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 11/11/2016 8:00 AM, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>>>>>> Hi David,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> This issue is different to 6675312, see also my comment
> >>>>>>>>>>>>>>>> in the
> >>>>> bug.
> >>>>>>>>>>>>>>>> It appears running jtreg test
> >>>>>>>>>> runtime/Thread/TooSmallStackSize.java,
> >>>>>>>>>>>>>>>> with my patch below you can reproduce it on linuxx86_64.
> >>>>>>>>>>>>>>>> You
> >>>>> can
> >>>>>>>>>> not
> >>>>>>>>>>>>>>>> do that with 6675312. Also, I would assume there are
> >>>>>>>>>>>>>>>> systems
> >>> out
> >>>>>>>>>> there
> >>>>>>>>>>>>>>>> on x86 that uses 64-K pages, did you run the tests on
> >>>>>>>>>>>>>>>> these? I
> >>>>>>>>> would
> >>>>>>>>>>>>>>>> assume you get hard crashes with stack overflows in the
> >>>>>>>>>>>>>>>> first
> >>> C2
> >>>>>>>>>>>>>>>> compilation if there is only 64K usable
> >>>>>>>>>>>>>>>> CompilerThreadStack.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> My fix does not affect Java threads, which are the largest
> >>> amount
> >>>>>>>>>>>>>>>> of threads used by the VM. It affects only the non-Java
> >>>>>>>>>>>>>>>> threads.
> >>>>>>>>>>>>>>>> It adds one page to these threads. The page does not
> >>>>>>>>>>>>>>>> require
> >>>>>>>>>> memory,
> >>>>>>>>>>>>>>>> as it's protected. The stack will only require more
> >>>>>>>>>>>>>>>> space if the
> >>>>>>>>> thread
> >>>>>>>>>>>>>>>> ran into a stack overflow before the fix as else the
> >>>>>>>>>>>>>>>> pages are
> >>> not
> >>>>>>>>>>>>>>>> mapped.
> >>>>>>>>>>>>>>>> This are stack overflows that cause hard crashes, at
> >>>>>>>>>>>>>>>> least on
> >>> ppc
> >>>>>>>>> the
> >>>>>>>>>> VM
> >>>>>>>>>>>>>>>> does not properly catch these stack overflows, so any setup
> >>>>>>>>> working
> >>>>>>>>>>>> now
> >>>>>>>>>>>>>>>> will not run into the additional space. Altogether there
> >>>>>>>>>>>>>>>> should
> >>> be
> >>>>>>>>> no
> >>>>>>>>>>>>>>>> effect on running systems besides requiring one more
> >>>>>>>>>>>>>>>> entry in
> >>>>> the
> >>>>>>>>>>>>>>>> page table per non-Java thread.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> The problem is caused by a rather recent change
> (8140520:
> >>>>>>>>> segfault
> >>>>>>>>>> on
> >>>>>>>>>>>>>>>> solaris-amd64
> >>>>>>>>>>>>>>>> with "-XX:VMThreadStackSize=1" option) which was pushed
> >>> after
> >>>>>>>>>>>>>>>> feature-close. As this was a rather recent change, it
> >>>>>>>>>>>>>>>> must be
> >>>>>>>>>>>>>>>> possible to
> >>>>>>>>>>>>>>>> fix this follow up issue. What else is this period in
> >>>>>>>>>>>>>>>> the project
> >>>>> good
> >>>>>>>>>>>>>>>> for if not fixing issues?
> >>>>>>>>>>>>>>> So I am seeing a number of factors here.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> First, 8140520, set:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> size_t os::Posix::_compiler_thread_min_stack_allowed = 128
> *
> >>> K;
> >>>>>>>>>>>>>> So I'm confused by the above comment:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> > The problem is caused by a rather recent change
> >>>>>>>>>>>>>> (8140520:
> >>>>>>>>> segfault
> >>>>>>>>>>>>>> > on solaris-amd64 with "-XX:VMThreadStackSize=1"
> >>>>>>>>>>>>>> option)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8140520-webrev/5-jdk9-
> hs-
> >>>>>>>>>>>>>>
> >>>>>>>>>>
> open/hotspot/src/os_cpu/linux_ppc/vm/os_linux_ppc.cpp.frames.html
> >>>>>>>>>>>>>> shows this change:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> @@ -531,19 +531,17 @@
> >>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>> //////////////////////////////////////////////////////////////////////////////
> >>>
> >>>>>>>>>>>>>> //
> >>>>>>>>>>>>>> // thread stack
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -size_t os::Linux::min_stack_allowed = 128*K;
> >>>>>>>>>>>>>> +size_t os::Posix::_compiler_thread_min_stack_allowed = 128
> *
> >>> K;
> >>>>>>>>>>>>>> +size_t os::Posix::_java_thread_min_stack_allowed = 128 * K;
> >>>>>>>>>>>>>> +size_t os::Posix::_vm_internal_thread_min_stack_allowed =
> >>>>>>>>>>>>>> 128
> >>> *
> >>>>> K;
> >>>>>>>>>>>>>> so the existing single variable of 'min_stack_allowed' was
> >>>>>>>>>>>>>> replaced by three variables:
> >>> _compiler_thread_min_stack_allowed,
> >>>>>>>>>>>>>> _java_thread_min_stack_allowed, and
> >>>>>>>>>>>>>> _vm_internal_thread_min_stack_allowed.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The old single variable and the three new variables are all
> >>>>>>>>>>>>>> initialized to the same value (128K) so the fix for 8140520
> >>>>>>>>>>>>>> did not change stack sizes for this platform. In fact, only
> >>>>>>>>>>>>>> one platform had a size change (Solaris X64).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> So I'm confused about how the fix for 8140520 caused this
> >>> problem.
> >>>>>>>>>>>>>> Based on David's analysis below, it looks to me like this
> >>>>>>>>>>>>>> 64K stack
> >>>>>>>>>>>>>> guard page problem should also exist prior to the fix for
> >>> 8140520.
> >>>>>>>>>>>>>> Goetz, can you please explain how 8140520 caused this
> >>>>>>>>>>>>>> problem?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Dan
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Second on linux PPC it is hardwired to use 2 guard pages:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> return 2 * page_size();
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Third, you had a pagesize of 64K.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Fourth, NPTL takes the guard space from the stack space -
> >>> hence
> >>>>>>>>> with
> >>>>>>>>>> 2
> >>>>>>>>>>>>>>> x 64K guard, and a 128K stack it was all consumed.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In the proposed changes you now only use page_size() for
> the
> >>>>> guard,
> >>>>>>>>>> so
> >>>>>>>>>>>>>>> that alone would have fixed the observed problem.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> But in addition you want to address the NPTL problem by
> >>>>>>>>>>>>>>> adding
> >>>>>>>>> back
> >>>>>>>>>>>>>>> the guard space to the stack size requested. That alone
> >>>>>>>>>>>>>>> would
> >>> also
> >>>>>>>>>>>>>>> have fixed the observed problem. :)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> But in addition you have increased the minimum stack size:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ! size_t os::Posix::_compiler_thread_min_stack_allowed =
> >>>>>>>>>>>>>>> 192 *
> >>> K;
> >>>>>>>>>>>>>>> which again, on its own would have fixed the original
> >>>>>>>>>>>>>>> problem.
> >>> :)
> >>>>>>>>>>>>>>> Did you really intend to increase the real minimum stack
> >>>>>>>>>>>>>>> from
> >>>>> 128K
> >>>>>>>>> to
> >>>>>>>>>>>>>>> 256K ? (on a 64K page system)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Focusing simply on the shared code change to adjust the
> >>> requested
> >>>>>>>>>>>>>>> stacksize by the amount of guard space (if any), this
> >>>>>>>>>>>>>>> does not
> >>> seem
> >>>>>>>>>>>>>>> unreasonable. As you note it is restricted to
> >>>>>>>>>>>>>>> non-JavaThreads
> >>> and
> >>>>>>>>> only
> >>>>>>>>>>>>>>> adds a page to reserved stack space.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> My only query now is whether the minimum stacksize
> detection
> >>>>> logic
> >>>>>>>>>>>>>>> will correctly report the real minimum stack size (taking
> >>>>>>>>>>>>>>> into
> >>>>> account
> >>>>>>>>>>>>>>> the need for the guard page) ?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> David
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> So I really think this issue should be fixed.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>>>>> Goetz.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>>>>> From: David Holmes [mailto:david.holmes at oracle.com]
> >>>>>>>>>>>>>>>>> Sent: Thursday, November 10, 2016 10:02 PM
> >>>>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>> hotspot-
> >>>>>>>>>>>>>> runtime-
> >>>>>>>>>>>>>>>>> dev at openjdk.java.net
> >>>>>>>>>>>>>>>>> Subject: Re: RFR(M): 8169373: Work around linux NPTL
> stack
> >>>>>>>>> guard
> >>>>>>>>>>>> error.
> >>>>>>>>>>>>>>>>> Hi Goetz,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> As per the bug report, this issue was already known
> >>> (6675312)
> >>>>> and
> >>>>>>>>>> we
> >>>>>>>>>>>>>>>>> chose not to try and address it due to no reported
> >>>>>>>>>>>>>>>>> issues at
> >>> the
> >>>>>>>>>> time.
> >>>>>>>>>>>>>>>>> While I see that you have encountered an issue (is it
> >>>>>>>>>>>>>>>>> real or
> >>>>>>>>>>>>>>>>> fabricated?) I think this change is too intrusive to be
> >>>>>>>>>>>>>>>>> applied
> >>> at
> >>>>> this
> >>>>>>>>>>>>>>>>> stage of the JDK 9 release cycle, as it will change the
> >>>>>>>>>>>>>>>>> stack
> >>>>>>>>>>>>>>>>> requirements of every application running on Linux.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>> David
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On 11/11/2016 1:58 AM, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Please review this change. I please need a sponsor:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr16/8169373-ppc-
> >>>>>>>>>>>>>> stackFix/webrev.01/
> >>>>>>>>>>>>>>>>>> In the Linux NPTL pthread implementation the guard size
> >>>>>>>>>> mechanism
> >>>>>>>>>>>>>>>>>> is not
> >>>>>>>>>>>>>>>>>> implemented properly. The posix standard requires to
> add
> >>> the
> >>>>>>>>>> size
> >>>>>>>>>>>>>>>>>> of the
> >>>>>>>>>>>>>>>>>> guard pages to the stack size, instead Linux takes the
> >>>>>>>>>>>>>>>>>> space
> >>> out
> >>>>>>>>> of
> >>>>>>>>>>>>>>>>>> 'stacksize'.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> The Posix standard
> >>>>>>>>>>>>>> http://pubs.opengroup.org/onlinepubs/9699919799/
> >>>>>>>>>>>>>>>>>> says "the implementation allocates extra memory
> >>>>>>>>>>>>>>>>>> at the
> >>>>>>>>>> overflow
> >>>>>>>>>>>>>>>>>> end of
> >>>>>>>>>>>>>>>>>> the stack". The linux man page
> >>>>>>>>>>>>>>>>>> https://linux.die.net/man/3/pthread_attr_setguardsize
> >>>>>>>>>>>>>>>>>> says
> >>> "As
> >>>>>>>>>> at
> >>>>>>>>>>>>>>>>>> glibc
> >>>>>>>>>>>>>>>>>> 2.8, the NPTL threading implementation includes the
> guard
> >>>>> area
> >>>>>>>>>>>> within
> >>>>>>>>>>>>>>>>>> the stack size allocation, rather than allocating
> >>>>>>>>>>>>>>>>>> extra space
> >>> at
> >>>>>>>>>>>>>>>>>> the end
> >>>>>>>>>>>>>>>>>> of the stack, as POSIX.1 requires".
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I encounter this problem in
> >>>>>>>>>> runtime/Thread/TooSmallStackSize.java
> >>>>>>>>>>>>>>>>>> on ppc
> >>>>>>>>>>>>>>>>>> with 64K pages. _compiler_thread_min_stack_allowed is
> >>> 128K
> >>>>>>>>> on
> >>>>>>>>>>>> ppc,
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> ppc specifies two OS guard pages. The VM crashes in
> >>> pthread
> >>>>>>>>>>>> creation
> >>>>>>>>>>>>>>>>>> because there is no usable space in the thread stack
> >>>>>>>>>>>>>>>>>> after
> >>>>>>>>>> allocating
> >>>>>>>>>>>>>>>>>> the guard pages.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> But TooSmallStackSize.java requires that the VM comes
> up
> >>>>> with
> >>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> stack
> >>>>>>>>>>>>>>>>>> size mentioned in the error message.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> This fix adapts the requested stack size on Linux by
> >>>>>>>>>>>>>>>>>> the size
> >>> of
> >>>>>>>>> the
> >>>>>>>>>>>>>>>>>> guard pages to mimick proper behaviour, see change to
> >>>>>>>>>>>> os_linux.cpp.
> >>>>>>>>>>>>>>>>>> The change also streamlines usage of stack_guard_page
> on
> >>>>>>>>>> linuxppc,
> >>>>>>>>>>>>>>>>>> linuxppcle, aixppc and linuxs390.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> To reproduce the error on linux_x86_64, apply below
> patch
> >>>>> and
> >>>>>>>>>> call
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> VM with -XX:CompilerThreadStackSize=64.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I'm still exploring why I had to choose such big compiler
> >>> stacks
> >>>>>>>>> on
> >>>>>>>>>>>>>>>>>> ppc
> >>>>>>>>>>>>>>>>>> to get -version passing, but I wanted to send the RFR
> >>>>>>>>>>>>>>>>>> now as
> >>>>>>>>>> people
> >>>>>>>>>>>>>>>>>> obviously looked at the bug I opened (Thanks David!).
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Goetz.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> diff -r b7ae012c55c3
> >>>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
> >>>>>>>>>>>>>>>>>> --- a/src/os_cpu/linux_x86/vm/os_linux_x86.cpp Mon
> Nov
> >>> 07
> >>>>>>>>>>>> 12:37:28
> >>>>>>>>>>>>>>>>> 2016
> >>>>>>>>>>>>>>>>>> +0100
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> +++ b/src/os_cpu/linux_x86/vm/os_linux_x86.cpp Thu
> Nov
> >>> 10
> >>>>>>>>>>>>>> 16:52:17
> >>>>>>>>>>>>>>>>> 2016
> >>>>>>>>>>>>>>>>>> +0100
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> @@ -701,7 +701,7 @@
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> size_t os::Linux::default_guard_size(os::ThreadType
> >>> thr_type) {
> >>>>>>>>>>>>>>>>>> // Creating guard page is very expensive. Java
> >>>>>>>>>>>>>>>>>> thread
> >>> has
> >>>>>>>>>> HotSpot
> >>>>>>>>>>>>>>>>>> // guard page, only enable glibc guard page for
> >>>>>>>>>>>>>>>>>> non-Java
> >>>>>>>>>> threads.
> >>>>>>>>>>>>>>>>>> - return (thr_type == java_thread ? 0 : page_size());
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> + return (thr_type == java_thread ? 0 : 64*K);
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> // Java thread:
> >>>>>>>>>>>>>>>>>>
> >
More information about the hotspot-compiler-dev
mailing list