RFR: 8170307: Stack size option -Xss is ignored

Tue Dec 13 17:12:43 UTC 2016

On 12/12/16 9:41 PM, David Holmes wrote:
> Okay here's the updated webrev complete with nice logging:
>
> http://cr.openjdk.java.net/~dholmes/8170307/webrev.v2/

src/os/linux/vm/os_linux.cpp
     L936:   // a user-specified value known to be greater than the 
minimum needed.
         Perhaps: ... known to be at least the minimum needed.

         As enforced by this code in os::Posix::set_minimum_stack_sizes():

         _java_thread_min_stack_allowed = 
MAX2(_java_thread_min_stack_allowed,
JavaThread::stack_guard_zone_size() +
JavaThread::stack_shadow_zone_size() +
                                               (4 * BytesPerWord 
COMPILER2_PRESENT(+ 2)) * 4 * K);

         _java_thread_min_stack_allowed = 
align_size_up(_java_thread_min_stack_allowed, vm_page_size());

         size_t stack_size_in_bytes = ThreadStackSize * K;
         if (stack_size_in_bytes != 0 &&
             stack_size_in_bytes < _java_thread_min_stack_allowed) {
           // The '-Xss' and '-XX:ThreadStackSize=N' options both set
           // ThreadStackSize so we go with "Java thread stack size" instead
           // of "ThreadStackSize" to be more friendly.
           tty->print_cr("\nThe Java thread stack size specified is too 
small. "
                         "Specify at least " SIZE_FORMAT "k",
                         _java_thread_min_stack_allowed / K);
           return JNI_ERR;
         }

     L932:   // can not do anything to emulate a larger stack than what 
has been provided by
         Typo: 'can not' -> 'cannot'

     L936:   // Mamimum stack size is the easy part, get it from 
RLIMIT_STACK
         Typo: 'Mamimum' -> 'Maximum'
         nit - please add a '.' to the end.

     L1125:                          SIZE_FORMAT "K, top=" INTPTR_FORMAT 
", bottom=" INTPTR_FORMAT "\n",
         Does the logging subsystem convert the "\n" into the proper
         output for non-*NIX platforms, e.g., Windows?

     L1126:                          primordial ? "Primordial" : "User", 
max_size/K,  _initial_thread_stack_size/K,
         Please add spaces around the div operator.

         Any particular reason that "Primordial" and "User" start with 
upper case?

Thumbs up!

I don't need to see a new webrev if you decide to make the
minor edits above.

Dan

>
> The stack size will be the smaller of the rlimit stack and the 
> -Xss/ThreadStackSize value. If the rlimit stack is unlimited and 
> ThreadStackSize==0 then we clamp it at 8MB as we do on Solaris. So you 
> can now get whatever primordial thread stack size you want by using 
> ulimit and -Xss appropriately.
>
> Thanks,
> David
>
> On 3/12/2016 2:11 PM, David Holmes wrote:
>> On 3/12/2016 9:12 AM, Daniel D. Daugherty wrote:
>>> On 12/1/16 10:51 PM, David Holmes wrote:
>>>> Investigating this further the history is quite complex, especially
>>>> when we start looking at other platforms. E.g. see
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-6269555
>>>>
>>>> Solaris actually hard-wires an 8MB limit for the primordial thread.
>>>>
>>>> I'm very tempted to do the same on Linux.
>>>
>>> Vote: yes
>>
>> Excellent! Other votes?
>>
>>> This latest problem only comes up with -XX:ThreadStackSize=0 when the
>>> stack is unlimited right?
>>
>> Right.
>>
>>> When -XX:ThreadStackSize=0 is specified, is taking the smaller of
>>> 8MB or the ulimit a viable option?
>>
>> I think so.
>>
>>> Also, it looks like Hui had some things to say about not setting the
>>> red/yellow zone pages on the primordial thread when we aren't using the
>>> 'java' launcher because we don't know the environment of the code that
>>> is using the JNI invocation API...
>>
>> Yeah but those comments seem a bit confused to me. They suggest we
>> shouldn't add guard pages but in fact we do add guard pages. And to me
>> it is no different in the primordial thread than any other natively
>> attached thread ie why should the initially attached thread be treated
>> differently to any other?** I suspect if I keep researching on this I
>> will find bugs regarding such differences in behaviour (eg the fact that
>> -Xss wasn't working on the main thread).
>>
>> ** There are arguments both ways as to how natively attached threads
>> should behave. The main argument against guard page insertion is that we
>> don't know how far down the existing stack we actually are - we could be
>> past the depth where the guard page would be inserted! The main argument
>> for (which seems to have won the day) is so that we don't get arbitrary
>> differences in behaviour between threads created and attached by
>> application native code; and threads created direct from application
>> Java code.
>>
>> Anyway, simply upping the 2M limit on Linux to 8M seems a simple
>> solution - assuming it addresses the needs of the folk that ran into
>> this problem.
>>
>> Thanks,
>> David
>>
>>> Dan
>>>
>>>
>>>>
>>>> David
>>>> -----
>>>>
>>>> On 30/11/2016 6:46 PM, David Holmes wrote:
>>>>> On 30/11/2016 6:17 PM, Thomas Stüfe wrote:
>>>>>> On Wed, Nov 30, 2016 at 8:35 AM, David Holmes 
>>>>>> <david.holmes at oracle.com
>>>>>> <mailto:david.holmes at oracle.com>> wrote:
>>>>>>
>>>>>>     On 29/11/2016 10:25 PM, David Holmes wrote:
>>>>>>
>>>>>>         I just realized I overlooked the case where 
>>>>>> ThreadStackSize=0
>>>>>>         and the
>>>>>>         stack is unlimited. In that case it isn't clear where the
>>>>>> guard
>>>>>>         pages
>>>>>>         will get inserted - I do know that I don't get a 
>>>>>> stackoverflow
>>>>>>         error.
>>>>>>
>>>>>>         This needs further investigation.
>>>>>>
>>>>>>
>>>>>>     So what happens here is that the massive stack-size causes
>>>>>>     stack-bottom to be higher than stack-top! So we will set a
>>>>>>     guard-page goodness knows where, and we can consume the current
>>>>>>     stack until such time as we hit an unmapped or protected 
>>>>>> region at
>>>>>>     which point we are killed.
>>>>>>
>>>>>>     I'm not sure what to do here. My gut feel is that in such a
>>>>>> case we
>>>>>>     should not attempt to create a guard page in the initial thread.
>>>>>>     That would require using a sentinel value for the stack-size.
>>>>>> Though
>>>>>>     it also presents a problem for stack-bottom - which is 
>>>>>> implicitly
>>>>>>     zero. It may also give false positives in the 
>>>>>> is_initial_thread()
>>>>>> check!
>>>>>>
>>>>>>     Thoughts? Suggestions?
>>>>>>
>>>>>>
>>>>>> Maybe I am overlooking something, but should
>>>>>> os::capture_initial_thread() not call pthread_getattr_np() first to
>>>>>> handle the case where the VM was created on a pthread which is 
>>>>>> not the
>>>>>> primordial thread and may have a different stack size than what
>>>>>> getrlimit returns? And fall back to getrlimit only if
>>>>>> pthread_getattr_np() fails?
>>>>>
>>>>> My understanding of the problem (which likely no longer exists) is 
>>>>> that
>>>>> pthread_getattr_np didn't fail as such but returned bogus values - so
>>>>> the problem was not detectable and so we just had to not use
>>>>> pthread_getattr_np.
>>>>>
>>>>>> And then we also should handle
>>>>>> RLIM_INFINITY. For that case, I also think not setting guard pages
>>>>>> would
>>>>>> be safest.
>>>>>>
>>>>>> We also may just refuse to run in that case, because the workaround
>>>>>> for
>>>>>> the user is easy - just set the limit before process start. Note
>>>>>> that on
>>>>>> AIX, we currently refuse to run on the primordial thread because it
>>>>>> may
>>>>>> have different page sizes than pthreads and it is impossible to get
>>>>>> the
>>>>>> exact stack locations.
>>>>>
>>>>> I was wondering why the AIX set up seemed so simple in comparison :)
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>>
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>>
>>>>>>         David
>>>>>>
>>>>>>         On 29/11/2016 9:59 PM, David Holmes wrote:
>>>>>>
>>>>>>             Hi Thomas,
>>>>>>
>>>>>>             On 29/11/2016 8:39 PM, Thomas Stüfe wrote:
>>>>>>
>>>>>>                 Hi David,
>>>>>>
>>>>>>                 thanks for the good explanation. Change looks 
>>>>>> good, I
>>>>>>                 really like the
>>>>>>                 comment in capture_initial_stack().
>>>>>>
>>>>>>                 Question, with -Xss given and being smaller than
>>>>>> current
>>>>>>                 thread stack
>>>>>>                 size, guard pages may appear in the middle of the
>>>>>>                 invoking thread stack?
>>>>>>                 I always thought this is a bit dangerous. If your
>>>>>> model
>>>>>>                 is to have the
>>>>>>                 VM created from the main thread, which then goes
>>>>>> off to
>>>>>>                 do different
>>>>>>                 things, and have other threads then attach and run
>>>>>> java
>>>>>>                 code, main
>>>>>>                 thread later may crash in unrelated native code just
>>>>>>                 because it reached
>>>>>>                 the stack depth of the hava threads? Or am I
>>>>>>                 misunderstanding something?
>>>>>>
>>>>>>
>>>>>>             There is no change to the general behaviour other than
>>>>>>             allowing a
>>>>>>             primordial process thread that launches the VM, to 
>>>>>> now not
>>>>>>             have an
>>>>>>             effective stack limited at 2MB. The current logic will
>>>>>>             insert guard
>>>>>>             pages where ever -Xss states (as long as less than 2MB
>>>>>> else
>>>>>>             2MB), while
>>>>>>             with the fix the guard pages will be inserted above 2MB
>>>>>> - as
>>>>>>             dictated by
>>>>>>             -Xss.
>>>>>>
>>>>>>             David
>>>>>>             -----
>>>>>>
>>>>>>                 Thanks, Thomas
>>>>>>
>>>>>>
>>>>>>                 On Fri, Nov 25, 2016 at 11:38 AM, David Holmes
>>>>>>                 <david.holmes at oracle.com
>>>>>> <mailto:david.holmes at oracle.com>
>>>>>>                 <mailto:david.holmes at oracle.com
>>>>>> <mailto:david.holmes at oracle.com>>> wrote:
>>>>>>
>>>>>>                     Bug:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8170307
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8170307>
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8170307
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8170307>>
>>>>>>
>>>>>>                     The bug is not public unfortunately for
>>>>>>                 non-technical reasons - but
>>>>>>                     see my eval below.
>>>>>>
>>>>>>                     Background: if you load the JVM from the
>>>>>> primordial
>>>>>>                 thread of a
>>>>>>                     process (not done by the java launcher since JDK
>>>>>> 6),
>>>>>>                 there is an
>>>>>>                     artificial stack limit imposed on the initial
>>>>>> thread
>>>>>>                 (by sticking
>>>>>>                     the guard page at the limit position of the 
>>>>>> actual
>>>>>>                 stack) of the
>>>>>>                     minimum of the -Xss setting and 2M. So if you 
>>>>>> set
>>>>>>                 -Xss to > 2M it is
>>>>>>                     ignored for the main thread even if the true 
>>>>>> stack
>>>>>>                 is, say, 8M. This
>>>>>>                     limitation dates back 10-15 years and is no 
>>>>>> longer
>>>>>>                 relevant today
>>>>>>                     and should be removed (see below). I've also 
>>>>>> added
>>>>>>                 additional
>>>>>>                     explanatory notes.
>>>>>>
>>>>>>                     webrev:
>>>>>> http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>>>>> <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>
>>>>>> <http://cr.openjdk.java.net/~dholmes/8170307/webrev/
>>>>>> <http://cr.openjdk.java.net/~dholmes/8170307/webrev/>>
>>>>>>
>>>>>>                     Testing was manually done by modifying the
>>>>>> launcher
>>>>>>                 to not run the
>>>>>>                     VM in a new thread, and checking the resulting
>>>>>> stack
>>>>>>                 size used.
>>>>>>
>>>>>>                     This change will only affect hosted JVMs 
>>>>>> launched
>>>>>>                 with a -Xss value
>>>>>>                     > 2M.
>>>>>>
>>>>>>                     Thanks,
>>>>>>                     David
>>>>>>                     -----
>>>>>>
>>>>>>                     Bug eval:
>>>>>>
>>>>>>                     JDK-4441425 limits the stack to 8M as a 
>>>>>> safeguard
>>>>>>                 against an
>>>>>>                     unlimited value from getrlimit in 1.3.1, but
>>>>>> further
>>>>>>                 constrained
>>>>>>                     that to 2M in 1.4.0 due to JDK-4466587.
>>>>>>
>>>>>>                     By 1.4.2 we have the basic form of the current
>>>>>>                 problematic code:
>>>>>>
>>>>>>                     #ifndef IA64
>>>>>>                       if (rlim.rlim_cur > 2 * K * K) rlim.rlim_cur =
>>>>>> 2 *
>>>>>>                 K * K;
>>>>>>                     #else
>>>>>>                       // Problem still exists RH7.2 (IA64 anyway) 
>>>>>> but
>>>>>>                 2MB is a little
>>>>>>                 small
>>>>>>                       if (rlim.rlim_cur > 4 * K * K) rlim.rlim_cur =
>>>>>> 4 *
>>>>>>                 K * K;
>>>>>>                     #endif
>>>>>>
>>>>>>                       _initial_thread_stack_size = rlim.rlim_cur &
>>>>>>                 ~(page_size() - 1);
>>>>>>
>>>>>>                       if (max_size && _initial_thread_stack_size >
>>>>>>                 max_size) {
>>>>>>                          _initial_thread_stack_size = max_size;
>>>>>>                       }
>>>>>>
>>>>>>                     This was added by JDK-4678676 to allow the
>>>>>> stack of
>>>>>>                 the main thread
>>>>>>                     to be _reduced_ below the default 2M/4M if the
>>>>>> -Xss
>>>>>>                 value was
>>>>>>                     smaller than that.** There was no intent to 
>>>>>> allow
>>>>>>                 the stack size to
>>>>>>                     follow -Xss arbitrarily due to the operational
>>>>>>                 constraints imposed
>>>>>>                     by the OS/glibc at the time when dealing with 
>>>>>> the
>>>>>>                 primordial process
>>>>>>                     thread.
>>>>>>
>>>>>>                     ** It could not actually change the actual stack
>>>>>>                 size of course, but
>>>>>>                     set the guard pages to limit use to the expected
>>>>>>                 stack size.
>>>>>>
>>>>>>                     In JDK 6, under JDK-6316197, the launcher was
>>>>>>                 changed to create the
>>>>>>                     JVM in a new thread, so that it was not 
>>>>>> limited by
>>>>>> the
>>>>>>                     idiosyncracies of the OS or thread library
>>>>>>                 primordial thread
>>>>>>                     handling. However, the stack size limitations
>>>>>>                 remained in place in
>>>>>>                     case the VM was launched from the primordial
>>>>>> thread
>>>>>>                 of a user
>>>>>>                     application via the JNI invocation API.
>>>>>>
>>>>>>                     I believe it should be safe to remove the 2M
>>>>>>                 limitation now.
>>>>>>
>>>>>>
>>>>>>
>>>