(RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads

Chris Plummer chris.plummer at oracle.com
Tue Mar 21 02:35:44 UTC 2017


On 3/20/17 6:45 PM, David Holmes wrote:
> On 21/03/2017 11:25 AM, Chris Plummer wrote:
>> Hi David,
>>
>> On 3/20/17 5:59 PM, David Holmes wrote:
>>> Hi Chris,
>>>
>>> <trimming>
>>>
>>> Getting closer :)
>>>
>>> Have you seen pthread_attr_setstacksize return EINVAL on very large
>>> values? Or does pthread_create just give a ENOMEM?
>> It gives EAGAIN. The log for the new test contains:
>>
>> [0.416s][warning][os,thread] Failed to start thread - pthread_create
>> failed (EAGAIN) for attributes: stacksize: 9007199254740992k, guardsize:
>> 0k, detached.
>> Got exception for stack size 9223372036854774784:
>> java.lang.OutOfMemoryError: unable to create native thread: possibly out
>> of memory or process/resource limits reached
>>
>> I assume it's also throwing OOME, which the test is catching and
>> (correctly) ignoring.
>
> Yep that is all good. I was just wondering of 
> pthread_attr_setstacksize did any sanity checking of big values and 
> returned EINVAL - but it seems not.
>
>>>
>>> On 21/03/2017 9:29 AM, Chris Plummer wrote:
>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.02/webrev.hotspot
>>>>
>>>> Here's what's changed since webrev.01:
>>>>
>>>> os_posix.cpp: In os::Posix::get_initial_stack_size(), first round 
>>>> up the
>>>> stack size to be paged aligned. This fixes issues on Mac OS X (other
>>>> platforms seem to be immune to this). Then check if the size is zero
>>>> after rounding up to the page size. Subtract the page size in this 
>>>> case
>>>> to produce the maximum stack size allowed. Surprisingly I got no
>>>> complaint from gcc for subtracting from an unsigned value that is 
>>>> known
>>>> to be 0.
>>>
>>> I'm a little surprised at that too. :)
>>>
>>>> os_linux.cpp: In os::create_thread(), I also check here to make 
>>>> sure the
>>>> size is not 0 after adding the guard page and aligning up, and 
>>>> subtract
>>>> the os page size if it is 0.
>>>
>>> What if adding the guard page and rounding up causes overflow so that
>>> we get a very small positive stack size?
>> We already know the stack size is os page aligned up before getting
>> here, so actually we could just add the guard page and not align up. You
>> would get the same result.
>
> So we can get rid of the align_up here:
>
> 727   stack_size = align_size_up(stack_size + 
> os::Linux::default_guard_size(thr_type), vm_page_size());
I think so. I'm going to try the following:

   stack_size += os::Linux::default_guard_size(thr_type);
   assert(is_size_aligned(stack_size, os::vm_page_size()), "stack_size 
not aligned");

>
> But are we assuming the guard page is always one virtual-memory page?
Yes.

size_t os::Linux::default_guard_size(os::ThreadType thr_type) {
   // Creating guard page is very expensive. Java thread has HotSpot
   // guard pages, only enable glibc guard page for non-Java threads.
   // (Remember: compiler thread is a Java thread, too!)
   return ((thr_type == java_thread || thr_type == compiler_thread) ? 0 
: page_size());
}

> Or put another way: what is the difference between page_size() and 
> vm_page_size() ?
They are the same:

   static int page_size(void) { return _page_size; }
   static void set_page_size(int val) { _page_size = val; }
   Linux::set_page_size(sysconf(_SC_PAGESIZE));

int os::vm_page_size() {
   // Seems redundant as all get out
   assert(os::Linux::page_size() != -1, "must call os::init");
   return os::Linux::page_size();
}

I'm not sure why we have both.

thanks,

Chris
>
> Thanks,
> David
> -----
>
>>>
>>>> jvm.c: In JVM_StartThread(), on 32-bit platforms if the size is 
>>>> greater
>>>> than UINT_MAX, then I set the size to UINT_MAX. Note it will later be
>>>> rounded up to 0 in os::Posix::get_initial_stack_size(), which will
>>>> result in subtracting the os page size to get the actual maximum 
>>>> allowed
>>>> stack size.
>>>
>>> Good catch!
>>>
>>> Nit: "-Avoid"  -> "- Avoid"
>> Ok.
>>>
>>>>
>>>> TooSmallStackSize.java: added test case for unaligned stack sizes.
>>>
>>> Ok.
>>>
>>>> TestThreadStackSizes.java: New test. Creates new threads with every 
>>>> size
>>>> up to 320k in 1k increments. Then creates threads with a few other 
>>>> sizes
>>>> that can be problematic.
>>>
>>> So this test never actually fails as long as the VM doesn't crash - is
>>> that right?
>> Yes.
>>>
>>>   27  * @modules java.base/jdk.internal.misc
>>>   28  * @library /test/lib
>>>
>>> The above are not needed by this test.
>> Ok.
>>
>> thanks,
>>
>> Chris
>>>
>>> Thanks,
>>> David
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>> Chris
>>>>>
>>>>>
>>>>>>>
>>>>>>>>     public Thread(ThreadGroup group, Runnable target, String name,
>>>>>>>>                   long stackSize) {
>>>>>>>>         init(group, target, name, stackSize);
>>>>>>>>     }
>>>>>>>>
>>>>>>>> Fortunately we already force the stackSize to be at least
>>>>>>>> _java_thread_min_stack_allowed. However, we don't do any OS page
>>>>>>>> rounding on Mac OS X as noted above, and I was able to trigger the
>>>>>>>> assert by creating a thread with size 257k.
>>>>>>>
>>>>>>> Note this means that OSX stack logic is broken because it will
>>>>>>> currently be silently failing due to EINVAL!
>>>>>> Yes, that is correct.
>>>>>>
>>>>>> Chris
>>>>>>>
>>>>>>>> I'll get another webrev out once I've made the needed fixes. I 
>>>>>>>> also
>>>>>>>> have
>>>>>>>> a new test I'd like to add.
>>>>>>>
>>>>>>> Ok.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/16/17 9:27 PM, Chris Plummer wrote:
>>>>>>>>> Ok, time for a new webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.01/webrev.hotspot 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The only thing that has changed since the first webrev are the
>>>>>>>>> asserts
>>>>>>>>> added to os_linux.cpp and os_bsd.cpp. And to summarize what we
>>>>>>>>> discuss
>>>>>>>>> already:
>>>>>>>>>
>>>>>>>>>  - The assert should never happen due to the stack size being too
>>>>>>>>> small since it will be at least PTHREAD_STACK_MIN.
>>>>>>>>>  - The assert should never happen due to an unaligned stack size
>>>>>>>>> because we always align it to the page size.
>>>>>>>>>  - Any assert would therefore be a VM bug and not due to user
>>>>>>>>> error.
>>>>>>>>>  - No fixing the java launcher. If the user specifies a stack
>>>>>>>>> that is
>>>>>>>>> too small, hotspot will already detect this. If the user
>>>>>>>>> specifies a
>>>>>>>>> stack size that is large enough but not page aligned, then we 
>>>>>>>>> just
>>>>>>>>> ignore any error (if the platform doth protest) and the user 
>>>>>>>>> gets a
>>>>>>>>> main thread with a stack size set to whatever the OS default is.
>>>>>>>>>
>>>>>>>>> I still need to retest (I only ran TooSmallStackSize.java), but
>>>>>>>>> figure
>>>>>>>>> getting agreement on the changes first would be best before I bog
>>>>>>>>> down
>>>>>>>>> our testing resources.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 3/15/17 10:03 PM, Chris Plummer wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Please review the following:
>>>>>>>>>>
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8176768
>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> While working on 8175342 I noticed our stack size on xgene 
>>>>>>>>>> was 8mb
>>>>>>>>>> even though I was specifying -Xss72k. It turns out the following
>>>>>>>>>> code
>>>>>>>>>> was failing:
>>>>>>>>>>
>>>>>>>>>>       pthread_attr_setstacksize(&attr, stack_size);
>>>>>>>>>>
>>>>>>>>>> Although we computed a minimum stack size of 72k, so -Xss72k
>>>>>>>>>> should
>>>>>>>>>> be fine, pthreads on this platform requires the stack be at 
>>>>>>>>>> least
>>>>>>>>>> 128k, so it failed the pthread_attr_setstacksize() call. The end
>>>>>>>>>> result is pthread_attr_setstacksize() had no impact on the
>>>>>>>>>> thread's
>>>>>>>>>> stack size, and we ended up with the platform default of 8mb. 
>>>>>>>>>> The
>>>>>>>>>> fix
>>>>>>>>>> is to round up the following variables to PTHREAD_STACK_MIN 
>>>>>>>>>> after
>>>>>>>>>> computing their new values:
>>>>>>>>>>
>>>>>>>>>>       _java_thread_min_stack_allowed
>>>>>>>>>>       _compiler_thread_min_stack_allowed
>>>>>>>>>>       _vm_internal_thread_min_stack_allowed
>>>>>>>>>>
>>>>>>>>>> For solaris, there was an issue using PTHREAD_STACK_MIN. You
>>>>>>>>>> need to
>>>>>>>>>> #define _POSIX_C_SOURCE >= 199506L in order to get
>>>>>>>>>> PTHREAD_STACK_MIN
>>>>>>>>>> #defined, and this needs to be done before including OS header
>>>>>>>>>> files.
>>>>>>>>>> I noticed that on solaris we were using thr_min_stack() 
>>>>>>>>>> elsewhere
>>>>>>>>>> instead of PTHREAD_STACK_MIN, so I decided to do the same with
>>>>>>>>>> this
>>>>>>>>>> fix. Either way is ugly (the #define or using thr_min_stack()).
>>>>>>>>>>
>>>>>>>>>> And speaking of the existing use of thr_min_stack(), I deleted
>>>>>>>>>> it. It
>>>>>>>>>> was being applied before any adjustments to the stack sizes had
>>>>>>>>>> been
>>>>>>>>>> made (rounding and adding red, yellow, and shadow zones). This
>>>>>>>>>> mean
>>>>>>>>>> the stack ended up being larger than necessary. With the above
>>>>>>>>>> fix in
>>>>>>>>>> place, we are now applying thr_min_stack() after recomputing the
>>>>>>>>>> minimum stack sizes. If for any reason one of those stack 
>>>>>>>>>> sizes is
>>>>>>>>>> now too small, the correct fix is to adjust the initial stack
>>>>>>>>>> sizes,
>>>>>>>>>> not apply thr_min_stack() to the initial stack sizes. 
>>>>>>>>>> However, it
>>>>>>>>>> looks like no adjustment is needed. I did something close to our
>>>>>>>>>> nightly testing on all affect platforms, and no new problems
>>>>>>>>>> turned up.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>



More information about the hotspot-dev mailing list