(RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads
Chris Plummer
chris.plummer at oracle.com
Tue Mar 21 02:35:44 UTC 2017
On 3/20/17 6:45 PM, David Holmes wrote:
> On 21/03/2017 11:25 AM, Chris Plummer wrote:
>> Hi David,
>>
>> On 3/20/17 5:59 PM, David Holmes wrote:
>>> Hi Chris,
>>>
>>> <trimming>
>>>
>>> Getting closer :)
>>>
>>> Have you seen pthread_attr_setstacksize return EINVAL on very large
>>> values? Or does pthread_create just give a ENOMEM?
>> It gives EAGAIN. The log for the new test contains:
>>
>> [0.416s][warning][os,thread] Failed to start thread - pthread_create
>> failed (EAGAIN) for attributes: stacksize: 9007199254740992k, guardsize:
>> 0k, detached.
>> Got exception for stack size 9223372036854774784:
>> java.lang.OutOfMemoryError: unable to create native thread: possibly out
>> of memory or process/resource limits reached
>>
>> I assume it's also throwing OOME, which the test is catching and
>> (correctly) ignoring.
>
> Yep that is all good. I was just wondering of
> pthread_attr_setstacksize did any sanity checking of big values and
> returned EINVAL - but it seems not.
>
>>>
>>> On 21/03/2017 9:29 AM, Chris Plummer wrote:
>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.02/webrev.hotspot
>>>>
>>>> Here's what's changed since webrev.01:
>>>>
>>>> os_posix.cpp: In os::Posix::get_initial_stack_size(), first round
>>>> up the
>>>> stack size to be paged aligned. This fixes issues on Mac OS X (other
>>>> platforms seem to be immune to this). Then check if the size is zero
>>>> after rounding up to the page size. Subtract the page size in this
>>>> case
>>>> to produce the maximum stack size allowed. Surprisingly I got no
>>>> complaint from gcc for subtracting from an unsigned value that is
>>>> known
>>>> to be 0.
>>>
>>> I'm a little surprised at that too. :)
>>>
>>>> os_linux.cpp: In os::create_thread(), I also check here to make
>>>> sure the
>>>> size is not 0 after adding the guard page and aligning up, and
>>>> subtract
>>>> the os page size if it is 0.
>>>
>>> What if adding the guard page and rounding up causes overflow so that
>>> we get a very small positive stack size?
>> We already know the stack size is os page aligned up before getting
>> here, so actually we could just add the guard page and not align up. You
>> would get the same result.
>
> So we can get rid of the align_up here:
>
> 727 stack_size = align_size_up(stack_size +
> os::Linux::default_guard_size(thr_type), vm_page_size());
I think so. I'm going to try the following:
stack_size += os::Linux::default_guard_size(thr_type);
assert(is_size_aligned(stack_size, os::vm_page_size()), "stack_size
not aligned");
>
> But are we assuming the guard page is always one virtual-memory page?
Yes.
size_t os::Linux::default_guard_size(os::ThreadType thr_type) {
// Creating guard page is very expensive. Java thread has HotSpot
// guard pages, only enable glibc guard page for non-Java threads.
// (Remember: compiler thread is a Java thread, too!)
return ((thr_type == java_thread || thr_type == compiler_thread) ? 0
: page_size());
}
> Or put another way: what is the difference between page_size() and
> vm_page_size() ?
They are the same:
static int page_size(void) { return _page_size; }
static void set_page_size(int val) { _page_size = val; }
Linux::set_page_size(sysconf(_SC_PAGESIZE));
int os::vm_page_size() {
// Seems redundant as all get out
assert(os::Linux::page_size() != -1, "must call os::init");
return os::Linux::page_size();
}
I'm not sure why we have both.
thanks,
Chris
>
> Thanks,
> David
> -----
>
>>>
>>>> jvm.c: In JVM_StartThread(), on 32-bit platforms if the size is
>>>> greater
>>>> than UINT_MAX, then I set the size to UINT_MAX. Note it will later be
>>>> rounded up to 0 in os::Posix::get_initial_stack_size(), which will
>>>> result in subtracting the os page size to get the actual maximum
>>>> allowed
>>>> stack size.
>>>
>>> Good catch!
>>>
>>> Nit: "-Avoid" -> "- Avoid"
>> Ok.
>>>
>>>>
>>>> TooSmallStackSize.java: added test case for unaligned stack sizes.
>>>
>>> Ok.
>>>
>>>> TestThreadStackSizes.java: New test. Creates new threads with every
>>>> size
>>>> up to 320k in 1k increments. Then creates threads with a few other
>>>> sizes
>>>> that can be problematic.
>>>
>>> So this test never actually fails as long as the VM doesn't crash - is
>>> that right?
>> Yes.
>>>
>>> 27 * @modules java.base/jdk.internal.misc
>>> 28 * @library /test/lib
>>>
>>> The above are not needed by this test.
>> Ok.
>>
>> thanks,
>>
>> Chris
>>>
>>> Thanks,
>>> David
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>> Chris
>>>>>
>>>>>
>>>>>>>
>>>>>>>> public Thread(ThreadGroup group, Runnable target, String name,
>>>>>>>> long stackSize) {
>>>>>>>> init(group, target, name, stackSize);
>>>>>>>> }
>>>>>>>>
>>>>>>>> Fortunately we already force the stackSize to be at least
>>>>>>>> _java_thread_min_stack_allowed. However, we don't do any OS page
>>>>>>>> rounding on Mac OS X as noted above, and I was able to trigger the
>>>>>>>> assert by creating a thread with size 257k.
>>>>>>>
>>>>>>> Note this means that OSX stack logic is broken because it will
>>>>>>> currently be silently failing due to EINVAL!
>>>>>> Yes, that is correct.
>>>>>>
>>>>>> Chris
>>>>>>>
>>>>>>>> I'll get another webrev out once I've made the needed fixes. I
>>>>>>>> also
>>>>>>>> have
>>>>>>>> a new test I'd like to add.
>>>>>>>
>>>>>>> Ok.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/16/17 9:27 PM, Chris Plummer wrote:
>>>>>>>>> Ok, time for a new webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.01/webrev.hotspot
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The only thing that has changed since the first webrev are the
>>>>>>>>> asserts
>>>>>>>>> added to os_linux.cpp and os_bsd.cpp. And to summarize what we
>>>>>>>>> discuss
>>>>>>>>> already:
>>>>>>>>>
>>>>>>>>> - The assert should never happen due to the stack size being too
>>>>>>>>> small since it will be at least PTHREAD_STACK_MIN.
>>>>>>>>> - The assert should never happen due to an unaligned stack size
>>>>>>>>> because we always align it to the page size.
>>>>>>>>> - Any assert would therefore be a VM bug and not due to user
>>>>>>>>> error.
>>>>>>>>> - No fixing the java launcher. If the user specifies a stack
>>>>>>>>> that is
>>>>>>>>> too small, hotspot will already detect this. If the user
>>>>>>>>> specifies a
>>>>>>>>> stack size that is large enough but not page aligned, then we
>>>>>>>>> just
>>>>>>>>> ignore any error (if the platform doth protest) and the user
>>>>>>>>> gets a
>>>>>>>>> main thread with a stack size set to whatever the OS default is.
>>>>>>>>>
>>>>>>>>> I still need to retest (I only ran TooSmallStackSize.java), but
>>>>>>>>> figure
>>>>>>>>> getting agreement on the changes first would be best before I bog
>>>>>>>>> down
>>>>>>>>> our testing resources.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 3/15/17 10:03 PM, Chris Plummer wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Please review the following:
>>>>>>>>>>
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8176768
>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> While working on 8175342 I noticed our stack size on xgene
>>>>>>>>>> was 8mb
>>>>>>>>>> even though I was specifying -Xss72k. It turns out the following
>>>>>>>>>> code
>>>>>>>>>> was failing:
>>>>>>>>>>
>>>>>>>>>> pthread_attr_setstacksize(&attr, stack_size);
>>>>>>>>>>
>>>>>>>>>> Although we computed a minimum stack size of 72k, so -Xss72k
>>>>>>>>>> should
>>>>>>>>>> be fine, pthreads on this platform requires the stack be at
>>>>>>>>>> least
>>>>>>>>>> 128k, so it failed the pthread_attr_setstacksize() call. The end
>>>>>>>>>> result is pthread_attr_setstacksize() had no impact on the
>>>>>>>>>> thread's
>>>>>>>>>> stack size, and we ended up with the platform default of 8mb.
>>>>>>>>>> The
>>>>>>>>>> fix
>>>>>>>>>> is to round up the following variables to PTHREAD_STACK_MIN
>>>>>>>>>> after
>>>>>>>>>> computing their new values:
>>>>>>>>>>
>>>>>>>>>> _java_thread_min_stack_allowed
>>>>>>>>>> _compiler_thread_min_stack_allowed
>>>>>>>>>> _vm_internal_thread_min_stack_allowed
>>>>>>>>>>
>>>>>>>>>> For solaris, there was an issue using PTHREAD_STACK_MIN. You
>>>>>>>>>> need to
>>>>>>>>>> #define _POSIX_C_SOURCE >= 199506L in order to get
>>>>>>>>>> PTHREAD_STACK_MIN
>>>>>>>>>> #defined, and this needs to be done before including OS header
>>>>>>>>>> files.
>>>>>>>>>> I noticed that on solaris we were using thr_min_stack()
>>>>>>>>>> elsewhere
>>>>>>>>>> instead of PTHREAD_STACK_MIN, so I decided to do the same with
>>>>>>>>>> this
>>>>>>>>>> fix. Either way is ugly (the #define or using thr_min_stack()).
>>>>>>>>>>
>>>>>>>>>> And speaking of the existing use of thr_min_stack(), I deleted
>>>>>>>>>> it. It
>>>>>>>>>> was being applied before any adjustments to the stack sizes had
>>>>>>>>>> been
>>>>>>>>>> made (rounding and adding red, yellow, and shadow zones). This
>>>>>>>>>> mean
>>>>>>>>>> the stack ended up being larger than necessary. With the above
>>>>>>>>>> fix in
>>>>>>>>>> place, we are now applying thr_min_stack() after recomputing the
>>>>>>>>>> minimum stack sizes. If for any reason one of those stack
>>>>>>>>>> sizes is
>>>>>>>>>> now too small, the correct fix is to adjust the initial stack
>>>>>>>>>> sizes,
>>>>>>>>>> not apply thr_min_stack() to the initial stack sizes.
>>>>>>>>>> However, it
>>>>>>>>>> looks like no adjustment is needed. I did something close to our
>>>>>>>>>> nightly testing on all affect platforms, and no new problems
>>>>>>>>>> turned up.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>
More information about the hotspot-dev
mailing list