(RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads
David Holmes
david.holmes at oracle.com
Tue Mar 21 03:47:05 UTC 2017
On 21/03/2017 12:35 PM, Chris Plummer wrote:
> On 3/20/17 6:45 PM, David Holmes wrote:
>> On 21/03/2017 11:25 AM, Chris Plummer wrote:
>>> Hi David,
>>>
>>> On 3/20/17 5:59 PM, David Holmes wrote:
>>>> Hi Chris,
>>>>
>>>> <trimming>
>>>>
>>>> Getting closer :)
>>>>
>>>> Have you seen pthread_attr_setstacksize return EINVAL on very large
>>>> values? Or does pthread_create just give a ENOMEM?
>>> It gives EAGAIN. The log for the new test contains:
>>>
>>> [0.416s][warning][os,thread] Failed to start thread - pthread_create
>>> failed (EAGAIN) for attributes: stacksize: 9007199254740992k, guardsize:
>>> 0k, detached.
>>> Got exception for stack size 9223372036854774784:
>>> java.lang.OutOfMemoryError: unable to create native thread: possibly out
>>> of memory or process/resource limits reached
>>>
>>> I assume it's also throwing OOME, which the test is catching and
>>> (correctly) ignoring.
>>
>> Yep that is all good. I was just wondering of
>> pthread_attr_setstacksize did any sanity checking of big values and
>> returned EINVAL - but it seems not.
>>
>>>>
>>>> On 21/03/2017 9:29 AM, Chris Plummer wrote:
>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.02/webrev.hotspot
>>>>>
>>>>> Here's what's changed since webrev.01:
>>>>>
>>>>> os_posix.cpp: In os::Posix::get_initial_stack_size(), first round
>>>>> up the
>>>>> stack size to be paged aligned. This fixes issues on Mac OS X (other
>>>>> platforms seem to be immune to this). Then check if the size is zero
>>>>> after rounding up to the page size. Subtract the page size in this
>>>>> case
>>>>> to produce the maximum stack size allowed. Surprisingly I got no
>>>>> complaint from gcc for subtracting from an unsigned value that is
>>>>> known
>>>>> to be 0.
>>>>
>>>> I'm a little surprised at that too. :)
>>>>
>>>>> os_linux.cpp: In os::create_thread(), I also check here to make
>>>>> sure the
>>>>> size is not 0 after adding the guard page and aligning up, and
>>>>> subtract
>>>>> the os page size if it is 0.
>>>>
>>>> What if adding the guard page and rounding up causes overflow so that
>>>> we get a very small positive stack size?
>>> We already know the stack size is os page aligned up before getting
>>> here, so actually we could just add the guard page and not align up. You
>>> would get the same result.
>>
>> So we can get rid of the align_up here:
>>
>> 727 stack_size = align_size_up(stack_size +
>> os::Linux::default_guard_size(thr_type), vm_page_size());
> I think so. I'm going to try the following:
>
> stack_size += os::Linux::default_guard_size(thr_type);
> assert(is_size_aligned(stack_size, os::vm_page_size()), "stack_size
> not aligned");
Looks good.
>>
>> But are we assuming the guard page is always one virtual-memory page?
> Yes.
>
> size_t os::Linux::default_guard_size(os::ThreadType thr_type) {
> // Creating guard page is very expensive. Java thread has HotSpot
> // guard pages, only enable glibc guard page for non-Java threads.
> // (Remember: compiler thread is a Java thread, too!)
> return ((thr_type == java_thread || thr_type == compiler_thread) ? 0 :
> page_size());
> }
>
>> Or put another way: what is the difference between page_size() and
>> vm_page_size() ?
> They are the same:
>
> static int page_size(void) { return _page_size; }
> static void set_page_size(int val) { _page_size = val; }
> Linux::set_page_size(sysconf(_SC_PAGESIZE));
>
> int os::vm_page_size() {
> // Seems redundant as all get out
> assert(os::Linux::page_size() != -1, "must call os::init");
> return os::Linux::page_size();
> }
>
> I'm not sure why we have both.
Okay. Yeah no reason to actually define os::Linux::page_size() when we
could just have a static _page_size variable in os_linux.cpp outside any
class.
Thanks,
David
-----
> thanks,
>
> Chris
>>
>> Thanks,
>> David
>> -----
>>
>>>>
>>>>> jvm.c: In JVM_StartThread(), on 32-bit platforms if the size is
>>>>> greater
>>>>> than UINT_MAX, then I set the size to UINT_MAX. Note it will later be
>>>>> rounded up to 0 in os::Posix::get_initial_stack_size(), which will
>>>>> result in subtracting the os page size to get the actual maximum
>>>>> allowed
>>>>> stack size.
>>>>
>>>> Good catch!
>>>>
>>>> Nit: "-Avoid" -> "- Avoid"
>>> Ok.
>>>>
>>>>>
>>>>> TooSmallStackSize.java: added test case for unaligned stack sizes.
>>>>
>>>> Ok.
>>>>
>>>>> TestThreadStackSizes.java: New test. Creates new threads with every
>>>>> size
>>>>> up to 320k in 1k increments. Then creates threads with a few other
>>>>> sizes
>>>>> that can be problematic.
>>>>
>>>> So this test never actually fails as long as the VM doesn't crash - is
>>>> that right?
>>> Yes.
>>>>
>>>> 27 * @modules java.base/jdk.internal.misc
>>>> 28 * @library /test/lib
>>>>
>>>> The above are not needed by this test.
>>> Ok.
>>>
>>> thanks,
>>>
>>> Chris
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>> public Thread(ThreadGroup group, Runnable target, String name,
>>>>>>>>> long stackSize) {
>>>>>>>>> init(group, target, name, stackSize);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Fortunately we already force the stackSize to be at least
>>>>>>>>> _java_thread_min_stack_allowed. However, we don't do any OS page
>>>>>>>>> rounding on Mac OS X as noted above, and I was able to trigger the
>>>>>>>>> assert by creating a thread with size 257k.
>>>>>>>>
>>>>>>>> Note this means that OSX stack logic is broken because it will
>>>>>>>> currently be silently failing due to EINVAL!
>>>>>>> Yes, that is correct.
>>>>>>>
>>>>>>> Chris
>>>>>>>>
>>>>>>>>> I'll get another webrev out once I've made the needed fixes. I
>>>>>>>>> also
>>>>>>>>> have
>>>>>>>>> a new test I'd like to add.
>>>>>>>>
>>>>>>>> Ok.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 3/16/17 9:27 PM, Chris Plummer wrote:
>>>>>>>>>> Ok, time for a new webrev:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.01/webrev.hotspot
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The only thing that has changed since the first webrev are the
>>>>>>>>>> asserts
>>>>>>>>>> added to os_linux.cpp and os_bsd.cpp. And to summarize what we
>>>>>>>>>> discuss
>>>>>>>>>> already:
>>>>>>>>>>
>>>>>>>>>> - The assert should never happen due to the stack size being too
>>>>>>>>>> small since it will be at least PTHREAD_STACK_MIN.
>>>>>>>>>> - The assert should never happen due to an unaligned stack size
>>>>>>>>>> because we always align it to the page size.
>>>>>>>>>> - Any assert would therefore be a VM bug and not due to user
>>>>>>>>>> error.
>>>>>>>>>> - No fixing the java launcher. If the user specifies a stack
>>>>>>>>>> that is
>>>>>>>>>> too small, hotspot will already detect this. If the user
>>>>>>>>>> specifies a
>>>>>>>>>> stack size that is large enough but not page aligned, then we
>>>>>>>>>> just
>>>>>>>>>> ignore any error (if the platform doth protest) and the user
>>>>>>>>>> gets a
>>>>>>>>>> main thread with a stack size set to whatever the OS default is.
>>>>>>>>>>
>>>>>>>>>> I still need to retest (I only ran TooSmallStackSize.java), but
>>>>>>>>>> figure
>>>>>>>>>> getting agreement on the changes first would be best before I bog
>>>>>>>>>> down
>>>>>>>>>> our testing resources.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 3/15/17 10:03 PM, Chris Plummer wrote:
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> Please review the following:
>>>>>>>>>>>
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8176768
>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> While working on 8175342 I noticed our stack size on xgene
>>>>>>>>>>> was 8mb
>>>>>>>>>>> even though I was specifying -Xss72k. It turns out the following
>>>>>>>>>>> code
>>>>>>>>>>> was failing:
>>>>>>>>>>>
>>>>>>>>>>> pthread_attr_setstacksize(&attr, stack_size);
>>>>>>>>>>>
>>>>>>>>>>> Although we computed a minimum stack size of 72k, so -Xss72k
>>>>>>>>>>> should
>>>>>>>>>>> be fine, pthreads on this platform requires the stack be at
>>>>>>>>>>> least
>>>>>>>>>>> 128k, so it failed the pthread_attr_setstacksize() call. The end
>>>>>>>>>>> result is pthread_attr_setstacksize() had no impact on the
>>>>>>>>>>> thread's
>>>>>>>>>>> stack size, and we ended up with the platform default of 8mb.
>>>>>>>>>>> The
>>>>>>>>>>> fix
>>>>>>>>>>> is to round up the following variables to PTHREAD_STACK_MIN
>>>>>>>>>>> after
>>>>>>>>>>> computing their new values:
>>>>>>>>>>>
>>>>>>>>>>> _java_thread_min_stack_allowed
>>>>>>>>>>> _compiler_thread_min_stack_allowed
>>>>>>>>>>> _vm_internal_thread_min_stack_allowed
>>>>>>>>>>>
>>>>>>>>>>> For solaris, there was an issue using PTHREAD_STACK_MIN. You
>>>>>>>>>>> need to
>>>>>>>>>>> #define _POSIX_C_SOURCE >= 199506L in order to get
>>>>>>>>>>> PTHREAD_STACK_MIN
>>>>>>>>>>> #defined, and this needs to be done before including OS header
>>>>>>>>>>> files.
>>>>>>>>>>> I noticed that on solaris we were using thr_min_stack()
>>>>>>>>>>> elsewhere
>>>>>>>>>>> instead of PTHREAD_STACK_MIN, so I decided to do the same with
>>>>>>>>>>> this
>>>>>>>>>>> fix. Either way is ugly (the #define or using thr_min_stack()).
>>>>>>>>>>>
>>>>>>>>>>> And speaking of the existing use of thr_min_stack(), I deleted
>>>>>>>>>>> it. It
>>>>>>>>>>> was being applied before any adjustments to the stack sizes had
>>>>>>>>>>> been
>>>>>>>>>>> made (rounding and adding red, yellow, and shadow zones). This
>>>>>>>>>>> mean
>>>>>>>>>>> the stack ended up being larger than necessary. With the above
>>>>>>>>>>> fix in
>>>>>>>>>>> place, we are now applying thr_min_stack() after recomputing the
>>>>>>>>>>> minimum stack sizes. If for any reason one of those stack
>>>>>>>>>>> sizes is
>>>>>>>>>>> now too small, the correct fix is to adjust the initial stack
>>>>>>>>>>> sizes,
>>>>>>>>>>> not apply thr_min_stack() to the initial stack sizes.
>>>>>>>>>>> However, it
>>>>>>>>>>> looks like no adjustment is needed. I did something close to our
>>>>>>>>>>> nightly testing on all affect platforms, and no new problems
>>>>>>>>>>> turned up.
>>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>
More information about the hotspot-dev
mailing list