(RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads
David Holmes
david.holmes at oracle.com
Tue Mar 21 01:45:42 UTC 2017
On 21/03/2017 11:25 AM, Chris Plummer wrote:
> Hi David,
>
> On 3/20/17 5:59 PM, David Holmes wrote:
>> Hi Chris,
>>
>> <trimming>
>>
>> Getting closer :)
>>
>> Have you seen pthread_attr_setstacksize return EINVAL on very large
>> values? Or does pthread_create just give a ENOMEM?
> It gives EAGAIN. The log for the new test contains:
>
> [0.416s][warning][os,thread] Failed to start thread - pthread_create
> failed (EAGAIN) for attributes: stacksize: 9007199254740992k, guardsize:
> 0k, detached.
> Got exception for stack size 9223372036854774784:
> java.lang.OutOfMemoryError: unable to create native thread: possibly out
> of memory or process/resource limits reached
>
> I assume it's also throwing OOME, which the test is catching and
> (correctly) ignoring.
Yep that is all good. I was just wondering of pthread_attr_setstacksize
did any sanity checking of big values and returned EINVAL - but it seems
not.
>>
>> On 21/03/2017 9:29 AM, Chris Plummer wrote:
>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.02/webrev.hotspot
>>>
>>> Here's what's changed since webrev.01:
>>>
>>> os_posix.cpp: In os::Posix::get_initial_stack_size(), first round up the
>>> stack size to be paged aligned. This fixes issues on Mac OS X (other
>>> platforms seem to be immune to this). Then check if the size is zero
>>> after rounding up to the page size. Subtract the page size in this case
>>> to produce the maximum stack size allowed. Surprisingly I got no
>>> complaint from gcc for subtracting from an unsigned value that is known
>>> to be 0.
>>
>> I'm a little surprised at that too. :)
>>
>>> os_linux.cpp: In os::create_thread(), I also check here to make sure the
>>> size is not 0 after adding the guard page and aligning up, and subtract
>>> the os page size if it is 0.
>>
>> What if adding the guard page and rounding up causes overflow so that
>> we get a very small positive stack size?
> We already know the stack size is os page aligned up before getting
> here, so actually we could just add the guard page and not align up. You
> would get the same result.
So we can get rid of the align_up here:
727 stack_size = align_size_up(stack_size +
os::Linux::default_guard_size(thr_type), vm_page_size());
But are we assuming the guard page is always one virtual-memory page? Or
put another way: what is the difference between page_size() and
vm_page_size() ?
Thanks,
David
-----
>>
>>> jvm.c: In JVM_StartThread(), on 32-bit platforms if the size is greater
>>> than UINT_MAX, then I set the size to UINT_MAX. Note it will later be
>>> rounded up to 0 in os::Posix::get_initial_stack_size(), which will
>>> result in subtracting the os page size to get the actual maximum allowed
>>> stack size.
>>
>> Good catch!
>>
>> Nit: "-Avoid" -> "- Avoid"
> Ok.
>>
>>>
>>> TooSmallStackSize.java: added test case for unaligned stack sizes.
>>
>> Ok.
>>
>>> TestThreadStackSizes.java: New test. Creates new threads with every size
>>> up to 320k in 1k increments. Then creates threads with a few other sizes
>>> that can be problematic.
>>
>> So this test never actually fails as long as the VM doesn't crash - is
>> that right?
> Yes.
>>
>> 27 * @modules java.base/jdk.internal.misc
>> 28 * @library /test/lib
>>
>> The above are not needed by this test.
> Ok.
>
> thanks,
>
> Chris
>>
>> Thanks,
>> David
>>
>>> thanks,
>>>
>>> Chris
>>>> Chris
>>>>
>>>>
>>>>>>
>>>>>>> public Thread(ThreadGroup group, Runnable target, String name,
>>>>>>> long stackSize) {
>>>>>>> init(group, target, name, stackSize);
>>>>>>> }
>>>>>>>
>>>>>>> Fortunately we already force the stackSize to be at least
>>>>>>> _java_thread_min_stack_allowed. However, we don't do any OS page
>>>>>>> rounding on Mac OS X as noted above, and I was able to trigger the
>>>>>>> assert by creating a thread with size 257k.
>>>>>>
>>>>>> Note this means that OSX stack logic is broken because it will
>>>>>> currently be silently failing due to EINVAL!
>>>>> Yes, that is correct.
>>>>>
>>>>> Chris
>>>>>>
>>>>>>> I'll get another webrev out once I've made the needed fixes. I also
>>>>>>> have
>>>>>>> a new test I'd like to add.
>>>>>>
>>>>>> Ok.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 3/16/17 9:27 PM, Chris Plummer wrote:
>>>>>>>> Ok, time for a new webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.01/webrev.hotspot
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The only thing that has changed since the first webrev are the
>>>>>>>> asserts
>>>>>>>> added to os_linux.cpp and os_bsd.cpp. And to summarize what we
>>>>>>>> discuss
>>>>>>>> already:
>>>>>>>>
>>>>>>>> - The assert should never happen due to the stack size being too
>>>>>>>> small since it will be at least PTHREAD_STACK_MIN.
>>>>>>>> - The assert should never happen due to an unaligned stack size
>>>>>>>> because we always align it to the page size.
>>>>>>>> - Any assert would therefore be a VM bug and not due to user
>>>>>>>> error.
>>>>>>>> - No fixing the java launcher. If the user specifies a stack
>>>>>>>> that is
>>>>>>>> too small, hotspot will already detect this. If the user
>>>>>>>> specifies a
>>>>>>>> stack size that is large enough but not page aligned, then we just
>>>>>>>> ignore any error (if the platform doth protest) and the user gets a
>>>>>>>> main thread with a stack size set to whatever the OS default is.
>>>>>>>>
>>>>>>>> I still need to retest (I only ran TooSmallStackSize.java), but
>>>>>>>> figure
>>>>>>>> getting agreement on the changes first would be best before I bog
>>>>>>>> down
>>>>>>>> our testing resources.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/15/17 10:03 PM, Chris Plummer wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Please review the following:
>>>>>>>>>
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8176768
>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> While working on 8175342 I noticed our stack size on xgene was 8mb
>>>>>>>>> even though I was specifying -Xss72k. It turns out the following
>>>>>>>>> code
>>>>>>>>> was failing:
>>>>>>>>>
>>>>>>>>> pthread_attr_setstacksize(&attr, stack_size);
>>>>>>>>>
>>>>>>>>> Although we computed a minimum stack size of 72k, so -Xss72k
>>>>>>>>> should
>>>>>>>>> be fine, pthreads on this platform requires the stack be at least
>>>>>>>>> 128k, so it failed the pthread_attr_setstacksize() call. The end
>>>>>>>>> result is pthread_attr_setstacksize() had no impact on the
>>>>>>>>> thread's
>>>>>>>>> stack size, and we ended up with the platform default of 8mb. The
>>>>>>>>> fix
>>>>>>>>> is to round up the following variables to PTHREAD_STACK_MIN after
>>>>>>>>> computing their new values:
>>>>>>>>>
>>>>>>>>> _java_thread_min_stack_allowed
>>>>>>>>> _compiler_thread_min_stack_allowed
>>>>>>>>> _vm_internal_thread_min_stack_allowed
>>>>>>>>>
>>>>>>>>> For solaris, there was an issue using PTHREAD_STACK_MIN. You
>>>>>>>>> need to
>>>>>>>>> #define _POSIX_C_SOURCE >= 199506L in order to get
>>>>>>>>> PTHREAD_STACK_MIN
>>>>>>>>> #defined, and this needs to be done before including OS header
>>>>>>>>> files.
>>>>>>>>> I noticed that on solaris we were using thr_min_stack() elsewhere
>>>>>>>>> instead of PTHREAD_STACK_MIN, so I decided to do the same with
>>>>>>>>> this
>>>>>>>>> fix. Either way is ugly (the #define or using thr_min_stack()).
>>>>>>>>>
>>>>>>>>> And speaking of the existing use of thr_min_stack(), I deleted
>>>>>>>>> it. It
>>>>>>>>> was being applied before any adjustments to the stack sizes had
>>>>>>>>> been
>>>>>>>>> made (rounding and adding red, yellow, and shadow zones). This
>>>>>>>>> mean
>>>>>>>>> the stack ended up being larger than necessary. With the above
>>>>>>>>> fix in
>>>>>>>>> place, we are now applying thr_min_stack() after recomputing the
>>>>>>>>> minimum stack sizes. If for any reason one of those stack sizes is
>>>>>>>>> now too small, the correct fix is to adjust the initial stack
>>>>>>>>> sizes,
>>>>>>>>> not apply thr_min_stack() to the initial stack sizes. However, it
>>>>>>>>> looks like no adjustment is needed. I did something close to our
>>>>>>>>> nightly testing on all affect platforms, and no new problems
>>>>>>>>> turned up.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>
More information about the hotspot-dev
mailing list