(RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads
Chris Plummer
chris.plummer at oracle.com
Sat Mar 18 06:37:59 UTC 2017
On 3/17/17 8:17 PM, Chris Plummer wrote:
> On 3/17/17 7:01 PM, David Holmes wrote:
>> On 18/03/2017 9:11 AM, Chris Plummer wrote:
>>> Looks like this will need some more work since the added asserts are
>>> triggering on mac os x (which is the only place we'd currently expect
>>> them to assert).
>>>
>>> The problem is that the code I found that rounds up to the page size is
>>> only applied to java threads created by the VM for which the java user
>>> specified no stack size. The VM and Compiler thread sizes are not
>>> rounded. The failure I saw was with
>>> runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java
>>> when is
>>> specified -XX:CompilerThreadStackSize=9007199254740991. I hit the
>>> assert
>>> with an EINVAL. The size is not aligned, but it could also be
>>> complaining because it is too big. I haven't tried aligning it yet
>>> to see.
>>>
>>> On Linux we do the following:
>>>
>>> stack_size = align_size_up(stack_size +
>>> os::Linux::default_guard_size(thr_type), vm_page_size());
>>>
>>> We don't do this on BSD. I think the same on BSD would solve this
>>> problem. I'm not sure about adding the guard size. I'll need to see if
>>> mac os x has the same pthread bug as linux does.
>>
>> At this stage I would only deal with alignment issues. IIRC the guard
>> issue only affected Linux.
> Yes, that's what I eventually concluded. I put the fix in
> os::Posix::get_initial_stack_size() in os_posix.cpp, and only did the
> page aligning, not add the guard page. That way all Posix ports are
> fixed in one place. It seems to be working.
>>
>>> BTW, did you know java users can specify the size of the a new thread's
>>> stack:
>>
>> Yes I mentioned that in another reply - wondering whether we suitably
>> check and aligned such requests.
> No we don't. Below I mentioned I was able to trigger the assert with a
> 257k stack size. I guess I wasn't clear that I did that from Java. I
> have a new test to add that will be testing this, plus the
> 9007199254740991 stack size (which fails to create the thread with an
> OOME, but that's acceptable). The fix I mention above in
> os::Posix::get_initial_stack_size() takes care of this issue also.
Rounding up triggers a new assert, this time on 32-bit x86 and arm.
I should have clarified it's 9007199254740991 "K", which is
9223372036854774784. Unfortunately on 32bit systems that is asserting
with EINVAL. I think we need to do a better job of dealing with 32-bit
size_t values:
jlong java_lang_Thread::stackSize(oop java_thread) {
if (_stackSize_offset > 0) {
return java_thread->long_field(_stackSize_offset);
} else {
return 0;
}
}
jlong size =
java_lang_Thread::stackSize(JNIHandles::resolve_non_null(jthread));
// Allocate the C++ Thread structure and create the native
thread. The
// stack size retrieved from java is signed, but the constructor
takes
// size_t (an unsigned type), so avoid passing negative values
which would
// result in really large stacks.
size_t sz = size > 0 ? (size_t) size : 0;
native_thread = new JavaThread(&thread_entry, sz);
9223372036854774784 is 0x7ffffffffffffc00 (close to 64 bit MAX_INT),
which is 0xfffffc00 when cast to a size_t on a 32-bit system (close to
32-bit MAX_UINT). Round it up to the 4k page size and you get 0, which I
guess pthread_attr_setstacksize() doesn't like. So I think more
processing of the size is needed here. Maybe in os::create_thread() we
should check for 0 after rounding up, and subtract the os page size if
it is 0. However, I think we should also avoid truncating on 32-bit to
what is basically some random number. Maybe if "size" (a jlong) is
greater than UINT_MAX, then just set "sz" (a size_t) it to UINT_MAX.
Chris
>>
>>> public Thread(ThreadGroup group, Runnable target, String name,
>>> long stackSize) {
>>> init(group, target, name, stackSize);
>>> }
>>>
>>> Fortunately we already force the stackSize to be at least
>>> _java_thread_min_stack_allowed. However, we don't do any OS page
>>> rounding on Mac OS X as noted above, and I was able to trigger the
>>> assert by creating a thread with size 257k.
>>
>> Note this means that OSX stack logic is broken because it will
>> currently be silently failing due to EINVAL!
> Yes, that is correct.
>
> Chris
>>
>>> I'll get another webrev out once I've made the needed fixes. I also
>>> have
>>> a new test I'd like to add.
>>
>> Ok.
>>
>> Thanks,
>> David
>>
>>> Chris
>>>
>>> On 3/16/17 9:27 PM, Chris Plummer wrote:
>>>> Ok, time for a new webrev:
>>>>
>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.01/webrev.hotspot
>>>>
>>>> The only thing that has changed since the first webrev are the asserts
>>>> added to os_linux.cpp and os_bsd.cpp. And to summarize what we discuss
>>>> already:
>>>>
>>>> - The assert should never happen due to the stack size being too
>>>> small since it will be at least PTHREAD_STACK_MIN.
>>>> - The assert should never happen due to an unaligned stack size
>>>> because we always align it to the page size.
>>>> - Any assert would therefore be a VM bug and not due to user error.
>>>> - No fixing the java launcher. If the user specifies a stack that is
>>>> too small, hotspot will already detect this. If the user specifies a
>>>> stack size that is large enough but not page aligned, then we just
>>>> ignore any error (if the platform doth protest) and the user gets a
>>>> main thread with a stack size set to whatever the OS default is.
>>>>
>>>> I still need to retest (I only ran TooSmallStackSize.java), but figure
>>>> getting agreement on the changes first would be best before I bog down
>>>> our testing resources.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 3/15/17 10:03 PM, Chris Plummer wrote:
>>>>> Hello,
>>>>>
>>>>> Please review the following:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8176768
>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot
>>>>>
>>>>>
>>>>> While working on 8175342 I noticed our stack size on xgene was 8mb
>>>>> even though I was specifying -Xss72k. It turns out the following code
>>>>> was failing:
>>>>>
>>>>> pthread_attr_setstacksize(&attr, stack_size);
>>>>>
>>>>> Although we computed a minimum stack size of 72k, so -Xss72k should
>>>>> be fine, pthreads on this platform requires the stack be at least
>>>>> 128k, so it failed the pthread_attr_setstacksize() call. The end
>>>>> result is pthread_attr_setstacksize() had no impact on the thread's
>>>>> stack size, and we ended up with the platform default of 8mb. The fix
>>>>> is to round up the following variables to PTHREAD_STACK_MIN after
>>>>> computing their new values:
>>>>>
>>>>> _java_thread_min_stack_allowed
>>>>> _compiler_thread_min_stack_allowed
>>>>> _vm_internal_thread_min_stack_allowed
>>>>>
>>>>> For solaris, there was an issue using PTHREAD_STACK_MIN. You need to
>>>>> #define _POSIX_C_SOURCE >= 199506L in order to get PTHREAD_STACK_MIN
>>>>> #defined, and this needs to be done before including OS header files.
>>>>> I noticed that on solaris we were using thr_min_stack() elsewhere
>>>>> instead of PTHREAD_STACK_MIN, so I decided to do the same with this
>>>>> fix. Either way is ugly (the #define or using thr_min_stack()).
>>>>>
>>>>> And speaking of the existing use of thr_min_stack(), I deleted it. It
>>>>> was being applied before any adjustments to the stack sizes had been
>>>>> made (rounding and adding red, yellow, and shadow zones). This mean
>>>>> the stack ended up being larger than necessary. With the above fix in
>>>>> place, we are now applying thr_min_stack() after recomputing the
>>>>> minimum stack sizes. If for any reason one of those stack sizes is
>>>>> now too small, the correct fix is to adjust the initial stack sizes,
>>>>> not apply thr_min_stack() to the initial stack sizes. However, it
>>>>> looks like no adjustment is needed. I did something close to our
>>>>> nightly testing on all affect platforms, and no new problems
>>>>> turned up.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>
>>>>
>>>
>
More information about the hotspot-dev
mailing list