(RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Mar 21 19:51:36 UTC 2017
> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.02/webrev.hotspot
src/os/aix/vm/os_aix.cpp
No comments.
src/os/bsd/vm/os_bsd.cpp
No comments.
src/os/linux/vm/os_linux.cpp
L729: // In that cast just subract the page size to get the
maximum possible stack size.
Typo: 'cast' -> 'case'
Typo: 'subract' -> 'subtract' (Thomas also commented on it)
src/os/posix/vm/os_posix.cpp
L263: // aligning up could have resulted in the size being 0. In
that case just subract the
Nit: 'aligning' -> 'Aligning' (since it's a sentence)
Typo: 'subract' -> 'subtract'
src/os/solaris/vm/os_solaris.cpp
No comments.
src/share/vm/prims/jvm.cpp
L2812: // -Avoid truncating on 32-bit platforms if size is
greater than UINT_MAX
Nit: needs a period at the end like L2813.
test/runtime/Thread/TooSmallStackSize.java
No comments.
test/runtime/Thread/TestThreadStackSizes.java
L26: * @summary Test user threads with various stacks sizes.
Typo?: "stacks sizes" -> "stack sizes"
L36: super(null, null, "TestThreadStackSizes"+stackSize,
stackSize);
Nit: spaces around the "+".
L46: TestThreadStackSizes testThreadStackSize = new
TestThreadStackSizes(stackSize);
Nit: extra space before '='.
So this test makes 326 createThread() calls... how long does
it take to run?
Thumbs up! I don't need to see another webrev if you choose to
fix these minor typos...
Dan
On 3/20/17 5:29 PM, Chris Plummer wrote:
> On 3/17/17 11:37 PM, Chris Plummer wrote:
>> On 3/17/17 8:17 PM, Chris Plummer wrote:
>>> On 3/17/17 7:01 PM, David Holmes wrote:
>>>> On 18/03/2017 9:11 AM, Chris Plummer wrote:
>>>>> Looks like this will need some more work since the added asserts are
>>>>> triggering on mac os x (which is the only place we'd currently expect
>>>>> them to assert).
>>>>>
>>>>> The problem is that the code I found that rounds up to the page
>>>>> size is
>>>>> only applied to java threads created by the VM for which the java
>>>>> user
>>>>> specified no stack size. The VM and Compiler thread sizes are not
>>>>> rounded. The failure I saw was with
>>>>> runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java
>>>>> when is
>>>>> specified -XX:CompilerThreadStackSize=9007199254740991. I hit the
>>>>> assert
>>>>> with an EINVAL. The size is not aligned, but it could also be
>>>>> complaining because it is too big. I haven't tried aligning it yet
>>>>> to see.
>>>>>
>>>>> On Linux we do the following:
>>>>>
>>>>> stack_size = align_size_up(stack_size +
>>>>> os::Linux::default_guard_size(thr_type), vm_page_size());
>>>>>
>>>>> We don't do this on BSD. I think the same on BSD would solve this
>>>>> problem. I'm not sure about adding the guard size. I'll need to
>>>>> see if
>>>>> mac os x has the same pthread bug as linux does.
>>>>
>>>> At this stage I would only deal with alignment issues. IIRC the
>>>> guard issue only affected Linux.
>>> Yes, that's what I eventually concluded. I put the fix in
>>> os::Posix::get_initial_stack_size() in os_posix.cpp, and only did
>>> the page aligning, not add the guard page. That way all Posix ports
>>> are fixed in one place. It seems to be working.
>>>>
>>>>> BTW, did you know java users can specify the size of the a new
>>>>> thread's
>>>>> stack:
>>>>
>>>> Yes I mentioned that in another reply - wondering whether we
>>>> suitably check and aligned such requests.
>>> No we don't. Below I mentioned I was able to trigger the assert with
>>> a 257k stack size. I guess I wasn't clear that I did that from Java.
>>> I have a new test to add that will be testing this, plus the
>>> 9007199254740991 stack size (which fails to create the thread with
>>> an OOME, but that's acceptable). The fix I mention above in
>>> os::Posix::get_initial_stack_size() takes care of this issue also.
>> Rounding up triggers a new assert, this time on 32-bit x86 and arm.
>>
>> I should have clarified it's 9007199254740991 "K", which is
>> 9223372036854774784. Unfortunately on 32bit systems that is asserting
>> with EINVAL. I think we need to do a better job of dealing with
>> 32-bit size_t values:
>>
>> jlong java_lang_Thread::stackSize(oop java_thread) {
>> if (_stackSize_offset > 0) {
>> return java_thread->long_field(_stackSize_offset);
>> } else {
>> return 0;
>> }
>> }
>>
>> jlong size =
>> java_lang_Thread::stackSize(JNIHandles::resolve_non_null(jthread));
>> // Allocate the C++ Thread structure and create the native
>> thread. The
>> // stack size retrieved from java is signed, but the
>> constructor takes
>> // size_t (an unsigned type), so avoid passing negative values
>> which would
>> // result in really large stacks.
>> size_t sz = size > 0 ? (size_t) size : 0;
>> native_thread = new JavaThread(&thread_entry, sz);
>>
>> 9223372036854774784 is 0x7ffffffffffffc00 (close to 64 bit MAX_INT),
>> which is 0xfffffc00 when cast to a size_t on a 32-bit system (close
>> to 32-bit MAX_UINT). Round it up to the 4k page size and you get 0,
>> which I guess pthread_attr_setstacksize() doesn't like. So I think
>> more processing of the size is needed here. Maybe in
>> os::create_thread() we should check for 0 after rounding up, and
>> subtract the os page size if it is 0. However, I think we should also
>> avoid truncating on 32-bit to what is basically some random number.
>> Maybe if "size" (a jlong) is greater than UINT_MAX, then just set
>> "sz" (a size_t) it to UINT_MAX.
>>
> Ok, I think I have this all worked out now. I've added fixes for
> unaligned stack sizes, 32-bit truncating of stack size, and the
> "aligning up to 0" problem. I also added a test. Here's the latest
> webrev:
>
> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.02/webrev.hotspot
>
> Here's what's changed since webrev.01:
>
> os_posix.cpp: In os::Posix::get_initial_stack_size(), first round up
> the stack size to be paged aligned. This fixes issues on Mac OS X
> (other platforms seem to be immune to this). Then check if the size is
> zero after rounding up to the page size. Subtract the page size in
> this case to produce the maximum stack size allowed. Surprisingly I
> got no complaint from gcc for subtracting from an unsigned value that
> is known to be 0.
>
> os_linux.cpp: In os::create_thread(), I also check here to make sure
> the size is not 0 after adding the guard page and aligning up, and
> subtract the os page size if it is 0.
>
> jvm.c: In JVM_StartThread(), on 32-bit platforms if the size is
> greater than UINT_MAX, then I set the size to UINT_MAX. Note it will
> later be rounded up to 0 in os::Posix::get_initial_stack_size(), which
> will result in subtracting the os page size to get the actual maximum
> allowed stack size.
>
> TooSmallStackSize.java: added test case for unaligned stack sizes.
>
> TestThreadStackSizes.java: New test. Creates new threads with every
> size up to 320k in 1k increments. Then creates threads with a few
> other sizes that can be problematic.
>
> thanks,
>
> Chris
>> Chris
>>
>>
>>>>
>>>>> public Thread(ThreadGroup group, Runnable target, String name,
>>>>> long stackSize) {
>>>>> init(group, target, name, stackSize);
>>>>> }
>>>>>
>>>>> Fortunately we already force the stackSize to be at least
>>>>> _java_thread_min_stack_allowed. However, we don't do any OS page
>>>>> rounding on Mac OS X as noted above, and I was able to trigger the
>>>>> assert by creating a thread with size 257k.
>>>>
>>>> Note this means that OSX stack logic is broken because it will
>>>> currently be silently failing due to EINVAL!
>>> Yes, that is correct.
>>>
>>> Chris
>>>>
>>>>> I'll get another webrev out once I've made the needed fixes. I
>>>>> also have
>>>>> a new test I'd like to add.
>>>>
>>>> Ok.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Chris
>>>>>
>>>>> On 3/16/17 9:27 PM, Chris Plummer wrote:
>>>>>> Ok, time for a new webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.01/webrev.hotspot
>>>>>>
>>>>>>
>>>>>> The only thing that has changed since the first webrev are the
>>>>>> asserts
>>>>>> added to os_linux.cpp and os_bsd.cpp. And to summarize what we
>>>>>> discuss
>>>>>> already:
>>>>>>
>>>>>> - The assert should never happen due to the stack size being too
>>>>>> small since it will be at least PTHREAD_STACK_MIN.
>>>>>> - The assert should never happen due to an unaligned stack size
>>>>>> because we always align it to the page size.
>>>>>> - Any assert would therefore be a VM bug and not due to user error.
>>>>>> - No fixing the java launcher. If the user specifies a stack
>>>>>> that is
>>>>>> too small, hotspot will already detect this. If the user specifies a
>>>>>> stack size that is large enough but not page aligned, then we just
>>>>>> ignore any error (if the platform doth protest) and the user gets a
>>>>>> main thread with a stack size set to whatever the OS default is.
>>>>>>
>>>>>> I still need to retest (I only ran TooSmallStackSize.java), but
>>>>>> figure
>>>>>> getting agreement on the changes first would be best before I bog
>>>>>> down
>>>>>> our testing resources.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 3/15/17 10:03 PM, Chris Plummer wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> Please review the following:
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8176768
>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot
>>>>>>>
>>>>>>>
>>>>>>> While working on 8175342 I noticed our stack size on xgene was 8mb
>>>>>>> even though I was specifying -Xss72k. It turns out the following
>>>>>>> code
>>>>>>> was failing:
>>>>>>>
>>>>>>> pthread_attr_setstacksize(&attr, stack_size);
>>>>>>>
>>>>>>> Although we computed a minimum stack size of 72k, so -Xss72k should
>>>>>>> be fine, pthreads on this platform requires the stack be at least
>>>>>>> 128k, so it failed the pthread_attr_setstacksize() call. The end
>>>>>>> result is pthread_attr_setstacksize() had no impact on the thread's
>>>>>>> stack size, and we ended up with the platform default of 8mb.
>>>>>>> The fix
>>>>>>> is to round up the following variables to PTHREAD_STACK_MIN after
>>>>>>> computing their new values:
>>>>>>>
>>>>>>> _java_thread_min_stack_allowed
>>>>>>> _compiler_thread_min_stack_allowed
>>>>>>> _vm_internal_thread_min_stack_allowed
>>>>>>>
>>>>>>> For solaris, there was an issue using PTHREAD_STACK_MIN. You
>>>>>>> need to
>>>>>>> #define _POSIX_C_SOURCE >= 199506L in order to get
>>>>>>> PTHREAD_STACK_MIN
>>>>>>> #defined, and this needs to be done before including OS header
>>>>>>> files.
>>>>>>> I noticed that on solaris we were using thr_min_stack() elsewhere
>>>>>>> instead of PTHREAD_STACK_MIN, so I decided to do the same with this
>>>>>>> fix. Either way is ugly (the #define or using thr_min_stack()).
>>>>>>>
>>>>>>> And speaking of the existing use of thr_min_stack(), I deleted
>>>>>>> it. It
>>>>>>> was being applied before any adjustments to the stack sizes had
>>>>>>> been
>>>>>>> made (rounding and adding red, yellow, and shadow zones). This mean
>>>>>>> the stack ended up being larger than necessary. With the above
>>>>>>> fix in
>>>>>>> place, we are now applying thr_min_stack() after recomputing the
>>>>>>> minimum stack sizes. If for any reason one of those stack sizes is
>>>>>>> now too small, the correct fix is to adjust the initial stack
>>>>>>> sizes,
>>>>>>> not apply thr_min_stack() to the initial stack sizes. However, it
>>>>>>> looks like no adjustment is needed. I did something close to our
>>>>>>> nightly testing on all affect platforms, and no new problems
>>>>>>> turned up.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>
>>>>>>
>>>>>
>>>
>>
>
>
More information about the hotspot-dev
mailing list