(RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads

Chris Plummer chris.plummer at oracle.com
Tue Mar 21 22:08:15 UTC 2017


On 3/21/17 2:57 PM, Daniel D. Daugherty wrote:
> On 3/21/17 3:20 PM, Chris Plummer wrote:
>> On 3/21/17 12:51 PM, Daniel D. Daugherty wrote:
>>> > 
>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.02/webrev.hotspot
>>>
>>> src/os/aix/vm/os_aix.cpp
>>>     No comments.
>>>
>>> src/os/bsd/vm/os_bsd.cpp
>>>     No comments.
>>>
>>> src/os/linux/vm/os_linux.cpp
>>>     L729:   // In that cast just subract the page size to get the 
>>> maximum possible stack size.
>>>         Typo: 'cast' -> 'case'
>>>         Typo: 'subract' -> 'subtract' (Thomas also commented on it)
>>>
>>> src/os/posix/vm/os_posix.cpp
>>>     L263:   // aligning up could have resulted in the size being 0. 
>>> In that case just subract the
>>>         Nit: 'aligning' -> 'Aligning' (since it's a sentence)
>>>         Typo: 'subract' -> 'subtract'
>>>
>>> src/os/solaris/vm/os_solaris.cpp
>>>     No comments.
>>>
>>> src/share/vm/prims/jvm.cpp
>>>     L2812:       // -Avoid truncating on 32-bit platforms if size is 
>>> greater than UINT_MAX
>>>         Nit: needs a period at the end like L2813.
>>>
>>> test/runtime/Thread/TooSmallStackSize.java
>>>     No comments.
>>>
>>> test/runtime/Thread/TestThreadStackSizes.java
>>>     L26:  * @summary Test user threads with various stacks sizes.
>>>         Typo?: "stacks sizes" -> "stack sizes"
>>>
>>>     L36:         super(null, null, "TestThreadStackSizes"+stackSize, 
>>> stackSize);
>>>         Nit: spaces around the "+".
>>>
>>>     L46:             TestThreadStackSizes testThreadStackSize = new 
>>> TestThreadStackSizes(stackSize);
>>>         Nit: extra space before '='.
>>>
>>>     So this test makes 326 createThread() calls... how long does
>>>     it take to run?
>>>
>> This is from the results page on Mac OS x log for all the tests in 
>> runtime/Thread
>>
>> 1    runtime/Thread/CancellableThreadTest.java         7.033
>> 2    runtime/Thread/Fibonacci.java         8.430
>> 3    runtime/Thread/TestThreadDumpMonitorContention.java 34.322
>> 4    runtime/Thread/ThreadPriorities.java         13.064
>> 5    runtime/Thread/TooSmallStackSize.java         10.086
>>
>> And 32-bit linux-arm:
>>
>> 1    runtime/Thread/CancellableThreadTest.java         9.359
>> 2    runtime/Thread/Fibonacci.java         11.744
>> 3    runtime/Thread/TestThreadDumpMonitorContention.java 00:01:04.370
>> 4    runtime/Thread/ThreadPriorities.java         18.140
>> 5    runtime/Thread/TooSmallStackSize.java         14.919
>>
>> And windows-x64:
>>
>> 1    runtime/Thread/CancellableThreadTest.java         8.074
>> 2    runtime/Thread/Fibonacci.java         10.238
>> 3    runtime/Thread/TestThreadDumpMonitorContention.java 00:01:21.404
>> 4    runtime/Thread/ThreadPriorities.java           23.134
>> 5    runtime/Thread/TooSmallStackSize.java         24.160
>
> Maybe I'm missing it, but I don't see results for the new test
> (TestThreadStackSizes.java)...
Sorry, that was from my attempt to rerun the tests earlier today after a 
minor tweak. Unfortunately I had temporarily switched to a clean jdk 
repo, so that's what got tested. Let me run the tests on the right repo 
this time. Results in a couple of hours probably.

Chris
>
> Dan
>
>
>>
>>>
>>> Thumbs up! I don't need to see another webrev if you choose to
>>> fix these minor typos...
>> They've all been fixed.
>>
>> thanks for the review,
>>
>> Chris
>>
>>>
>>> Dan
>>>
>>>
>>> On 3/20/17 5:29 PM, Chris Plummer wrote:
>>>> On 3/17/17 11:37 PM, Chris Plummer wrote:
>>>>> On 3/17/17 8:17 PM, Chris Plummer wrote:
>>>>>> On 3/17/17 7:01 PM, David Holmes wrote:
>>>>>>> On 18/03/2017 9:11 AM, Chris Plummer wrote:
>>>>>>>> Looks like this will need some more work since the added 
>>>>>>>> asserts are
>>>>>>>> triggering on mac os x (which is the only place we'd currently 
>>>>>>>> expect
>>>>>>>> them to assert).
>>>>>>>>
>>>>>>>> The problem is that the code I found that rounds up to the page 
>>>>>>>> size is
>>>>>>>> only applied to java threads created by the VM for which the 
>>>>>>>> java user
>>>>>>>> specified no stack size. The VM and Compiler thread sizes are not
>>>>>>>> rounded. The failure I saw was with
>>>>>>>> runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java 
>>>>>>>> when is
>>>>>>>> specified -XX:CompilerThreadStackSize=9007199254740991. I hit 
>>>>>>>> the assert
>>>>>>>> with an EINVAL. The size is not aligned, but it could also be
>>>>>>>> complaining because it is too big. I haven't tried aligning it 
>>>>>>>> yet to see.
>>>>>>>>
>>>>>>>> On Linux we do the following:
>>>>>>>>
>>>>>>>>   stack_size = align_size_up(stack_size +
>>>>>>>> os::Linux::default_guard_size(thr_type), vm_page_size());
>>>>>>>>
>>>>>>>> We don't do this on BSD. I think the same on BSD would solve this
>>>>>>>> problem. I'm not sure about adding the guard size. I'll need to 
>>>>>>>> see if
>>>>>>>> mac os x has the same pthread bug as linux does.
>>>>>>>
>>>>>>> At this stage I would only deal with alignment issues. IIRC the 
>>>>>>> guard issue only affected Linux.
>>>>>> Yes, that's what I eventually concluded. I put the fix in 
>>>>>> os::Posix::get_initial_stack_size() in os_posix.cpp, and only did 
>>>>>> the page aligning, not add the guard page. That way all Posix 
>>>>>> ports are fixed in one place. It seems to be working.
>>>>>>>
>>>>>>>> BTW, did you know java users can specify the size of the a new 
>>>>>>>> thread's
>>>>>>>> stack:
>>>>>>>
>>>>>>> Yes I mentioned that in another reply - wondering whether we 
>>>>>>> suitably check and aligned such requests.
>>>>>> No we don't. Below I mentioned I was able to trigger the assert 
>>>>>> with a 257k stack size. I guess I wasn't clear that I did that 
>>>>>> from Java. I have a new test to add that will be testing this, 
>>>>>> plus the 9007199254740991 stack size (which fails to create the 
>>>>>> thread with an OOME, but that's acceptable). The fix I mention 
>>>>>> above in os::Posix::get_initial_stack_size() takes care of this 
>>>>>> issue also.
>>>>> Rounding up triggers a new assert, this time on 32-bit x86 and arm.
>>>>>
>>>>> I should have clarified it's 9007199254740991 "K", which is 
>>>>> 9223372036854774784. Unfortunately on 32bit systems that is 
>>>>> asserting with EINVAL. I think we need to do a better job of 
>>>>> dealing with 32-bit size_t values:
>>>>>
>>>>> jlong java_lang_Thread::stackSize(oop java_thread) {
>>>>>   if (_stackSize_offset > 0) {
>>>>>     return java_thread->long_field(_stackSize_offset);
>>>>>   } else {
>>>>>     return 0;
>>>>>   }
>>>>> }
>>>>>
>>>>>       jlong size =
>>>>> java_lang_Thread::stackSize(JNIHandles::resolve_non_null(jthread));
>>>>>       // Allocate the C++ Thread structure and create the native 
>>>>> thread.  The
>>>>>       // stack size retrieved from java is signed, but the 
>>>>> constructor takes
>>>>>       // size_t (an unsigned type), so avoid passing negative 
>>>>> values which would
>>>>>       // result in really large stacks.
>>>>>       size_t sz = size > 0 ? (size_t) size : 0;
>>>>>       native_thread = new JavaThread(&thread_entry, sz);
>>>>>
>>>>> 9223372036854774784 is 0x7ffffffffffffc00 (close to 64 bit 
>>>>> MAX_INT), which is 0xfffffc00 when cast to a size_t on a 32-bit 
>>>>> system (close to 32-bit MAX_UINT). Round it up to the 4k page size 
>>>>> and you get 0, which I guess pthread_attr_setstacksize() doesn't 
>>>>> like. So I think more processing of the size is needed here. Maybe 
>>>>> in os::create_thread() we should check for 0 after rounding up, 
>>>>> and subtract the os page size if it is 0. However, I think we 
>>>>> should also avoid truncating on 32-bit to what is basically some 
>>>>> random number. Maybe if "size" (a jlong) is greater than UINT_MAX, 
>>>>> then just set "sz" (a size_t) it to UINT_MAX.
>>>>>
>>>> Ok, I think I have this all worked out now. I've added fixes for 
>>>> unaligned stack sizes, 32-bit truncating of stack size, and the 
>>>> "aligning up to 0" problem. I also added a test. Here's the latest 
>>>> webrev:
>>>>
>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.02/webrev.hotspot
>>>>
>>>> Here's what's changed since webrev.01:
>>>>
>>>> os_posix.cpp: In os::Posix::get_initial_stack_size(), first round 
>>>> up the stack size to be paged aligned. This fixes issues on Mac OS 
>>>> X (other platforms seem to be immune to this). Then check if the 
>>>> size is zero after rounding up to the page size. Subtract the page 
>>>> size in this case to produce the maximum stack size allowed. 
>>>> Surprisingly I got no complaint from gcc for subtracting from an 
>>>> unsigned value that is known to be 0.
>>>>
>>>> os_linux.cpp: In os::create_thread(), I also check here to make 
>>>> sure the size is not 0 after adding the guard page and aligning up, 
>>>> and subtract the os page size if it is 0.
>>>>
>>>> jvm.c: In JVM_StartThread(), on 32-bit platforms if the size is 
>>>> greater than UINT_MAX, then I set the size to UINT_MAX. Note it 
>>>> will later be rounded up to 0 in 
>>>> os::Posix::get_initial_stack_size(), which will result in 
>>>> subtracting the os page size to get the actual maximum allowed 
>>>> stack size.
>>>>
>>>> TooSmallStackSize.java: added test case for unaligned stack sizes.
>>>>
>>>> TestThreadStackSizes.java: New test. Creates new threads with every 
>>>> size up to 320k in 1k increments. Then creates threads with a few 
>>>> other sizes that can be problematic.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>> Chris
>>>>>
>>>>>
>>>>>>>
>>>>>>>>     public Thread(ThreadGroup group, Runnable target, String name,
>>>>>>>>                   long stackSize) {
>>>>>>>>         init(group, target, name, stackSize);
>>>>>>>>     }
>>>>>>>>
>>>>>>>> Fortunately we already force the stackSize to be at least
>>>>>>>> _java_thread_min_stack_allowed. However, we don't do any OS page
>>>>>>>> rounding on Mac OS X as noted above, and I was able to trigger the
>>>>>>>> assert by creating a thread with size 257k.
>>>>>>>
>>>>>>> Note this means that OSX stack logic is broken because it will 
>>>>>>> currently be silently failing due to EINVAL!
>>>>>> Yes, that is correct.
>>>>>>
>>>>>> Chris
>>>>>>>
>>>>>>>> I'll get another webrev out once I've made the needed fixes. I 
>>>>>>>> also have
>>>>>>>> a new test I'd like to add.
>>>>>>>
>>>>>>> Ok.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 3/16/17 9:27 PM, Chris Plummer wrote:
>>>>>>>>> Ok, time for a new webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.01/webrev.hotspot 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The only thing that has changed since the first webrev are the 
>>>>>>>>> asserts
>>>>>>>>> added to os_linux.cpp and os_bsd.cpp. And to summarize what we 
>>>>>>>>> discuss
>>>>>>>>> already:
>>>>>>>>>
>>>>>>>>>  - The assert should never happen due to the stack size being too
>>>>>>>>> small since it will be at least PTHREAD_STACK_MIN.
>>>>>>>>>  - The assert should never happen due to an unaligned stack size
>>>>>>>>> because we always align it to the page size.
>>>>>>>>>  - Any assert would therefore be a VM bug and not due to user 
>>>>>>>>> error.
>>>>>>>>>  - No fixing the java launcher. If the user specifies a stack 
>>>>>>>>> that is
>>>>>>>>> too small, hotspot will already detect this. If the user 
>>>>>>>>> specifies a
>>>>>>>>> stack size that is large enough but not page aligned, then we 
>>>>>>>>> just
>>>>>>>>> ignore any error (if the platform doth protest) and the user 
>>>>>>>>> gets a
>>>>>>>>> main thread with a stack size set to whatever the OS default is.
>>>>>>>>>
>>>>>>>>> I still need to retest (I only ran TooSmallStackSize.java), 
>>>>>>>>> but figure
>>>>>>>>> getting agreement on the changes first would be best before I 
>>>>>>>>> bog down
>>>>>>>>> our testing resources.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 3/15/17 10:03 PM, Chris Plummer wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Please review the following:
>>>>>>>>>>
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8176768
>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> While working on 8175342 I noticed our stack size on xgene 
>>>>>>>>>> was 8mb
>>>>>>>>>> even though I was specifying -Xss72k. It turns out the 
>>>>>>>>>> following code
>>>>>>>>>> was failing:
>>>>>>>>>>
>>>>>>>>>>       pthread_attr_setstacksize(&attr, stack_size);
>>>>>>>>>>
>>>>>>>>>> Although we computed a minimum stack size of 72k, so -Xss72k 
>>>>>>>>>> should
>>>>>>>>>> be fine, pthreads on this platform requires the stack be at 
>>>>>>>>>> least
>>>>>>>>>> 128k, so it failed the pthread_attr_setstacksize() call. The end
>>>>>>>>>> result is pthread_attr_setstacksize() had no impact on the 
>>>>>>>>>> thread's
>>>>>>>>>> stack size, and we ended up with the platform default of 8mb. 
>>>>>>>>>> The fix
>>>>>>>>>> is to round up the following variables to PTHREAD_STACK_MIN 
>>>>>>>>>> after
>>>>>>>>>> computing their new values:
>>>>>>>>>>
>>>>>>>>>>       _java_thread_min_stack_allowed
>>>>>>>>>>       _compiler_thread_min_stack_allowed
>>>>>>>>>>       _vm_internal_thread_min_stack_allowed
>>>>>>>>>>
>>>>>>>>>> For solaris, there was an issue using PTHREAD_STACK_MIN. You 
>>>>>>>>>> need to
>>>>>>>>>> #define _POSIX_C_SOURCE >= 199506L in order to get 
>>>>>>>>>> PTHREAD_STACK_MIN
>>>>>>>>>> #defined, and this needs to be done before including OS 
>>>>>>>>>> header files.
>>>>>>>>>> I noticed that on solaris we were using thr_min_stack() 
>>>>>>>>>> elsewhere
>>>>>>>>>> instead of PTHREAD_STACK_MIN, so I decided to do the same 
>>>>>>>>>> with this
>>>>>>>>>> fix. Either way is ugly (the #define or using thr_min_stack()).
>>>>>>>>>>
>>>>>>>>>> And speaking of the existing use of thr_min_stack(), I 
>>>>>>>>>> deleted it. It
>>>>>>>>>> was being applied before any adjustments to the stack sizes 
>>>>>>>>>> had been
>>>>>>>>>> made (rounding and adding red, yellow, and shadow zones). 
>>>>>>>>>> This mean
>>>>>>>>>> the stack ended up being larger than necessary. With the 
>>>>>>>>>> above fix in
>>>>>>>>>> place, we are now applying thr_min_stack() after recomputing the
>>>>>>>>>> minimum stack sizes. If for any reason one of those stack 
>>>>>>>>>> sizes is
>>>>>>>>>> now too small, the correct fix is to adjust the initial stack 
>>>>>>>>>> sizes,
>>>>>>>>>> not apply thr_min_stack() to the initial stack sizes. 
>>>>>>>>>> However, it
>>>>>>>>>> looks like no adjustment is needed. I did something close to our
>>>>>>>>>> nightly testing on all affect platforms, and no new problems 
>>>>>>>>>> turned up.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>



More information about the hotspot-dev mailing list