(RFR)(S)(10): 8176768: hotspot ignores PTHREAD_STACK_MIN when creating new threads

Chris Plummer chris.plummer at oracle.com
Fri Mar 17 00:18:11 UTC 2017


On 3/16/17 5:04 PM, David Holmes wrote:
> Hi Chris,
>
> On 17/03/2017 8:14 AM, Chris Plummer wrote:
>> On 3/16/17 3:01 PM, David Holmes wrote:
>>> On 17/03/2017 7:43 AM, Chris Plummer wrote:
>>>> On 3/16/17 2:35 PM, David Holmes wrote:
>>>>> On 17/03/2017 3:49 AM, Chris Plummer wrote:
>>>>>> On 3/16/17 2:16 AM, David Holmes wrote:
>>>>>>> On 16/03/2017 6:30 PM, Thomas St�fe wrote:
>>>>>>>> Hi Chris, David,
>>>>>>>>
>>>>>>>> the change looks good.
>>>>>>>>
>>>>>>>> I see that in the launcher we require a minimum stack size across
>>>>>>>> all
>>>>>>>> platforms ("STACK_SIZE_MINIMUM"), should we do the same fix 
>>>>>>>> (adjust
>>>>>>>> for
>>>>>>>> PTHREAD_STACK_MIN) there?
>>>>>>>>
>>>>>>>> I do not understand, why does error checking in the hotspot have
>>>>>>>> to be
>>>>>>>> consistent with the launcher? What prevents us from asserting 
>>>>>>>> in the
>>>>>>>> hotspot - or at least print a warning? Note that in the hotspot,
>>>>>>>> there
>>>>>>>> is already UL logging ("os", "thread") after pthread_create() 
>>>>>>>> in the
>>>>>>>> platform files, so the least we could do is add a warning log 
>>>>>>>> output
>>>>>>>> case ppthread_attr_setstacksize fails.
>>>>>>>
>>>>>>> Sorry I'm getting this group of bugs all muddled up.
>>>>>>>
>>>>>>> Chris: this issue does affect hotspot and the launcher 
>>>>>>> (potentially).
>>>>>>>
>>>>>>> Ideally both should be checking for failures in the pthread 
>>>>>>> calls but
>>>>>>> neither do so. Hotspot at least does so in some places but not in a
>>>>>>> lot of others.
>>>>>>>
>>>>>>> pthread_create is different in hotspot because failure can happen
>>>>>>> easily and we need to detect it and report it (via an exception and
>>>>>>> also via UL). The other pthread calls are not expected to fail 
>>>>>>> under
>>>>>>> "normal" conditions but only due to a programming error. Those 
>>>>>>> calls
>>>>>>> should at least be checked in debug builds as we already do in 
>>>>>>> places
>>>>>>> with assert_status.
>>>>>>>
>>>>>>> The launcher code doesn't do any error checking at all (but again
>>>>>>> pthread_create is a special case).
>>>>>> Are you just referring to the pthread related error checking? It
>>>>>> does do
>>>>>> other error checking.
>>>>>
>>>>> pthread error checking.
>>>>>
>>>>> So trying to think this through ...
>>>>>
>>>>> If the user specifies a too small, or unaligned-to-page-size, -Xss
>>>>> value the pthread_setstacksize() in the launcher will silently fail
>>>>> and the main thread will get the default stack of 8M. It will then
>>>>> load the VM which will then check the -Xss value, which will do its
>>>>> own validity checking.
>>>>>
>>>> Close, except there is still a potential issue if the size is bigger
>>>> than the minimum hotspot requires, but is not page size aligned.
>>>> pthread_setstacksize *could* fail in this case, and there would be no
>>>> "stack size too small" rejection from the hotspot. However,
>>>> pthread_setstacksize did not fail on the two platforms I tried 
>>>> unaligned
>>>> stack sizes on.
>>>
>>> Perhaps because that is not specified by POSIX. For POSIX we only have:
>>>
>>> [EINVAL]
>>>     The value of stacksize is less than {PTHREAD_STACK_MIN} or exceeds
>>> a system-imposed limit.
>> The man page on my linux host also adds the warning about "some hosts
>> can fail if the stack size is not a multiple of the system page size."
>> Is the man page documenting something different?
>
> Yes it's documenting that Linux doesn't follow POSIX very well a lot 
> of the time. :( However the source code seems different:
>
> http://code.metager.de/source/xref/gnu/glibc/nptl/pthreadP.h
>
> 628 static inline int
> 629 check_stacksize_attr (size_t st)
> 630 {
> 631  if (st >= PTHREAD_STACK_MIN)
> 632    return 0;
> 633
> 634  return EINVAL;
> 635 }
>
> That said, it seems that OSX will return EINVAL:
>
> https://opensource.apple.com/source/libpthread/libpthread-218.1.3/src/pthread.c.auto.html 
>
>
>     int ret = EINVAL;
>     if (attr->sig == _PTHREAD_ATTR_SIG &&
>         (stacksize % vm_page_size) == 0 &&
>         stacksize >= PTHREAD_STACK_MIN) {
>         attr->stacksize = stacksize;
>         ret = 0;
>     }
>     return ret;
Ok. Thanks for finding this. I had found the Mac OS X man page and it 
too mentioned that unaligned sizes might cause EINVAL.
>
>>>
>>> Anyway that is a check that hotspot could perform if
>>> pthread_attr_setstacksize fails. Though that then makes me wonder if
>>> we do any rounding when the stack size set on a per thread basis via
>>> the java.lang.Thread constructor?
>>>
>>> I think imposing the PTHREAD_STACK_MIN in hotspot, with an assert
>>> checking pthread_attr_setstacksize succeeded (in hotspot) would
>>> suffice here.
>> If you are certain that the assert would be a programming error and not
>> a user error, then I can see doing that. However, shouldn't we be
>> consistent in the launcher and do the same there also? We can skip
>> imposing PTHREAD_STACK_MIN since hotspot will already do this, but
>> unless the user creates another java thread there will be no hotspot
>> assert for pthread_attr_setstacksize failing.
>
> The launcher has its own policy regarding errors here - it ignores 
> them. As I said before if you pass an invalid -Xss value the launcher 
> will create the main thread with the default stack size. That is not 
> an unreasonable position to take. If you feel strongly about it you 
> could file a bug against the launcher, but I would not try to fix it 
> in this CR.
Ok. At least it's not fatal.
>
>
> Inside hotspot I think we already do various roundings on the value 
> eventually passed to pthread_attr_setstacksize, don't we? So any 
> EINVAL after that should be effectively impossible when combined with 
> the PTHREAD_STACK_MIN check.
>
Yes:

   size_t stack_size_in_bytes = ThreadStackSize * K;
...
   // Make the stack size a multiple of the page size so that
   // the yellow/red zones can be guarded.
   JavaThread::set_stack_size_at_create(round_to(stack_size_in_bytes, 
vm_page_size()));

Maybe I should add to the comment "...and also so 
pthread_attr_setstacksize won't fail".

Chris
> Thanks,
> David
>
>> Chris
>>>
>>> David
>>> -----
>>>
>>>> Chris
>>>>> That seems like quite a reasonable position for the launcher to take.
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>>
>>>>>> Chris
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> If we ever refactor this coding, could we rename the variables
>>>>>>>> holding
>>>>>>>> the base stack size requirement for java frames - in all its
>>>>>>>> incarnations in all the os_cpu files - to be renamed to something
>>>>>>>> different? It is a bit confusing to have a variable which at
>>>>>>>> different
>>>>>>>> times in VM life means different things (before and after the call
>>>>>>>> to os::Posix::set_minimum_stack_sizes()). Or, at least, rename
>>>>>>>> "set_minimum_stack_sizes" to something like
>>>>>>>> "adjust_minimum_stack_sizes"
>>>>>>>> which makes the intent clearer.
>>>>>>>>
>>>>>>>> Kind Regards, Thomas
>>>>>>>>
>>>>>>>> On Thu, Mar 16, 2017 at 7:50 AM, David Holmes
>>>>>>>> <david.holmes at oracle.com
>>>>>>>> <mailto:david.holmes at oracle.com>> wrote:
>>>>>>>>
>>>>>>>>     On 16/03/2017 4:33 PM, Chris Plummer wrote:
>>>>>>>>
>>>>>>>>         On 3/15/17 11:18 PM, David Holmes wrote:
>>>>>>>>
>>>>>>>>             On 16/03/2017 4:14 PM, Chris Plummer wrote:
>>>>>>>>
>>>>>>>>                 On 3/15/17 11:11 PM, David Holmes wrote:
>>>>>>>>
>>>>>>>>                     On 16/03/2017 3:51 PM, Chris Plummer wrote:
>>>>>>>>
>>>>>>>>                         On 3/15/17 10:23 PM, David Holmes wrote:
>>>>>>>>
>>>>>>>>                             Hi Chris,
>>>>>>>>
>>>>>>>>                             On 16/03/2017 3:03 PM, Chris Plummer
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>                                 Hello,
>>>>>>>>
>>>>>>>>                                 Please review the following:
>>>>>>>>
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8176768
>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8176768>
>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot 
>>>>>>>>
>>>>>>>>
>>>>>>>> <http://cr.openjdk.java.net/~cjplummer/8176768/webrev.00/webrev.hotspot> 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                             Change looks good.
>>>>>>>>
>>>>>>>>                                 While working on 8175342 I
>>>>>>>> noticed our
>>>>>>>>                                 stack size on xgene was 8mb
>>>>>>>>                                 even
>>>>>>>>                                 though I was specifying 
>>>>>>>> -Xss72k. It
>>>>>>>>                                 turns out the following code was
>>>>>>>>                                 failing:
>>>>>>>>
>>>>>>>> pthread_attr_setstacksize(&attr,
>>>>>>>>                                 stack_size);
>>>>>>>>
>>>>>>>>
>>>>>>>>                             So these really should be checking
>>>>>>>> return
>>>>>>>>                             values, at least in debug
>>>>>>>>                             builds. But we can leave that until we
>>>>>>>>                             refactor the thread startup
>>>>>>>>                             code into os_posix.cpp.
>>>>>>>>
>>>>>>>>                         I considered adding checks. I wasn't sure
>>>>>>>> if we
>>>>>>>>                         should abort or just
>>>>>>>>                         print a warning if it failed.
>>>>>>>>
>>>>>>>>
>>>>>>>>                     When we check pthread lib routines we use:
>>>>>>>>
>>>>>>>>                        int status = pthread_mutex_lock(_mutex);
>>>>>>>>                        assert_status(status == 0, status,
>>>>>>>> "mutex_lock");
>>>>>>>>
>>>>>>>>                     This is for things that should only fail if we
>>>>>>>> have
>>>>>>>>                     a programming
>>>>>>>>                     error.
>>>>>>>>
>>>>>>>>                 Ok, but this is in the launcher, so I'll need to
>>>>>>>> just
>>>>>>>>                 use the built-in
>>>>>>>>                 assert(). I'll add that if want.
>>>>>>>>
>>>>>>>>
>>>>>>>>             Oops! I was forgetting that. Need to be consistent 
>>>>>>>> with
>>>>>>>>             launcher error
>>>>>>>>             checking or lack thereof. And ignore refactoring
>>>>>>>> comments -
>>>>>>>>             not relevant.
>>>>>>>>
>>>>>>>>         So don't add the error check?
>>>>>>>>
>>>>>>>>
>>>>>>>>     Given there is no error checking, or assertions, in those
>>>>>>>> files I
>>>>>>>>     reluctantly have to say leave it out.
>>>>>>>>
>>>>>>>>     Thanks,
>>>>>>>>     David
>>>>>>>>     -----
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>             David
>>>>>>>>
>>>>>>>>                 Chris
>>>>>>>>
>>>>>>>>
>>>>>>>>                         What refactoring is planned?
>>>>>>>>
>>>>>>>>
>>>>>>>>                     "Planned" might be a bit strong :) I was
>>>>>>>> thinking of
>>>>>>>>                     a number of
>>>>>>>>                     os_posix related cleanups for which issues
>>>>>>>> exist,
>>>>>>>>                     but also forgot that
>>>>>>>>                     some of our general clean-up RFE's have been
>>>>>>>> closed
>>>>>>>>                     as WNF :( I may do
>>>>>>>>                     some of them after hours anyway :)
>>>>>>>>
>>>>>>>>                     David
>>>>>>>>                     -----
>>>>>>>>
>>>>>>>>                         Chris
>>>>>>>>
>>>>>>>>
>>>>>>>>                             Thanks,
>>>>>>>>                             David
>>>>>>>>                             -----
>>>>>>>>
>>>>>>>>                                 Although we computed a minimum 
>>>>>>>> stack
>>>>>>>>                                 size of 72k, so -Xss72k
>>>>>>>>                                 should be
>>>>>>>>                                 fine, pthreads on this platform
>>>>>>>> requires
>>>>>>>>                                 the stack be at least
>>>>>>>>                                 128k, so
>>>>>>>>                                 it failed the
>>>>>>>> pthread_attr_setstacksize() call. The
>>>>>>>>                                 end result is
>>>>>>>> pthread_attr_setstacksize() had no
>>>>>>>>                                 impact on the thread's stack
>>>>>>>>                                 size,
>>>>>>>>                                 and we ended up with the platform
>>>>>>>>                                 default of 8mb. The fix is to
>>>>>>>>                                 round up
>>>>>>>>                                 the following variables to
>>>>>>>>                                 PTHREAD_STACK_MIN after computing
>>>>>>>>                                 their new
>>>>>>>>                                 values:
>>>>>>>>
>>>>>>>> _java_thread_min_stack_allowed
>>>>>>>> _compiler_thread_min_stack_allowed
>>>>>>>> _vm_internal_thread_min_stack_allowed
>>>>>>>>
>>>>>>>>                                 For solaris, there was an issue
>>>>>>>> using
>>>>>>>>                                 PTHREAD_STACK_MIN. You need to
>>>>>>>>                                 #define _POSIX_C_SOURCE >=
>>>>>>>> 199506L in
>>>>>>>>                                 order to get PTHREAD_STACK_MIN
>>>>>>>>                                 #defined, and this needs to be 
>>>>>>>> done
>>>>>>>>                                 before including OS header
>>>>>>>>                                 files. I
>>>>>>>>                                 noticed that on solaris we were
>>>>>>>> using
>>>>>>>>                                 thr_min_stack() elsewhere
>>>>>>>>                                 instead
>>>>>>>>                                 of PTHREAD_STACK_MIN, so I decided
>>>>>>>> to do
>>>>>>>>                                 the same with this fix.
>>>>>>>>                                 Either
>>>>>>>>                                 way is ugly (the #define or using
>>>>>>>>                                 thr_min_stack()).
>>>>>>>>
>>>>>>>>                                 And speaking of the existing 
>>>>>>>> use of
>>>>>>>>                                 thr_min_stack(), I deleted
>>>>>>>>                                 it. It
>>>>>>>>                                 was being applied before any
>>>>>>>> adjustments
>>>>>>>>                                 to the stack sizes had been
>>>>>>>>                                 made (rounding and adding red,
>>>>>>>> yellow,
>>>>>>>>                                 and shadow zones). This mean
>>>>>>>>                                 the
>>>>>>>>                                 stack ended up being larger than
>>>>>>>>                                 necessary. With the above fix in
>>>>>>>>                                 place,
>>>>>>>>                                 we are now applying 
>>>>>>>> thr_min_stack()
>>>>>>>>                                 after recomputing the minimum
>>>>>>>>                                 stack
>>>>>>>>                                 sizes. If for any reason one of
>>>>>>>> those
>>>>>>>>                                 stack sizes is now too small,
>>>>>>>>                                 the
>>>>>>>>                                 correct fix is to adjust the 
>>>>>>>> initial
>>>>>>>>                                 stack sizes, not apply
>>>>>>>>                                 thr_min_stack() to the initial 
>>>>>>>> stack
>>>>>>>>                                 sizes. However, it looks
>>>>>>>>                                 like no
>>>>>>>>                                 adjustment is needed. I did
>>>>>>>> something
>>>>>>>>                                 close to our nightly
>>>>>>>>                                 testing on
>>>>>>>>                                 all affect platforms, and no new
>>>>>>>>                                 problems turned up.
>>>>>>>>
>>>>>>>>                                 thanks,
>>>>>>>>
>>>>>>>>                                 Chris
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>




More information about the hotspot-dev mailing list