RFR: 8225035: Thread stack size issue caused by large TLS size

David Holmes david.holmes at oracle.com
Mon Jul 1 08:22:42 UTC 2019


Hi Jiangli,

On 29/06/2019 9:42 am, Jiangli Zhou wrote:
> Hi David,
> 
> Thanks for the detailed comments! Here is the latest webrev:
> http://cr.openjdk.java.net/~jiangli/8225035/webrev.04/. Apologizing
> for not including an incremental webrev (realized that when I almost
> done edits).

That all looks fine - thanks for making the changes.

However ... now that I see the logging output it occurred to me that 
checking for the TLS adjustment is something that should only happen 
once and we should be storing the adjustment amount in a static for 
direct use. Sorry I didn't think about this earlier. Something like:

static size_t tls_size = 0;
static bool tls_size_inited = false;

static size_t get_static_tls_area_size(const pthread_attr_t *attr) {
+  if (!tls_size_inited) {
+    tls_size_inited = true;
      if (_get_minstack_func != NULL) {
        size_t minstack_size = _get_minstack_func(attr);
        ...
        if (minstack_size > (size_t)os::vm_page_size() + 
PTHREAD_STACK_MIN) {
          tls_size = minstack_size - os::vm_page_size() - PTHREAD_STACK_MIN;
        }
     }
+ }
   log_info(os, thread)("Stack size adjustment for TLS is " SIZE_FORMAT,
                        tls_size);
   return tls_size;
}

Or even fold it all into get_minstack_init() ?

I'm assuming that the result of __pthread_get_minstack wont' change over 
time of course.

Thanks,
David
-----

> On Fri, Jun 28, 2019 at 8:57 AM David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Jiangli,
>>
>> This is very well written up - thanks.
>>
>> I apologize in advance that I'm about to traveling for a few days so
>> won't be able to respond further until next week.
> 
> Hope you have a relaxed and safe travel. This can wait.
> 
>>
>> On 27/06/2019 11:58 pm, Jiangli Zhou wrote:
>>> Updated webrev:
>>> http://cr.openjdk.java.net/~jiangli/8225035/webrev.03/
>>
>> Overall changes look good. I also have a concern around this code:
>>
>> +     tls_size = minstack_size - os::vm_page_size() - PTHREAD_STACK_MIN;
>> +     assert(tls_size > 0, "unexpected size");
>>
>> In addition to Thomas's comments re-signedness, can't the result == 0 if
>> there is no sttaic TLS in use? Or is there always some static TLS in use?
> 
> On both glibc 2.24 and 2.28, by default I see there is one page memory
> for static TLS, without explicitly defining any __thread variables in
> user code.
> 
>>
>> A few other comments/requests:
>>
>>    822 static void get_minstack_init() {
>>    823   _get_minstack_func =
>>    824         (GetMinStack)dlsym(RTLD_DEFAULT, "__pthread_get_minstack");
>>    825 }
>>
>> Can you add a logging statement please:
>>
>> log(os, thread)("Lookup of __pthread_get_minstack %s",
>>                   _get_minstack_func == NULL ? "failed" : "succeeded");
> 
> Added log info.
> 
>>
>> --
>>
>>    884   // In the Linux NPTL pthread implementation the guard size mechanism
>>
>> Now that we have additional information on this could you update this
>> old comment to say
>>
>> // In glibc versions prior to 2.7 the guard size mechanism
> 
> Done.
> 
>>
>> --
>>
>>    897     // Adjust the stack_size by adding the on-stack TLS size if
>>    898     // AdjustStackSizeForTLS is true. The guard size is already
>>    899     // accounted in this case, please see comments in
>>    900     // get_static_tls_area_size().
>>
>> Given the extensive commentary in get_static_tls_area_size() can we be
>> more brief here and just say:
>>
>> // Adjust the stack size for on-stack TLS - see get_static_tls_area_size().
> 
> Done.
> 
>>
>> --
>>
>> 5193   get_minstack_init();
>>
>> Can you make this conditional on AdjustStackSizeForTLS please so there
>> is no affect when not using the flag - thanks.
> 
> Done.
> 
>>
>> --
>>
>> 855   }
>> 856   return tls_size;
>>
>> Can you insert a logging statement:
>>
>> log(os, thread)("Stack size adjustment for TLS is " SIZE_T_FORMAT,
>> tls_size);
> 
> Added.
> 
>>
>> ---
>>
>> test/hotspot/jtreg/runtime/TLS/T.java
>>
>> 37             // Starting a ProcessBuilder causes the process reaper
>> thread being
>>
>> s/being/to be/
> 
> Fixed.
> 
>>
>>    43             // failure mode the VM fails to create thread with
>> error message
>>
>> s/create thread/create a thread/
> 
> Fixed.
> 
>>
>>     53                 System.out.println("Unexpected Echo output: " +
>> echoOutput +
>>     54                                    ", expects: " + echoInput);
>>
>> should this be an exception so that test fails? I can't imagine how echo
>> would fail but probably better to fail the test if something unexpected
>> happens.
> 
> If no expected output is obtained from echo (due to ProcessBuilder
> failure caused by TLS issue), the test does fail and reports to the
> caller (returns false). If we throw an explicit expectation, it will
> be caught by the outer try/catch, which seems to be unnecessary.
> 
>>
>> 66         try {
>>     67             br = new BufferedReader(new
>> InputStreamReader(inputStream));
>>     68             s = br.readLine();
>>     69         } finally {
>>     70             br.close();
>>     71         }
>>
>> This could use try-with-resources for both streams.
> 
> Sounds good. Done.
> 
>>
>> ---
>>
>> exestack-tls.c
>>
>> It's simpler if no argument means no-tls and an argument means tls.
>>
>>     42     char classpath[4096];
>>     43     snprintf(classpath, sizeof classpath,
>>     44              "-Djava.class.path=%s", getenv("CLASSPATH"));
>>     45     options[0].optionString = classpath;
>>
>> Do we need to explicitly set the classpath? I'm concerned that our test
>> environment uses really, really long paths and a number of them.
>> (Probably not 4096 but still ...)
> 
> The classpath needs to be set so we know where to load the test class.
> Given that we use 4096 for the same type of usage in other existing
> test(s) (for example StackGap), it probably is okay for our test
> environments? I can increase the array size if we want to be extra
> safe ...
> 
>>
>>
>> test/hotspot/jtreg/runtime/TLS/testtls.sh
>>
>>     40 if [ "${VM_OS}" != "linux" ]
>>     41 then
>>     42   echo "Test is only valid for Linux"
>>     43   exit 0
>>     44 fi
>>
>> This should be done via "@requires os.family != Linux"
> 
> Done.
> 
> Thanks!
> 
> Best regards,
> Jiangli
>>
>> Thanks,
>> David
>> -----
>>
>>> Thanks for everyone's contribution on carving out the current workaround!
>>>
>>> Best regards,
>>> Jiangli
>>>
>>> On Thu, Jun 27, 2019 at 11:42 AM Jiangli Zhou <jianglizhou at google.com> wrote:
>>>>
>>>> Thank you Thomas and David! Glad to see that we are converging on an
>>>> acceptable approach here. I'll try to factor in all the latest inputs
>>>> from everyone and send out a new update.
>>>>
>>>> Thanks and best regards,
>>>> Jiangli
>>>>
>>>> On Thu, Jun 27, 2019 at 11:13 AM David Holmes <david.holmes at oracle.com> wrote:
>>>>>
>>>>> Trimming ....
>>>>>
>>>>> On 27/06/2019 12:35 pm, Jiangli Zhou wrote:
>>>>>> On Thu, Jun 27, 2019 at 9:23 AM Florian Weimer <fweimer at redhat.com> wrote:
>>>>>>> I think you can handle the guard size in this way:
>>>>>>>
>>>>>>>      pthread_attr_setguardsize(&attr, guard_size);
>>>>>>>
>>>>>>>      size_t stack_adjust_size = 0;
>>>>>>>      if (AdjustStackSizeForTLS) {
>>>>>>>        size_t minstack_size = get_minstack(&attr);
>>>>>>>        size_t tls_size = minstack_size - vm_page_size() - PTHREAD_STACK_MIN;
>>>>>>>        // In glibc before 2.27, tls_size still includes guard_size.
>>>>>>>        // In glibc 2.27 and later, guard_size is automatically
>>>>>>>        // added to the stack size by pthread_create.
>>>>>>>        // In both cases, the guard size is taken into account.
>>>>>>>        stack_adjust_size += tls_size;
>>>>>>>      } else {
>>>>>>>        stack_adjust_size += guard_size;
>>>>>>>      }
>>>>>>
>>>>>> Is the vm_page_size() counted for the dl_pagesize? As long as others
>>>>>> are okay with the above suggested adjustment, it looks good to me.
>>>>>> Thomas, David and others, any objection?
>>>>>
>>>>> I find the above acceptable. I've been waiting for the dust to settle.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks and best regards,
>>>>>> Jiangli
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Florian


More information about the hotspot-runtime-dev mailing list