RFR: 8225035: Thread stack size issue caused by large TLS size

Jiangli Zhou jianglizhou at google.com
Sun Jun 30 22:56:12 UTC 2019


Hi Thomas,

On Fri, Jun 28, 2019 at 11:33 PM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
>
> Hi Jiangli,
>
> http://cr.openjdk.java.net/~jiangli/8225035/webrev.04/src/hotspot/os/linux/os_linux.cpp.udiff.html
>
> +static void get_minstack_init() {
> +  _get_minstack_func =
> +        (GetMinStack)dlsym(RTLD_DEFAULT, "__pthread_get_minstack");
> +  log_info(os, thread)("Lookup of __pthread_get_minstack %s",
> +                       _get_minstack_func == NULL ? "failed" : "succeeded");
> +}
>
> Not sure if logging even works at that point, the logging system may not yet be initialized. No problem from my view, just wanted to point that out.

The get_minstack_init happens during os::init_2, which is after log
configuration initialization. It should be safe to call logging API in
there. I also enabled -Xlog:os+thread=info in the new jtreg TLS test
and verified the logging output.

>
> +    // The following 'minstack_size > os::vm_page_size() + PTHREAD_STACK_MIN'
> +    // if check is done for precaution.
> +    if (minstack_size > (size_t)os::vm_page_size() + PTHREAD_STACK_MIN) {
> +      tls_size = minstack_size - os::vm_page_size() - PTHREAD_STACK_MIN;
> +    }
>
> If you want you could drop the "done for precaution" comment since it is pretty clear what you do.

I debated with myself when adding the comment and decided it might
help clarify the intention for the above 'if' check. Let's keep it ...

>
> ----
>
> All these are idle musings. I am fine with the version as it is now.

Great! Thanks again!

Best regards,
Jiangli

>
> Cheers, Thomas
>
>
>
> On Sat, Jun 29, 2019 at 1:43 AM Jiangli Zhou <jianglizhou at google.com> wrote:
>>
>> Hi David,
>>
>> Thanks for the detailed comments! Here is the latest webrev:
>> http://cr.openjdk.java.net/~jiangli/8225035/webrev.04/. Apologizing
>> for not including an incremental webrev (realized that when I almost
>> done edits).
>>
>> On Fri, Jun 28, 2019 at 8:57 AM David Holmes <david.holmes at oracle.com> wrote:
>> >
>> > Hi Jiangli,
>> >
>> > This is very well written up - thanks.
>> >
>> > I apologize in advance that I'm about to traveling for a few days so
>> > won't be able to respond further until next week.
>>
>> Hope you have a relaxed and safe travel. This can wait.
>>
>> >
>> > On 27/06/2019 11:58 pm, Jiangli Zhou wrote:
>> > > Updated webrev:
>> > > http://cr.openjdk.java.net/~jiangli/8225035/webrev.03/
>> >
>> > Overall changes look good. I also have a concern around this code:
>> >
>> > +     tls_size = minstack_size - os::vm_page_size() - PTHREAD_STACK_MIN;
>> > +     assert(tls_size > 0, "unexpected size");
>> >
>> > In addition to Thomas's comments re-signedness, can't the result == 0 if
>> > there is no sttaic TLS in use? Or is there always some static TLS in use?
>>
>> On both glibc 2.24 and 2.28, by default I see there is one page memory
>> for static TLS, without explicitly defining any __thread variables in
>> user code.
>>
>> >
>> > A few other comments/requests:
>> >
>> >   822 static void get_minstack_init() {
>> >   823   _get_minstack_func =
>> >   824         (GetMinStack)dlsym(RTLD_DEFAULT, "__pthread_get_minstack");
>> >   825 }
>> >
>> > Can you add a logging statement please:
>> >
>> > log(os, thread)("Lookup of __pthread_get_minstack %s",
>> >                  _get_minstack_func == NULL ? "failed" : "succeeded");
>>
>> Added log info.
>>
>> >
>> > --
>> >
>> >   884   // In the Linux NPTL pthread implementation the guard size mechanism
>> >
>> > Now that we have additional information on this could you update this
>> > old comment to say
>> >
>> > // In glibc versions prior to 2.7 the guard size mechanism
>>
>> Done.
>>
>> >
>> > --
>> >
>> >   897     // Adjust the stack_size by adding the on-stack TLS size if
>> >   898     // AdjustStackSizeForTLS is true. The guard size is already
>> >   899     // accounted in this case, please see comments in
>> >   900     // get_static_tls_area_size().
>> >
>> > Given the extensive commentary in get_static_tls_area_size() can we be
>> > more brief here and just say:
>> >
>> > // Adjust the stack size for on-stack TLS - see get_static_tls_area_size().
>>
>> Done.
>>
>> >
>> > --
>> >
>> > 5193   get_minstack_init();
>> >
>> > Can you make this conditional on AdjustStackSizeForTLS please so there
>> > is no affect when not using the flag - thanks.
>>
>> Done.
>>
>> >
>> > --
>> >
>> > 855   }
>> > 856   return tls_size;
>> >
>> > Can you insert a logging statement:
>> >
>> > log(os, thread)("Stack size adjustment for TLS is " SIZE_T_FORMAT,
>> > tls_size);
>>
>> Added.
>>
>> >
>> > ---
>> >
>> > test/hotspot/jtreg/runtime/TLS/T.java
>> >
>> > 37             // Starting a ProcessBuilder causes the process reaper
>> > thread being
>> >
>> > s/being/to be/
>>
>> Fixed.
>>
>> >
>> >   43             // failure mode the VM fails to create thread with
>> > error message
>> >
>> > s/create thread/create a thread/
>>
>> Fixed.
>>
>> >
>> >    53                 System.out.println("Unexpected Echo output: " +
>> > echoOutput +
>> >    54                                    ", expects: " + echoInput);
>> >
>> > should this be an exception so that test fails? I can't imagine how echo
>> > would fail but probably better to fail the test if something unexpected
>> > happens.
>>
>> If no expected output is obtained from echo (due to ProcessBuilder
>> failure caused by TLS issue), the test does fail and reports to the
>> caller (returns false). If we throw an explicit expectation, it will
>> be caught by the outer try/catch, which seems to be unnecessary.
>>
>> >
>> > 66         try {
>> >    67             br = new BufferedReader(new
>> > InputStreamReader(inputStream));
>> >    68             s = br.readLine();
>> >    69         } finally {
>> >    70             br.close();
>> >    71         }
>> >
>> > This could use try-with-resources for both streams.
>>
>> Sounds good. Done.
>>
>> >
>> > ---
>> >
>> > exestack-tls.c
>> >
>> > It's simpler if no argument means no-tls and an argument means tls.
>> >
>> >    42     char classpath[4096];
>> >    43     snprintf(classpath, sizeof classpath,
>> >    44              "-Djava.class.path=%s", getenv("CLASSPATH"));
>> >    45     options[0].optionString = classpath;
>> >
>> > Do we need to explicitly set the classpath? I'm concerned that our test
>> > environment uses really, really long paths and a number of them.
>> > (Probably not 4096 but still ...)
>>
>> The classpath needs to be set so we know where to load the test class.
>> Given that we use 4096 for the same type of usage in other existing
>> test(s) (for example StackGap), it probably is okay for our test
>> environments? I can increase the array size if we want to be extra
>> safe ...
>>
>> >
>> >
>> > test/hotspot/jtreg/runtime/TLS/testtls.sh
>> >
>> >    40 if [ "${VM_OS}" != "linux" ]
>> >    41 then
>> >    42   echo "Test is only valid for Linux"
>> >    43   exit 0
>> >    44 fi
>> >
>> > This should be done via "@requires os.family != Linux"
>>
>> Done.
>>
>> Thanks!
>>
>> Best regards,
>> Jiangli
>> >
>> > Thanks,
>> > David
>> > -----
>> >
>> > > Thanks for everyone's contribution on carving out the current workaround!
>> > >
>> > > Best regards,
>> > > Jiangli
>> > >
>> > > On Thu, Jun 27, 2019 at 11:42 AM Jiangli Zhou <jianglizhou at google.com> wrote:
>> > >>
>> > >> Thank you Thomas and David! Glad to see that we are converging on an
>> > >> acceptable approach here. I'll try to factor in all the latest inputs
>> > >> from everyone and send out a new update.
>> > >>
>> > >> Thanks and best regards,
>> > >> Jiangli
>> > >>
>> > >> On Thu, Jun 27, 2019 at 11:13 AM David Holmes <david.holmes at oracle.com> wrote:
>> > >>>
>> > >>> Trimming ....
>> > >>>
>> > >>> On 27/06/2019 12:35 pm, Jiangli Zhou wrote:
>> > >>>> On Thu, Jun 27, 2019 at 9:23 AM Florian Weimer <fweimer at redhat.com> wrote:
>> > >>>>> I think you can handle the guard size in this way:
>> > >>>>>
>> > >>>>>     pthread_attr_setguardsize(&attr, guard_size);
>> > >>>>>
>> > >>>>>     size_t stack_adjust_size = 0;
>> > >>>>>     if (AdjustStackSizeForTLS) {
>> > >>>>>       size_t minstack_size = get_minstack(&attr);
>> > >>>>>       size_t tls_size = minstack_size - vm_page_size() - PTHREAD_STACK_MIN;
>> > >>>>>       // In glibc before 2.27, tls_size still includes guard_size.
>> > >>>>>       // In glibc 2.27 and later, guard_size is automatically
>> > >>>>>       // added to the stack size by pthread_create.
>> > >>>>>       // In both cases, the guard size is taken into account.
>> > >>>>>       stack_adjust_size += tls_size;
>> > >>>>>     } else {
>> > >>>>>       stack_adjust_size += guard_size;
>> > >>>>>     }
>> > >>>>
>> > >>>> Is the vm_page_size() counted for the dl_pagesize? As long as others
>> > >>>> are okay with the above suggested adjustment, it looks good to me.
>> > >>>> Thomas, David and others, any objection?
>> > >>>
>> > >>> I find the above acceptable. I've been waiting for the dust to settle.
>> > >>>
>> > >>> Thanks,
>> > >>> David
>> > >>>
>> > >>>> Thanks and best regards,
>> > >>>> Jiangli
>> > >>>>>
>> > >>>>> Thanks,
>> > >>>>> Florian


More information about the hotspot-runtime-dev mailing list