RFR: 8225035: Thread stack size issue caused by large TLS size
Jiangli Zhou
jianglizhou at google.com
Sun Jun 30 22:56:12 UTC 2019
Hi Thomas,
On Fri, Jun 28, 2019 at 11:33 PM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
>
> Hi Jiangli,
>
> http://cr.openjdk.java.net/~jiangli/8225035/webrev.04/src/hotspot/os/linux/os_linux.cpp.udiff.html
>
> +static void get_minstack_init() {
> + _get_minstack_func =
> + (GetMinStack)dlsym(RTLD_DEFAULT, "__pthread_get_minstack");
> + log_info(os, thread)("Lookup of __pthread_get_minstack %s",
> + _get_minstack_func == NULL ? "failed" : "succeeded");
> +}
>
> Not sure if logging even works at that point, the logging system may not yet be initialized. No problem from my view, just wanted to point that out.
The get_minstack_init happens during os::init_2, which is after log
configuration initialization. It should be safe to call logging API in
there. I also enabled -Xlog:os+thread=info in the new jtreg TLS test
and verified the logging output.
>
> + // The following 'minstack_size > os::vm_page_size() + PTHREAD_STACK_MIN'
> + // if check is done for precaution.
> + if (minstack_size > (size_t)os::vm_page_size() + PTHREAD_STACK_MIN) {
> + tls_size = minstack_size - os::vm_page_size() - PTHREAD_STACK_MIN;
> + }
>
> If you want you could drop the "done for precaution" comment since it is pretty clear what you do.
I debated with myself when adding the comment and decided it might
help clarify the intention for the above 'if' check. Let's keep it ...
>
> ----
>
> All these are idle musings. I am fine with the version as it is now.
Great! Thanks again!
Best regards,
Jiangli
>
> Cheers, Thomas
>
>
>
> On Sat, Jun 29, 2019 at 1:43 AM Jiangli Zhou <jianglizhou at google.com> wrote:
>>
>> Hi David,
>>
>> Thanks for the detailed comments! Here is the latest webrev:
>> http://cr.openjdk.java.net/~jiangli/8225035/webrev.04/. Apologizing
>> for not including an incremental webrev (realized that when I almost
>> done edits).
>>
>> On Fri, Jun 28, 2019 at 8:57 AM David Holmes <david.holmes at oracle.com> wrote:
>> >
>> > Hi Jiangli,
>> >
>> > This is very well written up - thanks.
>> >
>> > I apologize in advance that I'm about to traveling for a few days so
>> > won't be able to respond further until next week.
>>
>> Hope you have a relaxed and safe travel. This can wait.
>>
>> >
>> > On 27/06/2019 11:58 pm, Jiangli Zhou wrote:
>> > > Updated webrev:
>> > > http://cr.openjdk.java.net/~jiangli/8225035/webrev.03/
>> >
>> > Overall changes look good. I also have a concern around this code:
>> >
>> > + tls_size = minstack_size - os::vm_page_size() - PTHREAD_STACK_MIN;
>> > + assert(tls_size > 0, "unexpected size");
>> >
>> > In addition to Thomas's comments re-signedness, can't the result == 0 if
>> > there is no sttaic TLS in use? Or is there always some static TLS in use?
>>
>> On both glibc 2.24 and 2.28, by default I see there is one page memory
>> for static TLS, without explicitly defining any __thread variables in
>> user code.
>>
>> >
>> > A few other comments/requests:
>> >
>> > 822 static void get_minstack_init() {
>> > 823 _get_minstack_func =
>> > 824 (GetMinStack)dlsym(RTLD_DEFAULT, "__pthread_get_minstack");
>> > 825 }
>> >
>> > Can you add a logging statement please:
>> >
>> > log(os, thread)("Lookup of __pthread_get_minstack %s",
>> > _get_minstack_func == NULL ? "failed" : "succeeded");
>>
>> Added log info.
>>
>> >
>> > --
>> >
>> > 884 // In the Linux NPTL pthread implementation the guard size mechanism
>> >
>> > Now that we have additional information on this could you update this
>> > old comment to say
>> >
>> > // In glibc versions prior to 2.7 the guard size mechanism
>>
>> Done.
>>
>> >
>> > --
>> >
>> > 897 // Adjust the stack_size by adding the on-stack TLS size if
>> > 898 // AdjustStackSizeForTLS is true. The guard size is already
>> > 899 // accounted in this case, please see comments in
>> > 900 // get_static_tls_area_size().
>> >
>> > Given the extensive commentary in get_static_tls_area_size() can we be
>> > more brief here and just say:
>> >
>> > // Adjust the stack size for on-stack TLS - see get_static_tls_area_size().
>>
>> Done.
>>
>> >
>> > --
>> >
>> > 5193 get_minstack_init();
>> >
>> > Can you make this conditional on AdjustStackSizeForTLS please so there
>> > is no affect when not using the flag - thanks.
>>
>> Done.
>>
>> >
>> > --
>> >
>> > 855 }
>> > 856 return tls_size;
>> >
>> > Can you insert a logging statement:
>> >
>> > log(os, thread)("Stack size adjustment for TLS is " SIZE_T_FORMAT,
>> > tls_size);
>>
>> Added.
>>
>> >
>> > ---
>> >
>> > test/hotspot/jtreg/runtime/TLS/T.java
>> >
>> > 37 // Starting a ProcessBuilder causes the process reaper
>> > thread being
>> >
>> > s/being/to be/
>>
>> Fixed.
>>
>> >
>> > 43 // failure mode the VM fails to create thread with
>> > error message
>> >
>> > s/create thread/create a thread/
>>
>> Fixed.
>>
>> >
>> > 53 System.out.println("Unexpected Echo output: " +
>> > echoOutput +
>> > 54 ", expects: " + echoInput);
>> >
>> > should this be an exception so that test fails? I can't imagine how echo
>> > would fail but probably better to fail the test if something unexpected
>> > happens.
>>
>> If no expected output is obtained from echo (due to ProcessBuilder
>> failure caused by TLS issue), the test does fail and reports to the
>> caller (returns false). If we throw an explicit expectation, it will
>> be caught by the outer try/catch, which seems to be unnecessary.
>>
>> >
>> > 66 try {
>> > 67 br = new BufferedReader(new
>> > InputStreamReader(inputStream));
>> > 68 s = br.readLine();
>> > 69 } finally {
>> > 70 br.close();
>> > 71 }
>> >
>> > This could use try-with-resources for both streams.
>>
>> Sounds good. Done.
>>
>> >
>> > ---
>> >
>> > exestack-tls.c
>> >
>> > It's simpler if no argument means no-tls and an argument means tls.
>> >
>> > 42 char classpath[4096];
>> > 43 snprintf(classpath, sizeof classpath,
>> > 44 "-Djava.class.path=%s", getenv("CLASSPATH"));
>> > 45 options[0].optionString = classpath;
>> >
>> > Do we need to explicitly set the classpath? I'm concerned that our test
>> > environment uses really, really long paths and a number of them.
>> > (Probably not 4096 but still ...)
>>
>> The classpath needs to be set so we know where to load the test class.
>> Given that we use 4096 for the same type of usage in other existing
>> test(s) (for example StackGap), it probably is okay for our test
>> environments? I can increase the array size if we want to be extra
>> safe ...
>>
>> >
>> >
>> > test/hotspot/jtreg/runtime/TLS/testtls.sh
>> >
>> > 40 if [ "${VM_OS}" != "linux" ]
>> > 41 then
>> > 42 echo "Test is only valid for Linux"
>> > 43 exit 0
>> > 44 fi
>> >
>> > This should be done via "@requires os.family != Linux"
>>
>> Done.
>>
>> Thanks!
>>
>> Best regards,
>> Jiangli
>> >
>> > Thanks,
>> > David
>> > -----
>> >
>> > > Thanks for everyone's contribution on carving out the current workaround!
>> > >
>> > > Best regards,
>> > > Jiangli
>> > >
>> > > On Thu, Jun 27, 2019 at 11:42 AM Jiangli Zhou <jianglizhou at google.com> wrote:
>> > >>
>> > >> Thank you Thomas and David! Glad to see that we are converging on an
>> > >> acceptable approach here. I'll try to factor in all the latest inputs
>> > >> from everyone and send out a new update.
>> > >>
>> > >> Thanks and best regards,
>> > >> Jiangli
>> > >>
>> > >> On Thu, Jun 27, 2019 at 11:13 AM David Holmes <david.holmes at oracle.com> wrote:
>> > >>>
>> > >>> Trimming ....
>> > >>>
>> > >>> On 27/06/2019 12:35 pm, Jiangli Zhou wrote:
>> > >>>> On Thu, Jun 27, 2019 at 9:23 AM Florian Weimer <fweimer at redhat.com> wrote:
>> > >>>>> I think you can handle the guard size in this way:
>> > >>>>>
>> > >>>>> pthread_attr_setguardsize(&attr, guard_size);
>> > >>>>>
>> > >>>>> size_t stack_adjust_size = 0;
>> > >>>>> if (AdjustStackSizeForTLS) {
>> > >>>>> size_t minstack_size = get_minstack(&attr);
>> > >>>>> size_t tls_size = minstack_size - vm_page_size() - PTHREAD_STACK_MIN;
>> > >>>>> // In glibc before 2.27, tls_size still includes guard_size.
>> > >>>>> // In glibc 2.27 and later, guard_size is automatically
>> > >>>>> // added to the stack size by pthread_create.
>> > >>>>> // In both cases, the guard size is taken into account.
>> > >>>>> stack_adjust_size += tls_size;
>> > >>>>> } else {
>> > >>>>> stack_adjust_size += guard_size;
>> > >>>>> }
>> > >>>>
>> > >>>> Is the vm_page_size() counted for the dl_pagesize? As long as others
>> > >>>> are okay with the above suggested adjustment, it looks good to me.
>> > >>>> Thomas, David and others, any objection?
>> > >>>
>> > >>> I find the above acceptable. I've been waiting for the dust to settle.
>> > >>>
>> > >>> Thanks,
>> > >>> David
>> > >>>
>> > >>>> Thanks and best regards,
>> > >>>> Jiangli
>> > >>>>>
>> > >>>>> Thanks,
>> > >>>>> Florian
More information about the hotspot-runtime-dev
mailing list