RFR: 8225035: Thread stack size issue caused by large TLS size
Thomas Stüfe
thomas.stuefe at gmail.com
Sat Jun 29 06:33:35 UTC 2019
Hi Jiangli,
http://cr.openjdk.java.net/~jiangli/8225035/webrev.04/src/hotspot/os/linux/os_linux.cpp.udiff.html
+static void get_minstack_init() {
+ _get_minstack_func =
+ (GetMinStack)dlsym(RTLD_DEFAULT, "__pthread_get_minstack");
+ log_info(os, thread)("Lookup of __pthread_get_minstack %s",
+ _get_minstack_func == NULL ? "failed" :
"succeeded");
+}
Not sure if logging even works at that point, the logging system may not
yet be initialized. No problem from my view, just wanted to point that out.
+ // The following 'minstack_size > os::vm_page_size() +
PTHREAD_STACK_MIN'
+ // if check is done for precaution.
+ if (minstack_size > (size_t)os::vm_page_size() + PTHREAD_STACK_MIN) {
+ tls_size = minstack_size - os::vm_page_size() - PTHREAD_STACK_MIN;
+ }
If you want you could drop the "done for precaution" comment since it is
pretty clear what you do.
----
All these are idle musings. I am fine with the version as it is now.
Cheers, Thomas
On Sat, Jun 29, 2019 at 1:43 AM Jiangli Zhou <jianglizhou at google.com> wrote:
> Hi David,
>
> Thanks for the detailed comments! Here is the latest webrev:
> http://cr.openjdk.java.net/~jiangli/8225035/webrev.04/. Apologizing
> for not including an incremental webrev (realized that when I almost
> done edits).
>
> On Fri, Jun 28, 2019 at 8:57 AM David Holmes <david.holmes at oracle.com>
> wrote:
> >
> > Hi Jiangli,
> >
> > This is very well written up - thanks.
> >
> > I apologize in advance that I'm about to traveling for a few days so
> > won't be able to respond further until next week.
>
> Hope you have a relaxed and safe travel. This can wait.
>
> >
> > On 27/06/2019 11:58 pm, Jiangli Zhou wrote:
> > > Updated webrev:
> > > http://cr.openjdk.java.net/~jiangli/8225035/webrev.03/
> >
> > Overall changes look good. I also have a concern around this code:
> >
> > + tls_size = minstack_size - os::vm_page_size() - PTHREAD_STACK_MIN;
> > + assert(tls_size > 0, "unexpected size");
> >
> > In addition to Thomas's comments re-signedness, can't the result == 0 if
> > there is no sttaic TLS in use? Or is there always some static TLS in use?
>
> On both glibc 2.24 and 2.28, by default I see there is one page memory
> for static TLS, without explicitly defining any __thread variables in
> user code.
>
> >
> > A few other comments/requests:
> >
> > 822 static void get_minstack_init() {
> > 823 _get_minstack_func =
> > 824 (GetMinStack)dlsym(RTLD_DEFAULT, "__pthread_get_minstack");
> > 825 }
> >
> > Can you add a logging statement please:
> >
> > log(os, thread)("Lookup of __pthread_get_minstack %s",
> > _get_minstack_func == NULL ? "failed" : "succeeded");
>
> Added log info.
>
> >
> > --
> >
> > 884 // In the Linux NPTL pthread implementation the guard size
> mechanism
> >
> > Now that we have additional information on this could you update this
> > old comment to say
> >
> > // In glibc versions prior to 2.7 the guard size mechanism
>
> Done.
>
> >
> > --
> >
> > 897 // Adjust the stack_size by adding the on-stack TLS size if
> > 898 // AdjustStackSizeForTLS is true. The guard size is already
> > 899 // accounted in this case, please see comments in
> > 900 // get_static_tls_area_size().
> >
> > Given the extensive commentary in get_static_tls_area_size() can we be
> > more brief here and just say:
> >
> > // Adjust the stack size for on-stack TLS - see
> get_static_tls_area_size().
>
> Done.
>
> >
> > --
> >
> > 5193 get_minstack_init();
> >
> > Can you make this conditional on AdjustStackSizeForTLS please so there
> > is no affect when not using the flag - thanks.
>
> Done.
>
> >
> > --
> >
> > 855 }
> > 856 return tls_size;
> >
> > Can you insert a logging statement:
> >
> > log(os, thread)("Stack size adjustment for TLS is " SIZE_T_FORMAT,
> > tls_size);
>
> Added.
>
> >
> > ---
> >
> > test/hotspot/jtreg/runtime/TLS/T.java
> >
> > 37 // Starting a ProcessBuilder causes the process reaper
> > thread being
> >
> > s/being/to be/
>
> Fixed.
>
> >
> > 43 // failure mode the VM fails to create thread with
> > error message
> >
> > s/create thread/create a thread/
>
> Fixed.
>
> >
> > 53 System.out.println("Unexpected Echo output: " +
> > echoOutput +
> > 54 ", expects: " + echoInput);
> >
> > should this be an exception so that test fails? I can't imagine how echo
> > would fail but probably better to fail the test if something unexpected
> > happens.
>
> If no expected output is obtained from echo (due to ProcessBuilder
> failure caused by TLS issue), the test does fail and reports to the
> caller (returns false). If we throw an explicit expectation, it will
> be caught by the outer try/catch, which seems to be unnecessary.
>
> >
> > 66 try {
> > 67 br = new BufferedReader(new
> > InputStreamReader(inputStream));
> > 68 s = br.readLine();
> > 69 } finally {
> > 70 br.close();
> > 71 }
> >
> > This could use try-with-resources for both streams.
>
> Sounds good. Done.
>
> >
> > ---
> >
> > exestack-tls.c
> >
> > It's simpler if no argument means no-tls and an argument means tls.
> >
> > 42 char classpath[4096];
> > 43 snprintf(classpath, sizeof classpath,
> > 44 "-Djava.class.path=%s", getenv("CLASSPATH"));
> > 45 options[0].optionString = classpath;
> >
> > Do we need to explicitly set the classpath? I'm concerned that our test
> > environment uses really, really long paths and a number of them.
> > (Probably not 4096 but still ...)
>
> The classpath needs to be set so we know where to load the test class.
> Given that we use 4096 for the same type of usage in other existing
> test(s) (for example StackGap), it probably is okay for our test
> environments? I can increase the array size if we want to be extra
> safe ...
>
> >
> >
> > test/hotspot/jtreg/runtime/TLS/testtls.sh
> >
> > 40 if [ "${VM_OS}" != "linux" ]
> > 41 then
> > 42 echo "Test is only valid for Linux"
> > 43 exit 0
> > 44 fi
> >
> > This should be done via "@requires os.family != Linux"
>
> Done.
>
> Thanks!
>
> Best regards,
> Jiangli
> >
> > Thanks,
> > David
> > -----
> >
> > > Thanks for everyone's contribution on carving out the current
> workaround!
> > >
> > > Best regards,
> > > Jiangli
> > >
> > > On Thu, Jun 27, 2019 at 11:42 AM Jiangli Zhou <jianglizhou at google.com>
> wrote:
> > >>
> > >> Thank you Thomas and David! Glad to see that we are converging on an
> > >> acceptable approach here. I'll try to factor in all the latest inputs
> > >> from everyone and send out a new update.
> > >>
> > >> Thanks and best regards,
> > >> Jiangli
> > >>
> > >> On Thu, Jun 27, 2019 at 11:13 AM David Holmes <
> david.holmes at oracle.com> wrote:
> > >>>
> > >>> Trimming ....
> > >>>
> > >>> On 27/06/2019 12:35 pm, Jiangli Zhou wrote:
> > >>>> On Thu, Jun 27, 2019 at 9:23 AM Florian Weimer <fweimer at redhat.com>
> wrote:
> > >>>>> I think you can handle the guard size in this way:
> > >>>>>
> > >>>>> pthread_attr_setguardsize(&attr, guard_size);
> > >>>>>
> > >>>>> size_t stack_adjust_size = 0;
> > >>>>> if (AdjustStackSizeForTLS) {
> > >>>>> size_t minstack_size = get_minstack(&attr);
> > >>>>> size_t tls_size = minstack_size - vm_page_size() -
> PTHREAD_STACK_MIN;
> > >>>>> // In glibc before 2.27, tls_size still includes guard_size.
> > >>>>> // In glibc 2.27 and later, guard_size is automatically
> > >>>>> // added to the stack size by pthread_create.
> > >>>>> // In both cases, the guard size is taken into account.
> > >>>>> stack_adjust_size += tls_size;
> > >>>>> } else {
> > >>>>> stack_adjust_size += guard_size;
> > >>>>> }
> > >>>>
> > >>>> Is the vm_page_size() counted for the dl_pagesize? As long as others
> > >>>> are okay with the above suggested adjustment, it looks good to me.
> > >>>> Thomas, David and others, any objection?
> > >>>
> > >>> I find the above acceptable. I've been waiting for the dust to
> settle.
> > >>>
> > >>> Thanks,
> > >>> David
> > >>>
> > >>>> Thanks and best regards,
> > >>>> Jiangli
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Florian
>
More information about the hotspot-runtime-dev
mailing list