Thread stack size issue related to glibc TLS bug
David Holmes
david.holmes at oracle.com
Fri May 24 02:08:44 UTC 2019
Hi Jiangli,
On 24/05/2019 9:21 am, Jiangli Zhou wrote:
> Hi David (and others),
>
> There was a discussion [1] (between you, Jeremy, Martin and others)
> back in 2015 regarding a stack size issue caused by a glibc bug
> related to TLS (Thread local storage) [2]. The issue was manifested as
> a StackOverflowError with the reported test in JDK-8130425 [0] when
> large TLS size is used. A workaround was introduced with
> -Djdk.lang.processReaperUseDefaultStackSize. Based on the glibc
> discussion thread [2], Rust implemented a fix by taking into account
> of the TLS size. From one of the comments in the OpenJDK discussion
> archive [3], looks like you considered similar fix could be applied
> for JVM. I talked to Jeremy about sharing his fix for this particular
> issue today. The fix appears to be a more general solution than the
> processReaperUseDefaultStackSize workaround. It has been tested/used
> for server years and seems to be stable. The link to the changeset is
> listed below. Please let me know your thoughts on taking the change in
> OpenJDK.
My thoughts haven't really changed since 2015 - and sadly neither has
there been any change in glibc in that time. Nor, to my recollection,
have there been any other reported issues with this.
If this were to be taken into hotspot then I think it has to be opt-in
via a flag so that it doesn't make sudden and unexpected differences in
the number of threads an application can create. It may also be worth
considering, from the bugzilla discussion, only adding in the TLS size
if it is greater than a certain percentage of the stack size being
requested. That would limit the impact to threads with small stacks
without forcing every thread to have to grow by the TLS size.
But I'd want to know how often this is actually needed. As Andrew Haley
said in the original discussion thread "I think we're rather looking at
abuse of TLS here.".
And I'd need to understand better what versions of glibc this would work
for (and how they relate to current distros).
Cheers,
David
> [0] JDK bug: https://bugs.openjdk.java.net/browse/JDK-8130425
> [1] OpenJDK discussion archive:
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-December/037558.html
> [2] glibc discussion archive:
> http://sourceware.org/bugzilla/show_bug.cgi?id=11787
> [3] change: http://cr.openjdk.java.net/~jiangli/tls_size/webrev/
> (contributed by Jeremy Manson)
>
> The #ifdef __GLIBC__ in the change could be removed as os_linux.cpp
> already makes assumption about the use of glibc.
>
> Best regards,
> Jiangli
>
More information about the hotspot-runtime-dev
mailing list