Thread stack size issue related to glibc TLS bug

David Holmes david.holmes at oracle.com
Fri May 24 02:08:44 UTC 2019


Hi Jiangli,

On 24/05/2019 9:21 am, Jiangli Zhou wrote:
> Hi David (and others),
> 
> There was a discussion [1] (between you, Jeremy, Martin and others)
> back in 2015 regarding a stack size issue caused by a glibc bug
> related to TLS (Thread local storage) [2]. The issue was manifested as
> a StackOverflowError with the reported test in JDK-8130425 [0] when
> large TLS size is used. A workaround was introduced with
> -Djdk.lang.processReaperUseDefaultStackSize. Based on the glibc
> discussion thread [2], Rust implemented a fix by taking into account
> of the TLS size. From one of the comments in the OpenJDK discussion
> archive [3], looks like you considered similar fix could be applied
> for JVM. I talked to Jeremy about sharing his fix for this particular
> issue today. The fix appears to be a more general solution than the
> processReaperUseDefaultStackSize workaround. It has been tested/used
> for server years and seems to be stable. The link to the changeset is
> listed below. Please let me know your thoughts on taking the change in
> OpenJDK.

My thoughts haven't really changed since 2015 - and sadly neither has 
there been any change in glibc in that time. Nor, to my recollection, 
have there been any other reported issues with this.

If this were to be taken into hotspot then I think it has to be opt-in 
via a flag so that it doesn't make sudden and unexpected differences in 
the number of threads an application can create. It may also be worth 
considering, from the bugzilla discussion, only adding in the TLS size 
if it is greater than a certain percentage of the stack size being 
requested. That would limit the impact to threads with small stacks 
without forcing every thread to have to grow by the TLS size.

But I'd want to know how often this is actually needed. As Andrew Haley 
said in the original discussion thread "I think we're rather looking at 
abuse of TLS here.".

And I'd need to understand better what versions of glibc this would work 
for (and how they relate to current distros).

Cheers,
David

> [0] JDK bug: https://bugs.openjdk.java.net/browse/JDK-8130425
> [1] OpenJDK discussion archive:
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-December/037558.html
> [2] glibc discussion archive:
> http://sourceware.org/bugzilla/show_bug.cgi?id=11787
> [3] change: http://cr.openjdk.java.net/~jiangli/tls_size/webrev/
> (contributed by Jeremy Manson)
> 
> The #ifdef __GLIBC__ in the change could be removed as os_linux.cpp
> already makes assumption about the use of glibc.
> 
> Best regards,
> Jiangli
> 


More information about the hotspot-runtime-dev mailing list