Need help to understand TLS behavior
David Holmes
david.holmes at oracle.com
Tue Dec 15 10:00:34 UTC 2015
On 15/12/2015 7:25 PM, Thomas Stüfe wrote:
> Hi Jeremy, David,
>
> I would like to understand the problem better and have some questions,
> maybe you could answer?
The "specification" for ELF based TLS as I understood it is the Ulrich
Drepper document I referenced: ""ELF Handling for Thread Local Storage",
for which I don't have a URL handy. But you can see basically the same
kind of information in the Solaris document:
https://docs.oracle.com/cd/E26502_01/pdf/E26507.pdf
the details are complex as it depends on the exact model being used.
> - What is the difference between "static __thread int x" and "__thread
> int x" - one lives in the thread stacks, one does not?
If we are working with the initial executable or a statically linked
library then the storage for a static TLS variable can be allocated in a
static data segment. For a dynamic library there are two options: there
may be some spare space in the static area which may be used, or the
storage is allocated in the Thread-Control-Block - again it depends on
the model. And I guess it is up to the implementation as to how/where it
manages TCBs - they might be placed at the top of the stack, I don't know.
> - What happens with existing threads if a library is loaded which uses
> this form of TLS?
See above.
> - Does this TLS live at the top of the stack? So, to find out if a third
> party library uses this form of TLS, we could check the distance of sp
> to the stack base in java_start()?
Depends on the implementation. I'm not concerned with trying to solve
any general problem here, just the problem with this specific test
(which I have not yet seen :) )
> - Would the region returned by pthread_attr_getstack() include the TLS
> region?
That's up to glibc.
David
> Thanks!
>
> Kind Regards, Thomas
>
>
> On Tue, Dec 15, 2015 at 8:25 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> On 15/12/2015 4:32 PM, Jeremy Manson wrote:
>
> David: What the spec says and what glibc does are two different
> things:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=11787
>
> We have an internal Google patch to compensate for this. Nasty
> stuff.
>
>
> Nasty isn't even the right word - this is just ludicrous! And the
> bug has just languished even though they were going to fix it years
> ago!!!!! And I also cried when I saw the part finally recognizing
> that glibc does the wrong thing by taking the guard pages from the
> requested stack size!
>
> To me this just screams don't use TLS on linux except for trivially
> small data structures, or else use static-TLS.
>
> Which brings me back to this test - make the variable static!
>
> Thanks Jeremy. I'm thoroughly depressed now.
>
> David
> -----
>
> Jeremy
>
> On Mon, Dec 14, 2015 at 8:44 PM, David Holmes
> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>> wrote:
>
> On 15/12/2015 6:53 AM, Martin Buchholz wrote:
>
> Thread local storage is trouble.
>
> java stack sizes should be in _addition_ to any OS
> overhead,
> which includes TLS.
>
>
> TLS shouldn't be coming out the stack of the thread AFAIK -
> I see
> nothing about that in "ELF Handling for Thread Local
> Storage". That
> is why I want to know more about the test, the compilation
> environment and the execution environment.
>
> IOW, the java thread stack size should actually be
> available for
> stack frames.
> Hotspot should be fixed, but it's not easy.
>
>
> Do you mean that the value specified at the Java level
> should be
> rounded up to accommodate guard pages etc at the native level?
>
> David
> -----
>
>
> On Mon, Dec 14, 2015 at 12:47 PM, David Holmes
> <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com> <mailto:david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>> wrote:
>
> On 14/12/2015 11:06 PM, cheleswer sahu wrote:
>
>
> Hi David,
> TLS is thread local storage. In test program it is
> defined using
>
> #define TLS_SIZE 32
> int __thread XYZ[TLS_SIZE * 1024];
>
>
>
> Thanks for clarifying. What test is that? I'm
> guessing this
> may be a linux
> only test? Which platform do you see the problem on?
>
> We don't unconditionally use compiler-based TLS as some
> platforms may not
> support it.
>
> That aside that declaration should really be static
> I think.
>
> David
> -----
>
>
> Regards,
> Cheleswer
> On 12/14/2015 6:29 PM, David Holmes wrote:
>
>
> What is TLS in this context?
>
> Thanks,
> David
>
> On 14/12/2015 10:34 PM, cheleswer sahu wrote:
>
>
> Hi,
>
> I am investigating an issue, in which
> test with
> TLS size set to 32K is
> failing with StackOverFlowError. During
> investigation I found the below
> code
>
> http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/tip/src/solaris/classes/java/lang/UNIXProcess.java
>
>
>
> ThreadFactory threadFactory =
> grimReaper -> {
> // Our thread stack
> requirement is quite modest.
> Thread t = new
> Thread(systemThreadGroup, grimReaper,
>
> "process reaper", 32768);
>
> Here reaper thread is created with
> fixed stack
> size "32768 ", which
> causes StackOverFlowError when TLS is
> set to
> 32k around.
> If I remove this fixed size and make it
> default,
> test works fine.
>
> Thread t = new
> Thread(systemThreadGroup, grimReaper,
>
> "process reaper");
>
> I have run several test with TLS size
> 32k , 64k
> ,128k and more .
> The interesting part, it works well
> with 64k and
> 128k TLS size but not
> with 32k.
> So my questions are as follows:
>
>
> What is the motivation behind the fixed
> thread stack size ?
> will it be ok to replace the fixed
> stack
> size with default or stack
>
>
> size setting is platform sensitive?
>
>
> How TLS sizes are interpreted
> internally,
> which allows 64k and 128k
>
>
> to work but not to 32k ?
>
> I would really appreciate, if anyone
> have any
> opinion on this.
>
> Regards,
> Cheleswer
>
>
>
>
>
>
>
More information about the core-libs-dev
mailing list