RFR: 8225035: Thread stack size issue caused by large TLS size
David Holmes
david.holmes at oracle.com
Thu Jun 13 02:39:41 UTC 2019
Hi Ioi,
On 13/06/2019 11:32 am, Ioi Lam wrote:
> On 6/12/19 5:06 PM, David Holmes wrote:
>> Hi Thomas,
>>
>> > I am not sure, but is adding the TLS sized buffer to just small thread
>> > stacks the right thing to do? Why it is okay for thread stacks
>> > slightly larger than the cut-off point to live without this additional
>> > space? Will they not as well be hurt by the large TLS subtraction from
>> > their stack sizes?
>>
>> This was discussed previously:
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-May/034460.html
>>
>>
>> The premise is that the smaller the requested stack the more likely
>> unexpected TLS usage will lead to stackoverflows. If I set 32K stacks
>> and TLS uses 16K I likely have a problem at runtime. If I have 2M
>> stacks and TLS uses 16K I probably don't really care. So the proposal
>> is to use the ratio of TLS-size to requested-stack-size as the trigger
>> - configurable by the end user. That way for the expected case of
>> large stacks with small TLS, and we're not running on the edge of
>> stackoverflow, there will be no change in stack usage or general
>> behaviour. In contrast really small stacks will be expanded by the TLS
>> size to avoid stackoverflow - with some user control. In the latter
>> case if TLS can be accommodated without overflow then the user can
>> increase the ratio to avoid the extra stack usage. If TLS-size causes
>> overflow then the user has no choice if they want their application to
>> run.
>>
>> It's not perfect but nothing is.
>>
>
> Hi David,
>
> Thanks for the summary. I didn't follow this whole discussion so I
> apologize for asking dumb questions.
>
> First, I think the bug description needs to be expanded to clearly
> describe the problem and the proposed fix (and alternatives considered).
> It currently basically says -- "this was discussed before and we are
> going to fix it in a way discussed elsewhere".
I think Jiangli has more details in the CSR, but certainly the bug and
CSR can be expanded as needed.
> I am sure this has been discussed before -- is there any way for the
> users to specify both stack size (-XX:ThreadStackSize) and TLS size
> (using __thread in C code??), so they can live with their own perils
> without the VM making any guesses?
The issue, IIUC, is with TLS being "injected" by other tools which then
launch the JVM and run the application. The tools could in theory modify
the default thread stacksize based on the TLS (though the user of the
TLS may not have detailed knowledge of the actual size), but that is
only a very partial solution if the application sets their own thread
stacks sizes via the j.l.Thread creation API - which is the case with
the process reaper thread that triggered all this.
> With the proposed patch, we have added a single knob,
> -XX:AdjustWithMinStackAt, for controlling scenarios that seem to be
> affected by multiple factors. Is this a sufficient solution for all
> foreseeable scenarios? E.g., TLS used by dynamically loaded DSOs as
> described by Andrew Haley in a separate email.
This patch only tries to account for static TLS which is placed in the
stack. IIUC dynamic TLS is not placed in the stack so does not need to
be accounted for.
Note that what this patch is trying to do is perform the adjustment that
glibc "should" be doing, but isn't. But, as would be the concern with
changing this in glibc, unconditionally accounting for the static TLS
could significantly impact applications that have tuned themselves for
number of threads and thread stack sizes based on runtime conditions.
Thanks,
David
>
>
> Thanks
> - Ioi
>
>
>
>> Cheers,
>> David
>> -----
>>
>> On 13/06/2019 6:45 am, Thomas Stüfe wrote:
>>> Hi Jiangli,
>>>
>>> after learning a lot of interesting things today about TLS :-) I took
>>> a look at your change.
>>>
>>> First another question, just to be sure I understand this: From what
>>> think I understood about TLS, and from the way you wrote your test
>>> (writing an own launcher and all) I assume the problem only happens
>>> for TLS of modules which are loaded at program start, right? So
>>> Libjvm.so and dynamically loaded JNI libs can allocate as many
>>> __thread space as they like, they would not go on the thread stacks?
>>> And this is only a problem when you embed the libjvm into other
>>> launchers which use a lot of TLS?
>>>
>>> ---
>>>
>>> I do not understand why the complicated dynamic lookup of
>>> __pthread_get_minstack. What is the advantage of this two-level
>>> approach vs just dlsym'ing the pointer like we do for all other
>>> dynamically loaded functions? Also, would this not cause problems
>>> when building for other libc implementations, e.g. muslc?
>>>
>>> And why RTLD_NEXT, why not RTLD_DEFAULT? Any special reason? Since
>>> that seems to be how we normally load dynamic symbols from the libc.
>>>
>>> About the fkt pointer, in contrast to David I would prefer to see it
>>> is a pointer, not a function, so your first name which I assume was
>>> "p_.." I would have liked more.
>>>
>>> ---
>>>
>>> + size_t min_stack_size = align_up(pthread_get_minstack(&attr),
>>> vm_page_size());
>>>
>>> Why the align up? The only condition we have to meet is for the total
>>> stack size to be page aligned. On some platforms page size can be
>>> large (I think we have 64K pages on ppc) and TLS size could be
>>> smaller than that. Could you move the align up do right before the
>>> call to pthread_attr_setstacksize() (and probably remove the assert
>>> there)?
>>>
>>> ---
>>>
>>> What is the logic behind giving attr to pthread_get_minstack? If
>>> Florian is right and this:
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-May/034593.html
>>> is the implementation, that parameter seems not to be used. In that
>>> case, just to be sure, I would feed it a dummy attr.
>>>
>>> ---
>>>
>>> I am not sure, but is adding the TLS sized buffer to just small
>>> thread stacks the right thing to do? Why it is okay for thread stacks
>>> slightly larger than the cut-off point to live without this
>>> additional space? Will they not as well be hurt by the large TLS
>>> subtraction from their stack sizes?
>>>
>>> Thank you!
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>> On Tue, Jun 11, 2019 at 12:30 AM Jiangli Zhou <jianglizhou at google.com
>>> <mailto:jianglizhou at google.com>> wrote:
>>>
>>> Hi David and Florian,
>>>
>>> This the a follow up of the discussion regarding the TLS (Thread
>>> Local
>>> Storage) issue:
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-May/034459.html.
>>>
>>> The following is an updated patch with your suggestions
>>> incorporated:
>>>
>>> 1) Use __pthread_get_minstack
>>> 2) By default only adjust the thread stack size if the
>>> minstack-to-stacksize ratio is >= 10%
>>> 3) Introduce a new command-line option (Linux only),
>>> AdjustWithMinStackAt, which can be used to change the default 10%
>>> ratio for triggering stack size adjustment
>>>
>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8225498
>>> webrev: http://cr.openjdk.java.net/~jiangli/8225035/webrev.00/
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8225035
>>>
>>> Best regards,
>>>
>>> Jiangli
>>>
>
More information about the hotspot-runtime-dev
mailing list