RFR: 8225035: Thread stack size issue caused by large TLS size
David Holmes
david.holmes at oracle.com
Wed Jun 12 02:25:42 UTC 2019
Hi Jiangli,
On 12/06/2019 3:51 am, Jiangli Zhou wrote:
> Hi David,
>
> On Tue, Jun 11, 2019 at 12:56 AM David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Jiangli,
>>
>> Some initial feedback ...
>>
>> On 11/06/2019 8:29 am, Jiangli Zhou wrote:
>>> Hi David and Florian,
>>>
>>> This the a follow up of the discussion regarding the TLS (Thread Local
>>> Storage) issue:
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-May/034459.html.
>>> The following is an updated patch with your suggestions incorporated:
>>>
>>> 1) Use __pthread_get_minstack
>>> 2) By default only adjust the thread stack size if the
>>> minstack-to-stacksize ratio is >= 10%
>>> 3) Introduce a new command-line option (Linux only),
>>> AdjustWithMinStackAt, which can be used to change the default 10%
>>> ratio for triggering stack size adjustment
>>
>> I don't understand why the flag range is 0.0 to 50.0. If 0.1 = 10% then
>> 50 = 500%, which means the minstack is 5x greater than the actual
>> requested stack! The flag has to be limited between 0 and 100% AFAICS
>> else the requested stack size is too small to accommodate the "min
>> stack" and the attempt to create the thread will presumably fail. ??
>
> The minstack size is added on top of the current calculated size and
> the resulting value is passed to pthread_attr_setstacksize before the
> thread creation. We will be able to create a thread with stack size
> that's large enough. I initially thought about using [0%, 100%] range,
> which is 0 to 1 as well. However, that doesn't apply to the cases
> where the minstack size is greater then the requested size. Not sure
> how rare those cases are in reality, it is fairly easy to create a
> test case with a minstack > request_size. In the added jtreg test, the
> minstack is 540k with the TLS values.
>
> I picked 50.0, but I'm not too certain about using it as the upper
> bound. We probably want to use a larger value to cover more cases.
My point is that if TLS > requested stack then without enabling this
workaround the thread creation will presumably fail - right? And it
makes no sense to me to say "use the workaround if TLS is 5x greater
than normal stack but don't if its only 4x greater" as both cases are
equivalent from the perspective of "can the thread run" (answer: no!).
I think this only needs to be a simple percentage flag: if TLS-size is >
X% of requested stack size then add TLS-size. So 0% would always enable
it and 100% would disable it until TLS-size was equal to requested stack
size. I don't think it make sense to consider the case where TLS >
requested stack size.
>>
>> 827 static void pthread_get_minstack_init() {
>> 828 // The code below uses the static symbol if available.
>> 829 // If not, it will use dlsym to search for the symbol at runtime.
>> 830 p_pthread_get_minstack = &__pthread_get_minstack;
>>
>> I don't understand this part. If at runtime the "static" symbol doesn't
>> exist, won't this generate an error from the loader and prevent the VM
>> from starting?
>
> I see Florian has provided detailed information on this question (thanks!).
>
>>
>> Style nit: p_pthread_get_minstack
>>
>> Can you drop the "p" prefix and just use _pthread_get_minstack similar
>> to other dynamically determined functions please.
>
> Done.
>
>>
>> + min_stack_size = (*p_pthread_get_minstack)(attr);
>>
>> When you have a function pointer you don't need to dereference it to
>> invoke it, you can just do:
>>
>> min_stack_size = p_pthread_get_minstack(attr);
>
> Dropped dereferencing.
>
>>
>> + if (p_pthread_get_minstack) {
>>
>> Style nit: Please avoid implicit booleans and check != NULL.
>
> Done.
>
>>
>> I don't quite see what the test is doing - why does the Java code run
>> "echo" ??
>
> The TLS issue can be easily demonstrated by starting a process using
> ProcessBuilder. It could be any valid command that can be passed to
> ProcessBuilder. "echo" is inherited from the original test case
> attached to JDK-8130425.
Sorry I'm missing something here. What has an external process got to do
with TLS stealing stack from a newly created thread ??
>> I expected to see different values of AdjustWithMinStackAt
>> being used and threads then checking whether their stacks have been
>> adjusted or not; and in some cases to get errors. ??
>
> It's a bit tricky, as we may not know the exact failure behavior
> caused by insufficient stack space in different environments...
Nevertheless I expect that a test in this area would be combining
different amounts of TLS with different values of the flag to check
whether thread's can be created and run. At the moment I don't get what
this test is actually testing - what constitutes a pass and what
constitutes a fail?
Thanks,
David
>>
>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8225498
>>
>> I'm still thinking about the right name, type and range for the flag,
>> which will impact the CSR.
>
> Thanks! Please let me know your suggestions, I'll be more than happy
> to incorporate them.
>
> Best regards,
> Jiangli
>
>>
>> Thanks,
>> David
>> -----
>>
>>> webrev: http://cr.openjdk.java.net/~jiangli/8225035/webrev.00/
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8225035
>>>
>>> Best regards,
>>>
>>> Jiangli
>>>
More information about the hotspot-runtime-dev
mailing list