RFR: Avoid jtreg test timeout in aarch64 due to a stackoverflow in process reaper

Arthur Eubanks aeubanks at openjdk.java.net
Fri May 8 16:21:42 UTC 2020


On Fri, 8 May 2020 02:40:51 GMT, Jie He <github.com+10233373+jhe33 at openjdk.org> wrote:

>> I read through https://groups.google.com/forum/#!topic/thread-sanitizer/RsPcxUXBokg (nice investigation btw!) but don't
>> quite understand
>>> and found the different GLIBC behaviors between x86 and aarch64 stack
>>> allocation due to default stack size in openjdk. x86 will get the stack from
>>> glibc cached stack because it matches the threshold to allocate a stack from
>>> cached stack, but aarch64 not.
>> 
>> Do you mean that on x86 the stack size of the thread is larger than requested because glibc happened to have something
>> larger lying around to use? So we are currently getting lucky in x86 with stack sizes?
>
>> I read through https://groups.google.com/forum/#!topic/thread-sanitizer/RsPcxUXBokg (nice investigation btw!) but don't
>> quite understand
>> > and found the different GLIBC behaviors between x86 and aarch64 stack
>> > allocation due to default stack size in openjdk. x86 will get the stack from
>> > glibc cached stack because it matches the threshold to allocate a stack from
>> > cached stack, but aarch64 not.
>> 
>> Do you mean that on x86 the stack size of the thread is larger than requested because glibc happened to have something
>> larger lying around to use? So we are currently getting lucky in x86 with stack sizes?
> 
> yes, but not exactly. x86 gets the stack from cached stack is another story.
> 
> Initially, when started to investigate the SOE failure, I noticed the different behaviors between x86 and aarch64. TSAN
> increases the stack size to 384K, but x86 always could get a 1M stack, meanwhile, aarch64 couldn't. I thought it might
> be the reason why no SOE on x86. In fact, it's not the root cause as you already know. even though x86 gets 384K stack
> by bypassing the glibc allocation, it won't incur SOE in this case. However I have to take time to investigate the
> glibc.  By default, stack size of x86 is 1M in openjdk, and aarch64 is 2M. I assume aarch64 will take more stack
> consumption than x86 in most cases.  sometimes glibc allocates the stack from cached stacks, it depends on if the
> requested stack size is larger than 1/4 cached stack. here, 384K > 1/4 * 1M on x86, but not > 1/4 * 2M on aarch64.
> anyway, I think the issue in TSAN also will impact the effective usable stack on x86, it could make easier to SOE even
> though it doesn't happen in this case.

LGTM

-------------

PR: https://git.openjdk.java.net/tsan/pull/8


More information about the tsan-dev mailing list