Backporting stack guard fixes from JDK-9 (8169373+8159335+8139864)
Thomas Stüfe
thomas.stuefe at gmail.com
Wed Dec 6 16:22:34 UTC 2023
Hi Jan,
are you sure this crash is related to JDK-8169373?
As far as I remember that bug resulted in threads possibly failing to start
because the thread stack size specified was too small; I don't see how it
can cause crashes like the one you describe. Once the thread had been
started successfully, things should work, no?
Thanks, Thomas
On Wed, Dec 6, 2023 at 5:07 PM Jan Kratochvil (Azul) <jkratochvil at azul.com>
wrote:
> Hello,
>
> I got a crash report for an OpenJDK-8 derivation (Azul company's build
> Zulu-8)
> on aarch64. Details are at the end. I believe it could be fixed by some
> backports below but the resulting patch gets too big (>10k LoC = Lines of
> Code).
>
> Do you have an idea how to make the backport feasible for JDK-8?
>
> JDK-8169373: Work around linux NPTL stack guard error
> 13 files changed, 165 insertions(+), 425 deletions(-)
>
> Unfortunately this patch depends on a lot of other code from JDK-9 and
> I believe these fixes are related anyway so they also need a backport:
>
> JDK-8159335: Fix problems with stack overflow handling
> 33 files changed, 183 insertions(+), 236 deletions(-)
> JDK-8139864: Improve handling of stack protection zones
> 43 files changed, 312 insertions(+), 226 deletions(-)
>
> To satisfy dependencies of the patches above I had to backport also:
> 8078513: [linux] Clean up code relevant to LinuxThreads implementation
> 8080298: Clean up os::...::supports_variable_stack_size()
> 8037842: Failing to allocate MethodCounters and MDO causes a serious
> performance drop
> 8059847: complement JDK-8055286 and JDK-8056964 changes
> 8059606: Enable per-method usage of CompileThresholdScaling
> (per-method compilation thresholds)
> 8074119: [AARCH64] stage repo misses fixes from several Hotspot changes
> 8013393: Merge template interpreter files for x86 _32 and _64
> 8122937: [JEP 245] Validate JVM Command-Line Flag Arguments
> 8078556: Runtime: implement ranges (optionally constraints) for those
> flags that have them missing
> 8048241: Introduce umbrella header os.inline.hpp and clean up includes
> ^^^ 8139864: Improve handling of stack protection zones
> ^^^ 8159335: Fix problems with stack overflow handling
> 8140520: segfault on solaris-amd64 with "-XX:VMThreadStackSize=1"
> option
> ^^^ 8169373: Work around linux NPTL stack guard error
> 8049325: Introduce and clean up umbrella headers for the files in the
> cpu subdirectories
> 8064611: AARCH64: Changes to HotSpot shared code
> 8160189: Fix for 8159335 breaks AArch64
> 8130858: CICompilerCount=1 when tiered is off is not allowed any more
> 8072931: JEP-JDK-8059557: Test task: test framework development
> 228 files changed, 13311 insertions(+), 8831 deletions(-)
>
> Which is just a too big patch for a backport into JDK-8. There are some
> possibilities for minor reduction of the whole patch but it will be still
> around 10k LoC. None of the patches above have been backported to JDK-8.
>
> Unfortunately due to time constraints I do not yet have confirmed these
> backports really fix this crash. It needs to be tested at a customer as I
> do
> not have a reproducer (and it is even difficult to reproduce on the target
> system).
>
> I can publish the whole patchset above but I would need to rebase it first
> from the Zulu-8 derivation to plain OpenJDK-8.
>
>
> Thanks for your opinion,
> Jan Kratochvil
>
>
> ------------------------------------------------------------------------------
>
> The crashing memory access of __resp is in TLS (thread-local storage) which
> points to an unmapped memory during very early thread startup still in
> glibc:
>
> #1 <signal handler called>
> #2 start_thread (arg=0x7ef47cd160) at
> /usr/src/debug/glibc/2.23-r0/git/nptl/pthread_create.c:265
> 265 __resp = &pd->res;
> => 0x0000007f836e7f1c <+36>: str x0, [x2, x1]
>
> Program Headers:
> Type Offset VirtAddr PhysAddr FileSiz
> MemSiz Flg Align
> LOAD 0x10e75000 0x0000007ef45ce000 0x0000000000000000 0x1ff000
> 0x1ff000 RW 0x1000
> 0x7ef47cd000=end
> 7ef47cd850=accessed memory __resp
> LOAD 0x11074000 0x0000007ef47ce000 0x0000000000000000 0x003000
> 0x003000 0x1000
>
> A different thread was creating the thread above:
>
> [Current thread is 13 (LWP 834)]
> #5 0x0000007f811b83a4 in os::create_thread (thread=0x7f20010420,
> thr_type=<optimized out>, stack_size=<optimized out>)
> at zulu8-arm64-dev/hotspot/src/os/linux/vm/os_linux.cpp:939
> tid = 545267700064 = 0x7ef47cd160
>
> Someone did unmap the page where TLS of the new thread is being located
> before
> the thread really started.
>
> Given there is always gap 0x3000 between the mappings - it should be the
> guard
> pages.
>
> Type Offset VirtAddr PhysAddr FileSiz
> MemSiz Flg Align
> LOAD 0x10e75000 0x0000007ef45ce000 0x0000000000000000 0x1ff000
> 0x1ff000 RW 0x1000
> 0x7ef47cb6c0=$sp of Thread 1 (LWP 1358)
> 0x7ef47cd000=mapped area end
> 0x7ef47cd160=thread tid
> 0x7ef47cd850=accessed memory __resp
> LOAD 0x11074000 0x0000007ef47ce000 0x0000000000000000 0x003000
> 0x003000 0x1000
> LOAD 0x11077000 0x0000007ef47d1000 0x0000000000000000 0x1fd000
> 0x1fd000 RW 0x1000
> 0x7ef49cc240=$sp of Thread 37 (LWP 1349)
> LOAD 0x11274000 0x0000007ef49ce000 0x0000000000000000 0x003000
> 0x003000 0x1000
> LOAD 0x11277000 0x0000007ef49d1000 0x0000000000000000 0x1fd000
> 0x1fd000 RW 0x1000
> 0x7ef4bc8fa0=$sp of Thread 36 (LWP 1290)
> LOAD 0x11474000 0x0000007ef4bce000 0x0000000000000000 0x000000
> 0x003000 0x1000
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jdk8u-dev/attachments/20231206/d9261fec/attachment.htm>
More information about the jdk8u-dev
mailing list