<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>I believe what happens is that something has a bug and does an
invalid call to free() that makes glibc to call abort(). And since
there is no handler for that, nothing generates a core dump and it
just ends.</p>
<p>Based on this stackoverflow post:<br>
<a class="moz-txt-link-freetext" href="https://stackoverflow.com/a/151568">https://stackoverflow.com/a/151568</a></p>
<pre> As for how to debug it, installing a handler for SIGABRT is probably the best way to proceed. You can set a breakpoint in your handler or deliberately trigger a core dump.</pre>
<p>Álvaro</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 16/2/26 11:41, David Holmes wrote:<br>
</div>
<blockquote type="cite"
cite="mid:c2d957f0-c062-456a-8433-372c630a5d94@oracle.com">On
13/02/2026 7:25 pm, Álvaro Torres Cogollo wrote:
<br>
<blockquote type="cite">Hi,
<br>
<br>
In my opinion, I think it's fair to assume that other libraries
shouldn't call abort() if they actively don't want it to
generate a core dump. At least in the context of a Spring Boot
server, I can't think of a valid reason to call abort from a
library and don't expect a core dump.
<br>
</blockquote>
<br>
My query is: how is it calling abort but not getting a coredump?
<br>
<br>
David
<br>
<blockquote type="cite">However, I understand the concern about
handling SIGABRT signals in hosting environments. I'm also
missing a huge context on the implications of this. Maybe it's
enough to create a flag like -XX:
+CreateCoreDumpOnAbort, -XX:+HandleAbort or -XX:+CrashOnAbort.
That could be a best-practice configuration so far in certain
contexts (Spring Boot) and eventually consider making this the
default behaviour.
<br>
<br>
Regards,
<br>
<br>
Álvaro
<br>
<br>
<br>
On 13/2/26 08:07, David Holmes wrote:
<br>
<blockquote type="cite">Hi,
<br>
<br>
On 13/02/2026 3:16 am, Álvaro Torres Cogollo wrote:
<br>
<blockquote type="cite">Hi again,
<br>
<br>
I just realized that I made a typo in the reproduction
repository link. This is the right one:
<br>
<br>
<a class="moz-txt-link-freetext" href="https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-bug">https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-bug</a>
<br>
<br>
Sorry about that.
<br>
<br>
Álvaro
<br>
<br>
<br>
On 12/2/26 18:04, Álvaro Torres Cogollo wrote:
<br>
<blockquote type="cite">Hi,
<br>
<br>
We've been hitting a problem in production that I think
might be a bug in hotspot's signal handling. Let me know
if this should go somewhere else.
<br>
</blockquote>
</blockquote>
<br>
This is the right place (hotspot-runtime-dev would also have
done but a narrower audience).
<br>
<br>
Not sure it is a bug as such. I'm missing a piece of the
puzzle here. These other libraries are presumably calling
abort() to raise the SIGABRT but there is no coredump. Yet if
the VM calls abort() there is a coredump. I'm not seeing why
there would be different behaviour.
<br>
<br>
Catching SIGABRT in the VM then re-calling abort() may fix
your issue, but I'm not sure if it could introduce problems
for hosting environments which may already catch SIGABRT
themselves.
<br>
<br>
Need to hear what other think about this.
<br>
<br>
Cheers,
<br>
David
<br>
-----
<br>
<br>
<blockquote type="cite">
<blockquote type="cite">The issue is that when a native
library crashes due to memory corruption (like an invalid
free() call), the JVM exits immediately without generating
any core dump or error report, even though we have
-XX:+CreateCoredumpOnCrash enabled.
<br>
<br>
Here's what we're seeing when it crashes:
<br>
munmap_chunk(): invalid pointer
<br>
<br>
Or when using tcmalloc:
<br>
src/tcmalloc.cc:333] Attempt to free invalid pointer
0xffff38000b60
<br>
<br>
We're running with:
<br>
JAVA_TOOL_OPTIONS=-XX:+CreateCoredumpOnCrash
-XX:ErrorFile=/ core-dumps/hs_err_pid%p.log
<br>
<br>
But when these crashes happen, we get nothing - just the
error message above and the process dies. This makes
debugging really difficult, especially since the crashes
happen randomly in production.
<br>
<br>
After digging through the hotspot source, I noticed that
signal handlers are installed for SIGSEGV, SIGBUS, SIGFPE,
etc., but not for SIGABRT:
<br>
<br>
<a class="moz-txt-link-freetext" href="https://github.com/openjdk/jdk/">https://github.com/openjdk/jdk/</a>
blob/37dc1be67d4c15a040dc99dbc105c3269c65063d/src/hotspot/os/posix/
signals_posix.cpp#L1352-L1358
<br>
<br>
When glibc detects the memory corruption, it calls abort()
which raises SIGABRT. Since there's no handler for it, the
JVM can't catch it and generate the diagnostics.
<br>
<br>
To demonstrate the issue, I put together a small
reproduction case:
<br>
<br>
<a class="moz-txt-link-freetext" href="https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-handling">https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-handling</a>
<br>
<br>
The repo has a Spring Boot app with three endpoints that
show the problem:
<br>
<br>
1. /crash/unsafe - Uses Java Unsafe to write to address 0
<br>
Result: SIGSEGV -> Works correctly, generates hs_err
file
<br>
<br>
2. /crash/null - JNI code that dereferences a null pointer
<br>
Result: SIGSEGV -> Works correctly, generates hs_err
file
<br>
<br>
3. /crash/free - JNI code that calls free() on a stack
variable
<br>
Result: SIGABRT -> BROKEN, just prints
"munmap_chunk(): invalid pointer" and dies
<br>
<br>
You can reproduce it with:
<br>
docker-compose up -d
<br>
curl localhost:8080/crash/free
<br>
docker-compose logs
<br>
<br>
And you'll see it just prints the error and exits, no
hs_err file gets created.
<br>
<br>
I also tested a potential fix by adding SIGABRT handling
to hotspot. With that change, scenario 3 correctly
generates an hs_err file and core dump. The patch
basically:
<br>
<br>
<a class="moz-txt-link-freetext" href="https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-bug/blob/">https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-bug/blob/</a>
main/jdk17.patch
<br>
<br>
- Adds set_signal_handler(SIGABRT) in signals_posix.cpp
<br>
- Resets SIGABRT to SIG_DFL before calling abort() in
os_posix.cpp to avoid recursive handling
<br>
<br>
After applying it, the /crash/free endpoint generates
proper diagnostics:
<br>
# SIGABRT (0x6) at pc=0x0000ffffbd177608 (sent by
kill), pid=1, tid=41
<br>
# Problematic frame:
<br>
# C [libc.so.6+0x87608]
<br>
# Core dump will be written. Default location: //core
<br>
# An error report file with more information is saved
as:
<br>
# /core-dumps/java_error1.log
<br>
<br>
I'm not sure if there's a specific reason why SIGABRT
isn't handled currently. If there is, are there any
alternative approaches to capture diagnostics when native
libraries trigger abort()? For us and probably others
dealing with native library bugs in production, having
some way to get these diagnostics would be really
valuable.
<br>
<br>
Thanks,
<br>
<br>
Álvaro
<br>
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
</body>
</html>