SIGABRT signals don't create core dumps
Álvaro Torres Cogollo
atorrescogollo at gmail.com
Tue Feb 17 19:07:38 UTC 2026
I've been doing some tests. Using init has indeed effect on the exit code:
services:
crash-test:
build: .
*init: true*
environment:
- JAVA_TOOL_OPTIONS=-XX:+CreateCoredumpOnCrash -XX:ErrorFile=/core-dumps/hs_err_pid%p.log
ports:
- "8080:8080"
volumes:
- ./core-dumps:/core-dumps
When using init=true makes it end with exit code 134 (SIGABRT). With
init=false , it ends with exit code 133 (SIGTRAP).
However, I still don't get any core dump in any of those cases.
Additionally, I think the equivalent for Kubernetes would be setting
shareProcessNamespace=true in the podSpec. That would force me introduce
a potential security risk just to get core dumps. In some context you
simply can't due to security policies. So even if that would be the
case, it would be very annoying if the only way was using
init/shareProcessNamespace.
Álvaro
On 17/2/26 03:21, David Holmes wrote:
> Perhaps you are running into this docker issue:
>
> https://ddanilov.me/how-signals-are-handled-in-a-docker-container
>
> David
> -----
>
> On 17/02/2026 12:16 pm, David Holmes wrote:
>> On 17/02/2026 10:30 am, Álvaro Torres Cogollo wrote:
>>> >why is the call to abort() not triggering a coredump?
>>>
>>> I guess you mean that the JVM generates coredumps by calling abort()
>>> as I read in:
>>> https://github.com/openjdk/jdk/blob/jdk-25%2B36/src/hotspot/os/posix/
>>> os_posix.cpp#L2091
>>>
>>> But glibc calls abort() and apparently doesn't generate the coredump.
>>>
>>> This is beyond my knowledge of the topic. The only thing I can try
>>> to help with is that I did a quick test you can see at:
>>> https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-bug/
>>> commit/891969416579c7d6a8df6f3b10007a7c78f8ae61
>>>
>>> The relevant code:
>>>
>>> // src/main/kotlin/com/example/demo/controller/CrashController.kt
>>> ...
>>> @RequestMapping("/crash")
>>> class CrashController {
>>> @GetMapping("/abort")
>>> fun crashWithAbort(): String {
>>> return NativeCrasher.crashWithAbort()
>>> }
>>> ...
>>>
>>>
>>> // src/main/kotlin/com/example/demo/native/NativeCrasher.kt
>>> ...
>>> object NativeCrasher {
>>> init {
>>> LibraryLoader.load()h
>>> }
>>> external fun crashWithAbort(): String
>>> ...
>>>
>>>
>>> // src/main/c/native_crasher.c
>>> ...
>>> JNIEXPORT void JNICALL
>>> Java_com_example_demo_native_NativeCrasher_crashWithAbort
>>> (JNIEnv *env, jobject obj)
>>> {
>>> abort();
>>> }
>>> ...
>>>
>>> If I hit the endpoint with JDK25:
>>>
>>> curl localhost:8080/crash/abort
>>>
>>> I don't get any coredump from that but only this exit code:
>>>
>>> exited with code 133
>>
>> That indicates the process terminated via SIGTRAP not SIGABRT.
>>> The PoC repository uses docker and everything is pretty standard apart
>>
>> I have a suspicion that it is the Docker environment that is causing
>> the problem.
>>
>> David
>> -----
>>
>>> from the optional patched compilation of the JVM to register the
>>> SIGABRT handler:
>>> https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-bug/
>>> blob/891969416579c7d6a8df6f3b10007a7c78f8ae61/Dockerfile
>>>
>>> FROM amazoncorretto:25 AS amazoncorretto-25
>>> #FROM amazoncorretto-25-patched AS amazoncorretto-25 # Use this
>>> instead to use the patched JVM
>>>
>>> FROM amazoncorretto-25
>>>
>>> WORKDIR /app
>>>
>>> # Copy built JAR from builder
>>> COPY --from=builder /build/build/libs/*.jar app.jar
>>>
>>> # Expose port
>>> EXPOSE 8080
>>>
>>> # Run application
>>> ENTRYPOINT ["java", "-jar", "app.jar"]
>>>
>>>
>>> Álvaro
>>>
>>>
>>> On 16/2/26 22:33, David Holmes wrote:
>>>> On 16/02/2026 9:27 pm, Álvaro Torres Cogollo wrote:
>>>>> I believe what happens is that something has a bug and does an
>>>>> invalid call to free() that makes glibc to call abort(). And since
>>>>> there is no handler for that, nothing generates a core dump and it
>>>>> just ends.
>>>>>
>>>>> Based on this stackoverflow post:
>>>>> https://stackoverflow.com/a/151568
>>>>>
>>>>> As for how to debug it, installing a handler for SIGABRT is
>>>>> probably the best way to proceed. You can set a breakpoint in your
>>>>> handler or deliberately trigger a core dump.
>>>>
>>>> I think you are missing my point. I get that glibc calls abort()
>>>> but that in itself should trigger a coredump. You don't have to
>>>> install a handler for SIGABRT for abort() to create a coredump.
>>>>
>>>> So my question remains: why is the call to abort() not triggering a
>>>> coredump?
>>>>
>>>> I wonder if glibc doesn't actually call abort() but just raises
>>>> SIGABRT directly? And if so why? It sounds like you can control
>>>> what glibc does for these kinds of errors so perhaps you need to be
>>>> telling glibc to do something different?
>>>>
>>>> David
>>>> ------
>>>>> Álvaro
>>>>>
>>>>>
>>>>> On 16/2/26 11:41, David Holmes wrote:
>>>>>> On 13/02/2026 7:25 pm, Álvaro Torres Cogollo wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> In my opinion, I think it's fair to assume that other libraries
>>>>>>> shouldn't call abort() if they actively don't want it to
>>>>>>> generate a core dump. At least in the context of a Spring Boot
>>>>>>> server, I can't think of a valid reason to call abort from a
>>>>>>> library and don't expect a core dump.
>>>>>>
>>>>>> My query is: how is it calling abort but not getting a coredump?
>>>>>>
>>>>>> David
>>>>>>> However, I understand the concern about handling SIGABRT signals
>>>>>>> in hosting environments. I'm also missing a huge context on the
>>>>>>> implications of this. Maybe it's enough to create a flag
>>>>>>> like -XX: +CreateCoreDumpOnAbort, -XX:+HandleAbort
>>>>>>> or -XX:+CrashOnAbort. That could be a best-practice
>>>>>>> configuration so far in certain contexts (Spring Boot) and
>>>>>>> eventually consider making this the default behaviour.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Álvaro
>>>>>>>
>>>>>>>
>>>>>>> On 13/2/26 08:07, David Holmes wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 13/02/2026 3:16 am, Álvaro Torres Cogollo wrote:
>>>>>>>>> Hi again,
>>>>>>>>>
>>>>>>>>> I just realized that I made a typo in the reproduction
>>>>>>>>> repository link. This is the right one:
>>>>>>>>>
>>>>>>>>> https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-bug
>>>>>>>>>
>>>>>>>>> Sorry about that.
>>>>>>>>>
>>>>>>>>> Álvaro
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/2/26 18:04, Álvaro Torres Cogollo wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> We've been hitting a problem in production that I think might
>>>>>>>>>> be a bug in hotspot's signal handling. Let me know if this
>>>>>>>>>> should go somewhere else.
>>>>>>>>
>>>>>>>> This is the right place (hotspot-runtime-dev would also have
>>>>>>>> done but a narrower audience).
>>>>>>>>
>>>>>>>> Not sure it is a bug as such. I'm missing a piece of the puzzle
>>>>>>>> here. These other libraries are presumably calling abort() to
>>>>>>>> raise the SIGABRT but there is no coredump. Yet if the VM calls
>>>>>>>> abort() there is a coredump. I'm not seeing why there would be
>>>>>>>> different behaviour.
>>>>>>>>
>>>>>>>> Catching SIGABRT in the VM then re-calling abort() may fix your
>>>>>>>> issue, but I'm not sure if it could introduce problems for
>>>>>>>> hosting environments which may already catch SIGABRT themselves.
>>>>>>>>
>>>>>>>> Need to hear what other think about this.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>>> The issue is that when a native library crashes due to memory
>>>>>>>>>> corruption (like an invalid free() call), the JVM exits
>>>>>>>>>> immediately without generating any core dump or error report,
>>>>>>>>>> even though we have -XX:+CreateCoredumpOnCrash enabled.
>>>>>>>>>>
>>>>>>>>>> Here's what we're seeing when it crashes:
>>>>>>>>>> munmap_chunk(): invalid pointer
>>>>>>>>>>
>>>>>>>>>> Or when using tcmalloc:
>>>>>>>>>> src/tcmalloc.cc:333] Attempt to free invalid pointer
>>>>>>>>>> 0xffff38000b60
>>>>>>>>>>
>>>>>>>>>> We're running with:
>>>>>>>>>> JAVA_TOOL_OPTIONS=-XX:+CreateCoredumpOnCrash - XX:ErrorFile=/
>>>>>>>>>> core-dumps/hs_err_pid%p.log
>>>>>>>>>>
>>>>>>>>>> But when these crashes happen, we get nothing - just the
>>>>>>>>>> error message above and the process dies. This makes
>>>>>>>>>> debugging really difficult, especially since the crashes
>>>>>>>>>> happen randomly in production.
>>>>>>>>>>
>>>>>>>>>> After digging through the hotspot source, I noticed that
>>>>>>>>>> signal handlers are installed for SIGSEGV, SIGBUS, SIGFPE,
>>>>>>>>>> etc., but not for SIGABRT:
>>>>>>>>>>
>>>>>>>>>> https://github.com/openjdk/jdk/
>>>>>>>>>> blob/37dc1be67d4c15a040dc99dbc105c3269c65063d/src/hotspot/os/
>>>>>>>>>> posix/ signals_posix.cpp#L1352-L1358
>>>>>>>>>>
>>>>>>>>>> When glibc detects the memory corruption, it calls abort()
>>>>>>>>>> which raises SIGABRT. Since there's no handler for it, the
>>>>>>>>>> JVM can't catch it and generate the diagnostics.
>>>>>>>>>>
>>>>>>>>>> To demonstrate the issue, I put together a small reproduction
>>>>>>>>>> case:
>>>>>>>>>>
>>>>>>>>>> https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-
>>>>>>>>>> handling
>>>>>>>>>>
>>>>>>>>>> The repo has a Spring Boot app with three endpoints that show
>>>>>>>>>> the problem:
>>>>>>>>>>
>>>>>>>>>> 1. /crash/unsafe - Uses Java Unsafe to write to address 0
>>>>>>>>>> Result: SIGSEGV -> Works correctly, generates hs_err file
>>>>>>>>>>
>>>>>>>>>> 2. /crash/null - JNI code that dereferences a null pointer
>>>>>>>>>> Result: SIGSEGV -> Works correctly, generates hs_err file
>>>>>>>>>>
>>>>>>>>>> 3. /crash/free - JNI code that calls free() on a stack variable
>>>>>>>>>> Result: SIGABRT -> BROKEN, just prints "munmap_chunk():
>>>>>>>>>> invalid pointer" and dies
>>>>>>>>>>
>>>>>>>>>> You can reproduce it with:
>>>>>>>>>> docker-compose up -d
>>>>>>>>>> curl localhost:8080/crash/free
>>>>>>>>>> docker-compose logs
>>>>>>>>>>
>>>>>>>>>> And you'll see it just prints the error and exits, no hs_err
>>>>>>>>>> file gets created.
>>>>>>>>>>
>>>>>>>>>> I also tested a potential fix by adding SIGABRT handling to
>>>>>>>>>> hotspot. With that change, scenario 3 correctly generates an
>>>>>>>>>> hs_err file and core dump. The patch basically:
>>>>>>>>>>
>>>>>>>>>> https://github.com/atorrescogollo/poc-jdk-sigabrt-coredump-bug/
>>>>>>>>>> blob/ main/jdk17.patch
>>>>>>>>>>
>>>>>>>>>> - Adds set_signal_handler(SIGABRT) in signals_posix.cpp
>>>>>>>>>> - Resets SIGABRT to SIG_DFL before calling abort() in
>>>>>>>>>> os_posix.cpp to avoid recursive handling
>>>>>>>>>>
>>>>>>>>>> After applying it, the /crash/free endpoint generates proper
>>>>>>>>>> diagnostics:
>>>>>>>>>> # SIGABRT (0x6) at pc=0x0000ffffbd177608 (sent by kill),
>>>>>>>>>> pid=1, tid=41
>>>>>>>>>> # Problematic frame:
>>>>>>>>>> # C [libc.so.6+0x87608]
>>>>>>>>>> # Core dump will be written. Default location: //core
>>>>>>>>>> # An error report file with more information is saved as:
>>>>>>>>>> # /core-dumps/java_error1.log
>>>>>>>>>>
>>>>>>>>>> I'm not sure if there's a specific reason why SIGABRT isn't
>>>>>>>>>> handled currently. If there is, are there any alternative
>>>>>>>>>> approaches to capture diagnostics when native libraries
>>>>>>>>>> trigger abort()? For us and probably others dealing with
>>>>>>>>>> native library bugs in production, having some way to get
>>>>>>>>>> these diagnostics would be really valuable.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Álvaro
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>> ---
>>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-dev/attachments/20260217/f5e6c44c/attachment-0001.htm>
More information about the hotspot-dev
mailing list