JVM crash - fatal error: Mark stack space exhausted.

Stefan Karlsson stefan.karlsson at oracle.com
Thu Jan 9 11:20:14 UTC 2025


Hi Abdulhakim Unlu,

Your mail got caught in the mailing list filter because the attached 
file was too large.

On 2024-12-27 22:02, Abdülhakim Ünlü wrote:
> Hi,
> I am not sure if this is the right place to post this issue.

It is. Thanks for reporting this!

> We are testing out ZGC in a Xmx975G JVM. It crashed with the attached 
> hs_err file.
>
>     #  Internal Error (zMarkStackAllocator.cpp:81), pid=34334, tid=38183
>     #  fatal error: Mark stack space exhausted. Use
>     -XX:ZMarkStackSpaceLimit=<size> to increase the maximum number of
>     bytes allocated for mark stacks. Current limit is 8192M.
>
>
> How can I find the right size for ZMarkStackSpaceLimit, so that jvm 
> does not crash?
>
> Can ZGC handle this case in a more graceful way ? Can ZGC 
> adjust ZMarkStackSpaceLimit dynamically ? I mean, in a production 
> environment, we cannot keep crashing JVM until we find the right 
> ZMarkStackSpaceLimit value.

I've taken a look at the provided hs_err file. There's a few interesting 
things.

1) As you show, we run out of mark stack space. This is quite 
unexpected. We have seen similar issues in degenerate test-cases, for 
example, with extremely long doubly linked lists. We have also seen 
things like this when we introduced bugs during the development of the GC.

We've talked this through in the team and I think we could rewrite the 
code to remove the need to set this limit. I've created a RFE for this:
https://bugs.openjdk.org/browse/JDK-8347335

However, this will not solve the fact that the mark stacks are extremely 
large in your use-case. The next point *might* explain the reason why 
the mark stacks are filled up.

2) You are running with String Deduplication turned on. It turns out 
that Generational ZGC tries to deduplicate every single String that the 
GC finds to be a live in the system, where as the other GCs skip 
short-lived strings. This, together with String Deduplications usage of 
weak "handles" and the fact that ZGC treats these handles as strong 
roots in the young generation cause otherwise short-lived objects to get 
promoted to the old generation. This fills up the old generation, 
leading to very long old generation collections. I see in your logs that 
you had a 10m long major collection before this failure. Maybe this is 
related to why the mark stack space is filling up.

I've created a bug to fix the String Deduplication issue:
https://bugs.openjdk.org/browse/JDK-8347337

Do you have a great need for the String Deduplication feature? 
Otherwise, I would suggest that you turn it off when running with ZGC.

Thanks,
StefanK

>
> thanks,
> Abdulhakim Unlu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20250109/7a9aaf78/attachment.htm>


More information about the zgc-dev mailing list