From stefan.karlsson at oracle.com Thu Jan 9 11:20:14 2025 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 9 Jan 2025 12:20:14 +0100 Subject: JVM crash - fatal error: Mark stack space exhausted. In-Reply-To: References: Message-ID: <4e61e6c6-fd21-445a-860b-82840d6d644c@oracle.com> Hi Abdulhakim Unlu, Your mail got caught in the mailing list filter because the attached file was too large. On 2024-12-27 22:02, Abd?lhakim ?nl? wrote: > Hi, > I am not sure if this is the right place to post this issue. It is. Thanks for reporting this! > We are testing out ZGC in a Xmx975G JVM. It crashed with the attached > hs_err file. > > # ?Internal Error (zMarkStackAllocator.cpp:81), pid=34334, tid=38183 > # ?fatal error: Mark stack space exhausted. Use > -XX:ZMarkStackSpaceLimit= to increase the maximum number of > bytes allocated for mark stacks. Current limit is 8192M. > > > How can I find the right size for ZMarkStackSpaceLimit, so that jvm > does not crash? > > Can ZGC handle this case in a more graceful way ? Can ZGC > adjust?ZMarkStackSpaceLimit dynamically ? I mean, in a production > environment, we cannot keep crashing JVM until we find the right > ZMarkStackSpaceLimit value. I've taken a look at the provided hs_err file. There's a few interesting things. 1) As you show, we run out of mark stack space. This is quite unexpected. We have seen similar issues in degenerate test-cases, for example, with extremely long doubly linked lists. We have also seen things like this when we introduced bugs during the development of the GC. We've talked this through in the team and I think we could rewrite the code to remove the need to set this limit. I've created a RFE for this: https://bugs.openjdk.org/browse/JDK-8347335 However, this will not solve the fact that the mark stacks are extremely large in your use-case. The next point *might* explain the reason why the mark stacks are filled up. 2) You are running with String Deduplication turned on. It turns out that Generational ZGC tries to deduplicate every single String that the GC finds to be a live in the system, where as the other GCs skip short-lived strings. This, together with String Deduplications usage of weak "handles" and the fact that ZGC treats these handles as strong roots in the young generation cause otherwise short-lived objects to get promoted to the old generation. This fills up the old generation, leading to very long old generation collections. I see in your logs that you had a 10m long major collection before this failure. Maybe this is related to why the mark stack space is filling up. I've created a bug to fix the String Deduplication issue: https://bugs.openjdk.org/browse/JDK-8347337 Do you have a great need for the String Deduplication feature? Otherwise, I would suggest that you turn it off when running with ZGC. Thanks, StefanK > > thanks, > Abdulhakim Unlu -------------- next part -------------- An HTML attachment was scrubbed... URL: