RFR: 6203: Add ZGC allocation stall rule [v3]
Marcus Hirt
hirt at openjdk.org
Wed Aug 27 20:17:56 UTC 2025
On Sat, 19 Jul 2025 08:12:25 GMT, Suchita Chaturvedi <schaturvedi at openjdk.org> wrote:
>> This enhancement is to add new rule for ZGC Allocation Stall events.
>>
>> The default configuration:
>>
>> <img width="946" alt="image" src="https://github.com/user-attachments/assets/0d39ae26-fdc4-49e3-a0ed-fb7f7da8f709" />
>>
>> Here are few screenshots for reference:
>>
>> <img width="344" alt="image" src="https://github.com/user-attachments/assets/e7efe1e2-d6b3-4a05-8ea6-1bf2e8b8c15f" />
>>
>> <img width="353" alt="image" src="https://github.com/user-attachments/assets/a3d3862f-96d2-4292-947f-4562c7f9f3d3" />
>>
>> <img width="344" alt="image" src="https://github.com/user-attachments/assets/9616eea6-bce8-4395-a846-db343fe349f2" />
>>
>> <img width="341" alt="image" src="https://github.com/user-attachments/assets/376a100e-ac05-48fe-a9c3-754923c3d79e" />
>>
>> <img width="353" alt="image" src="https://github.com/user-attachments/assets/cfa7c5f1-08f0-44b4-a49d-28504439a631" />
>>
>> Ignored
>>
>> <img width="352" alt="image" src="https://github.com/user-attachments/assets/3c00c22e-e64d-46b0-8b64-ed653ed1fc4b" />
>>
>> If we change default configuration as below:
>>
>> <img width="367" alt="image" src="https://github.com/user-attachments/assets/8da64985-8446-44b8-bfb4-696a726b5255" />
>>
>> <img width="350" alt="image" src="https://github.com/user-attachments/assets/ad89f02e-39b4-4df4-aae7-43cec9cc9a91" />
>>
>> <img width="347" alt="image" src="https://github.com/user-attachments/assets/b3ec1081-26dd-4391-9e0f-0b4c64a7e3c1" />
>>
>> <img width="344" alt="image" src="https://github.com/user-attachments/assets/106a7c3e-aaaf-495a-8c6f-281c96371e6e" />
>
> Suchita Chaturvedi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits:
>
> - Resolving merge conflict
> - Updated the rule as per new metric allocation stall rate
> - 6203: Add ZGC allocation stall rule
Changes requested by hirt (Lead).
core/org.openjdk.jmc.flightrecorder.rules.jdk/src/main/java/org/openjdk/jmc/flightrecorder/rules/jdk/memory/ZGCAllocationStallRule.java line 2:
> 1: /*
> 2: * Copyright (c) 2020, 2025, Oracle and/or its affiliates. All rights reserved.
New file - only 2025.
core/org.openjdk.jmc.flightrecorder.rules.jdk/src/main/java/org/openjdk/jmc/flightrecorder/rules/jdk/memory/ZGCAllocationStallRule.java line 139:
> 137:
> 138: //Calculate time after JVM Start
> 139: IQuantity timeAfterJVMStart = RulesToolkit.getEarliestStartTime(items)
Come to think of it - this will not work well for OldObjectSample events. They are a bit special and may be timestamped far earlier than the current recording period. I think we might want to add a RulesToolkit function that explicitly avoid the OldObjectSample events for this...
core/org.openjdk.jmc.flightrecorder.rules.jdk/src/main/resources/org/openjdk/jmc/flightrecorder/rules/jdk/messages/internal/messages.properties line 756:
> 754: ZGCAllocationStall_RULE_NAME=ZGC Allocation Stall Rate
> 755: ZgcAllocationStall_TEXT_INFO=In ZGC, a type of concurrent Garbage Collection (GC) algorithm, GC threads run concurrently with application threads, resulting in minimal stop-the-world pauses. However application threads can overrun the GC threads, allocating objects faster than GC threads can reclaim memory. in such cases, the JVM temporarily stops allocating application threads. This is called "Allocation Stall".\n Allocation Stalls occurs due to the following reasons:\n 1. High Object Allocation Rate: if your application creates objects at a very high rate, it can overwhelm the GC's ability to reclaim memory quickly enough, leading to stalls\n 2. Java Heap size is not sufficient: Having more free room in the heap can give more time for GC threads to perform their GC cycle and reclaim memory to satisfy new allocations happening concurrently.\n 3. Insufficient resources available for the GC threads: number of GC threads or the CPU allocated for them is not enough to finish the
GC cycle and reclaim the memory compared to allocations done by application threads.\n
> 756: ZgcAllocationStall_TEXT_WARN=\nThere are {zgcAllocationStallCount} occurrence of Allocation Stall Events. Total time spent in waiting for memory to become available is {zgcAllocationStallTotalDuration} and the maximum duration is {zgcAllocationStallLongestDuration}. The rate of allocation stall per minute is {zgcAllocationStallPerMinute}.
Perhaps simplify this - "There are {zgcAllocationStallCount} Allocation Stall events"
Also, "Stall time rate per minute is {zgcAllocationStallPerMinute}"
core/org.openjdk.jmc.flightrecorder.rules.jdk/src/main/resources/org/openjdk/jmc/flightrecorder/rules/jdk/messages/internal/messages.properties line 759:
> 757: ZgcAllocationStall_TEXT_OK=There are no occurrence of Allocation Stall Events.
> 758: ZGCAllocationStallRule_CONFIG_INFO_LIMIT=ZGC Allocation Stall Rate info limit
> 759: ZGCAllocationStallRule_CONFIG_INFO_LIMIT_LONG=The rate of ZGC Allocation Stall events needed to trigger an info notice
The ZGC allocation stall time per minute needed to trigger an info notice
core/org.openjdk.jmc.flightrecorder.rules.jdk/src/main/resources/org/openjdk/jmc/flightrecorder/rules/jdk/messages/internal/messages.properties line 761:
> 759: ZGCAllocationStallRule_CONFIG_INFO_LIMIT_LONG=The rate of ZGC Allocation Stall events needed to trigger an info notice
> 760: ZGCAllocationStallRule_CONFIG_WARN_LIMIT=ZGC Allocation Stall Rate warning limit
> 761: ZGCAllocationStallRule_CONFIG_WARN_LIMIT_LONG=The rate of ZGC Allocation Stall events needed to trigger a warning
The ZGC allocation stall time per minute needed to trigger a warning
core/org.openjdk.jmc.flightrecorder.rules.jdk/src/main/resources/org/openjdk/jmc/flightrecorder/rules/jdk/messages/internal/messages.properties line 762:
> 760: ZGCAllocationStallRule_CONFIG_WARN_LIMIT=ZGC Allocation Stall Rate warning limit
> 761: ZGCAllocationStallRule_CONFIG_WARN_LIMIT_LONG=The rate of ZGC Allocation Stall events needed to trigger a warning
> 762: ZGCAllocationStallRule_RATE=Allocation Stall per minute
Perhaps "Allocation stall time per minute".
core/org.openjdk.jmc.flightrecorder/src/main/resources/org/openjdk/jmc/flightrecorder/jdk/messages/internal/messages.properties line 472:
> 470: AGGR_ALL_COLLECTION_GC_COUNT_DESC=The count of GC for all garbage collection.
> 471: AGGR_ZGC_ALLOCATION_STALL_COUNT=ZGC Allocation Stall Count
> 472: AGGR_ZGC_ALLOCATION_STALL_COUNT_DESC=The count of ZGC Allocation Stall.
The number of ZGC allocation stalls
-------------
PR Review: https://git.openjdk.org/jmc/pull/664#pullrequestreview-3161556961
PR Review Comment: https://git.openjdk.org/jmc/pull/664#discussion_r2305180306
PR Review Comment: https://git.openjdk.org/jmc/pull/664#discussion_r2305194969
PR Review Comment: https://git.openjdk.org/jmc/pull/664#discussion_r2305159468
PR Review Comment: https://git.openjdk.org/jmc/pull/664#discussion_r2305165104
PR Review Comment: https://git.openjdk.org/jmc/pull/664#discussion_r2305163035
PR Review Comment: https://git.openjdk.org/jmc/pull/664#discussion_r2305161765
PR Review Comment: https://git.openjdk.org/jmc/pull/664#discussion_r2305203704
More information about the jmc-dev
mailing list