Minor GCs consuming an extreme amount of user CPU time

Shane Cox Shane.Cox at ice.com
Tue Nov 19 16:45:31 UTC 2024


We are observing a transient behavior where Minor GCs consume an extreme amount of user CPU time.  Below is an excerpt from a GC log illustrating this behavior.  The first line is representative of typical behavior.  Starting with the second line, we observe progressively longer Minor GC pause times and attendant user CPU usage.  The last line shows GC behavior returning to normal.

2024-11-19T05:31:57.096-0500: 39910.586: [GC (Allocation Failure) 2024-11-19T05:31:57.096-0500: 39910.587: [ParNew: 2276397K->23593K(2530368K), 0.8039922 secs] 34156897K->32412975K(157005312K), 0.8042477 secs] [Times: user=26.09 sys=0.28, real=0.81 secs]
2024-11-19T05:32:01.942-0500: 39915.432: [GC (Allocation Failure) 2024-11-19T05:32:01.942-0500: 39915.433: [ParNew: 2272873K->22588K(2530368K), 5.9797785 secs] 34662255K->32413062K(157005312K), 5.9800455 secs] [Times: user=195.82 sys=0.16, real=5.98 secs]
2024-11-19T05:32:08.811-0500: 39922.302: [GC (Allocation Failure) 2024-11-19T05:32:08.812-0500: 39922.302: [ParNew: 2271868K->24057K(2530368K), 30.6785198 secs] 34662342K->32417812K(157005312K), 30.6788152 secs] [Times: user=1005.22 sys=0.69, real=30.68 secs]
2024-11-19T05:32:40.245-0500: 39953.736: [GC (Allocation Failure) 2024-11-19T05:32:40.245-0500: 39953.736: [ParNew: 2273337K->41786K(2530368K), 48.4038572 secs] 34667092K->32436550K(157005312K), 48.4041764 secs] [Times: user=1586.51 sys=0.85, real=48.41 secs]
2024-11-19T05:33:32.236-0500: 40005.726: [GC (Allocation Failure) 2024-11-19T05:33:32.236-0500: 40005.726: [ParNew: 2291066K->59232K(2530368K), 63.8453443 secs] 34685830K->32454926K(157005312K), 63.8456820 secs] [Times: user=2091.55 sys=1.66, real=63.85 secs]
2024-11-19T05:34:36.698-0500: 40070.189: [GC (Allocation Failure) 2024-11-19T05:34:36.698-0500: 40070.189: [ParNew: 2308512K->74883K(2530368K), 78.2503259 secs] 34704206K->32471817K(157005312K), 78.2506265 secs] [Times: user=2565.48 sys=2.21, real=78.25 secs]
2024-11-19T05:35:55.417-0500: 40148.908: [GC (Allocation Failure) 2024-11-19T05:35:55.418-0500: 40148.908: [ParNew: 2324163K->102130K(2530368K), 96.4663544 secs] 34721097K->32500405K(157005312K), 96.4668756 secs] [Times: user=3161.99 sys=2.26, real=96.46 secs]
2024-11-19T05:37:32.503-0500: 40245.993: [GC (Allocation Failure) 2024-11-19T05:37:32.503-0500: 40245.993: [ParNew: 2351410K->108861K(2530368K), 122.1160632 secs] 34749685K->32508388K(157005312K), 122.1164558 secs] [Times: user=4001.43 sys=3.68, real=122.12 secs]
2024-11-19T05:39:35.187-0500: 40368.678: [GC (Allocation Failure) 2024-11-19T05:39:35.188-0500: 40368.678: [ParNew: 2358141K->144672K(2530368K), 161.7914697 secs] 34757668K->32546599K(157005312K), 161.7916831 secs] [Times: user=5302.53 sys=4.43, real=161.79 secs]
2024-11-19T05:42:17.358-0500: 40530.849: [GC (Allocation Failure) 2024-11-19T05:42:17.359-0500: 40530.849: [ParNew: 2393952K->200667K(2530368K), 279.7220815 secs] 34795879K->32612580K(157005312K), 279.7224630 secs] [Times: user=9164.75 sys=8.37, real=279.72 secs]
2024-11-19T05:46:57.492-0500: 40810.983: [GC (Allocation Failure) 2024-11-19T05:46:57.492-0500: 40810.983: [ParNew: 2449947K->250145K(2530368K), 0.2150744 secs] 34861860K->32690960K(157005312K), 0.2152389 secs] [Times: user=5.89 sys=0.70, real=0.22 secs]


The GC log records above do not indicate that an excessive amount of GC-related work is being done.  Therefore, I'm guessing that most of the user CPU time is not spent doing productive work.  Instead, I'm guessing that the GC threads were just spinning.

The CPU usage on the host was 68-69% during this period.  That makes sense based on the GC log records.  For example, the second to last GC log record reports 9164 CPU seconds and 279 real seconds, which yields a ratio of 32.85.  Divide 32.85 by the 48 CPUs and you get a usage of 68.4% ... which is the CPU usage that we observed on the host at this time.  That being the case, it appears that all of the observed CPU usage was consumed by the GC threads.


Any idea what the GC threads might be doing that's consuming so much CPU usage?  Or suggestions for how I might determine what the GC threads are doing?


Thanks,
Shane Cox




First few lines of GC log containing configuration information:
OpenJDK 64-Bit Server VM (25.212-b04) for linux-amd64 JRE (1.8.0_212-b04), built on Apr 19 2019 23:23:17 by "root" with gcc 7.3.1 20180303 (Red Hat 7.3.1-5)
Memory: 4k page, physical 196374708k(167484808k free), swap 4194300k(3010032k free)
CommandLine flags: -XX:CMSInitiatingOccupancyFraction=75 -XX:+ExplicitGCInvokesConcurrent -XX:+HeapDumpOnOutOfMemoryError -XX:InitialHeapSize=161061273600 -XX:MaxHeapSize=161061273600 -XX:MaxNewSize=2878971904 -XX:MaxTenuringThreshold=6 -XX:NewSize=2878971904 -XX:OldPLABSize=16 -XX:OldSize=5757943808 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseConcMarkSweepGC -XX:+UseParNewGC


uname output:
Linux 4.18.0-305.45.1.el8_4.x86_64 #1 SMP Wed Apr 6 13:48:37 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

lscpu output:
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              48
On-line CPU(s) list: 0-47
Thread(s) per core:  2
Core(s) per socket:  12
Socket(s):           2
NUMA node(s):        2
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Gold 6246 CPU @ 3.30GHz
Stepping:            7
CPU MHz:             4100.050
CPU max MHz:         4200.0000
CPU min MHz:         1200.0000
BogoMIPS:            6600.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            25344K
NUMA node0 CPU(s):   0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46
NUMA node1 CPU(s):   1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47

________________________________

This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of Intercontinental Exchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-use/attachments/20241119/caf8acd6/attachment.htm>


More information about the hotspot-gc-use mailing list