Abnormal high sys time in G1 GC
Bernd Eckenfels
ecki at zusammenkunft.net
Mon Jan 20 11:30:55 UTC 2020
Hello,
Can you say a bit what's the actual problem? Your sys times in those GCs look large, but it's only a 4s Pause for a 160gb heap. I am sure you saw that pause time before? I think it's pretty hard to tell after the fact, but with such a large system I would lean towards problems outside of the JVM. Do you have a long term graph from the logs?
Gruss
Bernd
--
http://bernd.eckenfels.net
________________________________
Von: hotspot-gc-use <hotspot-gc-use-bounces at openjdk.java.net> im Auftrag von Roy Zhang <roy.sunny.zhang007 at gmail.com>
Gesendet: Monday, January 20, 2020 11:22:43 AM
An: hotspot-gc-use at openjdk.java.net <hotspot-gc-use at openjdk.java.net>
Betreff: Abnormal high sys time in G1 GC
Dear JVM experts,
Recently we found GC spike (long STW minor GC), and sys time is high when we GC time is high. Normally sys time is near 0 seconds and minor GC is less than 500ms.
>From http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2017-October/020630.html and https://blog.gceasy.io/2016/12/11/sys-time-greater-than-user-time/, high sys time could be caused by operation system problem/VM related problem/memory constraint/disk IO pressure/Transparent Huge Pages.
I checked them one by one, don't find any clue, could u please kindly provide suggestion? Thanks in advance!
1.operation system problem
--We have enough CPU/memory/disk (48 cpu cores + 373 RAM with 160G heap, disk is enough), and there is no error in /var/log/dmesg
2. memory constraint
-- We have enough available memory. available memory (free -m) is 263G
3. disk IO pressure
-- Not find issue from disk info from prometheus node exporter. Granularity is 15s, and I can't find counterpart of avgqu-sz & util metrics (disk IO util and saturation metrics) which is part of iostat. It could be caused by big Granularity???
4. VM related problem
-- We are using physical machine
5. Transparent Huge Pages.
It is madvise. It could be a problem, but we don't have this issue previously. It has been running for nearly 20 weeks.
cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
JDK version:
OpenJDK Runtime Environment, 1.8.0_222-b10
Java Opts:
-javaagent:/server/jmx_prometheus_javaagent-0.12.0.jar=xxxx:/server/config.yaml
-server
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=xxxx
-Dcom.sun.management.jmxremote.rmi.port=xxxx
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Xloggc:/server/xxxx.log
-XX:+PrintGCDateStamps
-XX:AutoBoxCacheMax=1000000
-XX:+UseG1GC
-XX:MaxGCPauseMillis=500
-XX:+UnlockExperimentalVMOptions
-XX:G1NewSizePercent=50
-XX:InitiatingHeapOccupancyPercent=70
-XX:+ParallelRefProcEnabled
-XX:+ExplicitGCInvokesConcurrent
-XX:+UseStringDeduplication
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-Xms160g
-Xmx160g
-XX:+HeapDumpOnOutOfMemoryError
Snippet of GC log:
2020-01-20T07:27:03.166+0000: 2756.665: [GC pause (G1 Evacuation Pause) (young),6.2899024 secs]
[Parallel Time: 6255.0 ms, GC Workers: 33]
[GC Worker Start (ms): Min: 2756664.9, Avg: 2756665.5, Max: 2756666.1, Diff: 1.2]
[Ext Root Scanning (ms): Min: 0.0, Avg: 0.5, Max: 5.3, Diff: 5.3, Sum: 16.8]
[Update RS (ms): Min: 0.0, Avg: 0.8, Max: 1.1, Diff: 1.1, Sum: 25.6]
[Processed Buffers: Min: 0, Avg: 1.6, Max: 4, Diff: 4, Sum: 53]
[Scan RS (ms): Min: 142.0, Avg: 145.3, Max: 146.4, Diff: 4.4, Sum: 4794.1]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.3, Max: 3.5, Diff: 3.5, Sum: 8.8]
[Object Copy (ms): Min: 6100.1, Avg: 6101.8, Max: 6106.5, Diff: 6.4, Sum: 201358.4]
[Termination (ms): Min: 0.1, Avg: 5.2, Max: 6.7, Diff: 6.6, Sum: 172.9]
[Termination Attempts: Min: 1, Avg: 1353.0, Max: 1476, Diff: 1475, Sum: 44650]
[GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum: 7.0]
[GC Worker Total (ms): Min: 6253.4, Avg: 6254.1, Max: 6254.7, Diff: 1.2, Sum: 206383.7]
[GC Worker End (ms): Min: 2762919.4, Avg: 2762919.6, Max: 2762919.8, Diff: 0.4]
[Code Root Fixup: 0.6 ms]
[Code Root Purge: 0.0 ms]
[String Dedup Fixup: 0.7 ms, GC Workers: 33]
[Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4]
[Table Fixup (ms): Min: 0.0, Avg: 0.1, Max: 0.6, Diff: 0.6, Sum: 2.0]
[Clear CT: 4.0 ms]
[Other: 29.6 ms]
[Choose CSet: 0.1 ms]
[Ref Proc: 10.3 ms]
[Ref Enq: 0.6 ms]
[Redirty Cards: 11.3 ms]
[Humongous Register: 0.2 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 6.5 ms]
[Eden: 72576.0M(72576.0M)->0.0B(80896.0M) Survivors: 9344.0M->1024.0M Heap: 83520.0M(160.0G)->11046.9M(160.0G)]
[Times: user=27.19 sys=162.28, real=6.30 secs]
2020-01-20T06:59:23.382+0000: 1096.881: [GC pause (G1 Evacuation Pause) (young) (initial-mark),4.1248088 secs]
[Parallel Time: 4098.0 ms, GC Workers: 33]
[GC Worker Start (ms): Min: 1096882.1, Avg: 1096882.8, Max: 1096883.2, Diff: 1.2]
[Ext Root Scanning (ms): Min: 4.0, Avg: 4.8, Max: 6.1, Diff: 2.0, Sum: 159.7]
[Update RS (ms): Min: 0.0, Avg: 0.3, Max: 1.1, Diff: 1.1, Sum: 9.5]
[Processed Buffers: Min: 0, Avg: 1.3, Max: 6, Diff: 6, Sum: 43]
[Scan RS (ms): Min: 2001.2, Avg: 2012.2, Max: 2013.4, Diff: 12.2, Sum: 66401.0]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.6, Max: 10.7, Diff: 10.7, Sum: 18.5]
[Object Copy (ms): Min: 2039.3, Avg: 2049.2, Max: 2079.5, Diff: 40.2, Sum: 67623.1]
[Termination (ms): Min: 0.0, Avg: 29.6, Max: 39.7, Diff: 39.7, Sum: 978.0]
[Termination Attempts: Min: 1, Avg: 6587.0, Max: 8068, Diff: 8067, Sum: 217372]
[GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.5, Diff: 0.4, Sum: 7.9]
[GC Worker Total (ms): Min: 4096.3, Avg: 4096.9, Max: 4097.7, Diff: 1.4, Sum: 135197.8]
[GC Worker End (ms): Min: 1100979.5, Avg: 1100979.7, Max: 1100979.9, Diff: 0.4]
[Code Root Fixup: 0.6 ms]
[Code Root Purge: 0.2 ms]
[String Dedup Fixup: 1.0 ms, GC Workers: 33]
[Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Table Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.7, Diff: 0.7, Sum: 1.4]
[Clear CT: 3.4 ms]
[Other: 21.7 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 9.1 ms]
[Ref Enq: 0.9 ms]
[Redirty Cards: 4.3 ms]
[Humongous Register: 0.2 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 5.3 ms]
[Eden: 81184.0M(81184.0M)->0.0B(72576.0M) Survivors: 736.0M->9344.0M Heap: 83508.0M(160.0G)->10944.0M(160.0G)]
[Times: user=68.40 sys=9.11, real=4.13 secs]
Thanks,
Roy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20200120/7fb88fd6/attachment-0002.htm>
More information about the hotspot-gc-use
mailing list