Large heap size, slow concurrent marking causing frequent full GC

James Sun jamessun at fb.com
Fri Nov 17 16:08:05 UTC 2017


Hi Thomas

Thanks for the help! The video is super helpful. JDK9 is definitely something we are going to gradually adopt. Also, it is good know we didn’t miss much in terms of tuning.

Thanks

James

On 11/16/17, 11:56 PM, "Thomas Schatzl" <thomas.schatzl at oracle.com> wrote:

    Hi James,
    
    On Thu, 2017-11-16 at 20:02 +0000, James Sun wrote:
    > Dear
    >  
    > We observed frequent full GCs due to long concurrent marking phase
    > (about 30 seconds to a minute). The GC log with heap histogram during
    > full GC is attached.
    > The Java version we use is 8_144 with G1 GC.  The machines are with
    > 56 cores and a heap size around 180 – 210GB.
    >  
    > Example concurrent mark duration:
    > 2017-11-16T09:32:04.565-0800: 167543.159: [GC concurrent-mark-end,
    > 45.7020802 secs]
    > 2017-11-16T09:33:16.314-0800: 167614.908: [GC concurrent-mark-end,
    > 51.0809053 secs]
    > 2017-11-16T09:34:28.343-0800: 167686.938: [GC concurrent-mark-end,
    > 48.7335047 secs]
    >  
    >  
    > Wonder if anyone could help in terms of:
    > How in general we can make concurrent marking faster. We bumped up
    > the ConcGCThread to 20 but it didn’t help that much.
    
    There are known performance and scalability issues with JDK8(u). See
    the reason in some JavaOne 2016 talk about scaling G1 for huge heaps
    [0]. Unfortunately it seems you covered all other options already
    (bumping mark stack size to avoid overflows, increasing the number of
    heaps).
    
    > We also turned on -XX:+UnlockDiagnosticVMOptions
    > -XX:+G1SummarizeConcMark but nothing related to marking shows up.
    > General advice in tuning GC in other aspects
    >  
    
    Apart from the full gc, and the mixed gcs I am discussing below, is
    there anything else of concern?
    
    > Thanks in advance
    >  
    > James
    >  
    >  
    > Here is the JVM config we have
    >  
    > -Xss2048k
    > -XX:MaxMetaspaceSize=4G
    > -XX:+PreserveFramePointer
    > -XX:-UseBiasedLocking
    > -XX:+PrintGCApplicationConcurrentTime
    > -XX:+PrintGCApplicationStoppedTime
    > -XX:+UnlockExperimentalVMOptions
    > -XX:+UseG1GC
    > -XX:+ExplicitGCInvokesConcurrent
    > -XX:+HeapDumpOnOutOfMemoryError
    > -XX:+UseGCOverheadLimit
    > -XX:+ExitOnOutOfMemoryError
    > -agentpath:/packages/presto.presto/bin/libjvmkill.so
    > -agentpath:/packages/presto.presto/bin/libperfagent.so
    > -XX:+PrintReferenceGC
    > -XX:+PrintGCCause
    > -XX:+PrintGCDateStamps
    > -XX:+PrintGCTimeStamps
    > -XX:+PrintGCDetails
    > -XX:+PrintClassHistogramAfterFullGC
    > -XX:+PrintClassHistogramBeforeFullGC
    > -XX:PrintFLSStatistics=2
    
    This one is CMS specific. Can remove.
    
    > -XX:+PrintAdaptiveSizePolicy
    > -XX:+PrintSafepointStatistics
    > -XX:PrintSafepointStatisticsCount=1
    > -XX:+PrintJNIGCStalls
    > -XX:+UnlockDiagnosticVMOptions
    > -XX:+AlwaysPreTouch
    > -XX:+G1SummarizeRSetStats
    > -XX:G1SummarizeRSetStatsPeriod=100
    > -Dorg.eclipse.jetty.io.SelectorManager.submitKeyUpdates=true
    > -XX:-OmitStackTraceInFastThrow
    > -XX:ReservedCodeCacheSize=1G
    > -Djdk.nio.maxCachedBufferSize=30000000
    > -XX:G1MaxNewSizePercent=20
    > -XX:G1HeapRegionSize=32M
    
    At that heap size, the region size would be 32M anyway, may remove.
    
    > -Xms180G
    > -Xmx180G
    > -XX:MarkStackSize=64M
    > -XX:G1HeapWastePercent=2
    
    That is very aggressive. That causes the long mixed gcs at the end of a
     old gen space reclamation phase. Either increase that to 5 to cut off
    the "long" mixed gcs (still mostly within your 500ms pause time goal),
    or increase G1MixedGCCountTarget to something like 16 to spread out the
    work (you are observing the "added expensive regions to CSet" message).
    See the documentation [1] for more information.
    
    > -XX:ConcGCThreads=20
    > -XX:MaxGCPauseMillis=500
    > -XX:GCLockerRetryAllocationCount=5
    > -XX:MarkStackSizeMax=256M
    > -XX:G1OldCSetRegionThresholdPercent=20
    > -XX:InitiatingHeapOccupancyPercent=40
    > 
    
    Thanks,
      Thomas
    
    [0] https://www.youtube.com/watch?v=LppgqvKOUKs at 32:04; quoting from
    the video: "marking runs 50x faster". It also shows other jdk9
    improvements particularly applicable to running larger heaps.
    
    [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.oracle.com_javase_9_gctuning_garbage-2Dfirst-2Dgarbage-2Dcol&d=DwIFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=ikRH8URaurZA7JMys57d3w&m=HGrocRYpexpC7RoZ8Uv8PUt0O4PiaGa_WAQORiZ85MM&s=wBM_SfIKY-Oly4plffuMq3TTNYBmyJg6eUjOvJK1uSU&e=
    lector-tuning.htm#GUID-D2B6ADCE-6766-4FF8-AA9D-B7F4F3D0F469
    
    



More information about the hotspot-gc-use mailing list