Large heap size, slow concurrent marking causing frequent full GC

Timur Akhmadeev timur.akhmadeev at gmail.com
Sat Nov 18 05:49:58 UTC 2017


Hi,

You should also try using huge pages. With large heaps it's a must thing to
do I think.

On Fri, 17 Nov 2017 at 21:24, James Sun <jamessun at fb.com> wrote:

> Hi Thomas
>
> Thanks for the help! The video is super helpful. JDK9 is definitely
> something we are going to gradually adopt. Also, it is good know we didn’t
> miss much in terms of tuning.
>
> Thanks
>
> James
>
> On 11/16/17, 11:56 PM, "Thomas Schatzl" <thomas.schatzl at oracle.com> wrote:
>
>     Hi James,
>
>     On Thu, 2017-11-16 at 20:02 +0000, James Sun wrote:
>     > Dear
>     >
>     > We observed frequent full GCs due to long concurrent marking phase
>     > (about 30 seconds to a minute). The GC log with heap histogram during
>     > full GC is attached.
>     > The Java version we use is 8_144 with G1 GC.  The machines are with
>     > 56 cores and a heap size around 180 – 210GB.
>     >
>     > Example concurrent mark duration:
>     > 2017-11-16T09:32:04.565-0800: 167543.159: [GC concurrent-mark-end,
>     > 45.7020802 secs]
>     > 2017-11-16T09:33:16.314-0800: 167614.908: [GC concurrent-mark-end,
>     > 51.0809053 secs]
>     > 2017-11-16T09:34:28.343-0800: 167686.938: [GC concurrent-mark-end,
>     > 48.7335047 secs]
>     >
>     >
>     > Wonder if anyone could help in terms of:
>     > How in general we can make concurrent marking faster. We bumped up
>     > the ConcGCThread to 20 but it didn’t help that much.
>
>     There are known performance and scalability issues with JDK8(u). See
>     the reason in some JavaOne 2016 talk about scaling G1 for huge heaps
>     [0]. Unfortunately it seems you covered all other options already
>     (bumping mark stack size to avoid overflows, increasing the number of
>     heaps).
>
>     > We also turned on -XX:+UnlockDiagnosticVMOptions
>     > -XX:+G1SummarizeConcMark but nothing related to marking shows up.
>     > General advice in tuning GC in other aspects
>     >
>
>     Apart from the full gc, and the mixed gcs I am discussing below, is
>     there anything else of concern?
>
>     > Thanks in advance
>     >
>     > James
>     >
>     >
>     > Here is the JVM config we have
>     >
>     > -Xss2048k
>     > -XX:MaxMetaspaceSize=4G
>     > -XX:+PreserveFramePointer
>     > -XX:-UseBiasedLocking
>     > -XX:+PrintGCApplicationConcurrentTime
>     > -XX:+PrintGCApplicationStoppedTime
>     > -XX:+UnlockExperimentalVMOptions
>     > -XX:+UseG1GC
>     > -XX:+ExplicitGCInvokesConcurrent
>     > -XX:+HeapDumpOnOutOfMemoryError
>     > -XX:+UseGCOverheadLimit
>     > -XX:+ExitOnOutOfMemoryError
>     > -agentpath:/packages/presto.presto/bin/libjvmkill.so
>     > -agentpath:/packages/presto.presto/bin/libperfagent.so
>     > -XX:+PrintReferenceGC
>     > -XX:+PrintGCCause
>     > -XX:+PrintGCDateStamps
>     > -XX:+PrintGCTimeStamps
>     > -XX:+PrintGCDetails
>     > -XX:+PrintClassHistogramAfterFullGC
>     > -XX:+PrintClassHistogramBeforeFullGC
>     > -XX:PrintFLSStatistics=2
>
>     This one is CMS specific. Can remove.
>
>     > -XX:+PrintAdaptiveSizePolicy
>     > -XX:+PrintSafepointStatistics
>     > -XX:PrintSafepointStatisticsCount=1
>     > -XX:+PrintJNIGCStalls
>     > -XX:+UnlockDiagnosticVMOptions
>     > -XX:+AlwaysPreTouch
>     > -XX:+G1SummarizeRSetStats
>     > -XX:G1SummarizeRSetStatsPeriod=100
>     > -Dorg.eclipse.jetty.io.SelectorManager.submitKeyUpdates=true
>     > -XX:-OmitStackTraceInFastThrow
>     > -XX:ReservedCodeCacheSize=1G
>     > -Djdk.nio.maxCachedBufferSize=30000000
>     > -XX:G1MaxNewSizePercent=20
>     > -XX:G1HeapRegionSize=32M
>
>     At that heap size, the region size would be 32M anyway, may remove.
>
>     > -Xms180G
>     > -Xmx180G
>     > -XX:MarkStackSize=64M
>     > -XX:G1HeapWastePercent=2
>
>     That is very aggressive. That causes the long mixed gcs at the end of a
>      old gen space reclamation phase. Either increase that to 5 to cut off
>     the "long" mixed gcs (still mostly within your 500ms pause time goal),
>     or increase G1MixedGCCountTarget to something like 16 to spread out the
>     work (you are observing the "added expensive regions to CSet" message).
>     See the documentation [1] for more information.
>
>     > -XX:ConcGCThreads=20
>     > -XX:MaxGCPauseMillis=500
>     > -XX:GCLockerRetryAllocationCount=5
>     > -XX:MarkStackSizeMax=256M
>     > -XX:G1OldCSetRegionThresholdPercent=20
>     > -XX:InitiatingHeapOccupancyPercent=40
>     >
>
>     Thanks,
>       Thomas
>
>     [0] https://www.youtube.com/watch?v=LppgqvKOUKs at 32:04; quoting from
>     the video: "marking runs 50x faster". It also shows other jdk9
>     improvements particularly applicable to running larger heaps.
>
>     [1]
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.oracle.com_javase_9_gctuning_garbage-2Dfirst-2Dgarbage-2Dcol&d=DwIFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=ikRH8URaurZA7JMys57d3w&m=HGrocRYpexpC7RoZ8Uv8PUt0O4PiaGa_WAQORiZ85MM&s=wBM_SfIKY-Oly4plffuMq3TTNYBmyJg6eUjOvJK1uSU&e=
>     lector-tuning.htm#GUID-D2B6ADCE-6766-4FF8-AA9D-B7F4F3D0F469
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-- 
Regards
Timur Akhmadeev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20171118/b8669df9/attachment.html>


More information about the hotspot-gc-use mailing list