Need help on G1 GC young gen Update RS and Scan RS pause reduction

Amit Mishra amit.mishra at redknee.com
Thu Jan 19 13:05:00 UTC 2017


Hello Friends,

I am using G1 GC on Java 1.7.45 version in my solaris LAB environment but am getting very high UpdateRS time and few occasions high Scan RS time as well.

OS version : Generic_150401-30 i86pc i386 i86pc
Number of CPU # 80, A day back i have tried to play with below options hoping extra GC threads will help me but all of no avail as I had witnessed long pauses up to 7 seconds. Right now I have disabled these GC threads.

argv[52]: -XX:ParallelGCThreads=70
argv[53]: -XX:ConcGCThreads=60
argv[54]: -XX:G1ConcRefinementThreads=70

I have already put -XX:G1RSetUpdatingPauseTimePercent=5 option and relaxed pause time goal by setting it to 500 ms but it doesn't seems to be helping me.


Your urgent help will be highly appreciated as I need system where  avg. pause time remain around 200-300 milli-second and Max pause remain around 500-600 ms.

Current GC parameters are as below:
argv[11]: -Xmx48g
argv[12]: -Xms48g
argv[13]: -XX:-EliminateLocks
argv[24]: -Xss1m
argv[25]: -Xoss1m
argv[26]: -XX:PermSize=512m
argv[27]: -XX:MaxPermSize=512m
argv[28]: -XX:ReservedCodeCacheSize=128m
argv[29]: -XX:+HeapDumpOnOutOfMemoryError
argv[30]: -XX:+AggressiveOpts
argv[35]: -XX:+UseG1GC
argv[36]: -XX:MaxGCPauseMillis=500
argv[37]: -XX:+PrintFlagsFinal
argv[38]: -XX:G1RSetUpdatingPauseTimePercent=5
argv[39]: -XX:+PrintGCTimeStamps
argv[40]: -XX:+PrintGCDetails
argv[42]: -verbose:gc
argv[43]: -XX:+UseLargePages
argv[44]: -XX:+MaxFDLimit
argv[48]: -XX:+UnlockExperimentalVMOptions
argv[49]: -XX:G1NewSizePercent=1
argv[50]: -XX:+ParallelRefProcEnabled
argv[51]: -XX:+DisableExplicitGC
argv[52]: -XX:G1MaxNewSizePercent=1
argv[53]: -XX:InitiatingHeapOccupancyPercent=40
argv[54]: -XX:+UnlockDiagnosticVMOptions
argv[55]: -XX:+G1SummarizeRSetStats
argv[56]: -XX:G1SummarizeRSetStatsPeriod=1
argv[57]: -XX:+PerfDisableSharedMem
argv[58]: -XX:+AlwaysPreTouch



Pause snapshots:

Very high UpdateRS but low scan RS

2602.655: [GC pause (young), 7.8981373 secs]
   [Parallel Time: 7883.9 ms, GC Workers: 53]
      [GC Worker Start (ms): Min: 2602654.9, Avg: 2602655.3, Max: 2602655.7, Diff: 0.8]
      [Ext Root Scanning (ms): Min: 3.5, Avg: 4.5, Max: 34.2, Diff: 30.7, Sum: 240.0]
      [Update RS (ms): Min: 7796.5, Avg: 7826.2, Max: 7827.8, Diff: 31.4, Sum: 414789.9]
         [Processed Buffers: Min: 1, Avg: 3.8, Max: 12, Diff: 11, Sum: 199]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6]
      [Object Copy (ms): Min: 48.2, Avg: 48.8, Max: 49.4, Diff: 1.3, Sum: 2588.2]
      [Termination (ms): Min: 3.1, Avg: 3.5, Max: 3.9, Diff: 0.8, Sum: 183.8]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 8.4]
      [GC Worker Total (ms): Min: 7882.7, Avg: 7883.2, Max: 7883.8, Diff: 1.1, Sum: 417810.9]
      [GC Worker End (ms): Min: 2610538.4, Avg: 2610538.5, Max: 2610538.7, Diff: 0.3]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 0.9 ms]
   [Other: 13.3 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 10.5 ms]
      [Ref Enq: 1.0 ms]
      [Free CSet: 0.5 ms]
   [Eden: 432.0M(432.0M)->0.0B(416.0M) Survivors: 48.0M->64.0M Heap: 19.3G(48.0G)->18.9G(48.0G)]
[Times: user=3.24 sys=0.02, real=7.90 secs]

Concurrent RS processed 48786945 cards
  Of 1661300 completed buffers:
      1661300 (100.0%) by conc RS threads.
            0 (  0.0%) by mutator threads.
  Conc RS threads times(s)
         17.55    15.01    12.67    11.26    10.14     9.56     8.13     7.26     6.61     6.18     5.25     4.64     3.97     3.16     2.92     2.76     2.60     2.54     2.36     2.30     2.16
  2.12     2.00     1.92     1.83     1.82     1.75     1.69     1.64     1.64     1.55     1.50     1.45     1.48     1.41     1.36     1.32     1.29     1.26     1.22     1.15     1.18     1.12
   1.07     1.05     1.03     1.01     0.97     0.96     0.95     0.89     0.89     0.87     0.33
  Total heap region rem set sizes = 808374K.  Max = 4135K.
  Static structures = 1596K, free_lists = 0K.
    72965736 occupied cards represented.
    Max size region = 35:(O)[0xfffffd73e3000000,0xfffffd73e4000000,0xfffffd73e4000000], size = 4136K, occupied = 14130K.
    Did 252 coarsenings.



High Update RS and high Scan RS :

301.125: [GC pause (young), 1.5623216 secs]
   [Parallel Time: 1542.0 ms, GC Workers: 53]
      [GC Worker Start (ms): Min: 301126.4, Avg: 301126.9, Max: 301127.4, Diff: 0.9]
      [Ext Root Scanning (ms): Min: 2.1, Avg: 2.8, Max: 3.9, Diff: 1.8, Sum: 146.3]
      [Update RS (ms): Min: 593.4, Avg: 647.2, Max: 723.0, Diff: 129.6, Sum: 34303.1]
        [Processed Buffers: Min: 10, Avg: 19.4, Max: 46, Diff: 36, Sum: 1026]
      [Scan RS (ms): Min: 504.4, Avg: 580.7, Max: 633.5, Diff: 129.0, Sum: 30775.8]
      [Object Copy (ms): Min: 293.4, Avg: 306.5, Max: 310.0, Diff: 16.6, Sum: 16245.9]
      [Termination (ms): Min: 0.0, Avg: 3.2, Max: 9.8, Diff: 9.7, Sum: 171.0]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.5, Diff: 0.4, Sum: 13.8]
      [GC Worker Total (ms): Min: 1540.1, Avg: 1540.7, Max: 1541.3, Diff: 1.2, Sum: 81656.0]
      [GC Worker End (ms): Min: 302667.3, Avg: 302667.6, Max: 302667.8, Diff: 0.4]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 1.0 ms]
   [Other: 19.3 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 12.4 ms]
      [Ref Enq: 1.1 ms]
      [Free CSet: 0.6 ms]
   [Eden: 416.0M(416.0M)->0.0B(416.0M) Survivors: 64.0M->64.0M Heap: 14.8G(48.0G)->14.7G(48.0G)]
[Times: user=12.90 sys=1.13, real=1.56 secs]

Concurrent RS processed 9611680 cards
  Of 113955 completed buffers:
       113955 (100.0%) by conc RS threads.
            0 (  0.0%) by mutator threads.
  Conc RS threads times(s)
          5.63     5.17     4.67     4.39     4.03     3.81     3.46     3.21     3.01     2.88     2.67     2.47     2.27     2.08     1.89     1.76     1.64     1.58     1.45     1.40     1.31
  1.27     1.17     1.12     1.06     1.04     0.99     0.95     0.93     0.92     0.86     0.83     0.82     0.82     0.77     0.74     0.71     0.69     0.67     0.64     0.60     0.61     0.57
   0.55     0.53     0.52     0.51     0.48     0.47     0.48     0.43     0.43     0.42     0.02
  Total heap region rem set sizes = 255924K.  Max = 3260K.
  Static structures = 1596K, free_lists = 6K.
    42009931 occupied cards represented.
    Max size region = 31:(O)[0xfffffd73df000000,0xfffffd73e0000000,0xfffffd73e0000000], size = 3261K, occupied = 11784K.
    Did 0 coarsenings.





Thanks,
Amit Mishra

From: Amit Mishra
Sent: Tuesday, January 17, 2017 18:53
To: 'yu.zhang at oracle.com' <yu.zhang at oracle.com>; Poonam Bajaj Parhar <poonam.bajaj at oracle.com>
Subject: RE: Need help on G1 GC young pause reduction

Thank you very much Poonam and Zhang for your response.

I have enabled below flags to summarize RS update sets as well as keep application un-impacted in case of high disk IO or swapping on Node.

argv[58]: -XX:+G1SummarizeRSetStats
argv[59]: -XX:G1SummarizeRSetStatsPeriod=10
argv[60]: -XX:+PerfDisableSharedMem
argv[61]: -XX:+AlwaysPreTouch


I am monitoring the situation and will raise my concern on hotspot mail list if I encounter any further pause.

I just want to know apart from below 3 STW pauses if there is any other STW pause in G1 GC.

a)Young generation evacuation pause
b)Remark phase pause
c) Cleanup phase pause


Thanks,
Amit Mishra



From: yu.zhang at oracle.com<mailto:yu.zhang at oracle.com> [mailto:yu.zhang at oracle.com]
Sent: Saturday, January 14, 2017 00:57
To: Poonam Bajaj Parhar <poonam.bajaj at oracle.com<mailto:poonam.bajaj at oracle.com>>; Amit Mishra <amit.mishra at redknee.com<mailto:amit.mishra at redknee.com>>
Subject: Re: Need help on G1 GC young pause reduction


Poonam,

Thanks for forwarding this. As you suggested, I also recommend sending to hotspot-gc-use at openjdk.java.net<mailto:hotspot-gc-use at openjdk.java.net>

To add to what Poonam said,

With -XX:+G1SummarizeRSetStats you can check if there are a lot of coarsening, and adjust the remember set table entries if there a a lot.

Another suggestion is to tune the Red/Yellow/Green zone definition for refinement.  That requires getting the refinement details first.

Thanks

Jenny

On 01/13/2017 10:13 AM, Poonam Bajaj Parhar wrote:
Hello Amit,

You could try to get more information with -XX:+G1SummarizeRSetStats and see what is taking more time. I am copying Jenny who would be able to help further.

And I suggest you send your question with GC logs to hotspot-gc-use at openjdk.java.net<mailto:hotspot-gc-use at openjdk.java.net> to get GC experts opinion on this.

Thanks,
Poonam
On 1/13/2017 5:02 AM, Amit Mishra wrote:
Hello Poonam,

I have updated the mentioned parameter but strangely young pause due to RSUpdate is going over 1 second while at most of time it remains around 30-50 ms.

Please help me to decrease this RSupdate time.

argv[35]: -XX:+UseG1GC
argv[38]: -XX:G1RSetUpdatingPauseTimePercent=5
argv[49]: -XX:G1NewSizePercent=1
argv[50]: -XX:+ParallelRefProcEnabled
argv[51]: -XX:+DisableExplicitGC
argv[52]: -XX:G1MaxNewSizePercent=1
argv[53]: -XX:ParallelGCThreads=70
argv[54]: -XX:ConcGCThreads=60
argv[55]: -XX:G1ConcRefinementThreads=70
argv[56]: -XX:InitiatingHeapOccupancyPercent=30



4082.624: [GC pause (young), 0.0319047 secs]
   [Parallel Time: 13.9 ms, GC Workers: 70]
      [GC Worker Start (ms): Min: 4082624.8, Avg: 4082625.3, Max: 4082625.9, Diff: 1.1]
      [Ext Root Scanning (ms): Min: 1.3, Avg: 2.1, Max: 5.9, Diff: 4.6, Sum: 146.0]
      [Update RS (ms): Min: 0.0, Avg: 2.8, Max: 3.7, Diff: 3.7, Sum: 194.1]
         [Processed Buffers: Min: 0, Avg: 3.1, Max: 16, Diff: 16, Sum: 216]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.9]
      [Object Copy (ms): Min: 6.0, Avg: 6.9, Max: 7.4, Diff: 1.4, Sum: 481.1]
      [Termination (ms): Min: 0.3, Avg: 0.6, Max: 0.8, Diff: 0.4, Sum: 38.8]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.5, Diff: 0.5, Sum: 17.4]
      [GC Worker Total (ms): Min: 12.0, Avg: 12.5, Max: 13.2, Diff: 1.2, Sum: 878.3]
      [GC Worker End (ms): Min: 4082637.6, Avg: 4082637.9, Max: 4082638.1, Diff: 0.5]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 1.1 ms]
   [Other: 16.8 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 14.0 ms]
      [Ref Enq: 1.1 ms]
      [Free CSet: 0.5 ms]
   [Eden: 464.0M(464.0M)->0.0B(464.0M) Survivors: 16.0M->16.0M Heap: 14.7G(48.0G)->14.2G(48.0G)]
[Times: user=1.04 sys=0.02, real=0.03 secs]
4083.465: [GC pause (young), 1.3382388 secs]
   [Parallel Time: 1319.7 ms, GC Workers: 70]
      [GC Worker Start (ms): Min: 4083465.7, Avg: 4083466.2, Max: 4083466.7, Diff: 1.0]
      [Ext Root Scanning (ms): Min: 1.6, Avg: 22.2, Max: 1306.9, Diff: 1305.3, Sum: 1557.0]
      [Update RS (ms): Min: 3.3, Avg: 1287.7, Max: 1308.7, Diff: 1305.4, Sum: 90139.6]
         [Processed Buffers: Min: 1, Avg: 2.8, Max: 11, Diff: 10, Sum: 198]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 1.0]
      [Object Copy (ms): Min: 6.8, Avg: 8.0, Max: 8.6, Diff: 1.8, Sum: 558.6]
      [Termination (ms): Min: 0.5, Avg: 0.7, Max: 0.9, Diff: 0.4, Sum: 46.0]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum: 11.2]
      [GC Worker Total (ms): Min: 1318.2, Avg: 1318.8, Max: 1319.5, Diff: 1.3, Sum: 92313.2]
      [GC Worker End (ms): Min: 4084784.8, Avg: 4084785.0, Max: 4084785.2, Diff: 0.4]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 1.3 ms]
   [Other: 17.3 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 14.3 ms]
      [Ref Enq: 1.3 ms]
      [Free CSet: 0.5 ms]
   [Eden: 464.0M(464.0M)->0.0B(464.0M) Survivors: 16.0M->16.0M Heap: 14.7G(48.0G)->14.2G(48.0G)]
[Times: user=1.11 sys=0.02, real=1.34 secs]
4085.195: [GC pause (young), 0.0303773 secs]
   [Parallel Time: 13.4 ms, GC Workers: 70]
      [GC Worker Start (ms): Min: 4085194.9, Avg: 4085195.4, Max: 4085195.9, Diff: 1.0]
      [Ext Root Scanning (ms): Min: 1.6, Avg: 2.3, Max: 5.4, Diff: 3.8, Sum: 161.6]
      [Update RS (ms): Min: 0.0, Avg: 2.8, Max: 3.8, Diff: 3.8, Sum: 196.4]
         [Processed Buffers: Min: 0, Avg: 3.4, Max: 15, Diff: 15, Sum: 235]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 1.0]
      [Object Copy (ms): Min: 5.4, Avg: 6.3, Max: 6.8, Diff: 1.4, Sum: 442.2]
      [Termination (ms): Min: 0.3, Avg: 0.4, Max: 0.6, Diff: 0.3, Sum: 28.6]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.5, Diff: 0.4, Sum: 15.1]
      [GC Worker Total (ms): Min: 11.4, Avg: 12.1, Max: 12.7, Diff: 1.2, Sum: 844.9]
      [GC Worker End (ms): Min: 4085207.2, Avg: 4085207.4, Max: 4085207.7, Diff: 0.4]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 1.2 ms]
   [Other: 15.8 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 12.9 ms]
      [Ref Enq: 1.2 ms]
      [Free CSet: 0.4 ms]


Regards,
Amit Mishra
From: Poonam Bajaj Parhar [mailto:poonam.bajaj at oracle.com]
Sent: Wednesday, January 11, 2017 23:59
To: Amit Mishra <amit.mishra at redknee.com><mailto:amit.mishra at redknee.com>
Subject: Re: Need help on G1 GC young pause reduction

Hello Amit,

You should decrease the value of G1RSetUpdatingPauseTimePercent (default is 10) and not increase it. By decreasing the value of this parameter, you are instructing G1 to spend less time on RS update during evacuation pauses. The leftover work would be done concurrently by the refinement threads. To help speed up the concurrent RS update, increase the number of concurrent refinement threads with -XX:G1ConcRefinementThreads. You may first want to check with the help of -XX:+G1SummarizeRSetStats if the refinement threads are able to keep up with the update work. If they are not able to complete the RS update work, then increase the number of refinement threads.

Hope this helps!
Thanks,
Poonam
On 1/11/2017 5:04 AM, Amit Mishra wrote:
Thank you very much Poonam, I have implemented particular option and added ConcGC thread as well but still I am getting RSUpdate time greater than 1 second which we cannot afford on our Production platform.
I have already enabled all the parameters that I can think of use after reading your mentioned article but still pause value are beyond our expectations.
Is there anyone from your team who can help us to alleviate such large pauses ?

Does reducing region sizes or implementing any other flag will going to help us ?


15872.711: [GC pause (young), 1.4327308 secs]
   [Parallel Time: 1407.3 ms, GC Workers: 70]
      [GC Worker Start (ms): Min: 15872711.5, Avg: 15872712.1, Max: 15872712.7, Diff: 1.2]
      [Ext Root Scanning (ms): Min: 1.5, Avg: 42.1, Max: 1398.8, Diff: 1397.3, Sum: 2945.8]
      [Update RS (ms): Min: 0.0, Avg: 1224.4, Max: 1399.8, Diff: 1399.7, Sum: 85704.5]
         [Processed Buffers: Min: 0, Avg: 15.5, Max: 93, Diff: 93, Sum: 1083]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.9]
      [Object Copy (ms): Min: 0.1, Avg: 4.7, Max: 33.1, Diff: 33.0, Sum: 326.9]
      [Termination (ms): Min: 0.2, Avg: 131.3, Max: 1316.1, Diff: 1315.9, Sum: 9188.6]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.5, Diff: 0.5, Sum: 15.2]
      [GC Worker Total (ms): Min: 1401.8, Avg: 1402.6, Max: 1406.7, Diff: 4.8, Sum: 98181.9]
      [GC Worker End (ms): Min: 15874114.4, Avg: 15874114.7, Max: 15874118.4, Diff: 4.0]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 1.5 ms]
   [Other: 24.0 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 16.9 ms]
      [Ref Enq: 1.6 ms]
      [Free CSet: 1.5 ms]
   [Eden: 928.0M(928.0M)->0.0B(912.0M) Survivors: 48.0M->64.0M Heap: 20.4G(48.0G)->19.5G(48.0G)]
[Times: user=9.37 sys=0.13, real=1.43 secs]


argv[35]: -XX:+UseG1GC
argv[36]: -XX:MaxGCPauseMillis=800
argv[37]: -XX:+UseCompressedOops
argv[38]: -XX:+PrintFlagsFinal
argv[39]: -XX:G1RSetUpdatingPauseTimePercent=20
argv[40]: -XX:+PrintGCTimeStamps
argv[41]: -XX:+PrintGCDetails
argv[42]: -Xloggc:/opt/redknee/product/cps/9_2_20/log/gcstats.log.17962
argv[43]: -verbose:gc
argv[44]: -XX:+UseLargePages
argv[45]: -XX:+MaxFDLimit
argv[49]: -XX:+UnlockExperimentalVMOptions
argv[50]: -XX:G1NewSizePercent=2
argv[51]: -XX:+ParallelRefProcEnabled
argv[52]: -XX:+DisableExplicitGC
argv[53]: -XX:G1MaxNewSizePercent=2
argv[54]: -XX:ParallelGCThreads=70
argv[55]: -XX:ConcGCThreads=60


Thanks,
Amit Mishra


From: Poonam Bajaj Parhar [mailto:poonam.bajaj at oracle.com]
Sent: Tuesday, January 10, 2017 03:57
To: Amit Mishra <amit.mishra at redknee.com><mailto:amit.mishra at redknee.com>
Subject: Re: Need help on G1 GC young pause reduction

Hello Amit,

You can try limiting the amount of time that the GC threads should spend in updating the RS with the option -XX:G1RSetUpdatingPauseTimePercent. Its default value is 10.

Please see this article here: https://www.infoq.com/articles/tuning-tips-G1-GC

This talks in detail about how to get more diagnostic information about the RS update times and how it can be controlled.

Thanks,
Poonam
On 1/9/2017 4:27 AM, Amit Mishra wrote:
Hello Poonam,

I had changed the MaxNewGen to 2% and parallel GC threads to 70 but even though I am getting random STW pauses greater than 1 second.

This time it happened during updating Remembered set, can you help me to decrease this pause by enabling any additional flag.

37171.325: [GC pause (young), 3.3011446 secs]
   [Parallel Time: 3280.9 ms, GC Workers: 70]
      [GC Worker Start (ms): Min: 37171325.3, Avg: 37171325.9, Max: 37171326.4, Diff: 1.0]
      [Ext Root Scanning (ms): Min: 1.9, Avg: 2.6, Max: 6.5, Diff: 4.5, Sum: 179.9]
      [Update RS (ms): Min: 6.0, Avg: 56.3, Max: 3267.8, Diff: 3261.8, Sum: 3939.2]
         [Processed Buffers: Min: 4, Avg: 6.5, Max: 18, Diff: 14, Sum: 455]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 1.2]
      [Object Copy (ms): Min: 0.2, Avg: 103.5, Max: 105.6, Diff: 105.4, Sum: 7245.4]
      [Termination (ms): Min: 0.0, Avg: 3108.7, Max: 3162.7, Diff: 3162.7, Sum: 217611.5]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.4, Diff: 0.4, Sum: 9.9]
      [GC Worker Total (ms): Min: 3270.0, Avg: 3271.2, Max: 3279.6, Diff: 9.7, Sum: 228987.2]
      [GC Worker End (ms): Min: 37174596.3, Avg: 37174597.1, Max: 37174605.9, Diff: 9.6]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 1.2 ms]
   [Other: 19.0 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 13.8 ms]
      [Ref Enq: 1.2 ms]
      [Free CSet: 1.6 ms]
   [Eden: 848.0M(848.0M)->0.0B(848.0M) Survivors: 128.0M->128.0M Heap: 22.0G(48.0G)->21.2G(48.0G)]
[Times: user=185.76 sys=1.63, real=3.30 secs]

Thanks,
Amit Mishra

From: Amit Mishra
Sent: Thursday, January 5, 2017 13:07
To: 'Poonam Bajaj Parhar' <poonam.bajaj at oracle.com><mailto:poonam.bajaj at oracle.com>
Subject: RE: Need help on G1 GC young pause reduction

Thank you very much Poonam, I had re -run the test by keeping new gen size as 2% which would be roughly 1 GB.

One more thing I  want to ask can we maximize number of parallel GC threads as default is 5/6 of number of cores so here we have 80 CPU on LAB so can we use 70 Parallel GC threads as there is no risk of CMS fragmentation in G1 GC.


Thanks,
Amit Mishra

From: Poonam Bajaj Parhar [mailto:poonam.bajaj at oracle.com]
Sent: Wednesday, January 4, 2017 19:21
To: Amit Mishra <amit.mishra at redknee.com<mailto:amit.mishra at redknee.com>>
Subject: Re: Need help on G1 GC young pause reduction

Hello Amit,

For some of the GC events, the sys time is higher than than the usr time:

Timestamp    User Time (secs)    Sys Time (secs)    Real Time (secs)
00:01:33        3.61                        4.55                        0.18
00:01:34        3.82                        4.16                        0.17
00:01:56        4.32                        6.64                        0.23
00:02:01        4.39                        6.25                        0.23
.....

Please check if there is something else going on at the system level when these GC events occur.

Have you tried setting -XX:G1MaxNewSizePercent to set the max percentage size for the young gen? G1NewSizePercent would set the minimum size for the young gen, not the max limit on it. Making the young gen smaller might help in reducing the evacuation pause times. Let me know if this helps.

Thanks,
Poonam
On 1/4/2017 4:24 AM, Amit Mishra wrote:
Hello Poonam,

We are testing one of our application for 48G heap size with G1 GC as there were issue with fragmentation with CMS.

But we are seeing young pause greater than 1 second every day which we cannot afford on production system, recent pause is as high as 5.7 seconds.

There are 64 virtual cores on platform, attaching GC file for your reference.

Initially I tested with only default G1 GC parameter and then after with -XX:G1NewSizePercent=1 but in any case there are young gen pauses greater than 1 second which is not acceptable.

Need your expert help on this topic to reduce young gen pause.

13531.695: [GC pause (young), 5.7629644 secs]
   [Parallel Time: 5714.4 ms, GC Workers: 53]
      [GC Worker Start (ms): Min: 13531695.5, Avg: 13531696.0, Max: 13531696.4, Diff: 0.9]
      [Ext Root Scanning (ms): Min: 2.2, Avg: 501.4, Max: 5286.9, Diff: 5284.7, Sum: 26572.0]
      [Update RS (ms): Min: 3.4, Avg: 4789.8, Max: 5289.7, Diff: 5286.3, Sum: 253861.2]
         [Processed Buffers: Min: 2, Avg: 10.4, Max: 32, Diff: 30, Sum: 552]
      [Scan RS (ms): Min: 5.0, Avg: 6.1, Max: 6.9, Diff: 1.9, Sum: 324.6]
      [Object Copy (ms): Min: 380.8, Avg: 382.7, Max: 384.4, Diff: 3.6, Sum: 20284.2]
      [Termination (ms): Min: 31.2, Avg: 33.0, Max: 34.9, Diff: 3.7, Sum: 1746.7]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.6, Sum: 17.7]
      [GC Worker Total (ms): Min: 5712.7, Avg: 5713.3, Max: 5714.1, Diff: 1.4, Sum: 302806.3]
      [GC Worker End (ms): Min: 13537409.0, Avg: 13537409.3, Max: 13537409.6, Diff: 0.6]
   [Code Root Fixup: 0.0 ms]
   [Clear CT: 7.0 ms]
   [Other: 41.6 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 25.7 ms]
      [Ref Enq: 1.0 ms]
      [Free CSet: 11.9 ms]
   [Eden: 20.9G(20.9G)->0.0B(128.0M) Survivors: 432.0M->352.0M Heap: 42.7G(48.0G)->21.8G(48.0G)]
[Times: user=24.40 sys=0.02, real=5.76 secs]


GC parameters are as below:
argv[11]: -Xmx48g
argv[12]: -Xms48g
argv[13]: -XX:-EliminateLocks
argv[18]: -server
argv[24]: -Xss1m
argv[25]: -Xoss1m
argv[26]: -XX:PermSize=512m
argv[27]: -XX:MaxPermSize=512m
argv[28]: -XX:ReservedCodeCacheSize=128m
argv[29]: -XX:+HeapDumpOnOutOfMemoryError
argv[30]: -XX:+AggressiveOpts
argv[31]: -Dnetworkaddress.cache.ttl=3600
argv[32]: -Dcom.sun.management.jmxremote.port=11883
argv[33]: -Dcom.sun.management.jmxremote.ssl=false
argv[34]: -Dcom.sun.management.jmxremote.authenticate=false
argv[35]: -XX:+UseG1GC
argv[36]: -XX:MaxGCPauseMillis=200
argv[37]: -XX:+UseCompressedOops
argv[38]: -XX:+PrintFlagsFinal
argv[39]: -XX:+PrintGCTimeStamps
argv[40]: -XX:+PrintGCDetails
argv[41]: -Xloggc:/opt/redknee/product/cps/9_2_20/log/gcstats.log.27925
argv[42]: -verbose:gc
argv[43]: -XX:+UseLargePages
argv[44]: -XX:+MaxFDLimit
argv[48]: -XX:+UnlockExperimentalVMOptions
argv[49]: -XX:G1NewSizePercent=1
argv[50]: -XX:+ParallelRefProcEnabled
argv[51]: -XX:+DisableExplicitGC

Thanks,
Amit Mishra





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20170119/166cccc3/attachment-0001.html>


More information about the hotspot-gc-use mailing list