Spikes in duration of the minor GC caused by some unknown JVM activity before GC

Tue Dec 3 06:26:45 PST 2013

Is anything else running on this host that consumes nontrivial cpu time?

Sent from my phone
On Dec 3, 2013 7:51 AM, "Bartek Markocki" <bartosz.markocki at gmail.com>
wrote:

> Hello all,
>
> We have been performing performance tests of a new release of one of our
> application. What caught our attention was sporadic spikes in the minor GC
> times.
>
> By default the application uses ParallelGC so we were unable to
> distinguish between enlarged minor GC times and something else. Therefore
> we switched to ParNew and added printing safepoint statistics together with
> tracing safepoint cleanup time.
>
> Based on the above we got data that shows that the duration of the STW is
> not driven by the duration of the minor GC but something else.
>
> Here is the extract from the GC log file:
> 2013-12-02T02:16:29.993-0600: 45476.110: Total time for which application
> threads were stopped: 0.0304740 seconds
> 2013-12-02T02:16:30.772-0600: 45476.889: Application time: 0.7794150
> seconds
> {Heap before GC invocations=53652 (full 13):
>  par new generation   total 629120K, used 565117K [0x538f0000, 0x7e390000,
> 0x7e390000)
>   eden space 559232K, 100% used [0x538f0000, 0x75b10000, 0x75b10000)
>   from space 69888K,   8% used [0x75b10000, 0x760cf500, 0x79f50000)
>   to   space 69888K,   0% used [0x79f50000, 0x79f50000, 0x7e390000)
>  tenured generation   total 1398144K, used 1077208K [0x7e390000,
> 0xd38f0000, 0xd38f0000)
>    the space 1398144K,  77% used [0x7e390000, 0xbff86080, 0xbff86200,
> 0xd38f0000)
>  compacting perm gen  total 262144K, used 97801K [0xd38f0000, 0xe38f0000,
> 0xf38f0000)
>    the space 262144K,  37% used [0xd38f0000, 0xd9872690, 0xd9872800,
> 0xe38f0000)
> No shared spaces configured.
> 2013-12-02T02:16:30.787-0600: 45476.904: [GC2013-12-02T02:16:32.776-0600:
> 45478.893: [ParNew
> Desired survivor size 35782656 bytes, new threshold 15 (max 15)
> - age   1:    1856072 bytes,    1856072 total
> - age   2:     170120 bytes,    2026192 total
> - age   3:     232696 bytes,    2258888 total
> - age   4:     180136 bytes,    2439024 total
> - age   5:     235120 bytes,    2674144 total
> - age   6:     242976 bytes,    2917120 total
> - age   7:     231728 bytes,    3148848 total
> - age   8:     149976 bytes,    3298824 total
> - age   9:     117904 bytes,    3416728 total
> - age  10:     126936 bytes,    3543664 total
> - age  11:     126624 bytes,    3670288 total
> - age  12:     114256 bytes,    3784544 total
> - age  13:     146760 bytes,    3931304 total
> - age  14:     163808 bytes,    4095112 total
> - age  15:     171664 bytes,    4266776 total
> : 565117K->5057K(629120K), 0.0629070 secs] 1642325K->1082393K(2027264K),
> 2.0523080 secs] [Times: user=0.07 sys=0.00, real=2.05 secs]
> Heap after GC invocations=53653 (full 13):
>  par new generation   total 629120K, used 5057K [0x538f0000, 0x7e390000,
> 0x7e390000)
>   eden space 559232K,   0% used [0x538f0000, 0x538f0000, 0x75b10000)
>   from space 69888K,   7% used [0x79f50000, 0x7a4404b8, 0x7e390000)
>   to   space 69888K,   0% used [0x75b10000, 0x75b10000, 0x79f50000)
>  tenured generation   total 1398144K, used 1077336K [0x7e390000,
> 0xd38f0000, 0xd38f0000)
>    the space 1398144K,  77% used [0x7e390000, 0xbffa6080, 0xbffa6200,
> 0xd38f0000)
>  compacting perm gen  total 262144K, used 97801K [0xd38f0000, 0xe38f0000,
> 0xf38f0000)
>    the space 262144K,  37% used [0xd38f0000, 0xd9872690, 0xd9872800,
> 0xe38f0000)
> No shared spaces configured.
> }
> 2013-12-02T02:16:32.839-0600: 45478.957: Total time for which application
> threads were stopped: 2.0675060 seconds
>
> Please notice the difference in the times of start of the STW and GC.
>
> And here is the output from the safepoint statistics:
> 45476.086: [deflating idle monitors, 0.0010070 secs]
> 45476.087: [updating inline caches, 0.0000000 secs]
> 45476.087: [compilation policy safepoint handler, 0.0005410 secs]
> 45476.088: [sweeping nmethods, 0.0000020 secs]
>          vmop                    [threads: total initially_running
> wait_to_block]    [time: spin block sync cleanup vmop] page_trap_count
> 45476.078: GenCollectForAllocation          [     294
> 3              6    ]      [     4     1     6     1    21    ]  3
> 45476.902: [deflating idle monitors, 0.0010780 secs]
> 45476.903: [updating inline caches, 0.0000010 secs]
> 45476.903: [compilation policy safepoint handler, 0.0005200 secs]
> 45476.904: [sweeping nmethods, 0.0000020 secs]
>          vmop                    [threads: total initially_running
> wait_to_block]    [time: spin block sync cleanup vmop] page_trap_count
> 45476.891: GenCollectForAllocation          [     294
> 2              3    ]      [     9     3    13     1  2052    ]  2
>
> So if my reading is correct we miss around 1.9 seconds - before the GC
> started.
> Based on the above it is unclear to me were this time was spent.
>
> The test was running for 40 hours.
> Over those 40 hours we had 26 incidents where the difference > 200ms.
> The above incident is the longest.
> Here they are:
> 3534.445: GenCollectForAllocation          [     307
> 0              2    ]      [     0     0     0     1  1935    ]  0
> 6403.822: GenCollectForAllocation          [     294
> 2              4    ]      [     0     0     0     1  1683    ]  1
> 9146.663: GenCollectForAllocation          [     292
> 2              3    ]      [    24     0    25     2  1773    ]  2
> 12351.684: GenCollectForAllocation          [     293
> 4             10    ]      [     2     0     2     1  2044    ]  2
> 15746.592: GenCollectForAllocation          [     294
> 1              2    ]      [     0     2     2     1  1697    ]  0
> 16574.963: GenCollectForAllocation          [     295
> 1              1    ]      [     0     0     0     1  1224    ]  0
> 18337.686: GenCollectForAllocation          [     293
> 1              2    ]      [     1     0     1     1   418    ]  0
> 19142.518: GenCollectForAllocation          [     295
> 0              1    ]      [     0     0     0     1  1626    ]  0
> 20563.826: GenCollectForAllocation          [     296
> 6              6    ]      [     7     2     9     1   233    ]  4
> 22611.752: GenCollectForAllocation          [     294
> 4              4    ]      [     0     0     0     1  1584    ]  1
> 26043.520: GenCollectForAllocation          [     295
> 2              6    ]      [     6     4    11     1  1883    ]  1
> 29584.480: GenCollectForAllocation          [     292
> 3              5    ]      [     0     0     0     1  1788    ]  2
> 33119.441: GenCollectForAllocation          [     293
> 2              4    ]      [     3     0     3     1  1853    ]  2
> 34800.660: GenCollectForAllocation          [     294
> 2              4    ]      [     0     0     0     1   725    ]  0
> 36444.246: GenCollectForAllocation          [     293
> 1              0    ]      [     0     0     4     1  1815    ]  0
> 36656.730: GenCollectForAllocation          [     294
> 1              3    ]      [     0     0     0     1   905    ]  0
> 39751.609: GenCollectForAllocation          [     294
> 2              4    ]      [     3     0     3     1  2207    ]  1
> 41836.305: GenCollectForAllocation          [     293
> 2              2    ]      [     1     0     1     1   286    ]  1
> 43323.840: GenCollectForAllocation          [     293
> 0              1    ]      [     0     0     0     1  2006    ]  0
> 45476.891: GenCollectForAllocation          [     294
> 2              3    ]      [     9     3    13     1  2052    ]  2
> 46288.453: GenCollectForAllocation          [     295
> 0              2    ]      [     0     4     5     1   211    ]  0
> 47016.430: GenCollectForAllocation          [     294
> 4              4    ]      [     0     0     0     1  2408    ]  0
> 48662.230: GenCollectForAllocation          [     293
> 1              4    ]      [     0     0     0     1   315    ]  1
> 48907.250: GenCollectForAllocation          [     296
> 3              6    ]      [     0     0     0     1   421    ]  0
> 50662.195: GenCollectForAllocation          [     294
> 3              4    ]      [     0     0     0     1  2043    ]  1
> 54128.828: GenCollectForAllocation          [     295
> 2              2    ]      [     1     1     3     2  2660    ]  2
> 57729.141: GenCollectForAllocation          [     298
> 1              4    ]      [     0     0     0     1  1926    ]  0
>
> We are on java7 u40, 32bit version on RHEL5.5.
> The JVM settings are:
> -server -Xmx2048m -Xms2048m -XX:PermSize=256m -XX:MaxPermSize=512m
> -Dsun.rmi.dgc.client.gcInterval=1800000
> -Dsun.rmi.dgc.server.gcInterval=1800000 -XX:+DisableExplicitGC
> -XX:+UseParNewGC -XX:MaxGCPauseMillis=3000
> -verbose:gc -Xloggc:... -XX:+PrintGCDetails -XX:+PrintHeapAtGC
> -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution
> -XX:+TraceSafepointCleanupTime -XX:+PrintSafepointStatistics
> -XX:PrintSafepointStatisticsCount=1 -XX:+PrintGCApplicationConcurrentTime
> -XX:+PrintGCApplicationStoppedTime
> -Djava.net.preferIPv4Stack=true
>
> The question of the day is: what else can we do to diagnose the above?
>
> Thanks in advance,
> Bartek
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20131203/90c0bc5f/attachment-0001.html