Multi-second ParNew collections but stable CMS

Mon Dec 22 20:45:20 UTC 2014

Hi,

Indeed thank you for reporting back on this. TIL about THP, so to say...

Best Regards,
GustavÅkesson
Den 22 dec 2014 18:52 skrev "charlie hunt" <charlie.hunt at oracle.com>:

> Thanks for reporting back, and great to hear disabling THP has solved your
> multi-second minor GCs issue. :-)
>
> charlie
>
> On Dec 22, 2014, at 11:49 AM, Ashwin Jayaprakash <
> ashwin.jayaprakash at gmail.com> wrote:
>
> All, I'm happy to report that disabling THP made a big difference. We do
> not see multi-second minor GCs in our cluster anymore.
>
> Thanks for your help.
>
> On Fri, Dec 19, 2014 at 2:10 PM, charlie hunt <charlie.hunt at oracle.com>
> wrote:
>
>> Disabling transparent huge pages should help those GC pauses where you
>> are seeing high sys time reported, which should also shorten their pause
>> times.
>>
>> Thanks for also sharing your observation of khugepaged/pages_collapsed.
>>
>> charlie
>>
>> On Dec 18, 2014, at 4:41 PM, Ashwin Jayaprakash <
>> ashwin.jayaprakash at gmail.com> wrote:
>>
>> *@Charlie/@Holger* My apologies, THP is indeed enabled. I misread the
>> "never" and thought it was already done. In fact "cat
>> /sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed" showed
>> 904 and after an hour now, it shows 6845 on one of our machines.
>>
>> *@Jon* I dug through some of our ElasticSearch/application logs again
>> and tried to correlate them with the GC logs. The collection time does seem
>> to match the GC log's "real" time. However the collected sizes don't seem
>> to match, which is what threw me off.
>>
>> *Item 1:*
>>
>> 2014-12-18T21:34:55.837+0000: 163793.979: [ParNew
>> Desired survivor size 56688640 bytes, new threshold 6 (max 6)
>> - age   1:   31568024 bytes,   31568024 total
>> - age   2:    1188576 bytes,   32756600 total
>> - age   3:    1830920 bytes,   34587520 total
>> - age   4:     282536 bytes,   34870056 total
>> - age   5:     316640 bytes,   35186696 total
>> - age   6:     249856 bytes,   35436552 total
>> : 931773K->49827K(996800K), 1.3622770 secs]
>> 22132844K->21256042K(31633280K)After GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 1206815932
>> Max   Chunk Size: 1206815932
>> Number of Blocks: 1
>> Av.  Block  Size: 1206815932
>> Tree      Height: 1
>> After GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 6189459
>> Max   Chunk Size: 6188544
>> Number of Blocks: 3
>> Av.  Block  Size: 2063153
>> Tree      Height: 2
>> , 1.3625110 secs] [Times: user=15.92 sys=1.36, real=1.36 secs]
>> 2014-12-18T21:34:57.200+0000: 163795.342: Total time for which
>> application threads were stopped: 1.3638970 seconds
>>
>>
>> [2014-12-18T21:34:57,203Z]  [WARN ]
>> [elasticsearch[server00001][scheduler][T#1]]
>> [org.elasticsearch.monitor.jvm]  [server00001] [gc][young][163563][20423] duration
>> [1.3s], collections [1]/[2.1s], total [1.3s]/[17.9m], memory
>> [20.7gb]->[20.2gb]/[30.1gb], all_pools {[young]
>> [543.2mb]->[1mb]/[865.3mb]}{[survivor] [44.6mb]->[48.6mb]/[108.1mb]}{[old]
>> [20.2gb]->[20.2gb]/[29.2gb]}
>>
>>
>> *Item 2:*
>>
>> 2014-12-18T20:53:35.011+0000: 161313.153: [ParNew
>> Desired survivor size 56688640 bytes, new threshold 6 (max 6)
>> - age   1:   32445776 bytes,   32445776 total
>> - age   3:    6068000 bytes,   38513776 total
>> - age   4:    1031528 bytes,   39545304 total
>> - age   5:     255896 bytes,   39801200 total
>> : 939702K->53536K(996800K), 2.9352940 secs]
>> 21501296K->20625925K(31633280K)After GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 1287922158
>> Max   Chunk Size: 1287922158
>> Number of Blocks: 1
>> Av.  Block  Size: 1287922158
>> Tree      Height: 1
>> After GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 6205476
>> Max   Chunk Size: 6204928
>> Number of Blocks: 2
>> Av.  Block  Size: 3102738
>> Tree      Height: 2
>> , 2.9355580 secs] [Times: user=33.82 sys=3.11, real=2.94 secs]
>> 2014-12-18T20:53:37.947+0000: 161316.089: Total time for which
>> application threads were stopped: 2.9367640 seconds
>>
>>
>> [2014-12-18T20:53:37,950Z]  [WARN ]
>> [elasticsearch[server00001][scheduler][T#1]]
>> [org.elasticsearch.monitor.jvm]  [server00001] [gc][young][161091][19838] duration
>> [2.9s], collections [1]/[3s], total [2.9s]/[17.3m], memory
>> [20.4gb]->[19.6gb]/[30.1gb], all_pools {[young]
>> [801.7mb]->[2.4mb]/[865.3mb]}{[survivor]
>> [52.3mb]->[52.2mb]/[108.1mb]}{[old] [19.6gb]->[19.6gb]/[29.2gb]}
>>
>>
>> *Item 3:*
>>
>> 2014-12-17T14:42:10.590+0000: 52628.731: [GCBefore GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: -966776244
>> Max   Chunk Size: -966776244
>> Number of Blocks: 1
>> Av.  Block  Size: -966776244
>> Tree      Height: 1
>> Before GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 530
>> Max   Chunk Size: 268
>> Number of Blocks: 2
>> Av.  Block  Size: 265
>> Tree      Height: 2
>> 2014-12-17T14:42:10.590+0000: 52628.731: [ParNew
>> Desired survivor size 56688640 bytes, new threshold 1 (max 6)
>> - age   1:  113315624 bytes,  113315624 total
>> : 996800K->110720K(996800K), 7.3511710 secs]
>> 5609422K->5065102K(31633280K)After GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: -1009955715
>> Max   Chunk Size: -1009955715
>> Number of Blocks: 1
>> Av.  Block  Size: -1009955715
>> Tree      Height: 1
>> After GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 530
>> Max   Chunk Size: 268
>> Number of Blocks: 2
>> Av.  Block  Size: 265
>> Tree      Height: 2
>> , 7.3514180 secs] [Times: user=36.50 sys=17.99, real=7.35 secs]
>> 2014-12-17T14:42:17.941+0000: 52636.083: Total time for which application
>> threads were stopped: 7.3525250 seconds
>>
>>
>> [2014-12-17T14:42:17,944Z] [WARN ]
>> [elasticsearch[prdaes04data03][scheduler][T#1]]
>> [org.elasticsearch.monitor.jvm] [prdaes04data03] [gc][young][52582][5110] duration
>> [7.3s], collections [1]/[7.6s], total [7.3s]/[4.2m], memory
>> [5.1gb]->[4.8gb]/[30.1gb], all_pools {[young]
>> [695.5mb]->[14.4mb]/[865.3mb]}{[survivor]
>> [108.1mb]->[108.1mb]/[108.1mb]}{[old] [4.3gb]->[4.7gb]/[29.2gb]}
>>
>>
>>
>>
>> On Thu, Dec 18, 2014 at 1:10 PM, charlie hunt <charlie.hunt at oracle.com>
>> wrote:
>>>
>>> The output:
>>>
>>> *cat /sys/kernel/mm/transparent_hugepage/enabled*
>>> [always] madvise never
>>>
>>>
>>> Tells me that transparent huge pages are enabled “always”.
>>>
>>> I think I would change this to “never”, even though sysctl -a may be
>>> reporting no huge pages are currently in use. The system may trying to
>>> coalesce pages occasionally in attempt to make huge pages available, even
>>> though you are not currently using any.
>>>
>>> charlie
>>>
>>>
>>> On Dec 18, 2014, at 2:00 PM, Ashwin Jayaprakash <
>>> ashwin.jayaprakash at gmail.com> wrote:
>>>
>>> *@Jon*, thanks for clearing that up. Yes, that was my source of
>>> confusion. I was misinterpreting the user time with the real time.
>>>
>>> *Jon's reply from an offline conversation:*
>>>
>>>> Are these the 7 second collections you refer to in the paragraph above?
>>>> If yes, the "user" time is the sum of the time spent by multiple GC
>>>> threads.
>>>> The real time is the GC pause time that your application experiences.
>>>> In the above case the GC pauses are .65s, 1.10s and .67s.
>>>>
>>>
>>> Something that added to my confusion was the tools we are using
>>> in-house. In addition to the GC logs we have 1 tool that uses the
>>> GarbageCollectorMXBean's getCollectionTime() method. This does not seem to
>>> match the values I see in the GC logs (
>>> http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/management/GarbageCollectorMXBean.html#getCollectionTime%28%29
>>> ).
>>>
>>> The other tool is the ElasticSearch JVM stats logger which uses GarbageCollectorMXBean's
>>> LastGCInfo (
>>> https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/monitor/jvm/JvmStats.java#L194
>>> and
>>> https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/monitor/jvm/JvmMonitorService.java#L187
>>> ).
>>>
>>> Do these methods expose the total time spent by all the parallel GC
>>> threads for the ParNew pool or the "real" time? They do not seem to match
>>> the GC log times.
>>>
>>> *@Gustav* We do not have any swapping on the machines. It could be the
>>> disk IO experienced by the GC log writer itself, as you've suspected. The
>>> machine has 128G of RAM
>>>
>>> *"top" sample from a similar machine:*
>>>    PID    USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>> 106856 xxx          20   0   68.9g  25g   9.9g  S 72.4     21.1
>>> 2408:05 java
>>>
>>>
>>> *"free -g":*
>>>              total       used       free     shared    buffers     cached
>>> Mem:           120        119          0          0          0         95
>>> -/+ buffers/cache:         23         96
>>> Swap:            0          0          0
>>>
>>> *@Charlie* Hugepages has already been disabled
>>>
>>> *sudo sysctl -a | grep hugepage*
>>> vm.nr_hugepages = 0
>>> vm.nr_hugepages_mempolicy = 0
>>> vm.hugepages_treat_as_movable = 0
>>> vm.nr_overcommit_hugepages = 0
>>>
>>> *cat /sys/kernel/mm/transparent_hugepage/enabled*
>>> [always] madvise never
>>>
>>>
>>> Thanks all!
>>>
>>>
>>>
>>> On Wed, Dec 17, 2014 at 5:00 PM, Jon Masamitsu <jon.masamitsu at oracle.com
>>> > wrote:
>>>>
>>>>  Ashwin,
>>>>
>>>> On 12/16/2014 8:47 PM, Ashwin Jayaprakash wrote:
>>>>
>>>>  Hi, we have a cluster of ElasticSearch servers running with 31G heap
>>>> and OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode).
>>>>
>>>> While our old gen seems to be very stable with about 40% usage and no
>>>> Full GCs so far, our young gen keeps growing from ~50MB to 850MB every few
>>>> seconds. These ParNew collections are taking anywhere between 1-7 seconds
>>>> and is causing some of our requests to time out. The eden space keeps
>>>> filling up and then cleared every 30-60 seconds. There is definitely work
>>>> being done by our JVM in terms of caching/buffering objects for a few
>>>> seconds, writing to disk and then clearing the objects (typical
>>>> Lucene/ElasticSearch indexing and querying workload)
>>>>
>>>>
>>>> From you recent mail
>>>>
>>>> Times: user=7.89 sys=0.55, real=0.65 secs]
>>>> Times: user=7.71 sys=4.59, real=1.10 secs]
>>>> Times: user=7.46 sys=0.32, real=0.67 secs]
>>>>
>>>> Are these the 7 second collections you refer to in the paragraph above?
>>>> If yes, the "user" time is the sum of the time spent by multiple GC
>>>> threads.
>>>> The real time is the GC pause time that your application experiences.
>>>> In the above case the GC pauses are .65s, 1.10s and .67s.
>>>>
>>>> Comment below  regarding  "eden space keeps filling up".
>>>>
>>>>
>>>>  These long pauses are not good for our server throughput and I was
>>>> doing some reading. I got some conflicting reports on how Cassandra is
>>>> configured compared to Hadoop. There are also references
>>>> <http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/>
>>>> to this old ParNew+CMS bug
>>>> <http://bugs.java.com/view_bug.do?bug_id=6459113> which I thought
>>>> would've been addressed in the JRE version we are using. Cassandra
>>>> recommends <http://aryanet.com/blog/cassandra-garbage-collector-tuning>
>>>> a larger NewSize with just 1 for max tenuring, whereas Hadoop
>>>> recommends <http://wiki.apache.org/hadoop/PerformanceTuning> a small
>>>> NewSize.
>>>>
>>>>  Since most of our allocations seem to be quite short lived, is there
>>>> a way to avoid these "long" young gen pauses?
>>>>
>>>>  Thanks in advance. Here are some details.
>>>>
>>>> * Heap settings:*
>>>> java -Xmx31000m -Xms31000m
>>>> -Xss512k -XX:MaxPermSize=512m -XX:ReservedCodeCacheSize=64m
>>>> -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly
>>>> -XX:CMSInitiatingOccupancyFraction=70
>>>> -XX:+UseParNewGC -XX:+ScavengeBeforeFullGC -XX:+PrintPromotionFailure
>>>> -XX:+PrintClassHistogramBeforeFullGC -XX:+PrintTenuringDistribution
>>>> -XX:GCLogFileSize=512m -XX:NumberOfGCLogFiles=2 -XX:+PrintGCDateStamps
>>>> -XX:+UseGCLogFileRotation -XX:+DisableExplicitGC
>>>> -XX:PrintFLSStatistics=1 -XX:+PrintGCDetails
>>>>
>>>>
>>>> * Last few lines of "kill -3 pid" output:*
>>>> Heap
>>>>  par new generation   total 996800K, used 865818K [0x00007fa18e800000,
>>>> 0x00007fa1d2190000, 0x00007fa1d2190000)
>>>>   eden space 886080K,  94% used [0x00007fa18e800000,
>>>> 0x00007fa1c1a659e0, 0x00007fa1c4950000)
>>>>   from space 110720K,  25% used [0x00007fa1cb570000,
>>>> 0x00007fa1cd091078, 0x00007fa1d2190000)
>>>>   to   space 110720K,   0% used [0x00007fa1c4950000,
>>>> 0x00007fa1c4950000, 0x00007fa1cb570000)
>>>>  concurrent mark-sweep generation total 30636480K, used 12036523K
>>>> [0x00007fa1d2190000, 0x00007fa920000000, 0x00007fa920000000)
>>>>  concurrent-mark-sweep perm gen total 128856K, used 77779K
>>>> [0x00007fa920000000, 0x00007fa927dd6000, 0x00007fa940000000)
>>>>
>>>>  *Sample gc log:*
>>>> 2014-12-11T23:32:16.121+0000: 710.618: [ParNew
>>>> Desired survivor size 56688640 bytes, new threshold 6 (max 6)
>>>> - age   1:    2956312 bytes,    2956312 total
>>>> - age   2:     591800 bytes,    3548112 total
>>>> - age   3:      66216 bytes,    3614328 total
>>>> - age   4:     270752 bytes,    3885080 total
>>>> - age   5:     615472 bytes,    4500552 total
>>>> - age   6:     358440 bytes,    4858992 total
>>>> : 900635K->8173K(996800K), 0.0317340 secs]
>>>> 1352217K->463460K(31633280K)After GC:
>>>>
>>>>
>>>> In this GC eden is at 900635k before the GC and is a 8173k after.  That
>>>> GC fills up is
>>>> the expected behavior.  Is that what you were asking about above?  If
>>>> not can you
>>>> send me an example of the "fills up" behavior.
>>>>
>>>> Jon
>>>>
>>>>   Statistics for BinaryTreeDictionary:
>>>> ------------------------------------
>>>> Total Free Space: -433641480
>>>> Max   Chunk Size: -433641480
>>>> Number of Blocks: 1
>>>> Av.  Block  Size: -433641480
>>>> Tree      Height: 1
>>>> After GC:
>>>> Statistics for BinaryTreeDictionary:
>>>> ------------------------------------
>>>> Total Free Space: 1227
>>>> Max   Chunk Size: 631
>>>> Number of Blocks: 3
>>>> Av.  Block  Size: 409
>>>> Tree      Height: 3
>>>> , 0.0318920 secs] [Times: user=0.38 sys=0.01, real=0.03 secs]
>>>>
>>>>
>>>>  Ashwin Jayaprakash.
>>>>
>>>>
>>>> _______________________________________________
>>>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>
>>>>
>>>>  _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20141222/822afb6e/attachment.html>