Multi-second ParNew collections but stable CMS
Jon Masamitsu
jon.masamitsu at oracle.com
Thu Dec 18 20:23:56 UTC 2014
On 12/18/2014 12:00 PM, Ashwin Jayaprakash wrote:
> *@Jon*, thanks for clearing that up. Yes, that was my source of
> confusion. I was misinterpreting the user time with the real time.
>
> *Jon's reply from an offline conversation:*
>
> Are these the 7 second collections you refer to in the paragraph
> above?
> If yes, the "user" time is the sum of the time spent by multiple
> GC threads.
> The real time is the GC pause time that your application experiences.
> In the above case the GC pauses are .65s, 1.10s and .67s.
>
>
> Something that added to my confusion was the tools we are using
> in-house. In addition to the GC logs we have 1 tool that uses the
> GarbageCollectorMXBean's getCollectionTime() method. This does not
> seem to match the values I see in the GC logs
> (http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/management/GarbageCollectorMXBean.html#getCollectionTime%28%29).
>
> The other tool is the ElasticSearch JVM stats logger which uses
> GarbageCollectorMXBean's LastGCInfo
> (https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/monitor/jvm/JvmStats.java#L194
> and
> https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/monitor/jvm/JvmMonitorService.java#L187).
>
> Do these methods expose the total time spent by all the parallel GC
> threads for the ParNew pool or the "real" time? They do not seem to
> match the GC log times.
I haven't found the JVM code that provides information for
getCollectionTime() but
I would expect to match the pause times in the GC logs. From your
earlier mail
> *Sample gc log:*
> 2014-12-11T23:32:16.121+0000: 710.618: [ParNew
> Desired survivor size 56688640 bytes, new threshold 6 (max 6)
> - age 1: 2956312 bytes, 2956312 total
> - age 2: 591800 bytes, 3548112 total
> - age 3: 66216 bytes, 3614328 total
> - age 4: 270752 bytes, 3885080 total
> - age 5: 615472 bytes, 4500552 total
> - age 6: 358440 bytes, 4858992 total
> : 900635K->8173K(996800K), 0.0317340 secs]
> 1352217K->463460K(31633280K)After GC:
The time 0.0317340 is what the GC measures and is what I would expect to
be available
through the MXBeans.
You're saying that does not match?
Jon
> *@Gustav* We do not have any swapping on the machines. It could be the
> disk IO experienced by the GC log writer itself, as you've suspected.
> The machine has 128G of RAM
>
> *"top" sample from a similar machine:*
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 106856 xxx 20 0 68.9g 25g 9.9g S 72.4 21.1
> 2408:05 java
>
> *"free -g":
> *
> total used free shared buffers cached
> Mem: 120 119 0 0 0 95
> -/+ buffers/cache: 23 96
> Swap: 0 0 0
>
> *@Charlie* Hugepages has already been disabled
>
> *sudo sysctl -a | grep hugepage*
> vm.nr_hugepages = 0
> vm.nr_hugepages_mempolicy = 0
> vm.hugepages_treat_as_movable = 0
> vm.nr_overcommit_hugepages = 0
>
> *cat /sys/kernel/mm/transparent_hugepage/enabled*
> [always] madvise never
>
>
> Thanks all!
>
>
>
> On Wed, Dec 17, 2014 at 5:00 PM, Jon Masamitsu
> <jon.masamitsu at oracle.com <mailto:jon.masamitsu at oracle.com>> wrote:
>
> Ashwin,
>
> On 12/16/2014 8:47 PM, Ashwin Jayaprakash wrote:
>> Hi, we have a cluster of ElasticSearch servers running with 31G
>> heap and OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode).
>>
>> While our old gen seems to be very stable with about 40% usage
>> and no Full GCs so far, our young gen keeps growing from ~50MB to
>> 850MB every few seconds. These ParNew collections are taking
>> anywhere between 1-7 seconds and is causing some of our requests
>> to time out. The eden space keeps filling up and then cleared
>> every 30-60 seconds. There is definitely work being done by our
>> JVM in terms of caching/buffering objects for a few seconds,
>> writing to disk and then clearing the objects (typical
>> Lucene/ElasticSearch indexing and querying workload)
>
> From you recent mail
>
> Times: user=7.89 sys=0.55, real=0.65 secs]
> Times: user=7.71 sys=4.59, real=1.10 secs]
> Times: user=7.46 sys=0.32, real=0.67 secs]
>
> Are these the 7 second collections you refer to in the paragraph
> above?
> If yes, the "user" time is the sum of the time spent by multiple
> GC threads.
> The real time is the GC pause time that your application experiences.
> In the above case the GC pauses are .65s, 1.10s and .67s.
>
> Comment below regarding "eden space keeps filling up".
>
>>
>> These long pauses are not good for our server throughput and I
>> was doing some reading. I got some conflicting reports on how
>> Cassandra is configured compared to Hadoop. There are also
>> references
>> <http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/>
>> to this old ParNew+CMS bug
>> <http://bugs.java.com/view_bug.do?bug_id=6459113> which I thought
>> would've been addressed in the JRE version we are using.
>> Cassandra recommends
>> <http://aryanet.com/blog/cassandra-garbage-collector-tuning> a
>> larger NewSize with just 1 for max tenuring, whereas Hadoop
>> recommends <http://wiki.apache.org/hadoop/PerformanceTuning> a
>> small NewSize.
>>
>> Since most of our allocations seem to be quite short lived, is
>> there a way to avoid these "long" young gen pauses?
>>
>> Thanks in advance. Here are some details.
>> *
>> Heap settings:*
>> java -Xmx31000m -Xms31000m
>> -Xss512k -XX:MaxPermSize=512m -XX:ReservedCodeCacheSize=64m
>> -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:CMSInitiatingOccupancyFraction=70
>> -XX:+UseParNewGC -XX:+ScavengeBeforeFullGC
>> -XX:+PrintPromotionFailure
>> -XX:+PrintClassHistogramBeforeFullGC -XX:+PrintTenuringDistribution
>> -XX:GCLogFileSize=512m -XX:NumberOfGCLogFiles=2
>> -XX:+PrintGCDateStamps
>> -XX:+UseGCLogFileRotation -XX:+DisableExplicitGC
>> -XX:PrintFLSStatistics=1 -XX:+PrintGCDetails
>>
>> *
>> Last few lines of "kill -3 pid" output:*
>> Heap
>> par new generation total 996800K, used 865818K
>> [0x00007fa18e800000, 0x00007fa1d2190000, 0x00007fa1d2190000)
>> eden space 886080K, 94% used [0x00007fa18e800000,
>> 0x00007fa1c1a659e0, 0x00007fa1c4950000)
>> from space 110720K, 25% used [0x00007fa1cb570000,
>> 0x00007fa1cd091078, 0x00007fa1d2190000)
>> to space 110720K, 0% used [0x00007fa1c4950000,
>> 0x00007fa1c4950000, 0x00007fa1cb570000)
>> concurrent mark-sweep generation total 30636480K, used 12036523K
>> [0x00007fa1d2190000, 0x00007fa920000000, 0x00007fa920000000)
>> concurrent-mark-sweep perm gen total 128856K, used 77779K
>> [0x00007fa920000000, 0x00007fa927dd6000, 0x00007fa940000000)
>> *
>> *
>> *Sample gc log:*
>> 2014-12-11T23:32:16.121+0000: 710.618: [ParNew
>> Desired survivor size 56688640 bytes, new threshold 6 (max 6)
>> - age 1: 2956312 bytes, 2956312 total
>> - age 2: 591800 bytes, 3548112 total
>> - age 3: 66216 bytes, 3614328 total
>> - age 4: 270752 bytes, 3885080 total
>> - age 5: 615472 bytes, 4500552 total
>> - age 6: 358440 bytes, 4858992 total
>> : 900635K->8173K(996800K), 0.0317340 secs]
>> 1352217K->463460K(31633280K)After GC:
>
> In this GC eden is at 900635k before the GC and is a 8173k after.
> That GC fills up is
> the expected behavior. Is that what you were asking about above?
> If not can you
> send me an example of the "fills up" behavior.
>
> Jon
>
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: -433641480
>> Max Chunk Size: -433641480
>> Number of Blocks: 1
>> Av. Block Size: -433641480
>> Tree Height: 1
>> After GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 1227
>> Max Chunk Size: 631
>> Number of Blocks: 3
>> Av. Block Size: 409
>> Tree Height: 3
>> , 0.0318920 secs] [Times: user=0.38 sys=0.01, real=0.03 secs]
>>
>>
>> Ashwin Jayaprakash.
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20141218/d426f2b2/attachment-0001.html>
More information about the hotspot-gc-use
mailing list