G1GC Full GCs

Wed Jul 7 17:09:13 PDT 2010

I plotted the heap used _after_ each gc (scavenge or full) etc.,
attached; and if you stand away from the plot, tilt your head to the right and
squint at the plot, you'll see what looks, at least to me, like a slow leak.
(The tell-tale slowly-rising lower envelope of your carrier-wave,
if you will pardon a telecom term.) Leaks can of course exacerbate
fragmentation in non-moving collectors such as CMS, but also possibly
in regionalized lazily evacuating collectors such as G1.

Perhaps jmap -histo:live or +PrintClassHistogram[AfterFullGC]
will help you get to the bottom of yr leak. Once the leak is plugged
perhaps we could come back to the G1 tuning effort? (We have some
guesses as to what might be happening and the best G1 minds are
chewing on the info you provided so far, for which thanks!)

-- ramki

On 07/07/10 11:56, Todd Lipcon wrote:
> On Wed, Jul 7, 2010 at 11:28 AM, Y. S. Ramakrishna 
> <y.s.ramakrishna at oracle.com <mailto:y.s.ramakrishna at oracle.com>> wrote:
> 
> 
> 
>     On 07/07/10 08:45, Todd Lipcon wrote:
>     ...
> 
> 
>         Overnight I saw one "concurrent mode failure".
> 
>     ...
> 
>         2010-07-07T07:56:27.786-0700: 28490.203: [GC 28490.203: [ParNew
>         (promotion failed): 59008K->59008K(59008K), 0.0179250
>         secs]28490.221: [CMS2010-07-07T07:56:27.901-0700: 28490.317:
>         [CMS-concurrent-preclean: 0.556/0.947 secs] [Times:
>          user=5.76 sys=0.26, real=0.95 secs]  (concurrent mode failure):
>         6359176K->4206871K(8323072K), 17.4366220 secs]
>         6417373K->4206871K(8382080K), [CMS Perm :
>         18609K->18565K(31048K)], 17.4546890 secs] [Times: user=11.17
>         sys=0.09, real=17.45 secs]
>         I've interpreted pauses like this as being caused by
>         fragmentation, since the young gen is 64M, and the old gen here
>         has about 2G free. If there's something I'm not understanding
>         about CMS, and I can tune it more smartly to avoid these longer
>         pauses, I'm happy to try.
> 
> 
>     Yes the old gen must be fragmented. I'll look at the data you have
>     made available (for CMS). The CMS log you uploaded does not have the
>     suffix leading into the concurrent mode failure ypu display above
>     (it stops less than 2500 s into the run). If you could include
>     the entire log leading into the concurrent mode failures, it would
>     be a great help. 
> 
> 
> Just uploaded the full log from the entire 11-hour run, all the way up 
> through the 218-second GC pause which caused the server to get kicked 
> out of the cluster (since it stopped heartbeating to the master)
> 
> http://cloudera-todd.s3.amazonaws.com/cms-full-gc-log.txt.gz
> 
> 
>     Do you have large arrays in your
>     application? 
> 
> 
> The primary heap consumers in the application are:
> - RPC buffers - in this case I'm configured for 40 RPC handlers, each of 
> which is usually handling a byte[] around 2-3MB for a "put". These 
> buffers then get passed along into the memstore:
> - Memstore - this is allocated 40% of the heap, and it's made up of some 
> hundreds of separate ConcurrentSkipListMaps. The values of the map are 
> small objects which contain offsets into to the byte[]s passed in above. 
> So, typically this is about 2GB of heap, corresponding to around a 
> million of the offset containers, and maybe 100 thousand of the actual 
> byte arrays.
> 
> These memstores are always being "flushed" to disk (basically we take 
> one of the maps and write it out, then drop references to the map to let 
> GC free up memory)
> 
> - LRU block cache - this is a large 
> ConcurrentHashMap<String,CachedBlock>, where a CachedBlock is basically 
> a wrapper for a ByteBuffer. These ByteBuffers represent around 64KB 
> each. Typically this is allocated 20% of the heap, so on the order of 
> 20,000 entries in the map here.
> 
> Eviction is done by manually accounting heap usage, and when it gets too 
> high, we remove blocks from the cache.
> 
> So to answer your question simply: there shouldn't be any byte arrays 
> floating around larger than 2MB, though there are a fair number at that 
> size and a fair number at 64KB. Can I use jmap or another program to do 
> any useful analysis?
>  
> 
>     The shape of the promotion graph for CMS is somewhat
>     jagged, indicating _perhaps_ that. Yes, +PrintTenuringDistribution
>     would shed a bit more light. 
> 
> 
> I'll restart the test with this option on and collect some more logs for 
> you guys.
>  
> 
>     As regards fragmentation, it can be
>     tricky to tune against, but we can try once we understand a bit
>     more about the object sizes and demographics.
> 
>     I am sure you don't have an easily shared test case, so we
>     can reproduce both the CMS fragmentation and the G1 full gc
>     issues locally for quickest progress on this?
> 
> Well, the project itself is open source, but to really get serious load 
> going into it you need beefy machines/disks. I'm running my tests on a 
> 5-node cluster of dual quad core Nehalems, 24G RAM, 12 disks each. I can 
> try to set up a mocked workload (eg skip actual disk IO) from the same 
> codebase, but it would be a fair bit of work and I don't think I can get 
> to it this month (leaving for vacation next week)
> 
> If it's useful to look at the source, here are some pointers to the 
> relevant RAM consumers:
> 
> Cache:
> http://github.com/apache/hbase/blob/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
> 
> MemStore:
> http://github.com/apache/hbase/blob/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
> 
> Wrapper class held by memstore:
> http://github.com/apache/hbase/blob/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
> 
> The class used by RPC to receive "Put" requests:
> http://github.com/apache/hbase/blob/trunk/src/main/java/org/apache/hadoop/hbase/client/Put.java
> 
> Thanks again for all the help, it's much appreciated.
> -Todd
> -- 
> Todd Lipcon
> Software Engineer, Cloudera
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cms-full-useda.gif
Type: image/gif
Size: 50693 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20100707/86cfe7f5/attachment-0001.gif