How to alert for heap fragmentation

Thu Oct 11 23:41:48 PDT 2012

Hi Ramki. Answers inline below:

On Thu, Oct 11, 2012 at 11:30 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:

>
> Todd, good question :-)
>
> @Jesper et al, do you know the answer to Todd's question? I agree that
> exposing all of these stats via suitable JMX/Mbean interfaces would be
> quite useful.... The other possibility would be to log in the manner of
> HP's gc logs (CSV format with suitable header), or jstat logs, so parsing
> cost would be minimal. Then higher level, general tools like Kafka could
> consume the log/event streams, apply suitable filters and inform/alert
> interested monitoring agents.
>
>
Parsing CSV is one possibility, but somewhat painful, because you have all
the usual issues with log rolling, compatibility between versions, etc.
Certainly better than parsing the internal dump format that
PrintFLSStatistics exposes at the moment, though :)

> @Todd & Saroj: Can you perhaps give some scenarios on how you might make
> use of information such as this (more concretely say CMS fragmentation at a
> specific JVM)? Would it be used only for "read-only" monitoring and
> alerting, or do you see this as part of an automated data-centric control
> system of sorts. The answer is kind of important, because something like
> the latter can be accomplished today via gc log parsing (however kludgey
> that might be) and something like Kafka/Zookeeper. On the other hand, I am
> not sure if the latency of that kind of thing would fit well into a more
> automated and fast-reacting data center control system or load-balancer
> where a more direct JMX/MBean like interface might work better. Or was your
> interest purely of the "development-debugging-performance-measurement"
> kind, rather than of production JVMs? Anyway, thinking out loud here...
>
>
Just to give some context, one of the main products where I work is
software which monitors large Hadoop clusters. Most of our daemons are
low-heap, but a few, notably the HBase Region Server, can have large heaps
and suffer from fragmentation. I wrote a few blog posts about this last
year (starting here:
http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
)

So, if we can monitor fragmentation, I think there would be two useful
things we could do:

1) If we notice that the heap is becoming really fragmented, we know a full
GC is imminent. HBase has the capability to shift load between servers at
runtime -- so we could simply ask the load balancer to move all load off
the fragmented server, initiate a full GC manually, and then move the
region back.

Less gracefully, we could have the server do a clean shutdown, which would
be handled by our normal fault tolerance. This is actually better than a
lengthy GC pause, because we can detect a clean shutdown immediately
whereas the GC pause will take 30+ seconds before various
heartbeats/sessions expire.

2) Our monitoring software already measures various JVM metrics and exposes
them to operators (eg percentage of time spent in GC, heap usage after last
GC, etc). If an operator suspects that GC is an issue, he or she can watch
this metric or even set an alert. For some use cases, a
fragmentation-induced STW GC is nearly catastrophic. An administrator
should be able to quickly look at one of these metrics and tell whether the
fragmentation is stable or if it's creeping towards an STW, in which case
they need to re-evaluate GC tuning, live set size, etc.

Hope that helps with motivation for the feature.

-Todd

> On Thu, Oct 11, 2012 at 9:11 PM, Todd Lipcon <todd at cloudera.com> wrote:
>
>> Hey Ramki,
>>
>> Do you know if there's any plan to offer the FLS statistics as a metric
>> via JMX or some other interface in the future? It would be nice to be able
>> to monitor fragmentation without having to actually log and parse the gc
>> logs.
>>
>> -Todd
>>
>>
>> On Thu, Oct 11, 2012 at 7:50 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:
>>
>>> In the absence of fragmentation, one would normally expect the max chunk
>>> size of the CMS generation
>>> to stabilize at some reasonable value, say after some 10's of CMS GC
>>> cycles. If it doesn't, you should try
>>> and use a larger heap, or otherwise reshape the heap to reduce promotion
>>> rates. In my experience,
>>> CMS seems to work best if its "duty cycle" is of the order of 1-2 %,
>>> i.e. there are 50 to 100 times more
>>> scavenges during the interval that it's not running vs the interva
>>> during which it is running.
>>>
>>> Have Nagios grep the GC log file w/PrintFLSStatistics=2 for the string
>>> "Max  Chunk Size:" and pick the
>>> numeric component of every (4n+1)th match. The max chunk size will
>>> typically cycle within a small band,
>>> once it has stabilized, returning always to a high value following a CMS
>>> cycle's completion. If the upper envelope
>>> of this keeps steadily declining over some 10's of CMS GC cycles, then
>>> you are probably seeing fragmentation
>>> that will eventually succumb to fragmentation.
>>>
>>> You can probably calibrate a threshold for the upper envelope so that if
>>> it falls below that threshold you will
>>> be alerted by Nagios that a closer look is in order.
>>>
>>> At least something along those lines should work. The toughest part is
>>> designing your "filter" to detect the
>>> fall in the upper envelope. You will probably want to plot the metric,
>>> then see what kind of filter will detect
>>> the condition.... Sorry this isn't much concrete help, but hopefully it
>>> gives you some ideas to work in
>>> the right direction...
>>>
>>> -- ramki
>>>
>>> On Thu, Oct 11, 2012 at 4:27 PM, roz dev <rozdev29 at gmail.com> wrote:
>>>
>>>> Hi All
>>>>
>>>> I am using Java 6u23, with CMS GC. I see that sometime Application gets
>>>> paused for longer time because of excessive heap fragmentation.
>>>>
>>>> I have enabled PrintFLSStatistics flag and following is the log
>>>>
>>>>
>>>> 2012-10-09T15:38:44.724-0400: 52404.306: [GC Before GC:
>>>> Statistics for BinaryTreeDictionary:
>>>> ------------------------------------
>>>> Total Free Space: -668151027
>>>> Max   Chunk Size: 1976112973
>>>> Number of Blocks: 175445
>>>> Av.  Block  Size: 20672
>>>> Tree      Height: 78
>>>> Before GC:
>>>> Statistics for BinaryTreeDictionary:
>>>> ------------------------------------
>>>> Total Free Space: 10926
>>>> Max   Chunk Size: 1660
>>>> Number of Blocks: 22
>>>> Av.  Block  Size: 496
>>>> Tree      Height: 7
>>>>
>>>>
>>>> I would like to know from people about the way they track Heap
>>>> Fragmentation and how do we alert for this situation?
>>>>
>>>> We use Nagios and I am wondering if there is a way to parse these logs
>>>> and know the max chunk size so that we can alert for it.
>>>>
>>>> Any inputs are welcome.
>>>>
>>>> -Saroj
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>
>>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>
>

-- 
Todd Lipcon
Software Engineer, Cloudera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121011/2ce6fee8/attachment.html