Trace object allocation flag proposal

Wed Dec 15 20:47:25 PST 2010

Xiaobin Lu said the following on 12/16/10 04:00:
> Thanks for your feedback, David.
> 
> Let me try to clarify with some more background information.
> 
> One of the problems many application has is the object allocation 
> problem (Duh ...). Over releases, GC becomes more and more frequent. It 
> is pretty hard for one single team to find out where the problem goes 
> wrong. The existing tool such as jhat or jmap is very hard to work with 
> giant heap dump file. So the flag I propose is mostly for diagnostic 
> purpose, it won't be enabled by default in production. Having said so, 
> if we make it as a manageable flag, we can turn it on when GC goes wild.

I don't see this flag as solving the problem. If the heap is filled by a 
millions object allocations then you will have a million trace records 
to process - as difficult as (and less informative than) processing a 
heap dump.

But just by turning on this tracing you will add so much overhead to the 
allocation (potentially) that you will likely completely change the 
allocation pattern and may allow GC to more easily keep up. Any 
mechanism has the potential to do this - even DTrace probes are not as 
unobstrusive as people make out when there are large quantities of 
events occurring. The key to success in these situations is having a 
means to identify the "interesting" events so that you only trace those.

> Another value this flag could provide is to find out why sometime OOM 
> occurs. For example, someone who wrote a search application on top of 
> Lucene framework suddenly found a crash due to OOM. The stack trace 
> points to Lucene code which is hard to instrument, so this flag could 
> provide insight to why the allocation fails. Maybe it is due to a 
> corrupted length value passed to "new byte[length];" etc.

A flag that enables you to track "unusual" allocations would seem of 
more use to detect this kind of condition.

> I like your idea on per-thread basis, however, for a lot of web 
> application, thread comes and go. It is pretty hard to pin point on 
> which thread I want to trace the object allocation.
> 
> To answer your last question, what I am really interested in finding out 
> is what type of object allocation makes GC happens more frequent. 

Most GC's don't know what kind of garbage they are collecting, but I 
wonder if G1 might be of more assistance here?

> Randomly taking snapshot of heap using jmap is absolutely not an idea 
> way to do so since not only it generates a giant file which is difficult 
> to work with, also it will pause the application and cause negative 
> impact to the application throughput. 

So will excessive tracing as per the above. Honestly I think the only 
real way to diagnoze such issue is to let the system get into the 
problematic state, take it offline, take a snapshot and work from that.

I appreciate what you are trying to do, but I don't see this kind of 
tracing as a viable solution.

Cheers,
David
-----

After I get the type of hot
> allocation, if it happens to be an application level object type such as 
> com.Xyz.search.IndexReader, I can instrument the constructor to dump the 
> caller stack. People here also suggests it would be nice if we could 
> dump the allocation stack trace for some particular hot types.
> 
> I could propose the diff for your folks to review.
> 
> Thanks,
> 
> -Xiaobin
> 
> On Tue, Dec 14, 2010 at 11:58 PM, David Holmes <David.Holmes at oracle.com 
> <mailto:David.Holmes at oracle.com>> wrote:
> 
>     Hi Xiaobin,
> 
>     The problem with tracing like this is that to be useful the tracing
>     must be unobtrusive and be able to handle getting called millions of
>     times (which is what will happen in those GC scenarios you
>     describe). The sheer volume of data generated as you suggest below
>     would overwhelm the app and be pretty hard to work with.
> 
>     Per-thread statistics of particular types (or of objects larger than
>     a certain size) might be more useful, with a dump able to be
>     requested on-demand.
> 
>     But I think you'd need to be able to combine this with heap dump
>     info to be useful.
> 
>     It really depends on exactly what info you want to be able to
>     deduce: simple number of objects of given types, hot allocation
>     sites, hot threads, ...
> 
>     Cheers,
>     David
> 
>     Xiaobin Lu said the following on 12/15/10 17:07:
> 
>         Hi folks,
> 
>         I would like to propose a way to trace object allocation on
>         Linux. On Solaris, we have DTrace which is pretty nice. But on
>         Linux, it is almost impossible to do so. Correct me if I am
>         wrong here.
> 
>         So I am thinking to add a manageable VM flag and let's call it
>         TraceObjectAllocation. When enabled, we can output something like:
> 
>         thread id: 10     class name: java/lang/reflect/Method          
>           size: 80 bytes
>         thread id: 10     class name: [Ljava/lang/Class;                
>            size: 16 bytes
>         thread id: 10     class name: [C                                
>                                   size: 56 bytes
>         thread id: 10     class name: java/lang/reflect/Method          
>          size: 80 bytes
>         thread id: 10     class name: [Ljava/lang/Class;                
>           size: 16 bytes
> 
>         As you could imagine, this could be very useful to keep track of
>         object allocation behavior in the app. Some smart tool can take
>         advantage of this to print a histogram (like top 10 hot
>         allocations) of object allocation. I would like to know your
>         thoughts and suggestions on this.
> 
>         I have a detailed proposal on this attached in PDF file.
> 
>         Thanks,
> 
>         -Xiaobin
> 
> 
>