hprof format question

Alex Bagehot ceeaspb at gmail.com
Fri Nov 2 09:09:42 UTC 2018


Hi Simon,
If you just need a count then would a class histogram suffice?

jcmd <pid> GC.class_histogram

and/or for older jdks

jmap -histo[:live] <pid>

thanks,
Alex

https://docs.oracle.com/javase/6/docs/technotes/tools/share/jmap.html
*-histo[:live]*Prints a histogram of the heap. For each Java class, number
of objects, memory size in bytes, and fully qualified class names are
printed. VM internal class names are printed with '*' prefix. If the
*live* suboption
is specified, only live objects are counted.

On Thu, Nov 1, 2018 at 10:09 PM Simon Roberts <
simon at dancingcloudservices.com> wrote:

> Thanks for the input Kirk, JC.
>
> Kirk, do I assume that your code decodes the entire object graph? I admit
> that mine is intended to be hyper-lean. It collects the strings, the
> classes, and then simply counts object instances. But the upside of that is
> that I can get a map of object count by class from a 20Gb heap in under two
> minutes, on a machine that only has 16G RAM. I tried to read it using jhat
> once on a 32G machine and it just ground into the dirt with all 8 cores
> running flat out (clearly in the "I'm dying" GC mode, since I'm pretty sure
> that jhat itself is single threaded!) But it's nice to know I'm not insane!
>
> JC, I'm on the road right now, but will try to dig out the relevant
> segment fwiw in the next few days. I will say that it was pretty clear that
> it was somehow bogus.
>
> Oh, while on the topic of what I learned from this exercise, I also
> believe I discovered a bug in BufferedInputStream (don't laugh!). It fails
> catastrophically after reading 2GB (I think I can guess why without even
> looking at the code :). Everything was reading properly from an unadorned
> FileInputStream, but then I needed to look at the byte sequence to work out
> what was happening with these broken parts, and everything that was
> previously working went crazy when I stuck the BIS on the front. I created
> a replacement so that I could actually look at the failing bytes in a
> debugger. Hmm, the point of that observation was to ask the off-topic
> question of one should report a bug in a core API?
>
> Thanks again,
> Cheers,
> Simon
>
>
> On Thu, Nov 1, 2018 at 11:20 AM JC Beyler <jcbeyler at google.com> wrote:
>
>> Hi Simon,
>>
>> I briefly looked at the code that does the dump (or seems to) and the
>> code is written in a form of:
>>                                           out.writeByte((byte)
>> HPROF_GC_ROOT_JNI_LOCAL);
>>                                            writeObjectID(oop);
>>                                            out.writeInt(threadIndex);
>>
>>  out.writeInt(EMPTY_FRAME_DEPTH);
>>
>> (for the Java implementation). I quickly went through the various path
>> but I don't see a case where it could just stop after having written the
>> object ID, it seems that it would either throw a nice exception or would
>> write those two integers afterward. The C++ implementation does the same
>> void JNILocalsDumper::do_oop(oop* obj_p) {
>>   // ignore null handles
>>   oop o = *obj_p;
>>   if (o != NULL) {
>>     writer()->write_u1(HPROF_GC_ROOT_JNI_LOCAL);
>>     writer()->write_objectID(o);
>>     writer()->write_u4(_thread_serial_num);
>>     writer()->write_u4((u4)_frame_num);
>>   }
>> }
>>
>> I'm making a lot of assumptions that the surrounding code is same, the
>> writer does not get corrupted or messed up. But it does seem sane. What
>> exactly are the few bytes at that 16th element that make you believe the
>> next two 4-bytes could not be the thread serial number and frame number?
>> Jc
>>
>>
>> On Wed, Oct 31, 2018 at 11:34 AM Kirk Pepperdine <
>> kirk.pepperdine at gmail.com> wrote:
>>
>>> Hi Simon,
>>>
>>> I’ve also started a small project to try and solve the we need to look
>>> at very large heap problem. My solution is to load the data into Neo4J. You
>>> can find the project on my GitHub account.
>>>
>>> So, I believe I’ve taken the same tactic in just abandoning the segment
>>> for the moment. It would be useful to sort that out but I listed it as a
>>> future…
>>>
>>> Kind regards,
>>> Kirk
>>> L
>>>
>>> On Oct 31, 2018, at 4:07 AM, Simon Roberts <
>>> simon at dancingcloudservices.com> wrote:
>>>
>>> Hi all, I'm hoping this is the correct list for a question on the hprof
>>> file format (1.0.2)?
>>>
>>> I found this information:
>>> http://hg.openjdk.java.net/jdk6/jdk6/jdk/raw-file/tip/src/share/demo/jvmti/hprof/manual.html
>>>
>>> and have been working on a small project to read these files. (Yes, I
>>> know that NetBeans/VisualVM and Eclipse both have such libraries, and a
>>> number of other tools have been derived from those, but so far as I can
>>> tell, they all are fundamentally built on the notion of fully decoding
>>> everything, and creating memory representations of the entire heap. I want
>>> to pull out only certain pieces of information--specifically object counts
>>> by class--from a large, ~20Gb, dump file, and those tools just give up the
>>> ghost on my systems.)
>>>
>>> Anyway, my code reads the file pretty well so far, except that the file
>>> I want to analyze seems to contradict the specifications of the document
>>> mentioned above. Specifically, after processing about five
>>> HEAP_DUMP_SEGMENTS with around 1.5 million sub-records in each, I come
>>> across some ROOT_JNI_LOCAL records. The first 15 follow the format
>>> specified in the above document (one 8 byte "ID" and two four byte values.)
>>> But the 16th omits the two four byte records (well, it might simply have
>>> more, but visual analysis shows that after the 8 byte ID, I have a new
>>> block tag, and a believable structure. I've actually noticed that several
>>> of the record types defined in this "group" seem to diverge from the paper
>>> I mentioned.
>>>
>>> My solution is that if my parser trips, it abandons that
>>> HEAP_DUMP_SEGMENT from that point forward. It doesn't seem to matter much,
>>> since I was looking for object data, and it appears that all of that has
>>> already been handled. However, clearly this is not ideal!
>>>
>>> Is there any more detailed, newer, better, information? Or anything else
>>> I should know to pursue this tool (or indeed a simple object frequency by
>>> classname result) from an hprof 1.0.2 format file?
>>>
>>> (And yes, I'm pursuing a putative memory leak :)
>>>
>>> Thanks for any input (including "dude, this is the wrong list!")
>>> Cheers,
>>> Simon
>>>
>>>
>>>
>>> --
>>> Simon Roberts
>>> (303) 249 3613
>>>
>>>
>>>
>>> --
>>> Simon Roberts
>>> (303) 249 3613
>>>
>>>
>>>
>>
>> --
>>
>> Thanks,
>> Jc
>>
>
>
> --
> Simon Roberts
> (303) 249 3613
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20181102/a8e905e4/attachment-0001.html>


More information about the serviceability-dev mailing list