[RFR]8215623: Add incremental dump for jmap histo

Tue May 14 06:46:55 UTC 2019

Dear Serguei,
     Thanks for your comments.

 > > - incremental[:<file_name>], enable the incremental dump of heap, dumped
 > >   data will be saved to, by default it is "IncrementalHisto.dump"
 >
 >  Q1: Should the <file_name> be full path or short name?
 >      Is there any default path? What is the path of the
 > "IncrementalHisto.dump" file?

The original design doesn't have the <file_name> option so the file is hardcoded named "IncrementalHisto.dump" and save to the same path as "file=" specified. Or print to whatever output stream is if "file=" is not set.

With the new design, I suggest firstly parse <file_name>, if the value contains folder path, use the specified path, if not, use same path as "file=" value,  and if "file=" is not set, use output stream. (The reason I prefer to use same path as "file=" is I assume that users prefer to save all data file under the same folder.)

 > > - chunksize=<N>, size of objects (in KB) will be dumped in one chunk.
 >
 > Q2: Should it be chunk of dump, not chunk of objects?

The purpose of "chunksize" is to decide how many objects' info are dumped at once. for example use "chunksize=1" on a "Xmx1m", there will be at max 1MB/1KB = 1000 chunks, which indicates that there will be 1000 times of file writing when do "jmap -histo".

 > > - maxfilesize=<N>, size of the incremental data dump file (in KB), when data size
 > >   is larger than maxfilesize, the file is erased and latest data will be written.

 > Q3: What is a relation and limitations between chunksize and maxfilesize?
 >    Should the maxfilesize be multiple of the chunksize?

 > Q4: The sentence "the file is erased and latest data will be written"
is not clear enough.
 >    Why the whole file needs to be erased
 >    Should the incremental file behave like a cyclic buffer?
 >    If so, then only next chunk needs to be erased.
 >    Then the chunks need to be numbered in order, so the earliest one can be found.

The "maxfilesize" controls the file size not to be too large, so when the dumped data is larger than "maxfilesize", the file is erased and latest data are written.The reason I erase whole file is that chunk data is accumulative, so the latest data includes the previous statistical ones. And this way may make the file easy to read.

I agree that we can add ordered number in chunks, I think it more or less help user to get to know how object distributed in heap.

I think maybe it is reasonable to have the incremental file behave like gclog, when maxfilesize is reached, the file is renamed with numbered suffix, and new file is created to use. so there can be IncrementalHisto.dump.0 and IncrementalHisto.dump.1 etc for large heap.

what do you think?

Thanks,
Lin
________________________________________
From: serguei.spitsyn at oracle.com <serguei.spitsyn at oracle.com>
Sent: Saturday, May 11, 2019 2:17:41 AM
To: 臧琳; Hohensee, Paul; JC Beyler
Cc: serviceability-dev at openjdk.java.net
Subject: Re: [RFR]8215623: Add incremental dump for jmap histo

Dear Lin,

Sorry for the late reply.
I've edited the CSR a little bit to fix some incorrect spots.
Now, a couple of spots are not clear to me.

 > - incremental[:<file_name>], enable the incremental dump of heap, dumped
 >   data will be saved to, by default it is "IncrementalHisto.dump"

  Q1: Should the <file_name> be full path or short name?
      Is there any default path? What is the path of the
"IncrementalHisto.dump" file?

 > - chunksize=<N>, size of objects (in KB) will be dumped in one chunk.

  Q2: Should it be chunk of dump, not chunk of objects?

 > - maxfilesize=<N>, size of the incremental data dump file (in KB),
when data size
 >   is larger than maxfilesize, the file is erased and latest data will
be written.

  Q3: What is a relation and limitations between chunksize and maxfilesize?
      Should the maxfilesize be multiple of the chunksize?

  Q4: The sentence "the file is erased and latest data will be written"
is not clear enough.
      Why the whole file needs to be erased
      Should the incremental file behave like a cyclic buffer?
      If so, then only next chunk needs to be erased.
      Then the chunks need to be numbered in order, so the earliest one
can be found.
      (I do not want you to accept my suggestions right away. It is just
a discussion point.
       You need to prove that your approach is good and clean enough.)

If we resolve the questions (or get into agreement) then I'll update the
CSR as needed.

Thanks,
Serguei

On 5/5/19 00:34, 臧琳 wrote:
> Dear All,
>       I have updated the CSR at https://bugs.openjdk.java.net/browse/JDK-8222319
>       May I ask your help to review it?
>       When it is finalized, I will refine the webrev.
>
> BRs,
> Lin
>
>> Dear Serguei，
>>           Thanks a lot for your reviewing.
>>
>>
>>
>>>    System.err.println("      incremental dump support:");
>>> +        System.err.println("        chunkcount=<N>    object number counted (in Kilo) to trigger incremental dump");
>>> +        System.err.println("        maxfilesize=<N>   size limit of incremental dump file (in KB)");
>>>
>>>
>>>  From this description is not clear at all what does the chunkcount mean.
>>> Is it to define how many heap objects are dumped in one chunk?
>>> If so, would it better to name it chunksize instead where chunksize is measured in heap objects?
>>> Then would it better to use the same units to define the maxfilesize as well?
>>> (I'm not insisting on this, just asking.)
>> The original meaning of  “chunkcount"  is how many objects are dumped in one chunk, and the “maxfilesize” is the limited size of the dump file.
>> For example, “chunkcount=1, maxfilesize=10” means that intermediated data will be written to the dump file for every 1000 objects, and
>> when the dump file is larger than 10k，erase the file and rewrite it with the latest dumped data.
>>
>> The reason I didn’t use object count to control the dump file size is that there can be humongous object, which may cause the file too large.
>> Do you think use object size instead of chunkcount is a good option? So the two options can be with same units.
>> BRs,
>> Lin