Feedback requested: HotSpot GC logging improvements

Thu May 6 13:04:31 PDT 2010

Wonderful! Comments inline.

On 06.05.2010 21:32, Tony Printezis wrote:
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails output.
>
> I strongly believe that maintaining two GC log formats is
> counter-productive, especially given that the current -verbosegc format
> is unhelpful in many ways (i.e., lacks a lot of helpful information).
> So, we would like to unify the two into one, with maybe
> -XX:+PrintGCDetails generating a superset of what -verbosegc would
> generate (so that a parser for the -XX:+PrintGCDetails output will also
> be able to parse the -verbosegc output). The new output will not be what
> -XX:+PrintGCDetails generates today but something that can be reliably
> parsed and it is also reasonably human-readable (so, no xml and no
> space/tab-separated formats). Additionally, we're proposing to enable
> -XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate and
> ignore that option, I can't believe that users will really not want a
> time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to be
> optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?

Not at all if we have a chance to get something better.

> - Would anyone really hate having time stamps by default?

We had to do a lot of quirks to simulate GCDateStamps for years until 
they finally made it into Java 6. Having timestamps by default is a 
must, absolute timestamps should be at least optional. Personally I find 
the absolute timestamps more important than the once relative from the 
JVM start, but that depends on what you are doing. Gathering statistical 
data the relative ones are better, since you can do computations more 
easily, tracking problems sometimes the absolute ones are easier, 
because you quickly want to know whether the log lines match the time of 
day when you observed the problem.

> - I know that a lot of folks have their own parsers for our current GC
> log formats. Would you be happy if we provided you with a (reliable!)
> parser for the new format in Java that you can easily adapt?

Of course. Although I can imagine folks qould want to get it in 
different implementation techniques. I guess you plan to provide a 
parser written in Java? Would be great, if it could be provided in a way 
making it easy fr people to customize, so possibly Open Source with a 
nice license like Apache Software License 2.

> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they
> were concerned with the GC logs getting very large (a 1TB disk is $85
> these days, but anyway...). Given that each GC log record is of variable
> size, we cannot easily cycle through the log using the same file (I'd
> rather not have to overwrite existing records). Our current proposal is
> for the user to specify a file number N and a size target S for each
> file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing
> to exceeds S, so each file will be slightly larger than S but it will be
> helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete the
> oldest. So, in the above example, if N == 3, when we create foo.00000004
> we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but,
> instead, we're pruning the oldest records every now and then, which has
> the same effect.

There's a lot of options here. When you doing log rotation, people who 
want to archive the logs might have regular jobs (cron and friends) 
fetching the old closed files and transferring them to another system. 
In that case it would be nice if the apparatus would not get into 
conflict by both the internal rotation and the external script operating 
on the same files. f00.00000001 might have been detected as old and 
copied to the remote host and during the same time GC decides to now 
reuse it. Of course people can increase the cycle length and so on, but 
I always found it a bit problematic if a log rotation mechanism touches 
old files long after the rotation happened. That's why I personally find 
externally organized pruning better. Of course than it's not carefree 
out of the box.

Another thing I often miss is the ability to combine size and time based 
rotation. I want to say: rotate whenever 10MB are full so that the 
chunks I need to handle do not get to big, but please also rotate at 
midnight, so that I know that I can grab the complete files of the day 
after midnight. So specifying a max size and a time pattern and the 
first criterium fulfilled already triggers rotation.

> Another (related) request has been to maybe append the GC log file name
> with the pid of the JVM that's generating it. Maybe we don't want to do
> this by default. But, would people find it helpful if we provide a new
> cmd line parameter to do that? So, for the above example and assuming
> that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the sequence
> number scheme above good enough?

Some time ago I asked whether it would be possible to get the %p 
substitution (replace it by the process id) that is already available 
for some files also for the GC log. I think it already exists in the JDK 
code either for the HeapDumpOnOutOfMemoryError or the hotspot error 
file. Forgot for which. The code is extremely simple.

Would foo.%p.%8N be to complex?

Great initiative! Will you start another discussion about the data 
contents of the file? It could be interesting when people describe what 
kind of information they extract out of the GC logs. Not everything is 
straightforward in the sense of it is based on individual lines. As an 
example I always calculate the total stopped time per minute (summing 
up) as a percentage of wallclock time.

Regards,

Rainer