Feedback requested: HotSpot GC logging improvements
Doug Jones
doug.jones at internet.co.nz
Thu May 6 17:45:59 PDT 2010
The biggest problem to us is that when the JVM is restarted the previous GC
log file is overwritten. So I would like to suggest the following:
1) At midnight each day the GC log is cycled by appending the old day's date
(in YYYYMMDD format). So the currently being written to log is always what
is specified on gclog.
2) When the JVM is started, if a file of the name specified on gclog exists
then it is renamed to the appropriate YYYYMMDD file (preferably taking the
date from the file last written to date, not the current date) and a fresh
GC log started. If the YYYYMMDD file already exists (ie the JVM has already
been restarted that day) then the last GC log would just be appended to
that.
This would seem to have a number of advantages: it overcomes the problem of
the last GC log being overwritten on a restart; the GC log is automatically
kept to a manageable size - most often when we are looking at a GC problem
we want to see what is in the log over the last few hours; and it means that
it is easy for each site to implement its own retention policy, eg deleting
old logs after NN days. I don't really see that as being the responsibility
of the JVM. I guess it also has the advantage that in the case that
DateStamps are
not turned on then the first entry in the log for a day will have the
TimeStamp for the start of the day, so becomes much easier to work out the
time of the
day a subsequent event in the log occurred.
For the other question our vote is: leave PrintGC Details much as it is and
deprecate verbosegc, turn on GCTimeStamps by default but agree not
DateStamps (I suspect just getting GCTime is the least overhead system
call).
Doug.
----- Original Message -----
From: "Tony Printezis" <tony.printezis at oracle.com>
To: <hotspot-gc-use at openjdk.java.net>
Sent: Friday, May 07, 2010 7:32 AM
Subject: Feedback requested: HotSpot GC logging improvements
> Hi all,
>
> We would like your input on some changes to HotSpot's GC logging that we
> have been discussing. We have been wanting to improve our GC logging for
> some time. However we haven't had the resources to spend on it. We don't
> know when we'll get to it, but we'd still like to get some feedback on
> our plans.
>
> The changes fall into two categories.
>
>
> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails output.
>
> I strongly believe that maintaining two GC log formats is
> counter-productive, especially given that the current -verbosegc format
> is unhelpful in many ways (i.e., lacks a lot of helpful information).
> So, we would like to unify the two into one, with maybe
> -XX:+PrintGCDetails generating a superset of what -verbosegc would
> generate (so that a parser for the -XX:+PrintGCDetails output will also
> be able to parse the -verbosegc output). The new output will not be what
> -XX:+PrintGCDetails generates today but something that can be reliably
> parsed and it is also reasonably human-readable (so, no xml and no
> space/tab-separated formats). Additionally, we're proposing to enable
> -XX:+PrintGCTimeStamps by default (in fact, we'll probably deprecate and
> ignore that option, I can't believe that users will really not want a
> time stamp per GC log record). We'll leave -XX:+PrintGCDateStamps to be
> optional though.
>
> Specific questions:
>
> - Is anyone really attached to the old -verbosegc output?
> - Would anyone really hate having time stamps by default?
> - I know that a lot of folks have their own parsers for our current GC
> log formats. Would you be happy if we provided you with a (reliable!)
> parser for the new format in Java that you can easily adapt?
>
>
> B. Introducing "cyclic" GC logs.
>
> This is something that a lot of folks have asked for given that they
> were concerned with the GC logs getting very large (a 1TB disk is $85
> these days, but anyway...). Given that each GC log record is of variable
> size, we cannot easily cycle through the log using the same file (I'd
> rather not have to overwrite existing records). Our current proposal is
> for the user to specify a file number N and a size target S for each
> file. For a given GC log -Xloggc:foo, HotSpot will generate
>
> foo.00000001
> foo.00000002
> foo.00000003
> etc.
>
> (we'll create a new file as soon as the size of the one we are writing
> to exceeds S, so each file will be slightly larger than S but it will be
> helpful not to split individual log records between two files)
>
> When we create a new file, if we have more than N files we'll delete the
> oldest. So, in the above example, if N == 3, when we create foo.00000004
> we'll delete foo.00000001.
>
> Note that in the above scheme, the logs are not really "cyclic" but,
> instead, we're pruning the oldest records every now and then, which has
> the same effect.
>
> Another (related) request has been to maybe append the GC log file name
> with the pid of the JVM that's generating it. Maybe we don't want to do
> this by default. But, would people find it helpful if we provide a new
> cmd line parameter to do that? So, for the above example and assuming
> that the JVM's pid is 1234, the GC log file(s) will be either:
>
> foo.1234
>
> or
>
> foo.1234.00000001
> foo.1234.00000002
> foo.1234.00000003
> etc.
>
> Specific questions:
>
> - Would people really hate it if HotSpot starts appending the GC log
> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
> default), HotSpot will skip the sequence number and ignore S, i.e.,
> behave as it does today.
> - To the people who have been asking for cyclic GC logs: is the sequence
> number scheme above good enough?
>
>
> Thanks in advance for your feedback,
>
> Tony, HotSpot GC Group
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
More information about the hotspot-gc-use
mailing list