Feedback requested: HotSpot GC logging improvements

Mon May 10 15:41:28 PDT 2010

Volker,

Volker Kleinschmidt wrote:
> Where is the advantage to using counter numbers in the log file names?
> If you take the sensible suggestion made by several others here to use
> ISO datetimestamps in the filenames, you have a natural sequence, no
> worries about name re-use, and easily automated log maintenance by
> those that want to keep these logs for a while. You could still
> implement an auto-deletion of "older" logs for those that want it.
> Each log can then easily be identified, and an optional PID 
Well, even with date stamps, if you don't have the JVM pid in the file 
name you won't know which JVM each log came from either. And, if you 
only have date stamps on the files names, you might not know whether 
you're missing a file in between the two your customer sent you (you'd 
need to look at the contents to see whether the time stamps are 
contiguous or there's a potential hole). Like the time stamps vs. date 
stamps argument, the sequence number is the minimum you'd need and some 
folks might want to also enable date stamps in addition to the seq numbers.

BTW, something that I just thought of: if we do introduce a way for 
people to use %n, %d, whatever in the GC log file names, would it also 
be helpful to have %h for "host name"?
> in the
> filename would be helpful too. But a counter? What info does that give
> you by itself, without additional context? None whatsoever. That's why
> others declared it as "not meaningful".
>
> We mainly need gc logs for post-mortem performance problem analysis,
> so the date/time stamps on the logs would be really handy to identify
> which log to look at (we often don't get to look at the log on the
> client system, hence file dates don't help us, 
Good point.
> and they often don't
> have PrintGCDateStamps enabled). However the core issue for us is
> prevention of log overwriting when -Xloggc specifies a fixed filename
> and the VM gets restarted by the service wrapper watchdog feature,
> i.e. when you really needed that GC log. So any auto-log-rolling
> mechanism is much better than none and will make me yodel with joy :^)
>   
Yodel? This is almost a good reason to drop that proposal asap. ;-) ;-) ;-)

Tony
> On Fri, May 7, 2010 at 11:15 AM, Tony Printezis
> <tony.printezis at oracle.com> wrote:
>   
>> Hi all,
>>
>> First, thank you for all the excellent feedback (which I see as mostly
>> positive to the proposals). We are glad that people still care about the
>> GC logs. Instead of replying to individual e-mails, I'll consolidate my
>> replies here.
>>
>> * "I would say that PrintGCDateStamps should be the default" (several
>> folks brought this up)
>>
>>  From the point of view of analyzing logs to just look at the GC's
>> behavior, we only need time stamps. And this is the reason why I'd like
>> to see them turned on by default (too many times we got a log without
>> time stamps for which we said "damn, if it had time stamps we'd get a
>> better idea of what was happening"). So, they are the minimum we need to
>> get a good picture of how the GC behaved. Date stamps will increase the
>> size of the log (which still seems to be an issue for some people) and
>> be helpful in fewer places (i.e., when comparing application and GC
>> events; but we generally do not do that). So, you'll have to turn them
>> on yourselves. :-)
>>
>> * "successive runs overwrite the previous log file" (several folks
>> brought this up)
>>
>> Don't you think that adding the JVM's pid to the log file name would
>> eliminate this problem?
>>
>> * "I'm not attached to the old format" (several folks mentioned this)
>>
>> Oh, good. I'll be quoting you when I'll be making a case to remove it.
>>
>> * "A serial-attached-scsi (aka SAS) disk at 10k rpm is a little bit more
>> expensive than $85/TB" (Ryan)
>>
>> Point taken, but do you really need super duper 10k rpm disks to store
>> GC log files. :-)
>>
>> * "if you're going to roll the logs then I would prefer a meaningful
>> suffix rather than just a counter."
>>
>> A counter seems like a perfectly meaningful suffix to me.
>>
>> * "wouldn't you still need a verbose output that is specific to each
>> collector in order to provide a "debug" level of detail?" (Matt)
>>
>> Very good point. The verbose output will be as unified as possible, but
>> with indeed GC-specific extensions not to lose that information.
>>
>> * "I guess you plan to provide a parser written in Java?" (Rainer)
>>
>> Java? We're HotSpot developers! We only work in C++, assembly, and awk!
>> Just kidding... Yes, indeed in Java.
>>
>> * "so possibly Open Source with a nice license like Apache Software
>> License 2" (Rainer)
>>
>> Maybe, and not up to me to decide.
>>
>> * "f00.00000001 might have been detected as old and copied to the remote
>> host and during the same time GC decides to now reuse it...  That's why
>> I personally find externally organized pruning better. Another thing I
>> often miss is the ability to combine size and time based rotation." (Rainer)
>>
>> The proposal never reuses log files. We'll never overwrite anything.
>> Instead, we'll delete the oldest files as we create new ones. If we tell
>> the users to prune the older log files themselves, I know what the first
>> bug filed against the new policy will be. :-) Regarding rotating based
>> on both size and time: most people care about size so I think that's
>> what we'll do. If you want more advanced management of the logs you'll
>> have to set N to infinity (at least we'll need a way to say "never
>> delete older files") so that HotSpot doesn't delete any files and you'll
>> be able to copy them and delete them yourself.
>>
>> But, seriously, this is excellent feedback. You guys are doing more wild
>> stuff with our logs than I had imagined. :-)
>>
>> * "Will you start another discussion about the data contents of the
>> file?" (Rainer)
>>
>> We'll do that separately, based most likely on a wiki. When we get to
>> it. No promises though!
>>
>> * "For more debug detail per collector one could use PrintHeapAtGC"
>> (Michael)
>>
>> Well, PrintHeapAtGC was supposed to be added for debugging purposes,
>> i.e., to find out what the address range of each generation is. However,
>> it has clearer information on how full each generation is which is why
>> people use it today (it's very space inefficient though...). We are
>> hoping to add that information to the standard GC log records to
>> eliminate the need for PrintHeapAtGC.
>>
>> * "In our application we prefix _every_ "cyclic" log file with the
>> config options used to start the app." (Jeff)
>>
>> Adding configuration / whatever information at the top of every log file
>> fragment is an excellent suggestion. Thanks for bringing it up.
>>
>> * "How many digits in the sequence?  Would that be configurable?" (Adam)
>>
>> 8 should be more enough (do you really see the need for more than 99m
>> log fragments)? Actually, even 6 will probably  be enough. And if we go
>> over that, we won't cycle the numbers, we'll just expand the number field.
>>
>> * "IMHO it would be great if we could use %p also for" (Johann)
>>
>> I was going to say that this would start getting over the top. But I was
>> not aware that you can do that with the fatal error log. I'll need to
>> investigate that further. So, we'll leave this (and additional custom
>> formatting in the GC log name) as a "maybe". :-) I'm not quite sure
>> whether we'd want to use the same facility for the sequence numbers
>> though, given that they'd be needed if we split the log and won't be
>> needed if we don't. For those, I just vote to just add a suffix to the
>> log file name when they are needed.
>>
>> Thanks again for all the good points,
>>
>> Tony, HotSpot GC Group
>>
>> On 5/6/2010 3:32 PM, Tony Printezis wrote:
>>     
>>> Hi all,
>>>
>>> We would like your input on some changes to HotSpot's GC logging that
>>> we have been discussing. We have been wanting to improve our GC
>>> logging for some time. However we haven't had the resources to spend
>>> on it. We don't know when we'll get to it, but we'd still like to get
>>> some feedback on our plans.
>>>
>>> The changes fall into two categories.
>>>
>>>
>>> A. Unification and improvement of -verbosegc / -XX:+PrintGCDetails
>>> output.
>>>
>>> I strongly believe that maintaining two GC log formats is
>>> counter-productive, especially given that the current -verbosegc
>>> format is unhelpful in many ways (i.e., lacks a lot of helpful
>>> information). So, we would like to unify the two into one, with maybe
>>> -XX:+PrintGCDetails generating a superset of what -verbosegc would
>>> generate (so that a parser for the -XX:+PrintGCDetails output will
>>> also be able to parse the -verbosegc output). The new output will not
>>> be what -XX:+PrintGCDetails generates today but something that can be
>>> reliably parsed and it is also reasonably human-readable (so, no xml
>>> and no space/tab-separated formats). Additionally, we're proposing to
>>> enable -XX:+PrintGCTimeStamps by default (in fact, we'll probably
>>> deprecate and ignore that option, I can't believe that users will
>>> really not want a time stamp per GC log record). We'll leave
>>> -XX:+PrintGCDateStamps to be optional though.
>>>
>>> Specific questions:
>>>
>>> - Is anyone really attached to the old -verbosegc output?
>>> - Would anyone really hate having time stamps by default?
>>> - I know that a lot of folks have their own parsers for our current GC
>>> log formats. Would you be happy if we provided you with a (reliable!)
>>> parser for the new format in Java that you can easily adapt?
>>>
>>>
>>> B. Introducing "cyclic" GC logs.
>>>
>>> This is something that a lot of folks have asked for given that they
>>> were concerned with the GC logs getting very large (a 1TB disk is $85
>>> these days, but anyway...). Given that each GC log record is of
>>> variable size, we cannot easily cycle through the log using the same
>>> file (I'd rather not have to overwrite existing records). Our current
>>> proposal is for the user to specify a file number N and a size target
>>> S for each file. For a given GC log -Xloggc:foo, HotSpot will generate
>>>
>>> foo.00000001
>>> foo.00000002
>>> foo.00000003
>>> etc.
>>>
>>> (we'll create a new file as soon as the size of the one we are writing
>>> to exceeds S, so each file will be slightly larger than S but it will
>>> be helpful not to split individual log records between two files)
>>>
>>> When we create a new file, if we have more than N files we'll delete
>>> the oldest. So, in the above example, if N == 3, when we create
>>> foo.00000004 we'll delete foo.00000001.
>>>
>>> Note that in the above scheme, the logs are not really "cyclic" but,
>>> instead, we're pruning the oldest records every now and then, which
>>> has the same effect.
>>>
>>> Another (related) request has been to maybe append the GC log file
>>> name with the pid of the JVM that's generating it. Maybe we don't want
>>> to do this by default. But, would people find it helpful if we provide
>>> a new cmd line parameter to do that? So, for the above example and
>>> assuming that the JVM's pid is 1234, the GC log file(s) will be either:
>>>
>>> foo.1234
>>>
>>> or
>>>
>>> foo.1234.00000001
>>> foo.1234.00000002
>>> foo.1234.00000003
>>> etc.
>>>
>>> Specific questions:
>>>
>>> - Would people really hate it if HotSpot starts appending the GC log
>>> file name with a (zero-padded) sequence number? Maybe if N == 1 (the
>>> default), HotSpot will skip the sequence number and ignore S, i.e.,
>>> behave as it does today.
>>> - To the people who have been asking for cyclic GC logs: is the
>>> sequence number scheme above good enough?
>>>
>>>
>>> Thanks in advance for your feedback,
>>>
>>> Tony, HotSpot GC Group
>>>
>>>
>>>       
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>     
>
>
>
>