gclog improvements

Tue Oct 14 15:15:10 UTC 2008

Tony Printezis schrieb:
>> Unfortunately the gclog gets overwrittten each time the JVM starts due
>> to the way the file gets opened. So in order to keep the files people
>> need to add rotation commands to their start scripts. Often this doesn't
>> happen and especially when emergency restarts where needed, the old
>> gclog is gone.
>>
>> Any plans for changing this? Or maybe even adding log file rotation to
>> the gclog?
>>   
> We don't have any immediate plans for doing this. Regarding the log file
> getting overwritten, typically customers launch the JVM from a script
> and it's easy to create a unique GC log file name from a script (append
> the start time, pid, or something like that). So, that's an easy issue
> to solve...

Yes, but I do now see a lot of customers running their midrange server
apps on Windows. Those are of course services started by Java service
wrappers, which are not themselves easily scriptable. Especially
automatic restarts triggered by built-in watchdogs do not have any
scripting mechanism.

>> Furthermore: Until now there is no event based model of monitoring the
>> GC and Memory data. All methods apart from logging are based on a pull
>> model, that retrieves information in regular intervals. The most
>> important information does not exist in regular intervals, but instead
>> immediately after a GC. It would be nice to have some event based model
>> (e.g. like in JVMTI), that allows to track the same data as the gc
>> logging, without using the historically motivated gclog file format.
>>   
> There's a trade-off. I don't like the pull method myself, given that it
> can skip events, give you inconsistent information, etc. But, to get
> consistent information you generally have to do a STW pause, and this is
> too much of an overhead for an application in production. This is why a
> lot of the monitoring has been implemented asynchronously.

So there is a problem in getting a perfect solution. An approximation
could be:

- provide a buffer with the information obtained after the last
Minor/Major GC (e.g. GC type, count, total and filled sizes of the
various memory regions, durations of the last run)

- fire an asynchronous event (not STW) to allow agents to retrieve the
buffered data

That's a hybrid, that would try to provide some benefits of the event
model, without risking long STW pauses. Of course one would have to
think about synchronizing read/write access to the buffer, but without
an actual agent running, there would be no contention, and with an agent
running there would be only noticable performance impact, if there were
lots of GCs and additionally the agent did expensive operations while
retrieving the buffers.

Regards,

Rainer