JEP 271: Unified GC Logging - Second pre-review

Tue Nov 10 19:43:46 UTC 2015

I agree with that and is mostly how I work with the logs nowadays (i.e. GC
is rare/unexpected, and if it occurs I'd like to see as much detail as
possible); in this case, there is just 1 or a few GC events in there so the
log is small anyway.  I don't see anything particularly unusual in the
workflow you describe, but it sounds more like searching for some outliers
rather than doing full-bore analysis.  E.g. what if you have
hundreds/thousands of to-space overflows or Full GCs or promotion failures
or whatever? Are you really going to inspect entries leading up to each of
those occurrences manually? I suspect not, and likely to use some tool.  By
the way, the tool could just as well be an awk/python/perl/whatever script,
I'm not implying that it has to be something super sophisticated -- it's
just not a human reading/parsing the data.  The human-readable bit is nice
when you want to "zoom in" on an occurrence, although a tool could be used
just as well for that (e.g. your example of a timeline-like cursor).

On Tue, Nov 10, 2015 at 2:13 PM, charlie hunt <charlie.hunt at oracle.com>
wrote:

> I may be interpreting what you are saying / suggesting incorrectly?
>
> Again, this may not be the way most folks operate when looking at GC logs
> …. When I have a huge GC log with say in 100,000+ lines of GC data, I tend
> to take the following approach:
> a.)  I may take a look manually and do a quick search for specific things
> such as with G1 GC … looking for to-space overflows or Full GCs. These are
> very quick and easy to find. If there are a lot of them they tend to have a
> very similar root cause. If I do find some issues then I’ll stay manual and
> look at the GC events leading up to those problematic areas (and after) to
> get a sense of what’s going on. Often times I do not need a tool.
> b.) If I don’t see one of the major issues, then I might go to a tool, or
> through together a quick & dirty awk script to parse the logs. Even when I
> do see something visually within a tool, I want to go back to the GC logs
> and manually look at the GC events leading up to the one of interest, and
> look several after to get a good sense of what’s going on.
> * Btw, at the risk of going off topic, IMO, the killer feature for a GC
> visualizer is the ability to select a given GC event, and give me the
> option to show me that GC event along with the ability to scroll backward
> and forward from it so I can see what is going on up to that event I’m
> looking at.
>
> If I’m looking at a smaller GC, say 10,000 lines or less, I almost always
> look at manually, and rarely load it in some kind visualizer. I can usually
> find what I’m interested in looking for more quickly, especially because
> I’ll end up looking at the GC log manually eventually any way.
>
> charlie
>
> On Nov 10, 2015, at 11:19 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>
> I think it's more of whether the GC log info is noise or signal.  If a
> process is GC'ing very frequently, looking at it manually doesn't strike me
> as practical -- you'll need a tool to make sense of it (i.e. each GC
> occurrence is noise).
>
> On Tue, Nov 10, 2015 at 12:16 PM, charlie hunt <charlie.hunt at oracle.com>
> wrote:
>
>>
>> On Nov 10, 2015, at 11:02 AM, Vitaly Davidovich <vitalyd at gmail.com>
>> wrote:
>>
>> I’m sorry to say but what you are saying just doesn’t correlate well with
>>> my experience. In my experience people rarely look at GC logging. That
>>> said, you are right, readability is not overly important to me however
>>> being completely unreadable wouldn’t be great either because I also open
>>> them up and read them.
>>
>>
>> I do occasionally look at GC logs manually.  The reason is because GC is
>> unexpected and/or rare event on some daemons.  I agree that on daemons
>> where GC is a frequent occurrence nobody is likely to look at it manually
>> unless they're examining a particular outlier.
>>
>>
>> I may not be the norm ... I almost always look at GC logs manually.
>>
>> And in those cases where look at GC logs via tools / visualization I
>> always end up looking at the logs to see was happening leading up to some
>> GC event of interest.
>>
>> That said ... on the one hand I agree that many folks don't read GC logs,
>> but on the other hand, for my selfish purposes, I look at them, and I'd
>> like something that's easy to read. (Put whatever definition you'd like on
>> "easy to read").
>>
>> Don't know if that helps or complicates things for Bengt?
>>
>> Charlie
>>
>> On Tue, Nov 10, 2015 at 10:51 AM, Kirk Pepperdine <
>> kirk.pepperdine at gmail.com> wrote:
>>
>>>
>>> >>>>
>>> >>>>>> [11.247s][info   ][gc,heap     ] GC#265 38912K->0K(2048K)
>>> 4096K->4096K(6144K) 68990K->73147K(131072K) 2993K->2993K(1056768K)
>>> >>> This format may be fine for machine parsing but it is not very
>>> readable to humans. Readability is an important use case,
>>> >> Trust me on this one... no one reads these things. Even when I tell
>>> attendees of my workshop to read the GC log, they still don’t. Readability
>>> might be an important an interesting use case from your POV. From mine it
>>> really isn’t. As an FYI, you should look at HP and Azul GC logs.
>>> >
>>> > I understand that readability is not important to you. But there are
>>> many other users than you and many of them depend on being able to read the
>>> GC logs in a reasonable way.
>>>
>>> I’m sorry to say but what you are saying just doesn’t correlate well
>>> with my experience. In my experience people rarely look at GC logging. That
>>> said, you are right, readability is not overly important to me however
>>> being completely unreadable wouldn’t be great either because I also open
>>> them up and read them.
>>>
>>> >
>>> >>
>>> >>> so we will need to at least add information about what the numbers
>>> map to. As I stated in a previous mail I would prefer these on separate
>>> lines, but if we want them on a single line I think the format need to be
>>> something like:
>>> >>>
>>> >>> [11.247s][info   ][gc,heap     ] GC#265 Eden: 38912K->0K(2048K),
>>> Survivor: 4096K->4096K(6144K), Old: 68990K->73147K(131072K), Metaspace:
>>> 2993K->2993K(1056768K)
>>> >>>
>>> >>> To me this is line is too long to ready quickly. In particular if I
>>> am just trying to follow how, for example, the Eden size changes over time.
>>> >> My concern is here and why I believe they should be combined is to
>>> minimize the frequency of logging.
>>> >
>>> > The frequency as in logging three lines instead of one line per GC?
>>> You would have to show me some performance numbers if you want to convince
>>> me that this would be a problem.
>>>
>>> This is a problem that you can’t see in the small. It is my experience
>>> tuning systems (sorry for the blatant hand-waving) that the way to choke a
>>> logging frame work is to have it work very frequently. The size of the
>>> message is secondary in that many small writes are far worse than fewer
>>> larger ones. For example, if you have a system that is logging to NAS on a
>>> system that is shy on network capacity, GC logging will degrade
>>> performance. Frequency in this situation hurts quite a bit. So, per GC
>>> logging, I’d agree with you 99% of the time. However it’s that 1% of the
>>> time that is important because those are the cases where you need the data
>>> and it very hard to get it when collecting it puts more pressure on the
>>> system.
>>>
>>> I find it very ironic that logging, which is intended to improve the
>>> visibility of your application, is so often a bottleneck in applications
>>> simply because the way it works isn’t visible.
>>>
>>> All said, I think I’ve offered all I can in terms of my experiences on
>>> this subject so no worries, I’ll just let you get on with it.
>>>
>>> Kind regards,
>>> Kirk
>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20151110/4924f1ab/attachment.htm>