RFR: 8143157: Convert TraceVMOperation to Unified Logging

Fri Nov 20 01:56:15 UTC 2015

David,

A couple of replies below.

On 11/19/15 7:10 PM, David Holmes wrote:
> Hi Coleen,
>
> On 20/11/2015 1:27 AM, Coleen Phillimore wrote:
>> On 11/18/15 5:06 PM, David Holmes wrote:
>>> On 19/11/2015 4:41 AM, Rachel Protacio wrote:
>>>> On 11/18/2015 12:32 PM, Max Ockner wrote:
>>>>> I think the issue is that doit() happens regardless of what is being
>>>>> logged. I don't think we can reorder the logging statements relative
>>>>> to doit(). In that case, we would be forced to split into 2
>>>>> conditionals for the logging.
>>>>>
>>>>> Still I don't think it is necessary to call log_is_enabled twice, and
>>>>> I don't think it is necessary to always define the outputStream. If
>>>>> this turns out to be a problem, we could flip things around a bit:
>>>>>
>>>>>   void VM_Operation::evaluate() {
>>>>>     ResourceMark rm;
>>>>> !   bool enabled = false;
>>>>> !   if (log_is_enabled(Debug, vmoperation)) {
>>>>> !     outputStream* debugstream =
>>>>> LogHandle(vmoperation)::debug_stream();
>>>>> !     enabled = true;
>>>>> !   if (enabled) {
>>>>> !     print_on_error(debugstream);
>>>>> !     debugstream->cr();
>>>>>     }
>>>>>     doit();
>>>>> !   if (enabled) {
>>>>> !     print_on_error(debugstream);
>>>>> !     debugstream->print_cr("]");
>>>>>     }
>>>>>   }
>>>>>
>>>> I'm fine with moving the definition into a conditional (but I can't 
>>>> move
>>>> the declaration for scope reasons). It will look like this
>>>>
>>>>     outputStream* debugstream;
>>>>     if (log_is_enabled(Debug, vmoperation)) {
>>>>          debugstream = LogHandle(vmoperation)::debug_stream();
>>>
>>> You may run afoul of the static analyzer when you use debugstream in
>>> the logging clause after the doit(). It won't know that log_is_enabled
>>> has to return the same answer both times. The use of the enabled flag
>>> would address that.
>>>
>>>> To be honest, I'm not sure whether evaluating log_is_enabled twice is
>>>> worse than adding a boolean check and cluttering the code and 
>>>> making it
>>>> less readable. The argument for changing log_is_enabled() to a boolean
>>>> could be extrapolated to say that it should be a boolean for every 
>>>> check
>>>> for every tag. And that might be a good idea. But I feel like the
>>>> problem with it here is that it comes across as confusing to add
>>>> "enabled", which only actually needs to be used once. If we want it
>>>> changed, maybe we should change the macro? Or put it outside the
>>>> function, even, to make it only evaluate once?
>>>
>>> I would argue that within a function log_is_enabled should only be
>>> called once per tag/level combination. I don't see it impeding
>>> readability at all - quite the opposite. (The new logging code in the
>>> current example, looks worse to me, than the original, from a
>>> clutter/readability perspective.)
>>
>> Something like what both you and Rachel said?
>>
>>     outputStream* debugstream;
>>     if (log_is_enabled(Debug, vmoperations)) {
>>           debugstream = LogHandle(vmoperation)::debug_stream();
>>           log output
>>           enabled = true;
>>    }
>>    vm_operation
>>    if (enabled) {
>>        log output again
>>    }
>
> Yes. Given what we have heard about dynamically enabling/disabling 
> logging at runtime, using a single enabling flag in this kind of 
> situation is important I think.
>
>>>>> On 11/18/2015 4:22 AM, David Holmes wrote:
>>>>>> On 18/11/2015 5:50 AM, Rachel Protacio wrote:
>>>>>>> Please review the following small logging enhancement.
>>>>>>>
>>>>>>> Summary: The former -XX:+TraceVMOperation flag is updated to the
>>>>>>> unified
>>>>>>> logging framework and is now replaced with -Xlog:vmoperation in
>>>>>>> product
>>>>>>> mode.
>>>>>>>
>>>>>>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8143157/
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8143157
>>>>>>> Testing: Passes jtreg, JPRT, RBT quick tests, and refworkload
>>>>>>> performance tests.
>>>>>>
>>>>>> Meta-question: the logging framework is safe to be called when at a
>>>>>> safepoint isn't it?
>>>> I haven't seen any problems, but I also don't know what would cause it
>>>> to be unsafe. I'd appreciate any guidance in figuring it out!
>>>
>>> The main thing would be locking of course.
>>
>> I think the only locking is the tty which the GC uses so if it's not
>> safe for safepoint use, then there's another bigger problem with the
>> framework.
>
> But can't this hook into non-tty based logging as directed by the 
> user? And can that hit any log-file rotation paths that use locking?
>
> Just want to be sure :)

I think Marcus answered this.  Yes, I assume any bugs will be fixed if 
we find them.
>
>>>>>>
>>>>>> src/share/vm/runtime/vm_operations.cpp
>>>>>>
>>>>>>   void VM_Operation::evaluate() {
>>>>>>     ResourceMark rm;
>>>>>> !   outputStream* debugstream =
>>>>>> LogHandle(vmoperation)::debug_stream();
>>>>>> !   if (log_is_enabled(Debug, vmoperation)) {
>>>>>> !     debugstream->print("[");
>>>>>> !     print_on_error(debugstream);
>>>>>> !     debugstream->cr();
>>>>>>     }
>>>>>>     doit();
>>>>>> !   if (log_is_enabled(Debug, vmoperation)) {
>>>>>> !     print_on_error(debugstream);
>>>>>> !     debugstream->print_cr("]");
>>>>>>     }
>>>>>>   }
>>>>>>
>>>>>> Why are you calling print_on_error twice?
>>>> The problem with switching VMOperation to logging is that in the 
>>>> current
>>>> system, the output goes straight to tty and so can be concatenated.
>>>> However, since the output triggered by the flag needs to be printed 
>>>> in a
>>>> log stream (which is not necessarily tty and which has bracketed
>>>> decorations), the "[<VM operation info>" portion comes out in that
>>>> stream, which could then be broken be other output, making it 
>>>> confusing
>>>> to have a closing "]" sitting around later. Coleen and I agreed 
>>>> that it
>>>> would improve output readability to be able to match the closing 
>>>> bracket
>>>> by restating the VM operation info. In other words:
>>>>
>>>>     [VM_Operation (0x00007fa3b44ce700): G1CollectFull, mode: 
>>>> safepoint,
>>>>     requested by thread 0x00007fa3ac018000
>>>>     <code from doit() and not from TraceVMOperation>
>>>>     ]
>>>>
>>>> becomes
>>>>
>>>>     [0.257s][debug  ][vmoperation] [VM_Operation (0x00007f74e589c700):
>>>>     G1CollectFull, mode: safepoint, requested by thread
>>>> 0x00007f74dc018000
>>>>     <code from doit() and not from TraceVMOperation>
>>>>     [0.272s][debug  ][vmoperation] VM_Operation (0x00007f74e589c700):
>>>>     G1CollectFull, mode: safepoint, requested by thread
>>>> 0x00007f74dc018000]
>>>>
>>>> rather than
>>>>
>>>>     [0.257s][debug  ][vmoperation] [VM_Operation (0x00007f74e589c700):
>>>>     G1CollectFull, mode: safepoint, requested by thread
>>>> 0x00007f74dc018000
>>>>     <code from doit() and not from TraceVMOperation>
>>>>     [0.272s][debug  ][vmoperation] ]
>>>>
>>>> It should be easier for users to parse this.
>>>
>>> Okay but in that case each part should have the trailing ]
>>>
>>> Or, perhaps better, given all the other decoration why not just drop
>>> the [ ] bracketing altogether? The intent to was capture all output
>>> that occurred while the VMOperation was active, but that is now much
>>> clearer in the output anyway. Though perhaps a "start" and "end"
>>> indicator would aid parsing - especially when nested VMOperations
>>> might occur.
>>
>> Either way!   The '[' on the first line and the ']' on the second log
>> line match up but we could drop both or keep them.  This doesn't look
>> like a big deal!    "start" and "end" are too noisy.
>
> Not a big deal, just trying to make it look neat and easy to match up 
> the pre- and post- operation log lines. The decorations are already 
> noisy so I don't think start/end make it any worse.

How about?

     [0.257s][debug  ][vmoperation] VM_Operation begin 
(0x00007f74e589c700):
     G1CollectFull, mode: safepoint, requested by thread
0x00007f74dc018000
     <code from doit() and not from TraceVMOperation>
     [0.272s][debug  ][vmoperation] VM_Operation end (0x00007f74e589c700):
     G1CollectFull, mode: safepoint, requested by thread
0x00007f74dc018000

>
>>>
>>>>>>
>>>>>> Why is the only logging level for this tag the "debug" level? I 
>>>>>> think
>>>>>> I may be missing part of the way UL works here - can you enable
>>>>>> logging both by explicit tag (ie vmoperation) and the level (ie
>>>>>> debug)?
>>>>>>
>>>> You enable the logging with "-Xlog:vmoperation=debug". If you leave of
>>>> the "=<level>" portion, it is by default parsed as "=info". We don't
>>>> believe this vmoperation logging needs to come out by default in the
>>>> case where someone asks for "-Xlog:all", so we put it one level below.
>>>
>>> <sigh> So does -Xlog:all signify all tags at info level or ???
>>>
>>> I think this is real conceptual problem with the UL framework. I
>>> should be able to define the info/trace/debug levels for a given tag,
>>> without having to think about how they interact with a "log all" 
>>> request.
>>
>> We picked debug level.  It seems like debug level.  Do you think it
>> should be Info level?  Is it important information?
>
> In the current TraceXXX logging scheme it is the tag, the XXX, that is 
> the primary focus. All of our tags are independent. We can add 
> "levels" to that logging by either adding in other flags (like Verbose 
> or WizardMode) or by using a numeric rather than boolean flag type.
>
> In the UL framework, as discussed in the emails with Kirk, the primary 
> focus is on the levels, and then the tags. This is an inversion of the 
> current scheme. That in itself would not be a problem when just 
> considering non-product tracing flags - particularly if, as discussed 
> in another thread, the distinction between product and non-product 
> levels is itself raised to be the first-level distinction. For each 
> flag/tag you can just consider: is this a first level of detail 
> (info), a second level of detail (debug), or a third level of detail 
> (trace) for this tag?
>
> However, when you change a non-product tag to product, this 
> level-oriented focus can become problematic: the framework wants the 
> user to be able to enable all info logging (for example), but the 
> "info" for tag XXX isn't considered anywhere as near as important as 
> the info for some other tags so "we" don't want it to show up in this 
> case, so instead of using "info" we use "debug", so that it is hidden. 
> As per the current case.
>
> So now we are not using the levels to classify the detail of a 
> particular subsystem, but rather we are using levels to provide a 
> secondary hierarchy of subsystems/tags:
>
> - level 1: all tags that have "info" logging statements
> - level 2: all "debug" logging from level 1 tags plus all tags that 
> start logging at "debug" level
> - level 3: all "trace" logging from level 1 and 2 tags, plus all 
> remaining tags that start logging at "trace" level
>
> I think this is problematic because we have to consider logging 
> holistically instead of just thinking about the subsystem being 
> logged. I also think the rules are very ad-hoc and subjective, which 
> isn't conducive to getting a clean and consistent approach to logging.

This is getting really long and I'm not seeing how to respond. Should we 
not convert the runtime tracing and and printing functions to Unifield 
Logging?  Should the framework be redone so that it satisfies all of 
your theoretical constraints?   Max wrote up a proposal how to convert 
the runtime flags to UL.  It's getting really tiresome to defend the 
framework with every flag that we try to convert.  I'm not sure what you 
would prefer we do.

Yes, there are tags and levels.  For runtime, we've chosen tags to 
differentiate all the different things we want to log.   Then we have 
levels for verbosity.  It's really not that complicated.  We wanted to 
avoid "info" because most logging in the runtime system isn't that 
interesting apart from it's subsystem.

Thanks,
Coleen
>
>>>
>>>>>> And I know I sound like a broken record but I'm concerned about the
>>>>>> overhead of the logging actions when it is not enabled ie:
>>>>>>
>>>>>> outputStream* debugstream = LogHandle(vmoperation)::debug_stream();
>>>>>>
>>>>>> always gets executed - and we evaluate is_enabled twice.
>>>>>>
>>>> See my reply to Max above.
>>>>>> ---
>>>>>>
>>>>>> test/runtime/logging/VMOperationTestMain.java
>>>>>>
>>>>>> Can you add a comment explaining what the logic is attempting to do
>>>>>> please. I'm curious how many times the loop executes before you 
>>>>>> get a
>>>>>> safepoint-based GC (and how it varies for different GC's).
>>>>>>
>>>>>> Style nit: while(  -> while (
>>>> Yes, will fix.
>>>>>>
>>>>>>> A compatability request has been accepted with regard to this 
>>>>>>> change.
>>>>>>
>>>>>> I'll record my objections again to the conversion of develop 
>>>>>> flags to
>>>>>> product. <sigh>
>>>> Thanks for voicing your concern. The justification for this tag being
>>>> product is that it could be useful to helping diagnose user 
>>>> problems, it
>>>> does not slow performance when off, and it minimally increases the 
>>>> size
>>>> of the code. If any of these prove to be false in aggregated logging
>>>> options, we can move it to develop level during code freeze.
>>>
>>> You could make that argument for every piece of logging and end up
>>> with a very simple flat logging system. :) It is all very subjective
>>> of course.
>>>
>>
>> Yes, we do intend on making this assessment and argument for each
>> logging level because we don't want a simple flat logging system. The
>> code that we are logging isn't simple nor flat.   I can cut/paste the
>> decision graph for you again.
>
> See above.
>
> Thanks,
> David
>
>> Thanks,
>> Coleen
>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Rachel
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Thank you very much,
>>>>>>> Rachel
>>>>>
>>>>
>>