RFR: 8143157: Convert TraceVMOperation to Unified Logging

Fri Nov 20 04:19:43 UTC 2015

Important correction: Max's proposal does in fact list the level 
suggested for each flag. Mea culpa for not flagging this earlier, but as 
I said it is only as we have done this that these issues have coalesced 
for me.

David

On 20/11/2015 2:10 PM, David Holmes wrote:
> Inline ...
>
> On 20/11/2015 11:56 AM, Coleen Phillimore wrote:
>>
>> David,
>>
>> A couple of replies below.
>>
>>
>> On 11/19/15 7:10 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 20/11/2015 1:27 AM, Coleen Phillimore wrote:
>>>> On 11/18/15 5:06 PM, David Holmes wrote:
>>>>> On 19/11/2015 4:41 AM, Rachel Protacio wrote:
>>>>>> On 11/18/2015 12:32 PM, Max Ockner wrote:
>>>>>>> I think the issue is that doit() happens regardless of what is being
>>>>>>> logged. I don't think we can reorder the logging statements relative
>>>>>>> to doit(). In that case, we would be forced to split into 2
>>>>>>> conditionals for the logging.
>>>>>>>
>>>>>>> Still I don't think it is necessary to call log_is_enabled twice,
>>>>>>> and
>>>>>>> I don't think it is necessary to always define the outputStream. If
>>>>>>> this turns out to be a problem, we could flip things around a bit:
>>>>>>>
>>>>>>>   void VM_Operation::evaluate() {
>>>>>>>     ResourceMark rm;
>>>>>>> !   bool enabled = false;
>>>>>>> !   if (log_is_enabled(Debug, vmoperation)) {
>>>>>>> !     outputStream* debugstream =
>>>>>>> LogHandle(vmoperation)::debug_stream();
>>>>>>> !     enabled = true;
>>>>>>> !   if (enabled) {
>>>>>>> !     print_on_error(debugstream);
>>>>>>> !     debugstream->cr();
>>>>>>>     }
>>>>>>>     doit();
>>>>>>> !   if (enabled) {
>>>>>>> !     print_on_error(debugstream);
>>>>>>> !     debugstream->print_cr("]");
>>>>>>>     }
>>>>>>>   }
>>>>>>>
>>>>>> I'm fine with moving the definition into a conditional (but I can't
>>>>>> move
>>>>>> the declaration for scope reasons). It will look like this
>>>>>>
>>>>>>     outputStream* debugstream;
>>>>>>     if (log_is_enabled(Debug, vmoperation)) {
>>>>>>          debugstream = LogHandle(vmoperation)::debug_stream();
>>>>>
>>>>> You may run afoul of the static analyzer when you use debugstream in
>>>>> the logging clause after the doit(). It won't know that log_is_enabled
>>>>> has to return the same answer both times. The use of the enabled flag
>>>>> would address that.
>>>>>
>>>>>> To be honest, I'm not sure whether evaluating log_is_enabled twice is
>>>>>> worse than adding a boolean check and cluttering the code and
>>>>>> making it
>>>>>> less readable. The argument for changing log_is_enabled() to a
>>>>>> boolean
>>>>>> could be extrapolated to say that it should be a boolean for every
>>>>>> check
>>>>>> for every tag. And that might be a good idea. But I feel like the
>>>>>> problem with it here is that it comes across as confusing to add
>>>>>> "enabled", which only actually needs to be used once. If we want it
>>>>>> changed, maybe we should change the macro? Or put it outside the
>>>>>> function, even, to make it only evaluate once?
>>>>>
>>>>> I would argue that within a function log_is_enabled should only be
>>>>> called once per tag/level combination. I don't see it impeding
>>>>> readability at all - quite the opposite. (The new logging code in the
>>>>> current example, looks worse to me, than the original, from a
>>>>> clutter/readability perspective.)
>>>>
>>>> Something like what both you and Rachel said?
>>>>
>>>>     outputStream* debugstream;
>>>>     if (log_is_enabled(Debug, vmoperations)) {
>>>>           debugstream = LogHandle(vmoperation)::debug_stream();
>>>>           log output
>>>>           enabled = true;
>>>>    }
>>>>    vm_operation
>>>>    if (enabled) {
>>>>        log output again
>>>>    }
>>>
>>> Yes. Given what we have heard about dynamically enabling/disabling
>>> logging at runtime, using a single enabling flag in this kind of
>>> situation is important I think.
>>>
>>>>>>> On 11/18/2015 4:22 AM, David Holmes wrote:
>>>>>>>> On 18/11/2015 5:50 AM, Rachel Protacio wrote:
>>>>>>>>> Please review the following small logging enhancement.
>>>>>>>>>
>>>>>>>>> Summary: The former -XX:+TraceVMOperation flag is updated to the
>>>>>>>>> unified
>>>>>>>>> logging framework and is now replaced with -Xlog:vmoperation in
>>>>>>>>> product
>>>>>>>>> mode.
>>>>>>>>>
>>>>>>>>> Open webrev: http://cr.openjdk.java.net/~rprotacio/8143157/
>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8143157
>>>>>>>>> Testing: Passes jtreg, JPRT, RBT quick tests, and refworkload
>>>>>>>>> performance tests.
>>>>>>>>
>>>>>>>> Meta-question: the logging framework is safe to be called when at a
>>>>>>>> safepoint isn't it?
>>>>>> I haven't seen any problems, but I also don't know what would
>>>>>> cause it
>>>>>> to be unsafe. I'd appreciate any guidance in figuring it out!
>>>>>
>>>>> The main thing would be locking of course.
>>>>
>>>> I think the only locking is the tty which the GC uses so if it's not
>>>> safe for safepoint use, then there's another bigger problem with the
>>>> framework.
>>>
>>> But can't this hook into non-tty based logging as directed by the
>>> user? And can that hit any log-file rotation paths that use locking?
>>>
>>> Just want to be sure :)
>>
>> I think Marcus answered this.  Yes, I assume any bugs will be fixed if
>> we find them.
>>>
>>>>>>>>
>>>>>>>> src/share/vm/runtime/vm_operations.cpp
>>>>>>>>
>>>>>>>>   void VM_Operation::evaluate() {
>>>>>>>>     ResourceMark rm;
>>>>>>>> !   outputStream* debugstream =
>>>>>>>> LogHandle(vmoperation)::debug_stream();
>>>>>>>> !   if (log_is_enabled(Debug, vmoperation)) {
>>>>>>>> !     debugstream->print("[");
>>>>>>>> !     print_on_error(debugstream);
>>>>>>>> !     debugstream->cr();
>>>>>>>>     }
>>>>>>>>     doit();
>>>>>>>> !   if (log_is_enabled(Debug, vmoperation)) {
>>>>>>>> !     print_on_error(debugstream);
>>>>>>>> !     debugstream->print_cr("]");
>>>>>>>>     }
>>>>>>>>   }
>>>>>>>>
>>>>>>>> Why are you calling print_on_error twice?
>>>>>> The problem with switching VMOperation to logging is that in the
>>>>>> current
>>>>>> system, the output goes straight to tty and so can be concatenated.
>>>>>> However, since the output triggered by the flag needs to be printed
>>>>>> in a
>>>>>> log stream (which is not necessarily tty and which has bracketed
>>>>>> decorations), the "[<VM operation info>" portion comes out in that
>>>>>> stream, which could then be broken be other output, making it
>>>>>> confusing
>>>>>> to have a closing "]" sitting around later. Coleen and I agreed
>>>>>> that it
>>>>>> would improve output readability to be able to match the closing
>>>>>> bracket
>>>>>> by restating the VM operation info. In other words:
>>>>>>
>>>>>>     [VM_Operation (0x00007fa3b44ce700): G1CollectFull, mode:
>>>>>> safepoint,
>>>>>>     requested by thread 0x00007fa3ac018000
>>>>>>     <code from doit() and not from TraceVMOperation>
>>>>>>     ]
>>>>>>
>>>>>> becomes
>>>>>>
>>>>>>     [0.257s][debug  ][vmoperation] [VM_Operation
>>>>>> (0x00007f74e589c700):
>>>>>>     G1CollectFull, mode: safepoint, requested by thread
>>>>>> 0x00007f74dc018000
>>>>>>     <code from doit() and not from TraceVMOperation>
>>>>>>     [0.272s][debug  ][vmoperation] VM_Operation (0x00007f74e589c700):
>>>>>>     G1CollectFull, mode: safepoint, requested by thread
>>>>>> 0x00007f74dc018000]
>>>>>>
>>>>>> rather than
>>>>>>
>>>>>>     [0.257s][debug  ][vmoperation] [VM_Operation
>>>>>> (0x00007f74e589c700):
>>>>>>     G1CollectFull, mode: safepoint, requested by thread
>>>>>> 0x00007f74dc018000
>>>>>>     <code from doit() and not from TraceVMOperation>
>>>>>>     [0.272s][debug  ][vmoperation] ]
>>>>>>
>>>>>> It should be easier for users to parse this.
>>>>>
>>>>> Okay but in that case each part should have the trailing ]
>>>>>
>>>>> Or, perhaps better, given all the other decoration why not just drop
>>>>> the [ ] bracketing altogether? The intent to was capture all output
>>>>> that occurred while the VMOperation was active, but that is now much
>>>>> clearer in the output anyway. Though perhaps a "start" and "end"
>>>>> indicator would aid parsing - especially when nested VMOperations
>>>>> might occur.
>>>>
>>>> Either way!   The '[' on the first line and the ']' on the second log
>>>> line match up but we could drop both or keep them.  This doesn't look
>>>> like a big deal!    "start" and "end" are too noisy.
>>>
>>> Not a big deal, just trying to make it look neat and easy to match up
>>> the pre- and post- operation log lines. The decorations are already
>>> noisy so I don't think start/end make it any worse.
>>
>> How about?
>>
>>      [0.257s][debug  ][vmoperation] VM_Operation begin
>> (0x00007f74e589c700):
>>      G1CollectFull, mode: safepoint, requested by thread
>> 0x00007f74dc018000
>>      <code from doit() and not from TraceVMOperation>
>>      [0.272s][debug  ][vmoperation] VM_Operation end
>> (0x00007f74e589c700):
>>      G1CollectFull, mode: safepoint, requested by thread
>> 0x00007f74dc018000
>
> That looks fine - thanks.
>
>>>
>>>>>
>>>>>>>>
>>>>>>>> Why is the only logging level for this tag the "debug" level? I
>>>>>>>> think
>>>>>>>> I may be missing part of the way UL works here - can you enable
>>>>>>>> logging both by explicit tag (ie vmoperation) and the level (ie
>>>>>>>> debug)?
>>>>>>>>
>>>>>> You enable the logging with "-Xlog:vmoperation=debug". If you
>>>>>> leave of
>>>>>> the "=<level>" portion, it is by default parsed as "=info". We don't
>>>>>> believe this vmoperation logging needs to come out by default in the
>>>>>> case where someone asks for "-Xlog:all", so we put it one level
>>>>>> below.
>>>>>
>>>>> <sigh> So does -Xlog:all signify all tags at info level or ???
>>>>>
>>>>> I think this is real conceptual problem with the UL framework. I
>>>>> should be able to define the info/trace/debug levels for a given tag,
>>>>> without having to think about how they interact with a "log all"
>>>>> request.
>>>>
>>>> We picked debug level.  It seems like debug level.  Do you think it
>>>> should be Info level?  Is it important information?
>>>
>>> In the current TraceXXX logging scheme it is the tag, the XXX, that is
>>> the primary focus. All of our tags are independent. We can add
>>> "levels" to that logging by either adding in other flags (like Verbose
>>> or WizardMode) or by using a numeric rather than boolean flag type.
>>>
>>> In the UL framework, as discussed in the emails with Kirk, the primary
>>> focus is on the levels, and then the tags. This is an inversion of the
>>> current scheme. That in itself would not be a problem when just
>>> considering non-product tracing flags - particularly if, as discussed
>>> in another thread, the distinction between product and non-product
>>> levels is itself raised to be the first-level distinction. For each
>>> flag/tag you can just consider: is this a first level of detail
>>> (info), a second level of detail (debug), or a third level of detail
>>> (trace) for this tag?
>>>
>>> However, when you change a non-product tag to product, this
>>> level-oriented focus can become problematic: the framework wants the
>>> user to be able to enable all info logging (for example), but the
>>> "info" for tag XXX isn't considered anywhere as near as important as
>>> the info for some other tags so "we" don't want it to show up in this
>>> case, so instead of using "info" we use "debug", so that it is hidden.
>>> As per the current case.
>>>
>>> So now we are not using the levels to classify the detail of a
>>> particular subsystem, but rather we are using levels to provide a
>>> secondary hierarchy of subsystems/tags:
>>>
>>> - level 1: all tags that have "info" logging statements
>>> - level 2: all "debug" logging from level 1 tags plus all tags that
>>> start logging at "debug" level
>>> - level 3: all "trace" logging from level 1 and 2 tags, plus all
>>> remaining tags that start logging at "trace" level
>>>
>>> I think this is problematic because we have to consider logging
>>> holistically instead of just thinking about the subsystem being
>>> logged. I also think the rules are very ad-hoc and subjective, which
>>> isn't conducive to getting a clean and consistent approach to logging.
>>
>> This is getting really long and I'm not seeing how to respond. Should we
>> not convert the runtime tracing and and printing functions to Unifield
>> Logging?  Should the framework be redone so that it satisfies all of
>> your theoretical constraints?   Max wrote up a proposal how to convert
>> the runtime flags to UL.  It's getting really tiresome to defend the
>> framework with every flag that we try to convert.  I'm not sure what you
>> would prefer we do.
>
> I have not seen a proposal that touches on these specific issues ie
> selection of info versus debug; nor the decision to make non-product
> flags into product tags (I know that has been discussed elsewhere).
>
> As we have started doing the conversion so these issues have become more
> apparent. Something evidenced by the proposal to extend UL so that we do
> have a clear distinction between product and non-product tags.
>
>> Yes, there are tags and levels.  For runtime, we've chosen tags to
>> differentiate all the different things we want to log.   Then we have
>> levels for verbosity.  It's really not that complicated.  We wanted to
>> avoid "info" because most logging in the runtime system isn't that
>> interesting apart from it's subsystem.
>
> Your last sentence is the problem and the crux of the matter. If I say:
>
> -XX:+TraceVMoperation
>
> then I see the tracing of VM operations. With the current proposal if I
> say:
>
> -Xlog:vmoperation
>
> then I see Nothing! Because it has been chosen not to use the info level
> _because_ using it adds noise to -Xlog:all. From a VM operation
> perspective what we are logging is the most basic information and so
> considering only that subsystem it should be "info". But because the UL
> framework also supports level-based enabling, ie -Xlog:all, we are
> making choices about what level to use not based on the subsytem, but on
> the overall log settings. I don't think we should be doing that. I don't
> think we should care about someone who uses -Xlog:all. I think we should
> simply address each subsystem and use the available three levels of
> logging for that subsystem.
>
> Also note we only hit this issue with VMoperations because we are
> choosing to make it product mode. If this was still a non-product flag
> and we had the three proposed non-product levels, then this would
> naturally have been using "info".
>
> Thanks,
> David
>
>> Thanks,
>> Coleen
>>>
>>>>>
>>>>>>>> And I know I sound like a broken record but I'm concerned about the
>>>>>>>> overhead of the logging actions when it is not enabled ie:
>>>>>>>>
>>>>>>>> outputStream* debugstream = LogHandle(vmoperation)::debug_stream();
>>>>>>>>
>>>>>>>> always gets executed - and we evaluate is_enabled twice.
>>>>>>>>
>>>>>> See my reply to Max above.
>>>>>>>> ---
>>>>>>>>
>>>>>>>> test/runtime/logging/VMOperationTestMain.java
>>>>>>>>
>>>>>>>> Can you add a comment explaining what the logic is attempting to do
>>>>>>>> please. I'm curious how many times the loop executes before you
>>>>>>>> get a
>>>>>>>> safepoint-based GC (and how it varies for different GC's).
>>>>>>>>
>>>>>>>> Style nit: while(  -> while (
>>>>>> Yes, will fix.
>>>>>>>>
>>>>>>>>> A compatability request has been accepted with regard to this
>>>>>>>>> change.
>>>>>>>>
>>>>>>>> I'll record my objections again to the conversion of develop
>>>>>>>> flags to
>>>>>>>> product. <sigh>
>>>>>> Thanks for voicing your concern. The justification for this tag being
>>>>>> product is that it could be useful to helping diagnose user
>>>>>> problems, it
>>>>>> does not slow performance when off, and it minimally increases the
>>>>>> size
>>>>>> of the code. If any of these prove to be false in aggregated logging
>>>>>> options, we can move it to develop level during code freeze.
>>>>>
>>>>> You could make that argument for every piece of logging and end up
>>>>> with a very simple flat logging system. :) It is all very subjective
>>>>> of course.
>>>>>
>>>>
>>>> Yes, we do intend on making this assessment and argument for each
>>>> logging level because we don't want a simple flat logging system. The
>>>> code that we are logging isn't simple nor flat.   I can cut/paste the
>>>> decision graph for you again.
>>>
>>> See above.
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks,
>>>>>> Rachel
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>> Thank you very much,
>>>>>>>>> Rachel
>>>>>>>
>>>>>>
>>>>
>>