RFR(S): 8173335: Improve logging for j.l.ref.reference processing
sangheon
sangheon.kim at oracle.com
Thu Jun 29 07:56:04 UTC 2017
Hi Thomas,
Thank you very much for the thorough review.
On 06/26/2017 07:42 AM, Thomas Schatzl wrote:
> Hi Sangheon,
>
> thanks for all your changes, and sorry a bit for the delay...
>
> On Wed, 2017-06-14 at 00:52 -0700, sangheon wrote:
>> Hi Thomas again,
>> On 06/13/2017 02:21 PM, sangheon wrote:
>>> Hi Thomas,
>>>
>>> Thank you for reviewing this.
>>>
>>> On 06/13/2017 04:21 AM, Thomas Schatzl wrote:
>>>> Hi Sangheon,
>>>>
>>>>
>>>> On Mon, 2017-06-12 at 17:13 -0700, sangheon wrote:
>>>>> Hi Aleksey,
>>>>>
>>>>> Thanks for the review.
>>>>>
>>>>> On 06/12/2017 09:06 AM, Aleksey Shipilev wrote:
>>>>>> On 06/10/2017 01:57 AM, sangheon wrote:
>>>>>>> CR:https://bugs.openjdk.java.net/browse/JDK-8173335
>>>>>>> webrev:http://cr.openjdk.java.net/~sangheki/8173335/webrev
>>>>>>> .0
>>>> - There should be a destructor in ReferenceProcessor cleaning up
>>>> the dynamically allocated memory.
>>> Thomas and I had some discussion about this and agreed to file a
>>> separate CR for freeing issue.
>>>
>>> I noticed that there's no destructor when I wrote this, but this is
>>> how we usually implement.
>>> However as this seems incorrect, I will add a destructor for newly
>>> added class but it will not be used in this patch.
>>> It will be used in the following CR(
>>> https://bugs.openjdk.java.net/browse/JDK-8182120 ) which fixes
>>> not-freeing issue in ReferenceProcessor.
>>> FYI, ReferenceProcessor has heap allocated members of
>>> ReferencePolicy(and its friends) but it is not freed too. So
>>> instead of extending this patch, I propose to separate this freeing
>>> issue.
> That's fine, thanks.
>
>>>> - the change should move gc+ref output to something else: there
>>>> is so much additional junk printed with gc+ref=trace so that the
>>>> phase logging is drowned out with real trace information and
>>>> unusable for regular consumption.
>>> Okay, I will add it.
>>> But I asked introducing 'gc+ref+phases' before but you didn't like
>>> it. :) Probably I didn't provide much details?!
> Yes. In the example you showed me earlier with gc+ref=trace the
> examples did not contain the other gc+ref=trace output. That's why I
> thought it would be fine. :)
:)
>>>> - I would prefer if resetting the reference phase times logger
>>>> wouldn't be kind of an afterthought of printing :)
>>>>
>>>> Also it might be useful to keep the data around for somewhat
>>>> longer (not throw it away after every phase). Don't we need the
>>>> data for further analysis?
>>> I don't have strong opinion on this.
>>>
>>> I didn't consider keeping log data for further analysis. This could
>>> a minor reason for supporting keeping log data longer but I think
>>> interspersing with existing G1 log would be the main reason of
>>> keeping it.
>>>
>>>> This would also allow printing it later using different log tags
>>>> (with different formatting).
>>>>
>>>> - I like the split of phasetimes into data storage and printing.
>>>> I do not like that basically the timing data is created twice,
>>>> once for the phasetimes, once for the GCTimer (for JFR
>>>> basically). No, currently timing data is created once and used
>>>> for both phase log and GCTimer.
>>> Or am I missing something?
>>>
>>> So in summary, mostly I agree with your comments except below 2:
>>> 1. Interspersing with G1 log.
>>> 2. Keeping log data longer. (This should be done if we go with
>>> interspersing idea)
>> I started working on above 2 items. :)
>> I will update webrev when I'm ready.
>>
> Thanks a lot for considering all my comments.
>
> I think the output is much nicer now :)
Thanks!
> Some more notes:
>
> - In the current change (webrev.2) the method with using the
> "direct_print()" getter seems a bit forced only to keep the current
> structure of the code, i.e. printing within the
> ReferenceProcessor::process_references() method.
Right.
> What do you think about moving the printing outside of that method for
> all collectors, just passing a (properly initialized - that allows
> moving the reset() method into gc specific code as well)
> ReferenceProcessorPhaseTimes* that is then later used for printing,
> either directly, or deferred?
Okay, this seems better than current one.
During applying your suggestion, I tweaked a little bit, because giving
the responsibility of printing logs to the callers seems not that
natural to me. (I also prepared additional webrev for your original
suggestion [1])
> At the location where the reference processing is done we know whether
> we need to print directly or deferred. This also hides pretty specific
> information about printing (like indentation level) from the reference
> processing itself.
>
> Also that would maybe allow storing the GCTimer reference somewhere in
> the ReferenceProcessorPhaseTimes so that we only need to pass a single
> container for timing information around.
Good idea, now GCTimer is included in ReferenceProcessorPhaseTimes.
> Overall that may reduce the code quite a bit, keeps similar components
> (GCTimer and ReferenceProcessorPhaseTimes) together without
> ReferenceProcessor needing to know about both of them, and removes the
> ReferenceProcessor "global" reference to the
> ReferenceProcessorPhaseTimes, which is easier to keep track of when
> looking at the code (instead of having the GCTimer passed in and the
> ReferenceProcessorPhaseTimes as member).
>
> The collectors that print immediately probably also can get away with a
> stack-allocated local ReferenceProcessorPhaseTimes, which somewhat
> simplifies their lifecycle management.
Right.
Mostly ReferenceProcessorPhaseTimes will be stack-allocated at the time
of calling process_discovered_references() or
enqueue_discovered_references(), except for G1 young GC case.
~ReferenceProcessorPhaseTimes() will not be added in the destructor of
G1CollectedHeap as we don't have it now. This can be addressed in
separate CR if needed.
> - could you please tighten the visibility of
> ReferenceProcessorPhaseTimes methods a bit? The getters of that class
> are only ever used in the print* methods, and even some of these print*
> methods are ever called from class local methods.
>
> I think this would drastically decrease the surface of that method.
You are right.
Tried to move to 'private' as many as possible.
> - there seems to be a bug in printing per-thread per-phase worker
> times, the values seem to contain the absolute time at which the list
> has been processed, not a duration. (with -XX:+ParallelRefProcEnabled
> and gc+phases+ref=trace)
>
> [1512.286s][debug][gc,phases,ref] GC(834) Reference Processing: 2.5ms
> [1512.286s][debug][gc,phases,ref] GC(834) SoftReference: 0.3ms
> [1512.286s][debug][gc,phases,ref] GC(834) Balance queues: 0.0ms
> [1512.286s][debug][gc,phases,ref] GC(834) Phase1: 0.3ms
> [1512.286s][trace][gc,phases,ref] GC(834) Process lists
> (ms) Min: 1512283.9, Avg: 1512283.9, Max: 1512283.9, Diff: 0.0,
> Sum: 34782529.1, Workers: 23
> [1512.286s][debug][gc,phases,ref] GC(834) Phase2: 0.3ms
> [1512.286s][trace][gc,phases,ref] GC(834) Process lists
> (ms) Min: 1512284.2, Avg: 1512284.2, Max: 1512284.2, Diff: 0.0,
> Sum: 34782535.9, Workers: 23
>
> - in referenceProcessorPhaseTimes.cpp:35: the code reads
>
> if (_worker_time != NULL) {
> ...
> }
>
> with _worker_time being set to NULL just one line above (same with the
> other constructor).
>
> Not sure.
_worker_time check was remainder of previous change and resulted in
above bug you pointed.
Thanks for catching this.
Fixed.
> - in RefProcWorkerTimeTracker::~RefProcWorkerTimeTracker: how is it
> possible that _worker_time is NULL? ReferenceProcessorPhaseTimes seems
> to always allocate memory for it.
Fixed.
_worker_time can't be NULL.
> - RefProcPhaseTimesTracker takes the DiscoveredList array as parameter,
> but only ever uses it to determine how many total entries this
> DiscoveredList[] has. So it seems to me that it would be better in the
> name of information hiding if the ReferenceProcessor, which already has
> a total_count() method, would just pass this total instead of the
> entire list.
The problem is that 'before/after' count should be gathered from
constructor and destructor.
With passing a parameter, constructor could get the total but it is
impossible from destructor.
But I agree with your point that passing DiscoveredList to get the
total-count can be enhanced. So I changed to add a new method that
returns total count at ReferenceProcessor(per ReferenceType). With this
new approach, we can simplify a bit more. e.g. eliminate
ReferenceProcessorPhaseTimes::max_gc_threads() and
total_count_from_list() etc.
> This would also remove the need for the max_gc_counts() getter in
> ReferenceProcessorPhaseTimes afaics too.
[...]
> - "Ref Counts" vs. "Reference Counts" vs. something else in the output
> of the enqueue phase: I would prefer to not use abbreviations. Since we
> already mess up the logging output in a big way, we might also just go
> all the way :P
Changed to use 'Reference Counts'.
Updated webrev:
http://cr.openjdk.java.net/~sangheki/8173335/webrev.3
http://cr.openjdk.java.net/~sangheki/8173335/webrev.3_to_2/
Testing: JPRT and local test with all collectors.
[1]: with your suggestion, callers will stack allocate
ReferenceProcessorPhaseTimes and GCTimer can be included in
ReferenceProcessorPhaseTimes. i.e. applied only your suggestion
http://cr.openjdk.java.net/~sangheki/8173335/webrev.3b
http://cr.openjdk.java.net/~sangheki/8173335/webrev.3b_to_2/
Thanks,
Sangheon
> Thanks,
> Thomas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170629/26e6f367/attachment.htm>
More information about the hotspot-gc-dev
mailing list