RFR(S): 8173335: Improve logging for j.l.ref.reference processing

Tue Jun 20 23:15:37 UTC 2017

Hi Thomas,

On 06/14/2017 12:52 AM, sangheon wrote:
> Hi Thomas again,
>
> On 06/13/2017 02:21 PM, sangheon wrote:
>> Hi Thomas,
>>
>> Thank you for reviewing this.
>>
>> On 06/13/2017 04:21 AM, Thomas Schatzl wrote:
>>> Hi Sangheon,
>>>
>>>
>>> On Mon, 2017-06-12 at 17:13 -0700, sangheon wrote:
>>>> Hi Aleksey,
>>>>
>>>> Thanks for the review.
>>>>
>>>> On 06/12/2017 09:06 AM, Aleksey Shipilev wrote:
>>>>> On 06/10/2017 01:57 AM, sangheon wrote:
>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8173335
>>>>>> webrev: http://cr.openjdk.java.net/~sangheki/8173335/webrev.0
>>> - There should be a destructor in ReferenceProcessor cleaning up the
>>> dynamically allocated memory.
>> Thomas and I had some discussion about this and agreed to file a 
>> separate CR for freeing issue.
>>
>> I noticed that there's no destructor when I wrote this, but this is 
>> how we usually implement.
>> However as this seems incorrect, I will add a destructor for newly 
>> added class but it will not be used in this patch.
>> It will be used in the following CR( 
>> https://bugs.openjdk.java.net/browse/JDK-8182120 ) which fixes 
>> not-freeing issue in ReferenceProcessor.
>> FYI, ReferenceProcessor has heap allocated members of 
>> ReferencePolicy(and its friends) but it is not freed too. So instead 
>> of extending this patch, I propose to separate this freeing issue.
>>
>>>
>>> - the change should move gc+ref output to something else: there is so
>>> much additional junk printed with gc+ref=trace so that the phase
>>> logging is drowned out with real trace information and unusable for
>>> regular consumption.
>> Okay, I will add it.
>> But I asked introducing 'gc+ref+phases' before but you didn't like 
>> it. :) Probably I didn't provide much details?!
>>
>>>
>>> Also I would prefer to have this detailed log output interspersed
>>> within the (existing) gc+phases output. Like under the "Reference
>>> Processing" and "Reference Enqueuing" sections for G1 in particular.
>> Frankly speaking, I'm not much interested now.
>> When I started investigating this CR, you mentioned about this too. 
>> (But you were okay for either way. i.e. current one or interspersing 
>> into G1 logging. :) )
>> I also tried in that way(interspersing one) and my feeling is that I 
>> don't see much benefit to have ref logs in G1 phases section. It 
>> looks better organized but it doesn't mean current log style is worse.
>> Ref. logs are printed separately for long time and other shared codes 
>> also print logs immediately.
>>
>> On the other hand, current implementation(re-use and print 
>> immediately) seems simpler to implement.
>> In addition, ReferenceProcessor::process_discovered_reflist() is 
>> repeatedly called for different type of References so re-using log 
>> printer seems natural to me. :)
>>
>>>
>>> Maybe with gc+phases+ref=debug/trace so that "everything" could be
>>> enabled using "gc+phases*=debug/trace"?
>> Yes, good idea.
>>
>>>
>>> I can see that the code throws away the previous information about
>>> reference processing after every use (the phasetimes reused). This is
>>> does not allow printing of the data at convenient times and places.
>>>
>>> I.e. I would prefer if the data were aggregated (not only from one
>>> particular phase) and later printed together.
>> If we don't intersperse with existing G1 log, do you still think 
>> printing later is needed?
>> Probably printing after Phanthom Ref. processed or different location?
>>
>>>
>>> I kind of disagree with Aleksey about need for backwards compatibility
>>> of log messages. This is such a big breaking change in the amount of
>>> information shown that existing users will want to adapt their log
>>> readers anyway.
>> True that log parsers already should be updated, but I understood 
>> Aleksey's comment something like preferring 'Ref Counts' instead of 
>> 'Counts'.
>>
>>> As mentioned, due to real trace code here, gc+ref=trace is unusable.
>> FYI, probably you tested with fastdebug because there are many 
>> debug/trace logs for debug build. It doesn't bother from product 
>> build actually.
>> But as I said, I will change current new logs' channel from 'gc+ref' 
>> to 'gc+phases+ref'.
>>
>>>
>>> We could still provide minimal backwards compatible output under
>>> gc+ref=debug if needed.
>> I'm don't see much value on this.
>> gc+phases+ref seems better.
>>
>>>
>>> - I would prefer if resetting the reference phase times logger wouldn't
>>> be kind of an afterthought of printing :)
>>>
>>> Also it might be useful to keep the data around for somewhat longer
>>> (not throw it away after every phase). Don't we need the data for
>>> further analysis?
>> I don't have strong opinion on this.
>>
>> I didn't consider keeping log data for further analysis. This could a 
>> minor reason for supporting keeping log data longer but I think 
>> interspersing with existing G1 log would be the main reason of 
>> keeping it.
>>
>>>
>>> This would also allow printing it later using different log tags (with
>>> different formatting).
>>>
>>> - I like the split of phasetimes into data storage and printing. I do
>>> not like that basically the timing data is created twice, once for the
>>> phasetimes, once for the GCTimer (for JFR basically).
>> No, currently timing data is created once and used for both phase log 
>> and GCTimer.
>> Or am I missing something?
>>
>> So in summary, mostly I agree with your comments except below 2:
>> 1. Interspersing with G1 log.
>> 2. Keeping log data longer. (This should be done if we go with 
>> interspersing idea)
> I started working on above 2 items. :)
> I will update webrev when I'm ready.
Here's updated webrevs which applied below:

1. Added destructor for ReferenceProcessorPhaseTimes.
2. Added 'gc+phases+ref' for newly added logs.
3. Interspersing reference logs into G1 young GC log.
     - Logs for other cases will be printed immediately.
4. All timing information have their own storage.
5. Total time is added.

Current reference logs will be:
1. New logs (except G1 young GC)
[1.541s][debug][gc,phases      ] GC(7) Finalize Marking 4.802ms
[1.541s][debug][gc,phases,start] GC(7) *_Reference Processing_**_<-- [1]_*
*[1.543s][debug][gc,phases,ref  ] GC(7) Reference Processing: 1.8ms**<-- [2]
**[1.543s][debug][gc,phases,ref  ] GC(7) SoftReference: 0.3ms**
**[1.543s][debug][gc,phases,ref  ] GC(7)     Balance queues: 0.0ms**
**[1.543s][debug][gc,phases,ref  ] GC(7)     Phase1: 0.3ms**
**[1.543s][trace][gc,phases,ref  ] GC(7) Process lists (ms)        Min: 
1541.3, Avg: 1541.3, Max: 1541.3, Diff:  0.0, Sum: 35450.0, Workers: 23**
**[1.543s][debug][gc,phases,ref  ] GC(7)     Phase2: 0.2ms**
**[1.543s][trace][gc,phases,ref  ] GC(7) Process lists (ms)        Min: 
1541.5, Avg: 1541.5, Max: 1541.5, Diff:  0.0, Sum: 35454.5, Workers: 23**
**[1.543s][debug][gc,phases,ref  ] GC(7)     Phase3: 0.3ms**
**[1.543s][trace][gc,phases,ref  ] GC(7) Process lists (ms)        Min: 
1541.7, Avg: 1541.8, Max: 1541.8, Diff:  0.0, Sum: 35460.5, Workers: 23**
**[1.543s][debug][gc,phases,ref  ] GC(7) Discovered: 0**
**[1.543s][debug][gc,phases,ref  ] GC(7) Cleared: 0**
**...**
**[1.543s][debug][gc,phases,ref  ] GC(7) Reference Enqueuing 0.1ms**
**[1.543s][trace][gc,phases,ref  ] GC(7)   Process lists (ms)        
Min: 1543.4, Avg: 1543.4, Max: 1543.4, Diff: 0.1, Sum: 35498.4, Workers: 
23**
**[1.543s][debug][gc,phases,ref  ] GC(7)   Ref Counts:  Soft: 0  Weak: 
0  Final: 0  Phantom: 0*
[1.544s][debug][gc,phases      ] GC(7) Reference Processing 2.445ms
[1.544s][debug][gc,phases,start] GC(7) Class Unloading
[1.544s][debug][gc,phases,start] GC(7) ClassLoaderData
[1.544s][debug][gc,phases      ] GC(7) ClassLoaderData 0.467ms

2. New logs for G1 young GC: -Xlog:gc+phases*=trace
[1.470s][info ][gc,phases     ] GC(6)   Post Evacuate Collection Set: 4.1ms
[1.470s][debug][gc,phases     ] GC(6)     Code Roots Fixup: 0.0ms
[1.470s][debug][gc,phases     ] GC(6)     Preserve CM Refs: 0.0ms
[1.470s][trace][gc,phases     ] GC(6)       Parallel Preserve CM Refs 
(ms): skipped
[1.470s][trace][gc,phases,task] GC(6)                                 - 
- - - - - - - - - - - - - - - - - - - - - -
*[1.470s][debug]_[gc,phases     ]_ GC(6) Reference Processing: 1.4ms**
**[1.470s][debug][gc,phases,ref ] GC(6) SoftReference: 0.2ms**
**[1.470s][debug][gc,phases,ref ] GC(6) Balance queues: 0.0ms**
**[1.470s][debug][gc,phases,ref ] GC(6) Phase1: 0.2ms**
**[1.470s][trace][gc,phases,ref ] GC(6) Process lists (ms)        Min: 
1463.2, Avg: 1463.2, Max: 1463.2, Diff:  0.0, Sum: 33653.2, Workers: 23**
**[1.470s][debug][gc,phases,ref ] GC(6) Phase2: 0.1ms**
**[1.470s][trace][gc,phases,ref ] GC(6) Process lists (ms)        Min: 
1463.3, Avg: 1463.3, Max: 1463.4, Diff:  0.0, Sum: 33656.8, Workers: 23**
**[1.470s][debug][gc,phases,ref ] GC(6) Phase3: 0.2ms**
**[1.470s][trace][gc,phases,ref ] GC(6) Process lists (ms)        Min: 
1463.5, Avg: 1463.5, Max: 1463.6, Diff:  0.0, Sum: 33661.6, Workers: 23**
**[1.470s][debug][gc,phases,ref ] GC(6) Discovered: 0**
**[1.470s][debug][gc,phases,ref ] GC(6) Cleared: 0*
...
[1.471s][debug][gc,phases     ] GC(6)     Clear Card Table: 0.0ms
*[1.471s][debug][gc,phases     ] GC(6)     Reference Enqueuing: 0.1ms**
**[1.471s][debug][gc,phases,ref ] GC(6)       Ref Counts:  Soft: 0  
Weak: 0  Final: 0  Phantom: 0*
[1.471s][debug][gc,phases     ] GC(6)     Merge Per-Thread State: 0.2ms

3. New logs for G1 young GC: -Xlog:gc+phases+ref=trace
[1.335s][debug]_*[gc,phases,ref]*_ GC(4)     Reference Processing: 
9.4ms_*<- This is still printed.*_
[1.335s][debug][gc,phases,ref] GC(4)       SoftReference: 7.0ms
[1.335s][debug][gc,phases,ref] GC(4)         Balance queues: 0.0ms
[1.335s][debug][gc,phases,ref] GC(4)         Phase1: 7.0ms
[1.335s][trace][gc,phases,ref] GC(4)           Process lists (ms)        
Min: 1329.0, Avg: 1329.0, Max: 1329.1, Diff: 0.1, Sum: 30568.1, Workers: 23
[1.335s][debug][gc,phases,ref] GC(4)         Phase2: 0.1ms
[1.335s][trace][gc,phases,ref] GC(4)           Process lists (ms)        
Min: 1329.1, Avg: 1329.1, Max: 1329.1, Diff: 0.0, Sum: 30569.7, Workers: 23
[1.335s][debug][gc,phases,ref] GC(4)         Phase3: 0.3ms
[1.335s][trace][gc,phases,ref] GC(4)           Process lists (ms)        
Min: 1329.4, Avg: 1329.4, Max: 1329.5, Diff: 0.1, Sum: 30576.7, Workers: 23
[1.335s][debug][gc,phases,ref] GC(4)         Discovered: 0
[1.335s][debug][gc,phases,ref] GC(4)         Cleared: 0
...
[1.335s][debug][gc,phases,ref] GC(4)     Reference Enqueuing: 0.1ms
[1.335s][debug][gc,phases,ref] GC(4)       Ref Counts: Soft: 0  Weak: 0  
Final: 0  Phantom: 0

[1]: Among implementations 'Reference Processing' GCTraceTime are 
measuring differently. Some include 
ReferenceProcessor::enqueue_discovered_references() while others don't. 
i.e. The former is something like measuring all reference process 
related work rather than process_discovered_references(). As to have its 
own total time seems right to me, I added it[2].

PS) There were some concern about exposing WorkerDataArray into 'shared' 
directory. But as there's no alternative to use, I'm hoping to share it now.

Webrev:
http://cr.openjdk.java.net/~sangheki/8173335/webrev.2 (full)
http://cr.openjdk.java.net/~sangheki/8173335/webrev.2_to_1/ (incremental)

Thanks,
Sangheon

>
> Thanks,
> Sangheon
>
>
>>
>> Let me post updated webrev, after making all decision.
>>
>> Thanks,
>> Sangheon
>>
>>
>>> Or the gctimer is
>>> passed everywhere. But that is another issue I guess.
>>>
>>> Thanks,
>>>    Thomas
>>>
>>>
>>>>> Oh, good! I had to instrument these by hand when optimizing RP
>>>>> paths.
>>>>>
>>>>> Comments after brief look:
>>>>>
>>>>>    *) So, the path with NULL executor are also not handling the
>>>>> timer? E.g. CMS:
>>>>>
>>>>>    5262   if (rp->processing_is_mt()) {
>>>>>    5263     rp->balance_all_queues();
>>>>>    5264     CMSRefProcTaskExecutor task_executor(*this);
>>>>>    5265 rp->enqueue_discovered_references(&task_executor,
>>>>> _gc_timer_cm);
>>>>>    5266   } else {
>>>>>    5267     rp->enqueue_discovered_references(NULL);
>>>>>    5268   }
>>>> Fixed to use timers for similar cases that you pointed. Thanks for
>>>> catching up this!
>>>> I started this CR as a part of MT ref. processing(JDK-8043575), so I
>>>> only added to that path. But this should be fixed.
>>>>>
>>>>>    *) I would leave "Ref Counts" line as usual for compatibility
>>>>> reasons. Changing
>>>>> it to "Counts" would force GC log parsers to handle that corner
>>>>> case too.
>>>> Changed, 'Counts -> Ref Counts'.
>>>>>
>>>>>    *) This may reuse Indents?
>>>>>
>>>>>      95       out->print("%s", "    ");
>>>> Fixed to use Indents[2].
>>>>
>>>>>
>>>>>    *) Probably makes sense to "hg mv -A" the workerDataArray files
>>>>> to preserve the
>>>>> Mercurial history -- webrev should say something like "copied from
>>>>> ...", IIRC.
>>>> Fixed.
>>>>
>>>> webrev:
>>>> http://cr.openjdk.java.net/~sangheki/8173335/webrev.1/
>>>> http://cr.openjdk.java.net/~sangheki/8173335/webrev.1_to_0
>>>>
>>>> Thanks,
>>>> Sangheon
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> -Aleksey
>>>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170620/c5c66c52/attachment.htm>