RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3)

Tue Oct 22 20:46:45 UTC 2019

Hi Thomas,

Thanks for your review!

On 10/21/19 7:09 AM, Thomas Schatzl wrote:
> Hi,
>
>   some initial comments looking at the log output:
>
> On 13.10.19 08:16, sangheon.kim at oracle.com wrote:
>> Hi all,
>>
>> Previous patch conflicts because of JDK-8220310, I'm posting rebased 
>> one with some refactoring.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.2
>> Testing: hs-tier 1 ~ 5, with/without UseNUMA
>>
>> Here's the full patch of 8220310, 8220311 and 8220312.
>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.full.2/
>>
>
>   - I did not performance impact test the additional logging yet, but 
> I do not expect issues.
>
>   - that's something from the first NUMA patch:
>
> There is this gc+heap+numa=debug log message "Request memory [address, 
> address] to be numa id (X)." for every region.
>
> First, it seems to be on the wrong level, consider a heap with 
> ten-thousands of regions. This imo clogs the log too much, and I would 
> prefer to move this information to trace level.
Moved to Trace level.

>
> Second, the full stop at the end is not necessary :)
Removed.

>
>   - the G1HRPrinter should be made NUMA aware, i.e. print expected 
> NUMA id for this region
>
>   - the casing of NUMA changes depending on message, i.e. sometimes 
> "NUMA" and other times "numa" in the log messages themselves. I would 
> recommend uniformly use "NUMA".
Changed to "NUMA".

>
> However I think that all the "NUMA id" in these messages should read 
> "node id" as at that level we do not manage the OS level NUMA ids any 
> more.
We don't manage but users may configure OS level NUMA ids (e.g. via 
numactl), so I wanted to print all logs with NUMA id.

>
>   - the "numa id" values in the various messages are formatted 
> differently in the different messages with no apparent guideline: 
> sometimes the code adds the leading zeros, sometimes not. Also the 
> separator between node id and value is sometimes ":" and once "="
>
> E.g.
>
> "NUMA id verification: preferred id (matched #): 00 (32), 01 (32), ..."
> "Region Allocated / Requested: 99% xxxx/yyyy (numa id 0: 99% ..."
>
> I am kind of undecided what is best, but probably simply leaving out 
> the leading zeros is best for the large majority of cases.
Okay, will remove leading zeros.

>
>   - just a suggestion: "Region Allocated / Requested" -> "Placement 
> Match Ratio" or so. Maybe somebody else has a better name.
"Placement match ratio" feels better but to align with below message, 
changed to lower case.

>
> Also in that message I would not print "numa id" at all to make the 
> message shorter.
>
>   - "Worker threads local object process rate" -> "Worker task 
> locality match rate" seems shorter.
Changed to "Worker task locality match ratio"

>
> Again, to make the message shorter I would prefer that "numa id" were 
> not printed at all in the details.
Tried to minimize but not zero occurrence.

>
> Not sure if that rate at this point is extremely interesting since G1 
> won't even try to improve it at this time, but you can leave it in if 
> you want.
Yeah, I know. But this is sort of logging framework for NUMA, so I would 
like to leave as is.

>
>   - I would *probably* like to have most of these messages split into 
> "recent" and "total" statistics. Maybe others think that the totals 
> are okay.
Interesting idea.
Could you expand your suggestion a bit more?
What is "recent"? Or do you mean per GC cycle?

>
>   - Again, to save space I would prefer to have the per-node details 
> in the region summaries in the same line as the original output. I.e. 
> instead of
>
> Eden regions: 28->0 (29)
>   From numa id 0: 18->0
>   From numa id 1: 10->0
>
> the following would be much shorter:
>
> Eden regions: 28->0 (29) (0: 18->0, 1: 10->0)
>
> As with higher node counts you will get lots of lines with little 
> content imho. Maybe others think differently?
I like your suggestion.

>
> Also, this would "fix" the problem that when you enabled gc+heap+numa 
> but not gc+heap, you will see these "From numa id" numbers in the log 
> without their required context. Alternatively, gc+heap+numa could 
> automatically enable gc+heap at the same level.
Yeah, I know this issue and this is why I like your suggestion! :)

>
> Comments after some superficial look at the changes themselves:
>
>   - G1Regions should be renamed as G1RegionCounts and get a single 
> line comment like: "Contains per Node id region count".
Done.

>
>   - G1NodeTimes::Stat: it would probably be useful to have a "rate()" 
> getter that recalculates the value as needed instead of the member.
I'm okay with your suggestion so I tried. :)

>
>   - G1HeapTransition::Data::~Data: the "if (soemthing != NULL)" checks 
> are unnecessary. FREE_C_HEAP_ARRAY does that already.
Done.

>
> Same in G1ParScanThreadState::G1ParscanThreadState.
Done.

>
>   - I do not understand the name "G1NodeTimes" :) What "time" is that 
> referring to?
It meant 'Phase Times' similar to G1GCPhaseTimes or 
ReferenceProcessorPhaseTimes.
G1NUMAPhaseTimes is better?
Or any suggestion for a name?

>
>   - G1NUMA::clear_statistics() seems to be unused.
Removed G1NUMA::clear_statistics().

>
>   - G1NodeTimes::print_mutator_alloc_stat_info() and 
> G1NodeTimes::copy_to_sruvivor_stat_info() look very similar. Could the 
> code be refactored a bit?
Good catch. Done.

I mostly addressed your comments except below two:
- I would *probably* like to have most of these messages split into 
"recent" and "total" statistics. Maybe others think that the totals are 
okay.
- I do not understand the name "G1NodeTimes" :) What "time" is that 
referring to?

I will post next webrev, if I get other reviews.

Thanks,
Sangheon

>
> Thanks,
>   Thomas