RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3)

Mon Oct 21 14:09:16 UTC 2019

Hi,

   some initial comments looking at the log output:

On 13.10.19 08:16, sangheon.kim at oracle.com wrote:
> Hi all,
> 
> Previous patch conflicts because of JDK-8220310, I'm posting rebased one 
> with some refactoring.
> 
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.2
> Testing: hs-tier 1 ~ 5, with/without UseNUMA
> 
> Here's the full patch of 8220310, 8220311 and 8220312.
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.full.2/
> 

   - I did not performance impact test the additional logging yet, but I 
do not expect issues.

   - that's something from the first NUMA patch:

There is this gc+heap+numa=debug log message "Request memory [address, 
address] to be numa id (X)." for every region.

First, it seems to be on the wrong level, consider a heap with 
ten-thousands of regions. This imo clogs the log too much, and I would 
prefer to move this information to trace level.

Second, the full stop at the end is not necessary :)

   - the G1HRPrinter should be made NUMA aware, i.e. print expected NUMA 
id for this region

   - the casing of NUMA changes depending on message, i.e. sometimes 
"NUMA" and other times "numa" in the log messages themselves. I would 
recommend uniformly use "NUMA".

However I think that all the "NUMA id" in these messages should read 
"node id" as at that level we do not manage the OS level NUMA ids any more.

   - the "numa id" values in the various messages are formatted 
differently in the different messages with no apparent guideline: 
sometimes the code adds the leading zeros, sometimes not. Also the 
separator between node id and value is sometimes ":" and once "="

E.g.

"NUMA id verification: preferred id (matched #): 00 (32), 01 (32), ..."
"Region Allocated / Requested: 99% xxxx/yyyy (numa id 0: 99% ..."

I am kind of undecided what is best, but probably simply leaving out the 
leading zeros is best for the large majority of cases.

   - just a suggestion: "Region Allocated / Requested" -> "Placement 
Match Ratio" or so. Maybe somebody else has a better name.

Also in that message I would not print "numa id" at all to make the 
message shorter.

   - "Worker threads local object process rate" -> "Worker task locality 
match rate" seems shorter.

Again, to make the message shorter I would prefer that "numa id" were 
not printed at all in the details.

Not sure if that rate at this point is extremely interesting since G1 
won't even try to improve it at this time, but you can leave it in if 
you want.

   - I would *probably* like to have most of these messages split into 
"recent" and "total" statistics. Maybe others think that the totals are 
okay.

   - Again, to save space I would prefer to have the per-node details in 
the region summaries in the same line as the original output. I.e. 
instead of

Eden regions: 28->0 (29)
   From numa id 0: 18->0
   From numa id 1: 10->0

the following would be much shorter:

Eden regions: 28->0 (29) (0: 18->0, 1: 10->0)

As with higher node counts you will get lots of lines with little 
content imho. Maybe others think differently?

Also, this would "fix" the problem that when you enabled gc+heap+numa 
but not gc+heap, you will see these "From numa id" numbers in the log 
without their required context. Alternatively, gc+heap+numa could 
automatically enable gc+heap at the same level.

Comments after some superficial look at the changes themselves:

   - G1Regions should be renamed as G1RegionCounts and get a single line 
comment like: "Contains per Node id region count".

   - G1NodeTimes::Stat: it would probably be useful to have a "rate()" 
getter that recalculates the value as needed instead of the member.

   - G1HeapTransition::Data::~Data: the "if (soemthing != NULL)" checks 
are unnecessary. FREE_C_HEAP_ARRAY does that already.

Same in G1ParScanThreadState::G1ParscanThreadState.

   - I do not understand the name "G1NodeTimes" :) What "time" is that 
referring to?

   - G1NUMA::clear_statistics() seems to be unused.

   - G1NodeTimes::print_mutator_alloc_stat_info() and 
G1NodeTimes::copy_to_sruvivor_stat_info() look very similar. Could the 
code be refactored a bit?

Thanks,
   Thomas