RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation for G1, Mutator (1/3)

Mon Sep 23 15:14:36 UTC 2019

Hi Sangheon,

Thanks for the updated webrev, I like where we are heading with this. I 
still have some comments but most of them are fairly small. The only big 
thing is that I still think we could simplify things a lot by not 
passing down node_index to the lower level commit functions and instead 
just focus on keeping a strict manual interleaving of regions/pages 
(whichever is largest). I know this wouldn't allow us to expand by a 
single region on a specific node, but since this is a fairly uncommon 
case I think we could live with that.

src/hotspot/share/gc/g1/g1Allocator.*
-------------------------------------
There are a few calls like this:
uint node_index = _g1h->mem_node_mgr()->index_of_current_thread();
I would like a private helper to wrap this, something like 
current_node_index().
---

hotspot/share/gc/g1/g1Arguments.cpp
-----------------------------------
  161   if (UseNUMA) {
  162     if (FLAG_IS_DEFAULT(AlwaysPreTouch)) {
  163       FLAG_SET_DEFAULT(AlwaysPreTouch, true);
  164     }
  165     if (!AlwaysPreTouch && FLAG_IS_CMDLINE(AlwaysPreTouch)) {
  166       warning("Disabling AlwaysPreTouch is incompatible with 
UseNUMA. Disabling UseNUMA.");
  167       FLAG_SET_ERGO(UseNUMA, false);
  168     }
  169   }

I'm not sure I understand why we have to enforce AlwaysPreTouch when 
NUMA is enabled. This will improve performance, but I think the NUMA 
support should work even without it. Or am I missing something?
---

src/hotspot/share/gc/g1/g1CollectedHeap.*
-----------------------------------------
As we discussed offline I'm not sure we need pass the node_index down in 
the call to expand. Since we're not strict in giving out memory from the 
correct node during the mutator phase, I think we could do the same 
during the GC in the case we need to expand.
---

src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp
-------------------------------------------------
  162     G1NodeDistributor itr(node_index);
  163
  164     for (uint i = start_idx; i < start_idx + num_regions; i++) {
  165       // If there are many pages to touch, different node ids will 
be used.
  166       uint processed_node_index = itr.next_node_index();
  ...
  179         zero_filled = _storage.commit(idx, 1, processed_node_index);
  180         itr.next();
  181       }

Do we really need to do this? Can't we just rely on the code in 
G1NUMA::touch_memory_roundrobin() to do the right thing? Or is this just 
to allow committing a single region on the correct17134 node?
---

src/hotspot/share/gc/g1/g1_globals.hpp
--------------------------------------
Instead of adding those flags I think we should use log-tags and guard 
the logic by using log_is_enabled(), or LogTarget::is_enable() as you do 
in heapRegionManager.cpp. To allow these to be excluded with other 
logging turned on I suggest using specific log-tag-sets for each of them 
something like:
G1PrintNUMAIdOfHeapRegions -> gc, numa, region (debug or trace)
G1VerifyNUMAIdOfHeapRegions -> gc, numa, region, verification (debug or 
trace)
---

src/hotspot/share/gc/g1/heapRegionManager.cpp
---------------------------------------------
  112   if (mgr->num_active_nodes() > 1) {
  113     uint valid_node_index = 
mgr->valid_node_index(requested_node_index);
  114     // Try to allocate with requested node index.
  115     hr = _free_list.remove_region_with_node_index(from_head, 
valid_node_index, NULL);
  116   }

I've seen some problems in my testing with this code, valid_node_index() 
now call _numa->next_numa_index() when the index isn't valid. This means 
that both the lower level commit code will call this function as well as 
the allocation path in some cases (for example single region humongous). 
I'm not sure if this will lead to any real problems but it makes it 
harder to understand what is going on underneath and I've seen it lead 
to having imbalance over the nodes. To avoid this I think we could 
instead do:

if (mgr->num_active_nodes() > 1 &&
     mgr->is_valid_node_index(requested_node_index) {
   hr = _free_list.remove_region_with_node_index(from_head, 
requested_node_index, NULL);
}

This way we only remove with index if we have a valid index, otherwise 
we get a HeapRegion from the code below, which should be more or less 
the same as taking whatever index and removing with that:
  118   if (hr == NULL) {
  ...
  121     hr = _free_list.remove_region(from_head);
  122   }

If this is an ok approach I think valid_node_index() is not longer used 
and can be removed.
---

src/hotspot/share/gc/g1/g1NUMA.inline.hpp
-----------------------------------------
G1NUMA::next_numa_id() seems to be unused and could then be removed.
---

I'll continue looking at the patch and taking it for some spins. I might 
come with additional feedback later on, but these are my initial findings.

Thanks,
Stefan

On 2019-09-21 07:19, sangheon.kim at oracle.com wrote:
> Hi Kim,
> 
> Many thanks for this thorough review!
> 
> On 9/16/19 4:12 PM, Kim Barrett wrote:
>>> On Sep 4, 2019, at 3:16 AM, sangheon.kim at oracle.com wrote:
>>>
>>> Hi all,
>>>
>>> Please review this patch, making G1 NUMA aware.
>>> This is the first part of G1 NUMA implementation.
>>>
>>> - At G1 CollectedHeap initialization time, the given heap regions 
>>> will be split and touched.
>>> - Internally node index(uint) is used instead of node id(int type) 
>>> for easier access to arrays.
>>> - Only Linux is supported.
>>> - New logs are under gc+heap+numa
>>> - Current memory touch implementation will be replaced with WorkGang 
>>> by JDK-8230411: NUMA aware work gang.
>>>
>>> CR: https://bugs.openjdk.java.net/browse/JDK-8220310
>>> Webrev: http://cr.openjdk.java.net/~sangheki/8220310/webrev.0
>>> Testing: hs-tier 1 ~ 5 with +- UseNUMA.
>>>
>>> Thanks,
>>> Sangheon
> I had long off-line discussion with Kim, Thomas and Stefan J.
> And 2 major changes are:
> 
> 1) Memory touch thread is removed. This was a general way to implement 
> for both Linux and Solaris. However, Solaris is not supported anymore, 
> so memory touch thread is just removed.
> And this made the code much simpler.
> 
> 2) G1NUMA doesn't manage numa id of each page and we don't care about 
> each page instead HeapRegion is only cared. i.e. we assume numa id of 
> all pages between bottom and end of HeapRegion to be same. This also 
> made related codes much simpler.
> 
> There are some other changes as well.
> 
> 
>> Here is an initial set of comments.  Some of these might be nonsense; I'm
>> not all that familiar with NUMA support.
>>
>> ------------------------------------------------------------------------------ 
>>
>>
>> Just in general, it's kind of disappointing that NUMA support for G1
>> apparently involves this much code, and is completely different from
>> ParallelGC NUMA support, and both are completely different from ZGC
>> NUMA support.  (And Shenandoah is probably yet another still.)
>> Oh well...
> Yes, I agree on that.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>>
>> According to off-line discussion, the intent is that a HeapRegion only
>> contains memory from one NUMA node.  That isn't obvious though, and there
>> are API and data structure choices that obscure that.  It seems like some
>> things could be simplified with that constraint in mind.  I think some of
>> that is a result of trying to leave open support for NUMA aware off-heap
>> data structures, but that isn't part of this change, and might be better
>> kept separate from NUMA aware Java heap support.
>>
>> I'll try to point out places where I think there are opportunities for
>> simplification.
>>
>> ------------------------------------------------------------------------------ 
>>
>>
>> According to off-line discussion, the touching mechanism might not be
>> necessary.  If that's true, that would be a significant 
>> simplification.  For
>> now I won't comment on the code for the touch threads, though I think
>> I found some problems there before the discussion suggesting we might
>> not need them happened.
> Thanks.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.hpp
>>    64 // Manages NUMA node related information and memory touch threads.
>> ...
>>
>> I found the comment describing G1NUMA pretty hard to parse.  Even
>> after looking at the code I'm having trouble figuring out what's being
>> said.
>>
>> This part I was able to figure out, but think it could be worded better.
>>    70 // Also provides a way to convert NUMA id and NUMA 
>> index(starting from zero and sequatial).
>>    71 // Actual NUMA ids would not be sequential nor not start from 
>> zero, so a conversion is necessary
>>    72 // between id and index.
>>
>> Something like
>>
>>    Also provides a mapping between NUMA indices (sequential, starting
>>    from zero) and NUMA ids (which may not start at zero and may not be
>>    sequential).  Indices are generally used by clients, and mapped to
>>    ids when dealing with the OS/hardware layer.
> DONE
> 
>>
>> (I *think* that's true.)
>>
>> But then, why expose node_ids at all in G1NUMA? It seems like node_ids
>> could be completely relegated to an implementation detail that clients
>> never need to deal with.
> ...
>> As evidence for not needing to expose ids to clients, there are no
>> callers of numa_id_of_index in any of the series of 3 patches.  And
>> G1NUMA::numa_ids() and G1MemoryMultNodeManager::node_ids() are only
>> used in a couple of places for printing/logging and by
>> WB_G1MemoryNodeIds.  I'm not sure what the last is for, but the
>> printing and logging code could be isolated into G1NUMA.
> Basically I totally agree with you for not exposing numa id.
> Current patches are only using node id for logging purpose at some 
> locations because users are only interested on node id.
> 
> Also Thread::lgrp_id() (renaming lgrp id, numa id etc.. is different 
> topic that we had before) is also public, so my feeling is current try 
> is enough. :)
> 
> WB_G1MemoryNodeIds() is for jtreg testing purpose which checks whether 
> the given memory has node id as we expected.
> 
>>
>> Also, it looks like the index to node_id mapping provided here is
>> duplicating the information provided by os::Linux::nindex_to_node.
>> Not that this necessarily helps with anything in generic code...
> Yes, I'm aware of _nindex_to_node but I think maintaining such 
> conversion at G1NUMA is okay. Other NUMA implementation such as parallel 
> gc relies on node id so completely removing node id at general code 
> seems different scope.
> Unless you have strong opinion on addressing it in this patch, I would 
> like proceed as is.
> If it is the case, we can consider your suggestion as a separate 
> enhancement.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>>    57   virtual int* node_ids() const { static int dummy_id = 0; 
>> return &dummy_id; }
>>
>> Can the return type here be "const int*" ?  That would be better if 
>> possible.
>>
>> [This is probably moot if clients only deal with numa indexes and not
>> node_ids, as suggested earlier.]
> DONE
> 
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>>    63   virtual uint guarantee_valid_node_index(uint node_index) const 
>> { return 0; }
>>
>> Given naming conventions elsewhere, the name of this function makes me
>> expect it to be a verification function.
> Agree.
> valid_node_index() is better? webrev.1 changed name.
> Or are you suggesting to remove this and then the caller do such thing? 
> e.g.
> void A(uint node_index) {
> uint valid_node_index = node_index;
> if (!is_valid_node_index(valid_node_index)) {
>    valid_node_index = random_numa_index();
> ...
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.inline.hpp
>>    29 #include "gc/shared/taskqueue.inline.hpp"
>>
>> This #include seems to be just to obtain randomParkAndMiller. It would be
>> better to refactor that function into its own file(s). Or make it part of
>> some component providing various random number generators.
> Not need because of below change.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.inline.hpp
>>    45   int seed = randomParkAndMiller(_seed);
>>    46   Atomic::store(seed, &_seed);
>>
>> This sequence may lose updates.  I don't know if that's important.
>> And why isn't this just using os::random(), or maybe refactoring that
>> slightly to gain access to the external seed version.  I don't want this
>> change to be made:
>> http://cr.openjdk.java.net/~sangheki/8220310/webrev.0/src/hotspot/share/gc/shared/taskqueue.inline.hpp.udiff.html 
>>
> Thomas and Stefan J. also complained on this.
> Changed to get next numa index with round-robin manner. i.e. doesn't use 
> random() at all.
> 
>> And do we *really* need randomness here at all?  Is there a reason not
>> to simply cycle through the nodes?
> Not really.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/os/linux/os_linux.cpp
>> 3016   size_t num_pages_left = size_in_bytes / page_size;
>>
>> Is size_in_bytes required to be a multiple of page_size?  I believe it 
>> is,
>> and an assert of that would be useful.
> This code is removed. And yes to answer to your question.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/os/linux/os_linux.cpp
>> 3016   size_t num_pages_left = size_in_bytes / page_size;
>> 3017   const size_t num_pages_total = num_pages_left;
>> 3018   size_t num_buffer_entries = MIN2(num_pages_left, max_pages);
>>
>> I think clearer would be
>>
>> 3017   const size_t num_pages_total = size_in_bytes / page_size;
>> 3018   size_t num_buffer_entries = MIN2(num_pages_total, max_pages);
>>
>> and move this down to the loop setup:
>> 3016   size_t num_pages_left = num_pages_total;
> This code is removed.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/os/linux/os_linux.cpp
>> 3020   void** pages = (void**)alloca(sizeof(void*) * num_buffer_entries);
>> 3021   int* status = (int*)alloca(sizeof(int) * num_buffer_entries);
>>
>> I think a ResourceMark and NEW_RESOURCE_ARRAYs would be preferable to
>> using alloca() here.  alloca() has some dangers and I think is currently
>> only used in some very specialized places where it really is needed.
> This code is removed.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/os/linux/os_linux.cpp
>> 3061 }
>> 3062 // Moves current thread on a specific node and it will not 
>> migrate to
>>
>> Missing blank line between functions.
> This code is removed.
> 
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/os/linux/os_linux.cpp
>> 3031       pages[i] = (void*)base;
>>
>> Unnecessary cast.
> This code is removed.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/os/linux/os_linux.cpp
>> 3072     case AFFINITY_NONE:
>> 3073       // Do nothing.
>> 3074       return true;
>> 3075       break;
>>
>> Shouldn't this be calling numa_run_on_node with -1 node id?  The
>> documentation for numa_run_on_node() says "Passing -1 permits the
>> kernel to schedule on all nodes again."
> My understand of that description is let kernel decide the numa node 
> which is not actually want here. We want to move current thread to the 
> preferred numa node.
> 
> Currently we don't need this API (numa_set_affinity() ) so removed on 
> webrev.1.
> However following patch which is not scope of this JEP will use it.
> I will add your suggestion below  when we need this.
> 
>>
>> But see also next comment about numa_affinity_set.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/os.hpp
>>   401   // Available affinities.
>>   402   enum NumaAffinity {
>>   403     AFFINITY_STRONG,
>>   404     AFFINITY_NONE
>>   405   };
>>   406   static bool numa_affinity_set(Thread* thread, int numa_id, 
>> NumaAffinity affinity);
>>
>> Could there ever be other affinities?  This API doesn't seem right.
>> It seems to me it should be something like
>>
>>    static bool numa_set_affinity(Thread* thread, int numa_id);
>>    static bool numa_clear_affinity(Thread* thread);
>>
>> No need for NumaAffinity enum.
> The enum is also kind of remainder after removing Solaris implementation 
> which has more types.
> But I agree with you as there are only 2 types now, it is simpler to 
> separate API without enum.
> 
>>
>> (I think numa_set/clear_affinity is better than numa_affinity_set/clear;
>> numa is a "namespace" and we usually set/clear/get_something rather
>> than something_set/clear/get.)
>>
>> Or maybe this should just be
>>
>>    static bool numa_set_affinity(Thread* thread, int numa_id);
>>
>> with "clearing" provided by passing AnyId as the numa_id?  Still don't
>> need the NumaAffinity enum for this.
> Agree on this naming stuff.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.cpp
>>   124   guarantee(rs.size() % page_size == 0, "Reserved space size (" 
>> SIZE_FORMAT ") should be aligned "
>>   141   guarantee(size_in_bytes % page_size == 0, "The given range (" 
>> SIZE_FORMAT ") should be aligned "
>>   209     guarantee((uintptr_t)G1HeapRegionSize % page_size == 0, 
>> "G1HeapRegionSize (" SIZE_FORMAT ") and page size ("
>>   234     guarantee((uintptr_t)_page_size % _heap_region_size == 0, 
>> "G1HeapRegionSize (" SIZE_FORMAT ") and page size ("
>>
>> Use is_aligned.  I think casts shouldn't be needed.
> These lines are removed after major refactoring.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.cpp
>>    43   _numa_id_to_index_map = NEW_C_HEAP_ARRAY_RETURN_NULL(uint, 
>> _len_numa_id_to_index_map, mtGC);
>>    44   memset(_numa_id_to_index_map, 0, sizeof(uint) * 
>> _len_numa_id_to_index_map);
>>
>> Using NEW_xxx_RETURN_NULL and then ignoring the possibility of a NULL
>> result.  It would be better to use the non-_RETURN_NULL variant and
>> get a decent error message rather than the segfault from attemptint to
>> write to NULL.
> Changed to use non-_RETURN_NULL version.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.cpp
>>    34 void G1NUMA::init_numa_id_to_index_map(int* numa_ids, uint 
>> num_numa_ids) {
>>
>> int* numa_ids => const int* numa_ids.
> DONE
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.cpp
>>    34 void G1NUMA::init_numa_id_to_index_map(int* numa_ids, uint 
>> num_numa_ids) {
>> ...
>>    44   memset(_numa_id_to_index_map, 0, sizeof(uint) * 
>> _len_numa_id_to_index_map);
>> Rather than this memset and then writing to all of them later, instead
>> initialize all entries to InvalidNodeIndex and then assign from numa_ids.
> DONE
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1MemoryTouchThread.cpp
>>   261   _wakeup_monitor = new Monitor(Mutex::nonleaf, "Wakeup Monitor 
>> for Touch Control", true, Monitor::_safepoint_check_never);
>>   262   _sync_monitor = new Monitor(Mutex::nonleaf, "Sync Monitor for 
>> Touch Control", true, Monitor::_safepoint_check_never);
>>
>> For singleton mutex/monitor, we generally prefer to put them in 
>> mutexLocker.
>> Probably passed as constructor arguments to G1NUMAThreadsControl.  And
>> I know there are counter-examples.
> Not necessary as memory touch thread is gone.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>>    30 extern const uint AnyNodeIndex;
>>    31 extern const uint InvalidNodeIndex;
>>
>> Why are these at namespace scope.  And they seem misplaced here.
>> Seems like they should belong to the G1NUMA class, which is
>> responsible for indexing nodes and mapping between indexes and ids.
> Changed to G1MemoryNodeManager::AnyNodeIndex.
> And I wanted to hide or minimize exposing G1NUMA in general.
> Memory node manager also has index related APIs, so let manager have 
> such constants seems okay.
> 
>>
>>    30 const uint AnyNodeIndex = (uint)os::AnyId;
>>    31 const uint InvalidNodeIndex = (uint)os::InvalidId;
>> And why these values?  That seems like a units mismatch.  Just use
>> UINT_MAX and UINT_MAX - 1.
> I don't have strong opinion on this.
> But the intend is to make same value after casting for same meaning 
> constants instead of randomly chosen ones.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.hpp
>>    80   // For invalid numa id, return InvalidNodeIndex. So the caller 
>> need exception handling.
>>
>> "need exception handling"?
> Thomas also commented on it so reworded. But finally the class is removed.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heapRegionManager.cpp
>>   114     uint allocated_node_index;
>>
>> This variable gets assigned via an out parameter here:
>>   116     hr = _free_list.remove_region_with_node_index(from_head, 
>> valid_node_index, &allocated_node_index);
>>
>> and by assignment here:
>>   122         allocated_node_index = 
>> mgr->node_index_of_address(hr->bottom());
>>
>> but is never used. If the lack of use is not itself a mistake, then 
>> the out
>> parameter of remove_region_with_node_index appears to be unnecessary.
> The variable will be used at patch3 (JDK-8220312) but slightly changed 
> to allow NULL at allocated_node_index parameter at 
> remove_region_with_node_index().
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heapRegionSet.inline.hpp
>>   151 inline HeapRegion* 
>> FreeRegionList::remove_region_with_node_index(bool from_head,
>>
>> As written, I think this might always return NULL if 
>> requested_node_index is
>> not a valid index. I think callers ensure that it's valid, but an 
>> assert of
>> the argument would be help.
> Done
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heapRegionSet.inline.hpp
>>   175     if (cur == NULL || cur_depth >= max_search_depth) { // not 
>> found
>>
>> There's an off-by-one issue here. The max_search_depth time through the
>> while-control-expression may have found a matching region, checked via 
>> the
>> numa_index_of_address lookup.
>>
>> Either the test should be "current_numa_index != requested_numa_index" or
>> the loop should be changed.
> This code is removed.
> 
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heapRegionSet.inline.hpp
>>   151 inline HeapRegion* 
>> FreeRegionList::remove_region_with_node_index(bool from_head,
>>
>> I think the list manipulation could be simplified quite a bit.  I 
>> think we
>> don't need the "last" variable.  And the splicing out of cur can be 
>> unified
>> between the from_head and !from_head cases, e.g. something like this:
>>
>>    // Find the region to use, searching from _head or _tail as requested.
>>    if (from_head) {
>>      for (cur = _head;
>>           cur != NULL && cur_depth < max_search_depth;
>>           cur = cur->next(), ++cur_depth) {
>>        if (current_numa_index == 
>> numa->numa_index_of_address(cur->bottom)) {
>>          break;
>>        }
>>      }
>>    } else {
>>      for (cur = _tail;
>>           cur != NULL && cur_depth < max_search_depth;
>>           cur = cur->prev(), ++cur_depth) {
>>        if (current_numa_index == 
>> numa->numa_index_of_address(cur->bottom)) {
>>          break;
>>        }
>>      }
>>    }
>>    if (cur == NULL || cur_depth >= max_search_depth) {
>>      return NULL;                      // Didn't find a region to use
>>    }
>>    // Splice the region out of the list.
>>    HeapRegion* prev = cur->prev();
>>    HeapRegion* next = cur->next();
>>    if (prev == NULL) {
>>      _head = next;
>>    } else {
>>      prev->set_next(next);
>>    }
>>    if (next == NULL) {
>>      _tail = prev;
>>    } else {
>>      next->set_prev(prev);
>>    }
>>    cur->set_prev(NULL);
>>    cur->set_next(NULL);
>>    ...
> Nice suggestion!
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.cpp
>>   166 class DumpIterator : public StackObj {
>>
>> Polluting global namespace.  Suggest making DumpIterator and the classes
>> derived from it (G1HeapRegion{Larger,Smaller}Iterator) private nested
>> classes of G1NUMA, making the names unimportant outside of that context.
> Not need any more.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heterogeneousHeapRegionManager.hpp
>>   122   virtual HeapRegion* allocate_free_region(HeapRegionType type);
>>   123   virtual HeapRegion* allocate_free_region(HeapRegionType type, 
>> uint node_index);
>>
>> The base class (HeapRegionManager) no longer has the one-arg 
>> function.  So
>> this one-arg function is no longer overriding the base class definition.
>> (Too bad it wasn't declared as C++11 "override".)
>>
>> I don't see any one-arg callers, so seems like it can just be removed.
> DONE
> Nice finding! :)
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1PageBasedNUMAVirtualSpace.cpp
>>
>> I dislike the new structure here, with G1PageBasedVirtualSpace now being
>> both a base class and a concrete class (which is often a problematic
>> design).  And the new derived class doesn't really do all that much.
>>
>> It seems like it would be much simpler to just have 
>> G1PageBasedVirtualSpace
>> check UseNUMA to conditionalize it's commit and uncommit functions.
>> Especially since it's commit already needs to have some knowlege of NUMA
>> because of the new numa_index argument.
> DONE
> Removed NUMA version and reverted related changes to allow inherit are 
> reverted.
> Now, G1PageBasedVirtualSpace does additional work after checking NUMA.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1PageBasedNUMAVirtualSpace.cpp
>>    52 void G1PageBasedNUMAVirtualSpace::uncommit(size_t start_page, 
>> size_t size_in_pages) {
>>
>> Do we actually need to do anything here?  We're resetting the
>> G1NUMA::_numa_id_of_pages_table (to AnyId), but it's not clear we really
>> need to do that.  When we later recommit any of those pages, that 
>> table will
>> be updated appropriately.
>>
>> Maybe rather than having G1NUMA expose quite so much detail it should
>> instead have post_commit and a pre_uncommit functions.
> I think clearing the node index may help for debugging. Unlike webrev.0, 
> numa id/index information is located at HeapRegion which is much 
> simpler. i.e. HeapRegionManager::uncommit_regions() does resetting node 
> index to InvalidNodeIndex.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1NUMA.hpp
>>    75   G1NUMAPageIdTable _numa_id_of_pages_table;
>>
>> I was surprised by the need for this, but I can't find anything in the 
>> NUMA
>> API that provides this information.  That seems strange, because one 
>> would
>> think that information must exist somewhere in the NUMA implementation.
> Removed.
> And this made the patch much simpler!
> 
> And here are your comments from other email.
> 
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>>    64   // Returns node index of the given address.
>>    65   // If the node id of the address is invalid return randomly 
>> selected node index.
>>    66   virtual uint valid_index_of_address(HeapWord* o) const { 
>> return 0; }
>>
>> This function is unused.
> Removed.
> 
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1MemoryNodeManager.hpp
>>    70   virtual uint node_index_of_address(HeapWord* o) const { return 
>> 0; }
>>
>> The only caller of this function is 
>> HeapRegionManager::allocate_free_region.
>> As noted in an earlier comment, the result of that call is unused.
> Removed as you commented but I had to revert as a part of 
> _numa_id_of_pages_table removal.
> Previously we retrieved the page info from _numa_id_of_pages_table, but 
> new version reads directly via system call.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>>
>> As pointed out to me offline, there*are*  ways to query the address ->
>> associated numa node mapping.  So I'm going back and questioning the need
>> for _numa_id_of_pages_table and the code for managing it.
> As we discussed offline, I removed this table. So webrev.1 checks the 
> bottom of HeapRegion and then saves it at HeapRegion::_node_index. 
> However we do need a way to check node index of the given address and so 
> I added back.
> 
>  From my testing, checking a numa id of one page seems okay to do on 
> runtime. Previously I did checking for all pages of given memory so it 
> was a bit slow.
> 
> One thing that didn't include at webrev.1 is combining with existing one.
> The newly introduced os::numa_get_address_id() is a copy from 
> zNUMA_linux.cpp. It may be unified but didn't try yet.
> 
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.1
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.1.inc (this may not 
> help much! :) )
> Testing: hs-tier 1 ~ 5 (with/without UseNUMA)
> 
> Thanks,
> Sangheon
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>>
>