RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3)
Stefan Johansson
stefan.johansson at oracle.com
Thu Oct 31 15:31:39 UTC 2019
Hi Sangheon,
On 2019-10-23 08:39, sangheon.kim at oracle.com wrote:
> Hi Thomas,
>
> I am posting the next webrev as Kim is waiting it.
>
> Webrev:
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.3
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.3.inc
Here are my comments:
src/hotspot/share/gc/g1/g1CollectedHeap.hpp
---
2397 st->print(" remaining free region(s) from each node id: ");
What do you think about changing this to "... region(s) on each NUMA
node: "? I think we should be clear about the logging being for NUMA.
---
src/hotspot/share/gc/g1/g1EdenRegions.hpp
---
33 class G1EdenRegions : public G1RegionCounts {
I don’t think G1EdenRegions is a G1RegionCounts but rather it should
have one. So instead of using inheritance here I think G1EdenRegions
should have a G1RegionsCount. Instead of overloading length I would then
suggest adding a region_count(uint node_index) to get the count.
Same goes for G1SurvivorRegions.
---
src/hotspot/share/gc/g1/g1NUMA.cpp
---
279 bool NodeIndexCheckClosure::do_heap_region(HeapRegion* hr) {
280 uint preferred_node_index =
_numa->preferred_node_index_for_index(hr->hrm_index());
281 uint active_node_index = _numa->index_of_address(hr->bottom());
282
283 if (preferred_node_index == active_node_index) {
284 _matched[preferred_node_index]++;
285 } else if (active_node_index == G1NUMA::UnknownNodeIndex) {
286 _unknown++;
287 }
288 _total++;
289
290 return false;
291 }
As we discussed offline, I would like to know the mismatches as well, I
think the easiest approach would be to make the total count per node as
well and that way we can see if there were any regions that didn't
match. What do you think about printing the info like this:
[3,009s][trace][gc,heap,numa ] GC(6) NUMA region verification
(actual/expected): 0: 1024/1024, 1: 270/1024, Unknown: 0
When testing this I also realized this output is problematic in the case
where we have committed regions that have not yet been used. Reading the
manual for get_mempolicy (the way we get the numa id for the address) say:
"If no page has yet been allocated for the specified address,
get_mempolicy() will allocate a page as if the thread had performed a
read (load) access to that address, and return the ID of the node where
that page was allocated."
Doing a read access seem to always get a page on NUMA node 0, so the
accounting will not be correct in this case.
One way to fix this would be to only do accounting for regions currently
used (!hr->is_free()) but I'm not sure that is exactly what we want, at
least not if we only do this after the GC, then only the survivors and
old will be checked. We could solve this by also do verification before
the GC. I think this might be the way to go, what do you think? If my
proposal was hard to follow, here's a patch:
http://cr.openjdk.java.net/~sjohanss/numa/verify-alternative/
The output from this patch would be:
9,233s][trace][gc,heap,numa ] GC(18) GC Start: NUMA region
verification (actual/expected): 0: 358/358, 1: 361/361, Unknown: 0
[9,306s][trace][gc,heap,numa ] GC(18) GC End: NUMA region verification
(actual/expected): 0: 348/348, 1: 347/347, Unknown: 0
One can also see that this verification takes some time, so maybe it
would make sense to have this logging under gc+numa+verify.
---
234 uint converted_req_index = requested_node_index;
235 if(converted_req_index == AnyNodeIndex) {
236 converted_req_index = _num_active_node_ids;
237 }
238 if (converted_req_index <= _num_active_node_ids) {
239 _times->update(phase, converted_req_index, allocated_node_index);
240 }
I had to read this more than once to understand what it really did and I
think we can simplify it a bit, by just doing an if-else that checks for
AnyNodeIndex and if so passes in _num_active_node_ids to update(). This
should be ok since requested_node_index never can be larger than
_num_active_node_ids.
---
src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
---
I would prefer if we hide all the accounting in helper functions, but it
might be good to declare them to be inlined.
85 if (_numa->is_enabled()) {
86 LogTarget(Info, gc, heap, numa) lt;
87
88 if (lt.is_enabled()) {
89 uint num_nodes = _numa->num_active_nodes();
90 // Record only if there are multiple active nodes.
91 _obj_alloc_stat = NEW_C_HEAP_ARRAY(size_t, num_nodes, mtGC);
92 memset((void*)_obj_alloc_stat, 0, sizeof(size_t) * num_nodes);
93 }
94 }
Move to something like initialize_numa_stats().
108 if (_obj_alloc_stat != NULL) {
109 uint node_index = _numa->index_of_current_thread();
110
_numa->copy_statistics(G1NodeTimes::LocalObjProcessAtCopyToSurv,
node_index, _obj_alloc_stat);
111 }
This could be called flush_numa_stats().
268 if (_obj_alloc_stat != NULL) {
269 _obj_alloc_stat[node_index]++;
270 }
And this something like update_numa_stats(uint).
--
heapRegionSet.hpp
---
159 inline void update_length(HeapRegion* hr, bool increase);
254 inline void update_length(HeapRegion* hr, bool increase);
Is there any reason for having update_length that takes a bool rather
than having one function for increments and one for decrements? To me it
looks like all uses are pretty well defined and it would make the code
easier to read. I also think we could pass in the node index rather than
the HeapRegion since the getter lenght() does this.
---
src/hotspot/share/gc/g1/g1NodeTimes.cpp
---
First, a question about the names, G1NodeTimes signals that it has to do
with timing, but currently we don't really record any timings. Same
thing with NodeStatPhases, not really the same type of phases that we
have for the rest of the GC logging. What do you think about renaming
the class to G1NUMAStats and the enum to NodeDataItems?
166 void G1NodeTimes::print_phase_info(G1NodeTimes::NodeStatPhases
phase) {
167 LogTarget(Info, gc, heap, numa) lt;
I think this should be on debug level, but if you don't agree leave it
as is.
---
191 void G1NodeTimes::print_mutator_alloc_stat_debug() {
192 LogTarget(Debug, gc, heap, numa) lt;
And if you agree on moving the above to debug I think this should be on
trace level.
---
This is it for now. Thanks,
Stefan
> Testing: hs-tier 1 ~ 4 with/without UseNUMA. hs-tier5 is almost finished
> without new failures.
>
> Thanks,
> Sangheon
>
>
More information about the hotspot-gc-dev
mailing list