RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3)
Stefan Johansson
stefan.johansson at oracle.com
Mon Nov 4 16:20:11 UTC 2019
Hi Sangheon,
Thanks for addressing my comments, some comments inline and additional
comments below.
On 2019-11-02 07:08, sangheon.kim at oracle.com wrote:
>> ...
>> As we discussed offline, I would like to know the mismatches as well,
>> I think the easiest approach would be to make the total count per node
>> as well and that way we can see if there were any regions that didn't
>> match. What do you think about printing the info like this:
>> [3,009s][trace][gc,heap,numa ] GC(6) NUMA region verification
>> (actual/expected): 0: 1024/1024, 1: 270/1024, Unknown: 0
> Changed as you suggested to have per node but deleted unknown as unknown
> is 'total - matched'.
I'm fine with skipping 'Unknown', but not all regions in (total -
matched) must be 'Unknown' since there is the possibility of a real
mismatch as well.
>
>>
>> When testing this I also realized this output is problematic in the
>> case where we have committed regions that have not yet been used.
>> Reading the manual for get_mempolicy (the way we get the numa id for
>> the address) say:
>> "If no page has yet been allocated for the specified address,
>> get_mempolicy() will allocate a page as if the thread had performed a
>> read (load) access to that address, and return the ID of the node
>> where that page was allocated."
> Nice catch.
>
> Shortly saying get_mempolicy() doesn't honor os::numa_make_local() call
> that we previously requested. And this problem occurs when
> AlwaysPreTouch is disabled.
>
> However, my initial implementation which uses 'numa_move_pages()'
> doesn't have this problem. So one fundamental solution would be
> replacing linux implementation.
> In current scope of 3 patches, there will be no problem if we add
> 'hr->free && !AlwaysPreTouch' condition check however
> os::numa_get_group_id_for_address() will still have such limitation.
>
> What do you think about changing the Linux implementation?
> (webrev link is added at the end)
>
I agree that we should use numa_move_pages(), but I think we might want
to change things a bit more because of this. I will add those comments
below the patch to have all code comments in one place.
>> ... >> One can also see that this verification takes some time, so maybe it
>> would make sense to have this logging under gc+numa+verify.
> I think if we avoid calling G1NUMAA:index_of_address() if 'hr->free() &&
> !AlwaysPreTouch' but count total, we will be fine.
> Basically I imported your patch but the verification is only happening
> at the beginning of GC.
Any reason for not including it at the end of GC as well? I guess that
could be good to be able to diagnose if there was an increase imbalance
due to the collection.
>> ...
>> 166 void G1NodeTimes::print_phase_info(G1NodeTimes::NodeStatPhases
>> phase) {
>> 167 LogTarget(Info, gc, heap, numa) lt;
>>
>> I think this should be on debug level, but if you don't agree leave it
>> as is.
> I feel Info seems okay, so let me leave as is.
>
>> ---
>>
>> 191 void G1NodeTimes::print_mutator_alloc_stat_debug() {
>> 192 LogTarget(Debug, gc, heap, numa) lt;
>>
>> And if you agree on moving the above to debug I think this should be
>> on trace level.
> As is, please.
I'm okay with that, I just wanted to point out that compared to the
other G1 logging on Info/Debug level these seemed to be one off.
>
> Here's the webrev:
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.4
Some additional comments:
src/hotspot/share/gc/g1/g1CollectedHeap.cpp
---
2598 void G1CollectedHeap::verify_numa_regions() {
2599 LogTarget(Trace, gc, heap, numa) lt;
I realize I didn't include that in my example patch, but did you have
anything against changing the tags here to 'gc, heap, verify'?
---
src/hotspot/share/gc/g1/g1NUMA.cpp
---
281 _ls->print("%d: %u / %u, ", numa_ids[i], _matched[i], _total[i]);
This will leave a trailing ',' now when we don't have the 'Unknown'
output. I think just having a space as the separator between the nodes
could be fine to avoid this.
---
292 if (hr->is_free() && !AlwaysPreTouch) {
293 active_node_index = G1NUMA::UnknownNodeIndex;
294 } else {
295 active_node_index = _numa->index_of_address(hr->bottom());
296 }
297
298 if (preferred_node_index == active_node_index) {
299 _matched[preferred_node_index]++;
300 }
301 _total[preferred_node_index]++;
As I touched upon above, I think we still lack some important
information here. If we are to look at all committed regions, which
probably is good, we need to keep a mismatch count as well. Otherwise we
can't know if the diff between matched and total is just unused regions
or actual mismatches without relying on other logs and analysis. Or am I
missing something?
--
src/hotspot/share/gc/g1/g1SurvivorRegions.hpp
---
30 class G1RegionCounts;
Just having a forward declaration here is not enough, so please add back
the include.
---
src/hotspot/share/gc/g1/g1NUMAStats.hpp
---
99 void print_phase_info(G1NUMAStats::NodeDataItems phase);
Sorry for being so picky on the naming, but since we changed the class
and enum I think this method, the comments and the parameters should
change as well. For example:
void print_info(G1NUMAStats::NodeDataItems type);
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc
> http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc.numa_move_pages
As I said above, I think we should go with this implementation of
getting the node given an address, but I have a comment:
src/hotspot/os/linux/os_linux.cpp
---
3018 if (os::Linux::numa_move_pages(0, 1, pages, NULL, &id, 0) == -1) {
3019 return -1;
3020 }
3021 if (id < 0) {
3022 return -1;
3023 }
I think we should differ between the case where the call fails (return
-1) and the id returned is negative. From reading the man_pages for
'move_pages()' (used by numa_move_pages()) it seems we can expect to
either get -EFAULT (for normal pages) or -ENOENT (for large pages) when
a region has not yet been used. I'm not completely sure how much logic
we want to add to this method, so maybe creating a RFE to look at this
in more detail later is good. But for this first version I think your
proposal is good.
One thought I just got, with this implementation is more like 'Unused',
rather then 'Unknown'. But such a notion change could also be looked at
in the RFE.
---
Thanks,
Stefan
>
> Testing: hs-tier 1 ~ 5 with / without UseNUMA.
>
> Thanks,
> Sangheon
>
>
>> ---
>>
>> This is it for now. Thanks,
>> Stefan
>>
>>
>>> Testing: hs-tier 1 ~ 4 with/without UseNUMA. hs-tier5 is almost
>>> finished without new failures.
>>>
>>> Thanks,
>>> Sangheon
>>>
>>>
>
More information about the hotspot-gc-dev
mailing list