RFR: 8298482: Implement ParallelGC NUMAStats for Linux [v3]

Stefan Johansson sjohanss at openjdk.org
Tue Jan 10 11:46:56 UTC 2023


On Thu, 5 Jan 2023 09:53:06 GMT, Nick Gasson <ngasson at openjdk.org> wrote:

>> ParallelGC has a seemingly useful option -XX:+NUMAStats that prints detailed information in GC.heap_info about which NUMA node pages in the eden space are bound to.  However as far as I can tell this only ever worked on Solaris and is not implemented on any of the systems we currently support.  This patch implements it on Linux using the move_pages system call.
>> 
>> The function os::get_page_info() and accompanying struct page_info was just a thin wrapper around the Solaris meminfo(2) syscall and was never ported to other systems so I've just removed it rather than try to emulate its interface.
>> 
>> There's also a method MutableNUMASpace::LGRPSpace::scan_pages() which attempts to find pages on the wrong NUMA node and frees them so that they have another chance to be allocated on the correct node by the first-touching thread, but I think this has always been a no-op on non-Solaris so perhaps should also be removed.  On Linux it shouldn't be necessary as you can bind pages to the desired node directly.
>> 
>> I don't know what the performance of this option was like on Solaris but on Linux the move_pages call can be quite slow: I measured about 25ms/GB on my system.  At the moment we call LGRPSpace::accumulate_statistics() twice per GC cycle: I removed the second call as it's likely to see a lot of uncommitted pages if the spaces were just resized. MutableNUMASpace::print_on() also calls accumulate_statistics() directly and since that's the only place this data is used, maybe we can drop the call from MutableNUMASpace::accumulate_statistics() as well?
>> 
>> Example output:
>> 
>> 
>>  PSYoungGen      total 4290560K, used 835628K [0x00000006aac00000, 0x0000000800000000, 0x0000000800000000)
>>   eden space 3096576K, 1% used [0x00000006aac00000,0x00000007176a9f48,0x0000000767c00000)
>>     lgrp 0 space 1761280K, 2% used [0x00000006aac00000,0x00000006acfc4980,0x0000000716400000)
>>     local/remote/unbiased/uncommitted: 1671168K/0K/0K/90112K, large/small pages: 0/440320
>>     lgrp 1 space 1335296K, 46% used [0x0000000716400000,0x000000073c2abb18,0x0000000767c00000)
>>     local/remote/unbiased/uncommitted: 1335296K/0K/0K/0K, large/small pages: 0/333824
>>   from space 1193984K, 65% used [0x00000007b7200000,0x00000007e6b9c778,0x0000000800000000)
>>   to   space 1247232K, 0% used [0x0000000767c00000,0x0000000767c00000,0x00000007b3e00000)
>> 
>> 
>> After testing this with SPECjbb for a while I noticed some pages always end up bound to the wrong node.  I think this is a regression caused by JDK-8283935 but I'll raise a separate ticket for that.
>
> Nick Gasson has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Do not update NUMAStats in MutableNUMASpace::accumulate_statistics()

Looks good, just a style comment below.

Regarding the need of calls to `accumulate_statistics()` I also agree that the call in `print_on` should be enough.

src/hotspot/share/gc/parallel/mutableNUMASpace.cpp line 896:

> 894:           space_stats()->_local_space += os::vm_page_size();
> 895:         else
> 896:           space_stats()->_remote_space += os::vm_page_size();

Please add braces for the `for`-loop and the `if-else`-statements.

-------------

Marked as reviewed by sjohanss (Reviewer).

PR: https://git.openjdk.org/jdk/pull/11635


More information about the hotspot-gc-dev mailing list