Integrated: 8298482: Implement ParallelGC NUMAStats for Linux

Nick Gasson ngasson at openjdk.org
Thu Jan 12 09:32:26 UTC 2023


On Mon, 12 Dec 2022 15:19:21 GMT, Nick Gasson <ngasson at openjdk.org> wrote:

> ParallelGC has a seemingly useful option -XX:+NUMAStats that prints detailed information in GC.heap_info about which NUMA node pages in the eden space are bound to.  However as far as I can tell this only ever worked on Solaris and is not implemented on any of the systems we currently support.  This patch implements it on Linux using the move_pages system call.
> 
> The function os::get_page_info() and accompanying struct page_info was just a thin wrapper around the Solaris meminfo(2) syscall and was never ported to other systems so I've just removed it rather than try to emulate its interface.
> 
> There's also a method MutableNUMASpace::LGRPSpace::scan_pages() which attempts to find pages on the wrong NUMA node and frees them so that they have another chance to be allocated on the correct node by the first-touching thread, but I think this has always been a no-op on non-Solaris so perhaps should also be removed.  On Linux it shouldn't be necessary as you can bind pages to the desired node directly.
> 
> I don't know what the performance of this option was like on Solaris but on Linux the move_pages call can be quite slow: I measured about 25ms/GB on my system.  At the moment we call LGRPSpace::accumulate_statistics() twice per GC cycle: I removed the second call as it's likely to see a lot of uncommitted pages if the spaces were just resized. MutableNUMASpace::print_on() also calls accumulate_statistics() directly and since that's the only place this data is used, maybe we can drop the call from MutableNUMASpace::accumulate_statistics() as well?
> 
> Example output:
> 
> 
>  PSYoungGen      total 4290560K, used 835628K [0x00000006aac00000, 0x0000000800000000, 0x0000000800000000)
>   eden space 3096576K, 1% used [0x00000006aac00000,0x00000007176a9f48,0x0000000767c00000)
>     lgrp 0 space 1761280K, 2% used [0x00000006aac00000,0x00000006acfc4980,0x0000000716400000)
>     local/remote/unbiased/uncommitted: 1671168K/0K/0K/90112K, large/small pages: 0/440320
>     lgrp 1 space 1335296K, 46% used [0x0000000716400000,0x000000073c2abb18,0x0000000767c00000)
>     local/remote/unbiased/uncommitted: 1335296K/0K/0K/0K, large/small pages: 0/333824
>   from space 1193984K, 65% used [0x00000007b7200000,0x00000007e6b9c778,0x0000000800000000)
>   to   space 1247232K, 0% used [0x0000000767c00000,0x0000000767c00000,0x00000007b3e00000)
> 
> 
> After testing this with SPECjbb for a while I noticed some pages always end up bound to the wrong node.  I think this is a regression caused by JDK-8283935 but I'll raise a separate ticket for that.

This pull request has now been integrated.

Changeset: 036c8084
Author:    Nick Gasson <ngasson at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/036c80844e30559bdced3587bb70b29ee38af498
Stats:     64 lines in 7 files changed: 10 ins; 30 del; 24 mod

8298482: Implement ParallelGC NUMAStats for Linux

Reviewed-by: ayang, sjohanss, tschatzl

-------------

PR: https://git.openjdk.org/jdk/pull/11635


More information about the hotspot-gc-dev mailing list