RFR: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size
Erik Österlund
eosterlund at openjdk.org
Wed Jun 7 14:14:55 UTC 2023
On Wed, 7 Jun 2023 13:30:05 GMT, Stefan Karlsson <stefank at openjdk.org> wrote:
> ZGC has separated the committing of physical memory from the mapping of the committed memory to virtual memory. It also has asynchronous, lazy unmapping of virtual memory from physical memory. This leads to a situation where multiple virtual memory areas can be mapped to the same physical memory. NMT has a strong assumption that there's a 1-to-1 correspondence between committed memory and its virtual memory areas. Because of this NMT and ZGC is not entirely compatible. ZGC has worked around this by adding NMT hooks where the virtual memory is mapped to the committed memory. This mostly works, but there are situations where we have multiple virtual memory areas mapped to the same physical memory, and that causes the NMT values to be inflated.
>
> I propose that we move the NMT committed memory tracking from the mapping of virtual memory to the actual committing of physical memory.
>
> FWIW, given that NMT and ZGC doesn't agree about how memory is committed, we have to fake the virtual memory addresses reported to NMT. This could probably be noticed if you look for the Java heap addresses in the NMT details output, but I don't see why anyone should be looking for those address for the Java heap in NMT. The interesting number is the amount of committed memory, not the exact addresses, IMHO. This isn't something that we change with this patch, but it can be worth understanding while looking at this Bug and the associated PR.
>
> I've written a small sanity test for the NMT Java Heap values, however it's non-trivial to write a test that efficiently provokes this. I've verified this fix by manually running an over-provisioned SPECjbb2015 run, which results in a lot of splitting of ZGC heap regions, which in turn gives us multiple virtual memory area mapping for the same physical memory.
>
> Side note: the lazy unmapping of virtual memory can cause other problems with too many virtual memory areas. The inflated NMT numbers have been a smoking gun showing us that issue. We are tracking that issue with [JDK-8308783](https://bugs.openjdk.org/browse/JDK-8308783).
Looks good.
-------------
Marked as reviewed by eosterlund (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/14355#pullrequestreview-1467790288
More information about the hotspot-dev
mailing list