Hi Florian, On Mon, May 8, 2023 at 11:15 AM Florian Weimer <fweimer@redhat.com> wrote:
* Thomas Stüfe:
ZGC, on Linux, seems only compatible with 2M pages. Seeing that ZGC is often used with very large heaps, is support for 1GB pages planned?
Especially if one disregards uncommitting (-ZUncommit), 1G pages could be a speed boost for customers with gigantic heaps, as well as reduce their number of VMAs.
Is the number of VMAs really tied to hugepage support?
Indirectly. AFAIU, the number of VMAs is coupled to the size of an internal granularity ZGC uses to stitch together memory from the underlying memory layer. That granularity is 2M, I assume, because large pages are 2M on the architectures relevant to ZGC. It seems hard-wired. In long running processes, I observe a dense mixup of these stitchings, so the kernel cannot fold neighboring regions, and needs a separate VMA to represents each one.
I think ZGC could keep the number of VMAs down simply by processing mappings at larger granularity.
There is a Fedora discussion under way to eliminate the kernel VMA limit completely, but the kernel OOM handler isn't really compatible with that. The current heuristics do not seem to pick the most appropriate process if the kernel ends up with too much (unswappable?) memory used due to an excessive count of VMAs, so I'm not sure that we're going to change the default.
F39 proposal: Increase vm.max_map_count value (System-Wide Change proposal) < https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/...
Interesting. Thanks for the hint. I have seen us running against this limit several times in the past, and the error is often confusing (e.g. when creating a thread and mprotecting the guard pages, which may split the contained VMA into two; if that fails, the mprotect fails with ENOMEM, which is not intuitive.
Thanks, Florian