RFR: 8369811: ZGC: Robust NUMA configuration detection

Joel Sikström jsikstro at openjdk.org
Mon Oct 20 10:32:20 UTC 2025


On Tue, 14 Oct 2025 11:41:18 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:

>> Hello,
>> 
>> When page allocation was overhauled in [JDK-8350441](https://bugs.openjdk.org/browse/JDK-8350441), NUMA support in ZGC was also significantly overhauled. The concept of a partition was introduced as a one-to-one mapping between NUMA nodes and a subset of the Java heap. The number of partitions is ideally the same number of NUMA nodes the Java process is bound to use.
>> 
>> Using ZGC and binding the Java process to only use a subset of the available NUMA nodes on a system with more than 2 NUMA nodes, there will be a mismatch between the internal representation and the configured one. The internal representation ends up having as many partitions as there are NUMA nodes on the system, not how many NUMA nodes the Java process will actually use.
>> 
>> To solve this, we create a mapping between what we refer to as "NUMA id" and "NUMA node", where NUMA id is the internal representation, i.e., the id of a partition, and the NUMA node is the actual NUMA node memory is allocated on. The mapping is used to translate between the two when syscalls are made, so that the internal representation always works with NUMA ids and syscalls work with the actual, or desired, NUMA node.
>> 
>> Before:
>> 
>> $ numactl --cpunodebind=0,2 --membind=0,2 ./jdk-25/bin/java -Xms200M -Xmx200M -XX:+AlwaysPretouch -XX:+UseZGC -Xlog:gc+init Forever.java
>> [0.236s][info][gc,init] NUMA Support: Enabled
>> [0.237s][info][gc,init] NUMA Nodes: 4
>> 
>> $ cat /proc/$(pidof java)/numa_maps | grep java_heap
>> 40000000000 bind:0,2 file=/memfd:java_heap\040(deleted) dirty=12800 active=0 N0=12800 kernelpagesize_kB=4
>> 401f1000000 bind:0,2 file=/memfd:java_heap\040(deleted) dirty=12800 active=0 N1=12800 kernelpagesize_kB=4
>> 403e2000000 bind:0,2 file=/memfd:java_heap\040(deleted) dirty=12800 active=0 N2=12800 kernelpagesize_kB=4
>> 405d3000000 bind:0,2 file=/memfd:java_heap\040(deleted) dirty=12800 active=0 N3=12800 kernelpagesize_kB=4
>> 
>> 
>> After:
>> 
>> $ numactl --cpunodebind=0,2 --membind=0,2 ./jdk/bin/java -Xms200M -Xmx200M -XX:+AlwaysPreTouch -XX:+UseZGC -Xlog:gc+init Forever.java
>> [0.236s][info][gc,init] NUMA Support: Enabled
>> [0.237s][info][gc,init] NUMA Nodes: 2
>> 
>> $ cat /proc/$(pidof java)/numa_maps | grep java_heap
>> 40000000000 bind:0,2 file=/memfd:java_heap\040(deleted) dirty=25600 active=0 N0=25600 kernelpagesize_kB=4
>> 403e2000000 bind:0,2 file=/memfd:java_heap\040(deleted) dirty=25600 active=0 N2=25600 kernelpagesize_kB=4
>> 
>> 
>> Testing:
>> * Functional testing on a QEMU VM with...
>
> lgtm. 
> 
> Unfortunate that `numa_get_leaf_groups` has the signature it does requiring this `int -> uint -> int` type juggling. 
> 
> Maybe we can improve that in the future. And even create a more explicit type system in our os numa layer so we are less likely to introduce similar bugs in the future.

Thank you for the reviews! @xmas92 @kstefanj

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27792#issuecomment-3421483087


More information about the hotspot-gc-dev mailing list