[10] RFR (S) 8175813: PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used
Gustavo Romero
gromero at linux.vnet.ibm.com
Wed Apr 12 22:51:39 UTC 2017
Hi,
Any update on it?
Thank you.
Regards,
Gustavo
On 09-03-2017 16:33, Gustavo Romero wrote:
> Hi,
>
> Could the following webrev be reviewed please?
>
> It improves the numa node detection when non-consecutive or memory-less nodes
> exist in the system.
>
> webrev: http://cr.openjdk.java.net/~gromero/8175813/v2/
> bug : https://bugs.openjdk.java.net/browse/JDK-8175813
>
> Currently, although no problem exists when the JVM detects numa nodes that are
> consecutive and have memory, for example in a numa topology like:
>
> available: 2 nodes (0-1)
> node 0 cpus: 0 8 16 24 32
> node 0 size: 65258 MB
> node 0 free: 34 MB
> node 1 cpus: 40 48 56 64 72
> node 1 size: 65320 MB
> node 1 free: 150 MB
> node distances:
> node 0 1
> 0: 10 20
> 1: 20 10,
>
> it fails on detecting numa nodes to be used in the Parallel GC in a numa
> topology like:
>
> available: 4 nodes (0-1,16-17)
> node 0 cpus: 0 8 16 24 32
> node 0 size: 130706 MB
> node 0 free: 7729 MB
> node 1 cpus: 40 48 56 64 72
> node 1 size: 0 MB
> node 1 free: 0 MB
> node 16 cpus: 80 88 96 104 112
> node 16 size: 130630 MB
> node 16 free: 5282 MB
> node 17 cpus: 120 128 136 144 152
> node 17 size: 0 MB
> node 17 free: 0 MB
> node distances:
> node 0 1 16 17
> 0: 10 20 40 40
> 1: 20 10 40 40
> 16: 40 40 10 20
> 17: 40 40 20 10,
>
> where node 16 is not consecutive in relation to 1 and also nodes 1 and 17 have
> no memory.
>
> If a topology like that exists, os::numa_make_local() will receive a local group
> id as a hint that is not available in the system to be bound (it will receive
> all nodes from 0 to 17), causing a proliferation of "mbind: Invalid argument"
> messages:
>
> http://cr.openjdk.java.net/~gromero/logs/jdk10_pristine.log
>
> That change improves the detection by making the JVM numa API aware of the
> existence of numa nodes that are non-consecutive from 0 to the highest node
> number and also of nodes that might be memory-less nodes, i.e. that might not
> be, in libnuma terms, a configured node. Hence just the configured nodes will
> be available:
>
> http://cr.openjdk.java.net/~gromero/logs/jdk10_numa_patched.log
>
> The change has no effect on numa topologies were the problem does not occur,
> i.e. no change in the number of nodes and no change in the cpu to node map. On
> numa topologies where memory-less nodes exist (like in the last example above),
> cpus from a memory-less node won't be able to bind locally so they are mapped
> to the closest node, otherwise they would be not associate to any node and
> MutableNUMASpace::cas_allocate() would pick a node randomly, compromising the
> performance.
>
> I found no regressions on x64 for the following numa topology:
>
> available: 2 nodes (0-1)
> node 0 cpus: 0 1 2 3 8 9 10 11
> node 0 size: 24102 MB
> node 0 free: 19806 MB
> node 1 cpus: 4 5 6 7 12 13 14 15
> node 1 size: 24190 MB
> node 1 free: 21951 MB
> node distances:
> node 0 1
> 0: 10 21
> 1: 21 10
>
> I understand that fixing the current numa detection is a prerequisite to enable
> UseNUMA by the default [1] and to extend the numa-aware allocation to the G1 GC [2].
>
> Thank you.
>
>
> Best regards,
> Gustavo
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable NUMA Mode by Default When Appropriate)
> [2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC: NUMA-Aware Allocation)
>
More information about the hotspot-dev
mailing list