Linux/PPC64: "mbind: Invalid argument" when -XX:+UseNUMA is used
sangheon
sangheon.kim at oracle.com
Mon Feb 6 22:23:35 UTC 2017
Hi Gustavo,
On 02/06/2017 01:50 PM, Gustavo Romero wrote:
> Hi,
>
> On Linux/PPC64 I'm getting a series of "mbind: Invalid argument" that seems
> exactly the same as reported for x64 [1]:
>
> [root at spocfire3 ~]# java -XX:+UseNUMA -version
> mbind: Invalid argument
> mbind: Invalid argument
> mbind: Invalid argument
> mbind: Invalid argument
> mbind: Invalid argument
> mbind: Invalid argument
> mbind: Invalid argument
> openjdk version "1.8.0_121"
> OpenJDK Runtime Environment (build 1.8.0_121-b13)
> OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)
>
> [root at spocfire3 ~]# uname -a
> Linux spocfire3.aus.stglabs.ibm.com 3.10.0-327.el7.ppc64le #1 SMP Thu Oct 29 17:31:13 EDT 2015 ppc64le ppc64le ppc64le GNU/Linux
>
> [root at spocfire3 ~]# lscpu
> Architecture: ppc64le
> Byte Order: Little Endian
> CPU(s): 160
> On-line CPU(s) list: 0-159
> Thread(s) per core: 8
> Core(s) per socket: 10
> Socket(s): 2
> NUMA node(s): 2
> Model: 2.0 (pvr 004d 0200)
> Model name: POWER8 (raw), altivec supported
> L1d cache: 64K
> L1i cache: 32K
> L2 cache: 512K
> L3 cache: 8192K
> NUMA node0 CPU(s): 0-79
> NUMA node8 CPU(s): 80-159
>
> On chasing down it, looks like it comes from PSYoungGen::initialize() in
> src/share/vm/gc_implementation/parallelScavenge/psYoungGen.cpp that calls
> initialize_work(), that calls the MutableNUMASpace() constructor if
> UseNUMA is set:
> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/567e410935e5/src/share/vm/gc_implementation/parallelScavenge/psYoungGen.cpp#l77
>
> MutableNUMASpace() then calls os::numa_make_local(), that in the end calls
> numa_set_bind_policy() in libnuma.so.1 [2].
>
> I've traced some values for which mbind() syscall fails:
> http://termbin.com/ztfs (search for "Invalid argument" in the log).
>
> Assuming it's the same bug as reported in [1] and so it's not fixed on 9 and 10:
>
> - Is there any WIP or known workaround?
There's no progress on JDK-8163796 and no workaround found yet.
And unfortunately, I'm not planning to fix it soon.
> - Should I append this output in [1] description or open a new one and make it
> related to" [1]?
I think your problem seems same as JDK-8163796, so adding your output on
the CR seems good.
And please add logs as well. I recommend to enabling something like
"-Xlog:gc*,gc+heap*=trace".
IIRC, the problem was only occurred when the -Xmx was small in my case.
Thanks,
Sangheon
>
> Thank you.
>
>
> Best regards,
> Gustavo
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8163796
> [2] https://da.gd/4vXF
>
More information about the ppc-aix-port-dev
mailing list