UseNUMA membind Issue in openJDK

Gustavo Romero gromero at linux.vnet.ibm.com
Thu Jun 14 16:28:56 UTC 2018


Hi,

On 06/14/2018 09:01 AM, Swati Sharma wrote:
> +Roshan
> 
> Hi Derek,
> 
> Thanks for your testing and finding additional bug with UseNUMA ,I appreciate your effort.
> 
> The answer to your questions:
> 
> 1) What should JVM do if cpu node is bound, but not memory is bound? Even with patch, JVM wastes memory because it sets aside part of Eden for threads that can never run on other node.
>    - numactl -N 0 java -Xlog:gc*=info -XX:+UseParallelGC -XX:+UseNUMA -version
>    - My expectation was that it would act as if membind is set. But I'm not an expert.
>    - What do containers do under the hood? Would they ever bind cpus and NOT memory?
> If membind is not given then JVM should use the memory on all nodes available. You are right, wastage of memory is happening,
> We have analyzed the code and got the root cause of this issue and the fix for this issue will take some time,
> 
> Note: My colleague Roshan has found the root cause in existing code and working on the fix for this issue, soon he will come up with the patch.

Thanks for the helpful comments, Derek and Swati. I agree: it's an issue and a
separated one. The problem (even with Swati's patch applied) is that JVM will
just look at the node mask information and won't consider cpu bindings to adapt.
I guess that originally UseNUMA was only interested on numa topology in regard
to find out the best memory allocation for the given unpinned cpus on the
machine.

I call not tell about the container question, but I understand the if we cover
all the bound/not bound combinations of cpu/memory the JVM should be fine in the
worst case (It might be the case that bindings are transparent to the JVM, I
don't know...).


> 2) What should JVM do if cpu node is bound, and numactl --localalloc specified? Even with patch, JVM wastes memory.
>    - numactl -N 0 --localalloc java -Xlog:gc*=info -XX:+UseParallelGC -XX:+UseNUMA -version
>    - My expectation was that "--localalloc" would be identical to setting membind for all of the cpu bound nodes, but I guess it's not.
> Yes ,In case of "numactl  --localalloc" , thread should use local memory always. Lgrp should be created based on cpunode given.In the current example it should create only single lgrp.

I agree.


> Gustavo, Shall we go ahead with the current patch as issue pointed out by Derek is not with current patch but exists in existing code and can fix the issue in another patch ?
> Derek , Can you file the separate bug for above issues with no --membind in numactl ?
> My current patch fixes the issue when user mentions --membind with numactl , the same mentioned also in subject line( UseNUMA membind Issue in openJDK)

Yes, I'm fine with that. Derek kindly already filed a new bug. Also the other
issue (not addressed by Swati's patch) is well stated.

Thanks.


Best regards,
Gustavo

> Thanks,
> Swati Sharma
> Software Engineer - 2 @AMD
> 
> 
> On Thu, Jun 14, 2018 at 3:23 AM, White, Derek <Derek.White at cavium.com <mailto:Derek.White at cavium.com>> wrote:
> 
>     See inline:
> 
>     > -----Original Message-----
>     > From: Gustavo Romero [mailto:gromero at linux.vnet.ibm.com <mailto:gromero at linux.vnet.ibm.com>]
>     ...
>     > Hi Derek,
>     > 
>     > On 06/12/2018 06:56 PM, White, Derek wrote:
>     > > Hi Swati, Gustavo,
>     > >
>     > > I’m not the best qualified to review the change – I just reported the issue
>     > as a JDK bug!
>     > >
>     > > I’d be happy to test a fix but I’m having trouble following the patch. Did
>     > Gustavo post a patch to your patch, or is that a full independent patch?
>     > 
>     > Yes, the idea was that you could help on testing it against JDK-8189922.
>     > Swati's initial report on this thread was accompanied with a simple way to
>     > test the issue he reported. You said it was related to bug JDK-8189922 but I
>     > can't see a simple way to test it as you reported. Besides that I assumed that
>     > you tested it on arm64, so I can't test it myself (I don't have such a
>     > hardware). Btw, if you could provide some numactl -H information I would
>     > be glad.
> 
> 
>     OK, here's a test case:
>     $ numactl -N 0 -m 0 java -Xlog:gc*=info -XX:+UseParallelGC -XX:+UseNUMA -version
> 
>     Before patch, failed output shows 1/2 of Eden being wasted for threads from node that will never allocate:
>     ...
>     [0.230s][info][gc,heap,exit ]   eden space 524800K, 4% used [0x0000000580100000,0x0000000581580260,0x00000005a0180000)
>     [0.230s][info][gc,heap,exit ]     lgrp 0 space 262400K, 8% used [0x0000000580100000,0x0000000581580260,0x0000000590140000)
>     [0.230s][info][gc,heap,exit ]     lgrp 1 space 262400K, 0% used [0x0000000590140000,0x0000000590140000,0x00000005a0180000)
>     ...
> 
>     After patch, passed output:
>     ...
>     [0.231s][info][gc,heap,exit ]   eden space 524800K, 8% used [0x0000000580100000,0x0000000582a00260,0x00000005a0180000)
>     ... (no lgrps)
> 
>     Open questions - still a bug?
>     1) What should JVM do if cpu node is bound, but not memory is bound? Even with patch, JVM wastes memory because it sets aside part of Eden for threads that can never run on other node.
>        - numactl -N 0 java -Xlog:gc*=info -XX:+UseParallelGC -XX:+UseNUMA -version
>        - My expectation was that it would act as if membind is set. But I'm not an expert.
>        - What do containers do under the hood? Would they ever bind cpus and NOT memory?
>     2) What should JVM do if cpu node is bound, and numactl --localalloc specified? Even with patch, JVM wastes memory.
>        - numactl -N 0 --localalloc java -Xlog:gc*=info -XX:+UseParallelGC -XX:+UseNUMA -version
>        - My expectation was that "--localalloc" would be identical to setting membind for all of the cpu bound nodes, but I guess it's not.
> 
> 
> 
>     FYI - numactl -H:
>     available: 2 nodes (0-1)
>     node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
>     node 0 size: 128924 MB
>     node 0 free: 8499 MB
>     node 1 cpus: 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
>     node 1 size: 129011 MB
>     node 1 free: 7964 MB
>     node distances:
>     node   0   1
>        0:  10  20
>        1:  20  10
> 
>     > I consider the patch I pointed out as the fourth version of Swati's original
>     > proposal, it evolved from the reviews so far:
>     > http://cr.openjdk.java.net/~gromero/8189922/draft/usenuma_v4.patch <http://cr.openjdk.java.net/~gromero/8189922/draft/usenuma_v4.patch>
>     > 
>     > 
>     > > Also, if you or Gustavo have permissions to post a webrev to
>     > http://cr.openjdk.java.net/ that would make reviewing a little easier. I’d be
>     > happy to post a webrev for you if not.
>     > 
>     > I was planing to host the webrev after your comments, but feel free to host
>     > it.
> 
>     No, you have it covered well, I'll stay out of it.
> 
>       - Derek
> 
> 



More information about the hotspot-dev mailing list