[ping] Re: [11] RFR(M): 8189922: UseNUMA memory interleaving vs membind
Gustavo Romero
gromero at linux.vnet.ibm.com
Tue Jul 10 20:14:44 UTC 2018
Hi Swati,
As David pointed out, it's necessary to determine if that bug qualifies as P3 in
order to get it into JDK 11 RDP1.
AFAICS, that bug was never triaged explicitly and got its current priority (P4)
from the default.
Once it's defined the correct integration version, I can sponsor that change
for you. I think there won't be any updates for JDK 11 (contrary to what
happened for JDK 10), but I think we can understand how distros are handling
it and so find out if there is a possibility to get the change into the
distros once it's pushed to JDK 12.
David, Alan,
I could not find a documentation on how to formally triage a bug. For instance,
on [1] I see Alan used some markers as "ILW =" and "MLH = " but I don't know if
these markers are only for Oracle internal control. Do you know how could I
triage that bug? I understand its risk of integration is small but even tho I
think it's necessary to bring up additional information on that to combine in a
final bug priority.
Thanks.
Best regards,
Gustavo
[1] https://bugs.openjdk.java.net/browse/JDK-8206953
On 07/03/2018 03:06 AM, David Holmes wrote:
> Looks fine.
>
> Thanks,
> David
>
> On 3/07/2018 3:08 PM, Swati Sharma wrote:
>> Hi David,
>>
>> I have added NULL check for _numa_bitmask_isbitset in isbound_to_single_node() method.
>>
>> Hosted:http://cr.openjdk.java.net/~gromero/8189922/v2/ <http://cr.openjdk.java.net/~gromero/8189922/v2/>
>>
>> Swati
>>
>> On Mon, Jul 2, 2018 at 5:54 AM, David Holmes <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
>>
>> Hi Swati,
>>
>> I took a look at this though I'm not familiar with the functional
>> operation of the NUMA API's - I'm relying on Gustavo and Derek to
>> spot any actual usage errors there.
>>
>> In isbound_to_single_node() there is no NULL check for
>> _numa_bitmask_isbitset (which seems to be the normal pattern for
>> using all of these function pointers).
>>
>> Otherwise this seems fine.
>>
>> Thanks,
>> David
>>
>>
>> On 30/06/2018 2:46 AM, Swati Sharma wrote:
>>
>> Hi,
>>
>> Could I get a review for this change that affects the JVM when
>> there are
>> pinned memory nodes please?
>>
>> It's already reviewed and tested on PPC64 and on AARCH64 by
>> Gustavo and
>> Derek, however both are not Reviewers so I need additional
>> reviews for that
>> change.
>>
>>
>> Thanks in advance.
>>
>> Swati
>>
>> On Tue, Jun 19, 2018 at 5:58 PM, Swati Sharma
>> <swatibits14 at gmail.com <mailto:swatibits14 at gmail.com>> wrote:
>>
>> Hi All,
>>
>> Here is the numa information of the system :
>> swati at java-diesel1:~$ numactl -H
>> available: 8 nodes (0-7)
>> node 0 cpus: 0 1 2 3 4 5 6 7 64 65 66 67 68 69 70 71
>> node 0 size: 64386 MB
>> node 0 free: 64134 MB
>> node 1 cpus: 8 9 10 11 12 13 14 15 72 73 74 75 76 77 78 79
>> node 1 size: 64509 MB
>> node 1 free: 64232 MB
>> node 2 cpus: 16 17 18 19 20 21 22 23 80 81 82 83 84 85 86 87
>> node 2 size: 64509 MB
>> node 2 free: 64215 MB
>> node 3 cpus: 24 25 26 27 28 29 30 31 88 89 90 91 92 93 94 95
>> node 3 size: 64509 MB
>> node 3 free: 64157 MB
>> node 4 cpus: 32 33 34 35 36 37 38 39 96 97 98 99 100 101 102 103
>> node 4 size: 64509 MB
>> node 4 free: 64336 MB
>> node 5 cpus: 40 41 42 43 44 45 46 47 104 105 106 107 108 109
>> 110 111
>> node 5 size: 64509 MB
>> node 5 free: 64352 MB
>> node 6 cpus: 48 49 50 51 52 53 54 55 112 113 114 115 116 117
>> 118 119
>> node 6 size: 64509 MB
>> node 6 free: 64359 MB
>> node 7 cpus: 56 57 58 59 60 61 62 63 120 121 122 123 124 125
>> 126 127
>> node 7 size: 64508 MB
>> node 7 free: 64350 MB
>> node distances:
>> node 0 1 2 3 4 5 6 7
>> 0: 10 16 16 16 32 32 32 32
>> 1: 16 10 16 16 32 32 32 32
>> 2: 16 16 10 16 32 32 32 32
>> 3: 16 16 16 10 32 32 32 32
>> 4: 32 32 32 32 10 16 16 16
>> 5: 32 32 32 32 16 10 16 16
>> 6: 32 32 32 32 16 16 10 16
>> 7: 32 32 32 32 16 16 16 10
>>
>> Thanks,
>> Swati
>>
>> On Tue, Jun 19, 2018 at 12:00 AM, Gustavo Romero <
>> gromero at linux.vnet.ibm.com
>> <mailto:gromero at linux.vnet.ibm.com>> wrote:
>>
>> Hi Swati,
>>
>> On 06/16/2018 02:52 PM, Swati Sharma wrote:
>>
>> Hi All,
>>
>> This is my first patch,I would appreciate if anyone
>> can review the fix:
>>
>> Bug :
>> https://bugs.openjdk.java.net/browse/JDK-8189922
>> <https://bugs.openjdk.java.net/browse/JDK-8189922> <
>> https://bugs.openjdk.java.net/browse/JDK-8189922
>> <https://bugs.openjdk.java.net/browse/JDK-8189922>>
>> Webrev
>> :http://cr.openjdk.java.net/~gromero/8189922/v1
>> <http://cr.openjdk.java.net/~gromero/8189922/v1>
>>
>> The bug is about JVM flag UseNUMA which bypasses the
>> user specified
>> numactl --membind option and divides the whole heap
>> in lgrps according to
>> available numa nodes.
>>
>> The proposed solution is to disable UseNUMA if bound
>> to single numa
>> node. In case more than one numa node binding,
>> create the lgrps according
>> to bound nodes.If there is no binding, then JVM will
>> divide the whole heap
>> based on the number of NUMA nodes available on the
>> system.
>>
>> I appreciate Gustavo's help for fixing the thread
>> allocation based on
>> numa distance for membind which was a dangling issue
>> associated with main
>> patch.
>>
>>
>> Thanks. I have no further comments on it. LGTM.
>>
>>
>> Best regards,
>> Gustavo
>>
>> PS: Please, provide numactl -H information when
>> possible. It helps to
>> grasp
>> promptly the actual NUMA topology in question :)
>>
>> Tested the fix by running specjbb2015 composite workload
>> on 8 NUMA node
>>
>> system.
>> Case 1 : Single NUMA node bind
>> numactl --cpunodebind=0 --membind=0 java -Xmx24g
>> -Xms24g -Xmn22g
>> -XX:+UseNUMA
>> -Xlog:gc*=debug:file=gc.log:time,uptimemillis
>> <composite_application>
>> Before Patch: gc.log
>> eden space 22511616K(22GB), 12% used
>> lgrp 0 space 2813952K, 100% used
>> lgrp 1 space 2813952K, 0% used
>> lgrp 2 space 2813952K, 0% used
>> lgrp 3 space 2813952K, 0% used
>> lgrp 4 space 2813952K, 0% used
>> lgrp 5 space 2813952K, 0% used
>> lgrp 6 space 2813952K, 0% used
>> lgrp 7 space 2813952K, 0% used
>> After Patch : gc.log
>> eden space 46718976K(45GB), 99% used(NUMA disabled)
>>
>> Case 2 : Multiple NUMA node bind
>> numactl --cpunodebind=0,7 –membind=0,7 java -Xms50g
>> -Xmx50g -Xmn45g
>> -XX:+UseNUMA
>> -Xlog:gc*=debug:file=gc.log:time,uptimemillis
>> <composite_application>
>> Before Patch :gc.log
>> eden space 46718976K, 6% used
>> lgrp 0 space 5838848K, 14% used
>> lgrp 1 space 5838848K, 0% used
>> lgrp 2 space 5838848K, 0% used
>> lgrp 3 space 5838848K, 0% used
>> lgrp 4 space 5838848K, 0% used
>> lgrp 5 space 5838848K, 0% used
>> lgrp 6 space 5838848K, 0% used
>> lgrp 7 space 5847040K, 35% used
>> After Patch : gc.log
>> eden space 46718976K(45GB), 99% used
>> lgrp 0 space 23359488K(23.5GB), 100% used
>> lgrp 7 space 23359488K(23.5GB), 99% used
>>
>>
>> Note: The proposed solution is only for numactl
>> membind option.The fix
>> is not for --cpunodebind and localalloc which is a
>> separate bug bug
>> https://bugs.openjdk.java.net/browse/JDK-8205051
>> <https://bugs.openjdk.java.net/browse/JDK-8205051>
>> and fix is in progress
>> on this.
>>
>> Thanks,
>> Swati Sharma
>> Software Engineer -2 at AMD
>>
>>
>>
>>
>>
>
More information about the hotspot-dev
mailing list