Turn on UseNUMA by default when prudent
Eric Caspole
eric.caspole at amd.com
Wed May 30 20:31:56 UTC 2012
I put a much simpler, still linux only rev at
http://cr.openjdk.java.net/~ecaspole/numa_default_2/
Simply doing UseNUMA on by default might work but there are so many
os/platforms to consider it's more than I can try to test.
Eric
On May 30, 2012, at 4:14 PM, Jesper Wilhelmsson wrote:
> On 2012-05-30 20:41, Igor Veresov wrote:
>> Actually UseNUMA should already do what you want. Even if
>> specified on the command line it will switch itself off if there's
>> only one node present.
>
> So, will setting UseNUMA to true as default be a platform
> independent way of solving this?
> /Jesper
>
>>
>> igor
>>
>> On May 30, 2012, at 12:27 AM, Thomas Schatzl wrote:
>>
>>> Hi all,
>>>
>>> On Tue, 2012-05-29 at 21:56 +0200, Jesper Wilhelmsson wrote:
>>>> Hi Eric,
>>>>
>>>> As long as this is based on actual data and not just a hunch, I
>>>> personally
>>>> think it is a good idea. I don't know if we have any policies
>>>> about platform
>>>> specific optimizations like this though.
>>>>
>>>> I have some comments on the code layout and there are a few
>>>> typos, but I guess
>>>> this is still a draft so I won't pick on that right now.
>>>>
>>>> One thing I wonder though is in os_linux_x86.cpp:
>>>>
>>>> if (VM_Version::cpu_family() == 0x15 || VM_Version::cpu_family()
>>>> == 0x10) {
>>>>
>>>> Is this the only way to identify the proper processor family? It
>>>> doesn't seem
>>>> very future proof. How often would you have to change this code
>>>> to keep it up
>>>> to date with new hardware?
>>> just a question, if this is implemented, wouldn't it more prudent to
>>> actually check whether the VM process runs on a NUMA machine, and
>>> actually has its computing (or memory) resources distributed across
>>> several nodes instead of a check for some arbitrary processors and
>>> processor identifiers?
>>>
>>> This would, given that the OS typically provides this information
>>> anyway, also immediately support e.g. sparc setups. It also avoids
>>> distributing memory when the user explicitly assigned the VM to a
>>> single
>>> node...
>>>
>>>> From memory, on solaris above mentioned detection works
>>>> approximately as
>>> follows:
>>>
>>> - detect the total amount of leaf locality groups (=nodes on
>>> Solaris)
>>> in the system, e.g. via lgrp_nlgrps()
>>> - from the root node (retrieved via lgrp_root()), iterate over its
>>> children and leaf lgroups via lgrp_children().
>>> - for each of the leaf lgroups found, check whether there is an
>>> active cpu for this process in it using lgrp_cpus(); if so,
>>> increment
>>> counter
>>>
>>> Maybe there is a better way to do that though.
>>>
>>> On Linux, numa_get_run_node_mask() may provide the same
>>> information when
>>> called during initialization.
>>> On Windows, it seems that a combination of GetProcessAffinityMask
>>> () and
>>> GetNUMAProcessorNode() may be useful.
>>> (From a cursory web search for the latter two; not sure about other
>>> OSes, but you could simply provide a dummy for those)
>>>
>>> I'd guess that some of the needed functionality to implement this is
>>> already provided by the current Hotspot code base.
>>>
>>>
>>> Ergonomics stuff is typically handled in runtime/arguments.?pp,
>>> so it
>>> might be a better place as a location for updating globals than
>>> putting
>>> this detection in some os-specific initialization code.
>>>
>>> Eg.
>>>
>>> if (FLAG_IS_DEFAULT(UseNUMA)) {
>>> UseNUMA := [maybe some other conditions&&]
>>> (os::get_num_active_numa_nodes()> 1);
>>> }
>>>
>>> in e.g. Arguments::set_ergonomics_flags() or similar.
>>>
>>> Seems a lot nicer than an explicit check for some processor family.
>>> Maybe a little more work though.
>>>
>>> Hth,
>>> Thomas
>>>
>>>
>>> <jesper_wilhelmsson.vcf>
More information about the hotspot-gc-dev
mailing list