Turn on UseNUMA by default when prudent
Eric Caspole
eric.caspole at amd.com
Mon Jun 11 14:57:40 UTC 2012
Hi Igor,
Would this no kernel numa support only happen on very old linux
versions? I have been testing under a variety of centos and fedora
versions but nothing more than say 2 years old.
Thanks,
Eric
On Jun 8, 2012, at 7:20 PM, Igor Veresov wrote:
> Eric,
>
> I remember there were problems caused by libnuma and dynamic linker
> that had to do with being unable to overload the weak "numa_error"
> function. I think it worked at some point but I'd verify that it's
> still the case. Basically, the bad thing that can happen is that on
> a linux system, that has libnuma.so, but doesn't have numa support
> compiled in kernel you would see error messages on console from
> libnuma.so.
>
> igor
>
> On Jun 8, 2012, at 12:49 PM, Eric Caspole wrote:
>
>> Hi everybody,
>> I made a similar change for Windows
>>
>> http://cr.openjdk.java.net/~ecaspole/numa_default_win_1/
>>
>> in addition to the linux one:
>>
>> http://cr.openjdk.java.net/~ecaspole/numa_default_3/
>>
>> Doing this is even more effective on Windows since windows seems
>> to have an aggressive policy of allocating process memory on the
>> "home node" where it first ran. In my worst case test on Windows
>> the test would be up to 3x faster with +UseNUMA with Windows
>> Server 2008 R2 compared to the existing default if the application
>> had at least as many threads as cores.
>>
>> Regards,
>> Eric
>>
>>
>>
>> On Jun 1, 2012, at 11:27 AM, Vladimir Kozlov wrote:
>>
>>> Can GC group sponsor this change? I think we also need to do the
>>> same for Solaris (the code is similar there).
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 6/1/12 4:43 AM, Jesper Wilhelmsson wrote:
>>>> This looks OK to me.
>>>> /Jesper
>>>>
>>>>
>>>> On 2012-05-31 17:05, Eric Caspole wrote:
>>>>> OK, I removed the warning, see
>>>>>
>>>>> http://cr.openjdk.java.net/~ecaspole/numa_default_3/
>>>>>
>>>>> Thanks,
>>>>> Eric
>>>>>
>>>>>
>>>>> On May 30, 2012, at 4:58 PM, Vladimir Kozlov wrote:
>>>>>
>>>>>> We issue a warning only if something is not right which not
>>>>>> the case here:
>>>>>>
>>>>>> + warning("Turned on UseNUMA in os::init_2");
>>>>>>
>>>>>> otherwise looks good.
>>>>>>
>>>>>> Vladimir
>>>>>>
>>>>>> Eric Caspole wrote:
>>>>>>> I put a much simpler, still linux only rev at
>>>>>>> http://cr.openjdk.java.net/~ecaspole/numa_default_2/
>>>>>>> Simply doing UseNUMA on by default might work but there are
>>>>>>> so many
>>>>>>> os/platforms to consider it's more than I can try to test.
>>>>>>> Eric
>>>>>>> On May 30, 2012, at 4:14 PM, Jesper Wilhelmsson wrote:
>>>>>>>> On 2012-05-30 20:41, Igor Veresov wrote:
>>>>>>>>> Actually UseNUMA should already do what you want. Even if
>>>>>>>>> specified on
>>>>>>>>> the command line it will switch itself off if there's only
>>>>>>>>> one node present.
>>>>>>>>
>>>>>>>> So, will setting UseNUMA to true as default be a platform
>>>>>>>> independent way
>>>>>>>> of solving this?
>>>>>>>> /Jesper
>>>>>>>>
>>>>>>>>>
>>>>>>>>> igor
>>>>>>>>>
>>>>>>>>> On May 30, 2012, at 12:27 AM, Thomas Schatzl wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> On Tue, 2012-05-29 at 21:56 +0200, Jesper Wilhelmsson wrote:
>>>>>>>>>>> Hi Eric,
>>>>>>>>>>>
>>>>>>>>>>> As long as this is based on actual data and not just a
>>>>>>>>>>> hunch, I personally
>>>>>>>>>>> think it is a good idea. I don't know if we have any
>>>>>>>>>>> policies about
>>>>>>>>>>> platform
>>>>>>>>>>> specific optimizations like this though.
>>>>>>>>>>>
>>>>>>>>>>> I have some comments on the code layout and there are a
>>>>>>>>>>> few typos, but
>>>>>>>>>>> I guess
>>>>>>>>>>> this is still a draft so I won't pick on that right now.
>>>>>>>>>>>
>>>>>>>>>>> One thing I wonder though is in os_linux_x86.cpp:
>>>>>>>>>>>
>>>>>>>>>>> if (VM_Version::cpu_family() == 0x15 ||
>>>>>>>>>>> VM_Version::cpu_family() ==
>>>>>>>>>>> 0x10) {
>>>>>>>>>>>
>>>>>>>>>>> Is this the only way to identify the proper processor
>>>>>>>>>>> family? It
>>>>>>>>>>> doesn't seem
>>>>>>>>>>> very future proof. How often would you have to change
>>>>>>>>>>> this code to keep
>>>>>>>>>>> it up
>>>>>>>>>>> to date with new hardware?
>>>>>>>>>> just a question, if this is implemented, wouldn't it more
>>>>>>>>>> prudent to
>>>>>>>>>> actually check whether the VM process runs on a NUMA
>>>>>>>>>> machine, and
>>>>>>>>>> actually has its computing (or memory) resources
>>>>>>>>>> distributed across
>>>>>>>>>> several nodes instead of a check for some arbitrary
>>>>>>>>>> processors and
>>>>>>>>>> processor identifiers?
>>>>>>>>>>
>>>>>>>>>> This would, given that the OS typically provides this
>>>>>>>>>> information
>>>>>>>>>> anyway, also immediately support e.g. sparc setups. It
>>>>>>>>>> also avoids
>>>>>>>>>> distributing memory when the user explicitly assigned the
>>>>>>>>>> VM to a single
>>>>>>>>>> node...
>>>>>>>>>>
>>>>>>>>>>> From memory, on solaris above mentioned detection works
>>>>>>>>>>> approximately as
>>>>>>>>>> follows:
>>>>>>>>>>
>>>>>>>>>> - detect the total amount of leaf locality groups (=nodes
>>>>>>>>>> on Solaris)
>>>>>>>>>> in the system, e.g. via lgrp_nlgrps()
>>>>>>>>>> - from the root node (retrieved via lgrp_root()), iterate
>>>>>>>>>> over its
>>>>>>>>>> children and leaf lgroups via lgrp_children().
>>>>>>>>>> - for each of the leaf lgroups found, check whether there
>>>>>>>>>> is an
>>>>>>>>>> active cpu for this process in it using lgrp_cpus(); if
>>>>>>>>>> so, increment
>>>>>>>>>> counter
>>>>>>>>>>
>>>>>>>>>> Maybe there is a better way to do that though.
>>>>>>>>>>
>>>>>>>>>> On Linux, numa_get_run_node_mask() may provide the same
>>>>>>>>>> information when
>>>>>>>>>> called during initialization.
>>>>>>>>>> On Windows, it seems that a combination of
>>>>>>>>>> GetProcessAffinityMask() and
>>>>>>>>>> GetNUMAProcessorNode() may be useful.
>>>>>>>>>> (From a cursory web search for the latter two; not sure
>>>>>>>>>> about other
>>>>>>>>>> OSes, but you could simply provide a dummy for those)
>>>>>>>>>>
>>>>>>>>>> I'd guess that some of the needed functionality to
>>>>>>>>>> implement this is
>>>>>>>>>> already provided by the current Hotspot code base.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Ergonomics stuff is typically handled in runtime/
>>>>>>>>>> arguments.?pp, so it
>>>>>>>>>> might be a better place as a location for updating globals
>>>>>>>>>> than putting
>>>>>>>>>> this detection in some os-specific initialization code.
>>>>>>>>>>
>>>>>>>>>> Eg.
>>>>>>>>>>
>>>>>>>>>> if (FLAG_IS_DEFAULT(UseNUMA)) {
>>>>>>>>>> UseNUMA := [maybe some other conditions&&]
>>>>>>>>>> (os::get_num_active_numa_nodes()> 1);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> in e.g. Arguments::set_ergonomics_flags() or similar.
>>>>>>>>>>
>>>>>>>>>> Seems a lot nicer than an explicit check for some
>>>>>>>>>> processor family.
>>>>>>>>>> Maybe a little more work though.
>>>>>>>>>>
>>>>>>>>>> Hth,
>>>>>>>>>> Thomas
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> <jesper_wilhelmsson.vcf>
>>>>>>
>>>>>
>>>>>
>>>
>>
>>
>
>
More information about the hotspot-gc-dev
mailing list