Review Request: UseNUMAInterleaving

David Dabbs dmdabbs at gmail.com
Mon May 16 11:21:03 PDT 2011


> -----Original Message-----
> From: hotspot-compiler-dev-bounces at openjdk.java.net [mailto:hotspot-
> compiler-dev-bounces at openjdk.java.net] On Behalf Of Deneau, Tom
> Sent: Monday, May 16, 2011 12:54 PM
> To: 'hotspot-compiler-dev at openjdk.java.net'
> Subject: Review Request: UseNUMAInterleaving
> 
> Please review this patch which adds a new flag called
> UseNUMAInterleaving.  This flag provides a subset of the functionality
> provided by UseNUMA, and its main purpose is to provide that subset on
> OSes like Windows which do not support the full UseNUMA functionality.
> In UseNUMA terminology, UseNUMAInterleaved makes all memory
> "numa_global" which is implemented as interleaved.
> 
> The situations where this shows the biggest benefits would be:
>    * Windows platforms with multiple numa nodes (eg, 4)
> 
>    * The JVM process is run across all the nodes (not affinitized to
> one node).
> 
>    * A workload that uses the majority of the cores in the machine, so
>      that the heap is being accessed from many cores, including remote
>      ones.
> 
>    * Enough memory per node and a heap size such that the default heap
>      placement policy on windows would end up with the heap (or
>      nursery) placed on one node.
> 
> jbb2005 and SPECPower_ssj2008 are examples of such workloads.  In our
> measurements, we have seen some cases where the performance with
> UseNUMAInterleaving was 2.7x vs. the performance without. There were
> gains of varying sizes across all systems.
> 
> As currently implemented this flag is ignored on Linux and Solaris
> since they already support the full UseNUMA flag.
> 
> The webrev is at
> http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.01/
> 
> Summary of changes:
> 
>    * Other than adding the new UseNUMAInterleaving global flag, all of
>      the changes are in src/os/windows/vm/os_windows.cpp
> 
>    * Some static routines were added to set things up init time.  These
>       * check that the required APIs (VirtualAllocExNuma,
>         GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in
>         the OS
> 
>       * build the list of numa nodes on which this process has affinity
> 
>    * Changes to os::reserve_memory
>       * There was already a routine that reserved pages one page at a
>         time (used for Individual Large Page Allocation on WS2003).
>         This was abstracted to a separate routine, called
>         allocate_pages_individually.  This gets called both for the
>         Individual Large Page Allocation thing mentioned above and for
>         UseNUMAInterleaving (for both small and large pages)
> 
>       * When used for NUMA Interleaving this just goes thru the numa
>         node list in a round-robin fashion, using a different one for
>         each chunk (with 4K pages, the minimum allocation granularity
>         is 64K, with 2M pages it is 1 Page)
> 
>       * Whether we do just a reserve or a combined reserve/commit is
>         determined by the caller of allocate_pages_individually
> 
>          * When used with large pages, we do a Reserve and Commit at
>            the same time which is the way it always worked and the way
>            it has to work on windows.
> 
>          * For small pages, only the reserve is done, the commit will
>            come later. (which is the way it worked for
>            non-interleaved)
> 
>    * os::commit_memory changes
>       * If UseNUMAIntereaving is true, os::commit_memory has to check
>         whether it was being asked to commit memory that might have
>         come from multiple Reserve allocations, if so, the commits
>         must also be broken up.  We don't keep any data structure to
>         keep track of this, we just use VirtualQuery which queries the
>         properties of a VA range and can tell us how much came from
>         one VirtualAlloc call.
> 
> I do not have a bug id for this.
> 
> -- Tom Deneau, AMD


Could this flag help Linux systems with kernel < 2.6.19, or is that the
minimum kernel needed for any JVM NUMA support?
Unfortunately, we run CentOS 5.5 (2.6.18)

Linux node01.int 2.6.18-194.17.4.el5 #1 SMP Mon Oct 25 15:50:53 EDT 2010
x86_64 x86_64 x86_64 GNU/Linux

and so -XX:+UseNUMA does not activate (at least not according to
PrintFlagsFinal). 

>From http://www.infoq.com/news/2010/01/java6u18
In the Java HotSpot VM, the NUMA-aware allocator has been implemented to
provide automatic memory placement optimisations for Java applications.
Typically, every processor in the system has a local memory that provides
low access latency and high bandwidth, and remote memory that is
considerably slower to access. The NUMA-aware allocator is implemented for
Solaris (>= 9u2) and Linux (kernel >= 2.6.19, glibc >= 2.6.1) operating
systems, and can be turned on for the Parallel Scavenger garbage collector
with the -XX:+UseNUMA flag. Parallel Scavenger remains the default for a
server-class machine and can also be turned on explicitly by specifying the
-XX:+UseParallelGC option. The impact of the change is significant: When
evaluated against the SPEC JBB 2005 benchmark on an 8 chip Opteron machine,
NUMA-aware systems gave about a 30% (for 32-bit) to 40% (for 64-bit)
increase in performance.



Thanks,

David






More information about the hotspot-compiler-dev mailing list