Review Request: UseNUMAInterleaving

Thu May 26 16:37:04 PDT 2011

I have incorporated the change suggested by Paul Hohensee to just use the existing UseNUMA flag rather than introduce a new flag.  Please let me know when you think this will be able to be checked in...

The new webrev is at
http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/

-- Tom Deneau, AMD

> -----Original Message-----
> From: Deneau, Tom
> Sent: Monday, May 16, 2011 12:54 PM
> To: 'hotspot-compiler-dev at openjdk.java.net'
> Subject: Review Request: UseNUMAInterleaving
> 
> Please review this patch which adds a new flag called
> UseNUMAInterleaving.  This flag provides a subset of the functionality
> provided by UseNUMA, and its main purpose is to provide that subset on
> OSes like Windows which do not support the full UseNUMA functionality.
> In UseNUMA terminology, UseNUMAInterleaved makes all memory
> "numa_global" which is implemented as interleaved.
> 
> The situations where this shows the biggest benefits would be:
>    * Windows platforms with multiple numa nodes (eg, 4)
> 
>    * The JVM process is run across all the nodes (not affinitized to one
> node).
> 
>    * A workload that uses the majority of the cores in the machine, so
>      that the heap is being accessed from many cores, including remote
>      ones.
> 
>    * Enough memory per node and a heap size such that the default heap
>      placement policy on windows would end up with the heap (or
>      nursery) placed on one node.
> 
> jbb2005 and SPECPower_ssj2008 are examples of such workloads.  In our
> measurements, we have seen some cases where the performance with
> UseNUMAInterleaving was 2.7x vs. the performance without. There were
> gains of varying sizes across all systems.
> 
> As currently implemented this flag is ignored on Linux and Solaris
> since they already support the full UseNUMA flag.
> 
> The webrev is at
> http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.01/
> 
> Summary of changes:
> 
>    * Other than adding the new UseNUMAInterleaving global flag, all of
>      the changes are in src/os/windows/vm/os_windows.cpp
> 
>    * Some static routines were added to set things up init time.  These
>       * check that the required APIs (VirtualAllocExNuma,
>         GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in
>         the OS
> 
>       * build the list of numa nodes on which this process has affinity
> 
>    * Changes to os::reserve_memory
>       * There was already a routine that reserved pages one page at a
>         time (used for Individual Large Page Allocation on WS2003).
>         This was abstracted to a separate routine, called
>         allocate_pages_individually.  This gets called both for the
>         Individual Large Page Allocation thing mentioned above and for
>         UseNUMAInterleaving (for both small and large pages)
> 
>       * When used for NUMA Interleaving this just goes thru the numa
>         node list in a round-robin fashion, using a different one for
>         each chunk (with 4K pages, the minimum allocation granularity
>         is 64K, with 2M pages it is 1 Page)
> 
>       * Whether we do just a reserve or a combined reserve/commit is
>         determined by the caller of allocate_pages_individually
> 
>          * When used with large pages, we do a Reserve and Commit at
>            the same time which is the way it always worked and the way
>            it has to work on windows.
> 
>          * For small pages, only the reserve is done, the commit will
>            come later. (which is the way it worked for
>            non-interleaved)
> 
>    * os::commit_memory changes
>       * If UseNUMAIntereaving is true, os::commit_memory has to check
>         whether it was being asked to commit memory that might have
>         come from multiple Reserve allocations, if so, the commits
>         must also be broken up.  We don't keep any data structure to
>         keep track of this, we just use VirtualQuery which queries the
>         properties of a VA range and can tell us how much came from
>         one VirtualAlloc call.
> 
> I do not have a bug id for this.
> 
> -- Tom Deneau, AMD