Review Request: UseNUMAInterleaving #6
Deneau, Tom
tom.deneau at amd.com
Wed Aug 24 16:26:54 UTC 2011
I believe I have addressed ramki's comments with
http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.06/
-- Tom
> -----Original Message-----
> From: Y. S. Ramakrishna [mailto:y.s.ramakrishna at oracle.com]
> Sent: Tuesday, August 23, 2011 6:51 PM
> To: Deneau, Tom
> Cc: hotspot-gc-dev at openjdk.java.net
> Subject: Re: Review Request: UseNUMAInterleaving #4
>
> Hi Tom -- the perf improvement on windows is impressive.
>
> The changes look good. Just a few very minor nits below:
>
> globals.hpp: In the doc string field for NUMAInterleaveGranularity, you
> might state that this is a Windows only option. (although i recognize
> that this hasn't been done for some of the other windows options that
> i became aware of now as being used exclusively in windows before
> your changes, for instance: UseLargePagesIndividualAllocation
> and LargePagesIndividualAllocationInjectError).
>
> arguments.cpp: you could get rid of the empty lines 1432-1433, and move
> the
> content of 1428-1430 into the if-scope of 1422-1426.
>
> os_windows.cpp: you can probably get rid of the extra newline
> introduced at line 1967.
>
> line 3018, typo: "NUMAInterleavaing"
> also at line 3033: "thNUMANodeListHolderat"
> The comment at lines 3030-3033 would also benefit
> from a few missing punctuation marks.
>
> at lines 3040 and 3043, it might read better to place the returns
> on lines of their own.
>
> If you run with +UseNUMAInterleaving and a commit failed,
> it would seem that the error message at line 2987 would be
> confusing and incorrect. Perhaps you want to suitably modify
> it or just suppress the additional text in that case.
>
> os_solaris.cpp: 2780-2784, it might make sense to do the madvise
> global/many
> call only if the mmap_chunk() succeeds, rather than all the time as you
> are doing. May be something like:-
>
> 2780 char *res = Solaris::mmap_chunk(addr, size, MAP_PRIVATE|MAP_FIXED,
> prot);
> 2781 if (res != NULL) {
> if (UseNUMAInterleaving) {
> 2782 numa_make_global(res, size);
> }
> return true;
> 2783 }
> 2784 return false;
>
> At line 3444, would it make sense to use "size" instead of "bytes"
> (although
> size is just a copy of bytes -- i don't understand the reason for making
> the copy, so feel free to ignore if this is some recherche style issue;
> otherwise
> it might make sense to get rid of the copy and just use the formal
> parameter as is
> the case for the Linux code; although this is really not code that you
> introduced,
> but just because you happen to be touching code in the vicinity... your
> choice.)
>
> In the same vein, i'd make the Linux code similar in shape to
> the solaris code for the two hunks changed in os_linux.cpp.
>
> rest looks good.
> -- ramki
>
> On 08/23/11 12:59, Deneau, Tom wrote:
> > OK, http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.05/
> > should address the concerns listed below...
> >
> > -- Tom
> >
> >
> >> -----Original Message-----
> >> From: Igor Veresov [mailto:igor.veresov at oracle.com]
> >> Sent: Tuesday, August 23, 2011 1:53 PM
> >> To: Deneau, Tom
> >> Cc: hotspot-gc-dev at openjdk.java.net
> >> Subject: Re: Review Request: UseNUMAInterleaving #4
> >>
> >> Tom,
> >>
> >> This looks good to me, except three minor things:
> >>
> >> os_windows.cpp:
> >>
> >> - you should check for null here:
> >> 2630 ~NUMANodeListHolder() {
> >>> if (_numa_used_node_list != NULL) {
> >> 2631 FREE_C_HEAP_ARRAY(int, _numa_used_node_list);
> >>> }
> >> 2632 }
> >>
> >> - if NUMANodeListHolder::build() will be called multiple times, you'll
> >> leak memory. I guess you should check if _numa_used_node_list is NULL
> and
> >> if not free it first.
> >>
> >> - you didn't modify os::numa_get_leaf_groups() to handle the situation
> >> when the value of argument "size" is bigger than
> >> NUMANodeListHolder::get_count(). You can use MIN2 to adjust the value.
> >> See my comment in the previous mail.
> >>
> >>
> >> igor
> >>
> >> On Tuesday, August 23, 2011 at 11:23 AM, Deneau, Tom wrote:
> >>
> >>> Please review this patch which adds a new flag called
> >>> UseNUMAInterleaving. This flag provides a subset of the functionality
> >>> provided by UseNUMA. In Hotspot UseNUMA terminology,
> >>> UseNUMAInterleaved makes all memory "numa_global" which is
> implemented
> >>> as interleaved. This patch's main purpose is to provide that subset
> >>> on OSes like Windows which do not support the full UseNUMA
> >>> functionality. However, a simple implementation of
> UseNUMAInterleaving
> >> is
> >>> also provided for other OSes
> >>>
> >>> The situations where this shows the biggest benefits would be:
> >>> * Windows platforms with multiple numa nodes (eg, 4)
> >>>
> >>> * The JVM process is run across all the nodes (not affinitized to
> >>> one node).
> >>>
> >>> * A workload that has enough threads so that it uses the majority
> >>> of the cores in the machine, so that the heap is being accessed
> >>> from many cores, including remote ones.
> >>>
> >>> * Enough memory per node and a heap size such that the default heap
> >>> placement policy on windows would end up with the heap (or
> >>> nursery) placed on one node.
> >>>
> >>> jbb2005 and SPECPower_ssj2008 are examples of such workloads. In our
> >>> measurements, we have seen some cases where the performance with
> >>> UseNUMAInterleaving was 2.7x vs. the performance without. There were
> >>> gains of varying sizes across all systems.
> >>>
> >>> The webrev is at
> >>> http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.04/
> >>>
> >>> Summary of changes in webrev.04 from webrev.03:
> >>>
> >>> * As suggested by Igor Veresov, UseNUMA can imply
> >>> UseNUMAInterleaving on all platforms. This is in arguments.cpp
> >>>
> >>> * In NUMANodeListHolder in os_windows.cpp, allocates the node_list
> >>> dynamically rather than assuming a length of 64. The method
> >>> NUMANodeListHolder::get_node_list_entry checks returns -1 for
> >>> indexes that are out of bounds.
> >>>
> >>> * Several code convention cleanups suggested by Igor.
> >>>
> >>> * Merge with the new style system dll function resolutions from
> >>> "7016797: Hotspot: securely/restrictive load dlls and new API for
> >>> loading system dlls" Note: my new NUMA functions are outside the
> >> ifdefs.
> >>>
> >>> Summary of changes in webrev.03 from webrev.02:
> >>>
> >>> * As suggested by Igor Veresov, reverts to using
> >>> UseNUMAInterleaving as the enabling flag. This will make it
> >>> easier in the future when there are GCs that enable fuller
> >>> UseNUMA on Windows.
> >>>
> >>> * Adds a simple implementation of UseNUMAInterleaving on Linux and
> >>> Solaris, which just calls numa_make_global after commit_memory
> >>> and reserve_memory_special
> >>>
> >>> * Adds a flag NUMAInterleaveGranularity which allows setting the
> >>> granularity with which we move to a different node in a memory
> >>> allocation. The default is 2MB. This flag only applies to
> >>> Windows for now.
> >>>
> >>> * Several code cleanups in os_windows.cpp suggested by Igor.
> >>>
> >>>
> >>> Summary of overall changes in os_windows.cpp:
> >>>
> >>> * Some static routines were added to set things up init time. These
> >>> * check that the required APIs (VirtualAllocExNuma,
> >>> GetNumaHighestNodeNumber, GetNumaNodeProcessorMask) exist in
> >>> the OS
> >>>
> >>> * build the list of numa nodes on which this process has affinity
> >>>
> >>> * Changes to os::reserve_memory
> >>> * There was already a routine that reserved pages one page at a
> >>> time (used for Individual Large Page Allocation on WS2003).
> >>> This was abstracted to a separate routine, called
> >>> allocate_pages_individually. This gets called both for the
> >>> Individual Large Page Allocation thing mentioned above and for
> >>> UseNUMAInterleaving (for both small and large pages)
> >>>
> >>> * When used for NUMA Interleaving this just goes thru the numa
> >>> node list in a round-robin fashion, allocating chunks at the
> >>> NUMAInterleaveGranularity using a different allocation for
> >>> each chunk
> >>>
> >>> * Whether we do just a reserve or a combined reserve/commit is
> >>> determined by the caller of allocate_pages_individually
> >>>
> >>> * When used with large pages, we do a Reserve and Commit at
> >>> the same time which is the way it always worked and the way
> >>> it has to work on windows.
> >>>
> >>> * For small pages, only the reserve is done, the commit will
> >>> come later. (which is the way it worked for
> >>> non-interleaved)
> >>>
> >>> * os::commit_memory changes
> >>> * If UseNUMAIntereaving is true, os::commit_memory has to check
> >>> whether it was being asked to commit memory that might have
> >>> come from multiple Reserve allocations, if so, the commits
> >>> must also be broken up. We don't keep any data structure to
> >>> keep track of this, we just use VirtualQuery which queries the
> >>> properties of a VA range and can tell us how much came from
> >>> one VirtualAlloc call.
> >>>
> >>> I do not have a bug id for this.
> >>>
> >>> -- Tom Deneau, AMD
> >>
> >
More information about the hotspot-gc-dev
mailing list