linux os processor optimizations for OpenJDK GC performance enhancement

Volker Simonis volker.simonis at gmail.com
Tue Apr 25 08:57:58 UTC 2017


Hi Ram,

while this sounds interesting, I wonder how this plays together with
NUMA and Large page support. I understand that these are different
concepts, but in the end it all bails down tot he fact that memory
access is not uniform and we have different "kinds" of memory. It
seems to me that this fact is currently not very well handled in
HotSpot and needs some general redesign. There are for example two
JEPs [1,2] about improving the NUMA support in general and in G1. One
of the problems is that NUMA support doesn't play well together with
Large/Huge page support.

I think your proposal must be evaluated in the broader context of
enhancing the VM and GC for non-uniform memory architectures.
Otherwise it would be yet another point fix which doesn't plays well
together with other features like NUMA and LargePages.

Thanks,
Volker

[1] https://bugs.openjdk.java.net/browse/JDK-8046153 (JEP 163: Enable
NUMA Mode by Default When Appropriate)
[2] https://bugs.openjdk.java.net/browse/JDK-8046147 (JEP 157: G1 GC:
NUMA-Aware Allocation)

On Wed, Apr 19, 2017 at 4:04 PM, Ram Krishnan <ramkri123 at gmail.com> wrote:
> Many thanks David.
>
> Thanks,
> Ramki
>
> On Tue, Apr 18, 2017 at 11:08 PM, David Holmes <david.holmes at oracle.com>
> wrote:
>
>> On 19/04/2017 11:38 AM, Ram Krishnan wrote:
>>
>>> Hi David,
>>>
>>> Many thanks, please find attached text version of document for temporary
>>> hosting.
>>>
>>
>> Hosted at: http://cr.openjdk.java.net/~dholmes/JEP-cache-partitioning-
>> v1.txt
>>
>> David
>>
>>
>>> Thanks,
>>> Ramki
>>>
>>> On Tue, Apr 18, 2017 at 5:42 PM, David Holmes <david.holmes at oracle.com
>>> <mailto:david.holmes at oracle.com>> wrote:
>>>
>>>     Hi Ramki,
>>>
>>>     On 19/04/2017 8:27 AM, Ram Krishnan wrote:
>>>
>>>         Hi David,
>>>
>>>         Thanks for the clarification.
>>>
>>>         I have signed the OCA and mailed it to
>>>         oracle-ca_us(at)oracle.com <http://oracle.com>
>>>         <http://oracle.com>. Any help to expedite processing would be
>>> much
>>>         appreciated.
>>>
>>>
>>>     Can't help with that I'm afraid. :)
>>>
>>>         We are seeing promising POC results (details in the google doc)
>>>         for this
>>>         proposal -- would really appreciate your help in moving this
>>>         forward.
>>>
>>>
>>>     If you email me a text/html version of the document I can host it on
>>>     cr.openjdk.java.net <http://cr.openjdk.java.net> temporarily. For
>>>     this to become a JEP you will need a sponsor with the necessary
>>>     OpenJDK credentials.
>>>
>>>     http://cr.openjdk.java.net/~mr/jep/jep-2.0-02.html
>>>     <http://cr.openjdk.java.net/~mr/jep/jep-2.0-02.html>
>>>
>>>     Cheers,
>>>     David
>>>
>>>
>>>         Thanks,
>>>         Ramki
>>>
>>>         On Tue, Apr 18, 2017 at 1:55 PM, David Holmes
>>>         <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>>>         <mailto:david.holmes at oracle.com
>>>         <mailto:david.holmes at oracle.com>>> wrote:
>>>
>>>             Hi Ramki,
>>>
>>>             On 19/04/2017 12:34 AM, Ram Krishnan wrote:
>>>
>>>                 Please find detailed proposal below, looking forward to
>>> your
>>>                 comments.
>>>
>>>                 "Minimize application tail latency using
>>>                 cache-partitioning-aware G1GC" --
>>>
>>>         https://docs.google.com/document/d/1rPMG4XUiE7cUEOogW1z5tBbB
>>> ZTclOWyg0arhuycXN94/edit
>>>         <https://docs.google.com/document/d/1rPMG4XUiE7cUEOogW1z5tBb
>>> BZTclOWyg0arhuycXN94/edit>
>>>
>>>         <https://docs.google.com/document/d/1rPMG4XUiE7cUEOogW1z5tBb
>>> BZTclOWyg0arhuycXN94/edit
>>>         <https://docs.google.com/document/d/1rPMG4XUiE7cUEOogW1z5tBb
>>> BZTclOWyg0arhuycXN94/edit>>
>>>
>>>
>>>             All contributions to OpenJDK need to be hosted on OpenJDK
>>>             infrastructure not on external systems like the above.
>>>
>>>             Also I can not see you listed as an OCA signatory. Are you an
>>>             OpenJDK contributor?
>>>
>>>             Thanks,
>>>             David
>>>             -----
>>>
>>>                 Thanks,
>>>                 Ramki
>>>
>>>                 On Thu, Apr 13, 2017 at 11:04 PM, Bernd Eckenfels
>>>                 <ecki at zusammenkunft.net <mailto:ecki at zusammenkunft.net>
>>>         <mailto:ecki at zusammenkunft.net <mailto:ecki at zusammenkunft.net>>>
>>>                 wrote:
>>>
>>>                     Maybe it would be better to concentrate the processor
>>>                     optimizations on
>>>                     accessors and barrriers without introducing a
>>>         completely new GC
>>>                     architecture. I can imagine that especially in the
>>>         area of
>>>                     NUMA, TLAB, huge
>>>                     pages, cache consistency and possibly MMX extensions
>>>         there
>>>                     is some
>>>                     potential.
>>>
>>>                     Abandoning the global STW - while it seems like a
>>> pretty
>>>                     powerful change -
>>>                     is I guess not a good starter exercise. Especially
>>>         since it
>>>                     is not only a
>>>                     question of mutator threads.
>>>
>>>                     Gruss
>>>                     Bernd
>>>                     --
>>>                     http://bernd.eckenfels.net
>>>                     ------------------------------
>>>                     *From:* hotspot-gc-dev
>>>                     <hotspot-gc-dev-bounces at openjdk.java.net
>>>         <mailto:hotspot-gc-dev-bounces at openjdk.java.net>
>>>                     <mailto:hotspot-gc-dev-bounces at openjdk.java.net
>>>         <mailto:hotspot-gc-dev-bounces at openjdk.java.net>>> on
>>>                     behalf of Ram Krishnan <ramkri123 at gmail.com
>>>         <mailto:ramkri123 at gmail.com>
>>>                     <mailto:ramkri123 at gmail.com
>>>         <mailto:ramkri123 at gmail.com>>>
>>>                     *Sent:* Friday, April 14, 2017 6:36:27 AM
>>>                     *To:* Asif Qamar; Andrew Haley;
>>>                     hotspot-gc-dev at openjdk.java.net
>>>         <mailto:hotspot-gc-dev at openjdk.java.net>
>>>                     <mailto:hotspot-gc-dev at openjdk.java.net
>>>         <mailto:hotspot-gc-dev at openjdk.java.net>>
>>>                     *Subject:* Re: linux os processor optimizations for
>>>         OpenJDK GC
>>>                     performance enhancement
>>>
>>>                     Thanks Andrew.
>>>
>>>                     >>Surely there is: a thread could have its TLAB
>>>         allocated
>>>                     from a region
>>>
>>>                             local to that socket (or core), and the GC
>>>         thread
>>>                             for that region
>>>                             could run on the same socket.  It only works
>>> for
>>>                             young gen, but that's
>>>                             a lot of the problem.
>>>
>>>
>>>                     A clarification -- does the TLAB allocation apply to
>>>         tenured
>>>                     space also?
>>>                     If not, the above would work only for young gen
>>>         cases where
>>>                     there is no
>>>                     promotion to tenured right?
>>>
>>>                     Thanks,
>>>                     Ramki
>>>
>>>                     On Thu, Apr 13, 2017 at 12:55 PM, Ram Krishnan
>>>                     <ramkri123 at gmail.com <mailto:ramkri123 at gmail.com>
>>>         <mailto:ramkri123 at gmail.com <mailto:ramkri123 at gmail.com>>>
>>>                     wrote:
>>>
>>>
>>>                         ---------- Forwarded message ----------
>>>                         From:
>>>
>>>                         Andrew Haley <aph at redhat.com
>>>         <mailto:aph at redhat.com> <mailto:aph at redhat.com
>>>         <mailto:aph at redhat.com>>>
>>>                         Date: Thu, Apr 13, 2017 at 9:52 AM
>>>                         Subject: Re: linux os processor optimizations for
>>>                         OpenJDK GC performance
>>>                         enhancement
>>>                         To:
>>>
>>>                         hotspot-gc-dev at openjdk.java.net
>>>         <mailto:hotspot-gc-dev at openjdk.java.net>
>>>                         <mailto:hotspot-gc-dev at openjdk.java.net
>>>         <mailto:hotspot-gc-dev at openjdk.java.net>>
>>>
>>>
>>>                         On 13/04/17 16:33, Kim Barrett wrote:
>>>
>>>                             An application thread may touch memory in any
>>>                             region; there is no
>>>                             notion of a thread being "scoped" to a
>>>         specific set
>>>                             of regions. While
>>>                             it might happen that a thread would only touch
>>>                             regions not being
>>>                             worked on by the collector, there is no a
>>>         priori way
>>>                             to know that.
>>>
>>>
>>>
>>>                         Surely there is: a thread could have its TLAB
>>>         allocated
>>>                         from a region
>>>                         local to that socket (or core), and the GC
>>>         thread for
>>>                         that region
>>>                         could run on the same socket.  It only works for
>>>         young
>>>                         gen, but that's
>>>                         a lot of the problem.
>>>
>>>                         Andrew.
>>>
>>>
>>>
>>>
>>>                         --
>>>                         Thanks,
>>>                         Ramki
>>>
>>>
>>>
>>>
>>>                     --
>>>                     Thanks,
>>>                     Ramki
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>         --
>>>         Thanks,
>>>         Ramki
>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Ramki
>>>
>>
>
>
> --
> Thanks,
> Ramki


More information about the hotspot-dev mailing list