[External] : Re: New NMT Lightweight mode

Fri Dec 1 07:59:48 UTC 2023

Thank you for considering my thoughts. I am happy that we keep thinking
about improving NMT, it is an important tool.

Thanks, Thomas

On Fri, Dec 1, 2023 at 8:55 AM Afshin Zafari <afshin.zafari at oracle.com>
wrote:

> Thank you Thomas for your valuable comments and thoughts.
> Since the design is still in progress and all the change is a prototype
> yet, all your comments will be considered meanwhile.
>
> Best,
> Afshin
> ------------------------------
> *From:* Thomas Stüfe <thomas.stuefe at gmail.com>
> *Sent:* 01 December 2023 08:32
> *To:* Afshin Zafari <afshin.zafari at oracle.com>; Johan Sjolen <
> johan.sjolen at oracle.com>
> *Cc:* Thomas Stuefe <tstuefe at redhat.com>; hotspot-runtime-dev at openjdk.org
> <hotspot-runtime-dev at openjdk.org>
> *Subject:* [External] : Re: New NMT Lightweight mode
>
> Some more thoughts:
>
> I wondered how your lightweight mode would track malloc sizes, and do so
> more cheaply than summary mode.
>
> You cannot use malloc headers since this would increase the footprint (by
> a large amount in corner cases) and would be unacceptable for an always-on
> "lightweight" mode.
>
> If you plan on forcing every caller of free() to hand in the size, this
> would be really ugly and cumbersome. In many cases it would not even work,
> e.g. for users of Unsafe.freeMemory().
>
> But even if you make malloc tracking work, I suspect it will not be faster
> than today's summary mode since you don't do anything different compared to
> summary mode. You increase atomic counters, lockfree - summary mode does
> the same. So you will pay the same cost as summary mode. But that cost
> absolutely dominates - see
> https://stackoverflow.com/questions/73126185/what-is-overhead-of-java-native-memory-tracking-in-summary-mode/73175766#73175766
> <https://urldefense.com/v3/__https://stackoverflow.com/questions/73126185/what-is-overhead-of-java-native-memory-tracking-in-summary-mode/73175766*73175766__;Iw!!ACWV5N9M2RV99hQ!L6l6nbwTnoGFEoPUn0t2EGQcR_9y3xazBQY5dM87Ts2pjn8T8GciutahGgpa4-bfO4B_R18i4ausP18cVgt0NPnSGWA$>):
> the examples that were costly in summary were malloc-heavy scenarios. That
> makes sense since the number of malloc calls outweighs the number of mmap
> calls by orders of magnitude.
>
> Therefore my assumption is that in malloc-heavy scenarios, the performance
> cost for lightweight is the same as for summary - a strong argument against
> enabling lightweight by default. The only performance advantage we have for
> processing mmap calls, which are rare in comparison; and that performance
> gain comes at the cost of wildly inaccurate mmap numbers.
>
> I also think rolling out a tool that reports numbers you cannot rely on is
> not good. Wrong numbers are worse than no numbers.
>
> Sorry for being so insistent about this - I really think the investment in
> a new lightweight mode, with the complexity it would bring and the new
> restrictions wrt API usage across the code base, would be better spent by
> making the existing summary mode faster. That mode was meant to be the
> lightweight mode.
>
> Cheers, Thomas
>
>
>
>
>
>
>
>
>
> On Thu, Nov 30, 2023 at 12:29 PM Thomas Stüfe <thomas.stuefe at gmail.com>
> wrote:
>
> Hi Afshin,
>
> I don't understand what the point of a lightweight nmt is if you cannot
> trust its numbers. The numbers you report would be wrong, so why do it at
> all?
>
> range A = reserve 2G
> range B = reserve 2G
> commit 1G of range A
> commit 1G of range B
> release range A -> decrease by reserve size 2G? Your numbers now report
> "committed=0" whereas in reality, we still have 1G committed.
>
> If its just about tracking memory allocation *requests*, that can be done
> a lot simpler, e.g. via logging.
>
> Cheers, Thomas
>
> On Thu, Nov 30, 2023 at 12:12 PM Afshin Zafari <afshin.zafari at oracle.com>
> wrote:
>
> Dear Thomas and Johan,
> Thanks for your explanations. Some comments:
>
>    - The Lightweight mode is a new mode and does not change functionality
>    of the summary or detail mode. Those modes can run as before.
>    - MEMFLAG argument is one of the missing data that if provided to NMT
>    API, it is possible to do some tracking.
>    - In this state of the Lightweight design the numbers that are
>    gathered are the *requests *of virtual memory
>    reserve/commit/uncommit/release operations. The actual amount of the memory
>    involved will be precise when case of overlapping regions is handled
>    properly.
>    - With this version, end-users can find which parts of JVM has the
>    most impact on total memory allocations.
>    - Becoming lightweight, enables NMT run as default.
>
>
> Cheers,
> Afshin
> ------------------------------
> *From:* hotspot-runtime-dev <hotspot-runtime-dev-retn at openjdk.org> on
> behalf of Thomas Stüfe <thomas.stuefe at gmail.com>
> *Sent:* 30 November 2023 08:51
> *To:* Johan Sjolen <johan.sjolen at oracle.com>; afshi.zafari at oracle.com <
> afshi.zafari at oracle.com>
> *Cc:* Thomas Stuefe <tstuefe at redhat.com>; hotspot-runtime-dev at openjdk.org
> <hotspot-runtime-dev at openjdk.org>
> *Subject:* Re: New NMT Lightweight mode
>
> Hi Afshin and Johan,
>
> thank you for Afshins branch; now I have a clearer picture of what you are
> getting at. See remarks below.
>
> *> The summary mode today still requires the VirtualMemoryTracker to be
> active, but this is only necessary because of MEMFLAGS not being passed
> when releasing memory, only when reserving/committing it.*
>
> I think that is a misunderstanding, the MEMFLAG issue is incidental. We
> need the virtual memory tracker for correct accounting. You cannot just
> count sizes, since mmaps don't happen in a vacuum.
>
> The most obvious example is release: you need to know the size and
> position of committed regions in the released region to correctly decrease
> commit numbers. As it is now, this proposal assumes on release that the
> whole range was committed. A released region can of course contain any mix
> of committed and uncommitted regions.
>
> The same goes for commit: E.g. in Metaspace, we uncommit and commit ranges
> that may contain a mix of committed and uncommitted regions.
>
> I still don't see how this could work with just counters.
>
> A large concern of mine is that an emasculation of the Virtual memory
> tracker would be a direct blocker against the possibility of extending NMT
> coverage beyond the borders of hotspot, since mmap usage in other areas can
> be even less curated. Extending NMT would make many people happy, and it
> has been discussed repeatedly [1]. There is work happening in this
> direction; if this proposal were to go forward, it would inhibit that
> effort.
>
> As a side effect, we would also lose the ability to determine MEMFLAG by
> pointer, so it disables recent work like
> https://bugs.openjdk.org/browse/JDK-8318636.
>
> --
>
> If this is really necessary, there are many alternatives that could
> improve performance while still tracking mmap regions.
>
> Data-structure-wise, we could track regions e.g. in a BST, or, as proposed
> a flat, cache-optimized array. I am happy to see that you already work in
> that direction.
>
> We also could do delayed bulk accounting. On mmap, journal the tracking
> info in a lock-free global list and process it in bulk later. This would
> introduce "staleness" to NMT reports and can interfere with functionality
> that expects tracking information inside NMT to be up-to-date. That is
> solveable, but let's first see if the performance gain is worth the
> complexity increase.
>
>
> *>Are you sure this is even worth optimizing? I doubt a typical jvm run
> involves more than a few dozen/hundred mmap calls. That is not exactly hot.
> Are you sure? A ZGC enabled JVM can have a lot of NMT calls, and Metaspace
> does a lot of small and contiguous allocations. We do see a performance
> degradation with just summary mode.*
>
> Is the performance degradation during normal operation or during
> artificial micros?
>
> Some time ago, I did performance measurements of NMT summary mode [2]. For
> normal operation, summary mode seemed not to matter. It only mattered to a
> small degree in a malloc-heavy benchmark, but that's because of the
> mallocs, not because of mmap. Therefore I am not convinced yet that
> optimizing the mmap path is worth the cost for this proposal. E.g. I would
> have thought that with ZGC, by the time we amass millions of regions, the
> CPU is burned elsewhere.
>
> I am not against performance improvements. I just really dislike the idea
> of limiting NMTs ability to track mmap, since I think that would go into
> the wrong direction.
>
> Cheers, Thomas
>
> [1]
> https://mail.openjdk.org/pipermail/core-libs-dev/2022-November/096197.html
> [2]
> https://stackoverflow.com/questions/73126185/what-is-overhead-of-java-native-memory-tracking-in-summary-mode/73175766#73175766
> <https://urldefense.com/v3/__https://stackoverflow.com/questions/73126185/what-is-overhead-of-java-native-memory-tracking-in-summary-mode/73175766*73175766__;Iw!!ACWV5N9M2RV99hQ!L6l6nbwTnoGFEoPUn0t2EGQcR_9y3xazBQY5dM87Ts2pjn8T8GciutahGgpa4-bfO4B_R18i4ausP18cVgt0NPnSGWA$>
>
>
>
>
> On Wed, Nov 29, 2023 at 11:24 PM Johan Sjolen <johan.sjolen at oracle.com>
> wrote:
>
> Hi Thomas,
>
> >You are talking about rewriting the NMT backend tracking virtual memory
> regions lock-free, right
> Only VirtualMemorySummary, not VirtualMemoryTracker!
>
> >Are you sure this is even worth optimizing? I doubt a typical jvm run
> involves more than a few dozen/hundred mmap calls. That is not exactly hot.
> > Then, are you sure that swapping a lock synchronization with a bunch of
> atomic variable updates brings that much performance gain? Atomic updates
> are still expensive.
> >Do we even have much contention? I would have thought that concurrently
> happening mmap calls are very rare.
> >Reducing contention can be done a lot simpler, first by swapping
> ThreadCritical for a targeted mutex as we always wanted.
>
> Unfortunately, Afshin did not list his branch in the e-mail, here's a link
> :https://github.com/afshin-zafari/jdk/commits/_nmt_light
> <https://urldefense.com/v3/__https://github.com/afshin-zafari/jdk/commits/_nmt_light__;!!ACWV5N9M2RV99hQ!L6l6nbwTnoGFEoPUn0t2EGQcR_9y3xazBQY5dM87Ts2pjn8T8GciutahGgpa4-bfO4B_R18i4ausP18cVgt0fXFD03Y$>
>
> Let's make sure that we're not talking past each other: What Afshin is
> proposing is replacing the summary mode tracking. We do have some
> preliminary performance results showing that these are effective (see
> second commit in branch).
> Basic idea is this:
> The summary mode today still requires the VirtualMemoryTracker to be
> active, but this is only necessary because of MEMFLAGS not being passed
> when releasing memory, only when reserving/committing it. If you add
> MEMFLAGs as a parameter to release_memory etc, then summary mode is
> independent of the VMT and the rest of Afshin's patch are natural
> progressions from the consequences of that.
>
> >If that is not good enough, change the underlying data structure. A
> linked list of regions is not that effective. There are data structures
> that are better suited for this, e.g. interval trees.
> >But a very simple and possibly very effective change could be to change
> the structure from a list to an array. That's a lot more cache friendly. If
> you optimise it right (e.g. similar to the "CachedNMTInformation" cache
> here:
> https://github.com/tstuefe/jdk/blob/24a9916aedecbde76f5c6d79c001133b750f431d/src/hotspot/share/nmt/memMapPrinter.cpp#L84
> <https://urldefense.com/v3/__https://github.com/tstuefe/jdk/blob/24a9916aedecbde76f5c6d79c001133b750f431d/src/hotspot/share/nmt/memMapPrinter.cpp*L84__;Iw!!ACWV5N9M2RV99hQ!L6l6nbwTnoGFEoPUn0t2EGQcR_9y3xazBQY5dM87Ts2pjn8T8GciutahGgpa4-bfO4B_R18i4ausP18cVgt0qlUPbAY$> )
> this can be orders of magnitude faster.
>
> Yes, I have a couple of branches for this work. You can find the most
> current one in https://github.com/jdksjolen/jdk/tree/nmt-memory-view
> <https://urldefense.com/v3/__https://github.com/jdksjolen/jdk/tree/nmt-memory-view__;!!ACWV5N9M2RV99hQ!L6l6nbwTnoGFEoPUn0t2EGQcR_9y3xazBQY5dM87Ts2pjn8T8GciutahGgpa4-bfO4B_R18i4ausP18cVgt0S-akxng$>
> I really like Afshin's patch because it will also help this work.
> Especially, if release_memory etc pass along the MEMFLAGS then we can use
> this information to skip scanning a lot of regions.
>
> So, I see this patch as differentiating further between summary and
> detailed mode and also helping out writing a faster virtual memory tracker.
>
> >Are you sure this is even worth optimizing? I doubt a typical jvm run
> involves more than a few dozen/hundred mmap calls. That is not exactly hot.
> Are you sure? A ZGC enabled JVM can have a lot of NMT calls, and Metaspace
> does a lot of small and contiguous allocations. We do see a performance
> degradation with just summary mode.
>
> I hope that this brings some clarity to the issue.
>
> All the best,
> Johan
> ------------------------------
> *From:* hotspot-runtime-dev <hotspot-runtime-dev-retn at openjdk.org> on
> behalf of Thomas Stuefe <tstuefe at redhat.com>
> *Sent:* Wednesday, November 29, 2023 19:20
> *To:* Afshin Zafari <afshin.zafari at oracle.com>
> *Cc:* hotspot-runtime-dev at openjdk.org <hotspot-runtime-dev at openjdk.org>
> *Subject:* Re: New NMT Lightweight mode
>
> Hi Ashin,
>
> I did not see your mail, therefore I replied in JBS.
>
> Cheers, Thomas
>
> On Wed, Nov 29, 2023 at 6:09 PM Afshin Zafari <afshin.zafari at oracle.com>
> wrote:
>
> Dear all,
> The JBS RFE https://bugs.openjdk.org/browse/JDK-8320977 is created for
> implementing a new Lightweight mode for NativeMemoryTracking (NMT) of JDK.
> Since the changes for this RFE impact a large number of files and code in
> JDK, it would be great if I get comments while prototyping it.
>
> Best,
> Afshin
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-runtime-dev/attachments/20231201/fa5bef46/attachment-0001.htm>