JVM stalls around uncommitting
Thomas Stüfe
thomas.stuefe at gmail.com
Mon Apr 6 06:33:35 UTC 2020
Hi Per,
On Sat, Apr 4, 2020 at 8:42 PM Per Liden <per.liden at oracle.com> wrote:
> Hi Thomas,
>
> On 4/4/20 11:21 AM, Thomas Stüfe wrote:
> > Sorry, let me formulate this question in a more precise manner:
> >
> > Assuming you use the "traditional" huge pages, HugeTLBFS, we take the
> > pages from the huge page pool. The content of the huge page pool is not
> > usable for other applications. So uncommitting would only benefit other
> > applications using huge pages. But that's okay and would be useful too.
>
> It depends a bit on how you've setup the huge page pool. Normally, you
> set nr_hugepages to configure the huge page pool to have a fixed number
> of pages, with a guarantee that those pages will actually be there when
> needed. Applications explicitly using huge pages will allocate from the
> pool. Applications that uncommit such pages will return them to the pool
> for other applications (that are also explicitly using huge pages) to use.
>
>
Good to know.
> However, you can instead (or also) choose to configure
> nr_overcommit_hugepages. When the huge page pool is depleted (e.g.
> because nr_hugepages was set to 0 from the start) the kernel will try to
> allocate at most this number of huge pages from the normal page pool.
> These pages will show up as HugePages_Surp in /proc/meminfo. When
> uncommiting such pages they will be returned to the normal page pool,
> for any other process to use (not just those explicitly using huge
> pages). Of course, you don't have the same guarantee that there are
> large pages available.
>
>
Oh this is nice. I did not know you could do this. It takes the sting out
of preallocating a huge page pool, especially on development machines.
> >
> > The question to me would be if reserving but not committing memory
> > backed by huge pages is any different from committing them right away.
> > Or, whether uncommitted pages are returned to the pool.
>
> It depends on what you mean with reserving. If you're going through
> ReservedSpace (i.e. os::reserve_memory_special() and friends), then yes,
> it's the same thing. But ZGC is not using those APIs, it has it's own
> reserve/commit/uncommit infrastructure where reserve only reserves
> address space, and commit/uncommit actually allocates/deallocates pages.
>
>
> > I made a simple test with UseLagePages and a VM with a 100M heap, and
> > see that both heap and code heap are now backed by huge pages
> > as expected. I ran once with AlwaysPreTouch, once without. I do not see
> > any difference from the outside as toward the number of used huge pages.
> > In /proc/pid/smaps the memory segments look identical in each case. I
> > may be doing this test wrong though...
>
> Maybe you weren't using ZGC? The code heap and all GCs, except ZGC, use
> ReservedSpace where large pages will be committed and "pinned" upfront,
> and no uncommit will happen.
>
>
That, and I also got confused with AIX where huge pages are pinned by the
OS :)
> cheers,
> Per
>
>
Thank you for that extensive answer!
Cheers, Thomas
> >
> > Thanks a lot, and sorry again for hijacking this thread,
> >
> > Thomas
> >
> > p.s. without doubt using huge pages is hugely beneficial even without
> > uncommitting.
> >
> >
> >
> >
> > On Sat, Apr 4, 2020 at 10:00 AM Thomas Stüfe <thomas.stuefe at gmail.com
> > <mailto:thomas.stuefe at gmail.com>> wrote:
> >
> > Hi Per, Zoltan,
> >
> > sorry for getting in a question sideways, but I was curious.
> >
> > I always thought large pages are memory-pinned, so cannot be
> > uncommitted? Or are you talking using THPs?
> >
> > Cheers, Thomas
> >
> >
> > On Fri, Apr 3, 2020 at 9:38 AM Per Liden <per.liden at oracle.com
> > <mailto:per.liden at oracle.com>> wrote:
> >
> > Hi Zoltan,
> >
> > On 4/3/20 1:27 AM, Zoltán Baranyi wrote:
> > > Hi Per,
> > >
> > > Thank you for confirming the issue and for recommending large
> > pages. I
> > > re-run my benchmarks with large pages and it gave me a 25-30%
> > performance
> > > boost, which is a bit more than what I expected. My
> > benchmarks run on a
> > > 600G heap with 1.5-2GB/s allocation rate on a 40 core
> > machine, so ZGC is
> > > busy. Since a significant part of the workload is ZGC itself,
> > I assume -
> > > besides the higher TLB hit rate - this gain is from managing
> > the ZPages
> > > more effectively on large pages.
> >
> > A 25-30% improvement is indeed more than I would have expected.
> > ZGC's
> > internal handling of ZPages is the same regardless of the
> > underlying
> > page size, but as you say, you'll get better TLB hit-rate and the
> > mmap/fallocate syscalls become a lot less expensive.
> >
> > Another reason for the boost might be that ZGC's NUMA-awareness,
> > until
> > recently, worked much better when using large pages. But this
> > has now
> > been fixed, see https://bugs.openjdk.java.net/browse/JDK-8237649
> .
> >
> > Btw, which JDK version are you using?
> >
> > >
> > > I have a good experience overall, nice to see ZGC getting
> > more and more
> > > mature.
> >
> > Good to hear. Thanks for the feedback!
> >
> > /Per
> >
> > >
> > > Cheers,
> > > Zoltan
> > >
> > > On Wed, Apr 1, 2020 at 9:15 AM Per Liden
> > <per.liden at oracle.com <mailto:per.liden at oracle.com>> wrote:
> > >
> > >> Hi,
> > >>
> > >> On 3/31/20 9:59 PM, Zoltan Baranyi wrote:
> > >>> Hi ZGC Team,
> > >>>
> > >>> I run benchmarks against our application using ZGC on heaps
> > in few
> > >>> hundreds GB scale. In the beginning everything goes smooth,
> but
> > >>> eventually I experience very long JVM stalls, sometimes
> > longer than one
> > >>> minute. According to the JVM log, reaching safepoints
> > occasionally takes
> > >>> very long time, matching to the duration of the stalls I
> > experience.
> > >>>
> > >>> After a few iterations, I started looking at uncommitting
> > and learned
> > >>> that the way ZGC performs uncommitting - flushing the
> > pages, punching
> > >>> holes, removing blocks from the backing file - can be
> > expensive [1] when
> > >>> uncommitting tens or more than a hundred GB of memory. The
> > trace level
> > >>> heap logs confirmed that uncommitting blocks in this size
> > takes many
> > >>> seconds. After disabled uncommitting my benchmark runs
> > without the huge
> > >>> stalls and the overall experience with ZGC is quite good.
> > >>>
> > >>> Since uncommitting is done asynchronously to the mutators,
> > I expected it
> > >>> not to interfere with them. My understanding is that
> flushing,
> > >>> bookeeping and uncommitting is done under a mutex [2], and
> > contention on
> > >>> that can be the source of the stalls I see, such as when
> > there is a
> > >>> demand to commit memory while uncommitting is taking place.
> > Can you
> > >>> confirm if this above is an explanation that makes sense to
> > you? If so,
> > >>> is there a cure to this that I couldn't find? Like a time
> > bound or a cap
> > >>> on the amount of the memory that can be uncommitted in one
> go.
> > >>
> > >> Yes, uncommitting is relatively expensive. And it's also
> > true that there
> > >> is a potential for lock contention affecting mutators. That
> > can be
> > >> improved in various ways. Like you say, uncommitting in
> > smaller chunks,
> > >> or possibly by releasing the lock while doing the actual
> > syscall.
> > >>
> > >> If you still want uncommit to happen, one thing to try is
> > using large
> > >> pages (-XX:+UseLargePages), since committing/uncommitting
> > large pages is
> > >> typically less expensive.
> > >>
> > >> This issue is on our radar, so we intend to improve this
> > going forward.
> > >>
> > >> cheers,
> > >> Per
> > >>
> > >>
> >
>
More information about the zgc-dev
mailing list