JVM stalls around uncommitting

Sat Apr 4 09:21:33 UTC 2020

Sorry, let me formulate this question in a more precise manner:

Assuming you use the "traditional" huge pages, HugeTLBFS, we take the pages
from the huge page pool. The content of the huge page pool is not usable
for other applications. So uncommitting would only benefit other
applications using huge pages. But that's okay and would be useful too.

The question to me would be if reserving but not committing memory backed
by huge pages is any different from committing them right away. Or, whether
uncommitted pages are returned to the pool.

I made a simple test with UseLagePages and a VM with a 100M heap, and see
that both heap and code heap are now backed by huge pages as expected. I
ran once with AlwaysPreTouch, once without. I do not see any difference
from the outside as toward the number of used huge pages. In
/proc/pid/smaps the memory segments look identical in each case. I may be
doing this test wrong though...

Thanks a lot, and sorry again for hijacking this thread,

Thomas

p.s. without doubt using huge pages is hugely beneficial even without
uncommitting.

On Sat, Apr 4, 2020 at 10:00 AM Thomas Stüfe <thomas.stuefe at gmail.com>
wrote:

> Hi Per, Zoltan,
>
> sorry for getting in a question sideways, but I was curious.
>
> I always thought large pages are memory-pinned, so cannot be uncommitted?
> Or are you talking using THPs?
>
> Cheers, Thomas
>
>
> On Fri, Apr 3, 2020 at 9:38 AM Per Liden <per.liden at oracle.com> wrote:
>
>> Hi Zoltan,
>>
>> On 4/3/20 1:27 AM, Zoltán Baranyi wrote:
>> > Hi Per,
>> >
>> > Thank you for confirming the issue and for recommending large pages. I
>> > re-run my benchmarks with large pages and it gave me a 25-30%
>> performance
>> > boost, which is a bit more than what I expected. My benchmarks run on a
>> > 600G heap with 1.5-2GB/s allocation rate on a 40 core machine, so ZGC is
>> > busy. Since a significant part of the workload is ZGC itself, I assume -
>> > besides the higher TLB hit rate - this gain is from managing the ZPages
>> > more effectively on large pages.
>>
>> A 25-30% improvement is indeed more than I would have expected. ZGC's
>> internal handling of ZPages is the same regardless of the underlying
>> page size, but as you say, you'll get better TLB hit-rate and the
>> mmap/fallocate syscalls become a lot less expensive.
>>
>> Another reason for the boost might be that ZGC's NUMA-awareness, until
>> recently, worked much better when using large pages. But this has now
>> been fixed, see https://bugs.openjdk.java.net/browse/JDK-8237649.
>>
>> Btw, which JDK version are you using?
>>
>> >
>> > I have a good experience overall, nice to see ZGC getting more and more
>> > mature.
>>
>> Good to hear. Thanks for the feedback!
>>
>> /Per
>>
>> >
>> > Cheers,
>> > Zoltan
>> >
>> > On Wed, Apr 1, 2020 at 9:15 AM Per Liden <per.liden at oracle.com> wrote:
>> >
>> >> Hi,
>> >>
>> >> On 3/31/20 9:59 PM, Zoltan Baranyi wrote:
>> >>> Hi ZGC Team,
>> >>>
>> >>> I run benchmarks against our application using ZGC on heaps in few
>> >>> hundreds GB scale. In the beginning everything goes smooth, but
>> >>> eventually I experience very long JVM stalls, sometimes longer than
>> one
>> >>> minute. According to the JVM log, reaching safepoints occasionally
>> takes
>> >>> very long time, matching to the duration of the stalls I experience.
>> >>>
>> >>> After a few iterations, I started looking at uncommitting and learned
>> >>> that the way ZGC performs uncommitting - flushing the pages, punching
>> >>> holes, removing blocks from the backing file - can be expensive [1]
>> when
>> >>> uncommitting tens or more than a hundred GB of memory. The trace level
>> >>> heap logs confirmed that uncommitting blocks in this size takes many
>> >>> seconds. After disabled uncommitting my benchmark runs without the
>> huge
>> >>> stalls and the overall experience with ZGC is quite good.
>> >>>
>> >>> Since uncommitting is done asynchronously to the mutators, I expected
>> it
>> >>> not to interfere with them. My understanding is that flushing,
>> >>> bookeeping and uncommitting is done under a mutex [2], and contention
>> on
>> >>> that can be the source of the stalls I see, such as when there is a
>> >>> demand to commit memory while uncommitting is taking place. Can you
>> >>> confirm if this above is an explanation that makes sense to you? If
>> so,
>> >>> is there a cure to this that I couldn't find? Like a time bound or a
>> cap
>> >>> on the amount of the memory that can be uncommitted in one go.
>> >>
>> >> Yes, uncommitting is relatively expensive. And it's also true that
>> there
>> >> is a potential for lock contention affecting mutators. That can be
>> >> improved in various ways. Like you say, uncommitting in smaller chunks,
>> >> or possibly by releasing the lock while doing the actual syscall.
>> >>
>> >> If you still want uncommit to happen, one thing to try is using large
>> >> pages (-XX:+UseLargePages), since committing/uncommitting large pages
>> is
>> >> typically less expensive.
>> >>
>> >> This issue is on our radar, so we intend to improve this going forward.
>> >>
>> >> cheers,
>> >> Per
>> >>
>> >>
>>
>