Trying to understand ZGC
per.liden at oracle.com
Thu Nov 29 07:40:18 UTC 2018
On 11/28/18 8:09 PM, Stefan Reich wrote:
> Hi Per!
> On Tue, 13 Nov 2018 at 20:22, Per Liden <per.liden at oracle.com
> <mailto:per.liden at oracle.com>> wrote:
> The RSS accounting on Linux isn't always telling the complete truth and
> it can even vary depending on if you're using small or large pages. ZGC
> does heap multi-mapping, which means it will map the same heap
> memory in
> three different locations in the virtual address space. When using
> pages, Linux isn't clever enough to detect that it's the same memory
> being mapped multiple times, and so it accounts for each mapping as if
> it was new/different, inflating the RSS by 3x. This typically doesn't
> happen when using large pages (-XX:+UseLargePages).
> Thanks. I would call this an actual bug in Linux then. Counting memory
> twice is really not OK.
Yes, I would also like to call it a bug. I assume the problem is that
figuring out if a new mapping is the same as an existing one it
potentially really expensive (like traverse all mappings to see if
there's a match). When using large pages, the memory is accounted to the
hugetlbfs inode rather than the process itself, which makes is easier to
get the accounting right.
> Hm... are large pages really problematic as suggested here?
Using -XX:+UseLargePages is typically good for both throughput and
latency (use of transparent huge pages is a different story though). The
main problem/inconvenience is that you need to reserve huge pages up
front, i.e. tie up memory in the huge page pool, so it's less flexible
in that sense.
> > When turning on GC notifications, I see (sometimes):
> > GC cause: Allocation Rate (360 ms)
> > Collector: ZGC
> > Changes: ZHeap: -16383 K, CodeHeap 'profiled nmethods': 85
> K, Metaspace: 1 K
> > and more often:
> > GC cause: Proactive (147 ms)
> > Collector: ZGC
> > Changes: ZHeap: -180223 K, CodeHeap 'profiled nmethods': 1
> K, CodeHeap 'non-profiled nmethods': 1 K, Metaspace: 1 K, CodeHeap
> 'non-nmethods': 12 K
> > Does this mean stop-the-world GC pauses are occurring, or is my
> application not paused?
> This is all normal. Each ZGC cycle has three short pauses (each of them
> should be below 10ms). If you enable detailed GC logging with
> -Xlog:gc*:gc.log you'll see more details on exactly how long the pauses
> are, and a bunch of other data points.
> I still don't understand... are the GC pauses of 360/147 ms
> stop-the-world pauses or just the duration of a concurrent GC cycle?
> (I'm just printing all GarbageCollectionNotificationInfo objects I get
> from the pertinent MX beans.)
The time you see there (e.g. 360 ms) is the time for a complete GC
cycle, i.e. the sum of all pauses and all concurrent phases. This time
is dominated by the concurrent phases, and your pauses should be on the
order of a few milliseconds.
Use -Xlog:gc*:gc.log to print more detailed GC information into a log,
then you'll see all the details on what's going on.
> For more information on ZGC, how to tune, how to interpret logs,
> internals, etc., I'd recommend having a look at some of the slides
> and/or videos available here:
> For now I think I'll stick to G1 as it has tolerable pauses (<50ms,
> roughly, unless I call System.gc()). I do have to call System.gc()
> sometimes in order to return memory to the OS.
A patch to have ZGC (optionally) return memory to the OS exists, but it
has not been upstreamed yet, but it will eventually get there. And you
will not need to do a System.gc() to make that happen (just as that is
not needed in the latest version of G1).
> I'm focusing on desktop use where my goal is <1GB total process size. I
> assume for ZGC I would need to reserve more slack than with G1 in order
> to get its full advantages?
You could be right, but it all depends on the allocation rate of your
application (which will dictate the heap headroom needed by ZGC) and the
shape of the object graph on the heap (which will dictate the amount of
memory needed by G1's remember-sets).
> Many greetings,
More information about the zgc-dev