on heap overhead of ZGC
Stefan Karlsson
stefan.karlsson at oracle.com
Thu Apr 20 15:15:07 UTC 2023
Hi Alen,
On 2023-04-19 13:38, Alen Vrečko wrote:
> Hello everyone.
>
> I did my best to search the web for any prior discussion on this.
> Haven't found anything useful.
>
> I am trying to understand why there is a noticeable difference between
> the size of all objects on the heap and the heap used (after full GC).
> The heap used size can be 10%+ more than the size of of all live objects.
The difference can come from two sources:
1) There is unusable memory in the regions, caused by address and size
alignment requirements. (This is probably what you see in your 4MB array
test)
2) We only calculate the `live` value when we mark through objects on
the heap. While ZGC runs a cycle it more or less ignores new objects
that the Java threads allocate during the GC cycle. This means that the
`used` value will increase, but the `live` value will not be updated
until the Mark End of the next GC cycle..
The latter also makes it bit misleading to look at the used value after
the GC cycle. With stop-the-world GCs, that used value is an OK
approximation of what is live on the heap. With a concurrent GC it
includes all the memory allocated during GC cycle. The used numbers is
still true, but some (many) of those allocated objects where short-lived
and died, but we won't be able to figure that out until the next GC cycle.
I don't know if you have seen this, but we have a table where we try to
give you a picture of how the values progressed during the GC cycle:
[6.150s][1681984168056ms][info ][gc,heap ] GC(0) Mark Start
Mark End Relocate Start Relocate End
High Low
[6.150s][1681984168056ms][info ][gc,heap ] GC(0) Capacity: 3282M
(10%) 3536M (11%) 3580M (11%) 3612M (11%)
3612M (11%) 3282M (10%)
[6.150s][1681984168056ms][info ][gc,heap ] GC(0) Free: 28780M
(90%) 28538M (89%) 28720M (90%) 29066M (91%)
29068M (91%) 28434M (89%)
[6.150s][1681984168056ms][info ][gc,heap ] GC(0) Used: 3234M
(10%) 3476M (11%) 3294M (10%) 2948M (9%)
3580M (11%) 2946M (9%)
[6.150s][1681984168056ms][info ][gc,heap ] GC(0) Live:
- 2496M (8%) 2496M (8%) 2496M (8%)
- -
[6.150s][1681984168056ms][info ][gc,heap ] GC(0) Allocated:
- 242M (1%) 364M (1%) 411M
(1%) - -
[6.150s][1681984168056ms][info ][gc,heap ] GC(0) Garbage:
- 737M (2%) 433M (1%) 39M
(0%) - -
[6.150s][1681984168056ms][info ][gc,heap ] GC(0) Reclaimed:
- - 304M (1%) 697M
(2%) - -
The `garbage` at Mark End is a diff between what was `used` when the GC
cycle started and what we later found to be `live` in that used memory.
>
> Let's say I want to fill 1 GiB heap with 4MiB byte[] objects.
>
> Naively I'd imagine I can store 1 GiB / 4MiB = 256 such byte[] on the
> heap.
>
> (I made a simple program that just allocates byte[], stores it in a
> list, does GC and waits (so I can do jcmd or similar), nothing else)
>
> With EpsilonGC -> 255x 4MiB byte[] allocations, after this the app
> crashes with out of memory
> With SerialGC -> 246x
> With ParallelGC -> 233x
> With G1GC -> 204x
> With ZGC -> 170x
>
> For example in the ZGC case, where I have 170x of 4MiB byte[] on the heap.
>
> GC.heap_info:
>
> ZHeap used 1022M, capacity 1024M, max capacity 1024M
> Metaspace used 407K, committed 576K, reserved 1114112K
> class space used 24K, committed 128K, reserved 1048576K
>
> GC.class_histogram:
>
> Total 15118 713971800 (~714M)
>
> In this case does it mean ZGC is wasting 1022M - 714M = 308M for doing
> its "thing"? This is like 1022/714= 43% overhead?
My guess is that the object header (typically 16 bytes) pushed the
objects size slightly beyond 4MB. ZGC allocates large objects in their
own region. Those regions are 2MB aligned, which makes your ~4MB objects
`use` 6MB.
You would probably see similar results with G1 when the heap region size
is increased, which happens when the heap max size is larger. You can
test that by explicitly running with -XX:G1HeapRegionSize=2MB, to use a
larger heap region size.
>
> This example might be convoluted and atypical of any production
> environment.
>
> I am seeing the difference between live set and heap used in
> production at around 12.5% for 3 servers looked at.
>
> Is there any other way to estimate the overhead apart from looking at
> the difference between the live set and heap used? Does ZGC have any
> internal statistic of the overhead?
I don't think we have a way to differentiate between the overhead caused
by (1) and (2) above.
>
> I'd prefer not to assume 12.5% is the number to use and then get
> surprised that in some case it might be 25%?
The overhead of yet-to-be collected garbage can easily be above 25%. It
all depend on the workload. We strive to keep the fragmentation below
the -XX:ZFragmentationLimit, which is set tot 25% by default, but that
doesn't include the overhead of newly allocated object (and it doesn't
include the large objects).
>
> Do you have any recommendations regarding ZGC overhead when estimating
> heap space?
Unfortunately, I can't. It depends on how intricate the object graph is
(meaning how long it will take to mark through it), how many live
objects you have, the allocation rate, number of cores, etc. There's a
constant race between the GC and the allocating Java threads. If the
Java threads "win", and use up all memory before the GC can mark through
the object graph and then give back memory to the Java application, then
the Java threads will stall waiting for more memory. You need to test
with your workload and see if you've given enough heap memory to allow
ZGC to complete its cycles without causing allocation stalls.
Thanks,
StefanK
>
> Thank you
> Alen
More information about the zgc-dev
mailing list