on heap overhead of ZGC

Stefan Karlsson stefan.karlsson at oracle.com
Thu Apr 20 15:15:07 UTC 2023


Hi Alen,

On 2023-04-19 13:38, Alen Vrečko wrote:
> Hello everyone.
>
> I did my best to search the web for any prior discussion on this. 
> Haven't found anything useful.
>
> I am trying to understand why there is a noticeable difference between 
> the size of all objects on the heap and the heap used (after full GC). 
> The heap used size can be 10%+ more than the size of of all live objects.

The difference can come from two sources:

1) There is unusable memory in the regions, caused by address and size 
alignment requirements. (This is probably what you see in your 4MB array 
test)

2) We only calculate the `live` value when we mark through objects on 
the heap. While ZGC runs a cycle it more or less ignores new objects 
that the Java threads allocate during the GC cycle. This means that the 
`used` value will increase, but the `live` value will not be updated 
until the Mark End of the next GC cycle..

The latter also makes it bit misleading to look at the used value after 
the GC cycle. With stop-the-world GCs, that used value is an OK 
approximation of what is live on the heap. With a concurrent GC it 
includes all the memory allocated during GC cycle. The used numbers is 
still true, but some (many) of those allocated objects where short-lived 
and died, but we won't be able to figure that out until the next GC cycle.

I don't know if you have seen this, but we have a table where we try to 
give you  a picture of how the values progressed during the GC cycle:

[6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Mark Start          
Mark End        Relocate Start      Relocate End           
High               Low
[6.150s][1681984168056ms][info ][gc,heap     ] GC(0)  Capacity: 3282M 
(10%)        3536M (11%)        3580M (11%)        3612M (11%)        
3612M (11%)        3282M (10%)
[6.150s][1681984168056ms][info ][gc,heap     ] GC(0)      Free: 28780M 
(90%)       28538M (89%)       28720M (90%)       29066M (91%)       
29068M (91%)       28434M (89%)
[6.150s][1681984168056ms][info ][gc,heap     ] GC(0)      Used: 3234M 
(10%)        3476M (11%)        3294M (10%)        2948M (9%)         
3580M (11%)        2946M (9%)
[6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Live:         
-              2496M (8%)         2496M (8%) 2496M (8%)             
-                  -
[6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Allocated:         
-               242M (1%)          364M (1%)          411M 
(1%)             -                  -
[6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Garbage:         
-               737M (2%)          433M (1%)           39M 
(0%)             -                  -
[6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Reclaimed:         
-                  -               304M (1%)          697M 
(2%)             -                  -

The `garbage` at Mark End is a diff between what was `used` when the GC 
cycle started and what we later found to be `live` in that used memory.

>
> Let's say I want to fill 1 GiB heap with 4MiB byte[] objects.
>
> Naively I'd imagine I can store 1 GiB / 4MiB = 256 such byte[] on the 
> heap.
>
> (I made a simple program that just allocates byte[], stores it in a 
> list, does GC and waits (so I can do jcmd or similar), nothing else)
>
> With EpsilonGC -> 255x 4MiB byte[] allocations, after this the app 
> crashes with out of memory
> With SerialGC  -> 246x
> With ParallelGC -> 233x
> With G1GC -> 204x
> With ZGC -> 170x
>
> For example in the ZGC case, where I have 170x of 4MiB byte[] on the heap.
>
> GC.heap_info:
>
>  ZHeap           used 1022M, capacity 1024M, max capacity 1024M
>  Metaspace       used 407K, committed 576K, reserved 1114112K
>   class space    used 24K, committed 128K, reserved 1048576K
>
> GC.class_histogram:
>
> Total         15118      713971800 (~714M)
>
> In this case does it mean ZGC is wasting 1022M - 714M = 308M for doing 
> its "thing"? This is like 1022/714= 43% overhead?

My guess is that the object header (typically 16 bytes) pushed the 
objects size slightly beyond 4MB. ZGC allocates large objects in their 
own region. Those regions are 2MB aligned, which makes your ~4MB objects 
`use` 6MB.

You would probably see similar results with G1 when the heap region size 
is increased, which happens when the heap max size is larger. You can 
test that by explicitly running with -XX:G1HeapRegionSize=2MB, to use a 
larger heap region size.

>
> This example might be convoluted and atypical of any production 
> environment.
>
> I am seeing the difference between live set and heap used in 
> production at around 12.5% for 3 servers looked at.
>
> Is there any other way to estimate the overhead apart from looking at 
> the difference between the live set and heap used? Does ZGC have any 
> internal statistic of the overhead?

I don't think we have a way to differentiate between the overhead caused 
by (1) and (2) above.

>
> I'd prefer not to assume 12.5% is the number to use and then get 
> surprised that in some case it might be 25%?

The overhead of yet-to-be collected garbage can easily be above 25%. It 
all depend on the workload. We strive to keep the fragmentation below 
the -XX:ZFragmentationLimit, which is set tot 25% by default, but that 
doesn't include the overhead of newly allocated object (and it doesn't 
include the large objects).

>
> Do you have any recommendations regarding ZGC overhead when estimating 
> heap space?

Unfortunately, I can't. It depends on how intricate the object graph is 
(meaning how long it will take to mark through it), how many live 
objects you have, the allocation rate, number of cores, etc. There's a 
constant race between the GC and the allocating Java threads. If the 
Java threads "win", and use up all memory before the GC can mark through 
the object graph and then give back memory to the Java application, then 
the Java threads will stall waiting for more memory. You need to test 
with your workload and see if you've given enough heap memory to allow 
ZGC to complete its cycles without causing allocation stalls.

Thanks,
StefanK

>
> Thank you
> Alen



More information about the zgc-dev mailing list