Forwarding Tables overhead

Erik Osterlund erik.osterlund at oracle.com
Wed May 7 11:52:54 UTC 2025


Hi,

We did indeed mess around with compact forwarding data structures. However, even though the memory cost was small and constant, we never found a real program where it actually used less memory than the more variable sized and flexible data structures we have today. That was a bit discouraging.

So far we have felt a bit awkward about adding that sort of thing if it in practice seems to use more memory for most programs, except so far some contrived programs that squeeze this pain point.

Going back to the problem domain - is the problem you are facing around average memory efficiency, or rather about accounting for temporary spikes? And by accounting I mean figuring out what heap size to select so that the OOM killer doesn’t kill you. In other words, the problem of figuring out how much *maximum* non-heap memory the JVM uses so that the entire process and not just the heap fits into the container.

The reason I ask is that I think automatic heap sizing (cf. https://openjdk.org/jeps/8329758) more or less solves the accounting problem. But if there is an actual memory efficiency problem in a real application, then that would be good to know about, and there are indeed various different ways of solving that.

So I wonder what percent of the container memory is spent on forwarding tables on average in your program and what the largest spikes are between two GC cycles. Do you think you could get any data around that?

/Erik

On 7 May 2025, at 10:35, Jean-Philippe Bempel <jean-philippe.bempel at datadoghq.com> wrote:

Hello ZGC team,

I would like to raise an issue we encounter on our production system
using Generational ZGC with jdk 23. We have sporadically some OOM
Kills in a container environment t
hat seems correlated to a spike of high Java heap allocation. We are
running the JVM with 19GB of Java heap in a container limited to 26GB
using those JVM flags:
-XX:+UseZGC -XX:SoftMaxHeapSize=15g -XX:ZAllocationSpikeTolerance=5
-XX:+UseLargePages -XX:+UseTransparentHugePages.
Normally I don't consider any allocation happening in Java as a
trigger for OOMKill except for any related things like direct memory.
The investigation with higher
container memory and NMT JFR events plots leads us to see a spike of
allocation for GC structs peaking at more than 3GB while normally at
around 512MB [1].
This triggers me to suspect forwarding tables: I have build the
following simulator:

import java.time.Duration;
import java.time.temporal.ChronoUnit;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.locks.LockSupport;

public class GCStress {
   static List<Object> roots = new ArrayList<>();
   public static void main(String[] args) {
       System.out.println("Starting GC stress test...");
       LockSupport.parkNanos(Duration.of(10, ChronoUnit.SECONDS).toNanos());
       while (true) {
           spike();
           LockSupport.parkNanos(Duration.of(10,
ChronoUnit.SECONDS).toNanos());
       }
   }

   private static void spike() {
       roots.clear();
       for (int i = 0; i < 4; i++) {
           new Thread(() -> {
               for (int j = 0; j < 500_000; j++) {
                   roots.add(new Payload());
               }
           }).start();
       }
   }

   private static class Payload {
       long l00;
       long l01;
       long l02;
       long l03;
       long l04;
       long l05;
       long l06;
       long l07;
       long l08;
       long l09;
       List<Object> internal = new ArrayList<>();
       public Payload() {
           for (int i = 0; i < 100; i++) {
               internal.add(new Object());
           }
       }
   }
}

and running it with following command line on jdk 23:
java -XX:StartFlightRecording=dumponexit=true,filename=gcstress.jfr
-XX:NativeMemoryTracking=summary -XX:+UseZGC "-Xlog:gc*:gc.log"
-Xmx16G GCStress

looking for "Forwarding Usage" in gc log [2] gives me this:
[11,175s][info][gc,reloc    ] GC(0) Y: Forwarding Usage: 824M
[14,897s][info][gc,reloc    ] GC(1) Y: Forwarding Usage: 3229M
[19,141s][info][gc,reloc    ] GC(2) y: Forwarding Usage: 2M
[20,335s][info][gc,reloc    ] GC(3) Y: Forwarding Usage: 26M
[20,613s][info][gc,reloc    ] GC(4) y: Forwarding Usage: 235M
[22,390s][info][gc,reloc    ] GC(5) y: Forwarding Usage: 1867M
[22,390s][info][gc,reloc    ] GC(3) O: Forwarding Usage: 0M
[24,517s][info][gc,reloc    ] GC(6) Y: Forwarding Usage: 51M
[24,534s][info][gc,reloc    ] GC(6) O: Forwarding Usage: 0M
[30,694s][info][gc,reloc    ] GC(7) Y: Forwarding Usage: 337M
[37,576s][info][gc,reloc    ] GC(8) y: Forwarding Usage: 2355M
[41,164s][info][gc,reloc    ] GC(9) y: Forwarding Usage: 215M
[41,164s][info][gc,reloc    ] GC(7) O: Forwarding Usage: 0M
[45,843s][info][gc,reloc    ] GC(10) Y: Forwarding Usage: 3528M
[55,427s][info][gc,reloc    ] GC(11) y: Forwarding Usage: 2M
[59,077s][info][gc,reloc    ] GC(12) y: Forwarding Usage: 3628M
[59,077s][info][gc,reloc    ] GC(10) O: Forwarding Usage: 0M
[63,599s][info][gc,reloc    ] GC(13) Y: Forwarding Usage: 3M
[73,646s][info][gc,reloc    ] GC(14) y: Forwarding Usage: 3838M
[76,746s][info][gc,reloc    ] GC(15) y: Forwarding Usage: 3859M
[82,553s][info][gc,reloc    ] GC(16) y: Forwarding Usage: 225M
[86,039s][info][gc,reloc    ] GC(17) y: Forwarding Usage: 3093M
[86,039s][info][gc,reloc    ] GC(13) O: Forwarding Usage: 0M
[89,845s][info][gc,reloc    ] GC(18) Y: Forwarding Usage: 303M
[89,892s][info][gc,reloc    ] GC(18) O: Forwarding Usage: 0M
[91,587s][info][gc,reloc    ] GC(19) Y: Forwarding Usage: 202M
[94,597s][info][gc,reloc    ] GC(20) y: Forwarding Usage: 1078M
[102,230s][info][gc,reloc    ] GC(21) y: Forwarding Usage: 1903M
[102,230s][info][gc,reloc    ] GC(19) O: Forwarding Usage: 0M
[104,895s][info][gc,reloc    ] GC(22) Y: Forwarding Usage: 2160M
[114,142s][info][gc,reloc    ] GC(23) y: Forwarding Usage: 1806M
[118,191s][info][gc,reloc    ] GC(24) y: Forwarding Usage: 3718M
[118,191s][info][gc,reloc    ] GC(22) O: Forwarding Usage: 0M
[122,279s][info][gc,reloc    ] GC(25) y: Forwarding Usage: 1M
[126,271s][info][gc,reloc    ] GC(26) Y: Forwarding Usage: 3204M
[131,925s][info][gc,reloc    ] GC(27) y: Forwarding Usage: 684M
[133,675s][info][gc,reloc    ] GC(28) y: Forwarding Usage: 1443M
[133,676s][info][gc,reloc    ] GC(26) O: Forwarding Usage: 0M
[137,303s][info][gc,reloc    ] GC(29) Y: Forwarding Usage: 2389M
[147,488s][info][gc,reloc    ] GC(30) y: Forwarding Usage: 1M
[153,150s][info][gc,reloc    ] GC(31) y: Forwarding Usage: 3871M
[153,151s][info][gc,reloc    ] GC(29) O: Forwarding Usage: 0M
[153,585s][info][gc,reloc    ] GC(32) Y: Forwarding Usage: 308M
[159,519s][info][gc,reloc    ] GC(33) y: Forwarding Usage: 1933M
[169,010s][info][gc,reloc    ] GC(34) y: Forwarding Usage: 1740M
[169,011s][info][gc,reloc    ] GC(32) O: Forwarding Usage: 0M
[176,374s][info][gc,reloc    ] GC(35) Y: Forwarding Usage: 4071M
[190,786s][info][gc,reloc    ] GC(36) y: Forwarding Usage: 4051M
[196,478s][info][gc,reloc    ] GC(37) y: Forwarding Usage: 4050M
[196,479s][info][gc,reloc    ] GC(35) O: Forwarding Usage: 0M
[197,187s][info][gc,reloc    ] GC(38) y: Forwarding Usage: 719M
[199,733s][info][gc,reloc    ] GC(39) y: Forwarding Usage: 2318M
[202,880s][info][gc,reloc    ] GC(40) y: Forwarding Usage: 4M
[206,743s][info][gc,reloc    ] GC(41) Y: Forwarding Usage: 270M
[212,209s][info][gc,reloc    ] GC(42) y: Forwarding Usage: 2098M
[215,218s][info][gc,reloc    ] GC(43) y: Forwarding Usage: 630M
[215,218s][info][gc,reloc    ] GC(41) O: Forwarding Usage: 0M
[216,553s][info][gc,reloc    ] GC(44) Y: Forwarding Usage: 69M
[219,661s][info][gc,reloc    ] GC(45) y: Forwarding Usage: 989M
[226,256s][info][gc,reloc    ] GC(46) y: Forwarding Usage: 1641M
[226,256s][info][gc,reloc    ] GC(44) O: Forwarding Usage: 0M
[229,217s][info][gc,reloc    ] GC(47) y: Forwarding Usage: 4M
[234,397s][info][gc,reloc    ] GC(48) Y: Forwarding Usage: 3966M
[247,632s][info][gc,reloc    ] GC(49) y: Forwarding Usage: 3M
[250,257s][info][gc,reloc    ] GC(50) y: Forwarding Usage: 1725M

and looking for NMT JFR event [3]:
jfr view native-memory-committed gcstress.jfr

Memory Type                    First Observed   Average Last Observed   Maximum
------------------------------ -------------- --------- ------------- ---------
GC                                   113.7 MB    2.6 GB        2.4 GB    4.8 GB

Which confirms my hypothesis about the spike of forwarding table usage
in case of high spike of java heap allocation before a GC cycle.

I saw an article [4] talking about compact representation of the
forwarding table wondering if it was implemented or planned to be
implemented in the future. This issue forces us to reconsider the
sizing of our container to account for this spike of Forwarding Usage.
Considering the magnitude of this overhead (almost 20% of the Java
Heap) do you think this is something worth improving?

Thanks
Jean-Philippe Bempel

[1] https://ginnieandfifounet.com/jpb/zgc_forwardingtable/Screenshot_spike_GCstructs.png
[2] https://ginnieandfifounet.com/jpb/zgc_forwardingtable/zgc_high_forwarding_usage.log
[3] https://ginnieandfifounet.com/jpb/zgc_forwardingtable/gcstress.jfr
[4] https://inside.java/2020/06/25/compact-forwarding/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20250507/ce8be469/attachment-0001.htm>


More information about the zgc-dev mailing list