Generational ZGC issue
Johannes Lichtenberger
lichtenberger.johannes at gmail.com
Thu Feb 15 16:54:50 UTC 2024
I've attached two logs, the first one without -XX:+Generational, the second
one with the option set, even though I also saw, that generational ZGC is
going to be supported in GraalVM 24.1 in September... so not sure what this
does :)
Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb Johannes Lichtenberger <
lichtenberger.johannes at gmail.com>:
> Strange, so does it simply ignore the option? The following is the
> beginning of the output from _non_ generational ZGC:
>
> johannes at luna:~/IdeaProjects/sirix$ ./gradlew
> -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal
> :sirix-core:test --tests
> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis
>
> > Configure project :
> The 'sonarqube' task depends on compile tasks. This behavior is now
> deprecated and will be removed in version 5.x. To avoid implicit
> compilation, set property 'sonar.gradle.skipCompile' to 'true' and make
> sure your project is compiled, before analysis has started.
> The 'sonar' task depends on compile tasks. This behavior is now deprecated
> and will be removed in version 5.x. To avoid implicit compilation, set
> property 'sonar.gradle.skipCompile' to 'true' and make sure your project is
> compiled, before analysis has started.
> [1,627s][info ][gc ] GC(0) Garbage Collection (Metadata GC
> Threshold) 84M(1%)->56M(0%)
>
> > Task :sirix-core:test
> [0.001s][warning][pagesize] UseLargePages disabled, no large pages
> configured and available on the system.
> [1.253s][info ][gc ] Using The Z Garbage Collector
>
> [2,930s][info ][gc ] GC(1) Garbage Collection (Warmup)
> 1616M(11%)->746M(5%)
> [4,445s][info ][gc ] GC(2) Garbage Collection (Warmup)
> 3232M(21%)->750M(5%)
> [5,751s][info ][gc ] GC(3) Garbage Collection (Warmup)
> 4644M(30%)->1356M(9%)
> [9,886s][info ][gc ] GC(4) Garbage Collection (Allocation Rate)
> 10668M(69%)->612M(4%)
> [10,406s][info ][gc ] GC(5) Garbage Collection (Allocation Rate)
> 2648M(17%)->216M(1%)
> [13,931s][info ][gc ] GC(6) Garbage Collection (Allocation Rate)
> 11164M(73%)->1562M(10%)
> [16,908s][info ][gc ] GC(7) Garbage Collection (Allocation Rate)
> 11750M(76%)->460M(3%)
> [20,690s][info ][gc ] GC(8) Garbage Collection (Allocation Rate)
> 12670M(82%)->726M(5%)
> [24,376s][info ][gc ] GC(9) Garbage Collection (Allocation Rate)
> 13422M(87%)->224M(1%)
> [28,152s][info ][gc ] GC(10) Garbage Collection (Proactive)
> 13474M(88%)->650M(4%)
> [31,526s][info ][gc ] GC(11) Garbage Collection (Allocation Rate)
> 12072M(79%)->1472M(10%)
> [34,754s][info ][gc ] GC(12) Garbage Collection (Allocation Rate)
> 13050M(85%)->330M(2%)
> [38,478s][info ][gc ] GC(13) Garbage Collection (Allocation Rate)
> 13288M(87%)->762M(5%)
> [41,936s][info ][gc ] GC(14) Garbage Collection (Proactive)
> 13294M(87%)->504M(3%)
> [45,353s][info ][gc ] GC(15) Garbage Collection (Allocation Rate)
> 12984M(85%)->268M(2%)
> [48,861s][info ][gc ] GC(16) Garbage Collection (Allocation Rate)
> 13008M(85%)->306M(2%)
> [52,133s][info ][gc ] GC(17) Garbage Collection (Proactive)
> 12042M(78%)->538M(4%)
> [55,705s][info ][gc ] GC(18) Garbage Collection (Allocation Rate)
> 12420M(81%)->1842M(12%)
> [59,000s][info ][gc ] GC(19) Garbage Collection (Allocation Rate)
> 12458M(81%)->1422M(9%)
> [64,501s][info ][gc ] Allocation Stall (Test worker) 59,673ms
> [64,742s][info ][gc ] Allocation Stall (Test worker) 240,077ms
> [65,806s][info ][gc ] GC(20) Garbage Collection (Allocation Rate)
> 13808M(90%)->6936M(45%)
> [66,476s][info ][gc ] GC(21) Garbage Collection (Allocation Stall)
> 7100M(46%)->4478M(29%)
> [69,471s][info ][gc ] GC(22) Garbage Collection (Allocation Rate)
> 10098M(66%)->5888M(38%)
> [72,252s][info ][gc ] GC(23) Garbage Collection (Allocation Rate)
> 11226M(73%)->5816M(38%)
>
> ...
>
> So even here I can see some allocation stalls.
>
> Running the Same with -XX:+ZGenerational in build.gradle probably using
> GraalVM does something differnt, but I don't know what... at least off-heap
> memory is exhausted at some point due to direct byte buffer usage!?
>
> So, I'm not sure what's the difference, though.
>
> With this:
>
> "-XX:+UseZGC",
> "-Xlog:gc*=debug:file=zgc-generational-detailed.log",
> "-XX:+ZGenerational",
> "-verbose:gc",
> "-XX:+HeapDumpOnOutOfMemoryError",
> "-XX:HeapDumpPath=heapdump.hprof",
> "-XX:MaxDirectMemorySize=2g",
>
>
> Caused by: java.lang.OutOfMemoryError: Cannot reserve 60000 bytes of direct buffer memory (allocated: 2147446560, limit: 2147483648)
> at java.base/java.nio.Bits.reserveMemory(Bits.java:178)
> at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:111)
> at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360)
> at net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191)
> at net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192)
> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176)
> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148)
> at io.sirix.access.trx.page.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443)
>
>
>
>
> Am Do., 15. Feb. 2024 um 12:05 Uhr schrieb Stefan Karlsson <
> stefan.karlsson at oracle.com>:
>
>> Hi Johannes,
>>
>> We tried to look at the log files and the jfr files, but couldn't find
>> an OotOfMemoryError in any of them. Do you think you could try to rerun
>> and capture the entire GC log from the OutOfMemoryError run?
>>
>> A few things to note:
>>
>> 1) You seem to be running the Graal compiler. Graal doesn't support
>> Generational ZGC, so you are going to run different compilers when you
>> compare Singlegen ZGC with Generational ZGC.
>>
>> 2) It's not clear to me that the provided JFR files matches the provided
>> log files.
>>
>> 3) The JFR files show that -XX:+UseLargePages are used, but the gc+init
>> logs shows 'Large Page Support: Disabled', you might want to look into
>> why that is the case.
>>
>> 4) The singlegen JFR file has a -Xlog:gc:g1-chicago.log line. It should
>> probably be named zgc-chicago.log.
>>
>> Cheers,
>> StefanK
>>
>> On 2024-02-14 17:36, Johannes Lichtenberger wrote:
>> > Hello,
>> >
>> > a test of my little DB project fails using generational ZGC, but not
>> > with ZGC and G1 (out of memory error).
>> >
>> > To be honest, I guess the allocation rate and thus GC pressure, when
>> > traversing a resource in SirixDB is unacceptable. The strategy is to
>> > create fine-grained nodes from JSON input and store these in a trie.
>> > First, a 3,8Gb JSON file is shredded and imported. Next, a preorder
>> > traversal of the generated trie traverses a trie (with leaf pages
>> > storing 1024 nodes each and in total ~300_000_000 (and these are going
>> > to be deserialized one by one). The pages are furthermore referenced
>> > in memory through PageReference::setPage. Furthermore, a Caffeine page
>> > cache caches the PageReferences (keys) and the pages (values) and sets
>> > the reference back to null once entries are going to be evicted
>> > (PageReference.setPage(null)).
>> >
>> > However, I think the whole strategy of having to have in-memory nodes
>> > might not be best. Maybe it's better to use off-heap memory for the
>> > pages itself with MemorySegments, but the pages are not of a fixed
>> > size, thus it may get tricky.
>> >
>> > The test mentioned is this:
>> >
>> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69
>> >
>> > I can upload the JSON file somewhere for a couple of days if needed.
>> >
>> > Caused by: java.lang.OutOfMemoryError
>> > at
>> >
>> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
>> > at
>> >
>> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
>> > at
>> >
>> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
>> > at
>> >
>> java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542)
>> > at
>> >
>> java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567)
>> > at
>> >
>> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670)
>> > at
>> >
>> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
>> > at
>> >
>> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174)
>> > at
>> >
>> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
>> > at
>> >
>> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
>> > at
>> > io.sirix.access.trx.page
>> .NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442)
>> >
>> > I've uploaded several JFR recordings and logs over here (maybe besides
>> > the async profiler JFR files the zgc-detailed log is most interesting):
>> >
>> > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core
>> >
>> > kind regards
>> > Johannes
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20240215/714f8d4b/attachment-0001.htm>
More information about the zgc-dev
mailing list