Generational ZGC issue

Johannes Lichtenberger lichtenberger.johannes at gmail.com
Thu Feb 15 20:53:01 UTC 2024


It's a laptop, I've attached some details.

Furthermore, if it seems worth digging deeper into the issue, the JSON file
is here for one week: https://www.transfernow.net/dl/20240215j9NaPTc0

You'd have to unzip into bundles/sirix-core/src/test/resources/json, remove
the @Disabled annotation and run the test
JsonShredderTest::testChicagoDescendantAxis

The test JVM parameters are specified in the parent build.gradle in the
project root folder.

The GitHub repo: https://github.com/sirixdb/sirix

[image: Screenshot from 2024-02-15 21-43-33.png]


kind regards
Johannes

Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb Peter Booth <peter_booth at me.com>:

> Just curious - what CPU, physical memory and OS are you using?
> Sent from my iPhone
>
> On Feb 15, 2024, at 12:23 PM, Johannes Lichtenberger <
> lichtenberger.johannes at gmail.com> wrote:
>
> 
> I guess I don't know which JDK it picks for the tests, but I guess OpenJDK
>
> Johannes Lichtenberger <lichtenberger.johannes at gmail.com> schrieb am Do.,
> 15. Feb. 2024, 17:58:
>
>> However, it's the same with: ./gradlew
>> -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 :sirix-core:test
>> --tests
>> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis
>>  using OpenJDK hopefully
>>
>> Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb Johannes Lichtenberger <
>> lichtenberger.johannes at gmail.com>:
>>
>>> I've attached two logs, the first one without -XX:+Generational, the
>>> second one with the option set, even though I also saw, that generational
>>> ZGC is going to be supported in GraalVM 24.1 in September... so not sure
>>> what this does :)
>>>
>>> Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb Johannes Lichtenberger <
>>> lichtenberger.johannes at gmail.com>:
>>>
>>>> Strange, so does it simply ignore the option? The following is the
>>>> beginning of the output from _non_ generational ZGC:
>>>>
>>>> johannes at luna:~/IdeaProjects/sirix$ ./gradlew
>>>> -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal
>>>> :sirix-core:test --tests
>>>> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis
>>>>
>>>> > Configure project :
>>>> The 'sonarqube' task depends on compile tasks. This behavior is now
>>>> deprecated and will be removed in version 5.x. To avoid implicit
>>>> compilation, set property 'sonar.gradle.skipCompile' to 'true' and make
>>>> sure your project is compiled, before analysis has started.
>>>> The 'sonar' task depends on compile tasks. This behavior is now
>>>> deprecated and will be removed in version 5.x. To avoid implicit
>>>> compilation, set property 'sonar.gradle.skipCompile' to 'true' and make
>>>> sure your project is compiled, before analysis has started.
>>>> [1,627s][info   ][gc      ] GC(0) Garbage Collection (Metadata GC
>>>> Threshold) 84M(1%)->56M(0%)
>>>>
>>>> > Task :sirix-core:test
>>>> [0.001s][warning][pagesize] UseLargePages disabled, no large pages
>>>> configured and available on the system.
>>>> [1.253s][info   ][gc      ] Using The Z Garbage Collector
>>>>
>>>> [2,930s][info   ][gc      ] GC(1) Garbage Collection (Warmup)
>>>> 1616M(11%)->746M(5%)
>>>> [4,445s][info   ][gc      ] GC(2) Garbage Collection (Warmup)
>>>> 3232M(21%)->750M(5%)
>>>> [5,751s][info   ][gc      ] GC(3) Garbage Collection (Warmup)
>>>> 4644M(30%)->1356M(9%)
>>>> [9,886s][info   ][gc      ] GC(4) Garbage Collection (Allocation Rate)
>>>> 10668M(69%)->612M(4%)
>>>> [10,406s][info   ][gc      ] GC(5) Garbage Collection (Allocation Rate)
>>>> 2648M(17%)->216M(1%)
>>>> [13,931s][info   ][gc      ] GC(6) Garbage Collection (Allocation Rate)
>>>> 11164M(73%)->1562M(10%)
>>>> [16,908s][info   ][gc      ] GC(7) Garbage Collection (Allocation Rate)
>>>> 11750M(76%)->460M(3%)
>>>> [20,690s][info   ][gc      ] GC(8) Garbage Collection (Allocation Rate)
>>>> 12670M(82%)->726M(5%)
>>>> [24,376s][info   ][gc      ] GC(9) Garbage Collection (Allocation Rate)
>>>> 13422M(87%)->224M(1%)
>>>> [28,152s][info   ][gc      ] GC(10) Garbage Collection (Proactive)
>>>> 13474M(88%)->650M(4%)
>>>> [31,526s][info   ][gc      ] GC(11) Garbage Collection (Allocation
>>>> Rate) 12072M(79%)->1472M(10%)
>>>> [34,754s][info   ][gc      ] GC(12) Garbage Collection (Allocation
>>>> Rate) 13050M(85%)->330M(2%)
>>>> [38,478s][info   ][gc      ] GC(13) Garbage Collection (Allocation
>>>> Rate) 13288M(87%)->762M(5%)
>>>> [41,936s][info   ][gc      ] GC(14) Garbage Collection (Proactive)
>>>> 13294M(87%)->504M(3%)
>>>> [45,353s][info   ][gc      ] GC(15) Garbage Collection (Allocation
>>>> Rate) 12984M(85%)->268M(2%)
>>>> [48,861s][info   ][gc      ] GC(16) Garbage Collection (Allocation
>>>> Rate) 13008M(85%)->306M(2%)
>>>> [52,133s][info   ][gc      ] GC(17) Garbage Collection (Proactive)
>>>> 12042M(78%)->538M(4%)
>>>> [55,705s][info   ][gc      ] GC(18) Garbage Collection (Allocation
>>>> Rate) 12420M(81%)->1842M(12%)
>>>> [59,000s][info   ][gc      ] GC(19) Garbage Collection (Allocation
>>>> Rate) 12458M(81%)->1422M(9%)
>>>> [64,501s][info   ][gc      ] Allocation Stall (Test worker) 59,673ms
>>>> [64,742s][info   ][gc      ] Allocation Stall (Test worker) 240,077ms
>>>> [65,806s][info   ][gc      ] GC(20) Garbage Collection (Allocation
>>>> Rate) 13808M(90%)->6936M(45%)
>>>> [66,476s][info   ][gc      ] GC(21) Garbage Collection (Allocation
>>>> Stall) 7100M(46%)->4478M(29%)
>>>> [69,471s][info   ][gc      ] GC(22) Garbage Collection (Allocation
>>>> Rate) 10098M(66%)->5888M(38%)
>>>> [72,252s][info   ][gc      ] GC(23) Garbage Collection (Allocation
>>>> Rate) 11226M(73%)->5816M(38%)
>>>>
>>>> ...
>>>>
>>>> So even here I can see some allocation stalls.
>>>>
>>>> Running the Same with -XX:+ZGenerational in build.gradle probably using
>>>> GraalVM does something differnt, but I don't know what... at least off-heap
>>>> memory is exhausted at some point due to direct byte buffer usage!?
>>>>
>>>> So, I'm not sure what's the difference, though.
>>>>
>>>> With this:
>>>>
>>>> "-XX:+UseZGC",
>>>> "-Xlog:gc*=debug:file=zgc-generational-detailed.log",
>>>> "-XX:+ZGenerational",
>>>> "-verbose:gc",
>>>> "-XX:+HeapDumpOnOutOfMemoryError",
>>>> "-XX:HeapDumpPath=heapdump.hprof",
>>>> "-XX:MaxDirectMemorySize=2g",
>>>>
>>>>
>>>> Caused by: java.lang.OutOfMemoryError: Cannot reserve 60000 bytes of direct buffer memory (allocated: 2147446560, limit: 2147483648)
>>>> 	at java.base/java.nio.Bits.reserveMemory(Bits.java:178)
>>>> 	at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:111)
>>>> 	at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360)
>>>> 	at net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191)
>>>> 	at net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192)
>>>> 	at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176)
>>>> 	at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148)
>>>> 	at io.sirix.access.trx.page.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443)
>>>>
>>>>
>>>>
>>>>
>>>> Am Do., 15. Feb. 2024 um 12:05 Uhr schrieb Stefan Karlsson <
>>>> stefan.karlsson at oracle.com>:
>>>>
>>>>> Hi Johannes,
>>>>>
>>>>> We tried to look at the log files and the jfr files, but couldn't find
>>>>> an OotOfMemoryError in any of them. Do you think you could try to
>>>>> rerun
>>>>> and capture the entire GC log from the OutOfMemoryError run?
>>>>>
>>>>> A few things to note:
>>>>>
>>>>> 1) You seem to be running the Graal compiler. Graal doesn't support
>>>>> Generational ZGC, so you are going to run different compilers when you
>>>>> compare Singlegen ZGC with Generational ZGC.
>>>>>
>>>>> 2) It's not clear to me that the provided JFR files matches the
>>>>> provided
>>>>> log files.
>>>>>
>>>>> 3) The JFR files show that -XX:+UseLargePages are used, but the
>>>>> gc+init
>>>>> logs shows 'Large Page Support: Disabled', you might want to look into
>>>>> why that is the case.
>>>>>
>>>>> 4) The singlegen JFR file has a -Xlog:gc:g1-chicago.log line. It
>>>>> should
>>>>> probably be named zgc-chicago.log.
>>>>>
>>>>> Cheers,
>>>>> StefanK
>>>>>
>>>>> On 2024-02-14 17:36, Johannes Lichtenberger wrote:
>>>>> > Hello,
>>>>> >
>>>>> > a test of my little DB project fails using generational ZGC, but not
>>>>> > with ZGC and G1 (out of memory error).
>>>>> >
>>>>> > To be honest, I guess the allocation rate and thus GC pressure, when
>>>>> > traversing a resource in SirixDB is unacceptable. The strategy is to
>>>>> > create fine-grained nodes from JSON input and store these in a trie.
>>>>> > First, a 3,8Gb JSON file is shredded and imported. Next, a preorder
>>>>> > traversal of the generated trie traverses a trie (with leaf pages
>>>>> > storing 1024 nodes each and in total ~300_000_000 (and these are
>>>>> going
>>>>> > to be deserialized one by one). The pages are furthermore referenced
>>>>> > in memory through PageReference::setPage. Furthermore, a Caffeine
>>>>> page
>>>>> > cache caches the PageReferences (keys) and the pages (values) and
>>>>> sets
>>>>> > the reference back to null once entries are going to be evicted
>>>>> > (PageReference.setPage(null)).
>>>>> >
>>>>> > However, I think the whole strategy of having to have in-memory
>>>>> nodes
>>>>> > might not be best. Maybe it's better to use off-heap memory for the
>>>>> > pages itself with MemorySegments, but the pages are not of a fixed
>>>>> > size, thus it may get tricky.
>>>>> >
>>>>> > The test mentioned is this:
>>>>> >
>>>>> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69
>>>>> >
>>>>> > I can upload the JSON file somewhere for a couple of days if needed.
>>>>> >
>>>>> > Caused by: java.lang.OutOfMemoryError
>>>>> >     at
>>>>> >
>>>>> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
>>>>> >     at
>>>>> >
>>>>> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
>>>>> >     at
>>>>> >
>>>>> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
>>>>> >     at
>>>>> >
>>>>> java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542)
>>>>> >     at
>>>>> >
>>>>> java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567)
>>>>> >     at
>>>>> >
>>>>> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670)
>>>>> >     at
>>>>> >
>>>>> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
>>>>> >     at
>>>>> >
>>>>> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174)
>>>>> >     at
>>>>> >
>>>>> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
>>>>> >     at
>>>>> >
>>>>> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
>>>>> >     at
>>>>> > io.sirix.access.trx.page
>>>>> .NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442)
>>>>> >
>>>>> > I've uploaded several JFR recordings and logs over here (maybe
>>>>> besides
>>>>> > the async profiler JFR files the zgc-detailed log is most
>>>>> interesting):
>>>>> >
>>>>> > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core
>>>>> >
>>>>> > kind regards
>>>>> > Johannes
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20240215/8ea94226/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot from 2024-02-15 21-43-33.png
Type: image/png
Size: 31322 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20240215/8ea94226/Screenshotfrom2024-02-1521-43-33-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot from 2024-02-15 21-43-33.png
Type: image/png
Size: 31322 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20240215/8ea94226/Screenshotfrom2024-02-1521-43-33-0003.png>


More information about the zgc-dev mailing list