From alen.vrecko at gmail.com Wed Feb 14 08:45:42 2024 From: alen.vrecko at gmail.com (=?UTF-8?B?QWxlbiBWcmXEjWtv?=) Date: Wed, 14 Feb 2024 09:45:42 +0100 Subject: clarification regarding SoftMaxHeapSize and AlwaysPreTouch Message-ID: Hello, saw this write up https://malloc.se/blog/zgc-softmaxheapsize. I have a question regarding AlwaysPreTouch. I could have missed it in the documentation and write up. But I am wondering: Doing -XX:SoftMaxHeapSize=2G -Xmx5G -XX:+AlwaysPreTouch will this first commit 5G of native memory on start and then ZGC will ZUncommit 3G? Alternative is to only pretouch up to 2G? With -XX:SoftMaxHeapSize=2G -Xmx5G -XX:+AlwaysPreTouch -XX:-ZUncommit does this mean the heap will take 5G of native memory from the start and stay that way? Thanks Alen -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.johansson at oracle.com Wed Feb 14 09:24:29 2024 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 14 Feb 2024 10:24:29 +0100 Subject: clarification regarding SoftMaxHeapSize and AlwaysPreTouch In-Reply-To: References: Message-ID: Hi Alen, Some details below. On 2024-02-14 09:45, Alen Vre?ko wrote: > Hello, > > saw this write up https://malloc.se/blog/zgc-softmaxheapsize > . I have a question > regarding?AlwaysPreTouch. I could have missed it in the documentation > and write up. But I am wondering: > > Doing -XX:SoftMaxHeapSize=2G -Xmx5G -XX:+AlwaysPreTouch > > will this first commit 5G of native memory on start and then ZGC will > ZUncommit 3G? Alternative is to only pretouch up to 2G? > It's actually not the maximum heap size that is the deciding factor here. At startup there are heuristics to decide an initial heap size (when not specified with -Xms or -XX:InitialHeapSize) and this is the amount of memory we will commit and pretouch at startup. > With -XX:SoftMaxHeapSize=2G -Xmx5G -XX:+AlwaysPreTouch -XX:-ZUncommit > does this mean the heap will take 5G of native memory from the start and > stay that way? > No, same as above, the initial heap size will be decided using heuristics or provided options. The only thing -ZUncommit change is that no memory that gets committed will later on be uncommitted and returned to the system. Hope this clarifies things a bit, StefanJ > Thanks > Alen From lichtenberger.johannes at gmail.com Wed Feb 14 16:36:42 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Wed, 14 Feb 2024 17:36:42 +0100 Subject: Generational ZGC issue Message-ID: Hello, a test of my little DB project fails using generational ZGC, but not with ZGC and G1 (out of memory error). To be honest, I guess the allocation rate and thus GC pressure, when traversing a resource in SirixDB is unacceptable. The strategy is to create fine-grained nodes from JSON input and store these in a trie. First, a 3,8Gb JSON file is shredded and imported. Next, a preorder traversal of the generated trie traverses a trie (with leaf pages storing 1024 nodes each and in total ~300_000_000 (and these are going to be deserialized one by one). The pages are furthermore referenced in memory through PageReference::setPage. Furthermore, a Caffeine page cache caches the PageReferences (keys) and the pages (values) and sets the reference back to null once entries are going to be evicted (PageReference.setPage(null)). However, I think the whole strategy of having to have in-memory nodes might not be best. Maybe it's better to use off-heap memory for the pages itself with MemorySegments, but the pages are not of a fixed size, thus it may get tricky. The test mentioned is this: https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 I can upload the JSON file somewhere for a couple of days if needed. Caused by: java.lang.OutOfMemoryError at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) at io.sirix.access.trx.page.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) I've uploaded several JFR recordings and logs over here (maybe besides the async profiler JFR files the zgc-detailed log is most interesting): https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core kind regards Johannes -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Thu Feb 15 11:05:26 2024 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 15 Feb 2024 12:05:26 +0100 Subject: Generational ZGC issue In-Reply-To: References: Message-ID: <7eac3764-d836-498b-98fa-c48d4b757096@oracle.com> Hi Johannes, We tried to look at the log files and the jfr files, but couldn't find an OotOfMemoryError in any of them. Do you think you could try to rerun and capture the entire GC log from the OutOfMemoryError run? A few things to note: 1) You seem to be running the Graal compiler. Graal doesn't support Generational ZGC, so you are going to run different compilers when you compare Singlegen ZGC with Generational ZGC. 2) It's not clear to me that the provided JFR files matches the provided log files. 3) The JFR files show that -XX:+UseLargePages are used, but the gc+init logs shows 'Large Page Support: Disabled', you might want to look into why that is the case. 4) The singlegen JFR file has a -Xlog:gc:g1-chicago.log line. It should probably be named zgc-chicago.log. Cheers, StefanK On 2024-02-14 17:36, Johannes Lichtenberger wrote: > Hello, > > a test of my little DB project fails using generational ZGC, but not > with ZGC and G1 (out of memory error). > > To be honest, I guess the allocation rate and thus GC pressure, when > traversing a resource in SirixDB is unacceptable. The strategy is to > create fine-grained nodes from JSON input and store these in a trie. > First, a 3,8Gb JSON file is shredded and imported. Next, a preorder > traversal of the generated trie traverses a trie (with leaf pages > storing 1024 nodes each and in total ~300_000_000 (and these are going > to be deserialized one by one). The pages are furthermore referenced > in memory through PageReference::setPage. Furthermore, a Caffeine page > cache caches the PageReferences (keys) and the pages (values) and sets > the reference back to null once entries are going to be evicted > (PageReference.setPage(null)). > > However, I think the whole strategy of having to have in-memory nodes > might not be best. Maybe it's better to use off-heap memory for the > pages itself with MemorySegments, but the pages are not of a fixed > size, thus it may get tricky. > > The test mentioned is this: > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > > I can upload the JSON file somewhere for a couple of days if needed. > > Caused by: java.lang.OutOfMemoryError > ? ? at > java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > ? ? at > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) > ? ? at > java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) > ? ? at > java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) > ? ? at > java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) > ? ? at > java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) > ? ? at > java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > ? ? at > java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > ? ? at > java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > ? ? at > java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) > ? ? at > io.sirix.access.trx.page.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) > > I've uploaded several JFR recordings and logs over here (maybe besides > the async profiler JFR files the zgc-detailed log is most interesting): > > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core > > kind regards > Johannes From lichtenberger.johannes at gmail.com Thu Feb 15 16:54:50 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Thu, 15 Feb 2024 17:54:50 +0100 Subject: Generational ZGC issue In-Reply-To: References: <7eac3764-d836-498b-98fa-c48d4b757096@oracle.com> Message-ID: I've attached two logs, the first one without -XX:+Generational, the second one with the option set, even though I also saw, that generational ZGC is going to be supported in GraalVM 24.1 in September... so not sure what this does :) Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb Johannes Lichtenberger < lichtenberger.johannes at gmail.com>: > Strange, so does it simply ignore the option? The following is the > beginning of the output from _non_ generational ZGC: > > johannes at luna:~/IdeaProjects/sirix$ ./gradlew > -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal > :sirix-core:test --tests > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > > > Configure project : > The 'sonarqube' task depends on compile tasks. This behavior is now > deprecated and will be removed in version 5.x. To avoid implicit > compilation, set property 'sonar.gradle.skipCompile' to 'true' and make > sure your project is compiled, before analysis has started. > The 'sonar' task depends on compile tasks. This behavior is now deprecated > and will be removed in version 5.x. To avoid implicit compilation, set > property 'sonar.gradle.skipCompile' to 'true' and make sure your project is > compiled, before analysis has started. > [1,627s][info ][gc ] GC(0) Garbage Collection (Metadata GC > Threshold) 84M(1%)->56M(0%) > > > Task :sirix-core:test > [0.001s][warning][pagesize] UseLargePages disabled, no large pages > configured and available on the system. > [1.253s][info ][gc ] Using The Z Garbage Collector > > [2,930s][info ][gc ] GC(1) Garbage Collection (Warmup) > 1616M(11%)->746M(5%) > [4,445s][info ][gc ] GC(2) Garbage Collection (Warmup) > 3232M(21%)->750M(5%) > [5,751s][info ][gc ] GC(3) Garbage Collection (Warmup) > 4644M(30%)->1356M(9%) > [9,886s][info ][gc ] GC(4) Garbage Collection (Allocation Rate) > 10668M(69%)->612M(4%) > [10,406s][info ][gc ] GC(5) Garbage Collection (Allocation Rate) > 2648M(17%)->216M(1%) > [13,931s][info ][gc ] GC(6) Garbage Collection (Allocation Rate) > 11164M(73%)->1562M(10%) > [16,908s][info ][gc ] GC(7) Garbage Collection (Allocation Rate) > 11750M(76%)->460M(3%) > [20,690s][info ][gc ] GC(8) Garbage Collection (Allocation Rate) > 12670M(82%)->726M(5%) > [24,376s][info ][gc ] GC(9) Garbage Collection (Allocation Rate) > 13422M(87%)->224M(1%) > [28,152s][info ][gc ] GC(10) Garbage Collection (Proactive) > 13474M(88%)->650M(4%) > [31,526s][info ][gc ] GC(11) Garbage Collection (Allocation Rate) > 12072M(79%)->1472M(10%) > [34,754s][info ][gc ] GC(12) Garbage Collection (Allocation Rate) > 13050M(85%)->330M(2%) > [38,478s][info ][gc ] GC(13) Garbage Collection (Allocation Rate) > 13288M(87%)->762M(5%) > [41,936s][info ][gc ] GC(14) Garbage Collection (Proactive) > 13294M(87%)->504M(3%) > [45,353s][info ][gc ] GC(15) Garbage Collection (Allocation Rate) > 12984M(85%)->268M(2%) > [48,861s][info ][gc ] GC(16) Garbage Collection (Allocation Rate) > 13008M(85%)->306M(2%) > [52,133s][info ][gc ] GC(17) Garbage Collection (Proactive) > 12042M(78%)->538M(4%) > [55,705s][info ][gc ] GC(18) Garbage Collection (Allocation Rate) > 12420M(81%)->1842M(12%) > [59,000s][info ][gc ] GC(19) Garbage Collection (Allocation Rate) > 12458M(81%)->1422M(9%) > [64,501s][info ][gc ] Allocation Stall (Test worker) 59,673ms > [64,742s][info ][gc ] Allocation Stall (Test worker) 240,077ms > [65,806s][info ][gc ] GC(20) Garbage Collection (Allocation Rate) > 13808M(90%)->6936M(45%) > [66,476s][info ][gc ] GC(21) Garbage Collection (Allocation Stall) > 7100M(46%)->4478M(29%) > [69,471s][info ][gc ] GC(22) Garbage Collection (Allocation Rate) > 10098M(66%)->5888M(38%) > [72,252s][info ][gc ] GC(23) Garbage Collection (Allocation Rate) > 11226M(73%)->5816M(38%) > > ... > > So even here I can see some allocation stalls. > > Running the Same with -XX:+ZGenerational in build.gradle probably using > GraalVM does something differnt, but I don't know what... at least off-heap > memory is exhausted at some point due to direct byte buffer usage!? > > So, I'm not sure what's the difference, though. > > With this: > > "-XX:+UseZGC", > "-Xlog:gc*=debug:file=zgc-generational-detailed.log", > "-XX:+ZGenerational", > "-verbose:gc", > "-XX:+HeapDumpOnOutOfMemoryError", > "-XX:HeapDumpPath=heapdump.hprof", > "-XX:MaxDirectMemorySize=2g", > > > Caused by: java.lang.OutOfMemoryError: Cannot reserve 60000 bytes of direct buffer memory (allocated: 2147446560, limit: 2147483648) > at java.base/java.nio.Bits.reserveMemory(Bits.java:178) > at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) > at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) > at net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) > at net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) > at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) > at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) > at io.sirix.access.trx.page.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) > > > > > Am Do., 15. Feb. 2024 um 12:05 Uhr schrieb Stefan Karlsson < > stefan.karlsson at oracle.com>: > >> Hi Johannes, >> >> We tried to look at the log files and the jfr files, but couldn't find >> an OotOfMemoryError in any of them. Do you think you could try to rerun >> and capture the entire GC log from the OutOfMemoryError run? >> >> A few things to note: >> >> 1) You seem to be running the Graal compiler. Graal doesn't support >> Generational ZGC, so you are going to run different compilers when you >> compare Singlegen ZGC with Generational ZGC. >> >> 2) It's not clear to me that the provided JFR files matches the provided >> log files. >> >> 3) The JFR files show that -XX:+UseLargePages are used, but the gc+init >> logs shows 'Large Page Support: Disabled', you might want to look into >> why that is the case. >> >> 4) The singlegen JFR file has a -Xlog:gc:g1-chicago.log line. It should >> probably be named zgc-chicago.log. >> >> Cheers, >> StefanK >> >> On 2024-02-14 17:36, Johannes Lichtenberger wrote: >> > Hello, >> > >> > a test of my little DB project fails using generational ZGC, but not >> > with ZGC and G1 (out of memory error). >> > >> > To be honest, I guess the allocation rate and thus GC pressure, when >> > traversing a resource in SirixDB is unacceptable. The strategy is to >> > create fine-grained nodes from JSON input and store these in a trie. >> > First, a 3,8Gb JSON file is shredded and imported. Next, a preorder >> > traversal of the generated trie traverses a trie (with leaf pages >> > storing 1024 nodes each and in total ~300_000_000 (and these are going >> > to be deserialized one by one). The pages are furthermore referenced >> > in memory through PageReference::setPage. Furthermore, a Caffeine page >> > cache caches the PageReferences (keys) and the pages (values) and sets >> > the reference back to null once entries are going to be evicted >> > (PageReference.setPage(null)). >> > >> > However, I think the whole strategy of having to have in-memory nodes >> > might not be best. Maybe it's better to use off-heap memory for the >> > pages itself with MemorySegments, but the pages are not of a fixed >> > size, thus it may get tricky. >> > >> > The test mentioned is this: >> > >> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 >> > >> > I can upload the JSON file somewhere for a couple of days if needed. >> > >> > Caused by: java.lang.OutOfMemoryError >> > at >> > >> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) >> > at >> > >> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) >> > at >> > >> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) >> > at >> > >> java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) >> > at >> > >> java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) >> > at >> > >> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) >> > at >> > >> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) >> > at >> > >> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) >> > at >> > >> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) >> > at >> > >> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) >> > at >> > io.sirix.access.trx.page >> .NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) >> > >> > I've uploaded several JFR recordings and logs over here (maybe besides >> > the async profiler JFR files the zgc-detailed log is most interesting): >> > >> > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core >> > >> > kind regards >> > Johannes >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From lichtenberger.johannes at gmail.com Thu Feb 15 16:58:28 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Thu, 15 Feb 2024 17:58:28 +0100 Subject: Generational ZGC issue In-Reply-To: References: <7eac3764-d836-498b-98fa-c48d4b757096@oracle.com> Message-ID: However, it's the same with: ./gradlew -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 :sirix-core:test --tests io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis using OpenJDK hopefully Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb Johannes Lichtenberger < lichtenberger.johannes at gmail.com>: > I've attached two logs, the first one without -XX:+Generational, the > second one with the option set, even though I also saw, that generational > ZGC is going to be supported in GraalVM 24.1 in September... so not sure > what this does :) > > Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb Johannes Lichtenberger < > lichtenberger.johannes at gmail.com>: > >> Strange, so does it simply ignore the option? The following is the >> beginning of the output from _non_ generational ZGC: >> >> johannes at luna:~/IdeaProjects/sirix$ ./gradlew >> -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal >> :sirix-core:test --tests >> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >> >> > Configure project : >> The 'sonarqube' task depends on compile tasks. This behavior is now >> deprecated and will be removed in version 5.x. To avoid implicit >> compilation, set property 'sonar.gradle.skipCompile' to 'true' and make >> sure your project is compiled, before analysis has started. >> The 'sonar' task depends on compile tasks. This behavior is now >> deprecated and will be removed in version 5.x. To avoid implicit >> compilation, set property 'sonar.gradle.skipCompile' to 'true' and make >> sure your project is compiled, before analysis has started. >> [1,627s][info ][gc ] GC(0) Garbage Collection (Metadata GC >> Threshold) 84M(1%)->56M(0%) >> >> > Task :sirix-core:test >> [0.001s][warning][pagesize] UseLargePages disabled, no large pages >> configured and available on the system. >> [1.253s][info ][gc ] Using The Z Garbage Collector >> >> [2,930s][info ][gc ] GC(1) Garbage Collection (Warmup) >> 1616M(11%)->746M(5%) >> [4,445s][info ][gc ] GC(2) Garbage Collection (Warmup) >> 3232M(21%)->750M(5%) >> [5,751s][info ][gc ] GC(3) Garbage Collection (Warmup) >> 4644M(30%)->1356M(9%) >> [9,886s][info ][gc ] GC(4) Garbage Collection (Allocation Rate) >> 10668M(69%)->612M(4%) >> [10,406s][info ][gc ] GC(5) Garbage Collection (Allocation Rate) >> 2648M(17%)->216M(1%) >> [13,931s][info ][gc ] GC(6) Garbage Collection (Allocation Rate) >> 11164M(73%)->1562M(10%) >> [16,908s][info ][gc ] GC(7) Garbage Collection (Allocation Rate) >> 11750M(76%)->460M(3%) >> [20,690s][info ][gc ] GC(8) Garbage Collection (Allocation Rate) >> 12670M(82%)->726M(5%) >> [24,376s][info ][gc ] GC(9) Garbage Collection (Allocation Rate) >> 13422M(87%)->224M(1%) >> [28,152s][info ][gc ] GC(10) Garbage Collection (Proactive) >> 13474M(88%)->650M(4%) >> [31,526s][info ][gc ] GC(11) Garbage Collection (Allocation Rate) >> 12072M(79%)->1472M(10%) >> [34,754s][info ][gc ] GC(12) Garbage Collection (Allocation Rate) >> 13050M(85%)->330M(2%) >> [38,478s][info ][gc ] GC(13) Garbage Collection (Allocation Rate) >> 13288M(87%)->762M(5%) >> [41,936s][info ][gc ] GC(14) Garbage Collection (Proactive) >> 13294M(87%)->504M(3%) >> [45,353s][info ][gc ] GC(15) Garbage Collection (Allocation Rate) >> 12984M(85%)->268M(2%) >> [48,861s][info ][gc ] GC(16) Garbage Collection (Allocation Rate) >> 13008M(85%)->306M(2%) >> [52,133s][info ][gc ] GC(17) Garbage Collection (Proactive) >> 12042M(78%)->538M(4%) >> [55,705s][info ][gc ] GC(18) Garbage Collection (Allocation Rate) >> 12420M(81%)->1842M(12%) >> [59,000s][info ][gc ] GC(19) Garbage Collection (Allocation Rate) >> 12458M(81%)->1422M(9%) >> [64,501s][info ][gc ] Allocation Stall (Test worker) 59,673ms >> [64,742s][info ][gc ] Allocation Stall (Test worker) 240,077ms >> [65,806s][info ][gc ] GC(20) Garbage Collection (Allocation Rate) >> 13808M(90%)->6936M(45%) >> [66,476s][info ][gc ] GC(21) Garbage Collection (Allocation Stall) >> 7100M(46%)->4478M(29%) >> [69,471s][info ][gc ] GC(22) Garbage Collection (Allocation Rate) >> 10098M(66%)->5888M(38%) >> [72,252s][info ][gc ] GC(23) Garbage Collection (Allocation Rate) >> 11226M(73%)->5816M(38%) >> >> ... >> >> So even here I can see some allocation stalls. >> >> Running the Same with -XX:+ZGenerational in build.gradle probably using >> GraalVM does something differnt, but I don't know what... at least off-heap >> memory is exhausted at some point due to direct byte buffer usage!? >> >> So, I'm not sure what's the difference, though. >> >> With this: >> >> "-XX:+UseZGC", >> "-Xlog:gc*=debug:file=zgc-generational-detailed.log", >> "-XX:+ZGenerational", >> "-verbose:gc", >> "-XX:+HeapDumpOnOutOfMemoryError", >> "-XX:HeapDumpPath=heapdump.hprof", >> "-XX:MaxDirectMemorySize=2g", >> >> >> Caused by: java.lang.OutOfMemoryError: Cannot reserve 60000 bytes of direct buffer memory (allocated: 2147446560, limit: 2147483648) >> at java.base/java.nio.Bits.reserveMemory(Bits.java:178) >> at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) >> at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) >> at net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) >> at net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) >> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) >> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) >> at io.sirix.access.trx.page.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) >> >> >> >> >> Am Do., 15. Feb. 2024 um 12:05 Uhr schrieb Stefan Karlsson < >> stefan.karlsson at oracle.com>: >> >>> Hi Johannes, >>> >>> We tried to look at the log files and the jfr files, but couldn't find >>> an OotOfMemoryError in any of them. Do you think you could try to rerun >>> and capture the entire GC log from the OutOfMemoryError run? >>> >>> A few things to note: >>> >>> 1) You seem to be running the Graal compiler. Graal doesn't support >>> Generational ZGC, so you are going to run different compilers when you >>> compare Singlegen ZGC with Generational ZGC. >>> >>> 2) It's not clear to me that the provided JFR files matches the provided >>> log files. >>> >>> 3) The JFR files show that -XX:+UseLargePages are used, but the gc+init >>> logs shows 'Large Page Support: Disabled', you might want to look into >>> why that is the case. >>> >>> 4) The singlegen JFR file has a -Xlog:gc:g1-chicago.log line. It should >>> probably be named zgc-chicago.log. >>> >>> Cheers, >>> StefanK >>> >>> On 2024-02-14 17:36, Johannes Lichtenberger wrote: >>> > Hello, >>> > >>> > a test of my little DB project fails using generational ZGC, but not >>> > with ZGC and G1 (out of memory error). >>> > >>> > To be honest, I guess the allocation rate and thus GC pressure, when >>> > traversing a resource in SirixDB is unacceptable. The strategy is to >>> > create fine-grained nodes from JSON input and store these in a trie. >>> > First, a 3,8Gb JSON file is shredded and imported. Next, a preorder >>> > traversal of the generated trie traverses a trie (with leaf pages >>> > storing 1024 nodes each and in total ~300_000_000 (and these are going >>> > to be deserialized one by one). The pages are furthermore referenced >>> > in memory through PageReference::setPage. Furthermore, a Caffeine page >>> > cache caches the PageReferences (keys) and the pages (values) and sets >>> > the reference back to null once entries are going to be evicted >>> > (PageReference.setPage(null)). >>> > >>> > However, I think the whole strategy of having to have in-memory nodes >>> > might not be best. Maybe it's better to use off-heap memory for the >>> > pages itself with MemorySegments, but the pages are not of a fixed >>> > size, thus it may get tricky. >>> > >>> > The test mentioned is this: >>> > >>> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 >>> > >>> > I can upload the JSON file somewhere for a couple of days if needed. >>> > >>> > Caused by: java.lang.OutOfMemoryError >>> > at >>> > >>> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) >>> > at >>> > >>> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) >>> > at >>> > >>> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) >>> > at >>> > >>> java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) >>> > at >>> > >>> java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) >>> > at >>> > >>> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) >>> > at >>> > >>> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) >>> > at >>> > >>> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) >>> > at >>> > >>> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) >>> > at >>> > >>> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) >>> > at >>> > io.sirix.access.trx.page >>> .NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) >>> > >>> > I've uploaded several JFR recordings and logs over here (maybe besides >>> > the async profiler JFR files the zgc-detailed log is most interesting): >>> > >>> > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core >>> > >>> > kind regards >>> > Johannes >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From lichtenberger.johannes at gmail.com Thu Feb 15 17:22:32 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Thu, 15 Feb 2024 18:22:32 +0100 Subject: Generational ZGC issue In-Reply-To: References: <7eac3764-d836-498b-98fa-c48d4b757096@oracle.com> Message-ID: I guess I don't know which JDK it picks for the tests, but I guess OpenJDK Johannes Lichtenberger schrieb am Do., 15. Feb. 2024, 17:58: > However, it's the same with: ./gradlew > -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 :sirix-core:test > --tests > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > using OpenJDK hopefully > > Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb Johannes Lichtenberger < > lichtenberger.johannes at gmail.com>: > >> I've attached two logs, the first one without -XX:+Generational, the >> second one with the option set, even though I also saw, that generational >> ZGC is going to be supported in GraalVM 24.1 in September... so not sure >> what this does :) >> >> Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb Johannes Lichtenberger < >> lichtenberger.johannes at gmail.com>: >> >>> Strange, so does it simply ignore the option? The following is the >>> beginning of the output from _non_ generational ZGC: >>> >>> johannes at luna:~/IdeaProjects/sirix$ ./gradlew >>> -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal >>> :sirix-core:test --tests >>> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >>> >>> > Configure project : >>> The 'sonarqube' task depends on compile tasks. This behavior is now >>> deprecated and will be removed in version 5.x. To avoid implicit >>> compilation, set property 'sonar.gradle.skipCompile' to 'true' and make >>> sure your project is compiled, before analysis has started. >>> The 'sonar' task depends on compile tasks. This behavior is now >>> deprecated and will be removed in version 5.x. To avoid implicit >>> compilation, set property 'sonar.gradle.skipCompile' to 'true' and make >>> sure your project is compiled, before analysis has started. >>> [1,627s][info ][gc ] GC(0) Garbage Collection (Metadata GC >>> Threshold) 84M(1%)->56M(0%) >>> >>> > Task :sirix-core:test >>> [0.001s][warning][pagesize] UseLargePages disabled, no large pages >>> configured and available on the system. >>> [1.253s][info ][gc ] Using The Z Garbage Collector >>> >>> [2,930s][info ][gc ] GC(1) Garbage Collection (Warmup) >>> 1616M(11%)->746M(5%) >>> [4,445s][info ][gc ] GC(2) Garbage Collection (Warmup) >>> 3232M(21%)->750M(5%) >>> [5,751s][info ][gc ] GC(3) Garbage Collection (Warmup) >>> 4644M(30%)->1356M(9%) >>> [9,886s][info ][gc ] GC(4) Garbage Collection (Allocation Rate) >>> 10668M(69%)->612M(4%) >>> [10,406s][info ][gc ] GC(5) Garbage Collection (Allocation Rate) >>> 2648M(17%)->216M(1%) >>> [13,931s][info ][gc ] GC(6) Garbage Collection (Allocation Rate) >>> 11164M(73%)->1562M(10%) >>> [16,908s][info ][gc ] GC(7) Garbage Collection (Allocation Rate) >>> 11750M(76%)->460M(3%) >>> [20,690s][info ][gc ] GC(8) Garbage Collection (Allocation Rate) >>> 12670M(82%)->726M(5%) >>> [24,376s][info ][gc ] GC(9) Garbage Collection (Allocation Rate) >>> 13422M(87%)->224M(1%) >>> [28,152s][info ][gc ] GC(10) Garbage Collection (Proactive) >>> 13474M(88%)->650M(4%) >>> [31,526s][info ][gc ] GC(11) Garbage Collection (Allocation Rate) >>> 12072M(79%)->1472M(10%) >>> [34,754s][info ][gc ] GC(12) Garbage Collection (Allocation Rate) >>> 13050M(85%)->330M(2%) >>> [38,478s][info ][gc ] GC(13) Garbage Collection (Allocation Rate) >>> 13288M(87%)->762M(5%) >>> [41,936s][info ][gc ] GC(14) Garbage Collection (Proactive) >>> 13294M(87%)->504M(3%) >>> [45,353s][info ][gc ] GC(15) Garbage Collection (Allocation Rate) >>> 12984M(85%)->268M(2%) >>> [48,861s][info ][gc ] GC(16) Garbage Collection (Allocation Rate) >>> 13008M(85%)->306M(2%) >>> [52,133s][info ][gc ] GC(17) Garbage Collection (Proactive) >>> 12042M(78%)->538M(4%) >>> [55,705s][info ][gc ] GC(18) Garbage Collection (Allocation Rate) >>> 12420M(81%)->1842M(12%) >>> [59,000s][info ][gc ] GC(19) Garbage Collection (Allocation Rate) >>> 12458M(81%)->1422M(9%) >>> [64,501s][info ][gc ] Allocation Stall (Test worker) 59,673ms >>> [64,742s][info ][gc ] Allocation Stall (Test worker) 240,077ms >>> [65,806s][info ][gc ] GC(20) Garbage Collection (Allocation Rate) >>> 13808M(90%)->6936M(45%) >>> [66,476s][info ][gc ] GC(21) Garbage Collection (Allocation >>> Stall) 7100M(46%)->4478M(29%) >>> [69,471s][info ][gc ] GC(22) Garbage Collection (Allocation Rate) >>> 10098M(66%)->5888M(38%) >>> [72,252s][info ][gc ] GC(23) Garbage Collection (Allocation Rate) >>> 11226M(73%)->5816M(38%) >>> >>> ... >>> >>> So even here I can see some allocation stalls. >>> >>> Running the Same with -XX:+ZGenerational in build.gradle probably using >>> GraalVM does something differnt, but I don't know what... at least off-heap >>> memory is exhausted at some point due to direct byte buffer usage!? >>> >>> So, I'm not sure what's the difference, though. >>> >>> With this: >>> >>> "-XX:+UseZGC", >>> "-Xlog:gc*=debug:file=zgc-generational-detailed.log", >>> "-XX:+ZGenerational", >>> "-verbose:gc", >>> "-XX:+HeapDumpOnOutOfMemoryError", >>> "-XX:HeapDumpPath=heapdump.hprof", >>> "-XX:MaxDirectMemorySize=2g", >>> >>> >>> Caused by: java.lang.OutOfMemoryError: Cannot reserve 60000 bytes of direct buffer memory (allocated: 2147446560, limit: 2147483648) >>> at java.base/java.nio.Bits.reserveMemory(Bits.java:178) >>> at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) >>> at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) >>> at net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) >>> at net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) >>> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) >>> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) >>> at io.sirix.access.trx.page.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) >>> >>> >>> >>> >>> Am Do., 15. Feb. 2024 um 12:05 Uhr schrieb Stefan Karlsson < >>> stefan.karlsson at oracle.com>: >>> >>>> Hi Johannes, >>>> >>>> We tried to look at the log files and the jfr files, but couldn't find >>>> an OotOfMemoryError in any of them. Do you think you could try to rerun >>>> and capture the entire GC log from the OutOfMemoryError run? >>>> >>>> A few things to note: >>>> >>>> 1) You seem to be running the Graal compiler. Graal doesn't support >>>> Generational ZGC, so you are going to run different compilers when you >>>> compare Singlegen ZGC with Generational ZGC. >>>> >>>> 2) It's not clear to me that the provided JFR files matches the >>>> provided >>>> log files. >>>> >>>> 3) The JFR files show that -XX:+UseLargePages are used, but the gc+init >>>> logs shows 'Large Page Support: Disabled', you might want to look into >>>> why that is the case. >>>> >>>> 4) The singlegen JFR file has a -Xlog:gc:g1-chicago.log line. It should >>>> probably be named zgc-chicago.log. >>>> >>>> Cheers, >>>> StefanK >>>> >>>> On 2024-02-14 17:36, Johannes Lichtenberger wrote: >>>> > Hello, >>>> > >>>> > a test of my little DB project fails using generational ZGC, but not >>>> > with ZGC and G1 (out of memory error). >>>> > >>>> > To be honest, I guess the allocation rate and thus GC pressure, when >>>> > traversing a resource in SirixDB is unacceptable. The strategy is to >>>> > create fine-grained nodes from JSON input and store these in a trie. >>>> > First, a 3,8Gb JSON file is shredded and imported. Next, a preorder >>>> > traversal of the generated trie traverses a trie (with leaf pages >>>> > storing 1024 nodes each and in total ~300_000_000 (and these are >>>> going >>>> > to be deserialized one by one). The pages are furthermore referenced >>>> > in memory through PageReference::setPage. Furthermore, a Caffeine >>>> page >>>> > cache caches the PageReferences (keys) and the pages (values) and >>>> sets >>>> > the reference back to null once entries are going to be evicted >>>> > (PageReference.setPage(null)). >>>> > >>>> > However, I think the whole strategy of having to have in-memory nodes >>>> > might not be best. Maybe it's better to use off-heap memory for the >>>> > pages itself with MemorySegments, but the pages are not of a fixed >>>> > size, thus it may get tricky. >>>> > >>>> > The test mentioned is this: >>>> > >>>> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 >>>> > >>>> > I can upload the JSON file somewhere for a couple of days if needed. >>>> > >>>> > Caused by: java.lang.OutOfMemoryError >>>> > at >>>> > >>>> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) >>>> > at >>>> > >>>> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) >>>> > at >>>> > >>>> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) >>>> > at >>>> > >>>> java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) >>>> > at >>>> > >>>> java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) >>>> > at >>>> > >>>> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) >>>> > at >>>> > >>>> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) >>>> > at >>>> > >>>> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) >>>> > at >>>> > >>>> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) >>>> > at >>>> > >>>> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) >>>> > at >>>> > io.sirix.access.trx.page >>>> .NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) >>>> > >>>> > I've uploaded several JFR recordings and logs over here (maybe >>>> besides >>>> > the async profiler JFR files the zgc-detailed log is most >>>> interesting): >>>> > >>>> > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core >>>> > >>>> > kind regards >>>> > Johannes >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter_booth at me.com Thu Feb 15 19:01:33 2024 From: peter_booth at me.com (Peter Booth) Date: Thu, 15 Feb 2024 14:01:33 -0500 Subject: Generational ZGC issue In-Reply-To: References: Message-ID: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> An HTML attachment was scrubbed... URL: From lichtenberger.johannes at gmail.com Thu Feb 15 20:53:01 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Thu, 15 Feb 2024 21:53:01 +0100 Subject: Generational ZGC issue In-Reply-To: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> Message-ID: It's a laptop, I've attached some details. Furthermore, if it seems worth digging deeper into the issue, the JSON file is here for one week: https://www.transfernow.net/dl/20240215j9NaPTc0 You'd have to unzip into bundles/sirix-core/src/test/resources/json, remove the @Disabled annotation and run the test JsonShredderTest::testChicagoDescendantAxis The test JVM parameters are specified in the parent build.gradle in the project root folder. The GitHub repo: https://github.com/sirixdb/sirix [image: Screenshot from 2024-02-15 21-43-33.png] kind regards Johannes Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb Peter Booth : > Just curious - what CPU, physical memory and OS are you using? > Sent from my iPhone > > On Feb 15, 2024, at 12:23?PM, Johannes Lichtenberger < > lichtenberger.johannes at gmail.com> wrote: > > ? > I guess I don't know which JDK it picks for the tests, but I guess OpenJDK > > Johannes Lichtenberger schrieb am Do., > 15. Feb. 2024, 17:58: > >> However, it's the same with: ./gradlew >> -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 :sirix-core:test >> --tests >> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >> using OpenJDK hopefully >> >> Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb Johannes Lichtenberger < >> lichtenberger.johannes at gmail.com>: >> >>> I've attached two logs, the first one without -XX:+Generational, the >>> second one with the option set, even though I also saw, that generational >>> ZGC is going to be supported in GraalVM 24.1 in September... so not sure >>> what this does :) >>> >>> Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb Johannes Lichtenberger < >>> lichtenberger.johannes at gmail.com>: >>> >>>> Strange, so does it simply ignore the option? The following is the >>>> beginning of the output from _non_ generational ZGC: >>>> >>>> johannes at luna:~/IdeaProjects/sirix$ ./gradlew >>>> -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal >>>> :sirix-core:test --tests >>>> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >>>> >>>> > Configure project : >>>> The 'sonarqube' task depends on compile tasks. This behavior is now >>>> deprecated and will be removed in version 5.x. To avoid implicit >>>> compilation, set property 'sonar.gradle.skipCompile' to 'true' and make >>>> sure your project is compiled, before analysis has started. >>>> The 'sonar' task depends on compile tasks. This behavior is now >>>> deprecated and will be removed in version 5.x. To avoid implicit >>>> compilation, set property 'sonar.gradle.skipCompile' to 'true' and make >>>> sure your project is compiled, before analysis has started. >>>> [1,627s][info ][gc ] GC(0) Garbage Collection (Metadata GC >>>> Threshold) 84M(1%)->56M(0%) >>>> >>>> > Task :sirix-core:test >>>> [0.001s][warning][pagesize] UseLargePages disabled, no large pages >>>> configured and available on the system. >>>> [1.253s][info ][gc ] Using The Z Garbage Collector >>>> >>>> [2,930s][info ][gc ] GC(1) Garbage Collection (Warmup) >>>> 1616M(11%)->746M(5%) >>>> [4,445s][info ][gc ] GC(2) Garbage Collection (Warmup) >>>> 3232M(21%)->750M(5%) >>>> [5,751s][info ][gc ] GC(3) Garbage Collection (Warmup) >>>> 4644M(30%)->1356M(9%) >>>> [9,886s][info ][gc ] GC(4) Garbage Collection (Allocation Rate) >>>> 10668M(69%)->612M(4%) >>>> [10,406s][info ][gc ] GC(5) Garbage Collection (Allocation Rate) >>>> 2648M(17%)->216M(1%) >>>> [13,931s][info ][gc ] GC(6) Garbage Collection (Allocation Rate) >>>> 11164M(73%)->1562M(10%) >>>> [16,908s][info ][gc ] GC(7) Garbage Collection (Allocation Rate) >>>> 11750M(76%)->460M(3%) >>>> [20,690s][info ][gc ] GC(8) Garbage Collection (Allocation Rate) >>>> 12670M(82%)->726M(5%) >>>> [24,376s][info ][gc ] GC(9) Garbage Collection (Allocation Rate) >>>> 13422M(87%)->224M(1%) >>>> [28,152s][info ][gc ] GC(10) Garbage Collection (Proactive) >>>> 13474M(88%)->650M(4%) >>>> [31,526s][info ][gc ] GC(11) Garbage Collection (Allocation >>>> Rate) 12072M(79%)->1472M(10%) >>>> [34,754s][info ][gc ] GC(12) Garbage Collection (Allocation >>>> Rate) 13050M(85%)->330M(2%) >>>> [38,478s][info ][gc ] GC(13) Garbage Collection (Allocation >>>> Rate) 13288M(87%)->762M(5%) >>>> [41,936s][info ][gc ] GC(14) Garbage Collection (Proactive) >>>> 13294M(87%)->504M(3%) >>>> [45,353s][info ][gc ] GC(15) Garbage Collection (Allocation >>>> Rate) 12984M(85%)->268M(2%) >>>> [48,861s][info ][gc ] GC(16) Garbage Collection (Allocation >>>> Rate) 13008M(85%)->306M(2%) >>>> [52,133s][info ][gc ] GC(17) Garbage Collection (Proactive) >>>> 12042M(78%)->538M(4%) >>>> [55,705s][info ][gc ] GC(18) Garbage Collection (Allocation >>>> Rate) 12420M(81%)->1842M(12%) >>>> [59,000s][info ][gc ] GC(19) Garbage Collection (Allocation >>>> Rate) 12458M(81%)->1422M(9%) >>>> [64,501s][info ][gc ] Allocation Stall (Test worker) 59,673ms >>>> [64,742s][info ][gc ] Allocation Stall (Test worker) 240,077ms >>>> [65,806s][info ][gc ] GC(20) Garbage Collection (Allocation >>>> Rate) 13808M(90%)->6936M(45%) >>>> [66,476s][info ][gc ] GC(21) Garbage Collection (Allocation >>>> Stall) 7100M(46%)->4478M(29%) >>>> [69,471s][info ][gc ] GC(22) Garbage Collection (Allocation >>>> Rate) 10098M(66%)->5888M(38%) >>>> [72,252s][info ][gc ] GC(23) Garbage Collection (Allocation >>>> Rate) 11226M(73%)->5816M(38%) >>>> >>>> ... >>>> >>>> So even here I can see some allocation stalls. >>>> >>>> Running the Same with -XX:+ZGenerational in build.gradle probably using >>>> GraalVM does something differnt, but I don't know what... at least off-heap >>>> memory is exhausted at some point due to direct byte buffer usage!? >>>> >>>> So, I'm not sure what's the difference, though. >>>> >>>> With this: >>>> >>>> "-XX:+UseZGC", >>>> "-Xlog:gc*=debug:file=zgc-generational-detailed.log", >>>> "-XX:+ZGenerational", >>>> "-verbose:gc", >>>> "-XX:+HeapDumpOnOutOfMemoryError", >>>> "-XX:HeapDumpPath=heapdump.hprof", >>>> "-XX:MaxDirectMemorySize=2g", >>>> >>>> >>>> Caused by: java.lang.OutOfMemoryError: Cannot reserve 60000 bytes of direct buffer memory (allocated: 2147446560, limit: 2147483648) >>>> at java.base/java.nio.Bits.reserveMemory(Bits.java:178) >>>> at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) >>>> at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) >>>> at net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) >>>> at net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) >>>> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) >>>> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) >>>> at io.sirix.access.trx.page.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) >>>> >>>> >>>> >>>> >>>> Am Do., 15. Feb. 2024 um 12:05 Uhr schrieb Stefan Karlsson < >>>> stefan.karlsson at oracle.com>: >>>> >>>>> Hi Johannes, >>>>> >>>>> We tried to look at the log files and the jfr files, but couldn't find >>>>> an OotOfMemoryError in any of them. Do you think you could try to >>>>> rerun >>>>> and capture the entire GC log from the OutOfMemoryError run? >>>>> >>>>> A few things to note: >>>>> >>>>> 1) You seem to be running the Graal compiler. Graal doesn't support >>>>> Generational ZGC, so you are going to run different compilers when you >>>>> compare Singlegen ZGC with Generational ZGC. >>>>> >>>>> 2) It's not clear to me that the provided JFR files matches the >>>>> provided >>>>> log files. >>>>> >>>>> 3) The JFR files show that -XX:+UseLargePages are used, but the >>>>> gc+init >>>>> logs shows 'Large Page Support: Disabled', you might want to look into >>>>> why that is the case. >>>>> >>>>> 4) The singlegen JFR file has a -Xlog:gc:g1-chicago.log line. It >>>>> should >>>>> probably be named zgc-chicago.log. >>>>> >>>>> Cheers, >>>>> StefanK >>>>> >>>>> On 2024-02-14 17:36, Johannes Lichtenberger wrote: >>>>> > Hello, >>>>> > >>>>> > a test of my little DB project fails using generational ZGC, but not >>>>> > with ZGC and G1 (out of memory error). >>>>> > >>>>> > To be honest, I guess the allocation rate and thus GC pressure, when >>>>> > traversing a resource in SirixDB is unacceptable. The strategy is to >>>>> > create fine-grained nodes from JSON input and store these in a trie. >>>>> > First, a 3,8Gb JSON file is shredded and imported. Next, a preorder >>>>> > traversal of the generated trie traverses a trie (with leaf pages >>>>> > storing 1024 nodes each and in total ~300_000_000 (and these are >>>>> going >>>>> > to be deserialized one by one). The pages are furthermore referenced >>>>> > in memory through PageReference::setPage. Furthermore, a Caffeine >>>>> page >>>>> > cache caches the PageReferences (keys) and the pages (values) and >>>>> sets >>>>> > the reference back to null once entries are going to be evicted >>>>> > (PageReference.setPage(null)). >>>>> > >>>>> > However, I think the whole strategy of having to have in-memory >>>>> nodes >>>>> > might not be best. Maybe it's better to use off-heap memory for the >>>>> > pages itself with MemorySegments, but the pages are not of a fixed >>>>> > size, thus it may get tricky. >>>>> > >>>>> > The test mentioned is this: >>>>> > >>>>> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 >>>>> > >>>>> > I can upload the JSON file somewhere for a couple of days if needed. >>>>> > >>>>> > Caused by: java.lang.OutOfMemoryError >>>>> > at >>>>> > >>>>> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) >>>>> > at >>>>> > >>>>> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) >>>>> > at >>>>> > >>>>> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) >>>>> > at >>>>> > >>>>> java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) >>>>> > at >>>>> > >>>>> java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) >>>>> > at >>>>> > >>>>> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) >>>>> > at >>>>> > >>>>> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) >>>>> > at >>>>> > >>>>> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) >>>>> > at >>>>> > >>>>> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) >>>>> > at >>>>> > >>>>> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) >>>>> > at >>>>> > io.sirix.access.trx.page >>>>> .NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) >>>>> > >>>>> > I've uploaded several JFR recordings and logs over here (maybe >>>>> besides >>>>> > the async profiler JFR files the zgc-detailed log is most >>>>> interesting): >>>>> > >>>>> > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core >>>>> > >>>>> > kind regards >>>>> > Johannes >>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot from 2024-02-15 21-43-33.png Type: image/png Size: 31322 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot from 2024-02-15 21-43-33.png Type: image/png Size: 31322 bytes Desc: not available URL: From stefan.johansson at oracle.com Fri Feb 16 12:43:10 2024 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 16 Feb 2024 13:43:10 +0100 Subject: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> Message-ID: <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> Hi Johannes, We've spent some more time looking at this and getting the json-file to reproduced it made it easy to verify our suspicions. Thanks for uploading it. There are a few things playing together here. The test is making quite heavy use of DirectByteBuffers and you limit the usage to 2G (-XX:MaxDirectMemorySize=2g). The life cycle and freeing of the native memory part of the DirectByteBuffer rely on reference processing. In generational ZGC reference processing is only done during Major collections and since the general GC preassure in this benchmark is very low (most objects die young), we do not trigger that many Major collections. Normaly this would not be a problem. To avoid throwing an out of memory error (due to hitting the direct buffer memory limit) too early the JDK triggers a System.gc(). This should trigger reference procesing and all buffers that are no longer in use would be freed. Since you specify the option -XX:+DisableExplicitGC all these calls to trigger GCs are ignored and no direct memory will be freed. So in our testing, just removing this flags makes the test pass. Another solution is to look at using HeapByteBuffers instead and don't have to worry about the direct memory usage. The OpenHFT lib seems to have support for this by just using elasticHeapByteBuffer(...) instead of elasticByteBuffer(). Lastly, the reason for this working with non-generational ZGC is that it does reference processing for every GC. Hope this helps, StefanJ On 2024-02-15 21:53, Johannes Lichtenberger wrote: > It's a laptop, I've attached some details. > > Furthermore, if it seems worth digging deeper into the issue, the JSON > file is here for one week: > https://www.transfernow.net/dl/20240215j9NaPTc0 > > > You'd have to unzip into bundles/sirix-core/src/test/resources/json, > remove the?@Disabled annotation and run the test > JsonShredderTest::testChicagoDescendantAxis > > The test JVM parameters are specified in the parent build.gradle in the > project root folder. > > The GitHub repo: https://github.com/sirixdb/sirix > > > Screenshot from 2024-02-15 21-43-33.png > > kind regards > Johannes > > Am Do., 15. Feb. 2024 um 20:01?Uhr schrieb Peter Booth > >: > > Just curious - what CPU, physical memory and OS are you using? > Sent from my iPhone > >> On Feb 15, 2024, at 12:23?PM, Johannes Lichtenberger >> > > wrote: >> >> ? >> I guess I don't know which JDK it picks for the tests, but I guess >> OpenJDK >> >> Johannes Lichtenberger > > schrieb am Do., 15. >> Feb. 2024, 17:58: >> >> However, it's the same with:?./gradlew >> -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 >> :sirix-core:test --tests >> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis? ?using OpenJDK hopefully >> >> Am Do., 15. Feb. 2024 um 17:54?Uhr schrieb Johannes >> Lichtenberger > >: >> >> I've attached two logs, the first one without >> -XX:+Generational, the second one with the option set, >> even though I also saw, that generational ZGC is going to >> be supported in GraalVM 24.1 in September... so not sure >> what this does :) >> >> Am Do., 15. Feb. 2024 um 17:52?Uhr schrieb Johannes >> Lichtenberger > >: >> >> Strange, so does it simply ignore the option? The >> following is the beginning of the output from _non_ >> generational ZGC: >> >> johannes at luna:~/IdeaProjects/sirix$ ./gradlew >> -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal :sirix-core:test --tests io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >> >> > Configure project : >> The 'sonarqube' task depends on compile tasks. This >> behavior is now deprecated and will be removed in >> version 5.x. To avoid implicit compilation, set >> property 'sonar.gradle.skipCompile' to 'true' and make >> sure your project is compiled, before analysis has >> started. >> The 'sonar' task depends on compile tasks. This >> behavior is now deprecated and will be removed in >> version 5.x. To avoid implicit compilation, set >> property 'sonar.gradle.skipCompile' to 'true' and make >> sure your project is compiled, before analysis has >> started. >> [1,627s][info ? ][gc ? ? ?] GC(0) Garbage Collection >> (Metadata GC Threshold) 84M(1%)->56M(0%) >> >> > Task :sirix-core:test >> [0.001s][warning][pagesize] UseLargePages disabled, no >> large pages configured and available on the system. >> [1.253s][info ? ][gc ? ? ?] Using The Z Garbage Collector >> >> [2,930s][info ? ][gc ? ? ?] GC(1) Garbage Collection >> (Warmup) 1616M(11%)->746M(5%) >> [4,445s][info ? ][gc ? ? ?] GC(2) Garbage Collection >> (Warmup) 3232M(21%)->750M(5%) >> [5,751s][info ? ][gc ? ? ?] GC(3) Garbage Collection >> (Warmup) 4644M(30%)->1356M(9%) >> [9,886s][info ? ][gc ? ? ?] GC(4) Garbage Collection >> (Allocation Rate) 10668M(69%)->612M(4%) >> [10,406s][info ? ][gc ? ? ?] GC(5) Garbage Collection >> (Allocation Rate) 2648M(17%)->216M(1%) >> [13,931s][info ? ][gc ? ? ?] GC(6) Garbage Collection >> (Allocation Rate) 11164M(73%)->1562M(10%) >> [16,908s][info ? ][gc ? ? ?] GC(7) Garbage Collection >> (Allocation Rate) 11750M(76%)->460M(3%) >> [20,690s][info ? ][gc ? ? ?] GC(8) Garbage Collection >> (Allocation Rate) 12670M(82%)->726M(5%) >> [24,376s][info ? ][gc ? ? ?] GC(9) Garbage Collection >> (Allocation Rate) 13422M(87%)->224M(1%) >> [28,152s][info ? ][gc ? ? ?] GC(10) Garbage Collection >> (Proactive) 13474M(88%)->650M(4%) >> [31,526s][info ? ][gc ? ? ?] GC(11) Garbage Collection >> (Allocation Rate) 12072M(79%)->1472M(10%) >> [34,754s][info ? ][gc ? ? ?] GC(12) Garbage Collection >> (Allocation Rate) 13050M(85%)->330M(2%) >> [38,478s][info ? ][gc ? ? ?] GC(13) Garbage Collection >> (Allocation Rate) 13288M(87%)->762M(5%) >> [41,936s][info ? ][gc ? ? ?] GC(14) Garbage Collection >> (Proactive) 13294M(87%)->504M(3%) >> [45,353s][info ? ][gc ? ? ?] GC(15) Garbage Collection >> (Allocation Rate) 12984M(85%)->268M(2%) >> [48,861s][info ? ][gc ? ? ?] GC(16) Garbage Collection >> (Allocation Rate) 13008M(85%)->306M(2%) >> [52,133s][info ? ][gc ? ? ?] GC(17) Garbage Collection >> (Proactive) 12042M(78%)->538M(4%) >> [55,705s][info ? ][gc ? ? ?] GC(18) Garbage Collection >> (Allocation Rate) 12420M(81%)->1842M(12%) >> [59,000s][info ? ][gc ? ? ?] GC(19) Garbage Collection >> (Allocation Rate) 12458M(81%)->1422M(9%) >> [64,501s][info ? ][gc ? ? ?] Allocation Stall (Test >> worker) 59,673ms >> [64,742s][info ? ][gc ? ? ?] Allocation Stall (Test >> worker) 240,077ms >> [65,806s][info ? ][gc ? ? ?] GC(20) Garbage Collection >> (Allocation Rate) 13808M(90%)->6936M(45%) >> [66,476s][info ? ][gc ? ? ?] GC(21) Garbage Collection >> (Allocation Stall) 7100M(46%)->4478M(29%) >> [69,471s][info ? ][gc ? ? ?] GC(22) Garbage Collection >> (Allocation Rate) 10098M(66%)->5888M(38%) >> [72,252s][info ? ][gc ? ? ?] GC(23) Garbage Collection >> (Allocation Rate) 11226M(73%)->5816M(38%) >> >> ... >> >> So even here I can see some allocation stalls. >> >> Running the Same with -XX:+ZGenerational in >> build.gradle probably using GraalVM does something >> differnt, but I don't know what... at least off-heap >> memory is exhausted at some point due to direct byte >> buffer usage!? >> >> So, I'm not sure what's the difference, though. >> >> With this: >> >> "-XX:+UseZGC", >> "-Xlog:gc*=debug:file=zgc-generational-detailed.log", >> "-XX:+ZGenerational", >> "-verbose:gc", >> "-XX:+HeapDumpOnOutOfMemoryError", >> "-XX:HeapDumpPath=heapdump.hprof", >> "-XX:MaxDirectMemorySize=2g", >> >> >> Caused by: java.lang.OutOfMemoryError: Cannot reserve 60000 bytes of direct buffer memory (allocated: 2147446560, limit: 2147483648) >> at java.base/java.nio.Bits.reserveMemory(Bits.java:178) >> at java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) >> at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) >> at net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) >> at net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) >> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) >> at net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) >> at io.sirix.access.trx.page.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) >> >> >> >> Am Do., 15. Feb. 2024 um 12:05?Uhr schrieb Stefan >> Karlsson > >: >> >> Hi Johannes, >> >> We tried to look at the log files and the jfr >> files, but couldn't find >> an OotOfMemoryError in any of them. Do you think >> you could try to rerun >> and capture the entire GC log from the >> OutOfMemoryError run? >> >> A few things to note: >> >> 1) You seem to be running the Graal compiler. >> Graal doesn't support >> Generational ZGC, so you are going to run >> different compilers when you >> compare Singlegen ZGC with Generational ZGC. >> >> 2) It's not clear to me that the provided JFR >> files matches the provided >> log files. >> >> 3) The JFR files show that -XX:+UseLargePages are >> used, but the gc+init >> logs shows 'Large Page Support: Disabled', you >> might want to look into >> why that is the case. >> >> 4) The singlegen JFR file has a >> -Xlog:gc:g1-chicago.log line. It should >> probably be named zgc-chicago.log. >> >> Cheers, >> StefanK >> >> On 2024-02-14 17:36, Johannes Lichtenberger wrote: >> > Hello, >> > >> > a test of my little DB project fails using >> generational ZGC, but not >> > with ZGC and G1 (out of memory error). >> > >> > To be honest, I guess the allocation rate and >> thus GC pressure, when >> > traversing a resource in SirixDB is >> unacceptable. The strategy is to >> > create fine-grained nodes from JSON input and >> store these in a trie. >> > First, a 3,8Gb JSON file is shredded and >> imported. Next, a preorder >> > traversal of the generated trie traverses a trie >> (with leaf pages >> > storing 1024 nodes each and in total >> ~300_000_000 (and these are going >> > to be deserialized one by one). The pages are >> furthermore referenced >> > in memory through PageReference::setPage. >> Furthermore, a Caffeine page >> > cache caches the PageReferences (keys) and the >> pages (values) and sets >> > the reference back to null once entries are >> going to be evicted >> > (PageReference.setPage(null)). >> > >> > However, I think the whole strategy of having to >> have in-memory nodes >> > might not be best. Maybe it's better to use >> off-heap memory for the >> > pages itself with MemorySegments, but the pages >> are not of a fixed >> > size, thus it may get tricky. >> > >> > The test mentioned is this: >> > >> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 >> > >> > I can upload the JSON file somewhere for a >> couple of days if needed. >> > >> > Caused by: java.lang.OutOfMemoryError >> > ? ? at >> > >> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) >> > ? ? at >> > >> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) >> > ? ? at >> > >> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) >> > ? ? at >> > >> java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) >> > ? ? at >> > >> java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) >> > ? ? at >> > >> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) >> > ? ? at >> > >> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) >> > ? ? at >> > >> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) >> > ? ? at >> > >> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) >> > ? ? at >> > >> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) >> > ? ? at >> > io.sirix.access.trx.page >> .NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) >> > >> > I've uploaded several JFR recordings and logs >> over here (maybe besides >> > the async profiler JFR files the zgc-detailed >> log is most interesting): >> > >> > >> https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core >> > >> > kind regards >> > Johannes >> From lichtenberger.johannes at gmail.com Fri Feb 16 15:47:02 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Fri, 16 Feb 2024 16:47:02 +0100 Subject: Generational ZGC issue In-Reply-To: <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> Message-ID: Thanks a lot for looking into it, I've added `-XX:MaxDirectMemorySize=2g` only recently, but without it failed as well, so not sure what the default is. Will definitely check your suggestions :-) Sadly I'm currently working alone on the project in my spare time (besides professionally switched from Java/Kotlin stuff to the embedded software world) and I'm not sure if the current architecture of Sirix is limited by too much GC pressure. I'd probably have to check Cassandra at some point and look into flame graphs and stuff for their integration tests, but maybe you can give some general insights/advice... Yesterday evening I switched to other JDKs (also I want to test with Shenandoah in particular), but I think especially the better escape analysis of the GraalVM is a huge plus in the case of SirixDB (for insertion on my laptop it's ~90s vs ~60s), but I think it should be faster and currently my suspicion is that garbage is a major performance issue. Maybe the GC pressure in general is a major issue, as in the CPU Flame graph IIRC the G1 had about 20% stack frames allocated and non generational ZGC even around 40% taking all threads into account. So in general I'm thinking about backing the KeyValueLeafPages with MemorySegments, but I think due to variable sized pages it's getting tricky, plus I currently don't have the time for changing fundamental stuff and I'm even not sure if it'll bring a performance boost, as I have to adapt neighbour relationships of the nodes often and off-heap memory access might be slightly worse performance wise. What do you think? I've attached a memory flame graph and there it seems the byte array from deserializing each page is prominent, but that might be something I can't even avoid (after decompression via Snappy or via another lib and maybe also decryption in the future). As of now G1 with GraalVM seems to perform best (a little bit better than with non generational ZGC, but I thought ZGC or maybe Shenandoah would improve the situation). But as said I may have to generate way less garbage after all in general for good performance!? All in all maybe due to most objects die young maybe also the generational GCs are not needed (that said if enough RAM is available and the Caffeine Caches are sized accordingly most objects may die old). But apparently the byte arrays holding the page data still die young (in AbstractReader::deserialize). In fact I'm not even sure why they escape, but currently I'm on my phone. Kind regards Johannes Stefan Johansson schrieb am Fr., 16. Feb. 2024, 13:43: > Hi Johannes, > > We've spent some more time looking at this and getting the json-file to > reproduced it made it easy to verify our suspicions. Thanks for > uploading it. > > There are a few things playing together here. The test is making quite > heavy use of DirectByteBuffers and you limit the usage to 2G > (-XX:MaxDirectMemorySize=2g). The life cycle and freeing of the native > memory part of the DirectByteBuffer rely on reference processing. In > generational ZGC reference processing is only done during Major > collections and since the general GC preassure in this benchmark is very > low (most objects die young), we do not trigger that many Major > collections. > > Normaly this would not be a problem. To avoid throwing an out of memory > error (due to hitting the direct buffer memory limit) too early the JDK > triggers a System.gc(). This should trigger reference procesing and all > buffers that are no longer in use would be freed. Since you specify the > option -XX:+DisableExplicitGC all these calls to trigger GCs are ignored > and no direct memory will be freed. So in our testing, just removing > this flags makes the test pass. > > Another solution is to look at using HeapByteBuffers instead and don't > have to worry about the direct memory usage. The OpenHFT lib seems to > have support for this by just using elasticHeapByteBuffer(...) instead > of elasticByteBuffer(). > > Lastly, the reason for this working with non-generational ZGC is that it > does reference processing for every GC. > > Hope this helps, > StefanJ > > > On 2024-02-15 21:53, Johannes Lichtenberger wrote: > > It's a laptop, I've attached some details. > > > > Furthermore, if it seems worth digging deeper into the issue, the JSON > > file is here for one week: > > https://www.transfernow.net/dl/20240215j9NaPTc0 > > > > > > You'd have to unzip into bundles/sirix-core/src/test/resources/json, > > remove the @Disabled annotation and run the test > > JsonShredderTest::testChicagoDescendantAxis > > > > The test JVM parameters are specified in the parent build.gradle in the > > project root folder. > > > > The GitHub repo: https://github.com/sirixdb/sirix > > > > > > Screenshot from 2024-02-15 21-43-33.png > > > > kind regards > > Johannes > > > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb Peter Booth > > >: > > > > Just curious - what CPU, physical memory and OS are you using? > > Sent from my iPhone > > > >> On Feb 15, 2024, at 12:23?PM, Johannes Lichtenberger > >> >> > wrote: > >> > >> ? > >> I guess I don't know which JDK it picks for the tests, but I guess > >> OpenJDK > >> > >> Johannes Lichtenberger >> > schrieb am Do., 15. > >> Feb. 2024, 17:58: > >> > >> However, it's the same with: ./gradlew > >> -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > >> :sirix-core:test --tests > >> > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > using OpenJDK hopefully > >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb Johannes > >> Lichtenberger >> >: > >> > >> I've attached two logs, the first one without > >> -XX:+Generational, the second one with the option set, > >> even though I also saw, that generational ZGC is going to > >> be supported in GraalVM 24.1 in September... so not sure > >> what this does :) > >> > >> Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb Johannes > >> Lichtenberger >> >: > >> > >> Strange, so does it simply ignore the option? The > >> following is the beginning of the output from _non_ > >> generational ZGC: > >> > >> johannes at luna:~/IdeaProjects/sirix$ ./gradlew > >> > -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal > :sirix-core:test --tests > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > >> > >> > Configure project : > >> The 'sonarqube' task depends on compile tasks. This > >> behavior is now deprecated and will be removed in > >> version 5.x. To avoid implicit compilation, set > >> property 'sonar.gradle.skipCompile' to 'true' and make > >> sure your project is compiled, before analysis has > >> started. > >> The 'sonar' task depends on compile tasks. This > >> behavior is now deprecated and will be removed in > >> version 5.x. To avoid implicit compilation, set > >> property 'sonar.gradle.skipCompile' to 'true' and make > >> sure your project is compiled, before analysis has > >> started. > >> [1,627s][info ][gc ] GC(0) Garbage Collection > >> (Metadata GC Threshold) 84M(1%)->56M(0%) > >> > >> > Task :sirix-core:test > >> [0.001s][warning][pagesize] UseLargePages disabled, no > >> large pages configured and available on the system. > >> [1.253s][info ][gc ] Using The Z Garbage > Collector > >> > >> [2,930s][info ][gc ] GC(1) Garbage Collection > >> (Warmup) 1616M(11%)->746M(5%) > >> [4,445s][info ][gc ] GC(2) Garbage Collection > >> (Warmup) 3232M(21%)->750M(5%) > >> [5,751s][info ][gc ] GC(3) Garbage Collection > >> (Warmup) 4644M(30%)->1356M(9%) > >> [9,886s][info ][gc ] GC(4) Garbage Collection > >> (Allocation Rate) 10668M(69%)->612M(4%) > >> [10,406s][info ][gc ] GC(5) Garbage Collection > >> (Allocation Rate) 2648M(17%)->216M(1%) > >> [13,931s][info ][gc ] GC(6) Garbage Collection > >> (Allocation Rate) 11164M(73%)->1562M(10%) > >> [16,908s][info ][gc ] GC(7) Garbage Collection > >> (Allocation Rate) 11750M(76%)->460M(3%) > >> [20,690s][info ][gc ] GC(8) Garbage Collection > >> (Allocation Rate) 12670M(82%)->726M(5%) > >> [24,376s][info ][gc ] GC(9) Garbage Collection > >> (Allocation Rate) 13422M(87%)->224M(1%) > >> [28,152s][info ][gc ] GC(10) Garbage Collection > >> (Proactive) 13474M(88%)->650M(4%) > >> [31,526s][info ][gc ] GC(11) Garbage Collection > >> (Allocation Rate) 12072M(79%)->1472M(10%) > >> [34,754s][info ][gc ] GC(12) Garbage Collection > >> (Allocation Rate) 13050M(85%)->330M(2%) > >> [38,478s][info ][gc ] GC(13) Garbage Collection > >> (Allocation Rate) 13288M(87%)->762M(5%) > >> [41,936s][info ][gc ] GC(14) Garbage Collection > >> (Proactive) 13294M(87%)->504M(3%) > >> [45,353s][info ][gc ] GC(15) Garbage Collection > >> (Allocation Rate) 12984M(85%)->268M(2%) > >> [48,861s][info ][gc ] GC(16) Garbage Collection > >> (Allocation Rate) 13008M(85%)->306M(2%) > >> [52,133s][info ][gc ] GC(17) Garbage Collection > >> (Proactive) 12042M(78%)->538M(4%) > >> [55,705s][info ][gc ] GC(18) Garbage Collection > >> (Allocation Rate) 12420M(81%)->1842M(12%) > >> [59,000s][info ][gc ] GC(19) Garbage Collection > >> (Allocation Rate) 12458M(81%)->1422M(9%) > >> [64,501s][info ][gc ] Allocation Stall (Test > >> worker) 59,673ms > >> [64,742s][info ][gc ] Allocation Stall (Test > >> worker) 240,077ms > >> [65,806s][info ][gc ] GC(20) Garbage Collection > >> (Allocation Rate) 13808M(90%)->6936M(45%) > >> [66,476s][info ][gc ] GC(21) Garbage Collection > >> (Allocation Stall) 7100M(46%)->4478M(29%) > >> [69,471s][info ][gc ] GC(22) Garbage Collection > >> (Allocation Rate) 10098M(66%)->5888M(38%) > >> [72,252s][info ][gc ] GC(23) Garbage Collection > >> (Allocation Rate) 11226M(73%)->5816M(38%) > >> > >> ... > >> > >> So even here I can see some allocation stalls. > >> > >> Running the Same with -XX:+ZGenerational in > >> build.gradle probably using GraalVM does something > >> differnt, but I don't know what... at least off-heap > >> memory is exhausted at some point due to direct byte > >> buffer usage!? > >> > >> So, I'm not sure what's the difference, though. > >> > >> With this: > >> > >> "-XX:+UseZGC", > >> "-Xlog:gc*=debug:file=zgc-generational-detailed.log", > >> "-XX:+ZGenerational", > >> "-verbose:gc", > >> "-XX:+HeapDumpOnOutOfMemoryError", > >> "-XX:HeapDumpPath=heapdump.hprof", > >> "-XX:MaxDirectMemorySize=2g", > >> > >> > >> Caused by: java.lang.OutOfMemoryError: Cannot reserve > 60000 bytes of direct buffer memory (allocated: 2147446560, limit: > 2147483648) > >> at > java.base/java.nio.Bits.reserveMemory(Bits.java:178) > >> at > java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) > >> at > java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) > >> at > net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) > >> at > net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) > >> at > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) > >> at > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) > >> at io.sirix.access.trx.page > .NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) > >> > >> > >> > >> Am Do., 15. Feb. 2024 um 12:05 Uhr schrieb Stefan > >> Karlsson >> >: > >> > >> Hi Johannes, > >> > >> We tried to look at the log files and the jfr > >> files, but couldn't find > >> an OotOfMemoryError in any of them. Do you think > >> you could try to rerun > >> and capture the entire GC log from the > >> OutOfMemoryError run? > >> > >> A few things to note: > >> > >> 1) You seem to be running the Graal compiler. > >> Graal doesn't support > >> Generational ZGC, so you are going to run > >> different compilers when you > >> compare Singlegen ZGC with Generational ZGC. > >> > >> 2) It's not clear to me that the provided JFR > >> files matches the provided > >> log files. > >> > >> 3) The JFR files show that -XX:+UseLargePages are > >> used, but the gc+init > >> logs shows 'Large Page Support: Disabled', you > >> might want to look into > >> why that is the case. > >> > >> 4) The singlegen JFR file has a > >> -Xlog:gc:g1-chicago.log line. It should > >> probably be named zgc-chicago.log. > >> > >> Cheers, > >> StefanK > >> > >> On 2024-02-14 17:36, Johannes Lichtenberger wrote: > >> > Hello, > >> > > >> > a test of my little DB project fails using > >> generational ZGC, but not > >> > with ZGC and G1 (out of memory error). > >> > > >> > To be honest, I guess the allocation rate and > >> thus GC pressure, when > >> > traversing a resource in SirixDB is > >> unacceptable. The strategy is to > >> > create fine-grained nodes from JSON input and > >> store these in a trie. > >> > First, a 3,8Gb JSON file is shredded and > >> imported. Next, a preorder > >> > traversal of the generated trie traverses a trie > >> (with leaf pages > >> > storing 1024 nodes each and in total > >> ~300_000_000 (and these are going > >> > to be deserialized one by one). The pages are > >> furthermore referenced > >> > in memory through PageReference::setPage. > >> Furthermore, a Caffeine page > >> > cache caches the PageReferences (keys) and the > >> pages (values) and sets > >> > the reference back to null once entries are > >> going to be evicted > >> > (PageReference.setPage(null)). > >> > > >> > However, I think the whole strategy of having to > >> have in-memory nodes > >> > might not be best. Maybe it's better to use > >> off-heap memory for the > >> > pages itself with MemorySegments, but the pages > >> are not of a fixed > >> > size, thus it may get tricky. > >> > > >> > The test mentioned is this: > >> > > >> > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > < > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > > > >> > > >> > I can upload the JSON file somewhere for a > >> couple of days if needed. > >> > > >> > Caused by: java.lang.OutOfMemoryError > >> > at > >> > > >> > java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > >> > at > >> > > >> > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) > >> > at > >> > > >> > java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) > >> > at > >> > > >> > java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) > >> > at > >> > > >> > java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) > >> > at > >> > > >> > java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) > >> > at > >> > > >> > java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > >> > at > >> > > >> > java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > >> > at > >> > > >> > java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > >> > at > >> > > >> > java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) > >> > at > >> > io.sirix.access.trx.page > >> >.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) > >> > > >> > I've uploaded several JFR recordings and logs > >> over here (maybe besides > >> > the async profiler JFR files the zgc-detailed > >> log is most interesting): > >> > > >> > > >> > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core < > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core> > >> > > >> > kind regards > >> > Johannes > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot_from_2024-02-14_21-07-36.png Type: image/png Size: 197639 bytes Desc: not available URL: From stefan.johansson at oracle.com Fri Feb 16 16:37:43 2024 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 16 Feb 2024 17:37:43 +0100 Subject: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> Message-ID: Hi, Some comments inline. On 2024-02-16 16:47, Johannes Lichtenberger wrote: > Thanks a lot for looking into it, I've added > `-XX:MaxDirectMemorySize=2g` only recently, but without it failed as > well,? so not sure what the default is. Will definitely check your > suggestions :-) > If you don't set a limit it will be set to: Runtime.getRuntime().maxMemory() So likely a good idea to set a reasonable limit, but the smaller the limit is the more frequent we need to run reference processing to allow memory to be freed up. > Sadly I'm currently working alone on the project in my spare time > (besides professionally switched from Java/Kotlin stuff to the embedded > software world) and I'm not sure if the current architecture of Sirix is > limited by too much GC pressure. I'd probably have to check Cassandra at > some point and look into flame graphs and stuff for their integration > tests, but maybe you can give some general insights/advice... > > Yesterday evening I switched to other JDKs (also I want to test with > Shenandoah in particular), but I think especially the better escape > analysis of the GraalVM is a huge plus in the case of SirixDB (for > insertion on my laptop it's ~90s vs ~60s),? but I think it should be > faster and currently my suspicion is that garbage is a major performance > issue. > > Maybe the GC pressure in general is a major issue, as in the CPU Flame > graph IIRC the G1 had about 20% stack frames allocated and non > generational ZGC even around 40% taking all threads into account. > From what I/we see, the GC pressure in the given test is not high. The allocation rate is below 1GB/s and since most of it die young the GCs are fairly cheap. In this log snippet G1 shows a GC every 5s and the pause time is below 50ms: [296,016s][info ][gc ] GC(90) Pause Young (Normal) (G1 Evacuation Pause) 5413M->1849M(6456M) 35,577ms [301,103s][info ][gc ] GC(91) Pause Young (Normal) (G1 Evacuation Pause) 5417M->1848M(6456M) 33,357ms [306,041s][info ][gc ] GC(92) Pause Young (Normal) (G1 Evacuation Pause) 5416M->1848M(6456M) 32,763ms [310,849s][info ][gc ] GC(93) Pause Young (Normal) (G1 Evacuation Pause) 5416M->1847M(6456M) 33,086ms I also see that the heap never expands to more the ~6.5GB even though it is allow to be 15GB and this also suggest that the GC is not under much pressure. As I said in the previous mail, the reason Generational ZGC don't free up the direct memory without the System.gc() calls is that the GC pressure is not high enough to trigger any Major cycles. So I would strongly recommend you to not run with -XX+DisableExplicitGC unless you really have to. Since you are using DirectByteBuffers and they use System.gc() to help free memory when the limit is reached. > So in general I'm thinking about backing the KeyValueLeafPages with > MemorySegments, but I think due to variable sized pages it's getting > tricky, plus I currently don't have the time for changing fundamental > stuff and I'm even not sure if it'll bring a performance boost, as I > have to adapt neighbour relationships of the nodes often and off-heap > memory access might be slightly worse performance wise. > > What do you think? > I know to little about the application to be able to give advice here, but I would first start with having most memory on heap. Only large long lived stuff off-heap, if really needed. Looking at the test at hand, it really doesn't look like it is long lived stuff that is placed off heap. > I've attached a memory flame graph and there it seems the byte array > from deserializing each page is prominent, but that might be something I > can't even avoid (after decompression via Snappy or via another lib and > maybe also decryption in the future). > > As of now G1 with GraalVM seems to perform best (a little bit better > than with non generational ZGC, but I thought ZGC or maybe Shenandoah > would improve the situation). But as said I may have to generate way > less garbage after all in general for good performance!? > > All in all maybe due to most objects die young maybe also the > generational GCs are not needed (that said if enough RAM is available > and the Caffeine Caches are sized accordingly most objects may die old). > But apparently the byte arrays holding the page data still die young (in > AbstractReader::deserialize). In fact I'm not even sure why they escape, > but currently I'm on my phone. > It's when most objects die young the Generational GC really shines, because it can handle the short lived objects without having to look at the long lived objects. So I would say Generational ZGC is a good fit here, but we need to let the System.gc() run to allow reference processing or slightly re-design and use HeapByteBuffers. Have a nice weekend, Stefan > Kind regards > Johannes > > Stefan Johansson > schrieb am Fr., 16. Feb. 2024, 13:43: > > Hi Johannes, > > We've spent some more time looking at this and getting the json-file to > reproduced it made it easy to verify our suspicions. Thanks for > uploading it. > > There are a few things playing together here. The test is making quite > heavy use of DirectByteBuffers and you limit the usage to 2G > (-XX:MaxDirectMemorySize=2g). The life cycle and freeing of the native > memory part of the DirectByteBuffer rely on reference processing. In > generational ZGC reference processing is only done during Major > collections and since the general GC preassure in this benchmark is > very > low (most objects die young), we do not trigger that many Major > collections. > > Normaly this would not be a problem. To avoid throwing an out of memory > error (due to hitting the direct buffer memory limit) too early the JDK > triggers a System.gc(). This should trigger reference procesing and all > buffers that are no longer in use would be freed. Since you specify the > option -XX:+DisableExplicitGC all these calls to trigger GCs are > ignored > and no direct memory will be freed. So in our testing, just removing > this flags makes the test pass. > > Another solution is to look at using HeapByteBuffers instead and don't > have to worry about the direct memory usage. The OpenHFT lib seems to > have support for this by just using elasticHeapByteBuffer(...) instead > of elasticByteBuffer(). > > Lastly, the reason for this working with non-generational ZGC is > that it > does reference processing for every GC. > > Hope this helps, > StefanJ > > > On 2024-02-15 21:53, Johannes Lichtenberger wrote: > > It's a laptop, I've attached some details. > > > > Furthermore, if it seems worth digging deeper into the issue, the > JSON > > file is here for one week: > > https://www.transfernow.net/dl/20240215j9NaPTc0 > > > > > > > > You'd have to unzip into bundles/sirix-core/src/test/resources/json, > > remove the?@Disabled annotation and run the test > > JsonShredderTest::testChicagoDescendantAxis > > > > The test JVM parameters are specified in the parent build.gradle > in the > > project root folder. > > > > The GitHub repo: https://github.com/sirixdb/sirix > > > > > > > > Screenshot from 2024-02-15 21-43-33.png > > > > kind regards > > Johannes > > > > Am Do., 15. Feb. 2024 um 20:01?Uhr schrieb Peter Booth > > > >>: > > > >? ? ?Just curious - what CPU, physical memory and OS are you using? > >? ? ?Sent from my iPhone > > > >>? ? ?On Feb 15, 2024, at 12:23?PM, Johannes Lichtenberger > >>? ? ? > >>? ? ? >> wrote: > >> > >>? ? ?? > >>? ? ?I guess I don't know which JDK it picks for the tests, but I > guess > >>? ? ?OpenJDK > >> > >>? ? ?Johannes Lichtenberger > >>? ? ? >> schrieb am Do., 15. > >>? ? ?Feb. 2024, 17:58: > >> > >>? ? ? ? ?However, it's the same with:?./gradlew > >>? ? ? ? ?-Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > >>? ? ? ? ?:sirix-core:test --tests > >> > ?io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis? ?using OpenJDK hopefully > >> > >>? ? ? ? ?Am Do., 15. Feb. 2024 um 17:54?Uhr schrieb Johannes > >>? ? ? ? ?Lichtenberger > >>? ? ? ? ? >>: > >> > >>? ? ? ? ? ? ?I've attached two logs, the first one without > >>? ? ? ? ? ? ?-XX:+Generational, the second one with the option set, > >>? ? ? ? ? ? ?even though I also saw, that generational ZGC is > going to > >>? ? ? ? ? ? ?be supported in GraalVM 24.1 in September... so not sure > >>? ? ? ? ? ? ?what this does :) > >> > >>? ? ? ? ? ? ?Am Do., 15. Feb. 2024 um 17:52?Uhr schrieb Johannes > >>? ? ? ? ? ? ?Lichtenberger > >>? ? ? ? ? ? ? >>: > >> > >>? ? ? ? ? ? ? ? ?Strange, so does it simply ignore the option? The > >>? ? ? ? ? ? ? ? ?following is the beginning of the output from _non_ > >>? ? ? ? ? ? ? ? ?generational ZGC: > >> > >>? ? ? ? ? ? ? ? ?johannes at luna:~/IdeaProjects/sirix$ ./gradlew > >> > ?-Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal :sirix-core:test --tests io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > >> > >>? ? ? ? ? ? ? ? ?> Configure project : > >>? ? ? ? ? ? ? ? ?The 'sonarqube' task depends on compile tasks. This > >>? ? ? ? ? ? ? ? ?behavior is now deprecated and will be removed in > >>? ? ? ? ? ? ? ? ?version 5.x. To avoid implicit compilation, set > >>? ? ? ? ? ? ? ? ?property 'sonar.gradle.skipCompile' to 'true' > and make > >>? ? ? ? ? ? ? ? ?sure your project is compiled, before analysis has > >>? ? ? ? ? ? ? ? ?started. > >>? ? ? ? ? ? ? ? ?The 'sonar' task depends on compile tasks. This > >>? ? ? ? ? ? ? ? ?behavior is now deprecated and will be removed in > >>? ? ? ? ? ? ? ? ?version 5.x. To avoid implicit compilation, set > >>? ? ? ? ? ? ? ? ?property 'sonar.gradle.skipCompile' to 'true' > and make > >>? ? ? ? ? ? ? ? ?sure your project is compiled, before analysis has > >>? ? ? ? ? ? ? ? ?started. > >>? ? ? ? ? ? ? ? ?[1,627s][info ? ][gc ? ? ?] GC(0) Garbage Collection > >>? ? ? ? ? ? ? ? ?(Metadata GC Threshold) 84M(1%)->56M(0%) > >> > >>? ? ? ? ? ? ? ? ?> Task :sirix-core:test > >>? ? ? ? ? ? ? ? ?[0.001s][warning][pagesize] UseLargePages > disabled, no > >>? ? ? ? ? ? ? ? ?large pages configured and available on the system. > >>? ? ? ? ? ? ? ? ?[1.253s][info ? ][gc ? ? ?] Using The Z Garbage > Collector > >> > >>? ? ? ? ? ? ? ? ?[2,930s][info ? ][gc ? ? ?] GC(1) Garbage Collection > >>? ? ? ? ? ? ? ? ?(Warmup) 1616M(11%)->746M(5%) > >>? ? ? ? ? ? ? ? ?[4,445s][info ? ][gc ? ? ?] GC(2) Garbage Collection > >>? ? ? ? ? ? ? ? ?(Warmup) 3232M(21%)->750M(5%) > >>? ? ? ? ? ? ? ? ?[5,751s][info ? ][gc ? ? ?] GC(3) Garbage Collection > >>? ? ? ? ? ? ? ? ?(Warmup) 4644M(30%)->1356M(9%) > >>? ? ? ? ? ? ? ? ?[9,886s][info ? ][gc ? ? ?] GC(4) Garbage Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 10668M(69%)->612M(4%) > >>? ? ? ? ? ? ? ? ?[10,406s][info ? ][gc ? ? ?] GC(5) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 2648M(17%)->216M(1%) > >>? ? ? ? ? ? ? ? ?[13,931s][info ? ][gc ? ? ?] GC(6) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 11164M(73%)->1562M(10%) > >>? ? ? ? ? ? ? ? ?[16,908s][info ? ][gc ? ? ?] GC(7) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 11750M(76%)->460M(3%) > >>? ? ? ? ? ? ? ? ?[20,690s][info ? ][gc ? ? ?] GC(8) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12670M(82%)->726M(5%) > >>? ? ? ? ? ? ? ? ?[24,376s][info ? ][gc ? ? ?] GC(9) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13422M(87%)->224M(1%) > >>? ? ? ? ? ? ? ? ?[28,152s][info ? ][gc ? ? ?] GC(10) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Proactive) 13474M(88%)->650M(4%) > >>? ? ? ? ? ? ? ? ?[31,526s][info ? ][gc ? ? ?] GC(11) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12072M(79%)->1472M(10%) > >>? ? ? ? ? ? ? ? ?[34,754s][info ? ][gc ? ? ?] GC(12) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13050M(85%)->330M(2%) > >>? ? ? ? ? ? ? ? ?[38,478s][info ? ][gc ? ? ?] GC(13) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13288M(87%)->762M(5%) > >>? ? ? ? ? ? ? ? ?[41,936s][info ? ][gc ? ? ?] GC(14) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Proactive) 13294M(87%)->504M(3%) > >>? ? ? ? ? ? ? ? ?[45,353s][info ? ][gc ? ? ?] GC(15) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12984M(85%)->268M(2%) > >>? ? ? ? ? ? ? ? ?[48,861s][info ? ][gc ? ? ?] GC(16) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13008M(85%)->306M(2%) > >>? ? ? ? ? ? ? ? ?[52,133s][info ? ][gc ? ? ?] GC(17) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Proactive) 12042M(78%)->538M(4%) > >>? ? ? ? ? ? ? ? ?[55,705s][info ? ][gc ? ? ?] GC(18) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12420M(81%)->1842M(12%) > >>? ? ? ? ? ? ? ? ?[59,000s][info ? ][gc ? ? ?] GC(19) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12458M(81%)->1422M(9%) > >>? ? ? ? ? ? ? ? ?[64,501s][info ? ][gc ? ? ?] Allocation Stall (Test > >>? ? ? ? ? ? ? ? ?worker) 59,673ms > >>? ? ? ? ? ? ? ? ?[64,742s][info ? ][gc ? ? ?] Allocation Stall (Test > >>? ? ? ? ? ? ? ? ?worker) 240,077ms > >>? ? ? ? ? ? ? ? ?[65,806s][info ? ][gc ? ? ?] GC(20) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13808M(90%)->6936M(45%) > >>? ? ? ? ? ? ? ? ?[66,476s][info ? ][gc ? ? ?] GC(21) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Stall) 7100M(46%)->4478M(29%) > >>? ? ? ? ? ? ? ? ?[69,471s][info ? ][gc ? ? ?] GC(22) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 10098M(66%)->5888M(38%) > >>? ? ? ? ? ? ? ? ?[72,252s][info ? ][gc ? ? ?] GC(23) Garbage > Collection > >>? ? ? ? ? ? ? ? ?(Allocation Rate) 11226M(73%)->5816M(38%) > >> > >>? ? ? ? ? ? ? ? ?... > >> > >>? ? ? ? ? ? ? ? ?So even here I can see some allocation stalls. > >> > >>? ? ? ? ? ? ? ? ?Running the Same with -XX:+ZGenerational in > >>? ? ? ? ? ? ? ? ?build.gradle probably using GraalVM does something > >>? ? ? ? ? ? ? ? ?differnt, but I don't know what... at least off-heap > >>? ? ? ? ? ? ? ? ?memory is exhausted at some point due to direct byte > >>? ? ? ? ? ? ? ? ?buffer usage!? > >> > >>? ? ? ? ? ? ? ? ?So, I'm not sure what's the difference, though. > >> > >>? ? ? ? ? ? ? ? ?With this: > >> > >>? ? ? ? ? ? ? ? ?"-XX:+UseZGC", > >> > ?"-Xlog:gc*=debug:file=zgc-generational-detailed.log", > >>? ? ? ? ? ? ? ? ?"-XX:+ZGenerational", > >>? ? ? ? ? ? ? ? ?"-verbose:gc", > >>? ? ? ? ? ? ? ? ?"-XX:+HeapDumpOnOutOfMemoryError", > >>? ? ? ? ? ? ? ? ?"-XX:HeapDumpPath=heapdump.hprof", > >>? ? ? ? ? ? ? ? ?"-XX:MaxDirectMemorySize=2g", > >> > >> > >>? ? ? ? ? ? ? ? ?Caused by: java.lang.OutOfMemoryError: Cannot > reserve 60000 bytes of direct buffer memory (allocated: 2147446560, > limit: 2147483648) > >>? ? ? ? ? ? ? ? ? ? ? at > java.base/java.nio.Bits.reserveMemory(Bits.java:178) > >>? ? ? ? ? ? ? ? ? ? ? at > java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) > >>? ? ? ? ? ? ? ? ? ? ? at > java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) > >>? ? ? ? ? ? ? ? ? ? ? at > net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) > >>? ? ? ? ? ? ? ? ? ? ? at > net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) > >>? ? ? ? ? ? ? ? ? ? ? at > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) > >>? ? ? ? ? ? ? ? ? ? ? at > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) > >>? ? ? ? ? ? ? ? ? ? ? at io.sirix.access.trx.page > .NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) > >> > >> > >> > >>? ? ? ? ? ? ? ? ?Am Do., 15. Feb. 2024 um 12:05?Uhr schrieb Stefan > >>? ? ? ? ? ? ? ? ?Karlsson > >>? ? ? ? ? ? ? ? ? >>: > >> > >>? ? ? ? ? ? ? ? ? ? ?Hi Johannes, > >> > >>? ? ? ? ? ? ? ? ? ? ?We tried to look at the log files and the jfr > >>? ? ? ? ? ? ? ? ? ? ?files, but couldn't find > >>? ? ? ? ? ? ? ? ? ? ?an OotOfMemoryError in any of them. Do you think > >>? ? ? ? ? ? ? ? ? ? ?you could try to rerun > >>? ? ? ? ? ? ? ? ? ? ?and capture the entire GC log from the > >>? ? ? ? ? ? ? ? ? ? ?OutOfMemoryError run? > >> > >>? ? ? ? ? ? ? ? ? ? ?A few things to note: > >> > >>? ? ? ? ? ? ? ? ? ? ?1) You seem to be running the Graal compiler. > >>? ? ? ? ? ? ? ? ? ? ?Graal doesn't support > >>? ? ? ? ? ? ? ? ? ? ?Generational ZGC, so you are going to run > >>? ? ? ? ? ? ? ? ? ? ?different compilers when you > >>? ? ? ? ? ? ? ? ? ? ?compare Singlegen ZGC with Generational ZGC. > >> > >>? ? ? ? ? ? ? ? ? ? ?2) It's not clear to me that the provided JFR > >>? ? ? ? ? ? ? ? ? ? ?files matches the provided > >>? ? ? ? ? ? ? ? ? ? ?log files. > >> > >>? ? ? ? ? ? ? ? ? ? ?3) The JFR files show that > -XX:+UseLargePages are > >>? ? ? ? ? ? ? ? ? ? ?used, but the gc+init > >>? ? ? ? ? ? ? ? ? ? ?logs shows 'Large Page Support: Disabled', you > >>? ? ? ? ? ? ? ? ? ? ?might want to look into > >>? ? ? ? ? ? ? ? ? ? ?why that is the case. > >> > >>? ? ? ? ? ? ? ? ? ? ?4) The singlegen JFR file has a > >>? ? ? ? ? ? ? ? ? ? ?-Xlog:gc:g1-chicago.log line. It should > >>? ? ? ? ? ? ? ? ? ? ?probably be named zgc-chicago.log. > >> > >>? ? ? ? ? ? ? ? ? ? ?Cheers, > >>? ? ? ? ? ? ? ? ? ? ?StefanK > >> > >>? ? ? ? ? ? ? ? ? ? ?On 2024-02-14 17:36, Johannes Lichtenberger > wrote: > >>? ? ? ? ? ? ? ? ? ? ?> Hello, > >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ? ? ? ? ? ? ? ? ?> a test of my little DB project fails using > >>? ? ? ? ? ? ? ? ? ? ?generational ZGC, but not > >>? ? ? ? ? ? ? ? ? ? ?> with ZGC and G1 (out of memory error). > >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ? ? ? ? ? ? ? ? ?> To be honest, I guess the allocation rate and > >>? ? ? ? ? ? ? ? ? ? ?thus GC pressure, when > >>? ? ? ? ? ? ? ? ? ? ?> traversing a resource in SirixDB is > >>? ? ? ? ? ? ? ? ? ? ?unacceptable. The strategy is to > >>? ? ? ? ? ? ? ? ? ? ?> create fine-grained nodes from JSON input and > >>? ? ? ? ? ? ? ? ? ? ?store these in a trie. > >>? ? ? ? ? ? ? ? ? ? ?> First, a 3,8Gb JSON file is shredded and > >>? ? ? ? ? ? ? ? ? ? ?imported. Next, a preorder > >>? ? ? ? ? ? ? ? ? ? ?> traversal of the generated trie traverses > a trie > >>? ? ? ? ? ? ? ? ? ? ?(with leaf pages > >>? ? ? ? ? ? ? ? ? ? ?> storing 1024 nodes each and in total > >>? ? ? ? ? ? ? ? ? ? ?~300_000_000 (and these are going > >>? ? ? ? ? ? ? ? ? ? ?> to be deserialized one by one). The pages are > >>? ? ? ? ? ? ? ? ? ? ?furthermore referenced > >>? ? ? ? ? ? ? ? ? ? ?> in memory through PageReference::setPage. > >>? ? ? ? ? ? ? ? ? ? ?Furthermore, a Caffeine page > >>? ? ? ? ? ? ? ? ? ? ?> cache caches the PageReferences (keys) and the > >>? ? ? ? ? ? ? ? ? ? ?pages (values) and sets > >>? ? ? ? ? ? ? ? ? ? ?> the reference back to null once entries are > >>? ? ? ? ? ? ? ? ? ? ?going to be evicted > >>? ? ? ? ? ? ? ? ? ? ?> (PageReference.setPage(null)). > >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ? ? ? ? ? ? ? ? ?> However, I think the whole strategy of > having to > >>? ? ? ? ? ? ? ? ? ? ?have in-memory nodes > >>? ? ? ? ? ? ? ? ? ? ?> might not be best. Maybe it's better to use > >>? ? ? ? ? ? ? ? ? ? ?off-heap memory for the > >>? ? ? ? ? ? ? ? ? ? ?> pages itself with MemorySegments, but the > pages > >>? ? ? ? ? ? ? ? ? ? ?are not of a fixed > >>? ? ? ? ? ? ? ? ? ? ?> size, thus it may get tricky. > >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ? ? ? ? ? ? ? ? ?> The test mentioned is this: > >>? ? ? ? ? ? ? ? ? ? ?> > >> > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > > >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ? ? ? ? ? ? ? ? ?> I can upload the JSON file somewhere for a > >>? ? ? ? ? ? ? ? ? ? ?couple of days if needed. > >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ? ? ? ? ? ? ? ? ?> Caused by: java.lang.OutOfMemoryError > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> > >> > ?java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) > >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >>? ? ? ? ? ? ? ? ? ? ?> io.sirix.access.trx.page > > >>? ? ? ? ? ? ? ? ? ? ? >.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) > >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ? ? ? ? ? ? ? ? ?> I've uploaded several JFR recordings and logs > >>? ? ? ? ? ? ? ? ? ? ?over here (maybe besides > >>? ? ? ? ? ? ? ? ? ? ?> the async profiler JFR files the zgc-detailed > >>? ? ? ? ? ? ? ? ? ? ?log is most interesting): > >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ? ? ? ? ? ? ? ? ?> > >> https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core > > > >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ? ? ? ? ? ? ? ? ?> kind regards > >>? ? ? ? ? ? ? ? ? ? ?> Johannes > >> > From lichtenberger.johannes at gmail.com Fri Feb 16 17:04:12 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Fri, 16 Feb 2024 18:04:12 +0100 Subject: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> Message-ID: Thanks a lot, I wasn't even aware of the fact, that DirectByteBuffers use System.gc() and I always had in mind that calling System.gc() at least in application code is bad practice (or at least we shouldn't rely on it) and I think I read somewhere a while ago, that it's recommended to even disable this, but may be completely wrong, of course. I'll change it to on-heap byte buffers tomorrow :-) I think your GC log entries were from G1, right? It seems ZGC always tries to use the full heap :-) Kind regards and thanks for sharing your insights. Have a nice weekend as well Johannes Stefan Johansson schrieb am Fr., 16. Feb. 2024, 17:38: > Hi, > > Some comments inline. > > On 2024-02-16 16:47, Johannes Lichtenberger wrote: > > Thanks a lot for looking into it, I've added > > `-XX:MaxDirectMemorySize=2g` only recently, but without it failed as > > well, so not sure what the default is. Will definitely check your > > suggestions :-) > > > If you don't set a limit it will be set to: > Runtime.getRuntime().maxMemory() > So likely a good idea to set a reasonable limit, but the smaller the > limit is the more frequent we need to run reference processing to allow > memory to be freed up. > > > Sadly I'm currently working alone on the project in my spare time > > (besides professionally switched from Java/Kotlin stuff to the embedded > > software world) and I'm not sure if the current architecture of Sirix is > > limited by too much GC pressure. I'd probably have to check Cassandra at > > some point and look into flame graphs and stuff for their integration > > tests, but maybe you can give some general insights/advice... > > > > Yesterday evening I switched to other JDKs (also I want to test with > > Shenandoah in particular), but I think especially the better escape > > analysis of the GraalVM is a huge plus in the case of SirixDB (for > > insertion on my laptop it's ~90s vs ~60s), but I think it should be > > faster and currently my suspicion is that garbage is a major performance > > issue. > > > > Maybe the GC pressure in general is a major issue, as in the CPU Flame > > graph IIRC the G1 had about 20% stack frames allocated and non > > generational ZGC even around 40% taking all threads into account. > > > > From what I/we see, the GC pressure in the given test is not high. The > allocation rate is below 1GB/s and since most of it die young the GCs > are fairly cheap. In this log snippet G1 shows a GC every 5s and the > pause time is below 50ms: > [296,016s][info ][gc ] GC(90) Pause Young (Normal) (G1 Evacuation > Pause) 5413M->1849M(6456M) 35,577ms > [301,103s][info ][gc ] GC(91) Pause Young (Normal) (G1 Evacuation > Pause) 5417M->1848M(6456M) 33,357ms > [306,041s][info ][gc ] GC(92) Pause Young (Normal) (G1 Evacuation > Pause) 5416M->1848M(6456M) 32,763ms > [310,849s][info ][gc ] GC(93) Pause Young (Normal) (G1 Evacuation > Pause) 5416M->1847M(6456M) 33,086ms > > I also see that the heap never expands to more the ~6.5GB even though it > is allow to be 15GB and this also suggest that the GC is not under much > pressure. As I said in the previous mail, the reason Generational ZGC > don't free up the direct memory without the System.gc() calls is that > the GC pressure is not high enough to trigger any Major cycles. So I > would strongly recommend you to not run with -XX+DisableExplicitGC > unless you really have to. Since you are using DirectByteBuffers and > they use System.gc() to help free memory when the limit is reached. > > > So in general I'm thinking about backing the KeyValueLeafPages with > > MemorySegments, but I think due to variable sized pages it's getting > > tricky, plus I currently don't have the time for changing fundamental > > stuff and I'm even not sure if it'll bring a performance boost, as I > > have to adapt neighbour relationships of the nodes often and off-heap > > memory access might be slightly worse performance wise. > > > > What do you think? > > > > I know to little about the application to be able to give advice here, > but I would first start with having most memory on heap. Only large long > lived stuff off-heap, if really needed. Looking at the test at hand, it > really doesn't look like it is long lived stuff that is placed off heap. > > > I've attached a memory flame graph and there it seems the byte array > > from deserializing each page is prominent, but that might be something I > > can't even avoid (after decompression via Snappy or via another lib and > > maybe also decryption in the future). > > > > As of now G1 with GraalVM seems to perform best (a little bit better > > than with non generational ZGC, but I thought ZGC or maybe Shenandoah > > would improve the situation). But as said I may have to generate way > > less garbage after all in general for good performance!? > > > > All in all maybe due to most objects die young maybe also the > > generational GCs are not needed (that said if enough RAM is available > > and the Caffeine Caches are sized accordingly most objects may die old). > > But apparently the byte arrays holding the page data still die young (in > > AbstractReader::deserialize). In fact I'm not even sure why they escape, > > but currently I'm on my phone. > > > > It's when most objects die young the Generational GC really shines, > because it can handle the short lived objects without having to look at > the long lived objects. So I would say Generational ZGC is a good fit > here, but we need to let the System.gc() run to allow reference > processing or slightly re-design and use HeapByteBuffers. > > Have a nice weekend, > Stefan > > > Kind regards > > Johannes > > > > Stefan Johansson > > schrieb am Fr., 16. Feb. 2024, > 13:43: > > > > Hi Johannes, > > > > We've spent some more time looking at this and getting the json-file > to > > reproduced it made it easy to verify our suspicions. Thanks for > > uploading it. > > > > There are a few things playing together here. The test is making > quite > > heavy use of DirectByteBuffers and you limit the usage to 2G > > (-XX:MaxDirectMemorySize=2g). The life cycle and freeing of the > native > > memory part of the DirectByteBuffer rely on reference processing. In > > generational ZGC reference processing is only done during Major > > collections and since the general GC preassure in this benchmark is > > very > > low (most objects die young), we do not trigger that many Major > > collections. > > > > Normaly this would not be a problem. To avoid throwing an out of > memory > > error (due to hitting the direct buffer memory limit) too early the > JDK > > triggers a System.gc(). This should trigger reference procesing and > all > > buffers that are no longer in use would be freed. Since you specify > the > > option -XX:+DisableExplicitGC all these calls to trigger GCs are > > ignored > > and no direct memory will be freed. So in our testing, just removing > > this flags makes the test pass. > > > > Another solution is to look at using HeapByteBuffers instead and > don't > > have to worry about the direct memory usage. The OpenHFT lib seems to > > have support for this by just using elasticHeapByteBuffer(...) > instead > > of elasticByteBuffer(). > > > > Lastly, the reason for this working with non-generational ZGC is > > that it > > does reference processing for every GC. > > > > Hope this helps, > > StefanJ > > > > > > On 2024-02-15 21:53, Johannes Lichtenberger wrote: > > > It's a laptop, I've attached some details. > > > > > > Furthermore, if it seems worth digging deeper into the issue, the > > JSON > > > file is here for one week: > > > https://www.transfernow.net/dl/20240215j9NaPTc0 > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > > > > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > >> > > > > > > You'd have to unzip into > bundles/sirix-core/src/test/resources/json, > > > remove the @Disabled annotation and run the test > > > JsonShredderTest::testChicagoDescendantAxis > > > > > > The test JVM parameters are specified in the parent build.gradle > > in the > > > project root folder. > > > > > > The GitHub repo: https://github.com/sirixdb/sirix > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > > > > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > >> > > > > > > Screenshot from 2024-02-15 21-43-33.png > > > > > > kind regards > > > Johannes > > > > > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb Peter Booth > > > > > >>: > > > > > > Just curious - what CPU, physical memory and OS are you using? > > > Sent from my iPhone > > > > > >> On Feb 15, 2024, at 12:23?PM, Johannes Lichtenberger > > >> > > > >> > >> wrote: > > >> > > >> ? > > >> I guess I don't know which JDK it picks for the tests, but I > > guess > > >> OpenJDK > > >> > > >> Johannes Lichtenberger > > > >> > >> schrieb am Do., 15. > > >> Feb. 2024, 17:58: > > >> > > >> However, it's the same with: ./gradlew > > >> > -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > > >> :sirix-core:test --tests > > >> > > > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > using OpenJDK hopefully > > >> > > >> Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb Johannes > > >> Lichtenberger > > > >> > >>: > > >> > > >> I've attached two logs, the first one without > > >> -XX:+Generational, the second one with the option > set, > > >> even though I also saw, that generational ZGC is > > going to > > >> be supported in GraalVM 24.1 in September... so not > sure > > >> what this does :) > > >> > > >> Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb Johannes > > >> Lichtenberger > > > >> > >>: > > >> > > >> Strange, so does it simply ignore the option? The > > >> following is the beginning of the output from > _non_ > > >> generational ZGC: > > >> > > >> johannes at luna:~/IdeaProjects/sirix$ ./gradlew > > >> > > > -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal > :sirix-core:test --tests > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > > >> > > >> > Configure project : > > >> The 'sonarqube' task depends on compile tasks. > This > > >> behavior is now deprecated and will be removed in > > >> version 5.x. To avoid implicit compilation, set > > >> property 'sonar.gradle.skipCompile' to 'true' > > and make > > >> sure your project is compiled, before analysis > has > > >> started. > > >> The 'sonar' task depends on compile tasks. This > > >> behavior is now deprecated and will be removed in > > >> version 5.x. To avoid implicit compilation, set > > >> property 'sonar.gradle.skipCompile' to 'true' > > and make > > >> sure your project is compiled, before analysis > has > > >> started. > > >> [1,627s][info ][gc ] GC(0) Garbage > Collection > > >> (Metadata GC Threshold) 84M(1%)->56M(0%) > > >> > > >> > Task :sirix-core:test > > >> [0.001s][warning][pagesize] UseLargePages > > disabled, no > > >> large pages configured and available on the > system. > > >> [1.253s][info ][gc ] Using The Z Garbage > > Collector > > >> > > >> [2,930s][info ][gc ] GC(1) Garbage > Collection > > >> (Warmup) 1616M(11%)->746M(5%) > > >> [4,445s][info ][gc ] GC(2) Garbage > Collection > > >> (Warmup) 3232M(21%)->750M(5%) > > >> [5,751s][info ][gc ] GC(3) Garbage > Collection > > >> (Warmup) 4644M(30%)->1356M(9%) > > >> [9,886s][info ][gc ] GC(4) Garbage > Collection > > >> (Allocation Rate) 10668M(69%)->612M(4%) > > >> [10,406s][info ][gc ] GC(5) Garbage > > Collection > > >> (Allocation Rate) 2648M(17%)->216M(1%) > > >> [13,931s][info ][gc ] GC(6) Garbage > > Collection > > >> (Allocation Rate) 11164M(73%)->1562M(10%) > > >> [16,908s][info ][gc ] GC(7) Garbage > > Collection > > >> (Allocation Rate) 11750M(76%)->460M(3%) > > >> [20,690s][info ][gc ] GC(8) Garbage > > Collection > > >> (Allocation Rate) 12670M(82%)->726M(5%) > > >> [24,376s][info ][gc ] GC(9) Garbage > > Collection > > >> (Allocation Rate) 13422M(87%)->224M(1%) > > >> [28,152s][info ][gc ] GC(10) Garbage > > Collection > > >> (Proactive) 13474M(88%)->650M(4%) > > >> [31,526s][info ][gc ] GC(11) Garbage > > Collection > > >> (Allocation Rate) 12072M(79%)->1472M(10%) > > >> [34,754s][info ][gc ] GC(12) Garbage > > Collection > > >> (Allocation Rate) 13050M(85%)->330M(2%) > > >> [38,478s][info ][gc ] GC(13) Garbage > > Collection > > >> (Allocation Rate) 13288M(87%)->762M(5%) > > >> [41,936s][info ][gc ] GC(14) Garbage > > Collection > > >> (Proactive) 13294M(87%)->504M(3%) > > >> [45,353s][info ][gc ] GC(15) Garbage > > Collection > > >> (Allocation Rate) 12984M(85%)->268M(2%) > > >> [48,861s][info ][gc ] GC(16) Garbage > > Collection > > >> (Allocation Rate) 13008M(85%)->306M(2%) > > >> [52,133s][info ][gc ] GC(17) Garbage > > Collection > > >> (Proactive) 12042M(78%)->538M(4%) > > >> [55,705s][info ][gc ] GC(18) Garbage > > Collection > > >> (Allocation Rate) 12420M(81%)->1842M(12%) > > >> [59,000s][info ][gc ] GC(19) Garbage > > Collection > > >> (Allocation Rate) 12458M(81%)->1422M(9%) > > >> [64,501s][info ][gc ] Allocation Stall > (Test > > >> worker) 59,673ms > > >> [64,742s][info ][gc ] Allocation Stall > (Test > > >> worker) 240,077ms > > >> [65,806s][info ][gc ] GC(20) Garbage > > Collection > > >> (Allocation Rate) 13808M(90%)->6936M(45%) > > >> [66,476s][info ][gc ] GC(21) Garbage > > Collection > > >> (Allocation Stall) 7100M(46%)->4478M(29%) > > >> [69,471s][info ][gc ] GC(22) Garbage > > Collection > > >> (Allocation Rate) 10098M(66%)->5888M(38%) > > >> [72,252s][info ][gc ] GC(23) Garbage > > Collection > > >> (Allocation Rate) 11226M(73%)->5816M(38%) > > >> > > >> ... > > >> > > >> So even here I can see some allocation stalls. > > >> > > >> Running the Same with -XX:+ZGenerational in > > >> build.gradle probably using GraalVM does > something > > >> differnt, but I don't know what... at least > off-heap > > >> memory is exhausted at some point due to direct > byte > > >> buffer usage!? > > >> > > >> So, I'm not sure what's the difference, though. > > >> > > >> With this: > > >> > > >> "-XX:+UseZGC", > > >> > > "-Xlog:gc*=debug:file=zgc-generational-detailed.log", > > >> "-XX:+ZGenerational", > > >> "-verbose:gc", > > >> "-XX:+HeapDumpOnOutOfMemoryError", > > >> "-XX:HeapDumpPath=heapdump.hprof", > > >> "-XX:MaxDirectMemorySize=2g", > > >> > > >> > > >> Caused by: java.lang.OutOfMemoryError: Cannot > > reserve 60000 bytes of direct buffer memory (allocated: 2147446560, > > limit: 2147483648) > > >> at > > java.base/java.nio.Bits.reserveMemory(Bits.java:178) > > >> at > > java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) > > >> at > > java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) > > >> at > > > net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) > > >> at > > > net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) > > >> at > > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) > > >> at > > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) > > >> at io.sirix.access.trx.page > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > >.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) > > >> > > >> > > >> > > >> Am Do., 15. Feb. 2024 um 12:05 Uhr schrieb Stefan > > >> Karlsson > > > >> > >>: > > >> > > >> Hi Johannes, > > >> > > >> We tried to look at the log files and the jfr > > >> files, but couldn't find > > >> an OotOfMemoryError in any of them. Do you > think > > >> you could try to rerun > > >> and capture the entire GC log from the > > >> OutOfMemoryError run? > > >> > > >> A few things to note: > > >> > > >> 1) You seem to be running the Graal compiler. > > >> Graal doesn't support > > >> Generational ZGC, so you are going to run > > >> different compilers when you > > >> compare Singlegen ZGC with Generational ZGC. > > >> > > >> 2) It's not clear to me that the provided JFR > > >> files matches the provided > > >> log files. > > >> > > >> 3) The JFR files show that > > -XX:+UseLargePages are > > >> used, but the gc+init > > >> logs shows 'Large Page Support: Disabled', > you > > >> might want to look into > > >> why that is the case. > > >> > > >> 4) The singlegen JFR file has a > > >> -Xlog:gc:g1-chicago.log line. It should > > >> probably be named zgc-chicago.log. > > >> > > >> Cheers, > > >> StefanK > > >> > > >> On 2024-02-14 17:36, Johannes Lichtenberger > > wrote: > > >> > Hello, > > >> > > > >> > a test of my little DB project fails using > > >> generational ZGC, but not > > >> > with ZGC and G1 (out of memory error). > > >> > > > >> > To be honest, I guess the allocation rate > and > > >> thus GC pressure, when > > >> > traversing a resource in SirixDB is > > >> unacceptable. The strategy is to > > >> > create fine-grained nodes from JSON input > and > > >> store these in a trie. > > >> > First, a 3,8Gb JSON file is shredded and > > >> imported. Next, a preorder > > >> > traversal of the generated trie traverses > > a trie > > >> (with leaf pages > > >> > storing 1024 nodes each and in total > > >> ~300_000_000 (and these are going > > >> > to be deserialized one by one). The pages > are > > >> furthermore referenced > > >> > in memory through PageReference::setPage. > > >> Furthermore, a Caffeine page > > >> > cache caches the PageReferences (keys) and > the > > >> pages (values) and sets > > >> > the reference back to null once entries are > > >> going to be evicted > > >> > (PageReference.setPage(null)). > > >> > > > >> > However, I think the whole strategy of > > having to > > >> have in-memory nodes > > >> > might not be best. Maybe it's better to use > > >> off-heap memory for the > > >> > pages itself with MemorySegments, but the > > pages > > >> are not of a fixed > > >> > size, thus it may get tricky. > > >> > > > >> > The test mentioned is this: > > >> > > > >> > > > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$> > < > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$ > >> > > >> > > > >> > I can upload the JSON file somewhere for a > > >> couple of days if needed. > > >> > > > >> > Caused by: java.lang.OutOfMemoryError > > >> > at > > >> > > > >> > > > java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > > >> > at > > >> > > > >> > > > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) > > >> > at > > >> > > > >> > > > java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) > > >> > at > > >> > > > >> > > > java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) > > >> > at > > >> > > > >> > > > java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) > > >> > at > > >> > > > >> > > > java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) > > >> > at > > >> > > > >> > > > java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > > >> > at > > >> > > > >> > > > java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > > >> > at > > >> > > > >> > > > java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > > >> > at > > >> > > > >> > > > java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) > > >> > at > > >> > io.sirix.access.trx.page > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > > > > >> > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > >>.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) > > >> > > > >> > I've uploaded several JFR recordings and > logs > > >> over here (maybe besides > > >> > the async profiler JFR files the > zgc-detailed > > >> log is most interesting): > > >> > > > >> > > > >> https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$> > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$ > >> > > >> > > > >> > kind regards > > >> > Johannes > > >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.johansson at oracle.com Fri Feb 16 20:54:55 2024 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 16 Feb 2024 21:54:55 +0100 Subject: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> Message-ID: <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> On 2024-02-16 18:04, Johannes Lichtenberger wrote: > Thanks a lot, I wasn't even aware of the fact, that DirectByteBuffers > use System.gc() and I always had in mind that calling System.gc() at > least in application code is bad practice (or at least we shouldn't rely > on it) and I think I read somewhere a while ago, that it's recommended > to even disable this, but may be completely wrong, of course. > In most cases callling System.gc() is bad practice, in some special cases it might be needed. > I'll change it to on-heap byte buffers tomorrow :-) > > I think your GC log entries were from G1, right? It seems ZGC always > tries to use the full heap :-) > Yes, the snippet was G1, it was mostly to show that the pressure isn't high. You are correct that ZGC uses more of the given heap but the collections are pretty far apart and I'm certian it would function well with a smaller heap as well. Maybe in that case some Major collections would be triggered. > Kind regards and thanks for sharing your insights. > No problem. We appriciate the feedback, StefanJ > Have a nice weekend as well > Johannes > > Stefan Johansson > schrieb am Fr., 16. Feb. 2024, 17:38: > > Hi, > > Some comments inline. > > On 2024-02-16 16:47, Johannes Lichtenberger wrote: > > Thanks a lot for looking into it, I've added > > `-XX:MaxDirectMemorySize=2g` only recently, but without it failed as > > well,? so not sure what the default is. Will definitely check your > > suggestions :-) > > > If you don't set a limit it will be set to: > Runtime.getRuntime().maxMemory() > So likely a good idea to set a reasonable limit, but the smaller the > limit is the more frequent we need to run reference processing to allow > memory to be freed up. > > > Sadly I'm currently working alone on the project in my spare time > > (besides professionally switched from Java/Kotlin stuff to the > embedded > > software world) and I'm not sure if the current architecture of > Sirix is > > limited by too much GC pressure. I'd probably have to check > Cassandra at > > some point and look into flame graphs and stuff for their > integration > > tests, but maybe you can give some general insights/advice... > > > > Yesterday evening I switched to other JDKs (also I want to test with > > Shenandoah in particular), but I think especially the better escape > > analysis of the GraalVM is a huge plus in the case of SirixDB (for > > insertion on my laptop it's ~90s vs ~60s),? but I think it should be > > faster and currently my suspicion is that garbage is a major > performance > > issue. > > > > Maybe the GC pressure in general is a major issue, as in the CPU > Flame > > graph IIRC the G1 had about 20% stack frames allocated and non > > generational ZGC even around 40% taking all threads into account. > > > > ?From what I/we see, the GC pressure in the given test is not high. > The > allocation rate is below 1GB/s and since most of it die young the GCs > are fairly cheap. In this log snippet G1 shows a GC every 5s and the > pause time is below 50ms: > [296,016s][info? ?][gc? ? ? ] GC(90) Pause Young (Normal) (G1 > Evacuation > Pause) 5413M->1849M(6456M) 35,577ms > [301,103s][info? ?][gc? ? ? ] GC(91) Pause Young (Normal) (G1 > Evacuation > Pause) 5417M->1848M(6456M) 33,357ms > [306,041s][info? ?][gc? ? ? ] GC(92) Pause Young (Normal) (G1 > Evacuation > Pause) 5416M->1848M(6456M) 32,763ms > [310,849s][info? ?][gc? ? ? ] GC(93) Pause Young (Normal) (G1 > Evacuation > Pause) 5416M->1847M(6456M) 33,086ms > > I also see that the heap never expands to more the ~6.5GB even > though it > is allow to be 15GB and this also suggest that the GC is not under much > pressure. As I said in the previous mail, the reason Generational ZGC > don't free up the direct memory without the System.gc() calls is that > the GC pressure is not high enough to trigger any Major cycles. So I > would strongly recommend you to not run with -XX+DisableExplicitGC > unless you really have to. Since you are using DirectByteBuffers and > they use System.gc() to help free memory when the limit is reached. > > > So in general I'm thinking about backing the KeyValueLeafPages with > > MemorySegments, but I think due to variable sized pages it's getting > > tricky, plus I currently don't have the time for changing > fundamental > > stuff and I'm even not sure if it'll bring a performance boost, as I > > have to adapt neighbour relationships of the nodes often and > off-heap > > memory access might be slightly worse performance wise. > > > > What do you think? > > > > I know to little about the application to be able to give advice here, > but I would first start with having most memory on heap. Only large > long > lived stuff off-heap, if really needed. Looking at the test at hand, it > really doesn't look like it is long lived stuff that is placed off heap. > > > I've attached a memory flame graph and there it seems the byte array > > from deserializing each page is prominent, but that might be > something I > > can't even avoid (after decompression via Snappy or via another > lib and > > maybe also decryption in the future). > > > > As of now G1 with GraalVM seems to perform best (a little bit better > > than with non generational ZGC, but I thought ZGC or maybe > Shenandoah > > would improve the situation). But as said I may have to generate way > > less garbage after all in general for good performance!? > > > > All in all maybe due to most objects die young maybe also the > > generational GCs are not needed (that said if enough RAM is > available > > and the Caffeine Caches are sized accordingly most objects may > die old). > > But apparently the byte arrays holding the page data still die > young (in > > AbstractReader::deserialize). In fact I'm not even sure why they > escape, > > but currently I'm on my phone. > > > > It's when most objects die young the Generational GC really shines, > because it can handle the short lived objects without having to look at > the long lived objects. So I would say Generational ZGC is a good fit > here, but we need to let the System.gc() run to allow reference > processing or slightly re-design and use HeapByteBuffers. > > Have a nice weekend, > Stefan > > > Kind regards > > Johannes > > > > Stefan Johansson > > >> schrieb am Fr., 16. Feb. > 2024, 13:43: > > > >? ? ?Hi Johannes, > > > >? ? ?We've spent some more time looking at this and getting the > json-file to > >? ? ?reproduced it made it easy to verify our suspicions. Thanks for > >? ? ?uploading it. > > > >? ? ?There are a few things playing together here. The test is > making quite > >? ? ?heavy use of DirectByteBuffers and you limit the usage to 2G > >? ? ?(-XX:MaxDirectMemorySize=2g). The life cycle and freeing of > the native > >? ? ?memory part of the DirectByteBuffer rely on reference > processing. In > >? ? ?generational ZGC reference processing is only done during Major > >? ? ?collections and since the general GC preassure in this > benchmark is > >? ? ?very > >? ? ?low (most objects die young), we do not trigger that many Major > >? ? ?collections. > > > >? ? ?Normaly this would not be a problem. To avoid throwing an out > of memory > >? ? ?error (due to hitting the direct buffer memory limit) too > early the JDK > >? ? ?triggers a System.gc(). This should trigger reference > procesing and all > >? ? ?buffers that are no longer in use would be freed. Since you > specify the > >? ? ?option -XX:+DisableExplicitGC all these calls to trigger GCs are > >? ? ?ignored > >? ? ?and no direct memory will be freed. So in our testing, just > removing > >? ? ?this flags makes the test pass. > > > >? ? ?Another solution is to look at using HeapByteBuffers instead > and don't > >? ? ?have to worry about the direct memory usage. The OpenHFT lib > seems to > >? ? ?have support for this by just using > elasticHeapByteBuffer(...) instead > >? ? ?of elasticByteBuffer(). > > > >? ? ?Lastly, the reason for this working with non-generational ZGC is > >? ? ?that it > >? ? ?does reference processing for every GC. > > > >? ? ?Hope this helps, > >? ? ?StefanJ > > > > > >? ? ?On 2024-02-15 21:53, Johannes Lichtenberger wrote: > >? ? ? > It's a laptop, I've attached some details. > >? ? ? > > >? ? ? > Furthermore, if it seems worth digging deeper into the > issue, the > >? ? ?JSON > >? ? ? > file is here for one week: > >? ? ? > https://www.transfernow.net/dl/20240215j9NaPTc0 > > > > ?> > >? ? ? > > > > ?>> > >? ? ? > > >? ? ? > You'd have to unzip into > bundles/sirix-core/src/test/resources/json, > >? ? ? > remove the?@Disabled annotation and run the test > >? ? ? > JsonShredderTest::testChicagoDescendantAxis > >? ? ? > > >? ? ? > The test JVM parameters are specified in the parent > build.gradle > >? ? ?in the > >? ? ? > project root folder. > >? ? ? > > >? ? ? > The GitHub repo: https://github.com/sirixdb/sirix > > > > ?> > >? ? ? > > > > ?>> > >? ? ? > > >? ? ? > Screenshot from 2024-02-15 21-43-33.png > >? ? ? > > >? ? ? > kind regards > >? ? ? > Johannes > >? ? ? > > >? ? ? > Am Do., 15. Feb. 2024 um 20:01?Uhr schrieb Peter Booth > >? ? ? > > > > >? ? ? > >>>: > >? ? ? > > >? ? ? >? ? ?Just curious - what CPU, physical memory and OS are > you using? > >? ? ? >? ? ?Sent from my iPhone > >? ? ? > > >? ? ? >>? ? ?On Feb 15, 2024, at 12:23?PM, Johannes Lichtenberger > >? ? ? >>? ? ? > >? ? ? > > >? ? ? >>? ? ? > >? ? ? >>> wrote: > >? ? ? >> > >? ? ? >>? ? ?? > >? ? ? >>? ? ?I guess I don't know which JDK it picks for the > tests, but I > >? ? ?guess > >? ? ? >>? ? ?OpenJDK > >? ? ? >> > >? ? ? >>? ? ?Johannes Lichtenberger > > >? ? ? > > >? ? ? >>? ? ? > >? ? ? >>> schrieb am Do., 15. > >? ? ? >>? ? ?Feb. 2024, 17:58: > >? ? ? >> > >? ? ? >>? ? ? ? ?However, it's the same with:?./gradlew > >? ? ? >> > ?-Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > >? ? ? >>? ? ? ? ?:sirix-core:test --tests > >? ? ? >> > > > ?io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis? ?using OpenJDK hopefully > >? ? ? >> > >? ? ? >>? ? ? ? ?Am Do., 15. Feb. 2024 um 17:54?Uhr schrieb Johannes > >? ? ? >>? ? ? ? ?Lichtenberger > >? ? ? > > >? ? ? >>? ? ? ? ? > >? ? ? >>>: > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ?I've attached two logs, the first one without > >? ? ? >>? ? ? ? ? ? ?-XX:+Generational, the second one with the > option set, > >? ? ? >>? ? ? ? ? ? ?even though I also saw, that generational ZGC is > >? ? ?going to > >? ? ? >>? ? ? ? ? ? ?be supported in GraalVM 24.1 in September... > so not sure > >? ? ? >>? ? ? ? ? ? ?what this does :) > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ?Am Do., 15. Feb. 2024 um 17:52?Uhr schrieb > Johannes > >? ? ? >>? ? ? ? ? ? ?Lichtenberger > > >? ? ? > > >? ? ? >>? ? ? ? ? ? ? > >? ? ? >>>: > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?Strange, so does it simply ignore the > option? The > >? ? ? >>? ? ? ? ? ? ? ? ?following is the beginning of the output > from _non_ > >? ? ? >>? ? ? ? ? ? ? ? ?generational ZGC: > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?johannes at luna:~/IdeaProjects/sirix$ ./gradlew > >? ? ? >> > > > ?-Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal :sirix-core:test --tests io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?> Configure project : > >? ? ? >>? ? ? ? ? ? ? ? ?The 'sonarqube' task depends on compile > tasks. This > >? ? ? >>? ? ? ? ? ? ? ? ?behavior is now deprecated and will be > removed in > >? ? ? >>? ? ? ? ? ? ? ? ?version 5.x. To avoid implicit > compilation, set > >? ? ? >>? ? ? ? ? ? ? ? ?property 'sonar.gradle.skipCompile' to 'true' > >? ? ?and make > >? ? ? >>? ? ? ? ? ? ? ? ?sure your project is compiled, before > analysis has > >? ? ? >>? ? ? ? ? ? ? ? ?started. > >? ? ? >>? ? ? ? ? ? ? ? ?The 'sonar' task depends on compile > tasks. This > >? ? ? >>? ? ? ? ? ? ? ? ?behavior is now deprecated and will be > removed in > >? ? ? >>? ? ? ? ? ? ? ? ?version 5.x. To avoid implicit > compilation, set > >? ? ? >>? ? ? ? ? ? ? ? ?property 'sonar.gradle.skipCompile' to 'true' > >? ? ?and make > >? ? ? >>? ? ? ? ? ? ? ? ?sure your project is compiled, before > analysis has > >? ? ? >>? ? ? ? ? ? ? ? ?started. > >? ? ? >>? ? ? ? ? ? ? ? ?[1,627s][info ? ][gc ? ? ?] GC(0) Garbage > Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Metadata GC Threshold) 84M(1%)->56M(0%) > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?> Task :sirix-core:test > >? ? ? >>? ? ? ? ? ? ? ? ?[0.001s][warning][pagesize] UseLargePages > >? ? ?disabled, no > >? ? ? >>? ? ? ? ? ? ? ? ?large pages configured and available on > the system. > >? ? ? >>? ? ? ? ? ? ? ? ?[1.253s][info ? ][gc ? ? ?] Using The Z > Garbage > >? ? ?Collector > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?[2,930s][info ? ][gc ? ? ?] GC(1) Garbage > Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Warmup) 1616M(11%)->746M(5%) > >? ? ? >>? ? ? ? ? ? ? ? ?[4,445s][info ? ][gc ? ? ?] GC(2) Garbage > Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Warmup) 3232M(21%)->750M(5%) > >? ? ? >>? ? ? ? ? ? ? ? ?[5,751s][info ? ][gc ? ? ?] GC(3) Garbage > Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Warmup) 4644M(30%)->1356M(9%) > >? ? ? >>? ? ? ? ? ? ? ? ?[9,886s][info ? ][gc ? ? ?] GC(4) Garbage > Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 10668M(69%)->612M(4%) > >? ? ? >>? ? ? ? ? ? ? ? ?[10,406s][info ? ][gc ? ? ?] GC(5) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 2648M(17%)->216M(1%) > >? ? ? >>? ? ? ? ? ? ? ? ?[13,931s][info ? ][gc ? ? ?] GC(6) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 11164M(73%)->1562M(10%) > >? ? ? >>? ? ? ? ? ? ? ? ?[16,908s][info ? ][gc ? ? ?] GC(7) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 11750M(76%)->460M(3%) > >? ? ? >>? ? ? ? ? ? ? ? ?[20,690s][info ? ][gc ? ? ?] GC(8) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12670M(82%)->726M(5%) > >? ? ? >>? ? ? ? ? ? ? ? ?[24,376s][info ? ][gc ? ? ?] GC(9) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13422M(87%)->224M(1%) > >? ? ? >>? ? ? ? ? ? ? ? ?[28,152s][info ? ][gc ? ? ?] GC(10) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Proactive) 13474M(88%)->650M(4%) > >? ? ? >>? ? ? ? ? ? ? ? ?[31,526s][info ? ][gc ? ? ?] GC(11) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12072M(79%)->1472M(10%) > >? ? ? >>? ? ? ? ? ? ? ? ?[34,754s][info ? ][gc ? ? ?] GC(12) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13050M(85%)->330M(2%) > >? ? ? >>? ? ? ? ? ? ? ? ?[38,478s][info ? ][gc ? ? ?] GC(13) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13288M(87%)->762M(5%) > >? ? ? >>? ? ? ? ? ? ? ? ?[41,936s][info ? ][gc ? ? ?] GC(14) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Proactive) 13294M(87%)->504M(3%) > >? ? ? >>? ? ? ? ? ? ? ? ?[45,353s][info ? ][gc ? ? ?] GC(15) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12984M(85%)->268M(2%) > >? ? ? >>? ? ? ? ? ? ? ? ?[48,861s][info ? ][gc ? ? ?] GC(16) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13008M(85%)->306M(2%) > >? ? ? >>? ? ? ? ? ? ? ? ?[52,133s][info ? ][gc ? ? ?] GC(17) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Proactive) 12042M(78%)->538M(4%) > >? ? ? >>? ? ? ? ? ? ? ? ?[55,705s][info ? ][gc ? ? ?] GC(18) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12420M(81%)->1842M(12%) > >? ? ? >>? ? ? ? ? ? ? ? ?[59,000s][info ? ][gc ? ? ?] GC(19) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 12458M(81%)->1422M(9%) > >? ? ? >>? ? ? ? ? ? ? ? ?[64,501s][info ? ][gc ? ? ?] Allocation > Stall (Test > >? ? ? >>? ? ? ? ? ? ? ? ?worker) 59,673ms > >? ? ? >>? ? ? ? ? ? ? ? ?[64,742s][info ? ][gc ? ? ?] Allocation > Stall (Test > >? ? ? >>? ? ? ? ? ? ? ? ?worker) 240,077ms > >? ? ? >>? ? ? ? ? ? ? ? ?[65,806s][info ? ][gc ? ? ?] GC(20) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 13808M(90%)->6936M(45%) > >? ? ? >>? ? ? ? ? ? ? ? ?[66,476s][info ? ][gc ? ? ?] GC(21) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Stall) 7100M(46%)->4478M(29%) > >? ? ? >>? ? ? ? ? ? ? ? ?[69,471s][info ? ][gc ? ? ?] GC(22) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 10098M(66%)->5888M(38%) > >? ? ? >>? ? ? ? ? ? ? ? ?[72,252s][info ? ][gc ? ? ?] GC(23) Garbage > >? ? ?Collection > >? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 11226M(73%)->5816M(38%) > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?... > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?So even here I can see some allocation > stalls. > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?Running the Same with -XX:+ZGenerational in > >? ? ? >>? ? ? ? ? ? ? ? ?build.gradle probably using GraalVM does > something > >? ? ? >>? ? ? ? ? ? ? ? ?differnt, but I don't know what... at > least off-heap > >? ? ? >>? ? ? ? ? ? ? ? ?memory is exhausted at some point due to > direct byte > >? ? ? >>? ? ? ? ? ? ? ? ?buffer usage!? > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?So, I'm not sure what's the difference, > though. > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?With this: > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?"-XX:+UseZGC", > >? ? ? >> > >? ? ? ?"-Xlog:gc*=debug:file=zgc-generational-detailed.log", > >? ? ? >>? ? ? ? ? ? ? ? ?"-XX:+ZGenerational", > >? ? ? >>? ? ? ? ? ? ? ? ?"-verbose:gc", > >? ? ? >>? ? ? ? ? ? ? ? ?"-XX:+HeapDumpOnOutOfMemoryError", > >? ? ? >>? ? ? ? ? ? ? ? ?"-XX:HeapDumpPath=heapdump.hprof", > >? ? ? >>? ? ? ? ? ? ? ? ?"-XX:MaxDirectMemorySize=2g", > >? ? ? >> > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?Caused by: java.lang.OutOfMemoryError: Cannot > >? ? ?reserve 60000 bytes of direct buffer memory (allocated: > 2147446560, > >? ? ?limit: 2147483648) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >? ? ?java.base/java.nio.Bits.reserveMemory(Bits.java:178) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > > > ?java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >? ? ?java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > > > ?net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > > > ?net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > > > ?net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > > > ?net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ? at io.sirix.access.trx.page > > > > ?>.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) > >? ? ? >> > >? ? ? >> > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ?Am Do., 15. Feb. 2024 um 12:05?Uhr > schrieb Stefan > >? ? ? >>? ? ? ? ? ? ? ? ?Karlsson > >? ? ? > > >? ? ? >>? ? ? ? ? ? ? ? ? > >? ? ? >>>: > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?Hi Johannes, > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?We tried to look at the log files and > the jfr > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?files, but couldn't find > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?an OotOfMemoryError in any of them. > Do you think > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?you could try to rerun > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?and capture the entire GC log from the > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?OutOfMemoryError run? > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?A few things to note: > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?1) You seem to be running the Graal > compiler. > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?Graal doesn't support > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?Generational ZGC, so you are going to run > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?different compilers when you > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?compare Singlegen ZGC with > Generational ZGC. > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?2) It's not clear to me that the > provided JFR > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?files matches the provided > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?log files. > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?3) The JFR files show that > >? ? ?-XX:+UseLargePages are > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?used, but the gc+init > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?logs shows 'Large Page Support: > Disabled', you > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?might want to look into > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?why that is the case. > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?4) The singlegen JFR file has a > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?-Xlog:gc:g1-chicago.log line. It should > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?probably be named zgc-chicago.log. > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?Cheers, > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?StefanK > >? ? ? >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?On 2024-02-14 17:36, Johannes > Lichtenberger > >? ? ?wrote: > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> Hello, > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> a test of my little DB project > fails using > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?generational ZGC, but not > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> with ZGC and G1 (out of memory error). > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> To be honest, I guess the > allocation rate and > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?thus GC pressure, when > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> traversing a resource in SirixDB is > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?unacceptable. The strategy is to > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> create fine-grained nodes from JSON > input and > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?store these in a trie. > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> First, a 3,8Gb JSON file is > shredded and > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?imported. Next, a preorder > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> traversal of the generated trie > traverses > >? ? ?a trie > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?(with leaf pages > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> storing 1024 nodes each and in total > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?~300_000_000 (and these are going > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> to be deserialized one by one). The > pages are > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?furthermore referenced > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> in memory through > PageReference::setPage. > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?Furthermore, a Caffeine page > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> cache caches the PageReferences > (keys) and the > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?pages (values) and sets > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> the reference back to null once > entries are > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?going to be evicted > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> (PageReference.setPage(null)). > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> However, I think the whole strategy of > >? ? ?having to > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?have in-memory nodes > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> might not be best. Maybe it's > better to use > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?off-heap memory for the > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> pages itself with MemorySegments, > but the > >? ? ?pages > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?are not of a fixed > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> size, thus it may get tricky. > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> The test mentioned is this: > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> I can upload the JSON file > somewhere for a > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?couple of days if needed. > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> Caused by: java.lang.OutOfMemoryError > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > > > ?java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> ? ? at > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> io.sirix.access.trx.page > > > > ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ? > > > ?>>.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> I've uploaded several JFR > recordings and logs > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?over here (maybe besides > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> the async profiler JFR files the > zgc-detailed > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?log is most interesting): > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >> > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core > > > > ?> >> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> kind regards > >? ? ? >>? ? ? ? ? ? ? ? ? ? ?> Johannes > >? ? ? >> > > > From erik.osterlund at oracle.com Fri Feb 16 21:38:50 2024 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Fri, 16 Feb 2024 21:38:50 +0000 Subject: Generational ZGC issue In-Reply-To: <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> Message-ID: It?s worth noting that when using ZGC, calling System.gc does not invoke a classic disastrously long GC pause. Instead, a concurrent GC is triggered, which should be not that noticeable to the application. The thread calling System.gc is blocked until the GC is done, but the other threads can run freely. /Erik > On 16 Feb 2024, at 21:55, Stefan Johansson wrote: > > ? > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: >> Thanks a lot, I wasn't even aware of the fact, that DirectByteBuffers use System.gc() and I always had in mind that calling System.gc() at least in application code is bad practice (or at least we shouldn't rely on it) and I think I read somewhere a while ago, that it's recommended to even disable this, but may be completely wrong, of course. > In most cases callling System.gc() is bad practice, in some special cases it might be needed. > >> I'll change it to on-heap byte buffers tomorrow :-) >> I think your GC log entries were from G1, right? It seems ZGC always tries to use the full heap :-) > > Yes, the snippet was G1, it was mostly to show that the pressure isn't high. You are correct that ZGC uses more of the given heap but the collections are pretty far apart and I'm certian it would function well with a smaller heap as well. Maybe in that case some Major collections would be triggered. > >> Kind regards and thanks for sharing your insights. > > No problem. We appriciate the feedback, > StefanJ > >> Have a nice weekend as well >> Johannes >> Stefan Johansson > schrieb am Fr., 16. Feb. 2024, 17:38: >> Hi, >> Some comments inline. >> On 2024-02-16 16:47, Johannes Lichtenberger wrote: >> > Thanks a lot for looking into it, I've added >> > `-XX:MaxDirectMemorySize=2g` only recently, but without it failed as >> > well, so not sure what the default is. Will definitely check your >> > suggestions :-) >> > >> If you don't set a limit it will be set to: >> Runtime.getRuntime().maxMemory() >> So likely a good idea to set a reasonable limit, but the smaller the >> limit is the more frequent we need to run reference processing to allow >> memory to be freed up. >> > Sadly I'm currently working alone on the project in my spare time >> > (besides professionally switched from Java/Kotlin stuff to the >> embedded >> > software world) and I'm not sure if the current architecture of >> Sirix is >> > limited by too much GC pressure. I'd probably have to check >> Cassandra at >> > some point and look into flame graphs and stuff for their >> integration >> > tests, but maybe you can give some general insights/advice... >> > >> > Yesterday evening I switched to other JDKs (also I want to test with >> > Shenandoah in particular), but I think especially the better escape >> > analysis of the GraalVM is a huge plus in the case of SirixDB (for >> > insertion on my laptop it's ~90s vs ~60s), but I think it should be >> > faster and currently my suspicion is that garbage is a major >> performance >> > issue. >> > >> > Maybe the GC pressure in general is a major issue, as in the CPU >> Flame >> > graph IIRC the G1 had about 20% stack frames allocated and non >> > generational ZGC even around 40% taking all threads into account. >> > >> From what I/we see, the GC pressure in the given test is not high. >> The >> allocation rate is below 1GB/s and since most of it die young the GCs >> are fairly cheap. In this log snippet G1 shows a GC every 5s and the >> pause time is below 50ms: >> [296,016s][info ][gc ] GC(90) Pause Young (Normal) (G1 >> Evacuation >> Pause) 5413M->1849M(6456M) 35,577ms >> [301,103s][info ][gc ] GC(91) Pause Young (Normal) (G1 >> Evacuation >> Pause) 5417M->1848M(6456M) 33,357ms >> [306,041s][info ][gc ] GC(92) Pause Young (Normal) (G1 >> Evacuation >> Pause) 5416M->1848M(6456M) 32,763ms >> [310,849s][info ][gc ] GC(93) Pause Young (Normal) (G1 >> Evacuation >> Pause) 5416M->1847M(6456M) 33,086ms >> I also see that the heap never expands to more the ~6.5GB even >> though it >> is allow to be 15GB and this also suggest that the GC is not under much >> pressure. As I said in the previous mail, the reason Generational ZGC >> don't free up the direct memory without the System.gc() calls is that >> the GC pressure is not high enough to trigger any Major cycles. So I >> would strongly recommend you to not run with -XX+DisableExplicitGC >> unless you really have to. Since you are using DirectByteBuffers and >> they use System.gc() to help free memory when the limit is reached. >> > So in general I'm thinking about backing the KeyValueLeafPages with >> > MemorySegments, but I think due to variable sized pages it's getting >> > tricky, plus I currently don't have the time for changing >> fundamental >> > stuff and I'm even not sure if it'll bring a performance boost, as I >> > have to adapt neighbour relationships of the nodes often and >> off-heap >> > memory access might be slightly worse performance wise. >> > >> > What do you think? >> > >> I know to little about the application to be able to give advice here, >> but I would first start with having most memory on heap. Only large >> long >> lived stuff off-heap, if really needed. Looking at the test at hand, it >> really doesn't look like it is long lived stuff that is placed off heap. >> > I've attached a memory flame graph and there it seems the byte array >> > from deserializing each page is prominent, but that might be >> something I >> > can't even avoid (after decompression via Snappy or via another >> lib and >> > maybe also decryption in the future). >> > >> > As of now G1 with GraalVM seems to perform best (a little bit better >> > than with non generational ZGC, but I thought ZGC or maybe >> Shenandoah >> > would improve the situation). But as said I may have to generate way >> > less garbage after all in general for good performance!? >> > >> > All in all maybe due to most objects die young maybe also the >> > generational GCs are not needed (that said if enough RAM is >> available >> > and the Caffeine Caches are sized accordingly most objects may >> die old). >> > But apparently the byte arrays holding the page data still die >> young (in >> > AbstractReader::deserialize). In fact I'm not even sure why they >> escape, >> > but currently I'm on my phone. >> > >> It's when most objects die young the Generational GC really shines, >> because it can handle the short lived objects without having to look at >> the long lived objects. So I would say Generational ZGC is a good fit >> here, but we need to let the System.gc() run to allow reference >> processing or slightly re-design and use HeapByteBuffers. >> Have a nice weekend, >> Stefan >> > Kind regards >> > Johannes >> > >> > Stefan Johansson > >> > > >> schrieb am Fr., 16. Feb. >> 2024, 13:43: >> > >> > Hi Johannes, >> > >> > We've spent some more time looking at this and getting the >> json-file to >> > reproduced it made it easy to verify our suspicions. Thanks for >> > uploading it. >> > >> > There are a few things playing together here. The test is >> making quite >> > heavy use of DirectByteBuffers and you limit the usage to 2G >> > (-XX:MaxDirectMemorySize=2g). The life cycle and freeing of >> the native >> > memory part of the DirectByteBuffer rely on reference >> processing. In >> > generational ZGC reference processing is only done during Major >> > collections and since the general GC preassure in this >> benchmark is >> > very >> > low (most objects die young), we do not trigger that many Major >> > collections. >> > >> > Normaly this would not be a problem. To avoid throwing an out >> of memory >> > error (due to hitting the direct buffer memory limit) too >> early the JDK >> > triggers a System.gc(). This should trigger reference >> procesing and all >> > buffers that are no longer in use would be freed. Since you >> specify the >> > option -XX:+DisableExplicitGC all these calls to trigger GCs are >> > ignored >> > and no direct memory will be freed. So in our testing, just >> removing >> > this flags makes the test pass. >> > >> > Another solution is to look at using HeapByteBuffers instead >> and don't >> > have to worry about the direct memory usage. The OpenHFT lib >> seems to >> > have support for this by just using >> elasticHeapByteBuffer(...) instead >> > of elasticByteBuffer(). >> > >> > Lastly, the reason for this working with non-generational ZGC is >> > that it >> > does reference processing for every GC. >> > >> > Hope this helps, >> > StefanJ >> > >> > >> > On 2024-02-15 21:53, Johannes Lichtenberger wrote: >> > > It's a laptop, I've attached some details. >> > > >> > > Furthermore, if it seems worth digging deeper into the >> issue, the >> > JSON >> > > file is here for one week: >> > > https://www.transfernow.net/dl/20240215j9NaPTc0 >> >> > > >> > > > >> > >> >> > > >> > > You'd have to unzip into >> bundles/sirix-core/src/test/resources/json, >> > > remove the @Disabled annotation and run the test >> > > JsonShredderTest::testChicagoDescendantAxis >> > > >> > > The test JVM parameters are specified in the parent >> build.gradle >> > in the >> > > project root folder. >> > > >> > > The GitHub repo: https://github.com/sirixdb/sirix >> >> > > >> > > > >> > >> >> > > >> > > Screenshot from 2024-02-15 21-43-33.png >> > > >> > > kind regards >> > > Johannes >> > > >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb Peter Booth >> > > >> > >> > >> >>>: >> > > >> > > Just curious - what CPU, physical memory and OS are >> you using? >> > > Sent from my iPhone >> > > >> > >> On Feb 15, 2024, at 12:23?PM, Johannes Lichtenberger >> > >> > >> > > > >> > >> > >> > > >>> wrote: >> > >> >> > >> ? >> > >> I guess I don't know which JDK it picks for the >> tests, but I >> > guess >> > >> OpenJDK >> > >> >> > >> Johannes Lichtenberger >> > >> > > > >> > >> > >> > > >>> schrieb am Do., 15. >> > >> Feb. 2024, 17:58: >> > >> >> > >> However, it's the same with: ./gradlew >> > >> -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 >> > >> :sirix-core:test --tests >> > >> >> > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis using OpenJDK hopefully >> > >> >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb Johannes >> > >> Lichtenberger > >> > > > >> > >> > >> > > >>>: >> > >> >> > >> I've attached two logs, the first one without >> > >> -XX:+Generational, the second one with the >> option set, >> > >> even though I also saw, that generational ZGC is >> > going to >> > >> be supported in GraalVM 24.1 in September... >> so not sure >> > >> what this does :) >> > >> >> > >> Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb >> Johannes >> > >> Lichtenberger >> > >> > > > >> > >> > >> > > >>>: >> > >> >> > >> Strange, so does it simply ignore the >> option? The >> > >> following is the beginning of the output >> from _non_ >> > >> generational ZGC: >> > >> >> > >> johannes at luna:~/IdeaProjects/sirix$ ./gradlew >> > >> >> > -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal :sirix-core:test --tests io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >> > >> >> > >> > Configure project : >> > >> The 'sonarqube' task depends on compile >> tasks. This >> > >> behavior is now deprecated and will be >> removed in >> > >> version 5.x. To avoid implicit >> compilation, set >> > >> property 'sonar.gradle.skipCompile' to 'true' >> > and make >> > >> sure your project is compiled, before >> analysis has >> > >> started. >> > >> The 'sonar' task depends on compile >> tasks. This >> > >> behavior is now deprecated and will be >> removed in >> > >> version 5.x. To avoid implicit >> compilation, set >> > >> property 'sonar.gradle.skipCompile' to 'true' >> > and make >> > >> sure your project is compiled, before >> analysis has >> > >> started. >> > >> [1,627s][info ][gc ] GC(0) Garbage >> Collection >> > >> (Metadata GC Threshold) 84M(1%)->56M(0%) >> > >> >> > >> > Task :sirix-core:test >> > >> [0.001s][warning][pagesize] UseLargePages >> > disabled, no >> > >> large pages configured and available on >> the system. >> > >> [1.253s][info ][gc ] Using The Z >> Garbage >> > Collector >> > >> >> > >> [2,930s][info ][gc ] GC(1) Garbage >> Collection >> > >> (Warmup) 1616M(11%)->746M(5%) >> > >> [4,445s][info ][gc ] GC(2) Garbage >> Collection >> > >> (Warmup) 3232M(21%)->750M(5%) >> > >> [5,751s][info ][gc ] GC(3) Garbage >> Collection >> > >> (Warmup) 4644M(30%)->1356M(9%) >> > >> [9,886s][info ][gc ] GC(4) Garbage >> Collection >> > >> (Allocation Rate) 10668M(69%)->612M(4%) >> > >> [10,406s][info ][gc ] GC(5) Garbage >> > Collection >> > >> (Allocation Rate) 2648M(17%)->216M(1%) >> > >> [13,931s][info ][gc ] GC(6) Garbage >> > Collection >> > >> (Allocation Rate) 11164M(73%)->1562M(10%) >> > >> [16,908s][info ][gc ] GC(7) Garbage >> > Collection >> > >> (Allocation Rate) 11750M(76%)->460M(3%) >> > >> [20,690s][info ][gc ] GC(8) Garbage >> > Collection >> > >> (Allocation Rate) 12670M(82%)->726M(5%) >> > >> [24,376s][info ][gc ] GC(9) Garbage >> > Collection >> > >> (Allocation Rate) 13422M(87%)->224M(1%) >> > >> [28,152s][info ][gc ] GC(10) Garbage >> > Collection >> > >> (Proactive) 13474M(88%)->650M(4%) >> > >> [31,526s][info ][gc ] GC(11) Garbage >> > Collection >> > >> (Allocation Rate) 12072M(79%)->1472M(10%) >> > >> [34,754s][info ][gc ] GC(12) Garbage >> > Collection >> > >> (Allocation Rate) 13050M(85%)->330M(2%) >> > >> [38,478s][info ][gc ] GC(13) Garbage >> > Collection >> > >> (Allocation Rate) 13288M(87%)->762M(5%) >> > >> [41,936s][info ][gc ] GC(14) Garbage >> > Collection >> > >> (Proactive) 13294M(87%)->504M(3%) >> > >> [45,353s][info ][gc ] GC(15) Garbage >> > Collection >> > >> (Allocation Rate) 12984M(85%)->268M(2%) >> > >> [48,861s][info ][gc ] GC(16) Garbage >> > Collection >> > >> (Allocation Rate) 13008M(85%)->306M(2%) >> > >> [52,133s][info ][gc ] GC(17) Garbage >> > Collection >> > >> (Proactive) 12042M(78%)->538M(4%) >> > >> [55,705s][info ][gc ] GC(18) Garbage >> > Collection >> > >> (Allocation Rate) 12420M(81%)->1842M(12%) >> > >> [59,000s][info ][gc ] GC(19) Garbage >> > Collection >> > >> (Allocation Rate) 12458M(81%)->1422M(9%) >> > >> [64,501s][info ][gc ] Allocation >> Stall (Test >> > >> worker) 59,673ms >> > >> [64,742s][info ][gc ] Allocation >> Stall (Test >> > >> worker) 240,077ms >> > >> [65,806s][info ][gc ] GC(20) Garbage >> > Collection >> > >> (Allocation Rate) 13808M(90%)->6936M(45%) >> > >> [66,476s][info ][gc ] GC(21) Garbage >> > Collection >> > >> (Allocation Stall) 7100M(46%)->4478M(29%) >> > >> [69,471s][info ][gc ] GC(22) Garbage >> > Collection >> > >> (Allocation Rate) 10098M(66%)->5888M(38%) >> > >> [72,252s][info ][gc ] GC(23) Garbage >> > Collection >> > >> (Allocation Rate) 11226M(73%)->5816M(38%) >> > >> >> > >> ... >> > >> >> > >> So even here I can see some allocation >> stalls. >> > >> >> > >> Running the Same with -XX:+ZGenerational in >> > >> build.gradle probably using GraalVM does >> something >> > >> differnt, but I don't know what... at >> least off-heap >> > >> memory is exhausted at some point due to >> direct byte >> > >> buffer usage!? >> > >> >> > >> So, I'm not sure what's the difference, >> though. >> > >> >> > >> With this: >> > >> >> > >> "-XX:+UseZGC", >> > >> >> > "-Xlog:gc*=debug:file=zgc-generational-detailed.log", >> > >> "-XX:+ZGenerational", >> > >> "-verbose:gc", >> > >> "-XX:+HeapDumpOnOutOfMemoryError", >> > >> "-XX:HeapDumpPath=heapdump.hprof", >> > >> "-XX:MaxDirectMemorySize=2g", >> > >> >> > >> >> > >> Caused by: java.lang.OutOfMemoryError: Cannot >> > reserve 60000 bytes of direct buffer memory (allocated: >> 2147446560, >> > limit: 2147483648) >> > >> at >> > java.base/java.nio.Bits.reserveMemory(Bits.java:178) >> > >> at >> > java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) >> > >> at >> > java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) >> > >> at >> > net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) >> > >> at >> > net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) >> > >> at >> > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) >> > >> at >> > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) >> > >> at io.sirix.access.trx.page >> >> > >.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) >> > >> >> > >> >> > >> >> > >> Am Do., 15. Feb. 2024 um 12:05 Uhr >> schrieb Stefan >> > >> Karlsson > >> > > > >> > >> > >> > > >>>: >> > >> >> > >> Hi Johannes, >> > >> >> > >> We tried to look at the log files and >> the jfr >> > >> files, but couldn't find >> > >> an OotOfMemoryError in any of them. >> Do you think >> > >> you could try to rerun >> > >> and capture the entire GC log from the >> > >> OutOfMemoryError run? >> > >> >> > >> A few things to note: >> > >> >> > >> 1) You seem to be running the Graal >> compiler. >> > >> Graal doesn't support >> > >> Generational ZGC, so you are going to run >> > >> different compilers when you >> > >> compare Singlegen ZGC with >> Generational ZGC. >> > >> >> > >> 2) It's not clear to me that the >> provided JFR >> > >> files matches the provided >> > >> log files. >> > >> >> > >> 3) The JFR files show that >> > -XX:+UseLargePages are >> > >> used, but the gc+init >> > >> logs shows 'Large Page Support: >> Disabled', you >> > >> might want to look into >> > >> why that is the case. >> > >> >> > >> 4) The singlegen JFR file has a >> > >> -Xlog:gc:g1-chicago.log line. It should >> > >> probably be named zgc-chicago.log. >> > >> >> > >> Cheers, >> > >> StefanK >> > >> >> > >> On 2024-02-14 17:36, Johannes >> Lichtenberger >> > wrote: >> > >> > Hello, >> > >> > >> > >> > a test of my little DB project >> fails using >> > >> generational ZGC, but not >> > >> > with ZGC and G1 (out of memory error). >> > >> > >> > >> > To be honest, I guess the >> allocation rate and >> > >> thus GC pressure, when >> > >> > traversing a resource in SirixDB is >> > >> unacceptable. The strategy is to >> > >> > create fine-grained nodes from JSON >> input and >> > >> store these in a trie. >> > >> > First, a 3,8Gb JSON file is >> shredded and >> > >> imported. Next, a preorder >> > >> > traversal of the generated trie >> traverses >> > a trie >> > >> (with leaf pages >> > >> > storing 1024 nodes each and in total >> > >> ~300_000_000 (and these are going >> > >> > to be deserialized one by one). The >> pages are >> > >> furthermore referenced >> > >> > in memory through >> PageReference::setPage. >> > >> Furthermore, a Caffeine page >> > >> > cache caches the PageReferences >> (keys) and the >> > >> pages (values) and sets >> > >> > the reference back to null once >> entries are >> > >> going to be evicted >> > >> > (PageReference.setPage(null)). >> > >> > >> > >> > However, I think the whole strategy of >> > having to >> > >> have in-memory nodes >> > >> > might not be best. Maybe it's >> better to use >> > >> off-heap memory for the >> > >> > pages itself with MemorySegments, >> but the >> > pages >> > >> are not of a fixed >> > >> > size, thus it may get tricky. >> > >> > >> > >> > The test mentioned is this: >> > >> > >> > >> >> > >> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > >> >> > >> > >> > >> > I can upload the JSON file >> somewhere for a >> > >> couple of days if needed. >> > >> > >> > >> > Caused by: java.lang.OutOfMemoryError >> > >> > at >> > >> > >> > >> >> > java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) >> > >> > at >> > >> > >> > >> >> > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) >> > >> > at >> > >> > >> > >> >> > java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) >> > >> > at >> > >> > >> > >> >> > java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) >> > >> > at >> > >> > >> > >> >> > java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) >> > >> > at >> > >> > >> > >> >> > java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) >> > >> > at >> > >> > >> > >> >> > java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) >> > >> > at >> > >> > >> > >> >> > java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) >> > >> > at >> > >> > >> > >> >> > java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) >> > >> > at >> > >> > >> > >> >> > java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) >> > >> > at >> > >> > io.sirix.access.trx.page >> >> > > >> > >> > >> > >>.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) >> > >> > >> > >> > I've uploaded several JFR >> recordings and logs >> > >> over here (maybe besides >> > >> > the async profiler JFR files the >> zgc-detailed >> > >> log is most interesting): >> > >> > >> > >> > >> > >> >> https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core >> >> > > >> >> > >> > >> > >> > kind regards >> > >> > Johannes >> > >> >> > From lichtenberger.johannes at gmail.com Fri Feb 16 23:36:09 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Sat, 17 Feb 2024 00:36:09 +0100 Subject: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> Message-ID: I just removed "-XX+DisableExplizitGC", increased max direct memory size to 5g (-XX:MaxDirectMemorySize=5g), but also changed Bytes::elasticByteBuffer to Bytes.elasticHeapByteBuffer(60_000); to use on heap ByteBuffers. However, the performance seems to be way worse. I've repeated the test several times, but with both G1 and non generational ZGC it's ~50s for importing the JSON file in the first case vs ~100s using generational ZGC, using Temurin 21.0.2 with similar values for the actual traversals. >From the log on STDOUT, I can see this (meaning 0,319s and 0,440s... pause times?) [35,718s][info ][gc ] GC(9) Minor Collection (Allocation Rate) 12462M(81%)->1556M(10%) 0,319s [40,871s][info ][gc ] GC(10) Minor Collection (Allocation Rate) [41,311s][info ][gc ] GC(10) Minor Collection (Allocation Rate) 13088M(85%)->1432M(9%) 0,440s [46,236s][info ][gc ] GC(11) Minor Collection (Allocation Rate) [46,603s][info ][gc ] GC(11) Minor Collection (Allocation Rate) 12406M(81%)->1676M(11%) 0,367s [51,445s][info ][gc ] GC(12) Minor Collection (Allocation Rate) [51,846s][info ][gc ] GC(12) Minor Collection (Allocation Rate) 12848M(84%)->1556M(10%) 0,401s [56,203s][info ][gc ] GC(13) Major Collection (Proactive) [56,368s][info ][gc ] GC(13) Major Collection (Proactive) 11684M(76%)->484M(3%) 0,166s kind regards Johannes Am Fr., 16. Feb. 2024 um 22:39 Uhr schrieb Erik Osterlund < erik.osterlund at oracle.com>: > It?s worth noting that when using ZGC, calling System.gc does not invoke a > classic disastrously long GC pause. Instead, a concurrent GC is triggered, > which should be not that noticeable to the application. The thread calling > System.gc is blocked until the GC is done, but the other threads can run > freely. > > /Erik > > > On 16 Feb 2024, at 21:55, Stefan Johansson > wrote: > > > > ? > > > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: > >> Thanks a lot, I wasn't even aware of the fact, that DirectByteBuffers > use System.gc() and I always had in mind that calling System.gc() at least > in application code is bad practice (or at least we shouldn't rely on it) > and I think I read somewhere a while ago, that it's recommended to even > disable this, but may be completely wrong, of course. > > In most cases callling System.gc() is bad practice, in some special > cases it might be needed. > > > >> I'll change it to on-heap byte buffers tomorrow :-) > >> I think your GC log entries were from G1, right? It seems ZGC always > tries to use the full heap :-) > > > > Yes, the snippet was G1, it was mostly to show that the pressure isn't > high. You are correct that ZGC uses more of the given heap but the > collections are pretty far apart and I'm certian it would function well > with a smaller heap as well. Maybe in that case some Major collections > would be triggered. > > > >> Kind regards and thanks for sharing your insights. > > > > No problem. We appriciate the feedback, > > StefanJ > > > >> Have a nice weekend as well > >> Johannes > >> Stefan Johansson stefan.johansson at oracle.com>> schrieb am Fr., 16. Feb. 2024, 17:38: > >> Hi, > >> Some comments inline. > >> On 2024-02-16 16:47, Johannes Lichtenberger wrote: > >> > Thanks a lot for looking into it, I've added > >> > `-XX:MaxDirectMemorySize=2g` only recently, but without it failed > as > >> > well, so not sure what the default is. Will definitely check your > >> > suggestions :-) > >> > > >> If you don't set a limit it will be set to: > >> Runtime.getRuntime().maxMemory() > >> So likely a good idea to set a reasonable limit, but the smaller the > >> limit is the more frequent we need to run reference processing to > allow > >> memory to be freed up. > >> > Sadly I'm currently working alone on the project in my spare time > >> > (besides professionally switched from Java/Kotlin stuff to the > >> embedded > >> > software world) and I'm not sure if the current architecture of > >> Sirix is > >> > limited by too much GC pressure. I'd probably have to check > >> Cassandra at > >> > some point and look into flame graphs and stuff for their > >> integration > >> > tests, but maybe you can give some general insights/advice... > >> > > >> > Yesterday evening I switched to other JDKs (also I want to test > with > >> > Shenandoah in particular), but I think especially the better > escape > >> > analysis of the GraalVM is a huge plus in the case of SirixDB (for > >> > insertion on my laptop it's ~90s vs ~60s), but I think it should > be > >> > faster and currently my suspicion is that garbage is a major > >> performance > >> > issue. > >> > > >> > Maybe the GC pressure in general is a major issue, as in the CPU > >> Flame > >> > graph IIRC the G1 had about 20% stack frames allocated and non > >> > generational ZGC even around 40% taking all threads into account. > >> > > >> From what I/we see, the GC pressure in the given test is not high. > >> The > >> allocation rate is below 1GB/s and since most of it die young the GCs > >> are fairly cheap. In this log snippet G1 shows a GC every 5s and the > >> pause time is below 50ms: > >> [296,016s][info ][gc ] GC(90) Pause Young (Normal) (G1 > >> Evacuation > >> Pause) 5413M->1849M(6456M) 35,577ms > >> [301,103s][info ][gc ] GC(91) Pause Young (Normal) (G1 > >> Evacuation > >> Pause) 5417M->1848M(6456M) 33,357ms > >> [306,041s][info ][gc ] GC(92) Pause Young (Normal) (G1 > >> Evacuation > >> Pause) 5416M->1848M(6456M) 32,763ms > >> [310,849s][info ][gc ] GC(93) Pause Young (Normal) (G1 > >> Evacuation > >> Pause) 5416M->1847M(6456M) 33,086ms > >> I also see that the heap never expands to more the ~6.5GB even > >> though it > >> is allow to be 15GB and this also suggest that the GC is not under > much > >> pressure. As I said in the previous mail, the reason Generational ZGC > >> don't free up the direct memory without the System.gc() calls is that > >> the GC pressure is not high enough to trigger any Major cycles. So I > >> would strongly recommend you to not run with -XX+DisableExplicitGC > >> unless you really have to. Since you are using DirectByteBuffers and > >> they use System.gc() to help free memory when the limit is reached. > >> > So in general I'm thinking about backing the KeyValueLeafPages > with > >> > MemorySegments, but I think due to variable sized pages it's > getting > >> > tricky, plus I currently don't have the time for changing > >> fundamental > >> > stuff and I'm even not sure if it'll bring a performance boost, > as I > >> > have to adapt neighbour relationships of the nodes often and > >> off-heap > >> > memory access might be slightly worse performance wise. > >> > > >> > What do you think? > >> > > >> I know to little about the application to be able to give advice > here, > >> but I would first start with having most memory on heap. Only large > >> long > >> lived stuff off-heap, if really needed. Looking at the test at hand, > it > >> really doesn't look like it is long lived stuff that is placed off > heap. > >> > I've attached a memory flame graph and there it seems the byte > array > >> > from deserializing each page is prominent, but that might be > >> something I > >> > can't even avoid (after decompression via Snappy or via another > >> lib and > >> > maybe also decryption in the future). > >> > > >> > As of now G1 with GraalVM seems to perform best (a little bit > better > >> > than with non generational ZGC, but I thought ZGC or maybe > >> Shenandoah > >> > would improve the situation). But as said I may have to generate > way > >> > less garbage after all in general for good performance!? > >> > > >> > All in all maybe due to most objects die young maybe also the > >> > generational GCs are not needed (that said if enough RAM is > >> available > >> > and the Caffeine Caches are sized accordingly most objects may > >> die old). > >> > But apparently the byte arrays holding the page data still die > >> young (in > >> > AbstractReader::deserialize). In fact I'm not even sure why they > >> escape, > >> > but currently I'm on my phone. > >> > > >> It's when most objects die young the Generational GC really shines, > >> because it can handle the short lived objects without having to look > at > >> the long lived objects. So I would say Generational ZGC is a good fit > >> here, but we need to let the System.gc() run to allow reference > >> processing or slightly re-design and use HeapByteBuffers. > >> Have a nice weekend, > >> Stefan > >> > Kind regards > >> > Johannes > >> > > >> > Stefan Johansson >> > >> > >> >> schrieb am Fr., 16. Feb. > >> 2024, 13:43: > >> > > >> > Hi Johannes, > >> > > >> > We've spent some more time looking at this and getting the > >> json-file to > >> > reproduced it made it easy to verify our suspicions. Thanks > for > >> > uploading it. > >> > > >> > There are a few things playing together here. The test is > >> making quite > >> > heavy use of DirectByteBuffers and you limit the usage to 2G > >> > (-XX:MaxDirectMemorySize=2g). The life cycle and freeing of > >> the native > >> > memory part of the DirectByteBuffer rely on reference > >> processing. In > >> > generational ZGC reference processing is only done during > Major > >> > collections and since the general GC preassure in this > >> benchmark is > >> > very > >> > low (most objects die young), we do not trigger that many > Major > >> > collections. > >> > > >> > Normaly this would not be a problem. To avoid throwing an out > >> of memory > >> > error (due to hitting the direct buffer memory limit) too > >> early the JDK > >> > triggers a System.gc(). This should trigger reference > >> procesing and all > >> > buffers that are no longer in use would be freed. Since you > >> specify the > >> > option -XX:+DisableExplicitGC all these calls to trigger GCs > are > >> > ignored > >> > and no direct memory will be freed. So in our testing, just > >> removing > >> > this flags makes the test pass. > >> > > >> > Another solution is to look at using HeapByteBuffers instead > >> and don't > >> > have to worry about the direct memory usage. The OpenHFT lib > >> seems to > >> > have support for this by just using > >> elasticHeapByteBuffer(...) instead > >> > of elasticByteBuffer(). > >> > > >> > Lastly, the reason for this working with non-generational ZGC > is > >> > that it > >> > does reference processing for every GC. > >> > > >> > Hope this helps, > >> > StefanJ > >> > > >> > > >> > On 2024-02-15 21:53, Johannes Lichtenberger wrote: > >> > > It's a laptop, I've attached some details. > >> > > > >> > > Furthermore, if it seems worth digging deeper into the > >> issue, the > >> > JSON > >> > > file is here for one week: > >> > > https://www.transfernow.net/dl/20240215j9NaPTc0 > >> < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > > > >> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > >> > >> > > >> < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > > > >> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > >>> > >> > > > >> > > You'd have to unzip into > >> bundles/sirix-core/src/test/resources/json, > >> > > remove the @Disabled annotation and run the test > >> > > JsonShredderTest::testChicagoDescendantAxis > >> > > > >> > > The test JVM parameters are specified in the parent > >> build.gradle > >> > in the > >> > > project root folder. > >> > > > >> > > The GitHub repo: https://github.com/sirixdb/sirix > >> < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > > > >> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > >> > >> > > >> < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > > > >> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > >>> > >> > > > >> > > Screenshot from 2024-02-15 21-43-33.png > >> > > > >> > > kind regards > >> > > Johannes > >> > > > >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb Peter Booth > >> > > > >> > > >> > > >> >>>: > >> > > > >> > > Just curious - what CPU, physical memory and OS are > >> you using? > >> > > Sent from my iPhone > >> > > > >> > >> On Feb 15, 2024, at 12:23?PM, Johannes Lichtenberger > >> > >> >> > >> > >> > > >> > >> >> > >> > >> >>> wrote: > >> > >> > >> > >> ? > >> > >> I guess I don't know which JDK it picks for the > >> tests, but I > >> > guess > >> > >> OpenJDK > >> > >> > >> > >> Johannes Lichtenberger > >> >> > >> > >> > > >> > >> >> > >> > >> >>> schrieb am Do., 15. > >> > >> Feb. 2024, 17:58: > >> > >> > >> > >> However, it's the same with: ./gradlew > >> > >> > -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > >> > >> :sirix-core:test --tests > >> > >> > >> > > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > using OpenJDK hopefully > >> > >> > >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb > Johannes > >> > >> Lichtenberger >> > >> > >> > > >> > >> >> > >> > >> >>>: > >> > >> > >> > >> I've attached two logs, the first one without > >> > >> -XX:+Generational, the second one with the > >> option set, > >> > >> even though I also saw, that generational ZGC > is > >> > going to > >> > >> be supported in GraalVM 24.1 in September... > >> so not sure > >> > >> what this does :) > >> > >> > >> > >> Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb > >> Johannes > >> > >> Lichtenberger > >> >> > >> > >> > > >> > >> >> > >> > >> >>>: > >> > >> > >> > >> Strange, so does it simply ignore the > >> option? The > >> > >> following is the beginning of the output > >> from _non_ > >> > >> generational ZGC: > >> > >> > >> > >> johannes at luna:~/IdeaProjects/sirix$ > ./gradlew > >> > >> > >> > > -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal > :sirix-core:test --tests > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > >> > >> > >> > >> > Configure project : > >> > >> The 'sonarqube' task depends on compile > >> tasks. This > >> > >> behavior is now deprecated and will be > >> removed in > >> > >> version 5.x. To avoid implicit > >> compilation, set > >> > >> property 'sonar.gradle.skipCompile' to > 'true' > >> > and make > >> > >> sure your project is compiled, before > >> analysis has > >> > >> started. > >> > >> The 'sonar' task depends on compile > >> tasks. This > >> > >> behavior is now deprecated and will be > >> removed in > >> > >> version 5.x. To avoid implicit > >> compilation, set > >> > >> property 'sonar.gradle.skipCompile' to > 'true' > >> > and make > >> > >> sure your project is compiled, before > >> analysis has > >> > >> started. > >> > >> [1,627s][info ][gc ] GC(0) Garbage > >> Collection > >> > >> (Metadata GC Threshold) 84M(1%)->56M(0%) > >> > >> > >> > >> > Task :sirix-core:test > >> > >> [0.001s][warning][pagesize] UseLargePages > >> > disabled, no > >> > >> large pages configured and available on > >> the system. > >> > >> [1.253s][info ][gc ] Using The Z > >> Garbage > >> > Collector > >> > >> > >> > >> [2,930s][info ][gc ] GC(1) Garbage > >> Collection > >> > >> (Warmup) 1616M(11%)->746M(5%) > >> > >> [4,445s][info ][gc ] GC(2) Garbage > >> Collection > >> > >> (Warmup) 3232M(21%)->750M(5%) > >> > >> [5,751s][info ][gc ] GC(3) Garbage > >> Collection > >> > >> (Warmup) 4644M(30%)->1356M(9%) > >> > >> [9,886s][info ][gc ] GC(4) Garbage > >> Collection > >> > >> (Allocation Rate) 10668M(69%)->612M(4%) > >> > >> [10,406s][info ][gc ] GC(5) Garbage > >> > Collection > >> > >> (Allocation Rate) 2648M(17%)->216M(1%) > >> > >> [13,931s][info ][gc ] GC(6) Garbage > >> > Collection > >> > >> (Allocation Rate) 11164M(73%)->1562M(10%) > >> > >> [16,908s][info ][gc ] GC(7) Garbage > >> > Collection > >> > >> (Allocation Rate) 11750M(76%)->460M(3%) > >> > >> [20,690s][info ][gc ] GC(8) Garbage > >> > Collection > >> > >> (Allocation Rate) 12670M(82%)->726M(5%) > >> > >> [24,376s][info ][gc ] GC(9) Garbage > >> > Collection > >> > >> (Allocation Rate) 13422M(87%)->224M(1%) > >> > >> [28,152s][info ][gc ] GC(10) > Garbage > >> > Collection > >> > >> (Proactive) 13474M(88%)->650M(4%) > >> > >> [31,526s][info ][gc ] GC(11) > Garbage > >> > Collection > >> > >> (Allocation Rate) 12072M(79%)->1472M(10%) > >> > >> [34,754s][info ][gc ] GC(12) > Garbage > >> > Collection > >> > >> (Allocation Rate) 13050M(85%)->330M(2%) > >> > >> [38,478s][info ][gc ] GC(13) > Garbage > >> > Collection > >> > >> (Allocation Rate) 13288M(87%)->762M(5%) > >> > >> [41,936s][info ][gc ] GC(14) > Garbage > >> > Collection > >> > >> (Proactive) 13294M(87%)->504M(3%) > >> > >> [45,353s][info ][gc ] GC(15) > Garbage > >> > Collection > >> > >> (Allocation Rate) 12984M(85%)->268M(2%) > >> > >> [48,861s][info ][gc ] GC(16) > Garbage > >> > Collection > >> > >> (Allocation Rate) 13008M(85%)->306M(2%) > >> > >> [52,133s][info ][gc ] GC(17) > Garbage > >> > Collection > >> > >> (Proactive) 12042M(78%)->538M(4%) > >> > >> [55,705s][info ][gc ] GC(18) > Garbage > >> > Collection > >> > >> (Allocation Rate) 12420M(81%)->1842M(12%) > >> > >> [59,000s][info ][gc ] GC(19) > Garbage > >> > Collection > >> > >> (Allocation Rate) 12458M(81%)->1422M(9%) > >> > >> [64,501s][info ][gc ] Allocation > >> Stall (Test > >> > >> worker) 59,673ms > >> > >> [64,742s][info ][gc ] Allocation > >> Stall (Test > >> > >> worker) 240,077ms > >> > >> [65,806s][info ][gc ] GC(20) > Garbage > >> > Collection > >> > >> (Allocation Rate) 13808M(90%)->6936M(45%) > >> > >> [66,476s][info ][gc ] GC(21) > Garbage > >> > Collection > >> > >> (Allocation Stall) 7100M(46%)->4478M(29%) > >> > >> [69,471s][info ][gc ] GC(22) > Garbage > >> > Collection > >> > >> (Allocation Rate) 10098M(66%)->5888M(38%) > >> > >> [72,252s][info ][gc ] GC(23) > Garbage > >> > Collection > >> > >> (Allocation Rate) 11226M(73%)->5816M(38%) > >> > >> > >> > >> ... > >> > >> > >> > >> So even here I can see some allocation > >> stalls. > >> > >> > >> > >> Running the Same with -XX:+ZGenerational > in > >> > >> build.gradle probably using GraalVM does > >> something > >> > >> differnt, but I don't know what... at > >> least off-heap > >> > >> memory is exhausted at some point due to > >> direct byte > >> > >> buffer usage!? > >> > >> > >> > >> So, I'm not sure what's the difference, > >> though. > >> > >> > >> > >> With this: > >> > >> > >> > >> "-XX:+UseZGC", > >> > >> > >> > "-Xlog:gc*=debug:file=zgc-generational-detailed.log", > >> > >> "-XX:+ZGenerational", > >> > >> "-verbose:gc", > >> > >> "-XX:+HeapDumpOnOutOfMemoryError", > >> > >> "-XX:HeapDumpPath=heapdump.hprof", > >> > >> "-XX:MaxDirectMemorySize=2g", > >> > >> > >> > >> > >> > >> Caused by: java.lang.OutOfMemoryError: > Cannot > >> > reserve 60000 bytes of direct buffer memory (allocated: > >> 2147446560, > >> > limit: 2147483648) > >> > >> at > >> > java.base/java.nio.Bits.reserveMemory(Bits.java:178) > >> > >> at > >> > > java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) > >> > >> at > >> > > java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) > >> > >> at > >> > > net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) > >> > >> at > >> > > net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) > >> > >> at > >> > > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) > >> > >> at > >> > > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) > >> > >> at io.sirix.access.trx.page > >> < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$ > > > >> > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > >>.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) > >> > >> > >> > >> > >> > >> > >> > >> Am Do., 15. Feb. 2024 um 12:05 Uhr > >> schrieb Stefan > >> > >> Karlsson >> > >> > >> > > >> > >> >> > >> > >> >>>: > >> > >> > >> > >> Hi Johannes, > >> > >> > >> > >> We tried to look at the log files and > >> the jfr > >> > >> files, but couldn't find > >> > >> an OotOfMemoryError in any of them. > >> Do you think > >> > >> you could try to rerun > >> > >> and capture the entire GC log from the > >> > >> OutOfMemoryError run? > >> > >> > >> > >> A few things to note: > >> > >> > >> > >> 1) You seem to be running the Graal > >> compiler. > >> > >> Graal doesn't support > >> > >> Generational ZGC, so you are going to > run > >> > >> different compilers when you > >> > >> compare Singlegen ZGC with > >> Generational ZGC. > >> > >> > >> > >> 2) It's not clear to me that the > >> provided JFR > >> > >> files matches the provided > >> > >> log files. > >> > >> > >> > >> 3) The JFR files show that > >> > -XX:+UseLargePages are > >> > >> used, but the gc+init > >> > >> logs shows 'Large Page Support: > >> Disabled', you > >> > >> might want to look into > >> > >> why that is the case. > >> > >> > >> > >> 4) The singlegen JFR file has a > >> > >> -Xlog:gc:g1-chicago.log line. It > should > >> > >> probably be named zgc-chicago.log. > >> > >> > >> > >> Cheers, > >> > >> StefanK > >> > >> > >> > >> On 2024-02-14 17:36, Johannes > >> Lichtenberger > >> > wrote: > >> > >> > Hello, > >> > >> > > >> > >> > a test of my little DB project > >> fails using > >> > >> generational ZGC, but not > >> > >> > with ZGC and G1 (out of memory > error). > >> > >> > > >> > >> > To be honest, I guess the > >> allocation rate and > >> > >> thus GC pressure, when > >> > >> > traversing a resource in SirixDB is > >> > >> unacceptable. The strategy is to > >> > >> > create fine-grained nodes from JSON > >> input and > >> > >> store these in a trie. > >> > >> > First, a 3,8Gb JSON file is > >> shredded and > >> > >> imported. Next, a preorder > >> > >> > traversal of the generated trie > >> traverses > >> > a trie > >> > >> (with leaf pages > >> > >> > storing 1024 nodes each and in total > >> > >> ~300_000_000 (and these are going > >> > >> > to be deserialized one by one). The > >> pages are > >> > >> furthermore referenced > >> > >> > in memory through > >> PageReference::setPage. > >> > >> Furthermore, a Caffeine page > >> > >> > cache caches the PageReferences > >> (keys) and the > >> > >> pages (values) and sets > >> > >> > the reference back to null once > >> entries are > >> > >> going to be evicted > >> > >> > (PageReference.setPage(null)). > >> > >> > > >> > >> > However, I think the whole strategy > of > >> > having to > >> > >> have in-memory nodes > >> > >> > might not be best. Maybe it's > >> better to use > >> > >> off-heap memory for the > >> > >> > pages itself with MemorySegments, > >> but the > >> > pages > >> > >> are not of a fixed > >> > >> > size, thus it may get tricky. > >> > >> > > >> > >> > The test mentioned is this: > >> > >> > > >> > >> > >> > > >> > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybA2KQCpC$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$>> > < > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybA2KQCpC$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$ > >>> > >> > >> > > >> > >> > I can upload the JSON file > >> somewhere for a > >> > >> couple of days if needed. > >> > >> > > >> > >> > Caused by: > java.lang.OutOfMemoryError > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > >> > >> > at > >> > >> > > >> > >> > >> > > java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) > >> > >> > at > >> > >> > io.sirix.access.trx.page > >> < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$ > > > >> > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > >> > >> > >> >> < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$ > > > >> > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > >>>.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) > >> > >> > > >> > >> > I've uploaded several JFR > >> recordings and logs > >> > >> over here (maybe besides > >> > >> > the async profiler JFR files the > >> zgc-detailed > >> > >> log is most interesting): > >> > >> > > >> > >> > > >> > >> > >> https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core > >> < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybMV7Rgtt$ > > > >> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$>> > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybMV7Rgtt$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$ > >>> > >> > >> > > >> > >> > kind regards > >> > >> > Johannes > >> > >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.johansson at oracle.com Sat Feb 17 09:22:29 2024 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Sat, 17 Feb 2024 10:22:29 +0100 Subject: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> Message-ID: On 2024-02-17 00:36, Johannes Lichtenberger wrote: > I just removed "-XX+DisableExplizitGC", increased max direct memory size > to 5g (-XX:MaxDirectMemorySize=5g), but also changed > Bytes::elasticByteBuffer to Bytes.elasticHeapByteBuffer(60_000); > to use on heap ByteBuffers. > Just for clarity, when using HeapByteBuffers the MaxDirectMemorySize has no effect since the ByteBuffers will be stored on the heap. But if you keep going with DirectByteBuffers, this might make sense to give some more head room. > However, the performance seems to be way worse. I've repeated the test > several times, but with both G1 and non generational ZGC it's ~50s for > importing the JSON file in the first case vs ~100s using generational > ZGC, using Temurin 21.0.2 with similar values for the actual traversals. > Ok, sounds like using DirectByteBuffers is a performance win here. If so I would just continue testing using DirectByteBuffers and allowing explicit GCs to ensure they are cleaned out properly. > From the log on STDOUT, I can see this (meaning 0,319s and 0,440s... > pause times?) > No, with ZGC the time here is not the pause time, it's the time to complete the whole GC. ZGC is a concurrent GC, meaning that most of the GC work is done concurrently with the Java application still running. There are still a some very short pauses, all way below 1ms. You can see them if you look at the detailed log: [30,938s][info][gc ] GC(3) Minor Collection (Allocation Rate) [30,938s][info][gc,phases ] GC(3) y: Young Generation [30,938s][info][gc,phases ] GC(3) y: Pause Mark Start 0,060ms [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark 383,563ms [31,322s][info][gc,phases ] GC(3) y: Pause Mark End 0,046ms [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark Free 0,009ms [31,322s][info][gc,phases ] GC(3) y: Concurrent Reset Relocation Set 0,201ms [31,335s][info][gc,phases ] GC(3) y: Concurrent Select Relocation Set 13,228ms [31,335s][info][gc,phases ] GC(3) y: Pause Relocate Start 0,019ms [31,381s][info][gc,phases ] GC(3) y: Concurrent Relocate 45,967ms [31,382s][info][gc,phases ] GC(3) y: Young Generation 9726M(63%)->518M(3%) 0,444s [31,382s][info][gc ] GC(3) Minor Collection (Allocation Rate) 9726M(63%)->518M(3%) 0,444s Here I included the phase-logs for a single GC of the young generation, where you can clearly see how much time was spent in which part of the GC and as you can see the three pauses are all very very short. Stefan > [35,718s][info ? ][gc ? ? ?] GC(9) Minor Collection (Allocation Rate) > 12462M(81%)->1556M(10%) 0,319s > [40,871s][info ? ][gc ? ? ?] GC(10) Minor Collection (Allocation Rate) > [41,311s][info ? ][gc ? ? ?] GC(10) Minor Collection (Allocation Rate) > 13088M(85%)->1432M(9%) 0,440s > [46,236s][info ? ][gc ? ? ?] GC(11) Minor Collection (Allocation Rate) > [46,603s][info ? ][gc ? ? ?] GC(11) Minor Collection (Allocation Rate) > 12406M(81%)->1676M(11%) 0,367s > [51,445s][info ? ][gc ? ? ?] GC(12) Minor Collection (Allocation Rate) > [51,846s][info ? ][gc ? ? ?] GC(12) Minor Collection (Allocation Rate) > 12848M(84%)->1556M(10%) 0,401s > [56,203s][info ? ][gc ? ? ?] GC(13) Major Collection (Proactive) > [56,368s][info ? ][gc ? ? ?] GC(13) Major Collection (Proactive) > 11684M(76%)->484M(3%) 0,166s > > kind regards > Johannes > > Am Fr., 16. Feb. 2024 um 22:39?Uhr schrieb Erik Osterlund > >: > > It?s worth noting that when using ZGC, calling System.gc does not > invoke a classic disastrously long GC pause. Instead, a concurrent > GC is triggered, which should be not that noticeable to the > application. The thread calling System.gc is blocked until the GC is > done, but the other threads can run freely. > > /Erik > > > On 16 Feb 2024, at 21:55, Stefan Johansson > > > wrote: > > > > ? > > > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: > >> Thanks a lot, I wasn't even aware of the fact, that > DirectByteBuffers use System.gc() and I always had in mind that > calling System.gc() at least in application code is bad practice (or > at least we shouldn't rely on it) and I think I read somewhere a > while ago, that it's recommended to even disable this, but may be > completely wrong, of course. > > In most cases callling System.gc() is bad practice, in some > special cases it might be needed. > > > >> I'll change it to on-heap byte buffers tomorrow :-) > >> I think your GC log entries were from G1, right? It seems ZGC > always tries to use the full heap :-) > > > > Yes, the snippet was G1, it was mostly to show that the pressure > isn't high. You are correct that ZGC uses more of the given heap but > the collections are pretty far apart and I'm certian it would > function well with a smaller heap as well. Maybe in that case some > Major collections would be triggered. > > > >> Kind regards and thanks for sharing your insights. > > > > No problem. We appriciate the feedback, > > StefanJ > > > >> Have a nice weekend as well > >> Johannes > >> Stefan Johansson > >> schrieb am Fr., 16. Feb. > 2024, 17:38: > >>? ? Hi, > >>? ? Some comments inline. > >>? ? On 2024-02-16 16:47, Johannes Lichtenberger wrote: > >>? ? ?> Thanks a lot for looking into it, I've added > >>? ? ?> `-XX:MaxDirectMemorySize=2g` only recently, but without it > failed as > >>? ? ?> well,? so not sure what the default is. Will definitely > check your > >>? ? ?> suggestions :-) > >>? ? ?> > >>? ? If you don't set a limit it will be set to: > >>? ? Runtime.getRuntime().maxMemory() > >>? ? So likely a good idea to set a reasonable limit, but the > smaller the > >>? ? limit is the more frequent we need to run reference > processing to allow > >>? ? memory to be freed up. > >>? ? ?> Sadly I'm currently working alone on the project in my > spare time > >>? ? ?> (besides professionally switched from Java/Kotlin stuff to the > >>? ? embedded > >>? ? ?> software world) and I'm not sure if the current > architecture of > >>? ? Sirix is > >>? ? ?> limited by too much GC pressure. I'd probably have to check > >>? ? Cassandra at > >>? ? ?> some point and look into flame graphs and stuff for their > >>? ? integration > >>? ? ?> tests, but maybe you can give some general insights/advice... > >>? ? ?> > >>? ? ?> Yesterday evening I switched to other JDKs (also I want to > test with > >>? ? ?> Shenandoah in particular), but I think especially the > better escape > >>? ? ?> analysis of the GraalVM is a huge plus in the case of > SirixDB (for > >>? ? ?> insertion on my laptop it's ~90s vs ~60s),? but I think it > should be > >>? ? ?> faster and currently my suspicion is that garbage is a major > >>? ? performance > >>? ? ?> issue. > >>? ? ?> > >>? ? ?> Maybe the GC pressure in general is a major issue, as in > the CPU > >>? ? Flame > >>? ? ?> graph IIRC the G1 had about 20% stack frames allocated and non > >>? ? ?> generational ZGC even around 40% taking all threads into > account. > >>? ? ?> > >>? ? ? From what I/we see, the GC pressure in the given test is > not high. > >>? ? The > >>? ? allocation rate is below 1GB/s and since most of it die young > the GCs > >>? ? are fairly cheap. In this log snippet G1 shows a GC every 5s > and the > >>? ? pause time is below 50ms: > >>? ? [296,016s][info? ?][gc? ? ? ] GC(90) Pause Young (Normal) (G1 > >>? ? Evacuation > >>? ? Pause) 5413M->1849M(6456M) 35,577ms > >>? ? [301,103s][info? ?][gc? ? ? ] GC(91) Pause Young (Normal) (G1 > >>? ? Evacuation > >>? ? Pause) 5417M->1848M(6456M) 33,357ms > >>? ? [306,041s][info? ?][gc? ? ? ] GC(92) Pause Young (Normal) (G1 > >>? ? Evacuation > >>? ? Pause) 5416M->1848M(6456M) 32,763ms > >>? ? [310,849s][info? ?][gc? ? ? ] GC(93) Pause Young (Normal) (G1 > >>? ? Evacuation > >>? ? Pause) 5416M->1847M(6456M) 33,086ms > >>? ? I also see that the heap never expands to more the ~6.5GB even > >>? ? though it > >>? ? is allow to be 15GB and this also suggest that the GC is not > under much > >>? ? pressure. As I said in the previous mail, the reason > Generational ZGC > >>? ? don't free up the direct memory without the System.gc() calls > is that > >>? ? the GC pressure is not high enough to trigger any Major > cycles. So I > >>? ? would strongly recommend you to not run with > -XX+DisableExplicitGC > >>? ? unless you really have to. Since you are using > DirectByteBuffers and > >>? ? they use System.gc() to help free memory when the limit is > reached. > >>? ? ?> So in general I'm thinking about backing the > KeyValueLeafPages with > >>? ? ?> MemorySegments, but I think due to variable sized pages > it's getting > >>? ? ?> tricky, plus I currently don't have the time for changing > >>? ? fundamental > >>? ? ?> stuff and I'm even not sure if it'll bring a performance > boost, as I > >>? ? ?> have to adapt neighbour relationships of the nodes often and > >>? ? off-heap > >>? ? ?> memory access might be slightly worse performance wise. > >>? ? ?> > >>? ? ?> What do you think? > >>? ? ?> > >>? ? I know to little about the application to be able to give > advice here, > >>? ? but I would first start with having most memory on heap. Only > large > >>? ? long > >>? ? lived stuff off-heap, if really needed. Looking at the test > at hand, it > >>? ? really doesn't look like it is long lived stuff that is > placed off heap. > >>? ? ?> I've attached a memory flame graph and there it seems the > byte array > >>? ? ?> from deserializing each page is prominent, but that might be > >>? ? something I > >>? ? ?> can't even avoid (after decompression via Snappy or via > another > >>? ? lib and > >>? ? ?> maybe also decryption in the future). > >>? ? ?> > >>? ? ?> As of now G1 with GraalVM seems to perform best (a little > bit better > >>? ? ?> than with non generational ZGC, but I thought ZGC or maybe > >>? ? Shenandoah > >>? ? ?> would improve the situation). But as said I may have to > generate way > >>? ? ?> less garbage after all in general for good performance!? > >>? ? ?> > >>? ? ?> All in all maybe due to most objects die young maybe also the > >>? ? ?> generational GCs are not needed (that said if enough RAM is > >>? ? available > >>? ? ?> and the Caffeine Caches are sized accordingly most objects may > >>? ? die old). > >>? ? ?> But apparently the byte arrays holding the page data still die > >>? ? young (in > >>? ? ?> AbstractReader::deserialize). In fact I'm not even sure > why they > >>? ? escape, > >>? ? ?> but currently I'm on my phone. > >>? ? ?> > >>? ? It's when most objects die young the Generational GC really > shines, > >>? ? because it can handle the short lived objects without having > to look at > >>? ? the long lived objects. So I would say Generational ZGC is a > good fit > >>? ? here, but we need to let the System.gc() run to allow reference > >>? ? processing or slightly re-design and use HeapByteBuffers. > >>? ? Have a nice weekend, > >>? ? Stefan > >>? ? ?> Kind regards > >>? ? ?> Johannes > >>? ? ?> > >>? ? ?> Stefan Johansson > >>? ? > > >>? ? ?> > >>? ? >>> schrieb am Fr., 16. Feb. > >>? ? 2024, 13:43: > >>? ? ?> > >>? ? ?>? ? ?Hi Johannes, > >>? ? ?> > >>? ? ?>? ? ?We've spent some more time looking at this and getting the > >>? ? json-file to > >>? ? ?>? ? ?reproduced it made it easy to verify our suspicions. > Thanks for > >>? ? ?>? ? ?uploading it. > >>? ? ?> > >>? ? ?>? ? ?There are a few things playing together here. The test is > >>? ? making quite > >>? ? ?>? ? ?heavy use of DirectByteBuffers and you limit the usage > to 2G > >>? ? ?>? ? ?(-XX:MaxDirectMemorySize=2g). The life cycle and > freeing of > >>? ? the native > >>? ? ?>? ? ?memory part of the DirectByteBuffer rely on reference > >>? ? processing. In > >>? ? ?>? ? ?generational ZGC reference processing is only done > during Major > >>? ? ?>? ? ?collections and since the general GC preassure in this > >>? ? benchmark is > >>? ? ?>? ? ?very > >>? ? ?>? ? ?low (most objects die young), we do not trigger that > many Major > >>? ? ?>? ? ?collections. > >>? ? ?> > >>? ? ?>? ? ?Normaly this would not be a problem. To avoid throwing > an out > >>? ? of memory > >>? ? ?>? ? ?error (due to hitting the direct buffer memory limit) too > >>? ? early the JDK > >>? ? ?>? ? ?triggers a System.gc(). This should trigger reference > >>? ? procesing and all > >>? ? ?>? ? ?buffers that are no longer in use would be freed. > Since you > >>? ? specify the > >>? ? ?>? ? ?option -XX:+DisableExplicitGC all these calls to > trigger GCs are > >>? ? ?>? ? ?ignored > >>? ? ?>? ? ?and no direct memory will be freed. So in our testing, > just > >>? ? removing > >>? ? ?>? ? ?this flags makes the test pass. > >>? ? ?> > >>? ? ?>? ? ?Another solution is to look at using HeapByteBuffers > instead > >>? ? and don't > >>? ? ?>? ? ?have to worry about the direct memory usage. The > OpenHFT lib > >>? ? seems to > >>? ? ?>? ? ?have support for this by just using > >>? ? elasticHeapByteBuffer(...) instead > >>? ? ?>? ? ?of elasticByteBuffer(). > >>? ? ?> > >>? ? ?>? ? ?Lastly, the reason for this working with > non-generational ZGC is > >>? ? ?>? ? ?that it > >>? ? ?>? ? ?does reference processing for every GC. > >>? ? ?> > >>? ? ?>? ? ?Hope this helps, > >>? ? ?>? ? ?StefanJ > >>? ? ?> > >>? ? ?> > >>? ? ?>? ? ?On 2024-02-15 21:53, Johannes Lichtenberger wrote: > >>? ? ?>? ? ? > It's a laptop, I've attached some details. > >>? ? ?>? ? ? > > >>? ? ?>? ? ? > Furthermore, if it seems worth digging deeper into the > >>? ? issue, the > >>? ? ?>? ? ?JSON > >>? ? ?>? ? ? > file is here for one week: > >>? ? ?>? ? ? > https://www.transfernow.net/dl/20240215j9NaPTc0 > > >> > > > >>? ? ?> > ? >> > >>? ? ?>? ? ? > > >> > > > >>? ? ?> > ? >>> > >>? ? ?>? ? ? > > >>? ? ?>? ? ? > You'd have to unzip into > >>? ? bundles/sirix-core/src/test/resources/json, > >>? ? ?>? ? ? > remove the @Disabled annotation and run the test > >>? ? ?>? ? ? > JsonShredderTest::testChicagoDescendantAxis > >>? ? ?>? ? ? > > >>? ? ?>? ? ? > The test JVM parameters are specified in the parent > >>? ? build.gradle > >>? ? ?>? ? ?in the > >>? ? ?>? ? ? > project root folder. > >>? ? ?>? ? ? > > >>? ? ?>? ? ? > The GitHub repo: https://github.com/sirixdb/sirix > > >> > > > >>? ? ?> > ? >> > >>? ? ?>? ? ? > > >> > > > >>? ? ?> > ? >>> > >>? ? ?>? ? ? > > >>? ? ?>? ? ? > Screenshot from 2024-02-15 21-43-33.png > >>? ? ?>? ? ? > > >>? ? ?>? ? ? > kind regards > >>? ? ?>? ? ? > Johannes > >>? ? ?>? ? ? > > >>? ? ?>? ? ? > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb Peter Booth > >>? ? ?>? ? ? > > > > >>? ? > >> > >>? ? ?>? ? ? > > > >>? ? > >>>>: > >>? ? ?>? ? ? > > >>? ? ?>? ? ? >? ? ?Just curious - what CPU, physical memory and OS are > >>? ? you using? > >>? ? ?>? ? ? >? ? ?Sent from my iPhone > >>? ? ?>? ? ? > > >>? ? ?>? ? ? >>? ? ?On Feb 15, 2024, at 12:23?PM, Johannes > Lichtenberger > >>? ? ?>? ? ? >>? ? ? > >>? ? > > >>? ? ?>? ? ? > >>? ? >> > >>? ? ?>? ? ? >>? ? ? > >>? ? > > >>? ? ?>? ? ? > >>? ? >>>> wrote: > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ?? > >>? ? ?>? ? ? >>? ? ?I guess I don't know which JDK it picks for the > >>? ? tests, but I > >>? ? ?>? ? ?guess > >>? ? ?>? ? ? >>? ? ?OpenJDK > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ?Johannes Lichtenberger > >>? ? > >>? ? > > >>? ? ?>? ? ? > >>? ? >> > >>? ? ?>? ? ? >>? ? ? > >>? ? > > >>? ? ?>? ? ? > >>? ? >>>> schrieb am Do., 15. > >>? ? ?>? ? ? >>? ? ?Feb. 2024, 17:58: > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ?However, it's the same with: ./gradlew > >>? ? ?>? ? ? >> > ?-Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > >>? ? ?>? ? ? >>? ? ? ? ?:sirix-core:test --tests > >>? ? ?>? ? ? >> > >>? ? ?> > ?io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis? ?using OpenJDK hopefully > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ?Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb > Johannes > >>? ? ?>? ? ? >>? ? ? ? ?Lichtenberger > > >>? ? > > >>? ? ?>? ? ? > >>? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? > >>? ? > > >>? ? ?>? ? ? > >>? ? >>>>: > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ?I've attached two logs, the first one > without > >>? ? ?>? ? ? >>? ? ? ? ? ? ?-XX:+Generational, the second one with the > >>? ? option set, > >>? ? ?>? ? ? >>? ? ? ? ? ? ?even though I also saw, that > generational ZGC is > >>? ? ?>? ? ?going to > >>? ? ?>? ? ? >>? ? ? ? ? ? ?be supported in GraalVM 24.1 in > September... > >>? ? so not sure > >>? ? ?>? ? ? >>? ? ? ? ? ? ?what this does :) > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ?Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb > >>? ? Johannes > >>? ? ?>? ? ? >>? ? ? ? ? ? ?Lichtenberger > >>? ? > >>? ? > > >>? ? ?>? ? ? > >>? ? >> > >>? ? ?>? ? ? >> > ? > >>? ? > > >>? ? ?>? ? ? > >>? ? >>>>: > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Strange, so does it simply ignore the > >>? ? option? The > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?following is the beginning of the > output > >>? ? from _non_ > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?generational ZGC: > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >> > ?johannes at luna:~/IdeaProjects/sirix$ ./gradlew > >>? ? ?>? ? ? >> > >>? ? ?> > ?-Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal :sirix-core:test --tests io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?> Configure project : > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?The 'sonarqube' task depends on > compile > >>? ? tasks. This > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?behavior is now deprecated and will be > >>? ? removed in > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?version 5.x. To avoid implicit > >>? ? compilation, set > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?property > 'sonar.gradle.skipCompile' to 'true' > >>? ? ?>? ? ?and make > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?sure your project is compiled, before > >>? ? analysis has > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?started. > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?The 'sonar' task depends on compile > >>? ? tasks. This > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?behavior is now deprecated and will be > >>? ? removed in > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?version 5.x. To avoid implicit > >>? ? compilation, set > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?property > 'sonar.gradle.skipCompile' to 'true' > >>? ? ?>? ? ?and make > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?sure your project is compiled, before > >>? ? analysis has > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?started. > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[1,627s][info? ?][gc? ? ? ] GC(0) > Garbage > >>? ? Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Metadata GC Threshold) > 84M(1%)->56M(0%) > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?> Task :sirix-core:test > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[0.001s][warning][pagesize] > UseLargePages > >>? ? ?>? ? ?disabled, no > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?large pages configured and > available on > >>? ? the system. > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[1.253s][info? ?][gc? ? ? ] Using > The Z > >>? ? Garbage > >>? ? ?>? ? ?Collector > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[2,930s][info? ?][gc? ? ? ] GC(1) > Garbage > >>? ? Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Warmup) 1616M(11%)->746M(5%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[4,445s][info? ?][gc? ? ? ] GC(2) > Garbage > >>? ? Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Warmup) 3232M(21%)->750M(5%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[5,751s][info? ?][gc? ? ? ] GC(3) > Garbage > >>? ? Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Warmup) 4644M(30%)->1356M(9%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[9,886s][info? ?][gc? ? ? ] GC(4) > Garbage > >>? ? Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 10668M(69%)->612M(4%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[10,406s][info? ?][gc? ? ? ] GC(5) > Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) 2648M(17%)->216M(1%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[13,931s][info? ?][gc? ? ? ] GC(6) > Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 11164M(73%)->1562M(10%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[16,908s][info? ?][gc? ? ? ] GC(7) > Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 11750M(76%)->460M(3%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[20,690s][info? ?][gc? ? ? ] GC(8) > Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 12670M(82%)->726M(5%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[24,376s][info? ?][gc? ? ? ] GC(9) > Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 13422M(87%)->224M(1%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[28,152s][info? ?][gc? ? ? ] > GC(10) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Proactive) 13474M(88%)->650M(4%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[31,526s][info? ?][gc? ? ? ] > GC(11) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 12072M(79%)->1472M(10%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[34,754s][info? ?][gc? ? ? ] > GC(12) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 13050M(85%)->330M(2%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[38,478s][info? ?][gc? ? ? ] > GC(13) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 13288M(87%)->762M(5%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[41,936s][info? ?][gc? ? ? ] > GC(14) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Proactive) 13294M(87%)->504M(3%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[45,353s][info? ?][gc? ? ? ] > GC(15) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 12984M(85%)->268M(2%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[48,861s][info? ?][gc? ? ? ] > GC(16) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 13008M(85%)->306M(2%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[52,133s][info? ?][gc? ? ? ] > GC(17) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Proactive) 12042M(78%)->538M(4%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[55,705s][info? ?][gc? ? ? ] > GC(18) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 12420M(81%)->1842M(12%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[59,000s][info? ?][gc? ? ? ] > GC(19) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 12458M(81%)->1422M(9%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[64,501s][info? ?][gc? ? ? ] > Allocation > >>? ? Stall (Test > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?worker) 59,673ms > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[64,742s][info? ?][gc? ? ? ] > Allocation > >>? ? Stall (Test > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?worker) 240,077ms > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[65,806s][info? ?][gc? ? ? ] > GC(20) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 13808M(90%)->6936M(45%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[66,476s][info? ?][gc? ? ? ] > GC(21) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Stall) > 7100M(46%)->4478M(29%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[69,471s][info? ?][gc? ? ? ] > GC(22) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 10098M(66%)->5888M(38%) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[72,252s][info? ?][gc? ? ? ] > GC(23) Garbage > >>? ? ?>? ? ?Collection > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 11226M(73%)->5816M(38%) > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?... > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?So even here I can see some allocation > >>? ? stalls. > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Running the Same with > -XX:+ZGenerational in > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?build.gradle probably using > GraalVM does > >>? ? something > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?differnt, but I don't know what... at > >>? ? least off-heap > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?memory is exhausted at some point > due to > >>? ? direct byte > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?buffer usage!? > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?So, I'm not sure what's the > difference, > >>? ? though. > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?With this: > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-XX:+UseZGC", > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? ?"-Xlog:gc*=debug:file=zgc-generational-detailed.log", > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-XX:+ZGenerational", > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-verbose:gc", > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-XX:+HeapDumpOnOutOfMemoryError", > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-XX:HeapDumpPath=heapdump.hprof", > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-XX:MaxDirectMemorySize=2g", > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Caused by: > java.lang.OutOfMemoryError: Cannot > >>? ? ?>? ? ?reserve 60000 bytes of direct buffer memory (allocated: > >>? ? 2147446560, > >>? ? ?>? ? ?limit: 2147483648) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >>? ? ?>? ? ?java.base/java.nio.Bits.reserveMemory(Bits.java:178) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >>? ? ?> > ?java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >>? ? ?> > ?java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >>? ? ?> > ?net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >>? ? ?> > ?net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >>? ? ?> > ?net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >>? ? ?> > ?net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at io.sirix.access.trx.page > > >> > > > >>? ? ?> > ? >>.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Am Do., 15. Feb. 2024 um 12:05 Uhr > >>? ? schrieb Stefan > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Karlsson > > >>? ? > > >>? ? ?>? ? ? > >>? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? > >>? ? > > >>? ? ?>? ? ? > >>? ? >>>>: > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Hi Johannes, > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?We tried to look at the log > files and > >>? ? the jfr > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?files, but couldn't find > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?an OotOfMemoryError in any of > them. > >>? ? Do you think > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?you could try to rerun > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?and capture the entire GC log > from the > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?OutOfMemoryError run? > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?A few things to note: > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?1) You seem to be running the > Graal > >>? ? compiler. > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Graal doesn't support > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Generational ZGC, so you are > going to run > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?different compilers when you > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?compare Singlegen ZGC with > >>? ? Generational ZGC. > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?2) It's not clear to me that the > >>? ? provided JFR > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?files matches the provided > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?log files. > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?3) The JFR files show that > >>? ? ?>? ? ?-XX:+UseLargePages are > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?used, but the gc+init > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?logs shows 'Large Page Support: > >>? ? Disabled', you > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?might want to look into > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?why that is the case. > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?4) The singlegen JFR file has a > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?-Xlog:gc:g1-chicago.log line. > It should > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?probably be named zgc-chicago.log. > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Cheers, > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?StefanK > >>? ? ?>? ? ? >> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?On 2024-02-14 17:36, Johannes > >>? ? Lichtenberger > >>? ? ?>? ? ?wrote: > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> Hello, > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> a test of my little DB project > >>? ? fails using > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?generational ZGC, but not > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> with ZGC and G1 (out of > memory error). > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> To be honest, I guess the > >>? ? allocation rate and > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?thus GC pressure, when > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> traversing a resource in > SirixDB is > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?unacceptable. The strategy is to > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> create fine-grained nodes > from JSON > >>? ? input and > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?store these in a trie. > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> First, a 3,8Gb JSON file is > >>? ? shredded and > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?imported. Next, a preorder > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> traversal of the generated trie > >>? ? traverses > >>? ? ?>? ? ?a trie > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?(with leaf pages > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> storing 1024 nodes each and > in total > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?~300_000_000 (and these are going > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> to be deserialized one by > one). The > >>? ? pages are > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?furthermore referenced > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> in memory through > >>? ? PageReference::setPage. > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Furthermore, a Caffeine page > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> cache caches the PageReferences > >>? ? (keys) and the > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?pages (values) and sets > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> the reference back to null once > >>? ? entries are > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?going to be evicted > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> (PageReference.setPage(null)). > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> However, I think the whole > strategy of > >>? ? ?>? ? ?having to > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?have in-memory nodes > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> might not be best. Maybe it's > >>? ? better to use > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?off-heap memory for the > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> pages itself with > MemorySegments, > >>? ? but the > >>? ? ?>? ? ?pages > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?are not of a fixed > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> size, thus it may get tricky. > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> The test mentioned is this: > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > >> > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > >> > >>> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> I can upload the JSON file > >>? ? somewhere for a > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?couple of days if needed. > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> Caused by: > java.lang.OutOfMemoryError > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >>? ? ?> > ?java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> io.sirix.access.trx.page > > >> > > > >>? ? ?> > ? >> > >>? ? ?>? ? ? >> > ? > >> > > > >>? ? ?> > ? >>>.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> I've uploaded several JFR > >>? ? recordings and logs > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?over here (maybe besides > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> the async profiler JFR files the > >>? ? zgc-detailed > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?log is most interesting): > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >> > >> https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core > > >> > > > >>? ? ?> > ? >> > >>> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> kind regards > >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> Johannes > >>? ? ?>? ? ? >> > >>? ? ?> > From lichtenberger.johannes at gmail.com Sat Feb 17 11:24:13 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Sat, 17 Feb 2024 12:24:13 +0100 Subject: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> Message-ID: So, switching back to DirectByteBuffers, and removing the disabling of explicit GCs still crashes on my laptop (swapping + close to 32 Gb RAM used)... kind regards Johannes Am Sa., 17. Feb. 2024 um 10:22 Uhr schrieb Stefan Johansson < stefan.johansson at oracle.com>: > > > On 2024-02-17 00:36, Johannes Lichtenberger wrote: > > I just removed "-XX+DisableExplizitGC", increased max direct memory size > > to 5g (-XX:MaxDirectMemorySize=5g), but also changed > > Bytes::elasticByteBuffer to Bytes.elasticHeapByteBuffer(60_000); > > to use on heap ByteBuffers. > > > > Just for clarity, when using HeapByteBuffers the MaxDirectMemorySize has > no effect since the ByteBuffers will be stored on the heap. But if you > keep going with DirectByteBuffers, this might make sense to give some > more head room. > > > However, the performance seems to be way worse. I've repeated the test > > several times, but with both G1 and non generational ZGC it's ~50s for > > importing the JSON file in the first case vs ~100s using generational > > ZGC, using Temurin 21.0.2 with similar values for the actual traversals. > > > > Ok, sounds like using DirectByteBuffers is a performance win here. If so > I would just continue testing using DirectByteBuffers and allowing > explicit GCs to ensure they are cleaned out properly. > > > From the log on STDOUT, I can see this (meaning 0,319s and 0,440s... > > pause times?) > > > > No, with ZGC the time here is not the pause time, it's the time to > complete the whole GC. ZGC is a concurrent GC, meaning that most of the > GC work is done concurrently with the Java application still running. > There are still a some very short pauses, all way below 1ms. You can see > them if you look at the detailed log: > > [30,938s][info][gc ] GC(3) Minor Collection (Allocation Rate) > [30,938s][info][gc,phases ] GC(3) y: Young Generation > [30,938s][info][gc,phases ] GC(3) y: Pause Mark Start 0,060ms > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark 383,563ms > [31,322s][info][gc,phases ] GC(3) y: Pause Mark End 0,046ms > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark Free 0,009ms > [31,322s][info][gc,phases ] GC(3) y: Concurrent Reset Relocation Set > 0,201ms > [31,335s][info][gc,phases ] GC(3) y: Concurrent Select Relocation Set > 13,228ms > [31,335s][info][gc,phases ] GC(3) y: Pause Relocate Start 0,019ms > [31,381s][info][gc,phases ] GC(3) y: Concurrent Relocate 45,967ms > [31,382s][info][gc,phases ] GC(3) y: Young Generation > 9726M(63%)->518M(3%) 0,444s > [31,382s][info][gc ] GC(3) Minor Collection (Allocation Rate) > 9726M(63%)->518M(3%) 0,444s > > Here I included the phase-logs for a single GC of the young generation, > where you can clearly see how much time was spent in which part of the > GC and as you can see the three pauses are all very very short. > > Stefan > > > [35,718s][info ][gc ] GC(9) Minor Collection (Allocation Rate) > > 12462M(81%)->1556M(10%) 0,319s > > [40,871s][info ][gc ] GC(10) Minor Collection (Allocation Rate) > > [41,311s][info ][gc ] GC(10) Minor Collection (Allocation Rate) > > 13088M(85%)->1432M(9%) 0,440s > > [46,236s][info ][gc ] GC(11) Minor Collection (Allocation Rate) > > [46,603s][info ][gc ] GC(11) Minor Collection (Allocation Rate) > > 12406M(81%)->1676M(11%) 0,367s > > [51,445s][info ][gc ] GC(12) Minor Collection (Allocation Rate) > > [51,846s][info ][gc ] GC(12) Minor Collection (Allocation Rate) > > 12848M(84%)->1556M(10%) 0,401s > > [56,203s][info ][gc ] GC(13) Major Collection (Proactive) > > [56,368s][info ][gc ] GC(13) Major Collection (Proactive) > > 11684M(76%)->484M(3%) 0,166s > > > > kind regards > > Johannes > > > > Am Fr., 16. Feb. 2024 um 22:39 Uhr schrieb Erik Osterlund > > >: > > > > It?s worth noting that when using ZGC, calling System.gc does not > > invoke a classic disastrously long GC pause. Instead, a concurrent > > GC is triggered, which should be not that noticeable to the > > application. The thread calling System.gc is blocked until the GC is > > done, but the other threads can run freely. > > > > /Erik > > > > > On 16 Feb 2024, at 21:55, Stefan Johansson > > > > > wrote: > > > > > > ? > > > > > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: > > >> Thanks a lot, I wasn't even aware of the fact, that > > DirectByteBuffers use System.gc() and I always had in mind that > > calling System.gc() at least in application code is bad practice (or > > at least we shouldn't rely on it) and I think I read somewhere a > > while ago, that it's recommended to even disable this, but may be > > completely wrong, of course. > > > In most cases callling System.gc() is bad practice, in some > > special cases it might be needed. > > > > > >> I'll change it to on-heap byte buffers tomorrow :-) > > >> I think your GC log entries were from G1, right? It seems ZGC > > always tries to use the full heap :-) > > > > > > Yes, the snippet was G1, it was mostly to show that the pressure > > isn't high. You are correct that ZGC uses more of the given heap but > > the collections are pretty far apart and I'm certian it would > > function well with a smaller heap as well. Maybe in that case some > > Major collections would be triggered. > > > > > >> Kind regards and thanks for sharing your insights. > > > > > > No problem. We appriciate the feedback, > > > StefanJ > > > > > >> Have a nice weekend as well > > >> Johannes > > >> Stefan Johansson > > > > >> schrieb am Fr., 16. Feb. > > 2024, 17:38: > > >> Hi, > > >> Some comments inline. > > >> On 2024-02-16 16:47, Johannes Lichtenberger wrote: > > >> > Thanks a lot for looking into it, I've added > > >> > `-XX:MaxDirectMemorySize=2g` only recently, but without it > > failed as > > >> > well, so not sure what the default is. Will definitely > > check your > > >> > suggestions :-) > > >> > > > >> If you don't set a limit it will be set to: > > >> Runtime.getRuntime().maxMemory() > > >> So likely a good idea to set a reasonable limit, but the > > smaller the > > >> limit is the more frequent we need to run reference > > processing to allow > > >> memory to be freed up. > > >> > Sadly I'm currently working alone on the project in my > > spare time > > >> > (besides professionally switched from Java/Kotlin stuff to > the > > >> embedded > > >> > software world) and I'm not sure if the current > > architecture of > > >> Sirix is > > >> > limited by too much GC pressure. I'd probably have to check > > >> Cassandra at > > >> > some point and look into flame graphs and stuff for their > > >> integration > > >> > tests, but maybe you can give some general > insights/advice... > > >> > > > >> > Yesterday evening I switched to other JDKs (also I want to > > test with > > >> > Shenandoah in particular), but I think especially the > > better escape > > >> > analysis of the GraalVM is a huge plus in the case of > > SirixDB (for > > >> > insertion on my laptop it's ~90s vs ~60s), but I think it > > should be > > >> > faster and currently my suspicion is that garbage is a > major > > >> performance > > >> > issue. > > >> > > > >> > Maybe the GC pressure in general is a major issue, as in > > the CPU > > >> Flame > > >> > graph IIRC the G1 had about 20% stack frames allocated and > non > > >> > generational ZGC even around 40% taking all threads into > > account. > > >> > > > >> From what I/we see, the GC pressure in the given test is > > not high. > > >> The > > >> allocation rate is below 1GB/s and since most of it die young > > the GCs > > >> are fairly cheap. In this log snippet G1 shows a GC every 5s > > and the > > >> pause time is below 50ms: > > >> [296,016s][info ][gc ] GC(90) Pause Young (Normal) (G1 > > >> Evacuation > > >> Pause) 5413M->1849M(6456M) 35,577ms > > >> [301,103s][info ][gc ] GC(91) Pause Young (Normal) (G1 > > >> Evacuation > > >> Pause) 5417M->1848M(6456M) 33,357ms > > >> [306,041s][info ][gc ] GC(92) Pause Young (Normal) (G1 > > >> Evacuation > > >> Pause) 5416M->1848M(6456M) 32,763ms > > >> [310,849s][info ][gc ] GC(93) Pause Young (Normal) (G1 > > >> Evacuation > > >> Pause) 5416M->1847M(6456M) 33,086ms > > >> I also see that the heap never expands to more the ~6.5GB even > > >> though it > > >> is allow to be 15GB and this also suggest that the GC is not > > under much > > >> pressure. As I said in the previous mail, the reason > > Generational ZGC > > >> don't free up the direct memory without the System.gc() calls > > is that > > >> the GC pressure is not high enough to trigger any Major > > cycles. So I > > >> would strongly recommend you to not run with > > -XX+DisableExplicitGC > > >> unless you really have to. Since you are using > > DirectByteBuffers and > > >> they use System.gc() to help free memory when the limit is > > reached. > > >> > So in general I'm thinking about backing the > > KeyValueLeafPages with > > >> > MemorySegments, but I think due to variable sized pages > > it's getting > > >> > tricky, plus I currently don't have the time for changing > > >> fundamental > > >> > stuff and I'm even not sure if it'll bring a performance > > boost, as I > > >> > have to adapt neighbour relationships of the nodes often > and > > >> off-heap > > >> > memory access might be slightly worse performance wise. > > >> > > > >> > What do you think? > > >> > > > >> I know to little about the application to be able to give > > advice here, > > >> but I would first start with having most memory on heap. Only > > large > > >> long > > >> lived stuff off-heap, if really needed. Looking at the test > > at hand, it > > >> really doesn't look like it is long lived stuff that is > > placed off heap. > > >> > I've attached a memory flame graph and there it seems the > > byte array > > >> > from deserializing each page is prominent, but that might > be > > >> something I > > >> > can't even avoid (after decompression via Snappy or via > > another > > >> lib and > > >> > maybe also decryption in the future). > > >> > > > >> > As of now G1 with GraalVM seems to perform best (a little > > bit better > > >> > than with non generational ZGC, but I thought ZGC or maybe > > >> Shenandoah > > >> > would improve the situation). But as said I may have to > > generate way > > >> > less garbage after all in general for good performance!? > > >> > > > >> > All in all maybe due to most objects die young maybe also > the > > >> > generational GCs are not needed (that said if enough RAM is > > >> available > > >> > and the Caffeine Caches are sized accordingly most objects > may > > >> die old). > > >> > But apparently the byte arrays holding the page data still > die > > >> young (in > > >> > AbstractReader::deserialize). In fact I'm not even sure > > why they > > >> escape, > > >> > but currently I'm on my phone. > > >> > > > >> It's when most objects die young the Generational GC really > > shines, > > >> because it can handle the short lived objects without having > > to look at > > >> the long lived objects. So I would say Generational ZGC is a > > good fit > > >> here, but we need to let the System.gc() run to allow > reference > > >> processing or slightly re-design and use HeapByteBuffers. > > >> Have a nice weekend, > > >> Stefan > > >> > Kind regards > > >> > Johannes > > >> > > > >> > Stefan Johansson > > > >> > > > > >> > > > > >> > >>> schrieb am Fr., 16. Feb. > > >> 2024, 13:43: > > >> > > > >> > Hi Johannes, > > >> > > > >> > We've spent some more time looking at this and getting > the > > >> json-file to > > >> > reproduced it made it easy to verify our suspicions. > > Thanks for > > >> > uploading it. > > >> > > > >> > There are a few things playing together here. The test > is > > >> making quite > > >> > heavy use of DirectByteBuffers and you limit the usage > > to 2G > > >> > (-XX:MaxDirectMemorySize=2g). The life cycle and > > freeing of > > >> the native > > >> > memory part of the DirectByteBuffer rely on reference > > >> processing. In > > >> > generational ZGC reference processing is only done > > during Major > > >> > collections and since the general GC preassure in this > > >> benchmark is > > >> > very > > >> > low (most objects die young), we do not trigger that > > many Major > > >> > collections. > > >> > > > >> > Normaly this would not be a problem. To avoid throwing > > an out > > >> of memory > > >> > error (due to hitting the direct buffer memory limit) > too > > >> early the JDK > > >> > triggers a System.gc(). This should trigger reference > > >> procesing and all > > >> > buffers that are no longer in use would be freed. > > Since you > > >> specify the > > >> > option -XX:+DisableExplicitGC all these calls to > > trigger GCs are > > >> > ignored > > >> > and no direct memory will be freed. So in our testing, > > just > > >> removing > > >> > this flags makes the test pass. > > >> > > > >> > Another solution is to look at using HeapByteBuffers > > instead > > >> and don't > > >> > have to worry about the direct memory usage. The > > OpenHFT lib > > >> seems to > > >> > have support for this by just using > > >> elasticHeapByteBuffer(...) instead > > >> > of elasticByteBuffer(). > > >> > > > >> > Lastly, the reason for this working with > > non-generational ZGC is > > >> > that it > > >> > does reference processing for every GC. > > >> > > > >> > Hope this helps, > > >> > StefanJ > > >> > > > >> > > > >> > On 2024-02-15 21:53, Johannes Lichtenberger wrote: > > >> > > It's a laptop, I've attached some details. > > >> > > > > >> > > Furthermore, if it seems worth digging deeper into > the > > >> issue, the > > >> > JSON > > >> > > file is here for one week: > > >> > > https://www.transfernow.net/dl/20240215j9NaPTc0 > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ > > > > >> > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > >> > > >> > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > >>> > > >> > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ > > > > >> > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > >> > > >> > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > >>>> > > >> > > > > >> > > You'd have to unzip into > > >> bundles/sirix-core/src/test/resources/json, > > >> > > remove the @Disabled annotation and run the test > > >> > > JsonShredderTest::testChicagoDescendantAxis > > >> > > > > >> > > The test JVM parameters are specified in the parent > > >> build.gradle > > >> > in the > > >> > > project root folder. > > >> > > > > >> > > The GitHub repo: https://github.com/sirixdb/sirix > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ > > > > >> > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > >> > > >> > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > >>> > > >> > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ > > > > >> > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > >> > > >> > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > >>>> > > >> > > > > >> > > Screenshot from 2024-02-15 21-43-33.png > > >> > > > > >> > > kind regards > > >> > > Johannes > > >> > > > > >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb Peter > Booth > > >> > > > > > > > >> > > >> > > >> > > > > > > >> > > >>>>: > > >> > > > > >> > > Just curious - what CPU, physical memory and OS > are > > >> you using? > > >> > > Sent from my iPhone > > >> > > > > >> > >> On Feb 15, 2024, at 12:23?PM, Johannes > > Lichtenberger > > >> > >> > > > >> > > > > >> > > > > >> > >> > > >> > >> > > > >> > > > > >> > > > > >> > >>>> wrote: > > >> > >> > > >> > >> ? > > >> > >> I guess I don't know which JDK it picks for the > > >> tests, but I > > >> > guess > > >> > >> OpenJDK > > >> > >> > > >> > >> Johannes Lichtenberger > > >> > > > >> > > > > >> > > > > >> > >> > > >> > >> > > > >> > > > > >> > > > > >> > >>>> schrieb am Do., 15. > > >> > >> Feb. 2024, 17:58: > > >> > >> > > >> > >> However, it's the same with: ./gradlew > > >> > >> > > -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > > >> > >> :sirix-core:test --tests > > >> > >> > > >> > > > > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > using OpenJDK hopefully > > >> > >> > > >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb > > Johannes > > >> > >> Lichtenberger > > > > > >> > > > > >> > > > > >> > >> > > >> > >> > > > >> > > > > >> > > > > >> > >>>>: > > >> > >> > > >> > >> I've attached two logs, the first one > > without > > >> > >> -XX:+Generational, the second one with > the > > >> option set, > > >> > >> even though I also saw, that > > generational ZGC is > > >> > going to > > >> > >> be supported in GraalVM 24.1 in > > September... > > >> so not sure > > >> > >> what this does :) > > >> > >> > > >> > >> Am Do., 15. Feb. 2024 um 17:52 Uhr > schrieb > > >> Johannes > > >> > >> Lichtenberger > > >> > > > >> > > > > >> > > > > >> > >> > > >> > >> > > > > > >> > > > > >> > > > > >> > >>>>: > > >> > >> > > >> > >> Strange, so does it simply ignore > the > > >> option? The > > >> > >> following is the beginning of the > > output > > >> from _non_ > > >> > >> generational ZGC: > > >> > >> > > >> > >> > > johannes at luna:~/IdeaProjects/sirix$ ./gradlew > > >> > >> > > >> > > > > -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal > :sirix-core:test --tests > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > > >> > >> > > >> > >> > Configure project : > > >> > >> The 'sonarqube' task depends on > > compile > > >> tasks. This > > >> > >> behavior is now deprecated and > will be > > >> removed in > > >> > >> version 5.x. To avoid implicit > > >> compilation, set > > >> > >> property > > 'sonar.gradle.skipCompile' to 'true' > > >> > and make > > >> > >> sure your project is compiled, > before > > >> analysis has > > >> > >> started. > > >> > >> The 'sonar' task depends on compile > > >> tasks. This > > >> > >> behavior is now deprecated and > will be > > >> removed in > > >> > >> version 5.x. To avoid implicit > > >> compilation, set > > >> > >> property > > 'sonar.gradle.skipCompile' to 'true' > > >> > and make > > >> > >> sure your project is compiled, > before > > >> analysis has > > >> > >> started. > > >> > >> [1,627s][info ][gc ] GC(0) > > Garbage > > >> Collection > > >> > >> (Metadata GC Threshold) > > 84M(1%)->56M(0%) > > >> > >> > > >> > >> > Task :sirix-core:test > > >> > >> [0.001s][warning][pagesize] > > UseLargePages > > >> > disabled, no > > >> > >> large pages configured and > > available on > > >> the system. > > >> > >> [1.253s][info ][gc ] Using > > The Z > > >> Garbage > > >> > Collector > > >> > >> > > >> > >> [2,930s][info ][gc ] GC(1) > > Garbage > > >> Collection > > >> > >> (Warmup) 1616M(11%)->746M(5%) > > >> > >> [4,445s][info ][gc ] GC(2) > > Garbage > > >> Collection > > >> > >> (Warmup) 3232M(21%)->750M(5%) > > >> > >> [5,751s][info ][gc ] GC(3) > > Garbage > > >> Collection > > >> > >> (Warmup) 4644M(30%)->1356M(9%) > > >> > >> [9,886s][info ][gc ] GC(4) > > Garbage > > >> Collection > > >> > >> (Allocation Rate) > > 10668M(69%)->612M(4%) > > >> > >> [10,406s][info ][gc ] GC(5) > > Garbage > > >> > Collection > > >> > >> (Allocation Rate) > 2648M(17%)->216M(1%) > > >> > >> [13,931s][info ][gc ] GC(6) > > Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 11164M(73%)->1562M(10%) > > >> > >> [16,908s][info ][gc ] GC(7) > > Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 11750M(76%)->460M(3%) > > >> > >> [20,690s][info ][gc ] GC(8) > > Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 12670M(82%)->726M(5%) > > >> > >> [24,376s][info ][gc ] GC(9) > > Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 13422M(87%)->224M(1%) > > >> > >> [28,152s][info ][gc ] > > GC(10) Garbage > > >> > Collection > > >> > >> (Proactive) 13474M(88%)->650M(4%) > > >> > >> [31,526s][info ][gc ] > > GC(11) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 12072M(79%)->1472M(10%) > > >> > >> [34,754s][info ][gc ] > > GC(12) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 13050M(85%)->330M(2%) > > >> > >> [38,478s][info ][gc ] > > GC(13) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 13288M(87%)->762M(5%) > > >> > >> [41,936s][info ][gc ] > > GC(14) Garbage > > >> > Collection > > >> > >> (Proactive) 13294M(87%)->504M(3%) > > >> > >> [45,353s][info ][gc ] > > GC(15) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 12984M(85%)->268M(2%) > > >> > >> [48,861s][info ][gc ] > > GC(16) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 13008M(85%)->306M(2%) > > >> > >> [52,133s][info ][gc ] > > GC(17) Garbage > > >> > Collection > > >> > >> (Proactive) 12042M(78%)->538M(4%) > > >> > >> [55,705s][info ][gc ] > > GC(18) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 12420M(81%)->1842M(12%) > > >> > >> [59,000s][info ][gc ] > > GC(19) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 12458M(81%)->1422M(9%) > > >> > >> [64,501s][info ][gc ] > > Allocation > > >> Stall (Test > > >> > >> worker) 59,673ms > > >> > >> [64,742s][info ][gc ] > > Allocation > > >> Stall (Test > > >> > >> worker) 240,077ms > > >> > >> [65,806s][info ][gc ] > > GC(20) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 13808M(90%)->6936M(45%) > > >> > >> [66,476s][info ][gc ] > > GC(21) Garbage > > >> > Collection > > >> > >> (Allocation Stall) > > 7100M(46%)->4478M(29%) > > >> > >> [69,471s][info ][gc ] > > GC(22) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 10098M(66%)->5888M(38%) > > >> > >> [72,252s][info ][gc ] > > GC(23) Garbage > > >> > Collection > > >> > >> (Allocation Rate) > > 11226M(73%)->5816M(38%) > > >> > >> > > >> > >> ... > > >> > >> > > >> > >> So even here I can see some > allocation > > >> stalls. > > >> > >> > > >> > >> Running the Same with > > -XX:+ZGenerational in > > >> > >> build.gradle probably using > > GraalVM does > > >> something > > >> > >> differnt, but I don't know what... > at > > >> least off-heap > > >> > >> memory is exhausted at some point > > due to > > >> direct byte > > >> > >> buffer usage!? > > >> > >> > > >> > >> So, I'm not sure what's the > > difference, > > >> though. > > >> > >> > > >> > >> With this: > > >> > >> > > >> > >> "-XX:+UseZGC", > > >> > >> > > >> > "-Xlog:gc*=debug:file=zgc-generational-detailed.log", > > >> > >> "-XX:+ZGenerational", > > >> > >> "-verbose:gc", > > >> > >> "-XX:+HeapDumpOnOutOfMemoryError", > > >> > >> "-XX:HeapDumpPath=heapdump.hprof", > > >> > >> "-XX:MaxDirectMemorySize=2g", > > >> > >> > > >> > >> > > >> > >> Caused by: > > java.lang.OutOfMemoryError: Cannot > > >> > reserve 60000 bytes of direct buffer memory (allocated: > > >> 2147446560, > > >> > limit: 2147483648) > > >> > >> at > > >> > java.base/java.nio.Bits.reserveMemory(Bits.java:178) > > >> > >> at > > >> > > > > java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) > > >> > >> at > > >> > > > java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) > > >> > >> at > > >> > > > > net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) > > >> > >> at > > >> > > > > net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) > > >> > >> at > > >> > > > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) > > >> > >> at > > >> > > > net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) > > >> > >> at io.sirix.access.trx.page > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNazagnbG$ > > > > >> > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$ > >> > > >> > > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$> > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > >>>.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) > > >> > >> > > >> > >> > > >> > >> > > >> > >> Am Do., 15. Feb. 2024 um 12:05 Uhr > > >> schrieb Stefan > > >> > >> Karlsson > > > > >> > > > > >> > > > > >> > >> > > >> > >> > > > >> > > > > >> > > > > >> > >>>>: > > >> > >> > > >> > >> Hi Johannes, > > >> > >> > > >> > >> We tried to look at the log > > files and > > >> the jfr > > >> > >> files, but couldn't find > > >> > >> an OotOfMemoryError in any of > > them. > > >> Do you think > > >> > >> you could try to rerun > > >> > >> and capture the entire GC log > > from the > > >> > >> OutOfMemoryError run? > > >> > >> > > >> > >> A few things to note: > > >> > >> > > >> > >> 1) You seem to be running the > > Graal > > >> compiler. > > >> > >> Graal doesn't support > > >> > >> Generational ZGC, so you are > > going to run > > >> > >> different compilers when you > > >> > >> compare Singlegen ZGC with > > >> Generational ZGC. > > >> > >> > > >> > >> 2) It's not clear to me that > the > > >> provided JFR > > >> > >> files matches the provided > > >> > >> log files. > > >> > >> > > >> > >> 3) The JFR files show that > > >> > -XX:+UseLargePages are > > >> > >> used, but the gc+init > > >> > >> logs shows 'Large Page Support: > > >> Disabled', you > > >> > >> might want to look into > > >> > >> why that is the case. > > >> > >> > > >> > >> 4) The singlegen JFR file has a > > >> > >> -Xlog:gc:g1-chicago.log line. > > It should > > >> > >> probably be named > zgc-chicago.log. > > >> > >> > > >> > >> Cheers, > > >> > >> StefanK > > >> > >> > > >> > >> On 2024-02-14 17:36, Johannes > > >> Lichtenberger > > >> > wrote: > > >> > >> > Hello, > > >> > >> > > > >> > >> > a test of my little DB > project > > >> fails using > > >> > >> generational ZGC, but not > > >> > >> > with ZGC and G1 (out of > > memory error). > > >> > >> > > > >> > >> > To be honest, I guess the > > >> allocation rate and > > >> > >> thus GC pressure, when > > >> > >> > traversing a resource in > > SirixDB is > > >> > >> unacceptable. The strategy is > to > > >> > >> > create fine-grained nodes > > from JSON > > >> input and > > >> > >> store these in a trie. > > >> > >> > First, a 3,8Gb JSON file is > > >> shredded and > > >> > >> imported. Next, a preorder > > >> > >> > traversal of the generated > trie > > >> traverses > > >> > a trie > > >> > >> (with leaf pages > > >> > >> > storing 1024 nodes each and > > in total > > >> > >> ~300_000_000 (and these are > going > > >> > >> > to be deserialized one by > > one). The > > >> pages are > > >> > >> furthermore referenced > > >> > >> > in memory through > > >> PageReference::setPage. > > >> > >> Furthermore, a Caffeine page > > >> > >> > cache caches the > PageReferences > > >> (keys) and the > > >> > >> pages (values) and sets > > >> > >> > the reference back to null > once > > >> entries are > > >> > >> going to be evicted > > >> > >> > > (PageReference.setPage(null)). > > >> > >> > > > >> > >> > However, I think the whole > > strategy of > > >> > having to > > >> > >> have in-memory nodes > > >> > >> > might not be best. Maybe it's > > >> better to use > > >> > >> off-heap memory for the > > >> > >> > pages itself with > > MemorySegments, > > >> but the > > >> > pages > > >> > >> are not of a fixed > > >> > >> > size, thus it may get tricky. > > >> > >> > > > >> > >> > The test mentioned is this: > > >> > >> > > > >> > >> > > >> > > > >> > > > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNUTAN5gn$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybA2KQCpC$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybA2KQCpC$>> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$>>> > < > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNUTAN5gn$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybA2KQCpC$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybA2KQCpC$>> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$ > >>>> > > >> > >> > > > >> > >> > I can upload the JSON file > > >> somewhere for a > > >> > >> couple of days if needed. > > >> > >> > > > >> > >> > Caused by: > > java.lang.OutOfMemoryError > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > > >> > >> > at > > >> > >> > > > >> > >> > > >> > > > > java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) > > >> > >> > at > > >> > >> > io.sirix.access.trx.page > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNazagnbG$ > > > > >> > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$ > >> > > >> > > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$> > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > >>> > > >> > >> > > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNazagnbG$ > > > > >> > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$ > >> > > >> > > > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$> > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > < > https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$ > >>>>.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) > > >> > >> > > > >> > >> > I've uploaded several JFR > > >> recordings and logs > > >> > >> over here (maybe besides > > >> > >> > the async profiler JFR files > the > > >> zgc-detailed > > >> > >> log is most interesting): > > >> > >> > > > >> > >> > > > >> > >> > > >> https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNVj2Peec$ > > > > >> > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybMV7Rgtt$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybMV7Rgtt$ > >> > > >> > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$>>> > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNVj2Peec$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybMV7Rgtt$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybMV7Rgtt$>> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$ > >>>> > > >> > >> > > > >> > >> > kind regards > > >> > >> > Johannes > > >> > >> > > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.johansson at oracle.com Sat Feb 17 14:54:12 2024 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Sat, 17 Feb 2024 15:54:12 +0100 Subject: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> Message-ID: <0261b2f1-e5fe-434b-b3bc-453de8021223@oracle.com> Ok, when you say crashes, what do you mean. Are you still seeing the same OutOfMemoryError or are we talking about an actual JVM crash. Or is it the Linux OOM killer stepping in because of high memory pressure? If this is with the new setting of 5g for direct memory it could be that these 3 extra gigs of memory is pushing you over the limit for what can be handle by you laptop. Generally, if you start swapping, the performance is out the door and you need to look at the configuration. Maybe the 2g for direct memory is reasonable on this setup to avoid swapping. I looked a bit at the total memory usage for the process here and it seem to be around 20G. Stefan On 2024-02-17 12:24, Johannes Lichtenberger wrote: > So, switching back to DirectByteBuffers, and removing the disabling of > explicit GCs still crashes on my laptop (swapping?+ close to 32 Gb RAM > used)... > > kind regards > Johannes > > Am Sa., 17. Feb. 2024 um 10:22?Uhr schrieb Stefan Johansson > >: > > > > On 2024-02-17 00:36, Johannes Lichtenberger wrote: > > I just removed "-XX+DisableExplizitGC", increased max direct > memory size > > to 5g (-XX:MaxDirectMemorySize=5g), but also changed > > Bytes::elasticByteBuffer to Bytes.elasticHeapByteBuffer(60_000); > > to use on heap ByteBuffers. > > > > Just for clarity, when using HeapByteBuffers the MaxDirectMemorySize > has > no effect since the ByteBuffers will be stored on the heap. But if you > keep going with DirectByteBuffers, this might make sense to give some > more head room. > > > However, the performance seems to be way worse. I've repeated the > test > > several times, but with both G1 and non generational ZGC it's > ~50s for > > importing the JSON file in the first case vs ~100s using > generational > > ZGC, using Temurin 21.0.2 with similar values for the actual > traversals. > > > > Ok, sounds like using DirectByteBuffers is a performance win here. > If so > I would just continue testing using DirectByteBuffers and allowing > explicit GCs to ensure they are cleaned out properly. > > >? From the log on STDOUT, I can see this (meaning 0,319s and > 0,440s... > > pause times?) > > > > No, with ZGC the time here is not the pause time, it's the time to > complete the whole GC. ZGC is a concurrent GC, meaning that most of the > GC work is done concurrently with the Java application still running. > There are still a some very short pauses, all way below 1ms. You can > see > them if you look at the detailed log: > > [30,938s][info][gc? ? ? ? ? ] GC(3) Minor Collection (Allocation Rate) > [30,938s][info][gc,phases? ?] GC(3) y: Young Generation > [30,938s][info][gc,phases? ?] GC(3) y: Pause Mark Start 0,060ms > [31,322s][info][gc,phases? ?] GC(3) y: Concurrent Mark 383,563ms > [31,322s][info][gc,phases? ?] GC(3) y: Pause Mark End 0,046ms > [31,322s][info][gc,phases? ?] GC(3) y: Concurrent Mark Free 0,009ms > [31,322s][info][gc,phases? ?] GC(3) y: Concurrent Reset Relocation Set > 0,201ms > [31,335s][info][gc,phases? ?] GC(3) y: Concurrent Select Relocation Set > 13,228ms > [31,335s][info][gc,phases? ?] GC(3) y: Pause Relocate Start 0,019ms > [31,381s][info][gc,phases? ?] GC(3) y: Concurrent Relocate 45,967ms > [31,382s][info][gc,phases? ?] GC(3) y: Young Generation > 9726M(63%)->518M(3%) 0,444s > [31,382s][info][gc? ? ? ? ? ] GC(3) Minor Collection (Allocation Rate) > 9726M(63%)->518M(3%) 0,444s > > Here I included the phase-logs for a single GC of the young generation, > where you can clearly see how much time was spent in which part of the > GC and as you can see the three pauses are all very very short. > > Stefan > > > [35,718s][info ? ][gc ? ? ?] GC(9) Minor Collection (Allocation > Rate) > > 12462M(81%)->1556M(10%) 0,319s > > [40,871s][info ? ][gc ? ? ?] GC(10) Minor Collection (Allocation > Rate) > > [41,311s][info ? ][gc ? ? ?] GC(10) Minor Collection (Allocation > Rate) > > 13088M(85%)->1432M(9%) 0,440s > > [46,236s][info ? ][gc ? ? ?] GC(11) Minor Collection (Allocation > Rate) > > [46,603s][info ? ][gc ? ? ?] GC(11) Minor Collection (Allocation > Rate) > > 12406M(81%)->1676M(11%) 0,367s > > [51,445s][info ? ][gc ? ? ?] GC(12) Minor Collection (Allocation > Rate) > > [51,846s][info ? ][gc ? ? ?] GC(12) Minor Collection (Allocation > Rate) > > 12848M(84%)->1556M(10%) 0,401s > > [56,203s][info ? ][gc ? ? ?] GC(13) Major Collection (Proactive) > > [56,368s][info ? ][gc ? ? ?] GC(13) Major Collection (Proactive) > > 11684M(76%)->484M(3%) 0,166s > > > > kind regards > > Johannes > > > > Am Fr., 16. Feb. 2024 um 22:39?Uhr schrieb Erik Osterlund > > > >>: > > > >? ? ?It?s worth noting that when using ZGC, calling System.gc does not > >? ? ?invoke a classic disastrously long GC pause. Instead, a > concurrent > >? ? ?GC is triggered, which should be not that noticeable to the > >? ? ?application. The thread calling System.gc is blocked until > the GC is > >? ? ?done, but the other threads can run freely. > > > >? ? ?/Erik > > > >? ? ? > On 16 Feb 2024, at 21:55, Stefan Johansson > >? ? ? > >> > >? ? ?wrote: > >? ? ? > > >? ? ? > ? > >? ? ? > > >? ? ? >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: > >? ? ? >> Thanks a lot, I wasn't even aware of the fact, that > >? ? ?DirectByteBuffers use System.gc() and I always had in mind that > >? ? ?calling System.gc() at least in application code is bad > practice (or > >? ? ?at least we shouldn't rely on it) and I think I read somewhere a > >? ? ?while ago, that it's recommended to even disable this, but may be > >? ? ?completely wrong, of course. > >? ? ? > In most cases callling System.gc() is bad practice, in some > >? ? ?special cases it might be needed. > >? ? ? > > >? ? ? >> I'll change it to on-heap byte buffers tomorrow :-) > >? ? ? >> I think your GC log entries were from G1, right? It seems ZGC > >? ? ?always tries to use the full heap :-) > >? ? ? > > >? ? ? > Yes, the snippet was G1, it was mostly to show that the > pressure > >? ? ?isn't high. You are correct that ZGC uses more of the given > heap but > >? ? ?the collections are pretty far apart and I'm certian it would > >? ? ?function well with a smaller heap as well. Maybe in that case > some > >? ? ?Major collections would be triggered. > >? ? ? > > >? ? ? >> Kind regards and thanks for sharing your insights. > >? ? ? > > >? ? ? > No problem. We appriciate the feedback, > >? ? ? > StefanJ > >? ? ? > > >? ? ? >> Have a nice weekend as well > >? ? ? >> Johannes > >? ? ? >> Stefan Johansson > >? ? ? > > >? ? ? > >? ? ? >>> schrieb am Fr., 16. Feb. > >? ? ?2024, 17:38: > >? ? ? >>? ? Hi, > >? ? ? >>? ? Some comments inline. > >? ? ? >>? ? On 2024-02-16 16:47, Johannes Lichtenberger wrote: > >? ? ? >>? ? ?> Thanks a lot for looking into it, I've added > >? ? ? >>? ? ?> `-XX:MaxDirectMemorySize=2g` only recently, but > without it > >? ? ?failed as > >? ? ? >>? ? ?> well,? so not sure what the default is. Will definitely > >? ? ?check your > >? ? ? >>? ? ?> suggestions :-) > >? ? ? >>? ? ?> > >? ? ? >>? ? If you don't set a limit it will be set to: > >? ? ? >>? ? Runtime.getRuntime().maxMemory() > >? ? ? >>? ? So likely a good idea to set a reasonable limit, but the > >? ? ?smaller the > >? ? ? >>? ? limit is the more frequent we need to run reference > >? ? ?processing to allow > >? ? ? >>? ? memory to be freed up. > >? ? ? >>? ? ?> Sadly I'm currently working alone on the project in my > >? ? ?spare time > >? ? ? >>? ? ?> (besides professionally switched from Java/Kotlin > stuff to the > >? ? ? >>? ? embedded > >? ? ? >>? ? ?> software world) and I'm not sure if the current > >? ? ?architecture of > >? ? ? >>? ? Sirix is > >? ? ? >>? ? ?> limited by too much GC pressure. I'd probably have > to check > >? ? ? >>? ? Cassandra at > >? ? ? >>? ? ?> some point and look into flame graphs and stuff for > their > >? ? ? >>? ? integration > >? ? ? >>? ? ?> tests, but maybe you can give some general > insights/advice... > >? ? ? >>? ? ?> > >? ? ? >>? ? ?> Yesterday evening I switched to other JDKs (also I > want to > >? ? ?test with > >? ? ? >>? ? ?> Shenandoah in particular), but I think especially the > >? ? ?better escape > >? ? ? >>? ? ?> analysis of the GraalVM is a huge plus in the case of > >? ? ?SirixDB (for > >? ? ? >>? ? ?> insertion on my laptop it's ~90s vs ~60s),? but I > think it > >? ? ?should be > >? ? ? >>? ? ?> faster and currently my suspicion is that garbage > is a major > >? ? ? >>? ? performance > >? ? ? >>? ? ?> issue. > >? ? ? >>? ? ?> > >? ? ? >>? ? ?> Maybe the GC pressure in general is a major issue, > as in > >? ? ?the CPU > >? ? ? >>? ? Flame > >? ? ? >>? ? ?> graph IIRC the G1 had about 20% stack frames > allocated and non > >? ? ? >>? ? ?> generational ZGC even around 40% taking all threads > into > >? ? ?account. > >? ? ? >>? ? ?> > >? ? ? >>? ? ? From what I/we see, the GC pressure in the given test is > >? ? ?not high. > >? ? ? >>? ? The > >? ? ? >>? ? allocation rate is below 1GB/s and since most of it > die young > >? ? ?the GCs > >? ? ? >>? ? are fairly cheap. In this log snippet G1 shows a GC > every 5s > >? ? ?and the > >? ? ? >>? ? pause time is below 50ms: > >? ? ? >>? ? [296,016s][info? ?][gc? ? ? ] GC(90) Pause Young > (Normal) (G1 > >? ? ? >>? ? Evacuation > >? ? ? >>? ? Pause) 5413M->1849M(6456M) 35,577ms > >? ? ? >>? ? [301,103s][info? ?][gc? ? ? ] GC(91) Pause Young > (Normal) (G1 > >? ? ? >>? ? Evacuation > >? ? ? >>? ? Pause) 5417M->1848M(6456M) 33,357ms > >? ? ? >>? ? [306,041s][info? ?][gc? ? ? ] GC(92) Pause Young > (Normal) (G1 > >? ? ? >>? ? Evacuation > >? ? ? >>? ? Pause) 5416M->1848M(6456M) 32,763ms > >? ? ? >>? ? [310,849s][info? ?][gc? ? ? ] GC(93) Pause Young > (Normal) (G1 > >? ? ? >>? ? Evacuation > >? ? ? >>? ? Pause) 5416M->1847M(6456M) 33,086ms > >? ? ? >>? ? I also see that the heap never expands to more the > ~6.5GB even > >? ? ? >>? ? though it > >? ? ? >>? ? is allow to be 15GB and this also suggest that the GC > is not > >? ? ?under much > >? ? ? >>? ? pressure. As I said in the previous mail, the reason > >? ? ?Generational ZGC > >? ? ? >>? ? don't free up the direct memory without the > System.gc() calls > >? ? ?is that > >? ? ? >>? ? the GC pressure is not high enough to trigger any Major > >? ? ?cycles. So I > >? ? ? >>? ? would strongly recommend you to not run with > >? ? ?-XX+DisableExplicitGC > >? ? ? >>? ? unless you really have to. Since you are using > >? ? ?DirectByteBuffers and > >? ? ? >>? ? they use System.gc() to help free memory when the limit is > >? ? ?reached. > >? ? ? >>? ? ?> So in general I'm thinking about backing the > >? ? ?KeyValueLeafPages with > >? ? ? >>? ? ?> MemorySegments, but I think due to variable sized pages > >? ? ?it's getting > >? ? ? >>? ? ?> tricky, plus I currently don't have the time for > changing > >? ? ? >>? ? fundamental > >? ? ? >>? ? ?> stuff and I'm even not sure if it'll bring a > performance > >? ? ?boost, as I > >? ? ? >>? ? ?> have to adapt neighbour relationships of the nodes > often and > >? ? ? >>? ? off-heap > >? ? ? >>? ? ?> memory access might be slightly worse performance wise. > >? ? ? >>? ? ?> > >? ? ? >>? ? ?> What do you think? > >? ? ? >>? ? ?> > >? ? ? >>? ? I know to little about the application to be able to give > >? ? ?advice here, > >? ? ? >>? ? but I would first start with having most memory on > heap. Only > >? ? ?large > >? ? ? >>? ? long > >? ? ? >>? ? lived stuff off-heap, if really needed. Looking at the > test > >? ? ?at hand, it > >? ? ? >>? ? really doesn't look like it is long lived stuff that is > >? ? ?placed off heap. > >? ? ? >>? ? ?> I've attached a memory flame graph and there it > seems the > >? ? ?byte array > >? ? ? >>? ? ?> from deserializing each page is prominent, but that > might be > >? ? ? >>? ? something I > >? ? ? >>? ? ?> can't even avoid (after decompression via Snappy or via > >? ? ?another > >? ? ? >>? ? lib and > >? ? ? >>? ? ?> maybe also decryption in the future). > >? ? ? >>? ? ?> > >? ? ? >>? ? ?> As of now G1 with GraalVM seems to perform best (a > little > >? ? ?bit better > >? ? ? >>? ? ?> than with non generational ZGC, but I thought ZGC > or maybe > >? ? ? >>? ? Shenandoah > >? ? ? >>? ? ?> would improve the situation). But as said I may have to > >? ? ?generate way > >? ? ? >>? ? ?> less garbage after all in general for good > performance!? > >? ? ? >>? ? ?> > >? ? ? >>? ? ?> All in all maybe due to most objects die young > maybe also the > >? ? ? >>? ? ?> generational GCs are not needed (that said if > enough RAM is > >? ? ? >>? ? available > >? ? ? >>? ? ?> and the Caffeine Caches are sized accordingly most > objects may > >? ? ? >>? ? die old). > >? ? ? >>? ? ?> But apparently the byte arrays holding the page > data still die > >? ? ? >>? ? young (in > >? ? ? >>? ? ?> AbstractReader::deserialize). In fact I'm not even sure > >? ? ?why they > >? ? ? >>? ? escape, > >? ? ? >>? ? ?> but currently I'm on my phone. > >? ? ? >>? ? ?> > >? ? ? >>? ? It's when most objects die young the Generational GC > really > >? ? ?shines, > >? ? ? >>? ? because it can handle the short lived objects without > having > >? ? ?to look at > >? ? ? >>? ? the long lived objects. So I would say Generational > ZGC is a > >? ? ?good fit > >? ? ? >>? ? here, but we need to let the System.gc() run to allow > reference > >? ? ? >>? ? processing or slightly re-design and use HeapByteBuffers. > >? ? ? >>? ? Have a nice weekend, > >? ? ? >>? ? Stefan > >? ? ? >>? ? ?> Kind regards > >? ? ? >>? ? ?> Johannes > >? ? ? >>? ? ?> > >? ? ? >>? ? ?> Stefan Johansson > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?> > >? ? ? > > >? ? ? >>? ? > >? ? ? >>>> schrieb am Fr., 16. Feb. > >? ? ? >>? ? 2024, 13:43: > >? ? ? >>? ? ?> > >? ? ? >>? ? ?>? ? ?Hi Johannes, > >? ? ? >>? ? ?> > >? ? ? >>? ? ?>? ? ?We've spent some more time looking at this and > getting the > >? ? ? >>? ? json-file to > >? ? ? >>? ? ?>? ? ?reproduced it made it easy to verify our > suspicions. > >? ? ?Thanks for > >? ? ? >>? ? ?>? ? ?uploading it. > >? ? ? >>? ? ?> > >? ? ? >>? ? ?>? ? ?There are a few things playing together here. > The test is > >? ? ? >>? ? making quite > >? ? ? >>? ? ?>? ? ?heavy use of DirectByteBuffers and you limit > the usage > >? ? ?to 2G > >? ? ? >>? ? ?>? ? ?(-XX:MaxDirectMemorySize=2g). The life cycle and > >? ? ?freeing of > >? ? ? >>? ? the native > >? ? ? >>? ? ?>? ? ?memory part of the DirectByteBuffer rely on > reference > >? ? ? >>? ? processing. In > >? ? ? >>? ? ?>? ? ?generational ZGC reference processing is only done > >? ? ?during Major > >? ? ? >>? ? ?>? ? ?collections and since the general GC preassure > in this > >? ? ? >>? ? benchmark is > >? ? ? >>? ? ?>? ? ?very > >? ? ? >>? ? ?>? ? ?low (most objects die young), we do not trigger > that > >? ? ?many Major > >? ? ? >>? ? ?>? ? ?collections. > >? ? ? >>? ? ?> > >? ? ? >>? ? ?>? ? ?Normaly this would not be a problem. To avoid > throwing > >? ? ?an out > >? ? ? >>? ? of memory > >? ? ? >>? ? ?>? ? ?error (due to hitting the direct buffer memory > limit) too > >? ? ? >>? ? early the JDK > >? ? ? >>? ? ?>? ? ?triggers a System.gc(). This should trigger > reference > >? ? ? >>? ? procesing and all > >? ? ? >>? ? ?>? ? ?buffers that are no longer in use would be freed. > >? ? ?Since you > >? ? ? >>? ? specify the > >? ? ? >>? ? ?>? ? ?option -XX:+DisableExplicitGC all these calls to > >? ? ?trigger GCs are > >? ? ? >>? ? ?>? ? ?ignored > >? ? ? >>? ? ?>? ? ?and no direct memory will be freed. So in our > testing, > >? ? ?just > >? ? ? >>? ? removing > >? ? ? >>? ? ?>? ? ?this flags makes the test pass. > >? ? ? >>? ? ?> > >? ? ? >>? ? ?>? ? ?Another solution is to look at using > HeapByteBuffers > >? ? ?instead > >? ? ? >>? ? and don't > >? ? ? >>? ? ?>? ? ?have to worry about the direct memory usage. The > >? ? ?OpenHFT lib > >? ? ? >>? ? seems to > >? ? ? >>? ? ?>? ? ?have support for this by just using > >? ? ? >>? ? elasticHeapByteBuffer(...) instead > >? ? ? >>? ? ?>? ? ?of elasticByteBuffer(). > >? ? ? >>? ? ?> > >? ? ? >>? ? ?>? ? ?Lastly, the reason for this working with > >? ? ?non-generational ZGC is > >? ? ? >>? ? ?>? ? ?that it > >? ? ? >>? ? ?>? ? ?does reference processing for every GC. > >? ? ? >>? ? ?> > >? ? ? >>? ? ?>? ? ?Hope this helps, > >? ? ? >>? ? ?>? ? ?StefanJ > >? ? ? >>? ? ?> > >? ? ? >>? ? ?> > >? ? ? >>? ? ?>? ? ?On 2024-02-15 21:53, Johannes Lichtenberger wrote: > >? ? ? >>? ? ?>? ? ? > It's a laptop, I've attached some details. > >? ? ? >>? ? ?>? ? ? > > >? ? ? >>? ? ?>? ? ? > Furthermore, if it seems worth digging > deeper into the > >? ? ? >>? ? issue, the > >? ? ? >>? ? ?>? ? ?JSON > >? ? ? >>? ? ?>? ? ? > file is here for one week: > >? ? ? >>? ? ?>? ? ? > > https://www.transfernow.net/dl/20240215j9NaPTc0 > > > > ?> > >? ? ? >> > > > ? >> > >? ? ? >>? ? ?> > > > ? > >>> > >? ? ? >>? ? ?>? ? ? > > > > > ?> > >? ? ? >> > > > ? >> > >? ? ? >>? ? ?> > > > ? > >>>> > >? ? ? >>? ? ?>? ? ? > > >? ? ? >>? ? ?>? ? ? > You'd have to unzip into > >? ? ? >>? ? bundles/sirix-core/src/test/resources/json, > >? ? ? >>? ? ?>? ? ? > remove the @Disabled annotation and run the test > >? ? ? >>? ? ?>? ? ? > JsonShredderTest::testChicagoDescendantAxis > >? ? ? >>? ? ?>? ? ? > > >? ? ? >>? ? ?>? ? ? > The test JVM parameters are specified in the > parent > >? ? ? >>? ? build.gradle > >? ? ? >>? ? ?>? ? ?in the > >? ? ? >>? ? ?>? ? ? > project root folder. > >? ? ? >>? ? ?>? ? ? > > >? ? ? >>? ? ?>? ? ? > The GitHub repo: > https://github.com/sirixdb/sirix > > > > ?> > >? ? ? >> > > > ? >> > >? ? ? >>? ? ?> > > > ? > >>> > >? ? ? >>? ? ?>? ? ? > > > > ?> > >? ? ? >> > > > ? >> > >? ? ? >>? ? ?> > > > ? > >>>> > >? ? ? >>? ? ?>? ? ? > > >? ? ? >>? ? ?>? ? ? > Screenshot from 2024-02-15 21-43-33.png > >? ? ? >>? ? ?>? ? ? > > >? ? ? >>? ? ?>? ? ? > kind regards > >? ? ? >>? ? ?>? ? ? > Johannes > >? ? ? >>? ? ?>? ? ? > > >? ? ? >>? ? ?>? ? ? > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb > Peter Booth > >? ? ? >>? ? ?>? ? ? > > > >? ? ? > >> > >? ? ? >>? ? > > > >? ? ? > >>> > >? ? ? >>? ? ?>? ? ? > > >? ? ? > >> > >? ? ? >>? ? > > > >? ? ? > >>>>>: > >? ? ? >>? ? ?>? ? ? > > >? ? ? >>? ? ?>? ? ? >? ? ?Just curious - what CPU, physical memory > and OS are > >? ? ? >>? ? you using? > >? ? ? >>? ? ?>? ? ? >? ? ?Sent from my iPhone > >? ? ? >>? ? ?>? ? ? > > >? ? ? >>? ? ?>? ? ? >>? ? ?On Feb 15, 2024, at 12:23?PM, Johannes > >? ? ?Lichtenberger > >? ? ? >>? ? ?>? ? ? >>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>> > >? ? ? >>? ? ?>? ? ? >> > ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>>>> wrote: > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ?? > >? ? ? >>? ? ?>? ? ? >>? ? ?I guess I don't know which JDK it picks > for the > >? ? ? >>? ? tests, but I > >? ? ? >>? ? ?>? ? ?guess > >? ? ? >>? ? ?>? ? ? >>? ? ?OpenJDK > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ?Johannes Lichtenberger > >? ? ? >>? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>> > >? ? ? >>? ? ?>? ? ? >> > ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>>>> schrieb am Do., 15. > >? ? ? >>? ? ?>? ? ? >>? ? ?Feb. 2024, 17:58: > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ?However, it's the same with: ./gradlew > >? ? ? >>? ? ?>? ? ? >> > >? ? ? ?-Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ?:sirix-core:test --tests > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis? ?using OpenJDK hopefully > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ?Am Do., 15. Feb. 2024 um 17:54 Uhr > schrieb > >? ? ?Johannes > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ?Lichtenberger > >? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>> > >? ? ? >>? ? ?>? ? ? >> > ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>>>>: > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ?I've attached two logs, the > first one > >? ? ?without > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ?-XX:+Generational, the second > one with the > >? ? ? >>? ? option set, > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ?even though I also saw, that > >? ? ?generational ZGC is > >? ? ? >>? ? ?>? ? ?going to > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ?be supported in GraalVM 24.1 in > >? ? ?September... > >? ? ? >>? ? so not sure > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ?what this does :) > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ?Am Do., 15. Feb. 2024 um 17:52 > Uhr schrieb > >? ? ? >>? ? Johannes > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ?Lichtenberger > >? ? ? >>? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>>>>: > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Strange, so does it simply > ignore the > >? ? ? >>? ? option? The > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?following is the beginning > of the > >? ? ?output > >? ? ? >>? ? from _non_ > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?generational ZGC: > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? ?johannes at luna:~/IdeaProjects/sirix$ ./gradlew > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?-Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal :sirix-core:test --tests io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?> Configure project : > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?The 'sonarqube' task depends on > >? ? ?compile > >? ? ? >>? ? tasks. This > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?behavior is now deprecated > and will be > >? ? ? >>? ? removed in > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?version 5.x. To avoid implicit > >? ? ? >>? ? compilation, set > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?property > >? ? ?'sonar.gradle.skipCompile' to 'true' > >? ? ? >>? ? ?>? ? ?and make > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?sure your project is > compiled, before > >? ? ? >>? ? analysis has > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?started. > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?The 'sonar' task depends on > compile > >? ? ? >>? ? tasks. This > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?behavior is now deprecated > and will be > >? ? ? >>? ? removed in > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?version 5.x. To avoid implicit > >? ? ? >>? ? compilation, set > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?property > >? ? ?'sonar.gradle.skipCompile' to 'true' > >? ? ? >>? ? ?>? ? ?and make > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?sure your project is > compiled, before > >? ? ? >>? ? analysis has > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?started. > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[1,627s][info? ?][gc? ? ? ] > GC(0) > >? ? ?Garbage > >? ? ? >>? ? Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Metadata GC Threshold) > >? ? ?84M(1%)->56M(0%) > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?> Task :sirix-core:test > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[0.001s][warning][pagesize] > >? ? ?UseLargePages > >? ? ? >>? ? ?>? ? ?disabled, no > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?large pages configured and > >? ? ?available on > >? ? ? >>? ? the system. > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[1.253s][info? ?][gc? ? ? ] > Using > >? ? ?The Z > >? ? ? >>? ? Garbage > >? ? ? >>? ? ?>? ? ?Collector > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[2,930s][info? ?][gc? ? ? ] > GC(1) > >? ? ?Garbage > >? ? ? >>? ? Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Warmup) 1616M(11%)->746M(5%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[4,445s][info? ?][gc? ? ? ] > GC(2) > >? ? ?Garbage > >? ? ? >>? ? Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Warmup) 3232M(21%)->750M(5%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[5,751s][info? ?][gc? ? ? ] > GC(3) > >? ? ?Garbage > >? ? ? >>? ? Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Warmup) 4644M(30%)->1356M(9%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[9,886s][info? ?][gc? ? ? ] > GC(4) > >? ? ?Garbage > >? ? ? >>? ? Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?10668M(69%)->612M(4%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[10,406s][info? ?][gc > ] GC(5) > >? ? ?Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > 2648M(17%)->216M(1%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[13,931s][info? ?][gc > ] GC(6) > >? ? ?Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?11164M(73%)->1562M(10%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[16,908s][info? ?][gc > ] GC(7) > >? ? ?Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?11750M(76%)->460M(3%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[20,690s][info? ?][gc > ] GC(8) > >? ? ?Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?12670M(82%)->726M(5%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[24,376s][info? ?][gc > ] GC(9) > >? ? ?Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?13422M(87%)->224M(1%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[28,152s][info? ?][gc? ? ? ] > >? ? ?GC(10) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Proactive) > 13474M(88%)->650M(4%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[31,526s][info? ?][gc? ? ? ] > >? ? ?GC(11) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?12072M(79%)->1472M(10%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[34,754s][info? ?][gc? ? ? ] > >? ? ?GC(12) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?13050M(85%)->330M(2%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[38,478s][info? ?][gc? ? ? ] > >? ? ?GC(13) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?13288M(87%)->762M(5%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[41,936s][info? ?][gc? ? ? ] > >? ? ?GC(14) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Proactive) > 13294M(87%)->504M(3%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[45,353s][info? ?][gc? ? ? ] > >? ? ?GC(15) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?12984M(85%)->268M(2%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[48,861s][info? ?][gc? ? ? ] > >? ? ?GC(16) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?13008M(85%)->306M(2%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[52,133s][info? ?][gc? ? ? ] > >? ? ?GC(17) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Proactive) > 12042M(78%)->538M(4%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[55,705s][info? ?][gc? ? ? ] > >? ? ?GC(18) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?12420M(81%)->1842M(12%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[59,000s][info? ?][gc? ? ? ] > >? ? ?GC(19) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?12458M(81%)->1422M(9%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[64,501s][info? ?][gc? ? ? ] > >? ? ?Allocation > >? ? ? >>? ? Stall (Test > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?worker) 59,673ms > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[64,742s][info? ?][gc? ? ? ] > >? ? ?Allocation > >? ? ? >>? ? Stall (Test > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?worker) 240,077ms > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[65,806s][info? ?][gc? ? ? ] > >? ? ?GC(20) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?13808M(90%)->6936M(45%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[66,476s][info? ?][gc? ? ? ] > >? ? ?GC(21) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Stall) > >? ? ?7100M(46%)->4478M(29%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[69,471s][info? ?][gc? ? ? ] > >? ? ?GC(22) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?10098M(66%)->5888M(38%) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?[72,252s][info? ?][gc? ? ? ] > >? ? ?GC(23) Garbage > >? ? ? >>? ? ?>? ? ?Collection > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?(Allocation Rate) > >? ? ?11226M(73%)->5816M(38%) > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?... > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?So even here I can see some > allocation > >? ? ? >>? ? stalls. > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Running the Same with > >? ? ?-XX:+ZGenerational in > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?build.gradle probably using > >? ? ?GraalVM does > >? ? ? >>? ? something > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?differnt, but I don't know > what... at > >? ? ? >>? ? least off-heap > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?memory is exhausted at some > point > >? ? ?due to > >? ? ? >>? ? direct byte > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?buffer usage!? > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?So, I'm not sure what's the > >? ? ?difference, > >? ? ? >>? ? though. > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?With this: > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-XX:+UseZGC", > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > ?"-Xlog:gc*=debug:file=zgc-generational-detailed.log", > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-XX:+ZGenerational", > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-verbose:gc", > >? ? ? >>? ? ?>? ? ? >> > ?"-XX:+HeapDumpOnOutOfMemoryError", > >? ? ? >>? ? ?>? ? ? >> > ?"-XX:HeapDumpPath=heapdump.hprof", > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?"-XX:MaxDirectMemorySize=2g", > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Caused by: > >? ? ?java.lang.OutOfMemoryError: Cannot > >? ? ? >>? ? ?>? ? ?reserve 60000 bytes of direct buffer memory > (allocated: > >? ? ? >>? ? 2147446560, > >? ? ? >>? ? ?>? ? ?limit: 2147483648) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >? ? ? >>? ? ?> > ?java.base/java.nio.Bits.reserveMemory(Bits.java:178) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >? ? ? >>? ? ?> > > > ?java.base/java.nio.DirectByteBuffer.(DirectByteBuffer.java:111) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >? ? ? >>? ? ?> > > > ?java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >? ? ? >>? ? ?> > > > ?net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >? ? ? >>? ? ?> > > > ?net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >? ? ? >>? ? ?> > > > ?net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > >? ? ? >>? ? ?> > > > ?net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ? at > io.sirix.access.trx.page > > > > ?> > >? ? ? >> > > > ? >> > >? ? ? >>? ? ?> > > > ? > >>>.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443) > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Am Do., 15. Feb. 2024 um > 12:05 Uhr > >? ? ? >>? ? schrieb Stefan > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ?Karlsson > >? ? ? > > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>> > >? ? ? >>? ? ?>? ? ? >> > ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >> > >? ? ? >>? ? ?>? ? ? > >? ? ? > > >? ? ? >>? ? > >? ? ? >>>>>: > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Hi Johannes, > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?We tried to look at the log > >? ? ?files and > >? ? ? >>? ? the jfr > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?files, but couldn't find > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?an OotOfMemoryError in > any of > >? ? ?them. > >? ? ? >>? ? Do you think > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?you could try to rerun > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?and capture the entire > GC log > >? ? ?from the > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?OutOfMemoryError run? > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?A few things to note: > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?1) You seem to be > running the > >? ? ?Graal > >? ? ? >>? ? compiler. > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Graal doesn't support > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Generational ZGC, so > you are > >? ? ?going to run > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?different compilers > when you > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?compare Singlegen ZGC with > >? ? ? >>? ? Generational ZGC. > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?2) It's not clear to me > that the > >? ? ? >>? ? provided JFR > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?files matches the provided > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?log files. > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?3) The JFR files show that > >? ? ? >>? ? ?>? ? ?-XX:+UseLargePages are > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?used, but the gc+init > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?logs shows 'Large Page > Support: > >? ? ? >>? ? Disabled', you > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?might want to look into > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?why that is the case. > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?4) The singlegen JFR > file has a > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?-Xlog:gc:g1-chicago.log > line. > >? ? ?It should > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?probably be named > zgc-chicago.log. > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Cheers, > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?StefanK > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?On 2024-02-14 17:36, > Johannes > >? ? ? >>? ? Lichtenberger > >? ? ? >>? ? ?>? ? ?wrote: > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> Hello, > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> a test of my little > DB project > >? ? ? >>? ? fails using > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?generational ZGC, but not > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> with ZGC and G1 (out of > >? ? ?memory error). > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> To be honest, I guess the > >? ? ? >>? ? allocation rate and > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?thus GC pressure, when > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> traversing a resource in > >? ? ?SirixDB is > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?unacceptable. The > strategy is to > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> create fine-grained nodes > >? ? ?from JSON > >? ? ? >>? ? input and > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?store these in a trie. > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> First, a 3,8Gb JSON > file is > >? ? ? >>? ? shredded and > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?imported. Next, a preorder > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> traversal of the > generated trie > >? ? ? >>? ? traverses > >? ? ? >>? ? ?>? ? ?a trie > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?(with leaf pages > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> storing 1024 nodes > each and > >? ? ?in total > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?~300_000_000 (and these > are going > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> to be deserialized one by > >? ? ?one). The > >? ? ? >>? ? pages are > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?furthermore referenced > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> in memory through > >? ? ? >>? ? PageReference::setPage. > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?Furthermore, a Caffeine > page > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> cache caches the > PageReferences > >? ? ? >>? ? (keys) and the > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?pages (values) and sets > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> the reference back to > null once > >? ? ? >>? ? entries are > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?going to be evicted > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > (PageReference.setPage(null)). > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> However, I think the > whole > >? ? ?strategy of > >? ? ? >>? ? ?>? ? ?having to > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?have in-memory nodes > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> might not be best. > Maybe it's > >? ? ? >>? ? better to use > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?off-heap memory for the > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> pages itself with > >? ? ?MemorySegments, > >? ? ? >>? ? but the > >? ? ? >>? ? ?>? ? ?pages > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?are not of a fixed > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> size, thus it may get > tricky. > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> The test mentioned is > this: > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > >? ? ? >> > > > https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69 > >> > >>> > >> > >>>> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> I can upload the JSON > file > >? ? ? >>? ? somewhere for a > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?couple of days if needed. > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> Caused by: > >? ? ?java.lang.OutOfMemoryError > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > ?java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?>? ? ?at > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > io.sirix.access.trx.page > > > > ?> > >? ? ? >> > > > ? >> > >? ? ? >>? ? ?> > > > ? > >>> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? ? > > > ?> > >? ? ? >> > > > ? >> > >? ? ? >>? ? ?> > > > ? > >>>>.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442) > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> I've uploaded several JFR > >? ? ? >>? ? recordings and logs > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?over here (maybe besides > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> the async profiler > JFR files the > >? ? ? >>? ? zgc-detailed > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?log is most interesting): > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >> > https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core > > > > ?> > >? ? ? >> > > > ? >> > >? ? ? >>? ? ?> > > > ? > >>> > >> > >>>> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> kind regards > >? ? ? >>? ? ?>? ? ? >>? ? ? ? ? ? ? ? ? ? ?> Johannes > >? ? ? >>? ? ?>? ? ? >> > >? ? ? >>? ? ?> > > > From lichtenberger.johannes at gmail.com Sat Feb 17 15:21:03 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Sat, 17 Feb 2024 16:21:03 +0100 Subject: Generational ZGC issue In-Reply-To: <0261b2f1-e5fe-434b-b3bc-453de8021223@oracle.com> References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> <0261b2f1-e5fe-434b-b3bc-453de8021223@oracle.com> Message-ID: I'll check later on if the test doesn't fail with the 2g max and I'll have to check as well if it's still the OutOfMemoryError (as I'm not at home currently). But everytime everything freezes for a couple of seconds. In any case isn't it strange that with G1 and ZGC the runtime is very close to each other with G1 having the upper hand slightly, when switching to on heap ByteBuffers, but with Generational ZGC the runtime almost exactly doubles? I thought at some point the generational version should make the non generational obsolet and as almost every object dies young the generational ZGC should be better as you wrote!? Kind regards and have a nice weekend (and kind of feel sorry for bothering that much) Johannes Stefan Johansson schrieb am Sa., 17. Feb. 2024, 15:54: > Ok, when you say crashes, what do you mean. Are you still seeing the > same OutOfMemoryError or are we talking about an actual JVM crash. Or is > it the Linux OOM killer stepping in because of high memory pressure? > > If this is with the new setting of 5g for direct memory it could be that > these 3 extra gigs of memory is pushing you over the limit for what can > be handle by you laptop. Generally, if you start swapping, the > performance is out the door and you need to look at the configuration. > Maybe the 2g for direct memory is reasonable on this setup to avoid > swapping. I looked a bit at the total memory usage for the process here > and it seem to be around 20G. > > Stefan > > On 2024-02-17 12:24, Johannes Lichtenberger wrote: > > So, switching back to DirectByteBuffers, and removing the disabling of > > explicit GCs still crashes on my laptop (swapping + close to 32 Gb RAM > > used)... > > > > kind regards > > Johannes > > > > Am Sa., 17. Feb. 2024 um 10:22 Uhr schrieb Stefan Johansson > > >: > > > > > > > > On 2024-02-17 00:36, Johannes Lichtenberger wrote: > > > I just removed "-XX+DisableExplizitGC", increased max direct > > memory size > > > to 5g (-XX:MaxDirectMemorySize=5g), but also changed > > > Bytes::elasticByteBuffer to Bytes.elasticHeapByteBuffer(60_000); > > > to use on heap ByteBuffers. > > > > > > > Just for clarity, when using HeapByteBuffers the MaxDirectMemorySize > > has > > no effect since the ByteBuffers will be stored on the heap. But if > you > > keep going with DirectByteBuffers, this might make sense to give some > > more head room. > > > > > However, the performance seems to be way worse. I've repeated the > > test > > > several times, but with both G1 and non generational ZGC it's > > ~50s for > > > importing the JSON file in the first case vs ~100s using > > generational > > > ZGC, using Temurin 21.0.2 with similar values for the actual > > traversals. > > > > > > > Ok, sounds like using DirectByteBuffers is a performance win here. > > If so > > I would just continue testing using DirectByteBuffers and allowing > > explicit GCs to ensure they are cleaned out properly. > > > > > From the log on STDOUT, I can see this (meaning 0,319s and > > 0,440s... > > > pause times?) > > > > > > > No, with ZGC the time here is not the pause time, it's the time to > > complete the whole GC. ZGC is a concurrent GC, meaning that most of > the > > GC work is done concurrently with the Java application still running. > > There are still a some very short pauses, all way below 1ms. You can > > see > > them if you look at the detailed log: > > > > [30,938s][info][gc ] GC(3) Minor Collection (Allocation > Rate) > > [30,938s][info][gc,phases ] GC(3) y: Young Generation > > [30,938s][info][gc,phases ] GC(3) y: Pause Mark Start 0,060ms > > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark 383,563ms > > [31,322s][info][gc,phases ] GC(3) y: Pause Mark End 0,046ms > > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark Free 0,009ms > > [31,322s][info][gc,phases ] GC(3) y: Concurrent Reset Relocation > Set > > 0,201ms > > [31,335s][info][gc,phases ] GC(3) y: Concurrent Select Relocation > Set > > 13,228ms > > [31,335s][info][gc,phases ] GC(3) y: Pause Relocate Start 0,019ms > > [31,381s][info][gc,phases ] GC(3) y: Concurrent Relocate 45,967ms > > [31,382s][info][gc,phases ] GC(3) y: Young Generation > > 9726M(63%)->518M(3%) 0,444s > > [31,382s][info][gc ] GC(3) Minor Collection (Allocation > Rate) > > 9726M(63%)->518M(3%) 0,444s > > > > Here I included the phase-logs for a single GC of the young > generation, > > where you can clearly see how much time was spent in which part of > the > > GC and as you can see the three pauses are all very very short. > > > > Stefan > > > > > [35,718s][info ][gc ] GC(9) Minor Collection (Allocation > > Rate) > > > 12462M(81%)->1556M(10%) 0,319s > > > [40,871s][info ][gc ] GC(10) Minor Collection (Allocation > > Rate) > > > [41,311s][info ][gc ] GC(10) Minor Collection (Allocation > > Rate) > > > 13088M(85%)->1432M(9%) 0,440s > > > [46,236s][info ][gc ] GC(11) Minor Collection (Allocation > > Rate) > > > [46,603s][info ][gc ] GC(11) Minor Collection (Allocation > > Rate) > > > 12406M(81%)->1676M(11%) 0,367s > > > [51,445s][info ][gc ] GC(12) Minor Collection (Allocation > > Rate) > > > [51,846s][info ][gc ] GC(12) Minor Collection (Allocation > > Rate) > > > 12848M(84%)->1556M(10%) 0,401s > > > [56,203s][info ][gc ] GC(13) Major Collection (Proactive) > > > [56,368s][info ][gc ] GC(13) Major Collection (Proactive) > > > 11684M(76%)->484M(3%) 0,166s > > > > > > kind regards > > > Johannes > > > > > > Am Fr., 16. Feb. 2024 um 22:39 Uhr schrieb Erik Osterlund > > > > > >>>: > > > > > > It?s worth noting that when using ZGC, calling System.gc does > not > > > invoke a classic disastrously long GC pause. Instead, a > > concurrent > > > GC is triggered, which should be not that noticeable to the > > > application. The thread calling System.gc is blocked until > > the GC is > > > done, but the other threads can run freely. > > > > > > /Erik > > > > > > > On 16 Feb 2024, at 21:55, Stefan Johansson > > > > > > > >> > > > wrote: > > > > > > > > ? > > > > > > > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: > > > >> Thanks a lot, I wasn't even aware of the fact, that > > > DirectByteBuffers use System.gc() and I always had in mind > that > > > calling System.gc() at least in application code is bad > > practice (or > > > at least we shouldn't rely on it) and I think I read > somewhere a > > > while ago, that it's recommended to even disable this, but > may be > > > completely wrong, of course. > > > > In most cases callling System.gc() is bad practice, in some > > > special cases it might be needed. > > > > > > > >> I'll change it to on-heap byte buffers tomorrow :-) > > > >> I think your GC log entries were from G1, right? It seems > ZGC > > > always tries to use the full heap :-) > > > > > > > > Yes, the snippet was G1, it was mostly to show that the > > pressure > > > isn't high. You are correct that ZGC uses more of the given > > heap but > > > the collections are pretty far apart and I'm certian it would > > > function well with a smaller heap as well. Maybe in that case > > some > > > Major collections would be triggered. > > > > > > > >> Kind regards and thanks for sharing your insights. > > > > > > > > No problem. We appriciate the feedback, > > > > StefanJ > > > > > > > >> Have a nice weekend as well > > > >> Johannes > > > >> Stefan Johansson > > > > > > > > > > > > > > >>> schrieb am Fr., 16. Feb. > > > 2024, 17:38: > > > >> Hi, > > > >> Some comments inline. > > > >> On 2024-02-16 16:47, Johannes Lichtenberger wrote: > > > >> > Thanks a lot for looking into it, I've added > > > >> > `-XX:MaxDirectMemorySize=2g` only recently, but > > without it > > > failed as > > > >> > well, so not sure what the default is. Will > definitely > > > check your > > > >> > suggestions :-) > > > >> > > > > >> If you don't set a limit it will be set to: > > > >> Runtime.getRuntime().maxMemory() > > > >> So likely a good idea to set a reasonable limit, but > the > > > smaller the > > > >> limit is the more frequent we need to run reference > > > processing to allow > > > >> memory to be freed up. > > > >> > Sadly I'm currently working alone on the project in > my > > > spare time > > > >> > (besides professionally switched from Java/Kotlin > > stuff to the > > > >> embedded > > > >> > software world) and I'm not sure if the current > > > architecture of > > > >> Sirix is > > > >> > limited by too much GC pressure. I'd probably have > > to check > > > >> Cassandra at > > > >> > some point and look into flame graphs and stuff for > > their > > > >> integration > > > >> > tests, but maybe you can give some general > > insights/advice... > > > >> > > > > >> > Yesterday evening I switched to other JDKs (also I > > want to > > > test with > > > >> > Shenandoah in particular), but I think especially > the > > > better escape > > > >> > analysis of the GraalVM is a huge plus in the case > of > > > SirixDB (for > > > >> > insertion on my laptop it's ~90s vs ~60s), but I > > think it > > > should be > > > >> > faster and currently my suspicion is that garbage > > is a major > > > >> performance > > > >> > issue. > > > >> > > > > >> > Maybe the GC pressure in general is a major issue, > > as in > > > the CPU > > > >> Flame > > > >> > graph IIRC the G1 had about 20% stack frames > > allocated and non > > > >> > generational ZGC even around 40% taking all threads > > into > > > account. > > > >> > > > > >> From what I/we see, the GC pressure in the given > test is > > > not high. > > > >> The > > > >> allocation rate is below 1GB/s and since most of it > > die young > > > the GCs > > > >> are fairly cheap. In this log snippet G1 shows a GC > > every 5s > > > and the > > > >> pause time is below 50ms: > > > >> [296,016s][info ][gc ] GC(90) Pause Young > > (Normal) (G1 > > > >> Evacuation > > > >> Pause) 5413M->1849M(6456M) 35,577ms > > > >> [301,103s][info ][gc ] GC(91) Pause Young > > (Normal) (G1 > > > >> Evacuation > > > >> Pause) 5417M->1848M(6456M) 33,357ms > > > >> [306,041s][info ][gc ] GC(92) Pause Young > > (Normal) (G1 > > > >> Evacuation > > > >> Pause) 5416M->1848M(6456M) 32,763ms > > > >> [310,849s][info ][gc ] GC(93) Pause Young > > (Normal) (G1 > > > >> Evacuation > > > >> Pause) 5416M->1847M(6456M) 33,086ms > > > >> I also see that the heap never expands to more the > > ~6.5GB even > > > >> though it > > > >> is allow to be 15GB and this also suggest that the GC > > is not > > > under much > > > >> pressure. As I said in the previous mail, the reason > > > Generational ZGC > > > >> don't free up the direct memory without the > > System.gc() calls > > > is that > > > >> the GC pressure is not high enough to trigger any Major > > > cycles. So I > > > >> would strongly recommend you to not run with > > > -XX+DisableExplicitGC > > > >> unless you really have to. Since you are using > > > DirectByteBuffers and > > > >> they use System.gc() to help free memory when the > limit is > > > reached. > > > >> > So in general I'm thinking about backing the > > > KeyValueLeafPages with > > > >> > MemorySegments, but I think due to variable sized > pages > > > it's getting > > > >> > tricky, plus I currently don't have the time for > > changing > > > >> fundamental > > > >> > stuff and I'm even not sure if it'll bring a > > performance > > > boost, as I > > > >> > have to adapt neighbour relationships of the nodes > > often and > > > >> off-heap > > > >> > memory access might be slightly worse performance > wise. > > > >> > > > > >> > What do you think? > > > >> > > > > >> I know to little about the application to be able to > give > > > advice here, > > > >> but I would first start with having most memory on > > heap. Only > > > large > > > >> long > > > >> lived stuff off-heap, if really needed. Looking at the > > test > > > at hand, it > > > >> really doesn't look like it is long lived stuff that is > > > placed off heap. > > > >> > I've attached a memory flame graph and there it > > seems the > > > byte array > > > >> > from deserializing each page is prominent, but that > > might be > > > >> something I > > > >> > can't even avoid (after decompression via Snappy or > via > > > another > > > >> lib and > > > >> > maybe also decryption in the future). > > > >> > > > > >> > As of now G1 with GraalVM seems to perform best (a > > little > > > bit better > > > >> > than with non generational ZGC, but I thought ZGC > > or maybe > > > >> Shenandoah > > > >> > would improve the situation). But as said I may > have to > > > generate way > > > >> > less garbage after all in general for good > > performance!? > > > >> > > > > >> > All in all maybe due to most objects die young > > maybe also the > > > >> > generational GCs are not needed (that said if > > enough RAM is > > > >> available > > > >> > and the Caffeine Caches are sized accordingly most > > objects may > > > >> die old). > > > >> > But apparently the byte arrays holding the page > > data still die > > > >> young (in > > > >> > AbstractReader::deserialize). In fact I'm not even > sure > > > why they > > > >> escape, > > > >> > but currently I'm on my phone. > > > >> > > > > >> It's when most objects die young the Generational GC > > really > > > shines, > > > >> because it can handle the short lived objects without > > having > > > to look at > > > >> the long lived objects. So I would say Generational > > ZGC is a > > > good fit > > > >> here, but we need to let the System.gc() run to allow > > reference > > > >> processing or slightly re-design and use > HeapByteBuffers. > > > >> Have a nice weekend, > > > >> Stefan > > > >> > Kind regards > > > >> > Johannes > > > >> > > > > >> > Stefan Johansson > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > >> > > > > > >>>> schrieb am Fr., 16. Feb. > > > >> 2024, 13:43: > > > >> > > > > >> > Hi Johannes, > > > >> > > > > >> > We've spent some more time looking at this and > > getting the > > > >> json-file to > > > >> > reproduced it made it easy to verify our > > suspicions. > > > Thanks for > > > >> > uploading it. > > > >> > > > > >> > There are a few things playing together here. > > The test is > > > >> making quite > > > >> > heavy use of DirectByteBuffers and you limit > > the usage > > > to 2G > > > >> > (-XX:MaxDirectMemorySize=2g). The life cycle and > > > freeing of > > > >> the native > > > >> > memory part of the DirectByteBuffer rely on > > reference > > > >> processing. In > > > >> > generational ZGC reference processing is only > done > > > during Major > > > >> > collections and since the general GC preassure > > in this > > > >> benchmark is > > > >> > very > > > >> > low (most objects die young), we do not trigger > > that > > > many Major > > > >> > collections. > > > >> > > > > >> > Normaly this would not be a problem. To avoid > > throwing > > > an out > > > >> of memory > > > >> > error (due to hitting the direct buffer memory > > limit) too > > > >> early the JDK > > > >> > triggers a System.gc(). This should trigger > > reference > > > >> procesing and all > > > >> > buffers that are no longer in use would be > freed. > > > Since you > > > >> specify the > > > >> > option -XX:+DisableExplicitGC all these calls to > > > trigger GCs are > > > >> > ignored > > > >> > and no direct memory will be freed. So in our > > testing, > > > just > > > >> removing > > > >> > this flags makes the test pass. > > > >> > > > > >> > Another solution is to look at using > > HeapByteBuffers > > > instead > > > >> and don't > > > >> > have to worry about the direct memory usage. The > > > OpenHFT lib > > > >> seems to > > > >> > have support for this by just using > > > >> elasticHeapByteBuffer(...) instead > > > >> > of elasticByteBuffer(). > > > >> > > > > >> > Lastly, the reason for this working with > > > non-generational ZGC is > > > >> > that it > > > >> > does reference processing for every GC. > > > >> > > > > >> > Hope this helps, > > > >> > StefanJ > > > >> > > > > >> > > > > >> > On 2024-02-15 21:53, Johannes Lichtenberger > wrote: > > > >> > > It's a laptop, I've attached some details. > > > >> > > > > > >> > > Furthermore, if it seems worth digging > > deeper into the > > > >> issue, the > > > >> > JSON > > > >> > > file is here for one week: > > > >> > > > > https://www.transfernow.net/dl/20240215j9NaPTc0 > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ > > > > > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ > >> > > > >> > > > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > >>> > > > >> > > > > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > >>>> > > > >> > > > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ > > > > > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ > >> > > > >> > > > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ > >>> > > > >> > > > > > > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > < > https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ > >>>>> > > > >> > > > > > >> > > You'd have to unzip into > > > >> bundles/sirix-core/src/test/resources/json, > > > >> > > remove the @Disabled annotation and run the > test > > > >> > > JsonShredderTest::testChicagoDescendantAxis > > > >> > > > > > >> > > The test JVM parameters are specified in the > > parent > > > >> build.gradle > > > >> > in the > > > >> > > project root folder. > > > >> > > > > > >> > > The GitHub repo: > > https://github.com/sirixdb/sirix > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ > > > > > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ > >> > > > >> > > > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > >>> > > > >> > > > > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > >>>> > > > >> > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ > > > > > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ > >> > > > >> > > > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ > >>> > > > >> > > > > > > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > < > https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ > >>>>> > > > >> > > > > > >> > > Screenshot from 2024-02-15 21-43-33.png > > > >> > > > > > >> > > kind regards > > > >> > > Johannes > > > >> > > > > > >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb > > Peter Booth > > > >> > > > > > > > > > > >> > > > >> > > > > > > > > >>> > > > >> > > > > > > > > > >> > > > >> > > > > > > > > >>>>>: > > > >> > > > > > >> > > Just curious - what CPU, physical memory > > and OS are > > > >> you using? > > > >> > > Sent from my iPhone > > > >> > > > > > >> > >> On Feb 15, 2024, at 12:23?PM, Johannes > > > Lichtenberger > > > >> > >> > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > >> > > > > > >>> > > > >> > >> > > > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > >> > > > > > >>>>> wrote: > > > >> > >> > > > >> > >> ? > > > >> > >> I guess I don't know which JDK it picks > > for the > > > >> tests, but I > > > >> > guess > > > >> > >> OpenJDK > > > >> > >> > > > >> > >> Johannes Lichtenberger > > > >> > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > >> > > > > > >>> > > > >> > >> > > > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > >> > > > > > >>>>> schrieb am Do., 15. > > > >> > >> Feb. 2024, 17:58: > > > >> > >> > > > >> > >> However, it's the same with: > ./gradlew > > > >> > >> > > > -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > > > >> > >> :sirix-core:test --tests > > > >> > >> > > > >> > > > > > > > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis > using OpenJDK hopefully > > > >> > >> > > > >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr > > schrieb > > > Johannes > > > >> > >> Lichtenberger > > > > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > >> > > > > > >>> > > > >> > >> > > > > > > > > > > > >> > > > > > >> > > > >> > > > > > > > > > > >> > > > > > >>>>>: > > > >> > >> > > > >> > >> I've attached two logs, the > > first one > > > without > > > >> > >> -XX:+Generational, the second > > one with the > > > >> option set, > > > >> > >> even though I also saw, that > > > generational ZGC is > > > >> > going to > > > >> > >> be supported in GraalVM 24.1 in > > > September... > > > >> so not sure > > > >> > >> what this does :) > > > >> > >> > > > >> > >> Am Do., 15. Feb. 2024 um 17:52 > > Uhr schrieb > > > >> Johannes > > > >> > >> Lichtenberger > > > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.osterlund at oracle.com Sat Feb 17 18:13:13 2024 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Sat, 17 Feb 2024 18:13:13 +0000 Subject: [External] : Re: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> <0261b2f1-e5fe-434b-b3bc-453de8021223@oracle.com> Message-ID: <98209DE5-9B24-4864-8B94-FF6D1A817623@oracle.com> Could you check how much is user time vs system time? It smells like you are swapping. When swapping starts, performance goes out of the window. /Erik On 17 Feb 2024, at 16:21, Johannes Lichtenberger wrote: ? I'll check later on if the test doesn't fail with the 2g max and I'll have to check as well if it's still the OutOfMemoryError (as I'm not at home currently). But everytime everything freezes for a couple of seconds. In any case isn't it strange that with G1 and ZGC the runtime is very close to each other with G1 having the upper hand slightly, when switching to on heap ByteBuffers, but with Generational ZGC the runtime almost exactly doubles? I thought at some point the generational version should make the non generational obsolet and as almost every object dies young the generational ZGC should be better as you wrote!? Kind regards and have a nice weekend (and kind of feel sorry for bothering that much) Johannes Stefan Johansson > schrieb am Sa., 17. Feb. 2024, 15:54: Ok, when you say crashes, what do you mean. Are you still seeing the same OutOfMemoryError or are we talking about an actual JVM crash. Or is it the Linux OOM killer stepping in because of high memory pressure? If this is with the new setting of 5g for direct memory it could be that these 3 extra gigs of memory is pushing you over the limit for what can be handle by you laptop. Generally, if you start swapping, the performance is out the door and you need to look at the configuration. Maybe the 2g for direct memory is reasonable on this setup to avoid swapping. I looked a bit at the total memory usage for the process here and it seem to be around 20G. Stefan On 2024-02-17 12:24, Johannes Lichtenberger wrote: > So, switching back to DirectByteBuffers, and removing the disabling of > explicit GCs still crashes on my laptop (swapping + close to 32 Gb RAM > used)... > > kind regards > Johannes > > Am Sa., 17. Feb. 2024 um 10:22 Uhr schrieb Stefan Johansson > >>: > > > > On 2024-02-17 00:36, Johannes Lichtenberger wrote: > > I just removed "-XX+DisableExplizitGC", increased max direct > memory size > > to 5g (-XX:MaxDirectMemorySize=5g), but also changed > > Bytes::elasticByteBuffer to Bytes.elasticHeapByteBuffer(60_000); > > to use on heap ByteBuffers. > > > > Just for clarity, when using HeapByteBuffers the MaxDirectMemorySize > has > no effect since the ByteBuffers will be stored on the heap. But if you > keep going with DirectByteBuffers, this might make sense to give some > more head room. > > > However, the performance seems to be way worse. I've repeated the > test > > several times, but with both G1 and non generational ZGC it's > ~50s for > > importing the JSON file in the first case vs ~100s using > generational > > ZGC, using Temurin 21.0.2 with similar values for the actual > traversals. > > > > Ok, sounds like using DirectByteBuffers is a performance win here. > If so > I would just continue testing using DirectByteBuffers and allowing > explicit GCs to ensure they are cleaned out properly. > > > From the log on STDOUT, I can see this (meaning 0,319s and > 0,440s... > > pause times?) > > > > No, with ZGC the time here is not the pause time, it's the time to > complete the whole GC. ZGC is a concurrent GC, meaning that most of the > GC work is done concurrently with the Java application still running. > There are still a some very short pauses, all way below 1ms. You can > see > them if you look at the detailed log: > > [30,938s][info][gc ] GC(3) Minor Collection (Allocation Rate) > [30,938s][info][gc,phases ] GC(3) y: Young Generation > [30,938s][info][gc,phases ] GC(3) y: Pause Mark Start 0,060ms > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark 383,563ms > [31,322s][info][gc,phases ] GC(3) y: Pause Mark End 0,046ms > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark Free 0,009ms > [31,322s][info][gc,phases ] GC(3) y: Concurrent Reset Relocation Set > 0,201ms > [31,335s][info][gc,phases ] GC(3) y: Concurrent Select Relocation Set > 13,228ms > [31,335s][info][gc,phases ] GC(3) y: Pause Relocate Start 0,019ms > [31,381s][info][gc,phases ] GC(3) y: Concurrent Relocate 45,967ms > [31,382s][info][gc,phases ] GC(3) y: Young Generation > 9726M(63%)->518M(3%) 0,444s > [31,382s][info][gc ] GC(3) Minor Collection (Allocation Rate) > 9726M(63%)->518M(3%) 0,444s > > Here I included the phase-logs for a single GC of the young generation, > where you can clearly see how much time was spent in which part of the > GC and as you can see the three pauses are all very very short. > > Stefan > > > [35,718s][info ][gc ] GC(9) Minor Collection (Allocation > Rate) > > 12462M(81%)->1556M(10%) 0,319s > > [40,871s][info ][gc ] GC(10) Minor Collection (Allocation > Rate) > > [41,311s][info ][gc ] GC(10) Minor Collection (Allocation > Rate) > > 13088M(85%)->1432M(9%) 0,440s > > [46,236s][info ][gc ] GC(11) Minor Collection (Allocation > Rate) > > [46,603s][info ][gc ] GC(11) Minor Collection (Allocation > Rate) > > 12406M(81%)->1676M(11%) 0,367s > > [51,445s][info ][gc ] GC(12) Minor Collection (Allocation > Rate) > > [51,846s][info ][gc ] GC(12) Minor Collection (Allocation > Rate) > > 12848M(84%)->1556M(10%) 0,401s > > [56,203s][info ][gc ] GC(13) Major Collection (Proactive) > > [56,368s][info ][gc ] GC(13) Major Collection (Proactive) > > 11684M(76%)->484M(3%) 0,166s > > > > kind regards > > Johannes > > > > Am Fr., 16. Feb. 2024 um 22:39 Uhr schrieb Erik Osterlund > > > > >>>: > > > > It?s worth noting that when using ZGC, calling System.gc does not > > invoke a classic disastrously long GC pause. Instead, a > concurrent > > GC is triggered, which should be not that noticeable to the > > application. The thread calling System.gc is blocked until > the GC is > > done, but the other threads can run freely. > > > > /Erik > > > > > On 16 Feb 2024, at 21:55, Stefan Johansson > > > > > > >>> > > wrote: > > > > > > ? > > > > > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: > > >> Thanks a lot, I wasn't even aware of the fact, that > > DirectByteBuffers use System.gc() and I always had in mind that > > calling System.gc() at least in application code is bad > practice (or > > at least we shouldn't rely on it) and I think I read somewhere a > > while ago, that it's recommended to even disable this, but may be > > completely wrong, of course. > > > In most cases callling System.gc() is bad practice, in some > > special cases it might be needed. > > > > > >> I'll change it to on-heap byte buffers tomorrow :-) > > >> I think your GC log entries were from G1, right? It seems ZGC > > always tries to use the full heap :-) > > > > > > Yes, the snippet was G1, it was mostly to show that the > pressure > > isn't high. You are correct that ZGC uses more of the given > heap but > > the collections are pretty far apart and I'm certian it would > > function well with a smaller heap as well. Maybe in that case > some > > Major collections would be triggered. > > > > > >> Kind regards and thanks for sharing your insights. > > > > > > No problem. We appriciate the feedback, > > > StefanJ > > > > > >> Have a nice weekend as well > > >> Johannes > > >> Stefan Johansson > > > > > >> > > > > > > > >>>> schrieb am Fr., 16. Feb. > > 2024, 17:38: > > >> Hi, > > >> Some comments inline. > > >> On 2024-02-16 16:47, Johannes Lichtenberger wrote: > > >> > Thanks a lot for looking into it, I've added > > >> > `-XX:MaxDirectMemorySize=2g` only recently, but > without it > > failed as > > >> > well, so not sure what the default is. Will definitely > > check your > > >> > suggestions :-) > > >> > > > >> If you don't set a limit it will be set to: > > >> Runtime.getRuntime().maxMemory() > > >> So likely a good idea to set a reasonable limit, but the > > smaller the > > >> limit is the more frequent we need to run reference > > processing to allow > > >> memory to be freed up. > > >> > Sadly I'm currently working alone on the project in my > > spare time > > >> > (besides professionally switched from Java/Kotlin > stuff to the > > >> embedded > > >> > software world) and I'm not sure if the current > > architecture of > > >> Sirix is > > >> > limited by too much GC pressure. I'd probably have > to check > > >> Cassandra at > > >> > some point and look into flame graphs and stuff for > their > > >> integration > > >> > tests, but maybe you can give some general > insights/advice... > > >> > > > >> > Yesterday evening I switched to other JDKs (also I > want to > > test with > > >> > Shenandoah in particular), but I think especially the > > better escape > > >> > analysis of the GraalVM is a huge plus in the case of > > SirixDB (for > > >> > insertion on my laptop it's ~90s vs ~60s), but I > think it > > should be > > >> > faster and currently my suspicion is that garbage > is a major > > >> performance > > >> > issue. > > >> > > > >> > Maybe the GC pressure in general is a major issue, > as in > > the CPU > > >> Flame > > >> > graph IIRC the G1 had about 20% stack frames > allocated and non > > >> > generational ZGC even around 40% taking all threads > into > > account. > > >> > > > >> From what I/we see, the GC pressure in the given test is > > not high. > > >> The > > >> allocation rate is below 1GB/s and since most of it > die young > > the GCs > > >> are fairly cheap. In this log snippet G1 shows a GC > every 5s > > and the > > >> pause time is below 50ms: > > >> [296,016s][info ][gc ] GC(90) Pause Young > (Normal) (G1 > > >> Evacuation > > >> Pause) 5413M->1849M(6456M) 35,577ms > > >> [301,103s][info ][gc ] GC(91) Pause Young > (Normal) (G1 > > >> Evacuation > > >> Pause) 5417M->1848M(6456M) 33,357ms > > >> [306,041s][info ][gc ] GC(92) Pause Young > (Normal) (G1 > > >> Evacuation > > >> Pause) 5416M->1848M(6456M) 32,763ms > > >> [310,849s][info ][gc ] GC(93) Pause Young > (Normal) (G1 > > >> Evacuation > > >> Pause) 5416M->1847M(6456M) 33,086ms > > >> I also see that the heap never expands to more the > ~6.5GB even > > >> though it > > >> is allow to be 15GB and this also suggest that the GC > is not > > under much > > >> pressure. As I said in the previous mail, the reason > > Generational ZGC > > >> don't free up the direct memory without the > System.gc() calls > > is that > > >> the GC pressure is not high enough to trigger any Major > > cycles. So I > > >> would strongly recommend you to not run with > > -XX+DisableExplicitGC > > >> unless you really have to. Since you are using > > DirectByteBuffers and > > >> they use System.gc() to help free memory when the limit is > > reached. > > >> > So in general I'm thinking about backing the > > KeyValueLeafPages with > > >> > MemorySegments, but I think due to variable sized pages > > it's getting > > >> > tricky, plus I currently don't have the time for > changing > > >> fundamental > > >> > stuff and I'm even not sure if it'll bring a > performance > > boost, as I > > >> > have to adapt neighbour relationships of the nodes > often and > > >> off-heap > > >> > memory access might be slightly worse performance wise. > > >> > > > >> > What do you think? > > >> > > > >> I know to little about the application to be able to give > > advice here, > > >> but I would first start with having most memory on > heap. Only > > large > > >> long > > >> lived stuff off-heap, if really needed. Looking at the > test > > at hand, it > > >> really doesn't look like it is long lived stuff that is > > placed off heap. > > >> > I've attached a memory flame graph and there it > seems the > > byte array > > >> > from deserializing each page is prominent, but that > might be > > >> something I > > >> > can't even avoid (after decompression via Snappy or via > > another > > >> lib and > > >> > maybe also decryption in the future). > > >> > > > >> > As of now G1 with GraalVM seems to perform best (a > little > > bit better > > >> > than with non generational ZGC, but I thought ZGC > or maybe > > >> Shenandoah > > >> > would improve the situation). But as said I may have to > > generate way > > >> > less garbage after all in general for good > performance!? > > >> > > > >> > All in all maybe due to most objects die young > maybe also the > > >> > generational GCs are not needed (that said if > enough RAM is > > >> available > > >> > and the Caffeine Caches are sized accordingly most > objects may > > >> die old). > > >> > But apparently the byte arrays holding the page > data still die > > >> young (in > > >> > AbstractReader::deserialize). In fact I'm not even sure > > why they > > >> escape, > > >> > but currently I'm on my phone. > > >> > > > >> It's when most objects die young the Generational GC > really > > shines, > > >> because it can handle the short lived objects without > having > > to look at > > >> the long lived objects. So I would say Generational > ZGC is a > > good fit > > >> here, but we need to let the System.gc() run to allow > reference > > >> processing or slightly re-design and use HeapByteBuffers. > > >> Have a nice weekend, > > >> Stefan > > >> > Kind regards > > >> > Johannes > > >> > > > >> > Stefan Johansson > > > > > >> > > >> > > > > > >>> > > >> > > > > > > >> > > >> > > > > > >>>>> schrieb am Fr., 16. Feb. > > >> 2024, 13:43: > > >> > > > >> > Hi Johannes, > > >> > > > >> > We've spent some more time looking at this and > getting the > > >> json-file to > > >> > reproduced it made it easy to verify our > suspicions. > > Thanks for > > >> > uploading it. > > >> > > > >> > There are a few things playing together here. > The test is > > >> making quite > > >> > heavy use of DirectByteBuffers and you limit > the usage > > to 2G > > >> > (-XX:MaxDirectMemorySize=2g). The life cycle and > > freeing of > > >> the native > > >> > memory part of the DirectByteBuffer rely on > reference > > >> processing. In > > >> > generational ZGC reference processing is only done > > during Major > > >> > collections and since the general GC preassure > in this > > >> benchmark is > > >> > very > > >> > low (most objects die young), we do not trigger > that > > many Major > > >> > collections. > > >> > > > >> > Normaly this would not be a problem. To avoid > throwing > > an out > > >> of memory > > >> > error (due to hitting the direct buffer memory > limit) too > > >> early the JDK > > >> > triggers a System.gc(). This should trigger > reference > > >> procesing and all > > >> > buffers that are no longer in use would be freed. > > Since you > > >> specify the > > >> > option -XX:+DisableExplicitGC all these calls to > > trigger GCs are > > >> > ignored > > >> > and no direct memory will be freed. So in our > testing, > > just > > >> removing > > >> > this flags makes the test pass. > > >> > > > >> > Another solution is to look at using > HeapByteBuffers > > instead > > >> and don't > > >> > have to worry about the direct memory usage. The > > OpenHFT lib > > >> seems to > > >> > have support for this by just using > > >> elasticHeapByteBuffer(...) instead > > >> > of elasticByteBuffer(). > > >> > > > >> > Lastly, the reason for this working with > > non-generational ZGC is > > >> > that it > > >> > does reference processing for every GC. > > >> > > > >> > Hope this helps, > > >> > StefanJ > > >> > > > >> > > > >> > On 2024-02-15 21:53, Johannes Lichtenberger wrote: > > >> > > It's a laptop, I've attached some details. > > >> > > > > >> > > Furthermore, if it seems worth digging > deeper into the > > >> issue, the > > >> > JSON > > >> > > file is here for one week: > > >> > > > https://www.transfernow.net/dl/20240215j9NaPTc0 > > > > > > > >> > > > >> > > >> > > > > > >>> > > >> > > > > > > > > > > >> > > > >> > > >> > > > > > >>>> > > >> > > > > >> > > You'd have to unzip into > > >> bundles/sirix-core/src/test/resources/json, > > >> > > remove the @Disabled annotation and run the test > > >> > > JsonShredderTest::testChicagoDescendantAxis > > >> > > > > >> > > The test JVM parameters are specified in the > parent > > >> build.gradle > > >> > in the > > >> > > project root folder. > > >> > > > > >> > > The GitHub repo: > https://github.com/sirixdb/sirix > > > > > > > >> > > > >> > > >> > > > > > >>> > > >> > > > > > > > > > >> > > > >> > > >> > > > > > >>>> > > >> > > > > >> > > Screenshot from 2024-02-15 21-43-33.png > > >> > > > > >> > > kind regards > > >> > > Johannes > > >> > > > > >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb > Peter Booth > > >> > > > > > >> > > > > >>> > > >> > > >> > > > > >>>> > > >> > > > > >> > > > > >>> > > >> > > >> > > > > >>>>>>: > > >> > > > > >> > > Just curious - what CPU, physical memory > and OS are > > >> you using? > > >> > > Sent from my iPhone > > >> > > > > >> > >> On Feb 15, 2024, at 12:23?PM, Johannes > > Lichtenberger > > >> > >> > > > > > >> > > >> > > > > > >>> > > >> > > > > > > >> > > >> > > > > > >>>> > > >> > >> > > > > > > >> > > >> > > > > > >>> > > >> > > > > > > >> > > >> > > > > > >>>>>> wrote: > > >> > >> > > >> > >> ? > > >> > >> I guess I don't know which JDK it picks > for the > > >> tests, but I > > >> > guess > > >> > >> OpenJDK > > >> > >> > > >> > >> Johannes Lichtenberger > > >> > > > > > >> > > >> > > > > > >>> > > >> > > > > > > >> > > >> > > > > > >>>> > > >> > >> > > > > > > >> > > >> > > > > > >>> > > >> > > > > > > >> > > >> > > > > > >>>>>> schrieb am Do., 15. > > >> > >> Feb. 2024, 17:58: > > >> > >> > > >> > >> However, it's the same with: ./gradlew > > >> > >> > > -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 > > >> > >> :sirix-core:test --tests > > >> > >> > > >> > > > > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis using OpenJDK hopefully > > >> > >> > > >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr > schrieb > > Johannes > > >> > >> Lichtenberger > > > > > > > >> > > >> > > > > > >>> > > >> > > > > > > >> > > >> > > > > > >>>> > > >> > >> > > > > > > >> > > >> > > > > > >>> > > >> > > > > > > >> > > >> > > > > > >>>>>>: > > >> > >> > > >> > >> I've attached two logs, the > first one > > without > > >> > >> -XX:+Generational, the second > one with the > > >> option set, > > >> > >> even though I also saw, that > > generational ZGC is > > >> > going to > > >> > >> be supported in GraalVM 24.1 in > > September... > > >> so not sure > > >> > >> what this does :) > > >> > >> > > >> > >> Am Do., 15. Feb. 2024 um 17:52 > Uhr schrieb > > >> Johannes > > >> > >> Lichtenberger > > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From lichtenberger.johannes at gmail.com Sat Feb 17 20:46:54 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Sat, 17 Feb 2024 21:46:54 +0100 Subject: [External] : Re: Generational ZGC issue In-Reply-To: <98209DE5-9B24-4864-8B94-FF6D1A817623@oracle.com> References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> <0261b2f1-e5fe-434b-b3bc-453de8021223@oracle.com> <98209DE5-9B24-4864-8B94-FF6D1A817623@oracle.com> Message-ID: With setting the max direct memory to 2Gb (-XX:MaxDirectMemorySize=2g) and using DirectByteBuffers under the hood using Chronicle Bytes, the test runs on my laptop, but as said way slower than with non generational ZGC or G1 (and with G1 it's even fastest). real 4m58,836s user 0m1,849s sys 0m0,297s It caps at around 20Gb RAM usage and I think swapping somehow occurred at some point (452MB of 2GB). Before, setting MaxDIrectMemorySize to 5g (too high), the memory usage (watching `htop`) went to around 25Gb before heavy swapping occurred and the test failed at some point... kind regards Johannes Am Sa., 17. Feb. 2024 um 19:13 Uhr schrieb Erik Osterlund < erik.osterlund at oracle.com>: > Could you check how much is user time vs system time? It smells like you > are swapping. When swapping starts, performance goes out of the window. > > /Erik > > On 17 Feb 2024, at 16:21, Johannes Lichtenberger < > lichtenberger.johannes at gmail.com> wrote: > > ? > I'll check later on if the test doesn't fail with the 2g max and I'll have > to check as well if it's still the OutOfMemoryError (as I'm not at home > currently). But everytime everything freezes for a couple of seconds. > > In any case isn't it strange that with G1 and ZGC the runtime is very > close to each other with G1 having the upper hand slightly, when switching > to on heap ByteBuffers, but with Generational ZGC the runtime almost > exactly doubles? I thought at some point the generational version should > make the non generational obsolet and as almost every object dies young the > generational ZGC should be better as you wrote!? > > Kind regards and have a nice weekend (and kind of feel sorry for bothering > that much) > Johannes > > Stefan Johansson schrieb am Sa., 17. Feb. > 2024, 15:54: > >> Ok, when you say crashes, what do you mean. Are you still seeing the >> same OutOfMemoryError or are we talking about an actual JVM crash. Or is >> it the Linux OOM killer stepping in because of high memory pressure? >> >> If this is with the new setting of 5g for direct memory it could be that >> these 3 extra gigs of memory is pushing you over the limit for what can >> be handle by you laptop. Generally, if you start swapping, the >> performance is out the door and you need to look at the configuration. >> Maybe the 2g for direct memory is reasonable on this setup to avoid >> swapping. I looked a bit at the total memory usage for the process here >> and it seem to be around 20G. >> >> Stefan >> >> On 2024-02-17 12:24, Johannes Lichtenberger wrote: >> > So, switching back to DirectByteBuffers, and removing the disabling of >> > explicit GCs still crashes on my laptop (swapping + close to 32 Gb RAM >> > used)... >> > >> > kind regards >> > Johannes >> > >> > Am Sa., 17. Feb. 2024 um 10:22 Uhr schrieb Stefan Johansson >> > >: >> > >> > >> > >> > On 2024-02-17 00:36, Johannes Lichtenberger wrote: >> > > I just removed "-XX+DisableExplizitGC", increased max direct >> > memory size >> > > to 5g (-XX:MaxDirectMemorySize=5g), but also changed >> > > Bytes::elasticByteBuffer to Bytes.elasticHeapByteBuffer(60_000); >> > > to use on heap ByteBuffers. >> > > >> > >> > Just for clarity, when using HeapByteBuffers the MaxDirectMemorySize >> > has >> > no effect since the ByteBuffers will be stored on the heap. But if >> you >> > keep going with DirectByteBuffers, this might make sense to give >> some >> > more head room. >> > >> > > However, the performance seems to be way worse. I've repeated the >> > test >> > > several times, but with both G1 and non generational ZGC it's >> > ~50s for >> > > importing the JSON file in the first case vs ~100s using >> > generational >> > > ZGC, using Temurin 21.0.2 with similar values for the actual >> > traversals. >> > > >> > >> > Ok, sounds like using DirectByteBuffers is a performance win here. >> > If so >> > I would just continue testing using DirectByteBuffers and allowing >> > explicit GCs to ensure they are cleaned out properly. >> > >> > > From the log on STDOUT, I can see this (meaning 0,319s and >> > 0,440s... >> > > pause times?) >> > > >> > >> > No, with ZGC the time here is not the pause time, it's the time to >> > complete the whole GC. ZGC is a concurrent GC, meaning that most of >> the >> > GC work is done concurrently with the Java application still >> running. >> > There are still a some very short pauses, all way below 1ms. You can >> > see >> > them if you look at the detailed log: >> > >> > [30,938s][info][gc ] GC(3) Minor Collection (Allocation >> Rate) >> > [30,938s][info][gc,phases ] GC(3) y: Young Generation >> > [30,938s][info][gc,phases ] GC(3) y: Pause Mark Start 0,060ms >> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark 383,563ms >> > [31,322s][info][gc,phases ] GC(3) y: Pause Mark End 0,046ms >> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark Free 0,009ms >> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Reset Relocation >> Set >> > 0,201ms >> > [31,335s][info][gc,phases ] GC(3) y: Concurrent Select Relocation >> Set >> > 13,228ms >> > [31,335s][info][gc,phases ] GC(3) y: Pause Relocate Start 0,019ms >> > [31,381s][info][gc,phases ] GC(3) y: Concurrent Relocate 45,967ms >> > [31,382s][info][gc,phases ] GC(3) y: Young Generation >> > 9726M(63%)->518M(3%) 0,444s >> > [31,382s][info][gc ] GC(3) Minor Collection (Allocation >> Rate) >> > 9726M(63%)->518M(3%) 0,444s >> > >> > Here I included the phase-logs for a single GC of the young >> generation, >> > where you can clearly see how much time was spent in which part of >> the >> > GC and as you can see the three pauses are all very very short. >> > >> > Stefan >> > >> > > [35,718s][info ][gc ] GC(9) Minor Collection (Allocation >> > Rate) >> > > 12462M(81%)->1556M(10%) 0,319s >> > > [40,871s][info ][gc ] GC(10) Minor Collection (Allocation >> > Rate) >> > > [41,311s][info ][gc ] GC(10) Minor Collection (Allocation >> > Rate) >> > > 13088M(85%)->1432M(9%) 0,440s >> > > [46,236s][info ][gc ] GC(11) Minor Collection (Allocation >> > Rate) >> > > [46,603s][info ][gc ] GC(11) Minor Collection (Allocation >> > Rate) >> > > 12406M(81%)->1676M(11%) 0,367s >> > > [51,445s][info ][gc ] GC(12) Minor Collection (Allocation >> > Rate) >> > > [51,846s][info ][gc ] GC(12) Minor Collection (Allocation >> > Rate) >> > > 12848M(84%)->1556M(10%) 0,401s >> > > [56,203s][info ][gc ] GC(13) Major Collection (Proactive) >> > > [56,368s][info ][gc ] GC(13) Major Collection (Proactive) >> > > 11684M(76%)->484M(3%) 0,166s >> > > >> > > kind regards >> > > Johannes >> > > >> > > Am Fr., 16. Feb. 2024 um 22:39 Uhr schrieb Erik Osterlund >> > > >> > > >>>: >> > > >> > > It?s worth noting that when using ZGC, calling System.gc >> does not >> > > invoke a classic disastrously long GC pause. Instead, a >> > concurrent >> > > GC is triggered, which should be not that noticeable to the >> > > application. The thread calling System.gc is blocked until >> > the GC is >> > > done, but the other threads can run freely. >> > > >> > > /Erik >> > > >> > > > On 16 Feb 2024, at 21:55, Stefan Johansson >> > > > > >> > > > >> >> > > wrote: >> > > > >> > > > ? >> > > > >> > > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: >> > > >> Thanks a lot, I wasn't even aware of the fact, that >> > > DirectByteBuffers use System.gc() and I always had in mind >> that >> > > calling System.gc() at least in application code is bad >> > practice (or >> > > at least we shouldn't rely on it) and I think I read >> somewhere a >> > > while ago, that it's recommended to even disable this, but >> may be >> > > completely wrong, of course. >> > > > In most cases callling System.gc() is bad practice, in >> some >> > > special cases it might be needed. >> > > > >> > > >> I'll change it to on-heap byte buffers tomorrow :-) >> > > >> I think your GC log entries were from G1, right? It >> seems ZGC >> > > always tries to use the full heap :-) >> > > > >> > > > Yes, the snippet was G1, it was mostly to show that the >> > pressure >> > > isn't high. You are correct that ZGC uses more of the given >> > heap but >> > > the collections are pretty far apart and I'm certian it would >> > > function well with a smaller heap as well. Maybe in that case >> > some >> > > Major collections would be triggered. >> > > > >> > > >> Kind regards and thanks for sharing your insights. >> > > > >> > > > No problem. We appriciate the feedback, >> > > > StefanJ >> > > > >> > > >> Have a nice weekend as well >> > > >> Johannes >> > > >> Stefan Johansson > > >> > > > > > >> > > > > >> > > > > >>> schrieb am Fr., 16. Feb. >> > > 2024, 17:38: >> > > >> Hi, >> > > >> Some comments inline. >> > > >> On 2024-02-16 16:47, Johannes Lichtenberger wrote: >> > > >> > Thanks a lot for looking into it, I've added >> > > >> > `-XX:MaxDirectMemorySize=2g` only recently, but >> > without it >> > > failed as >> > > >> > well, so not sure what the default is. Will >> definitely >> > > check your >> > > >> > suggestions :-) >> > > >> > >> > > >> If you don't set a limit it will be set to: >> > > >> Runtime.getRuntime().maxMemory() >> > > >> So likely a good idea to set a reasonable limit, but >> the >> > > smaller the >> > > >> limit is the more frequent we need to run reference >> > > processing to allow >> > > >> memory to be freed up. >> > > >> > Sadly I'm currently working alone on the project >> in my >> > > spare time >> > > >> > (besides professionally switched from Java/Kotlin >> > stuff to the >> > > >> embedded >> > > >> > software world) and I'm not sure if the current >> > > architecture of >> > > >> Sirix is >> > > >> > limited by too much GC pressure. I'd probably have >> > to check >> > > >> Cassandra at >> > > >> > some point and look into flame graphs and stuff for >> > their >> > > >> integration >> > > >> > tests, but maybe you can give some general >> > insights/advice... >> > > >> > >> > > >> > Yesterday evening I switched to other JDKs (also I >> > want to >> > > test with >> > > >> > Shenandoah in particular), but I think especially >> the >> > > better escape >> > > >> > analysis of the GraalVM is a huge plus in the case >> of >> > > SirixDB (for >> > > >> > insertion on my laptop it's ~90s vs ~60s), but I >> > think it >> > > should be >> > > >> > faster and currently my suspicion is that garbage >> > is a major >> > > >> performance >> > > >> > issue. >> > > >> > >> > > >> > Maybe the GC pressure in general is a major issue, >> > as in >> > > the CPU >> > > >> Flame >> > > >> > graph IIRC the G1 had about 20% stack frames >> > allocated and non >> > > >> > generational ZGC even around 40% taking all threads >> > into >> > > account. >> > > >> > >> > > >> From what I/we see, the GC pressure in the given >> test is >> > > not high. >> > > >> The >> > > >> allocation rate is below 1GB/s and since most of it >> > die young >> > > the GCs >> > > >> are fairly cheap. In this log snippet G1 shows a GC >> > every 5s >> > > and the >> > > >> pause time is below 50ms: >> > > >> [296,016s][info ][gc ] GC(90) Pause Young >> > (Normal) (G1 >> > > >> Evacuation >> > > >> Pause) 5413M->1849M(6456M) 35,577ms >> > > >> [301,103s][info ][gc ] GC(91) Pause Young >> > (Normal) (G1 >> > > >> Evacuation >> > > >> Pause) 5417M->1848M(6456M) 33,357ms >> > > >> [306,041s][info ][gc ] GC(92) Pause Young >> > (Normal) (G1 >> > > >> Evacuation >> > > >> Pause) 5416M->1848M(6456M) 32,763ms >> > > >> [310,849s][info ][gc ] GC(93) Pause Young >> > (Normal) (G1 >> > > >> Evacuation >> > > >> Pause) 5416M->1847M(6456M) 33,086ms >> > > >> I also see that the heap never expands to more the >> > ~6.5GB even >> > > >> though it >> > > >> is allow to be 15GB and this also suggest that the GC >> > is not >> > > under much >> > > >> pressure. As I said in the previous mail, the reason >> > > Generational ZGC >> > > >> don't free up the direct memory without the >> > System.gc() calls >> > > is that >> > > >> the GC pressure is not high enough to trigger any >> Major >> > > cycles. So I >> > > >> would strongly recommend you to not run with >> > > -XX+DisableExplicitGC >> > > >> unless you really have to. Since you are using >> > > DirectByteBuffers and >> > > >> they use System.gc() to help free memory when the >> limit is >> > > reached. >> > > >> > So in general I'm thinking about backing the >> > > KeyValueLeafPages with >> > > >> > MemorySegments, but I think due to variable sized >> pages >> > > it's getting >> > > >> > tricky, plus I currently don't have the time for >> > changing >> > > >> fundamental >> > > >> > stuff and I'm even not sure if it'll bring a >> > performance >> > > boost, as I >> > > >> > have to adapt neighbour relationships of the nodes >> > often and >> > > >> off-heap >> > > >> > memory access might be slightly worse performance >> wise. >> > > >> > >> > > >> > What do you think? >> > > >> > >> > > >> I know to little about the application to be able to >> give >> > > advice here, >> > > >> but I would first start with having most memory on >> > heap. Only >> > > large >> > > >> long >> > > >> lived stuff off-heap, if really needed. Looking at the >> > test >> > > at hand, it >> > > >> really doesn't look like it is long lived stuff that >> is >> > > placed off heap. >> > > >> > I've attached a memory flame graph and there it >> > seems the >> > > byte array >> > > >> > from deserializing each page is prominent, but that >> > might be >> > > >> something I >> > > >> > can't even avoid (after decompression via Snappy >> or via >> > > another >> > > >> lib and >> > > >> > maybe also decryption in the future). >> > > >> > >> > > >> > As of now G1 with GraalVM seems to perform best (a >> > little >> > > bit better >> > > >> > than with non generational ZGC, but I thought ZGC >> > or maybe >> > > >> Shenandoah >> > > >> > would improve the situation). But as said I may >> have to >> > > generate way >> > > >> > less garbage after all in general for good >> > performance!? >> > > >> > >> > > >> > All in all maybe due to most objects die young >> > maybe also the >> > > >> > generational GCs are not needed (that said if >> > enough RAM is >> > > >> available >> > > >> > and the Caffeine Caches are sized accordingly most >> > objects may >> > > >> die old). >> > > >> > But apparently the byte arrays holding the page >> > data still die >> > > >> young (in >> > > >> > AbstractReader::deserialize). In fact I'm not even >> sure >> > > why they >> > > >> escape, >> > > >> > but currently I'm on my phone. >> > > >> > >> > > >> It's when most objects die young the Generational GC >> > really >> > > shines, >> > > >> because it can handle the short lived objects without >> > having >> > > to look at >> > > >> the long lived objects. So I would say Generational >> > ZGC is a >> > > good fit >> > > >> here, but we need to let the System.gc() run to allow >> > reference >> > > >> processing or slightly re-design and use >> HeapByteBuffers. >> > > >> Have a nice weekend, >> > > >> Stefan >> > > >> > Kind regards >> > > >> > Johannes >> > > >> > >> > > >> > Stefan Johansson > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>>> schrieb am Fr., 16. Feb. >> > > >> 2024, 13:43: >> > > >> > >> > > >> > Hi Johannes, >> > > >> > >> > > >> > We've spent some more time looking at this and >> > getting the >> > > >> json-file to >> > > >> > reproduced it made it easy to verify our >> > suspicions. >> > > Thanks for >> > > >> > uploading it. >> > > >> > >> > > >> > There are a few things playing together here. >> > The test is >> > > >> making quite >> > > >> > heavy use of DirectByteBuffers and you limit >> > the usage >> > > to 2G >> > > >> > (-XX:MaxDirectMemorySize=2g). The life cycle >> and >> > > freeing of >> > > >> the native >> > > >> > memory part of the DirectByteBuffer rely on >> > reference >> > > >> processing. In >> > > >> > generational ZGC reference processing is only >> done >> > > during Major >> > > >> > collections and since the general GC preassure >> > in this >> > > >> benchmark is >> > > >> > very >> > > >> > low (most objects die young), we do not trigger >> > that >> > > many Major >> > > >> > collections. >> > > >> > >> > > >> > Normaly this would not be a problem. To avoid >> > throwing >> > > an out >> > > >> of memory >> > > >> > error (due to hitting the direct buffer memory >> > limit) too >> > > >> early the JDK >> > > >> > triggers a System.gc(). This should trigger >> > reference >> > > >> procesing and all >> > > >> > buffers that are no longer in use would be >> freed. >> > > Since you >> > > >> specify the >> > > >> > option -XX:+DisableExplicitGC all these calls >> to >> > > trigger GCs are >> > > >> > ignored >> > > >> > and no direct memory will be freed. So in our >> > testing, >> > > just >> > > >> removing >> > > >> > this flags makes the test pass. >> > > >> > >> > > >> > Another solution is to look at using >> > HeapByteBuffers >> > > instead >> > > >> and don't >> > > >> > have to worry about the direct memory usage. >> The >> > > OpenHFT lib >> > > >> seems to >> > > >> > have support for this by just using >> > > >> elasticHeapByteBuffer(...) instead >> > > >> > of elasticByteBuffer(). >> > > >> > >> > > >> > Lastly, the reason for this working with >> > > non-generational ZGC is >> > > >> > that it >> > > >> > does reference processing for every GC. >> > > >> > >> > > >> > Hope this helps, >> > > >> > StefanJ >> > > >> > >> > > >> > >> > > >> > On 2024-02-15 21:53, Johannes Lichtenberger >> wrote: >> > > >> > > It's a laptop, I've attached some details. >> > > >> > > >> > > >> > > Furthermore, if it seems worth digging >> > deeper into the >> > > >> issue, the >> > > >> > JSON >> > > >> > > file is here for one week: >> > > >> > > >> > https://www.transfernow.net/dl/20240215j9NaPTc0 >> >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ >> > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >> >> >> > > >> >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> >>> >> > > >> > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> >>>> >> > > >> > > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ >> > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >> >> >> > > >> >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> >>> >> > > >> > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> >>>>> >> > > >> > > >> > > >> > > You'd have to unzip into >> > > >> bundles/sirix-core/src/test/resources/json, >> > > >> > > remove the @Disabled annotation and run the >> test >> > > >> > > JsonShredderTest::testChicagoDescendantAxis >> > > >> > > >> > > >> > > The test JVM parameters are specified in the >> > parent >> > > >> build.gradle >> > > >> > in the >> > > >> > > project root folder. >> > > >> > > >> > > >> > > The GitHub repo: >> > https://github.com/sirixdb/sirix >> >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ >> > >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >> >> >> > > >> >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> >>> >> > > >> > >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> >>>> >> > > >> > > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ >> > >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >> >> >> > > >> >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> >>> >> > > >> > >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> >>>>> >> > > >> > > >> > > >> > > Screenshot from 2024-02-15 21-43-33.png >> > > >> > > >> > > >> > > kind regards >> > > >> > > Johannes >> > > >> > > >> > > >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb >> > Peter Booth >> > > >> > > > > > > > >> > > >> > >> >> > > >> > > >> > > >> > > >> > >>> >> > > >> > > > > > > >> > > >> > >> >> > > >> > > >> > > >> > > >> > >>>>>: >> > > >> > > >> > > >> > > Just curious - what CPU, physical memory >> > and OS are >> > > >> you using? >> > > >> > > Sent from my iPhone >> > > >> > > >> > > >> > >> On Feb 15, 2024, at 12:23?PM, Johannes >> > > Lichtenberger >> > > >> > >> > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>> >> > > >> > >> >> > > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>>>> wrote: >> > > >> > >> >> > > >> > >> ? >> > > >> > >> I guess I don't know which JDK it picks >> > for the >> > > >> tests, but I >> > > >> > guess >> > > >> > >> OpenJDK >> > > >> > >> >> > > >> > >> Johannes Lichtenberger >> > > >> > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>> >> > > >> > >> >> > > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>>>> schrieb am Do., 15. >> > > >> > >> Feb. 2024, 17:58: >> > > >> > >> >> > > >> > >> However, it's the same with: >> ./gradlew >> > > >> > >> >> > > -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 >> > > >> > >> :sirix-core:test --tests >> > > >> > >> >> > > >> > >> > > >> > >> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >> using OpenJDK hopefully >> > > >> > >> >> > > >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr >> > schrieb >> > > Johannes >> > > >> > >> Lichtenberger >> > > > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>> >> > > >> > >> >> > > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>>>>: >> > > >> > >> >> > > >> > >> I've attached two logs, the >> > first one >> > > without >> > > >> > >> -XX:+Generational, the second >> > one with the >> > > >> option set, >> > > >> > >> even though I also saw, that >> > > generational ZGC is >> > > >> > going to >> > > >> > >> be supported in GraalVM 24.1 in >> > > September... >> > > >> so not sure >> > > >> > >> what this does :) >> > > >> > >> >> > > >> > >> Am Do., 15. Feb. 2024 um 17:52 >> > Uhr schrieb >> > > >> Johannes >> > > >> > >> Lichtenberger >> > > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lichtenberger.johannes at gmail.com Sat Feb 17 23:01:29 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Sun, 18 Feb 2024 00:01:29 +0100 Subject: [External] : Re: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> <0261b2f1-e5fe-434b-b3bc-453de8021223@oracle.com> <98209DE5-9B24-4864-8B94-FF6D1A817623@oracle.com> Message-ID: Rerun the test with `johannes at luna:~/IdeaProjects/sirix$ time ./gradlew :sirix-core:test --tests io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis` and -XX:MaxDirectMemorySize=1g ... real 5m8,135s user 0m1,799s sys 0m0,374s I've attached the log. Stuff like this may be strange, or normal? [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed retain safe 0x00000001dbe00000 [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed retain safe 0x00000001dc200000 [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed retain safe 0x00000001e2600000 [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed retain safe 0x00000001e3e00000 [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed retain safe 0x00000001f2400000 [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed retain safe 0x00000001f9200000 [183,700s][debug ][gc,remset ] GC(57) y: Scan_Forwarding Failed Retain Safe 0x00000001fac00000 kind regards Johannes Am Sa., 17. Feb. 2024 um 21:46 Uhr schrieb Johannes Lichtenberger < lichtenberger.johannes at gmail.com>: > With setting the max direct memory to 2Gb (-XX:MaxDirectMemorySize=2g) and > using DirectByteBuffers under the hood using Chronicle Bytes, the test runs > on my laptop, but as said way slower than with non generational ZGC or G1 > (and with G1 it's even fastest). > > real 4m58,836s > user 0m1,849s > sys 0m0,297s > > It caps at around 20Gb RAM usage and I think swapping somehow occurred at > some point (452MB of 2GB). > > Before, setting MaxDIrectMemorySize to 5g (too high), the memory usage > (watching `htop`) went to around 25Gb before heavy swapping occurred and > the test failed at some point... > > kind regards > Johannes > > Am Sa., 17. Feb. 2024 um 19:13 Uhr schrieb Erik Osterlund < > erik.osterlund at oracle.com>: > >> Could you check how much is user time vs system time? It smells like you >> are swapping. When swapping starts, performance goes out of the window. >> >> /Erik >> >> On 17 Feb 2024, at 16:21, Johannes Lichtenberger < >> lichtenberger.johannes at gmail.com> wrote: >> >> ? >> I'll check later on if the test doesn't fail with the 2g max and I'll >> have to check as well if it's still the OutOfMemoryError (as I'm not at >> home currently). But everytime everything freezes for a couple of seconds. >> >> In any case isn't it strange that with G1 and ZGC the runtime is very >> close to each other with G1 having the upper hand slightly, when switching >> to on heap ByteBuffers, but with Generational ZGC the runtime almost >> exactly doubles? I thought at some point the generational version should >> make the non generational obsolet and as almost every object dies young the >> generational ZGC should be better as you wrote!? >> >> Kind regards and have a nice weekend (and kind of feel sorry for >> bothering that much) >> Johannes >> >> Stefan Johansson schrieb am Sa., 17. Feb. >> 2024, 15:54: >> >>> Ok, when you say crashes, what do you mean. Are you still seeing the >>> same OutOfMemoryError or are we talking about an actual JVM crash. Or is >>> it the Linux OOM killer stepping in because of high memory pressure? >>> >>> If this is with the new setting of 5g for direct memory it could be that >>> these 3 extra gigs of memory is pushing you over the limit for what can >>> be handle by you laptop. Generally, if you start swapping, the >>> performance is out the door and you need to look at the configuration. >>> Maybe the 2g for direct memory is reasonable on this setup to avoid >>> swapping. I looked a bit at the total memory usage for the process here >>> and it seem to be around 20G. >>> >>> Stefan >>> >>> On 2024-02-17 12:24, Johannes Lichtenberger wrote: >>> > So, switching back to DirectByteBuffers, and removing the disabling of >>> > explicit GCs still crashes on my laptop (swapping + close to 32 Gb RAM >>> > used)... >>> > >>> > kind regards >>> > Johannes >>> > >>> > Am Sa., 17. Feb. 2024 um 10:22 Uhr schrieb Stefan Johansson >>> > >: >>> > >>> > >>> > >>> > On 2024-02-17 00:36, Johannes Lichtenberger wrote: >>> > > I just removed "-XX+DisableExplizitGC", increased max direct >>> > memory size >>> > > to 5g (-XX:MaxDirectMemorySize=5g), but also changed >>> > > Bytes::elasticByteBuffer to Bytes.elasticHeapByteBuffer(60_000); >>> > > to use on heap ByteBuffers. >>> > > >>> > >>> > Just for clarity, when using HeapByteBuffers the >>> MaxDirectMemorySize >>> > has >>> > no effect since the ByteBuffers will be stored on the heap. But if >>> you >>> > keep going with DirectByteBuffers, this might make sense to give >>> some >>> > more head room. >>> > >>> > > However, the performance seems to be way worse. I've repeated >>> the >>> > test >>> > > several times, but with both G1 and non generational ZGC it's >>> > ~50s for >>> > > importing the JSON file in the first case vs ~100s using >>> > generational >>> > > ZGC, using Temurin 21.0.2 with similar values for the actual >>> > traversals. >>> > > >>> > >>> > Ok, sounds like using DirectByteBuffers is a performance win here. >>> > If so >>> > I would just continue testing using DirectByteBuffers and allowing >>> > explicit GCs to ensure they are cleaned out properly. >>> > >>> > > From the log on STDOUT, I can see this (meaning 0,319s and >>> > 0,440s... >>> > > pause times?) >>> > > >>> > >>> > No, with ZGC the time here is not the pause time, it's the time to >>> > complete the whole GC. ZGC is a concurrent GC, meaning that most >>> of the >>> > GC work is done concurrently with the Java application still >>> running. >>> > There are still a some very short pauses, all way below 1ms. You >>> can >>> > see >>> > them if you look at the detailed log: >>> > >>> > [30,938s][info][gc ] GC(3) Minor Collection (Allocation >>> Rate) >>> > [30,938s][info][gc,phases ] GC(3) y: Young Generation >>> > [30,938s][info][gc,phases ] GC(3) y: Pause Mark Start 0,060ms >>> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark 383,563ms >>> > [31,322s][info][gc,phases ] GC(3) y: Pause Mark End 0,046ms >>> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark Free 0,009ms >>> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Reset Relocation >>> Set >>> > 0,201ms >>> > [31,335s][info][gc,phases ] GC(3) y: Concurrent Select >>> Relocation Set >>> > 13,228ms >>> > [31,335s][info][gc,phases ] GC(3) y: Pause Relocate Start 0,019ms >>> > [31,381s][info][gc,phases ] GC(3) y: Concurrent Relocate 45,967ms >>> > [31,382s][info][gc,phases ] GC(3) y: Young Generation >>> > 9726M(63%)->518M(3%) 0,444s >>> > [31,382s][info][gc ] GC(3) Minor Collection (Allocation >>> Rate) >>> > 9726M(63%)->518M(3%) 0,444s >>> > >>> > Here I included the phase-logs for a single GC of the young >>> generation, >>> > where you can clearly see how much time was spent in which part of >>> the >>> > GC and as you can see the three pauses are all very very short. >>> > >>> > Stefan >>> > >>> > > [35,718s][info ][gc ] GC(9) Minor Collection (Allocation >>> > Rate) >>> > > 12462M(81%)->1556M(10%) 0,319s >>> > > [40,871s][info ][gc ] GC(10) Minor Collection (Allocation >>> > Rate) >>> > > [41,311s][info ][gc ] GC(10) Minor Collection (Allocation >>> > Rate) >>> > > 13088M(85%)->1432M(9%) 0,440s >>> > > [46,236s][info ][gc ] GC(11) Minor Collection (Allocation >>> > Rate) >>> > > [46,603s][info ][gc ] GC(11) Minor Collection (Allocation >>> > Rate) >>> > > 12406M(81%)->1676M(11%) 0,367s >>> > > [51,445s][info ][gc ] GC(12) Minor Collection (Allocation >>> > Rate) >>> > > [51,846s][info ][gc ] GC(12) Minor Collection (Allocation >>> > Rate) >>> > > 12848M(84%)->1556M(10%) 0,401s >>> > > [56,203s][info ][gc ] GC(13) Major Collection (Proactive) >>> > > [56,368s][info ][gc ] GC(13) Major Collection (Proactive) >>> > > 11684M(76%)->484M(3%) 0,166s >>> > > >>> > > kind regards >>> > > Johannes >>> > > >>> > > Am Fr., 16. Feb. 2024 um 22:39 Uhr schrieb Erik Osterlund >>> > > >>> > >> erik.osterlund at oracle.com>>>: >>> > > >>> > > It?s worth noting that when using ZGC, calling System.gc >>> does not >>> > > invoke a classic disastrously long GC pause. Instead, a >>> > concurrent >>> > > GC is triggered, which should be not that noticeable to the >>> > > application. The thread calling System.gc is blocked until >>> > the GC is >>> > > done, but the other threads can run freely. >>> > > >>> > > /Erik >>> > > >>> > > > On 16 Feb 2024, at 21:55, Stefan Johansson >>> > > >> > >>> > >> > >> >>> > > wrote: >>> > > > >>> > > > ? >>> > > > >>> > > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: >>> > > >> Thanks a lot, I wasn't even aware of the fact, that >>> > > DirectByteBuffers use System.gc() and I always had in mind >>> that >>> > > calling System.gc() at least in application code is bad >>> > practice (or >>> > > at least we shouldn't rely on it) and I think I read >>> somewhere a >>> > > while ago, that it's recommended to even disable this, but >>> may be >>> > > completely wrong, of course. >>> > > > In most cases callling System.gc() is bad practice, in >>> some >>> > > special cases it might be needed. >>> > > > >>> > > >> I'll change it to on-heap byte buffers tomorrow :-) >>> > > >> I think your GC log entries were from G1, right? It >>> seems ZGC >>> > > always tries to use the full heap :-) >>> > > > >>> > > > Yes, the snippet was G1, it was mostly to show that the >>> > pressure >>> > > isn't high. You are correct that ZGC uses more of the given >>> > heap but >>> > > the collections are pretty far apart and I'm certian it >>> would >>> > > function well with a smaller heap as well. Maybe in that >>> case >>> > some >>> > > Major collections would be triggered. >>> > > > >>> > > >> Kind regards and thanks for sharing your insights. >>> > > > >>> > > > No problem. We appriciate the feedback, >>> > > > StefanJ >>> > > > >>> > > >> Have a nice weekend as well >>> > > >> Johannes >>> > > >> Stefan Johansson >> > >>> > > >> > > >>> > > >> > >>> > > >> > >>> schrieb am Fr., 16. Feb. >>> > > 2024, 17:38: >>> > > >> Hi, >>> > > >> Some comments inline. >>> > > >> On 2024-02-16 16:47, Johannes Lichtenberger wrote: >>> > > >> > Thanks a lot for looking into it, I've added >>> > > >> > `-XX:MaxDirectMemorySize=2g` only recently, but >>> > without it >>> > > failed as >>> > > >> > well, so not sure what the default is. Will >>> definitely >>> > > check your >>> > > >> > suggestions :-) >>> > > >> > >>> > > >> If you don't set a limit it will be set to: >>> > > >> Runtime.getRuntime().maxMemory() >>> > > >> So likely a good idea to set a reasonable limit, but >>> the >>> > > smaller the >>> > > >> limit is the more frequent we need to run reference >>> > > processing to allow >>> > > >> memory to be freed up. >>> > > >> > Sadly I'm currently working alone on the project >>> in my >>> > > spare time >>> > > >> > (besides professionally switched from Java/Kotlin >>> > stuff to the >>> > > >> embedded >>> > > >> > software world) and I'm not sure if the current >>> > > architecture of >>> > > >> Sirix is >>> > > >> > limited by too much GC pressure. I'd probably have >>> > to check >>> > > >> Cassandra at >>> > > >> > some point and look into flame graphs and stuff >>> for >>> > their >>> > > >> integration >>> > > >> > tests, but maybe you can give some general >>> > insights/advice... >>> > > >> > >>> > > >> > Yesterday evening I switched to other JDKs (also I >>> > want to >>> > > test with >>> > > >> > Shenandoah in particular), but I think especially >>> the >>> > > better escape >>> > > >> > analysis of the GraalVM is a huge plus in the >>> case of >>> > > SirixDB (for >>> > > >> > insertion on my laptop it's ~90s vs ~60s), but I >>> > think it >>> > > should be >>> > > >> > faster and currently my suspicion is that garbage >>> > is a major >>> > > >> performance >>> > > >> > issue. >>> > > >> > >>> > > >> > Maybe the GC pressure in general is a major issue, >>> > as in >>> > > the CPU >>> > > >> Flame >>> > > >> > graph IIRC the G1 had about 20% stack frames >>> > allocated and non >>> > > >> > generational ZGC even around 40% taking all >>> threads >>> > into >>> > > account. >>> > > >> > >>> > > >> From what I/we see, the GC pressure in the given >>> test is >>> > > not high. >>> > > >> The >>> > > >> allocation rate is below 1GB/s and since most of it >>> > die young >>> > > the GCs >>> > > >> are fairly cheap. In this log snippet G1 shows a GC >>> > every 5s >>> > > and the >>> > > >> pause time is below 50ms: >>> > > >> [296,016s][info ][gc ] GC(90) Pause Young >>> > (Normal) (G1 >>> > > >> Evacuation >>> > > >> Pause) 5413M->1849M(6456M) 35,577ms >>> > > >> [301,103s][info ][gc ] GC(91) Pause Young >>> > (Normal) (G1 >>> > > >> Evacuation >>> > > >> Pause) 5417M->1848M(6456M) 33,357ms >>> > > >> [306,041s][info ][gc ] GC(92) Pause Young >>> > (Normal) (G1 >>> > > >> Evacuation >>> > > >> Pause) 5416M->1848M(6456M) 32,763ms >>> > > >> [310,849s][info ][gc ] GC(93) Pause Young >>> > (Normal) (G1 >>> > > >> Evacuation >>> > > >> Pause) 5416M->1847M(6456M) 33,086ms >>> > > >> I also see that the heap never expands to more the >>> > ~6.5GB even >>> > > >> though it >>> > > >> is allow to be 15GB and this also suggest that the GC >>> > is not >>> > > under much >>> > > >> pressure. As I said in the previous mail, the reason >>> > > Generational ZGC >>> > > >> don't free up the direct memory without the >>> > System.gc() calls >>> > > is that >>> > > >> the GC pressure is not high enough to trigger any >>> Major >>> > > cycles. So I >>> > > >> would strongly recommend you to not run with >>> > > -XX+DisableExplicitGC >>> > > >> unless you really have to. Since you are using >>> > > DirectByteBuffers and >>> > > >> they use System.gc() to help free memory when the >>> limit is >>> > > reached. >>> > > >> > So in general I'm thinking about backing the >>> > > KeyValueLeafPages with >>> > > >> > MemorySegments, but I think due to variable sized >>> pages >>> > > it's getting >>> > > >> > tricky, plus I currently don't have the time for >>> > changing >>> > > >> fundamental >>> > > >> > stuff and I'm even not sure if it'll bring a >>> > performance >>> > > boost, as I >>> > > >> > have to adapt neighbour relationships of the nodes >>> > often and >>> > > >> off-heap >>> > > >> > memory access might be slightly worse performance >>> wise. >>> > > >> > >>> > > >> > What do you think? >>> > > >> > >>> > > >> I know to little about the application to be able to >>> give >>> > > advice here, >>> > > >> but I would first start with having most memory on >>> > heap. Only >>> > > large >>> > > >> long >>> > > >> lived stuff off-heap, if really needed. Looking at >>> the >>> > test >>> > > at hand, it >>> > > >> really doesn't look like it is long lived stuff that >>> is >>> > > placed off heap. >>> > > >> > I've attached a memory flame graph and there it >>> > seems the >>> > > byte array >>> > > >> > from deserializing each page is prominent, but >>> that >>> > might be >>> > > >> something I >>> > > >> > can't even avoid (after decompression via Snappy >>> or via >>> > > another >>> > > >> lib and >>> > > >> > maybe also decryption in the future). >>> > > >> > >>> > > >> > As of now G1 with GraalVM seems to perform best (a >>> > little >>> > > bit better >>> > > >> > than with non generational ZGC, but I thought ZGC >>> > or maybe >>> > > >> Shenandoah >>> > > >> > would improve the situation). But as said I may >>> have to >>> > > generate way >>> > > >> > less garbage after all in general for good >>> > performance!? >>> > > >> > >>> > > >> > All in all maybe due to most objects die young >>> > maybe also the >>> > > >> > generational GCs are not needed (that said if >>> > enough RAM is >>> > > >> available >>> > > >> > and the Caffeine Caches are sized accordingly most >>> > objects may >>> > > >> die old). >>> > > >> > But apparently the byte arrays holding the page >>> > data still die >>> > > >> young (in >>> > > >> > AbstractReader::deserialize). In fact I'm not >>> even sure >>> > > why they >>> > > >> escape, >>> > > >> > but currently I'm on my phone. >>> > > >> > >>> > > >> It's when most objects die young the Generational GC >>> > really >>> > > shines, >>> > > >> because it can handle the short lived objects without >>> > having >>> > > to look at >>> > > >> the long lived objects. So I would say Generational >>> > ZGC is a >>> > > good fit >>> > > >> here, but we need to let the System.gc() run to allow >>> > reference >>> > > >> processing or slightly re-design and use >>> HeapByteBuffers. >>> > > >> Have a nice weekend, >>> > > >> Stefan >>> > > >> > Kind regards >>> > > >> > Johannes >>> > > >> > >>> > > >> > Stefan Johansson >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >> >>> > > >> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >>>> schrieb am Fr., 16. Feb. >>> > > >> 2024, 13:43: >>> > > >> > >>> > > >> > Hi Johannes, >>> > > >> > >>> > > >> > We've spent some more time looking at this and >>> > getting the >>> > > >> json-file to >>> > > >> > reproduced it made it easy to verify our >>> > suspicions. >>> > > Thanks for >>> > > >> > uploading it. >>> > > >> > >>> > > >> > There are a few things playing together here. >>> > The test is >>> > > >> making quite >>> > > >> > heavy use of DirectByteBuffers and you limit >>> > the usage >>> > > to 2G >>> > > >> > (-XX:MaxDirectMemorySize=2g). The life cycle >>> and >>> > > freeing of >>> > > >> the native >>> > > >> > memory part of the DirectByteBuffer rely on >>> > reference >>> > > >> processing. In >>> > > >> > generational ZGC reference processing is only >>> done >>> > > during Major >>> > > >> > collections and since the general GC preassure >>> > in this >>> > > >> benchmark is >>> > > >> > very >>> > > >> > low (most objects die young), we do not >>> trigger >>> > that >>> > > many Major >>> > > >> > collections. >>> > > >> > >>> > > >> > Normaly this would not be a problem. To avoid >>> > throwing >>> > > an out >>> > > >> of memory >>> > > >> > error (due to hitting the direct buffer memory >>> > limit) too >>> > > >> early the JDK >>> > > >> > triggers a System.gc(). This should trigger >>> > reference >>> > > >> procesing and all >>> > > >> > buffers that are no longer in use would be >>> freed. >>> > > Since you >>> > > >> specify the >>> > > >> > option -XX:+DisableExplicitGC all these calls >>> to >>> > > trigger GCs are >>> > > >> > ignored >>> > > >> > and no direct memory will be freed. So in our >>> > testing, >>> > > just >>> > > >> removing >>> > > >> > this flags makes the test pass. >>> > > >> > >>> > > >> > Another solution is to look at using >>> > HeapByteBuffers >>> > > instead >>> > > >> and don't >>> > > >> > have to worry about the direct memory usage. >>> The >>> > > OpenHFT lib >>> > > >> seems to >>> > > >> > have support for this by just using >>> > > >> elasticHeapByteBuffer(...) instead >>> > > >> > of elasticByteBuffer(). >>> > > >> > >>> > > >> > Lastly, the reason for this working with >>> > > non-generational ZGC is >>> > > >> > that it >>> > > >> > does reference processing for every GC. >>> > > >> > >>> > > >> > Hope this helps, >>> > > >> > StefanJ >>> > > >> > >>> > > >> > >>> > > >> > On 2024-02-15 21:53, Johannes Lichtenberger >>> wrote: >>> > > >> > > It's a laptop, I've attached some details. >>> > > >> > > >>> > > >> > > Furthermore, if it seems worth digging >>> > deeper into the >>> > > >> issue, the >>> > > >> > JSON >>> > > >> > > file is here for one week: >>> > > >> > > >>> > https://www.transfernow.net/dl/20240215j9NaPTc0 >>> >>> > < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ >>> > >>> > > >>> > < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >>> >> >>> > > >> >>> > > >>> > < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>> >>> >>> > > >> > >>> > > >>> > < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> >>>> >>> > > >> > > >>> > >> >>> > < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ >>> > >>> > > >>> > < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >>> >> >>> > > >> >>> > > >>> > < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>> >>> >>> > > >> > >>> > > >>> > < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> < >>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>> >>>>> >>> > > >> > > >>> > > >> > > You'd have to unzip into >>> > > >> bundles/sirix-core/src/test/resources/json, >>> > > >> > > remove the @Disabled annotation and run >>> the test >>> > > >> > > JsonShredderTest::testChicagoDescendantAxis >>> > > >> > > >>> > > >> > > The test JVM parameters are specified in >>> the >>> > parent >>> > > >> build.gradle >>> > > >> > in the >>> > > >> > > project root folder. >>> > > >> > > >>> > > >> > > The GitHub repo: >>> > https://github.com/sirixdb/sirix >>> >>> > < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ >>> > >>> > > >>> > < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >>> >> >>> > > >> >>> > > >>> > < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>> >>> >>> > > >> > >>> > > >>> > < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> >>>> >>> > > >> > > >> >>> > < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ >>> > >>> > > >>> > < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >>> >> >>> > > >> >>> > > >>> > < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>> >>> >>> > > >> > >>> > > >>> > < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> < >>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>> >>>>> >>> > > >> > > >>> > > >> > > Screenshot from 2024-02-15 21-43-33.png >>> > > >> > > >>> > > >> > > kind regards >>> > > >> > > Johannes >>> > > >> > > >>> > > >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb >>> > Peter Booth >>> > > >> > > >> > >> > > >>> > > >>> > >> >>> > > >> >> peter_booth at me.com> >>> > > >>> > > >>> > >>> >>> > > >> > >> > >> > > >>> > > >>> > >> >>> > > >> >> peter_booth at me.com> >>> > > >>> > > >>> > >>>>>: >>> > > >> > > >>> > > >> > > Just curious - what CPU, physical >>> memory >>> > and OS are >>> > > >> you using? >>> > > >> > > Sent from my iPhone >>> > > >> > > >>> > > >> > >> On Feb 15, 2024, at 12:23?PM, Johannes >>> > > Lichtenberger >>> > > >> > >> >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >> >>> > > >> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >>> >>> > > >> > >> >>> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >> >>> > > >> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >>>>> wrote: >>> > > >> > >> >>> > > >> > >> ? >>> > > >> > >> I guess I don't know which JDK it >>> picks >>> > for the >>> > > >> tests, but I >>> > > >> > guess >>> > > >> > >> OpenJDK >>> > > >> > >> >>> > > >> > >> Johannes Lichtenberger >>> > > >> >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >> >>> > > >> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >>> >>> > > >> > >> >>> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >> >>> > > >> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >>>>> schrieb am Do., 15. >>> > > >> > >> Feb. 2024, 17:58: >>> > > >> > >> >>> > > >> > >> However, it's the same with: >>> ./gradlew >>> > > >> > >> >>> > > -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 >>> > > >> > >> :sirix-core:test --tests >>> > > >> > >> >>> > > >> > >>> > > >>> > >>> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >>> using OpenJDK hopefully >>> > > >> > >> >>> > > >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr >>> > schrieb >>> > > Johannes >>> > > >> > >> Lichtenberger >>> > > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >> >>> > > >> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >>> >>> > > >> > >> >>> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >> >>> > > >> > >> > >>> > > >> > > >>> > > >> >> > >>> > > >> > >>>>>: >>> > > >> > >> >>> > > >> > >> I've attached two logs, the >>> > first one >>> > > without >>> > > >> > >> -XX:+Generational, the second >>> > one with the >>> > > >> option set, >>> > > >> > >> even though I also saw, that >>> > > generational ZGC is >>> > > >> > going to >>> > > >> > >> be supported in GraalVM 24.1 >>> in >>> > > September... >>> > > >> so not sure >>> > > >> > >> what this does :) >>> > > >> > >> >>> > > >> > >> Am Do., 15. Feb. 2024 um 17:52 >>> > Uhr schrieb >>> > > >> Johannes >>> > > >> > >> Lichtenberger >>> > > >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: zgc-generational-very-new.log.tar.gz Type: application/gzip Size: 985874 bytes Desc: not available URL: From lichtenberger.johannes at gmail.com Sat Feb 17 23:27:42 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Sun, 18 Feb 2024 00:27:42 +0100 Subject: [External] : Re: Generational ZGC issue In-Reply-To: References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> <0261b2f1-e5fe-434b-b3bc-453de8021223@oracle.com> <98209DE5-9B24-4864-8B94-FF6D1A817623@oracle.com> Message-ID: I've also added a new IntelliJ Ultimate Async Profiler JFR recording using Generational ZGC: https://github.com/sirixdb/sirix/blob/main/JsonShredderTest_testChicagoDescendantAxis_2024_02_18_001558-zgc-generational-latest.jfr Am So., 18. Feb. 2024 um 00:01 Uhr schrieb Johannes Lichtenberger < lichtenberger.johannes at gmail.com>: > Rerun the test with `johannes at luna:~/IdeaProjects/sirix$ time ./gradlew > :sirix-core:test --tests > io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis` > > and -XX:MaxDirectMemorySize=1g > > ... > > real 5m8,135s > user 0m1,799s > sys 0m0,374s > > I've attached the log. Stuff like this may be strange, or normal? > > [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed > retain safe 0x00000001dbe00000 > [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed > retain safe 0x00000001dc200000 > [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed > retain safe 0x00000001e2600000 > [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed > retain safe 0x00000001e3e00000 > [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed > retain safe 0x00000001f2400000 > [183,700s][debug ][gc,remset ] GC(57) y: scan_forwarding failed > retain safe 0x00000001f9200000 > [183,700s][debug ][gc,remset ] GC(57) y: Scan_Forwarding Failed > Retain Safe 0x00000001fac00000 > > kind regards > Johannes > > Am Sa., 17. Feb. 2024 um 21:46 Uhr schrieb Johannes Lichtenberger < > lichtenberger.johannes at gmail.com>: > >> With setting the max direct memory to 2Gb (-XX:MaxDirectMemorySize=2g) >> and using DirectByteBuffers under the hood using Chronicle Bytes, the test >> runs on my laptop, but as said way slower than with non generational ZGC or >> G1 (and with G1 it's even fastest). >> >> real 4m58,836s >> user 0m1,849s >> sys 0m0,297s >> >> It caps at around 20Gb RAM usage and I think swapping somehow occurred at >> some point (452MB of 2GB). >> >> Before, setting MaxDIrectMemorySize to 5g (too high), the memory usage >> (watching `htop`) went to around 25Gb before heavy swapping occurred and >> the test failed at some point... >> >> kind regards >> Johannes >> >> Am Sa., 17. Feb. 2024 um 19:13 Uhr schrieb Erik Osterlund < >> erik.osterlund at oracle.com>: >> >>> Could you check how much is user time vs system time? It smells like you >>> are swapping. When swapping starts, performance goes out of the window. >>> >>> /Erik >>> >>> On 17 Feb 2024, at 16:21, Johannes Lichtenberger < >>> lichtenberger.johannes at gmail.com> wrote: >>> >>> ? >>> I'll check later on if the test doesn't fail with the 2g max and I'll >>> have to check as well if it's still the OutOfMemoryError (as I'm not at >>> home currently). But everytime everything freezes for a couple of seconds. >>> >>> In any case isn't it strange that with G1 and ZGC the runtime is very >>> close to each other with G1 having the upper hand slightly, when switching >>> to on heap ByteBuffers, but with Generational ZGC the runtime almost >>> exactly doubles? I thought at some point the generational version should >>> make the non generational obsolet and as almost every object dies young the >>> generational ZGC should be better as you wrote!? >>> >>> Kind regards and have a nice weekend (and kind of feel sorry for >>> bothering that much) >>> Johannes >>> >>> Stefan Johansson schrieb am Sa., 17. Feb. >>> 2024, 15:54: >>> >>>> Ok, when you say crashes, what do you mean. Are you still seeing the >>>> same OutOfMemoryError or are we talking about an actual JVM crash. Or >>>> is >>>> it the Linux OOM killer stepping in because of high memory pressure? >>>> >>>> If this is with the new setting of 5g for direct memory it could be >>>> that >>>> these 3 extra gigs of memory is pushing you over the limit for what can >>>> be handle by you laptop. Generally, if you start swapping, the >>>> performance is out the door and you need to look at the configuration. >>>> Maybe the 2g for direct memory is reasonable on this setup to avoid >>>> swapping. I looked a bit at the total memory usage for the process here >>>> and it seem to be around 20G. >>>> >>>> Stefan >>>> >>>> On 2024-02-17 12:24, Johannes Lichtenberger wrote: >>>> > So, switching back to DirectByteBuffers, and removing the disabling >>>> of >>>> > explicit GCs still crashes on my laptop (swapping + close to 32 Gb >>>> RAM >>>> > used)... >>>> > >>>> > kind regards >>>> > Johannes >>>> > >>>> > Am Sa., 17. Feb. 2024 um 10:22 Uhr schrieb Stefan Johansson >>>> > >: >>>> > >>>> > >>>> > >>>> > On 2024-02-17 00:36, Johannes Lichtenberger wrote: >>>> > > I just removed "-XX+DisableExplizitGC", increased max direct >>>> > memory size >>>> > > to 5g (-XX:MaxDirectMemorySize=5g), but also changed >>>> > > Bytes::elasticByteBuffer to >>>> Bytes.elasticHeapByteBuffer(60_000); >>>> > > to use on heap ByteBuffers. >>>> > > >>>> > >>>> > Just for clarity, when using HeapByteBuffers the >>>> MaxDirectMemorySize >>>> > has >>>> > no effect since the ByteBuffers will be stored on the heap. But >>>> if you >>>> > keep going with DirectByteBuffers, this might make sense to give >>>> some >>>> > more head room. >>>> > >>>> > > However, the performance seems to be way worse. I've repeated >>>> the >>>> > test >>>> > > several times, but with both G1 and non generational ZGC it's >>>> > ~50s for >>>> > > importing the JSON file in the first case vs ~100s using >>>> > generational >>>> > > ZGC, using Temurin 21.0.2 with similar values for the actual >>>> > traversals. >>>> > > >>>> > >>>> > Ok, sounds like using DirectByteBuffers is a performance win here. >>>> > If so >>>> > I would just continue testing using DirectByteBuffers and allowing >>>> > explicit GCs to ensure they are cleaned out properly. >>>> > >>>> > > From the log on STDOUT, I can see this (meaning 0,319s and >>>> > 0,440s... >>>> > > pause times?) >>>> > > >>>> > >>>> > No, with ZGC the time here is not the pause time, it's the time to >>>> > complete the whole GC. ZGC is a concurrent GC, meaning that most >>>> of the >>>> > GC work is done concurrently with the Java application still >>>> running. >>>> > There are still a some very short pauses, all way below 1ms. You >>>> can >>>> > see >>>> > them if you look at the detailed log: >>>> > >>>> > [30,938s][info][gc ] GC(3) Minor Collection (Allocation >>>> Rate) >>>> > [30,938s][info][gc,phases ] GC(3) y: Young Generation >>>> > [30,938s][info][gc,phases ] GC(3) y: Pause Mark Start 0,060ms >>>> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark 383,563ms >>>> > [31,322s][info][gc,phases ] GC(3) y: Pause Mark End 0,046ms >>>> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark Free >>>> 0,009ms >>>> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Reset >>>> Relocation Set >>>> > 0,201ms >>>> > [31,335s][info][gc,phases ] GC(3) y: Concurrent Select >>>> Relocation Set >>>> > 13,228ms >>>> > [31,335s][info][gc,phases ] GC(3) y: Pause Relocate Start >>>> 0,019ms >>>> > [31,381s][info][gc,phases ] GC(3) y: Concurrent Relocate >>>> 45,967ms >>>> > [31,382s][info][gc,phases ] GC(3) y: Young Generation >>>> > 9726M(63%)->518M(3%) 0,444s >>>> > [31,382s][info][gc ] GC(3) Minor Collection (Allocation >>>> Rate) >>>> > 9726M(63%)->518M(3%) 0,444s >>>> > >>>> > Here I included the phase-logs for a single GC of the young >>>> generation, >>>> > where you can clearly see how much time was spent in which part >>>> of the >>>> > GC and as you can see the three pauses are all very very short. >>>> > >>>> > Stefan >>>> > >>>> > > [35,718s][info ][gc ] GC(9) Minor Collection (Allocation >>>> > Rate) >>>> > > 12462M(81%)->1556M(10%) 0,319s >>>> > > [40,871s][info ][gc ] GC(10) Minor Collection >>>> (Allocation >>>> > Rate) >>>> > > [41,311s][info ][gc ] GC(10) Minor Collection >>>> (Allocation >>>> > Rate) >>>> > > 13088M(85%)->1432M(9%) 0,440s >>>> > > [46,236s][info ][gc ] GC(11) Minor Collection >>>> (Allocation >>>> > Rate) >>>> > > [46,603s][info ][gc ] GC(11) Minor Collection >>>> (Allocation >>>> > Rate) >>>> > > 12406M(81%)->1676M(11%) 0,367s >>>> > > [51,445s][info ][gc ] GC(12) Minor Collection >>>> (Allocation >>>> > Rate) >>>> > > [51,846s][info ][gc ] GC(12) Minor Collection >>>> (Allocation >>>> > Rate) >>>> > > 12848M(84%)->1556M(10%) 0,401s >>>> > > [56,203s][info ][gc ] GC(13) Major Collection >>>> (Proactive) >>>> > > [56,368s][info ][gc ] GC(13) Major Collection >>>> (Proactive) >>>> > > 11684M(76%)->484M(3%) 0,166s >>>> > > >>>> > > kind regards >>>> > > Johannes >>>> > > >>>> > > Am Fr., 16. Feb. 2024 um 22:39 Uhr schrieb Erik Osterlund >>>> > > >>>> > >>> erik.osterlund at oracle.com>>>: >>>> > > >>>> > > It?s worth noting that when using ZGC, calling System.gc >>>> does not >>>> > > invoke a classic disastrously long GC pause. Instead, a >>>> > concurrent >>>> > > GC is triggered, which should be not that noticeable to the >>>> > > application. The thread calling System.gc is blocked until >>>> > the GC is >>>> > > done, but the other threads can run freely. >>>> > > >>>> > > /Erik >>>> > > >>>> > > > On 16 Feb 2024, at 21:55, Stefan Johansson >>>> > > >>> > >>>> > >>> > >> >>>> > > wrote: >>>> > > > >>>> > > > ? >>>> > > > >>>> > > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: >>>> > > >> Thanks a lot, I wasn't even aware of the fact, that >>>> > > DirectByteBuffers use System.gc() and I always had in mind >>>> that >>>> > > calling System.gc() at least in application code is bad >>>> > practice (or >>>> > > at least we shouldn't rely on it) and I think I read >>>> somewhere a >>>> > > while ago, that it's recommended to even disable this, but >>>> may be >>>> > > completely wrong, of course. >>>> > > > In most cases callling System.gc() is bad practice, in >>>> some >>>> > > special cases it might be needed. >>>> > > > >>>> > > >> I'll change it to on-heap byte buffers tomorrow :-) >>>> > > >> I think your GC log entries were from G1, right? It >>>> seems ZGC >>>> > > always tries to use the full heap :-) >>>> > > > >>>> > > > Yes, the snippet was G1, it was mostly to show that the >>>> > pressure >>>> > > isn't high. You are correct that ZGC uses more of the given >>>> > heap but >>>> > > the collections are pretty far apart and I'm certian it >>>> would >>>> > > function well with a smaller heap as well. Maybe in that >>>> case >>>> > some >>>> > > Major collections would be triggered. >>>> > > > >>>> > > >> Kind regards and thanks for sharing your insights. >>>> > > > >>>> > > > No problem. We appriciate the feedback, >>>> > > > StefanJ >>>> > > > >>>> > > >> Have a nice weekend as well >>>> > > >> Johannes >>>> > > >> Stefan Johansson >>> > >>>> > > >>> > > >>>> > > >>> > >>>> > > >>> > >>> schrieb am Fr., 16. Feb. >>>> > > 2024, 17:38: >>>> > > >> Hi, >>>> > > >> Some comments inline. >>>> > > >> On 2024-02-16 16:47, Johannes Lichtenberger wrote: >>>> > > >> > Thanks a lot for looking into it, I've added >>>> > > >> > `-XX:MaxDirectMemorySize=2g` only recently, but >>>> > without it >>>> > > failed as >>>> > > >> > well, so not sure what the default is. Will >>>> definitely >>>> > > check your >>>> > > >> > suggestions :-) >>>> > > >> > >>>> > > >> If you don't set a limit it will be set to: >>>> > > >> Runtime.getRuntime().maxMemory() >>>> > > >> So likely a good idea to set a reasonable limit, >>>> but the >>>> > > smaller the >>>> > > >> limit is the more frequent we need to run reference >>>> > > processing to allow >>>> > > >> memory to be freed up. >>>> > > >> > Sadly I'm currently working alone on the project >>>> in my >>>> > > spare time >>>> > > >> > (besides professionally switched from Java/Kotlin >>>> > stuff to the >>>> > > >> embedded >>>> > > >> > software world) and I'm not sure if the current >>>> > > architecture of >>>> > > >> Sirix is >>>> > > >> > limited by too much GC pressure. I'd probably >>>> have >>>> > to check >>>> > > >> Cassandra at >>>> > > >> > some point and look into flame graphs and stuff >>>> for >>>> > their >>>> > > >> integration >>>> > > >> > tests, but maybe you can give some general >>>> > insights/advice... >>>> > > >> > >>>> > > >> > Yesterday evening I switched to other JDKs (also >>>> I >>>> > want to >>>> > > test with >>>> > > >> > Shenandoah in particular), but I think >>>> especially the >>>> > > better escape >>>> > > >> > analysis of the GraalVM is a huge plus in the >>>> case of >>>> > > SirixDB (for >>>> > > >> > insertion on my laptop it's ~90s vs ~60s), but I >>>> > think it >>>> > > should be >>>> > > >> > faster and currently my suspicion is that garbage >>>> > is a major >>>> > > >> performance >>>> > > >> > issue. >>>> > > >> > >>>> > > >> > Maybe the GC pressure in general is a major >>>> issue, >>>> > as in >>>> > > the CPU >>>> > > >> Flame >>>> > > >> > graph IIRC the G1 had about 20% stack frames >>>> > allocated and non >>>> > > >> > generational ZGC even around 40% taking all >>>> threads >>>> > into >>>> > > account. >>>> > > >> > >>>> > > >> From what I/we see, the GC pressure in the given >>>> test is >>>> > > not high. >>>> > > >> The >>>> > > >> allocation rate is below 1GB/s and since most of it >>>> > die young >>>> > > the GCs >>>> > > >> are fairly cheap. In this log snippet G1 shows a GC >>>> > every 5s >>>> > > and the >>>> > > >> pause time is below 50ms: >>>> > > >> [296,016s][info ][gc ] GC(90) Pause Young >>>> > (Normal) (G1 >>>> > > >> Evacuation >>>> > > >> Pause) 5413M->1849M(6456M) 35,577ms >>>> > > >> [301,103s][info ][gc ] GC(91) Pause Young >>>> > (Normal) (G1 >>>> > > >> Evacuation >>>> > > >> Pause) 5417M->1848M(6456M) 33,357ms >>>> > > >> [306,041s][info ][gc ] GC(92) Pause Young >>>> > (Normal) (G1 >>>> > > >> Evacuation >>>> > > >> Pause) 5416M->1848M(6456M) 32,763ms >>>> > > >> [310,849s][info ][gc ] GC(93) Pause Young >>>> > (Normal) (G1 >>>> > > >> Evacuation >>>> > > >> Pause) 5416M->1847M(6456M) 33,086ms >>>> > > >> I also see that the heap never expands to more the >>>> > ~6.5GB even >>>> > > >> though it >>>> > > >> is allow to be 15GB and this also suggest that the >>>> GC >>>> > is not >>>> > > under much >>>> > > >> pressure. As I said in the previous mail, the reason >>>> > > Generational ZGC >>>> > > >> don't free up the direct memory without the >>>> > System.gc() calls >>>> > > is that >>>> > > >> the GC pressure is not high enough to trigger any >>>> Major >>>> > > cycles. So I >>>> > > >> would strongly recommend you to not run with >>>> > > -XX+DisableExplicitGC >>>> > > >> unless you really have to. Since you are using >>>> > > DirectByteBuffers and >>>> > > >> they use System.gc() to help free memory when the >>>> limit is >>>> > > reached. >>>> > > >> > So in general I'm thinking about backing the >>>> > > KeyValueLeafPages with >>>> > > >> > MemorySegments, but I think due to variable >>>> sized pages >>>> > > it's getting >>>> > > >> > tricky, plus I currently don't have the time for >>>> > changing >>>> > > >> fundamental >>>> > > >> > stuff and I'm even not sure if it'll bring a >>>> > performance >>>> > > boost, as I >>>> > > >> > have to adapt neighbour relationships of the >>>> nodes >>>> > often and >>>> > > >> off-heap >>>> > > >> > memory access might be slightly worse >>>> performance wise. >>>> > > >> > >>>> > > >> > What do you think? >>>> > > >> > >>>> > > >> I know to little about the application to be able >>>> to give >>>> > > advice here, >>>> > > >> but I would first start with having most memory on >>>> > heap. Only >>>> > > large >>>> > > >> long >>>> > > >> lived stuff off-heap, if really needed. Looking at >>>> the >>>> > test >>>> > > at hand, it >>>> > > >> really doesn't look like it is long lived stuff >>>> that is >>>> > > placed off heap. >>>> > > >> > I've attached a memory flame graph and there it >>>> > seems the >>>> > > byte array >>>> > > >> > from deserializing each page is prominent, but >>>> that >>>> > might be >>>> > > >> something I >>>> > > >> > can't even avoid (after decompression via Snappy >>>> or via >>>> > > another >>>> > > >> lib and >>>> > > >> > maybe also decryption in the future). >>>> > > >> > >>>> > > >> > As of now G1 with GraalVM seems to perform best >>>> (a >>>> > little >>>> > > bit better >>>> > > >> > than with non generational ZGC, but I thought ZGC >>>> > or maybe >>>> > > >> Shenandoah >>>> > > >> > would improve the situation). But as said I may >>>> have to >>>> > > generate way >>>> > > >> > less garbage after all in general for good >>>> > performance!? >>>> > > >> > >>>> > > >> > All in all maybe due to most objects die young >>>> > maybe also the >>>> > > >> > generational GCs are not needed (that said if >>>> > enough RAM is >>>> > > >> available >>>> > > >> > and the Caffeine Caches are sized accordingly >>>> most >>>> > objects may >>>> > > >> die old). >>>> > > >> > But apparently the byte arrays holding the page >>>> > data still die >>>> > > >> young (in >>>> > > >> > AbstractReader::deserialize). In fact I'm not >>>> even sure >>>> > > why they >>>> > > >> escape, >>>> > > >> > but currently I'm on my phone. >>>> > > >> > >>>> > > >> It's when most objects die young the Generational GC >>>> > really >>>> > > shines, >>>> > > >> because it can handle the short lived objects >>>> without >>>> > having >>>> > > to look at >>>> > > >> the long lived objects. So I would say Generational >>>> > ZGC is a >>>> > > good fit >>>> > > >> here, but we need to let the System.gc() run to >>>> allow >>>> > reference >>>> > > >> processing or slightly re-design and use >>>> HeapByteBuffers. >>>> > > >> Have a nice weekend, >>>> > > >> Stefan >>>> > > >> > Kind regards >>>> > > >> > Johannes >>>> > > >> > >>>> > > >> > Stefan Johansson >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >> >>>> > > >> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >>>> schrieb am Fr., 16. Feb. >>>> > > >> 2024, 13:43: >>>> > > >> > >>>> > > >> > Hi Johannes, >>>> > > >> > >>>> > > >> > We've spent some more time looking at this >>>> and >>>> > getting the >>>> > > >> json-file to >>>> > > >> > reproduced it made it easy to verify our >>>> > suspicions. >>>> > > Thanks for >>>> > > >> > uploading it. >>>> > > >> > >>>> > > >> > There are a few things playing together here. >>>> > The test is >>>> > > >> making quite >>>> > > >> > heavy use of DirectByteBuffers and you limit >>>> > the usage >>>> > > to 2G >>>> > > >> > (-XX:MaxDirectMemorySize=2g). The life cycle >>>> and >>>> > > freeing of >>>> > > >> the native >>>> > > >> > memory part of the DirectByteBuffer rely on >>>> > reference >>>> > > >> processing. In >>>> > > >> > generational ZGC reference processing is >>>> only done >>>> > > during Major >>>> > > >> > collections and since the general GC >>>> preassure >>>> > in this >>>> > > >> benchmark is >>>> > > >> > very >>>> > > >> > low (most objects die young), we do not >>>> trigger >>>> > that >>>> > > many Major >>>> > > >> > collections. >>>> > > >> > >>>> > > >> > Normaly this would not be a problem. To avoid >>>> > throwing >>>> > > an out >>>> > > >> of memory >>>> > > >> > error (due to hitting the direct buffer >>>> memory >>>> > limit) too >>>> > > >> early the JDK >>>> > > >> > triggers a System.gc(). This should trigger >>>> > reference >>>> > > >> procesing and all >>>> > > >> > buffers that are no longer in use would be >>>> freed. >>>> > > Since you >>>> > > >> specify the >>>> > > >> > option -XX:+DisableExplicitGC all these >>>> calls to >>>> > > trigger GCs are >>>> > > >> > ignored >>>> > > >> > and no direct memory will be freed. So in our >>>> > testing, >>>> > > just >>>> > > >> removing >>>> > > >> > this flags makes the test pass. >>>> > > >> > >>>> > > >> > Another solution is to look at using >>>> > HeapByteBuffers >>>> > > instead >>>> > > >> and don't >>>> > > >> > have to worry about the direct memory usage. >>>> The >>>> > > OpenHFT lib >>>> > > >> seems to >>>> > > >> > have support for this by just using >>>> > > >> elasticHeapByteBuffer(...) instead >>>> > > >> > of elasticByteBuffer(). >>>> > > >> > >>>> > > >> > Lastly, the reason for this working with >>>> > > non-generational ZGC is >>>> > > >> > that it >>>> > > >> > does reference processing for every GC. >>>> > > >> > >>>> > > >> > Hope this helps, >>>> > > >> > StefanJ >>>> > > >> > >>>> > > >> > >>>> > > >> > On 2024-02-15 21:53, Johannes Lichtenberger >>>> wrote: >>>> > > >> > > It's a laptop, I've attached some details. >>>> > > >> > > >>>> > > >> > > Furthermore, if it seems worth digging >>>> > deeper into the >>>> > > >> issue, the >>>> > > >> > JSON >>>> > > >> > > file is here for one week: >>>> > > >> > > >>>> > https://www.transfernow.net/dl/20240215j9NaPTc0 >>>> >>>> > < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ >>>> > >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >>>> >> >>>> > > >> >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>>> >>> >>>> > > >> > >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> >>>> >>>> > > >> > > >>>> > >>> >>>> > < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ >>>> > >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >>>> >> >>>> > > >> >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >>>> >>> >>>> > > >> > >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> < >>>> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >>>> >>>>> >>>> > > >> > > >>>> > > >> > > You'd have to unzip into >>>> > > >> bundles/sirix-core/src/test/resources/json, >>>> > > >> > > remove the @Disabled annotation and run >>>> the test >>>> > > >> > > >>>> JsonShredderTest::testChicagoDescendantAxis >>>> > > >> > > >>>> > > >> > > The test JVM parameters are specified in >>>> the >>>> > parent >>>> > > >> build.gradle >>>> > > >> > in the >>>> > > >> > > project root folder. >>>> > > >> > > >>>> > > >> > > The GitHub repo: >>>> > https://github.com/sirixdb/sirix >>>> >>>> > < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ >>>> > >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >>>> >> >>>> > > >> >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>>> >>> >>>> > > >> > >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> >>>> >>>> > > >> > > >>> >>>> > < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ >>>> > >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >>>> >> >>>> > > >> >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >>>> >>> >>>> > > >> > >>>> > > >>>> > < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> < >>>> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >>>> >>>>> >>>> > > >> > > >>>> > > >> > > Screenshot from 2024-02-15 21-43-33.png >>>> > > >> > > >>>> > > >> > > kind regards >>>> > > >> > > Johannes >>>> > > >> > > >>>> > > >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb >>>> > Peter Booth >>>> > > >> > > >>> > >>> > > >>>> > > >>>> > >> >>>> > > >> >>> peter_booth at me.com> >>>> > > >>>> > > >>>> > >>> >>>> > > >> > >>> > >>> > > >>>> > > >>>> > >> >>>> > > >> >>> peter_booth at me.com> >>>> > > >>>> > > >>>> > >>>>>: >>>> > > >> > > >>>> > > >> > > Just curious - what CPU, physical >>>> memory >>>> > and OS are >>>> > > >> you using? >>>> > > >> > > Sent from my iPhone >>>> > > >> > > >>>> > > >> > >> On Feb 15, 2024, at 12:23?PM, >>>> Johannes >>>> > > Lichtenberger >>>> > > >> > >> >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >> >>>> > > >> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >>> >>>> > > >> > >> >>>> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >> >>>> > > >> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >>>>> wrote: >>>> > > >> > >> >>>> > > >> > >> ? >>>> > > >> > >> I guess I don't know which JDK it >>>> picks >>>> > for the >>>> > > >> tests, but I >>>> > > >> > guess >>>> > > >> > >> OpenJDK >>>> > > >> > >> >>>> > > >> > >> Johannes Lichtenberger >>>> > > >> >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >> >>>> > > >> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >>> >>>> > > >> > >> >>>> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >> >>>> > > >> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >>>>> schrieb am Do., >>>> 15. >>>> > > >> > >> Feb. 2024, 17:58: >>>> > > >> > >> >>>> > > >> > >> However, it's the same with: >>>> ./gradlew >>>> > > >> > >> >>>> > > >>>> -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 >>>> > > >> > >> :sirix-core:test --tests >>>> > > >> > >> >>>> > > >> > >>>> > > >>>> > >>>> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >>>> using OpenJDK hopefully >>>> > > >> > >> >>>> > > >> > >> Am Do., 15. Feb. 2024 um 17:54 >>>> Uhr >>>> > schrieb >>>> > > Johannes >>>> > > >> > >> Lichtenberger >>>> > > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >> >>>> > > >> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >>> >>>> > > >> > >> >>>> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >> >>>> > > >> > >>> > >>>> > > >>> > > >>>> > > >> >>> > >>>> > > >>> > >>>>>: >>>> > > >> > >> >>>> > > >> > >> I've attached two logs, the >>>> > first one >>>> > > without >>>> > > >> > >> -XX:+Generational, the second >>>> > one with the >>>> > > >> option set, >>>> > > >> > >> even though I also saw, that >>>> > > generational ZGC is >>>> > > >> > going to >>>> > > >> > >> be supported in GraalVM 24.1 >>>> in >>>> > > September... >>>> > > >> so not sure >>>> > > >> > >> what this does :) >>>> > > >> > >> >>>> > > >> > >> Am Do., 15. Feb. 2024 um >>>> 17:52 >>>> > Uhr schrieb >>>> > > >> Johannes >>>> > > >> > >> Lichtenberger >>>> > > >> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From lichtenberger.johannes at gmail.com Tue Feb 20 15:58:58 2024 From: lichtenberger.johannes at gmail.com (Johannes Lichtenberger) Date: Tue, 20 Feb 2024 16:58:58 +0100 Subject: [External] : Re: Generational ZGC issue In-Reply-To: <98209DE5-9B24-4864-8B94-FF6D1A817623@oracle.com> References: <3BF4611F-4F9F-4443-9E8E-F3BDD05A0DFB@me.com> <448322eb-72a8-4e64-bd68-ab42f799164a@oracle.com> <63b4162f-2c4a-4acc-bbc7-655e712c18e8@oracle.com> <0261b2f1-e5fe-434b-b3bc-453de8021223@oracle.com> <98209DE5-9B24-4864-8B94-FF6D1A817623@oracle.com> Message-ID: So, a short summary: All in all the problem one the one hand was, that I mixed the GraalVM JIT compiler with C2 due to specifying Shenandoah generational version, which the Graal JIT compiler can't as of now (using Temurin and DirectByteBuffers, the test finished ~7% slower than with G1). Switching from DirectByteBuffers to on heap ByteBuffers, generational ZGC was about as fast using Temurin as with G1. For the time being, I think for Sirix it's currently best to use GraalVM + G1 or ZGC (end of September I think also generational ZGC will be available on the GraalVM). I'll check a native image next with profile guided optimizations, but sadly this means no ZGC (only G1 available!?)... Kind regards Johannes Erik Osterlund schrieb am Sa., 17. Feb. 2024, 19:13: > Could you check how much is user time vs system time? It smells like you > are swapping. When swapping starts, performance goes out of the window. > > /Erik > > On 17 Feb 2024, at 16:21, Johannes Lichtenberger < > lichtenberger.johannes at gmail.com> wrote: > > ? > I'll check later on if the test doesn't fail with the 2g max and I'll have > to check as well if it's still the OutOfMemoryError (as I'm not at home > currently). But everytime everything freezes for a couple of seconds. > > In any case isn't it strange that with G1 and ZGC the runtime is very > close to each other with G1 having the upper hand slightly, when switching > to on heap ByteBuffers, but with Generational ZGC the runtime almost > exactly doubles? I thought at some point the generational version should > make the non generational obsolet and as almost every object dies young the > generational ZGC should be better as you wrote!? > > Kind regards and have a nice weekend (and kind of feel sorry for bothering > that much) > Johannes > > Stefan Johansson schrieb am Sa., 17. Feb. > 2024, 15:54: > >> Ok, when you say crashes, what do you mean. Are you still seeing the >> same OutOfMemoryError or are we talking about an actual JVM crash. Or is >> it the Linux OOM killer stepping in because of high memory pressure? >> >> If this is with the new setting of 5g for direct memory it could be that >> these 3 extra gigs of memory is pushing you over the limit for what can >> be handle by you laptop. Generally, if you start swapping, the >> performance is out the door and you need to look at the configuration. >> Maybe the 2g for direct memory is reasonable on this setup to avoid >> swapping. I looked a bit at the total memory usage for the process here >> and it seem to be around 20G. >> >> Stefan >> >> On 2024-02-17 12:24, Johannes Lichtenberger wrote: >> > So, switching back to DirectByteBuffers, and removing the disabling of >> > explicit GCs still crashes on my laptop (swapping + close to 32 Gb RAM >> > used)... >> > >> > kind regards >> > Johannes >> > >> > Am Sa., 17. Feb. 2024 um 10:22 Uhr schrieb Stefan Johansson >> > >: >> > >> > >> > >> > On 2024-02-17 00:36, Johannes Lichtenberger wrote: >> > > I just removed "-XX+DisableExplizitGC", increased max direct >> > memory size >> > > to 5g (-XX:MaxDirectMemorySize=5g), but also changed >> > > Bytes::elasticByteBuffer to Bytes.elasticHeapByteBuffer(60_000); >> > > to use on heap ByteBuffers. >> > > >> > >> > Just for clarity, when using HeapByteBuffers the MaxDirectMemorySize >> > has >> > no effect since the ByteBuffers will be stored on the heap. But if >> you >> > keep going with DirectByteBuffers, this might make sense to give >> some >> > more head room. >> > >> > > However, the performance seems to be way worse. I've repeated the >> > test >> > > several times, but with both G1 and non generational ZGC it's >> > ~50s for >> > > importing the JSON file in the first case vs ~100s using >> > generational >> > > ZGC, using Temurin 21.0.2 with similar values for the actual >> > traversals. >> > > >> > >> > Ok, sounds like using DirectByteBuffers is a performance win here. >> > If so >> > I would just continue testing using DirectByteBuffers and allowing >> > explicit GCs to ensure they are cleaned out properly. >> > >> > > From the log on STDOUT, I can see this (meaning 0,319s and >> > 0,440s... >> > > pause times?) >> > > >> > >> > No, with ZGC the time here is not the pause time, it's the time to >> > complete the whole GC. ZGC is a concurrent GC, meaning that most of >> the >> > GC work is done concurrently with the Java application still >> running. >> > There are still a some very short pauses, all way below 1ms. You can >> > see >> > them if you look at the detailed log: >> > >> > [30,938s][info][gc ] GC(3) Minor Collection (Allocation >> Rate) >> > [30,938s][info][gc,phases ] GC(3) y: Young Generation >> > [30,938s][info][gc,phases ] GC(3) y: Pause Mark Start 0,060ms >> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark 383,563ms >> > [31,322s][info][gc,phases ] GC(3) y: Pause Mark End 0,046ms >> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Mark Free 0,009ms >> > [31,322s][info][gc,phases ] GC(3) y: Concurrent Reset Relocation >> Set >> > 0,201ms >> > [31,335s][info][gc,phases ] GC(3) y: Concurrent Select Relocation >> Set >> > 13,228ms >> > [31,335s][info][gc,phases ] GC(3) y: Pause Relocate Start 0,019ms >> > [31,381s][info][gc,phases ] GC(3) y: Concurrent Relocate 45,967ms >> > [31,382s][info][gc,phases ] GC(3) y: Young Generation >> > 9726M(63%)->518M(3%) 0,444s >> > [31,382s][info][gc ] GC(3) Minor Collection (Allocation >> Rate) >> > 9726M(63%)->518M(3%) 0,444s >> > >> > Here I included the phase-logs for a single GC of the young >> generation, >> > where you can clearly see how much time was spent in which part of >> the >> > GC and as you can see the three pauses are all very very short. >> > >> > Stefan >> > >> > > [35,718s][info ][gc ] GC(9) Minor Collection (Allocation >> > Rate) >> > > 12462M(81%)->1556M(10%) 0,319s >> > > [40,871s][info ][gc ] GC(10) Minor Collection (Allocation >> > Rate) >> > > [41,311s][info ][gc ] GC(10) Minor Collection (Allocation >> > Rate) >> > > 13088M(85%)->1432M(9%) 0,440s >> > > [46,236s][info ][gc ] GC(11) Minor Collection (Allocation >> > Rate) >> > > [46,603s][info ][gc ] GC(11) Minor Collection (Allocation >> > Rate) >> > > 12406M(81%)->1676M(11%) 0,367s >> > > [51,445s][info ][gc ] GC(12) Minor Collection (Allocation >> > Rate) >> > > [51,846s][info ][gc ] GC(12) Minor Collection (Allocation >> > Rate) >> > > 12848M(84%)->1556M(10%) 0,401s >> > > [56,203s][info ][gc ] GC(13) Major Collection (Proactive) >> > > [56,368s][info ][gc ] GC(13) Major Collection (Proactive) >> > > 11684M(76%)->484M(3%) 0,166s >> > > >> > > kind regards >> > > Johannes >> > > >> > > Am Fr., 16. Feb. 2024 um 22:39 Uhr schrieb Erik Osterlund >> > > >> > > >>>: >> > > >> > > It?s worth noting that when using ZGC, calling System.gc >> does not >> > > invoke a classic disastrously long GC pause. Instead, a >> > concurrent >> > > GC is triggered, which should be not that noticeable to the >> > > application. The thread calling System.gc is blocked until >> > the GC is >> > > done, but the other threads can run freely. >> > > >> > > /Erik >> > > >> > > > On 16 Feb 2024, at 21:55, Stefan Johansson >> > > > > >> > > > >> >> > > wrote: >> > > > >> > > > ? >> > > > >> > > >> On 2024-02-16 18:04, Johannes Lichtenberger wrote: >> > > >> Thanks a lot, I wasn't even aware of the fact, that >> > > DirectByteBuffers use System.gc() and I always had in mind >> that >> > > calling System.gc() at least in application code is bad >> > practice (or >> > > at least we shouldn't rely on it) and I think I read >> somewhere a >> > > while ago, that it's recommended to even disable this, but >> may be >> > > completely wrong, of course. >> > > > In most cases callling System.gc() is bad practice, in >> some >> > > special cases it might be needed. >> > > > >> > > >> I'll change it to on-heap byte buffers tomorrow :-) >> > > >> I think your GC log entries were from G1, right? It >> seems ZGC >> > > always tries to use the full heap :-) >> > > > >> > > > Yes, the snippet was G1, it was mostly to show that the >> > pressure >> > > isn't high. You are correct that ZGC uses more of the given >> > heap but >> > > the collections are pretty far apart and I'm certian it would >> > > function well with a smaller heap as well. Maybe in that case >> > some >> > > Major collections would be triggered. >> > > > >> > > >> Kind regards and thanks for sharing your insights. >> > > > >> > > > No problem. We appriciate the feedback, >> > > > StefanJ >> > > > >> > > >> Have a nice weekend as well >> > > >> Johannes >> > > >> Stefan Johansson > > >> > > > > > >> > > > > >> > > > > >>> schrieb am Fr., 16. Feb. >> > > 2024, 17:38: >> > > >> Hi, >> > > >> Some comments inline. >> > > >> On 2024-02-16 16:47, Johannes Lichtenberger wrote: >> > > >> > Thanks a lot for looking into it, I've added >> > > >> > `-XX:MaxDirectMemorySize=2g` only recently, but >> > without it >> > > failed as >> > > >> > well, so not sure what the default is. Will >> definitely >> > > check your >> > > >> > suggestions :-) >> > > >> > >> > > >> If you don't set a limit it will be set to: >> > > >> Runtime.getRuntime().maxMemory() >> > > >> So likely a good idea to set a reasonable limit, but >> the >> > > smaller the >> > > >> limit is the more frequent we need to run reference >> > > processing to allow >> > > >> memory to be freed up. >> > > >> > Sadly I'm currently working alone on the project >> in my >> > > spare time >> > > >> > (besides professionally switched from Java/Kotlin >> > stuff to the >> > > >> embedded >> > > >> > software world) and I'm not sure if the current >> > > architecture of >> > > >> Sirix is >> > > >> > limited by too much GC pressure. I'd probably have >> > to check >> > > >> Cassandra at >> > > >> > some point and look into flame graphs and stuff for >> > their >> > > >> integration >> > > >> > tests, but maybe you can give some general >> > insights/advice... >> > > >> > >> > > >> > Yesterday evening I switched to other JDKs (also I >> > want to >> > > test with >> > > >> > Shenandoah in particular), but I think especially >> the >> > > better escape >> > > >> > analysis of the GraalVM is a huge plus in the case >> of >> > > SirixDB (for >> > > >> > insertion on my laptop it's ~90s vs ~60s), but I >> > think it >> > > should be >> > > >> > faster and currently my suspicion is that garbage >> > is a major >> > > >> performance >> > > >> > issue. >> > > >> > >> > > >> > Maybe the GC pressure in general is a major issue, >> > as in >> > > the CPU >> > > >> Flame >> > > >> > graph IIRC the G1 had about 20% stack frames >> > allocated and non >> > > >> > generational ZGC even around 40% taking all threads >> > into >> > > account. >> > > >> > >> > > >> From what I/we see, the GC pressure in the given >> test is >> > > not high. >> > > >> The >> > > >> allocation rate is below 1GB/s and since most of it >> > die young >> > > the GCs >> > > >> are fairly cheap. In this log snippet G1 shows a GC >> > every 5s >> > > and the >> > > >> pause time is below 50ms: >> > > >> [296,016s][info ][gc ] GC(90) Pause Young >> > (Normal) (G1 >> > > >> Evacuation >> > > >> Pause) 5413M->1849M(6456M) 35,577ms >> > > >> [301,103s][info ][gc ] GC(91) Pause Young >> > (Normal) (G1 >> > > >> Evacuation >> > > >> Pause) 5417M->1848M(6456M) 33,357ms >> > > >> [306,041s][info ][gc ] GC(92) Pause Young >> > (Normal) (G1 >> > > >> Evacuation >> > > >> Pause) 5416M->1848M(6456M) 32,763ms >> > > >> [310,849s][info ][gc ] GC(93) Pause Young >> > (Normal) (G1 >> > > >> Evacuation >> > > >> Pause) 5416M->1847M(6456M) 33,086ms >> > > >> I also see that the heap never expands to more the >> > ~6.5GB even >> > > >> though it >> > > >> is allow to be 15GB and this also suggest that the GC >> > is not >> > > under much >> > > >> pressure. As I said in the previous mail, the reason >> > > Generational ZGC >> > > >> don't free up the direct memory without the >> > System.gc() calls >> > > is that >> > > >> the GC pressure is not high enough to trigger any >> Major >> > > cycles. So I >> > > >> would strongly recommend you to not run with >> > > -XX+DisableExplicitGC >> > > >> unless you really have to. Since you are using >> > > DirectByteBuffers and >> > > >> they use System.gc() to help free memory when the >> limit is >> > > reached. >> > > >> > So in general I'm thinking about backing the >> > > KeyValueLeafPages with >> > > >> > MemorySegments, but I think due to variable sized >> pages >> > > it's getting >> > > >> > tricky, plus I currently don't have the time for >> > changing >> > > >> fundamental >> > > >> > stuff and I'm even not sure if it'll bring a >> > performance >> > > boost, as I >> > > >> > have to adapt neighbour relationships of the nodes >> > often and >> > > >> off-heap >> > > >> > memory access might be slightly worse performance >> wise. >> > > >> > >> > > >> > What do you think? >> > > >> > >> > > >> I know to little about the application to be able to >> give >> > > advice here, >> > > >> but I would first start with having most memory on >> > heap. Only >> > > large >> > > >> long >> > > >> lived stuff off-heap, if really needed. Looking at the >> > test >> > > at hand, it >> > > >> really doesn't look like it is long lived stuff that >> is >> > > placed off heap. >> > > >> > I've attached a memory flame graph and there it >> > seems the >> > > byte array >> > > >> > from deserializing each page is prominent, but that >> > might be >> > > >> something I >> > > >> > can't even avoid (after decompression via Snappy >> or via >> > > another >> > > >> lib and >> > > >> > maybe also decryption in the future). >> > > >> > >> > > >> > As of now G1 with GraalVM seems to perform best (a >> > little >> > > bit better >> > > >> > than with non generational ZGC, but I thought ZGC >> > or maybe >> > > >> Shenandoah >> > > >> > would improve the situation). But as said I may >> have to >> > > generate way >> > > >> > less garbage after all in general for good >> > performance!? >> > > >> > >> > > >> > All in all maybe due to most objects die young >> > maybe also the >> > > >> > generational GCs are not needed (that said if >> > enough RAM is >> > > >> available >> > > >> > and the Caffeine Caches are sized accordingly most >> > objects may >> > > >> die old). >> > > >> > But apparently the byte arrays holding the page >> > data still die >> > > >> young (in >> > > >> > AbstractReader::deserialize). In fact I'm not even >> sure >> > > why they >> > > >> escape, >> > > >> > but currently I'm on my phone. >> > > >> > >> > > >> It's when most objects die young the Generational GC >> > really >> > > shines, >> > > >> because it can handle the short lived objects without >> > having >> > > to look at >> > > >> the long lived objects. So I would say Generational >> > ZGC is a >> > > good fit >> > > >> here, but we need to let the System.gc() run to allow >> > reference >> > > >> processing or slightly re-design and use >> HeapByteBuffers. >> > > >> Have a nice weekend, >> > > >> Stefan >> > > >> > Kind regards >> > > >> > Johannes >> > > >> > >> > > >> > Stefan Johansson > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>>> schrieb am Fr., 16. Feb. >> > > >> 2024, 13:43: >> > > >> > >> > > >> > Hi Johannes, >> > > >> > >> > > >> > We've spent some more time looking at this and >> > getting the >> > > >> json-file to >> > > >> > reproduced it made it easy to verify our >> > suspicions. >> > > Thanks for >> > > >> > uploading it. >> > > >> > >> > > >> > There are a few things playing together here. >> > The test is >> > > >> making quite >> > > >> > heavy use of DirectByteBuffers and you limit >> > the usage >> > > to 2G >> > > >> > (-XX:MaxDirectMemorySize=2g). The life cycle >> and >> > > freeing of >> > > >> the native >> > > >> > memory part of the DirectByteBuffer rely on >> > reference >> > > >> processing. In >> > > >> > generational ZGC reference processing is only >> done >> > > during Major >> > > >> > collections and since the general GC preassure >> > in this >> > > >> benchmark is >> > > >> > very >> > > >> > low (most objects die young), we do not trigger >> > that >> > > many Major >> > > >> > collections. >> > > >> > >> > > >> > Normaly this would not be a problem. To avoid >> > throwing >> > > an out >> > > >> of memory >> > > >> > error (due to hitting the direct buffer memory >> > limit) too >> > > >> early the JDK >> > > >> > triggers a System.gc(). This should trigger >> > reference >> > > >> procesing and all >> > > >> > buffers that are no longer in use would be >> freed. >> > > Since you >> > > >> specify the >> > > >> > option -XX:+DisableExplicitGC all these calls >> to >> > > trigger GCs are >> > > >> > ignored >> > > >> > and no direct memory will be freed. So in our >> > testing, >> > > just >> > > >> removing >> > > >> > this flags makes the test pass. >> > > >> > >> > > >> > Another solution is to look at using >> > HeapByteBuffers >> > > instead >> > > >> and don't >> > > >> > have to worry about the direct memory usage. >> The >> > > OpenHFT lib >> > > >> seems to >> > > >> > have support for this by just using >> > > >> elasticHeapByteBuffer(...) instead >> > > >> > of elasticByteBuffer(). >> > > >> > >> > > >> > Lastly, the reason for this working with >> > > non-generational ZGC is >> > > >> > that it >> > > >> > does reference processing for every GC. >> > > >> > >> > > >> > Hope this helps, >> > > >> > StefanJ >> > > >> > >> > > >> > >> > > >> > On 2024-02-15 21:53, Johannes Lichtenberger >> wrote: >> > > >> > > It's a laptop, I've attached some details. >> > > >> > > >> > > >> > > Furthermore, if it seems worth digging >> > deeper into the >> > > >> issue, the >> > > >> > JSON >> > > >> > > file is here for one week: >> > > >> > > >> > https://www.transfernow.net/dl/20240215j9NaPTc0 >> >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ >> > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >> >> >> > > >> >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> >>> >> > > >> > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> >>>> >> > > >> > > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HudjOaY-y$ >> > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNTFwuk6i$ >> >> >> > > >> >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$ >> >>> >> > > >> > >> > > >> > < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$>> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$> >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> < >> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$ >> >>>>> >> > > >> > > >> > > >> > > You'd have to unzip into >> > > >> bundles/sirix-core/src/test/resources/json, >> > > >> > > remove the @Disabled annotation and run the >> test >> > > >> > > JsonShredderTest::testChicagoDescendantAxis >> > > >> > > >> > > >> > > The test JVM parameters are specified in the >> > parent >> > > >> build.gradle >> > > >> > in the >> > > >> > > project root folder. >> > > >> > > >> > > >> > > The GitHub repo: >> > https://github.com/sirixdb/sirix >> >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ >> > >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >> >> >> > > >> >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> >>> >> > > >> > >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> >>>> >> > > >> > > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!Oy8M_MVHordSb4P3bQK7FIsjr7OUWYsKxpJWW-5Hsq4QmU-5utLsNbRRM5kiyoHWE92dAWA7n38XpX68XvdBhd0HuVPfuWFY$ >> > >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!M33-mkbNfIFidOtIRYJLrdt970BIn5XjvmSvgBg0Ip6zkm5Zk7w6OG6FunFxjzDpUZju_f7FbEua8sTaS9Q3SnufNbw9BBhL$ >> >> >> > > >> >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$ >> >>> >> > > >> > >> > > >> > < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$>> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$> >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> < >> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$ >> >>>>> >> > > >> > > >> > > >> > > Screenshot from 2024-02-15 21-43-33.png >> > > >> > > >> > > >> > > kind regards >> > > >> > > Johannes >> > > >> > > >> > > >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb >> > Peter Booth >> > > >> > > > > > > > >> > > >> > >> >> > > >> > > >> > > >> > > >> > >>> >> > > >> > > > > > > >> > > >> > >> >> > > >> > > >> > > >> > > >> > >>>>>: >> > > >> > > >> > > >> > > Just curious - what CPU, physical memory >> > and OS are >> > > >> you using? >> > > >> > > Sent from my iPhone >> > > >> > > >> > > >> > >> On Feb 15, 2024, at 12:23?PM, Johannes >> > > Lichtenberger >> > > >> > >> > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>> >> > > >> > >> >> > > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>>>> wrote: >> > > >> > >> >> > > >> > >> ? >> > > >> > >> I guess I don't know which JDK it picks >> > for the >> > > >> tests, but I >> > > >> > guess >> > > >> > >> OpenJDK >> > > >> > >> >> > > >> > >> Johannes Lichtenberger >> > > >> > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>> >> > > >> > >> >> > > > >> > > > > > >> > > >> > > >> > > > > >> >> > > >> > > > >> > > > > > >> > > >> > > >> > > > > >>>>> schrieb am Do., 15. >> > > >> > >> Feb. 2024, 17:58: >> > > >> > >> >> > > >> > >> However, it's the same with: >> ./gradlew >> > > >> > >> >> > > -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2 >> > > >> > >> :sirix-core:test --tests >> > > >> > >> >> > > >> > >> > > >> > >> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis >> using OpenJDK hopefully >> > > >> > >> >> > > >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr >> > schrieb >> > > Johannes >> > > >> > >> Lichtenberger >> > > > > >> > > > > > >> > > >> > > >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: