Generational ZGC issue
Johannes Lichtenberger
lichtenberger.johannes at gmail.com
Fri Feb 16 23:36:09 UTC 2024
I just removed "-XX+DisableExplizitGC", increased max direct memory size to
5g (-XX:MaxDirectMemorySize=5g), but also changed Bytes::elasticByteBuffer
to Bytes.elasticHeapByteBuffer(60_000);
to use on heap ByteBuffers.
However, the performance seems to be way worse. I've repeated the test
several times, but with both G1 and non generational ZGC it's ~50s for
importing the JSON file in the first case vs ~100s using generational ZGC,
using Temurin 21.0.2 with similar values for the actual traversals.
>From the log on STDOUT, I can see this (meaning 0,319s and 0,440s... pause
times?)
[35,718s][info ][gc ] GC(9) Minor Collection (Allocation Rate)
12462M(81%)->1556M(10%) 0,319s
[40,871s][info ][gc ] GC(10) Minor Collection (Allocation Rate)
[41,311s][info ][gc ] GC(10) Minor Collection (Allocation Rate)
13088M(85%)->1432M(9%) 0,440s
[46,236s][info ][gc ] GC(11) Minor Collection (Allocation Rate)
[46,603s][info ][gc ] GC(11) Minor Collection (Allocation Rate)
12406M(81%)->1676M(11%) 0,367s
[51,445s][info ][gc ] GC(12) Minor Collection (Allocation Rate)
[51,846s][info ][gc ] GC(12) Minor Collection (Allocation Rate)
12848M(84%)->1556M(10%) 0,401s
[56,203s][info ][gc ] GC(13) Major Collection (Proactive)
[56,368s][info ][gc ] GC(13) Major Collection (Proactive)
11684M(76%)->484M(3%) 0,166s
kind regards
Johannes
Am Fr., 16. Feb. 2024 um 22:39 Uhr schrieb Erik Osterlund <
erik.osterlund at oracle.com>:
> It’s worth noting that when using ZGC, calling System.gc does not invoke a
> classic disastrously long GC pause. Instead, a concurrent GC is triggered,
> which should be not that noticeable to the application. The thread calling
> System.gc is blocked until the GC is done, but the other threads can run
> freely.
>
> /Erik
>
> > On 16 Feb 2024, at 21:55, Stefan Johansson <stefan.johansson at oracle.com>
> wrote:
> >
> >
> >
> >> On 2024-02-16 18:04, Johannes Lichtenberger wrote:
> >> Thanks a lot, I wasn't even aware of the fact, that DirectByteBuffers
> use System.gc() and I always had in mind that calling System.gc() at least
> in application code is bad practice (or at least we shouldn't rely on it)
> and I think I read somewhere a while ago, that it's recommended to even
> disable this, but may be completely wrong, of course.
> > In most cases callling System.gc() is bad practice, in some special
> cases it might be needed.
> >
> >> I'll change it to on-heap byte buffers tomorrow :-)
> >> I think your GC log entries were from G1, right? It seems ZGC always
> tries to use the full heap :-)
> >
> > Yes, the snippet was G1, it was mostly to show that the pressure isn't
> high. You are correct that ZGC uses more of the given heap but the
> collections are pretty far apart and I'm certian it would function well
> with a smaller heap as well. Maybe in that case some Major collections
> would be triggered.
> >
> >> Kind regards and thanks for sharing your insights.
> >
> > No problem. We appriciate the feedback,
> > StefanJ
> >
> >> Have a nice weekend as well
> >> Johannes
> >> Stefan Johansson <stefan.johansson at oracle.com <mailto:
> stefan.johansson at oracle.com>> schrieb am Fr., 16. Feb. 2024, 17:38:
> >> Hi,
> >> Some comments inline.
> >> On 2024-02-16 16:47, Johannes Lichtenberger wrote:
> >> > Thanks a lot for looking into it, I've added
> >> > `-XX:MaxDirectMemorySize=2g` only recently, but without it failed
> as
> >> > well, so not sure what the default is. Will definitely check your
> >> > suggestions :-)
> >> >
> >> If you don't set a limit it will be set to:
> >> Runtime.getRuntime().maxMemory()
> >> So likely a good idea to set a reasonable limit, but the smaller the
> >> limit is the more frequent we need to run reference processing to
> allow
> >> memory to be freed up.
> >> > Sadly I'm currently working alone on the project in my spare time
> >> > (besides professionally switched from Java/Kotlin stuff to the
> >> embedded
> >> > software world) and I'm not sure if the current architecture of
> >> Sirix is
> >> > limited by too much GC pressure. I'd probably have to check
> >> Cassandra at
> >> > some point and look into flame graphs and stuff for their
> >> integration
> >> > tests, but maybe you can give some general insights/advice...
> >> >
> >> > Yesterday evening I switched to other JDKs (also I want to test
> with
> >> > Shenandoah in particular), but I think especially the better
> escape
> >> > analysis of the GraalVM is a huge plus in the case of SirixDB (for
> >> > insertion on my laptop it's ~90s vs ~60s), but I think it should
> be
> >> > faster and currently my suspicion is that garbage is a major
> >> performance
> >> > issue.
> >> >
> >> > Maybe the GC pressure in general is a major issue, as in the CPU
> >> Flame
> >> > graph IIRC the G1 had about 20% stack frames allocated and non
> >> > generational ZGC even around 40% taking all threads into account.
> >> >
> >> From what I/we see, the GC pressure in the given test is not high.
> >> The
> >> allocation rate is below 1GB/s and since most of it die young the GCs
> >> are fairly cheap. In this log snippet G1 shows a GC every 5s and the
> >> pause time is below 50ms:
> >> [296,016s][info ][gc ] GC(90) Pause Young (Normal) (G1
> >> Evacuation
> >> Pause) 5413M->1849M(6456M) 35,577ms
> >> [301,103s][info ][gc ] GC(91) Pause Young (Normal) (G1
> >> Evacuation
> >> Pause) 5417M->1848M(6456M) 33,357ms
> >> [306,041s][info ][gc ] GC(92) Pause Young (Normal) (G1
> >> Evacuation
> >> Pause) 5416M->1848M(6456M) 32,763ms
> >> [310,849s][info ][gc ] GC(93) Pause Young (Normal) (G1
> >> Evacuation
> >> Pause) 5416M->1847M(6456M) 33,086ms
> >> I also see that the heap never expands to more the ~6.5GB even
> >> though it
> >> is allow to be 15GB and this also suggest that the GC is not under
> much
> >> pressure. As I said in the previous mail, the reason Generational ZGC
> >> don't free up the direct memory without the System.gc() calls is that
> >> the GC pressure is not high enough to trigger any Major cycles. So I
> >> would strongly recommend you to not run with -XX+DisableExplicitGC
> >> unless you really have to. Since you are using DirectByteBuffers and
> >> they use System.gc() to help free memory when the limit is reached.
> >> > So in general I'm thinking about backing the KeyValueLeafPages
> with
> >> > MemorySegments, but I think due to variable sized pages it's
> getting
> >> > tricky, plus I currently don't have the time for changing
> >> fundamental
> >> > stuff and I'm even not sure if it'll bring a performance boost,
> as I
> >> > have to adapt neighbour relationships of the nodes often and
> >> off-heap
> >> > memory access might be slightly worse performance wise.
> >> >
> >> > What do you think?
> >> >
> >> I know to little about the application to be able to give advice
> here,
> >> but I would first start with having most memory on heap. Only large
> >> long
> >> lived stuff off-heap, if really needed. Looking at the test at hand,
> it
> >> really doesn't look like it is long lived stuff that is placed off
> heap.
> >> > I've attached a memory flame graph and there it seems the byte
> array
> >> > from deserializing each page is prominent, but that might be
> >> something I
> >> > can't even avoid (after decompression via Snappy or via another
> >> lib and
> >> > maybe also decryption in the future).
> >> >
> >> > As of now G1 with GraalVM seems to perform best (a little bit
> better
> >> > than with non generational ZGC, but I thought ZGC or maybe
> >> Shenandoah
> >> > would improve the situation). But as said I may have to generate
> way
> >> > less garbage after all in general for good performance!?
> >> >
> >> > All in all maybe due to most objects die young maybe also the
> >> > generational GCs are not needed (that said if enough RAM is
> >> available
> >> > and the Caffeine Caches are sized accordingly most objects may
> >> die old).
> >> > But apparently the byte arrays holding the page data still die
> >> young (in
> >> > AbstractReader::deserialize). In fact I'm not even sure why they
> >> escape,
> >> > but currently I'm on my phone.
> >> >
> >> It's when most objects die young the Generational GC really shines,
> >> because it can handle the short lived objects without having to look
> at
> >> the long lived objects. So I would say Generational ZGC is a good fit
> >> here, but we need to let the System.gc() run to allow reference
> >> processing or slightly re-design and use HeapByteBuffers.
> >> Have a nice weekend,
> >> Stefan
> >> > Kind regards
> >> > Johannes
> >> >
> >> > Stefan Johansson <stefan.johansson at oracle.com
> >> <mailto:stefan.johansson at oracle.com>
> >> > <mailto:stefan.johansson at oracle.com
> >> <mailto:stefan.johansson at oracle.com>>> schrieb am Fr., 16. Feb.
> >> 2024, 13:43:
> >> >
> >> > Hi Johannes,
> >> >
> >> > We've spent some more time looking at this and getting the
> >> json-file to
> >> > reproduced it made it easy to verify our suspicions. Thanks
> for
> >> > uploading it.
> >> >
> >> > There are a few things playing together here. The test is
> >> making quite
> >> > heavy use of DirectByteBuffers and you limit the usage to 2G
> >> > (-XX:MaxDirectMemorySize=2g). The life cycle and freeing of
> >> the native
> >> > memory part of the DirectByteBuffer rely on reference
> >> processing. In
> >> > generational ZGC reference processing is only done during
> Major
> >> > collections and since the general GC preassure in this
> >> benchmark is
> >> > very
> >> > low (most objects die young), we do not trigger that many
> Major
> >> > collections.
> >> >
> >> > Normaly this would not be a problem. To avoid throwing an out
> >> of memory
> >> > error (due to hitting the direct buffer memory limit) too
> >> early the JDK
> >> > triggers a System.gc(). This should trigger reference
> >> procesing and all
> >> > buffers that are no longer in use would be freed. Since you
> >> specify the
> >> > option -XX:+DisableExplicitGC all these calls to trigger GCs
> are
> >> > ignored
> >> > and no direct memory will be freed. So in our testing, just
> >> removing
> >> > this flags makes the test pass.
> >> >
> >> > Another solution is to look at using HeapByteBuffers instead
> >> and don't
> >> > have to worry about the direct memory usage. The OpenHFT lib
> >> seems to
> >> > have support for this by just using
> >> elasticHeapByteBuffer(...) instead
> >> > of elasticByteBuffer().
> >> >
> >> > Lastly, the reason for this working with non-generational ZGC
> is
> >> > that it
> >> > does reference processing for every GC.
> >> >
> >> > Hope this helps,
> >> > StefanJ
> >> >
> >> >
> >> > On 2024-02-15 21:53, Johannes Lichtenberger wrote:
> >> > > It's a laptop, I've attached some details.
> >> > >
> >> > > Furthermore, if it seems worth digging deeper into the
> >> issue, the
> >> > JSON
> >> > > file is here for one week:
> >> > > https://www.transfernow.net/dl/20240215j9NaPTc0
> >> <
> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$
> >
> >> > <
> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$
> <
> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$
> >>
> >> > > <https://www.transfernow.net/dl/20240215j9NaPTc0
> >> <
> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybJynYpht$
> >
> >> > <
> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$
> <
> https://urldefense.com/v3/__https://www.transfernow.net/dl/20240215j9NaPTc0__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeDXD5_-$
> >>>
> >> > >
> >> > > You'd have to unzip into
> >> bundles/sirix-core/src/test/resources/json,
> >> > > remove the @Disabled annotation and run the test
> >> > > JsonShredderTest::testChicagoDescendantAxis
> >> > >
> >> > > The test JVM parameters are specified in the parent
> >> build.gradle
> >> > in the
> >> > > project root folder.
> >> > >
> >> > > The GitHub repo: https://github.com/sirixdb/sirix
> >> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$
> >
> >> > <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$
> >>
> >> > > <https://github.com/sirixdb/sirix
> >> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybALU2RDy$
> >
> >> > <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFUUPdeUD$
> >>>
> >> > >
> >> > > Screenshot from 2024-02-15 21-43-33.png
> >> > >
> >> > > kind regards
> >> > > Johannes
> >> > >
> >> > > Am Do., 15. Feb. 2024 um 20:01 Uhr schrieb Peter Booth
> >> > > <peter_booth at me.com <mailto:peter_booth at me.com>
> >> <mailto:peter_booth at me.com <mailto:peter_booth at me.com>>
> >> > <mailto:peter_booth at me.com <mailto:peter_booth at me.com>
> >> <mailto:peter_booth at me.com <mailto:peter_booth at me.com>>>>:
> >> > >
> >> > > Just curious - what CPU, physical memory and OS are
> >> you using?
> >> > > Sent from my iPhone
> >> > >
> >> > >> On Feb 15, 2024, at 12:23 PM, Johannes Lichtenberger
> >> > >> <lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>
> >> > <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>>
> >> > >> <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>
> >> > <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>>>> wrote:
> >> > >>
> >> > >>
> >> > >> I guess I don't know which JDK it picks for the
> >> tests, but I
> >> > guess
> >> > >> OpenJDK
> >> > >>
> >> > >> Johannes Lichtenberger
> >> <lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>
> >> > <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>>
> >> > >> <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>
> >> > <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>>>> schrieb am Do., 15.
> >> > >> Feb. 2024, 17:58:
> >> > >>
> >> > >> However, it's the same with: ./gradlew
> >> > >>
> -Dorg.gradle.java.home=/home/johannes/.jdks/openjdk-21.0.2
> >> > >> :sirix-core:test --tests
> >> > >>
> >> >
> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis
> using OpenJDK hopefully
> >> > >>
> >> > >> Am Do., 15. Feb. 2024 um 17:54 Uhr schrieb
> Johannes
> >> > >> Lichtenberger <lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>
> >> > <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>>
> >> > >> <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>
> >> > <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>>>>:
> >> > >>
> >> > >> I've attached two logs, the first one without
> >> > >> -XX:+Generational, the second one with the
> >> option set,
> >> > >> even though I also saw, that generational ZGC
> is
> >> > going to
> >> > >> be supported in GraalVM 24.1 in September...
> >> so not sure
> >> > >> what this does :)
> >> > >>
> >> > >> Am Do., 15. Feb. 2024 um 17:52 Uhr schrieb
> >> Johannes
> >> > >> Lichtenberger
> >> <lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>
> >> > <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>>
> >> > >> <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>
> >> > <mailto:lichtenberger.johannes at gmail.com
> >> <mailto:lichtenberger.johannes at gmail.com>>>>:
> >> > >>
> >> > >> Strange, so does it simply ignore the
> >> option? The
> >> > >> following is the beginning of the output
> >> from _non_
> >> > >> generational ZGC:
> >> > >>
> >> > >> johannes at luna:~/IdeaProjects/sirix$
> ./gradlew
> >> > >>
> >> >
> -Dorg.gradle.java.home=/home/johannes/.sdkman/candidates/java/21.0.2-graal
> :sirix-core:test --tests
> io.sirix.service.json.shredder.JsonShredderTest.testChicagoDescendantAxis
> >> > >>
> >> > >> > Configure project :
> >> > >> The 'sonarqube' task depends on compile
> >> tasks. This
> >> > >> behavior is now deprecated and will be
> >> removed in
> >> > >> version 5.x. To avoid implicit
> >> compilation, set
> >> > >> property 'sonar.gradle.skipCompile' to
> 'true'
> >> > and make
> >> > >> sure your project is compiled, before
> >> analysis has
> >> > >> started.
> >> > >> The 'sonar' task depends on compile
> >> tasks. This
> >> > >> behavior is now deprecated and will be
> >> removed in
> >> > >> version 5.x. To avoid implicit
> >> compilation, set
> >> > >> property 'sonar.gradle.skipCompile' to
> 'true'
> >> > and make
> >> > >> sure your project is compiled, before
> >> analysis has
> >> > >> started.
> >> > >> [1,627s][info ][gc ] GC(0) Garbage
> >> Collection
> >> > >> (Metadata GC Threshold) 84M(1%)->56M(0%)
> >> > >>
> >> > >> > Task :sirix-core:test
> >> > >> [0.001s][warning][pagesize] UseLargePages
> >> > disabled, no
> >> > >> large pages configured and available on
> >> the system.
> >> > >> [1.253s][info ][gc ] Using The Z
> >> Garbage
> >> > Collector
> >> > >>
> >> > >> [2,930s][info ][gc ] GC(1) Garbage
> >> Collection
> >> > >> (Warmup) 1616M(11%)->746M(5%)
> >> > >> [4,445s][info ][gc ] GC(2) Garbage
> >> Collection
> >> > >> (Warmup) 3232M(21%)->750M(5%)
> >> > >> [5,751s][info ][gc ] GC(3) Garbage
> >> Collection
> >> > >> (Warmup) 4644M(30%)->1356M(9%)
> >> > >> [9,886s][info ][gc ] GC(4) Garbage
> >> Collection
> >> > >> (Allocation Rate) 10668M(69%)->612M(4%)
> >> > >> [10,406s][info ][gc ] GC(5) Garbage
> >> > Collection
> >> > >> (Allocation Rate) 2648M(17%)->216M(1%)
> >> > >> [13,931s][info ][gc ] GC(6) Garbage
> >> > Collection
> >> > >> (Allocation Rate) 11164M(73%)->1562M(10%)
> >> > >> [16,908s][info ][gc ] GC(7) Garbage
> >> > Collection
> >> > >> (Allocation Rate) 11750M(76%)->460M(3%)
> >> > >> [20,690s][info ][gc ] GC(8) Garbage
> >> > Collection
> >> > >> (Allocation Rate) 12670M(82%)->726M(5%)
> >> > >> [24,376s][info ][gc ] GC(9) Garbage
> >> > Collection
> >> > >> (Allocation Rate) 13422M(87%)->224M(1%)
> >> > >> [28,152s][info ][gc ] GC(10)
> Garbage
> >> > Collection
> >> > >> (Proactive) 13474M(88%)->650M(4%)
> >> > >> [31,526s][info ][gc ] GC(11)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 12072M(79%)->1472M(10%)
> >> > >> [34,754s][info ][gc ] GC(12)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 13050M(85%)->330M(2%)
> >> > >> [38,478s][info ][gc ] GC(13)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 13288M(87%)->762M(5%)
> >> > >> [41,936s][info ][gc ] GC(14)
> Garbage
> >> > Collection
> >> > >> (Proactive) 13294M(87%)->504M(3%)
> >> > >> [45,353s][info ][gc ] GC(15)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 12984M(85%)->268M(2%)
> >> > >> [48,861s][info ][gc ] GC(16)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 13008M(85%)->306M(2%)
> >> > >> [52,133s][info ][gc ] GC(17)
> Garbage
> >> > Collection
> >> > >> (Proactive) 12042M(78%)->538M(4%)
> >> > >> [55,705s][info ][gc ] GC(18)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 12420M(81%)->1842M(12%)
> >> > >> [59,000s][info ][gc ] GC(19)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 12458M(81%)->1422M(9%)
> >> > >> [64,501s][info ][gc ] Allocation
> >> Stall (Test
> >> > >> worker) 59,673ms
> >> > >> [64,742s][info ][gc ] Allocation
> >> Stall (Test
> >> > >> worker) 240,077ms
> >> > >> [65,806s][info ][gc ] GC(20)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 13808M(90%)->6936M(45%)
> >> > >> [66,476s][info ][gc ] GC(21)
> Garbage
> >> > Collection
> >> > >> (Allocation Stall) 7100M(46%)->4478M(29%)
> >> > >> [69,471s][info ][gc ] GC(22)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 10098M(66%)->5888M(38%)
> >> > >> [72,252s][info ][gc ] GC(23)
> Garbage
> >> > Collection
> >> > >> (Allocation Rate) 11226M(73%)->5816M(38%)
> >> > >>
> >> > >> ...
> >> > >>
> >> > >> So even here I can see some allocation
> >> stalls.
> >> > >>
> >> > >> Running the Same with -XX:+ZGenerational
> in
> >> > >> build.gradle probably using GraalVM does
> >> something
> >> > >> differnt, but I don't know what... at
> >> least off-heap
> >> > >> memory is exhausted at some point due to
> >> direct byte
> >> > >> buffer usage!?
> >> > >>
> >> > >> So, I'm not sure what's the difference,
> >> though.
> >> > >>
> >> > >> With this:
> >> > >>
> >> > >> "-XX:+UseZGC",
> >> > >>
> >> > "-Xlog:gc*=debug:file=zgc-generational-detailed.log",
> >> > >> "-XX:+ZGenerational",
> >> > >> "-verbose:gc",
> >> > >> "-XX:+HeapDumpOnOutOfMemoryError",
> >> > >> "-XX:HeapDumpPath=heapdump.hprof",
> >> > >> "-XX:MaxDirectMemorySize=2g",
> >> > >>
> >> > >>
> >> > >> Caused by: java.lang.OutOfMemoryError:
> Cannot
> >> > reserve 60000 bytes of direct buffer memory (allocated:
> >> 2147446560,
> >> > limit: 2147483648)
> >> > >> at
> >> > java.base/java.nio.Bits.reserveMemory(Bits.java:178)
> >> > >> at
> >> >
> java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:111)
> >> > >> at
> >> >
> java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360)
> >> > >> at
> >> >
> net.openhft.chronicle.bytes.internal.NativeBytesStore.elasticByteBuffer(NativeBytesStore.java:191)
> >> > >> at
> >> >
> net.openhft.chronicle.bytes.BytesStore.elasticByteBuffer(BytesStore.java:192)
> >> > >> at
> >> >
> net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:176)
> >> > >> at
> >> >
> net.openhft.chronicle.bytes.Bytes.elasticByteBuffer(Bytes.java:148)
> >> > >> at io.sirix.access.trx.page
> >> <
> https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$
> >
> >> > <
> https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$
> <
> https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$
> >>.NodePageTrx.lambda$parallelSerializationOfKeyValuePages$1(NodePageTrx.java:443)
> >> > >>
> >> > >>
> >> > >>
> >> > >> Am Do., 15. Feb. 2024 um 12:05 Uhr
> >> schrieb Stefan
> >> > >> Karlsson <stefan.karlsson at oracle.com
> >> <mailto:stefan.karlsson at oracle.com>
> >> > <mailto:stefan.karlsson at oracle.com
> >> <mailto:stefan.karlsson at oracle.com>>
> >> > >> <mailto:stefan.karlsson at oracle.com
> >> <mailto:stefan.karlsson at oracle.com>
> >> > <mailto:stefan.karlsson at oracle.com
> >> <mailto:stefan.karlsson at oracle.com>>>>:
> >> > >>
> >> > >> Hi Johannes,
> >> > >>
> >> > >> We tried to look at the log files and
> >> the jfr
> >> > >> files, but couldn't find
> >> > >> an OotOfMemoryError in any of them.
> >> Do you think
> >> > >> you could try to rerun
> >> > >> and capture the entire GC log from the
> >> > >> OutOfMemoryError run?
> >> > >>
> >> > >> A few things to note:
> >> > >>
> >> > >> 1) You seem to be running the Graal
> >> compiler.
> >> > >> Graal doesn't support
> >> > >> Generational ZGC, so you are going to
> run
> >> > >> different compilers when you
> >> > >> compare Singlegen ZGC with
> >> Generational ZGC.
> >> > >>
> >> > >> 2) It's not clear to me that the
> >> provided JFR
> >> > >> files matches the provided
> >> > >> log files.
> >> > >>
> >> > >> 3) The JFR files show that
> >> > -XX:+UseLargePages are
> >> > >> used, but the gc+init
> >> > >> logs shows 'Large Page Support:
> >> Disabled', you
> >> > >> might want to look into
> >> > >> why that is the case.
> >> > >>
> >> > >> 4) The singlegen JFR file has a
> >> > >> -Xlog:gc:g1-chicago.log line. It
> should
> >> > >> probably be named zgc-chicago.log.
> >> > >>
> >> > >> Cheers,
> >> > >> StefanK
> >> > >>
> >> > >> On 2024-02-14 17:36, Johannes
> >> Lichtenberger
> >> > wrote:
> >> > >> > Hello,
> >> > >> >
> >> > >> > a test of my little DB project
> >> fails using
> >> > >> generational ZGC, but not
> >> > >> > with ZGC and G1 (out of memory
> error).
> >> > >> >
> >> > >> > To be honest, I guess the
> >> allocation rate and
> >> > >> thus GC pressure, when
> >> > >> > traversing a resource in SirixDB is
> >> > >> unacceptable. The strategy is to
> >> > >> > create fine-grained nodes from JSON
> >> input and
> >> > >> store these in a trie.
> >> > >> > First, a 3,8Gb JSON file is
> >> shredded and
> >> > >> imported. Next, a preorder
> >> > >> > traversal of the generated trie
> >> traverses
> >> > a trie
> >> > >> (with leaf pages
> >> > >> > storing 1024 nodes each and in total
> >> > >> ~300_000_000 (and these are going
> >> > >> > to be deserialized one by one). The
> >> pages are
> >> > >> furthermore referenced
> >> > >> > in memory through
> >> PageReference::setPage.
> >> > >> Furthermore, a Caffeine page
> >> > >> > cache caches the PageReferences
> >> (keys) and the
> >> > >> pages (values) and sets
> >> > >> > the reference back to null once
> >> entries are
> >> > >> going to be evicted
> >> > >> > (PageReference.setPage(null)).
> >> > >> >
> >> > >> > However, I think the whole strategy
> of
> >> > having to
> >> > >> have in-memory nodes
> >> > >> > might not be best. Maybe it's
> >> better to use
> >> > >> off-heap memory for the
> >> > >> > pages itself with MemorySegments,
> >> but the
> >> > pages
> >> > >> are not of a fixed
> >> > >> > size, thus it may get tricky.
> >> > >> >
> >> > >> > The test mentioned is this:
> >> > >> >
> >> > >>
> >> >
> >>
> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybA2KQCpC$>
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$>>
> <
> https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java#L69
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybA2KQCpC$>
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/blob/248ab141632c94c6484a3069a056550516afb1d2/bundles/sirix-core/src/test/java/io/sirix/service/json/shredder/JsonShredderTest.java*L69__;Iw!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFRtH2qmJ$
> >>>
> >> > >> >
> >> > >> > I can upload the JSON file
> >> somewhere for a
> >> > >> couple of days if needed.
> >> > >> >
> >> > >> > Caused by:
> java.lang.OutOfMemoryError
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:542)
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567)
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670)
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174)
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
> >> > >> > at
> >> > >> >
> >> > >>
> >> >
> java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
> >> > >> > at
> >> > >> > io.sirix.access.trx.page
> >> <
> https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$
> >
> >> > <
> https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$
> <
> https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$
> >>
> >> > >> <http://io.sirix.access.trx.page
> >> <
> https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybLv7t-Xn$
> >
> >> > <
> https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$
> <
> https://urldefense.com/v3/__http://io.sirix.access.trx.page__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFeoArpQf$
> >>>.NodePageTrx.parallelSerializationOfKeyValuePages(NodePageTrx.java:442)
> >> > >> >
> >> > >> > I've uploaded several JFR
> >> recordings and logs
> >> > >> over here (maybe besides
> >> > >> > the async profiler JFR files the
> >> zgc-detailed
> >> > >> log is most interesting):
> >> > >> >
> >> > >> >
> >> > >>
> >> https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core
> >> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybMV7Rgtt$
> >
> >> > <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$>>
> <https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!O5j6Ri-Ostqq68q1zm71CEhSQ4CE7DqBfHZNq7cDAhU7b7CwqrnIA-ddZFaQDbOMAkgHkFriNeIrXJdRofVuv1UybMV7Rgtt$>
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$
> <
> https://urldefense.com/v3/__https://github.com/sirixdb/sirix/tree/main/bundles/sirix-core__;!!ACWV5N9M2RV99hQ!MWZDuvCBsbZSYul-HLDtF_j1IBD6osBF4cBVE_bg0yM5zCqYFwzLLp7nKN3b1hq1XVFRreqUVaXiKuXjUwGbxpjjFYBlqOOx$
> >>>
> >> > >> >
> >> > >> > kind regards
> >> > >> > Johannes
> >> > >>
> >> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20240217/1163bb19/attachment-0001.htm>
More information about the zgc-dev
mailing list