From alen.vrecko at gmail.com  Wed Apr 19 11:38:13 2023
From: alen.vrecko at gmail.com (=?UTF-8?B?QWxlbiBWcmXEjWtv?=)
Date: Wed, 19 Apr 2023 13:38:13 +0200
Subject: on heap overhead of ZGC
Message-ID: <CACOdjgRE_piXJ6MVdrR67hnfFwcpj1PVkPOcjFfvbO8DMKsMkw@mail.gmail.com>

Hello everyone.

I did my best to search the web for any prior discussion on this. Haven't
found anything useful.

I am trying to understand why there is a noticeable difference between the
size of all objects on the heap and the heap used (after full GC). The heap
used size can be 10%+ more than the size of of all live objects.

Let's say I want to fill 1 GiB heap with 4MiB byte[] objects.

Naively I'd imagine I can store 1 GiB / 4MiB = 256 such byte[] on the heap.

(I made a simple program that just allocates byte[], stores it in a list,
does GC and waits (so I can do jcmd or similar), nothing else)

With EpsilonGC -> 255x 4MiB byte[] allocations, after this the app crashes
with out of memory
With SerialGC  -> 246x
With ParallelGC -> 233x
With G1GC -> 204x
With ZGC -> 170x

For example in the ZGC case, where I have 170x of 4MiB byte[] on the heap.

GC.heap_info:

 ZHeap           used 1022M, capacity 1024M, max capacity 1024M
 Metaspace       used 407K, committed 576K, reserved 1114112K
  class space    used 24K, committed 128K, reserved 1048576K

GC.class_histogram:

Total         15118      713971800 (~714M)

In this case does it mean ZGC is wasting 1022M - 714M = 308M for doing its
"thing"? This is like 1022/714= 43% overhead?

This example might be convoluted and atypical of any production environment.

I am seeing the difference between live set and heap used in production at
around 12.5% for 3 servers looked at.

Is there any other way to estimate the overhead apart from looking at the
difference between the live set and heap used? Does ZGC have any internal
statistic of the overhead?

I'd prefer not to assume 12.5% is the number to use and then get surprised
that in some case it might be 25%?

Do you have any recommendations regarding ZGC overhead when estimating heap
space?

Thank you
Alen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20230419/d51edf53/attachment.htm>

From stefan.karlsson at oracle.com  Thu Apr 20 15:15:07 2023
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 20 Apr 2023 17:15:07 +0200
Subject: on heap overhead of ZGC
In-Reply-To: <CACOdjgRE_piXJ6MVdrR67hnfFwcpj1PVkPOcjFfvbO8DMKsMkw@mail.gmail.com>
References: <CACOdjgRE_piXJ6MVdrR67hnfFwcpj1PVkPOcjFfvbO8DMKsMkw@mail.gmail.com>
Message-ID: <05877848-6923-379c-9ff7-e5e03b0a58e0@oracle.com>

Hi Alen,

On 2023-04-19 13:38, Alen Vre?ko wrote:
> Hello everyone.
>
> I did my best to search the web for any prior discussion on this. 
> Haven't found anything useful.
>
> I am trying to understand why there is a noticeable difference between 
> the size of all objects on the heap and the heap used (after full GC). 
> The heap used size can be 10%+ more than the size of of all live objects.

The difference can come from two sources:

1) There is unusable memory in the regions, caused by address and size 
alignment requirements. (This is probably what you see in your 4MB array 
test)

2) We only calculate the `live` value when we mark through objects on 
the heap. While ZGC runs a cycle it more or less ignores new objects 
that the Java threads allocate during the GC cycle. This means that the 
`used` value will increase, but the `live` value will not be updated 
until the Mark End of the next GC cycle..

The latter also makes it bit misleading to look at the used value after 
the GC cycle. With stop-the-world GCs, that used value is an OK 
approximation of what is live on the heap. With a concurrent GC it 
includes all the memory allocated during GC cycle. The used numbers is 
still true, but some (many) of those allocated objects where short-lived 
and died, but we won't be able to figure that out until the next GC cycle.

I don't know if you have seen this, but we have a table where we try to 
give you? a picture of how the values progressed during the GC cycle:

[6.150s][1681984168056ms][info ][gc,heap???? ] GC(0) Mark Start????????? 
Mark End??????? Relocate Start????? Relocate End?????????? 
High?????????????? Low
[6.150s][1681984168056ms][info ][gc,heap???? ] GC(0)? Capacity: 3282M 
(10%)??????? 3536M (11%)??????? 3580M (11%)??????? 3612M (11%)??????? 
3612M (11%)??????? 3282M (10%)
[6.150s][1681984168056ms][info ][gc,heap???? ] GC(0)????? Free: 28780M 
(90%)?????? 28538M (89%)?????? 28720M (90%)?????? 29066M (91%)?????? 
29068M (91%)?????? 28434M (89%)
[6.150s][1681984168056ms][info ][gc,heap???? ] GC(0)????? Used: 3234M 
(10%)??????? 3476M (11%)??????? 3294M (10%)??????? 2948M (9%)???????? 
3580M (11%)??????? 2946M (9%)
[6.150s][1681984168056ms][info ][gc,heap???? ] GC(0) Live:???????? 
-????????????? 2496M (8%)???????? 2496M (8%) 2496M (8%)???????????? 
-????????????????? -
[6.150s][1681984168056ms][info ][gc,heap???? ] GC(0) Allocated:???????? 
-?????????????? 242M (1%)????????? 364M (1%)????????? 411M 
(1%)???????????? -????????????????? -
[6.150s][1681984168056ms][info ][gc,heap???? ] GC(0) Garbage:???????? 
-?????????????? 737M (2%)????????? 433M (1%)?????????? 39M 
(0%)???????????? -????????????????? -
[6.150s][1681984168056ms][info ][gc,heap???? ] GC(0) Reclaimed:???????? 
-????????????????? -?????????????? 304M (1%)????????? 697M 
(2%)???????????? -????????????????? -

The `garbage` at Mark End is a diff between what was `used` when the GC 
cycle started and what we later found to be `live` in that used memory.

>
> Let's say I want to fill 1 GiB heap with 4MiB byte[] objects.
>
> Naively I'd imagine I can store 1 GiB / 4MiB = 256 such byte[] on the 
> heap.
>
> (I made a simple program that just allocates byte[], stores it in a 
> list, does GC and waits (so I can do jcmd or similar), nothing else)
>
> With EpsilonGC -> 255x 4MiB byte[] allocations, after this the app 
> crashes with out of memory
> With SerialGC ?-> 246x
> With ParallelGC -> 233x
> With G1GC -> 204x
> With ZGC -> 170x
>
> For example in the ZGC case, where I have 170x of 4MiB byte[] on the heap.
>
> GC.heap_info:
>
> ?ZHeap ? ? ? ? ? used 1022M, capacity 1024M, max capacity 1024M
> ?Metaspace ? ? ? used 407K, committed 576K, reserved 1114112K
> ? class space ? ?used 24K, committed 128K, reserved 1048576K
>
> GC.class_histogram:
>
> Total ? ? ? ? 15118 ? ? ?713971800 (~714M)
>
> In this case does it mean ZGC is wasting 1022M - 714M = 308M for doing 
> its "thing"? This is like 1022/714= 43% overhead?

My guess is that the object header (typically 16 bytes) pushed the 
objects size slightly beyond 4MB. ZGC allocates large objects in their 
own region. Those regions are 2MB aligned, which makes your ~4MB objects 
`use` 6MB.

You would probably see similar results with G1 when the heap region size 
is increased, which happens when the heap max size is larger. You can 
test that by explicitly running with -XX:G1HeapRegionSize=2MB, to use a 
larger heap region size.

>
> This example might be convoluted and atypical of any production 
> environment.
>
> I am seeing the difference between live set and heap used in 
> production at around 12.5% for 3 servers looked at.
>
> Is there any other way to estimate the overhead apart from looking at 
> the difference between the live set and heap used? Does ZGC have any 
> internal statistic of the overhead?

I don't think we have a way to differentiate between the overhead caused 
by (1) and (2) above.

>
> I'd prefer not to assume 12.5% is the number to use and then get 
> surprised that in some case it might be 25%?

The overhead of yet-to-be collected garbage can easily be above 25%. It 
all depend on the workload. We strive to keep the fragmentation below 
the -XX:ZFragmentationLimit, which is set tot 25% by default, but that 
doesn't include the overhead of newly allocated object (and it doesn't 
include the large objects).

>
> Do you have any recommendations regarding ZGC overhead when estimating 
> heap space?

Unfortunately, I can't. It depends on how intricate the object graph is 
(meaning how long it will take to mark through it), how many live 
objects you have, the allocation rate, number of cores, etc. There's a 
constant race between the GC and the allocating Java threads. If the 
Java threads "win", and use up all memory before the GC can mark through 
the object graph and then give back memory to the Java application, then 
the Java threads will stall waiting for more memory. You need to test 
with your workload and see if you've given enough heap memory to allow 
ZGC to complete its cycles without causing allocation stalls.

Thanks,
StefanK

>
> Thank you
> Alen


From ehelin at openjdk.org  Fri Apr 21 08:50:48 2023
From: ehelin at openjdk.org (Erik Helin)
Date: Fri, 21 Apr 2023 08:50:48 GMT
Subject: RFR: 8306656: Fix age table logging
Message-ID: <HWUaSSA-X2TRMXyHuyfljBj-KhGqstIc2ICKONoF9Ho=.58db2171-0d46-4e0d-9415-77a9c27439d3@github.com>

Hey all,

please review this patch that updates the age table logging. The logging now looks like:


[1.946s][info][gc,reloc    ] GC(22) y: Age Table:
[1.946s][info][gc,reloc    ] GC(22) y:                    Live              Small              Medium             Large        
[1.946s][info][gc,reloc    ] GC(22) y: Eden               0M -> 0M        300 -> 300           0 -> 0             0 -> 0       
[1.946s][info][gc,reloc    ] GC(22) y: Survivor 1         0M -> 0M          2 -> 2             0 -> 0             0 -> 0       
[1.946s][info][gc,reloc    ] GC(22) y: Survivor 2         0M -> 0M          1 -> 0             0 -> 0             0 -> 0       
[1.946s][info][gc,reloc    ] GC(22) y: Survivor 3         0M -> 0M          1 -> 1             0 -> 0             0 -> 0       
[1.946s][info][gc,reloc    ] GC(22) y: Survivor 4         0M -> 0M          0 -> 1             0 -> 0             0 -> 0 


The first column is the size of the live objects with the given age and how that has changed since the previous collection. The remaining columns are the number of pages of the given age and how those have changed from the previous collection to the current one (so the format is `<previous> -> <current>`). This makes it easy to spot a change in behavior in objects' lifetimes from one collection to another: if objects have similar lifetimes compared to the previous collection then the distribution of pages should be similar between the current collection and the previous collection.

_Note_: I would like to make the table adapt to live sizes smaller than one megabyte, but that should probably be done for the logging as a whole, not just for the age table.

I also cleaned up some minor stuff as I went along.

Testing:
- [x] Tier 1-3 (macOS-aarch64, Windows-x64, Linux-x64, Linux-aarch64)
- [x] Local testing on macOS-aarch64

Thanks,
Erik

-------------

Commit messages:
 - Rework age table logging

Changes: https://git.openjdk.org/zgc/pull/19/files
 Webrev: https://webrevs.openjdk.org/?repo=zgc&pr=19&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8306656
  Stats: 200 lines in 10 files changed: 104 ins; 52 del; 44 mod
  Patch: https://git.openjdk.org/zgc/pull/19.diff
  Fetch: git fetch https://git.openjdk.org/zgc.git pull/19/head:pull/19

PR: https://git.openjdk.org/zgc/pull/19

From alen.vrecko at gmail.com  Sun Apr 23 18:28:07 2023
From: alen.vrecko at gmail.com (=?UTF-8?B?QWxlbiBWcmXEjWtv?=)
Date: Sun, 23 Apr 2023 20:28:07 +0200
Subject: on heap overhead of ZGC
In-Reply-To: <05877848-6923-379c-9ff7-e5e03b0a58e0@oracle.com>
References: <CACOdjgRE_piXJ6MVdrR67hnfFwcpj1PVkPOcjFfvbO8DMKsMkw@mail.gmail.com>
 <05877848-6923-379c-9ff7-e5e03b0a58e0@oracle.com>
Message-ID: <CACOdjgRQVjyW81y-dfA7yf8HRo4DB=2wVi1Gmwd0Jddx_8-zEQ@mail.gmail.com>

Hi Stefan.

Thank you for the explanations. Makes sense.

Alen

V V ?et., 20. apr. 2023 ob 17:15 je oseba Stefan Karlsson <
stefan.karlsson at oracle.com> napisala:

> Hi Alen,
>
> On 2023-04-19 13:38, Alen Vre?ko wrote:
> > Hello everyone.
> >
> > I did my best to search the web for any prior discussion on this.
> > Haven't found anything useful.
> >
> > I am trying to understand why there is a noticeable difference between
> > the size of all objects on the heap and the heap used (after full GC).
> > The heap used size can be 10%+ more than the size of of all live objects.
>
> The difference can come from two sources:
>
> 1) There is unusable memory in the regions, caused by address and size
> alignment requirements. (This is probably what you see in your 4MB array
> test)
>
> 2) We only calculate the `live` value when we mark through objects on
> the heap. While ZGC runs a cycle it more or less ignores new objects
> that the Java threads allocate during the GC cycle. This means that the
> `used` value will increase, but the `live` value will not be updated
> until the Mark End of the next GC cycle..
>
> The latter also makes it bit misleading to look at the used value after
> the GC cycle. With stop-the-world GCs, that used value is an OK
> approximation of what is live on the heap. With a concurrent GC it
> includes all the memory allocated during GC cycle. The used numbers is
> still true, but some (many) of those allocated objects where short-lived
> and died, but we won't be able to figure that out until the next GC cycle.
>
> I don't know if you have seen this, but we have a table where we try to
> give you  a picture of how the values progressed during the GC cycle:
>
> [6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Mark Start
> Mark End        Relocate Start      Relocate End
> High               Low
> [6.150s][1681984168056ms][info ][gc,heap     ] GC(0)  Capacity: 3282M
> (10%)        3536M (11%)        3580M (11%)        3612M (11%)
> 3612M (11%)        3282M (10%)
> [6.150s][1681984168056ms][info ][gc,heap     ] GC(0)      Free: 28780M
> (90%)       28538M (89%)       28720M (90%)       29066M (91%)
> 29068M (91%)       28434M (89%)
> [6.150s][1681984168056ms][info ][gc,heap     ] GC(0)      Used: 3234M
> (10%)        3476M (11%)        3294M (10%)        2948M (9%)
> 3580M (11%)        2946M (9%)
> [6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Live:
> -              2496M (8%)         2496M (8%) 2496M (8%)
> -                  -
> [6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Allocated:
> -               242M (1%)          364M (1%)          411M
> (1%)             -                  -
> [6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Garbage:
> -               737M (2%)          433M (1%)           39M
> (0%)             -                  -
> [6.150s][1681984168056ms][info ][gc,heap     ] GC(0) Reclaimed:
> -                  -               304M (1%)          697M
> (2%)             -                  -
>
> The `garbage` at Mark End is a diff between what was `used` when the GC
> cycle started and what we later found to be `live` in that used memory.
>
> >
> > Let's say I want to fill 1 GiB heap with 4MiB byte[] objects.
> >
> > Naively I'd imagine I can store 1 GiB / 4MiB = 256 such byte[] on the
> > heap.
> >
> > (I made a simple program that just allocates byte[], stores it in a
> > list, does GC and waits (so I can do jcmd or similar), nothing else)
> >
> > With EpsilonGC -> 255x 4MiB byte[] allocations, after this the app
> > crashes with out of memory
> > With SerialGC  -> 246x
> > With ParallelGC -> 233x
> > With G1GC -> 204x
> > With ZGC -> 170x
> >
> > For example in the ZGC case, where I have 170x of 4MiB byte[] on the
> heap.
> >
> > GC.heap_info:
> >
> >  ZHeap           used 1022M, capacity 1024M, max capacity 1024M
> >  Metaspace       used 407K, committed 576K, reserved 1114112K
> >   class space    used 24K, committed 128K, reserved 1048576K
> >
> > GC.class_histogram:
> >
> > Total         15118      713971800 (~714M)
> >
> > In this case does it mean ZGC is wasting 1022M - 714M = 308M for doing
> > its "thing"? This is like 1022/714= 43% overhead?
>
> My guess is that the object header (typically 16 bytes) pushed the
> objects size slightly beyond 4MB. ZGC allocates large objects in their
> own region. Those regions are 2MB aligned, which makes your ~4MB objects
> `use` 6MB.
>
> You would probably see similar results with G1 when the heap region size
> is increased, which happens when the heap max size is larger. You can
> test that by explicitly running with -XX:G1HeapRegionSize=2MB, to use a
> larger heap region size.
>
> >
> > This example might be convoluted and atypical of any production
> > environment.
> >
> > I am seeing the difference between live set and heap used in
> > production at around 12.5% for 3 servers looked at.
> >
> > Is there any other way to estimate the overhead apart from looking at
> > the difference between the live set and heap used? Does ZGC have any
> > internal statistic of the overhead?
>
> I don't think we have a way to differentiate between the overhead caused
> by (1) and (2) above.
>
> >
> > I'd prefer not to assume 12.5% is the number to use and then get
> > surprised that in some case it might be 25%?
>
> The overhead of yet-to-be collected garbage can easily be above 25%. It
> all depend on the workload. We strive to keep the fragmentation below
> the -XX:ZFragmentationLimit, which is set tot 25% by default, but that
> doesn't include the overhead of newly allocated object (and it doesn't
> include the large objects).
>
> >
> > Do you have any recommendations regarding ZGC overhead when estimating
> > heap space?
>
> Unfortunately, I can't. It depends on how intricate the object graph is
> (meaning how long it will take to mark through it), how many live
> objects you have, the allocation rate, number of cores, etc. There's a
> constant race between the GC and the allocating Java threads. If the
> Java threads "win", and use up all memory before the GC can mark through
> the object graph and then give back memory to the Java application, then
> the Java threads will stall waiting for more memory. You need to test
> with your workload and see if you've given enough heap memory to allow
> ZGC to complete its cycles without causing allocation stalls.
>
> Thanks,
> StefanK
>
> >
> > Thank you
> > Alen
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20230423/6757e720/attachment.htm>

From thomas.stuefe at gmail.com  Tue Apr 25 07:58:48 2023
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 25 Apr 2023 09:58:48 +0200
Subject: Large number of VMAs for large ZGC heap
Message-ID: <CAA-vtUykVrHKXvPA+=F3=SWs9K3HGvN3pwpUz7P96tV+4mFecQ@mail.gmail.com>

Hi ZGC experts,

I see a strangeness with one of our customers running JDK 17 with ZGC, THP
enabled (always), and a large heap of 4.6TB.

The number of VMAs exceeds 20 million. I try to understand whether that is
normal or pathological.

Looking at maps, I see millions of adjacent VMAs that point into the heap
to different offsets:

```
15fc5f600000-15fc5f800000 rw-s 24630400000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
15fc5f800000-15fc5fa00000 rw-s 2504e600000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
15fc5fa00000-15fc5fc00000 rw-s 25330000000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
15fc5fc00000-15fc5fe00000 rw-s 26324200000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
15fc5fe00000-15fc60000000 rw-s 26f03a00000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
```

The different offsets prevent these mappings from being folded.

The number of mappings surpasses what would be needed to map the heap.
Almost all are 2MB mappings:

Total number of mappings: 18634289
Number of 2MB mappings:        18529201
Per color: 6211420 / 6211429 / 6211439

The total address space covered by these 2MB mappings is 38TB. Taking into
account the triple-mapping, we still map about 12TB per color. That far
exceeds the necessary room for a 4.6TB heap.

Examining the mappings, I see that many offsets into the heap are mapped to
multiple points, even discounting the triple mapping. For example, offset
105fe800000 is mapped six times per color, for a total of 12 times:

13438de00000-13438e000000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
15bf79400000-15bf79600000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
165022800000-165022a00000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
16fdad200000-16fdad400000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
17b1b9600000-17b1b9800000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
1d9860000000-1d9860200000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)

23438de00000-23438e000000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
25bf79400000-25bf79600000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
265022800000-265022a00000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
26fdad200000-26fdad400000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
27b1b9600000-27b1b9800000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
2d9860000000-2d9860200000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)

43438de00000-43438e000000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
45bf79400000-45bf79600000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
465022800000-465022a00000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
46fdad200000-46fdad400000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
47b1b9600000-47b1b9800000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)
4d9860000000-4d9860200000 rw-s 105fe800000 00:0f 373323680
/memfd:java_heap.hugetlb (deleted)

The ZGC Page table contains close to a million ZGC pages and looks okay for
a heap of that size:
Small: 739175
Medium: 10160
Large:   65495
               -------
                814830

My question: is such a high number of mappings for ZGC normal?

Thank you for your time,

Cheers, Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20230425/1b213b0e/attachment-0001.htm>

From stefan.karlsson at oracle.com  Tue Apr 25 15:00:08 2023
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 25 Apr 2023 17:00:08 +0200
Subject: Large number of VMAs for large ZGC heap
In-Reply-To: <CAA-vtUykVrHKXvPA+=F3=SWs9K3HGvN3pwpUz7P96tV+4mFecQ@mail.gmail.com>
References: <CAA-vtUykVrHKXvPA+=F3=SWs9K3HGvN3pwpUz7P96tV+4mFecQ@mail.gmail.com>
Message-ID: <d4b66bf5-dabc-dfef-7983-76eaaa1514bd@oracle.com>

Hi Thomas,

On 2023-04-25 09:58, Thomas St?fe wrote:
> Hi ZGC experts,
>
> I see a strangeness with one of our customers running JDK 17 with ZGC, 
> THP enabled (always), and a large heap of 4.6TB.

Side-note: be careful about using THP and expecting good latencies, but 
if you do want to use THP with ZGC make sure to also change:
|
/sys/kernel/mm/transparent_hugepage/shmem_enabled

https://wiki.openjdk.org/display/zgc
|
>
> The number of VMAs exceeds 20 million. I try to understand whether 
> that is normal or pathological.
>
> Looking at maps, I see millions of adjacent VMAs that point into the 
> heap to different offsets:
>
> ```
> 15fc5f600000-15fc5f800000 rw-s 24630400000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 15fc5f800000-15fc5fa00000 rw-s 2504e600000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 15fc5fa00000-15fc5fc00000 rw-s 25330000000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 15fc5fc00000-15fc5fe00000 rw-s 26324200000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 15fc5fe00000-15fc60000000 rw-s 26f03a00000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> ```
>
> The different offsets prevent these mappings from being folded.
>
> The number of mappings surpasses what would be needed to map the heap. 
> Almost all are 2MB mappings:
>
> Total number of mappings: 18634289
> Number of 2MB mappings: ? ? ? ?18529201
> Per color: 6211420 / 6211429 / 6211439
>
> The total address space covered by these 2MB mappings is 38TB. Taking 
> into account the triple-mapping, we still map about 12TB per color. 
> That far exceeds the necessary room for a 4.6TB heap.

ZGC reserves a larger address space for the heap than the given max heap 
size. This is done to make it easier to deal with large objects. There 
are some hints to the address space layout here:
https://github.com/openjdk/zgc/blob/5ea960728c5616373c986ae1343b44043c0db487/src/hotspot/cpu/x86/gc/z/zGlobals_x86.cpp

>
> Examining the mappings, I see that many offsets into the heap are 
> mapped to multiple points, even discounting the triple mapping. For 
> example, offset 105fe800000 is mapped six times per color, for a total 
> of 12 times:
>
> 13438de00000-13438e000000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 15bf79400000-15bf79600000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 165022800000-165022a00000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 16fdad200000-16fdad400000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 17b1b9600000-17b1b9800000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 1d9860000000-1d9860200000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
>
> 23438de00000-23438e000000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 25bf79400000-25bf79600000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 265022800000-265022a00000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 26fdad200000-26fdad400000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 27b1b9600000-27b1b9800000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 2d9860000000-2d9860200000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
>
> 43438de00000-43438e000000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 45bf79400000-45bf79600000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 465022800000-465022a00000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 46fdad200000-46fdad400000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 47b1b9600000-47b1b9800000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
> 4d9860000000-4d9860200000 rw-s 105fe800000 00:0f 373323680 ? ? ? ? 
> /memfd:java_heap.hugetlb (deleted)
>

What I think happens here is that when we detach virtual-to-physical 
memory mappings we don't do it immediately, instead the memory is handed 
over to a separate ZUnmapper thread. If that thread gets starved, 
typically because of an over provisioned machine, then these mappings 
start to build up. You can see the ZUnmapper code here:
https://github.com/openjdk/zgc/blob/5ea960728c5616373c986ae1343b44043c0db487/src/hotspot/share/gc/z/zUnmapper.cpp

I recently looked into this and thought that the starvation happened 
because of how we take the lock for every ZPage we want to unmap. I 
prototyped a way to bulk fetch all pages, but that didn't seem to help. 
AFAICT, the big problem for us was still that the ZUnmapper thread was 
starved out. The prototype is here:
https://github.com/stefank/jdk/tree/zgc_generational_bulk_unmapper

You can can actually see this problem if you monitor the amount of 
committed memory in the Java heap. When this happens the reported amount 
of committed memory increases and can even go past the max heap size. 
This is a bug because of how report our virtual memory to NMT. I created 
a bug for that:
https://bugs.openjdk.org/browse/JDK-8306841

And a prototype:
https://github.com/stefank/jdk/tree/zgc_generational_fix_nmt_overcommit_reporting

> The ZGC Page table contains close to a million ZGC pages and looks 
> okay for a heap of that size:
> Small: 739175
> Medium: 10160
> Large: ? 65495
> ? ? ? ? ? ? ? ?-------
> ? ? ? ? ? ? ? ? 814830
>
> My question: is such a high number of mappings for ZGC normal?

A larger number of mappings is normal, but what you have above indicates 
some kind of performance issue with the system.

Cheers,
StefanK

>
> Thank you for your time,
>
> Cheers, Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20230425/9d1a8ff3/attachment.htm>

From thomas.stuefe at gmail.com  Tue Apr 25 15:31:34 2023
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 25 Apr 2023 17:31:34 +0200
Subject: Large number of VMAs for large ZGC heap
In-Reply-To: <d4b66bf5-dabc-dfef-7983-76eaaa1514bd@oracle.com>
References: <CAA-vtUykVrHKXvPA+=F3=SWs9K3HGvN3pwpUz7P96tV+4mFecQ@mail.gmail.com>
 <d4b66bf5-dabc-dfef-7983-76eaaa1514bd@oracle.com>
Message-ID: <CAA-vtUwYHy7EPnsLH02VFvhV-6z5=Ht_V-Yb4nwe=7Vq-KVTmg@mail.gmail.com>

Hi Stefan,

thanks a lot for your answers. Wrt THPs, yes, it would be wise to use
explicit huge pages.

Does the single ZUnmapper thread compete with all mutator threads for the
page allocator?

Thanks, Thomas


On Tue, Apr 25, 2023 at 2:59?PM Stefan Karlsson <stefan.karlsson at oracle.com>
wrote:

> Hi Thomas,
>
> On 2023-04-25 09:58, Thomas St?fe wrote:
>
> Hi ZGC experts,
>
> I see a strangeness with one of our customers running JDK 17 with ZGC, THP
> enabled (always), and a large heap of 4.6TB.
>
>
> Side-note: be careful about using THP and expecting good latencies, but if
> you do want to use THP with ZGC make sure to also change:
>
> /sys/kernel/mm/transparent_hugepage/shmem_enabled
>
> https://wiki.openjdk.org/display/zgc
>
>
> The number of VMAs exceeds 20 million. I try to understand whether that is
> normal or pathological.
>
> Looking at maps, I see millions of adjacent VMAs that point into the heap
> to different offsets:
>
> ```
> 15fc5f600000-15fc5f800000 rw-s 24630400000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 15fc5f800000-15fc5fa00000 rw-s 2504e600000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 15fc5fa00000-15fc5fc00000 rw-s 25330000000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 15fc5fc00000-15fc5fe00000 rw-s 26324200000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 15fc5fe00000-15fc60000000 rw-s 26f03a00000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> ```
>
> The different offsets prevent these mappings from being folded.
>
> The number of mappings surpasses what would be needed to map the heap.
> Almost all are 2MB mappings:
>
> Total number of mappings: 18634289
> Number of 2MB mappings:        18529201
> Per color: 6211420 / 6211429 / 6211439
>
> The total address space covered by these 2MB mappings is 38TB. Taking into
> account the triple-mapping, we still map about 12TB per color. That far
> exceeds the necessary room for a 4.6TB heap.
>
>
> ZGC reserves a larger address space for the heap than the given max heap
> size. This is done to make it easier to deal with large objects. There are
> some hints to the address space layout here:
>
> https://github.com/openjdk/zgc/blob/5ea960728c5616373c986ae1343b44043c0db487/src/hotspot/cpu/x86/gc/z/zGlobals_x86.cpp
>
>
> Examining the mappings, I see that many offsets into the heap are mapped
> to multiple points, even discounting the triple mapping. For example,
> offset 105fe800000 is mapped six times per color, for a total of 12 times:
>
> 13438de00000-13438e000000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 15bf79400000-15bf79600000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 165022800000-165022a00000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 16fdad200000-16fdad400000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 17b1b9600000-17b1b9800000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 1d9860000000-1d9860200000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
>
> 23438de00000-23438e000000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 25bf79400000-25bf79600000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 265022800000-265022a00000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 26fdad200000-26fdad400000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 27b1b9600000-27b1b9800000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 2d9860000000-2d9860200000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
>
> 43438de00000-43438e000000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 45bf79400000-45bf79600000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 465022800000-465022a00000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 46fdad200000-46fdad400000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 47b1b9600000-47b1b9800000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
> 4d9860000000-4d9860200000 rw-s 105fe800000 00:0f 373323680
> /memfd:java_heap.hugetlb (deleted)
>
>
> What I think happens here is that when we detach virtual-to-physical
> memory mappings we don't do it immediately, instead the memory is handed
> over to a separate ZUnmapper thread. If that thread gets starved, typically
> because of an over provisioned machine, then these mappings start to build
> up. You can see the ZUnmapper code here:
>
> https://github.com/openjdk/zgc/blob/5ea960728c5616373c986ae1343b44043c0db487/src/hotspot/share/gc/z/zUnmapper.cpp
>
> I recently looked into this and thought that the starvation happened
> because of how we take the lock for every ZPage we want to unmap. I
> prototyped a way to bulk fetch all pages, but that didn't seem to help.
> AFAICT, the big problem for us was still that the ZUnmapper thread was
> starved out. The prototype is here:
> https://github.com/stefank/jdk/tree/zgc_generational_bulk_unmapper
>
> You can can actually see this problem if you monitor the amount of
> committed memory in the Java heap. When this happens the reported amount of
> committed memory increases and can even go past the max heap size. This is
> a bug because of how report our virtual memory to NMT. I created a bug for
> that:
> https://bugs.openjdk.org/browse/JDK-8306841
>
> And a prototype:
>
> https://github.com/stefank/jdk/tree/zgc_generational_fix_nmt_overcommit_reporting
>
> The ZGC Page table contains close to a million ZGC pages and looks okay
> for a heap of that size:
> Small: 739175
> Medium: 10160
> Large:   65495
>                -------
>                 814830
>
> My question: is such a high number of mappings for ZGC normal?
>
>
> A larger number of mappings is normal, but what you have above indicates
> some kind of performance issue with the system.
>
> Cheers,
> StefanK
>
>
> Thank you for your time,
>
> Cheers, Thomas
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20230425/4f0130a1/attachment-0001.htm>

From stefan.karlsson at oracle.com  Wed Apr 26 07:35:21 2023
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 26 Apr 2023 09:35:21 +0200
Subject: Large number of VMAs for large ZGC heap
In-Reply-To: <CAA-vtUwYHy7EPnsLH02VFvhV-6z5=Ht_V-Yb4nwe=7Vq-KVTmg@mail.gmail.com>
References: <CAA-vtUykVrHKXvPA+=F3=SWs9K3HGvN3pwpUz7P96tV+4mFecQ@mail.gmail.com>
 <d4b66bf5-dabc-dfef-7983-76eaaa1514bd@oracle.com>
 <CAA-vtUwYHy7EPnsLH02VFvhV-6z5=Ht_V-Yb4nwe=7Vq-KVTmg@mail.gmail.com>
Message-ID: <fc30bd34-624a-feeb-f98e-b90aed6fa579@oracle.com>

On 2023-04-25 17:31, Thomas St?fe wrote:
> Hi Stefan,
>
> thanks a lot for your answers. Wrt THPs, yes, it would be wise to use 
> explicit huge pages.
>
> Does the single ZUnmapper thread compete with all mutator threads for 
> the page allocator?

In most cases the mutator threads don't compete with the ZUnmapper 
thread (except for CPU time). However, if we need to allocate either a 
medium page or a large page, and we can't grow the heap more, and 
there's no large enough page in the page cache, then we gather a bunch 
of free pages from the page cache (i.e. page cache flushing) and "steal" 
the physical memory and assign it to a new virtual memory range of the 
required sized. Then we put the flushed pages onto the unmap queue and 
let the ZUnmapper thread deal with it. So, the manipulation of the unmap 
queue uses a lock and that lock is what the mutator and ZUnmapper thread 
competes for. I first thought that lock contention on this thread caused 
the issues we were seeing in our internal tests, but for us it seemed to 
be much more caused by the ZUnmapper thread not getting enough run time.

If you start to see messages about "Page Cache Flushed: " in the gc logs 
then you know that we have run the path described above.

StefanK

>
> Thanks, Thomas
>
>
>
>
> On Tue, Apr 25, 2023 at 2:59?PM Stefan Karlsson 
> <stefan.karlsson at oracle.com> wrote:
>
>     Hi Thomas,
>
>     On 2023-04-25 09:58, Thomas St?fe wrote:
>>     Hi ZGC experts,
>>
>>     I see a strangeness with one of our customers running JDK 17 with
>>     ZGC, THP enabled (always), and a large heap of 4.6TB.
>
>     Side-note: be careful about using THP and expecting good
>     latencies, but if you do want to use THP with ZGC make sure to
>     also change:
>     |
>     /sys/kernel/mm/transparent_hugepage/shmem_enabled
>
>     https://wiki.openjdk.org/display/zgc
>     |
>>
>>     The number of VMAs exceeds 20 million. I try to understand
>>     whether that is normal or pathological.
>>
>>     Looking at maps, I see millions of adjacent VMAs that point into
>>     the heap to different offsets:
>>
>>     ```
>>     15fc5f600000-15fc5f800000 rw-s 24630400000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     15fc5f800000-15fc5fa00000 rw-s 2504e600000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     15fc5fa00000-15fc5fc00000 rw-s 25330000000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     15fc5fc00000-15fc5fe00000 rw-s 26324200000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     15fc5fe00000-15fc60000000 rw-s 26f03a00000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     ```
>>
>>     The different offsets prevent these mappings from being folded.
>>
>>     The number of mappings surpasses what would be needed to map the
>>     heap. Almost all are 2MB mappings:
>>
>>     Total number of mappings: 18634289
>>     Number of 2MB mappings: ? ? ? ?18529201
>>     Per color: 6211420 / 6211429 / 6211439
>>
>>     The total address space covered by these 2MB mappings is 38TB.
>>     Taking into account the triple-mapping, we still map about 12TB
>>     per color. That far exceeds the necessary room for a 4.6TB heap.
>
>     ZGC reserves a larger address space for the heap than the given
>     max heap size. This is done to make it easier to deal with large
>     objects. There are some hints to the address space layout here:
>     https://github.com/openjdk/zgc/blob/5ea960728c5616373c986ae1343b44043c0db487/src/hotspot/cpu/x86/gc/z/zGlobals_x86.cpp
>     <https://urldefense.com/v3/__https://github.com/openjdk/zgc/blob/5ea960728c5616373c986ae1343b44043c0db487/src/hotspot/cpu/x86/gc/z/zGlobals_x86.cpp__;!!ACWV5N9M2RV99hQ!InKmrEgd37o1vph7b34heLsWF3cazBgBKiLbsBP-IeLQ63mezZbwtCFxatSe8E7vZkveYWnKulwj5PVczQe8Q4RzJKI$>
>
>>
>>     Examining the mappings, I see that many offsets into the heap are
>>     mapped to multiple points, even discounting the triple mapping.
>>     For example, offset 105fe800000 is mapped six times per color,
>>     for a total of 12 times:
>>
>>     13438de00000-13438e000000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     15bf79400000-15bf79600000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     165022800000-165022a00000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     16fdad200000-16fdad400000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     17b1b9600000-17b1b9800000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     1d9860000000-1d9860200000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>
>>     23438de00000-23438e000000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     25bf79400000-25bf79600000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     265022800000-265022a00000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     26fdad200000-26fdad400000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     27b1b9600000-27b1b9800000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     2d9860000000-2d9860200000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>
>>     43438de00000-43438e000000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     45bf79400000-45bf79600000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     465022800000-465022a00000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     46fdad200000-46fdad400000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     47b1b9600000-47b1b9800000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>     4d9860000000-4d9860200000 rw-s 105fe800000 00:0f 373323680 ? ? ?
>>     ? ? ? ? /memfd:java_heap.hugetlb (deleted)
>>
>
>     What I think happens here is that when we detach
>     virtual-to-physical memory mappings we don't do it immediately,
>     instead the memory is handed over to a separate ZUnmapper thread.
>     If that thread gets starved, typically because of an over
>     provisioned machine, then these mappings start to build up. You
>     can see the ZUnmapper code here:
>     https://github.com/openjdk/zgc/blob/5ea960728c5616373c986ae1343b44043c0db487/src/hotspot/share/gc/z/zUnmapper.cpp
>     <https://urldefense.com/v3/__https://github.com/openjdk/zgc/blob/5ea960728c5616373c986ae1343b44043c0db487/src/hotspot/share/gc/z/zUnmapper.cpp__;!!ACWV5N9M2RV99hQ!InKmrEgd37o1vph7b34heLsWF3cazBgBKiLbsBP-IeLQ63mezZbwtCFxatSe8E7vZkveYWnKulwj5PVczQe8MHiUbBs$>
>
>     I recently looked into this and thought that the starvation
>     happened because of how we take the lock for every ZPage we want
>     to unmap. I prototyped a way to bulk fetch all pages, but that
>     didn't seem to help. AFAICT, the big problem for us was still that
>     the ZUnmapper thread was starved out. The prototype is here:
>     https://github.com/stefank/jdk/tree/zgc_generational_bulk_unmapper
>     <https://urldefense.com/v3/__https://github.com/stefank/jdk/tree/zgc_generational_bulk_unmapper__;!!ACWV5N9M2RV99hQ!InKmrEgd37o1vph7b34heLsWF3cazBgBKiLbsBP-IeLQ63mezZbwtCFxatSe8E7vZkveYWnKulwj5PVczQe8DCq6qQQ$>
>
>     You can can actually see this problem if you monitor the amount of
>     committed memory in the Java heap. When this happens the reported
>     amount of committed memory increases and can even go past the max
>     heap size. This is a bug because of how report our virtual memory
>     to NMT. I created a bug for that:
>     https://bugs.openjdk.org/browse/JDK-8306841
>
>     And a prototype:
>     https://github.com/stefank/jdk/tree/zgc_generational_fix_nmt_overcommit_reporting
>     <https://urldefense.com/v3/__https://github.com/stefank/jdk/tree/zgc_generational_fix_nmt_overcommit_reporting__;!!ACWV5N9M2RV99hQ!InKmrEgd37o1vph7b34heLsWF3cazBgBKiLbsBP-IeLQ63mezZbwtCFxatSe8E7vZkveYWnKulwj5PVczQe8k4Vxlh4$>
>
>>     The ZGC Page table contains close to a million ZGC pages and
>>     looks okay for a heap of that size:
>>     Small: 739175
>>     Medium: 10160
>>     Large: ? 65495
>>     ? ? ? ? ? ? ? ?-------
>>     ? ? ? ? ? ? ? ? 814830
>>
>>     My question: is such a high number of mappings for ZGC normal?
>
>     A larger number of mappings is normal, but what you have above
>     indicates some kind of performance issue with the system.
>
>     Cheers,
>     StefanK
>
>>
>>     Thank you for your time,
>>
>>     Cheers, Thomas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20230426/e15283d9/attachment-0001.htm>

From ehelin at openjdk.org  Wed Apr 26 16:45:23 2023
From: ehelin at openjdk.org (Erik Helin)
Date: Wed, 26 Apr 2023 16:45:23 GMT
Subject: RFR: 8306656: Generational ZGC: Fix age table logging [v2]
In-Reply-To: <HWUaSSA-X2TRMXyHuyfljBj-KhGqstIc2ICKONoF9Ho=.58db2171-0d46-4e0d-9415-77a9c27439d3@github.com>
References: <HWUaSSA-X2TRMXyHuyfljBj-KhGqstIc2ICKONoF9Ho=.58db2171-0d46-4e0d-9415-77a9c27439d3@github.com>
Message-ID: <zKQ70qpbI_FcoSP1UWMajYjunKGdMdcZa0r2Q6_sabk=.a661d2ae-9010-4515-b929-716e7a2e7424@github.com>

> Hey all,
> 
> please review this patch that updates the age table logging. The logging now looks like:
> 
> 
> [1.946s][info][gc,reloc    ] GC(22) y: Age Table:
> [1.946s][info][gc,reloc    ] GC(22) y:                    Live              Small              Medium             Large        
> [1.946s][info][gc,reloc    ] GC(22) y: Eden               0M -> 0M        300 -> 300           0 -> 0             0 -> 0       
> [1.946s][info][gc,reloc    ] GC(22) y: Survivor 1         0M -> 0M          2 -> 2             0 -> 0             0 -> 0       
> [1.946s][info][gc,reloc    ] GC(22) y: Survivor 2         0M -> 0M          1 -> 0             0 -> 0             0 -> 0       
> [1.946s][info][gc,reloc    ] GC(22) y: Survivor 3         0M -> 0M          1 -> 1             0 -> 0             0 -> 0       
> [1.946s][info][gc,reloc    ] GC(22) y: Survivor 4         0M -> 0M          0 -> 1             0 -> 0             0 -> 0 
> 
> 
> The first column is the size of the live objects with the given age and how that has changed since the previous collection. The remaining columns are the number of pages of the given age and how those have changed from the previous collection to the current one (so the format is `<previous> -> <current>`). This makes it easy to spot a change in behavior in objects' lifetimes from one collection to another: if objects have similar lifetimes compared to the previous collection then the distribution of pages should be similar between the current collection and the previous collection.
> 
> _Note_: I would like to make the table adapt to live sizes smaller than one megabyte, but that should probably be done for the logging as a whole, not just for the age table.
> 
> I also cleaned up some minor stuff as I went along.
> 
> Testing:
> - [x] Tier 1-3 (macOS-aarch64, Windows-x64, Linux-x64, Linux-aarch64)
> - [x] Local testing on macOS-aarch64
> 
> Thanks,
> Erik

Erik Helin has updated the pull request incrementally with two additional commits since the last revision:

 - Add units
 - StefanK changes

-------------

Changes:
  - all: https://git.openjdk.org/zgc/pull/19/files
  - new: https://git.openjdk.org/zgc/pull/19/files/46d42e6a..e2bded1d

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=zgc&pr=19&range=01
 - incr: https://webrevs.openjdk.org/?repo=zgc&pr=19&range=00-01

  Stats: 162 lines in 10 files changed: 55 ins; 60 del; 47 mod
  Patch: https://git.openjdk.org/zgc/pull/19.diff
  Fetch: git fetch https://git.openjdk.org/zgc.git pull/19/head:pull/19

PR: https://git.openjdk.org/zgc/pull/19