ZGC logs
Per Liden
per.liden at oracle.com
Wed Apr 22 10:46:28 UTC 2020
Hi Kirk,
On 4/21/20 11:23 PM, Kirk Pepperdine wrote:
> Hi,
>
> I’ve been looking at GC log data trying to reconcile some of the numbers.
>
> [4.719s][info ][gc,heap ] GC(3) Mark Start Mark End Relocate Start Relocate End High Low
> [4.719s][info ][gc,heap ] GC(3) Capacity: 1032M (25%) 1176M (29%) 1176M (29%) 1176M (29%) 1176M (29%) 1032M (25%)
> [4.719s][info ][gc,heap ] GC(3) Reserve: 42M (1%) 42M (1%) 42M (1%) 42M (1%) 42M (1%) 42M (1%)
> [4.719s][info ][gc,heap ] GC(3) Free: 3064M (75%) 2954M (72%) 3818M (93%) 3704M (90%) 3842M (94%) 2920M (71%)
> [4.719s][info ][gc,heap ] GC(3) Used: 990M (24%) 1100M (27%) 236M (6%) 350M (9%) 1134M (28%) 212M (5%)
> [4.719s][info ][gc,heap ] GC(3) Live: - 10M (0%) 10M (0%) 10M (0%) - -
> [4.719s][info ][gc,heap ] GC(3) Allocated: - 208M (5%) 240M (6%) 764M (19%) - -
> [4.719s][info ][gc,heap ] GC(3) Garbage: - 979M (24%) 83M (2%) 5M (0%) - -
> [4.719s][info ][gc,heap ] GC(3) Reclaimed: - - 896M (22%) 974M (24%) - -
>
>
> If I understand what this log is telling me, there is initially 990M of data of which 10M is marked Live leaving 979M of garbage (-1 rounding error). During the concurrent mark an additional 208M of data was allocated requiring an additional 144M from committed memory. By Mark End, used increase by 110M, not 208M. That leaves 98M unaccounted for. I have some ideas but I’d feel more comfortable if someone could offer a comment on how to reconcile the live, allocated, garbage and reclaimed numbers with used. For example, I would assume that of the 350M remaining at Relocate End is a mix of live and floating garbage. That said, I cannot seem to reconcile the Used, allocated, and reclaimed in a way that yields 350M. The calculation that seems to work is used at Mark End + allocated - recovered = used @ Relocate Start. That is 1100 + (240-208)-896 = 336. However applying that logic shifted over fails. 236 + (764-240) - (974-896) = 681 != 350.
>
> So I’m seem to be missing bits that my starting at zStat hasn’t cleared up.
>
> All comments appreciated.
What's confusing, as it doesn't show in the log, is that the "allocated"
number can be inflated because some page allocations where "undone".
This can happen in situations where, for example, two Java threads
competed to allocate a new shared medium page, both threads allocate one
medium page each, but only one thread will win the race to install that
page. The thread who lost the race will then immediately free its newly
allocated page (undo the allocation). However, we only increase the
"allocated" number when pages are allocated, but we don't decrease it
when we undo an allocation.
You can confirm that it's cause by "undo" by looking at the statistics
table and look for the "Memory: Undo Page Allocation" line, which should
show non-zero numbers. The need to undo allocations is usually a rare
event, at least on most workloads.
One could argue that the "allocated" number is correct (in the sense
that these pages where in fact allocated), but I think it would be more
helpful to log the "allocated - undone" number, as the undo part is more
of an artifact of how things work internally, and not what a user
expects to see.
Here's a patch to adjust "allocated" so that it takes undone allocations
into account. Feel free to try it out and see if the numbers make more
sense.
diff --git a/src/hotspot/share/gc/z/zPageAllocator.cpp
b/src/hotspot/share/gc/z/zPageAllocator.cpp
--- a/src/hotspot/share/gc/z/zPageAllocator.cpp
+++ b/src/hotspot/share/gc/z/zPageAllocator.cpp
@@ -287,12 +287,14 @@
}
void ZPageAllocator::decrease_used(size_t size, bool reclaimed) {
+ // Only pages explicitly released with the reclaimed flag set
+ // counts as reclaimed bytes. This flag is true when a worker
+ // releases a page after relocation, and is false when we
+ // release a page to undo an allocation.
if (reclaimed) {
- // Only pages explicitly released with the reclaimed flag set
- // counts as reclaimed bytes. This flag is typically true when
- // a worker releases a page after relocation, and is typically
- // false when we release a page to undo an allocation.
_reclaimed += size;
+ } else {
+ _allocated -= size;
}
_used -= size;
if (_used < _used_low) {
cheers,
Per
>
> Kind regards,
> Kirk
>
More information about the zgc-dev
mailing list