Are good colors more common than bad colors?
Gil Tene
gil at azul.com
Thu Mar 19 15:45:04 UTC 2020
The dramatic (multi-order-of-magnitude) impact of self healing on
barrier triggering likelihood and on the overall costs of self-healing
read barriers compared to traditional non-self-healing read barriers
(Such as Brooks and Baker style barriers, which do not self heal and
keep triggering during GC activity) is highlighted in section 2.2 of [1].
That section explains the fundamental barrier qualities that enable
self healing (the semantic placement of the barrier between the load
of a reference value and the use of that loaded (reference) value,
giving the barrier access to the address that the reference value is
loaded from)..
The introduction of self healing completely changed the effective
costs of read barrier schemes on GC’ed workloads. The change
was profound enough that prior work surveying costs of read
barriers schemes (usually showing read barriers costs to be
prohibitive) no longer applied to modern GC implementations that
employ self healing read barriers. With all modern concurrent
collectors seeming to coalesce on using self healing read barriers
(first Pauseless & C4, then ZGC, then the latest variant of
Shenandoah), we can probably start referring to collectors that
do not employ self healing read barriers as “legacy collectors”.
A lower bound on the probability of a mutator NOT loading a
barrier-triggering reference [a “bad color” in the terminology
used below] in a self healing loaded (reference) value barrier
(LVB) based system can be trivially approximated as:
1 - ( RefsInLiveSet / (RefsAccessedPerSecond / GCCycleFrequency) )
RefsInLiveSet = number of references in the live set
RefsAccessedPerSecond = number of references accessed
per second by the mutator
GCCycleFrequency = Frequency (in cycles per second) of GC
cycles.
Since reference loads are extremely common operations in java
execution, this number tends to have many 9s in it.
An even higher lower bound approximation can be expressed as:
1 - ( (RefsInLiveSet - RefsHealedByCollector) / (RefsAccessedPerSecond / GCCycleFrequency) )
RefsHealedByCollector = number if references healed by the
collector (as opposed to the mutator)
Note: In C4/ZGC, this healing-by-the-collector is done in the
remap/fixup as part of the next mark phase. In Shenandoah 2.x
(2.0 with the new self healing “LRB”), it is done in a separate pass.
In workloads where a significant portion of the references in the
heap are rarely loaded by mutators (to the point where they
are often not loaded between two consecutive GC cycles), this
number has even more 9s in it...
[1] C4: The Continuously Concurrent Compacting Collector http://paperhub.s3.amazonaws.com/d14661878f7811e5ee9c43de88414e86.pdf
Sent from my iPad
On Mar 19, 2020, at 12:36 AM, Per Liden <per.liden at oracle.com> wrote:
Hi,
On 3/19/20 1:11 AM, raell at web.de wrote:
Dear all,
on slide 35 of the presentation about ZGC by Per Lidén and Stefan Karlsson [1] it
is stated that most object references will have the good color. I tried to analyze
the probability of a non-root reference that is loaded for the first time for
having a good color:
A complete cycle has the three phases
1. remapping/marking
2. relocation
3. no gc action (till the next marking/remapping starts)
In phase 1 'good' means that the right marked bit is set. At the beginning of the phase
all non-root references are bad, at the end of the phase all are good. So, if a non-root
reference is selected randomly, it is good with a probability of about 50%.
In phases 2, 3 'good' means that the remapped bit is set. Since in these phases no remapping
is done (except by the load barrier), all non-root references are bad. So, if a non-root
reference is selected randomly, it is good with a probability of 0%.
Altogether, it seems to me, that in most parts of a cycle a non-root reference will have
the bad color.
Of course, I my be missing something. Therefore, I would be interested in an argument, why
most object references are expected to have a good color.
Your observations are correct. However, applications typically doesn't load randomly selected references. They tend to load a much smaller subset of references over and over again, and a reference will only be bad the first time it's loaded (within the same phase).
From our measurements, the chances of loading a bad pointer (i.e. load barrier taking the slow path), is on the order of 1 in a million.
cheers,
Per
Thank you very much!
Ralph
[1] http://cr.openjdk.java.net/~pliden/slides/ZGC-Jfokus-2018.pdf
More information about the zgc-dev
mailing list