g1: dealing with high rates of inter-region pointer writes

Wed Dec 29 11:49:53 UTC 2010

> Hi Peter -- are only the dynamic rates of reference (i.e. updates) of
> pointers
> into these regions high, or are the total number of such references into
> the region high at any time (i.e. if one were to take an instantaneous
> snapshot
> of the heap, would the # references into these regions be constant, while
> the
> set of regions referencing this region keep changing at a high rate?)

The test generates garbage by accumulating data in an immutable
(clojure) data structures that are "updated" by structural sharing. So
every update implies the creation of multiple (log_32(n)) references
in young gen to older data structures (presumably usually in old-gen
except the top of the trees that have recently been changed already).
The pointers to nodes closer to the root of the tree would tend to be
pointers to newer data, while pointers to data further down in the
tree towards the leafs should be likely to be to old data. Looking for
a moment at just the leafs, there should be a constant churn of new
nodes referring to leafs close to the items being "updated".

So yes, my expectation and my assumption is that we have lots of
writes (per second) pointing to old-gen rather than there just being
lots of static long-lived inter-region references.

The empirical observation on which I base my claim about sparse table
overflow is the sparse table overflow log entry from
OtherRegionsTable::add_reference(), along with rs card counts that
are, for some regions, appropriately huge enough to be consistent with
the region being marked in the coarse bit map.

I realized that this test is not actually an LRU cache (as I claimed
in my previous post); I was confusing my different test cases (I have
another which does do LRU where I see similar problems but have not
investigated as carefully). It is actually just an approximation of an
LRU in terms of GC pressure. I'll see about updating it to use the
real immutable LRU implementation to make it more obviously relevant
to real-world use-cases.

My original motivation was to see whether immutable data structures
might work better than a LinkedHashMap style implementation (which
didn't work very well either) by eliminating writes to old-gen.

> If, as you say, you have a region that has very low liveness but is very
> very
> popular, it must contain very few/small objects that are extremely popular,
> and moreover
> to which the referring set is highly dynamic, with its membership always
> very high,
> but also constantly changing. Does that characterize your data structure?

I believe so, but I do not have direct observations/measurements
w.r.t. individual objects and their popularity in terms of new refs.

But, given the amount of (structurally shared) objects in any given
region which predominantly contains objects from these immutable data
structures, it seems utterly plausible that the number of references
(from distinct cards) created in other regions would be pretty high.

> Do these few popular objects then remain immortal and popular forever, so
> that the
> region in which they lie would always have a collection metric that would
> make it
> ineligible as a candidate for evacuation?

I would not expect objects remaining popular forever, but I for some
reason even when I artificially cause all data to be dropped followed
by generating enough garbage for old-gen to fill slowly and eventually
trigger a concurrent mark, I am not seeing the amount of region
eviction (by partial gc) that I would expect. I don't (currently) have
any live set / region overflow data to correlate with this case
however.

I know I should be providing better information, but doing better
tests and properly post-processing log data takes time and I wanted to
get the question out there. I'll *try* to do something better, if I
have time, provided that what I'm seeing is not known expected
behavior already. I.e., is it *unexpected* that G1 has trouble dealing
with these cases, or is it simply an expected result given how
remember set scanning works? Let's ignore for a moment the question of
whether regions end up in perpetual uncollectable state (until/unless
I can provide better information on whether this is happening).

Or put another way: It seems expected that some workloads will trigger
an overflow in fine grained rs tracking. It *seems* to me that this is
then also expected to have the effect I am seeing (regions being
artificially too expensive to collect, leading to them not being
collected). My main question is whether there is some mechanism in
place that I'm not seeing, that is supposed to mitigate or work around
this problem?

Thinking back to the rs scanning costs I was seeing with a regular
LinkedHashMap LRU, I would not be surprised if that was due to the
exact same issue. Throwing a large LinkedHashMap (large in the sense
of many objects) at the heap would seem to be a pretty nice
self-contained and realistic use-case. I'll try to confirm by testing
both, but I don't think this is a problem specific to the use of
immutable data structures (which granted, does change the behavior of
the mutator in ways that would very rarely be seen in regular Java
code).

-- 
/ Peter Schuller