G1 heap growth very aggressive in spite of low heap usage

Mon Mar 22 15:02:01 PDT 2010

Peter,

Peter Schuller wrote:
> I don't know what G1 does if there is too little free heap for the
> preferred amount of young generation regions, but I would expect that
> the primary effect of increasing heap size, in terms of GC overhead,
> would be that of:
>
> (1) Decreasing the frequency of concurrent marks and the overhead
> associated with it.
> (2) Increasing the pay-off of partial GC:s after such a mark, due to
> (presumably) a larger average free ratio in selected regions.
>   
There's another advantage of increasing heap size. The larger the heap 
size is, the larger the young gen can grow (assuming the prediction 
heuristics determine that the pause times will not go over the desired 
goal) and, as a result, the less frequent collections will be. This will 
also decrease the overall GC overhead.
> But under normal circumstances, when not actually running out of heap
> space, would this policy every be expected to be effective in any
> significant percentage of cases?
>   
Yes. See above. But, I agree that it's a bit too aggressive as it is 
right now.

Tony
> While I can understand that spending too much time on GC means you
> want to cut down on GC overhead *in general*, there seems to me to be
> a very weak relation between the cost of non-young evacuations
> (directly and indirectly through marking) and the cost of the young
> generation collections.
>
> If there is an excessive cost to young generation collections due to
> excessive promotion into old generations, would not that rather be an
> indication that the pause time goal and desired GC overhead are simply
> incompatible given the workload of the application?
>
> If so, a way out might be to accept the added overhead (but perhaps
> provide diagnostic feedback).
>
> Increasing the young generation size in an attempt to increase
> efficiency at the detriment of collection pause time could be an
> option, but it would only work if the application exhibits behavior
> consistent with the generational hypothesis, so is not a very safe
> bet. Probably in an ideal world this would be a knob (which to
> prefer).
>
> Ideally, maybe heap expansion would primarily be triggered by high
> cost of non-young collections. However I also understand that it is
> difficult to impossible to measure the overhead of concurrent marking,
> even if the cost of non-young region evacuations could be measured.
>
> Another observation is that it may be advantageous to only expand the
> heap when an allocation rate fails (as a result of
> G1CollectedHeap::expand_and_allocate() perhaps?), or at least not
> until the lack of heap space is in some way affecting the cost of
> young generation collections (or e.g. at the start of concurrent
> marking where we expect to start incurring costs associated with the
> heap size being too small).
>
> Thoughts?
>
>   
>> I'd guess that any micro benchmark
>> that tries to stress the GC (i.e., do a lot of allocations, not much else)
>> would cause G1 to expand the heap aggressively, given that such benchmarks
>> typically do mostly GC and not much else (can't tell whether this is the
>> case for your test as you don't have -XX:+PrintGCTimeStamps enabled to see
>> how close together the GCs are).
>>     
>
> Yes, but this also highlights, I think, an issue with the ratio of
> time spent since it specifically does not take into account the
> allocation rate of the mutator. For non-young evacuation and
> concurrent marking cost that would likely not matter since a larger
> heap would always lead to better throughput (and greater GC
> efficiency), but because the logic is applied also based on the cost
> of young generation collections the effects on heap size are probably
> likely to be very non-expected for any application that has temporary
> bursts of allocation (which I think is very much realistic even in
> production code).
>
> On time stamps: I can provide a sample runt with PrintGCTimeStamps,
> but the short answer is that they happened relatively frequently
> (several times per second) and the growth exhibited was not preceeded
> by a concurrent mark or non-young evacuations.
>
>   
>> Setting the max heap size with -Xmx will control how much G1 will expand the
>> heap.
>>     
>
> Understood. However for general-purpose use I am very interested in
> seeing the JVM self-regulate it's memory use in such a way that a
> non-developer can look at the memory use of a JVM (or for that matter
> the heap free/total:s) and draw some kind of reasonable ballpark
> conclusion on memory demands.
>
> Currently one can fairly easily trigger extreme heap growth to the
> point of multiple orders of magnitude. And since concurrent marking
> and non-young evacuations won't happen until the heap size is
> significantly exhausted, that effectively means that your program may
> end up seemingly "needing" orders of magnitude more memory than what
> it actually does.
>
>