ParNew - how does it decide if Full GC is needed

Thu May 8 23:16:23 UTC 2014

Hi Peter --

Thanks! (and so nice to hear another familiar voice from the good ol' days
again!)

I didn't look at the code detail you provide (thanks!) but it seems as
though the check for possible promotion failure that was done at the end of
the previous scavenge could instead be done, with, as far as I can see, no
loss of generality or any loss if information, instead before starting the
current GC, and thus avoid the promotion failure that the full gc at the
end of the previous gc was trying to avoid. All we then end up doing is
allowing one whole allocation epoch before doing the imminent full gc.

I am sure I am missing something here that I will find when I get time to
read through the details of the actual code again, and use one both of my
remaining neurons. :-)
-- ramki

On Thu, May 8, 2014 at 2:16 PM, Peter B. Kessler <Peter.B.Kessler at oracle.com
> wrote:

> The "problem", if you want to call it that, is that when the young
> generation has filled up before the next collection it is probably too
> late.  The scavenger is optimistic and thinks everything can be promoted.
>  It just goes ahead and starts a young collection.  It gets a promotion
> failure if it runs out of space in the old generation, painfully recovers
> from the promotion failure and then causes a full collection.  Instead we
> use the promotion history at the end of each young generation collection to
> decide to do a full collection preemptively.  That way we can sneak in that
> last scavenge (usually pretty fast, and usually emptying the whole eden)
> before we invoke a full collection, which doesn't handle massive amounts of
> garbage well (e.g., in the young generation).  If we were pessimistic,
> given Vitaly's heap layout, we'd do nothing but full collections.
>
> I think all the policy code (for the parallel scavenger) is in
> PSScavenge::invoke(), e.g.,
>
>     http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/
> 2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/psScavenge.
> cpp
>
> starting at line 210.  The policy decision is made in
>  PSAdaptiveSizePolicy::should_full_GC
>
>     http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/
> 2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/
> psAdaptiveSizePolicy.cpp
>
> starting at line 162.   Look at all those lovely fish!
>
> It looks like setting -XX:+PrintGCDetails -XX:+Verbose (a "develop" flag)
> would tell you what choices are being made (and probably produce a lot of
> other output as well :-).  In a product build -XX:+PrintGCDetails
> -XX:+PrintHeapAtGC, as has been suggested by others, should get enough
> information to figure out what's going on.
>
> I've cited the code for the parallel scavenger, because Vitaly said "this
> is the throughput/parallel collector setup".  The other collectors have
> similar policy code.
>
>                         ... peter
>
>
> On 05/08/14 13:24, Srinivas Ramakrishna wrote:
>
>> The 98% old gen occupancy triggered one of my two neurons.
>> I think there was gc policy code (don't know if it;s still there) that
>> would proactiively precipitate a full gc when it realized (based on
>> recent/historical promotion volume stats) that the next minor gc would not
>> be able to promote its survivors into the head room remaining in old.
>> (Don't ask me why it;s better to do it now rather than the next time the
>> young gen fills up and just rely on the same check). Again I am not looking
>> at the code (as it takes some effort to get to the box where I keep a copy
>> of the hotspot/openjdk code.)
>>
>> Hopefully Jon &co. will quickly confirm or shoot down the imaginations o
>> my foggy memory!
>> -- ramki
>>
>>
>> On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich <vitalyd at gmail.com<mailto:
>> vitalyd at gmail.com>> wrote:
>>
>>     I captured some usage and capacity stats via jstat right after that
>> full gc that started this email thread.  It showed 0% usage of survivor
>> spaces (which makes sense now that I know that a full gc empties that out
>> irrespective of tenuring threshold and object age); eden usage went down to
>> like 10%; tenured usage was very high, 98%.  Last gc cause was recorded as
>> "Allocation Failure".  So it's true that the tenured doesn't have much
>> breathing room here, but what prompted this email is I don't understand why
>> that even matters considering young gen got cleaned up quite nicely.
>>
>>
>>     On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna <
>> ysr1729 at gmail.com <mailto:ysr1729 at gmail.com>> wrote:
>>
>>
>>         By the way, as others have noted, -XX:+PrintGCDetails at max
>> verbosity level would be your friend to get more visibility into this.
>> Include -XX:+PrintHeapAtGC for even better visibility. For good measure,
>> after the puzzling full gc happens (and hopefully before another GC
>> happens) capture jstat data re the heap (old gen), for direct allocation
>> visibility.
>>
>>         -- ramki
>>
>>
>>         On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna <
>> ysr1729 at gmail.com <mailto:ysr1729 at gmail.com>> wrote:
>>
>>             Hi Vitaly --
>>
>>
>>             On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich <
>> vitalyd at gmail.com <mailto:vitalyd at gmail.com>> wrote:
>>
>>                 Hi Jon,
>>
>>                 Nope, we're not using CMS here; this is the
>> throughput/parallel collector setup.
>>
>>                 I was browsing some of the gc code in openjdk, and
>> noticed a few places where each generation attempts to decide (upfront from
>> what I can tell, i.e. before doing the collection) whether it thinks it's
>> "safe" to perform the collection (and if it's not, it punts to the next
>> generation) and also whether some amount of promoted bytes will fit.
>>
>>                 I didn't dig too much yet, but a cursory scan of that
>> code leads me to think that perhaps the defNew generation is asking the
>> next gen (i.e. tenured) whether it could handle some estimated promotion
>> amount, and given the large imbalance between Young and Tenured size,
>> tenured is reporting that things won't fit -- this then causes a full gc.
>>  Is that at all possible from what you know?
>>
>>
>>             If that were to happen, you wouldn't see the minor gc that
>> precedes the full gc in the log snippet you posted.
>>
>>             The only situation I know where a minor GC is followed
>> immediately by a major is when a minor gc didn't manage to fit an
>> allocation request in the space available. But, thinking more about that,
>> it can't be because one would expect that Eden knows the largest object it
>> can allocate, so if the request is larger than will fit in young, the
>> allocator would just go look for space in the older generation. If that
>> didn't fit, the old gen would precipitate a gc which would collect the
>> entire heap (all this should be taken with a dose of salt as I don't have
>> the code in front of me as I type, and I haven't looked at the allocation
>> policy code in ages).
>>
>>
>>                 On your first remark about compaction, just to make sure
>> I understand, you're saying that a full GC prefers to move all live objects
>> into tenured (this means taking objects out of survivor space and eden),
>> irrespective of whether their tenuring threshold has been exceeded? If that
>> compaction/migration of objects into tenured overflows tenured, then it
>> attempts to compact the young gen, with overflow into survivor space from
>> eden.  So basically, this generation knows how to perform compaction and
>> it's not just a copying collection?
>>
>>
>>             That is correct. A full gc does in fact move all survivors
>> from young gen into the old gen. This is a limitation (artificial nepotism
>> can ensue because of "too young" objects that will soon die, getting
>> artificially dragged into the old generation) that I had been lobbying to
>> fix for a while now. I think there's even an old, perhaps still open, bug
>> for this.
>>
>>
>>                 Is there a way to get the young gen to print an age table
>> of objects in its survivor space? I couldn't find one, but perhaps I'm
>> blind.
>>
>>
>>             +PrintTenuringDistribution (for ParNew/DefNew, perhaps also
>> G1?)
>>
>>
>>                 Also, as a confirmation, System.gc() always invokes a
>> full gc with the parallel collector, right? I believe so, but just wanted
>> to double check while we're on the topic.
>>
>>
>>             Right. (Not sure what happens if JNI critical section is in
>> force -- whether it's skipped or we wait for the JNI CS to exit/complete;
>> hopefully others can fill in the blanks/inaccuracies in my comments above,
>> since they are based on things that used to be a while ago in code I
>> haven't looked at recently.)
>>
>>             -- ramki
>>
>>
>>                 Thanks
>>
>>
>>                 On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu <
>> jon.masamitsu at oracle.com <mailto:jon.masamitsu at oracle.com>> wrote:
>>
>>
>>                     On 05/07/2014 05:55 PM, Vitaly Davidovich wrote:
>>
>>>
>>>                     Yes, I know :) This is some cruft that needs to be
>>> cleaned up.
>>>
>>>                     So my suspicion is that full gc is triggered
>>> precisely because old gen occupancy is almost 100%, but I'd appreciate
>>> confirmation on that.  What's surprising is that even though old gen is
>>> almost full, young gen has lots of room now. In fact, this system is
>>> restarted daily so we never see another young gc before the restart.
>>>
>>>                     The other odd observation is that survivor spaces
>>> are completely empty after this full gc despite tenuring threshold not
>>> being adjusted.
>>>
>>>
>>                     The full gc algorithm used compacts everything (old
>> gen and young gen) into
>>                     the old gen unless it does not all fit.   If the old
>> gen overflows, the young gen
>>                     is compacted into itself.  Live in the young gen is
>> compacted into eden first and
>>                     then into the survivor spaces.
>>
>>                      My intuitive thinking is that there was no real
>>> reason for the full gc to occur; whatever allocation failed in young could
>>> now succeed and whatever was tenured fit, albeit very tightly.
>>>
>>>
>>                     Still puzzling about the full GC.  Are you using CMS?
>>  If you have PrintGCDetails output,
>>                     that might help.
>>
>>                     Jon
>>
>>                      Sent from my phone
>>>
>>>
>>>                     On May 7, 2014 8:40 PM, "Bernd Eckenfels" <
>>> bernd-2014 at eckenfels.net <mailto:bernd-2014 at eckenfels.net>> wrote:
>>>
>>>                         Am Wed, 7 May 2014 19:34:20 -0400
>>>                         schrieb Vitaly Davidovich <vitalyd at gmail.com<mailto:
>>> vitalyd at gmail.com>>:
>>>
>>>
>>>                         > The vm args are:
>>>                         >
>>>                         > -Xms16384m -Xmx16384m -Xmn16384m
>>> -XX:NewSize=12288m
>>>                         > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10
>>>
>>>                         Hmm... you have confliciting arguments here,
>>> MaxNewSize overwrites Xmn.
>>>                         You will get 16384-12288=4gb old size, thats
>>> quite low. As you can see
>>>                         in your FullGC the steady state after FullGC has
>>> filled it nearly
>>>                         completely.
>>>
>>>                         Gruss
>>>                         Bernd
>>>                         _______________________________________________
>>>                         hotspot-gc-use mailing list
>>>                         hotspot-gc-use at openjdk.java.net <mailto:
>>> hotspot-gc-use at openjdk.java.net>
>>>
>>>                         http://mail.openjdk.java.net/
>>> mailman/listinfo/hotspot-gc-use
>>>
>>>
>>>
>>>                     _______________________________________________
>>>                     hotspot-gc-use mailing list
>>>                     hotspot-gc-use at openjdk.java.net  <mailto:
>>> hotspot-gc-use at openjdk.java.net>
>>>                     http://mail.openjdk.java.net/
>>> mailman/listinfo/hotspot-gc-use
>>>
>>
>>
>>                     _______________________________________________
>>                     hotspot-gc-use mailing list
>>                     hotspot-gc-use at openjdk.java.net <mailto:
>> hotspot-gc-use at openjdk.java.net>
>>
>>                     http://mail.openjdk.java.net/
>> mailman/listinfo/hotspot-gc-use
>>
>>
>>
>>                 _______________________________________________
>>                 hotspot-gc-use mailing list
>>                 hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use@
>> openjdk.java.net>
>>
>>                 http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-
>> use
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140508/3c1c7e97/attachment-0001.html>