Is CMS cycle can collect finalize objects

Thu Nov 24 01:52:13 PST 2011

Hi Koji and Ramki,

I had a look at the repro test case in Bug 7113118. I don't think the test
case is showing the same problem as the original one caused by
SocksSocketImpl objects. The way this test case is behaving is exactly what
the VM arguments told it to do.

I ran the test case on a 64-bit Linux HotSpot Server VM, on JDK 6 update 29.

My .hotspotrc is at [1].

SurvivorRatio doesn't need to be set explicitly, because when CMS is in use
and MaxTenuringThreshold is 0 (or AlwaysTenure is true), the SurvivorRatio
will automatically be set to 1024 by ergonomics.
UsePSAdaptiveSurvivorSizePolicy has no effect when CMS is in use, so it's
omitted from my configuration, too.

By using -XX:+PrintReferenceGC, the gc log will show when and how many
finalizable object are discovered.

The VM arguments given force all surviving object from minor collections to
be promoted into the old generation, and none of the minor collections had
a chance to discovery any ready-to-be-collected FinalReferences, so minor
GC logs aren't of interest in this case. All of the minor GC log lines show
"[FinalReference, 0 refs, xxx secs]".

A part of the GC log can be found at [2]. This log shows two CMS
collections cycles, in between dozens of minor collections.

* Before the first of these two CMS collections, the Java heap used
is 971914K, and then the CMS occupancy threshold is crossed so a CMS
collection cycle starts;
* During the re-mark phase of the first CMS collection, 46400
FinalReferences were discovered;
* After the first CMS collection, the Java heap used is still high,
at 913771K, because the finalizable objects need another old generation
collection to be collected (either CMS or full GC is fine);
* During the re-mark phase of the second CMS collection, 3000
FinalReferences were discovered, these are from promoted objects from the
minor collections in between;
* After the second CMS collection, the Java heap used goes down to 61747K,
as the finalizable objects discovered from the first CMS collection are
indeed finalized and then collected during the second CMS collection.

This behavior looks normal to me -- it's what the VM arguments were telling
the VM to do.
The reason that the Java heap used size was swing up and down is because
the actual live data set was very low, but
the CMSInitiatingOccupancyFraction was set too high so concurrent
collections are started too late. If the initiating threshold were set to a
smaller value, say 20, then the test case would behave quite reasonably.

We'd need another test case to study, because this one doesn't really repro
the problem.

After we applied the patch to SocketAdaptor [3], we don't see this kind of
CMS/finalization problem in production anymore. Should we hit one of these
again, I'll try to get more info from our production site and see if I can
trace down the real problem.

- Kris

[1]: https://gist.github.com/1390876#file_.hotspotrc
[2]: https://gist.github.com/1390876#file_gc.partial.log
[3]:
http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html

On Thu, Nov 24, 2011 at 7:11 AM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:

> Hi Koji --
>
> Thanks for the test case, that should definitely help with the
> dentification of the problem. I'll see if
> i can find some spare time to pursue it one of these days (but can't
> promise), so please
> do open that Oracle support ticket to get the requisite resource allocated
> for the official
> investigation.
>
> Thanks again for boiling it down to a simple test case, and i'll update if
> i identify the
> root cause...
>
> -- ramki
>
>
> On Tue, Nov 22, 2011 at 1:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com>wrote:
>
>> This is from an old thread in 2011 April but we're still seeing the same
>> problem with (nio) Socket instances not getting collecting by CMS.
>>
>> Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>>
>> Thanks,
>> Koji
>>
>>
>> (From
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
>> l)
>> On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna" <y.s.ramakrishna at oracle.com
>> >>
>> wrote:
>> > On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
>> > > Hi Ramki,
>> > >
>> > > Thanks for the detailed explanation. I was trying to
>> > > run some tests for your questions. Here are the answers to some of
>> your
>> > > questions.
>> > >
>> > >>> What are the symptoms?
>> > > java.net.SocksSocketImpl objects are not getting cleaned up after a
>> CMS
>> > cycle. I see the direct
>> > > correlation to java.lang.ref.Finalizer objects. Overtime, this fills
>> up
>> > > the old generation and CMS going in loop occupying complete one core.
>> > > But when we trigger Full GC, these objects are garbage collected.
>> >
>> > OK, thanks.
>> >
>> > >
>> > > You
>> > >   mentioned that CMS cycle does cleanup these objects provided we
>> enable
>> > > class unloading. Are you suggesting -XX:+ClassUnloading or
>> > > -XX:+CMSClassUnloadingEnabled? I have tried with later and
>> > >
>> > > didn't
>> > >   succeed.  Our pern gen is relatively constant, by enabling this,
>> are we
>> > >   introducing performance overhead? We have room for CPU cycles and
>> perm
>> > > gen is relatively small, so this may be fine. Just that we want to see
>> > > these objects should GC'ed in CMS cycle.
>> > >
>> > >
>> > > Do you have any suggestion w.r.t. to which flags should i be using to
>> > trigger this?
>> >
>> > For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
>> > not make a difference in the accumulation of the socket objects
>> > because there is no "projection" as far as i can tell of these
>> > into the perm gen, esepcially since as you say there is no class
>> > loading going on (since your perm gen size remains constant after
>> > start-up).
>> >
>>
>> > However, keeping class unloading enabled via this flag should
>> > hopefully not have much of an impact on your pause times given that
>> > the perm gen is small. The typical effect you will see if class
>> > unloading is enabled is that the CMS remark pause times are a bit
>> > longer (if you enable PrintGCDetails you will see messages
>> > such as "scrub string table" and "scrub symbol table", "code cache"
>> > etc. BY comparing the CMS-remark pause details and times with
>> > and without enabling class unloading you will get a good idea
>> > of its impact. In some cases, eben though you pay a small price
>> > in terms of increased CMS-remark pause times, you will make up
>> > for that in terms of faster scavenges etc., so it might well
>> > be worthwhile.
>> >
>> > In the very near future, we may end up turning that on
>> > by default for CMS because the savings from leaving it off
>> > by default are much smaller now and it can often lead to
>> > other issues if class unloading is turned off.
>> >
>> > So bottom line is: it will not affect the accumulation of
>> > your socket objects, but it's a good idea to keep class
>> > unloading by CMS enabled anyway.
>> >
>> > >
>> > >
>> > >>> What does jmap -finalizerinfo on your process show?
>> > >>> What does -XX:+PrintClassHistogram show as accumulating in the
>> > heap?
>> > >>> (Are they one specific type of Finalizer objects or all
>> > varieties?)
>> > >
>> > > Jmap -histo shows the above class is keep accumulating. Infact,
>> > > finalizerinfo doesn't show any objects on this process.
>> >
>> > OK, that shows that the objects are somehow not discovered by
>> > CMS as being eligible for finalization. Although one can imagine
>> > a one cycle delay (because of floating garbage) with CMS finding
>> > these objects to be unreachable and hence eligible for finalization,
>> > continuing accumulation of these objects over a period of time
>> > (and presumably many CMS cycles) seems strange and almost
>> > definitely a CMS bug especially as you find that a full STW
>> > gc does indeed reclaim them.
>> >
>> > >
>> > >
>> > >
>> > >>> Did the problem start in 6u21? Or are those the only versions
>> > >>> you tested and found that there was an issue?
>> > > We
>> > >   have seen this problem in 6u21. We were on 6u12 earlier and didn't
>> run
>> > > into this problem. But can't say this is a build particular, since
>> lots
>> > > of things have changed.
>> >
>> > Can you boil down this behavior into a test case that you are able
>> > to share with us?
>> > If so, please file a bug with the test case
>> > and send me the CR id and I'll take a look.
>> >
>> > Oh, and before you do that, can you please check the latest public
>> > release (6u24 or 6u25?) to see if the problem still reproduces?
>> >
>> > thanks, and sorry I could not be of more help without a bug
>> > report or a test case.
>> >
>> > -- ramki
>> >
>> > >
>> > > Thanks in anticipation,
>> > > -Bharath
>> > >
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111124/5211e23d/attachment.html