From jon at siliconcircus.com  Tue Nov  1 05:58:09 2011
From: jon at siliconcircus.com (Jon Bright)
Date: Tue, 01 Nov 2011 13:58:09 +0100
Subject: Identifying concurrent mode failures caused by fragmentation
In-Reply-To: <4EAF5CCD.5030004@oracle.com>
References: <4EAE9D6E.1060807@siliconcircus.com> <4EAF5CCD.5030004@oracle.com>
Message-ID: <4EAFECE1.7040806@siliconcircus.com>

Jon,

Indeed, the problem appears to have gone away with today's update to 
u26.  (We plan to migrate further, but we're fairly conservative about 
rolling out new versions, and we already had u26 in use elsewhere.)

With regard to your (and Kris') question on incremental mode: I started 
out by reading the tuning guide at

	http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#icms

and followed that up by reading various other pages and your blog (which 
was very helpful in terms of giving a sense of how to think about GC - 
thank you!).

Whilst I was fairly ambivalent about incremental mode (we have at least 
4 logical CPUs in each machine), we'd been using it in the past and I 
didn't see anything specifically mentioning that it was obsolete.  Is 
there a better reference on this subject?

I'll certainly now try a few benchmarking/test runs with incremental 
mode turned off and roll that out if all is well.

Thanks!

Jon

On 01.11.2011 03:43, Jon Masamitsu wrote:
> Jon,
>
> I haven't looked at the longer log but in general I've found the
> information in the GC logs inadequate to figure out if the
> problem is fragmentation. But more important, there has
> been some good work in recent versions of hotspot so that
> we're more successful at combating fragmentation. Try
> the latest release and see if it helps (u26 should be good
> enough).
>
> Jon
>
> On 10/31/11 06:06, Jon Bright wrote:
>> Hi,
>>
>> We have an application running with a 6GB heap (complete parameters
>> below). Mostly it has a fairly low turnover of memory use, but on
>> occasion, it will come under some pressure as it reloads a large
>> in-memory data set from a database.
>>
>> Sometimes in this situation, we'll see a concurrent mode failure.
>> Here's one failure:
>>
>> 20021.464: [GC 20021.465: [ParNew: 13093K->3939K(76672K), 0.0569240
>> secs]20021.522: [CMS20023.747: [CMS-concurrent-mark: 11.403/29.029
>> secs] [Times: user=41.11 sys=1.03, real=29.03 secs]
>> (concurrent mode failure): 3873922K->2801744K(6206272K), 30.7900180
>> secs] 3886215K->2801744K(6282944K), [CMS Perm :
>> 142884K->142834K(524288K)] icms_dc=33 , 30.8473830 secs] [Times:
>> user=30.26 sys=0.71, real=30.85 secs]
>> Total time for which application threads were stopped: 30.8484460 seconds
>>
>> (I've attached a lengthier log including the previous and subsequent
>> CMS collection.)
>>
>> Am I correct in thinking that this failure can basically only be
>> caused by fragmentation? Both young and old seem to have plenty of
>> space. There doesn't seem to be any sign that the tenured generation
>> would run out of space before CMS completes. Fragmentation is the only
>> remaining cause that occurs to me.
>>
>> We're running with 1.6.0_11, although this will be upgraded to
>> 1.6.0_26 tomorrow. I realise our current version is ancient - I'm not
>> really looking for help on the problem itself, just for advice on
>> whether the log line above indicates fragmentation.
>>
>> Thanks
>>
>> Jon Bright
>>
>>
>>
>> The parameters we have set are:
>>
>> -server
>> -Xmx6144M
>> -Xms6144M
>> -XX:MaxPermSize=512m
>> -XX:PermSize=512m
>> -XX:+UseConcMarkSweepGC
>> -XX:+CMSIncrementalMode
>> -XX:+CMSIncrementalPacing
>> -XX:SoftRefLRUPolicyMSPerMB=3
>> -XX:CMSIncrementalSafetyFactor=30
>> -XX:+PrintGCDetails
>> -XX:+PrintGCApplicationStoppedTime
>> -XX:+PrintGCApplicationConcurrentTime
>> -XX:+PrintGCTimeStamps
>> -Xloggc:/home/tbmx/log/gc_`date +%Y%m%d%H%M`.log
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From jon at siliconcircus.com  Tue Nov  1 06:01:38 2011
From: jon at siliconcircus.com (Jon Bright)
Date: Tue, 01 Nov 2011 14:01:38 +0100
Subject: Identifying concurrent mode failures caused by fragmentation
In-Reply-To: <CA+cQ+tRGxHOfomQy+eiFZFRvn6uP929H=HfQ1Q925ciWW5iiVg@mail.gmail.com>
References: <4EAE9D6E.1060807@siliconcircus.com>
	<CA+cQ+tRGxHOfomQy+eiFZFRvn6uP929H=HfQ1Q925ciWW5iiVg@mail.gmail.com>
Message-ID: <4EAFEDB2.2030800@siliconcircus.com>

Hi Kris,

Many thanks for PrintFLSStatistics - it looks like just the sort of 
thing I'm after.  I'm mostly reading the GC logs manually anyway, so 
breaking the parsing tools isn't a big deal for me.

As I mentioned in my reply to Jon, we'll probably turn off incremental 
mode - I hadn't realised it was obsolete.  Thanks for the hint.

Jon

On 01.11.2011 04:21, Krystal Mok wrote:
> Hi Jon,
>
> It might be helpful to set -XX:PrintFLSStatistics to a value greater
> than zero, to get the stats of FreeListSpace so that you'd know the size
> of the biggest fragment. The GC log produced by -XX:+PrintGCDetails
> doesn't give enough information on fragmentation.
>
> Here's an example of using -XX:PrintFLSStatistics=1:
> https://gist.github.com/1329783
> It does make the GC log messier, and some of the GC log parsing tools
> won't cope with this, but you get to know how bad the fragmentation is.
>
> Anyway, it looks like you're using CMS in incremental mode. This mode
> should be obsolete in JDK6 already. Is there a good reason for you to be
> using it? If not, I'd suggest turning it off, though, no matter if
> you're upgrading your JDK or not.
>
> Regards,
> Kris Mok
>
> On Mon, Oct 31, 2011 at 9:06 PM, Jon Bright <jon at siliconcircus.com
> <mailto:jon at siliconcircus.com>> wrote:
>
>     Hi,
>
>     We have an application running with a 6GB heap (complete parameters
>     below).  Mostly it has a fairly low turnover of memory use, but on
>     occasion, it will come under some pressure as it reloads a large
>     in-memory data set from a database.
>
>     Sometimes in this situation, we'll see a concurrent mode failure.
>     Here's one failure:
>
>     20021.464: [GC 20021.465: [ParNew: 13093K->3939K(76672K), 0.0569240
>     secs]20021.522: [CMS20023.747: [CMS-concurrent-mark: 11.403/29.029
>     secs] [Times: user=41.11 sys=1.03, real=29.03 secs]
>       (concurrent mode failure): 3873922K->2801744K(6206272K),
>     30.7900180 secs] 3886215K->2801744K(6282944K), [CMS Perm :
>     142884K->142834K(524288K)] icms_dc=33 , 30.8473830 secs] [Times:
>     user=30.26 sys=0.71, real=30.85 secs]
>     Total time for which application threads were stopped: 30.8484460
>     seconds
>
>     (I've attached a lengthier log including the previous and subsequent
>     CMS collection.)
>
>     Am I correct in thinking that this failure can basically only be
>     caused by fragmentation?  Both young and old seem to have plenty of
>     space. There doesn't seem to be any sign that the tenured generation
>     would run out of space before CMS completes.  Fragmentation is the
>     only remaining cause that occurs to me.
>
>     We're running with 1.6.0_11, although this will be upgraded to
>     1.6.0_26 tomorrow.  I realise our current version is ancient - I'm
>     not really looking for help on the problem itself, just for advice
>     on whether the log line above indicates fragmentation.
>
>     Thanks
>
>     Jon Bright
>
>
>
>     The parameters we have set are:
>
>     -server
>     -Xmx6144M
>     -Xms6144M
>     -XX:MaxPermSize=512m
>     -XX:PermSize=512m
>     -XX:+UseConcMarkSweepGC
>     -XX:+CMSIncrementalMode
>     -XX:+CMSIncrementalPacing
>     -XX:SoftRefLRUPolicyMSPerMB=3
>     -XX:__CMSIncrementalSafetyFactor=30
>     -XX:+PrintGCDetails
>     -XX:+__PrintGCApplicationStoppedTime
>     -XX:+__PrintGCApplicationConcurrentTi__me
>     -XX:+PrintGCTimeStamps
>     -Xloggc:/home/tbmx/log/gc_`__date +%Y%m%d%H%M`.log
>
>
>     _______________________________________________
>     hotspot-gc-use mailing list
>     hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>     http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>

From jon.masamitsu at oracle.com  Tue Nov  1 06:50:29 2011
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Tue, 01 Nov 2011 06:50:29 -0700
Subject: Identifying concurrent mode failures caused by fragmentation
In-Reply-To: <4EAFECE1.7040806@siliconcircus.com>
References: <4EAE9D6E.1060807@siliconcircus.com> <4EAF5CCD.5030004@oracle.com>
	<4EAFECE1.7040806@siliconcircus.com>
Message-ID: <4EAFF925.9000005@oracle.com>

Jon,

Incremental CMS (iCMS) was written for a specific use case - 1 or 2 hardware
threads where concurrent activity by CMS would look like a STW (if
only 1 hardware thread) or a high tax on the cpu cycles (2 hardware
threads).   It has a higher overhead and also is less efficient in terms
of identifying garbage.  The latter is because iCMS spreads out the
concurrent work so that objects that it has identified as live earlier
may actually be dead when the dead objects are swept up.  It's
worth testing with regular CMS instead of iCMS.

BTW, for a 6g heap your young gen might be on the small side.
A larger young gen allows more objects to die in the young gen
and puts less pressure on the old (CMS) gen (i.e. fewer objects
get promoted).    Next time you want to play with your GC
settings, try a larger young gen.  Not sure if iCMS pushed you
toward a smaller young gen.  I personally don't have much
experience with iCMS but with regular CMS, I would expect you to
get better throughput with a larger young gen.  As usual the
devil is in the details.

Jon

On 11/01/11 05:58, Jon Bright wrote:
> Jon,
>
> Indeed, the problem appears to have gone away with today's update to
> u26.  (We plan to migrate further, but we're fairly conservative about
> rolling out new versions, and we already had u26 in use elsewhere.)
>
> With regard to your (and Kris') question on incremental mode: I started
> out by reading the tuning guide at
>
> 	http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#icms
>
> and followed that up by reading various other pages and your blog (which
> was very helpful in terms of giving a sense of how to think about GC -
> thank you!).
>
> Whilst I was fairly ambivalent about incremental mode (we have at least
> 4 logical CPUs in each machine), we'd been using it in the past and I
> didn't see anything specifically mentioning that it was obsolete.  Is
> there a better reference on this subject?
>
> I'll certainly now try a few benchmarking/test runs with incremental
> mode turned off and roll that out if all is well.
>
> Thanks!
>
> Jon
>
> On 01.11.2011 03:43, Jon Masamitsu wrote:
>> Jon,
>>
>> I haven't looked at the longer log but in general I've found the
>> information in the GC logs inadequate to figure out if the
>> problem is fragmentation. But more important, there has
>> been some good work in recent versions of hotspot so that
>> we're more successful at combating fragmentation. Try
>> the latest release and see if it helps (u26 should be good
>> enough).
>>
>> Jon
>>
>> On 10/31/11 06:06, Jon Bright wrote:
>>> Hi,
>>>
>>> We have an application running with a 6GB heap (complete parameters
>>> below). Mostly it has a fairly low turnover of memory use, but on
>>> occasion, it will come under some pressure as it reloads a large
>>> in-memory data set from a database.
>>>
>>> Sometimes in this situation, we'll see a concurrent mode failure.
>>> Here's one failure:
>>>
>>> 20021.464: [GC 20021.465: [ParNew: 13093K->3939K(76672K), 0.0569240
>>> secs]20021.522: [CMS20023.747: [CMS-concurrent-mark: 11.403/29.029
>>> secs] [Times: user=41.11 sys=1.03, real=29.03 secs]
>>> (concurrent mode failure): 3873922K->2801744K(6206272K), 30.7900180
>>> secs] 3886215K->2801744K(6282944K), [CMS Perm :
>>> 142884K->142834K(524288K)] icms_dc=33 , 30.8473830 secs] [Times:
>>> user=30.26 sys=0.71, real=30.85 secs]
>>> Total time for which application threads were stopped: 30.8484460 seconds
>>>
>>> (I've attached a lengthier log including the previous and subsequent
>>> CMS collection.)
>>>
>>> Am I correct in thinking that this failure can basically only be
>>> caused by fragmentation? Both young and old seem to have plenty of
>>> space. There doesn't seem to be any sign that the tenured generation
>>> would run out of space before CMS completes. Fragmentation is the only
>>> remaining cause that occurs to me.
>>>
>>> We're running with 1.6.0_11, although this will be upgraded to
>>> 1.6.0_26 tomorrow. I realise our current version is ancient - I'm not
>>> really looking for help on the problem itself, just for advice on
>>> whether the log line above indicates fragmentation.
>>>
>>> Thanks
>>>
>>> Jon Bright
>>>
>>>
>>>
>>> The parameters we have set are:
>>>
>>> -server
>>> -Xmx6144M
>>> -Xms6144M
>>> -XX:MaxPermSize=512m
>>> -XX:PermSize=512m
>>> -XX:+UseConcMarkSweepGC
>>> -XX:+CMSIncrementalMode
>>> -XX:+CMSIncrementalPacing
>>> -XX:SoftRefLRUPolicyMSPerMB=3
>>> -XX:CMSIncrementalSafetyFactor=30
>>> -XX:+PrintGCDetails
>>> -XX:+PrintGCApplicationStoppedTime
>>> -XX:+PrintGCApplicationConcurrentTime
>>> -XX:+PrintGCTimeStamps
>>> -Xloggc:/home/tbmx/log/gc_`date +%Y%m%d%H%M`.log
>>>
>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From mchr3k at gmail.com  Sat Nov  5 15:29:10 2011
From: mchr3k at gmail.com (Martin Hare Robertson)
Date: Sat, 5 Nov 2011 22:29:10 +0000
Subject: Perf Impact of CMSClassUnloadingEnabled
Message-ID: <CA+41Qxos87ROt57vanm8PaOiEE2S9XaCc27RjL9S6HBfATmG4Q@mail.gmail.com>

Hi,

I recently encountered an interesting GC issue with a Tomcat application. I
came up with a simple repro scenario which I posted to StackOverflow:
http://stackoverflow.com/questions/8017193/when-does-the-perm-gen-get-collected

To solve this issue I have been encouraged to use
-XX:+CMSClassUnloadingEnabled.
I currently use the following GC configuration.

-XX:+UseMembar
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=80
-XX:+UseCMSInitiatingOccupancyOnly

Is enabling CMSClassUnloadingEnabled likely to have a negative perf impact?
If not, why is it disabled by default?

Thanks

Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111105/5d3c6043/attachment.html 

From jon.masamitsu at oracle.com  Mon Nov  7 07:44:17 2011
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 07 Nov 2011 07:44:17 -0800
Subject: Perf Impact of CMSClassUnloadingEnabled
In-Reply-To: <CA+41Qxos87ROt57vanm8PaOiEE2S9XaCc27RjL9S6HBfATmG4Q@mail.gmail.com>
References: <CA+41Qxos87ROt57vanm8PaOiEE2S9XaCc27RjL9S6HBfATmG4Q@mail.gmail.com>
Message-ID: <4EB7FCD1.4090607@oracle.com>

Doing class unloading with CMS will often increase the remark pause times
and so is not on by default.

On 11/5/2011 3:29 PM, Martin Hare Robertson wrote:
> Hi,
>
> I recently encountered an interesting GC issue with a Tomcat application. I
> came up with a simple repro scenario which I posted to StackOverflow:
> http://stackoverflow.com/questions/8017193/when-does-the-perm-gen-get-collected
>
> To solve this issue I have been encouraged to use
> -XX:+CMSClassUnloadingEnabled.
> I currently use the following GC configuration.
>
> -XX:+UseMembar
> -XX:+UseConcMarkSweepGC
> -XX:+UseParNewGC
> -XX:CMSInitiatingOccupancyFraction=80
> -XX:+UseCMSInitiatingOccupancyOnly
>
> Is enabling CMSClassUnloadingEnabled likely to have a negative perf impact?
> If not, why is it disabled by default?
>
> Thanks
>
> Martin
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111107/b39b7256/attachment.html 

From Andreas.Loew at oracle.com  Mon Nov  7 10:33:17 2011
From: Andreas.Loew at oracle.com (Andreas Loew)
Date: Mon, 07 Nov 2011 19:33:17 +0100
Subject: Perf Impact of CMSClassUnloadingEnabled
In-Reply-To: <4EB7FCD1.4090607@oracle.com>
References: <CA+41Qxos87ROt57vanm8PaOiEE2S9XaCc27RjL9S6HBfATmG4Q@mail.gmail.com>
	<4EB7FCD1.4090607@oracle.com>
Message-ID: <4EB8246D.8000202@oracle.com>

Hi Jon,

sorry, a follow-up question from my side: As it shouldn't be the most 
normal thing even for a Java EE app to constantly dereference 
classloaders or single classes that need to be GC'ed:

In how far does your statement about increased remark pauses still apply 
in case the PermGen / set of loaded classes has stayed completely 
constant between initial mark and remark (which should be the usual case)?

And wouldn't there a also be a distinction between PermGen and Old Gen?

Many thanks & best regards,

Andreas

--
Andreas Loew
Senior Java Architect
Oracle Advanced Customer Services Germany


Am 07.11.2011 16:44, schrieb Jon Masamitsu:
> Doing class unloading with CMS will often increase the remark pause times
> and so is not on by default.
>
> On 11/5/2011 3:29 PM, Martin Hare Robertson wrote:
>> Hi,
>>
>> I recently encountered an interesting GC issue with a Tomcat application. I
>> came up with a simple repro scenario which I posted to StackOverflow:
>> http://stackoverflow.com/questions/8017193/when-does-the-perm-gen-get-collected
>>
>> To solve this issue I have been encouraged to use
>> -XX:+CMSClassUnloadingEnabled.
>> I currently use the following GC configuration.
>>
>> -XX:+UseMembar
>> -XX:+UseConcMarkSweepGC
>> -XX:+UseParNewGC
>> -XX:CMSInitiatingOccupancyFraction=80
>> -XX:+UseCMSInitiatingOccupancyOnly
>>
>> Is enabling CMSClassUnloadingEnabled likely to have a negative perf impact?
>> If not, why is it disabled by default?
>>
>> Thanks
>>
>> Martin

From jon.masamitsu at oracle.com  Tue Nov  8 07:58:51 2011
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Tue, 08 Nov 2011 07:58:51 -0800
Subject: Perf Impact of CMSClassUnloadingEnabled
In-Reply-To: <4EB8246D.8000202@oracle.com>
References: <CA+41Qxos87ROt57vanm8PaOiEE2S9XaCc27RjL9S6HBfATmG4Q@mail.gmail.com>
	<4EB7FCD1.4090607@oracle.com> <4EB8246D.8000202@oracle.com>
Message-ID: <4EB951BB.3000600@oracle.com>

Andreas,

Hotspot maintains a list of classes that are loaded in the
Dictionary (dictionary.hpp/cpp).  This list does not keep
classes alive.  After marking (when we know what classes
are dead), we walk the list and remove dead classes.
Hotspot does not keep information that says classes have
not been unloaded, so the list is always walked.

Jon

On 11/07/11 10:33, Andreas Loew wrote:
> Hi Jon,
>
> sorry, a follow-up question from my side: As it shouldn't be the most 
> normal thing even for a Java EE app to constantly dereference 
> classloaders or single classes that need to be GC'ed:
>
> In how far does your statement about increased remark pauses still 
> apply in case the PermGen / set of loaded classes has stayed 
> completely constant between initial mark and remark (which should be 
> the usual case)?
>
> And wouldn't there a also be a distinction between PermGen and Old Gen?
>
> Many thanks & best regards,
>
> Andreas
>
> -- 
> Andreas Loew
> Senior Java Architect
> Oracle Advanced Customer Services Germany
>
>
> Am 07.11.2011 16:44, schrieb Jon Masamitsu:
>> Doing class unloading with CMS will often increase the remark pause 
>> times
>> and so is not on by default.
>>
>> On 11/5/2011 3:29 PM, Martin Hare Robertson wrote:
>>> Hi,
>>>
>>> I recently encountered an interesting GC issue with a Tomcat 
>>> application. I
>>> came up with a simple repro scenario which I posted to StackOverflow:
>>> http://stackoverflow.com/questions/8017193/when-does-the-perm-gen-get-collected 
>>>
>>>
>>> To solve this issue I have been encouraged to use
>>> -XX:+CMSClassUnloadingEnabled.
>>> I currently use the following GC configuration.
>>>
>>> -XX:+UseMembar
>>> -XX:+UseConcMarkSweepGC
>>> -XX:+UseParNewGC
>>> -XX:CMSInitiatingOccupancyFraction=80
>>> -XX:+UseCMSInitiatingOccupancyOnly
>>>
>>> Is enabling CMSClassUnloadingEnabled likely to have a negative perf 
>>> impact?
>>> If not, why is it disabled by default?
>>>
>>> Thanks
>>>
>>> Martin

From Andreas.Loew at oracle.com  Tue Nov  8 08:13:58 2011
From: Andreas.Loew at oracle.com (Andreas Loew)
Date: Tue, 08 Nov 2011 17:13:58 +0100
Subject: Perf Impact of CMSClassUnloadingEnabled
In-Reply-To: <4EB951BB.3000600@oracle.com>
References: <CA+41Qxos87ROt57vanm8PaOiEE2S9XaCc27RjL9S6HBfATmG4Q@mail.gmail.com>
	<4EB7FCD1.4090607@oracle.com> <4EB8246D.8000202@oracle.com>
	<4EB951BB.3000600@oracle.com>
Message-ID: <4EB95546.5000006@oracle.com>

Hi Jon,

many thanks for your reply :-)

The behavior you mention indeed seems a little "unfortunate"... ;-)
Will this change as part of the efforts to completely remove PermGen 
(part of the "HotRockit" initiative) following the example of JRockit?

Thanks again & best regards,

Andreas

-- 
Andreas Loew
Senior Java Architect
Oracle Advanced Customer Services Germany


Am 08.11.2011 16:58, schrieb Jon Masamitsu:
> Andreas,
>
> Hotspot maintains a list of classes that are loaded in the
> Dictionary (dictionary.hpp/cpp).  This list does not keep
> classes alive.  After marking (when we know what classes
> are dead), we walk the list and remove dead classes.
> Hotspot does not keep information that says classes have
> not been unloaded, so the list is always walked.
>
> Jon
>
> On 11/07/11 10:33, Andreas Loew wrote:
>> Hi Jon,
>>
>> sorry, a follow-up question from my side: As it shouldn't be the most 
>> normal thing even for a Java EE app to constantly dereference 
>> classloaders or single classes that need to be GC'ed:
>>
>> In how far does your statement about increased remark pauses still 
>> apply in case the PermGen / set of loaded classes has stayed 
>> completely constant between initial mark and remark (which should be 
>> the usual case)?
>>
>> And wouldn't there a also be a distinction between PermGen and Old Gen?
>>
>> Many thanks & best regards,
>>
>> Andreas
>>
>> -- 
>> Andreas Loew
>> Senior Java Architect
>> Oracle Advanced Customer Services Germany
>>
>>
>> Am 07.11.2011 16:44, schrieb Jon Masamitsu:
>>> Doing class unloading with CMS will often increase the remark pause 
>>> times
>>> and so is not on by default.
>>>
>>> On 11/5/2011 3:29 PM, Martin Hare Robertson wrote:
>>>> Hi,
>>>>
>>>> I recently encountered an interesting GC issue with a Tomcat 
>>>> application. I
>>>> came up with a simple repro scenario which I posted to StackOverflow:
>>>> http://stackoverflow.com/questions/8017193/when-does-the-perm-gen-get-collected 
>>>>
>>>>
>>>> To solve this issue I have been encouraged to use
>>>> -XX:+CMSClassUnloadingEnabled.
>>>> I currently use the following GC configuration.
>>>>
>>>> -XX:+UseMembar
>>>> -XX:+UseConcMarkSweepGC
>>>> -XX:+UseParNewGC
>>>> -XX:CMSInitiatingOccupancyFraction=80
>>>> -XX:+UseCMSInitiatingOccupancyOnly
>>>>
>>>> Is enabling CMSClassUnloadingEnabled likely to have a negative perf 
>>>> impact?
>>>> If not, why is it disabled by default?
>>>>
>>>> Thanks
>>>>
>>>> Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111108/2467fd8f/attachment.html 

From jon.masamitsu at oracle.com  Tue Nov  8 09:28:42 2011
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Tue, 08 Nov 2011 09:28:42 -0800
Subject: Perf Impact of CMSClassUnloadingEnabled
In-Reply-To: <4EB95546.5000006@oracle.com>
References: <CA+41Qxos87ROt57vanm8PaOiEE2S9XaCc27RjL9S6HBfATmG4Q@mail.gmail.com>
	<4EB7FCD1.4090607@oracle.com> <4EB8246D.8000202@oracle.com>
	<4EB951BB.3000600@oracle.com> <4EB95546.5000006@oracle.com>
Message-ID: <4EB966CA.1020604@oracle.com>


On 11/08/11 08:13, Andreas Loew wrote:
> Hi Jon,
>
> many thanks for your reply :-)
>
> The behavior you mention indeed seems a little "unfortunate"... ;-)
> Will this change as part of the efforts to completely remove PermGen 
> (part of the "HotRockit" initiative) following the example of JRockit?

It will be the case after perm gen removal that we will readily know if
classes have been unloaded so will be able to conditionally skip the walk of
the Dictionary for purposes of purging dead classes.

Jon
>
> Thanks again & best regards,
>
> Andreas
>
> -- 
> Andreas Loew
> Senior Java Architect
> Oracle Advanced Customer Services Germany
>
>
> Am 08.11.2011 16:58, schrieb Jon Masamitsu:
>> Andreas,
>>
>> Hotspot maintains a list of classes that are loaded in the
>> Dictionary (dictionary.hpp/cpp).  This list does not keep
>> classes alive.  After marking (when we know what classes
>> are dead), we walk the list and remove dead classes.
>> Hotspot does not keep information that says classes have
>> not been unloaded, so the list is always walked.
>>
>> Jon
>>
>> On 11/07/11 10:33, Andreas Loew wrote:
>>> Hi Jon,
>>>
>>> sorry, a follow-up question from my side: As it shouldn't be the 
>>> most normal thing even for a Java EE app to constantly dereference 
>>> classloaders or single classes that need to be GC'ed:
>>>
>>> In how far does your statement about increased remark pauses still 
>>> apply in case the PermGen / set of loaded classes has stayed 
>>> completely constant between initial mark and remark (which should be 
>>> the usual case)?
>>>
>>> And wouldn't there a also be a distinction between PermGen and Old Gen?
>>>
>>> Many thanks & best regards,
>>>
>>> Andreas
>>>
>>> -- 
>>> Andreas Loew
>>> Senior Java Architect
>>> Oracle Advanced Customer Services Germany
>>>
>>>
>>> Am 07.11.2011 16:44, schrieb Jon Masamitsu:
>>>> Doing class unloading with CMS will often increase the remark pause 
>>>> times
>>>> and so is not on by default.
>>>>
>>>> On 11/5/2011 3:29 PM, Martin Hare Robertson wrote:
>>>>> Hi,
>>>>>
>>>>> I recently encountered an interesting GC issue with a Tomcat 
>>>>> application. I
>>>>> came up with a simple repro scenario which I posted to StackOverflow:
>>>>> http://stackoverflow.com/questions/8017193/when-does-the-perm-gen-get-collected 
>>>>>
>>>>>
>>>>> To solve this issue I have been encouraged to use
>>>>> -XX:+CMSClassUnloadingEnabled.
>>>>> I currently use the following GC configuration.
>>>>>
>>>>> -XX:+UseMembar
>>>>> -XX:+UseConcMarkSweepGC
>>>>> -XX:+UseParNewGC
>>>>> -XX:CMSInitiatingOccupancyFraction=80
>>>>> -XX:+UseCMSInitiatingOccupancyOnly
>>>>>
>>>>> Is enabling CMSClassUnloadingEnabled likely to have a negative 
>>>>> perf impact?
>>>>> If not, why is it disabled by default?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111108/72ec8a46/attachment.html 

From ysr1729 at gmail.com  Tue Nov  8 10:20:46 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Tue, 8 Nov 2011 10:20:46 -0800
Subject: Perf Impact of CMSClassUnloadingEnabled
In-Reply-To: <4EB951BB.3000600@oracle.com>
References: <CA+41Qxos87ROt57vanm8PaOiEE2S9XaCc27RjL9S6HBfATmG4Q@mail.gmail.com>
	<4EB7FCD1.4090607@oracle.com> <4EB8246D.8000202@oracle.com>
	<4EB951BB.3000600@oracle.com>
Message-ID: <CABzyjyk=HXeEbKThYkSz-0WicXAUOUSnn7tmbxVas6MS2iSk_Q@mail.gmail.com>

Right, and this walk is done single-threaded today (perhaps it could be
parallelized
without too much effort?). May be moot with the upcoming changes around perm
gen though....

-- ramki

On Tue, Nov 8, 2011 at 7:58 AM, Jon Masamitsu <jon.masamitsu at oracle.com>wrote:

> Andreas,
>
> Hotspot maintains a list of classes that are loaded in the
> Dictionary (dictionary.hpp/cpp).  This list does not keep
> classes alive.  After marking (when we know what classes
> are dead), we walk the list and remove dead classes.
> Hotspot does not keep information that says classes have
> not been unloaded, so the list is always walked.
>
> Jon
>
> On 11/07/11 10:33, Andreas Loew wrote:
> > Hi Jon,
> >
> > sorry, a follow-up question from my side: As it shouldn't be the most
> > normal thing even for a Java EE app to constantly dereference
> > classloaders or single classes that need to be GC'ed:
> >
> > In how far does your statement about increased remark pauses still
> > apply in case the PermGen / set of loaded classes has stayed
> > completely constant between initial mark and remark (which should be
> > the usual case)?
> >
> > And wouldn't there a also be a distinction between PermGen and Old Gen?
> >
> > Many thanks & best regards,
> >
> > Andreas
> >
> > --
> > Andreas Loew
> > Senior Java Architect
> > Oracle Advanced Customer Services Germany
> >
> >
> > Am 07.11.2011 16:44, schrieb Jon Masamitsu:
> >> Doing class unloading with CMS will often increase the remark pause
> >> times
> >> and so is not on by default.
> >>
> >> On 11/5/2011 3:29 PM, Martin Hare Robertson wrote:
> >>> Hi,
> >>>
> >>> I recently encountered an interesting GC issue with a Tomcat
> >>> application. I
> >>> came up with a simple repro scenario which I posted to StackOverflow:
> >>>
> http://stackoverflow.com/questions/8017193/when-does-the-perm-gen-get-collected
> >>>
> >>>
> >>> To solve this issue I have been encouraged to use
> >>> -XX:+CMSClassUnloadingEnabled.
> >>> I currently use the following GC configuration.
> >>>
> >>> -XX:+UseMembar
> >>> -XX:+UseConcMarkSweepGC
> >>> -XX:+UseParNewGC
> >>> -XX:CMSInitiatingOccupancyFraction=80
> >>> -XX:+UseCMSInitiatingOccupancyOnly
> >>>
> >>> Is enabling CMSClassUnloadingEnabled likely to have a negative perf
> >>> impact?
> >>> If not, why is it disabled by default?
> >>>
> >>> Thanks
> >>>
> >>> Martin
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111108/add51560/attachment-0001.html 

From ysr1729 at gmail.com  Fri Nov 11 14:31:21 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Fri, 11 Nov 2011 14:31:21 -0800
Subject: Class histogram output and chopping off long thin tails
Message-ID: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>

I am posting this to hotspot-gc-use, but the idea is that it also post to
-dev (but given how
the lists are arranged, I am posting directly to the one and not the other
to avoid double copies
to those who are in the intersection of the two kists, while covering those
in the union of the two).

I've noticed recently in my use of the the class histogram feature, that in
typical cases I am interested
in the top few types of objects and not in the long thin tail. I am not
sure how typical my use or
experience is, but it would appear to me (based on my limited experience of
late) that if we limited
the histogram output to the top "N" (for say N = 40 or so) classes by
default, it would likely satisfy
80-90% of use cases. For the remaining 10% of use cases, one would provide
a complete dump,
or a dump with more entries than available by default.

I wanted to run this suggestion by everyone and see whether this would have
some traction
wrt such a request.

I am guessing that this may be especially useful when dealing with very
large applications that
may have many different types of objects in the heap and might present a
very long thin (and in
many cases uninteresting) tail. (There may be other ways of restricting the
output, for example
by cutting off output below a certain population or volume threshold, but
simply displaying the
top N most voluminous or populous classes would seem to be the simplest....)

Comments?
-- ramki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111111/6510c3e1/attachment.html 

From Peter.B.Kessler at Oracle.COM  Fri Nov 11 15:19:12 2011
From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler)
Date: Fri, 11 Nov 2011 15:19:12 -0800
Subject: Class histogram output and chopping off long thin tails
In-Reply-To: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
References: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
Message-ID: <4EBDAD70.6040009@Oracle.COM>

This seems like a reasonable request.  In fact, I thought there *was* a way not to print classes that had fewer than N bytes (or instances), but I don't see it (or any traces of it :-).  The other way I've wanted to filter PrintClassHistogram is to only print objects of a particular class (or probably package).  E.g., java.util.Hashtable and java.util.Hashtable$Entry, when I'm looking for a "leak" like that.  Knowing me, I probably kludged those together with grep or awk.

			... peter

Srinivas Ramakrishna wrote:
> I am posting this to hotspot-gc-use, but the idea is that it also post to -dev (but given how
> the lists are arranged, I am posting directly to the one and not the other to avoid double copies
> to those who are in the intersection of the two kists, while covering those in the union of the two).
> 
> I've noticed recently in my use of the the class histogram feature, that in typical cases I am interested
> in the top few types of objects and not in the long thin tail. I am not sure how typical my use or
> experience is, but it would appear to me (based on my limited experience of late) that if we limited
> the histogram output to the top "N" (for say N = 40 or so) classes by default, it would likely satisfy
> 80-90% of use cases. For the remaining 10% of use cases, one would provide a complete dump,
> or a dump with more entries than available by default.
> 
> I wanted to run this suggestion by everyone and see whether this would have some traction
> wrt such a request.
> 
> I am guessing that this may be especially useful when dealing with very large applications that
> may have many different types of objects in the heap and might present a very long thin (and in
> many cases uninteresting) tail. (There may be other ways of restricting the output, for example
> by cutting off output below a certain population or volume threshold, but simply displaying the
> top N most voluminous or populous classes would seem to be the simplest....)
> 
> Comments?
> -- ramki


From tony.printezis at oracle.com  Mon Nov 14 07:13:55 2011
From: tony.printezis at oracle.com (Tony Printezis)
Date: Mon, 14 Nov 2011 10:13:55 -0500
Subject: Class histogram output and chopping off long thin tails
In-Reply-To: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
References: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
Message-ID: <4EC13033.6010901@oracle.com>

Ramki,

First, which version of the class histogram are you referring to? I 
assume it's the one we generate from within the JVM which goes to the GC 
log? If you were using jmap you could just pipe the output to head or 
similar.

Is your concern mainly to keep the class histogram output reasonably 
compact? FWIW, and I don't know how common this scenario is, I once 
tracked down a leak by noticing that they were 2 instances of a 
particular class instead of 1 (I was replacing once instance with a 
newly-allocated one, but the original one ended up being queued up for 
finalization and held on to a lot of space). If we only dumped the top N 
classes I would have missed this piece of information.

Maybe adding a new -XX parameter :-) to set N would be a good compromise?

Tony

On 11/11/2011 5:31 PM, Srinivas Ramakrishna wrote:
>
> I am posting this to hotspot-gc-use, but the idea is that it also post 
> to -dev (but given how
> the lists are arranged, I am posting directly to the one and not the 
> other to avoid double copies
> to those who are in the intersection of the two kists, while covering 
> those in the union of the two).
>
> I've noticed recently in my use of the the class histogram feature, 
> that in typical cases I am interested
> in the top few types of objects and not in the long thin tail. I am 
> not sure how typical my use or
> experience is, but it would appear to me (based on my limited 
> experience of late) that if we limited
> the histogram output to the top "N" (for say N = 40 or so) classes by 
> default, it would likely satisfy
> 80-90% of use cases. For the remaining 10% of use cases, one would 
> provide a complete dump,
> or a dump with more entries than available by default.
>
> I wanted to run this suggestion by everyone and see whether this would 
> have some traction
> wrt such a request.
>
> I am guessing that this may be especially useful when dealing with 
> very large applications that
> may have many different types of objects in the heap and might present 
> a very long thin (and in
> many cases uninteresting) tail. (There may be other ways of 
> restricting the output, for example
> by cutting off output below a certain population or volume threshold, 
> but simply displaying the
> top N most voluminous or populous classes would seem to be the 
> simplest....)
>
> Comments?
> -- ramki
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111114/86c75dd7/attachment.html 

From stefan.karlsson at oracle.com  Mon Nov 14 07:51:56 2011
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 14 Nov 2011 16:51:56 +0100
Subject: Class histogram output and chopping off long thin tails
In-Reply-To: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
References: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
Message-ID: <4EC1391C.5020505@oracle.com>

Hi Ramki,

On 11/11/2011 11:31 PM, Srinivas Ramakrishna wrote:
>
> I am posting this to hotspot-gc-use, but the idea is that it also post 
> to -dev (but given how
> the lists are arranged, I am posting directly to the one and not the 
> other to avoid double copies
> to those who are in the intersection of the two kists, while covering 
> those in the union of the two).
>
> I've noticed recently in my use of the the class histogram feature, 
> that in typical cases I am interested
> in the top few types of objects and not in the long thin tail. I am 
> not sure how typical my use or
> experience is, but it would appear to me (based on my limited 
> experience of late) that if we limited
> the histogram output to the top "N" (for say N = 40 or so) classes by 
> default, it would likely satisfy
> 80-90% of use cases. For the remaining 10% of use cases, one would 
> provide a complete dump,
> or a dump with more entries than available by default.
>
> I wanted to run this suggestion by everyone and see whether this would 
> have some traction
> wrt such a request.

We have this feature in JRockit's class histogram infrastructure, so 
maybe one could see this as a convergence "project"?

For example:
$ jrcmd 14991 print_object_summary
14991:

--------- Detailed Heap Statistics: ---------
33.3% 79k     1005    +79k [C
22.3% 53k      456    +53k java/lang/Class
14.2% 34k       10    +34k [B
  9.7% 23k      995    +23k java/lang/String
  5.4% 12k      304    +12k [Ljava/lang/Object;
  2.5% 5k       76     +5k java/lang/reflect/Method
  1.3% 3k      157     +3k [Ljava/lang/Class;
  1.3% 2k       49     +2k [Ljava/lang/String;
  1.1% 2k        4     +2k [Ljrockit/vm/FCECache$FCE;
  0.9% 2k       32     +2k java/lang/reflect/Field
  0.7% 1k       20     +1k [Ljava/util/HashMap$Entry;
  0.6% 1k       10     +1k java/lang/Thread
  0.6% 1k       62     +1k java/util/Hashtable$Entry
  0.5% 1k        5     +1k [I
      239kB total ---

--------- End of Detailed Heap Statistics ---

where the cut off is as explained with:
$ jrcmd 14991 help print_object_summary
...
         cutoff                  - classes that represent less than this
                   percentage of totallive objects (measured in
                   size) will not be displayed.
                   Currently the percentage should be multiplied
                   by 1000 so 1.5%% would be 1500 (int, 500)
         cutoffpointsto          - like cutoff but for points-to 
information
                   (int, 500)
         increaseonly            - set if you only want to display the 
classes
                   thatincreased since the last listing (bool,
                   false)
...

StefanK

>
> I am guessing that this may be especially useful when dealing with 
> very large applications that
> may have many different types of objects in the heap and might present 
> a very long thin (and in
> many cases uninteresting) tail. (There may be other ways of 
> restricting the output, for example
> by cutting off output below a certain population or volume threshold, 
> but simply displaying the
> top N most voluminous or populous classes would seem to be the 
> simplest....)
>
> Comments?
> -- ramki
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111114/b1536712/attachment.html 

From ysr1729 at gmail.com  Mon Nov 14 11:37:17 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Mon, 14 Nov 2011 11:37:17 -0800
Subject: Class histogram output and chopping off long thin tails
In-Reply-To: <4EC13033.6010901@oracle.com>
References: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
	<4EC13033.6010901@oracle.com>
Message-ID: <CABzyjyk_1dACf-3QYM+DLX8DXALPRDUR7GdMpA3r0KzqNE8bEA@mail.gmail.com>

On Mon, Nov 14, 2011 at 7:13 AM, Tony Printezis
<tony.printezis at oracle.com>wrote:

>  Ramki,
>
> First, which version of the class histogram are you referring to? I assume
> it's the one we generate from within the JVM which goes to the GC log? If
> you were using jmap you could just pipe the output to head or similar.
>

Right -- the former.


>
> Is your concern mainly to keep the class histogram output reasonably
> compact? FWIW, and I don't know how common this scenario is, I once tracked
> down a leak by noticing that they were 2 instances of a particular class
> instead of 1 (I was replacing once instance with a newly-allocated one, but
> the original one ended up being queued up for finalization and held on to a
> lot of space). If we only dumped the top N classes I would have missed this
> piece of information.
>

Sure. I can imagine there are cases where the skinny tail is interesting
and indeed vital. My guess (as i indicated in the email) was that perhaps
the
common use case was in the top part of the histogram, and the objective as
you stated was compactness :-)


>
> Maybe adding a new -XX parameter :-) to set N would be a good compromise?
>

Sure. That's what i was suggesting, plus that the default be to favor
compactness (because of my guesstimate on how the use-cases fell in
practice,
a guesstimate that could be wrong since it was based on subjective
experience rather than a survey :-)

thanks!
-- ramki


>
> Tony
>
>
> On 11/11/2011 5:31 PM, Srinivas Ramakrishna wrote:
>
>
> I am posting this to hotspot-gc-use, but the idea is that it also post to
> -dev (but given how
> the lists are arranged, I am posting directly to the one and not the other
> to avoid double copies
> to those who are in the intersection of the two kists, while covering
> those in the union of the two).
>
> I've noticed recently in my use of the the class histogram feature, that
> in typical cases I am interested
> in the top few types of objects and not in the long thin tail. I am not
> sure how typical my use or
> experience is, but it would appear to me (based on my limited experience
> of late) that if we limited
> the histogram output to the top "N" (for say N = 40 or so) classes by
> default, it would likely satisfy
> 80-90% of use cases. For the remaining 10% of use cases, one would provide
> a complete dump,
> or a dump with more entries than available by default.
>
> I wanted to run this suggestion by everyone and see whether this would
> have some traction
> wrt such a request.
>
> I am guessing that this may be especially useful when dealing with very
> large applications that
> may have many different types of objects in the heap and might present a
> very long thin (and in
> many cases uninteresting) tail. (There may be other ways of restricting
> the output, for example
> by cutting off output below a certain population or volume threshold, but
> simply displaying the
> top N most voluminous or populous classes would seem to be the
> simplest....)
>
> Comments?
> -- ramki
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111114/47cba37e/attachment-0001.html 

From ysr1729 at gmail.com  Mon Nov 14 11:40:46 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Mon, 14 Nov 2011 11:40:46 -0800
Subject: Class histogram output and chopping off long thin tails
In-Reply-To: <4EC1391C.5020505@oracle.com>
References: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
	<4EC1391C.5020505@oracle.com>
Message-ID: <CABzyjymoP3mT=tKZB-fSHGt+jUcNXK+oLHhXrWnauQsxyRcOxg@mail.gmail.com>

Hi Stefan -- yes, +1 for that.

With of course the option to have the entire listing if the user so chose.

I love the +increaseonly option. From my limited experience, it would
likley be
a big hit (although lacking more experience in its behaviour, i am mildly
concerned
about short-term noise/volatility confusing the user).

-- ramki

On Mon, Nov 14, 2011 at 7:51 AM, Stefan Karlsson <stefan.karlsson at oracle.com
> wrote:

> **
> Hi Ramki,
>
>
> On 11/11/2011 11:31 PM, Srinivas Ramakrishna wrote:
>
>
> I am posting this to hotspot-gc-use, but the idea is that it also post to
> -dev (but given how
> the lists are arranged, I am posting directly to the one and not the other
> to avoid double copies
> to those who are in the intersection of the two kists, while covering
> those in the union of the two).
>
> I've noticed recently in my use of the the class histogram feature, that
> in typical cases I am interested
> in the top few types of objects and not in the long thin tail. I am not
> sure how typical my use or
> experience is, but it would appear to me (based on my limited experience
> of late) that if we limited
> the histogram output to the top "N" (for say N = 40 or so) classes by
> default, it would likely satisfy
> 80-90% of use cases. For the remaining 10% of use cases, one would provide
> a complete dump,
> or a dump with more entries than available by default.
>
> I wanted to run this suggestion by everyone and see whether this would
> have some traction
> wrt such a request.
>
>
> We have this feature in JRockit's class histogram infrastructure, so maybe
> one could see this as a convergence "project"?
>
> For example:
> $ jrcmd 14991 print_object_summary
> 14991:
>
> --------- Detailed Heap Statistics: ---------
> 33.3% 79k     1005    +79k [C
> 22.3% 53k      456    +53k java/lang/Class
> 14.2% 34k       10    +34k [B
>  9.7% 23k      995    +23k java/lang/String
>  5.4% 12k      304    +12k [Ljava/lang/Object;
>  2.5% 5k       76     +5k java/lang/reflect/Method
>  1.3% 3k      157     +3k [Ljava/lang/Class;
>  1.3% 2k       49     +2k [Ljava/lang/String;
>  1.1% 2k        4     +2k [Ljrockit/vm/FCECache$FCE;
>  0.9% 2k       32     +2k java/lang/reflect/Field
>  0.7% 1k       20     +1k [Ljava/util/HashMap$Entry;
>  0.6% 1k       10     +1k java/lang/Thread
>  0.6% 1k       62     +1k java/util/Hashtable$Entry
>  0.5% 1k        5     +1k [I
>      239kB total ---
>
> --------- End of Detailed Heap Statistics ---
>
> where the cut off is as explained with:
> $ jrcmd 14991 help print_object_summary
> ...
>         cutoff                  - classes that represent less than this
>                   percentage of totallive objects (measured in
>                   size) will not be displayed.
>                   Currently the percentage should be multiplied
>                   by 1000 so 1.5%% would be 1500 (int, 500)
>         cutoffpointsto          - like cutoff but for points-to
> information
>                   (int, 500)
>         increaseonly            - set if you only want to display the
> classes
>                   thatincreased since the last listing (bool,
>                   false)
> ...
>
> StefanK
>
>
> I am guessing that this may be especially useful when dealing with very
> large applications that
> may have many different types of objects in the heap and might present a
> very long thin (and in
> many cases uninteresting) tail. (There may be other ways of restricting
> the output, for example
> by cutting off output below a certain population or volume threshold, but
> simply displaying the
> top N most voluminous or populous classes would seem to be the
> simplest....)
>
> Comments?
> -- ramki
>
>
>
> _______________________________________________
> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111114/c5587b07/attachment.html 

From tony.printezis at oracle.com  Mon Nov 14 14:10:17 2011
From: tony.printezis at oracle.com (Tony Printezis)
Date: Mon, 14 Nov 2011 17:10:17 -0500
Subject: Class histogram output and chopping off long thin tails
In-Reply-To: <CABzyjyk_1dACf-3QYM+DLX8DXALPRDUR7GdMpA3r0KzqNE8bEA@mail.gmail.com>
References: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
	<4EC13033.6010901@oracle.com>
	<CABzyjyk_1dACf-3QYM+DLX8DXALPRDUR7GdMpA3r0KzqNE8bEA@mail.gmail.com>
Message-ID: <4EC191C9.6010907@oracle.com>

I'll be happy if we provided a parameter to limit the histogram output. 
But, I would personally recommend that the default value for this is 
"unbounded" for the reasons I described in my previous e-mail...

Tony

On 11/14/2011 02:37 PM, Srinivas Ramakrishna wrote:
>
>
> On Mon, Nov 14, 2011 at 7:13 AM, Tony Printezis 
> <tony.printezis at oracle.com <mailto:tony.printezis at oracle.com>> wrote:
>
>     Ramki,
>
>     First, which version of the class histogram are you referring to?
>     I assume it's the one we generate from within the JVM which goes
>     to the GC log? If you were using jmap you could just pipe the
>     output to head or similar.
>
>
> Right -- the former.
>
>
>     Is your concern mainly to keep the class histogram output
>     reasonably compact? FWIW, and I don't know how common this
>     scenario is, I once tracked down a leak by noticing that they were
>     2 instances of a particular class instead of 1 (I was replacing
>     once instance with a newly-allocated one, but the original one
>     ended up being queued up for finalization and held on to a lot of
>     space). If we only dumped the top N classes I would have missed
>     this piece of information.
>
>
> Sure. I can imagine there are cases where the skinny tail is 
> interesting and indeed vital. My guess (as i indicated in the email) 
> was that perhaps the
> common use case was in the top part of the histogram, and the 
> objective as you stated was compactness :-)
>
>
>     Maybe adding a new -XX parameter :-) to set N would be a good
>     compromise?
>
>
> Sure. That's what i was suggesting, plus that the default be to favor 
> compactness (because of my guesstimate on how the use-cases fell in 
> practice,
> a guesstimate that could be wrong since it was based on subjective 
> experience rather than a survey :-)
>
> thanks!
> -- ramki
>
>
>     Tony
>
>
>     On 11/11/2011 5:31 PM, Srinivas Ramakrishna wrote:
>>
>>     I am posting this to hotspot-gc-use, but the idea is that it also
>>     post to -dev (but given how
>>     the lists are arranged, I am posting directly to the one and not
>>     the other to avoid double copies
>>     to those who are in the intersection of the two kists, while
>>     covering those in the union of the two).
>>
>>     I've noticed recently in my use of the the class histogram
>>     feature, that in typical cases I am interested
>>     in the top few types of objects and not in the long thin tail. I
>>     am not sure how typical my use or
>>     experience is, but it would appear to me (based on my limited
>>     experience of late) that if we limited
>>     the histogram output to the top "N" (for say N = 40 or so)
>>     classes by default, it would likely satisfy
>>     80-90% of use cases. For the remaining 10% of use cases, one
>>     would provide a complete dump,
>>     or a dump with more entries than available by default.
>>
>>     I wanted to run this suggestion by everyone and see whether this
>>     would have some traction
>>     wrt such a request.
>>
>>     I am guessing that this may be especially useful when dealing
>>     with very large applications that
>>     may have many different types of objects in the heap and might
>>     present a very long thin (and in
>>     many cases uninteresting) tail. (There may be other ways of
>>     restricting the output, for example
>>     by cutting off output below a certain population or volume
>>     threshold, but simply displaying the
>>     top N most voluminous or populous classes would seem to be the
>>     simplest....)
>>
>>     Comments?
>>     -- ramki
>>
>>
>>
>>
>>     _______________________________________________
>>     hotspot-gc-use mailing list
>>     hotspot-gc-use at openjdk.java.net  <mailto:hotspot-gc-use at openjdk.java.net>
>>     http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111114/99072f66/attachment.html 

From ysr1729 at gmail.com  Mon Nov 14 14:22:13 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Mon, 14 Nov 2011 14:22:13 -0800
Subject: Class histogram output and chopping off long thin tails
In-Reply-To: <4EC191C9.6010907@oracle.com>
References: <CABzyjynbEHDOSoeGrAotd5CwZGoi=R3soHZM78sbkVEKpTM_Rg@mail.gmail.com>
	<4EC13033.6010901@oracle.com>
	<CABzyjyk_1dACf-3QYM+DLX8DXALPRDUR7GdMpA3r0KzqNE8bEA@mail.gmail.com>
	<4EC191C9.6010907@oracle.com>
Message-ID: <CABzyjynBAofjzFoKY5wbgp=gwt09OBJmsx-vnLZR+5PCGJ74Mw@mail.gmail.com>

OK, sounds good to me.

thanks!
-- ramki

On Mon, Nov 14, 2011 at 2:10 PM, Tony Printezis
<tony.printezis at oracle.com>wrote:

>  I'll be happy if we provided a parameter to limit the histogram output.
> But, I would personally recommend that the default value for this is
> "unbounded" for the reasons I described in my previous e-mail...
>
> Tony
>
>
> On 11/14/2011 02:37 PM, Srinivas Ramakrishna wrote:
>
>
>
> On Mon, Nov 14, 2011 at 7:13 AM, Tony Printezis <tony.printezis at oracle.com
> > wrote:
>
>>  Ramki,
>>
>> First, which version of the class histogram are you referring to? I
>> assume it's the one we generate from within the JVM which goes to the GC
>> log? If you were using jmap you could just pipe the output to head or
>> similar.
>>
>
> Right -- the former.
>
>
>>
>> Is your concern mainly to keep the class histogram output reasonably
>> compact? FWIW, and I don't know how common this scenario is, I once tracked
>> down a leak by noticing that they were 2 instances of a particular class
>> instead of 1 (I was replacing once instance with a newly-allocated one, but
>> the original one ended up being queued up for finalization and held on to a
>> lot of space). If we only dumped the top N classes I would have missed this
>> piece of information.
>>
>
> Sure. I can imagine there are cases where the skinny tail is interesting
> and indeed vital. My guess (as i indicated in the email) was that perhaps
> the
> common use case was in the top part of the histogram, and the objective as
> you stated was compactness :-)
>
>
>>
>> Maybe adding a new -XX parameter :-) to set N would be a good compromise?
>>
>
> Sure. That's what i was suggesting, plus that the default be to favor
> compactness (because of my guesstimate on how the use-cases fell in
> practice,
> a guesstimate that could be wrong since it was based on subjective
> experience rather than a survey :-)
>
> thanks!
> -- ramki
>
>
>>
>> Tony
>>
>>
>> On 11/11/2011 5:31 PM, Srinivas Ramakrishna wrote:
>>
>>
>> I am posting this to hotspot-gc-use, but the idea is that it also post to
>> -dev (but given how
>> the lists are arranged, I am posting directly to the one and not the
>> other to avoid double copies
>> to those who are in the intersection of the two kists, while covering
>> those in the union of the two).
>>
>> I've noticed recently in my use of the the class histogram feature, that
>> in typical cases I am interested
>> in the top few types of objects and not in the long thin tail. I am not
>> sure how typical my use or
>> experience is, but it would appear to me (based on my limited experience
>> of late) that if we limited
>> the histogram output to the top "N" (for say N = 40 or so) classes by
>> default, it would likely satisfy
>> 80-90% of use cases. For the remaining 10% of use cases, one would
>> provide a complete dump,
>> or a dump with more entries than available by default.
>>
>> I wanted to run this suggestion by everyone and see whether this would
>> have some traction
>> wrt such a request.
>>
>> I am guessing that this may be especially useful when dealing with very
>> large applications that
>> may have many different types of objects in the heap and might present a
>> very long thin (and in
>> many cases uninteresting) tail. (There may be other ways of restricting
>> the output, for example
>> by cutting off output below a certain population or volume threshold, but
>> simply displaying the
>> top N most voluminous or populous classes would seem to be the
>> simplest....)
>>
>> Comments?
>> -- ramki
>>
>>
>>
>>
>>  _______________________________________________
>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111114/a6a6d5e5/attachment.html 

From rhelbing at icubic.de  Wed Nov 16 07:02:35 2011
From: rhelbing at icubic.de (Ralf Helbing)
Date: Wed, 16 Nov 2011 16:02:35 +0100
Subject: GC Parameters for low-latency
Message-ID: <4EC3D08B.4060309@icubic.de>

dear mailing list,

we try to achieve low latencies despite using a huge heap (10G) and many 
logical cores (64).
VM is 1.7u1. Ideally, we would let GC ergonomics decide what is best, 
giving only a low pause time goal (50ms).

-Xss2m
-Xmx10000M
-XX:PermSize=256m
-XX:+UseAdaptiveGCBoundary
-XX:+UseAdaptiveSizePolicy
-XX:+UseConcMarkSweepGC
-XX:MaxGCPauseMillis=100
-XX:ParallelGCThreads=12

-XX:+BindGCTaskThreadsToCPUs
-XX:+UseGCTaskAffinity

-XX:+UseCompressedOops
-XX:+DoEscapeAnalysis

Whenever we use adaptive sizes, the VM will crash in GenCollect*, as 
soon as some serious allocations start. I already filed a bug for this 
(7112413).

Assuming a small newsize helps maintaining a low pause time goal, I can 
set the newsize, too. Say I set it to 100MB, it will increase later 
anyway, again yielding frequent pause times in over 1s by the time the 
newsize is around 1G.

What am I doing wrong here?

From jon.masamitsu at oracle.com  Wed Nov 16 07:22:07 2011
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 16 Nov 2011 07:22:07 -0800
Subject: GC Parameters for low-latency
In-Reply-To: <4EC3D08B.4060309@icubic.de>
References: <4EC3D08B.4060309@icubic.de>
Message-ID: <4EC3D51F.6020509@oracle.com>

Do not use UseAdaptiveSizePolicy with CMS.   The implementation for CMS
is incomplete. Never use  UseAdaptiveGCBoundary.  There are known problems
with that option.

UseAdaptiveSizePolicy should only be used with UseParallelGC and
UseParallelOldGC.


On 11/16/2011 7:02 AM, Ralf Helbing wrote:
> dear mailing list,
>
> we try to achieve low latencies despite using a huge heap (10G) and many
> logical cores (64).
> VM is 1.7u1. Ideally, we would let GC ergonomics decide what is best,
> giving only a low pause time goal (50ms).
>
> -Xss2m
> -Xmx10000M
> -XX:PermSize=256m
> -XX:+UseAdaptiveGCBoundary
> -XX:+UseAdaptiveSizePolicy
> -XX:+UseConcMarkSweepGC
> -XX:MaxGCPauseMillis=100
> -XX:ParallelGCThreads=12
>
> -XX:+BindGCTaskThreadsToCPUs
> -XX:+UseGCTaskAffinity
>
> -XX:+UseCompressedOops
> -XX:+DoEscapeAnalysis
>
> Whenever we use adaptive sizes, the VM will crash in GenCollect*, as
> soon as some serious allocations start. I already filed a bug for this
> (7112413).
>
> Assuming a small newsize helps maintaining a low pause time goal, I can
> set the newsize, too. Say I set it to 100MB, it will increase later
> anyway, again yielding frequent pause times in over 1s by the time the
> newsize is around 1G.
>
> What am I doing wrong here?
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From michal at frajt.eu  Wed Nov 16 15:45:49 2011
From: michal at frajt.eu (Michal Frajt)
Date: Thu, 17 Nov 2011 00:45:49 +0100
Subject: Survivor space class historgram print (know your bad garbage)
Message-ID: <01e201cca4b9$de88e400$9b9aac00$@frajt.eu>

Hi,

 
Is there a way to get printed a survivor space class histogram on every
minor collection run? Tony once provided us a special jdk build containing
this feature but it got never integrated as a print flag into the main
hotspot version. It was very useful for understanding and identifying
promoted objects to the old gen. Additionally we were looking to get a class
histogram print for the eden space but there was no easy way to implement
it. The eden space class histogram would help to identify garbage invoking
minor collection runs. It could be as well used to check the impact of the
scalar replacements.

 
Would it be possible to reimplement both histogram prints?

 
Thanks,

Michal

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111117/c73ef862/attachment.html 

From ysr1729 at gmail.com  Wed Nov 16 16:01:50 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Wed, 16 Nov 2011 16:01:50 -0800
Subject: Survivor space class historgram print (know your bad garbage)
In-Reply-To: <01e201cca4b9$de88e400$9b9aac00$@frajt.eu>
References: <01e201cca4b9$de88e400$9b9aac00$@frajt.eu>
Message-ID: <CABzyjynyN8=THDUOFRFosda0p0cgyRfEdBZFuTtFZCnwO-U56Q@mail.gmail.com>

AFAR, it was not integrated at that time because of the performance impact
even when the feature was turned off.
I believe it would be possible to refactor the code, at some cost, to get
this to work without that performance
impact, but that didn't get done. It might be time to revisit that code and
do the requisite refactoring. Tony et al?

-- ramki


On Wed, Nov 16, 2011 at 3:45 PM, Michal Frajt <michal at frajt.eu> wrote:

> Hi,****
>
> ** **
>
> Is there a way to get printed a survivor space class histogram on every
> minor collection run? Tony once provided us a special jdk build containing
> this feature but it got never integrated as a print flag into the main
> hotspot version. It was very useful for understanding and identifying
> promoted objects to the old gen. Additionally we were looking to get a
> class histogram print for the eden space but there was no easy way to
> implement it. The eden space class histogram would help to identify garbage
> invoking minor collection runs. It could be as well used to check the
> impact of the scalar replacements.****
>
> ** **
>
> Would it be possible to reimplement both histogram prints?****
>
> ** **
>
> Thanks,****
>
> Michal****
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111116/77872637/attachment.html 

From tony.printezis at oracle.com  Thu Nov 17 05:10:56 2011
From: tony.printezis at oracle.com (Tony Printezis)
Date: Thu, 17 Nov 2011 08:10:56 -0500
Subject: Survivor space class historgram print (know your bad garbage)
In-Reply-To: <CABzyjynyN8=THDUOFRFosda0p0cgyRfEdBZFuTtFZCnwO-U56Q@mail.gmail.com>
References: <01e201cca4b9$de88e400$9b9aac00$@frajt.eu>
	<CABzyjynyN8=THDUOFRFosda0p0cgyRfEdBZFuTtFZCnwO-U56Q@mail.gmail.com>
Message-ID: <4EC507E0.1080905@oracle.com>

No current plans to integrate this. But this is something we could 
consider as part of our ongoing effort to support Mission Control.

Tony

On 11/16/2011 7:01 PM, Srinivas Ramakrishna wrote:
> AFAR, it was not integrated at that time because of the performance 
> impact even when the feature was turned off.
> I believe it would be possible to refactor the code, at some cost, to 
> get this to work without that performance
> impact, but that didn't get done. It might be time to revisit that 
> code and do the requisite refactoring. Tony et al?
>
> -- ramki
>
>
> On Wed, Nov 16, 2011 at 3:45 PM, Michal Frajt <michal at frajt.eu 
> <mailto:michal at frajt.eu>> wrote:
>
>     Hi,
>
>     Is there a way to get printed a survivor space class histogram on
>     every minor collection run? Tony once provided us a special jdk
>     build containing this feature but it got never integrated as a
>     print flag into the main hotspot version. It was very useful for
>     understanding and identifying promoted objects to the old gen.
>     Additionally we were looking to get a class histogram print for
>     the eden space but there was no easy way to implement it. The eden
>     space class histogram would help to identify garbage invoking
>     minor collection runs. It could be as well used to check the
>     impact of the scalar replacements.
>
>     Would it be possible to reimplement both histogram prints?
>
>     Thanks,
>
>     Michal
>
>
>     _______________________________________________
>     hotspot-gc-use mailing list
>     hotspot-gc-use at openjdk.java.net
>     <mailto:hotspot-gc-use at openjdk.java.net>
>     http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111117/d8942979/attachment.html 

From ysr1729 at gmail.com  Thu Nov 17 08:53:32 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 17 Nov 2011 08:53:32 -0800
Subject: Survivor space class historgram print (know your bad garbage)
In-Reply-To: <4EC507E0.1080905@oracle.com>
References: <01e201cca4b9$de88e400$9b9aac00$@frajt.eu>
	<CABzyjynyN8=THDUOFRFosda0p0cgyRfEdBZFuTtFZCnwO-U56Q@mail.gmail.com>
	<4EC507E0.1080905@oracle.com>
Message-ID: <CABzyjykarchvP9Cr2hzxSiktKC=4bJY=dDV78zCEG=x4Yqu-ag@mail.gmail.com>

Yes, that seems like a good place to expose such functionality. The
question of refactoring to allow
the stats gathering to happen at low cost when disabled of course still
remains, I guess.

thanks!
-- ramki

On Thu, Nov 17, 2011 at 5:10 AM, Tony Printezis
<tony.printezis at oracle.com>wrote:

>  No current plans to integrate this. But this is something we could
> consider as part of our ongoing effort to support Mission Control.
>
> Tony
>
>
> On 11/16/2011 7:01 PM, Srinivas Ramakrishna wrote:
>
> AFAR, it was not integrated at that time because of the performance impact
> even when the feature was turned off.
> I believe it would be possible to refactor the code, at some cost, to get
> this to work without that performance
> impact, but that didn't get done. It might be time to revisit that code
> and do the requisite refactoring. Tony et al?
>
> -- ramki
>
>
> On Wed, Nov 16, 2011 at 3:45 PM, Michal Frajt <michal at frajt.eu> wrote:
>
>>  Hi,
>>
>>
>>
>> Is there a way to get printed a survivor space class histogram on every
>> minor collection run? Tony once provided us a special jdk build containing
>> this feature but it got never integrated as a print flag into the main
>> hotspot version. It was very useful for understanding and identifying
>> promoted objects to the old gen. Additionally we were looking to get a
>> class histogram print for the eden space but there was no easy way to
>> implement it. The eden space class histogram would help to identify garbage
>> invoking minor collection runs. It could be as well used to check the
>> impact of the scalar replacements.
>>
>>
>>
>> Would it be possible to reimplement both histogram prints?
>>
>>
>>
>> Thanks,
>>
>> Michal
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
>
> _______________________________________________
> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111117/b5b8b509/attachment.html 

From knoguchi at yahoo-inc.com  Tue Nov 22 13:06:49 2011
From: knoguchi at yahoo-inc.com (Koji Noguchi)
Date: Tue, 22 Nov 2011 13:06:49 -0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <&lt;4DB5A34A.5040108@oracle.com&gt;>
Message-ID: <CAF14EE9.24080%knoguchi@yahoo-inc.com>

This is from an old thread in 2011 April but we're still seeing the same
problem with (nio) Socket instances not getting collecting by CMS.

Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118

Thanks,
Koji


(From 
http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
l)
On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna" <y.s.ramakrishna at oracle.com>>
wrote:
> On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
> > Hi Ramki,
> >
> > Thanks for the detailed explanation. I was trying to
> > run some tests for your questions. Here are the answers to some of your
> > questions.
> >
> >>> What are the symptoms?
> > java.net.SocksSocketImpl objects are not getting cleaned up after a CMS
> cycle. I see the direct
> > correlation to java.lang.ref.Finalizer objects. Overtime, this fills up
> > the old generation and CMS going in loop occupying complete one core.
> > But when we trigger Full GC, these objects are garbage collected.
>
> OK, thanks.
>
> >
> > You
> >   mentioned that CMS cycle does cleanup these objects provided we enable
> > class unloading. Are you suggesting -XX:+ClassUnloading or
> > -XX:+CMSClassUnloadingEnabled? I have tried with later and
> >
> > didn't
> >   succeed.  Our pern gen is relatively constant, by enabling this, are we
> >   introducing performance overhead? We have room for CPU cycles and perm
> > gen is relatively small, so this may be fine. Just that we want to see
> > these objects should GC'ed in CMS cycle.
> >
> >
> > Do you have any suggestion w.r.t. to which flags should i be using to
> trigger this?
>
> For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
> not make a difference in the accumulation of the socket objects
> because there is no "projection" as far as i can tell of these
> into the perm gen, esepcially since as you say there is no class
> loading going on (since your perm gen size remains constant after
> start-up).
>

> However, keeping class unloading enabled via this flag should
> hopefully not have much of an impact on your pause times given that
> the perm gen is small. The typical effect you will see if class
> unloading is enabled is that the CMS remark pause times are a bit
> longer (if you enable PrintGCDetails you will see messages
> such as "scrub string table" and "scrub symbol table", "code cache"
> etc. BY comparing the CMS-remark pause details and times with
> and without enabling class unloading you will get a good idea
> of its impact. In some cases, eben though you pay a small price
> in terms of increased CMS-remark pause times, you will make up
> for that in terms of faster scavenges etc., so it might well
> be worthwhile.
>
> In the very near future, we may end up turning that on
> by default for CMS because the savings from leaving it off
> by default are much smaller now and it can often lead to
> other issues if class unloading is turned off.
>
> So bottom line is: it will not affect the accumulation of
> your socket objects, but it's a good idea to keep class
> unloading by CMS enabled anyway.
>
> >
> >
> >>> What does jmap -finalizerinfo on your process show?
> >>> What does -XX:+PrintClassHistogram show as accumulating in the
> heap?
> >>> (Are they one specific type of Finalizer objects or all
> varieties?)
> >
> > Jmap -histo shows the above class is keep accumulating. Infact,
> > finalizerinfo doesn't show any objects on this process.
>
> OK, that shows that the objects are somehow not discovered by
> CMS as being eligible for finalization. Although one can imagine
> a one cycle delay (because of floating garbage) with CMS finding
> these objects to be unreachable and hence eligible for finalization,
> continuing accumulation of these objects over a period of time
> (and presumably many CMS cycles) seems strange and almost
> definitely a CMS bug especially as you find that a full STW
> gc does indeed reclaim them.
>
> >
> >
> >
> >>> Did the problem start in 6u21? Or are those the only versions
> >>> you tested and found that there was an issue?
> > We
> >   have seen this problem in 6u21. We were on 6u12 earlier and didn't run
> > into this problem. But can't say this is a build particular, since lots
> > of things have changed.
>
> Can you boil down this behavior into a test case that you are able
> to share with us?
> If so, please file a bug with the test case
> and send me the CR id and I'll take a look.
>
> Oh, and before you do that, can you please check the latest public
> release (6u24 or 6u25?) to see if the problem still reproduces?
>
> thanks, and sorry I could not be of more help without a bug
> report or a test case.
>
> -- ramki
>
> >
> > Thanks in anticipation,
> > -Bharath
> >


From jon.masamitsu at oracle.com  Wed Nov 23 05:31:29 2011
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Wed, 23 Nov 2011 05:31:29 -0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <CAF14EE9.24080%knoguchi@yahoo-inc.com>
References: <CAF14EE9.24080%knoguchi@yahoo-inc.com>
Message-ID: <4ECCF5B1.7040408@oracle.com>

Koji,

There is no engineer assigned to this CR and no progress has been
made on it as far as I can tell.  I'd suggest you pursue this through
your Oracle support contacts.

Jon


On 11/22/2011 1:06 PM, Koji Noguchi wrote:
> This is from an old thread in 2011 April but we're still seeing the same
> problem with (nio) Socket instances not getting collecting by CMS.
>
> Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>
> Thanks,
> Koji
>
>
> (From
> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
> l)
> On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna"<y.s.ramakrishna at oracle.com>>
> wrote:
>> On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
>>> Hi Ramki,
>>>
>>> Thanks for the detailed explanation. I was trying to
>>> run some tests for your questions. Here are the answers to some of your
>>> questions.
>>>
>>>>> What are the symptoms?
>>> java.net.SocksSocketImpl objects are not getting cleaned up after a CMS
>> cycle. I see the direct
>>> correlation to java.lang.ref.Finalizer objects. Overtime, this fills up
>>> the old generation and CMS going in loop occupying complete one core.
>>> But when we trigger Full GC, these objects are garbage collected.
>> OK, thanks.
>>
>>> You
>>>    mentioned that CMS cycle does cleanup these objects provided we enable
>>> class unloading. Are you suggesting -XX:+ClassUnloading or
>>> -XX:+CMSClassUnloadingEnabled? I have tried with later and
>>>
>>> didn't
>>>    succeed.  Our pern gen is relatively constant, by enabling this, are we
>>>    introducing performance overhead? We have room for CPU cycles and perm
>>> gen is relatively small, so this may be fine. Just that we want to see
>>> these objects should GC'ed in CMS cycle.
>>>
>>>
>>> Do you have any suggestion w.r.t. to which flags should i be using to
>> trigger this?
>>
>> For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
>> not make a difference in the accumulation of the socket objects
>> because there is no "projection" as far as i can tell of these
>> into the perm gen, esepcially since as you say there is no class
>> loading going on (since your perm gen size remains constant after
>> start-up).
>>
>> However, keeping class unloading enabled via this flag should
>> hopefully not have much of an impact on your pause times given that
>> the perm gen is small. The typical effect you will see if class
>> unloading is enabled is that the CMS remark pause times are a bit
>> longer (if you enable PrintGCDetails you will see messages
>> such as "scrub string table" and "scrub symbol table", "code cache"
>> etc. BY comparing the CMS-remark pause details and times with
>> and without enabling class unloading you will get a good idea
>> of its impact. In some cases, eben though you pay a small price
>> in terms of increased CMS-remark pause times, you will make up
>> for that in terms of faster scavenges etc., so it might well
>> be worthwhile.
>>
>> In the very near future, we may end up turning that on
>> by default for CMS because the savings from leaving it off
>> by default are much smaller now and it can often lead to
>> other issues if class unloading is turned off.
>>
>> So bottom line is: it will not affect the accumulation of
>> your socket objects, but it's a good idea to keep class
>> unloading by CMS enabled anyway.
>>
>>>
>>>>> What does jmap -finalizerinfo on your process show?
>>>>> What does -XX:+PrintClassHistogram show as accumulating in the
>> heap?
>>>>> (Are they one specific type of Finalizer objects or all
>> varieties?)
>>> Jmap -histo shows the above class is keep accumulating. Infact,
>>> finalizerinfo doesn't show any objects on this process.
>> OK, that shows that the objects are somehow not discovered by
>> CMS as being eligible for finalization. Although one can imagine
>> a one cycle delay (because of floating garbage) with CMS finding
>> these objects to be unreachable and hence eligible for finalization,
>> continuing accumulation of these objects over a period of time
>> (and presumably many CMS cycles) seems strange and almost
>> definitely a CMS bug especially as you find that a full STW
>> gc does indeed reclaim them.
>>
>>>
>>>
>>>>> Did the problem start in 6u21? Or are those the only versions
>>>>> you tested and found that there was an issue?
>>> We
>>>    have seen this problem in 6u21. We were on 6u12 earlier and didn't run
>>> into this problem. But can't say this is a build particular, since lots
>>> of things have changed.
>> Can you boil down this behavior into a test case that you are able
>> to share with us?
>> If so, please file a bug with the test case
>> and send me the CR id and I'll take a look.
>>
>> Oh, and before you do that, can you please check the latest public
>> release (6u24 or 6u25?) to see if the problem still reproduces?
>>
>> thanks, and sorry I could not be of more help without a bug
>> report or a test case.
>>
>> -- ramki
>>
>>> Thanks in anticipation,
>>> -Bharath
>>>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From rednaxelafx at gmail.com  Wed Nov 23 05:43:00 2011
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Wed, 23 Nov 2011 21:43:00 +0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <4ECCF5B1.7040408@oracle.com>
References: <CAF14EE9.24080%knoguchi@yahoo-inc.com>
	<4ECCF5B1.7040408@oracle.com>
Message-ID: <CA+cQ+tQziJu_jtwjvvozhfqVRRNzqupNHm+xjY0JUP=hi9thFQ@mail.gmail.com>

Hi,

I submitted a patch recently to mitigate the specific CMS problem caused
by excessive SocksSocketImpl objects, by trying to avoid creating them in
the first place. [1]
That doesn't solve the general case if there really is a problem with CMS
and finalization.

Since we've hit the same problem here, we might investigate further on CMS.
I'll report back shall we make progress on it.

Regards,
Kris Mok
Software Engineer, Taobao (http://www.taobao.com)

[1]:
http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html

On Wed, Nov 23, 2011 at 9:31 PM, Jon Masamitsu <jon.masamitsu at oracle.com>wrote:

> Koji,
>
> There is no engineer assigned to this CR and no progress has been
> made on it as far as I can tell.  I'd suggest you pursue this through
> your Oracle support contacts.
>
> Jon
>
>
>
> On 11/22/2011 1:06 PM, Koji Noguchi wrote:
> > This is from an old thread in 2011 April but we're still seeing the same
> > problem with (nio) Socket instances not getting collecting by CMS.
> >
> > Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
> >
> > Thanks,
> > Koji
> >
> >
> > (From
> >
> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
> > l)
> > On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna"<y.s.ramakrishna at oracle.com
> >>
> > wrote:
> >> On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
> >>> Hi Ramki,
> >>>
> >>> Thanks for the detailed explanation. I was trying to
> >>> run some tests for your questions. Here are the answers to some of your
> >>> questions.
> >>>
> >>>>> What are the symptoms?
> >>> java.net.SocksSocketImpl objects are not getting cleaned up after a CMS
> >> cycle. I see the direct
> >>> correlation to java.lang.ref.Finalizer objects. Overtime, this fills up
> >>> the old generation and CMS going in loop occupying complete one core.
> >>> But when we trigger Full GC, these objects are garbage collected.
> >> OK, thanks.
> >>
> >>> You
> >>>    mentioned that CMS cycle does cleanup these objects provided we
> enable
> >>> class unloading. Are you suggesting -XX:+ClassUnloading or
> >>> -XX:+CMSClassUnloadingEnabled? I have tried with later and
> >>>
> >>> didn't
> >>>    succeed.  Our pern gen is relatively constant, by enabling this,
> are we
> >>>    introducing performance overhead? We have room for CPU cycles and
> perm
> >>> gen is relatively small, so this may be fine. Just that we want to see
> >>> these objects should GC'ed in CMS cycle.
> >>>
> >>>
> >>> Do you have any suggestion w.r.t. to which flags should i be using to
> >> trigger this?
> >>
> >> For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
> >> not make a difference in the accumulation of the socket objects
> >> because there is no "projection" as far as i can tell of these
> >> into the perm gen, esepcially since as you say there is no class
> >> loading going on (since your perm gen size remains constant after
> >> start-up).
> >>
> >> However, keeping class unloading enabled via this flag should
> >> hopefully not have much of an impact on your pause times given that
> >> the perm gen is small. The typical effect you will see if class
> >> unloading is enabled is that the CMS remark pause times are a bit
> >> longer (if you enable PrintGCDetails you will see messages
> >> such as "scrub string table" and "scrub symbol table", "code cache"
> >> etc. BY comparing the CMS-remark pause details and times with
> >> and without enabling class unloading you will get a good idea
> >> of its impact. In some cases, eben though you pay a small price
> >> in terms of increased CMS-remark pause times, you will make up
> >> for that in terms of faster scavenges etc., so it might well
> >> be worthwhile.
> >>
> >> In the very near future, we may end up turning that on
> >> by default for CMS because the savings from leaving it off
> >> by default are much smaller now and it can often lead to
> >> other issues if class unloading is turned off.
> >>
> >> So bottom line is: it will not affect the accumulation of
> >> your socket objects, but it's a good idea to keep class
> >> unloading by CMS enabled anyway.
> >>
> >>>
> >>>>> What does jmap -finalizerinfo on your process show?
> >>>>> What does -XX:+PrintClassHistogram show as accumulating in the
> >> heap?
> >>>>> (Are they one specific type of Finalizer objects or all
> >> varieties?)
> >>> Jmap -histo shows the above class is keep accumulating. Infact,
> >>> finalizerinfo doesn't show any objects on this process.
> >> OK, that shows that the objects are somehow not discovered by
> >> CMS as being eligible for finalization. Although one can imagine
> >> a one cycle delay (because of floating garbage) with CMS finding
> >> these objects to be unreachable and hence eligible for finalization,
> >> continuing accumulation of these objects over a period of time
> >> (and presumably many CMS cycles) seems strange and almost
> >> definitely a CMS bug especially as you find that a full STW
> >> gc does indeed reclaim them.
> >>
> >>>
> >>>
> >>>>> Did the problem start in 6u21? Or are those the only versions
> >>>>> you tested and found that there was an issue?
> >>> We
> >>>    have seen this problem in 6u21. We were on 6u12 earlier and didn't
> run
> >>> into this problem. But can't say this is a build particular, since lots
> >>> of things have changed.
> >> Can you boil down this behavior into a test case that you are able
> >> to share with us?
> >> If so, please file a bug with the test case
> >> and send me the CR id and I'll take a look.
> >>
> >> Oh, and before you do that, can you please check the latest public
> >> release (6u24 or 6u25?) to see if the problem still reproduces?
> >>
> >> thanks, and sorry I could not be of more help without a bug
> >> report or a test case.
> >>
> >> -- ramki
> >>
> >>> Thanks in anticipation,
> >>> -Bharath
> >>>
> >
> >
> > _______________________________________________
> > hotspot-gc-use mailing list
> > hotspot-gc-use at openjdk.java.net
> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111123/f74f8ced/attachment-0001.html 

From knoguchi at yahoo-inc.com  Wed Nov 23 14:14:56 2011
From: knoguchi at yahoo-inc.com (Koji Noguchi)
Date: Wed, 23 Nov 2011 14:14:56 -0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <CA+cQ+tQziJu_jtwjvvozhfqVRRNzqupNHm+xjY0JUP=hi9thFQ@mail.gmail.com>
Message-ID: <CAF2B060.248B8%knoguchi@yahoo-inc.com>

Thanks Kris and Jon.

On 11/23/11 5:43 AM, "Krystal Mok" <rednaxelafx at gmail.com> wrote:
> [1]: http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html
>
Yes, I believe we?re hitting the same issue and that nio change would solve at least the problem we?re facing.


On Wed, Nov 23, 2011 at 9:31 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
> I'd suggest you pursue this through your Oracle support contacts.
>
Thanks.  I?ll try that.

Koji

On 11/23/11 5:43 AM, "Krystal Mok" <rednaxelafx at gmail.com> wrote:

Hi,

I submitted a patch recently to mitigate the specific CMS problem caused by excessive SocksSocketImpl objects, by trying to avoid creating them in the first place. [1]
That doesn't solve the general case if there really is a problem with CMS and finalization.

Since we've hit the same problem here, we might investigate further on CMS. I'll report back shall we make progress on it.

Regards,
Kris Mok
Software Engineer, Taobao (http://www.taobao.com)

[1]: http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html

On Wed, Nov 23, 2011 at 9:31 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
Koji,

There is no engineer assigned to this CR and no progress has been
made on it as far as I can tell.  I'd suggest you pursue this through
your Oracle support contacts.

Jon


On 11/22/2011 1:06 PM, Koji Noguchi wrote:
> This is from an old thread in 2011 April but we're still seeing the same
> problem with (nio) Socket instances not getting collecting by CMS.
>
> Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>
> Thanks,
> Koji
>
>
> (From
> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
> l)
> On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna"<y.s.ramakrishna at oracle.com>>
> wrote:
>> On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
>>> Hi Ramki,
>>>
>>> Thanks for the detailed explanation. I was trying to
>>> run some tests for your questions. Here are the answers to some of your
>>> questions.
>>>
>>>>> What are the symptoms?
>>> java.net.SocksSocketImpl objects are not getting cleaned up after a CMS
>> cycle. I see the direct
>>> correlation to java.lang.ref.Finalizer objects. Overtime, this fills up
>>> the old generation and CMS going in loop occupying complete one core.
>>> But when we trigger Full GC, these objects are garbage collected.
>> OK, thanks.
>>
>>> You
>>>    mentioned that CMS cycle does cleanup these objects provided we enable
>>> class unloading. Are you suggesting -XX:+ClassUnloading or
>>> -XX:+CMSClassUnloadingEnabled? I have tried with later and
>>>
>>> didn't
>>>    succeed.  Our pern gen is relatively constant, by enabling this, are we
>>>    introducing performance overhead? We have room for CPU cycles and perm
>>> gen is relatively small, so this may be fine. Just that we want to see
>>> these objects should GC'ed in CMS cycle.
>>>
>>>
>>> Do you have any suggestion w.r.t. to which flags should i be using to
>> trigger this?
>>
>> For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
>> not make a difference in the accumulation of the socket objects
>> because there is no "projection" as far as i can tell of these
>> into the perm gen, esepcially since as you say there is no class
>> loading going on (since your perm gen size remains constant after
>> start-up).
>>
>> However, keeping class unloading enabled via this flag should
>> hopefully not have much of an impact on your pause times given that
>> the perm gen is small. The typical effect you will see if class
>> unloading is enabled is that the CMS remark pause times are a bit
>> longer (if you enable PrintGCDetails you will see messages
>> such as "scrub string table" and "scrub symbol table", "code cache"
>> etc. BY comparing the CMS-remark pause details and times with
>> and without enabling class unloading you will get a good idea
>> of its impact. In some cases, eben though you pay a small price
>> in terms of increased CMS-remark pause times, you will make up
>> for that in terms of faster scavenges etc., so it might well
>> be worthwhile.
>>
>> In the very near future, we may end up turning that on
>> by default for CMS because the savings from leaving it off
>> by default are much smaller now and it can often lead to
>> other issues if class unloading is turned off.
>>
>> So bottom line is: it will not affect the accumulation of
>> your socket objects, but it's a good idea to keep class
>> unloading by CMS enabled anyway.
>>
>>>
>>>>> What does jmap -finalizerinfo on your process show?
>>>>> What does -XX:+PrintClassHistogram show as accumulating in the
>> heap?
>>>>> (Are they one specific type of Finalizer objects or all
>> varieties?)
>>> Jmap -histo shows the above class is keep accumulating. Infact,
>>> finalizerinfo doesn't show any objects on this process.
>> OK, that shows that the objects are somehow not discovered by
>> CMS as being eligible for finalization. Although one can imagine
>> a one cycle delay (because of floating garbage) with CMS finding
>> these objects to be unreachable and hence eligible for finalization,
>> continuing accumulation of these objects over a period of time
>> (and presumably many CMS cycles) seems strange and almost
>> definitely a CMS bug especially as you find that a full STW
>> gc does indeed reclaim them.
>>
>>>
>>>
>>>>> Did the problem start in 6u21? Or are those the only versions
>>>>> you tested and found that there was an issue?
>>> We
>>>    have seen this problem in 6u21. We were on 6u12 earlier and didn't run
>>> into this problem. But can't say this is a build particular, since lots
>>> of things have changed.
>> Can you boil down this behavior into a test case that you are able
>> to share with us?
>> If so, please file a bug with the test case
>> and send me the CR id and I'll take a look.
>>
>> Oh, and before you do that, can you please check the latest public
>> release (6u24 or 6u25?) to see if the problem still reproduces?
>>
>> thanks, and sorry I could not be of more help without a bug
>> report or a test case.
>>
>> -- ramki
>>
>>> Thanks in anticipation,
>>> -Bharath
>>>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111123/a72c14a8/attachment.html 

From ysr1729 at gmail.com  Wed Nov 23 15:11:38 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Wed, 23 Nov 2011 15:11:38 -0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <CAF14EE9.24080%knoguchi@yahoo-inc.com>
References: <CAF14EE9.24080%knoguchi@yahoo-inc.com>
Message-ID: <CABzyjyn1k22VJJh+Ta5ztVbj-hcu0cz7c=Nv3SEvLn1Oisg4Hw@mail.gmail.com>

Hi Koji --

Thanks for the test case, that should definitely help with the
dentification of the problem. I'll see if
i can find some spare time to pursue it one of these days (but can't
promise), so please
do open that Oracle support ticket to get the requisite resource allocated
for the official
investigation.

Thanks again for boiling it down to a simple test case, and i'll update if
i identify the
root cause...

-- ramki

On Tue, Nov 22, 2011 at 1:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com>wrote:

> This is from an old thread in 2011 April but we're still seeing the same
> problem with (nio) Socket instances not getting collecting by CMS.
>
> Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>
> Thanks,
> Koji
>
>
> (From
>
> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
> l)
> On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna" <y.s.ramakrishna at oracle.com
> >>
> wrote:
> > On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
> > > Hi Ramki,
> > >
> > > Thanks for the detailed explanation. I was trying to
> > > run some tests for your questions. Here are the answers to some of your
> > > questions.
> > >
> > >>> What are the symptoms?
> > > java.net.SocksSocketImpl objects are not getting cleaned up after a CMS
> > cycle. I see the direct
> > > correlation to java.lang.ref.Finalizer objects. Overtime, this fills up
> > > the old generation and CMS going in loop occupying complete one core.
> > > But when we trigger Full GC, these objects are garbage collected.
> >
> > OK, thanks.
> >
> > >
> > > You
> > >   mentioned that CMS cycle does cleanup these objects provided we
> enable
> > > class unloading. Are you suggesting -XX:+ClassUnloading or
> > > -XX:+CMSClassUnloadingEnabled? I have tried with later and
> > >
> > > didn't
> > >   succeed.  Our pern gen is relatively constant, by enabling this, are
> we
> > >   introducing performance overhead? We have room for CPU cycles and
> perm
> > > gen is relatively small, so this may be fine. Just that we want to see
> > > these objects should GC'ed in CMS cycle.
> > >
> > >
> > > Do you have any suggestion w.r.t. to which flags should i be using to
> > trigger this?
> >
> > For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
> > not make a difference in the accumulation of the socket objects
> > because there is no "projection" as far as i can tell of these
> > into the perm gen, esepcially since as you say there is no class
> > loading going on (since your perm gen size remains constant after
> > start-up).
> >
>
> > However, keeping class unloading enabled via this flag should
> > hopefully not have much of an impact on your pause times given that
> > the perm gen is small. The typical effect you will see if class
> > unloading is enabled is that the CMS remark pause times are a bit
> > longer (if you enable PrintGCDetails you will see messages
> > such as "scrub string table" and "scrub symbol table", "code cache"
> > etc. BY comparing the CMS-remark pause details and times with
> > and without enabling class unloading you will get a good idea
> > of its impact. In some cases, eben though you pay a small price
> > in terms of increased CMS-remark pause times, you will make up
> > for that in terms of faster scavenges etc., so it might well
> > be worthwhile.
> >
> > In the very near future, we may end up turning that on
> > by default for CMS because the savings from leaving it off
> > by default are much smaller now and it can often lead to
> > other issues if class unloading is turned off.
> >
> > So bottom line is: it will not affect the accumulation of
> > your socket objects, but it's a good idea to keep class
> > unloading by CMS enabled anyway.
> >
> > >
> > >
> > >>> What does jmap -finalizerinfo on your process show?
> > >>> What does -XX:+PrintClassHistogram show as accumulating in the
> > heap?
> > >>> (Are they one specific type of Finalizer objects or all
> > varieties?)
> > >
> > > Jmap -histo shows the above class is keep accumulating. Infact,
> > > finalizerinfo doesn't show any objects on this process.
> >
> > OK, that shows that the objects are somehow not discovered by
> > CMS as being eligible for finalization. Although one can imagine
> > a one cycle delay (because of floating garbage) with CMS finding
> > these objects to be unreachable and hence eligible for finalization,
> > continuing accumulation of these objects over a period of time
> > (and presumably many CMS cycles) seems strange and almost
> > definitely a CMS bug especially as you find that a full STW
> > gc does indeed reclaim them.
> >
> > >
> > >
> > >
> > >>> Did the problem start in 6u21? Or are those the only versions
> > >>> you tested and found that there was an issue?
> > > We
> > >   have seen this problem in 6u21. We were on 6u12 earlier and didn't
> run
> > > into this problem. But can't say this is a build particular, since lots
> > > of things have changed.
> >
> > Can you boil down this behavior into a test case that you are able
> > to share with us?
> > If so, please file a bug with the test case
> > and send me the CR id and I'll take a look.
> >
> > Oh, and before you do that, can you please check the latest public
> > release (6u24 or 6u25?) to see if the problem still reproduces?
> >
> > thanks, and sorry I could not be of more help without a bug
> > report or a test case.
> >
> > -- ramki
> >
> > >
> > > Thanks in anticipation,
> > > -Bharath
> > >
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111123/150dd794/attachment-0001.html 

From rednaxelafx at gmail.com  Thu Nov 24 01:52:13 2011
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Thu, 24 Nov 2011 17:52:13 +0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <CABzyjyn1k22VJJh+Ta5ztVbj-hcu0cz7c=Nv3SEvLn1Oisg4Hw@mail.gmail.com>
References: <CAF14EE9.24080%knoguchi@yahoo-inc.com>
	<CABzyjyn1k22VJJh+Ta5ztVbj-hcu0cz7c=Nv3SEvLn1Oisg4Hw@mail.gmail.com>
Message-ID: <CA+cQ+tS+yyMJ_cwH2_Hsmp7SFxZCnMm8AuVZbsB2vpJsr85PEA@mail.gmail.com>

Hi Koji and Ramki,

I had a look at the repro test case in Bug 7113118. I don't think the test
case is showing the same problem as the original one caused by
SocksSocketImpl objects. The way this test case is behaving is exactly what
the VM arguments told it to do.

I ran the test case on a 64-bit Linux HotSpot Server VM, on JDK 6 update 29.

My .hotspotrc is at [1].

SurvivorRatio doesn't need to be set explicitly, because when CMS is in use
and MaxTenuringThreshold is 0 (or AlwaysTenure is true), the SurvivorRatio
will automatically be set to 1024 by ergonomics.
UsePSAdaptiveSurvivorSizePolicy has no effect when CMS is in use, so it's
omitted from my configuration, too.

By using -XX:+PrintReferenceGC, the gc log will show when and how many
finalizable object are discovered.

The VM arguments given force all surviving object from minor collections to
be promoted into the old generation, and none of the minor collections had
a chance to discovery any ready-to-be-collected FinalReferences, so minor
GC logs aren't of interest in this case. All of the minor GC log lines show
"[FinalReference, 0 refs, xxx secs]".

A part of the GC log can be found at [2]. This log shows two CMS
collections cycles, in between dozens of minor collections.

* Before the first of these two CMS collections, the Java heap used
is 971914K, and then the CMS occupancy threshold is crossed so a CMS
collection cycle starts;
* During the re-mark phase of the first CMS collection, 46400
FinalReferences were discovered;
* After the first CMS collection, the Java heap used is still high,
at 913771K, because the finalizable objects need another old generation
collection to be collected (either CMS or full GC is fine);
* During the re-mark phase of the second CMS collection, 3000
FinalReferences were discovered, these are from promoted objects from the
minor collections in between;
* After the second CMS collection, the Java heap used goes down to 61747K,
as the finalizable objects discovered from the first CMS collection are
indeed finalized and then collected during the second CMS collection.

This behavior looks normal to me -- it's what the VM arguments were telling
the VM to do.
The reason that the Java heap used size was swing up and down is because
the actual live data set was very low, but
the CMSInitiatingOccupancyFraction was set too high so concurrent
collections are started too late. If the initiating threshold were set to a
smaller value, say 20, then the test case would behave quite reasonably.

We'd need another test case to study, because this one doesn't really repro
the problem.

After we applied the patch to SocketAdaptor [3], we don't see this kind of
CMS/finalization problem in production anymore. Should we hit one of these
again, I'll try to get more info from our production site and see if I can
trace down the real problem.

- Kris

[1]: https://gist.github.com/1390876#file_.hotspotrc
[2]: https://gist.github.com/1390876#file_gc.partial.log
[3]:
http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html

On Thu, Nov 24, 2011 at 7:11 AM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:

> Hi Koji --
>
> Thanks for the test case, that should definitely help with the
> dentification of the problem. I'll see if
> i can find some spare time to pursue it one of these days (but can't
> promise), so please
> do open that Oracle support ticket to get the requisite resource allocated
> for the official
> investigation.
>
> Thanks again for boiling it down to a simple test case, and i'll update if
> i identify the
> root cause...
>
> -- ramki
>
>
> On Tue, Nov 22, 2011 at 1:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com>wrote:
>
>> This is from an old thread in 2011 April but we're still seeing the same
>> problem with (nio) Socket instances not getting collecting by CMS.
>>
>> Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>>
>> Thanks,
>> Koji
>>
>>
>> (From
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
>> l)
>> On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna" <y.s.ramakrishna at oracle.com
>> >>
>> wrote:
>> > On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
>> > > Hi Ramki,
>> > >
>> > > Thanks for the detailed explanation. I was trying to
>> > > run some tests for your questions. Here are the answers to some of
>> your
>> > > questions.
>> > >
>> > >>> What are the symptoms?
>> > > java.net.SocksSocketImpl objects are not getting cleaned up after a
>> CMS
>> > cycle. I see the direct
>> > > correlation to java.lang.ref.Finalizer objects. Overtime, this fills
>> up
>> > > the old generation and CMS going in loop occupying complete one core.
>> > > But when we trigger Full GC, these objects are garbage collected.
>> >
>> > OK, thanks.
>> >
>> > >
>> > > You
>> > >   mentioned that CMS cycle does cleanup these objects provided we
>> enable
>> > > class unloading. Are you suggesting -XX:+ClassUnloading or
>> > > -XX:+CMSClassUnloadingEnabled? I have tried with later and
>> > >
>> > > didn't
>> > >   succeed.  Our pern gen is relatively constant, by enabling this,
>> are we
>> > >   introducing performance overhead? We have room for CPU cycles and
>> perm
>> > > gen is relatively small, so this may be fine. Just that we want to see
>> > > these objects should GC'ed in CMS cycle.
>> > >
>> > >
>> > > Do you have any suggestion w.r.t. to which flags should i be using to
>> > trigger this?
>> >
>> > For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
>> > not make a difference in the accumulation of the socket objects
>> > because there is no "projection" as far as i can tell of these
>> > into the perm gen, esepcially since as you say there is no class
>> > loading going on (since your perm gen size remains constant after
>> > start-up).
>> >
>>
>> > However, keeping class unloading enabled via this flag should
>> > hopefully not have much of an impact on your pause times given that
>> > the perm gen is small. The typical effect you will see if class
>> > unloading is enabled is that the CMS remark pause times are a bit
>> > longer (if you enable PrintGCDetails you will see messages
>> > such as "scrub string table" and "scrub symbol table", "code cache"
>> > etc. BY comparing the CMS-remark pause details and times with
>> > and without enabling class unloading you will get a good idea
>> > of its impact. In some cases, eben though you pay a small price
>> > in terms of increased CMS-remark pause times, you will make up
>> > for that in terms of faster scavenges etc., so it might well
>> > be worthwhile.
>> >
>> > In the very near future, we may end up turning that on
>> > by default for CMS because the savings from leaving it off
>> > by default are much smaller now and it can often lead to
>> > other issues if class unloading is turned off.
>> >
>> > So bottom line is: it will not affect the accumulation of
>> > your socket objects, but it's a good idea to keep class
>> > unloading by CMS enabled anyway.
>> >
>> > >
>> > >
>> > >>> What does jmap -finalizerinfo on your process show?
>> > >>> What does -XX:+PrintClassHistogram show as accumulating in the
>> > heap?
>> > >>> (Are they one specific type of Finalizer objects or all
>> > varieties?)
>> > >
>> > > Jmap -histo shows the above class is keep accumulating. Infact,
>> > > finalizerinfo doesn't show any objects on this process.
>> >
>> > OK, that shows that the objects are somehow not discovered by
>> > CMS as being eligible for finalization. Although one can imagine
>> > a one cycle delay (because of floating garbage) with CMS finding
>> > these objects to be unreachable and hence eligible for finalization,
>> > continuing accumulation of these objects over a period of time
>> > (and presumably many CMS cycles) seems strange and almost
>> > definitely a CMS bug especially as you find that a full STW
>> > gc does indeed reclaim them.
>> >
>> > >
>> > >
>> > >
>> > >>> Did the problem start in 6u21? Or are those the only versions
>> > >>> you tested and found that there was an issue?
>> > > We
>> > >   have seen this problem in 6u21. We were on 6u12 earlier and didn't
>> run
>> > > into this problem. But can't say this is a build particular, since
>> lots
>> > > of things have changed.
>> >
>> > Can you boil down this behavior into a test case that you are able
>> > to share with us?
>> > If so, please file a bug with the test case
>> > and send me the CR id and I'll take a look.
>> >
>> > Oh, and before you do that, can you please check the latest public
>> > release (6u24 or 6u25?) to see if the problem still reproduces?
>> >
>> > thanks, and sorry I could not be of more help without a bug
>> > report or a test case.
>> >
>> > -- ramki
>> >
>> > >
>> > > Thanks in anticipation,
>> > > -Bharath
>> > >
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111124/5211e23d/attachment.html 

From ysr1729 at gmail.com  Thu Nov 24 13:45:12 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 24 Nov 2011 13:45:12 -0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <CA+cQ+tS+yyMJ_cwH2_Hsmp7SFxZCnMm8AuVZbsB2vpJsr85PEA@mail.gmail.com>
References: <CAF14EE9.24080%knoguchi@yahoo-inc.com>
	<CABzyjyn1k22VJJh+Ta5ztVbj-hcu0cz7c=Nv3SEvLn1Oisg4Hw@mail.gmail.com>
	<CA+cQ+tS+yyMJ_cwH2_Hsmp7SFxZCnMm8AuVZbsB2vpJsr85PEA@mail.gmail.com>
Message-ID: <CABzyjymyFSoned-rLGp+pnjhkRAHuXkyaOKrsLf0mcFrm5dd9Q@mail.gmail.com>

Hi Kris, thanks for running the test case and figuring that out, and saving
us further investigation of
the submitted test case from Koji.

Hopefully you or Koji will be able to find a simple test case that
illustrates the real issue.

thanks!
-- ramki


On Thu, Nov 24, 2011 at 1:52 AM, Krystal Mok <rednaxelafx at gmail.com> wrote:

> Hi Koji and Ramki,
>
> I had a look at the repro test case in Bug 7113118. I don't think the test
> case is showing the same problem as the original one caused by
> SocksSocketImpl objects. The way this test case is behaving is exactly what
> the VM arguments told it to do.
>
> I ran the test case on a 64-bit Linux HotSpot Server VM, on JDK 6 update
> 29.
>
> My .hotspotrc is at [1].
>
> SurvivorRatio doesn't need to be set explicitly, because when CMS is in
> use and MaxTenuringThreshold is 0 (or AlwaysTenure is true), the
> SurvivorRatio will automatically be set to 1024 by ergonomics.
> UsePSAdaptiveSurvivorSizePolicy has no effect when CMS is in use, so it's
> omitted from my configuration, too.
>
> By using -XX:+PrintReferenceGC, the gc log will show when and how many
> finalizable object are discovered.
>
> The VM arguments given force all surviving object from minor collections
> to be promoted into the old generation, and none of the minor collections
> had a chance to discovery any ready-to-be-collected FinalReferences, so
> minor GC logs aren't of interest in this case. All of the minor GC log
> lines show "[FinalReference, 0 refs, xxx secs]".
>
> A part of the GC log can be found at [2]. This log shows two CMS
> collections cycles, in between dozens of minor collections.
>
> * Before the first of these two CMS collections, the Java heap used
> is 971914K, and then the CMS occupancy threshold is crossed so a CMS
> collection cycle starts;
> * During the re-mark phase of the first CMS collection, 46400
> FinalReferences were discovered;
> * After the first CMS collection, the Java heap used is still high,
> at 913771K, because the finalizable objects need another old generation
> collection to be collected (either CMS or full GC is fine);
> * During the re-mark phase of the second CMS collection, 3000
> FinalReferences were discovered, these are from promoted objects from the
> minor collections in between;
> * After the second CMS collection, the Java heap used goes down to 61747K,
> as the finalizable objects discovered from the first CMS collection are
> indeed finalized and then collected during the second CMS collection.
>
> This behavior looks normal to me -- it's what the VM arguments were
> telling the VM to do.
> The reason that the Java heap used size was swing up and down is because
> the actual live data set was very low, but
> the CMSInitiatingOccupancyFraction was set too high so concurrent
> collections are started too late. If the initiating threshold were set to a
> smaller value, say 20, then the test case would behave quite reasonably.
>
> We'd need another test case to study, because this one doesn't really
> repro the problem.
>
> After we applied the patch to SocketAdaptor [3], we don't see this kind of
> CMS/finalization problem in production anymore. Should we hit one of these
> again, I'll try to get more info from our production site and see if I can
> trace down the real problem.
>
> - Kris
>
> [1]: https://gist.github.com/1390876#file_.hotspotrc
> [2]: https://gist.github.com/1390876#file_gc.partial.log
> [3]:
> http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html
>
>
> On Thu, Nov 24, 2011 at 7:11 AM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:
>
>> Hi Koji --
>>
>> Thanks for the test case, that should definitely help with the
>> dentification of the problem. I'll see if
>> i can find some spare time to pursue it one of these days (but can't
>> promise), so please
>> do open that Oracle support ticket to get the requisite resource
>> allocated for the official
>> investigation.
>>
>> Thanks again for boiling it down to a simple test case, and i'll update
>> if i identify the
>> root cause...
>>
>> -- ramki
>>
>>
>> On Tue, Nov 22, 2011 at 1:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com>wrote:
>>
>>> This is from an old thread in 2011 April but we're still seeing the same
>>> problem with (nio) Socket instances not getting collecting by CMS.
>>>
>>> Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>>>
>>> Thanks,
>>> Koji
>>>
>>>
>>> (From
>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
>>> l)
>>> On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna" <
>>> y.s.ramakrishna at oracle.com>>
>>> wrote:
>>> > On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
>>> > > Hi Ramki,
>>> > >
>>> > > Thanks for the detailed explanation. I was trying to
>>> > > run some tests for your questions. Here are the answers to some of
>>> your
>>> > > questions.
>>> > >
>>> > >>> What are the symptoms?
>>> > > java.net.SocksSocketImpl objects are not getting cleaned up after a
>>> CMS
>>> > cycle. I see the direct
>>> > > correlation to java.lang.ref.Finalizer objects. Overtime, this fills
>>> up
>>> > > the old generation and CMS going in loop occupying complete one core.
>>> > > But when we trigger Full GC, these objects are garbage collected.
>>> >
>>> > OK, thanks.
>>> >
>>> > >
>>> > > You
>>> > >   mentioned that CMS cycle does cleanup these objects provided we
>>> enable
>>> > > class unloading. Are you suggesting -XX:+ClassUnloading or
>>> > > -XX:+CMSClassUnloadingEnabled? I have tried with later and
>>> > >
>>> > > didn't
>>> > >   succeed.  Our pern gen is relatively constant, by enabling this,
>>> are we
>>> > >   introducing performance overhead? We have room for CPU cycles and
>>> perm
>>> > > gen is relatively small, so this may be fine. Just that we want to
>>> see
>>> > > these objects should GC'ed in CMS cycle.
>>> > >
>>> > >
>>> > > Do you have any suggestion w.r.t. to which flags should i be using to
>>> > trigger this?
>>> >
>>> > For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
>>> > not make a difference in the accumulation of the socket objects
>>> > because there is no "projection" as far as i can tell of these
>>> > into the perm gen, esepcially since as you say there is no class
>>> > loading going on (since your perm gen size remains constant after
>>> > start-up).
>>> >
>>>
>>> > However, keeping class unloading enabled via this flag should
>>> > hopefully not have much of an impact on your pause times given that
>>> > the perm gen is small. The typical effect you will see if class
>>> > unloading is enabled is that the CMS remark pause times are a bit
>>> > longer (if you enable PrintGCDetails you will see messages
>>> > such as "scrub string table" and "scrub symbol table", "code cache"
>>> > etc. BY comparing the CMS-remark pause details and times with
>>> > and without enabling class unloading you will get a good idea
>>> > of its impact. In some cases, eben though you pay a small price
>>> > in terms of increased CMS-remark pause times, you will make up
>>> > for that in terms of faster scavenges etc., so it might well
>>> > be worthwhile.
>>> >
>>> > In the very near future, we may end up turning that on
>>> > by default for CMS because the savings from leaving it off
>>> > by default are much smaller now and it can often lead to
>>> > other issues if class unloading is turned off.
>>> >
>>> > So bottom line is: it will not affect the accumulation of
>>> > your socket objects, but it's a good idea to keep class
>>> > unloading by CMS enabled anyway.
>>> >
>>> > >
>>> > >
>>> > >>> What does jmap -finalizerinfo on your process show?
>>> > >>> What does -XX:+PrintClassHistogram show as accumulating in the
>>> > heap?
>>> > >>> (Are they one specific type of Finalizer objects or all
>>> > varieties?)
>>> > >
>>> > > Jmap -histo shows the above class is keep accumulating. Infact,
>>> > > finalizerinfo doesn't show any objects on this process.
>>> >
>>> > OK, that shows that the objects are somehow not discovered by
>>> > CMS as being eligible for finalization. Although one can imagine
>>> > a one cycle delay (because of floating garbage) with CMS finding
>>> > these objects to be unreachable and hence eligible for finalization,
>>> > continuing accumulation of these objects over a period of time
>>> > (and presumably many CMS cycles) seems strange and almost
>>> > definitely a CMS bug especially as you find that a full STW
>>> > gc does indeed reclaim them.
>>> >
>>> > >
>>> > >
>>> > >
>>> > >>> Did the problem start in 6u21? Or are those the only versions
>>> > >>> you tested and found that there was an issue?
>>> > > We
>>> > >   have seen this problem in 6u21. We were on 6u12 earlier and didn't
>>> run
>>> > > into this problem. But can't say this is a build particular, since
>>> lots
>>> > > of things have changed.
>>> >
>>> > Can you boil down this behavior into a test case that you are able
>>> > to share with us?
>>> > If so, please file a bug with the test case
>>> > and send me the CR id and I'll take a look.
>>> >
>>> > Oh, and before you do that, can you please check the latest public
>>> > release (6u24 or 6u25?) to see if the problem still reproduces?
>>> >
>>> > thanks, and sorry I could not be of more help without a bug
>>> > report or a test case.
>>> >
>>> > -- ramki
>>> >
>>> > >
>>> > > Thanks in anticipation,
>>> > > -Bharath
>>> > >
>>>
>>>
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111124/d68f9df8/attachment.html 

From fancyerii at gmail.com  Mon Nov 28 01:25:04 2011
From: fancyerii at gmail.com (Li Li)
Date: Mon, 28 Nov 2011 17:25:04 +0800
Subject: what about Azul's Zing JVM?
Message-ID: <CAFAd71WsHNmZH9gUmH9H85tY5oi_+VRQAYmOPxfV-qSUOiHrpQ@mail.gmail.com>

hi everybody,
    I read an article today about Azul's Zing JVM. It is said that this jvm
is pauseless.
    In my application, our machine is about 48GB and about 25GB memory is
given to jvm(by -Xmx). But it will occasionally pause 1-2 seconds.
    So when I saw this, I want to know whether it's so good as they say.
And I googled and found a related question in stackoverflow:
http://stackoverflow.com/questions/4491260/explanation-of-azuls-pauseless-garbage-collector
    after reading, I am still confusing. Anyone would give more detail
explanations about it? thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111128/328ae759/attachment.html 

From fweimer at bfk.de  Mon Nov 28 01:33:11 2011
From: fweimer at bfk.de (Florian Weimer)
Date: Mon, 28 Nov 2011 09:33:11 +0000
Subject: what about Azul's Zing JVM?
In-Reply-To: <CAFAd71WsHNmZH9gUmH9H85tY5oi_+VRQAYmOPxfV-qSUOiHrpQ@mail.gmail.com>
	(Li Li's message of "Mon, 28 Nov 2011 17:25:04 +0800")
References: <CAFAd71WsHNmZH9gUmH9H85tY5oi_+VRQAYmOPxfV-qSUOiHrpQ@mail.gmail.com>
Message-ID: <82mxbgvbfs.fsf@mid.bfk.de>

* Li Li:

>     after reading, I am still confusing. Anyone would give more detail
> explanations about it? thanks

I think you should ask on one of Azul's mailing lists.  As far as I
understand it, the MRI VM has a similar garbage collector, and source
code has been published, so you could have a look at it and ask on the
MRI mailing list (but I don't know if it is still active).

-- 
Florian Weimer                <fweimer at bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstra?e 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

From vitalyd at gmail.com  Mon Nov 28 06:10:47 2011
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Mon, 28 Nov 2011 09:10:47 -0500
Subject: what about Azul's Zing JVM?
In-Reply-To: <CAFAd71WsHNmZH9gUmH9H85tY5oi_+VRQAYmOPxfV-qSUOiHrpQ@mail.gmail.com>
References: <CAFAd71WsHNmZH9gUmH9H85tY5oi_+VRQAYmOPxfV-qSUOiHrpQ@mail.gmail.com>
Message-ID: <CAHjP37Hu0tyZj1rUWxQVoTqV75xW6ffD=u9ORZzTfZJQz6MAHA@mail.gmail.com>

As a gross oversimplification their GC is concurrent to mutator (java)
threads but is mostly pauseless (they still pause at times but only very
briefly) because they use read barriers.   This means that if a mutator
thread reads memory that's been relocated, they trap this condition at read
time, fix up the pointer (mutator does this itself), and continue on.  Last
I heard this approach required azul's os support for bulk in/mapping of
pagetable entries, and required a Linux patch for x86 to do the same (but
it wasn't accepted into mainline kernel).

What's interesting is whether hotspot has any plans to do something similar?
On Nov 28, 2011 4:27 AM, "Li Li" <fancyerii at gmail.com> wrote:

> hi everybody,
>     I read an article today about Azul's Zing JVM. It is said that this
> jvm is pauseless.
>     In my application, our machine is about 48GB and about 25GB memory is
> given to jvm(by -Xmx). But it will occasionally pause 1-2 seconds.
>     So when I saw this, I want to know whether it's so good as they say.
> And I googled and found a related question in stackoverflow:
> http://stackoverflow.com/questions/4491260/explanation-of-azuls-pauseless-garbage-collector
>     after reading, I am still confusing. Anyone would give more detail
> explanations about it? thanks
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111128/e968e276/attachment.html 

From gaberger at cisco.com  Mon Nov 28 06:54:35 2011
From: gaberger at cisco.com (Gary Berger)
Date: Mon, 28 Nov 2011 09:54:35 -0500
Subject: what about Azul's Zing JVM?
In-Reply-To: <CAHjP37Hu0tyZj1rUWxQVoTqV75xW6ffD=u9ORZzTfZJQz6MAHA@mail.gmail.com>
Message-ID: <CAF90A21.325F5%gaberger@cisco.com>

Yes, Zing is a concurrent mark-and-compact GC, compaction being the somewhat
harder. Originally the C4 collector ran on custom Vega processors and than
leveraged hypervisor based memory mapping features (EPT) to flip pages, now
they have figured out how to do it on bare Linux with a kernel module..
Would be great to get these changes in the upstream kernel.

Gil gave a great talk about Zing 5 at QCON

http://bit.ly/vN5xQ0

.:|:.:|:.  Gary Berger | Architect, Office of the CTO, DSSG | Cisco Systems|
One Penn Plaza | New York, NY 10119 | Phone: 917.288.8691

From:  Vitaly Davidovich <vitalyd at gmail.com>
Date:  Mon, 28 Nov 2011 09:10:47 -0500
To:  Li Li <fancyerii at gmail.com>
Cc:  hotspot-gc-use <hotspot-gc-use at openjdk.java.net>
Subject:  Re: what about Azul's Zing JVM?


As a gross oversimplification their GC is concurrent to mutator (java)
threads but is mostly pauseless (they still pause at times but only very
briefly) because they use read barriers.   This means that if a mutator
thread reads memory that's been relocated, they trap this condition at read
time, fix up the pointer (mutator does this itself), and continue on.  Last
I heard this approach required azul's os support for bulk in/mapping of
pagetable entries, and required a Linux patch for x86 to do the same (but it
wasn't accepted into mainline kernel).

What's interesting is whether hotspot has any plans to do something similar?

On Nov 28, 2011 4:27 AM, "Li Li" <fancyerii at gmail.com> wrote:
> hi everybody,
>     I read an article today about Azul's Zing JVM. It is said that this jvm is
> pauseless.
>     In my application, our machine is about 48GB and about 25GB memory is
> given to jvm(by -Xmx). But it will occasionally pause 1-2 seconds.
>     So when I saw this, I want to know whether it's so good as they say. And I
> googled and found a related question in stackoverflow:
> http://stackoverflow.com/questions/4491260/explanation-of-azuls-pauseless-garb
> age-collector
>     after reading, I am still confusing. Anyone would give more detail
> explanations about it? thanks
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
_______________________________________________ hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111128/df17bcc1/attachment.html 

From kirk at kodewerk.com  Wed Nov 23 23:28:11 2011
From: kirk at kodewerk.com (Charles K Pepperdine)
Date: Thu, 24 Nov 2011 08:28:11 +0100
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <4ECCF5B1.7040408@oracle.com>
References: <CAF14EE9.24080%knoguchi@yahoo-inc.com>
	<4ECCF5B1.7040408@oracle.com>
Message-ID: <55A131FE-6724-4C1A-B535-7CEA69751B26@kodewerk.com>

Hi Jon,

If I can solve the problem locally, what is the chance of getting it into a build?

Regards,
Kirk

On Nov 23, 2011, at 2:31 PM, Jon Masamitsu wrote:

> Koji,
> 
> There is no engineer assigned to this CR and no progress has been
> made on it as far as I can tell.  I'd suggest you pursue this through
> your Oracle support contacts.
> 
> Jon
> 
> 
> 
> On 11/22/2011 1:06 PM, Koji Noguchi wrote:
>> This is from an old thread in 2011 April but we're still seeing the same
>> problem with (nio) Socket instances not getting collecting by CMS.
>> 
>> Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>> 
>> Thanks,
>> Koji
>> 
>> 
>> (From
>> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
>> l)
>> On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna"<y.s.ramakrishna at oracle.com>>
>> wrote:
>>> On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
>>>> Hi Ramki,
>>>> 
>>>> Thanks for the detailed explanation. I was trying to
>>>> run some tests for your questions. Here are the answers to some of your
>>>> questions.
>>>> 
>>>>>> What are the symptoms?
>>>> java.net.SocksSocketImpl objects are not getting cleaned up after a CMS
>>> cycle. I see the direct
>>>> correlation to java.lang.ref.Finalizer objects. Overtime, this fills up
>>>> the old generation and CMS going in loop occupying complete one core.
>>>> But when we trigger Full GC, these objects are garbage collected.
>>> OK, thanks.
>>> 
>>>> You
>>>>   mentioned that CMS cycle does cleanup these objects provided we enable
>>>> class unloading. Are you suggesting -XX:+ClassUnloading or
>>>> -XX:+CMSClassUnloadingEnabled? I have tried with later and
>>>> 
>>>> didn't
>>>>   succeed.  Our pern gen is relatively constant, by enabling this, are we
>>>>   introducing performance overhead? We have room for CPU cycles and perm
>>>> gen is relatively small, so this may be fine. Just that we want to see
>>>> these objects should GC'ed in CMS cycle.
>>>> 
>>>> 
>>>> Do you have any suggestion w.r.t. to which flags should i be using to
>>> trigger this?
>>> 
>>> For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
>>> not make a difference in the accumulation of the socket objects
>>> because there is no "projection" as far as i can tell of these
>>> into the perm gen, esepcially since as you say there is no class
>>> loading going on (since your perm gen size remains constant after
>>> start-up).
>>> 
>>> However, keeping class unloading enabled via this flag should
>>> hopefully not have much of an impact on your pause times given that
>>> the perm gen is small. The typical effect you will see if class
>>> unloading is enabled is that the CMS remark pause times are a bit
>>> longer (if you enable PrintGCDetails you will see messages
>>> such as "scrub string table" and "scrub symbol table", "code cache"
>>> etc. BY comparing the CMS-remark pause details and times with
>>> and without enabling class unloading you will get a good idea
>>> of its impact. In some cases, eben though you pay a small price
>>> in terms of increased CMS-remark pause times, you will make up
>>> for that in terms of faster scavenges etc., so it might well
>>> be worthwhile.
>>> 
>>> In the very near future, we may end up turning that on
>>> by default for CMS because the savings from leaving it off
>>> by default are much smaller now and it can often lead to
>>> other issues if class unloading is turned off.
>>> 
>>> So bottom line is: it will not affect the accumulation of
>>> your socket objects, but it's a good idea to keep class
>>> unloading by CMS enabled anyway.
>>> 
>>>> 
>>>>>> What does jmap -finalizerinfo on your process show?
>>>>>> What does -XX:+PrintClassHistogram show as accumulating in the
>>> heap?
>>>>>> (Are they one specific type of Finalizer objects or all
>>> varieties?)
>>>> Jmap -histo shows the above class is keep accumulating. Infact,
>>>> finalizerinfo doesn't show any objects on this process.
>>> OK, that shows that the objects are somehow not discovered by
>>> CMS as being eligible for finalization. Although one can imagine
>>> a one cycle delay (because of floating garbage) with CMS finding
>>> these objects to be unreachable and hence eligible for finalization,
>>> continuing accumulation of these objects over a period of time
>>> (and presumably many CMS cycles) seems strange and almost
>>> definitely a CMS bug especially as you find that a full STW
>>> gc does indeed reclaim them.
>>> 
>>>> 
>>>> 
>>>>>> Did the problem start in 6u21? Or are those the only versions
>>>>>> you tested and found that there was an issue?
>>>> We
>>>>   have seen this problem in 6u21. We were on 6u12 earlier and didn't run
>>>> into this problem. But can't say this is a build particular, since lots
>>>> of things have changed.
>>> Can you boil down this behavior into a test case that you are able
>>> to share with us?
>>> If so, please file a bug with the test case
>>> and send me the CR id and I'll take a look.
>>> 
>>> Oh, and before you do that, can you please check the latest public
>>> release (6u24 or 6u25?) to see if the problem still reproduces?
>>> 
>>> thanks, and sorry I could not be of more help without a bug
>>> report or a test case.
>>> 
>>> -- ramki
>>> 
>>>> Thanks in anticipation,
>>>> -Bharath
>>>> 
>> 
>> 
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From java at java4.info  Mon Nov 28 11:18:17 2011
From: java at java4.info (Florian Binder)
Date: Mon, 28 Nov 2011 20:18:17 +0100
Subject: G1 discovers same garbage again?
Message-ID: <4ED3DE79.5050801@java4.info>

Hi everybody,

I have a java application with 20gb (large-table) memory and using the 
g1 garbage collector.
The application calculates the whole time with 10 threads some ratios 
(high cpu load). This is done without producing any garbage. About two 
times a minute a request is sent which produce a littlebit garbage. 
Since we are working with realtime data we are interested in very short 
stop-the-world pauses. Therefore we have used the CMS gc in the past 
until we have got problems with fragmentation now.
Therefore I am trying the g1.

This seemed to work very well at first. The stw-pauses were, except the 
cleanup pause, very short. This yields me to my first question:
Is this normal and are there any parameters to influence the 
cleanup-process? I thought this phase should be short because there is 
just finished the counting, the role of the bitmaps is switched and the 
next possible garbage regions are detemined. All things, which should be 
very fast. So what is taking the time?

The second cause for my email is the crazy behaviour after a few hours:
After the startup of the server it uses about 13.5 gb old-gen memory and 
generates very slowly eden-garbage. Since the new allocated memory is 
mostly garbage the (young) garbage collections are very fast and g1 
decides to grow up the eden space. This works 4 times until eden space 
has more than about 3.5 gb memory. After this the gc is making much more 
collections and while the collections it discovers new garbage (probably 
the old one again). Eden memory usage jumps between 0 and 3.5gb even 
though I am sure the java-application is not making more than before. I 
assume that it runs during a collection in the old garbage and collects 
it again. Is this possible? Or is there an overflow since eden space 
uses more than 3.5 gb?

Thanks and regards,
Flo

Some useful information:
$ java -version
java version "1.6.0_29"
Java(TM) SE Runtime Environment (build 1.6.0_29-b11)
Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode)

Startup Parameters:
-Xms20g -Xmx20g
-verbose:gc \
-XX:+UnlockExperimentalVMOptions \
-XX:+UseG1GC \
-XX:+PrintGCDetails \
-XX:+PrintGCDateStamps \
-XX:+UseLargePages \
-XX:+PrintFlagsFinal \
-XX:-TraceClassUnloading \

$ cat /proc/meminfo | grep Huge
HugePages_Total: 11264
HugePages_Free:   1015
HugePages_Rsvd:     32
Hugepagesize:     2048 kB

A few screen-shots of the jconsole memory-view:
http://java4.info/g1/1h.png
http://java4.info/g1/all.png
http://java4.info/g1/eden_1h.png
http://java4.info/g1/eden_all.png
http://java4.info/g1/oldgen_all.png

The sysout end syserr logfile with the gc logging and PrintFinalFlags:
http://java4.info/g1/out_err.log.gz

From tony.printezis at oracle.com  Tue Nov 29 09:29:22 2011
From: tony.printezis at oracle.com (Tony Printezis)
Date: Tue, 29 Nov 2011 12:29:22 -0500
Subject: G1 discovers same garbage again?
In-Reply-To: <4ED3DE79.5050801@java4.info>
References: <4ED3DE79.5050801@java4.info>
Message-ID: <4ED51672.6030105@oracle.com>

Hi Florian,

See inline.

On 11/28/2011 2:18 PM, Florian Binder wrote:
> Hi everybody,
>
> I have a java application with 20gb (large-table) memory and using the
> g1 garbage collector.

Quick clarification: I saw that you use a 20G heap from the parameters 
you showed below. Do you know what's your live data size?

> The application calculates the whole time with 10 threads some ratios
> (high cpu load). This is done without producing any garbage. About two
> times a minute a request is sent which produce a littlebit garbage.
> Since we are working with realtime data we are interested in very short
> stop-the-world pauses. Therefore we have used the CMS gc in the past
> until we have got problems with fragmentation now.

Since you don't produce much garbage how come you have fragmentation? Do 
you keep the results for all the requests you serve?

> Therefore I am trying the g1.
>
> This seemed to work very well at first. The stw-pauses were, except the
> cleanup pause,

Out of curiosity: how long are the cleanup pauses?

> very short. This yields me to my first question:
> Is this normal and are there any parameters to influence the
> cleanup-process?

I don't think there's much you can do in the app to influence the 
cleanup duration. During this pause we do some, ahem, cleanup of our 
data structures and for large heaps I have also seen the cleanup pauses 
to take longer than I thought they would take. I know this is not going 
to help you in the short term but we have plans to do the cleanup work 
concurrently (or at least mostly-concurrently) in the future.

>   I thought this phase should be short because there is
> just finished the counting, the role of the bitmaps is switched and the
> next possible garbage regions are detemined. All things, which should be
> very fast. So what is taking the time?

Most likely, the remembered set scrubbing phase...

> The second cause for my email is the crazy behaviour after a few hours:
> After the startup of the server it uses about 13.5 gb old-gen memory and
> generates very slowly eden-garbage. Since the new allocated memory is
> mostly garbage the (young) garbage collections are very fast and g1
> decides to grow up the eden space. This works 4 times until eden space
> has more than about 3.5 gb memory. After this the gc is making much more
> collections and while the collections it discovers new garbage (probably
> the old one again).

I'm not quite sure what you mean by "it discovers new garbage". For 
young GCs, G1 (and our other GCs) will reclaim any young objects that 
will discover to be dead (more accurately: that it will not discover to 
be live).

> Eden memory usage jumps between 0 and 3.5gb even
> though I am sure the java-application is not making more than before.

Well, that's not good. :-) Can you try to explicitly set the young gen 
size with -Xmn3g say, to see what happens?

Tony

>   I
> assume that it runs during a collection in the old garbage and collects
> it again. Is this possible? Or is there an overflow since eden space
> uses more than 3.5 gb?
>
> Thanks and regards,
> Flo
>
> Some useful information:
> $ java -version
> java version "1.6.0_29"
> Java(TM) SE Runtime Environment (build 1.6.0_29-b11)
> Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode)
>
> Startup Parameters:
> -Xms20g -Xmx20g
> -verbose:gc \
> -XX:+UnlockExperimentalVMOptions \
> -XX:+UseG1GC \
> -XX:+PrintGCDetails \
> -XX:+PrintGCDateStamps \
> -XX:+UseLargePages \
> -XX:+PrintFlagsFinal \
> -XX:-TraceClassUnloading \
>
> $ cat /proc/meminfo | grep Huge
> HugePages_Total: 11264
> HugePages_Free:   1015
> HugePages_Rsvd:     32
> Hugepagesize:     2048 kB
>
> A few screen-shots of the jconsole memory-view:
> http://java4.info/g1/1h.png
> http://java4.info/g1/all.png
> http://java4.info/g1/eden_1h.png
> http://java4.info/g1/eden_all.png
> http://java4.info/g1/oldgen_all.png
>
> The sysout end syserr logfile with the gc logging and PrintFinalFlags:
> http://java4.info/g1/out_err.log.gz
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From knoguchi at yahoo-inc.com  Tue Nov 29 14:06:42 2011
From: knoguchi at yahoo-inc.com (Koji Noguchi)
Date: Tue, 29 Nov 2011 14:06:42 -0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <CABzyjymyFSoned-rLGp+pnjhkRAHuXkyaOKrsLf0mcFrm5dd9Q@mail.gmail.com>
Message-ID: <CAFA9773.26643%knoguchi@yahoo-inc.com>

Thanks Krystal for your update.

I don?t know why I?m getting a different result than yours.

> * After the second CMS collection, the Java heap used goes down to 61747K,
>
In my case, it stays above 800MBytes...

Attached is the memory footprint with CMS(-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=95 -XX:+UseConcMarkSweepGC) and without(fullgc).

It was interesting to see

 1.  FullGC case eventually stabilizing to having each FullGC releasing half of the heap (500M) due to finalizer requiring two GCs.
 2.  CMS case still stayed above 800M but there were a few times when memory footprint dropped.

In any cases, I?m pretty sure your SocketAdaptor [3] patch would workaround the CMS issue I?m facing.  So this is no longer urgent to me as long as that change gets into a future java version.

Thanks again for all your inputs.

Koji

On 11/24/11 1:45 PM, "Srinivas Ramakrishna" <ysr1729 at gmail.com> wrote:

Hi Kris, thanks for running the test case and figuring that out, and saving us further investigation of
the submitted test case from Koji.

Hopefully you or Koji will be able to find a simple test case that illustrates the real issue.

thanks!
-- ramki


On Thu, Nov 24, 2011 at 1:52 AM, Krystal Mok <rednaxelafx at gmail.com> wrote:
Hi Koji and Ramki,

I had a look at the repro test case in Bug 7113118. I don't think the test case is showing the same problem as the original one caused by SocksSocketImpl objects. The way this test case is behaving is exactly what the VM arguments told it to do.

I ran the test case on a 64-bit Linux HotSpot Server VM, on JDK 6 update 29.

My .hotspotrc is at [1].

SurvivorRatio doesn't need to be set explicitly, because when CMS is in use and MaxTenuringThreshold is 0 (or AlwaysTenure is true), the SurvivorRatio will automatically be set to 1024 by ergonomics.
UsePSAdaptiveSurvivorSizePolicy has no effect when CMS is in use, so it's omitted from my configuration, too.

By using -XX:+PrintReferenceGC, the gc log will show when and how many finalizable object are discovered.

The VM arguments given force all surviving object from minor collections to be promoted into the old generation, and none of the minor collections had a chance to discovery any ready-to-be-collected FinalReferences, so minor GC logs aren't of interest in this case. All of the minor GC log lines show "[FinalReference, 0 refs, xxx secs]".

A part of the GC log can be found at [2]. This log shows two CMS collections cycles, in between dozens of minor collections.

* Before the first of these two CMS collections, the Java heap used is 971914K, and then the CMS occupancy threshold is crossed so a CMS collection cycle starts;
* During the re-mark phase of the first CMS collection, 46400 FinalReferences were discovered;
* After the first CMS collection, the Java heap used is still high, at 913771K, because the finalizable objects need another old generation collection to be collected (either CMS or full GC is fine);
* During the re-mark phase of the second CMS collection, 3000 FinalReferences were discovered, these are from promoted objects from the minor collections in between;
* After the second CMS collection, the Java heap used goes down to 61747K, as the finalizable objects discovered from the first CMS collection are indeed finalized and then collected during the second CMS collection.

This behavior looks normal to me -- it's what the VM arguments were telling the VM to do.
The reason that the Java heap used size was swing up and down is because the actual live data set was very low, but the CMSInitiatingOccupancyFraction was set too high so concurrent collections are started too late. If the initiating threshold were set to a smaller value, say 20, then the test case would behave quite reasonably.

We'd need another test case to study, because this one doesn't really repro the problem.

After we applied the patch to SocketAdaptor [3], we don't see this kind of CMS/finalization problem in production anymore. Should we hit one of these again, I'll try to get more info from our production site and see if I can trace down the real problem.

- Kris

[1]: https://gist.github.com/1390876#file_.hotspotrc
[2]: https://gist.github.com/1390876#file_gc.partial.log
[3]: http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html


On Thu, Nov 24, 2011 at 7:11 AM, Srinivas Ramakrishna <ysr1729 at gmail.com> wrote:
Hi Koji --

Thanks for the test case, that should definitely help with the dentification of the problem. I'll see if
i can find some spare time to pursue it one of these days (but can't promise), so please
do open that Oracle support ticket to get the requisite resource allocated for the official
investigation.

Thanks again for boiling it down to a simple test case, and i'll update if i identify the
root cause...

-- ramki


On Tue, Nov 22, 2011 at 1:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com> wrote:
This is from an old thread in 2011 April but we're still seeing the same
problem with (nio) Socket instances not getting collecting by CMS.

Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118

Thanks,
Koji


(From
http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
l)
On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna" <y.s.ramakrishna at oracle.com>>
wrote:
> On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
> > Hi Ramki,
> >
> > Thanks for the detailed explanation. I was trying to
> > run some tests for your questions. Here are the answers to some of your
> > questions.
> >
> >>> What are the symptoms?
> > java.net.SocksSocketImpl objects are not getting cleaned up after a CMS
> cycle. I see the direct
> > correlation to java.lang.ref.Finalizer objects. Overtime, this fills up
> > the old generation and CMS going in loop occupying complete one core.
> > But when we trigger Full GC, these objects are garbage collected.
>
> OK, thanks.
>
> >
> > You
> >   mentioned that CMS cycle does cleanup these objects provided we enable
> > class unloading. Are you suggesting -XX:+ClassUnloading or
> > -XX:+CMSClassUnloadingEnabled? I have tried with later and
> >
> > didn't
> >   succeed.  Our pern gen is relatively constant, by enabling this, are we
> >   introducing performance overhead? We have room for CPU cycles and perm
> > gen is relatively small, so this may be fine. Just that we want to see
> > these objects should GC'ed in CMS cycle.
> >
> >
> > Do you have any suggestion w.r.t. to which flags should i be using to
> trigger this?
>
> For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
> not make a difference in the accumulation of the socket objects
> because there is no "projection" as far as i can tell of these
> into the perm gen, esepcially since as you say there is no class
> loading going on (since your perm gen size remains constant after
> start-up).
>

> However, keeping class unloading enabled via this flag should
> hopefully not have much of an impact on your pause times given that
> the perm gen is small. The typical effect you will see if class
> unloading is enabled is that the CMS remark pause times are a bit
> longer (if you enable PrintGCDetails you will see messages
> such as "scrub string table" and "scrub symbol table", "code cache"
> etc. BY comparing the CMS-remark pause details and times with
> and without enabling class unloading you will get a good idea
> of its impact. In some cases, eben though you pay a small price
> in terms of increased CMS-remark pause times, you will make up
> for that in terms of faster scavenges etc., so it might well
> be worthwhile.
>
> In the very near future, we may end up turning that on
> by default for CMS because the savings from leaving it off
> by default are much smaller now and it can often lead to
> other issues if class unloading is turned off.
>
> So bottom line is: it will not affect the accumulation of
> your socket objects, but it's a good idea to keep class
> unloading by CMS enabled anyway.
>
> >
> >
> >>> What does jmap -finalizerinfo on your process show?
> >>> What does -XX:+PrintClassHistogram show as accumulating in the
> heap?
> >>> (Are they one specific type of Finalizer objects or all
> varieties?)
> >
> > Jmap -histo shows the above class is keep accumulating. Infact,
> > finalizerinfo doesn't show any objects on this process.
>
> OK, that shows that the objects are somehow not discovered by
> CMS as being eligible for finalization. Although one can imagine
> a one cycle delay (because of floating garbage) with CMS finding
> these objects to be unreachable and hence eligible for finalization,
> continuing accumulation of these objects over a period of time
> (and presumably many CMS cycles) seems strange and almost
> definitely a CMS bug especially as you find that a full STW
> gc does indeed reclaim them.
>
> >
> >
> >
> >>> Did the problem start in 6u21? Or are those the only versions
> >>> you tested and found that there was an issue?
> > We
> >   have seen this problem in 6u21. We were on 6u12 earlier and didn't run
> > into this problem. But can't say this is a build particular, since lots
> > of things have changed.
>
> Can you boil down this behavior into a test case that you are able
> to share with us?
> If so, please file a bug with the test case
> and send me the CR id and I'll take a look.
>
> Oh, and before you do that, can you please check the latest public
> release (6u24 or 6u25?) to see if the problem still reproduces?
>
> thanks, and sorry I could not be of more help without a bug
> report or a test case.
>
> -- ramki
>
> >
> > Thanks in anticipation,
> > -Bharath
> >


_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111129/edb6e31c/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cmsAndFullGCTesting.png
Type: application/octet-stream
Size: 46276 bytes
Desc: cmsAndFullGCTesting.png
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111129/edb6e31c/cmsAndFullGCTesting-0001.png 

From ysr1729 at gmail.com  Tue Nov 29 22:16:41 2011
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Tue, 29 Nov 2011 22:16:41 -0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <CAFA9773.26643%knoguchi@yahoo-inc.com>
References: <CABzyjymyFSoned-rLGp+pnjhkRAHuXkyaOKrsLf0mcFrm5dd9Q@mail.gmail.com>
	<CAFA9773.26643%knoguchi@yahoo-inc.com>
Message-ID: <CABzyjynk20zQYtcBiACM+J=+8wHDHpSQtkVOSNXcfnwr3rsjEQ@mail.gmail.com>

Who knows, may be this is related to the other CMS CR that Stefan just sent
out a review request for. If I understand correctly then,
the behaviour should be good if you turn off parallel marking in CMS, viz
-XX:-CMSConcurrentMTEnabled (or whatever
the flag is called now). Are you able to check that?

Adding Stefan to the cc, just in case.
-- ramki

On Tue, Nov 29, 2011 at 2:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com>wrote:

>  Thanks Krystal for your update.
>
> I don?t know why I?m getting a different result than yours.
>
>
> > * After the second CMS collection, the Java heap used goes down
> to 61747K,
> >
> In my case, it stays above 800MBytes...
>
> Attached is the memory footprint with
> CMS(-XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSInitiatingOccupancyFraction=95 -XX:+UseConcMarkSweepGC) and
> without(fullgc).
>
> It was interesting to see
>
>    1. FullGC case eventually stabilizing to having each FullGC releasing
>    half of the heap (500M) due to finalizer requiring two GCs.
>    2. CMS case still stayed above 800M but there were a few times when
>    memory footprint dropped.
>
>
> In any cases, I?m pretty sure your SocketAdaptor [3] patch would
> workaround the CMS issue I?m facing.  So this is no longer urgent to me as
> long as that change gets into a future java version.
>
> Thanks again for all your inputs.
>
> Koji
>
>
> On 11/24/11 1:45 PM, "Srinivas Ramakrishna" <ysr1729 at gmail.com> wrote:
>
> Hi Kris, thanks for running the test case and figuring that out, and
> saving us further investigation of
> the submitted test case from Koji.
>
> Hopefully you or Koji will be able to find a simple test case that
> illustrates the real issue.
>
> thanks!
> -- ramki
>
>
> On Thu, Nov 24, 2011 at 1:52 AM, Krystal Mok <rednaxelafx at gmail.com>
> wrote:
>
> Hi Koji and Ramki,
>
> I had a look at the repro test case in Bug 7113118. I don't think the test
> case is showing the same problem as the original one caused by
> SocksSocketImpl objects. The way this test case is behaving is exactly what
> the VM arguments told it to do.
>
> I ran the test case on a 64-bit Linux HotSpot Server VM, on JDK 6 update
> 29.
>
> My .hotspotrc is at [1].
>
> SurvivorRatio doesn't need to be set explicitly, because when CMS is in
> use and MaxTenuringThreshold is 0 (or AlwaysTenure is true), the
> SurvivorRatio will automatically be set to 1024 by ergonomics.
> UsePSAdaptiveSurvivorSizePolicy has no effect when CMS is in use, so it's
> omitted from my configuration, too.
>
> By using -XX:+PrintReferenceGC, the gc log will show when and how many
> finalizable object are discovered.
>
> The VM arguments given force all surviving object from minor collections
> to be promoted into the old generation, and none of the minor collections
> had a chance to discovery any ready-to-be-collected FinalReferences, so
> minor GC logs aren't of interest in this case. All of the minor GC log
> lines show "[FinalReference, 0 refs, xxx secs]".
>
> A part of the GC log can be found at [2]. This log shows two CMS
> collections cycles, in between dozens of minor collections.
>
> * Before the first of these two CMS collections, the Java heap used
> is 971914K, and then the CMS occupancy threshold is crossed so a CMS
> collection cycle starts;
> * During the re-mark phase of the first CMS collection, 46400
> FinalReferences were discovered;
> * After the first CMS collection, the Java heap used is still high,
> at 913771K, because the finalizable objects need another old generation
> collection to be collected (either CMS or full GC is fine);
> * During the re-mark phase of the second CMS collection, 3000
> FinalReferences were discovered, these are from promoted objects from the
> minor collections in between;
> * After the second CMS collection, the Java heap used goes down to 61747K,
> as the finalizable objects discovered from the first CMS collection are
> indeed finalized and then collected during the second CMS collection.
>
> This behavior looks normal to me -- it's what the VM arguments were
> telling the VM to do.
> The reason that the Java heap used size was swing up and down is because
> the actual live data set was very low, but
> the CMSInitiatingOccupancyFraction was set too high so concurrent
> collections are started too late. If the initiating threshold were set to a
> smaller value, say 20, then the test case would behave quite reasonably.
>
> We'd need another test case to study, because this one doesn't really
> repro the problem.
>
> After we applied the patch to SocketAdaptor [3], we don't see this kind of
> CMS/finalization problem in production anymore. Should we hit one of these
> again, I'll try to get more info from our production site and see if I can
> trace down the real problem.
>
> - Kris
>
> [1]: https://gist.github.com/1390876#file_.hotspotrc
> [2]: https://gist.github.com/1390876#file_gc.partial.log
> [3]:
> http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html
>
>
> On Thu, Nov 24, 2011 at 7:11 AM, Srinivas Ramakrishna <ysr1729 at gmail.com>
> wrote:
>
> Hi Koji --
>
> Thanks for the test case, that should definitely help with the
> dentification of the problem. I'll see if
> i can find some spare time to pursue it one of these days (but can't
> promise), so please
> do open that Oracle support ticket to get the requisite resource allocated
> for the official
> investigation.
>
> Thanks again for boiling it down to a simple test case, and i'll update if
> i identify the
> root cause...
>
> -- ramki
>
>
> On Tue, Nov 22, 2011 at 1:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com>
> wrote:
>
> This is from an old thread in 2011 April but we're still seeing the same
> problem with (nio) Socket instances not getting collecting by CMS.
>
> Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>
> Thanks,
> Koji
>
>
> (From
>
> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
> l)
> On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna" <y.s.ramakrishna at oracle.com>
> >
> wrote:
> > On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
> > > Hi Ramki,
> > >
> > > Thanks for the detailed explanation. I was trying to
> > > run some tests for your questions. Here are the answers to some of your
> > > questions.
> > >
> > >>> What are the symptoms?
> > > java.net.SocksSocketImpl objects are not getting cleaned up after a CMS
> > cycle. I see the direct
> > > correlation to java.lang.ref.Finalizer objects. Overtime, this fills up
> > > the old generation and CMS going in loop occupying complete one core.
> > > But when we trigger Full GC, these objects are garbage collected.
> >
> > OK, thanks.
> >
> > >
> > > You
> > >   mentioned that CMS cycle does cleanup these objects provided we
> enable
> > > class unloading. Are you suggesting -XX:+ClassUnloading or
> > > -XX:+CMSClassUnloadingEnabled? I have tried with later and
> > >
> > > didn't
> > >   succeed.  Our pern gen is relatively constant, by enabling this, are
> we
> > >   introducing performance overhead? We have room for CPU cycles and
> perm
> > > gen is relatively small, so this may be fine. Just that we want to see
> > > these objects should GC'ed in CMS cycle.
> > >
> > >
> > > Do you have any suggestion w.r.t. to which flags should i be using to
> > trigger this?
> >
> > For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
> > not make a difference in the accumulation of the socket objects
> > because there is no "projection" as far as i can tell of these
> > into the perm gen, esepcially since as you say there is no class
> > loading going on (since your perm gen size remains constant after
> > start-up).
> >
>
> > However, keeping class unloading enabled via this flag should
> > hopefully not have much of an impact on your pause times given that
> > the perm gen is small. The typical effect you will see if class
> > unloading is enabled is that the CMS remark pause times are a bit
> > longer (if you enable PrintGCDetails you will see messages
> > such as "scrub string table" and "scrub symbol table", "code cache"
> > etc. BY comparing the CMS-remark pause details and times with
> > and without enabling class unloading you will get a good idea
> > of its impact. In some cases, eben though you pay a small price
> > in terms of increased CMS-remark pause times, you will make up
> > for that in terms of faster scavenges etc., so it might well
> > be worthwhile.
> >
> > In the very near future, we may end up turning that on
> > by default for CMS because the savings from leaving it off
> > by default are much smaller now and it can often lead to
> > other issues if class unloading is turned off.
> >
> > So bottom line is: it will not affect the accumulation of
> > your socket objects, but it's a good idea to keep class
> > unloading by CMS enabled anyway.
> >
> > >
> > >
> > >>> What does jmap -finalizerinfo on your process show?
> > >>> What does -XX:+PrintClassHistogram show as accumulating in the
> > heap?
> > >>> (Are they one specific type of Finalizer objects or all
> > varieties?)
> > >
> > > Jmap -histo shows the above class is keep accumulating. Infact,
> > > finalizerinfo doesn't show any objects on this process.
> >
> > OK, that shows that the objects are somehow not discovered by
> > CMS as being eligible for finalization. Although one can imagine
> > a one cycle delay (because of floating garbage) with CMS finding
> > these objects to be unreachable and hence eligible for finalization,
> > continuing accumulation of these objects over a period of time
> > (and presumably many CMS cycles) seems strange and almost
> > definitely a CMS bug especially as you find that a full STW
> > gc does indeed reclaim them.
> >
> > >
> > >
> > >
> > >>> Did the problem start in 6u21? Or are those the only versions
> > >>> you tested and found that there was an issue?
> > > We
> > >   have seen this problem in 6u21. We were on 6u12 earlier and didn't
> run
> > > into this problem. But can't say this is a build particular, since lots
> > > of things have changed.
> >
> > Can you boil down this behavior into a test case that you are able
> > to share with us?
> > If so, please file a bug with the test case
> > and send me the CR id and I'll take a look.
> >
> > Oh, and before you do that, can you please check the latest public
> > release (6u24 or 6u25?) to see if the problem still reproduces?
> >
> > thanks, and sorry I could not be of more help without a bug
> > report or a test case.
> >
> > -- ramki
> >
> > >
> > > Thanks in anticipation,
> > > -Bharath
> > >
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111129/10bd293b/attachment.html 

From java at java4.info  Wed Nov 30 00:35:45 2011
From: java at java4.info (Florian Binder)
Date: Wed, 30 Nov 2011 09:35:45 +0100
Subject: G1 discovers same garbage again?
In-Reply-To: <4ED51672.6030105@oracle.com>
References: <4ED3DE79.5050801@java4.info> <4ED51672.6030105@oracle.com>
Message-ID: <4ED5EAE1.2020102@java4.info>

Hi Tony,

first of all thank you for your answere.
See inline.

Am 29.11.2011 18:29, schrieb Tony Printezis:
> Hi Florian,
>
> See inline.
>
> On 11/28/2011 2:18 PM, Florian Binder wrote:
>> Hi everybody,
>>
>> I have a java application with 20gb (large-table) memory and using the
>> g1 garbage collector.
>
> Quick clarification: I saw that you use a 20G heap from the parameters 
> you showed below. Do you know what's your live data size?
At this time I have about 14gb live data but it is growing day by day.
>
>> The application calculates the whole time with 10 threads some ratios
>> (high cpu load). This is done without producing any garbage. About two
>> times a minute a request is sent which produce a littlebit garbage.
>> Since we are working with realtime data we are interested in very short
>> stop-the-world pauses. Therefore we have used the CMS gc in the past
>> until we have got problems with fragmentation now.
>
> Since you don't produce much garbage how come you have fragmentation? 
> Do you keep the results for all the requests you serve?
This data is hold one day and every night it is droped and reinitialized.
We have a lot of different server with big memory and have had problems 
with fragmentation on few of them. This was the cause I am experiencing 
with g1 in general. I am not sure if we had fragmentation on this one.

Today I tried the g1 with another server which surely have had a problem 
with fragmented heap, but this one did not start wit g1. I got several 
different exceptions (NoClassDefFound, NullPointerException or even a 
jvm-crash ;-)). But I think I will write you another email especially 
for this, because it is started with a lot of special parameters (e.g. 
-Xms39G -Xmx39G -XX:+UseCompressedOops -XX:ObjectAlignmentInBytes=16 
-XX:+UseLargePages).

>
>> Therefore I am trying the g1.
>>
>> This seemed to work very well at first. The stw-pauses were, except the
>> cleanup pause,
>
> Out of curiosity: how long are the cleanup pauses?
I think they were about 150ms. This is acceptable for me, but in 
proportion to the garbage-collection of 30ms it is very long and 
therefore I was wondering.
>
>> very short. This yields me to my first question:
>> Is this normal and are there any parameters to influence the
>> cleanup-process?
>
> I don't think there's much you can do in the app to influence the 
> cleanup duration. During this pause we do some, ahem, cleanup of our 
> data structures and for large heaps I have also seen the cleanup 
> pauses to take longer than I thought they would take. I know this is 
> not going to help you in the short term but we have plans to do the 
> cleanup work concurrently (or at least mostly-concurrently) in the 
> future.
Sounds good :-)
>
>>   I thought this phase should be short because there is
>> just finished the counting, the role of the bitmaps is switched and the
>> next possible garbage regions are detemined. All things, which should be
>> very fast. So what is taking the time?
>
> Most likely, the remembered set scrubbing phase...
>
>> The second cause for my email is the crazy behaviour after a few hours:
>> After the startup of the server it uses about 13.5 gb old-gen memory and
>> generates very slowly eden-garbage. Since the new allocated memory is
>> mostly garbage the (young) garbage collections are very fast and g1
>> decides to grow up the eden space. This works 4 times until eden space
>> has more than about 3.5 gb memory. After this the gc is making much more
>> collections and while the collections it discovers new garbage (probably
>> the old one again).
>
> I'm not quite sure what you mean by "it discovers new garbage". For 
> young GCs, G1 (and our other GCs) will reclaim any young objects that 
> will discover to be dead (more accurately: that it will not discover 
> to be live).
>
>> Eden memory usage jumps between 0 and 3.5gb even
>> though I am sure the java-application is not making more than before.
>
> Well, that's not good. :-) Can you try to explicitly set the young gen 
> size with -Xmn3g say, to see what happens?
With "it discovers new garbage" I mean that during the garbage 
collection the eden space usage jumps up to 3gb. Then it cleans up the 
whole garbage (eden usage is 0) and a few seconds later the eden usage 
jumps again up. You can see this in the 1h eden-space snapshot:
http://java4.info/g1/eden_1h.png
Since the jumps are betweend 0 and the last max eden usage (of about 
3.5gb) I assume that it discovers the same garbage, it cleaned up the 
last time, and collects it again. I am sure the application is not 
making more garbage than the time before. Have you ever heared of 
problems like this?
After I have written the last email, I have seen that it has calm itself 
after a few hours. But it is nevertheless very curious and produces a 
lot of unnecessary pauses.

Flo

>
> Tony
>
>>   I
>> assume that it runs during a collection in the old garbage and collects
>> it again. Is this possible? Or is there an overflow since eden space
>> uses more than 3.5 gb?
>>
>> Thanks and regards,
>> Flo
>>
>> Some useful information:
>> $ java -version
>> java version "1.6.0_29"
>> Java(TM) SE Runtime Environment (build 1.6.0_29-b11)
>> Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode)
>>
>> Startup Parameters:
>> -Xms20g -Xmx20g
>> -verbose:gc \
>> -XX:+UnlockExperimentalVMOptions \
>> -XX:+UseG1GC \
>> -XX:+PrintGCDetails \
>> -XX:+PrintGCDateStamps \
>> -XX:+UseLargePages \
>> -XX:+PrintFlagsFinal \
>> -XX:-TraceClassUnloading \
>>
>> $ cat /proc/meminfo | grep Huge
>> HugePages_Total: 11264
>> HugePages_Free:   1015
>> HugePages_Rsvd:     32
>> Hugepagesize:     2048 kB
>>
>> A few screen-shots of the jconsole memory-view:
>> http://java4.info/g1/1h.png
>> http://java4.info/g1/all.png
>> http://java4.info/g1/eden_1h.png
>> http://java4.info/g1/eden_all.png
>> http://java4.info/g1/oldgen_all.png
>>
>> The sysout end syserr logfile with the gc logging and PrintFinalFlags:
>> http://java4.info/g1/out_err.log.gz
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From stefan.karlsson at oracle.com  Wed Nov 30 01:02:33 2011
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 30 Nov 2011 10:02:33 +0100
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <CABzyjynk20zQYtcBiACM+J=+8wHDHpSQtkVOSNXcfnwr3rsjEQ@mail.gmail.com>
References: <CABzyjymyFSoned-rLGp+pnjhkRAHuXkyaOKrsLf0mcFrm5dd9Q@mail.gmail.com>	<CAFA9773.26643%knoguchi@yahoo-inc.com>
	<CABzyjynk20zQYtcBiACM+J=+8wHDHpSQtkVOSNXcfnwr3rsjEQ@mail.gmail.com>
Message-ID: <4ED5F129.7040208@oracle.com>

On 11/30/2011 07:16 AM, Srinivas Ramakrishna wrote:
> Who knows, may be this is related to the other CMS CR that Stefan just 
> sent out a review request for. If I understand correctly then,
> the behaviour should be good if you turn off parallel marking in CMS, 
> viz -XX:-CMSConcurrentMTEnabled (or whatever
> the flag is called now). Are you able to check that?

If it's the same bug -XX:-CMSConcurrentMTEnabled should fix it.

I instrumented and ran the small reproducer in the bug report for bug 
7113118 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118>. I 
agree with Krystal's earlier assessment of that reproducer. We actually 
do discover most Finalizers, but we need a second GC to clean them out.

StefanK

>
> Adding Stefan to the cc, just in case.
> -- ramki
>
> On Tue, Nov 29, 2011 at 2:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com 
> <mailto:knoguchi at yahoo-inc.com>> wrote:
>
>     Thanks Krystal for your update.
>
>     I don?t know why I?m getting a different result than yours.
>
>
>     > * After the second CMS collection, the Java heap used goes down
>     to 61747K,
>     >
>     In my case, it stays above 800MBytes...
>
>     Attached is the memory footprint with
>     CMS(-XX:+UseCMSInitiatingOccupancyOnly
>     -XX:CMSInitiatingOccupancyFraction=95 -XX:+UseConcMarkSweepGC) and
>     without(fullgc).
>
>     It was interesting to see
>
>        1. FullGC case eventually stabilizing to having each FullGC
>           releasing half of the heap (500M) due to finalizer requiring
>           two GCs.
>        2. CMS case still stayed above 800M but there were a few times
>           when memory footprint dropped.
>
>
>     In any cases, I?m pretty sure your SocketAdaptor [3] patch would
>     workaround the CMS issue I?m facing.  So this is no longer urgent
>     to me as long as that change gets into a future java version.
>
>     Thanks again for all your inputs.
>
>     Koji
>
>
>     On 11/24/11 1:45 PM, "Srinivas Ramakrishna" <ysr1729 at gmail.com
>     <http://ysr1729 at gmail.com>> wrote:
>
>         Hi Kris, thanks for running the test case and figuring that
>         out, and saving us further investigation of
>         the submitted test case from Koji.
>
>         Hopefully you or Koji will be able to find a simple test case
>         that illustrates the real issue.
>
>         thanks!
>         -- ramki
>
>
>         On Thu, Nov 24, 2011 at 1:52 AM, Krystal Mok
>         <rednaxelafx at gmail.com <http://rednaxelafx at gmail.com>> wrote:
>
>             Hi Koji and Ramki,
>
>             I had a look at the repro test case in Bug 7113118. I
>             don't think the test case is showing the same problem as
>             the original one caused by SocksSocketImpl objects. The
>             way this test case is behaving is exactly what the VM
>             arguments told it to do.
>
>             I ran the test case on a 64-bit Linux HotSpot Server VM,
>             on JDK 6 update 29.
>
>             My .hotspotrc is at [1].
>
>             SurvivorRatio doesn't need to be set explicitly, because
>             when CMS is in use and MaxTenuringThreshold is 0 (or
>             AlwaysTenure is true), the SurvivorRatio will
>             automatically be set to 1024 by ergonomics.
>             UsePSAdaptiveSurvivorSizePolicy has no effect when CMS is
>             in use, so it's omitted from my configuration, too.
>
>             By using -XX:+PrintReferenceGC, the gc log will show when
>             and how many finalizable object are discovered.
>
>             The VM arguments given force all surviving object from
>             minor collections to be promoted into the old generation,
>             and none of the minor collections had a chance to
>             discovery any ready-to-be-collected FinalReferences, so
>             minor GC logs aren't of interest in this case. All of the
>             minor GC log lines show "[FinalReference, 0 refs, xxx secs]".
>
>             A part of the GC log can be found at [2]. This log shows
>             two CMS collections cycles, in between dozens of minor
>             collections.
>
>             * Before the first of these two CMS collections, the Java
>             heap used is 971914K, and then the CMS occupancy threshold
>             is crossed so a CMS collection cycle starts;
>             * During the re-mark phase of the first CMS
>             collection, 46400 FinalReferences were discovered;
>             * After the first CMS collection, the Java heap used is
>             still high, at 913771K, because the finalizable objects
>             need another old generation collection to be collected
>             (either CMS or full GC is fine);
>             * During the re-mark phase of the second CMS collection,
>             3000 FinalReferences were discovered, these are from
>             promoted objects from the minor collections in between;
>             * After the second CMS collection, the Java heap used goes
>             down to 61747K, as the finalizable objects discovered from
>             the first CMS collection are indeed finalized and then
>             collected during the second CMS collection.
>
>             This behavior looks normal to me -- it's what the VM
>             arguments were telling the VM to do.
>             The reason that the Java heap used size was swing up and
>             down is because the actual live data set was very low, but
>             the CMSInitiatingOccupancyFraction was set too high so
>             concurrent collections are started too late. If the
>             initiating threshold were set to a smaller value, say 20,
>             then the test case would behave quite reasonably.
>
>             We'd need another test case to study, because this one
>             doesn't really repro the problem.
>
>             After we applied the patch to SocketAdaptor [3], we don't
>             see this kind of CMS/finalization problem in production
>             anymore. Should we hit one of these again, I'll try to get
>             more info from our production site and see if I can trace
>             down the real problem.
>
>             - Kris
>
>             [1]: https://gist.github.com/1390876#file_.hotspotrc
>             [2]: https://gist.github.com/1390876#file_gc.partial.log
>             [3]:
>             http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html
>
>
>             On Thu, Nov 24, 2011 at 7:11 AM, Srinivas Ramakrishna
>             <ysr1729 at gmail.com <http://ysr1729 at gmail.com>> wrote:
>
>                 Hi Koji --
>
>                 Thanks for the test case, that should definitely help
>                 with the dentification of the problem. I'll see if
>                 i can find some spare time to pursue it one of these
>                 days (but can't promise), so please
>                 do open that Oracle support ticket to get the
>                 requisite resource allocated for the official
>                 investigation.
>
>                 Thanks again for boiling it down to a simple test
>                 case, and i'll update if i identify the
>                 root cause...
>
>                 -- ramki
>
>
>                 On Tue, Nov 22, 2011 at 1:06 PM, Koji Noguchi
>                 <knoguchi at yahoo-inc.com
>                 <http://knoguchi at yahoo-inc.com>> wrote:
>
>                     This is from an old thread in 2011 April but we're
>                     still seeing the same
>                     problem with (nio) Socket instances not getting
>                     collecting by CMS.
>
>                     Opened
>                     http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>
>                     Thanks,
>                     Koji
>
>
>                     (From
>                     http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
>                     l)
>                     On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna"
>                     <y.s.ramakrishna at oracle.com>>
>                     wrote:
>                     > On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
>                     > > Hi Ramki,
>                     > >
>                     > > Thanks for the detailed explanation. I was
>                     trying to
>                     > > run some tests for your questions. Here are the
>                     answers to some of your
>                     > > questions.
>                     > >
>                     > >>> What are the symptoms?
>                     > > java.net.SocksSocketImpl objects are not
>                     getting cleaned up after a CMS
>                     > cycle. I see the direct
>                     > > correlation to java.lang.ref.Finalizer objects.
>                     Overtime, this fills up
>                     > > the old generation and CMS going in loop
>                     occupying complete one core.
>                     > > But when we trigger Full GC, these objects are
>                     garbage collected.
>                     >
>                     > OK, thanks.
>                     >
>                     > >
>                     > > You
>                     > >   mentioned that CMS cycle does cleanup these
>                     objects provided we enable
>                     > > class unloading. Are you suggesting
>                     -XX:+ClassUnloading or
>                     > > -XX:+CMSClassUnloadingEnabled? I have tried
>                     with later and
>                     > >
>                     > > didn't
>                     > >   succeed.  Our pern gen is relatively
>                     constant, by enabling this, are we
>                     > >   introducing performance overhead? We have
>                     room for CPU cycles and perm
>                     > > gen is relatively small, so this may be fine.
>                     Just that we want to see
>                     > > these objects should GC'ed in CMS cycle.
>                     > >
>                     > >
>                     > > Do you have any suggestion w.r.t. to which
>                     flags should i be using to
>                     > trigger this?
>                     >
>                     > For the issue you are seeing the
>                     -XX:+CMSClassUnloadingFlag will
>                     > not make a difference in the accumulation of the
>                     socket objects
>                     > because there is no "projection" as far as i can
>                     tell of these
>                     > into the perm gen, esepcially since as you say
>                     there is no class
>                     > loading going on (since your perm gen size
>                     remains constant after
>                     > start-up).
>                     >
>
>                     > However, keeping class unloading enabled via this
>                     flag should
>                     > hopefully not have much of an impact on your
>                     pause times given that
>                     > the perm gen is small. The typical effect you
>                     will see if class
>                     > unloading is enabled is that the CMS remark pause
>                     times are a bit
>                     > longer (if you enable PrintGCDetails you will see
>                     messages
>                     > such as "scrub string table" and "scrub symbol
>                     table", "code cache"
>                     > etc. BY comparing the CMS-remark pause details
>                     and times with
>                     > and without enabling class unloading you will get
>                     a good idea
>                     > of its impact. In some cases, eben though you pay
>                     a small price
>                     > in terms of increased CMS-remark pause times, you
>                     will make up
>                     > for that in terms of faster scavenges etc., so it
>                     might well
>                     > be worthwhile.
>                     >
>                     > In the very near future, we may end up turning
>                     that on
>                     > by default for CMS because the savings from
>                     leaving it off
>                     > by default are much smaller now and it can often
>                     lead to
>                     > other issues if class unloading is turned off.
>                     >
>                     > So bottom line is: it will not affect the
>                     accumulation of
>                     > your socket objects, but it's a good idea to keep
>                     class
>                     > unloading by CMS enabled anyway.
>                     >
>                     > >
>                     > >
>                     > >>> What does jmap -finalizerinfo on your process
>                     show?
>                     > >>> What does -XX:+PrintClassHistogram show as
>                     accumulating in the
>                     > heap?
>                     > >>> (Are they one specific type of Finalizer
>                     objects or all
>                     > varieties?)
>                     > >
>                     > > Jmap -histo shows the above class is keep
>                     accumulating. Infact,
>                     > > finalizerinfo doesn't show any objects on this
>                     process.
>                     >
>                     > OK, that shows that the objects are somehow not
>                     discovered by
>                     > CMS as being eligible for finalization. Although
>                     one can imagine
>                     > a one cycle delay (because of floating garbage)
>                     with CMS finding
>                     > these objects to be unreachable and hence
>                     eligible for finalization,
>                     > continuing accumulation of these objects over a
>                     period of time
>                     > (and presumably many CMS cycles) seems strange
>                     and almost
>                     > definitely a CMS bug especially as you find that
>                     a full STW
>                     > gc does indeed reclaim them.
>                     >
>                     > >
>                     > >
>                     > >
>                     > >>> Did the problem start in 6u21? Or are those
>                     the only versions
>                     > >>> you tested and found that there was an issue?
>                     > > We
>                     > >   have seen this problem in 6u21. We were on
>                     6u12 earlier and didn't run
>                     > > into this problem. But can't say this is a
>                     build particular, since lots
>                     > > of things have changed.
>                     >
>                     > Can you boil down this behavior into a test case
>                     that you are able
>                     > to share with us?
>                     > If so, please file a bug with the test case
>                     > and send me the CR id and I'll take a look.
>                     >
>                     > Oh, and before you do that, can you please check
>                     the latest public
>                     > release (6u24 or 6u25?) to see if the problem
>                     still reproduces?
>                     >
>                     > thanks, and sorry I could not be of more help
>                     without a bug
>                     > report or a test case.
>                     >
>                     > -- ramki
>                     >
>                     > >
>                     > > Thanks in anticipation,
>                     > > -Bharath
>                     > >
>
>
>
>                     _______________________________________________
>                     hotspot-gc-use mailing list
>                     hotspot-gc-use at openjdk.java.net
>                     <http://hotspot-gc-use at openjdk.java.net>
>                     http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>                 _______________________________________________
>                 hotspot-gc-use mailing list
>                 hotspot-gc-use at openjdk.java.net
>                 <http://hotspot-gc-use at openjdk.java.net>
>                 http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111130/2c7360c3/attachment-0001.html 

From rednaxelafx at gmail.com  Wed Nov 30 01:45:19 2011
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Wed, 30 Nov 2011 17:45:19 +0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <4ED5F129.7040208@oracle.com>
References: <CABzyjymyFSoned-rLGp+pnjhkRAHuXkyaOKrsLf0mcFrm5dd9Q@mail.gmail.com>
	<CAFA9773.26643%knoguchi@yahoo-inc.com>
	<CABzyjynk20zQYtcBiACM+J=+8wHDHpSQtkVOSNXcfnwr3rsjEQ@mail.gmail.com>
	<4ED5F129.7040208@oracle.com>
Message-ID: <CA+cQ+tR7EexQhETwSX_RsVkeaRGR2RR9y4hPtR_xVJbTSHzN9A@mail.gmail.com>

Hi Stefan and Ramki,

I've been looking at 7112034 since it was sent for review. I'll see if I
can get one of our production site machines that encountered the problem
try out -XX:-CMSConcurrentMTEnabled without the SocketAdaptor patch. Will
report back later.

Thanks a lot for the fix!

- Kris

On Wed, Nov 30, 2011 at 5:02 PM, Stefan Karlsson <stefan.karlsson at oracle.com
> wrote:

> **
> On 11/30/2011 07:16 AM, Srinivas Ramakrishna wrote:
>
> Who knows, may be this is related to the other CMS CR that Stefan just
> sent out a review request for. If I understand correctly then,
> the behaviour should be good if you turn off parallel marking in CMS, viz
> -XX:-CMSConcurrentMTEnabled (or whatever
> the flag is called now). Are you able to check that?
>
>
> If it's the same bug -XX:-CMSConcurrentMTEnabled should fix it.
>
> I instrumented and ran the small reproducer in the bug report for bug
> 7113118 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118>. I
> agree with Krystal's earlier assessment of that reproducer. We actually do
> discover most Finalizers, but we need a second GC to clean them out.
>
> StefanK
>
>
>
> Adding Stefan to the cc, just in case.
> -- ramki
>
> On Tue, Nov 29, 2011 at 2:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com>wrote:
>
>>  Thanks Krystal for your update.
>>
>> I don?t know why I?m getting a different result than yours.
>>
>>
>> > * After the second CMS collection, the Java heap used goes down
>> to 61747K,
>> >
>>  In my case, it stays above 800MBytes...
>>
>> Attached is the memory footprint with
>> CMS(-XX:+UseCMSInitiatingOccupancyOnly
>> -XX:CMSInitiatingOccupancyFraction=95 -XX:+UseConcMarkSweepGC) and
>> without(fullgc).
>>
>> It was interesting to see
>>
>>    1. FullGC case eventually stabilizing to having each FullGC releasing
>>    half of the heap (500M) due to finalizer requiring two GCs.
>>    2. CMS case still stayed above 800M but there were a few times when
>>    memory footprint dropped.
>>
>>
>> In any cases, I?m pretty sure your SocketAdaptor [3] patch would
>> workaround the CMS issue I?m facing.  So this is no longer urgent to me as
>> long as that change gets into a future java version.
>>
>> Thanks again for all your inputs.
>>
>> Koji
>>
>>
>> On 11/24/11 1:45 PM, "Srinivas Ramakrishna" <ysr1729 at gmail.com> wrote:
>>
>>    Hi Kris, thanks for running the test case and figuring that out, and
>> saving us further investigation of
>> the submitted test case from Koji.
>>
>> Hopefully you or Koji will be able to find a simple test case that
>> illustrates the real issue.
>>
>> thanks!
>> -- ramki
>>
>>
>> On Thu, Nov 24, 2011 at 1:52 AM, Krystal Mok <rednaxelafx at gmail.com>
>> wrote:
>>
>> Hi Koji and Ramki,
>>
>> I had a look at the repro test case in Bug 7113118. I don't think the
>> test case is showing the same problem as the original one caused by
>> SocksSocketImpl objects. The way this test case is behaving is exactly what
>> the VM arguments told it to do.
>>
>> I ran the test case on a 64-bit Linux HotSpot Server VM, on JDK 6 update
>> 29.
>>
>> My .hotspotrc is at [1].
>>
>> SurvivorRatio doesn't need to be set explicitly, because when CMS is in
>> use and MaxTenuringThreshold is 0 (or AlwaysTenure is true), the
>> SurvivorRatio will automatically be set to 1024 by ergonomics.
>> UsePSAdaptiveSurvivorSizePolicy has no effect when CMS is in use, so it's
>> omitted from my configuration, too.
>>
>> By using -XX:+PrintReferenceGC, the gc log will show when and how many
>> finalizable object are discovered.
>>
>> The VM arguments given force all surviving object from minor collections
>> to be promoted into the old generation, and none of the minor collections
>> had a chance to discovery any ready-to-be-collected FinalReferences, so
>> minor GC logs aren't of interest in this case. All of the minor GC log
>> lines show "[FinalReference, 0 refs, xxx secs]".
>>
>> A part of the GC log can be found at [2]. This log shows two CMS
>> collections cycles, in between dozens of minor collections.
>>
>> * Before the first of these two CMS collections, the Java heap used
>> is 971914K, and then the CMS occupancy threshold is crossed so a CMS
>> collection cycle starts;
>> * During the re-mark phase of the first CMS collection, 46400
>> FinalReferences were discovered;
>> * After the first CMS collection, the Java heap used is still high,
>> at 913771K, because the finalizable objects need another old generation
>> collection to be collected (either CMS or full GC is fine);
>> * During the re-mark phase of the second CMS collection, 3000
>> FinalReferences were discovered, these are from promoted objects from the
>> minor collections in between;
>> * After the second CMS collection, the Java heap used goes down
>> to 61747K, as the finalizable objects discovered from the first CMS
>> collection are indeed finalized and then collected during the second CMS
>> collection.
>>
>> This behavior looks normal to me -- it's what the VM arguments were
>> telling the VM to do.
>> The reason that the Java heap used size was swing up and down is because
>> the actual live data set was very low, but
>> the CMSInitiatingOccupancyFraction was set too high so concurrent
>> collections are started too late. If the initiating threshold were set to a
>> smaller value, say 20, then the test case would behave quite reasonably.
>>
>> We'd need another test case to study, because this one doesn't really
>> repro the problem.
>>
>> After we applied the patch to SocketAdaptor [3], we don't see this kind
>> of CMS/finalization problem in production anymore. Should we hit one of
>> these again, I'll try to get more info from our production site and see if
>> I can trace down the real problem.
>>
>> - Kris
>>
>> [1]: https://gist.github.com/1390876#file_.hotspotrc
>> [2]: https://gist.github.com/1390876#file_gc.partial.log
>> [3]:
>> http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html
>>
>>
>> On Thu, Nov 24, 2011 at 7:11 AM, Srinivas Ramakrishna <ysr1729 at gmail.com>
>> wrote:
>>
>> Hi Koji --
>>
>> Thanks for the test case, that should definitely help with the
>> dentification of the problem. I'll see if
>> i can find some spare time to pursue it one of these days (but can't
>> promise), so please
>> do open that Oracle support ticket to get the requisite resource
>> allocated for the official
>> investigation.
>>
>> Thanks again for boiling it down to a simple test case, and i'll update
>> if i identify the
>> root cause...
>>
>> -- ramki
>>
>>
>> On Tue, Nov 22, 2011 at 1:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com>
>> wrote:
>>
>> This is from an old thread in 2011 April but we're still seeing the same
>> problem with (nio) Socket instances not getting collecting by CMS.
>>
>> Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118
>>
>> Thanks,
>> Koji
>>
>>
>> (From
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
>> l)
>> On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna" <
>> y.s.ramakrishna at oracle.com>>
>> wrote:
>> > On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
>> > > Hi Ramki,
>> > >
>> > > Thanks for the detailed explanation. I was trying to
>> > > run some tests for your questions. Here are the answers to some of
>> your
>> > > questions.
>> > >
>> > >>> What are the symptoms?
>> > > java.net.SocksSocketImpl objects are not getting cleaned up after a
>> CMS
>> > cycle. I see the direct
>> > > correlation to java.lang.ref.Finalizer objects. Overtime, this fills
>> up
>> > > the old generation and CMS going in loop occupying complete one core.
>> > > But when we trigger Full GC, these objects are garbage collected.
>> >
>> > OK, thanks.
>> >
>> > >
>> > > You
>> > >   mentioned that CMS cycle does cleanup these objects provided we
>> enable
>> > > class unloading. Are you suggesting -XX:+ClassUnloading or
>> > > -XX:+CMSClassUnloadingEnabled? I have tried with later and
>> > >
>> > > didn't
>> > >   succeed.  Our pern gen is relatively constant, by enabling this,
>> are we
>> > >   introducing performance overhead? We have room for CPU cycles and
>> perm
>> > > gen is relatively small, so this may be fine. Just that we want to see
>> > > these objects should GC'ed in CMS cycle.
>> > >
>> > >
>> > > Do you have any suggestion w.r.t. to which flags should i be using to
>> > trigger this?
>> >
>> > For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
>> > not make a difference in the accumulation of the socket objects
>> > because there is no "projection" as far as i can tell of these
>> > into the perm gen, esepcially since as you say there is no class
>> > loading going on (since your perm gen size remains constant after
>> > start-up).
>> >
>>
>> > However, keeping class unloading enabled via this flag should
>> > hopefully not have much of an impact on your pause times given that
>> > the perm gen is small. The typical effect you will see if class
>> > unloading is enabled is that the CMS remark pause times are a bit
>> > longer (if you enable PrintGCDetails you will see messages
>> > such as "scrub string table" and "scrub symbol table", "code cache"
>> > etc. BY comparing the CMS-remark pause details and times with
>> > and without enabling class unloading you will get a good idea
>> > of its impact. In some cases, eben though you pay a small price
>> > in terms of increased CMS-remark pause times, you will make up
>> > for that in terms of faster scavenges etc., so it might well
>> > be worthwhile.
>> >
>> > In the very near future, we may end up turning that on
>> > by default for CMS because the savings from leaving it off
>> > by default are much smaller now and it can often lead to
>> > other issues if class unloading is turned off.
>> >
>> > So bottom line is: it will not affect the accumulation of
>> > your socket objects, but it's a good idea to keep class
>> > unloading by CMS enabled anyway.
>> >
>> > >
>> > >
>> > >>> What does jmap -finalizerinfo on your process show?
>> > >>> What does -XX:+PrintClassHistogram show as accumulating in the
>> > heap?
>> > >>> (Are they one specific type of Finalizer objects or all
>> > varieties?)
>> > >
>> > > Jmap -histo shows the above class is keep accumulating. Infact,
>> > > finalizerinfo doesn't show any objects on this process.
>> >
>> > OK, that shows that the objects are somehow not discovered by
>> > CMS as being eligible for finalization. Although one can imagine
>> > a one cycle delay (because of floating garbage) with CMS finding
>> > these objects to be unreachable and hence eligible for finalization,
>> > continuing accumulation of these objects over a period of time
>> > (and presumably many CMS cycles) seems strange and almost
>> > definitely a CMS bug especially as you find that a full STW
>> > gc does indeed reclaim them.
>> >
>> > >
>> > >
>> > >
>> > >>> Did the problem start in 6u21? Or are those the only versions
>> > >>> you tested and found that there was an issue?
>> > > We
>> > >   have seen this problem in 6u21. We were on 6u12 earlier and didn't
>> run
>> > > into this problem. But can't say this is a build particular, since
>> lots
>> > > of things have changed.
>> >
>> > Can you boil down this behavior into a test case that you are able
>> > to share with us?
>> > If so, please file a bug with the test case
>> > and send me the CR id and I'll take a look.
>> >
>> > Oh, and before you do that, can you please check the latest public
>> > release (6u24 or 6u25?) to see if the problem still reproduces?
>> >
>> > thanks, and sorry I could not be of more help without a bug
>> > report or a test case.
>> >
>> > -- ramki
>> >
>> > >
>> > > Thanks in anticipation,
>> > > -Bharath
>> > >
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>>
>>
>>
>
> _______________________________________________
> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111130/762ebad7/attachment-0001.html 

From knoguchi at yahoo-inc.com  Wed Nov 30 16:03:31 2011
From: knoguchi at yahoo-inc.com (Koji Noguchi)
Date: Wed, 30 Nov 2011 16:03:31 -0800
Subject: Is CMS cycle can collect finalize objects
In-Reply-To: <4ED5F129.7040208@oracle.com>
Message-ID: <CAFC0455.26FAF%knoguchi@yahoo-inc.com>

Thanks everyone!

>  the behaviour should be good if you turn off parallel marking in CMS, viz -XX:-CMSConcurrentMTEnabled
>
Jon also pinged me offline pointing out the same.

And yes, this does seem to solve the issue I?m observing.
Attaching the result I got from disabling parallel marking for my simple test.

Red: Regular CMS with CMSConcurrentMTEnabled
Green: CMS with CMSConcurrentMT disabled
Blue: FullGC

You can see that with CMSConcurrentMT disabled, it is successfully collecting all the stale objects on every other CMS.

As a side note,

> FullGC case eventually stabilizing to having each FullGC releasing half of the heap (500M) due to finalizer requiring two GCs.
>
>From the graph it doesn?t seem like CMS+ ?XX:-CMSConcurrentMTEnabled (green) is hitting this, but this is just a matter of time.  It is slowly getting closer to this state.  I would just need to run the test 100 times longer.

So,

(i)  My simple test:  ?XX:-CMSConcurrentMTEnabled  does fix the issue.
(ii) Single node test on my actual server(namenode):  ?XX:-CMSConcurrentMTEnabled  also seem to fix the issue,  I would continue to run for couple more days to confirm.
(iii) Test on production.  : Haven?t done this yet but I?m optimistic on this option fixing the issue.


Thanks again for everyone who helped !
It has bugged me for such a long time.  I cannot wait to try this option on a real cluster soon.

Koji


On 11/30/11 1:02 AM, "Stefan Karlsson" <stefan.karlsson at oracle.com> wrote:

  On 11/30/2011 07:16 AM, Srinivas Ramakrishna wrote:
Who knows, may be this is related to the other CMS CR that Stefan just sent out a review request for. If I understand correctly then,
 the behaviour should be good if you turn off parallel marking in CMS, viz -XX:-CMSConcurrentMTEnabled (or whatever
 the flag is called now). Are you able to check that?


 If it's the same bug -XX:-CMSConcurrentMTEnabled should fix it.

 I instrumented and ran the small reproducer in the bug report for bug 7113118 <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118> . I agree with Krystal's earlier assessment of that reproducer. We actually do discover most Finalizers, but we need a second GC to clean them out.

 StefanK


 Adding Stefan to the cc, just in case.
 -- ramki


On Tue, Nov 29, 2011 at 2:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com> wrote:


 Thanks Krystal for your update.

 I don?t know why I?m getting a different result than yours.


 > * After the second CMS collection, the Java heap used goes down to 61747K,
 >

 In my case, it stays above 800MBytes...

 Attached is the memory footprint with CMS(-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=95 -XX:+UseConcMarkSweepGC) and without(fullgc).

 It was interesting to see


 1.  FullGC case eventually stabilizing to having each FullGC releasing half of the heap (500M) due to finalizer requiring two GCs.
 2.  CMS case still stayed above 800M but there were a few times when memory footprint dropped.
 3.

 In any cases, I?m pretty sure your SocketAdaptor [3] patch would workaround the CMS issue I?m facing.  So this is no longer urgent to me as long as that change gets into a future java version.

 Thanks again for all your inputs.

 Koji


 On 11/24/11 1:45 PM, "Srinivas Ramakrishna" <ysr1729 at gmail.com <http://ysr1729 at gmail.com> > wrote:


Hi Kris, thanks for running the test case and figuring that out, and saving us further investigation of
 the submitted test case from Koji.

 Hopefully you or Koji will be able to find a simple test case that illustrates the real issue.

 thanks!
 -- ramki


 On Thu, Nov 24, 2011 at 1:52 AM, Krystal Mok <rednaxelafx at gmail.com <http://rednaxelafx at gmail.com> > wrote:

Hi Koji and Ramki,

 I had a look at the repro test case in Bug 7113118. I don't think the test case is showing the same problem as the original one caused by SocksSocketImpl objects. The way this test case is behaving is exactly what the VM arguments told it to do.

 I ran the test case on a 64-bit Linux HotSpot Server VM, on JDK 6 update 29.

 My .hotspotrc is at [1].

 SurvivorRatio doesn't need to be set explicitly, because when CMS is in use and MaxTenuringThreshold is 0 (or AlwaysTenure is true), the SurvivorRatio will automatically be set to 1024 by ergonomics.
 UsePSAdaptiveSurvivorSizePolicy has no effect when CMS is in use, so it's omitted from my configuration, too.

 By using -XX:+PrintReferenceGC, the gc log will show when and how many finalizable object are discovered.

 The VM arguments given force all surviving object from minor collections to be promoted into the old generation, and none of the minor collections had a chance to discovery any ready-to-be-collected FinalReferences, so minor GC logs aren't of interest in this case. All of the minor GC log lines show "[FinalReference, 0 refs, xxx secs]".

 A part of the GC log can be found at [2]. This log shows two CMS collections cycles, in between dozens of minor collections.

 * Before the first of these two CMS collections, the Java heap used is 971914K, and then the CMS occupancy threshold is crossed so a CMS collection cycle starts;
 * During the re-mark phase of the first CMS collection, 46400 FinalReferences were discovered;
 * After the first CMS collection, the Java heap used is still high, at 913771K, because the finalizable objects need another old generation collection to be collected (either CMS or full GC is fine);
 * During the re-mark phase of the second CMS collection, 3000 FinalReferences were discovered, these are from promoted objects from the minor collections in between;
 * After the second CMS collection, the Java heap used goes down to 61747K, as the finalizable objects discovered from the first CMS collection are indeed finalized and then collected during the second CMS collection.

 This behavior looks normal to me -- it's what the VM arguments were telling the VM to do.
 The reason that the Java heap used size was swing up and down is because the actual live data set was very low, but the CMSInitiatingOccupancyFraction was set too high so concurrent collections are started too late. If the initiating threshold were set to a smaller value, say 20, then the test case would behave quite reasonably.

 We'd need another test case to study, because this one doesn't really repro the problem.

 After we applied the patch to SocketAdaptor [3], we don't see this kind of CMS/finalization problem in production anymore. Should we hit one of these again, I'll try to get more info from our production site and see if I can trace down the real problem.

 - Kris

 [1]: https://gist.github.com/1390876#file_.hotspotrc
 [2]: https://gist.github.com/1390876#file_gc.partial.log
 [3]: http://mail.openjdk.java.net/pipermail/nio-dev/2011-November/001480.html


 On Thu, Nov 24, 2011 at 7:11 AM, Srinivas Ramakrishna <ysr1729 at gmail.com <http://ysr1729 at gmail.com> > wrote:

Hi Koji --

 Thanks for the test case, that should definitely help with the dentification of the problem. I'll see if
 i can find some spare time to pursue it one of these days (but can't promise), so please
 do open that Oracle support ticket to get the requisite resource allocated for the official
 investigation.

 Thanks again for boiling it down to a simple test case, and i'll update if i identify the
 root cause...

 -- ramki


 On Tue, Nov 22, 2011 at 1:06 PM, Koji Noguchi <knoguchi at yahoo-inc.com <http://knoguchi at yahoo-inc.com> > wrote:

This is from an old thread in 2011 April but we're still seeing the same
 problem with (nio) Socket instances not getting collecting by CMS.

 Opened http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7113118

 Thanks,
 Koji


 (From
 http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2011-April/subject.htm
 l)
 On 4/25/11 8:37 AM, "Y. Srinivas Ramakrishna" <y.s.ramakrishna at oracle.com>>
 wrote:
 > On 4/25/2011 9:10 AM, Bharath Mundlapudi wrote:
 > > Hi Ramki,
 > >
 > > Thanks for the detailed explanation. I was trying to
 > > run some tests for your questions. Here are the answers to some of your
 > > questions.
 > >
 > >>> What are the symptoms?
 > > java.net.SocksSocketImpl objects are not getting cleaned up after a CMS
 > cycle. I see the direct
 > > correlation to java.lang.ref.Finalizer objects. Overtime, this fills up
 > > the old generation and CMS going in loop occupying complete one core.
 > > But when we trigger Full GC, these objects are garbage collected.
 >
 > OK, thanks.
 >
 > >
 > > You
 > >   mentioned that CMS cycle does cleanup these objects provided we enable
 > > class unloading. Are you suggesting -XX:+ClassUnloading or
 > > -XX:+CMSClassUnloadingEnabled? I have tried with later and
 > >
 > > didn't
 > >   succeed.  Our pern gen is relatively constant, by enabling this, are we
 > >   introducing performance overhead? We have room for CPU cycles and perm
 > > gen is relatively small, so this may be fine. Just that we want to see
 > > these objects should GC'ed in CMS cycle.
 > >
 > >
 > > Do you have any suggestion w.r.t. to which flags should i be using to
 > trigger this?
 >
 > For the issue you are seeing the -XX:+CMSClassUnloadingFlag will
 > not make a difference in the accumulation of the socket objects
 > because there is no "projection" as far as i can tell of these
 > into the perm gen, esepcially since as you say there is no class
 > loading going on (since your perm gen size remains constant after
 > start-up).
 >

 > However, keeping class unloading enabled via this flag should
 > hopefully not have much of an impact on your pause times given that
 > the perm gen is small. The typical effect you will see if class
 > unloading is enabled is that the CMS remark pause times are a bit
 > longer (if you enable PrintGCDetails you will see messages
 > such as "scrub string table" and "scrub symbol table", "code cache"
 > etc. BY comparing the CMS-remark pause details and times with
 > and without enabling class unloading you will get a good idea
 > of its impact. In some cases, eben though you pay a small price
 > in terms of increased CMS-remark pause times, you will make up
 > for that in terms of faster scavenges etc., so it might well
 > be worthwhile.
 >
 > In the very near future, we may end up turning that on
 > by default for CMS because the savings from leaving it off
 > by default are much smaller now and it can often lead to
 > other issues if class unloading is turned off.
 >
 > So bottom line is: it will not affect the accumulation of
 > your socket objects, but it's a good idea to keep class
 > unloading by CMS enabled anyway.
 >
 > >
 > >
 > >>> What does jmap -finalizerinfo on your process show?
 > >>> What does -XX:+PrintClassHistogram show as accumulating in the
 > heap?
 > >>> (Are they one specific type of Finalizer objects or all
 > varieties?)
 > >
 > > Jmap -histo shows the above class is keep accumulating. Infact,
 > > finalizerinfo doesn't show any objects on this process.
 >
 > OK, that shows that the objects are somehow not discovered by
 > CMS as being eligible for finalization. Although one can imagine
 > a one cycle delay (because of floating garbage) with CMS finding
 > these objects to be unreachable and hence eligible for finalization,
 > continuing accumulation of these objects over a period of time
 > (and presumably many CMS cycles) seems strange and almost
 > definitely a CMS bug especially as you find that a full STW
 > gc does indeed reclaim them.
 >
 > >
 > >
 > >
 > >>> Did the problem start in 6u21? Or are those the only versions
 > >>> you tested and found that there was an issue?
 > > We
 > >   have seen this problem in 6u21. We were on 6u12 earlier and didn't run
 > > into this problem. But can't say this is a build particular, since lots
 > > of things have changed.
 >
 > Can you boil down this behavior into a test case that you are able
 > to share with us?
 > If so, please file a bug with the test case
 > and send me the CR id and I'll take a look.
 >
 > Oh, and before you do that, can you please check the latest public
 > release (6u24 or 6u25?) to see if the problem still reproduces?
 >
 > thanks, and sorry I could not be of more help without a bug
 > report or a test case.
 >
 > -- ramki
 >
 > >
 > > Thanks in anticipation,
 > > -Bharath
 > >


_______________________________________________
 hotspot-gc-use mailing list
 hotspot-gc-use at openjdk.java.net <http://hotspot-gc-use at openjdk.java.net>
 http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


_______________________________________________
 hotspot-gc-use mailing list
 hotspot-gc-use at openjdk.java.net <http://hotspot-gc-use at openjdk.java.net>
 http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111130/011b9c65/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cmsAndWithoutConcurrentMTAndFullGC.png
Type: application/octet-stream
Size: 46041 bytes
Desc: cmsAndWithoutConcurrentMTAndFullGC.png
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20111130/011b9c65/cmsAndWithoutConcurrentMTAndFullGC-0001.png