From jeff.lloyd at algorithmics.com  Fri Oct  2 08:23:39 2009
From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com)
Date: Fri, 2 Oct 2009 11:23:39 -0400
Subject: Default performance on Sun T5440
Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>

Hi,

 
I just ran my application on a new box from Sun and found immediate
debilitating java performance by default.

 
System Configuration:  Sun Microsystems  sun4v T5440

Memory size: 65248 Megabytes

 
================================ Virtual CPUs
================================

 
CPU ID Frequency Implementation         Status

------ --------- ---------------------- -------

0      1164 MHz  SUNW,UltraSPARC-T2+    on-line  

1      1164 MHz  SUNW,UltraSPARC-T2+    on-line  

...

254    1164 MHz  SUNW,UltraSPARC-T2+    on-line  

255    1164 MHz  SUNW,UltraSPARC-T2+    on-line  

 
four UltraSPARC T2 Plus processors per system, 256 threads; 1.2 GHz

 
The default java behaviour is to use all 256 virtual processors for
garbage collection, and even for small ParNew collections the system was
being hammered.  The cpu load reported by prstat was 150 during
too-frequent and long-running YG garbage collections.  I had to specify
-XX:ParallelGCThreads=8 to bring the gc under control.

 
I just wanted to let you know that the default gc behaviour for Sun's
JVM on Sun's hardware produces undesirable results.  I don't think most
people will know how to diagnose or fix this problem.

 
Jeff

 
--------------------------------------------------------------------------
This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory.
--------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20091002/b2ad8797/attachment.html 

From Jon.Masamitsu at Sun.COM  Fri Oct  2 10:19:50 2009
From: Jon.Masamitsu at Sun.COM (Jon Masamitsu)
Date: Fri, 02 Oct 2009 10:19:50 -0700
Subject: Default performance on Sun T5440
In-Reply-To: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>
References: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>
Message-ID: <4AC63636.5080609@sun.com>

Jeff,

The default number of GC threads was changed to be fewer in
jdk6u14 (same change in jdk7). I'm guessing you're using
an older jdk? The number of GC threads depends on the
number of cores on the platform. I think on the T5440 the
number will be about 3/16 * 256 or 48.

Jon

jeff.lloyd at algorithmics.com wrote On 10/02/09 08:23,:

> Hi,
>
> I just ran my application on a new box from Sun and found immediate
> debilitating java performance by default.
>
> System Configuration: Sun Microsystems sun4v T5440
>
> Memory size: 65248 Megabytes
>
> ================================ Virtual CPUs
> ================================
>
> CPU ID Frequency Implementation Status
>
> ------ --------- ---------------------- -------
>
> 0 1164 MHz SUNW,UltraSPARC-T2+ on-line
>
> 1 1164 MHz SUNW,UltraSPARC-T2+ on-line
>
> ?
>
> 254 1164 MHz SUNW,UltraSPARC-T2+ on-line
>
> 255 1164 MHz SUNW,UltraSPARC-T2+ on-line
>
> four UltraSPARC T2 Plus processors per system, 256 threads; 1.2 GHz
>
> The default java behaviour is to use all 256 virtual processors for
> garbage collection, and even for small ParNew collections the system
> was being hammered. The cpu load reported by prstat was 150 during
> too-frequent and long-running YG garbage collections. I had to specify
> -XX:ParallelGCThreads=8 to bring the gc under control.
>
> I just wanted to let you know that the default gc behaviour for Sun?s
> JVM on Sun?s hardware produces undesirable results. I don?t think most
> people will know how to diagnose or fix this problem.
>
> Jeff
>
> ------------------------------------------------------------------------
> This email and any files transmitted with it are confidential and
> proprietary to Algorithmics Incorporated and its affiliates
> ("Algorithmics"). If received in error, use is prohibited. Please
> destroy, and notify sender. Sender does not waive confidentiality or
> privilege. Internet communications cannot be guaranteed to be timely,
> secure, error or virus-free. Algorithmics does not accept liability
> for any errors or omissions. Any commitment intended to bind
> Algorithmics must be reduced to writing and signed by an authorized
> signatory.
> ------------------------------------------------------------------------
>
>------------------------------------------------------------------------
>
>_______________________________________________
>hotspot-gc-use mailing list
>hotspot-gc-use at openjdk.java.net
>http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>  
>


From Jon.Masamitsu at Sun.COM  Fri Oct  2 10:40:03 2009
From: Jon.Masamitsu at Sun.COM (Jon Masamitsu)
Date: Fri, 02 Oct 2009 10:40:03 -0700
Subject: Default performance on Sun T5440
In-Reply-To: <4AC63636.5080609@sun.com>
References: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>
	<4AC63636.5080609@sun.com>
Message-ID: <4AC63AF3.6050100@sun.com>

Jon Masamitsu wrote On 10/02/09 10:19,:

>Jeff,
>
>The default number of GC threads was changed to be fewer in
>jdk6u14 (same change in jdk7). I'm guessing you're using
>an older jdk? The number of GC threads depends on the
>number of cores on the platform. I think on the T5440 the
>  
>
^^^^^^^^^^^^  cores should be strands

>number will be about 3/16 * 256 or 48.
>  
>

This is still correct.


From jeff.lloyd at algorithmics.com  Fri Oct  2 10:56:46 2009
From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com)
Date: Fri, 2 Oct 2009 13:56:46 -0400
Subject: Default performance on Sun T5440
In-Reply-To: <4AC63AF3.6050100@sun.com>
References: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>
	<4AC63636.5080609@sun.com> <4AC63AF3.6050100@sun.com>
Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0AABB3B0@TORMAIL.algorithmics.com>

Hi John,

Thanks for responding.  However I just want to show you a simple gc
without manually setting the number of gc cpus:

178.427: [GC 178.427: [ParNew: 471872K->52416K(471872K), 18.8885468
secs] 1054705K->756342K(12530496K), 18.8895357 secs] [Times: user=740.49
sys=15.54, real=18.89 secs]

Compared to a similar gc when I set the max parallel cpus to 8:

170.531: [GC 170.532: [ParNew: 118014K->13056K(118016K), 0.1646787 secs]
863713K->771012K(12569856K), 0.1651373 secs] [Times: user=0.76 sys=0.05,
real=0.17 secs]

I apologize it isn't totally apples-to-apples because I reduced the YG
from 512M to 128M, but look at the difference in "user" time.  After
5-10 minutes of uptime my application was reporting 135 minutes of cpu
time with the default settings.

I'm not sure if this is helpful to you or the appropriate place to
provide feedback, but this is what people are going to be seeing if they
don't understand gc details.

Jeff

-----Original Message-----
From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
Sent: Friday, October 02, 2009 1:40 PM
To: Jon Masamitsu
Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net
Subject: Re: Default performance on Sun T5440

Jon Masamitsu wrote On 10/02/09 10:19,:

>Jeff,
>
>The default number of GC threads was changed to be fewer in
>jdk6u14 (same change in jdk7). I'm guessing you're using
>an older jdk? The number of GC threads depends on the
>number of cores on the platform. I think on the T5440 the
>  
>
^^^^^^^^^^^^  cores should be strands

>number will be about 3/16 * 256 or 48.
>  
>

This is still correct.


--------------------------------------------------------------------------
This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory.
--------------------------------------------------------------------------

From Jon.Masamitsu at Sun.COM  Fri Oct  2 11:14:11 2009
From: Jon.Masamitsu at Sun.COM (Jon Masamitsu)
Date: Fri, 02 Oct 2009 11:14:11 -0700
Subject: Default performance on Sun T5440
In-Reply-To: <0FCC438D62A5E643AA3F57D3417B220D0AABB3B0@TORMAIL.algorithmics.com>
References: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>
	<4AC63636.5080609@sun.com> <4AC63AF3.6050100@sun.com>
	<0FCC438D62A5E643AA3F57D3417B220D0AABB3B0@TORMAIL.algorithmics.com>
Message-ID: <4AC642F3.2020402@sun.com>

Jeff,

This might be something different.  First what jdk are you using?
What's you command line?  If you are using CMS, can you
try a run with -XX:-UseParNewGC?  If not using CMS, can
you try -XX:+UseSerialGC (assuming you're using jdk5 or
later).

Jon

jeff.lloyd at algorithmics.com wrote On 10/02/09 10:56,:

>Hi John,
>
>Thanks for responding.  However I just want to show you a simple gc
>without manually setting the number of gc cpus:
>
>178.427: [GC 178.427: [ParNew: 471872K->52416K(471872K), 18.8885468
>secs] 1054705K->756342K(12530496K), 18.8895357 secs] [Times: user=740.49
>sys=15.54, real=18.89 secs]
>
>Compared to a similar gc when I set the max parallel cpus to 8:
>
>170.531: [GC 170.532: [ParNew: 118014K->13056K(118016K), 0.1646787 secs]
>863713K->771012K(12569856K), 0.1651373 secs] [Times: user=0.76 sys=0.05,
>real=0.17 secs]
>
>I apologize it isn't totally apples-to-apples because I reduced the YG
>from 512M to 128M, but look at the difference in "user" time.  After
>5-10 minutes of uptime my application was reporting 135 minutes of cpu
>time with the default settings.
>
>I'm not sure if this is helpful to you or the appropriate place to
>provide feedback, but this is what people are going to be seeing if they
>don't understand gc details.
>
>Jeff
>
>-----Original Message-----
>From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>Sent: Friday, October 02, 2009 1:40 PM
>To: Jon Masamitsu
>Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net
>Subject: Re: Default performance on Sun T5440
>
>Jon Masamitsu wrote On 10/02/09 10:19,:
>
>  
>
>>Jeff,
>>
>>The default number of GC threads was changed to be fewer in
>>jdk6u14 (same change in jdk7). I'm guessing you're using
>>an older jdk? The number of GC threads depends on the
>>number of cores on the platform. I think on the T5440 the
>> 
>>
>>    
>>
>^^^^^^^^^^^^  cores should be strands
>
>  
>
>>number will be about 3/16 * 256 or 48.
>> 
>>
>>    
>>
>
>This is still correct.
>
>
> 
>--------------------------------------------------------------------------
>This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory.
>--------------------------------------------------------------------------
>  
>


From jeff.lloyd at algorithmics.com  Fri Oct  2 11:37:19 2009
From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com)
Date: Fri, 2 Oct 2009 14:37:19 -0400
Subject: Default performance on Sun T5440
In-Reply-To: <4AC642F3.2020402@sun.com>
References: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>
	<4AC63636.5080609@sun.com> <4AC63AF3.6050100@sun.com>
	<0FCC438D62A5E643AA3F57D3417B220D0AABB3B0@TORMAIL.algorithmics.com>
	<4AC642F3.2020402@sun.com>
Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0AABB40D@TORMAIL.algorithmics.com>

Hi Jon,

I'm using 1.6.0_16.  I can solve it by reducing the number of parallel
threads for ParNew or using serial new collection - that's ok.  

I just wanted to raise this as a usability issue because I'm a developer
and this issue came to me from the field.  I'm concerned that few people
in the field understand java/gc enough to fix or even recognize such an
issue.  And this is the default jvm behaviour on the new multithreaded
servers (though the default is pure parallel gc instead of cms OG -
probably the same problem).

So, I don't have a problem right now, but I may have more in the future
because our app is supported on many platforms and now the default
number of gc threads won't work well for clients running the Sun JVM on
particular Sun servers.  Maybe 3/16 is too high?

Jeff

-----Original Message-----
From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
Sent: Friday, October 02, 2009 2:14 PM
To: Jeff Lloyd
Cc: hotspot-gc-use at openjdk.java.net
Subject: Re: Default performance on Sun T5440

Jeff,

This might be something different.  First what jdk are you using?
What's you command line?  If you are using CMS, can you
try a run with -XX:-UseParNewGC?  If not using CMS, can
you try -XX:+UseSerialGC (assuming you're using jdk5 or
later).

Jon

jeff.lloyd at algorithmics.com wrote On 10/02/09 10:56,:

>Hi John,
>
>Thanks for responding.  However I just want to show you a simple gc
>without manually setting the number of gc cpus:
>
>178.427: [GC 178.427: [ParNew: 471872K->52416K(471872K), 18.8885468
>secs] 1054705K->756342K(12530496K), 18.8895357 secs] [Times:
user=740.49
>sys=15.54, real=18.89 secs]
>
>Compared to a similar gc when I set the max parallel cpus to 8:
>
>170.531: [GC 170.532: [ParNew: 118014K->13056K(118016K), 0.1646787
secs]
>863713K->771012K(12569856K), 0.1651373 secs] [Times: user=0.76
sys=0.05,
>real=0.17 secs]
>
>I apologize it isn't totally apples-to-apples because I reduced the YG
>from 512M to 128M, but look at the difference in "user" time.  After
>5-10 minutes of uptime my application was reporting 135 minutes of cpu
>time with the default settings.
>
>I'm not sure if this is helpful to you or the appropriate place to
>provide feedback, but this is what people are going to be seeing if
they
>don't understand gc details.
>
>Jeff
>
>-----Original Message-----
>From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>Sent: Friday, October 02, 2009 1:40 PM
>To: Jon Masamitsu
>Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net
>Subject: Re: Default performance on Sun T5440
>
>Jon Masamitsu wrote On 10/02/09 10:19,:
>
>  
>
>>Jeff,
>>
>>The default number of GC threads was changed to be fewer in
>>jdk6u14 (same change in jdk7). I'm guessing you're using
>>an older jdk? The number of GC threads depends on the
>>number of cores on the platform. I think on the T5440 the
>> 
>>
>>    
>>
>^^^^^^^^^^^^  cores should be strands
>
>  
>
>>number will be about 3/16 * 256 or 48.
>> 
>>
>>    
>>
>
>This is still correct.
>
>
> 
>-----------------------------------------------------------------------
---
>This email and any files transmitted with it are confidential and
proprietary to Algorithmics Incorporated and its affiliates
("Algorithmics"). If received in error, use is prohibited. Please
destroy, and notify sender. Sender does not waive confidentiality or
privilege. Internet communications cannot be guaranteed to be timely,
secure, error or virus-free. Algorithmics does not accept liability for
any errors or omissions. Any commitment intended to bind Algorithmics
must be reduced to writing and signed by an authorized signatory.
>-----------------------------------------------------------------------
---
>  
>


--------------------------------------------------------------------------
This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory.
--------------------------------------------------------------------------

From Jon.Masamitsu at Sun.COM  Fri Oct  2 12:28:27 2009
From: Jon.Masamitsu at Sun.COM (Jon Masamitsu)
Date: Fri, 02 Oct 2009 12:28:27 -0700
Subject: Default performance on Sun T5440
In-Reply-To: <0FCC438D62A5E643AA3F57D3417B220D0AABB40D@TORMAIL.algorithmics.com>
References: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>
	<4AC63636.5080609@sun.com> <4AC63AF3.6050100@sun.com>
	<0FCC438D62A5E643AA3F57D3417B220D0AABB3B0@TORMAIL.algorithmics.com>
	<4AC642F3.2020402@sun.com>
	<0FCC438D62A5E643AA3F57D3417B220D0AABB40D@TORMAIL.algorithmics.com>
Message-ID: <4AC6545B.4050103@sun.com>

Jeff,

When I can get access to a T5440 I'll do some experiments.
Do you have a benchmark you can give us?

Jon

jeff.lloyd at algorithmics.com wrote On 10/02/09 11:37,:

>Hi Jon,
>
>I'm using 1.6.0_16.  I can solve it by reducing the number of parallel
>threads for ParNew or using serial new collection - that's ok.  
>
>I just wanted to raise this as a usability issue because I'm a developer
>and this issue came to me from the field.  I'm concerned that few people
>in the field understand java/gc enough to fix or even recognize such an
>issue.  And this is the default jvm behaviour on the new multithreaded
>servers (though the default is pure parallel gc instead of cms OG -
>probably the same problem).
>
>So, I don't have a problem right now, but I may have more in the future
>because our app is supported on many platforms and now the default
>number of gc threads won't work well for clients running the Sun JVM on
>particular Sun servers.  Maybe 3/16 is too high?
>
>Jeff
>
>-----Original Message-----
>From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>Sent: Friday, October 02, 2009 2:14 PM
>To: Jeff Lloyd
>Cc: hotspot-gc-use at openjdk.java.net
>Subject: Re: Default performance on Sun T5440
>
>Jeff,
>
>This might be something different.  First what jdk are you using?
>What's you command line?  If you are using CMS, can you
>try a run with -XX:-UseParNewGC?  If not using CMS, can
>you try -XX:+UseSerialGC (assuming you're using jdk5 or
>later).
>
>Jon
>
>jeff.lloyd at algorithmics.com wrote On 10/02/09 10:56,:
>
>  
>
>>Hi John,
>>
>>Thanks for responding.  However I just want to show you a simple gc
>>without manually setting the number of gc cpus:
>>
>>178.427: [GC 178.427: [ParNew: 471872K->52416K(471872K), 18.8885468
>>secs] 1054705K->756342K(12530496K), 18.8895357 secs] [Times:
>>    
>>
>user=740.49
>  
>
>>sys=15.54, real=18.89 secs]
>>
>>Compared to a similar gc when I set the max parallel cpus to 8:
>>
>>170.531: [GC 170.532: [ParNew: 118014K->13056K(118016K), 0.1646787
>>    
>>
>secs]
>  
>
>>863713K->771012K(12569856K), 0.1651373 secs] [Times: user=0.76
>>    
>>
>sys=0.05,
>  
>
>>real=0.17 secs]
>>
>>I apologize it isn't totally apples-to-apples because I reduced the YG
>>    
>>
>>from 512M to 128M, but look at the difference in "user" time.  After
>  
>
>>5-10 minutes of uptime my application was reporting 135 minutes of cpu
>>time with the default settings.
>>
>>I'm not sure if this is helpful to you or the appropriate place to
>>provide feedback, but this is what people are going to be seeing if
>>    
>>
>they
>  
>
>>don't understand gc details.
>>
>>Jeff
>>
>>-----Original Message-----
>>From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>>Sent: Friday, October 02, 2009 1:40 PM
>>To: Jon Masamitsu
>>Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net
>>Subject: Re: Default performance on Sun T5440
>>
>>Jon Masamitsu wrote On 10/02/09 10:19,:
>>
>> 
>>
>>    
>>
>>>Jeff,
>>>
>>>The default number of GC threads was changed to be fewer in
>>>jdk6u14 (same change in jdk7). I'm guessing you're using
>>>an older jdk? The number of GC threads depends on the
>>>number of cores on the platform. I think on the T5440 the
>>>
>>>
>>>   
>>>
>>>      
>>>
>>^^^^^^^^^^^^  cores should be strands
>>
>> 
>>
>>    
>>
>>>number will be about 3/16 * 256 or 48.
>>>
>>>
>>>   
>>>
>>>      
>>>
>>This is still correct.
>>
>>
>>
>>-----------------------------------------------------------------------
>>    
>>
>---
>  
>
>>This email and any files transmitted with it are confidential and
>>    
>>
>proprietary to Algorithmics Incorporated and its affiliates
>("Algorithmics"). If received in error, use is prohibited. Please
>destroy, and notify sender. Sender does not waive confidentiality or
>privilege. Internet communications cannot be guaranteed to be timely,
>secure, error or virus-free. Algorithmics does not accept liability for
>any errors or omissions. Any commitment intended to bind Algorithmics
>must be reduced to writing and signed by an authorized signatory.
>  
>
>>-----------------------------------------------------------------------
>>    
>>
>---
>  
>
>> 
>>
>>    
>>
>
>
> 
>--------------------------------------------------------------------------
>This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory.
>--------------------------------------------------------------------------
>  
>


From Jon.Masamitsu at Sun.COM  Fri Oct  2 13:32:36 2009
From: Jon.Masamitsu at Sun.COM (Jon Masamitsu)
Date: Fri, 02 Oct 2009 13:32:36 -0700
Subject: Default performance on Sun T5440
In-Reply-To: <4AC6545B.4050103@sun.com>
References: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>
	<4AC63636.5080609@sun.com> <4AC63AF3.6050100@sun.com>
	<0FCC438D62A5E643AA3F57D3417B220D0AABB3B0@TORMAIL.algorithmics.com>
	<4AC642F3.2020402@sun.com>
	<0FCC438D62A5E643AA3F57D3417B220D0AABB40D@TORMAIL.algorithmics.com>
	<4AC6545B.4050103@sun.com>
Message-ID: <4AC66364.7090000@sun.com>

Jeff,

This is what I see running one of our benchmarks with a
1g heap on a T5440.

UseSerialGC

81.908: [GC 81.908: [DefNew: 279616K->34943K(314560K), 1.6078835 secs]
464333K->241833K(1013632K), 1.6080676 secs] [Times: user=1.61 sys=0.00,
real=1.61 secs]

UseParNewGC with default number of GC threads (85).  My appologies but I had
the wrong fraction in my earlier mail.  Not 3/16 but 5/16.  Default
number of GC
threads is approximately number of strands * 5/16.

49.001: [GC 49.001: [ParNew: 314558K->34942K(314560K), 0.1807405 secs]
601002K->350442K(1013632K), 0.1809190 secs] [Times: user=14.67 sys=0.02,
real=0.18 secs]

UseParNewGC with 8 GC threads

51.140: [GC 51.141: [ParNew: 279616K->34943K(314560K), 0.3261322 secs]
464428K->242091K(1013632K), 0.3262756 secs] [Times: user=2.60 sys=0.00,
real=0.33 secs]

I arbitrarily took the last minor collection in each run.

Yes, between 85 GC threads and 8 GC threads we're doing lots more work
(user time)
for not that much gain in GC pause (less than a factor of two).   We're
addressing that
problem with CR 6593758 which will be smarter about the number of GC threads
(use factors such as heap size and Java threads to figure out a better
number of GC
threads).

But I don't see the egregious increase in pause times that you're seeing.

18.8885468 (default number of GC threads) secs vs. 0.1646787 (with 8 GC threads)

So something more is in play.

Jon


Jon Masamitsu wrote On 10/02/09 12:28,:

>Jeff,
>
>When I can get access to a T5440 I'll do some experiments.
>Do you have a benchmark you can give us?
>
>Jon
>
>jeff.lloyd at algorithmics.com wrote On 10/02/09 11:37,:
>
>  
>
>>Hi Jon,
>>
>>I'm using 1.6.0_16.  I can solve it by reducing the number of parallel
>>threads for ParNew or using serial new collection - that's ok.  
>>
>>I just wanted to raise this as a usability issue because I'm a developer
>>and this issue came to me from the field.  I'm concerned that few people
>>in the field understand java/gc enough to fix or even recognize such an
>>issue.  And this is the default jvm behaviour on the new multithreaded
>>servers (though the default is pure parallel gc instead of cms OG -
>>probably the same problem).
>>
>>So, I don't have a problem right now, but I may have more in the future
>>because our app is supported on many platforms and now the default
>>number of gc threads won't work well for clients running the Sun JVM on
>>particular Sun servers.  Maybe 3/16 is too high?
>>
>>Jeff
>>
>>-----Original Message-----
>>From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>>Sent: Friday, October 02, 2009 2:14 PM
>>To: Jeff Lloyd
>>Cc: hotspot-gc-use at openjdk.java.net
>>Subject: Re: Default performance on Sun T5440
>>
>>Jeff,
>>
>>This might be something different.  First what jdk are you using?
>>What's you command line?  If you are using CMS, can you
>>try a run with -XX:-UseParNewGC?  If not using CMS, can
>>you try -XX:+UseSerialGC (assuming you're using jdk5 or
>>later).
>>
>>Jon
>>
>>jeff.lloyd at algorithmics.com wrote On 10/02/09 10:56,:
>>
>> 
>>
>>    
>>
>>>Hi John,
>>>
>>>Thanks for responding.  However I just want to show you a simple gc
>>>without manually setting the number of gc cpus:
>>>
>>>178.427: [GC 178.427: [ParNew: 471872K->52416K(471872K), 18.8885468
>>>secs] 1054705K->756342K(12530496K), 18.8895357 secs] [Times:
>>>   
>>>
>>>      
>>>
>>user=740.49
>> 
>>
>>    
>>
>>>sys=15.54, real=18.89 secs]
>>>
>>>Compared to a similar gc when I set the max parallel cpus to 8:
>>>
>>>170.531: [GC 170.532: [ParNew: 118014K->13056K(118016K), 0.1646787
>>>   
>>>
>>>      
>>>
>>secs]
>> 
>>
>>    
>>
>>>863713K->771012K(12569856K), 0.1651373 secs] [Times: user=0.76
>>>   
>>>
>>>      
>>>
>>sys=0.05,
>> 
>>
>>    
>>
>>>real=0.17 secs]
>>>
>>>I apologize it isn't totally apples-to-apples because I reduced the YG
>>>   
>>>
>>>      
>>>
>>>from 512M to 128M, but look at the difference in "user" time.  After
>> 
>>
>>    
>>
>>>5-10 minutes of uptime my application was reporting 135 minutes of cpu
>>>time with the default settings.
>>>
>>>I'm not sure if this is helpful to you or the appropriate place to
>>>provide feedback, but this is what people are going to be seeing if
>>>   
>>>
>>>      
>>>
>>they
>> 
>>
>>    
>>
>>>don't understand gc details.
>>>
>>>Jeff
>>>
>>>-----Original Message-----
>>>From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>>>Sent: Friday, October 02, 2009 1:40 PM
>>>To: Jon Masamitsu
>>>Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net
>>>Subject: Re: Default performance on Sun T5440
>>>
>>>Jon Masamitsu wrote On 10/02/09 10:19,:
>>>
>>>
>>>
>>>   
>>>
>>>      
>>>
>>>>Jeff,
>>>>
>>>>The default number of GC threads was changed to be fewer in
>>>>jdk6u14 (same change in jdk7). I'm guessing you're using
>>>>an older jdk? The number of GC threads depends on the
>>>>number of cores on the platform. I think on the T5440 the
>>>>
>>>>
>>>>  
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>^^^^^^^^^^^^  cores should be strands
>>>
>>>
>>>
>>>   
>>>
>>>      
>>>
>>>>number will be about 3/16 * 256 or 48.
>>>>
>>>>
>>>>  
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>This is still correct.
>>>
>>>
>>>
>>>-----------------------------------------------------------------------
>>>   
>>>
>>>      
>>>
>>---
>> 
>>
>>    
>>
>>>This email and any files transmitted with it are confidential and
>>>   
>>>
>>>      
>>>
>>proprietary to Algorithmics Incorporated and its affiliates
>>("Algorithmics"). If received in error, use is prohibited. Please
>>destroy, and notify sender. Sender does not waive confidentiality or
>>privilege. Internet communications cannot be guaranteed to be timely,
>>secure, error or virus-free. Algorithmics does not accept liability for
>>any errors or omissions. Any commitment intended to bind Algorithmics
>>must be reduced to writing and signed by an authorized signatory.
>> 
>>
>>    
>>
>>>-----------------------------------------------------------------------
>>>   
>>>
>>>      
>>>
>>---
>> 
>>
>>    
>>
>>>   
>>>
>>>      
>>>
>>
>>--------------------------------------------------------------------------
>>This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory.
>>--------------------------------------------------------------------------
>> 
>>
>>    
>>
>
>_______________________________________________
>hotspot-gc-use mailing list
>hotspot-gc-use at openjdk.java.net
>http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>  
>


From jeff.lloyd at algorithmics.com  Tue Oct  6 14:02:53 2009
From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com)
Date: Tue, 6 Oct 2009 17:02:53 -0400
Subject: Default performance on Sun T5440
In-Reply-To: <4AC66364.7090000@sun.com>
References: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com><4AC63636.5080609@sun.com>
	<4AC63AF3.6050100@sun.com><0FCC438D62A5E643AA3F57D3417B220D0AABB3B0@TORMAIL.algorithmics.com><4AC642F3.2020402@sun.com><0FCC438D62A5E643AA3F57D3417B220D0AABB40D@TORMAIL.algorithmics.com><4AC6545B.4050103@sun.com>
	<4AC66364.7090000@sun.com>
Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0AB35B74@TORMAIL.algorithmics.com>

Hi Jon,

Sorry for the delay - I took a long weekend.

Is it possible you are testing on a faster box than me?  Here is mine:

The sparcv9 processor operates at 1164 MHz,
        and has a sparcv9 floating point processor.

Jeff

-----Original Message-----
From: hotspot-gc-use-bounces at openjdk.java.net
[mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of Jon
Masamitsu
Sent: Friday, October 02, 2009 4:33 PM
To: Jeff Lloyd
Cc: hotspot-gc-use at openjdk.java.net
Subject: Re: Default performance on Sun T5440

Jeff,

This is what I see running one of our benchmarks with a
1g heap on a T5440.

UseSerialGC

81.908: [GC 81.908: [DefNew: 279616K->34943K(314560K), 1.6078835 secs]
464333K->241833K(1013632K), 1.6080676 secs] [Times: user=1.61 sys=0.00,
real=1.61 secs]

UseParNewGC with default number of GC threads (85).  My appologies but I
had
the wrong fraction in my earlier mail.  Not 3/16 but 5/16.  Default
number of GC
threads is approximately number of strands * 5/16.

49.001: [GC 49.001: [ParNew: 314558K->34942K(314560K), 0.1807405 secs]
601002K->350442K(1013632K), 0.1809190 secs] [Times: user=14.67 sys=0.02,
real=0.18 secs]

UseParNewGC with 8 GC threads

51.140: [GC 51.141: [ParNew: 279616K->34943K(314560K), 0.3261322 secs]
464428K->242091K(1013632K), 0.3262756 secs] [Times: user=2.60 sys=0.00,
real=0.33 secs]

I arbitrarily took the last minor collection in each run.

Yes, between 85 GC threads and 8 GC threads we're doing lots more work
(user time)
for not that much gain in GC pause (less than a factor of two).   We're
addressing that
problem with CR 6593758 which will be smarter about the number of GC
threads
(use factors such as heap size and Java threads to figure out a better
number of GC
threads).

But I don't see the egregious increase in pause times that you're
seeing.

18.8885468 (default number of GC threads) secs vs. 0.1646787 (with 8 GC
threads)

So something more is in play.

Jon


Jon Masamitsu wrote On 10/02/09 12:28,:

>Jeff,
>
>When I can get access to a T5440 I'll do some experiments.
>Do you have a benchmark you can give us?
>
>Jon
>
>jeff.lloyd at algorithmics.com wrote On 10/02/09 11:37,:
>
>  
>
>>Hi Jon,
>>
>>I'm using 1.6.0_16.  I can solve it by reducing the number of parallel
>>threads for ParNew or using serial new collection - that's ok.  
>>
>>I just wanted to raise this as a usability issue because I'm a
developer
>>and this issue came to me from the field.  I'm concerned that few
people
>>in the field understand java/gc enough to fix or even recognize such
an
>>issue.  And this is the default jvm behaviour on the new multithreaded
>>servers (though the default is pure parallel gc instead of cms OG -
>>probably the same problem).
>>
>>So, I don't have a problem right now, but I may have more in the
future
>>because our app is supported on many platforms and now the default
>>number of gc threads won't work well for clients running the Sun JVM
on
>>particular Sun servers.  Maybe 3/16 is too high?
>>
>>Jeff
>>
>>-----Original Message-----
>>From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>>Sent: Friday, October 02, 2009 2:14 PM
>>To: Jeff Lloyd
>>Cc: hotspot-gc-use at openjdk.java.net
>>Subject: Re: Default performance on Sun T5440
>>
>>Jeff,
>>
>>This might be something different.  First what jdk are you using?
>>What's you command line?  If you are using CMS, can you
>>try a run with -XX:-UseParNewGC?  If not using CMS, can
>>you try -XX:+UseSerialGC (assuming you're using jdk5 or
>>later).
>>
>>Jon
>>
>>jeff.lloyd at algorithmics.com wrote On 10/02/09 10:56,:
>>
>> 
>>
>>    
>>
>>>Hi John,
>>>
>>>Thanks for responding.  However I just want to show you a simple gc
>>>without manually setting the number of gc cpus:
>>>
>>>178.427: [GC 178.427: [ParNew: 471872K->52416K(471872K), 18.8885468
>>>secs] 1054705K->756342K(12530496K), 18.8895357 secs] [Times:
>>>   
>>>
>>>      
>>>
>>user=740.49
>> 
>>
>>    
>>
>>>sys=15.54, real=18.89 secs]
>>>
>>>Compared to a similar gc when I set the max parallel cpus to 8:
>>>
>>>170.531: [GC 170.532: [ParNew: 118014K->13056K(118016K), 0.1646787
>>>   
>>>
>>>      
>>>
>>secs]
>> 
>>
>>    
>>
>>>863713K->771012K(12569856K), 0.1651373 secs] [Times: user=0.76
>>>   
>>>
>>>      
>>>
>>sys=0.05,
>> 
>>
>>    
>>
>>>real=0.17 secs]
>>>
>>>I apologize it isn't totally apples-to-apples because I reduced the
YG
>>>   
>>>
>>>      
>>>
>>>from 512M to 128M, but look at the difference in "user" time.  After
>> 
>>
>>    
>>
>>>5-10 minutes of uptime my application was reporting 135 minutes of
cpu
>>>time with the default settings.
>>>
>>>I'm not sure if this is helpful to you or the appropriate place to
>>>provide feedback, but this is what people are going to be seeing if
>>>   
>>>
>>>      
>>>
>>they
>> 
>>
>>    
>>
>>>don't understand gc details.
>>>
>>>Jeff
>>>
>>>-----Original Message-----
>>>From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>>>Sent: Friday, October 02, 2009 1:40 PM
>>>To: Jon Masamitsu
>>>Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net
>>>Subject: Re: Default performance on Sun T5440
>>>
>>>Jon Masamitsu wrote On 10/02/09 10:19,:
>>>
>>>
>>>
>>>   
>>>
>>>      
>>>
>>>>Jeff,
>>>>
>>>>The default number of GC threads was changed to be fewer in
>>>>jdk6u14 (same change in jdk7). I'm guessing you're using
>>>>an older jdk? The number of GC threads depends on the
>>>>number of cores on the platform. I think on the T5440 the
>>>>
>>>>
>>>>  
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>^^^^^^^^^^^^  cores should be strands
>>>
>>>
>>>
>>>   
>>>
>>>      
>>>
>>>>number will be about 3/16 * 256 or 48.
>>>>
>>>>
>>>>  
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>This is still correct.
>>>
>>>
>>>
>>>---------------------------------------------------------------------
--
>>>   
>>>
>>>      
>>>
>>---
>> 
>>
>>    
>>
>>>This email and any files transmitted with it are confidential and
>>>   
>>>
>>>      
>>>
>>proprietary to Algorithmics Incorporated and its affiliates
>>("Algorithmics"). If received in error, use is prohibited. Please
>>destroy, and notify sender. Sender does not waive confidentiality or
>>privilege. Internet communications cannot be guaranteed to be timely,
>>secure, error or virus-free. Algorithmics does not accept liability
for
>>any errors or omissions. Any commitment intended to bind Algorithmics
>>must be reduced to writing and signed by an authorized signatory.
>> 
>>
>>    
>>
>>>---------------------------------------------------------------------
--
>>>   
>>>
>>>      
>>>
>>---
>> 
>>
>>    
>>
>>>   
>>>
>>>      
>>>
>>
>>----------------------------------------------------------------------
----
>>This email and any files transmitted with it are confidential and
proprietary to Algorithmics Incorporated and its affiliates
("Algorithmics"). If received in error, use is prohibited. Please
destroy, and notify sender. Sender does not waive confidentiality or
privilege. Internet communications cannot be guaranteed to be timely,
secure, error or virus-free. Algorithmics does not accept liability for
any errors or omissions. Any commitment intended to bind Algorithmics
must be reduced to writing and signed by an authorized signatory.
>>----------------------------------------------------------------------
----
>> 
>>
>>    
>>
>
>_______________________________________________
>hotspot-gc-use mailing list
>hotspot-gc-use at openjdk.java.net
>http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>  
>

_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

 
--------------------------------------------------------------------------
This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory.
--------------------------------------------------------------------------

From Jon.Masamitsu at Sun.COM  Tue Oct  6 14:27:11 2009
From: Jon.Masamitsu at Sun.COM (Jon Masamitsu)
Date: Tue, 06 Oct 2009 14:27:11 -0700
Subject: Default performance on Sun T5440
In-Reply-To: <0FCC438D62A5E643AA3F57D3417B220D0AB35B74@TORMAIL.algorithmics.com>
References: <0FCC438D62A5E643AA3F57D3417B220D0AABB213@TORMAIL.algorithmics.com>
	<4AC63636.5080609@sun.com> <4AC63AF3.6050100@sun.com>
	<0FCC438D62A5E643AA3F57D3417B220D0AABB3B0@TORMAIL.algorithmics.com>
	<4AC642F3.2020402@sun.com>
	<0FCC438D62A5E643AA3F57D3417B220D0AABB40D@TORMAIL.algorithmics.com>
	<4AC6545B.4050103@sun.com> <4AC66364.7090000@sun.com>
	<0FCC438D62A5E643AA3F57D3417B220D0AB35B74@TORMAIL.algorithmics.com>
Message-ID: <4ACBB62F.2070303@Sun.COM>

Jeff,

It's a little faster.

This is a Sun T5440, 4 1.4Ghz Niagara2(T2+) CPUs

I'm guessing that there is something about your
application that is hitting a weak spot in the collector.
If you can share the benchmark, that would help.

In the past we've seen performance problems with very
large objects, but there isn't any problem that we're
aware of in the latest jdk6 release.

Jon

On 10/06/09 14:02, jeff.lloyd at algorithmics.com wrote:
> Hi Jon,
> 
> Sorry for the delay - I took a long weekend.
> 
> Is it possible you are testing on a faster box than me?  Here is mine:
> 
> The sparcv9 processor operates at 1164 MHz,
>         and has a sparcv9 floating point processor.
> 
> Jeff
> 
> -----Original Message-----
> From: hotspot-gc-use-bounces at openjdk.java.net
> [mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of Jon
> Masamitsu
> Sent: Friday, October 02, 2009 4:33 PM
> To: Jeff Lloyd
> Cc: hotspot-gc-use at openjdk.java.net
> Subject: Re: Default performance on Sun T5440
> 
> Jeff,
> 
> This is what I see running one of our benchmarks with a
> 1g heap on a T5440.
> 
> UseSerialGC
> 
> 81.908: [GC 81.908: [DefNew: 279616K->34943K(314560K), 1.6078835 secs]
> 464333K->241833K(1013632K), 1.6080676 secs] [Times: user=1.61 sys=0.00,
> real=1.61 secs]
> 
> UseParNewGC with default number of GC threads (85).  My appologies but I
> had
> the wrong fraction in my earlier mail.  Not 3/16 but 5/16.  Default
> number of GC
> threads is approximately number of strands * 5/16.
> 
> 49.001: [GC 49.001: [ParNew: 314558K->34942K(314560K), 0.1807405 secs]
> 601002K->350442K(1013632K), 0.1809190 secs] [Times: user=14.67 sys=0.02,
> real=0.18 secs]
> 
> UseParNewGC with 8 GC threads
> 
> 51.140: [GC 51.141: [ParNew: 279616K->34943K(314560K), 0.3261322 secs]
> 464428K->242091K(1013632K), 0.3262756 secs] [Times: user=2.60 sys=0.00,
> real=0.33 secs]
> 
> I arbitrarily took the last minor collection in each run.
> 
> Yes, between 85 GC threads and 8 GC threads we're doing lots more work
> (user time)
> for not that much gain in GC pause (less than a factor of two).   We're
> addressing that
> problem with CR 6593758 which will be smarter about the number of GC
> threads
> (use factors such as heap size and Java threads to figure out a better
> number of GC
> threads).
> 
> But I don't see the egregious increase in pause times that you're
> seeing.
> 
> 18.8885468 (default number of GC threads) secs vs. 0.1646787 (with 8 GC
> threads)
> 
> So something more is in play.
> 
> Jon
> 
> 
> Jon Masamitsu wrote On 10/02/09 12:28,:
> 
>> Jeff,
>>
>> When I can get access to a T5440 I'll do some experiments.
>> Do you have a benchmark you can give us?
>>
>> Jon
>>
>> jeff.lloyd at algorithmics.com wrote On 10/02/09 11:37,:
>>
>>  
>>
>>> Hi Jon,
>>>
>>> I'm using 1.6.0_16.  I can solve it by reducing the number of parallel
>>> threads for ParNew or using serial new collection - that's ok.  
>>>
>>> I just wanted to raise this as a usability issue because I'm a
> developer
>>> and this issue came to me from the field.  I'm concerned that few
> people
>>> in the field understand java/gc enough to fix or even recognize such
> an
>>> issue.  And this is the default jvm behaviour on the new multithreaded
>>> servers (though the default is pure parallel gc instead of cms OG -
>>> probably the same problem).
>>>
>>> So, I don't have a problem right now, but I may have more in the
> future
>>> because our app is supported on many platforms and now the default
>>> number of gc threads won't work well for clients running the Sun JVM
> on
>>> particular Sun servers.  Maybe 3/16 is too high?
>>>
>>> Jeff
>>>
>>> -----Original Message-----
>>> From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>>> Sent: Friday, October 02, 2009 2:14 PM
>>> To: Jeff Lloyd
>>> Cc: hotspot-gc-use at openjdk.java.net
>>> Subject: Re: Default performance on Sun T5440
>>>
>>> Jeff,
>>>
>>> This might be something different.  First what jdk are you using?
>>> What's you command line?  If you are using CMS, can you
>>> try a run with -XX:-UseParNewGC?  If not using CMS, can
>>> you try -XX:+UseSerialGC (assuming you're using jdk5 or
>>> later).
>>>
>>> Jon
>>>
>>> jeff.lloyd at algorithmics.com wrote On 10/02/09 10:56,:
>>>
>>>
>>>
>>>    
>>>
>>>> Hi John,
>>>>
>>>> Thanks for responding.  However I just want to show you a simple gc
>>>> without manually setting the number of gc cpus:
>>>>
>>>> 178.427: [GC 178.427: [ParNew: 471872K->52416K(471872K), 18.8885468
>>>> secs] 1054705K->756342K(12530496K), 18.8895357 secs] [Times:
>>>>   
>>>>
>>>>      
>>>>
>>> user=740.49
>>>
>>>
>>>    
>>>
>>>> sys=15.54, real=18.89 secs]
>>>>
>>>> Compared to a similar gc when I set the max parallel cpus to 8:
>>>>
>>>> 170.531: [GC 170.532: [ParNew: 118014K->13056K(118016K), 0.1646787
>>>>   
>>>>
>>>>      
>>>>
>>> secs]
>>>
>>>
>>>    
>>>
>>>> 863713K->771012K(12569856K), 0.1651373 secs] [Times: user=0.76
>>>>   
>>>>
>>>>      
>>>>
>>> sys=0.05,
>>>
>>>
>>>    
>>>
>>>> real=0.17 secs]
>>>>
>>>> I apologize it isn't totally apples-to-apples because I reduced the
> YG
>>>>   
>>>>
>>>>      
>>>>
>>> >from 512M to 128M, but look at the difference in "user" time.  After
>>>
>>>
>>>    
>>>
>>>> 5-10 minutes of uptime my application was reporting 135 minutes of
> cpu
>>>> time with the default settings.
>>>>
>>>> I'm not sure if this is helpful to you or the appropriate place to
>>>> provide feedback, but this is what people are going to be seeing if
>>>>   
>>>>
>>>>      
>>>>
>>> they
>>>
>>>
>>>    
>>>
>>>> don't understand gc details.
>>>>
>>>> Jeff
>>>>
>>>> -----Original Message-----
>>>> From: Jon.Masamitsu at Sun.COM [mailto:Jon.Masamitsu at Sun.COM] 
>>>> Sent: Friday, October 02, 2009 1:40 PM
>>>> To: Jon Masamitsu
>>>> Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net
>>>> Subject: Re: Default performance on Sun T5440
>>>>
>>>> Jon Masamitsu wrote On 10/02/09 10:19,:
>>>>
>>>>
>>>>
>>>>   
>>>>
>>>>      
>>>>
>>>>> Jeff,
>>>>>
>>>>> The default number of GC threads was changed to be fewer in
>>>>> jdk6u14 (same change in jdk7). I'm guessing you're using
>>>>> an older jdk? The number of GC threads depends on the
>>>>> number of cores on the platform. I think on the T5440 the
>>>>>
>>>>>
>>>>>  
>>>>>
>>>>>     
>>>>>
>>>>>        
>>>>>
>>>> ^^^^^^^^^^^^  cores should be strands
>>>>
>>>>
>>>>
>>>>   
>>>>
>>>>      
>>>>
>>>>> number will be about 3/16 * 256 or 48.
>>>>>
>>>>>
>>>>>  
>>>>>
>>>>>     
>>>>>
>>>>>        
>>>>>
>>>> This is still correct.
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
> --
>>>>   
>>>>
>>>>      
>>>>
>>> ---
>>>
>>>
>>>    
>>>
>>>> This email and any files transmitted with it are confidential and
>>>>   
>>>>
>>>>      
>>>>
>>> proprietary to Algorithmics Incorporated and its affiliates
>>> ("Algorithmics"). If received in error, use is prohibited. Please
>>> destroy, and notify sender. Sender does not waive confidentiality or
>>> privilege. Internet communications cannot be guaranteed to be timely,
>>> secure, error or virus-free. Algorithmics does not accept liability
> for
>>> any errors or omissions. Any commitment intended to bind Algorithmics
>>> must be reduced to writing and signed by an authorized signatory.
>>>
>>>
>>>    
>>>
>>>> ---------------------------------------------------------------------
> --
>>>>   
>>>>
>>>>      
>>>>
>>> ---
>>>
>>>
>>>    
>>>
>>>>   
>>>>
>>>>      
>>>>
>>> ----------------------------------------------------------------------
> ----
>>> This email and any files transmitted with it are confidential and
> proprietary to Algorithmics Incorporated and its affiliates
> ("Algorithmics"). If received in error, use is prohibited. Please
> destroy, and notify sender. Sender does not waive confidentiality or
> privilege. Internet communications cannot be guaranteed to be timely,
> secure, error or virus-free. Algorithmics does not accept liability for
> any errors or omissions. Any commitment intended to bind Algorithmics
> must be reduced to writing and signed by an authorized signatory.
>>> ----------------------------------------------------------------------
> ----
>>>
>>>    
>>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>  
>>
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
>  
> --------------------------------------------------------------------------
> This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory.
> --------------------------------------------------------------------------
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From Sujit.Das at cognizant.com  Sun Oct 11 17:58:14 2009
From: Sujit.Das at cognizant.com (Sujit.Das at cognizant.com)
Date: Mon, 12 Oct 2009 06:28:14 +0530
Subject: CMS GC Log
Message-ID: <19B27FD5AF2EAA49A66F787911CF519596D084@CTSINCHNSXUU.cts.com>


Trying to understand CMS-IM and CMS-RM log messages.
 
1. CMS Initial Mark
7465.472: [GC [1 CMS-initial-mark: 1370720K(2739200K)] 1384764K(3058176K), 0.0995109 secs]

In above log snippet, CMS IM was triggered at the old gen occupancy of 1370720 K with total size of old gen size as 2739200 K. What does 1384764 K and 3058176 K represent?

2. CMS Remark
7475.403: [weak refs processing, 0.0024767 secs] [1 CMS-remark: 1401865K(2739200K)] 1582866K(3058176K), 0.1290017 secs]
 
Similarly in above log snippet, CMS RM was triggered at the old gen occupancy of 1401865 K with total size of old gen size as 2739200 K. What does 1582866 K and 3058176 K represent?
 
Also, in GC output is there any way we can determine heap space used before and after a CMS collection? This is printed for minor collection but not sure if this is printed for CMS collection.
 
Thanks,
Sujit
 

This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. 
Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20091012/99f3ed93/attachment.html 

From Y.S.Ramakrishna at Sun.COM  Mon Oct 12 08:23:51 2009
From: Y.S.Ramakrishna at Sun.COM (Y.S.Ramakrishna at Sun.COM)
Date: Mon, 12 Oct 2009 08:23:51 -0700
Subject: CMS GC Log
In-Reply-To: <19B27FD5AF2EAA49A66F787911CF519596D084@CTSINCHNSXUU.cts.com>
References: <19B27FD5AF2EAA49A66F787911CF519596D084@CTSINCHNSXUU.cts.com>
Message-ID: <4AD34A07.8050007@Sun.COM>

On 10/11/09 17:58, Sujit.Das at cognizant.com wrote:
> Trying to understand CMS-IM and CMS-RM log messages.
>  
> *1. CMS Initial Mark*
> 7465.472: [GC [1 CMS-initial-mark: 1370720K(2739200K)] 1384764K(3058176K), 0.0995109 secs]
> In above log snippet, CMS IM was triggered at the old gen occupancy of 
> 1370720 K with total size of old gen size as 2739200 K. What does 

Right.

> 1384764 K and 3058176 K represent?

Current total heap occupancy and current total heap capacity. (excluding perm gen).

> *2. CMS Remark*
> 7475.403: [weak refs processing, 0.0024767 secs] [1 CMS-remark: 
> 1401865K(2739200K)] 1582866K(3058176K), 0.1290017 secs]
>  
> Similarly in above log snippet, CMS RM was triggered at the old gen 
> occupancy of 1401865 K with total size of old gen size as 2739200 K. 
> What does 1582866 K and 3058176 K represent?

Same as what i wrote above for CMS-IM.

>  
> Also, in GC output is there any way we can determine heap space used 
> before and after a CMS collection? This is printed for minor collection 
> but not sure if this is printed for CMS collection.

The above is basically all you get. Typically the free space peaks towards
the end of the CMS-sweep and reaches its minimum typically around or just
after the start of the sweep (i.e. around the CMS-remark phase).

-- ramki

>  
> Thanks,
> Sujit
>  
> This e-mail and any files transmitted with it are for the sole use of 
> the intended recipient(s) and may contain confidential and privileged 
> information.
> If you are not the intended recipient, please contact the sender by 
> reply e-mail and destroy all copies of the original message.
> Any unauthorized review, use, disclosure, dissemination, forwarding, 
> printing or copying of this email or any action taken in reliance on 
> this e-mail is strictly prohibited and may be unlawful.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From rainer.jung at kippdata.de  Tue Oct 13 15:46:36 2009
From: rainer.jung at kippdata.de (Rainer Jung)
Date: Wed, 14 Oct 2009 00:46:36 +0200
Subject: DGC and UseParallelOldGC
Message-ID: <4AD5034C.7090703@kippdata.de>

Hi all,

if I activate the parallel stop the world collector using
-XX:+UseParallelOldGC and DGC kicks in: will it run with multiple
threads in parallel, or will it only run single-threaded?

I know that in case I use CMS, there are ExplicitGCInvokesConcurrent and
ExplicitGCInvokesConcurrentAndUnloadsClasses which also influence DGC,
but I didn't find any info about the situation using parallel old GC.

TIA!

Regards,

Rainer


From Jon.Masamitsu at Sun.COM  Tue Oct 13 16:35:34 2009
From: Jon.Masamitsu at Sun.COM (Jon Masamitsu)
Date: Tue, 13 Oct 2009 16:35:34 -0700
Subject: DGC and UseParallelOldGC
In-Reply-To: <4AD5034C.7090703@kippdata.de>
References: <4AD5034C.7090703@kippdata.de>
Message-ID: <4AD50EC6.1020504@sun.com>

Rainer Jung wrote On 10/13/09 15:46,:

>Hi all,
>
>if I activate the parallel stop the world collector using
>-XX:+UseParallelOldGC and DGC kicks in: will it run with multiple
>threads in parallel, or will it only run single-threaded?
>  
>

I'm assuming you mean that you are using the throughput collector
(UseParallelGC/UseParallelOldGC) in which case turning on
UseParallelOldGC will use multiple threads for an explicit GC.
(System.gc()).

>I know that in case I use CMS, there are ExplicitGCInvokesConcurrent and
>ExplicitGCInvokesConcurrentAndUnloadsClasses which also influence DGC,
>but I didn't find any info about the situation using parallel old GC.
>
>TIA!
>
>Regards,
>
>Rainer
>
>_______________________________________________
>hotspot-gc-use mailing list
>hotspot-gc-use at openjdk.java.net
>http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>  
>


From rainer.jung at kippdata.de  Tue Oct 13 18:10:08 2009
From: rainer.jung at kippdata.de (Rainer Jung)
Date: Wed, 14 Oct 2009 03:10:08 +0200
Subject: DGC and UseParallelOldGC
In-Reply-To: <4AD50EC6.1020504@sun.com>
References: <4AD5034C.7090703@kippdata.de> <4AD50EC6.1020504@sun.com>
Message-ID: <4AD524F0.4050905@kippdata.de>

On 14.10.2009 01:35, Jon Masamitsu wrote:
> Rainer Jung wrote On 10/13/09 15:46,:
> 
>> Hi all,
>>
>> if I activate the parallel stop the world collector using
>> -XX:+UseParallelOldGC and DGC kicks in: will it run with multiple
>> threads in parallel, or will it only run single-threaded?
> 
> I'm assuming you mean that you are using the throughput collector
> (UseParallelGC/UseParallelOldGC) in which case turning on
> UseParallelOldGC will use multiple threads for an explicit GC.
> (System.gc()).

Yes that was the question, but also especially for the distributed GC
triggered by RMI. When using CMS you have to explicitely set flags to
let DGC also use the CMS. So since UseParallelOldGC is not the default I
wasn't sure, whether you need to set something in analogy to let the
distributed GC also use it.

Regards,

Rainer

From Jon.Masamitsu at Sun.COM  Wed Oct 14 06:40:29 2009
From: Jon.Masamitsu at Sun.COM (Jon Masamitsu)
Date: Wed, 14 Oct 2009 06:40:29 -0700
Subject: DGC and UseParallelOldGC
In-Reply-To: <4AD524F0.4050905@kippdata.de>
References: <4AD5034C.7090703@kippdata.de> <4AD50EC6.1020504@sun.com>
	<4AD524F0.4050905@kippdata.de>
Message-ID: <4AD5D4CD.6040501@Sun.COM>


On 10/13/09 18:10, Rainer Jung wrote:
> On 14.10.2009 01:35, Jon Masamitsu wrote:
>> Rainer Jung wrote On 10/13/09 15:46,:
>>
>>> Hi all,
>>>
>>> if I activate the parallel stop the world collector using
>>> -XX:+UseParallelOldGC and DGC kicks in: will it run with multiple
>>> threads in parallel, or will it only run single-threaded?
>> I'm assuming you mean that you are using the throughput collector
>> (UseParallelGC/UseParallelOldGC) in which case turning on
>> UseParallelOldGC will use multiple threads for an explicit GC.
>> (System.gc()).
> 
> Yes that was the question, but also especially for the distributed GC
> triggered by RMI. When using CMS you have to explicitely set flags to
> let DGC also use the CMS. So since UseParallelOldGC is not the default I
> wasn't sure, whether you need to set something in analogy to let the
> distributed GC also use it.
> 

Yes for the distributed GC case multiple GC threads will be used.

A GC triggered by RMI (for distributed GC purposes) is the same as
a explicit GC (no special treatment).  Similarly, when you use
CMS and ExplicitGCInvokesConcurrent and/or 
ExplicitGCInvokesConcurrentAndUnloadsClasses all the
explicit GC's are affected.

From eagle.kessler at gmail.com  Thu Oct 15 13:43:47 2009
From: eagle.kessler at gmail.com (Eagle Kessler)
Date: Thu, 15 Oct 2009 13:43:47 -0700
Subject: Excessive concurrent-abortable-preclean wall time?
Message-ID: <b7a77c660910151343m1c49566bw5aeb2531e4cf89e9@mail.gmail.com>

A bit of background:
We had a service up (1.4GB in-memory database cache with short-lived
requests against it) running happily in 2GB of heap (512MB Young
Generation). That was ticking along perfectly acceptably, promoting
around 1MB of young-generation objects every minute, which were
cleaned out by periodic CMS cycles.

Seventeen days in, we hit a CMS concurrent mode failure, paused for
ten seconds to do a compacting collection, and ops decided (based on
their response-time monitors) that the service had gone down and
restarted it. When they saw the concurrent mode failure in the logs,
they decided to bump it up to 3GB of heap (1GB Young Generation) based
on I'm not entirely sure what and asked us to look into why we hit the
compacting collection.

Looking at the logs, it seemed that each young generation caught
around 1MB of live transient data and promoted it, and those
promotions eventually caused enough fragmentation to force a
compacting collection. The old generation was definitely not growing
between collections or anything like that, but fragmentation as the
cause is a guess on my part.

We asked ops to add -XX:MaxTenuringThreshold=2 and
-XX:SurvivorRatio=128, in the hopes that the addition of the tenuring
threshold would prevent the live transient data from being promoted.
When that change was applied, though, we began seeing constant, and
extremely poorly-performing, CMS collections:

778.842: [GC [1 CMS-initial-mark: 1403910K(2097152K)]
1955787K(3137664K), 0.3734370 secs]
779.216: [CMS-concurrent-mark-start]
782.234: [CMS-concurrent-mark: 3.017/3.018 secs]
782.234: [CMS-concurrent-preclean-start]
782.242: [CMS-concurrent-preclean: 0.008/0.008 secs]
782.242: [CMS-concurrent-abortable-preclean-start]
856.748: [GC 856.749: [ParNew
Desired survivor size 4128768 bytes, new threshold 2 (max 2)
- age   1:    1159440 bytes,    1159440 total
- age   2:     147320 bytes,    1306760 total
: 1033744K->1301K(1040512K), 0.0085134 secs]
2437655K->1405297K(3137664K), 0.0087125 secs]
933.642: [CMS-concurrent-abortable-preclean: 12.932/151.400 secs]
933.642: [GC[YG occupancy: 517886 K (1040512 K)]933.643: [Rescan
(parallel) , 0.2396928 secs]933.882: [weak refs processing, 0.0026955
secs] [1 CMS-remark: 1403995K(2097152K)] 1921882K(3137664K), 0.2425640
secs]
933.885: [CMS-concurrent-sweep-start]
934.685: [CMS-concurrent-sweep: 0.799/0.799 secs]
934.685: [CMS-concurrent-reset-start]
934.700: [CMS-concurrent-reset: 0.015/0.015 secs]
936.702: [GC [1 CMS-initial-mark: 1403961K(2097152K)]
1955265K(3137664K), 0.3780262 secs]
937.081: [CMS-concurrent-mark-start]
940.148: [CMS-concurrent-mark: 3.067/3.067 secs]
940.148: [CMS-concurrent-preclean-start]
940.155: [CMS-concurrent-preclean: 0.007/0.008 secs]
940.155: [CMS-concurrent-abortable-preclean-start]
1012.714: [GC 1012.714: [ParNew
Desired survivor size 4128768 bytes, new threshold 2 (max 2)
- age   1:    1300328 bytes,    1300328 total
- age   2:     153424 bytes,    1453752 total
: 1033749K->1436K(1040512K), 0.0187176 secs]
2437711K->1405484K(3137664K), 0.0188932 secs]
1097.952: [CMS-concurrent-abortable-preclean: 13.428/157.797 secs]
1097.954: [GC[YG occupancy: 518322 K (1040512 K)]1097.954: [Rescan
(parallel) , 0.2218981 secs]1098.176: [weak refs processing, 0.0026654
secs] [1 CMS-remark: 1404048K(2097152K)] 1922370K(3137664K), 0.2247333
secs]
1098.180: [CMS-concurrent-sweep-start]
1098.939: [CMS-concurrent-sweep: 0.759/0.759 secs]
1098.939: [CMS-concurrent-reset-start]
1098.954: [CMS-concurrent-reset: 0.015/0.015 secs]

Those two cycles are representative of the entire log (attached), the
main complaints being the constant CMS cycles and the lengthy
CMS-concurrent-abortable-preclean wall time.
We've since added -XX:CMSInitiatingOccupancyFraction=85
-XX:+UseCMSInitiatingOccupancyOnly to prevent the constant cycles,
which may have fixed the lengthy preclean time as well - we haven't
hit a full GC yet, but I can share that data when it is available. The
system is a dual quad core AMD server, running Solaris and Java
1.5.0_14 64-bit server.

This all leads down to three questions:
1) Why did we start seeing constant CMS cycles with the increased heap
and addition of MaxTenuringThreshold?
2) Why did CMS-concurrent-abortable-preclean begin taking so much wall
time when CMS was running constantly?
3) Was adding CMSInitiatingOccupancyFraction and
UseCMSInitiatingOccupancyOnly the right way to solve this problem? Is
there a better one?

The original options were:
  -server -Xms2g -Xmx2g
-XX:NewSize=512m -XX:MaxNewSize=512m
-XX:PermSize=128m -XX:MaxPermSize=128m
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:MaxTenuringThreshold=0 -XX:SurvivorRatio=128

The poor-performance options were:
-server -Xms3g -Xmx3g
-XX:NewSize=1024m -XX:MaxNewSize=1024m
-XX:PermSize=128m -XX:MaxPermSize=128m
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:MaxTenuringThreshold=2 -XX:SurvivorRatio=128

The current (working) options are
-server -Xms3g -Xmx3g
-XX:NewSize=1024m -XX:MaxNewSize=1024m
-XX:PermSize=128m -XX:MaxPermSize=128m
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:MaxTenuringThreshold=2 -XX:SurvivorRatio=128
-XX:CMSInitiatingOccupancyFraction=85 -XX:+UseCMSInitiatingOccupancyOnly


-- 
Jacob Kessler
-------------- next part --------------
A non-text attachment was scrubbed...
Name: poorGC.log
Type: application/octet-stream
Size: 17929 bytes
Desc: not available
Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20091015/868c16c0/attachment-0001.obj 

From Y.S.Ramakrishna at Sun.COM  Thu Oct 15 15:20:07 2009
From: Y.S.Ramakrishna at Sun.COM (Y.S.Ramakrishna at Sun.COM)
Date: Thu, 15 Oct 2009 15:20:07 -0700
Subject: Excessive concurrent-abortable-preclean wall time?
In-Reply-To: <b7a77c660910151343m1c49566bw5aeb2531e4cf89e9@mail.gmail.com>
References: <b7a77c660910151343m1c49566bw5aeb2531e4cf89e9@mail.gmail.com>
Message-ID: <4AD7A017.5070100@Sun.COM>

Hi Jacob --

On 10/15/09 13:43, Eagle Kessler wrote:
...
> Looking at the logs, it seemed that each young generation caught
> around 1MB of live transient data and promoted it, and those
> promotions eventually caused enough fragmentation to force a
> compacting collection. The old generation was definitely not growing
> between collections or anything like that, but fragmentation as the
> cause is a guess on my part.

 From yr description I would tend to agree that that looks
like a likely explanation.

> 
> We asked ops to add -XX:MaxTenuringThreshold=2 and
> -XX:SurvivorRatio=128, in the hopes that the addition of the tenuring
> threshold would prevent the live transient data from being promoted.

Of course, you sized SurvivorRatio so that all of that transient data
(and then some) fitted in the survivor spaces (as your tenuring distribution
data below showed).

> When that change was applied, though, we began seeing constant, and
> extremely poorly-performing, CMS collections:
> 
> 778.842: [GC [1 CMS-initial-mark: 1403910K(2097152K)]
> 1955787K(3137664K), 0.3734370 secs]

The occupancy of the old gen is here ~67%, which is
quite close to the occupancy of the old gen following
a collection (i.e. yr program's "footprint").
For some reason CMS seems to have decided
that 67% is the right level at which to initiate a
CMS cycle, so you started seeing these back to back
cycles.

I am not sure why CMS ergonomics decided that that was
the right triggering level (clearly you have managed to
run successfully at a higher manually set initiating
occupancy fraction of 85%, so perhaps the ergonomic trigger
is being overly conserrvative). My guess is that for some
reason the CMS ergonomics is perhaps doing the wrong thing here.
Roughly speaking, at a high level, here's how it works --
it looks at (1) how much space is free in the old gen at
the end of each scavenge (2) it tracks roughly how much
our recent promotion rate (3) based on these two numbers,
it calculates roughly how long we are likely to last
before we run out of space in the old gen (plus possibly
some safety factor) (4) it compares that with how long our
recent CMS cycles have lasted and kicks off a CMS collection
at the appropriate time or if it finds that CMSinitiatingOccupancyFraction
setting has been exceeded. (I believe that number for 5u14
is by default 92%; so i must conclude that it is not the
CMSInitiatingOccupancyFraction but the ergonomic triggering
logic described above that is causing collections to kick off
too early.)

The workaround you used, of explictly setting CMSInitiatingOccupancyFraction
and asking to use that and only that as the trigger for CMS collection
via -XX:+UseCMSInitiatingOccupancyOnly (thus short-circuiting the
ergonomic trigger) was indeed the right thing to do here.

Unless someone else on this list knows the reason for this
behaviour or has other suggestions, I'd suggest the following:
(a) try the latest 5uXX (5u21 i believe), although it's unlikely its
     behaviour will be any different. (Aside: Note that
     5uXX is approaching EOSL end of this month, so make sure
     to check with http://www.sun.com/software/javaforbusiness/support.jsp
     re support of that product going forward.)
(b) try the latest 6uXX (6u16), and latest HotSpot hs17 to
     see if the behaviour is similar. If the behaviour w/hs17 etc
     is good we are all happy. If not, report the bug below.
(c) file a bug if this behaviour is reproducible with (b) above,
     or with (1) if you have appropriate formal support.

As regards :-

> 2) Why did CMS-concurrent-abortable-preclean begin taking so much wall
> time when CMS was running constantly?

a measure of the efficacy of that phase is whether it succeeds in
its objective of placing the CMS remark pause roughly mid-way between
scavenges -- this reduces the occurrence of possibly back-to-back
STW pauses and also tends to maximally work-balance the CMS remark.
The remark pauses all seem to do a pretty good job of
scheduling when the young gen is roughly 50% full. The long abortable-preclean
phase is just a measure of the long inter-scavenge durations seen in this log.
As the following indicates:-

> 2409.982: [CMS-concurrent-abortable-preclean: 13.555/159.013 secs]

CMS precleaning runs at a duty cycle of less than 10%, so it's
something that you can basically ignore. (I am curious though by
what you meant when stating that the new settings made the long
abortable preclean go away; but the real test of the efficacy of
that cycle is how it spaces apart the STW pauses and whether it's
able to keep the CMS-remark pauses under control.)

all the best!
-- ramki

From Y.S.Ramakrishna at Sun.COM  Thu Oct 15 15:28:38 2009
From: Y.S.Ramakrishna at Sun.COM (Y.S.Ramakrishna at Sun.COM)
Date: Thu, 15 Oct 2009 15:28:38 -0700
Subject: Excessive concurrent-abortable-preclean wall time?
In-Reply-To: <4AD7A017.5070100@Sun.COM>
References: <b7a77c660910151343m1c49566bw5aeb2531e4cf89e9@mail.gmail.com>
	<4AD7A017.5070100@Sun.COM>
Message-ID: <4AD7A216.3030402@Sun.COM>


>>
>> We asked ops to add -XX:MaxTenuringThreshold=2 and
>> -XX:SurvivorRatio=128, in the hopes that the addition of the tenuring
>> threshold would prevent the live transient data from being promoted.

This was also exactly the right thing to do to arrest possible fragmentation
and, in any case, to reduce pressure on the concurrent collector.

-- ramki

> Of course, you sized SurvivorRatio so that all of that transient data
> (and then some) fitted in the survivor spaces (as your tenuring 
> distribution
> data below showed).

From Jon.Masamitsu at Sun.COM  Fri Oct 16 07:33:35 2009
From: Jon.Masamitsu at Sun.COM (Jon Masamitsu)
Date: Fri, 16 Oct 2009 07:33:35 -0700
Subject: Excessive concurrent-abortable-preclean wall time?
In-Reply-To: <4AD7A017.5070100@Sun.COM>
References: <b7a77c660910151343m1c49566bw5aeb2531e4cf89e9@mail.gmail.com>
	<4AD7A017.5070100@Sun.COM>
Message-ID: <4AD8843F.5050701@sun.com>

Y.S.Ramakrishna at Sun.COM wrote On 10/15/09 15:20,:

> ...
>
>
>Unless someone else on this list knows the reason for this
>behaviour or has other suggestions, I'd suggest the following:
>(a) try the latest 5uXX (5u21 i believe), although it's unlikely its
>     behaviour will be any different. (Aside: Note that
>     5uXX is approaching EOSL end of this month, so make sure
>     to check with http://www.sun.com/software/javaforbusiness/support.jsp
>     re support of that product going forward.)
>(b) try the latest 6uXX (6u16), and latest HotSpot hs17 to
>     see if the behaviour is similar. If the behaviour w/hs17 etc
>     is good we are all happy. If not, report the bug below.
>  
>
Note that the CMS default settings for jdk6 changed so
you might see some differences just from those changes.

http://java.sun.com/javase/6/docs/technotes/guides/vm/cms-6.html

From Akhmadeev at NetCracker.com  Fri Oct 16 08:13:15 2009
From: Akhmadeev at NetCracker.com (Timur Akhmadeev)
Date: Fri, 16 Oct 2009 19:13:15 +0400
Subject: Interpreting G1 GC log
Message-ID: <17182849A17B3C44AE01780E7B519C82065C46E7@WISE.netcracker.com>

Hello all,

 
I?m testing an application which is very sensitive to GC pause times. I?ve already tried CMS and found it quite capable of handling Full GC using reasonable pauses; now I want to try G1. The first impression is G1 can defer Full GC, and does it better than CMS. I found it hard to interpret what is logged by G1, so I would like to ask if someone can help me to understand G1 log. I?m using jdk6u16 on OEL4u7 and startup options are following:

 
-XX:+UseLargePages 

-Xms2048m 

-Xmx2048m 

-XX:G1YoungGenSize=1600m 

-XX:+UnlockExperimentalVMOptions 

-XX:+UseG1GC 

-XX:MaxGCPauseMillis=25 

-XX:GCPauseIntervalMillis=5000 

-XX:+G1ParallelRSetUpdatingEnabled 

-XX:+G1ParallelRSetScanningEnabled 

-Xloggc:../logs/gc.log 

-XX:+PrintGCDetails

 
The box has 4 cores, so I assume it is using 4 threads for GC. Here is information about threads related to GC:

 
"Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x08061000 nid=0x32b6 runnable

"Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x08062400 nid=0x32b7 runnable

"Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x08063c00 nid=0x32b8 runnable

"Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x08065000 nid=0x32b9 runnable

"G1 concurrent mark GC Thread" prio=10 tid=0x080cb000 nid=0x32bb runnable

"G1 concurrent refinement GC Thread" prio=10 tid=0x080a6800 nid=0x32ba runnable

"G1 zero-fill GC Thread" prio=10 tid=0x0810e800 nid=0x32bd runnable

 
And here is an excerpt of one minor collection:

 
1900.742: [GC pause (young), 0.08446400 secs]

   [Parallel Time:  65.5 ms]

      [Update RS (Start) (ms):  1900759.4  1900754.3  1900754.6  1900754.1]

      [Update RS (ms):  2.7  0.7  0.4  0.9

       Avg:   1.2, Min:   0.4, Max:   2.7]

         [Processed Buffers : 0 20 8 18

          Sum: 46, Avg: 11, Min: 0, Max: 20]

      [Ext Root Scanning (ms):  8.9  4.7  5.9  5.4

       Avg:   6.3, Min:   4.7, Max:   8.9]

      [Mark Stack Scanning (ms):  0.0  0.0  0.0  0.0

       Avg:   0.0, Min:   0.0, Max:   0.0]

      [Scan-Only Scanning (ms):  0.0  0.0  0.0  0.0

       Avg:   0.0, Min:   0.0, Max:   0.0]

         [Scan-Only Regions : 0 0 0 0

          Sum: 0, Avg: 0, Min: 0, Max: 0]

      [Scan RS (ms):  0.3  3.8  3.7  3.7

       Avg:   2.9, Min:   0.3, Max:   3.8]

      [Object Copy (ms):  48.4  50.6  48.8  51.8

       Avg:  49.9, Min:  48.4, Max:  51.8]

      [Termination (ms):  1.7  5.5  4.7  1.7

       Avg:   3.4, Min:   1.7, Max:   5.5]

      [Other:   1.9 ms]

   [Clear CT:   2.8 ms]

   [Other:  16.2 ms]

   [ 1827M->247M(2048M)]

 [Times: user=0.26 sys=0.01, real=0.09 secs] 

 
Here?re my questions:

 
1)       GC pause (young), 0.08195600 secs

Does it mean that the application was stopped for 0.0819 seconds? If so, what is represented by this line:

[Times: user=0.26 sys=0.01, real=0.09 secs]

2)       What parts of G1 are working in parallel/serially?

3)       What parts of G1 are working concurrently/cause stop-the-world?

 
BTW, I found that large pages are not working with G1 ? JVM silently uses usual way for OS memory allocation. Is it a bug?

 
TIA,

Timur Akhmadeev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20091016/8d8f9f9c/attachment-0001.html 

From eagle.kessler at gmail.com  Fri Oct 16 10:46:53 2009
From: eagle.kessler at gmail.com (Eagle Kessler)
Date: Fri, 16 Oct 2009 10:46:53 -0700
Subject: Excessive concurrent-abortable-preclean wall time?
In-Reply-To: <4AD7A216.3030402@Sun.COM>
References: <b7a77c660910151343m1c49566bw5aeb2531e4cf89e9@mail.gmail.com>
	<4AD7A017.5070100@Sun.COM> <4AD7A216.3030402@Sun.COM>
Message-ID: <b7a77c660910161046n15f78b64ke9ee9b48417284a4@mail.gmail.com>

Thank you for the explanation, and the confirmation that that was the
correct workaround and settings to use.
I don't that we'll be able to get HotSpot 1.7 deployed any time soon,
but I'll see if we can get bumped up to 1.6 and get some data out of
that.

-- 
Jacob Kessler

On Thu, Oct 15, 2009 at 3:28 PM,  <Y.S.Ramakrishna at sun.com> wrote:
>
>>>
>>> We asked ops to add -XX:MaxTenuringThreshold=2 and
>>> -XX:SurvivorRatio=128, in the hopes that the addition of the tenuring
>>> threshold would prevent the live transient data from being promoted.
>
> This was also exactly the right thing to do to arrest possible fragmentation
> and, in any case, to reduce pressure on the concurrent collector.
>
> -- ramki
>
>> Of course, you sized SurvivorRatio so that all of that transient data
>> (and then some) fitted in the survivor spaces (as your tenuring
>> distribution
>> data below showed).
>

From Sujit.Das at cognizant.com  Tue Oct 27 09:49:10 2009
From: Sujit.Das at cognizant.com (Sujit.Das at cognizant.com)
Date: Tue, 27 Oct 2009 22:19:10 +0530
Subject: Parsing GC log for Thread Stops
Message-ID: <19B27FD5AF2EAA49A66F787911CF519596D0A0@CTSINCHNSXUU.cts.com>

Hi,
 
We are using JDK jdk1.6.0_14 on Solaris 10 platform. I am having difficulty in associating timestamp to thread stop time. Please refer the log snippet below.
 
There are two occurence of thread stops. The first is at 18:06 with thread stop time value of 0.0336887 seconds. The second thread stop time, of value 3.4217414 seconds, has no associated timestamp. My question - what is the associated timestamp for second thread stop? Is it 18:06 or 18:09?

2009-10-26T18:06:19.008-0500: 3534.051: [GC Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 385976134
Max   Chunk Size: 385976134
Number of Blocks: 1
Av.  Block  Size: 385976134
Tree      Height: 1
Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 0
Max   Chunk Size: 0
Number of Blocks: 0
Tree      Height: 0
3534.051: [ParNew
Desired survivor size 218431488 bytes, new threshold 4 (max 4)
- age   1:    4747768 bytes,    4747768 total
- age   2:    8973152 bytes,   13720920 total
- age   3:   16305200 bytes,   30026120 total
- age   4:    1050896 bytes,   31077016 total
: 4300016K->43704K(4693376K), 0.0265603 secs] 5380449K->1124137K(8789376K)After GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 385976134
Max   Chunk Size: 385976134
Number of Blocks: 1
Av.  Block  Size: 385976134
Tree      Height: 1
After GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 0
Max   Chunk Size: 0
Number of Blocks: 0
Tree      Height: 0
, 0.0272833 secs] [Times: user=0.36 sys=0.01, real=0.03 secs]
Heap after GC invocations=213 (full 2):
 par new generation   total 4693376K, used 43704K [0xfffffd7dbf600000, 0xfffffd7ef7e00000, 0xfffffd7ef7e00000)
  eden space 4266752K,   0% used [0xfffffd7dbf600000, 0xfffffd7dbf600000, 0xfffffd7ec3cc0000)
  from space 426624K,  10% used [0xfffffd7ec3cc0000, 0xfffffd7ec676e090, 0xfffffd7eddd60000)
  to   space 426624K,   0% used [0xfffffd7eddd60000, 0xfffffd7eddd60000, 0xfffffd7ef7e00000)
 concurrent mark-sweep generation total 4096000K, used 1080433K [0xfffffd7ef7e00000, 0xfffffd7ff1e00000, 0xfffffd7ff1e00000)
 concurrent-mark-sweep perm gen total 131072K, used 62140K [0xfffffd7ff1e00000, 0xfffffd7ff9e00000, 0xfffffd7ff9e00000)
}
Total time for which application threads were stopped: 0.0336887 seconds
3878       ...class name-method name...
3879       ...class name-method name...
3880       ...class name-method name...
3881       ...class name-method name...
3882       ...class name-method name...
Application time: 163.9563827 seconds
3867       ...class name-method name...
3868       ...class name-method name...
Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 385976134
Max   Chunk Size: 385976134
Number of Blocks: 1
Av.  Block  Size: 385976134
Tree      Height: 1
Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 0
Max   Chunk Size: 0
Number of Blocks: 0
Tree      Height: 0
After GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 385976134
Max   Chunk Size: 385976134
Number of Blocks: 1
Av.  Block  Size: 385976134
Tree      Height: 1
After GC:
Statistics for BinaryTreeDictionary:
------------------------------------
Total Free Space: 0
Max   Chunk Size: 0
Number of Blocks: 0
Tree      Height: 0
Total time for which application threads were stopped: 3.4217414 seconds
Application time: 0.3801325 seconds
{Heap before GC invocations=213 (full 2):
 par new generation   total 4693376K, used 1247868K [0xfffffd7dbf600000, 0xfffffd7ef7e00000, 0xfffffd7ef7e00000)
  eden space 4266752K,  28% used [0xfffffd7dbf600000, 0xfffffd7e08df1070, 0xfffffd7ec3cc0000)
  from space 426624K,  10% used [0xfffffd7ec3cc0000, 0xfffffd7ec676e090, 0xfffffd7eddd60000)
  to   space 426624K,   0% used [0xfffffd7eddd60000, 0xfffffd7eddd60000, 0xfffffd7ef7e00000)
 concurrent mark-sweep generation total 4096000K, used 1080433K [0xfffffd7ef7e00000, 0xfffffd7ff1e00000, 0xfffffd7ff1e00000)
 concurrent-mark-sweep perm gen total 131072K, used 62142K [0xfffffd7ff1e00000, 0xfffffd7ff9e00000, 0xfffffd7ff9e00000)
2009-10-26T18:09:45.836-0500: 3740.882: [Full GC Before GC:
Statistics for BinaryTreeDictionary:
------------------------------------
...
...
 
Thanks,
Sujit

This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20091027/11ab6508/attachment.html 

From Y.S.Ramakrishna at Sun.COM  Tue Oct 27 11:42:24 2009
From: Y.S.Ramakrishna at Sun.COM (Y.S.Ramakrishna at Sun.COM)
Date: Tue, 27 Oct 2009 11:42:24 -0700
Subject: Parsing GC log for Thread Stops
In-Reply-To: <19B27FD5AF2EAA49A66F787911CF519596D0A0@CTSINCHNSXUU.cts.com>
References: <19B27FD5AF2EAA49A66F787911CF519596D0A0@CTSINCHNSXUU.cts.com>
Message-ID: <4AE73F10.5050709@Sun.COM>

You seem to have deleted some o/p, but
the stop time printing strictly follows the stop;
it does not precede it in the log.

(Note of course that threads may be stopped for
stuff other than GC as well, and those reasons
may not necessarily be published to the log
(even though the fact that threads were stopped
and for how long is).)

-- ramki

On 10/27/09 09:49, Sujit.Das at cognizant.com wrote:
> Hi,
>  
> We are using JDK jdk1.6.0_14 on Solaris 10 platform. I am having 
> difficulty in associating timestamp to thread stop time. Please refer 
> the log snippet below.
>  
> There are two occurence of thread stops. The first is at 18:06 with 
> thread stop time value of 0.0336887 seconds. The second thread stop 
> time, of value 3.4217414 seconds, has no associated timestamp. My 
> question - what is the associated timestamp for second thread stop? Is 
> it 18:06 or 18:09?
> 
> 2009-10-26T18:06:19.008-0500: 3534.051: [GC Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 385976134
> Max   Chunk Size: 385976134
> Number of Blocks: 1
> Av.  Block  Size: 385976134
> Tree      Height: 1
> Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 0
> Max   Chunk Size: 0
> Number of Blocks: 0
> Tree      Height: 0
> 3534.051: [ParNew
> Desired survivor size 218431488 bytes, new threshold 4 (max 4)
> - age   1:    4747768 bytes,    4747768 total
> - age   2:    8973152 bytes,   13720920 total
> - age   3:   16305200 bytes,   30026120 total
> - age   4:    1050896 bytes,   31077016 total
> : 4300016K->43704K(4693376K), 0.0265603 secs] 
> 5380449K->1124137K(8789376K)After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 385976134
> Max   Chunk Size: 385976134
> Number of Blocks: 1
> Av.  Block  Size: 385976134
> Tree      Height: 1
> After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 0
> Max   Chunk Size: 0
> Number of Blocks: 0
> Tree      Height: 0
> , 0.0272833 secs] [Times: user=0.36 sys=0.01, real=0.03 secs]
> Heap after GC invocations=213 (full 2):
>  par new generation   total 4693376K, used 43704K [0xfffffd7dbf600000, 
> 0xfffffd7ef7e00000, 0xfffffd7ef7e00000)
>   eden space 4266752K,   0% used [0xfffffd7dbf600000, 
> 0xfffffd7dbf600000, 0xfffffd7ec3cc0000)
>   from space 426624K,  10% used [0xfffffd7ec3cc0000, 0xfffffd7ec676e090, 
> 0xfffffd7eddd60000)
>   to   space 426624K,   0% used [0xfffffd7eddd60000, 0xfffffd7eddd60000, 
> 0xfffffd7ef7e00000)
>  concurrent mark-sweep generation total 4096000K, used 1080433K 
> [0xfffffd7ef7e00000, 0xfffffd7ff1e00000, 0xfffffd7ff1e00000)
>  concurrent-mark-sweep perm gen total 131072K, used 62140K 
> [0xfffffd7ff1e00000, 0xfffffd7ff9e00000, 0xfffffd7ff9e00000)
> }
> *Total time for which application threads were stopped: 0.0336887 seconds
> *3878       ...class name-method name...
> 3879       ...class name-method name...
> 3880       ...class name-method name...
> 3881       ...class name-method name...
> 3882       ...class name-method name...
> Application time: 163.9563827 seconds
> 3867       ...class name-method name...
> 3868       ...class name-method name...
> Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 385976134
> Max   Chunk Size: 385976134
> Number of Blocks: 1
> Av.  Block  Size: 385976134
> Tree      Height: 1
> Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 0
> Max   Chunk Size: 0
> Number of Blocks: 0
> Tree      Height: 0
> After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 385976134
> Max   Chunk Size: 385976134
> Number of Blocks: 1
> Av.  Block  Size: 385976134
> Tree      Height: 1
> After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 0
> Max   Chunk Size: 0
> Number of Blocks: 0
> Tree      Height: 0
> *Total time for which application threads were stopped: 3.4217414 seconds*
> Application time: 0.3801325 seconds
> {Heap before GC invocations=213 (full 2):
>  par new generation   total 4693376K, used 1247868K [0xfffffd7dbf600000, 
> 0xfffffd7ef7e00000, 0xfffffd7ef7e00000)
>   eden space 4266752K,  28% used [0xfffffd7dbf600000, 
> 0xfffffd7e08df1070, 0xfffffd7ec3cc0000)
>   from space 426624K,  10% used [0xfffffd7ec3cc0000, 0xfffffd7ec676e090, 
> 0xfffffd7eddd60000)
>   to   space 426624K,   0% used [0xfffffd7eddd60000, 0xfffffd7eddd60000, 
> 0xfffffd7ef7e00000)
>  concurrent mark-sweep generation total 4096000K, used 1080433K 
> [0xfffffd7ef7e00000, 0xfffffd7ff1e00000, 0xfffffd7ff1e00000)
>  concurrent-mark-sweep perm gen total 131072K, used 62142K 
> [0xfffffd7ff1e00000, 0xfffffd7ff9e00000, 0xfffffd7ff9e00000)
> 2009-10-26T18:09:45.836-0500: 3740.882: [Full GC Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------