From ching at neutec.com.tw  Wed Aug  8 00:50:06 2012
From: ching at neutec.com.tw (Ching Chen)
Date: Wed, 8 Aug 2012 15:50:06 +0800
Subject: Java 7 update 5 GC
Message-ID: <CAFX+WFTt9HrA2=HPt=9URS-beqfyCjyDmi6Xgw5yEws004_xag@mail.gmail.com>

Dear Sirs/Madams who may concern:

My java application (high-volume, low-latency on-line transaction betting
system) needs to accomplish the least GC pause while to keep acceptable
throughput.  When I read java 7 G1GC and thought this is what I really want
(I also investigated Azul Zing but stop somewhere in the middle of my
study).

I got a strange result when I ran a test with java 7 update 5 and compared
it with java 7.0 of same test though.

The java commands associating to GC in my test is not complicate.  I only
specified several of them and let GC to determine the rest for best
performance. The GC options are:
java -Xms12288m -Xmx12288m -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCApplicationStoppedTime -XX:+PrintCommandLineFlags -XX:+UseG1GC
-XX:MaxGCPauseMillis=100 \

The java 7.0 result as following:

*java version "1.7.0"
Java(TM) SE Runtime Environment (build 1.7.0-b147)
Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)
[INFO] total GC times:199
[INFO] C:/Users/Chris/Documents/My
Projects/citibet-2nd/support/matchspace/GC
performance/XX_UseG1GC_MaxGCPauseMillis100.txt average GC:0.067739 second
[INFO] app stops:206, average seconds:0.065926
[DEBUG] [0.006688, 0.007678, 0.008537, 0.014192, 0.021206, 0.023098,
0.023628, 0.026924, 0.028749, 0.029012, 0.030648, 0.031663, 0.031704,
0.031743, 0.033121,
0.035534, 0.036437, 0.037418, 0.038571, 0.039253, 0.040569, 0.041807,
0.042169, 0.042459, 0.043335, 0.043797, 0.045504, 0.046178, 0.046337,
0.050747, 0.051332,
0.052527, 0.052614, 0.052623, 0.054945, 0.055021, 0.05538, 0.055595,
0.055836, 0.05586, 0.056101, 0.056134, 0.056167, 0.056181, 0.056243,
0.056648, 0.056803,
0.057133, 0.057656, 0.057689, 0.058021, 0.058613, 0.058964, 0.059164,
0.059424, 0.059438, 0.059456, 0.059853, 0.060354, 0.06064, 0.060972,
0.060999, 0.061141,
0.061157, 0.061163, 0.061305, 0.061444, 0.061567, 0.061594, 0.061923,
0.06215, 0.062252, 0.063147, 0.063149, 0.063159, 0.063563, 0.063885,
0.064047, 0.064882,
0.064975, 0.065076, 0.065228, 0.065311, 0.066254, 0.066435, 0.066572,
0.067084, 0.067206, 0.067543, 0.067919, 0.067973, 0.068015, 0.068042,
0.068397, 0.068797,
0.0691, 0.069626, 0.069885, 0.070051, 0.070063, 0.070077, 0.070625,
0.071117, 0.071127, 0.071257, 0.071372, 0.071453, 0.071647, 0.071703,
0.072027, 0.072058,
0.072275, 0.072301, 0.07234, 0.072363, 0.072513, 0.072575, 0.072679,
0.07305, 0.073061, 0.073326, 0.073482, 0.073607, 0.073888, 0.073949,
0.074562, 0.074604,
0.074644, 0.075278, 0.075352, 0.07547, 0.075543, 0.075581, 0.076057,
0.076445, 0.076514, 0.076599, 0.076967, 0.076991, 0.077079, 0.077112,
0.077192, 0.077519,
0.077547, 0.077881, 0.078351, 0.078416, 0.078975, 0.079444, 0.079986,
0.080327, 0.080342, 0.080568, 0.080884, 0.081912, 0.081916, 0.082134,
0.082136, 0.082305,
0.082722, 0.082755, 0.082992, 0.083276, 0.083517, 0.083989, 0.084839,
0.084884, 0.085197, 0.08552, 0.085662, 0.08599, 0.085999, 0.086837,
0.087184, 0.087367,
0.08745, 0.087659, 0.087908, 0.088478, 0.089203, 0.089498, 0.090376,
0.09044, 0.092862, 0.093084, 0.093279, 0.093361, 0.097381, 0.098379,
0.100322, 0.100868,
0.10163, 0.103446, 0.104192, 0.105727, 0.108074, 0.108931, 0.113103,
0.152951]
[INFO] Minimum=11419, Maximum=121219, Total=5204830, Count=66, Average=78861.
(total elapsed nano:67824103252)*
*[INFO] memory in usages after test 101 ends:4111938008, total
memory:12884901888, max memory:12884901888 with total 5280000 bets*
**

This test result meets G1GC specified (more GC times but smaller GC
pause).  While comparing to the test result of java 7 up*date 5*, the
result is quite surprised me*:*
**
*java version "1.7.0_05"
Java(TM) SE Runtime Environment (build 1.7.0_05-b06)
Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode)
[INFO] total GC times:8
[INFO] C:/Users/Chris/Documents/My
Projects/citibet-2nd/support/matchspace/GC
performance/XX_UseG1GC_MaxGCPauseMillis100_java7u5.txt average GC:0.730325second
[INFO] app stops:18, average seconds:0.325045
[DEBUG] [0.433048, 0.712155, 0.725535, 0.73903, 0.765341, 0.774686,
0.833398, 0.859405]
[INFO] Minimum=11504, Maximum=175425, Total=5220946, Count=40, Average=
130523. (total elapsed nano:43241336678)
[INFO] memory in usages after test 101 ends:5626027848, total
memory:12884901888, max memory:12884901888 with total 5280000 bets
*

The throughput does increase but the GC pause-time does not meet the
minimum requirement (< 100 millisecond) quite a big difference!

Again, both runnings use same command GC options.  System shows to me like:
-XX:InitialHeapSize=12884901888 -XX:MaxGCPauseMillis=100
-XX:MaxHeapSize=12884901888 -XX:+PrintCommandLineFlags -XX:+PrintGC
-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails
-XX:+UseCompressedOops -XX:+UseG1GC
Do I miss something secrets when using java 7 update 5 for GC specific
issues?


Thanks,
Ching Chen
**
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120808/5d11a5a5/attachment.html 

From john.cuthbertson at oracle.com  Wed Aug  8 11:18:20 2012
From: john.cuthbertson at oracle.com (John Cuthbertson)
Date: Wed, 08 Aug 2012 11:18:20 -0700
Subject: Java 7 update 5 GC
In-Reply-To: <CAFX+WFTt9HrA2=HPt=9URS-beqfyCjyDmi6Xgw5yEws004_xag@mail.gmail.com>
References: <CAFX+WFTt9HrA2=HPt=9URS-beqfyCjyDmi6Xgw5yEws004_xag@mail.gmail.com>
Message-ID: <5022AD6C.1060306@oracle.com>

Hi Ching,

I'm not sure what's going on here. Do you have the complete GC logs 
available? What was the behavior with jdk7u4? I wonder if you are 
running into some expensive mixed GCs (perhaps as a result of a marking 
cycle being initiated by a humongous object allocation).

Thanks,

JohnC

On 08/08/12 00:50, Ching Chen wrote:
> Dear Sirs/Madams who may concern:
>  
> My java application (high-volume, low-latency on-line transaction 
> betting system) needs to accomplish the least GC pause while to keep 
> acceptable throughput.  When I read java 7 G1GC and thought this is 
> what I really want (I also investigated Azul Zing but stop somewhere 
> in the middle of my study).
>  
> I got a strange result when I ran a test with java 7 update 5 and 
> compared it with java 7.0 of same test though.
>  
> The java commands associating to GC in my test is not complicate.  I 
> only specified several of them and let GC to determine the rest for 
> best performance. The GC options are:
> java -Xms12288m -Xmx12288m -verbose:gc -XX:+PrintGCDetails 
> -XX:+PrintGCApplicationStoppedTime -XX:+PrintCommandLineFlags 
> -XX:+UseG1GC -XX:MaxGCPauseMillis=100 \
>  
> The java 7.0 result as following:
>  
> /java version "1.7.0"
> Java(TM) SE Runtime Environment (build 1.7.0-b147)
> Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)
> [INFO] total GC times:199
> [INFO] C:/Users/Chris/Documents/My 
> Projects/citibet-2nd/support/matchspace/GC 
> performance/XX_UseG1GC_MaxGCPauseMillis100.txt average GC:0.067739 second
> [INFO] app stops:206, average seconds:0.065926
> [DEBUG] [0.006688, 0.007678, 0.008537, 0.014192, 0.021206, 0.023098, 
> 0.023628, 0.026924, 0.028749, 0.029012, 0.030648, 0.031663, 0.031704, 
> 0.031743, 0.033121,
> 0.035534, 0.036437, 0.037418, 0.038571, 0.039253, 0.040569, 0.041807, 
> 0.042169, 0.042459, 0.043335, 0.043797, 0.045504, 0.046178, 0.046337, 
> 0.050747, 0.051332,
> 0.052527, 0.052614, 0.052623, 0.054945, 0.055021, 0.05538, 0.055595, 
> 0.055836, 0.05586, 0.056101, 0.056134, 0.056167, 0.056181, 0.056243, 
> 0.056648, 0.056803,
> 0.057133, 0.057656, 0.057689, 0.058021, 0.058613, 0.058964, 0.059164, 
> 0.059424, 0.059438, 0.059456, 0.059853, 0.060354, 0.06064, 0.060972, 
> 0.060999, 0.061141,
> 0.061157, 0.061163, 0.061305, 0.061444, 0.061567, 0.061594, 0.061923, 
> 0.06215, 0.062252, 0.063147, 0.063149, 0.063159, 0.063563, 0.063885, 
> 0.064047, 0.064882,
> 0.064975, 0.065076, 0.065228, 0.065311, 0.066254, 0.066435, 0.066572, 
> 0.067084, 0.067206, 0.067543, 0.067919, 0.067973, 0.068015, 0.068042, 
> 0.068397, 0.068797,
> 0.0691, 0.069626, 0.069885, 0.070051, 0.070063, 0.070077, 0.070625, 
> 0.071117, 0.071127, 0.071257, 0.071372, 0.071453, 0.071647, 0.071703, 
> 0.072027, 0.072058,
> 0.072275, 0.072301, 0.07234, 0.072363, 0.072513, 0.072575, 0.072679, 
> 0.07305, 0.073061, 0.073326, 0.073482, 0.073607, 0.073888, 0.073949, 
> 0.074562, 0.074604,
> 0.074644, 0.075278, 0.075352, 0.07547, 0.075543, 0.075581, 0.076057, 
> 0.076445, 0.076514, 0.076599, 0.076967, 0.076991, 0.077079, 0.077112, 
> 0.077192, 0.077519,
> 0.077547, 0.077881, 0.078351, 0.078416, 0.078975, 0.079444, 0.079986, 
> 0.080327, 0.080342, 0.080568, 0.080884, 0.081912, 0.081916, 0.082134, 
> 0.082136, 0.082305,
> 0.082722, 0.082755, 0.082992, 0.083276, 0.083517, 0.083989, 0.084839, 
> 0.084884, 0.085197, 0.08552, 0.085662, 0.08599, 0.085999, 0.086837, 
> 0.087184, 0.087367,
> 0.08745, 0.087659, 0.087908, 0.088478, 0.089203, 0.089498, 0.090376, 
> 0.09044, 0.092862, 0.093084, 0.093279, 0.093361, 0.097381, 0.098379, 
> 0.100322, 0.100868,
> 0.10163, 0.103446, 0.104192, 0.105727, 0.108074, 0.108931, 0.113103, 
> 0.152951]
> [INFO] Minimum=11419, Maximum=121219, Total=5204830, Count=66, 
> Average=78861. (total elapsed nano:67824103252)/
> /[INFO] memory in usages after test 101 ends:4111938008, total 
> memory:12884901888, max memory:12884901888 with total 5280000 bets/
> // 
>  
> This test result meets G1GC specified (more GC times but smaller GC 
> pause).  While comparing to the test result of java 7 up/date 5/, the 
> result is quite surprised me/:/
> // 
> /java version "1.7.0_05"
> Java(TM) SE Runtime Environment (build 1.7.0_05-b06)
> Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode)
> [INFO] total GC times:8
> [INFO] C:/Users/Chris/Documents/My 
> Projects/citibet-2nd/support/matchspace/GC 
> performance/XX_UseG1GC_MaxGCPauseMillis100_java7u5.txt average 
> GC:0.730325 second
> [INFO] app stops:18, average seconds:0.325045
> [DEBUG] [0.433048, 0.712155, 0.725535, 0.73903, 0.765341, 0.774686, 
> 0.833398, 0.859405]
> [INFO] Minimum=11504, Maximum=175425, Total=5220946, Count=40, 
> Average=130523. (total elapsed nano:43241336678)
> [INFO] memory in usages after test 101 ends:5626027848, total 
> memory:12884901888, max memory:12884901888 with total 5280000 bets
> /
>  
> The throughput does increase but the GC pause-time does not meet the 
> minimum requirement (< 100 millisecond) quite a big difference! 
>  
> Again, both runnings use same command GC options.  System shows to me 
> like:
> -XX:InitialHeapSize=12884901888 -XX:MaxGCPauseMillis=100 
> -XX:MaxHeapSize=12884901888 -XX:+PrintCommandLineFlags -XX:+PrintGC 
> -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails 
> -XX:+UseCompressedOops -XX:+UseG1GC
> Do I miss something secrets when using java 7 update 5 for GC specific 
> issues?
>  
>  
> Thanks,
> Ching Chen
> // 
>  
>
>
>  
> ------------------------------------------------------------------------
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120808/f206c333/attachment.html 

From caoxudong818 at gmail.com  Sun Aug 12 01:10:58 2012
From: caoxudong818 at gmail.com (=?GB2312?B?stzQ8bar?=)
Date: Sun, 12 Aug 2012 16:10:58 +0800
Subject: Why dose max heap size change?
Message-ID: <CAM4A6M+BJZBLzUFWfuYL1G=H2T2fZZvR=k8NafyBAJQPJKHvZA@mail.gmail.com>

Hi all,

I am doing some monitor jobs for JVM with JMX, and has been stuck on a
question about max heap size.

The javadoc of the field *max* of class *java.lang.management.HeapUsage*says,

      "represents the maximum amount of memory (in bytes) that can be used
for memory management. Its value may be undefined.

       *The maximum amount of memory may change over time if defined*. "

So, my question is,

       Why dose max heap size change, if I defined it with Xmx?

Any response will be appreciated.

Best Regards.

caoxudong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120812/7a12079f/attachment.html 

From rednaxelafx at gmail.com  Sun Aug 12 08:47:58 2012
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Sun, 12 Aug 2012 23:47:58 +0800
Subject: Why dose max heap size change?
In-Reply-To: <CAM4A6M+BJZBLzUFWfuYL1G=H2T2fZZvR=k8NafyBAJQPJKHvZA@mail.gmail.com>
References: <CAM4A6M+BJZBLzUFWfuYL1G=H2T2fZZvR=k8NafyBAJQPJKHvZA@mail.gmail.com>
Message-ID: <CA+cQ+tQt+5WbCAqHWjybaGf=aqP8Oxyo_Prv6SqMach94O4egw@mail.gmail.com>

Hi Xudong,

I believe the class you're referring to is
java.lang.management.MemoryUsage. Quoting the docs [1]:

A MemoryUsage object represents a snapshot of memory usage. Instances of
the MemoryUsage class are usually constructed by methods that are used to
obtain memory usage information about individual memory pool of the Java
virtual machine or the heap or non-heap memory of the Java virtual machine
as a whole.

Which means MemoryUsage is not just used to represent the usage of the Java
heap as a whole. -Xmx only locks the maximum size of the Java heap, but
doesn't say anything about how the spaces within the Java heap should be
arranged.

Let's look at an example of a MemoryUsage object representing the usage of
a memory pool.
Try running JConsole on a HotSpot Server VM with default arguments. The
collector used by default would be the Parallel collector.
Go to the MBean tab, find the MBean of "PS Eden Space" from java.lang ->
MemoryPool, and open its Usage property. Refresh the value a few times, and
see if the max field changes.
In my environment, it does change over time. That's because the Parallel
collector uses an adaptive size policy, which could change the maximum size
of the generations adaptively.

Regards,
Kris

[1]:
http://docs.oracle.com/javase/7/docs/api/java/lang/management/MemoryUsage.html


On Sun, Aug 12, 2012 at 4:10 PM, ??? <caoxudong818 at gmail.com> wrote:

> Hi all,
>
> I am doing some monitor jobs for JVM with JMX, and has been stuck on a
> question about max heap size.
>
> The javadoc of the field *max* of class *java.lang.management.HeapUsage*says,
>
>       "represents the maximum amount of memory (in bytes) that can be
> used for memory management. Its value may be undefined.
>
>        *The maximum amount of memory may change over time if defined*. "
>
> So, my question is,
>
>        Why dose max heap size change, if I defined it with Xmx?
>
> Any response will be appreciated.
>
> Best Regards.
>
> caoxudong
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120812/6a63aefe/attachment.html 

From caoxudong818 at gmail.com  Sun Aug 12 09:19:52 2012
From: caoxudong818 at gmail.com (=?GB2312?B?stzQ8bar?=)
Date: Mon, 13 Aug 2012 00:19:52 +0800
Subject: Why dose max heap size change?
In-Reply-To: <CA+cQ+tQt+5WbCAqHWjybaGf=aqP8Oxyo_Prv6SqMach94O4egw@mail.gmail.com>
References: <CAM4A6M+BJZBLzUFWfuYL1G=H2T2fZZvR=k8NafyBAJQPJKHvZA@mail.gmail.com>
	<CA+cQ+tQt+5WbCAqHWjybaGf=aqP8Oxyo_Prv6SqMach94O4egw@mail.gmail.com>
Message-ID: <CAM4A6MLB+922Bw22qTfTf7Hdm4b+WUNYPbuotqmWgEimSt1HYw@mail.gmail.com>

Hi  Krystal,

Thanks for your explaination.

I understand it. Thanks a lot.

Best Regards.

caoxudong


2012/8/12 Krystal Mok <rednaxelafx at gmail.com>

> Hi Xudong,
>
> I believe the class you're referring to is
> java.lang.management.MemoryUsage. Quoting the docs [1]:
>
> A MemoryUsage object represents a snapshot of memory usage. Instances of
> the MemoryUsage class are usually constructed by methods that are used to
> obtain memory usage information about individual memory pool of the Java
> virtual machine or the heap or non-heap memory of the Java virtual machine
> as a whole.
>
> Which means MemoryUsage is not just used to represent the usage of the
> Java heap as a whole. -Xmx only locks the maximum size of the Java heap,
> but doesn't say anything about how the spaces within the Java heap should
> be arranged.
>
> Let's look at an example of a MemoryUsage object representing the usage of
> a memory pool.
> Try running JConsole on a HotSpot Server VM with default arguments. The
> collector used by default would be the Parallel collector.
> Go to the MBean tab, find the MBean of "PS Eden Space" from java.lang ->
> MemoryPool, and open its Usage property. Refresh the value a few times, and
> see if the max field changes.
> In my environment, it does change over time. That's because the Parallel
> collector uses an adaptive size policy, which could change the maximum size
> of the generations adaptively.
>
> Regards,
> Kris
>
> [1]:
> http://docs.oracle.com/javase/7/docs/api/java/lang/management/MemoryUsage.html
>
>
> On Sun, Aug 12, 2012 at 4:10 PM, ?????? <caoxudong818 at gmail.com> wrote:
>
>> Hi all,
>>
>> I am doing some monitor jobs for JVM with JMX, and has been stuck on a
>> question about max heap size.
>>
>> The javadoc of the field *max* of class *java.lang.management.HeapUsage*says,
>>
>>       "represents the maximum amount of memory (in bytes) that can be
>> used for memory management. Its value may be undefined.
>>
>>        *The maximum amount of memory may change over time if defined*. "
>>
>> So, my question is,
>>
>>        Why dose max heap size change, if I defined it with Xmx?
>>
>> Any response will be appreciated.
>>
>> Best Regards.
>>
>> caoxudong
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120813/9235b639/attachment.html 

From java at java4.info  Tue Aug 14 02:27:47 2012
From: java at java4.info (Florian Binder)
Date: Tue, 14 Aug 2012 11:27:47 +0200
Subject: CMS, PLAB-size and fragmentation
Message-ID: <502A1A13.9010709@java4.info>

Hi everybody,

one of our servers (uses CMS with ParNew) promotes a lot of very small 
objects at the young gc to the tenure (survivor is disabled since the 
objects would survive it anyway):

5[4]: 10703/101530/19930
5[6]: 5482/50575/10115

(Sometimes even more)
Therefore I increased the CMSOldPLABMax=131072  (ok, this might be to 
large and 32k would be enough) and decreased the CMSOldPLABMin=8 because 
there are always a few larger objects which have always a different size:

11[134]: 7/8/8
11[148]: 4/944/8
11[150]: 7/9/9
11[156]: 6/24/8
11[158]: 4/16/8
11[160]: 6/8/8
11[164]: 3/8/8
11[166]: 7/64/8
11[224]: 7/8/8

My questions now are:
Does this have any effect on fragmentation of the tenure space? I assume 
the increasing of the maximum would have a positive effect because the 
small objects are more compacted. Is this right?
Are there any other negative effects of changing these parameters?

Thanks a lot,
Flo


From taras.tielkes at gmail.com  Wed Aug 15 04:49:07 2012
From: taras.tielkes at gmail.com (Taras Tielkes)
Date: Wed, 15 Aug 2012 13:49:07 +0200
Subject: Faster card marking: chances for Java 6 backport
In-Reply-To: <CA+cQ+tSHEXqrB9RLJp-49zvcQ+63LvHM8v4B2EPbHHT-ddHcTA@mail.gmail.com>
References: <CA+R7V79XnJ8U+6noqZWvs26CWKwLvYtReW=vTCvzgWsFN4bmnw@mail.gmail.com>
	<4F91E64D.1070509@oracle.com> <4F95022A.7060103@oracle.com>
	<CA+R7V7_sLyC=x2XW2Y6FXJqGgxdfRWKhHnYmx-_j2q8ATJJnTw@mail.gmail.com>
	<CA+cQ+tSHEXqrB9RLJp-49zvcQ+63LvHM8v4B2EPbHHT-ddHcTA@mail.gmail.com>
Message-ID: <CA+R7V7_5c6NybGCV4whqjcA5NYG1jYPAvQttj-Y47qxcQbVibw@mail.gmail.com>

Hi,

Is the patch still scheduled to be integrated in an upcoming Java 7
update release?

Thanks,
Taras

On Tue, Apr 24, 2012 at 6:12 AM, Krystal Mok <rednaxelafx at gmail.com> wrote:
> Hi Taras,
>
> I asked something related in an earlier thread,
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-March/005380.html
> Looks like people should put their bet on JDK7 instead of staying on JDK6...
>
> - Kris
>
>
> On Tue, Apr 24, 2012 at 4:07 AM, Taras Tielkes <taras.tielkes at gmail.com>
> wrote:
>>
>> Hi Bengt,
>>
>> Thanks for the correction - you're completely right, of course.
>>
>> To me, the decision process for which performance improvements are
>> backported to the previous release stream has never been completely
>> clear.
>> Given that the change in question seems quite an isolated fix, I
>> though it would make sense to ask.
>>
>> Thanks,
>> -tt
>>
>> On Mon, Apr 23, 2012 at 9:18 AM, Bengt Rutisson
>> <bengt.rutisson at oracle.com> wrote:
>> >
>> > Taras,
>> >
>> > Maybe I'm being a bit picky here, but just to be clear. The change for
>> > 7068625 is for faster card scanning - not marking.
>> >
>> > I agree with Jon, I don't think this will be backported to JDK6 unless
>> > there is an explicit customer request to do so.
>> >
>> > Bengt
>> >
>> > On 2012-04-21 00:42, Jon Masamitsu wrote:
>> >> Taras,
>> >>
>> >> I haven't heard any discussions about a backport.
>> >> I think it's a issue that the sustaining organization would
>> >> have to consider (since it's to jdk6).
>> >>
>> >> Jon
>> >>
>> >> On 4/20/2012 12:46 PM, Taras Tielkes wrote:
>> >>> Hi,
>> >>>
>> >>> Are there plans to port RFE 7068625 to Java 6?
>> >>>
>> >>> Thanks,
>> >>> -tt
>> >>> _______________________________________________
>> >>> hotspot-gc-use mailing list
>> >>> hotspot-gc-use at openjdk.java.net
>> >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>> >> _______________________________________________
>> >> hotspot-gc-use mailing list
>> >> hotspot-gc-use at openjdk.java.net
>> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>> >
>> > _______________________________________________
>> > hotspot-gc-use mailing list
>> > hotspot-gc-use at openjdk.java.net
>> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>

From bengt.rutisson at oracle.com  Thu Aug 16 00:58:24 2012
From: bengt.rutisson at oracle.com (Bengt Rutisson)
Date: Thu, 16 Aug 2012 09:58:24 +0200
Subject: Faster card marking: chances for Java 6 backport
In-Reply-To: <CA+R7V7_5c6NybGCV4whqjcA5NYG1jYPAvQttj-Y47qxcQbVibw@mail.gmail.com>
References: <CA+R7V79XnJ8U+6noqZWvs26CWKwLvYtReW=vTCvzgWsFN4bmnw@mail.gmail.com>	<4F91E64D.1070509@oracle.com>
	<4F95022A.7060103@oracle.com>	<CA+R7V7_sLyC=x2XW2Y6FXJqGgxdfRWKhHnYmx-_j2q8ATJJnTw@mail.gmail.com>	<CA+cQ+tSHEXqrB9RLJp-49zvcQ+63LvHM8v4B2EPbHHT-ddHcTA@mail.gmail.com>
	<CA+R7V7_5c6NybGCV4whqjcA5NYG1jYPAvQttj-Y47qxcQbVibw@mail.gmail.com>
Message-ID: <502CA820.4040400@oracle.com>


Hi Taras,

The patch was integrated just after the 7u4 was branched. The 7u6 and 
7u8 releases will be based on the 7u4 branch. So, I think the patch will 
not be available in JDK 7 until 7u10.

Bengt

On 2012-08-15 13:49, Taras Tielkes wrote:
> Hi,
>
> Is the patch still scheduled to be integrated in an upcoming Java 7
> update release?
>
> Thanks,
> Taras
>
> On Tue, Apr 24, 2012 at 6:12 AM, Krystal Mok<rednaxelafx at gmail.com>  wrote:
>> Hi Taras,
>>
>> I asked something related in an earlier thread,
>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-March/005380.html
>> Looks like people should put their bet on JDK7 instead of staying on JDK6...
>>
>> - Kris
>>
>>
>> On Tue, Apr 24, 2012 at 4:07 AM, Taras Tielkes<taras.tielkes at gmail.com>
>> wrote:
>>> Hi Bengt,
>>>
>>> Thanks for the correction - you're completely right, of course.
>>>
>>> To me, the decision process for which performance improvements are
>>> backported to the previous release stream has never been completely
>>> clear.
>>> Given that the change in question seems quite an isolated fix, I
>>> though it would make sense to ask.
>>>
>>> Thanks,
>>> -tt
>>>
>>> On Mon, Apr 23, 2012 at 9:18 AM, Bengt Rutisson
>>> <bengt.rutisson at oracle.com>  wrote:
>>>> Taras,
>>>>
>>>> Maybe I'm being a bit picky here, but just to be clear. The change for
>>>> 7068625 is for faster card scanning - not marking.
>>>>
>>>> I agree with Jon, I don't think this will be backported to JDK6 unless
>>>> there is an explicit customer request to do so.
>>>>
>>>> Bengt
>>>>
>>>> On 2012-04-21 00:42, Jon Masamitsu wrote:
>>>>> Taras,
>>>>>
>>>>> I haven't heard any discussions about a backport.
>>>>> I think it's a issue that the sustaining organization would
>>>>> have to consider (since it's to jdk6).
>>>>>
>>>>> Jon
>>>>>
>>>>> On 4/20/2012 12:46 PM, Taras Tielkes wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Are there plans to port RFE 7068625 to Java 6?
>>>>>>
>>>>>> Thanks,
>>>>>> -tt
>>>>>> _______________________________________________
>>>>>> hotspot-gc-use mailing list
>>>>>> hotspot-gc-use at openjdk.java.net
>>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>> _______________________________________________
>>>>> hotspot-gc-use mailing list
>>>>> hotspot-gc-use at openjdk.java.net
>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


From haim at performize-it.com  Fri Aug 17 14:14:08 2012
From: haim at performize-it.com (Haim Yadid)
Date: Fri, 17 Aug 2012 23:14:08 +0200
Subject: CMS Concurrent mode failure fallback to the serial old collector?
In-Reply-To: <CAJ-ba_MdiR138kNSP1TB-PYMxtED5Z81D39e0s_4HbMpXz8NsQ@mail.gmail.com>
References: <CAJ-ba_MdiR138kNSP1TB-PYMxtED5Z81D39e0s_4HbMpXz8NsQ@mail.gmail.com>
Message-ID: <CAJ-ba_P4tQsPd_kxU0+FNsZ6UKEdxK3rOOpVWtCf4RCpNfXPKA@mail.gmail.com>

> I am analysing a GC pause problem and I have noticed that when CMS is used
> and a concurrent mode failure occurs or GC is triggered manually (by
> System.gc()) the STW collector used does not seem to be parallel. ( I am
> aware of the ExplicitGCInvokesConcurrent flag but it will not solve
> concurrent failure ).
> I tried to play with -XX:ParallelGCThreads=... -XX:ParallelCMSThreads=...
> but they seem have no effect (only on the ParNew GC).
>
> I am deducing it from the following GC log line
>
> 24.904: [Full GC (System) 24.904: [CMS: 302703K->303056K(2116864K),
> 1.0847520 secs] 484492K->303056K(2423552K), [CMS Perm :
> 7528K->7525K(21248K)], 1.0852780 secs] [Times: user=1.04 sys=0.02,
> real=1.09 secs]
> If it would have been parallel "user" would have been equal to "nThreads"
> * "real".
> In addition if I choose ParallelOld GC it will behave correctly.
>
> I really do not understand why the failover STW mechanism of CMS is not
> parallel shouldn't it be finishing the work as soon as possible ?
> I am not able to find anything useful on the internet.
>
> I think G1 behaves in the same manner BTW ( AFAIK the the fallback
> collector of G1 is copied from CMS)
>
> Help will be appreciated.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120817/171b62b3/attachment.html 

From haim at performize-it.com  Fri Aug 17 14:08:38 2012
From: haim at performize-it.com (Haim Yadid)
Date: Fri, 17 Aug 2012 23:08:38 +0200
Subject: CMS Concurrent mode failure fallback to the serial old collector?
Message-ID: <CAJ-ba_MdiR138kNSP1TB-PYMxtED5Z81D39e0s_4HbMpXz8NsQ@mail.gmail.com>

I am analysing a GC pause problem and I have noticed that when CMS is used
and a concurrent mode failure occurs or GC is triggered manually (by
System.gc()) the STW collector used does not seem to be parallel. ( I am
aware of the ExplicitGCInvokesConcurrent flag but it will not solve
concurrent failure ).
I tried to play with -XX:ParallelGCThreads=... -XX:ParallelCMSThreads=...
but they seem have no effect (only on the ParNew GC).

I am deducing it from the following GC log line

24.904: [Full GC (System) 24.904: [CMS: 302703K->303056K(2116864K),
1.0847520 secs] 484492K->303056K(2423552K), [CMS Perm :
7528K->7525K(21248K)], 1.0852780 secs] [Times: user=1.04 sys=0.02,
real=1.09 secs]
If it would have been parallel "user" would have been equal to "nThreads" *
"real".
In addition if I choose ParallelOld GC it will behave correctly.

I really do not understand why the failover STW mechanism of CMS is not
parallel shouldn't it be finishing the work as soon as possible ?
I am not able to find anything useful on the internet.

I think G1 behaves in the same manner BTW ( AFAIK the the fallback
collector of G1 is copied from CMS)

Help will be appreciated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120817/2d4654d9/attachment.html 

From hjohn at xs4all.nl  Sat Aug 18 05:06:46 2012
From: hjohn at xs4all.nl (John Hendrikx)
Date: Sat, 18 Aug 2012 14:06:46 +0200
Subject: Soft References... are they working as intended?
Message-ID: <502F8556.8080701@xs4all.nl>

I've come to the conclusion that SoftReferences in the current hotspot 
implementation are suffering from some problems.

I'm running the latest Java 7, with default gc settings and a very 
modest heap space of 256 MB.

On this heap I have on the order of 50-60 large objects that are 
referenced by SoftReference objects.  Each object is a few megabytes in 
size (they are decoded JPEG images).

At any given time, only 10 of these images have strong references to 
them, totalling no more than 50-60 MB of heap space, the other 200 MB of 
space is only soft referenced.

It is said that SoftReferences are guaranteed to get cleared before heap 
space runs out, yet in certain extreme circumstances one of the 
following can happen:

1) 90% of the time, when under high memory pressure (many images loaded 
and discarded), the VM gets really slow and it seems that some threads 
get stuck in an infinite loop.  What is actually happening is that the 
GC will run for long periods in a row (upto a few minutes, consuming one 
CPU core) before the program gets unstuck and it finally noticed it can 
clear some SoftReference objects.

It is possible that the GC has trouble deciding which SoftReferences can 
be cleared because many of them had (upto a few seconds ago) strong 
references to them, which themselves may not have been marked as garbage 
yet.

So it recovers, but it is taking so much time to do it that users will 
think the program is stuck.

2) The rest of the time it actually will throw an out of heap space 
exception, despite there being SoftReference objects that could have 
been cleared.  This usually happens after a long pause as well.

Can anyone confirm that these problems exists, and perhaps advice a 
course of action?

I really don't want to have to 2nd guess the GC about which images 
should be discarded, but it looks like I will have no choice but to 
limit this Image cache manually to some reasonable value to avoid the GC 
getting stuck for long periods.

Best regards,
John Hendrikx


From dhd at exnet.com  Sat Aug 18 05:13:20 2012
From: dhd at exnet.com (Damon Hart-Davis)
Date: Sat, 18 Aug 2012 13:13:20 +0100
Subject: Soft References... are they working as intended?
In-Reply-To: <502F8556.8080701@xs4all.nl>
References: <502F8556.8080701@xs4all.nl>
Message-ID: <215BC73F-E02D-4E2C-9C0D-C14EA8D7667D@exnet.com>

Hi,

FWIW I usually combine SoftReferences with some other sort of explicit limit based on heap size to help avert this type of issue, and indeed use a number of different strategies, often involving some explicit LRU management.

I can supply code snippets if that would help!  B^>

Rgds

Damon


On 18 Aug 2012, at 13:06, John Hendrikx wrote:

> I've come to the conclusion that SoftReferences in the current hotspot 
> implementation are suffering from some problems.
> 
> I'm running the latest Java 7, with default gc settings and a very 
> modest heap space of 256 MB.
> 
> On this heap I have on the order of 50-60 large objects that are 
> referenced by SoftReference objects.  Each object is a few megabytes in 
> size (they are decoded JPEG images).
> 
> At any given time, only 10 of these images have strong references to 
> them, totalling no more than 50-60 MB of heap space, the other 200 MB of 
> space is only soft referenced.
> 
> It is said that SoftReferences are guaranteed to get cleared before heap 
> space runs out, yet in certain extreme circumstances one of the 
> following can happen:
> 
> 1) 90% of the time, when under high memory pressure (many images loaded 
> and discarded), the VM gets really slow and it seems that some threads 
> get stuck in an infinite loop.  What is actually happening is that the 
> GC will run for long periods in a row (upto a few minutes, consuming one 
> CPU core) before the program gets unstuck and it finally noticed it can 
> clear some SoftReference objects.
> 
> It is possible that the GC has trouble deciding which SoftReferences can 
> be cleared because many of them had (upto a few seconds ago) strong 
> references to them, which themselves may not have been marked as garbage 
> yet.
> 
> So it recovers, but it is taking so much time to do it that users will 
> think the program is stuck.
> 
> 2) The rest of the time it actually will throw an out of heap space 
> exception, despite there being SoftReference objects that could have 
> been cleared.  This usually happens after a long pause as well.
> 
> Can anyone confirm that these problems exists, and perhaps advice a 
> course of action?
> 
> I really don't want to have to 2nd guess the GC about which images 
> should be discarded, but it looks like I will have no choice but to 
> limit this Image cache manually to some reasonable value to avoid the GC 
> getting stuck for long periods.
> 
> Best regards,
> John Hendrikx
> 
> 
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 


From Andreas.Loew at oracle.com  Sat Aug 18 07:36:29 2012
From: Andreas.Loew at oracle.com (Andreas Loew)
Date: Sat, 18 Aug 2012 16:36:29 +0200
Subject: Soft References... are they working as intended?
In-Reply-To: <215BC73F-E02D-4E2C-9C0D-C14EA8D7667D@exnet.com>
References: <502F8556.8080701@xs4all.nl>
	<215BC73F-E02D-4E2C-9C0D-C14EA8D7667D@exnet.com>
Message-ID: <502FA86D.3080604@oracle.com>

Hi John,

while I am a "field guy" and therefore cannot really comment on any 
possible latest implementation details in JDK7, but from what I know 
about this topic, I can surely imagine that when in your sample, your 
SoftReferences start to make up a very large portion of the heap, this 
will cause the currently implemented mechanism to behave poorly and 
finally fail.

Please see mainly 
http://www.oracle.com/technetwork/java/hotspotfaq-138619.html and 
especially 
http://jeremymanson.blogspot.co.uk/2009/07/how-hotspot-decides-to-clear_07.html 
explaining all the nasty details - I'm going to partly cite both these 
sources below:


You do probably know about the "-XX" parameter:

-XX:SoftRefLRUPolicyMSPerMB=<value>

Every SoftReference has a timestamp field that is updated when it is 
accessed (when it is constructed or the get()) method is called. This 
gives a very coarse ordering over the SoftReferences; the timestamp 
indicates the time of the last GC before they were accessed.

Whenever a garbage collection occurs (and only then), the decision to 
clear a SoftReference is based on two factors:

 1. how old the reference's timestamp is, and
 2. how much free space there is in memory.

In my experience, this will coarsely mean that a soft reference will 
survive (after the last strong reference to the object has been 
collected!) for <value> milliseconds times the number of megabytes of 
current free space in the heap. The default is 1s/Mb, so if an object is 
only soft reachable it will stay for 1s if only 1Mb of heap space is 
free - provided that we have garbage collections run frequently enough 
to check this condition (!!!).

Also, the HotSpot Server VM uses the maximum possible heap size (as set 
with the |-Xmx| option) to calculate the current free space remaining, 
while the Client VM uses the current actual heap size to calculate the 
free space (!).

"This means that the general tendency is for the Server VM to grow the 
heap rather than flush soft references, and |-Xmx| therefore has a 
significant effect on when soft references are garbage collected. On the 
other hand, the Client VM will have a greater tendency to flush soft 
references rather than grow the heap."

And also - and this is probably which affects you most severely:

"One thing to notice about this is that it implies that SoftReferences 
will always be kept for at least one GC after their last access. Why is 
that? Well, for the interval, we are using the clock value of the last 
garbage collection, not the current one. As a result, if a SoftReference 
has been accessed since the last garbage collection, it will have the 
same timestamp as that garbage collection, and the interval will be 0. 0 
<= free_heap * 1000 for any amount of free_heap, so any SoftReference 
accessed since the last garbage collection is guaranteed to be kept."

The big hidden pitfall is that in case the objects being held via 
SoftReferences were too big to be allocated in the young generation 
(which, in my understanding, is true in your example), the above will 
not refer to the most recent minor GC, but to the most recent old gen, 
i.e. full GC that happened (!!!).


So in your sample case mentioned below, please check for the above 
conditions:

* What version of the JVM are you using?
* If using the server VM, do you use equal -Xms and -Xmx values?
* Are your "decoded JPEG images" directly being allocated into old 
generation (which I assume to be true)?
* And finally - looking at the general frequency of the appropriate type 
of GCs in your scenario, did you access the soft referenced objects 
since the last (in your scenario probably: full) GC when you see 
everything getting stuck or an OOME?

Hope this helps & best regards,

Andreas


Am 18.08.2012 14:13, schrieb Damon Hart-Davis:
> Hi,
>
> FWIW I usually combine SoftReferences with some other sort of explicit limit based on heap size to help avert this type of issue, and indeed use a number of different strategies, often involving some explicit LRU management.
>
> I can supply code snippets if that would help!  B^>
>
> Rgds
>
> Damon
>
>
> On 18 Aug 2012, at 13:06, John Hendrikx wrote:
>
>> I've come to the conclusion that SoftReferences in the current hotspot
>> implementation are suffering from some problems.
>>
>> I'm running the latest Java 7, with default gc settings and a very
>> modest heap space of 256 MB.
>>
>> On this heap I have on the order of 50-60 large objects that are
>> referenced by SoftReference objects.  Each object is a few megabytes in
>> size (they are decoded JPEG images).
>>
>> At any given time, only 10 of these images have strong references to
>> them, totalling no more than 50-60 MB of heap space, the other 200 MB of
>> space is only soft referenced.
>>
>> It is said that SoftReferences are guaranteed to get cleared before heap
>> space runs out, yet in certain extreme circumstances one of the
>> following can happen:
>>
>> 1) 90% of the time, when under high memory pressure (many images loaded
>> and discarded), the VM gets really slow and it seems that some threads
>> get stuck in an infinite loop.  What is actually happening is that the
>> GC will run for long periods in a row (upto a few minutes, consuming one
>> CPU core) before the program gets unstuck and it finally noticed it can
>> clear some SoftReference objects.
>>
>> It is possible that the GC has trouble deciding which SoftReferences can
>> be cleared because many of them had (upto a few seconds ago) strong
>> references to them, which themselves may not have been marked as garbage
>> yet.
>>
>> So it recovers, but it is taking so much time to do it that users will
>> think the program is stuck.
>>
>> 2) The rest of the time it actually will throw an out of heap space
>> exception, despite there being SoftReference objects that could have
>> been cleared.  This usually happens after a long pause as well.
>>
>> Can anyone confirm that these problems exists, and perhaps advice a
>> course of action?
>>
>> I really don't want to have to 2nd guess the GC about which images
>> should be discarded, but it looks like I will have no choice but to
>> limit this Image cache manually to some reasonable value to avoid the GC
>> getting stuck for long periods.
>>
>> Best regards,
>> John Hendrikx

-- 
Andreas Loew | Senior Java Architect
ACS Principal Service Delivery Engineer
ORACLE Germany


From haim at performize-it.com  Mon Aug 20 03:29:24 2012
From: haim at performize-it.com (Haim Yadid)
Date: Mon, 20 Aug 2012 13:29:24 +0300
Subject: What is the logic that G1GC follows when triggering young/mixed/full
	GC
Message-ID: <CAJ-ba_NNrDeeu_KXmUobFiMCH0fZYD9mppJ1XLLCd-5asS9fcg@mail.gmail.com>

I am evaluating G1GC as a candidate to solve GC pause of an application
with 20GB heap.
It seems like from time to time G1 is issuing a Full GC which leads to a
long pause ( at least 10 seconds).
In addition from time to time it is triggering the mixed mode and then it
breaches the soft real time requirement I set ( 100 ms max pause time)

Why does it happen ? What are the reasons for G1 to initiate a full GC. Why
does the mixed mode collections are so long ?
Attaching part of the log

2012-08-19T23:53:26.414+0000: 36846.868: [GC pause (young), 0.99116700 secs]
   [Parallel Time: 957.3 ms]
      [GC Worker Start (ms):  36846868.9  36846868.9  36846868.9
 36846868.9  36846868.9  36846868.9  36846868.9  36846868.9  36846868.9
 36846868.9  36846868.9  36846869.0  36846869.0
       Avg: 36846868.9, Min: 36846868.9, Max: 36846869.0, Diff:   0.1]
      [Ext Root Scanning (ms):  1.7  1.5  1.9  2.2  1.8  1.7  1.7  1.6  1.5
 1.7  1.7  1.4  1.9
       Avg:   1.7, Min:   1.4, Max:   2.2, Diff:   0.8]
      [Update RS (ms):  9.8  9.8  9.9  9.3  9.6  10.0  10.4  10.0  10.7
 11.4  11.0  10.1  9.6
       Avg:  10.1, Min:   9.3, Max:  11.4, Diff:   2.1]
         [Processed Buffers : 11 13 13 15 18 10 20 11 12 13 11 12 9
          Sum: 168, Avg: 12, Min: 9, Max: 20, Diff: 11]
      [Scan RS (ms):  67.9  67.8  67.5  67.8  67.7  67.6  67.2  67.6  67.0
 66.0  66.5  67.9  67.7
       Avg:  67.4, Min:  66.0, Max:  67.9, Diff:   1.9]
      [Object Copy (ms):  875.3  875.3  876.5  875.4  875.6  875.3  875.5
 875.3  875.4  875.6  875.7  875.2  875.4
       Avg: 875.5, Min: 875.2, Max: 876.5, Diff:   1.2]
      [Termination (ms):  1.1  1.1  0.0  1.0  1.0  1.2  0.9  1.2  1.1  1.0
 0.9  1.1  1.2
       Avg:   1.0, Min:   0.0, Max:   1.2, Diff:   1.2]
         [Termination Attempts : 1974 1909 1 1892 1665 2133 1751 2073 2045
1701 1756 2077 2158
          Sum: 23135, Avg: 1779, Min: 1, Max: 2158, Diff: 2157]
      [GC Worker End (ms):  36847824.7  36847824.8  36847824.8  36847824.8
 36847824.7  36847824.8  36847824.8  36847824.7  36847824.8  36847824.8
 36847824.8  36847824.7  36847824.8
       Avg: 36847824.8, Min: 36847824.7, Max: 36847824.8, Diff:   0.1]
      [GC Worker (ms):  955.9  955.9  955.9  955.9  955.8  955.9  955.9
 955.8  955.8  955.9  955.9  955.8  955.8
       Avg: 955.9, Min: 955.8, Max: 955.9, Diff:   0.2]
      [GC Worker Other (ms):  1.5  1.8  1.5  1.5  1.6  1.6  1.6  1.6  1.6
 1.6  1.6  1.6  1.6
       Avg:   1.6, Min:   1.5, Max:   1.8, Diff:   0.3]
   [Clear CT:   1.7 ms]
   [Other:  32.1 ms]
      [Choose CSet:   0.3 ms]
      [Ref Proc:   0.7 ms]
      [Ref Enq:   0.0 ms]
      [Free CSet:  12.4 ms]
   [Eden: 3776M(3776M)->0B(3306M) Survivors: 4096K->474M Heap:
8949M(18908M)->6839M(18908M)]
 [Times: user=12.45 sys=0.03, real=0.99 secs]
2012-08-19T23:59:23.636+0000: 37204.090: [GC pause (mixed), 1.48753900 secs]
   [Parallel Time: 1439.9 ms]
      [GC Worker Start (ms):  37204093.5  37204093.5  37204093.5
 37204093.5  37204093.5  37204093.6  37204093.6  37204093.6  37204093.6
 37204093.6  37204093.6  37204093.6  37204093.6
       Avg: 37204093.6, Min: 37204093.5, Max: 37204093.6, Diff:   0.1]
      [Ext Root Scanning (ms):  1.5  2.0  1.9  1.4  1.6  1.6  1.4  1.5  1.5
 1.6  1.4  1.6  1.4
       Avg:   1.6, Min:   1.4, Max:   2.0, Diff:   0.6]
      [Update RS (ms):  14.7  12.0  11.5  12.4  11.5  11.6  11.9  11.5
 15.5  11.8  11.9  12.8  11.8
       Avg:  12.4, Min:  11.5, Max:  15.5, Diff:   4.0]
         [Processed Buffers : 10 7 7 8 21 7 7 32 8 7 8 7 8
          Sum: 137, Avg: 10, Min: 7, Max: 32, Diff: 25]
      [Scan RS (ms):  144.8  146.8  147.2  147.0  147.9  147.6  147.3
 147.9  143.9  147.6  147.5  146.1  147.6
       Avg: 146.9, Min: 143.9, Max: 147.9, Diff:   4.0]
      [Object Copy (ms):  1271.4  1271.8  1273.0  1271.7  1271.7  1271.6
 1271.8  1271.6  1272.4  1277.3  1271.6  1273.1  1273.0
       Avg: 1272.5, Min: 1271.4, Max: 1277.3, Diff:   5.8]
      [Termination (ms):  6.0  5.8  4.7  5.9  5.5  5.9  5.8  5.6  4.9  0.0
 5.9  4.4  4.4
       Avg:   5.0, Min:   0.0, Max:   6.0, Diff:   6.0]
         [Termination Attempts : 8002 7759 6378 7864 7158 7929 7776 7293
6256 1 7901 5835 5842
          Sum: 85994, Avg: 6614, Min: 1, Max: 8002, Diff: 8001]
      [GC Worker End (ms):  37205531.9  37205531.9  37205531.9  37205531.9
 37205532.0  37205531.9  37205531.9  37205531.9  37205532.0  37205532.0
 37205531.9  37205532.0  37205532.0
       Avg: 37205531.9, Min: 37205531.9, Max: 37205532.0, Diff:   0.1]
      [GC Worker (ms):  1438.4  1438.4  1438.3  1438.4  1438.4  1438.3
 1438.3  1438.4  1438.4  1438.4  1438.3  1438.4  1438.4
       Avg: 1438.4, Min: 1438.3, Max: 1438.4, Diff:   0.1]
      [GC Worker Other (ms):  1.6  1.6  1.6  1.6  1.6  1.6  1.7  1.6  1.7
 1.7  1.7  1.9  1.7
       Avg:   1.6, Min:   1.6, Max:   1.9, Diff:   0.3]
   [Clear CT:   2.1 ms]
   [Other:  45.6 ms]
      [Choose CSet:   2.3 ms]
      [Ref Proc:   0.6 ms]
      [Ref Enq:   0.0 ms]
      [Free CSet:  15.8 ms]
   [Eden: 3306M(3306M)->0B(3532M) Survivors: 474M->474M Heap:
13220M(20036M)->10771M(20036M)]
 [Times: user=18.47 sys=0.10, real=1.49 secs]
2012-08-19T23:59:31.632+0000: 37212.087: [GC pause (mixed), 0.79556800 secs]
   [Parallel Time: 766.4 ms]
      [GC Worker Start (ms):  37212091.2  37212091.2  37212091.2
 37212091.2  37212091.2  37212091.2  37212091.2  37212091.3  37212091.3
 37212091.3  37212091.3  37212091.3  37212091.3
       Avg: 37212091.3, Min: 37212091.2, Max: 37212091.3, Diff:   0.1]
      [Ext Root Scanning (ms):  1.5  2.1  1.4  1.5  1.4  2.0  1.6  1.5  2.0
 1.5  1.6  1.1  1.4
       Avg:   1.6, Min:   1.1, Max:   2.1, Diff:   1.0]
      [Update RS (ms):  15.3  14.6  15.4  18.6  15.4  14.6  15.1  15.3
 14.6  15.4  15.1  15.2  15.5
       Avg:  15.4, Min:  14.6, Max:  18.6, Diff:   4.0]
         [Processed Buffers : 40 30 31 40 34 30 31 35 30 38 32 38 30
          Sum: 439, Avg: 33, Min: 30, Max: 40, Diff: 10]
      [Scan RS (ms):  70.4  70.4  70.4  67.0  70.4  70.4  70.3  70.4  70.4
 70.3  70.3  70.5  70.1
       Avg:  70.1, Min:  67.0, Max:  70.5, Diff:   3.4]
      [Object Copy (ms):  671.5  670.5  670.6  670.5  670.5  670.8  670.6
 670.5  670.6  671.0  670.9  677.6  670.8
       Avg: 671.3, Min: 670.5, Max: 677.6, Diff:   7.1]
      [Termination (ms):  6.2  7.3  7.2  7.2  7.1  7.1  7.2  7.1  7.2  6.6
 6.9  0.0  7.0
       Avg:   6.5, Min:   0.0, Max:   7.3, Diff:   7.3]
         [Termination Attempts : 7855 8294 8154 8366 8234 8178 8029 7990
8054 8147 7594 1 8133
          Sum: 97029, Avg: 7463, Min: 1, Max: 8366, Diff: 8365]
      [GC Worker End (ms):  37212856.1  37212856.1  37212856.1  37212856.2
 37212856.2  37212856.1  37212856.2  37212856.1  37212856.2  37212856.2
 37212856.1  37212856.2  37212856.2
       Avg: 37212856.2, Min: 37212856.1, Max: 37212856.2, Diff:   0.1]
      [GC Worker (ms):  764.9  764.9  764.9  764.9  765.0  764.9  764.9
 764.9  764.9  764.9  764.8  764.9  764.9
       Avg: 764.9, Min: 764.8, Max: 765.0, Diff:   0.1]
      [GC Worker Other (ms):  1.5  1.5  1.6  1.6  1.6  1.6  1.6  1.6  1.6
 1.6  1.6  2.1  1.6
       Avg:   1.6, Min:   1.5, Max:   2.1, Diff:   0.5]
   [Clear CT:   2.5 ms]
   [Other:  26.6 ms]
      [Choose CSet:   2.7 ms]
      [Ref Proc:   0.2 ms]
      [Ref Enq:   0.0 ms]
      [Free CSet:  10.1 ms]
   [Eden: 1318M(3532M)->0B(3822M) Survivors: 474M->266M Heap:
13077M(20444M)->11364M(20444M)]
 [Times: user=10.01 sys=0.00, real=0.79 secs]
2012-08-19T23:59:47.406+0000: 37227.860: [GC pause (mixed), 2.20691100 secs]
   [Parallel Time: 2151.1 ms]
      [GC Worker Start (ms):  37227867.4  37227867.4  37227867.4
 37227867.4  37227867.4  37227867.4  37227867.4  37227867.4  37227867.5
 37227867.5  37227867.5  37227867.5  37227867.5
       Avg: 37227867.4, Min: 37227867.4, Max: 37227867.5, Diff:   0.1]
      [Ext Root Scanning (ms):  1.5  1.9  1.5  1.6  2.2  1.6  1.2  1.6  2.1
 1.3  1.4  1.3  1.4
       Avg:   1.6, Min:   1.2, Max:   2.2, Diff:   0.9]
      [Update RS (ms):  19.2  17.1  19.3  19.6  17.4  20.7  17.7  18.0
 19.8  17.9  17.7  17.7  21.3
       Avg:  18.7, Min:  17.1, Max:  21.3, Diff:   4.1]
         [Processed Buffers : 9 7 8 5 9 6 11 9 7 8 8 8 7
          Sum: 102, Avg: 7, Min: 5, Max: 11, Diff: 6]
      [Scan RS (ms):  1384.6  1394.3  1384.9  1395.4  1396.9  1380.9
 1386.4  1395.6  1395.8  1384.9  1399.2  1401.8  1387.4
       Avg: 1391.4, Min: 1380.9, Max: 1401.8, Diff:  20.8]
      [Object Copy (ms):  740.2  731.6  739.9  728.2  728.2  741.4  744.2
 729.6  727.0  740.8  726.4  723.9  734.7
       Avg: 733.5, Min: 723.9, Max: 744.2, Diff:  20.3]
      [Termination (ms):  4.0  4.6  3.9  4.8  4.8  4.9  0.0  4.8  4.7  4.6
 4.9  4.8  4.8
       Avg:   4.3, Min:   0.0, Max:   4.9, Diff:   4.9]
         [Termination Attempts : 5659 7036 5875 6888 6896 6924 1 7335 6824
7011 7219 7016 7148
          Sum: 81832, Avg: 6294, Min: 1, Max: 7335, Diff: 7334]
      [GC Worker End (ms):  37230017.1  37230017.0  37230017.0  37230017.1
 37230017.1  37230017.1  37230017.1  37230017.1  37230017.0  37230017.0
 37230017.0  37230017.0  37230017.1
       Avg: 37230017.1, Min: 37230017.0, Max: 37230017.1, Diff:   0.1]
      [GC Worker (ms):  2149.7  2149.6  2149.6  2149.7  2149.6  2149.6
 2149.7  2149.6  2149.6  2149.5  2149.6  2149.6  2149.6
       Avg: 2149.6, Min: 2149.5, Max: 2149.7, Diff:   0.2]
      [GC Worker Other (ms):  1.7  1.6  1.6  1.6  1.6  1.6  1.6  1.6  1.6
 1.6  1.6  1.6  1.7
       Avg:   1.6, Min:   1.6, Max:   1.7, Diff:   0.1]
   [Clear CT:   3.2 ms]
   [Other:  52.6 ms]
      [Choose CSet:   6.4 ms]
      [Ref Proc:   0.2 ms]
      [Ref Enq:   0.0 ms]
      [Free CSet:  16.7 ms]
   [Eden: 3152M(3822M)->0B(3912M) Survivors: 266M->176M Heap:
14601M(20444M)->11309M(20444M)]
 [Times: user=28.03 sys=0.01, real=2.21 secs]
2012-08-19T23:59:49.615+0000: 37230.069: [Full GC 11309M->2021M(7604M),
10.4855330 secs]
 [Times: user=16.90 sys=0.55, real=10.48 secs]
2012-08-20T00:00:07.434+0000: 37247.889: [GC pause (young), 0.24075000 secs]
   [Parallel Time: 230.3 ms]
      [GC Worker Start (ms):  37247889.7  37247889.7  37247889.7
 37247889.7  37247889.7  37247889.8  37247889.8  37247889.8  37247889.8
 37247889.8  37247889.8  37247889.8  37247889.8
       Avg: 37247889.8, Min: 37247889.7, Max: 37247889.8, Diff:   0.1]
      [Ext Root Scanning (ms):  2.3  2.4  2.2  2.3  2.6  2.4  2.9  2.3  2.1
 2.4  2.1  2.2  2.3
       Avg:   2.3, Min:   2.1, Max:   2.9, Diff:   0.8]
      [Update RS (ms):  10.0  10.5  9.6  9.4  9.2  8.8  8.1  9.2  11.8
 11.2  9.7  9.3  9.1
       Avg:   9.7, Min:   8.1, Max:  11.8, Diff:   3.7]
         [Processed Buffers : 8 8 9 7 9 16 12 14 7 8 7 8 8
          Sum: 121, Avg: 9, Min: 7, Max: 16, Diff: 9]
      [Scan RS (ms):  12.1  11.6  12.6  12.6  12.6  13.1  13.1  13.0  10.5
 10.8  12.6  12.8  13.1
       Avg:  12.4, Min:  10.5, Max:  13.1, Diff:   2.6]
      [Object Copy (ms):  204.4  204.3  204.3  204.4  204.4  204.4  204.6
 204.3  204.3  204.3  204.3  204.3  204.2
       Avg: 204.3, Min: 204.2, Max: 204.6, Diff:   0.4]
      [Termination (ms):  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0
       Avg:   0.0, Min:   0.0, Max:   0.0, Diff:   0.0]
         [Termination Attempts : 3 2 3 2 3 3 2 3 1 3 1 3 2
          Sum: 31, Avg: 2, Min: 1, Max: 3, Diff: 2]
      [GC Worker End (ms):  37248118.6  37248118.6  37248118.6  37248118.6
 37248118.6  37248118.5  37248118.5  37248118.6  37248118.6  37248118.6
 37248118.6  37248118.6  37248118.6
       Avg: 37248118.6, Min: 37248118.5, Max: 37248118.6, Diff:   0.1]
      [GC Worker (ms):  228.9  228.9  228.8  228.8  228.9  228.8  228.8
 228.9  228.8  228.8  228.8  228.8  228.8
       Avg: 228.8, Min: 228.8, Max: 228.9, Diff:   0.1]
      [GC Worker Other (ms):  1.5  1.5  1.6  1.5  1.6  1.6  1.6  1.6  1.6
 1.6  1.6  1.6  1.6
       Avg:   1.6, Min:   1.5, Max:   1.6, Diff:   0.1]
   [Clear CT:   0.9 ms]
   [Other:   9.5 ms]
      [Choose CSet:   0.1 ms]
      [Ref Proc:   0.2 ms]
      [Ref Enq:   0.0 ms]
      [Free CSet:   4.2 ms]
   [Eden: 1520M(3912M)->0B(1330M) Survivors: 0B->190M Heap:
3945M(7604M)->2847M(7604M)]
 [Times: user=2.99 sys=0.00, real=0.24 secs]
2012-08-20T00:00:16.250+0000: 37256.705: [GC pause (young), 0.19628600 secs]
   [Parallel Time: 187.5 ms]
      [GC Worker Start (ms):  37256705.1  37256705.1  37256705.1
 37256705.2  37256705.2  37256705.2  37256705.2  37256705.2  37256705.2
 37256705.2  37256705.2  37256705.2  37256705.2
       Avg: 37256705.2, Min: 37256705.1, Max: 37256705.2, Diff:   0.1]
      [Ext Root Scanning (ms):  1.4  1.4  1.5  1.6  1.6  1.5  1.6  1.4  2.1
 1.2  1.9  1.4  1.3
       Avg:   1.5, Min:   1.2, Max:   2.1, Diff:   0.9]
      [Update RS (ms):  4.9  5.2  5.2  4.9  4.9  5.1  4.9  4.9  4.4  5.3
 4.6  5.2  5.1
       Avg:   5.0, Min:   4.4, Max:   5.3, Diff:   0.9]
         [Processed Buffers : 8 9 9 9 12 11 8 8 11 10 8 13 9
          Sum: 125, Avg: 9, Min: 8, Max: 13, Diff: 5]
      [Scan RS (ms):  14.5  14.5  14.4  14.5  14.6  14.3  14.5  14.6  14.7
 14.5  14.3  14.3  14.6
       Avg:  14.5, Min:  14.3, Max:  14.7, Diff:   0.4]
      [Object Copy (ms):  165.0  164.9  164.8  164.8  164.8  164.9  164.8
 164.8  164.5  164.7  164.9  164.8  164.6
       Avg: 164.8, Min: 164.5, Max: 165.0, Diff:   0.5]
      [Termination (ms):  0.1  0.0  0.2  0.1  0.1  0.1  0.1  0.2  0.2  0.2
 0.2  0.1  0.2
       Avg:   0.1, Min:   0.0, Max:   0.2, Diff:   0.2]
         [Termination Attempts : 299 1 263 259 188 185 247 256 307 311 318
270 304
          Sum: 3208, Avg: 246, Min: 1, Max: 318, Diff: 317]
      [GC Worker End (ms):  37256891.2  37256891.2  37256891.2  37256891.2
 37256891.2  37256891.1  37256891.2  37256891.2  37256891.2  37256891.2
 37256891.2  37256891.2  37256891.2
       Avg: 37256891.2, Min: 37256891.1, Max: 37256891.2, Diff:   0.1]
      [GC Worker (ms):  186.1  186.0  186.1  186.1  186.1  186.0  186.0
 186.0  186.0  186.0  186.0  186.0  185.9
       Avg: 186.0, Min: 185.9, Max: 186.1, Diff:   0.1]
      [GC Worker Other (ms):  1.5  1.6  1.5  1.5  1.5  1.6  1.6  1.6  1.6
 1.6  1.6  1.6  1.6
       Avg:   1.6, Min:   1.5, Max:   1.6, Diff:   0.1]
   [Clear CT:   0.8 ms]
   [Other:   8.0 ms]
      [Choose CSet:   0.1 ms]
      [Ref Proc:   0.2 ms]
      [Ref Enq:   0.0 ms]
      [Free CSet:   4.0 ms]
   [Eden: 1330M(1330M)->0B(1332M) Survivors: 190M->188M Heap:
4487M(7604M)->3343M(7604M)]
 [Times: user=2.43 sys=0.00, real=0.20 secs]
2012-08-20T00:00:27.180+0000: 37267.635: [GC pause (young), 0.14220300 secs]
   [Parallel Time: 134.8 ms]
      [GC Worker Start (ms):  37267634.8  37267634.8  37267634.8
 37267634.8  37267634.8  37267634.8  37267634.8  37267634.8  37267634.8
 37267634.8  37267634.8  37267634.8  37267634.8
       Avg: 37267634.8, Min: 37267634.8, Max: 37267634.8, Diff:   0.1]
      [Ext Root Scanning (ms):  2.2  1.7  1.6  1.6  1.4  1.5  1.3  1.4  1.6
 1.6  1.9  1.6  1.3
       Avg:   1.6, Min:   1.3, Max:   2.2, Diff:   0.9]
      [Update RS (ms):  14.8  15.0  14.9  15.2  15.5  15.1  15.5  15.5
 15.2  15.0  15.0  16.1  15.2
       Avg:  15.2, Min:  14.8, Max:  16.1, Diff:   1.3]
         [Processed Buffers : 8 12 13 13 11 5 11 9 8 8 10 19 11
          Sum: 138, Avg: 10, Min: 5, Max: 19, Diff: 14]
      [Scan RS (ms):  0.7  0.7  0.9  0.8  0.7  0.9  0.6  0.5  0.7  0.9  0.5
 0.0  0.9
       Avg:   0.7, Min:   0.0, Max:   0.9, Diff:   0.9]
      [Object Copy (ms):  115.5  115.6  115.6  115.3  115.3  115.4  115.5
 115.4  115.5  115.3  115.4  115.2  115.4
       Avg: 115.4, Min: 115.2, Max: 115.6, Diff:   0.4]
      [Termination (ms):  0.0  0.3  0.2  0.3  0.3  0.3  0.3  0.3  0.3  0.3
 0.3  0.3  0.3
       Avg:   0.3, Min:   0.0, Max:   0.3, Diff:   0.3]
         [Termination Attempts : 1 471 475 570 615 560 605 595 464 532 616
469 593
          Sum: 6566, Avg: 505, Min: 1, Max: 616, Diff: 615]
      [GC Worker End (ms):  37267768.0  37267768.0  37267768.1  37267768.1
 37267768.1  37267768.1  37267768.1  37267768.0  37267768.1  37267768.1
 37267768.1  37267768.1  37267768.1
       Avg: 37267768.1, Min: 37267768.0, Max: 37267768.1, Diff:   0.1]
      [GC Worker (ms):  133.3  133.3  133.4  133.3  133.3  133.3  133.3
 133.2  133.3  133.3  133.2  133.2  133.3
       Avg: 133.3, Min: 133.2, Max: 133.4, Diff:   0.1]
      [GC Worker Other (ms):  1.5  1.5  1.5  1.5  1.5  1.6  1.6  1.6  1.6
 1.6  1.6  1.6  1.6
       Avg:   1.6, Min:   1.5, Max:   1.6, Diff:   0.1]
   [Clear CT:   0.4 ms]
   [Other:   7.0 ms]
      [Choose CSet:   0.1 ms]
      [Ref Proc:   0.2 ms]
      [Ref Enq:   0.0 ms]
      [Free CSet:   3.6 ms]
   [Eden: 1332M(1332M)->0B(1482M) Survivors: 188M->38M Heap:
4675M(7604M)->3370M(7604M)]
 [Times: user=1.76 sys=0.00, real=0.14 secs]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120820/a86c4623/attachment-0001.html 

From jon.masamitsu at oracle.com  Mon Aug 20 09:16:59 2012
From: jon.masamitsu at oracle.com (Jon Masamitsu)
Date: Mon, 20 Aug 2012 09:16:59 -0700
Subject: CMS Concurrent mode failure fallback to the serial old collector?
In-Reply-To: <CAJ-ba_MdiR138kNSP1TB-PYMxtED5Z81D39e0s_4HbMpXz8NsQ@mail.gmail.com>
References: <CAJ-ba_MdiR138kNSP1TB-PYMxtED5Z81D39e0s_4HbMpXz8NsQ@mail.gmail.com>
Message-ID: <503262FB.5020900@oracle.com>


On 08/17/12 14:08, Haim Yadid wrote:
> I am analysing a GC pause problem and I have noticed that when CMS is 
> used and a concurrent mode failure occurs or GC is triggered manually 
> (by System.gc()) the STW collector used does not seem to be parallel. 
> ( I am aware of the ExplicitGCInvokesConcurrent flag but it will not 
> solve concurrent failure ).
> I tried to play with -XX:ParallelGCThreads=... 
> -XX:ParallelCMSThreads=... but they seem have no effect (only on the 
> ParNew GC).
>
> I am deducing it from the following GC log line
>
> 24.904: [Full GC (System) 24.904: [CMS: 302703K->303056K(2116864K), 
> 1.0847520 secs] 484492K->303056K(2423552K), [CMS Perm : 
> 7528K->7525K(21248K)], 1.0852780 secs] [Times: user=1.04 sys=0.02, 
> real=1.09 secs]
> If it would have been parallel "user" would have been equal to 
> "nThreads" * "real".
> In addition if I choose ParallelOld GC it will behave correctly.
>
> I really do not understand why the failover STW mechanism of CMS is 
> not parallel shouldn't it be finishing the work as soon as possible ?
> I am not able to find anything useful on the internet.

You are correct that the concurrent mode failure does a full GC serially.
The parallel old collector used for UseParallelGC/UseParallelOldGC was
never ported to CMS.  Because of differences between UseParallelGC and
CMS, it is more work than we had expected.
>
> I think G1 behaves in the same manner BTW ( AFAIK the the fallback 
> collector of G1 is copied from CMS)

Yes, G1 behaves the same.  G1 will not use the UseParallelGC implementation
for a parallel full collection but will implement one in line with G1's 
design.
Currently the G1 guys have been focusing on better policies for achieving
pause goals and avoiding full collections.  Last I heard there was at 
least some
work to be done for class unloading (JEP 156) before the parallel full 
collection.

Jon

>
> Help will be appreciated.
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120820/59be1d20/attachment.html 

From haim at performize-it.com  Mon Aug 20 13:39:47 2012
From: haim at performize-it.com (Haim Yadid)
Date: Mon, 20 Aug 2012 22:39:47 +0200
Subject: hotspot-gc-use Digest, Vol 54, Issue 8
In-Reply-To: <mailman.29.1345489201.24385.hotspot-gc-use@openjdk.java.net>
References: <mailman.29.1345489201.24385.hotspot-gc-use@openjdk.java.net>
Message-ID: <CAJ-ba_NH_J4iudeyUnkD5Er-+2U6E-aomjVpmvafR=U7+WedDg@mail.gmail.com>

Thanks Jon,
Thats a pity since CMS full GC is unavoidable, and when this happen we will
experience unacceptable pause.
I was trying G1 as well and In theory G1GC should not have pauses longer
than the soft real time goal.
However in practice (as you can see from another question I have posted )
it seems that G1 do has pauses from time to time and in the application I
am tuning it is much worse than CMS.


On Mon, Aug 20, 2012 at 9:00 PM, <hotspot-gc-use-request at openjdk.java.net>wrote:

> Send hotspot-gc-use mailing list submissions to
>         hotspot-gc-use at openjdk.java.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> or, via email, send a message with subject or body 'help' to
>         hotspot-gc-use-request at openjdk.java.net
>
> You can reach the person managing the list at
>         hotspot-gc-use-owner at openjdk.java.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of hotspot-gc-use digest..."
>
>
> Today's Topics:
>
>    1. Re: CMS Concurrent mode failure fallback to the serial old
>       collector? (Jon Masamitsu)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 20 Aug 2012 09:16:59 -0700
> From: Jon Masamitsu <jon.masamitsu at oracle.com>
> Subject: Re: CMS Concurrent mode failure fallback to the serial old
>         collector?
> To: hotspot-gc-use at openjdk.java.net
> Message-ID: <503262FB.5020900 at oracle.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
>
>
> On 08/17/12 14:08, Haim Yadid wrote:
> > I am analysing a GC pause problem and I have noticed that when CMS is
> > used and a concurrent mode failure occurs or GC is triggered manually
> > (by System.gc()) the STW collector used does not seem to be parallel.
> > ( I am aware of the ExplicitGCInvokesConcurrent flag but it will not
> > solve concurrent failure ).
> > I tried to play with -XX:ParallelGCThreads=...
> > -XX:ParallelCMSThreads=... but they seem have no effect (only on the
> > ParNew GC).
> >
> > I am deducing it from the following GC log line
> >
> > 24.904: [Full GC (System) 24.904: [CMS: 302703K->303056K(2116864K),
> > 1.0847520 secs] 484492K->303056K(2423552K), [CMS Perm :
> > 7528K->7525K(21248K)], 1.0852780 secs] [Times: user=1.04 sys=0.02,
> > real=1.09 secs]
> > If it would have been parallel "user" would have been equal to
> > "nThreads" * "real".
> > In addition if I choose ParallelOld GC it will behave correctly.
> >
> > I really do not understand why the failover STW mechanism of CMS is
> > not parallel shouldn't it be finishing the work as soon as possible ?
> > I am not able to find anything useful on the internet.
>
> You are correct that the concurrent mode failure does a full GC serially.
> The parallel old collector used for UseParallelGC/UseParallelOldGC was
> never ported to CMS.  Because of differences between UseParallelGC and
> CMS, it is more work than we had expected.
> >
> > I think G1 behaves in the same manner BTW ( AFAIK the the fallback
> > collector of G1 is copied from CMS)
>
> Yes, G1 behaves the same.  G1 will not use the UseParallelGC implementation
> for a parallel full collection but will implement one in line with G1's
> design.
> Currently the G1 guys have been focusing on better policies for achieving
> pause goals and avoiding full collections.  Last I heard there was at
> least some
> work to be done for class unloading (JEP 156) before the parallel full
> collection.
>
> Jon
>
> >
> > Help will be appreciated.
> >
> >
> > _______________________________________________
> > hotspot-gc-use mailing list
> > hotspot-gc-use at openjdk.java.net
> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120820/59be1d20/attachment-0001.html
>
> ------------------------------
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
> End of hotspot-gc-use Digest, Vol 54, Issue 8
> *********************************************
>


-- 
Haim Yadid | Performization Expert
Performize-IT | t +972-54-7777132
www.performize-it.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120820/9a6f7afe/attachment.html 

From bernd.eckenfels at googlemail.com  Tue Aug 21 13:06:53 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Tue, 21 Aug 2012 22:06:53 +0200
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <CAGO386YyNqJgb-1HgYN6+_gSh5FUbLvydS3XvOwCCPhjvUq28w@mail.gmail.com>
References: <CAGO386bGMnD9XnwSdgbp39u45y1zG0Vk=ieG8sG8d82JXUiojw@mail.gmail.com>
	<CAGO386YyNqJgb-1HgYN6+_gSh5FUbLvydS3XvOwCCPhjvUq28w@mail.gmail.com>
Message-ID: <CAGO386Yv4AF-GeWks-P6_0ZKWyB6=shUWnRGu1KRex7bu1C9fg@mail.gmail.com>

Y. S. Ramakrishna wrote:
>> An alternative workaround that might also work
>> for you would be -XX:CMSWaitDuration=X
> That should have been: -XX:CMSMaxAbortablePrecleanTime=X

I have a question on this, can the ?XX:+CMSScavengeBeforeRemark be
combined with -XX:CMSMaxAbortablePrecleanTime=ms?

Because in my scenario the Scavenger Intervall is rather large
(100-200s), so I dont want to use 400.000ms in the
MaxAbortablePrecleanTime.

The same is true for the initial-mark: when I use CMSWaitDuration, can
I specify if should trigger a scavenger when timeout is reached? Or
would it be OK to specify 400s?

Greetings
Bernd

PS: in my case I know I need to resize the generations to get more
frequent scavenger runs, however it is hard to push that into
production on that particular system.

From bernd.eckenfels at googlemail.com  Tue Aug 21 13:24:41 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Tue, 21 Aug 2012 22:24:41 +0200
Subject: PrintGCDate/TimeStamps
Message-ID: <CAGO386YNQr2h=b6dr-O30TGkm_BiodNvxpv9whxGi=15vAPygg@mail.gmail.com>

Hello,

I wondering about some strange behaviour. Sorry to bother with this minor
observation, but I do hope the datestamp ioption gets more popular by
mention it again .)

Often +PrintGCTimeStamps is recommended to be able to see the intervall
between GC events. However I found some places talking about
+PrintGCDateStamps which is much more convenient for some problems (for
example correlating SLA violations with the STW pause times).

Some discusions suggest that you can combine both in a way that it does not
print both timestamps:

-XX:+PrintGCDateStamps -XX:-PrintGCTimeStamps

However on the win 64bit JDKs 1.6.0_33 and 1.7.0_03 I have tried this, it
does not work (i.e. the logfiles always contans both, actually the
timestamp typically 2 times):

2012-08-21T22:08:29.989+0200: 1.292: [GC 1.292: [ParNew:
5033216K->3826K(5662336K), 0.0015120 secs] 5033216K->3826K(6710912K),
0.0015790 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
Total time for which application threads were stopped: 0.0018006 seconds
2012-08-21T22:08:30.605+0200: 1.907: [GC 1.907: [ParNew:
5037042K->5230K(5662336K), 0.0023563 secs] 5037042K->5230K(6710912K),
0.0024133 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
Total time for which application threads were stopped: 0.0026457 seconds

Just some additional information: I was able to tuen off the timestamps in
the HotSpotDiagnostics MBean:

2012-08-21T22:19:54.858+0200: [GC [ParNew: 5039624K->5836K(5662336K),
0.0030131 secs] 5349974K->316537K(6710912K), 0.0030753 secs] [Times:
user=0.00 sys=0.00, real=0.00 secs]

Greetings
Bernd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120821/3faaa333/attachment.html 

From rednaxelafx at gmail.com  Tue Aug 21 16:52:46 2012
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Wed, 22 Aug 2012 07:52:46 +0800
Subject: PrintGCDate/TimeStamps
In-Reply-To: <CAGO386YNQr2h=b6dr-O30TGkm_BiodNvxpv9whxGi=15vAPygg@mail.gmail.com>
References: <CAGO386YNQr2h=b6dr-O30TGkm_BiodNvxpv9whxGi=15vAPygg@mail.gmail.com>
Message-ID: <CA+cQ+tS9RqU=sz4B2bKK3pW0T7Y0Q2v4rf+FaD3qe9VjsXt3Zw@mail.gmail.com>

Hi Bernd,

You're probably using those two flags along with -Xloggc:filename. This
flag implies -XX:+PrintGCTimeStamps, which in your case is something you're
trying to get rid of.

There's an easy workaround to this: put -XX:-PrintGCTimeStamps *after*
-Xloggc.

The way HotSpot VM's argument processing works, if a VM flag is specified
multiple times, then the one that comes last is the one used. This includes
flags that are set implicitly.

There are different types of VM flags. "manageable" flags are ones that can
be changed at runtime, via the HotSpotDiagnostic MBean. PrintGCTimeStamps
is a manageable flag.

HTH,
Kris

On Wed, Aug 22, 2012 at 4:24 AM, Bernd Eckenfels <
bernd.eckenfels at googlemail.com> wrote:

> Hello,
>
> I wondering about some strange behaviour. Sorry to bother with this minor
> observation, but I do hope the datestamp ioption gets more popular by
> mention it again .)
>
> Often +PrintGCTimeStamps is recommended to be able to see the intervall
> between GC events. However I found some places talking about
> +PrintGCDateStamps which is much more convenient for some problems (for
> example correlating SLA violations with the STW pause times).
>
> Some discusions suggest that you can combine both in a way that it does
> not print both timestamps:
>
> -XX:+PrintGCDateStamps -XX:-PrintGCTimeStamps
>
> However on the win 64bit JDKs 1.6.0_33 and 1.7.0_03 I have tried this, it
> does not work (i.e. the logfiles always contans both, actually the
> timestamp typically 2 times):
>
> 2012-08-21T22:08:29.989+0200: 1.292: [GC 1.292: [ParNew:
> 5033216K->3826K(5662336K), 0.0015120 secs] 5033216K->3826K(6710912K),
> 0.0015790 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0018006 seconds
> 2012-08-21T22:08:30.605+0200: 1.907: [GC 1.907: [ParNew:
> 5037042K->5230K(5662336K), 0.0023563 secs] 5037042K->5230K(6710912K),
> 0.0024133 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0026457 seconds
>
> Just some additional information: I was able to tuen off the timestamps in
> the HotSpotDiagnostics MBean:
>
> 2012-08-21T22:19:54.858+0200: [GC [ParNew: 5039624K->5836K(5662336K),
> 0.0030131 secs] 5349974K->316537K(6710912K), 0.0030753 secs] [Times:
> user=0.00 sys=0.00, real=0.00 secs]
>
> Greetings
> Bernd
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120822/d50e5666/attachment.html 

From bernd.eckenfels at googlemail.com  Tue Aug 21 19:37:33 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Wed, 22 Aug 2012 04:37:33 +0200
Subject: PrintGCDate/TimeStamps
In-Reply-To: <CA+cQ+tS9RqU=sz4B2bKK3pW0T7Y0Q2v4rf+FaD3qe9VjsXt3Zw@mail.gmail.com>
References: <CAGO386YNQr2h=b6dr-O30TGkm_BiodNvxpv9whxGi=15vAPygg@mail.gmail.com>
	<CA+cQ+tS9RqU=sz4B2bKK3pW0T7Y0Q2v4rf+FaD3qe9VjsXt3Zw@mail.gmail.com>
Message-ID: <CAGO386boP7fgGGpk72L3-NsFA82R2o0MSJZ0aS-PdqJi78qsTQ@mail.gmail.com>

> You're probably using those two flags along with -Xloggc:filename. This flag
> implies -XX:+PrintGCTimeStamps, which in your case is something you're
> trying to get rid of.

Yes, correct. I noticed it after posting the question because at first
I was not able to reproduce it. Thanks for the confirmation, I was
expecting something like that (but not necesarily with the Xloggc
option :)

Greetings
Bernd

From ysr1729 at gmail.com  Tue Aug 21 19:41:20 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Tue, 21 Aug 2012 19:41:20 -0700
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <CAGO386Yv4AF-GeWks-P6_0ZKWyB6=shUWnRGu1KRex7bu1C9fg@mail.gmail.com>
References: <CAGO386bGMnD9XnwSdgbp39u45y1zG0Vk=ieG8sG8d82JXUiojw@mail.gmail.com>
	<CAGO386YyNqJgb-1HgYN6+_gSh5FUbLvydS3XvOwCCPhjvUq28w@mail.gmail.com>
	<CAGO386Yv4AF-GeWks-P6_0ZKWyB6=shUWnRGu1KRex7bu1C9fg@mail.gmail.com>
Message-ID: <CABzyjyn5wMSq-JJQJ-3QGOOi0NAb0pVUeS36_t6Rsd-UgXm9ag@mail.gmail.com>

Hi Bernd --

I've unfortunately forgotten the full history of this exchange (at least my
mailer has, and my own cache is nowadays oversubscribed and prone to
evicting objects somewhat too rapidly), so I'll answer only the questions
in the email:-

On Tue, Aug 21, 2012 at 1:06 PM, Bernd Eckenfels <
bernd.eckenfels at googlemail.com> wrote:

> Y. S. Ramakrishna wrote:
> >> An alternative workaround that might also work
> >> for you would be -XX:CMSWaitDuration=X
> > That should have been: -XX:CMSMaxAbortablePrecleanTime=X
>
> I have a question on this, can the ?XX:+CMSScavengeBeforeRemark be
> combined with -XX:CMSMaxAbortablePrecleanTime=ms?
>

Yes, they can be combined.


>
> Because in my scenario the Scavenger Intervall is rather large
> (100-200s), so I dont want to use 400.000ms in the
> MaxAbortablePrecleanTime.
>

OK.


>
> The same is true for the initial-mark: when I use CMSWaitDuration, can
> I specify if should trigger a scavenger when timeout is reached? Or
> would it be OK to specify 400s?
>

Initial mark is typically scheduled immediately after a scavenge, so no
timeout
specification should be necessary. Perhaps I misunderstood yr question and
may be
you can elaborate a bit more on what you want to achieve?


>
> Greetings
> Bernd
>
> PS: in my case I know I need to resize the generations to get more
> frequent scavenger runs, however it is hard to push that into
> production on that particular system.
>

Yes, I am somewhat painfully aware of the limitations of tuning around CMS'
various
shortcomings and am looking forward to G1 being a panacea for those
problems :-)

-- ramki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120821/e2200ce5/attachment.html 

From bernd.eckenfels at googlemail.com  Tue Aug 21 21:14:35 2012
From: bernd.eckenfels at googlemail.com (Bernd Eckenfels)
Date: Wed, 22 Aug 2012 06:14:35 +0200
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <CABzyjyn5wMSq-JJQJ-3QGOOi0NAb0pVUeS36_t6Rsd-UgXm9ag@mail.gmail.com>
References: <CAGO386bGMnD9XnwSdgbp39u45y1zG0Vk=ieG8sG8d82JXUiojw@mail.gmail.com>
	<CAGO386YyNqJgb-1HgYN6+_gSh5FUbLvydS3XvOwCCPhjvUq28w@mail.gmail.com>
	<CAGO386Yv4AF-GeWks-P6_0ZKWyB6=shUWnRGu1KRex7bu1C9fg@mail.gmail.com>
	<CABzyjyn5wMSq-JJQJ-3QGOOi0NAb0pVUeS36_t6Rsd-UgXm9ag@mail.gmail.com>
Message-ID: <op.wjfcqlcltqmg3o@eckenfels02.seeburger.de>

Am 22.08.2012, 04:41 Uhr, schrieb Srinivas Ramakrishna <ysr1729 at gmail.com>:
> Initial mark is typically scheduled immediately after a scavenge, so no
> timeout specification should be necessary. Perhaps I misunderstood yr  
> question and may be you can elaborate a bit more on what you want to  
> achieve?

Well, I have a gclog which contains some STW situations > 1s (which
violates my SLA). If I check the GCLog file there are some initial-marks
and some remarks causing the problem. For the slow initial-marks I see the
pattern that the time difference to the preceeding scavenger run is large.
For the initial marks which run sub second, they happen all directly after
a scavenger run.

So here is a slow samples:

159430.703: [GC 159430.705: [ParNew: 20646923K->582368K(22649280K),
0.4311960 secs]
               21710818K->1665223K(47815104K), 0.4343870 secs] [Times:
user=1.92 sys=0.02, real=0.43 secs]
159607.370: [GC [1 CMS-initial-mark: 1082855K(25165824K)]
14734770K(47815104K), 11.1184690 secs]
               [Times: user=11.06 sys=0.03, real=11.12 secs]
159618.490: [CMS-concurrent-mark-start]
159618.930: [CMS-concurrent-mark: 0.440/0.440 secs] [Times: user=4.59
sys=0.16, real=0.44 secs]

Difference 176s, 11s STW

And here is the next run, which is typically fast:

166807.592: [GC 166807.594: [ParNew: 21200224K->372584K(22649280K),
0.4462060 secs]
               22444233K->1629155K(47815104K), 0.4493750 secs] [Times:
user=1.43 sys=0.01, real=0.45 secs]
166808.057: [GC [1 CMS-initial-mark: 1256570K(25165824K)]
1629155K(47815104K), 0.3039830 secs]
               [Times: user=0.31 sys=0.00, real=0.31 secs]

Difference 0.4s, 0.3s STW

I need to collect the actual jvm parameters, version and gclogfile and
will provide it. I am actually waiting for a CMSStatistics=2 version.


Greetings
Bernd

From ysr1729 at gmail.com  Wed Aug 22 01:03:25 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Wed, 22 Aug 2012 01:03:25 -0700
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <op.wjfcqlcltqmg3o@eckenfels02.seeburger.de>
References: <CAGO386bGMnD9XnwSdgbp39u45y1zG0Vk=ieG8sG8d82JXUiojw@mail.gmail.com>
	<CAGO386YyNqJgb-1HgYN6+_gSh5FUbLvydS3XvOwCCPhjvUq28w@mail.gmail.com>
	<CAGO386Yv4AF-GeWks-P6_0ZKWyB6=shUWnRGu1KRex7bu1C9fg@mail.gmail.com>
	<CABzyjyn5wMSq-JJQJ-3QGOOi0NAb0pVUeS36_t6Rsd-UgXm9ag@mail.gmail.com>
	<op.wjfcqlcltqmg3o@eckenfels02.seeburger.de>
Message-ID: <CABzyjyk4qRrtpaU8zJnabWgSZq=B7VB0r_gT6iWjZp5jW7OP-Q@mail.gmail.com>

Hi Bernd --

Yes, this has been observed (albeit in a different context by Michal Frajt
as well; see emails from a couple of weeks ago).
I am not sure why, with regular CMS, we should have this kind of
upredictable delay from a scavenge to a CMS initial mark.
It must be OS/scheduling and load etc. which we cannot control (although a
delay of 177 s seems excessive and must either mean
that the CMS wait duration was exceeded or something like that.

In any case, as you observed, the length of the CMS initial pause is
related to the ocupancy of the young generation. Thus,
even if it were to occur immediately after a scavenge (when Eden is nearly
empty), the use of large (and fully used) survivor spaces
can make the pause longer.

As we have noted in earlier email, the real solution is to multi-thread the
CMS initial mark pause so that the work can be done much faster.

An easier if less pleasant and less efficient alternative is to implement
CMSScavengeBeforeInitialMark, but that alone would not address
the large fully-used survivor space problem I mentioned above, only the
issue with scheduling the initial mark. (and in that case, the pause-time
for the scavenge would be additive, albeit because it is parallel, would
likely be much faster even for a large Eden).

I'd be curious to know if you get to the bottom of the cause for the long
delay between scavenge and initial mark pause.

regards.
-- ramki

On Tue, Aug 21, 2012 at 9:14 PM, Bernd Eckenfels <
bernd.eckenfels at googlemail.com> wrote:

> Am 22.08.2012, 04:41 Uhr, schrieb Srinivas Ramakrishna <ysr1729 at gmail.com
> >:
> > Initial mark is typically scheduled immediately after a scavenge, so no
> > timeout specification should be necessary. Perhaps I misunderstood yr
> > question and may be you can elaborate a bit more on what you want to
> > achieve?
>
> Well, I have a gclog which contains some STW situations > 1s (which
> violates my SLA). If I check the GCLog file there are some initial-marks
> and some remarks causing the problem. For the slow initial-marks I see the
> pattern that the time difference to the preceeding scavenger run is large.
> For the initial marks which run sub second, they happen all directly after
> a scavenger run.
>
> So here is a slow samples:
>
> 159430.703: [GC 159430.705: [ParNew: 20646923K->582368K(22649280K),
> 0.4311960 secs]
>                21710818K->1665223K(47815104K), 0.4343870 secs] [Times:
> user=1.92 sys=0.02, real=0.43 secs]
> 159607.370: [GC [1 CMS-initial-mark: 1082855K(25165824K)]
> 14734770K(47815104K), 11.1184690 secs]
>                [Times: user=11.06 sys=0.03, real=11.12 secs]
> 159618.490: [CMS-concurrent-mark-start]
> 159618.930: [CMS-concurrent-mark: 0.440/0.440 secs] [Times: user=4.59
> sys=0.16, real=0.44 secs]
>
> Difference 176s, 11s STW
>
> And here is the next run, which is typically fast:
>
> 166807.592: [GC 166807.594: [ParNew: 21200224K->372584K(22649280K),
> 0.4462060 secs]
>                22444233K->1629155K(47815104K), 0.4493750 secs] [Times:
> user=1.43 sys=0.01, real=0.45 secs]
> 166808.057: [GC [1 CMS-initial-mark: 1256570K(25165824K)]
> 1629155K(47815104K), 0.3039830 secs]
>                [Times: user=0.31 sys=0.00, real=0.31 secs]
>
> Difference 0.4s, 0.3s STW
>
> I need to collect the actual jvm parameters, version and gclogfile and
> will provide it. I am actually waiting for a CMSStatistics=2 version.
>
>
> Greetings
> Bernd
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120822/1d608bad/attachment-0001.html 

From dhd at exnet.com  Wed Aug 22 02:31:04 2012
From: dhd at exnet.com (Damon Hart-Davis)
Date: Wed, 22 Aug 2012 10:31:04 +0100
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <CABzyjyk4qRrtpaU8zJnabWgSZq=B7VB0r_gT6iWjZp5jW7OP-Q@mail.gmail.com>
References: <CAGO386bGMnD9XnwSdgbp39u45y1zG0Vk=ieG8sG8d82JXUiojw@mail.gmail.com>
	<CAGO386YyNqJgb-1HgYN6+_gSh5FUbLvydS3XvOwCCPhjvUq28w@mail.gmail.com>
	<CAGO386Yv4AF-GeWks-P6_0ZKWyB6=shUWnRGu1KRex7bu1C9fg@mail.gmail.com>
	<CABzyjyn5wMSq-JJQJ-3QGOOi0NAb0pVUeS36_t6Rsd-UgXm9ag@mail.gmail.com>
	<op.wjfcqlcltqmg3o@eckenfels02.seeburger.de>
	<CABzyjyk4qRrtpaU8zJnabWgSZq=B7VB0r_gT6iWjZp5jW7OP-Q@mail.gmail.com>
Message-ID: <539E4C90-D05E-49CF-92B5-0FF8D4AA10D0@exnet.com>

Hi,

Could it be paging when not all of the JVM heap is (able to be) in physical memory at the same time?

Rgds

Damon


On 22 Aug 2012, at 09:03, Srinivas Ramakrishna wrote:

> I am not sure why, with regular CMS, we should have this kind of upredictable delay from a scavenge to a CMS initial mark.
> It must be OS/scheduling and load etc. which we cannot control (although a delay of 177 s seems excessive and must either mean
> that the CMS wait duration was exceeded or something like that.


From Michal.Frajt at partner.commerzbank.com  Wed Aug 22 04:40:16 2012
From: Michal.Frajt at partner.commerzbank.com (Frajt, Michal)
Date: Wed, 22 Aug 2012 13:40:16 +0200
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <539E4C90-D05E-49CF-92B5-0FF8D4AA10D0@exnet.com>
References: <CAGO386bGMnD9XnwSdgbp39u45y1zG0Vk=ieG8sG8d82JXUiojw@mail.gmail.com>
	<CAGO386YyNqJgb-1HgYN6+_gSh5FUbLvydS3XvOwCCPhjvUq28w@mail.gmail.com>
	<CAGO386Yv4AF-GeWks-P6_0ZKWyB6=shUWnRGu1KRex7bu1C9fg@mail.gmail.com>
	<CABzyjyn5wMSq-JJQJ-3QGOOi0NAb0pVUeS36_t6Rsd-UgXm9ag@mail.gmail.com>
	<op.wjfcqlcltqmg3o@eckenfels02.seeburger.de>
	<CABzyjyk4qRrtpaU8zJnabWgSZq=B7VB0r_gT6iWjZp5jW7OP-Q@mail.gmail.com>
	<539E4C90-D05E-49CF-92B5-0FF8D4AA10D0@exnet.com>
Message-ID: <1DDCA93502632C4DA22E9EE199CE907C5E9F24AD@SE002593.cs.commerzbank.com>

Hi Damon,

It is not paging. The unpredictable delay from a scavenge to a CMS initial mark is the state of the current implementation. The CMSWaitDuration does not work correctly. Please find the details in the "CMSWaitDuration unstable behavior" post on hotspot-gc-dev mailing.

We are facing same issue with extremely long initial mark pauses. Just proper waiting for a scavenge (customized OpenJDK 6) has reduced every single initial-mark pause from 1200ms to 20ms only. 

Regards,
Michal


-----Original Message-----
From: hotspot-gc-use-bounces at openjdk.java.net [mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of Damon Hart-Davis
Sent: Mittwoch, 22. August 2012 11:31
To: Srinivas Ramakrishna; Bernd Eckenfels
Cc: hotspot-gc-use at openjdk.java.net
Subject: Re: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%?

Hi,

Could it be paging when not all of the JVM heap is (able to be) in physical memory at the same time?

Rgds

Damon


On 22 Aug 2012, at 09:03, Srinivas Ramakrishna wrote:

> I am not sure why, with regular CMS, we should have this kind of upredictable delay from a scavenge to a CMS initial mark.
> It must be OS/scheduling and load etc. which we cannot control (although a delay of 177 s seems excessive and must either mean
> that the CMS wait duration was exceeded or something like that.

_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

From ysr1729 at gmail.com  Wed Aug 22 09:23:55 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Wed, 22 Aug 2012 09:23:55 -0700
Subject: Why abortable-preclean phase is not being aborted after YG
	occupancy exceeds 50%?
In-Reply-To: <1DDCA93502632C4DA22E9EE199CE907C5E9F24AD@SE002593.cs.commerzbank.com>
References: <CAGO386bGMnD9XnwSdgbp39u45y1zG0Vk=ieG8sG8d82JXUiojw@mail.gmail.com>
	<CAGO386YyNqJgb-1HgYN6+_gSh5FUbLvydS3XvOwCCPhjvUq28w@mail.gmail.com>
	<CAGO386Yv4AF-GeWks-P6_0ZKWyB6=shUWnRGu1KRex7bu1C9fg@mail.gmail.com>
	<CABzyjyn5wMSq-JJQJ-3QGOOi0NAb0pVUeS36_t6Rsd-UgXm9ag@mail.gmail.com>
	<op.wjfcqlcltqmg3o@eckenfels02.seeburger.de>
	<CABzyjyk4qRrtpaU8zJnabWgSZq=B7VB0r_gT6iWjZp5jW7OP-Q@mail.gmail.com>
	<539E4C90-D05E-49CF-92B5-0FF8D4AA10D0@exnet.com>
	<1DDCA93502632C4DA22E9EE199CE907C5E9F24AD@SE002593.cs.commerzbank.com>
Message-ID: <CABzyjy=zwE5MvhNnaONGFfo6CDNJWMaZkFb2rvYPGB37jwaFaQ@mail.gmail.com>

Ah, I see. I'll go back and review your email on this subject earlier.
Sorry, got pulled off for some other stuff and missed the follow-ups.

thanks.
-- ramki

On Wed, Aug 22, 2012 at 4:40 AM, Frajt, Michal <
Michal.Frajt at partner.commerzbank.com> wrote:

> Hi Damon,
>
> It is not paging. The unpredictable delay from a scavenge to a CMS initial
> mark is the state of the current implementation. The CMSWaitDuration does
> not work correctly. Please find the details in the "CMSWaitDuration
> unstable behavior" post on hotspot-gc-dev mailing.
>
> We are facing same issue with extremely long initial mark pauses. Just
> proper waiting for a scavenge (customized OpenJDK 6) has reduced every
> single initial-mark pause from 1200ms to 20ms only.
>
> Regards,
> Michal
>
>
> -----Original Message-----
> From: hotspot-gc-use-bounces at openjdk.java.net [mailto:
> hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of Damon Hart-Davis
> Sent: Mittwoch, 22. August 2012 11:31
> To: Srinivas Ramakrishna; Bernd Eckenfels
> Cc: hotspot-gc-use at openjdk.java.net
> Subject: Re: Why abortable-preclean phase is not being aborted after YG
> occupancy exceeds 50%?
>
> Hi,
>
> Could it be paging when not all of the JVM heap is (able to be) in
> physical memory at the same time?
>
> Rgds
>
> Damon
>
>
> On 22 Aug 2012, at 09:03, Srinivas Ramakrishna wrote:
>
> > I am not sure why, with regular CMS, we should have this kind of
> upredictable delay from a scavenge to a CMS initial mark.
> > It must be OS/scheduling and load etc. which we cannot control (although
> a delay of 177 s seems excessive and must either mean
> > that the CMS wait duration was exceeded or something like that.
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120822/b519d1e8/attachment.html 

From michal at frajt.eu  Wed Aug 29 07:43:28 2012
From: michal at frajt.eu (Michal Frajt)
Date: Wed, 29 Aug 2012 16:43:28 +0200
Subject: CMSScavengeBeforeRemark confuses CMS-remark time
Message-ID: <M9IU8G$9E752F4D0C58782CE72BB71157F5AF5E@frajt.eu>

Hi,
We have fixed the bug in the CMSWaitDuration handling (find more in the 'CMSWaitDuration unstable behavior' hotspot-gc-dev post, fixed done in our customized OpenJDK). The CMS-initial-mark phase is now always starting right after the scavenge which makes it running 20-50 faster. Currently we are focused on minimizing the cost of the second STW remark phase. There we have the option CMSScavengeBeforeRemark to invoke the scavenge right before the remarking. All works as expected but the results reported in CMS GC logs are a bit confusing.  
The CMSScavengeBeforeRemark forces scavenge invocation from the CMS-remark phase (from within the VM thread as the CMS-remark operation is executed in the foreground collector). The generation collector reports its time into the same GC log file. In our case the ParNew collector reports the line including the STW time (0.0266193 seconds here).
 2012-08-29T07:27:02.613+0200: 9.388: [GC 9.388: [ParNew Desired survivor size 9568256 bytes, new threshold 8 (max 8) - age   1:    4626512 bytes,    4626512 total : 65631K->14657K(112384K), 0.0264694 secs] 108213K->72304K(10467072K), 0.0266193 secs] [Times: user=0.17 sys=0.01, real=0.03 secs]

The total CMS-remark time (STW) is usually understood as the number of seconds reported by the CMS-remark line (0.401657 seconds here).
 9.415: [Rescan (parallel)  (Survivor:10chunks) .... 0.0064098 secs]9.421: [weak refs processing, 0.0000320 secs]9.421: [class unloading, 0.0020331 secs]9.423: [scrub symbol & string tables, 0.0042518 secs] [1 CMS-remark: 57647K(10354688K)] 72304K(10467072K), 0.0401657 secs] [Times: user=0.23 sys=0.01, real=0.04 secs]
 
But, in the case the ParNew got invoked explicitly by the CMSScavengeBeforeRemark option, the time reported by the CMS-remark phase does include the time of the generation collector. The common interpretation is that there was one ParNew STW (0.0266193 sec) and the CMS-remark STW (0.0401657 secs) phase. Fortunately the total STW time is not ParNew plus CMS-remark time but only the CMS-remark time in total. In our example we spent 27ms in the ParNew and 14ms in the CMS-remark. The total STW time was 40ms and not 67ms as many times wrongly interpreted. 
The reporting of the CMS-remark phase, when used together with the CMSScavengeBeforeRemark option, does not allow easy interpretation of the application STW times. None of the CMS log file analyzer tools is able to correctly interpret the time reported in the CMS-remark phase. As far as we know none of existing interpretations of the CMS logging outputs does mention this issue. Same way the fixed CMSWaitDuration was able to deliver 20-50 faster initial marking to us, the existing and working CMSScavengeBeforeRemark option is able to deliver similar results when correctly interpreted from the GC log files. 
Regards,
Michal
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120829/13a78f67/attachment.html 

From rozdev29 at gmail.com  Wed Aug 29 10:18:58 2012
From: rozdev29 at gmail.com (Rozdev29)
Date: Wed, 29 Aug 2012 10:18:58 -0700
Subject: CMSScavengeBeforeRemark confuses CMS-remark time
In-Reply-To: <M9IU8G$9E752F4D0C58782CE72BB71157F5AF5E@frajt.eu>
References: <M9IU8G$9E752F4D0C58782CE72BB71157F5AF5E@frajt.eu>
Message-ID: <5253D329-E8D3-442F-BFF3-54BDCD44ADE3@gmail.com>

Hi there
Is this bug fixed for java 6 or 7?
Which version will have this fix?

Thanks
Saroj


On Aug 29, 2012, at 7:43 AM, "Michal Frajt" <michal at frajt.eu> wrote:

> Hi,
> 
> We have fixed the bug in the CMSWaitDuration handling (find more in the 'CMSWaitDuration unstable behavior' hotspot-gc-dev post, fixed done in our customized OpenJDK). The CMS-initial-mark phase is now always starting right after the scavenge which makes it running 20-50 faster. Currently we are focused on minimizing the cost of the second STW remark phase. There we have the option CMSScavengeBeforeRemark to invoke the scavenge right before the remarking. All works as expected but the results reported in CMS GC logs are a bit confusing.  
> 
> The CMSScavengeBeforeRemark forces scavenge invocation from the CMS-remark phase (from within the VM thread as the CMS-remark operation is executed in the foreground collector). The generation collector reports its time into the same GC log file. In our case the ParNew collector reports the line including the STW time (0.0266193 seconds here).
> 
>  2012-08-29T07:27:02.613+0200: 9.388: [GC 9.388: [ParNew Desired survivor size 9568256 bytes, new threshold 8 (max 8) - age   1:    4626512 bytes,    4626512 total : 65631K->14657K(112384K), 0.0264694 secs] 108213K->72304K(10467072K), 0.0266193 secs] [Times: user=0.17 sys=0.01, real=0.03 secs]
> 
>  
> The total CMS-remark time (STW) is usually understood as the number of seconds reported by the CMS-remark line (0.401657 seconds here).
> 
>  9.415: [Rescan (parallel)  (Survivor:10chunks) .... 0.0064098 secs]9.421: [weak refs processing, 0.0000320 secs]9.421: [class unloading, 0.0020331 secs]9.423: [scrub symbol & string tables, 0.0042518 secs] [1 CMS-remark: 57647K(10354688K)] 72304K(10467072K), 0.0401657 secs] [Times: user=0.23 sys=0.01, real=0.04 secs]
> 
>  
> 
> 
> But, in the case the ParNew got invoked explicitly by the CMSScavengeBeforeRemark option, the time reported by the CMS-remark phase does include the time of the generation collector. The common interpretation is that there was one ParNew STW (0.0266193 sec) and the CMS-remark STW (0.0401657 secs) phase. Fortunately the total STW time is not ParNew plus CMS-remark time but only the CMS-remark time in total. In our example we spent 27ms in the ParNew and 14ms in the CMS-remark. The total STW time was 40ms and not 67ms as many times wrongly interpreted. 
> 
> The reporting of the CMS-remark phase, when used together with the CMSScavengeBeforeRemark option, does not allow easy interpretation of the application STW times. None of the CMS log file analyzer tools is able to correctly interpret the time reported in the CMS-remark phase. As far as we know none of existing interpretations of the CMS logging outputs does mention this issue. Same way the fixed CMSWaitDuration was able to deliver 20-50 faster initial marking to us, the existing and working CMSScavengeBeforeRemark option is able to deliver similar results when correctly interpreted from the GC log files. 
> 
> Regards,
> Michal
> 
>  
> 
>  
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120829/486e4496/attachment.html 

From Bond.Chen at lombardrisk.com  Wed Aug 29 19:40:31 2012
From: Bond.Chen at lombardrisk.com (Bond Chen)
Date: Thu, 30 Aug 2012 03:40:31 +0100
Subject: [HTML]my CMS incremental duty cycle can't be controll by GC
	parameter settings
Message-ID: <503F4320.9AAE.00F7.0@lombardrisk.com>

Dear All & Sri,

Our application have encountered very long GC pause, from the GC analysis, I found the CMS takes 20-30 minutes to get finished by the value of icms_dc=1230 seconds, so the solution is to reduce this value, to let CMS finished ASAP, by reading a doc on oracle website about the CMS incremental mode, I want to have a concept demonstration test, but the result confusing me.

 
1)I have the set 3 CMS incremental parameters:
 -XX:+CMSIncrementalMode -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=3 
 
   My expectation is:
     the icms_dc value in the gc log should be in range 0-3
 
   Actual results:
     the icms_dc value are out of the range


2)All vm options:
VM arguments: -Dprogram.name=run.sh -Xms6072M -Xmx6072M -XX:PermSize=512m -XX:MaxPermSize=512m -Xss1024k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseTLAB -XX:+CMSIncrementalMode -XX:ParallelGCThreads=6 -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=3 -XX:MaxTenuringThreshold=32 -XX:+PrintTenuringDistribution -XX:CMSInitiatingOccupancyFraction=50 -Xmn1700m -XX:+UseLargePages -XX:LargePageSizeInBytes=64k -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -Xloggc:./gc_10_20120902171712.log -Dsun.rmi.dgc.server.gcInterval=18000000 -Dsun.rmi.dgc.client.gcInterval=18000000 -verbose:gc -Djava.library.path=/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/lib/valuation-lib:/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/firmament -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl -Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote.port=1234 -Dcom.sun.management.jmxremote.authenticate=false -XX:+ExplicitGCInvokesConcurrent -XX:+PrintTenuringDistribution -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintPromotionFailure -XX:PrintFLSStatistics=2 -Djboss.cluster.number=4 -Djboss.cluster.monitor.switch=y -Djboss.messaging.groupname=MessagingPostOffice -Djboss.partition.name=UATPartition_RH -Djboss.messaging.serverpeerid=10 -Djboss.web.lb.nodeid=nodee -Djboss.home.dir=/export/home/test/server/colline/cluster/jboss -Djboss.profile.name=2011.2.0.0.3 -Djboss.cluster.node1.addr=172.20.30.8 -Djboss.cluster.node2.addr=172.20.30.11 -Djboss.cluster.node3.addr=172.20.30.16 -Djboss.cluster.node4.addr=172.20.30.10 -Djboss.cluster.port_hacluster=7800 -Djboss.cluster.port_jbmdata=7900 -Djboss.cluster.port_jbmcontrol=7910 -Djboss.cluster.port_invalidationcache=7920 -Djboss.cluster.port_replicationcache=7930 -Djboss.messaging.consumer_count=3 -Djboss.jndi.port=1099 -Djboss.hajndi.port=1100 -Dcollateral_config=/export/home/test/server/colline/cluster/bin/collateral.properties -DIGNORE_FQN=/marketdata/FxRates,/marketdata/EODFxRates -Ddatasource.min.pool.size=5 -Ddatasource.max.pool.size=150 -Djboss.partition.udpGroup=230.1.0.4 -Dcom.sun.management.jmxremote.port=5004 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl -Djboss.platform.mbeanserver -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed 
 
 
3)GC logs:
-bash-3.00$ cat gc_10_20120902004701.log |grep icms
 icms_dc=0 , 0.1387250 secs] [Times: user=0.35 sys=0.20, real=0.14 secs] 
 icms_dc=0 , 0.3590387 secs] [Times: user=0.90 sys=0.43, real=0.36 secs] 
 icms_dc=15 , 0.5206302 secs] [Times: user=1.31 sys=0.46, real=0.52 secs] 
 icms_dc=30 , 0.2350382 secs] [Times: user=0.76 sys=0.01, real=0.24 secs] 
 icms_dc=45 , 0.4883827 secs] [Times: user=1.38 sys=0.37, real=0.49 secs] 
 icms_dc=41 , 0.7013340 secs] [Times: user=0.92 sys=0.05, real=0.70 secs] 
 
 
4)JVM version:
-bash-3.00$ java -version
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) Server VM (build 17.0-b16, mixed mode)
-bash-3.00$ ./launch_bondGCParameter.sh 
 
Regards,
Bond


This e-mail together with any attachments (the "Message") is confidential and may contain privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this Message from your system.  Any unauthorized copying, disclosure, distribution or use of this Message is strictly forbidden.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120830/0f71d5cb/attachment.html 

From ysr1729 at gmail.com  Thu Aug 30 01:03:57 2012
From: ysr1729 at gmail.com (Srinivas Ramakrishna)
Date: Thu, 30 Aug 2012 01:03:57 -0700
Subject: [HTML]my CMS incremental duty cycle can't be controll by GC
	parameter settings
In-Reply-To: <503F4320.9AAE.00F7.0@lombardrisk.com>
References: <503F4320.9AAE.00F7.0@lombardrisk.com>
Message-ID: <CABzyjynnpYmtTT+Fvz7=aXjscj6EMOtywz3L7OoFhrEfEnnUoQ@mail.gmail.com>

Bond --

On Wed, Aug 29, 2012 at 7:40 PM, Bond Chen <Bond.Chen at lombardrisk.com>wrote:

>  Dear All & Sri,
> Our application have encountered very long GC pause, from the GC analysis,
> I found the CMS takes 20-30 minutes to get finished by the value of
> icms_dc=1230 seconds, so the solution is to reduce this value, to let CMS
> finished ASAP, by reading a doc on oracle website about the CMS
> incremental mode, I want to have a concept demonstration test, but the
> result confusing me.
>

Not really. If you want the CMS cycled to finish as soon as possible you
should increase the duty cycle, not decrease it. (The idea is that the
duty-cycle defines the %ge of "concurrent time" that the ICMS thread will
be eligible to run.)

    -XX:CMSIncrementalDutyCycle=50

will, for example, let it run 50% of the time between two scavenges.

Then, you have to turn off the automatic duty-cycle control that ICMS does,
if you want  to maintain that duty cycle value whenever ICMS runs:-

  -XX:-CMSIncrementalPacing

I believe the min value sets a lower bound on the duty-cycle when
incremental pacing is on. It just sets a floor under which the duty-cycle
will never go.
Yes, I know, it's kind of asymmetric, and I can't recall the thinking
behind that, but it should be possible, i guess to bound the cycle between
two values
if you really wanted to make that modification. Ah, now I remember... the
idea is that concurrent mode failure is bad, so escalating the duty-cycle
to as much as 100%
should be permitted rather than bound it from above, lose the race and
cause a concurrent mode failure.

Here's the set of relevant options:-

$ java -XX:+PrintFlagsFinal -version | grep CMSIncremental
    uintx CMSIncrementalDutyCycle                   = 10
{product}
    uintx CMSIncrementalDutyCycleMin                = 0
{product}
     bool CMSIncrementalMode                        = false
{product}
    uintx CMSIncrementalOffset                      = 0
{product}
     bool CMSIncrementalPacing                      = true
{product}
    uintx CMSIncrementalSafetyFactor                = 10
{product}
java version "1.7.0_05"
Java(TM) SE Runtime Environment (build 1.7.0_05-b05)
Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode)

-- ramki


>
> *1)I have the set 3 CMS incremental parameters:*
>  -XX:+CMSIncrementalMode -XX:CMSIncrementalDutyCycleMin=0
> -XX:CMSIncrementalDutyCycle=3
>
>    My expectation is:
>      the icms_dc value in the gc log should be in range 0-3
>
>    Actual results:
>      the icms_dc value are out of the range
>
>
>
>
> *2)All vm options:*
> VM arguments: -Dprogram.name=run.sh -Xms6072M -Xmx6072M -XX:PermSize=512m
> -XX:MaxPermSize=512m -Xss1024k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled -XX:+UseTLAB -XX:+CMSIncrementalMode
> -XX:ParallelGCThreads=6 -XX:CMSIncrementalDutyCycleMin=0
> -XX:CMSIncrementalDutyCycle=3 -XX:MaxTenuringThreshold=32
> -XX:+PrintTenuringDistribution -XX:CMSInitiatingOccupancyFraction=50
> -Xmn1700m -XX:+UseLargePages -XX:LargePageSizeInBytes=64k
> -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails
> -XX:+PrintGCApplicationStoppedTime -Xloggc:./gc_10_20120902171712.log
> -Dsun.rmi.dgc.server.gcInterval=18000000
> -Dsun.rmi.dgc.client.gcInterval=18000000 -verbose:gc
> -Djava.library.path=/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/lib/valuation-lib:/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/firmament
> -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed
> -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl
> -Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote.port=1234
> -Dcom.sun.management.jmxremote.authenticate=false
> -XX:+ExplicitGCInvokesConcurrent -XX:+PrintTenuringDistribution
> -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC
> -XX:+PrintPromotionFailure -XX:PrintFLSStatistics=2
> -Djboss.cluster.number=4 -Djboss.cluster.monitor.switch=y
> -Djboss.messaging.groupname=MessagingPostOffice -Djboss.partition.name=UATPartition_RH
> -Djboss.messaging.serverpeerid=10 -Djboss.web.lb.nodeid=nodee
> -Djboss.home.dir=/export/home/test/server/colline/cluster/jboss -
> Djboss.profile.name=2011.2.0.0.3 -Djboss.cluster.node1.addr=172.20.30.8
> -Djboss.cluster.node2.addr=172.20.30.11
> -Djboss.cluster.node3.addr=172.20.30.16
> -Djboss.cluster.node4.addr=172.20.30.10 -Djboss.cluster.port_hacluster=7800
> -Djboss.cluster.port_jbmdata=7900 -Djboss.cluster.port_jbmcontrol=7910
> -Djboss.cluster.port_invalidationcache=7920
> -Djboss.cluster.port_replicationcache=7930
> -Djboss.messaging.consumer_count=3 -Djboss.jndi.port=1099
> -Djboss.hajndi.port=1100
> -Dcollateral_config=/export/home/test/server/colline/cluster/bin/collateral.properties
> -DIGNORE_FQN=/marketdata/FxRates,/marketdata/EODFxRates
> -Ddatasource.min.pool.size=5 -Ddatasource.max.pool.size=150
> -Djboss.partition.udpGroup=230.1.0.4
> -Dcom.sun.management.jmxremote.port=5004
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl
> -Djboss.platform.mbeanserver
> -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed
>
>
>
>
> *3)GC logs:*
> -bash-3.00$ cat gc_10_20120902004701.log |grep icms
>  icms_dc=0 , 0.1387250 secs] [Times: user=0.35 sys=0.20, real=0.14 secs]
>  icms_dc=0 , 0.3590387 secs] [Times: user=0.90 sys=0.43, real=0.36 secs]
>  icms_dc=15 , 0.5206302 secs] [Times: user=1.31 sys=0.46, real=0.52 secs]
>  icms_dc=30 , 0.2350382 secs] [Times: user=0.76 sys=0.01, real=0.24 secs]
>  icms_dc=45 , 0.4883827 secs] [Times: user=1.38 sys=0.37, real=0.49 secs]
>  icms_dc=41 , 0.7013340 secs] [Times: user=0.92 sys=0.05, real=0.70 secs]
>
>
> *4)JVM version:*
> -bash-3.00$ java -version
> java version "1.6.0_21"
> Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
> Java HotSpot(TM) Server VM (build 17.0-b16, mixed mode)
> -bash-3.00$ ./launch_bondGCParameter.sh
>
> Regards,
> Bond
>
> This e-mail together with any attachments (the "Message") is confidential and may contain privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this Message from your system.  Any unauthorized copying, disclosure, distribution or use of this Message is strictly forbidden.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120830/d558cfeb/attachment-0001.html 

From Bond.Chen at lombardrisk.com  Fri Aug 31 20:50:39 2012
From: Bond.Chen at lombardrisk.com (Bond Chen)
Date: Sat, 01 Sep 2012 04:50:39 +0100
Subject: =?UTF-8?Q?=E7=AD=94=E5=A4=8D=EF=BC=9A=20Re:=20[HTML]my=20CMS=20i?=
	=?UTF-8?Q?ncremental=20duty=20cycle=20can't=20be=20controll=20by=20GCpar?=
	=?UTF-8?Q?ameter=20settings?=
Message-ID: <5041F690020000F700011A5E@lde-email-smtp-ext.londoneast.lombardrisk.com>

Hi Sri,

Thanks for your help, now I know where the issue is.

By default the CMSIncrementalPacing is true, so JVM will automatically
adjust the duty cycle, which explain why I still see high icms value
when I set the duty cycle to 3

BTW, an official doc at Oracle website says the default is false.


Thanks again for this
Bond
 

>>> Srinivas Ramakrishna <ysr1729 at gmail.com> 12?08?30? ?? 16:04 >>>
Bond --

On Wed, Aug 29, 2012 at 7:40 PM, Bond Chen
<Bond.Chen at lombardrisk.com>wrote:

>  Dear All & Sri,
> Our application have encountered very long GC pause, from the GC
analysis,
> I found the CMS takes 20-30 minutes to get finished by the value of
> icms_dc=1230 seconds, so the solution is to reduce this value, to let
CMS
> finished ASAP, by reading a doc on oracle website about the CMS
> incremental mode, I want to have a concept demonstration test, but the
> result confusing me.
>

Not really. If you want the CMS cycled to finish as soon as possible you
should increase the duty cycle, not decrease it. (The idea is that the
duty-cycle defines the %ge of "concurrent time" that the ICMS thread
will
be eligible to run.)

    -XX:CMSIncrementalDutyCycle=50

will, for example, let it run 50% of the time between two scavenges.

Then, you have to turn off the automatic duty-cycle control that ICMS
does,
if you want  to maintain that duty cycle value whenever ICMS runs:-

  -XX:-CMSIncrementalPacing

I believe the min value sets a lower bound on the duty-cycle when
incremental pacing is on. It just sets a floor under which the
duty-cycle
will never go.
Yes, I know, it's kind of asymmetric, and I can't recall the thinking
behind that, but it should be possible, i guess to bound the cycle
between
two values
if you really wanted to make that modification. Ah, now I remember...
the
idea is that concurrent mode failure is bad, so escalating the
duty-cycle
to as much as 100%
should be permitted rather than bound it from above, lose the race and
cause a concurrent mode failure.

Here's the set of relevant options:-

$ java -XX:+PrintFlagsFinal -version | grep CMSIncremental
    uintx CMSIncrementalDutyCycle                   = 10
{product}
    uintx CMSIncrementalDutyCycleMin                = 0
{product}
     bool CMSIncrementalMode                        = false
{product}
    uintx CMSIncrementalOffset                      = 0
{product}
     bool CMSIncrementalPacing                      = true
{product}
    uintx CMSIncrementalSafetyFactor                = 10
{product}
java version "1.7.0_05"
Java(TM) SE Runtime Environment (build 1.7.0_05-b05)
Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode)

-- ramki


>
> *1)I have the set 3 CMS incremental parameters:*
>  -XX:+CMSIncrementalMode -XX:CMSIncrementalDutyCycleMin=0
> -XX:CMSIncrementalDutyCycle=3
>
>    My expectation is:
>      the icms_dc value in the gc log should be in range 0-3
>
>    Actual results:
>      the icms_dc value are out of the range
>
>
>
>
> *2)All vm options:*
> VM arguments: -Dprogram.name=run.sh -Xms6072M -Xmx6072M
-XX:PermSize=512m
> -XX:MaxPermSize=512m -Xss1024k -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled -XX:+UseTLAB -XX:+CMSIncrementalMode
> -XX:ParallelGCThreads=6 -XX:CMSIncrementalDutyCycleMin=0
> -XX:CMSIncrementalDutyCycle=3 -XX:MaxTenuringThreshold=32
> -XX:+PrintTenuringDistribution -XX:CMSInitiatingOccupancyFraction=50
> -Xmn1700m -XX:+UseLargePages -XX:LargePageSizeInBytes=64k
> -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails
> -XX:+PrintGCApplicationStoppedTime -Xloggc:./gc_10_20120902171712.log
> -Dsun.rmi.dgc.server.gcInterval=18000000
> -Dsun.rmi.dgc.client.gcInterval=18000000 -verbose:gc
>
-Djava.library.path=/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/lib/valuation-lib:/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/firmament
>
-Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed
>
-Djavax.manageme> -Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote.port=1234
> -Dcom.sun.management.jmxremote.authenticate=false
> -XX:+ExplicitGCInvokesConcurrent -XX:+PrintTenuringDistribution
> -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC
> -XX:+PrintPromotionFailure -XX:PrintFLSStatistics=2
> -Djboss.cluster.number=4 -Djboss.cluster.monitor.switch=y
> -Djboss.messaging.groupname=MessagingPostOffice
-Djboss.partition.name=UATPartition_RH
> -Djboss.messaging.serverpeerid=10 -Djboss.web.lb.nodeid=nodee
> -Djboss.home.dir=/export/home/test/server/colline/cluster/jboss -
> Djboss.profile.name=2011.2.0.0.3
-Djboss.cluster.node1.addr=172.20.30.8
> -Djboss.cluster.node2.addr=172.20.30.11
> -Djboss.cluster.node3.addr=172.20.30.16
> -Djboss.cluster.node4.addr=172.20.30.10
-Djboss.cluster.port_hacluster=7800
> -Djboss.cluster.port_jbmdata=7900 -Djboss.cluster.port_jbmcontrol=7910
> -Djboss.cluster.port_invalidationcache=7920
> -Djboss.cluster.port_replicationcache=7930
> -Djboss.messaging.consumer_count=3 -Djboss.jndi.port=1099
> -Djboss.hajndi.port=1100
>
-Dcollateral_config=/export/home/test/server/colline/cluster/bin/collateral.properties
> -DIGNORE_FQN=/marketdata/FxRates,/marketdata/EODFxRates
> -Ddatasource.min.pool.size=5 -Ddatasource.max.pool.size=150
> -Djboss.partition.udpGroup=230.1.0.4
> -Dcom.sun.management.jmxremote.port=5004
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
>
-Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl
> -Djboss.platform.mbeanserver
>
-Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed
>
>
>
>
> *3)GC logs:*
> -bash-3.00$ cat gc_10_20120902004701.log |grep icms
>  icms_dc=0 , 0.1387250 secs] [Times: user=0.35 sys=0.20, real=0.14
secs]
>  icms_dc=0 , 0.3590387 secs] [Times: user=0.90 sys=0.43, real=0.36
secs]
>  icms_dc=15 , 0.5206302 secs] [Times: user=1.31 sys=0.46, real=0.52
secs]
>  icms_dc=30 , 0.2350382 secs] [Times: user=0.76 sys=0.01, real=0.24
secs]
>  icms_dc=45 , 0.4883827 secs] [Times: user=1.38 sys=0.37, real=0.49
secs]
>  icms_dc=41 , 0.7013340 secs] [Times: user=0.92 sys=0.05, real=0.70
secs]
>
>
> *4)JVM version:*
> -bash-3.00$ java -version
> java version "1.6.0_21"
> Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
> Java HotSpot(TM) Server VM (build 17.0-b16, mixed mode)
> -bash-3.00$ ./launch_bondGCParameter.sh
>
> Regards,
> Bond
>
> This e-mail together with any attachments (the "Message") is
confidential and may contain privileged information. If you are not the
intended recipient (or have received this e-mail in error) please notify
the sender immediately and delete this Message from your system.  Any
unauthorized copying, disclosure, distribution or use of this Message is
strictly forbidden.
>
>
>


This e-mail together with any attachments (the "Message") is confidential and may contain privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this Message from your system.  Any unauthorized copying, disclosure, distribution or use of this Message is strictly forbidden.