From ching at neutec.com.tw Wed Aug 8 00:50:06 2012 From: ching at neutec.com.tw (Ching Chen) Date: Wed, 8 Aug 2012 15:50:06 +0800 Subject: Java 7 update 5 GC Message-ID: Dear Sirs/Madams who may concern: My java application (high-volume, low-latency on-line transaction betting system) needs to accomplish the least GC pause while to keep acceptable throughput. When I read java 7 G1GC and thought this is what I really want (I also investigated Azul Zing but stop somewhere in the middle of my study). I got a strange result when I ran a test with java 7 update 5 and compared it with java 7.0 of same test though. The java commands associating to GC in my test is not complicate. I only specified several of them and let GC to determine the rest for best performance. The GC options are: java -Xms12288m -Xmx12288m -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -XX:+PrintCommandLineFlags -XX:+UseG1GC -XX:MaxGCPauseMillis=100 \ The java 7.0 result as following: *java version "1.7.0" Java(TM) SE Runtime Environment (build 1.7.0-b147) Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode) [INFO] total GC times:199 [INFO] C:/Users/Chris/Documents/My Projects/citibet-2nd/support/matchspace/GC performance/XX_UseG1GC_MaxGCPauseMillis100.txt average GC:0.067739 second [INFO] app stops:206, average seconds:0.065926 [DEBUG] [0.006688, 0.007678, 0.008537, 0.014192, 0.021206, 0.023098, 0.023628, 0.026924, 0.028749, 0.029012, 0.030648, 0.031663, 0.031704, 0.031743, 0.033121, 0.035534, 0.036437, 0.037418, 0.038571, 0.039253, 0.040569, 0.041807, 0.042169, 0.042459, 0.043335, 0.043797, 0.045504, 0.046178, 0.046337, 0.050747, 0.051332, 0.052527, 0.052614, 0.052623, 0.054945, 0.055021, 0.05538, 0.055595, 0.055836, 0.05586, 0.056101, 0.056134, 0.056167, 0.056181, 0.056243, 0.056648, 0.056803, 0.057133, 0.057656, 0.057689, 0.058021, 0.058613, 0.058964, 0.059164, 0.059424, 0.059438, 0.059456, 0.059853, 0.060354, 0.06064, 0.060972, 0.060999, 0.061141, 0.061157, 0.061163, 0.061305, 0.061444, 0.061567, 0.061594, 0.061923, 0.06215, 0.062252, 0.063147, 0.063149, 0.063159, 0.063563, 0.063885, 0.064047, 0.064882, 0.064975, 0.065076, 0.065228, 0.065311, 0.066254, 0.066435, 0.066572, 0.067084, 0.067206, 0.067543, 0.067919, 0.067973, 0.068015, 0.068042, 0.068397, 0.068797, 0.0691, 0.069626, 0.069885, 0.070051, 0.070063, 0.070077, 0.070625, 0.071117, 0.071127, 0.071257, 0.071372, 0.071453, 0.071647, 0.071703, 0.072027, 0.072058, 0.072275, 0.072301, 0.07234, 0.072363, 0.072513, 0.072575, 0.072679, 0.07305, 0.073061, 0.073326, 0.073482, 0.073607, 0.073888, 0.073949, 0.074562, 0.074604, 0.074644, 0.075278, 0.075352, 0.07547, 0.075543, 0.075581, 0.076057, 0.076445, 0.076514, 0.076599, 0.076967, 0.076991, 0.077079, 0.077112, 0.077192, 0.077519, 0.077547, 0.077881, 0.078351, 0.078416, 0.078975, 0.079444, 0.079986, 0.080327, 0.080342, 0.080568, 0.080884, 0.081912, 0.081916, 0.082134, 0.082136, 0.082305, 0.082722, 0.082755, 0.082992, 0.083276, 0.083517, 0.083989, 0.084839, 0.084884, 0.085197, 0.08552, 0.085662, 0.08599, 0.085999, 0.086837, 0.087184, 0.087367, 0.08745, 0.087659, 0.087908, 0.088478, 0.089203, 0.089498, 0.090376, 0.09044, 0.092862, 0.093084, 0.093279, 0.093361, 0.097381, 0.098379, 0.100322, 0.100868, 0.10163, 0.103446, 0.104192, 0.105727, 0.108074, 0.108931, 0.113103, 0.152951] [INFO] Minimum=11419, Maximum=121219, Total=5204830, Count=66, Average=78861. (total elapsed nano:67824103252)* *[INFO] memory in usages after test 101 ends:4111938008, total memory:12884901888, max memory:12884901888 with total 5280000 bets* ** This test result meets G1GC specified (more GC times but smaller GC pause). While comparing to the test result of java 7 up*date 5*, the result is quite surprised me*:* ** *java version "1.7.0_05" Java(TM) SE Runtime Environment (build 1.7.0_05-b06) Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode) [INFO] total GC times:8 [INFO] C:/Users/Chris/Documents/My Projects/citibet-2nd/support/matchspace/GC performance/XX_UseG1GC_MaxGCPauseMillis100_java7u5.txt average GC:0.730325second [INFO] app stops:18, average seconds:0.325045 [DEBUG] [0.433048, 0.712155, 0.725535, 0.73903, 0.765341, 0.774686, 0.833398, 0.859405] [INFO] Minimum=11504, Maximum=175425, Total=5220946, Count=40, Average= 130523. (total elapsed nano:43241336678) [INFO] memory in usages after test 101 ends:5626027848, total memory:12884901888, max memory:12884901888 with total 5280000 bets * The throughput does increase but the GC pause-time does not meet the minimum requirement (< 100 millisecond) quite a big difference! Again, both runnings use same command GC options. System shows to me like: -XX:InitialHeapSize=12884901888 -XX:MaxGCPauseMillis=100 -XX:MaxHeapSize=12884901888 -XX:+PrintCommandLineFlags -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails -XX:+UseCompressedOops -XX:+UseG1GC Do I miss something secrets when using java 7 update 5 for GC specific issues? Thanks, Ching Chen ** -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120808/5d11a5a5/attachment.html From john.cuthbertson at oracle.com Wed Aug 8 11:18:20 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 08 Aug 2012 11:18:20 -0700 Subject: Java 7 update 5 GC In-Reply-To: References: Message-ID: <5022AD6C.1060306@oracle.com> Hi Ching, I'm not sure what's going on here. Do you have the complete GC logs available? What was the behavior with jdk7u4? I wonder if you are running into some expensive mixed GCs (perhaps as a result of a marking cycle being initiated by a humongous object allocation). Thanks, JohnC On 08/08/12 00:50, Ching Chen wrote: > Dear Sirs/Madams who may concern: > > My java application (high-volume, low-latency on-line transaction > betting system) needs to accomplish the least GC pause while to keep > acceptable throughput. When I read java 7 G1GC and thought this is > what I really want (I also investigated Azul Zing but stop somewhere > in the middle of my study). > > I got a strange result when I ran a test with java 7 update 5 and > compared it with java 7.0 of same test though. > > The java commands associating to GC in my test is not complicate. I > only specified several of them and let GC to determine the rest for > best performance. The GC options are: > java -Xms12288m -Xmx12288m -verbose:gc -XX:+PrintGCDetails > -XX:+PrintGCApplicationStoppedTime -XX:+PrintCommandLineFlags > -XX:+UseG1GC -XX:MaxGCPauseMillis=100 \ > > The java 7.0 result as following: > > /java version "1.7.0" > Java(TM) SE Runtime Environment (build 1.7.0-b147) > Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode) > [INFO] total GC times:199 > [INFO] C:/Users/Chris/Documents/My > Projects/citibet-2nd/support/matchspace/GC > performance/XX_UseG1GC_MaxGCPauseMillis100.txt average GC:0.067739 second > [INFO] app stops:206, average seconds:0.065926 > [DEBUG] [0.006688, 0.007678, 0.008537, 0.014192, 0.021206, 0.023098, > 0.023628, 0.026924, 0.028749, 0.029012, 0.030648, 0.031663, 0.031704, > 0.031743, 0.033121, > 0.035534, 0.036437, 0.037418, 0.038571, 0.039253, 0.040569, 0.041807, > 0.042169, 0.042459, 0.043335, 0.043797, 0.045504, 0.046178, 0.046337, > 0.050747, 0.051332, > 0.052527, 0.052614, 0.052623, 0.054945, 0.055021, 0.05538, 0.055595, > 0.055836, 0.05586, 0.056101, 0.056134, 0.056167, 0.056181, 0.056243, > 0.056648, 0.056803, > 0.057133, 0.057656, 0.057689, 0.058021, 0.058613, 0.058964, 0.059164, > 0.059424, 0.059438, 0.059456, 0.059853, 0.060354, 0.06064, 0.060972, > 0.060999, 0.061141, > 0.061157, 0.061163, 0.061305, 0.061444, 0.061567, 0.061594, 0.061923, > 0.06215, 0.062252, 0.063147, 0.063149, 0.063159, 0.063563, 0.063885, > 0.064047, 0.064882, > 0.064975, 0.065076, 0.065228, 0.065311, 0.066254, 0.066435, 0.066572, > 0.067084, 0.067206, 0.067543, 0.067919, 0.067973, 0.068015, 0.068042, > 0.068397, 0.068797, > 0.0691, 0.069626, 0.069885, 0.070051, 0.070063, 0.070077, 0.070625, > 0.071117, 0.071127, 0.071257, 0.071372, 0.071453, 0.071647, 0.071703, > 0.072027, 0.072058, > 0.072275, 0.072301, 0.07234, 0.072363, 0.072513, 0.072575, 0.072679, > 0.07305, 0.073061, 0.073326, 0.073482, 0.073607, 0.073888, 0.073949, > 0.074562, 0.074604, > 0.074644, 0.075278, 0.075352, 0.07547, 0.075543, 0.075581, 0.076057, > 0.076445, 0.076514, 0.076599, 0.076967, 0.076991, 0.077079, 0.077112, > 0.077192, 0.077519, > 0.077547, 0.077881, 0.078351, 0.078416, 0.078975, 0.079444, 0.079986, > 0.080327, 0.080342, 0.080568, 0.080884, 0.081912, 0.081916, 0.082134, > 0.082136, 0.082305, > 0.082722, 0.082755, 0.082992, 0.083276, 0.083517, 0.083989, 0.084839, > 0.084884, 0.085197, 0.08552, 0.085662, 0.08599, 0.085999, 0.086837, > 0.087184, 0.087367, > 0.08745, 0.087659, 0.087908, 0.088478, 0.089203, 0.089498, 0.090376, > 0.09044, 0.092862, 0.093084, 0.093279, 0.093361, 0.097381, 0.098379, > 0.100322, 0.100868, > 0.10163, 0.103446, 0.104192, 0.105727, 0.108074, 0.108931, 0.113103, > 0.152951] > [INFO] Minimum=11419, Maximum=121219, Total=5204830, Count=66, > Average=78861. (total elapsed nano:67824103252)/ > /[INFO] memory in usages after test 101 ends:4111938008, total > memory:12884901888, max memory:12884901888 with total 5280000 bets/ > // > > This test result meets G1GC specified (more GC times but smaller GC > pause). While comparing to the test result of java 7 up/date 5/, the > result is quite surprised me/:/ > // > /java version "1.7.0_05" > Java(TM) SE Runtime Environment (build 1.7.0_05-b06) > Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode) > [INFO] total GC times:8 > [INFO] C:/Users/Chris/Documents/My > Projects/citibet-2nd/support/matchspace/GC > performance/XX_UseG1GC_MaxGCPauseMillis100_java7u5.txt average > GC:0.730325 second > [INFO] app stops:18, average seconds:0.325045 > [DEBUG] [0.433048, 0.712155, 0.725535, 0.73903, 0.765341, 0.774686, > 0.833398, 0.859405] > [INFO] Minimum=11504, Maximum=175425, Total=5220946, Count=40, > Average=130523. (total elapsed nano:43241336678) > [INFO] memory in usages after test 101 ends:5626027848, total > memory:12884901888, max memory:12884901888 with total 5280000 bets > / > > The throughput does increase but the GC pause-time does not meet the > minimum requirement (< 100 millisecond) quite a big difference! > > Again, both runnings use same command GC options. System shows to me > like: > -XX:InitialHeapSize=12884901888 -XX:MaxGCPauseMillis=100 > -XX:MaxHeapSize=12884901888 -XX:+PrintCommandLineFlags -XX:+PrintGC > -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails > -XX:+UseCompressedOops -XX:+UseG1GC > Do I miss something secrets when using java 7 update 5 for GC specific > issues? > > > Thanks, > Ching Chen > // > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120808/f206c333/attachment.html From caoxudong818 at gmail.com Sun Aug 12 01:10:58 2012 From: caoxudong818 at gmail.com (=?GB2312?B?stzQ8bar?=) Date: Sun, 12 Aug 2012 16:10:58 +0800 Subject: Why dose max heap size change? Message-ID: Hi all, I am doing some monitor jobs for JVM with JMX, and has been stuck on a question about max heap size. The javadoc of the field *max* of class *java.lang.management.HeapUsage*says, "represents the maximum amount of memory (in bytes) that can be used for memory management. Its value may be undefined. *The maximum amount of memory may change over time if defined*. " So, my question is, Why dose max heap size change, if I defined it with Xmx? Any response will be appreciated. Best Regards. caoxudong -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120812/7a12079f/attachment.html From rednaxelafx at gmail.com Sun Aug 12 08:47:58 2012 From: rednaxelafx at gmail.com (Krystal Mok) Date: Sun, 12 Aug 2012 23:47:58 +0800 Subject: Why dose max heap size change? In-Reply-To: References: Message-ID: Hi Xudong, I believe the class you're referring to is java.lang.management.MemoryUsage. Quoting the docs [1]: A MemoryUsage object represents a snapshot of memory usage. Instances of the MemoryUsage class are usually constructed by methods that are used to obtain memory usage information about individual memory pool of the Java virtual machine or the heap or non-heap memory of the Java virtual machine as a whole. Which means MemoryUsage is not just used to represent the usage of the Java heap as a whole. -Xmx only locks the maximum size of the Java heap, but doesn't say anything about how the spaces within the Java heap should be arranged. Let's look at an example of a MemoryUsage object representing the usage of a memory pool. Try running JConsole on a HotSpot Server VM with default arguments. The collector used by default would be the Parallel collector. Go to the MBean tab, find the MBean of "PS Eden Space" from java.lang -> MemoryPool, and open its Usage property. Refresh the value a few times, and see if the max field changes. In my environment, it does change over time. That's because the Parallel collector uses an adaptive size policy, which could change the maximum size of the generations adaptively. Regards, Kris [1]: http://docs.oracle.com/javase/7/docs/api/java/lang/management/MemoryUsage.html On Sun, Aug 12, 2012 at 4:10 PM, ??? wrote: > Hi all, > > I am doing some monitor jobs for JVM with JMX, and has been stuck on a > question about max heap size. > > The javadoc of the field *max* of class *java.lang.management.HeapUsage*says, > > "represents the maximum amount of memory (in bytes) that can be > used for memory management. Its value may be undefined. > > *The maximum amount of memory may change over time if defined*. " > > So, my question is, > > Why dose max heap size change, if I defined it with Xmx? > > Any response will be appreciated. > > Best Regards. > > caoxudong > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120812/6a63aefe/attachment.html From caoxudong818 at gmail.com Sun Aug 12 09:19:52 2012 From: caoxudong818 at gmail.com (=?GB2312?B?stzQ8bar?=) Date: Mon, 13 Aug 2012 00:19:52 +0800 Subject: Why dose max heap size change? In-Reply-To: References: Message-ID: Hi Krystal, Thanks for your explaination. I understand it. Thanks a lot. Best Regards. caoxudong 2012/8/12 Krystal Mok > Hi Xudong, > > I believe the class you're referring to is > java.lang.management.MemoryUsage. Quoting the docs [1]: > > A MemoryUsage object represents a snapshot of memory usage. Instances of > the MemoryUsage class are usually constructed by methods that are used to > obtain memory usage information about individual memory pool of the Java > virtual machine or the heap or non-heap memory of the Java virtual machine > as a whole. > > Which means MemoryUsage is not just used to represent the usage of the > Java heap as a whole. -Xmx only locks the maximum size of the Java heap, > but doesn't say anything about how the spaces within the Java heap should > be arranged. > > Let's look at an example of a MemoryUsage object representing the usage of > a memory pool. > Try running JConsole on a HotSpot Server VM with default arguments. The > collector used by default would be the Parallel collector. > Go to the MBean tab, find the MBean of "PS Eden Space" from java.lang -> > MemoryPool, and open its Usage property. Refresh the value a few times, and > see if the max field changes. > In my environment, it does change over time. That's because the Parallel > collector uses an adaptive size policy, which could change the maximum size > of the generations adaptively. > > Regards, > Kris > > [1]: > http://docs.oracle.com/javase/7/docs/api/java/lang/management/MemoryUsage.html > > > On Sun, Aug 12, 2012 at 4:10 PM, ?????? wrote: > >> Hi all, >> >> I am doing some monitor jobs for JVM with JMX, and has been stuck on a >> question about max heap size. >> >> The javadoc of the field *max* of class *java.lang.management.HeapUsage*says, >> >> "represents the maximum amount of memory (in bytes) that can be >> used for memory management. Its value may be undefined. >> >> *The maximum amount of memory may change over time if defined*. " >> >> So, my question is, >> >> Why dose max heap size change, if I defined it with Xmx? >> >> Any response will be appreciated. >> >> Best Regards. >> >> caoxudong >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120813/9235b639/attachment.html From java at java4.info Tue Aug 14 02:27:47 2012 From: java at java4.info (Florian Binder) Date: Tue, 14 Aug 2012 11:27:47 +0200 Subject: CMS, PLAB-size and fragmentation Message-ID: <502A1A13.9010709@java4.info> Hi everybody, one of our servers (uses CMS with ParNew) promotes a lot of very small objects at the young gc to the tenure (survivor is disabled since the objects would survive it anyway): 5[4]: 10703/101530/19930 5[6]: 5482/50575/10115 (Sometimes even more) Therefore I increased the CMSOldPLABMax=131072 (ok, this might be to large and 32k would be enough) and decreased the CMSOldPLABMin=8 because there are always a few larger objects which have always a different size: 11[134]: 7/8/8 11[148]: 4/944/8 11[150]: 7/9/9 11[156]: 6/24/8 11[158]: 4/16/8 11[160]: 6/8/8 11[164]: 3/8/8 11[166]: 7/64/8 11[224]: 7/8/8 My questions now are: Does this have any effect on fragmentation of the tenure space? I assume the increasing of the maximum would have a positive effect because the small objects are more compacted. Is this right? Are there any other negative effects of changing these parameters? Thanks a lot, Flo From taras.tielkes at gmail.com Wed Aug 15 04:49:07 2012 From: taras.tielkes at gmail.com (Taras Tielkes) Date: Wed, 15 Aug 2012 13:49:07 +0200 Subject: Faster card marking: chances for Java 6 backport In-Reply-To: References: <4F91E64D.1070509@oracle.com> <4F95022A.7060103@oracle.com> Message-ID: Hi, Is the patch still scheduled to be integrated in an upcoming Java 7 update release? Thanks, Taras On Tue, Apr 24, 2012 at 6:12 AM, Krystal Mok wrote: > Hi Taras, > > I asked something related in an earlier thread, > http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-March/005380.html > Looks like people should put their bet on JDK7 instead of staying on JDK6... > > - Kris > > > On Tue, Apr 24, 2012 at 4:07 AM, Taras Tielkes > wrote: >> >> Hi Bengt, >> >> Thanks for the correction - you're completely right, of course. >> >> To me, the decision process for which performance improvements are >> backported to the previous release stream has never been completely >> clear. >> Given that the change in question seems quite an isolated fix, I >> though it would make sense to ask. >> >> Thanks, >> -tt >> >> On Mon, Apr 23, 2012 at 9:18 AM, Bengt Rutisson >> wrote: >> > >> > Taras, >> > >> > Maybe I'm being a bit picky here, but just to be clear. The change for >> > 7068625 is for faster card scanning - not marking. >> > >> > I agree with Jon, I don't think this will be backported to JDK6 unless >> > there is an explicit customer request to do so. >> > >> > Bengt >> > >> > On 2012-04-21 00:42, Jon Masamitsu wrote: >> >> Taras, >> >> >> >> I haven't heard any discussions about a backport. >> >> I think it's a issue that the sustaining organization would >> >> have to consider (since it's to jdk6). >> >> >> >> Jon >> >> >> >> On 4/20/2012 12:46 PM, Taras Tielkes wrote: >> >>> Hi, >> >>> >> >>> Are there plans to port RFE 7068625 to Java 6? >> >>> >> >>> Thanks, >> >>> -tt >> >>> _______________________________________________ >> >>> hotspot-gc-use mailing list >> >>> hotspot-gc-use at openjdk.java.net >> >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> _______________________________________________ >> >> hotspot-gc-use mailing list >> >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > >> > _______________________________________________ >> > hotspot-gc-use mailing list >> > hotspot-gc-use at openjdk.java.net >> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > From bengt.rutisson at oracle.com Thu Aug 16 00:58:24 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 16 Aug 2012 09:58:24 +0200 Subject: Faster card marking: chances for Java 6 backport In-Reply-To: References: <4F91E64D.1070509@oracle.com> <4F95022A.7060103@oracle.com> Message-ID: <502CA820.4040400@oracle.com> Hi Taras, The patch was integrated just after the 7u4 was branched. The 7u6 and 7u8 releases will be based on the 7u4 branch. So, I think the patch will not be available in JDK 7 until 7u10. Bengt On 2012-08-15 13:49, Taras Tielkes wrote: > Hi, > > Is the patch still scheduled to be integrated in an upcoming Java 7 > update release? > > Thanks, > Taras > > On Tue, Apr 24, 2012 at 6:12 AM, Krystal Mok wrote: >> Hi Taras, >> >> I asked something related in an earlier thread, >> http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-March/005380.html >> Looks like people should put their bet on JDK7 instead of staying on JDK6... >> >> - Kris >> >> >> On Tue, Apr 24, 2012 at 4:07 AM, Taras Tielkes >> wrote: >>> Hi Bengt, >>> >>> Thanks for the correction - you're completely right, of course. >>> >>> To me, the decision process for which performance improvements are >>> backported to the previous release stream has never been completely >>> clear. >>> Given that the change in question seems quite an isolated fix, I >>> though it would make sense to ask. >>> >>> Thanks, >>> -tt >>> >>> On Mon, Apr 23, 2012 at 9:18 AM, Bengt Rutisson >>> wrote: >>>> Taras, >>>> >>>> Maybe I'm being a bit picky here, but just to be clear. The change for >>>> 7068625 is for faster card scanning - not marking. >>>> >>>> I agree with Jon, I don't think this will be backported to JDK6 unless >>>> there is an explicit customer request to do so. >>>> >>>> Bengt >>>> >>>> On 2012-04-21 00:42, Jon Masamitsu wrote: >>>>> Taras, >>>>> >>>>> I haven't heard any discussions about a backport. >>>>> I think it's a issue that the sustaining organization would >>>>> have to consider (since it's to jdk6). >>>>> >>>>> Jon >>>>> >>>>> On 4/20/2012 12:46 PM, Taras Tielkes wrote: >>>>>> Hi, >>>>>> >>>>>> Are there plans to port RFE 7068625 to Java 6? >>>>>> >>>>>> Thanks, >>>>>> -tt >>>>>> _______________________________________________ >>>>>> hotspot-gc-use mailing list >>>>>> hotspot-gc-use at openjdk.java.net >>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From haim at performize-it.com Fri Aug 17 14:14:08 2012 From: haim at performize-it.com (Haim Yadid) Date: Fri, 17 Aug 2012 23:14:08 +0200 Subject: CMS Concurrent mode failure fallback to the serial old collector? In-Reply-To: References: Message-ID: > I am analysing a GC pause problem and I have noticed that when CMS is used > and a concurrent mode failure occurs or GC is triggered manually (by > System.gc()) the STW collector used does not seem to be parallel. ( I am > aware of the ExplicitGCInvokesConcurrent flag but it will not solve > concurrent failure ). > I tried to play with -XX:ParallelGCThreads=... -XX:ParallelCMSThreads=... > but they seem have no effect (only on the ParNew GC). > > I am deducing it from the following GC log line > > 24.904: [Full GC (System) 24.904: [CMS: 302703K->303056K(2116864K), > 1.0847520 secs] 484492K->303056K(2423552K), [CMS Perm : > 7528K->7525K(21248K)], 1.0852780 secs] [Times: user=1.04 sys=0.02, > real=1.09 secs] > If it would have been parallel "user" would have been equal to "nThreads" > * "real". > In addition if I choose ParallelOld GC it will behave correctly. > > I really do not understand why the failover STW mechanism of CMS is not > parallel shouldn't it be finishing the work as soon as possible ? > I am not able to find anything useful on the internet. > > I think G1 behaves in the same manner BTW ( AFAIK the the fallback > collector of G1 is copied from CMS) > > Help will be appreciated. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120817/171b62b3/attachment.html From haim at performize-it.com Fri Aug 17 14:08:38 2012 From: haim at performize-it.com (Haim Yadid) Date: Fri, 17 Aug 2012 23:08:38 +0200 Subject: CMS Concurrent mode failure fallback to the serial old collector? Message-ID: I am analysing a GC pause problem and I have noticed that when CMS is used and a concurrent mode failure occurs or GC is triggered manually (by System.gc()) the STW collector used does not seem to be parallel. ( I am aware of the ExplicitGCInvokesConcurrent flag but it will not solve concurrent failure ). I tried to play with -XX:ParallelGCThreads=... -XX:ParallelCMSThreads=... but they seem have no effect (only on the ParNew GC). I am deducing it from the following GC log line 24.904: [Full GC (System) 24.904: [CMS: 302703K->303056K(2116864K), 1.0847520 secs] 484492K->303056K(2423552K), [CMS Perm : 7528K->7525K(21248K)], 1.0852780 secs] [Times: user=1.04 sys=0.02, real=1.09 secs] If it would have been parallel "user" would have been equal to "nThreads" * "real". In addition if I choose ParallelOld GC it will behave correctly. I really do not understand why the failover STW mechanism of CMS is not parallel shouldn't it be finishing the work as soon as possible ? I am not able to find anything useful on the internet. I think G1 behaves in the same manner BTW ( AFAIK the the fallback collector of G1 is copied from CMS) Help will be appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120817/2d4654d9/attachment.html From hjohn at xs4all.nl Sat Aug 18 05:06:46 2012 From: hjohn at xs4all.nl (John Hendrikx) Date: Sat, 18 Aug 2012 14:06:46 +0200 Subject: Soft References... are they working as intended? Message-ID: <502F8556.8080701@xs4all.nl> I've come to the conclusion that SoftReferences in the current hotspot implementation are suffering from some problems. I'm running the latest Java 7, with default gc settings and a very modest heap space of 256 MB. On this heap I have on the order of 50-60 large objects that are referenced by SoftReference objects. Each object is a few megabytes in size (they are decoded JPEG images). At any given time, only 10 of these images have strong references to them, totalling no more than 50-60 MB of heap space, the other 200 MB of space is only soft referenced. It is said that SoftReferences are guaranteed to get cleared before heap space runs out, yet in certain extreme circumstances one of the following can happen: 1) 90% of the time, when under high memory pressure (many images loaded and discarded), the VM gets really slow and it seems that some threads get stuck in an infinite loop. What is actually happening is that the GC will run for long periods in a row (upto a few minutes, consuming one CPU core) before the program gets unstuck and it finally noticed it can clear some SoftReference objects. It is possible that the GC has trouble deciding which SoftReferences can be cleared because many of them had (upto a few seconds ago) strong references to them, which themselves may not have been marked as garbage yet. So it recovers, but it is taking so much time to do it that users will think the program is stuck. 2) The rest of the time it actually will throw an out of heap space exception, despite there being SoftReference objects that could have been cleared. This usually happens after a long pause as well. Can anyone confirm that these problems exists, and perhaps advice a course of action? I really don't want to have to 2nd guess the GC about which images should be discarded, but it looks like I will have no choice but to limit this Image cache manually to some reasonable value to avoid the GC getting stuck for long periods. Best regards, John Hendrikx From dhd at exnet.com Sat Aug 18 05:13:20 2012 From: dhd at exnet.com (Damon Hart-Davis) Date: Sat, 18 Aug 2012 13:13:20 +0100 Subject: Soft References... are they working as intended? In-Reply-To: <502F8556.8080701@xs4all.nl> References: <502F8556.8080701@xs4all.nl> Message-ID: <215BC73F-E02D-4E2C-9C0D-C14EA8D7667D@exnet.com> Hi, FWIW I usually combine SoftReferences with some other sort of explicit limit based on heap size to help avert this type of issue, and indeed use a number of different strategies, often involving some explicit LRU management. I can supply code snippets if that would help! B^> Rgds Damon On 18 Aug 2012, at 13:06, John Hendrikx wrote: > I've come to the conclusion that SoftReferences in the current hotspot > implementation are suffering from some problems. > > I'm running the latest Java 7, with default gc settings and a very > modest heap space of 256 MB. > > On this heap I have on the order of 50-60 large objects that are > referenced by SoftReference objects. Each object is a few megabytes in > size (they are decoded JPEG images). > > At any given time, only 10 of these images have strong references to > them, totalling no more than 50-60 MB of heap space, the other 200 MB of > space is only soft referenced. > > It is said that SoftReferences are guaranteed to get cleared before heap > space runs out, yet in certain extreme circumstances one of the > following can happen: > > 1) 90% of the time, when under high memory pressure (many images loaded > and discarded), the VM gets really slow and it seems that some threads > get stuck in an infinite loop. What is actually happening is that the > GC will run for long periods in a row (upto a few minutes, consuming one > CPU core) before the program gets unstuck and it finally noticed it can > clear some SoftReference objects. > > It is possible that the GC has trouble deciding which SoftReferences can > be cleared because many of them had (upto a few seconds ago) strong > references to them, which themselves may not have been marked as garbage > yet. > > So it recovers, but it is taking so much time to do it that users will > think the program is stuck. > > 2) The rest of the time it actually will throw an out of heap space > exception, despite there being SoftReference objects that could have > been cleared. This usually happens after a long pause as well. > > Can anyone confirm that these problems exists, and perhaps advice a > course of action? > > I really don't want to have to 2nd guess the GC about which images > should be discarded, but it looks like I will have no choice but to > limit this Image cache manually to some reasonable value to avoid the GC > getting stuck for long periods. > > Best regards, > John Hendrikx > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > From Andreas.Loew at oracle.com Sat Aug 18 07:36:29 2012 From: Andreas.Loew at oracle.com (Andreas Loew) Date: Sat, 18 Aug 2012 16:36:29 +0200 Subject: Soft References... are they working as intended? In-Reply-To: <215BC73F-E02D-4E2C-9C0D-C14EA8D7667D@exnet.com> References: <502F8556.8080701@xs4all.nl> <215BC73F-E02D-4E2C-9C0D-C14EA8D7667D@exnet.com> Message-ID: <502FA86D.3080604@oracle.com> Hi John, while I am a "field guy" and therefore cannot really comment on any possible latest implementation details in JDK7, but from what I know about this topic, I can surely imagine that when in your sample, your SoftReferences start to make up a very large portion of the heap, this will cause the currently implemented mechanism to behave poorly and finally fail. Please see mainly http://www.oracle.com/technetwork/java/hotspotfaq-138619.html and especially http://jeremymanson.blogspot.co.uk/2009/07/how-hotspot-decides-to-clear_07.html explaining all the nasty details - I'm going to partly cite both these sources below: You do probably know about the "-XX" parameter: -XX:SoftRefLRUPolicyMSPerMB= Every SoftReference has a timestamp field that is updated when it is accessed (when it is constructed or the get()) method is called. This gives a very coarse ordering over the SoftReferences; the timestamp indicates the time of the last GC before they were accessed. Whenever a garbage collection occurs (and only then), the decision to clear a SoftReference is based on two factors: 1. how old the reference's timestamp is, and 2. how much free space there is in memory. In my experience, this will coarsely mean that a soft reference will survive (after the last strong reference to the object has been collected!) for milliseconds times the number of megabytes of current free space in the heap. The default is 1s/Mb, so if an object is only soft reachable it will stay for 1s if only 1Mb of heap space is free - provided that we have garbage collections run frequently enough to check this condition (!!!). Also, the HotSpot Server VM uses the maximum possible heap size (as set with the |-Xmx| option) to calculate the current free space remaining, while the Client VM uses the current actual heap size to calculate the free space (!). "This means that the general tendency is for the Server VM to grow the heap rather than flush soft references, and |-Xmx| therefore has a significant effect on when soft references are garbage collected. On the other hand, the Client VM will have a greater tendency to flush soft references rather than grow the heap." And also - and this is probably which affects you most severely: "One thing to notice about this is that it implies that SoftReferences will always be kept for at least one GC after their last access. Why is that? Well, for the interval, we are using the clock value of the last garbage collection, not the current one. As a result, if a SoftReference has been accessed since the last garbage collection, it will have the same timestamp as that garbage collection, and the interval will be 0. 0 <= free_heap * 1000 for any amount of free_heap, so any SoftReference accessed since the last garbage collection is guaranteed to be kept." The big hidden pitfall is that in case the objects being held via SoftReferences were too big to be allocated in the young generation (which, in my understanding, is true in your example), the above will not refer to the most recent minor GC, but to the most recent old gen, i.e. full GC that happened (!!!). So in your sample case mentioned below, please check for the above conditions: * What version of the JVM are you using? * If using the server VM, do you use equal -Xms and -Xmx values? * Are your "decoded JPEG images" directly being allocated into old generation (which I assume to be true)? * And finally - looking at the general frequency of the appropriate type of GCs in your scenario, did you access the soft referenced objects since the last (in your scenario probably: full) GC when you see everything getting stuck or an OOME? Hope this helps & best regards, Andreas Am 18.08.2012 14:13, schrieb Damon Hart-Davis: > Hi, > > FWIW I usually combine SoftReferences with some other sort of explicit limit based on heap size to help avert this type of issue, and indeed use a number of different strategies, often involving some explicit LRU management. > > I can supply code snippets if that would help! B^> > > Rgds > > Damon > > > On 18 Aug 2012, at 13:06, John Hendrikx wrote: > >> I've come to the conclusion that SoftReferences in the current hotspot >> implementation are suffering from some problems. >> >> I'm running the latest Java 7, with default gc settings and a very >> modest heap space of 256 MB. >> >> On this heap I have on the order of 50-60 large objects that are >> referenced by SoftReference objects. Each object is a few megabytes in >> size (they are decoded JPEG images). >> >> At any given time, only 10 of these images have strong references to >> them, totalling no more than 50-60 MB of heap space, the other 200 MB of >> space is only soft referenced. >> >> It is said that SoftReferences are guaranteed to get cleared before heap >> space runs out, yet in certain extreme circumstances one of the >> following can happen: >> >> 1) 90% of the time, when under high memory pressure (many images loaded >> and discarded), the VM gets really slow and it seems that some threads >> get stuck in an infinite loop. What is actually happening is that the >> GC will run for long periods in a row (upto a few minutes, consuming one >> CPU core) before the program gets unstuck and it finally noticed it can >> clear some SoftReference objects. >> >> It is possible that the GC has trouble deciding which SoftReferences can >> be cleared because many of them had (upto a few seconds ago) strong >> references to them, which themselves may not have been marked as garbage >> yet. >> >> So it recovers, but it is taking so much time to do it that users will >> think the program is stuck. >> >> 2) The rest of the time it actually will throw an out of heap space >> exception, despite there being SoftReference objects that could have >> been cleared. This usually happens after a long pause as well. >> >> Can anyone confirm that these problems exists, and perhaps advice a >> course of action? >> >> I really don't want to have to 2nd guess the GC about which images >> should be discarded, but it looks like I will have no choice but to >> limit this Image cache manually to some reasonable value to avoid the GC >> getting stuck for long periods. >> >> Best regards, >> John Hendrikx -- Andreas Loew | Senior Java Architect ACS Principal Service Delivery Engineer ORACLE Germany From haim at performize-it.com Mon Aug 20 03:29:24 2012 From: haim at performize-it.com (Haim Yadid) Date: Mon, 20 Aug 2012 13:29:24 +0300 Subject: What is the logic that G1GC follows when triggering young/mixed/full GC Message-ID: I am evaluating G1GC as a candidate to solve GC pause of an application with 20GB heap. It seems like from time to time G1 is issuing a Full GC which leads to a long pause ( at least 10 seconds). In addition from time to time it is triggering the mixed mode and then it breaches the soft real time requirement I set ( 100 ms max pause time) Why does it happen ? What are the reasons for G1 to initiate a full GC. Why does the mixed mode collections are so long ? Attaching part of the log 2012-08-19T23:53:26.414+0000: 36846.868: [GC pause (young), 0.99116700 secs] [Parallel Time: 957.3 ms] [GC Worker Start (ms): 36846868.9 36846868.9 36846868.9 36846868.9 36846868.9 36846868.9 36846868.9 36846868.9 36846868.9 36846868.9 36846868.9 36846869.0 36846869.0 Avg: 36846868.9, Min: 36846868.9, Max: 36846869.0, Diff: 0.1] [Ext Root Scanning (ms): 1.7 1.5 1.9 2.2 1.8 1.7 1.7 1.6 1.5 1.7 1.7 1.4 1.9 Avg: 1.7, Min: 1.4, Max: 2.2, Diff: 0.8] [Update RS (ms): 9.8 9.8 9.9 9.3 9.6 10.0 10.4 10.0 10.7 11.4 11.0 10.1 9.6 Avg: 10.1, Min: 9.3, Max: 11.4, Diff: 2.1] [Processed Buffers : 11 13 13 15 18 10 20 11 12 13 11 12 9 Sum: 168, Avg: 12, Min: 9, Max: 20, Diff: 11] [Scan RS (ms): 67.9 67.8 67.5 67.8 67.7 67.6 67.2 67.6 67.0 66.0 66.5 67.9 67.7 Avg: 67.4, Min: 66.0, Max: 67.9, Diff: 1.9] [Object Copy (ms): 875.3 875.3 876.5 875.4 875.6 875.3 875.5 875.3 875.4 875.6 875.7 875.2 875.4 Avg: 875.5, Min: 875.2, Max: 876.5, Diff: 1.2] [Termination (ms): 1.1 1.1 0.0 1.0 1.0 1.2 0.9 1.2 1.1 1.0 0.9 1.1 1.2 Avg: 1.0, Min: 0.0, Max: 1.2, Diff: 1.2] [Termination Attempts : 1974 1909 1 1892 1665 2133 1751 2073 2045 1701 1756 2077 2158 Sum: 23135, Avg: 1779, Min: 1, Max: 2158, Diff: 2157] [GC Worker End (ms): 36847824.7 36847824.8 36847824.8 36847824.8 36847824.7 36847824.8 36847824.8 36847824.7 36847824.8 36847824.8 36847824.8 36847824.7 36847824.8 Avg: 36847824.8, Min: 36847824.7, Max: 36847824.8, Diff: 0.1] [GC Worker (ms): 955.9 955.9 955.9 955.9 955.8 955.9 955.9 955.8 955.8 955.9 955.9 955.8 955.8 Avg: 955.9, Min: 955.8, Max: 955.9, Diff: 0.2] [GC Worker Other (ms): 1.5 1.8 1.5 1.5 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 Avg: 1.6, Min: 1.5, Max: 1.8, Diff: 0.3] [Clear CT: 1.7 ms] [Other: 32.1 ms] [Choose CSet: 0.3 ms] [Ref Proc: 0.7 ms] [Ref Enq: 0.0 ms] [Free CSet: 12.4 ms] [Eden: 3776M(3776M)->0B(3306M) Survivors: 4096K->474M Heap: 8949M(18908M)->6839M(18908M)] [Times: user=12.45 sys=0.03, real=0.99 secs] 2012-08-19T23:59:23.636+0000: 37204.090: [GC pause (mixed), 1.48753900 secs] [Parallel Time: 1439.9 ms] [GC Worker Start (ms): 37204093.5 37204093.5 37204093.5 37204093.5 37204093.5 37204093.6 37204093.6 37204093.6 37204093.6 37204093.6 37204093.6 37204093.6 37204093.6 Avg: 37204093.6, Min: 37204093.5, Max: 37204093.6, Diff: 0.1] [Ext Root Scanning (ms): 1.5 2.0 1.9 1.4 1.6 1.6 1.4 1.5 1.5 1.6 1.4 1.6 1.4 Avg: 1.6, Min: 1.4, Max: 2.0, Diff: 0.6] [Update RS (ms): 14.7 12.0 11.5 12.4 11.5 11.6 11.9 11.5 15.5 11.8 11.9 12.8 11.8 Avg: 12.4, Min: 11.5, Max: 15.5, Diff: 4.0] [Processed Buffers : 10 7 7 8 21 7 7 32 8 7 8 7 8 Sum: 137, Avg: 10, Min: 7, Max: 32, Diff: 25] [Scan RS (ms): 144.8 146.8 147.2 147.0 147.9 147.6 147.3 147.9 143.9 147.6 147.5 146.1 147.6 Avg: 146.9, Min: 143.9, Max: 147.9, Diff: 4.0] [Object Copy (ms): 1271.4 1271.8 1273.0 1271.7 1271.7 1271.6 1271.8 1271.6 1272.4 1277.3 1271.6 1273.1 1273.0 Avg: 1272.5, Min: 1271.4, Max: 1277.3, Diff: 5.8] [Termination (ms): 6.0 5.8 4.7 5.9 5.5 5.9 5.8 5.6 4.9 0.0 5.9 4.4 4.4 Avg: 5.0, Min: 0.0, Max: 6.0, Diff: 6.0] [Termination Attempts : 8002 7759 6378 7864 7158 7929 7776 7293 6256 1 7901 5835 5842 Sum: 85994, Avg: 6614, Min: 1, Max: 8002, Diff: 8001] [GC Worker End (ms): 37205531.9 37205531.9 37205531.9 37205531.9 37205532.0 37205531.9 37205531.9 37205531.9 37205532.0 37205532.0 37205531.9 37205532.0 37205532.0 Avg: 37205531.9, Min: 37205531.9, Max: 37205532.0, Diff: 0.1] [GC Worker (ms): 1438.4 1438.4 1438.3 1438.4 1438.4 1438.3 1438.3 1438.4 1438.4 1438.4 1438.3 1438.4 1438.4 Avg: 1438.4, Min: 1438.3, Max: 1438.4, Diff: 0.1] [GC Worker Other (ms): 1.6 1.6 1.6 1.6 1.6 1.6 1.7 1.6 1.7 1.7 1.7 1.9 1.7 Avg: 1.6, Min: 1.6, Max: 1.9, Diff: 0.3] [Clear CT: 2.1 ms] [Other: 45.6 ms] [Choose CSet: 2.3 ms] [Ref Proc: 0.6 ms] [Ref Enq: 0.0 ms] [Free CSet: 15.8 ms] [Eden: 3306M(3306M)->0B(3532M) Survivors: 474M->474M Heap: 13220M(20036M)->10771M(20036M)] [Times: user=18.47 sys=0.10, real=1.49 secs] 2012-08-19T23:59:31.632+0000: 37212.087: [GC pause (mixed), 0.79556800 secs] [Parallel Time: 766.4 ms] [GC Worker Start (ms): 37212091.2 37212091.2 37212091.2 37212091.2 37212091.2 37212091.2 37212091.2 37212091.3 37212091.3 37212091.3 37212091.3 37212091.3 37212091.3 Avg: 37212091.3, Min: 37212091.2, Max: 37212091.3, Diff: 0.1] [Ext Root Scanning (ms): 1.5 2.1 1.4 1.5 1.4 2.0 1.6 1.5 2.0 1.5 1.6 1.1 1.4 Avg: 1.6, Min: 1.1, Max: 2.1, Diff: 1.0] [Update RS (ms): 15.3 14.6 15.4 18.6 15.4 14.6 15.1 15.3 14.6 15.4 15.1 15.2 15.5 Avg: 15.4, Min: 14.6, Max: 18.6, Diff: 4.0] [Processed Buffers : 40 30 31 40 34 30 31 35 30 38 32 38 30 Sum: 439, Avg: 33, Min: 30, Max: 40, Diff: 10] [Scan RS (ms): 70.4 70.4 70.4 67.0 70.4 70.4 70.3 70.4 70.4 70.3 70.3 70.5 70.1 Avg: 70.1, Min: 67.0, Max: 70.5, Diff: 3.4] [Object Copy (ms): 671.5 670.5 670.6 670.5 670.5 670.8 670.6 670.5 670.6 671.0 670.9 677.6 670.8 Avg: 671.3, Min: 670.5, Max: 677.6, Diff: 7.1] [Termination (ms): 6.2 7.3 7.2 7.2 7.1 7.1 7.2 7.1 7.2 6.6 6.9 0.0 7.0 Avg: 6.5, Min: 0.0, Max: 7.3, Diff: 7.3] [Termination Attempts : 7855 8294 8154 8366 8234 8178 8029 7990 8054 8147 7594 1 8133 Sum: 97029, Avg: 7463, Min: 1, Max: 8366, Diff: 8365] [GC Worker End (ms): 37212856.1 37212856.1 37212856.1 37212856.2 37212856.2 37212856.1 37212856.2 37212856.1 37212856.2 37212856.2 37212856.1 37212856.2 37212856.2 Avg: 37212856.2, Min: 37212856.1, Max: 37212856.2, Diff: 0.1] [GC Worker (ms): 764.9 764.9 764.9 764.9 765.0 764.9 764.9 764.9 764.9 764.9 764.8 764.9 764.9 Avg: 764.9, Min: 764.8, Max: 765.0, Diff: 0.1] [GC Worker Other (ms): 1.5 1.5 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 2.1 1.6 Avg: 1.6, Min: 1.5, Max: 2.1, Diff: 0.5] [Clear CT: 2.5 ms] [Other: 26.6 ms] [Choose CSet: 2.7 ms] [Ref Proc: 0.2 ms] [Ref Enq: 0.0 ms] [Free CSet: 10.1 ms] [Eden: 1318M(3532M)->0B(3822M) Survivors: 474M->266M Heap: 13077M(20444M)->11364M(20444M)] [Times: user=10.01 sys=0.00, real=0.79 secs] 2012-08-19T23:59:47.406+0000: 37227.860: [GC pause (mixed), 2.20691100 secs] [Parallel Time: 2151.1 ms] [GC Worker Start (ms): 37227867.4 37227867.4 37227867.4 37227867.4 37227867.4 37227867.4 37227867.4 37227867.4 37227867.5 37227867.5 37227867.5 37227867.5 37227867.5 Avg: 37227867.4, Min: 37227867.4, Max: 37227867.5, Diff: 0.1] [Ext Root Scanning (ms): 1.5 1.9 1.5 1.6 2.2 1.6 1.2 1.6 2.1 1.3 1.4 1.3 1.4 Avg: 1.6, Min: 1.2, Max: 2.2, Diff: 0.9] [Update RS (ms): 19.2 17.1 19.3 19.6 17.4 20.7 17.7 18.0 19.8 17.9 17.7 17.7 21.3 Avg: 18.7, Min: 17.1, Max: 21.3, Diff: 4.1] [Processed Buffers : 9 7 8 5 9 6 11 9 7 8 8 8 7 Sum: 102, Avg: 7, Min: 5, Max: 11, Diff: 6] [Scan RS (ms): 1384.6 1394.3 1384.9 1395.4 1396.9 1380.9 1386.4 1395.6 1395.8 1384.9 1399.2 1401.8 1387.4 Avg: 1391.4, Min: 1380.9, Max: 1401.8, Diff: 20.8] [Object Copy (ms): 740.2 731.6 739.9 728.2 728.2 741.4 744.2 729.6 727.0 740.8 726.4 723.9 734.7 Avg: 733.5, Min: 723.9, Max: 744.2, Diff: 20.3] [Termination (ms): 4.0 4.6 3.9 4.8 4.8 4.9 0.0 4.8 4.7 4.6 4.9 4.8 4.8 Avg: 4.3, Min: 0.0, Max: 4.9, Diff: 4.9] [Termination Attempts : 5659 7036 5875 6888 6896 6924 1 7335 6824 7011 7219 7016 7148 Sum: 81832, Avg: 6294, Min: 1, Max: 7335, Diff: 7334] [GC Worker End (ms): 37230017.1 37230017.0 37230017.0 37230017.1 37230017.1 37230017.1 37230017.1 37230017.1 37230017.0 37230017.0 37230017.0 37230017.0 37230017.1 Avg: 37230017.1, Min: 37230017.0, Max: 37230017.1, Diff: 0.1] [GC Worker (ms): 2149.7 2149.6 2149.6 2149.7 2149.6 2149.6 2149.7 2149.6 2149.6 2149.5 2149.6 2149.6 2149.6 Avg: 2149.6, Min: 2149.5, Max: 2149.7, Diff: 0.2] [GC Worker Other (ms): 1.7 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.7 Avg: 1.6, Min: 1.6, Max: 1.7, Diff: 0.1] [Clear CT: 3.2 ms] [Other: 52.6 ms] [Choose CSet: 6.4 ms] [Ref Proc: 0.2 ms] [Ref Enq: 0.0 ms] [Free CSet: 16.7 ms] [Eden: 3152M(3822M)->0B(3912M) Survivors: 266M->176M Heap: 14601M(20444M)->11309M(20444M)] [Times: user=28.03 sys=0.01, real=2.21 secs] 2012-08-19T23:59:49.615+0000: 37230.069: [Full GC 11309M->2021M(7604M), 10.4855330 secs] [Times: user=16.90 sys=0.55, real=10.48 secs] 2012-08-20T00:00:07.434+0000: 37247.889: [GC pause (young), 0.24075000 secs] [Parallel Time: 230.3 ms] [GC Worker Start (ms): 37247889.7 37247889.7 37247889.7 37247889.7 37247889.7 37247889.8 37247889.8 37247889.8 37247889.8 37247889.8 37247889.8 37247889.8 37247889.8 Avg: 37247889.8, Min: 37247889.7, Max: 37247889.8, Diff: 0.1] [Ext Root Scanning (ms): 2.3 2.4 2.2 2.3 2.6 2.4 2.9 2.3 2.1 2.4 2.1 2.2 2.3 Avg: 2.3, Min: 2.1, Max: 2.9, Diff: 0.8] [Update RS (ms): 10.0 10.5 9.6 9.4 9.2 8.8 8.1 9.2 11.8 11.2 9.7 9.3 9.1 Avg: 9.7, Min: 8.1, Max: 11.8, Diff: 3.7] [Processed Buffers : 8 8 9 7 9 16 12 14 7 8 7 8 8 Sum: 121, Avg: 9, Min: 7, Max: 16, Diff: 9] [Scan RS (ms): 12.1 11.6 12.6 12.6 12.6 13.1 13.1 13.0 10.5 10.8 12.6 12.8 13.1 Avg: 12.4, Min: 10.5, Max: 13.1, Diff: 2.6] [Object Copy (ms): 204.4 204.3 204.3 204.4 204.4 204.4 204.6 204.3 204.3 204.3 204.3 204.3 204.2 Avg: 204.3, Min: 204.2, Max: 204.6, Diff: 0.4] [Termination (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] [Termination Attempts : 3 2 3 2 3 3 2 3 1 3 1 3 2 Sum: 31, Avg: 2, Min: 1, Max: 3, Diff: 2] [GC Worker End (ms): 37248118.6 37248118.6 37248118.6 37248118.6 37248118.6 37248118.5 37248118.5 37248118.6 37248118.6 37248118.6 37248118.6 37248118.6 37248118.6 Avg: 37248118.6, Min: 37248118.5, Max: 37248118.6, Diff: 0.1] [GC Worker (ms): 228.9 228.9 228.8 228.8 228.9 228.8 228.8 228.9 228.8 228.8 228.8 228.8 228.8 Avg: 228.8, Min: 228.8, Max: 228.9, Diff: 0.1] [GC Worker Other (ms): 1.5 1.5 1.6 1.5 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 Avg: 1.6, Min: 1.5, Max: 1.6, Diff: 0.1] [Clear CT: 0.9 ms] [Other: 9.5 ms] [Choose CSet: 0.1 ms] [Ref Proc: 0.2 ms] [Ref Enq: 0.0 ms] [Free CSet: 4.2 ms] [Eden: 1520M(3912M)->0B(1330M) Survivors: 0B->190M Heap: 3945M(7604M)->2847M(7604M)] [Times: user=2.99 sys=0.00, real=0.24 secs] 2012-08-20T00:00:16.250+0000: 37256.705: [GC pause (young), 0.19628600 secs] [Parallel Time: 187.5 ms] [GC Worker Start (ms): 37256705.1 37256705.1 37256705.1 37256705.2 37256705.2 37256705.2 37256705.2 37256705.2 37256705.2 37256705.2 37256705.2 37256705.2 37256705.2 Avg: 37256705.2, Min: 37256705.1, Max: 37256705.2, Diff: 0.1] [Ext Root Scanning (ms): 1.4 1.4 1.5 1.6 1.6 1.5 1.6 1.4 2.1 1.2 1.9 1.4 1.3 Avg: 1.5, Min: 1.2, Max: 2.1, Diff: 0.9] [Update RS (ms): 4.9 5.2 5.2 4.9 4.9 5.1 4.9 4.9 4.4 5.3 4.6 5.2 5.1 Avg: 5.0, Min: 4.4, Max: 5.3, Diff: 0.9] [Processed Buffers : 8 9 9 9 12 11 8 8 11 10 8 13 9 Sum: 125, Avg: 9, Min: 8, Max: 13, Diff: 5] [Scan RS (ms): 14.5 14.5 14.4 14.5 14.6 14.3 14.5 14.6 14.7 14.5 14.3 14.3 14.6 Avg: 14.5, Min: 14.3, Max: 14.7, Diff: 0.4] [Object Copy (ms): 165.0 164.9 164.8 164.8 164.8 164.9 164.8 164.8 164.5 164.7 164.9 164.8 164.6 Avg: 164.8, Min: 164.5, Max: 165.0, Diff: 0.5] [Termination (ms): 0.1 0.0 0.2 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.1 0.2 Avg: 0.1, Min: 0.0, Max: 0.2, Diff: 0.2] [Termination Attempts : 299 1 263 259 188 185 247 256 307 311 318 270 304 Sum: 3208, Avg: 246, Min: 1, Max: 318, Diff: 317] [GC Worker End (ms): 37256891.2 37256891.2 37256891.2 37256891.2 37256891.2 37256891.1 37256891.2 37256891.2 37256891.2 37256891.2 37256891.2 37256891.2 37256891.2 Avg: 37256891.2, Min: 37256891.1, Max: 37256891.2, Diff: 0.1] [GC Worker (ms): 186.1 186.0 186.1 186.1 186.1 186.0 186.0 186.0 186.0 186.0 186.0 186.0 185.9 Avg: 186.0, Min: 185.9, Max: 186.1, Diff: 0.1] [GC Worker Other (ms): 1.5 1.6 1.5 1.5 1.5 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 Avg: 1.6, Min: 1.5, Max: 1.6, Diff: 0.1] [Clear CT: 0.8 ms] [Other: 8.0 ms] [Choose CSet: 0.1 ms] [Ref Proc: 0.2 ms] [Ref Enq: 0.0 ms] [Free CSet: 4.0 ms] [Eden: 1330M(1330M)->0B(1332M) Survivors: 190M->188M Heap: 4487M(7604M)->3343M(7604M)] [Times: user=2.43 sys=0.00, real=0.20 secs] 2012-08-20T00:00:27.180+0000: 37267.635: [GC pause (young), 0.14220300 secs] [Parallel Time: 134.8 ms] [GC Worker Start (ms): 37267634.8 37267634.8 37267634.8 37267634.8 37267634.8 37267634.8 37267634.8 37267634.8 37267634.8 37267634.8 37267634.8 37267634.8 37267634.8 Avg: 37267634.8, Min: 37267634.8, Max: 37267634.8, Diff: 0.1] [Ext Root Scanning (ms): 2.2 1.7 1.6 1.6 1.4 1.5 1.3 1.4 1.6 1.6 1.9 1.6 1.3 Avg: 1.6, Min: 1.3, Max: 2.2, Diff: 0.9] [Update RS (ms): 14.8 15.0 14.9 15.2 15.5 15.1 15.5 15.5 15.2 15.0 15.0 16.1 15.2 Avg: 15.2, Min: 14.8, Max: 16.1, Diff: 1.3] [Processed Buffers : 8 12 13 13 11 5 11 9 8 8 10 19 11 Sum: 138, Avg: 10, Min: 5, Max: 19, Diff: 14] [Scan RS (ms): 0.7 0.7 0.9 0.8 0.7 0.9 0.6 0.5 0.7 0.9 0.5 0.0 0.9 Avg: 0.7, Min: 0.0, Max: 0.9, Diff: 0.9] [Object Copy (ms): 115.5 115.6 115.6 115.3 115.3 115.4 115.5 115.4 115.5 115.3 115.4 115.2 115.4 Avg: 115.4, Min: 115.2, Max: 115.6, Diff: 0.4] [Termination (ms): 0.0 0.3 0.2 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 Avg: 0.3, Min: 0.0, Max: 0.3, Diff: 0.3] [Termination Attempts : 1 471 475 570 615 560 605 595 464 532 616 469 593 Sum: 6566, Avg: 505, Min: 1, Max: 616, Diff: 615] [GC Worker End (ms): 37267768.0 37267768.0 37267768.1 37267768.1 37267768.1 37267768.1 37267768.1 37267768.0 37267768.1 37267768.1 37267768.1 37267768.1 37267768.1 Avg: 37267768.1, Min: 37267768.0, Max: 37267768.1, Diff: 0.1] [GC Worker (ms): 133.3 133.3 133.4 133.3 133.3 133.3 133.3 133.2 133.3 133.3 133.2 133.2 133.3 Avg: 133.3, Min: 133.2, Max: 133.4, Diff: 0.1] [GC Worker Other (ms): 1.5 1.5 1.5 1.5 1.5 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 Avg: 1.6, Min: 1.5, Max: 1.6, Diff: 0.1] [Clear CT: 0.4 ms] [Other: 7.0 ms] [Choose CSet: 0.1 ms] [Ref Proc: 0.2 ms] [Ref Enq: 0.0 ms] [Free CSet: 3.6 ms] [Eden: 1332M(1332M)->0B(1482M) Survivors: 188M->38M Heap: 4675M(7604M)->3370M(7604M)] [Times: user=1.76 sys=0.00, real=0.14 secs] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120820/a86c4623/attachment-0001.html From jon.masamitsu at oracle.com Mon Aug 20 09:16:59 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Mon, 20 Aug 2012 09:16:59 -0700 Subject: CMS Concurrent mode failure fallback to the serial old collector? In-Reply-To: References: Message-ID: <503262FB.5020900@oracle.com> On 08/17/12 14:08, Haim Yadid wrote: > I am analysing a GC pause problem and I have noticed that when CMS is > used and a concurrent mode failure occurs or GC is triggered manually > (by System.gc()) the STW collector used does not seem to be parallel. > ( I am aware of the ExplicitGCInvokesConcurrent flag but it will not > solve concurrent failure ). > I tried to play with -XX:ParallelGCThreads=... > -XX:ParallelCMSThreads=... but they seem have no effect (only on the > ParNew GC). > > I am deducing it from the following GC log line > > 24.904: [Full GC (System) 24.904: [CMS: 302703K->303056K(2116864K), > 1.0847520 secs] 484492K->303056K(2423552K), [CMS Perm : > 7528K->7525K(21248K)], 1.0852780 secs] [Times: user=1.04 sys=0.02, > real=1.09 secs] > If it would have been parallel "user" would have been equal to > "nThreads" * "real". > In addition if I choose ParallelOld GC it will behave correctly. > > I really do not understand why the failover STW mechanism of CMS is > not parallel shouldn't it be finishing the work as soon as possible ? > I am not able to find anything useful on the internet. You are correct that the concurrent mode failure does a full GC serially. The parallel old collector used for UseParallelGC/UseParallelOldGC was never ported to CMS. Because of differences between UseParallelGC and CMS, it is more work than we had expected. > > I think G1 behaves in the same manner BTW ( AFAIK the the fallback > collector of G1 is copied from CMS) Yes, G1 behaves the same. G1 will not use the UseParallelGC implementation for a parallel full collection but will implement one in line with G1's design. Currently the G1 guys have been focusing on better policies for achieving pause goals and avoiding full collections. Last I heard there was at least some work to be done for class unloading (JEP 156) before the parallel full collection. Jon > > Help will be appreciated. > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120820/59be1d20/attachment.html From haim at performize-it.com Mon Aug 20 13:39:47 2012 From: haim at performize-it.com (Haim Yadid) Date: Mon, 20 Aug 2012 22:39:47 +0200 Subject: hotspot-gc-use Digest, Vol 54, Issue 8 In-Reply-To: References: Message-ID: Thanks Jon, Thats a pity since CMS full GC is unavoidable, and when this happen we will experience unacceptable pause. I was trying G1 as well and In theory G1GC should not have pauses longer than the soft real time goal. However in practice (as you can see from another question I have posted ) it seems that G1 do has pauses from time to time and in the application I am tuning it is much worse than CMS. On Mon, Aug 20, 2012 at 9:00 PM, wrote: > Send hotspot-gc-use mailing list submissions to > hotspot-gc-use at openjdk.java.net > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > or, via email, send a message with subject or body 'help' to > hotspot-gc-use-request at openjdk.java.net > > You can reach the person managing the list at > hotspot-gc-use-owner at openjdk.java.net > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of hotspot-gc-use digest..." > > > Today's Topics: > > 1. Re: CMS Concurrent mode failure fallback to the serial old > collector? (Jon Masamitsu) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 20 Aug 2012 09:16:59 -0700 > From: Jon Masamitsu > Subject: Re: CMS Concurrent mode failure fallback to the serial old > collector? > To: hotspot-gc-use at openjdk.java.net > Message-ID: <503262FB.5020900 at oracle.com> > Content-Type: text/plain; charset="iso-8859-1" > > > > On 08/17/12 14:08, Haim Yadid wrote: > > I am analysing a GC pause problem and I have noticed that when CMS is > > used and a concurrent mode failure occurs or GC is triggered manually > > (by System.gc()) the STW collector used does not seem to be parallel. > > ( I am aware of the ExplicitGCInvokesConcurrent flag but it will not > > solve concurrent failure ). > > I tried to play with -XX:ParallelGCThreads=... > > -XX:ParallelCMSThreads=... but they seem have no effect (only on the > > ParNew GC). > > > > I am deducing it from the following GC log line > > > > 24.904: [Full GC (System) 24.904: [CMS: 302703K->303056K(2116864K), > > 1.0847520 secs] 484492K->303056K(2423552K), [CMS Perm : > > 7528K->7525K(21248K)], 1.0852780 secs] [Times: user=1.04 sys=0.02, > > real=1.09 secs] > > If it would have been parallel "user" would have been equal to > > "nThreads" * "real". > > In addition if I choose ParallelOld GC it will behave correctly. > > > > I really do not understand why the failover STW mechanism of CMS is > > not parallel shouldn't it be finishing the work as soon as possible ? > > I am not able to find anything useful on the internet. > > You are correct that the concurrent mode failure does a full GC serially. > The parallel old collector used for UseParallelGC/UseParallelOldGC was > never ported to CMS. Because of differences between UseParallelGC and > CMS, it is more work than we had expected. > > > > I think G1 behaves in the same manner BTW ( AFAIK the the fallback > > collector of G1 is copied from CMS) > > Yes, G1 behaves the same. G1 will not use the UseParallelGC implementation > for a parallel full collection but will implement one in line with G1's > design. > Currently the G1 guys have been focusing on better policies for achieving > pause goals and avoiding full collections. Last I heard there was at > least some > work to be done for class unloading (JEP 156) before the parallel full > collection. > > Jon > > > > > Help will be appreciated. > > > > > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120820/59be1d20/attachment-0001.html > > ------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > End of hotspot-gc-use Digest, Vol 54, Issue 8 > ********************************************* > -- Haim Yadid | Performization Expert Performize-IT | t +972-54-7777132 www.performize-it.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120820/9a6f7afe/attachment.html From bernd.eckenfels at googlemail.com Tue Aug 21 13:06:53 2012 From: bernd.eckenfels at googlemail.com (Bernd Eckenfels) Date: Tue, 21 Aug 2012 22:06:53 +0200 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: References: Message-ID: Y. S. Ramakrishna wrote: >> An alternative workaround that might also work >> for you would be -XX:CMSWaitDuration=X > That should have been: -XX:CMSMaxAbortablePrecleanTime=X I have a question on this, can the ?XX:+CMSScavengeBeforeRemark be combined with -XX:CMSMaxAbortablePrecleanTime=ms? Because in my scenario the Scavenger Intervall is rather large (100-200s), so I dont want to use 400.000ms in the MaxAbortablePrecleanTime. The same is true for the initial-mark: when I use CMSWaitDuration, can I specify if should trigger a scavenger when timeout is reached? Or would it be OK to specify 400s? Greetings Bernd PS: in my case I know I need to resize the generations to get more frequent scavenger runs, however it is hard to push that into production on that particular system. From bernd.eckenfels at googlemail.com Tue Aug 21 13:24:41 2012 From: bernd.eckenfels at googlemail.com (Bernd Eckenfels) Date: Tue, 21 Aug 2012 22:24:41 +0200 Subject: PrintGCDate/TimeStamps Message-ID: Hello, I wondering about some strange behaviour. Sorry to bother with this minor observation, but I do hope the datestamp ioption gets more popular by mention it again .) Often +PrintGCTimeStamps is recommended to be able to see the intervall between GC events. However I found some places talking about +PrintGCDateStamps which is much more convenient for some problems (for example correlating SLA violations with the STW pause times). Some discusions suggest that you can combine both in a way that it does not print both timestamps: -XX:+PrintGCDateStamps -XX:-PrintGCTimeStamps However on the win 64bit JDKs 1.6.0_33 and 1.7.0_03 I have tried this, it does not work (i.e. the logfiles always contans both, actually the timestamp typically 2 times): 2012-08-21T22:08:29.989+0200: 1.292: [GC 1.292: [ParNew: 5033216K->3826K(5662336K), 0.0015120 secs] 5033216K->3826K(6710912K), 0.0015790 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0018006 seconds 2012-08-21T22:08:30.605+0200: 1.907: [GC 1.907: [ParNew: 5037042K->5230K(5662336K), 0.0023563 secs] 5037042K->5230K(6710912K), 0.0024133 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0026457 seconds Just some additional information: I was able to tuen off the timestamps in the HotSpotDiagnostics MBean: 2012-08-21T22:19:54.858+0200: [GC [ParNew: 5039624K->5836K(5662336K), 0.0030131 secs] 5349974K->316537K(6710912K), 0.0030753 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] Greetings Bernd -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120821/3faaa333/attachment.html From rednaxelafx at gmail.com Tue Aug 21 16:52:46 2012 From: rednaxelafx at gmail.com (Krystal Mok) Date: Wed, 22 Aug 2012 07:52:46 +0800 Subject: PrintGCDate/TimeStamps In-Reply-To: References: Message-ID: Hi Bernd, You're probably using those two flags along with -Xloggc:filename. This flag implies -XX:+PrintGCTimeStamps, which in your case is something you're trying to get rid of. There's an easy workaround to this: put -XX:-PrintGCTimeStamps *after* -Xloggc. The way HotSpot VM's argument processing works, if a VM flag is specified multiple times, then the one that comes last is the one used. This includes flags that are set implicitly. There are different types of VM flags. "manageable" flags are ones that can be changed at runtime, via the HotSpotDiagnostic MBean. PrintGCTimeStamps is a manageable flag. HTH, Kris On Wed, Aug 22, 2012 at 4:24 AM, Bernd Eckenfels < bernd.eckenfels at googlemail.com> wrote: > Hello, > > I wondering about some strange behaviour. Sorry to bother with this minor > observation, but I do hope the datestamp ioption gets more popular by > mention it again .) > > Often +PrintGCTimeStamps is recommended to be able to see the intervall > between GC events. However I found some places talking about > +PrintGCDateStamps which is much more convenient for some problems (for > example correlating SLA violations with the STW pause times). > > Some discusions suggest that you can combine both in a way that it does > not print both timestamps: > > -XX:+PrintGCDateStamps -XX:-PrintGCTimeStamps > > However on the win 64bit JDKs 1.6.0_33 and 1.7.0_03 I have tried this, it > does not work (i.e. the logfiles always contans both, actually the > timestamp typically 2 times): > > 2012-08-21T22:08:29.989+0200: 1.292: [GC 1.292: [ParNew: > 5033216K->3826K(5662336K), 0.0015120 secs] 5033216K->3826K(6710912K), > 0.0015790 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > Total time for which application threads were stopped: 0.0018006 seconds > 2012-08-21T22:08:30.605+0200: 1.907: [GC 1.907: [ParNew: > 5037042K->5230K(5662336K), 0.0023563 secs] 5037042K->5230K(6710912K), > 0.0024133 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > Total time for which application threads were stopped: 0.0026457 seconds > > Just some additional information: I was able to tuen off the timestamps in > the HotSpotDiagnostics MBean: > > 2012-08-21T22:19:54.858+0200: [GC [ParNew: 5039624K->5836K(5662336K), > 0.0030131 secs] 5349974K->316537K(6710912K), 0.0030753 secs] [Times: > user=0.00 sys=0.00, real=0.00 secs] > > Greetings > Bernd > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120822/d50e5666/attachment.html From bernd.eckenfels at googlemail.com Tue Aug 21 19:37:33 2012 From: bernd.eckenfels at googlemail.com (Bernd Eckenfels) Date: Wed, 22 Aug 2012 04:37:33 +0200 Subject: PrintGCDate/TimeStamps In-Reply-To: References: Message-ID: > You're probably using those two flags along with -Xloggc:filename. This flag > implies -XX:+PrintGCTimeStamps, which in your case is something you're > trying to get rid of. Yes, correct. I noticed it after posting the question because at first I was not able to reproduce it. Thanks for the confirmation, I was expecting something like that (but not necesarily with the Xloggc option :) Greetings Bernd From ysr1729 at gmail.com Tue Aug 21 19:41:20 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Tue, 21 Aug 2012 19:41:20 -0700 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: References: Message-ID: Hi Bernd -- I've unfortunately forgotten the full history of this exchange (at least my mailer has, and my own cache is nowadays oversubscribed and prone to evicting objects somewhat too rapidly), so I'll answer only the questions in the email:- On Tue, Aug 21, 2012 at 1:06 PM, Bernd Eckenfels < bernd.eckenfels at googlemail.com> wrote: > Y. S. Ramakrishna wrote: > >> An alternative workaround that might also work > >> for you would be -XX:CMSWaitDuration=X > > That should have been: -XX:CMSMaxAbortablePrecleanTime=X > > I have a question on this, can the ?XX:+CMSScavengeBeforeRemark be > combined with -XX:CMSMaxAbortablePrecleanTime=ms? > Yes, they can be combined. > > Because in my scenario the Scavenger Intervall is rather large > (100-200s), so I dont want to use 400.000ms in the > MaxAbortablePrecleanTime. > OK. > > The same is true for the initial-mark: when I use CMSWaitDuration, can > I specify if should trigger a scavenger when timeout is reached? Or > would it be OK to specify 400s? > Initial mark is typically scheduled immediately after a scavenge, so no timeout specification should be necessary. Perhaps I misunderstood yr question and may be you can elaborate a bit more on what you want to achieve? > > Greetings > Bernd > > PS: in my case I know I need to resize the generations to get more > frequent scavenger runs, however it is hard to push that into > production on that particular system. > Yes, I am somewhat painfully aware of the limitations of tuning around CMS' various shortcomings and am looking forward to G1 being a panacea for those problems :-) -- ramki -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120821/e2200ce5/attachment.html From bernd.eckenfels at googlemail.com Tue Aug 21 21:14:35 2012 From: bernd.eckenfels at googlemail.com (Bernd Eckenfels) Date: Wed, 22 Aug 2012 06:14:35 +0200 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: References: Message-ID: Am 22.08.2012, 04:41 Uhr, schrieb Srinivas Ramakrishna : > Initial mark is typically scheduled immediately after a scavenge, so no > timeout specification should be necessary. Perhaps I misunderstood yr > question and may be you can elaborate a bit more on what you want to > achieve? Well, I have a gclog which contains some STW situations > 1s (which violates my SLA). If I check the GCLog file there are some initial-marks and some remarks causing the problem. For the slow initial-marks I see the pattern that the time difference to the preceeding scavenger run is large. For the initial marks which run sub second, they happen all directly after a scavenger run. So here is a slow samples: 159430.703: [GC 159430.705: [ParNew: 20646923K->582368K(22649280K), 0.4311960 secs] 21710818K->1665223K(47815104K), 0.4343870 secs] [Times: user=1.92 sys=0.02, real=0.43 secs] 159607.370: [GC [1 CMS-initial-mark: 1082855K(25165824K)] 14734770K(47815104K), 11.1184690 secs] [Times: user=11.06 sys=0.03, real=11.12 secs] 159618.490: [CMS-concurrent-mark-start] 159618.930: [CMS-concurrent-mark: 0.440/0.440 secs] [Times: user=4.59 sys=0.16, real=0.44 secs] Difference 176s, 11s STW And here is the next run, which is typically fast: 166807.592: [GC 166807.594: [ParNew: 21200224K->372584K(22649280K), 0.4462060 secs] 22444233K->1629155K(47815104K), 0.4493750 secs] [Times: user=1.43 sys=0.01, real=0.45 secs] 166808.057: [GC [1 CMS-initial-mark: 1256570K(25165824K)] 1629155K(47815104K), 0.3039830 secs] [Times: user=0.31 sys=0.00, real=0.31 secs] Difference 0.4s, 0.3s STW I need to collect the actual jvm parameters, version and gclogfile and will provide it. I am actually waiting for a CMSStatistics=2 version. Greetings Bernd From ysr1729 at gmail.com Wed Aug 22 01:03:25 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Wed, 22 Aug 2012 01:03:25 -0700 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: References: Message-ID: Hi Bernd -- Yes, this has been observed (albeit in a different context by Michal Frajt as well; see emails from a couple of weeks ago). I am not sure why, with regular CMS, we should have this kind of upredictable delay from a scavenge to a CMS initial mark. It must be OS/scheduling and load etc. which we cannot control (although a delay of 177 s seems excessive and must either mean that the CMS wait duration was exceeded or something like that. In any case, as you observed, the length of the CMS initial pause is related to the ocupancy of the young generation. Thus, even if it were to occur immediately after a scavenge (when Eden is nearly empty), the use of large (and fully used) survivor spaces can make the pause longer. As we have noted in earlier email, the real solution is to multi-thread the CMS initial mark pause so that the work can be done much faster. An easier if less pleasant and less efficient alternative is to implement CMSScavengeBeforeInitialMark, but that alone would not address the large fully-used survivor space problem I mentioned above, only the issue with scheduling the initial mark. (and in that case, the pause-time for the scavenge would be additive, albeit because it is parallel, would likely be much faster even for a large Eden). I'd be curious to know if you get to the bottom of the cause for the long delay between scavenge and initial mark pause. regards. -- ramki On Tue, Aug 21, 2012 at 9:14 PM, Bernd Eckenfels < bernd.eckenfels at googlemail.com> wrote: > Am 22.08.2012, 04:41 Uhr, schrieb Srinivas Ramakrishna >: > > Initial mark is typically scheduled immediately after a scavenge, so no > > timeout specification should be necessary. Perhaps I misunderstood yr > > question and may be you can elaborate a bit more on what you want to > > achieve? > > Well, I have a gclog which contains some STW situations > 1s (which > violates my SLA). If I check the GCLog file there are some initial-marks > and some remarks causing the problem. For the slow initial-marks I see the > pattern that the time difference to the preceeding scavenger run is large. > For the initial marks which run sub second, they happen all directly after > a scavenger run. > > So here is a slow samples: > > 159430.703: [GC 159430.705: [ParNew: 20646923K->582368K(22649280K), > 0.4311960 secs] > 21710818K->1665223K(47815104K), 0.4343870 secs] [Times: > user=1.92 sys=0.02, real=0.43 secs] > 159607.370: [GC [1 CMS-initial-mark: 1082855K(25165824K)] > 14734770K(47815104K), 11.1184690 secs] > [Times: user=11.06 sys=0.03, real=11.12 secs] > 159618.490: [CMS-concurrent-mark-start] > 159618.930: [CMS-concurrent-mark: 0.440/0.440 secs] [Times: user=4.59 > sys=0.16, real=0.44 secs] > > Difference 176s, 11s STW > > And here is the next run, which is typically fast: > > 166807.592: [GC 166807.594: [ParNew: 21200224K->372584K(22649280K), > 0.4462060 secs] > 22444233K->1629155K(47815104K), 0.4493750 secs] [Times: > user=1.43 sys=0.01, real=0.45 secs] > 166808.057: [GC [1 CMS-initial-mark: 1256570K(25165824K)] > 1629155K(47815104K), 0.3039830 secs] > [Times: user=0.31 sys=0.00, real=0.31 secs] > > Difference 0.4s, 0.3s STW > > I need to collect the actual jvm parameters, version and gclogfile and > will provide it. I am actually waiting for a CMSStatistics=2 version. > > > Greetings > Bernd > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120822/1d608bad/attachment-0001.html From dhd at exnet.com Wed Aug 22 02:31:04 2012 From: dhd at exnet.com (Damon Hart-Davis) Date: Wed, 22 Aug 2012 10:31:04 +0100 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: References: Message-ID: <539E4C90-D05E-49CF-92B5-0FF8D4AA10D0@exnet.com> Hi, Could it be paging when not all of the JVM heap is (able to be) in physical memory at the same time? Rgds Damon On 22 Aug 2012, at 09:03, Srinivas Ramakrishna wrote: > I am not sure why, with regular CMS, we should have this kind of upredictable delay from a scavenge to a CMS initial mark. > It must be OS/scheduling and load etc. which we cannot control (although a delay of 177 s seems excessive and must either mean > that the CMS wait duration was exceeded or something like that. From Michal.Frajt at partner.commerzbank.com Wed Aug 22 04:40:16 2012 From: Michal.Frajt at partner.commerzbank.com (Frajt, Michal) Date: Wed, 22 Aug 2012 13:40:16 +0200 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: <539E4C90-D05E-49CF-92B5-0FF8D4AA10D0@exnet.com> References: <539E4C90-D05E-49CF-92B5-0FF8D4AA10D0@exnet.com> Message-ID: <1DDCA93502632C4DA22E9EE199CE907C5E9F24AD@SE002593.cs.commerzbank.com> Hi Damon, It is not paging. The unpredictable delay from a scavenge to a CMS initial mark is the state of the current implementation. The CMSWaitDuration does not work correctly. Please find the details in the "CMSWaitDuration unstable behavior" post on hotspot-gc-dev mailing. We are facing same issue with extremely long initial mark pauses. Just proper waiting for a scavenge (customized OpenJDK 6) has reduced every single initial-mark pause from 1200ms to 20ms only. Regards, Michal -----Original Message----- From: hotspot-gc-use-bounces at openjdk.java.net [mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of Damon Hart-Davis Sent: Mittwoch, 22. August 2012 11:31 To: Srinivas Ramakrishna; Bernd Eckenfels Cc: hotspot-gc-use at openjdk.java.net Subject: Re: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? Hi, Could it be paging when not all of the JVM heap is (able to be) in physical memory at the same time? Rgds Damon On 22 Aug 2012, at 09:03, Srinivas Ramakrishna wrote: > I am not sure why, with regular CMS, we should have this kind of upredictable delay from a scavenge to a CMS initial mark. > It must be OS/scheduling and load etc. which we cannot control (although a delay of 177 s seems excessive and must either mean > that the CMS wait duration was exceeded or something like that. _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From ysr1729 at gmail.com Wed Aug 22 09:23:55 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Wed, 22 Aug 2012 09:23:55 -0700 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: <1DDCA93502632C4DA22E9EE199CE907C5E9F24AD@SE002593.cs.commerzbank.com> References: <539E4C90-D05E-49CF-92B5-0FF8D4AA10D0@exnet.com> <1DDCA93502632C4DA22E9EE199CE907C5E9F24AD@SE002593.cs.commerzbank.com> Message-ID: Ah, I see. I'll go back and review your email on this subject earlier. Sorry, got pulled off for some other stuff and missed the follow-ups. thanks. -- ramki On Wed, Aug 22, 2012 at 4:40 AM, Frajt, Michal < Michal.Frajt at partner.commerzbank.com> wrote: > Hi Damon, > > It is not paging. The unpredictable delay from a scavenge to a CMS initial > mark is the state of the current implementation. The CMSWaitDuration does > not work correctly. Please find the details in the "CMSWaitDuration > unstable behavior" post on hotspot-gc-dev mailing. > > We are facing same issue with extremely long initial mark pauses. Just > proper waiting for a scavenge (customized OpenJDK 6) has reduced every > single initial-mark pause from 1200ms to 20ms only. > > Regards, > Michal > > > -----Original Message----- > From: hotspot-gc-use-bounces at openjdk.java.net [mailto: > hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of Damon Hart-Davis > Sent: Mittwoch, 22. August 2012 11:31 > To: Srinivas Ramakrishna; Bernd Eckenfels > Cc: hotspot-gc-use at openjdk.java.net > Subject: Re: Why abortable-preclean phase is not being aborted after YG > occupancy exceeds 50%? > > Hi, > > Could it be paging when not all of the JVM heap is (able to be) in > physical memory at the same time? > > Rgds > > Damon > > > On 22 Aug 2012, at 09:03, Srinivas Ramakrishna wrote: > > > I am not sure why, with regular CMS, we should have this kind of > upredictable delay from a scavenge to a CMS initial mark. > > It must be OS/scheduling and load etc. which we cannot control (although > a delay of 177 s seems excessive and must either mean > > that the CMS wait duration was exceeded or something like that. > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120822/b519d1e8/attachment.html From michal at frajt.eu Wed Aug 29 07:43:28 2012 From: michal at frajt.eu (Michal Frajt) Date: Wed, 29 Aug 2012 16:43:28 +0200 Subject: CMSScavengeBeforeRemark confuses CMS-remark time Message-ID: Hi, We have fixed the bug in the CMSWaitDuration handling (find more in the 'CMSWaitDuration unstable behavior' hotspot-gc-dev post, fixed done in our customized OpenJDK). The CMS-initial-mark phase is now always starting right after the scavenge which makes it running 20-50 faster. Currently we are focused on minimizing the cost of the second STW remark phase. There we have the option CMSScavengeBeforeRemark to invoke the scavenge right before the remarking. All works as expected but the results reported in CMS GC logs are a bit confusing. The CMSScavengeBeforeRemark forces scavenge invocation from the CMS-remark phase (from within the VM thread as the CMS-remark operation is executed in the foreground collector). The generation collector reports its time into the same GC log file. In our case the ParNew collector reports the line including the STW time (0.0266193 seconds here). 2012-08-29T07:27:02.613+0200: 9.388: [GC 9.388: [ParNew Desired survivor size 9568256 bytes, new threshold 8 (max 8) - age 1: 4626512 bytes, 4626512 total : 65631K->14657K(112384K), 0.0264694 secs] 108213K->72304K(10467072K), 0.0266193 secs] [Times: user=0.17 sys=0.01, real=0.03 secs] The total CMS-remark time (STW) is usually understood as the number of seconds reported by the CMS-remark line (0.401657 seconds here). 9.415: [Rescan (parallel) (Survivor:10chunks) .... 0.0064098 secs]9.421: [weak refs processing, 0.0000320 secs]9.421: [class unloading, 0.0020331 secs]9.423: [scrub symbol & string tables, 0.0042518 secs] [1 CMS-remark: 57647K(10354688K)] 72304K(10467072K), 0.0401657 secs] [Times: user=0.23 sys=0.01, real=0.04 secs] But, in the case the ParNew got invoked explicitly by the CMSScavengeBeforeRemark option, the time reported by the CMS-remark phase does include the time of the generation collector. The common interpretation is that there was one ParNew STW (0.0266193 sec) and the CMS-remark STW (0.0401657 secs) phase. Fortunately the total STW time is not ParNew plus CMS-remark time but only the CMS-remark time in total. In our example we spent 27ms in the ParNew and 14ms in the CMS-remark. The total STW time was 40ms and not 67ms as many times wrongly interpreted. The reporting of the CMS-remark phase, when used together with the CMSScavengeBeforeRemark option, does not allow easy interpretation of the application STW times. None of the CMS log file analyzer tools is able to correctly interpret the time reported in the CMS-remark phase. As far as we know none of existing interpretations of the CMS logging outputs does mention this issue. Same way the fixed CMSWaitDuration was able to deliver 20-50 faster initial marking to us, the existing and working CMSScavengeBeforeRemark option is able to deliver similar results when correctly interpreted from the GC log files. Regards, Michal -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120829/13a78f67/attachment.html From rozdev29 at gmail.com Wed Aug 29 10:18:58 2012 From: rozdev29 at gmail.com (Rozdev29) Date: Wed, 29 Aug 2012 10:18:58 -0700 Subject: CMSScavengeBeforeRemark confuses CMS-remark time In-Reply-To: References: Message-ID: <5253D329-E8D3-442F-BFF3-54BDCD44ADE3@gmail.com> Hi there Is this bug fixed for java 6 or 7? Which version will have this fix? Thanks Saroj On Aug 29, 2012, at 7:43 AM, "Michal Frajt" wrote: > Hi, > > We have fixed the bug in the CMSWaitDuration handling (find more in the 'CMSWaitDuration unstable behavior' hotspot-gc-dev post, fixed done in our customized OpenJDK). The CMS-initial-mark phase is now always starting right after the scavenge which makes it running 20-50 faster. Currently we are focused on minimizing the cost of the second STW remark phase. There we have the option CMSScavengeBeforeRemark to invoke the scavenge right before the remarking. All works as expected but the results reported in CMS GC logs are a bit confusing. > > The CMSScavengeBeforeRemark forces scavenge invocation from the CMS-remark phase (from within the VM thread as the CMS-remark operation is executed in the foreground collector). The generation collector reports its time into the same GC log file. In our case the ParNew collector reports the line including the STW time (0.0266193 seconds here). > > 2012-08-29T07:27:02.613+0200: 9.388: [GC 9.388: [ParNew Desired survivor size 9568256 bytes, new threshold 8 (max 8) - age 1: 4626512 bytes, 4626512 total : 65631K->14657K(112384K), 0.0264694 secs] 108213K->72304K(10467072K), 0.0266193 secs] [Times: user=0.17 sys=0.01, real=0.03 secs] > > > The total CMS-remark time (STW) is usually understood as the number of seconds reported by the CMS-remark line (0.401657 seconds here). > > 9.415: [Rescan (parallel) (Survivor:10chunks) .... 0.0064098 secs]9.421: [weak refs processing, 0.0000320 secs]9.421: [class unloading, 0.0020331 secs]9.423: [scrub symbol & string tables, 0.0042518 secs] [1 CMS-remark: 57647K(10354688K)] 72304K(10467072K), 0.0401657 secs] [Times: user=0.23 sys=0.01, real=0.04 secs] > > > > > But, in the case the ParNew got invoked explicitly by the CMSScavengeBeforeRemark option, the time reported by the CMS-remark phase does include the time of the generation collector. The common interpretation is that there was one ParNew STW (0.0266193 sec) and the CMS-remark STW (0.0401657 secs) phase. Fortunately the total STW time is not ParNew plus CMS-remark time but only the CMS-remark time in total. In our example we spent 27ms in the ParNew and 14ms in the CMS-remark. The total STW time was 40ms and not 67ms as many times wrongly interpreted. > > The reporting of the CMS-remark phase, when used together with the CMSScavengeBeforeRemark option, does not allow easy interpretation of the application STW times. None of the CMS log file analyzer tools is able to correctly interpret the time reported in the CMS-remark phase. As far as we know none of existing interpretations of the CMS logging outputs does mention this issue. Same way the fixed CMSWaitDuration was able to deliver 20-50 faster initial marking to us, the existing and working CMSScavengeBeforeRemark option is able to deliver similar results when correctly interpreted from the GC log files. > > Regards, > Michal > > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120829/486e4496/attachment.html From Bond.Chen at lombardrisk.com Wed Aug 29 19:40:31 2012 From: Bond.Chen at lombardrisk.com (Bond Chen) Date: Thu, 30 Aug 2012 03:40:31 +0100 Subject: [HTML]my CMS incremental duty cycle can't be controll by GC parameter settings Message-ID: <503F4320.9AAE.00F7.0@lombardrisk.com> Dear All & Sri, Our application have encountered very long GC pause, from the GC analysis, I found the CMS takes 20-30 minutes to get finished by the value of icms_dc=1230 seconds, so the solution is to reduce this value, to let CMS finished ASAP, by reading a doc on oracle website about the CMS incremental mode, I want to have a concept demonstration test, but the result confusing me. 1)I have the set 3 CMS incremental parameters: -XX:+CMSIncrementalMode -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=3 My expectation is: the icms_dc value in the gc log should be in range 0-3 Actual results: the icms_dc value are out of the range 2)All vm options: VM arguments: -Dprogram.name=run.sh -Xms6072M -Xmx6072M -XX:PermSize=512m -XX:MaxPermSize=512m -Xss1024k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseTLAB -XX:+CMSIncrementalMode -XX:ParallelGCThreads=6 -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=3 -XX:MaxTenuringThreshold=32 -XX:+PrintTenuringDistribution -XX:CMSInitiatingOccupancyFraction=50 -Xmn1700m -XX:+UseLargePages -XX:LargePageSizeInBytes=64k -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -Xloggc:./gc_10_20120902171712.log -Dsun.rmi.dgc.server.gcInterval=18000000 -Dsun.rmi.dgc.client.gcInterval=18000000 -verbose:gc -Djava.library.path=/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/lib/valuation-lib:/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/firmament -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl -Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote.port=1234 -Dcom.sun.management.jmxremote.authenticate=false -XX:+ExplicitGCInvokesConcurrent -XX:+PrintTenuringDistribution -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintPromotionFailure -XX:PrintFLSStatistics=2 -Djboss.cluster.number=4 -Djboss.cluster.monitor.switch=y -Djboss.messaging.groupname=MessagingPostOffice -Djboss.partition.name=UATPartition_RH -Djboss.messaging.serverpeerid=10 -Djboss.web.lb.nodeid=nodee -Djboss.home.dir=/export/home/test/server/colline/cluster/jboss -Djboss.profile.name=2011.2.0.0.3 -Djboss.cluster.node1.addr=172.20.30.8 -Djboss.cluster.node2.addr=172.20.30.11 -Djboss.cluster.node3.addr=172.20.30.16 -Djboss.cluster.node4.addr=172.20.30.10 -Djboss.cluster.port_hacluster=7800 -Djboss.cluster.port_jbmdata=7900 -Djboss.cluster.port_jbmcontrol=7910 -Djboss.cluster.port_invalidationcache=7920 -Djboss.cluster.port_replicationcache=7930 -Djboss.messaging.consumer_count=3 -Djboss.jndi.port=1099 -Djboss.hajndi.port=1100 -Dcollateral_config=/export/home/test/server/colline/cluster/bin/collateral.properties -DIGNORE_FQN=/marketdata/FxRates,/marketdata/EODFxRates -Ddatasource.min.pool.size=5 -Ddatasource.max.pool.size=150 -Djboss.partition.udpGroup=230.1.0.4 -Dcom.sun.management.jmxremote.port=5004 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl -Djboss.platform.mbeanserver -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed 3)GC logs: -bash-3.00$ cat gc_10_20120902004701.log |grep icms icms_dc=0 , 0.1387250 secs] [Times: user=0.35 sys=0.20, real=0.14 secs] icms_dc=0 , 0.3590387 secs] [Times: user=0.90 sys=0.43, real=0.36 secs] icms_dc=15 , 0.5206302 secs] [Times: user=1.31 sys=0.46, real=0.52 secs] icms_dc=30 , 0.2350382 secs] [Times: user=0.76 sys=0.01, real=0.24 secs] icms_dc=45 , 0.4883827 secs] [Times: user=1.38 sys=0.37, real=0.49 secs] icms_dc=41 , 0.7013340 secs] [Times: user=0.92 sys=0.05, real=0.70 secs] 4)JVM version: -bash-3.00$ java -version java version "1.6.0_21" Java(TM) SE Runtime Environment (build 1.6.0_21-b06) Java HotSpot(TM) Server VM (build 17.0-b16, mixed mode) -bash-3.00$ ./launch_bondGCParameter.sh Regards, Bond This e-mail together with any attachments (the "Message") is confidential and may contain privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this Message from your system. Any unauthorized copying, disclosure, distribution or use of this Message is strictly forbidden. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120830/0f71d5cb/attachment.html From ysr1729 at gmail.com Thu Aug 30 01:03:57 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Thu, 30 Aug 2012 01:03:57 -0700 Subject: [HTML]my CMS incremental duty cycle can't be controll by GC parameter settings In-Reply-To: <503F4320.9AAE.00F7.0@lombardrisk.com> References: <503F4320.9AAE.00F7.0@lombardrisk.com> Message-ID: Bond -- On Wed, Aug 29, 2012 at 7:40 PM, Bond Chen wrote: > Dear All & Sri, > Our application have encountered very long GC pause, from the GC analysis, > I found the CMS takes 20-30 minutes to get finished by the value of > icms_dc=1230 seconds, so the solution is to reduce this value, to let CMS > finished ASAP, by reading a doc on oracle website about the CMS > incremental mode, I want to have a concept demonstration test, but the > result confusing me. > Not really. If you want the CMS cycled to finish as soon as possible you should increase the duty cycle, not decrease it. (The idea is that the duty-cycle defines the %ge of "concurrent time" that the ICMS thread will be eligible to run.) -XX:CMSIncrementalDutyCycle=50 will, for example, let it run 50% of the time between two scavenges. Then, you have to turn off the automatic duty-cycle control that ICMS does, if you want to maintain that duty cycle value whenever ICMS runs:- -XX:-CMSIncrementalPacing I believe the min value sets a lower bound on the duty-cycle when incremental pacing is on. It just sets a floor under which the duty-cycle will never go. Yes, I know, it's kind of asymmetric, and I can't recall the thinking behind that, but it should be possible, i guess to bound the cycle between two values if you really wanted to make that modification. Ah, now I remember... the idea is that concurrent mode failure is bad, so escalating the duty-cycle to as much as 100% should be permitted rather than bound it from above, lose the race and cause a concurrent mode failure. Here's the set of relevant options:- $ java -XX:+PrintFlagsFinal -version | grep CMSIncremental uintx CMSIncrementalDutyCycle = 10 {product} uintx CMSIncrementalDutyCycleMin = 0 {product} bool CMSIncrementalMode = false {product} uintx CMSIncrementalOffset = 0 {product} bool CMSIncrementalPacing = true {product} uintx CMSIncrementalSafetyFactor = 10 {product} java version "1.7.0_05" Java(TM) SE Runtime Environment (build 1.7.0_05-b05) Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode) -- ramki > > *1)I have the set 3 CMS incremental parameters:* > -XX:+CMSIncrementalMode -XX:CMSIncrementalDutyCycleMin=0 > -XX:CMSIncrementalDutyCycle=3 > > My expectation is: > the icms_dc value in the gc log should be in range 0-3 > > Actual results: > the icms_dc value are out of the range > > > > > *2)All vm options:* > VM arguments: -Dprogram.name=run.sh -Xms6072M -Xmx6072M -XX:PermSize=512m > -XX:MaxPermSize=512m -Xss1024k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC > -XX:+CMSParallelRemarkEnabled -XX:+UseTLAB -XX:+CMSIncrementalMode > -XX:ParallelGCThreads=6 -XX:CMSIncrementalDutyCycleMin=0 > -XX:CMSIncrementalDutyCycle=3 -XX:MaxTenuringThreshold=32 > -XX:+PrintTenuringDistribution -XX:CMSInitiatingOccupancyFraction=50 > -Xmn1700m -XX:+UseLargePages -XX:LargePageSizeInBytes=64k > -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails > -XX:+PrintGCApplicationStoppedTime -Xloggc:./gc_10_20120902171712.log > -Dsun.rmi.dgc.server.gcInterval=18000000 > -Dsun.rmi.dgc.client.gcInterval=18000000 -verbose:gc > -Djava.library.path=/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/lib/valuation-lib:/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/firmament > -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed > -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl > -Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote.port=1234 > -Dcom.sun.management.jmxremote.authenticate=false > -XX:+ExplicitGCInvokesConcurrent -XX:+PrintTenuringDistribution > -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC > -XX:+PrintPromotionFailure -XX:PrintFLSStatistics=2 > -Djboss.cluster.number=4 -Djboss.cluster.monitor.switch=y > -Djboss.messaging.groupname=MessagingPostOffice -Djboss.partition.name=UATPartition_RH > -Djboss.messaging.serverpeerid=10 -Djboss.web.lb.nodeid=nodee > -Djboss.home.dir=/export/home/test/server/colline/cluster/jboss - > Djboss.profile.name=2011.2.0.0.3 -Djboss.cluster.node1.addr=172.20.30.8 > -Djboss.cluster.node2.addr=172.20.30.11 > -Djboss.cluster.node3.addr=172.20.30.16 > -Djboss.cluster.node4.addr=172.20.30.10 -Djboss.cluster.port_hacluster=7800 > -Djboss.cluster.port_jbmdata=7900 -Djboss.cluster.port_jbmcontrol=7910 > -Djboss.cluster.port_invalidationcache=7920 > -Djboss.cluster.port_replicationcache=7930 > -Djboss.messaging.consumer_count=3 -Djboss.jndi.port=1099 > -Djboss.hajndi.port=1100 > -Dcollateral_config=/export/home/test/server/colline/cluster/bin/collateral.properties > -DIGNORE_FQN=/marketdata/FxRates,/marketdata/EODFxRates > -Ddatasource.min.pool.size=5 -Ddatasource.max.pool.size=150 > -Djboss.partition.udpGroup=230.1.0.4 > -Dcom.sun.management.jmxremote.port=5004 > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.authenticate=false > -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl > -Djboss.platform.mbeanserver > -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed > > > > > *3)GC logs:* > -bash-3.00$ cat gc_10_20120902004701.log |grep icms > icms_dc=0 , 0.1387250 secs] [Times: user=0.35 sys=0.20, real=0.14 secs] > icms_dc=0 , 0.3590387 secs] [Times: user=0.90 sys=0.43, real=0.36 secs] > icms_dc=15 , 0.5206302 secs] [Times: user=1.31 sys=0.46, real=0.52 secs] > icms_dc=30 , 0.2350382 secs] [Times: user=0.76 sys=0.01, real=0.24 secs] > icms_dc=45 , 0.4883827 secs] [Times: user=1.38 sys=0.37, real=0.49 secs] > icms_dc=41 , 0.7013340 secs] [Times: user=0.92 sys=0.05, real=0.70 secs] > > > *4)JVM version:* > -bash-3.00$ java -version > java version "1.6.0_21" > Java(TM) SE Runtime Environment (build 1.6.0_21-b06) > Java HotSpot(TM) Server VM (build 17.0-b16, mixed mode) > -bash-3.00$ ./launch_bondGCParameter.sh > > Regards, > Bond > > This e-mail together with any attachments (the "Message") is confidential and may contain privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this Message from your system. Any unauthorized copying, disclosure, distribution or use of this Message is strictly forbidden. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120830/d558cfeb/attachment-0001.html From Bond.Chen at lombardrisk.com Fri Aug 31 20:50:39 2012 From: Bond.Chen at lombardrisk.com (Bond Chen) Date: Sat, 01 Sep 2012 04:50:39 +0100 Subject: =?UTF-8?Q?=E7=AD=94=E5=A4=8D=EF=BC=9A=20Re:=20[HTML]my=20CMS=20i?= =?UTF-8?Q?ncremental=20duty=20cycle=20can't=20be=20controll=20by=20GCpar?= =?UTF-8?Q?ameter=20settings?= Message-ID: <5041F690020000F700011A5E@lde-email-smtp-ext.londoneast.lombardrisk.com> Hi Sri, Thanks for your help, now I know where the issue is. By default the CMSIncrementalPacing is true, so JVM will automatically adjust the duty cycle, which explain why I still see high icms value when I set the duty cycle to 3 BTW, an official doc at Oracle website says the default is false. Thanks again for this Bond >>> Srinivas Ramakrishna 12?08?30? ?? 16:04 >>> Bond -- On Wed, Aug 29, 2012 at 7:40 PM, Bond Chen wrote: > Dear All & Sri, > Our application have encountered very long GC pause, from the GC analysis, > I found the CMS takes 20-30 minutes to get finished by the value of > icms_dc=1230 seconds, so the solution is to reduce this value, to let CMS > finished ASAP, by reading a doc on oracle website about the CMS > incremental mode, I want to have a concept demonstration test, but the > result confusing me. > Not really. If you want the CMS cycled to finish as soon as possible you should increase the duty cycle, not decrease it. (The idea is that the duty-cycle defines the %ge of "concurrent time" that the ICMS thread will be eligible to run.) -XX:CMSIncrementalDutyCycle=50 will, for example, let it run 50% of the time between two scavenges. Then, you have to turn off the automatic duty-cycle control that ICMS does, if you want to maintain that duty cycle value whenever ICMS runs:- -XX:-CMSIncrementalPacing I believe the min value sets a lower bound on the duty-cycle when incremental pacing is on. It just sets a floor under which the duty-cycle will never go. Yes, I know, it's kind of asymmetric, and I can't recall the thinking behind that, but it should be possible, i guess to bound the cycle between two values if you really wanted to make that modification. Ah, now I remember... the idea is that concurrent mode failure is bad, so escalating the duty-cycle to as much as 100% should be permitted rather than bound it from above, lose the race and cause a concurrent mode failure. Here's the set of relevant options:- $ java -XX:+PrintFlagsFinal -version | grep CMSIncremental uintx CMSIncrementalDutyCycle = 10 {product} uintx CMSIncrementalDutyCycleMin = 0 {product} bool CMSIncrementalMode = false {product} uintx CMSIncrementalOffset = 0 {product} bool CMSIncrementalPacing = true {product} uintx CMSIncrementalSafetyFactor = 10 {product} java version "1.7.0_05" Java(TM) SE Runtime Environment (build 1.7.0_05-b05) Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode) -- ramki > > *1)I have the set 3 CMS incremental parameters:* > -XX:+CMSIncrementalMode -XX:CMSIncrementalDutyCycleMin=0 > -XX:CMSIncrementalDutyCycle=3 > > My expectation is: > the icms_dc value in the gc log should be in range 0-3 > > Actual results: > the icms_dc value are out of the range > > > > > *2)All vm options:* > VM arguments: -Dprogram.name=run.sh -Xms6072M -Xmx6072M -XX:PermSize=512m > -XX:MaxPermSize=512m -Xss1024k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC > -XX:+CMSParallelRemarkEnabled -XX:+UseTLAB -XX:+CMSIncrementalMode > -XX:ParallelGCThreads=6 -XX:CMSIncrementalDutyCycleMin=0 > -XX:CMSIncrementalDutyCycle=3 -XX:MaxTenuringThreshold=32 > -XX:+PrintTenuringDistribution -XX:CMSInitiatingOccupancyFraction=50 > -Xmn1700m -XX:+UseLargePages -XX:LargePageSizeInBytes=64k > -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails > -XX:+PrintGCApplicationStoppedTime -Xloggc:./gc_10_20120902171712.log > -Dsun.rmi.dgc.server.gcInterval=18000000 > -Dsun.rmi.dgc.client.gcInterval=18000000 -verbose:gc > -Djava.library.path=/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/lib/valuation-lib:/export/home/test/server/colline/cluster/jboss/server/2011.2.0.0.3/firmament > -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed > -Djavax.manageme> -Djboss.platform.mbeanserver -Dcom.sun.management.jmxremote.port=1234 > -Dcom.sun.management.jmxremote.authenticate=false > -XX:+ExplicitGCInvokesConcurrent -XX:+PrintTenuringDistribution > -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC > -XX:+PrintPromotionFailure -XX:PrintFLSStatistics=2 > -Djboss.cluster.number=4 -Djboss.cluster.monitor.switch=y > -Djboss.messaging.groupname=MessagingPostOffice -Djboss.partition.name=UATPartition_RH > -Djboss.messaging.serverpeerid=10 -Djboss.web.lb.nodeid=nodee > -Djboss.home.dir=/export/home/test/server/colline/cluster/jboss - > Djboss.profile.name=2011.2.0.0.3 -Djboss.cluster.node1.addr=172.20.30.8 > -Djboss.cluster.node2.addr=172.20.30.11 > -Djboss.cluster.node3.addr=172.20.30.16 > -Djboss.cluster.node4.addr=172.20.30.10 -Djboss.cluster.port_hacluster=7800 > -Djboss.cluster.port_jbmdata=7900 -Djboss.cluster.port_jbmcontrol=7910 > -Djboss.cluster.port_invalidationcache=7920 > -Djboss.cluster.port_replicationcache=7930 > -Djboss.messaging.consumer_count=3 -Djboss.jndi.port=1099 > -Djboss.hajndi.port=1100 > -Dcollateral_config=/export/home/test/server/colline/cluster/bin/collateral.properties > -DIGNORE_FQN=/marketdata/FxRates,/marketdata/EODFxRates > -Ddatasource.min.pool.size=5 -Ddatasource.max.pool.size=150 > -Djboss.partition.udpGroup=230.1.0.4 > -Dcom.sun.management.jmxremote.port=5004 > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.authenticate=false > -Djavax.management.builder.initial=org.jboss.system.server.jmx.MBeanServerBuilderImpl > -Djboss.platform.mbeanserver > -Djava.endorsed.dirs=/export/home/test/server/colline/cluster/jboss/lib/endorsed > > > > > *3)GC logs:* > -bash-3.00$ cat gc_10_20120902004701.log |grep icms > icms_dc=0 , 0.1387250 secs] [Times: user=0.35 sys=0.20, real=0.14 secs] > icms_dc=0 , 0.3590387 secs] [Times: user=0.90 sys=0.43, real=0.36 secs] > icms_dc=15 , 0.5206302 secs] [Times: user=1.31 sys=0.46, real=0.52 secs] > icms_dc=30 , 0.2350382 secs] [Times: user=0.76 sys=0.01, real=0.24 secs] > icms_dc=45 , 0.4883827 secs] [Times: user=1.38 sys=0.37, real=0.49 secs] > icms_dc=41 , 0.7013340 secs] [Times: user=0.92 sys=0.05, real=0.70 secs] > > > *4)JVM version:* > -bash-3.00$ java -version > java version "1.6.0_21" > Java(TM) SE Runtime Environment (build 1.6.0_21-b06) > Java HotSpot(TM) Server VM (build 17.0-b16, mixed mode) > -bash-3.00$ ./launch_bondGCParameter.sh > > Regards, > Bond > > This e-mail together with any attachments (the "Message") is confidential and may contain privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this Message from your system. Any unauthorized copying, disclosure, distribution or use of this Message is strictly forbidden. > > > This e-mail together with any attachments (the "Message") is confidential and may contain privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this Message from your system. Any unauthorized copying, disclosure, distribution or use of this Message is strictly forbidden.