From bengt.rutisson at oracle.com Mon May 2 00:00:50 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 02 May 2011 09:00:50 +0200 Subject: CMS option for Java7 ignored (was Re: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?) In-Reply-To: <4DBB183A.6090608@oracle.com> References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com> <4DBA5640.9080203@xs4all.nl> <4DBA57BC.40604@oracle.com> <4DBA675F.70801@xs4all.nl> <4DBB183A.6090608@oracle.com> Message-ID: <4DBE56A2.2000207@oracle.com> John, >> .... I don't know how to activate CMS for Java 7, it ignores the option that I'd use >> for Java 6. > ... >> Java6: -ea -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc > ... What do you mean when you say "ignores"? Does the VM start and just ignores your option or does the VM complain that it is an unknown option? I am asking since the error message that you get if you use the command line that you posted is somewhat confusing. Maybe this is just a typo in the email, but your command line is missing the plus sign in front of UseConcMarkSweepGC. If you run the command line from your email you get this error message: Unrecognized VM option 'UseConcMarkSweepGC' Could not create the Java virtual machine. Which, I admit, is confusing. The VM does recognize the option, what is missing is just the value for the option. If you use this command line instead you get CMS and no error message: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC -verbose:gc But then again, maybe this was not what you were asking... Bengt >>>>>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC > > Are you saying that if you do: > > -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc > > you do not get CMS? If not, what does the GC log say? > Can you provide the following details: > > % java -version > > and also > > % jinfo > > as well as: > > % jnifo -flag UseConcMarkSweepGC > > where is yr JVM process. > > The main issue will be dealt with in the original thread. > Sorry for the digression. > > -- ramki > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From hjohn at xs4all.nl Mon May 2 01:31:50 2011 From: hjohn at xs4all.nl (John Hendrikx) Date: Mon, 02 May 2011 10:31:50 +0200 Subject: CMS option for Java7 ignored (was Re: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?) In-Reply-To: <4DBB183A.6090608@oracle.com> References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com> <4DBA5640.9080203@xs4all.nl> <4DBA57BC.40604@oracle.com> <4DBA675F.70801@xs4all.nl> <4DBB183A.6090608@oracle.com> Message-ID: <4DBE6BF6.70702@xs4all.nl> Y. Srinivas Ramakrishna wrote: > Are you saying that if you do: > > -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc > > you do not get CMS? If not, what does the GC log say? Sorry, I used the wrong commandline there. As Bengt pointed out, the plus sign was missing and I assumed that the option had a new name. --John From bengt.rutisson at oracle.com Mon May 2 01:41:41 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 02 May 2011 10:41:41 +0200 Subject: CMS option for Java7 ignored (was Re: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?) In-Reply-To: <4DBE56A2.2000207@oracle.com> References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com> <4DBA5640.9080203@xs4all.nl> <4DBA57BC.40604@oracle.com> <4DBA675F.70801@xs4all.nl> <4DBB183A.6090608@oracle.com> <4DBE56A2.2000207@oracle.com> Message-ID: <4DBE6E45.70801@oracle.com> John, For some reason I didn't manage to reply to your email address - only to the hotspot-gc-use alias. Re-sending this to make sure you get it, even though I assume you are on the hotspot-gc-use alias. Bengt On 2011-05-02 09:00, Bengt Rutisson wrote: > John, > >>> .... I don't know how to activate CMS for Java 7, it ignores the option that I'd use >>> for Java 6. >> ... >>> Java6: -ea -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc >> ... > What do you mean when you say "ignores"? Does the VM start and just > ignores your option or does the VM complain that it is an unknown option? > > I am asking since the error message that you get if you use the command > line that you posted is somewhat confusing. Maybe this is just a typo in > the email, but your command line is missing the plus sign in front of > UseConcMarkSweepGC. If you run the command line from your email you get > this error message: > > Unrecognized VM option 'UseConcMarkSweepGC' > Could not create the Java virtual machine. > > Which, I admit, is confusing. The VM does recognize the option, what is > missing is just the value for the option. If you use this command line > instead you get CMS and no error message: > > -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC -verbose:gc > > But then again, maybe this was not what you were asking... > > Bengt > > > > >>>>>>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC >> Are you saying that if you do: >> >> -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc >> >> you do not get CMS? If not, what does the GC log say? >> Can you provide the following details: >> >> % java -version >> >> and also >> >> % jinfo >> >> as well as: >> >> % jnifo -flag UseConcMarkSweepGC >> >> where is yr JVM process. >> >> The main issue will be dealt with in the original thread. >> Sorry for the digression. >> >> -- ramki >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From hjohn at xs4all.nl Mon May 2 01:54:29 2011 From: hjohn at xs4all.nl (John Hendrikx) Date: Mon, 02 May 2011 10:54:29 +0200 Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep? In-Reply-To: <4DBA57BC.40604@oracle.com> References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com> <4DBA5640.9080203@xs4all.nl> <4DBA57BC.40604@oracle.com> Message-ID: <4DBE7145.8020900@xs4all.nl> I've attached the four runs you asked for. I've included the full log (with some prints of my own) plus the info from -XX:+PrintGCTimeStamps -verbose:gc. --John Y. Srinivas Ramakrishna wrote: > John -- > > How about posting performance/times of each of the 4 combinations > from the following cartesian product:- > > {JDK7, JDK6} X {CMS, G1} > > Perhaps you are conflating JDK changes with GC changes, because > of changing both axes/dimensions at the same time? > > -- ramki > > On 4/28/2011 11:10 PM, John Hendrikx wrote: >> I tried many -XX:MaxGCPauseMillis settings, including not setting it at >> all, 20, 10, 5, 2. The results were similar each time -- it didn't >> really have much of an effect. In retrospect you might say that the >> total CPU use is what is causing the problems, not necessarily the >> length of the pauses -- whether this extra CPU use is caused by the >> collector or because of some other change in Java 7 I donot know; the >> program is the same. Is there perhaps another collector that I could >> try to see if this lowers CPU use? Or settings (even non-GC related) >> that could lower CPU use? >> >> Java 6's CMS I didn't need to tune. After determining that the length >> of GC pauses was causing problems in the application, I tried turning >> CMS on and it resolved the problems. >> >> What I observe is that even though with Java 7 the pauses seem (are?) >> very short, the CPU use is a lot higher (from 65% under Java 6 to 95% >> with 7). This could be related to other causes (perhaps threading >> overhead, debug code in Java 7, etc) but I doubt it is in any specific >> Java code that I wrote as most of the heavy lifting is happening in >> native methods. It could for example be that several ByteBuffers being >> used are being copied under Java 7 while under 6 direct access was >> possible. >> >> John. >> >> Jon Masamitsu wrote: >>> John, >>> >>> You're telling G1 (UseG1GC) to limit pauses to 2ms. >>> (-XX:MaxGCPauseMillis=2) but seemed to have tuned >>> CMS (UseConcMarkSweepGC) toward a 20ms goal. >>> G1 is trying to do very short collections and needs to do many >>> of them to keep up with the allocation rate. Did you >>> mean you are setting MaxGCPauseMillis to 20? >>> >>> Jon >>> >>> On 4/26/2011 7:59 AM, John Hendrikx wrote: >>> >>>> Hi list, >>>> >>>> I've been testing Java 1.6 performance vs Java 1.7 performance with a >>>> timing critical application -- it's essential that garbage collection >>>> pauses are very short. What I've found is that Java 1.6 seems to >>>> perform significantly better than 1.7 (b137) in this respect, although >>>> with certain settings 1.6 will also fail catastrophically. I've used >>>> the following options: >>>> >>>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC >>>> For 1.7.0b137: -Xms256M -Xmx256M -XX:+UseG1GC -XX:MaxGCPauseMillis=2 >>>> >>>> The amount of garbage created is roughly 150 MB/sec. The application >>>> demands a response time of about 20 ms and uses half a dozen threads >>>> which deal with buffering and decoding of information. >>>> >>>> With the above settings, the 1.6 VM will meet this goal over a 2 >>>> minute >>>> period>99% of the time (with an average CPU consumption of 65% per CPU >>>> core for two cores) -- from verbosegc I gather that the pause times >>>> are >>>> around 0.01-0.02 seconds: >>>> >>>> [GC 187752K->187559K(258880K), 0.0148198 secs] >>>> [GC 192156K(258880K), 0.0008281 secs] >>>> [GC 144561K->144372K(258880K), 0.0153497 secs] >>>> [GC 148965K(258880K), 0.0008028 secs] >>>> [GC 166187K->165969K(258880K), 0.0146546 secs] >>>> [GC 187935K->187754K(258880K), 0.0150638 secs] >>>> [GC 192344K(258880K), 0.0008422 secs] >>>> >>>> Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a >>>> bit. >>>> It can also introduce OutOfMemory conditions and other catastrophic >>>> failures (one time the GC took 10 seconds after the application had >>>> only >>>> been running 20 seconds). How stable 1.6 will perform with the >>>> initial >>>> settings remains to be seen; the results with more RAM worry me >>>> somewhat. >>>> >>>> The 1.7 VM however performs significantly worse. Here is some of its >>>> output (over roughtly a one second period): >>>> >>>> [GC concurrent-mark-end, 0.0197681 sec] >>>> [GC remark, 0.0030323 secs] >>>> [GC concurrent-count-start] >>>> [GC concurrent-count-end, 0.0060561] >>>> [GC cleanup 177M->103M(256M), 0.0005319 secs] >>>> [GC concurrent-cleanup-start] >>>> [GC concurrent-cleanup-end, 0.0000676] >>>> [GC pause (partial) 136M->136M(256M), 0.0046206 secs] >>>> [GC pause (partial) 139M->139M(256M), 0.0039039 secs] >>>> [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs] >>>> [GC concurrent-mark-start] >>>> [GC concurrent-mark-end, 0.0152915 sec] >>>> [GC remark, 0.0033085 secs] >>>> [GC concurrent-count-start] >>>> [GC concurrent-count-end, 0.0085232] >>>> [GC cleanup 163M->129M(256M), 0.0004847 secs] >>>> [GC concurrent-cleanup-start] >>>> [GC concurrent-cleanup-end, 0.0000363] >>>> >>>> From the above output one would not expect the performance to be >>>> worse, >>>> however, the application fails to meet its goals 10-20% of the time. >>>> The amount of garbage created is the same. CPU time however is >>>> hovering >>>> around 90-95%, which is likely the cause of the poor performance. The >>>> GC seems to take a significantly larger amount of time to do its work >>>> causing these stalls in my test application. >>>> >>>> I've experimented with memory sizes and max pause times with the >>>> 1.7 VM, >>>> and although it seemed to be doing better with more RAM, it never >>>> comes >>>> even close to the performance observed with the 1.6 VM. >>>> >>>> I'm not sure if there are other useful options I can try to see if >>>> I can >>>> tune the 1.7 VM performance a bit better. I can provide more >>>> information, although not any (useful) source code at this time due to >>>> external dependencies (JNA/JNI) of this application. >>>> >>>> I'm wondering if I'm missing something as it seems strange to me that >>>> 1.7 is actually underperforming for me when in general most seem to >>>> agree that the G1GC is a huge improvement. >>>> >>>> --John >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: GC timings.txt Url: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110502/09432b18/attachment-0001.txt From fancyerii at gmail.com Tue May 3 03:31:07 2011 From: fancyerii at gmail.com (Li Li) Date: Tue, 3 May 2011 18:31:07 +0800 Subject: =?windows-1252?Q?=93abort_preclean_due_to_time=94_in_Concurrent_Mark_?= =?windows-1252?Q?=26_Sweep?= Message-ID: hi all I confronted a strange case. The hotspot jvm was always doing gc and consumed many cpu resources(from 50% to 300% cpu usage). And when I turned on gc information. I found "abort preclean due to time" in the gc logs. So I googled and found some similar questions in http://stackoverflow.com/questions/1834501/abort-preclean-due-to-time-in-concurrent-mark-sweep and http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2008-October/000482.html. And http://blogs.sun.com/jonthecollector/entry/did_you_know is suggested to read. I read the blog post and can't understand well. As it says, CMS full gc has follwoing phases: STW initial mark Concurrent marking Concurrent precleaning STW remark Concurrent sweeping Concurrent reset "Ok, so here's the punch line for all this. When we're doing the precleaning we do the sampling of the young generation top for a fixed amount of time before starting the remark. That fixed amount of time is CMSMaxAbortablePrecleanTime and its default value is 5 seconds. The best situation is to have a minor collection happen during the sampling. When that happens the sampling is done over the entire region in the young generation from its start to its final top. If a minor collection is not done during that 5 seconds then the region below the first sample is 1 chunk and it might be the majority of the young generation. Such a chunking doesn't spread the work out evenly to the GC threads so reduces the effective parallelism. " --quoted from this post. In my option, Concurrent precleaning is the preparing stage for remark. It will split the young generation to chunks so remark can do it parallelly. It expected a young gc in order to split chunks evenly. If there is no young gc before time out(CMSMaxAbortablePrecleanTime ), it seems it this gc will fail and all following phases will be skipped. So when the system load is light(which means there will be no minor gc), precleaning will always time out and full gc will always fail. cpu is waste. Some suggested enlarge CMSMaxAbortablePrecleanTime. Maybe it can solve this problem. But CMS collector,not like other collectors that will perform gc when full. it will perform gc when space usage is larger than 92%(68% for older version of hotspot) or jvm feel it should do it. if this value is too large, it will stop the world longer. "Based on recent history, the concurrent collector maintains estimates of the time remaining before the tenured generation will be exhausted and of the time needed for a concurrent collection cycle. Based on these dynamic estimates, a concurrent collection cycle will be started with the aim of completing the collection cycle before the tenured generation is exhausted. These estimates are padded for safety, since the concurrent mode failure can be very costly. A concurrent collection will also start if the occupancy of the tenured generation exceeds an initiating occupancy, a percentage of the tenured generation. The default value of this initiating occupancy threshold is approximately 92%, but the value is subject to change from release to release. " --quoted from http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#cms Another solution: "There is an option CMSScavengeBeforeRemark which is off by default. If turned on, it will cause a minor collection to occur just before the remark. That's good because it will reduce the remark pause. That's bad because there is a minor collection pause followed immediately by the remark pause which looks like 1 big fat pause.l " My question is that why the collector so stupid that it don't do it like this. If the system is busy, it works like before. Because it's busy, minor gc will occur and precleaning will success in the future. If the system is idling, it can adjust the CMSMaxAbortablePrecleanTime or turning CMSScavengeBeforeRemark on. From chkwok at digibites.nl Tue May 3 05:06:11 2011 From: chkwok at digibites.nl (Chi Ho Kwok) Date: Tue, 3 May 2011 14:06:11 +0200 Subject: =?windows-1252?Q?Re=3A_=93abort_preclean_due_to_time=94_in_Concurrent_M?= =?windows-1252?Q?ark_=26_Sweep?= In-Reply-To: References: Message-ID: You're looking at the wrong thing. Your heap usage is above *Heap size x CMSInitiatingOccupancyFraction%*, causing GC to be run continuously. Make that percentage higher or increase the heap size. See -XX:CMSInitiatingOccupancyFraction and -Xmx On Tue, May 3, 2011 at 12:31 PM, Li Li wrote: > hi all > I confronted a strange case. The hotspot jvm was always doing gc > and consumed many cpu resources(from 50% to 300% cpu usage). And when > I turned on gc information. I > found "abort preclean due to time" in the gc logs. > So I googled and found some similar questions in > > http://stackoverflow.com/questions/1834501/abort-preclean-due-to-time-in-concurrent-mark-sweep > and > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2008-October/000482.html > . > And http://blogs.sun.com/jonthecollector/entry/did_you_know is > suggested to read. > I read the blog post and can't understand well. > As it says, CMS full gc has follwoing phases: > STW initial mark > Concurrent marking > Concurrent precleaning > STW remark > Concurrent sweeping > Concurrent reset > > "Ok, so here's the punch line for all this. When we're doing the > precleaning we do the sampling of the young generation top for a fixed > amount of time before starting the remark. That fixed amount of time > is CMSMaxAbortablePrecleanTime and its default value is 5 seconds. The > best situation is to have a minor collection happen during the > sampling. When that happens the sampling is done over the entire > region in the young generation from its start to its final top. If a > minor collection is not done during that 5 seconds then the region > below the first sample is 1 chunk and it might be the majority of the > young generation. Such a chunking doesn't spread the work out evenly > to the GC threads so reduces the effective parallelism. " --quoted > from this post. > > In my option, Concurrent precleaning is the preparing stage for > remark. It will split the young generation to chunks so remark can do > it parallelly. It expected a young gc in order > to split chunks evenly. If there is no young gc before time > out(CMSMaxAbortablePrecleanTime ), it seems it this gc will fail and > all following phases will be skipped. > > So when the system load is light(which means there will be no > minor gc), precleaning will always time out and full gc will always > fail. cpu is waste. > > Some suggested enlarge CMSMaxAbortablePrecleanTime. Maybe it can > solve this problem. But CMS collector,not like other collectors that > will perform gc when full. it will > perform gc when space usage is larger than 92%(68% for older version > of hotspot) or jvm feel it should do it. if this value is too large, > it will stop the world longer. > > "Based on recent history, the concurrent collector maintains > estimates of the time remaining before the tenured generation will be > exhausted and of the time needed for a concurrent collection cycle. > Based on these dynamic estimates, a concurrent collection cycle will > be started with the aim of completing the collection cycle before the > tenured generation is exhausted. These estimates are padded for > safety, since the concurrent mode failure can be very costly. > > A concurrent collection will also start if the occupancy of the > tenured generation exceeds an initiating occupancy, a percentage of > the tenured generation. The default value of this initiating occupancy > threshold is approximately 92%, but the value is subject to change > from release to release. " > --quoted from > http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#cms > > > Another solution: "There is an option CMSScavengeBeforeRemark > which is off by default. If turned on, it will cause a minor > collection to occur just before the remark. That's good because it > will reduce the remark pause. That's bad because there is a minor > collection pause followed immediately by the remark pause which looks > like 1 big fat pause.l " > > My question is that why the collector so stupid that it don't do > it like this. If the system is busy, it works like before. Because > it's busy, minor gc will occur and precleaning will success in the > future. If the system is idling, it can adjust the > CMSMaxAbortablePrecleanTime or turning CMSScavengeBeforeRemark on. > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110503/c2dfaa92/attachment.html From fancyerii at gmail.com Tue May 3 06:02:01 2011 From: fancyerii at gmail.com (Li Li) Date: Tue, 3 May 2011 21:02:01 +0800 Subject: =?windows-1252?Q?Re=3A_=93abort_preclean_due_to_time=94_in_Concurrent_M?= =?windows-1252?Q?ark_=26_Sweep?= In-Reply-To: References: Message-ID: I forgot to say, the free heap size is larger than 30% 2011/5/3 Chi Ho Kwok : > You're looking at the wrong thing. Your heap usage is above Heap size x > CMSInitiatingOccupancyFraction%, causing GC to be run continuously. Make > that percentage higher or increase the heap size. > See?-XX:CMSInitiatingOccupancyFraction and -Xmx > On Tue, May 3, 2011 at 12:31 PM, Li Li wrote: >> >> hi all >> ? ?I confronted a strange case. The hotspot jvm was always doing gc >> and consumed many cpu resources(from 50% to 300% cpu usage). And when >> I turned on gc information. I >> found "abort preclean due to time" in the gc logs. >> ? ?So I googled and found some similar questions in >> >> http://stackoverflow.com/questions/1834501/abort-preclean-due-to-time-in-concurrent-mark-sweep >> and >> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2008-October/000482.html. >> And http://blogs.sun.com/jonthecollector/entry/did_you_know is >> suggested to read. >> ? ?I read the blog post and can't understand well. >> ? ?As it says, CMS full gc has follwoing phases: >> ? ? ? ?STW initial mark >> ? ? ? ?Concurrent marking >> ? ? ? ?Concurrent precleaning >> ? ? ? ?STW remark >> ? ? ? ?Concurrent sweeping >> ? ? ? ?Concurrent reset >> >> ? ?"Ok, so here's the punch line for all this. When we're doing the >> precleaning we do the sampling of the young generation top for a fixed >> amount of time before starting the remark. That fixed amount of time >> is CMSMaxAbortablePrecleanTime and its default value is 5 seconds. The >> best situation is to have a minor collection happen during the >> sampling. When that happens the sampling is done over the entire >> region in the young generation from its start to its final top. If a >> minor collection is not done during that 5 seconds then the region >> below the first sample is 1 chunk and it might be the majority of the >> young generation. Such a chunking doesn't spread the work out evenly >> to the GC threads so reduces the effective parallelism. " --quoted >> from this post. >> >> ? ?In my option, Concurrent precleaning is the preparing stage for >> remark. It will split the young generation to chunks so remark can do >> it parallelly. It expected a young gc in order >> to split chunks evenly. If there is no young gc before time >> out(CMSMaxAbortablePrecleanTime ), it seems it this gc will fail and >> all following phases will be skipped. >> >> ? ?So when the system load is light(which means there will be no >> minor gc), precleaning ?will always ?time out and full gc will always >> fail. cpu is waste. >> >> ? Some suggested enlarge CMSMaxAbortablePrecleanTime. Maybe it can >> solve this problem. But CMS collector,not like other collectors that >> will perform gc when full. it will >> perform gc when space usage is larger than 92%(68% for older version >> of hotspot) or jvm feel it should do it. if this value is too large, >> it will stop the world longer. >> >> ? ? "Based on recent history, the concurrent collector maintains >> estimates of the time remaining before the tenured generation will be >> exhausted and of the time needed for a concurrent collection cycle. >> Based on these dynamic estimates, a concurrent collection cycle will >> be started with the aim of completing the collection cycle before the >> tenured generation is exhausted. These estimates are padded for >> safety, since the concurrent mode failure can be very costly. >> >> ? ? ?A concurrent collection will also start if the occupancy of the >> tenured generation exceeds an initiating occupancy, a percentage of >> the tenured generation. The default value of this initiating occupancy >> threshold is approximately 92%, but the value is subject to change >> from release to release. " >> ? ?--quoted from >> http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#cms >> >> >> ? ?Another solution: "There is an option CMSScavengeBeforeRemark >> which is off by default. If turned on, it will cause a minor >> collection to occur just before the remark. That's good because it >> will reduce the remark pause. That's bad because there is a minor >> collection pause followed immediately by the remark pause which looks >> like 1 big fat pause.l " >> >> ? ? My question is that why the collector so stupid that it don't do >> it ?like this. If the system is busy, it works like before. Because >> it's busy, minor gc will occur and precleaning will success in the >> future. If the system is idling, it can adjust the >> CMSMaxAbortablePrecleanTime or turning CMSScavengeBeforeRemark ?on. >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > From chkwok at digibites.nl Tue May 3 06:08:55 2011 From: chkwok at digibites.nl (Chi Ho Kwok) Date: Tue, 3 May 2011 15:08:55 +0200 Subject: =?windows-1252?Q?Re=3A_=93abort_preclean_due_to_time=94_in_Concurrent_M?= =?windows-1252?Q?ark_=26_Sweep?= In-Reply-To: References: Message-ID: It's still a basic GC tuning problem, it has nothing to do with abortable preclean etc. 1. Post the full command line with all the options. 2. Use jvisualvm to monitor the heap. It's in the SDK\bin folder. 3. Make sure the average heap usage is 10% lower than CMSInitiatingOccupancyFraction. 4. If CMS is still running all the time, learn to set a proper young generation size. 5. And read http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html On Tue, May 3, 2011 at 3:02 PM, Li Li wrote: > I forgot to say, the free heap size is larger than 30% > > 2011/5/3 Chi Ho Kwok : > > You're looking at the wrong thing. Your heap usage is above Heap size x > > CMSInitiatingOccupancyFraction%, causing GC to be run continuously. Make > > that percentage higher or increase the heap size. > > See -XX:CMSInitiatingOccupancyFraction and -Xmx > > On Tue, May 3, 2011 at 12:31 PM, Li Li wrote: > >> > >> hi all > >> I confronted a strange case. The hotspot jvm was always doing gc > >> and consumed many cpu resources(from 50% to 300% cpu usage). And when > >> I turned on gc information. I > >> found "abort preclean due to time" in the gc logs. > >> So I googled and found some similar questions in > >> > >> > http://stackoverflow.com/questions/1834501/abort-preclean-due-to-time-in-concurrent-mark-sweep > >> and > >> > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2008-October/000482.html > . > >> And http://blogs.sun.com/jonthecollector/entry/did_you_know is > >> suggested to read. > >> I read the blog post and can't understand well. > >> As it says, CMS full gc has follwoing phases: > >> STW initial mark > >> Concurrent marking > >> Concurrent precleaning > >> STW remark > >> Concurrent sweeping > >> Concurrent reset > >> > >> "Ok, so here's the punch line for all this. When we're doing the > >> precleaning we do the sampling of the young generation top for a fixed > >> amount of time before starting the remark. That fixed amount of time > >> is CMSMaxAbortablePrecleanTime and its default value is 5 seconds. The > >> best situation is to have a minor collection happen during the > >> sampling. When that happens the sampling is done over the entire > >> region in the young generation from its start to its final top. If a > >> minor collection is not done during that 5 seconds then the region > >> below the first sample is 1 chunk and it might be the majority of the > >> young generation. Such a chunking doesn't spread the work out evenly > >> to the GC threads so reduces the effective parallelism. " --quoted > >> from this post. > >> > >> In my option, Concurrent precleaning is the preparing stage for > >> remark. It will split the young generation to chunks so remark can do > >> it parallelly. It expected a young gc in order > >> to split chunks evenly. If there is no young gc before time > >> out(CMSMaxAbortablePrecleanTime ), it seems it this gc will fail and > >> all following phases will be skipped. > >> > >> So when the system load is light(which means there will be no > >> minor gc), precleaning will always time out and full gc will always > >> fail. cpu is waste. > >> > >> Some suggested enlarge CMSMaxAbortablePrecleanTime. Maybe it can > >> solve this problem. But CMS collector,not like other collectors that > >> will perform gc when full. it will > >> perform gc when space usage is larger than 92%(68% for older version > >> of hotspot) or jvm feel it should do it. if this value is too large, > >> it will stop the world longer. > >> > >> "Based on recent history, the concurrent collector maintains > >> estimates of the time remaining before the tenured generation will be > >> exhausted and of the time needed for a concurrent collection cycle. > >> Based on these dynamic estimates, a concurrent collection cycle will > >> be started with the aim of completing the collection cycle before the > >> tenured generation is exhausted. These estimates are padded for > >> safety, since the concurrent mode failure can be very costly. > >> > >> A concurrent collection will also start if the occupancy of the > >> tenured generation exceeds an initiating occupancy, a percentage of > >> the tenured generation. The default value of this initiating occupancy > >> threshold is approximately 92%, but the value is subject to change > >> from release to release. " > >> --quoted from > >> > http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#cms > >> > >> > >> Another solution: "There is an option CMSScavengeBeforeRemark > >> which is off by default. If turned on, it will cause a minor > >> collection to occur just before the remark. That's good because it > >> will reduce the remark pause. That's bad because there is a minor > >> collection pause followed immediately by the remark pause which looks > >> like 1 big fat pause.l " > >> > >> My question is that why the collector so stupid that it don't do > >> it like this. If the system is busy, it works like before. Because > >> it's busy, minor gc will occur and precleaning will success in the > >> future. If the system is idling, it can adjust the > >> CMSMaxAbortablePrecleanTime or turning CMSScavengeBeforeRemark on. > >> _______________________________________________ > >> hotspot-gc-use mailing list > >> hotspot-gc-use at openjdk.java.net > >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110503/a5622a7d/attachment-0001.html From y.s.ramakrishna at oracle.com Tue May 3 10:17:49 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 03 May 2011 10:17:49 -0700 Subject: =?windows-1252?Q?=93abort_preclean_due_to_time=94_?= =?windows-1252?Q?in_Concurrent_Mark_=26_Sweep?= In-Reply-To: References: Message-ID: <4DC038BD.1070404@oracle.com> Hi LiLi -- On 05/03/11 03:31, Li Li wrote: > hi all > I confronted a strange case. The hotspot jvm was always doing gc > and consumed many cpu resources(from 50% to 300% cpu usage). And when > I turned on gc information. I > found "abort preclean due to time" in the gc logs. > So I googled and found some similar questions in > http://stackoverflow.com/questions/1834501/abort-preclean-due-to-time-in-concurrent-mark-sweep > and http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2008-October/000482.html. > And http://blogs.sun.com/jonthecollector/entry/did_you_know is > suggested to read. > I read the blog post and can't understand well. > As it says, CMS full gc has follwoing phases: > STW initial mark > Concurrent marking > Concurrent precleaning > STW remark > Concurrent sweeping > Concurrent reset > > "Ok, so here's the punch line for all this. When we're doing the > precleaning we do the sampling of the young generation top for a fixed > amount of time before starting the remark. That fixed amount of time > is CMSMaxAbortablePrecleanTime and its default value is 5 seconds. The > best situation is to have a minor collection happen during the > sampling. When that happens the sampling is done over the entire > region in the young generation from its start to its final top. If a > minor collection is not done during that 5 seconds then the region > below the first sample is 1 chunk and it might be the majority of the > young generation. Such a chunking doesn't spread the work out evenly > to the GC threads so reduces the effective parallelism. " --quoted > from this post. > > In my option, Concurrent precleaning is the preparing stage for > remark. It will split the young generation to chunks so remark can do > it parallelly. It expected a young gc in order > to split chunks evenly. If there is no young gc before time > out(CMSMaxAbortablePrecleanTime ), it seems it this gc will fail and > all following phases will be skipped. Not quite. Reread the above para. What it says is that the splitting might be uneven if the time between scavenges is much larger than the default timeout because the first chunk may be much larger than the rest, and it would be the "long pole" in the parallelization. > > So when the system load is light(which means there will be no > minor gc), precleaning will always time out and full gc will always > fail. cpu is waste. It won't fail. It'll be less parallel (i.e. less efficient, and would have a longer pause time, for lesser work). > > Some suggested enlarge CMSMaxAbortablePrecleanTime. Maybe it can > solve this problem. But CMS collector,not like other collectors that > will perform gc when full. it will > perform gc when space usage is larger than 92%(68% for older version > of hotspot) or jvm feel it should do it. if this value is too large, > it will stop the world longer. If you are right at the edge, you may be right. But the idea is to make the CMSMaxAbortablePrecleanTime about twice the inter-scavenge time and then you will almost never have the uneven splitting. If you are getting concurrent mode failure because of making CMSMaxBortablePrecleanTime too large, then you must be in a regime where your CMS trigger threshold is much too high for comfort and preclean or not you run a high risk of concurrent mode failure. I don't think not setting a larger timeout will save you there. > > "Based on recent history, the concurrent collector maintains > estimates of the time remaining before the tenured generation will be > exhausted and of the time needed for a concurrent collection cycle. > Based on these dynamic estimates, a concurrent collection cycle will > be started with the aim of completing the collection cycle before the > tenured generation is exhausted. These estimates are padded for > safety, since the concurrent mode failure can be very costly. > > A concurrent collection will also start if the occupancy of the > tenured generation exceeds an initiating occupancy, a percentage of > the tenured generation. The default value of this initiating occupancy > threshold is approximately 92%, but the value is subject to change > from release to release. " > --quoted from > http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#cms > > > Another solution: "There is an option CMSScavengeBeforeRemark > which is off by default. If turned on, it will cause a minor > collection to occur just before the remark. That's good because it > will reduce the remark pause. That's bad because there is a minor > collection pause followed immediately by the remark pause which looks > like 1 big fat pause.l " > > My question is that why the collector so stupid that it don't do > it like this. If the system is busy, it works like before. Because > it's busy, minor gc will occur and precleaning will success in the > future. If the system is idling, it can adjust the > CMSMaxAbortablePrecleanTime or turning CMSScavengeBeforeRemark on. Absolutely. There's in fact an open RFE to do just that, but we have been frying more important fish recently and have not gotten to that RFE. I'll dig up the CR id for you shortly. -- ramki From y.s.ramakrishna at oracle.com Tue May 3 10:28:28 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 03 May 2011 10:28:28 -0700 Subject: =?windows-1252?Q?=93abort_preclean_due_to_time=94_?= =?windows-1252?Q?in_Concurrent_Mark_=26_Sweep?= In-Reply-To: <4DC038BD.1070404@oracle.com> References: <4DC038BD.1070404@oracle.com> Message-ID: <4DC03B3C.2040201@oracle.com> On 05/03/11 10:17, Y. S. Ramakrishna wrote: > > Hi LiLi -- > > On 05/03/11 03:31, Li Li wrote: ... >> My question is that why the collector so stupid that it don't do >> it like this. If the system is busy, it works like before. Because >> it's busy, minor gc will occur and precleaning will success in the >> future. If the system is idling, it can adjust the >> CMSMaxAbortablePrecleanTime or turning CMSScavengeBeforeRemark on. > > Absolutely. There's in fact an open RFE to do just that, but we have > been frying > more important fish recently and have not gotten to that RFE. I'll > dig up the CR id for you shortly. 6990419 CMS: Remaining work for 6572569: consistently skewed work distribution in (long) re-mark pauses From y.s.ramakrishna at oracle.com Tue May 3 10:34:53 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 03 May 2011 10:34:53 -0700 Subject: =?windows-1252?Q?=93abort_preclean_due_to_time=94_?= =?windows-1252?Q?in_Concurrent_Mark_=26_Sweep?= In-Reply-To: <4DC03B3C.2040201@oracle.com> References: <4DC038BD.1070404@oracle.com> <4DC03B3C.2040201@oracle.com> Message-ID: <4DC03CBD.6060700@oracle.com> BTW, if this is urgent, please feel free to file an escalation if you have a support contract. Otherwise, feel free to offer a patch for review if you have one that you'd like us to consider for integration. thanks! -- ramki On 05/03/11 10:28, Y. S. Ramakrishna wrote: > > > On 05/03/11 10:17, Y. S. Ramakrishna wrote: >> >> Hi LiLi -- >> >> On 05/03/11 03:31, Li Li wrote: > ... >>> My question is that why the collector so stupid that it don't do >>> it like this. If the system is busy, it works like before. Because >>> it's busy, minor gc will occur and precleaning will success in the >>> future. If the system is idling, it can adjust the >>> CMSMaxAbortablePrecleanTime or turning CMSScavengeBeforeRemark on. >> >> Absolutely. There's in fact an open RFE to do just that, but we have >> been frying >> more important fish recently and have not gotten to that RFE. I'll >> dig up the CR id for you shortly. > > > 6990419 CMS: Remaining work for 6572569: consistently skewed work > distribution in (long) re-mark pauses > From aaisinzon at guidewire.com Wed May 4 11:55:45 2011 From: aaisinzon at guidewire.com (Alex Aisinzon) Date: Wed, 4 May 2011 11:55:45 -0700 Subject: Two questions about compressed references Message-ID: Hi all I had two questions about the use of compressed references: * I understand that, with compressed references, the native memory must be located within a 32 bit space (which may be as small as 2GB on Windows). Additionally, if the heap is small enough and can function in "real" 32 bit, the heap will be placed in the 32 bit space, thereby reducing the space available for the native memory. As a result, out of native memory issues are more likely. I have heard of an option to rebase the heap above that 32 bit space so as to provide more growth to the native memory. Does someone know of that option? * I understand how compressed references work for the heap (Java addresses are 8 bytes aligned and, therefore, addresses have the least 3 bits set to 0, which allows a address up to 35 bit to be stored in 32 bit after some shifting). I am less clear on how compressed references works for the native memory: does it use a 32 bit pointer only or does it use a 64 bit one? Thanks in advance Alex Aisinzon -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110504/e1b6fd61/attachment.html From shane.cox at gmail.com Thu May 5 12:11:45 2011 From: shane.cox at gmail.com (Shane Cox) Date: Thu, 5 May 2011 15:11:45 -0400 Subject: Periodic long minor GC pauses In-Reply-To: <4DB79CC1.8040707@oracle.com> References: <4DB704D3.20600@oracle.com> <4DB70C4A.9090203@oracle.com> <4DB71FAD.3050905@oracle.com> <4DB79CC1.8040707@oracle.com> Message-ID: Jon, Thanks for the suggestion. Interestingly enough, adding -XX:+AlwaysPreTouch increased the frequency of the problem by 2x. Although this didn't solve the problem, I think it offers a clue. Touching all of the memory pages on startup will leave fewer pages on the free list, no? Could fewer pages on the free list trigger an OS behavior that interferes with an activity in the GC prologue? All that comes to mind is the linux page scanner (recycling pages or pruning the page cache) - not sure how this could interfere with the GC prologue. According to the free command, swap usage remained 0 throughout the test ... so I'm not suspecting swap at this time. FYI, I ran the application/test on a Solaris 10 host and was unable to recreate the problem. So the problem appears to be linux-specific (RHEL). Any additional thoughts would be appreciated. Shane On Wed, Apr 27, 2011 at 12:34 AM, Jon Masamitsu wrote: > Shane, > > Have you tried running with -XX:+AlwaysPreTouch ? We've occasionally > seen intermittent > long pauses as the heap grows into newly committed pages. This flag > causes pages > to be touched as they are committed. I don't know how this fits into > Ramki's > observation but it might be worth a shot. > > Jon > > > On 4/26/2011 12:40 PM, Y. S. Ramakrishna wrote: > > Well-spotted; it's a version of the same problem as near as > > i can tell. Please make sure to include a sizable GC log with > > your bug report (starting from VM start-up, so we can see if > > there is any clue in when the problem first starts during > > the life of the VM). > > > > thanks. > > -- ramki > > > > On 04/26/11 11:29, Shane Cox wrote: > >> Below is an example from a Remark. Of the total 1.3 seconds of elapsed > >> time, 1.2 seconds is found between the first two timestamps. However, > >> I'm not savvy enough to know whether this is the same problem or simply > >> the result of a long scavenge that occurs as part of the Remark. Is > >> there any way to tell? > >> > >> 2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG occupancy: 712500 K > >> (943744 K)]9467.353: [Rescan (parallel) , 0.0106370 secs]9467.374: [weak > >> refs processing, 0.0159250 secs]9467.390: [class unloading, 0.0180420 > >> secs]9467.408: [scrub symbol& string tables, 0.0458500 secs] [1 > >> CMS-remark: 12520949K(24117248K)] 13233450K(25060992K), 0.1052950 secs] > >> [Times: user=0.13 sys=0.01, real=1.32 secs] > >> > >> > >> On Tue, Apr 26, 2011 at 2:17 PM, Y. S. Ramakrishna > >> > wrote: > >> > >> I had a quick look and all i could find was the GC prologue > >> code (although i didn't look all that carefully). > >> Bascially, GC is invoked, it prints this timestamp, > >> does a bit of global book-keeping and some initialization, > >> and then goes over each generation in the heap and > >> says "i am going to do a collection, do whatever you need > >> to do before i do the collection", and the generations each do a > bit of > >> book-keeping and any relevant initialization. > >> > >> The only thing i can see in the gc prologues other than a bit > >> of lightweight book-keeping is some reporting code that could > >> potentially be heavyweight. But you do not have any of those > >> enabled in your option set, so there should not be anything > >> obviously heavyweight going on. > >> > >> I'd suggest filing a bug under the category of > >> jvm/hotspot/garbage_collector > >> so someone in support can work with you to get this diagnosed... > >> > >> Three questions when you file the bug: > >> (1) have you seen this start happening recently? (version?) > >> (2) can you check if the longer pauses are "random" or do > >> they always happen "during" CMS concurrent cycles or > >> always outside of such cycles? > >> (3) test set-up. > >> > >> -- ramki > >> > >> > >> On 04/26/11 10:45, Y. S. Ramakrishna wrote: > >> > >> The pause is definitely in the beginning, before GC collection > code > >> itself runs; witness the timestamps:- > >> > >> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: > >> 943744K->79296K(943744K), 0.0559560 secs] > >> 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: > user=0.31 > >> sys=0.09, real=2.45 secs] > >> > >> The first timestamp is 2120.686 and the next one is 2123.075, > so > >> we have > >> about 2.389 s between those two. If you add to that the GC time > >> of 0.056 s, > >> you get 2.445 which is close enough to the 2.45 s reported. > >> > >> So we need to figure out what happens in the JVM between those > two > >> time-stamps and we can at least bound the culprit. > >> > >> -- ramki > >> > >> On 04/26/11 10:36, Shane Cox wrote: > >> > >> Periodically, our Java app on Linux experiences a long > Minor > >> GC pause that cannot be accounted for by the GC time in the > >> log file. Instead, the pause is captured as "real" (wall > >> clock) time and is observable in our application logs. An > >> example is below. The GC completed in 56ms, but the > >> application was paused for 2.45 seconds. > >> > >> 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: > >> [ParNew: 943439K->104832K(943744K), 0.0481790 secs] > >> 4909998K->4086751K(25060992K), 0.0485110 secs] [Times: > >> user=0.34 sys=0.03, real=0.04 secs] > >> 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: > >> [ParNew: 942852K->104832K(943744K), 0.0738000 secs] > >> 4924772K->4150899K(25060992K), 0.0740980 secs] [Times: > >> user=0.45 sys=0.12, real=0.07 secs] > >> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: > >> [ParNew: 943744K->79296K(943744K), 0.0559560 secs] > >> 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: > >> user=0.31 sys=0.09, *real=2.45 secs]* > >> 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: > >> [ParNew: 918208K->81040K(943744K), 0.0396620 secs] > >> 5026432K->4189265K(25060992K), 0.0400030 secs] [Times: > >> user=0.32 sys=0.00, real=0.04 secs] > >> 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: > >> [ParNew: 919952K->104832K(943744K), 0.0845070 secs] > >> 5028177K->4268050K(25060992K), 0.0848300 secs] [Times: > >> user=0.52 sys=0.11, real=0.09 secs] > >> > >> > >> Initially I suspected swapping, but according to the free > >> command, 0 bytes of swap are in use. > >> >free -m > >> total used free shared > >> buffers cached > >> Mem: 32168 28118 4050 0 > >> 824 12652 > >> -/+ buffers/cache: 14641 17527 > >> Swap: 8191 0 8191 > >> > >> > >> Next, I read about a problem relating to mprotect() on > Linux > >> that can be worked around with -XX:+UseMember. I tried > >> that, but I still see the same unexplainable pauses. > >> > >> > >> Any suggestions/ideas? We've upgraded to the latest JDK, > >> but no luck. > >> > >> Thanks, > >> Shane > >> > >> > >> java version "1.6.0_25" > >> Java(TM) SE Runtime Environment (build 1.6.0_25-b06) > >> Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed > mode) > >> > >> > >> Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 > >> x86_64 x86_64 x86_64 GNU/Linux > >> > >> > >> -verbose:gc -Xms24g -Xmx24g -Xmn1g -Xss256k > >> -XX:PermSize=256m -XX:MaxPermSize=256m > >> -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC > >> -XX:+CMSParallelRemarkEnabled > >> -XX:CMSInitiatingOccupancyFraction=70 > >> -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails > >> -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC > >> -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings > >> -XX:+UseMembar > >> > >> > >> > ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> hotspot-gc-use mailing list > >> hotspot-gc-use at openjdk.java.net > >> > >> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > >> > >> _______________________________________________ > >> hotspot-gc-use mailing list > >> hotspot-gc-use at openjdk.java.net > >> > >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > >> > >> > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110505/20306967/attachment.html From Alexander.Livitz at on24.com Wed May 11 15:30:10 2011 From: Alexander.Livitz at on24.com (Alexander Livitz) Date: Wed, 11 May 2011 15:30:10 -0700 Subject: Flags are enabled by -XX:+AggressiveOpts Message-ID: <4C717D4720DE704383A5D24A4D46E1C681D5E97C5A@P-HQEXCHANGE.on24.com> Hello, My performance tests indicate that using -XX:+AggressiveOpts actually helps my application, but since this flag is marked as experimental I want to be careful with it. I want to know what flags are enabled by -XX:+AggressiveOpts on 1.6u25. Thanks, Alexander Livitz -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110511/893e1736/attachment.html From Alexander.Livitz at on24.com Fri May 13 11:22:08 2011 From: Alexander.Livitz at on24.com (Alexander Livitz) Date: Fri, 13 May 2011 11:22:08 -0700 Subject: Flags are enabled by -XX:+AggressiveOpts In-Reply-To: <4C717D4720DE704383A5D24A4D46E1C681D5E97C5A@P-HQEXCHANGE.on24.com> References: <4C717D4720DE704383A5D24A4D46E1C681D5E97C5A@P-HQEXCHANGE.on24.com> Message-ID: <4C717D4720DE704383A5D24A4D46E1C681D5F3E323@P-HQEXCHANGE.on24.com> Just in case anyone is curious, the flags enabled by -XX:+AggressiveOpts in JDK 1.6.0_25 are: -XX:+EliminateAutoBox -XX:AutoBoxCacheMax=20000 -XX:BiasedLockingStartupDelay=500 -XX:+DoEscapeAnalysis -XX:+OptimizeStringConcat -XX:+OptimizeFill Best, Alexander From: hotspot-gc-use-bounces at openjdk.java.net [mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of Alexander Livitz Sent: Wednesday, May 11, 2011 3:30 PM To: hotspot-gc-use at openjdk.java.net Subject: Flags are enabled by -XX:+AggressiveOpts Hello, My performance tests indicate that using -XX:+AggressiveOpts actually helps my application, but since this flag is marked as experimental I want to be careful with it. I want to know what flags are enabled by -XX:+AggressiveOpts on 1.6u25. Thanks, Alexander Livitz -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110513/28e2d302/attachment.html From y.s.ramakrishna at oracle.com Fri May 13 17:49:24 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Fri, 13 May 2011 17:49:24 -0700 Subject: Quick CMS Survey [~5 minutes] Message-ID: <4DCDD194.9020503@oracle.com> If you do not use/deploy CMS or do not wish to participate in this survey, then please ignore this question. If you do use CMS, and wish to participate in this survey, please answer Yes or No to each of the following three questions, and if you wish a fourth bonus question further below: --------------------------------------------------------------------------------- Q#1: do you use a fixed heap size (-Xmx == -Xms)? Q#2: do you explicitly enable the flag -XX:+CMSClassUnloadingEnabled ? Q#3: do you use a fixed perm gen size (-XX:PermSize == -XX:MaxPermSize)? --------------------------------------------------------------------------------- Many thanks for participating in this survey. Your answers will help us provide you with a better CMS experience. :-) --------------------------------------------------------------------------------- Bonus Question: Q#4: if you answered yes to any of the above questions, do you wish your answer were no to those questions? (Feel free to elaborate why here, if you wish to). --------------------------------------------------------------------------------- Please send your responses directly to me, so as to not flood everyone's mailboxes. I will post a summary of the survey results if there is sufficient interest. Thanks again for your participation, and have a good weekend. -- ramki From shane.cox at gmail.com Mon May 16 04:52:08 2011 From: shane.cox at gmail.com (Shane Cox) Date: Mon, 16 May 2011 07:52:08 -0400 Subject: Cause of Full GCs / OOME Message-ID: We have a simple app that reads log files (1 at a time), and inserts the records into a DB. Yesterday we observed two OOME's which coincide with Full GC's. These Full GC's appear to be "premature" - meaning that the heap occupancy was well below the heap size when GC was triggered. My best guess is that the Full GCs / OOME were caused by an extremely large allocation (e.g., > 1GB). Is there any other reasonable explanation for this behavior? Thanks, Shane 2011-05-15T01:04:08.942-0400: 744918.391: [GC 744918.391: [DefNew: 32914K->365K(36288K), 0.0020476 secs] 443409K->410860K(520256K), 0.0020886 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:04:09.144-0400: 744918.593: [GC 744918.593: [DefNew: 32621K->323K(36288K), 0.0024427 secs] 443116K->410842K(520256K), 0.0024860 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:05:08.487-0400: 744977.947: [GC 744977.954: [DefNew: 32579K->244K(36288K), 0.0028439 secs] 443098K->410763K(520256K), 0.0029446 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 2011-05-15T01:05:08.723-0400: 744978.182: [GC 744978.182: [DefNew: 32500K->636K(36288K), 0.0023190 secs] 443019K->411154K(520256K), 0.0023617 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:05:08.906-0400: 744978.365: [GC 744978.366: [DefNew: 32892K->321K(36288K), 0.0021073 secs] 443410K->410842K(520256K), 0.0021459 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:05:13.378-0400: 744982.839: [GC 744982.839: [DefNew: 32577K->37K(36288K), 0.0016258 secs] 443098K->410558K(520256K), 0.0016636 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:06:06.624-0400: 745036.095: [GC 745036.097: [DefNew: 10190K->48K(36288K), 0.0026042 secs]745036.100: [Tenured: 410520K->82745K(483968K), 0.5341348 secs] 420710K->82745K(520256K), [Perm : 9007K->9007K(12288K)], 0.5369732 secs] [Times: user=0.23 sys=0.00, real=0.54 secs] 2011-05-15T01:06:07.166-0400: 745036.636: [Full GC 745036.636: [Tenured: 82745K->59599K(967936K), 0.4755627 secs] 82745K->59599K(1004288K), [Perm : 9007K->8935K(12288K)], 0.4758317 secs] [Times: user=0.15 sys=0.00, real=0.48 secs] 2011-05-15T01:06:17.461-0400: 745046.933: [GC 745046.944: [DefNew: 64512K->488K(72576K), 0.0045634 secs] 124111K->60087K(1040512K), 0.0046410 secs] [Times: user=0.00 sys=0.00, real=0.02 secs] 2011-05-15T01:07:08.690-0400: 745098.171: [GC 745098.181: [DefNew: 65000K->797K(72576K), 0.0041858 secs] 124599K->60397K(1040512K), 0.0042621 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 2011-05-15T01:07:09.694-0400: 745099.176: [GC 745099.176: [DefNew: 65309K->485K(72576K), 0.0016391 secs] 124909K->60085K(1040512K), 0.0016827 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:08:08.858-0400: 745158.350: [GC 745158.357: [DefNew: 64997K->799K(72576K), 0.0036000 secs] 124597K->60399K(1040512K), 0.0036762 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 2011-05-15T01:08:11.512-0400: 745161.005: [GC 745161.005: [DefNew: 65311K->488K(72576K), 0.0015378 secs] 124911K->60088K(1040512K), 0.0015790 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:08:34.926-0400: 745184.423: [GC 745184.433: [DefNew: 19958K->190K(72576K), 0.0020757 secs]745184.435: [Tenured: 59599K->59785K(967936K), 0.2253939 secs] 79558K->59785K(1040512K), [Perm : 9173K->9173K(12288K)], 0.2279033 secs] [Times: user=0.15 sys=0.00, real=0.24 secs] 2011-05-15T01:08:35.164-0400: 745184.661: [Full GC 745184.661: [Tenured: 59785K->58941K(967936K), 0.3941210 secs] 59785K->58941K(1037056K), [Perm : 9173K->9165K(12288K)], 0.3947753 secs] [Times: user=0.15 sys=0.00, real=0.39 secs] 2011-05-15T01:09:08.897-0400: 745218.400: [GC 745218.412: [DefNew: 51712K->346K(58176K), 0.0012630 secs] 110653K->59287K(832528K), 0.0013170 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 2011-05-15T01:09:10.210-0400: 745219.714: [GC 745219.714: [DefNew: 52058K->330K(58176K), 0.0018250 secs] 110999K->59271K(832528K), 0.0018960 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:10:09.557-0400: 745279.072: [GC 745279.075: [DefNew: 52042K->653K(58176K), 0.0032786 secs] 110983K->59595K(832528K), 0.0429740 secs] [Times: user=0.00 sys=0.00, real=0.05 secs] 2011-05-15T01:10:11.557-0400: 745281.072: [GC 745281.072: [DefNew: 52365K->342K(58176K), 0.0010212 secs] 111307K->59284K(832528K), 0.0010563 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:11:08.779-0400: 745338.304: [GC 745338.316: [DefNew: 52054K->655K(58176K), 0.0028664 secs] 110996K->59596K(832528K), 0.0029585 secs] [Times: user=0.00 sys=0.00, real=0.02 secs] 2011-05-15T01:11:10.138-0400: 745339.664: [GC 745339.664: [DefNew: 52367K->342K(58176K), 0.0010634 secs] 111308K->59284K(832528K), 0.0011042 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 2011-05-15T01:12:08.585-0400: 745398.121: [GC 745398.130: [DefNew: 52054K->752K(58176K), 0.0018517 secs] 110996K->59693K(832528K), 0.0019152 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 2011-05-15 01:06:07,738 [Reader#0 ] ERROR ntroller.ReadLogFileWorker - Exception processing file. java.lang.OutOfMemoryError: Java heap space at com.cpex.icelog.core.tradingengine.reader.SocketMessageLogReader.getNextSocketMessageByteStream(SocketMessageLogReader.java:159) at com.cpex.icelog.core.tradingengine.reader.SocketMessageLogReader.readLogEntries(SocketMessageLogReader.java:94) at com.cpex.icelog.core.tradingengine.reader.SocketMessageLogReader.read(SocketMessageLogReader.java:287) at com.cpex.icelog.core.controller.ReadLogFileWorker.dispatchReader(ReadLogFileWorker.java:110) at com.cpex.icelog.core.controller.ReadLogFileWorker.delegateDoRun(ReadLogFileWorker.java:81) at com.cpex.icelog.core.controller.ReadLogFileWorker.doRun(ReadLogFileWorker.java:63) at com.cpex.icelog.core.domain.IceLogProcessingRunnable.run(IceLogProcessingRunnable.java:37) at edu.emory.mathcs.backport.java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:442) at edu.emory.mathcs.backport.java.util.concurrent.FutureTask.run(FutureTask.java:176) at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665) at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690) at java.lang.Thread.run(Thread.java:619) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110516/48872bea/attachment.html From y.s.ramakrishna at oracle.com Mon May 16 08:41:17 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Mon, 16 May 2011 08:41:17 -0700 Subject: Cause of Full GCs / OOME In-Reply-To: References: Message-ID: <4DD1459D.1020906@oracle.com> On 5/16/2011 4:52 AM, Shane Cox wrote: > We have a simple app that reads log files (1 at a time), and inserts the > records into a DB. Yesterday we observed two OOME's which coincide with > Full GC's. These Full GC's appear to be "premature" - meaning that the heap > occupancy was well below the heap size when GC was triggered. My best guess > is that the Full GCs / OOME were caused by an extremely large allocation > (e.g.,> 1GB). Is there any other reasonable explanation for this behavior? I think you are right that this must be for an allocation that is too large to fit in either or the two generations (old or young). Does that appear reasonable for the kind of allocation that the OOME stack trace shows at: com.cpex.icelog.core.tradingengine.reader.SocketMessageLogReader.getNextSocketMessageByteStream(SocketMessageLogReader.java:159) Looking at the heap in the temporal vicinity of the OOME, we find:- > 2011-05-15T01:06:06.624-0400: 745036.095: [GC 745036.097: [DefNew: 10190K->48K(36288K), 0.0026042 secs]745036.100: [Tenured: 410520K->82745K(483968K), 0.5341348 secs] 420710K->82745K(520256K), [Perm : 9007K->9007K(12288K)], 0.5369732 secs] [Times: user=0.23 sys=0.00, real=0.54 secs] > 2011-05-15T01:06:07.166-0400: 745036.636: [Full GC 745036.636: [Tenured: 82745K->59599K(967936K), 0.4755627 secs] 82745K->59599K(1004288K), [Perm : 9007K->8935K(12288K)], 0.4758317 secs] [Times: user=0.15 sys=0.00, real=0.48 secs] that following the second GC, there was 967936K - 59599K = 908337 K of free space in the old generation, so it seems reasonable to me that this was an allocation request whose size exceeded that value. The two GC's indicate the JVM escalating the collection: it first attempts to do a normal full gc to try and satisfy the allocation request. When the available space still falls short, it follows up with a collection that clears any soft references held by the VM, and retries the allocation. This also seems to fail. I'd try and bound the size at the allocation site or, failing that, size the heap sufficiently large to allow such large allocations from time to time. -- ramki > > Thanks, > Shane > > > 2011-05-15T01:04:08.942-0400: 744918.391: [GC 744918.391: [DefNew: > 32914K->365K(36288K), 0.0020476 secs] 443409K->410860K(520256K), 0.0020886 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:04:09.144-0400: 744918.593: [GC 744918.593: [DefNew: > 32621K->323K(36288K), 0.0024427 secs] 443116K->410842K(520256K), 0.0024860 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:05:08.487-0400: 744977.947: [GC 744977.954: [DefNew: > 32579K->244K(36288K), 0.0028439 secs] 443098K->410763K(520256K), 0.0029446 > secs] [Times: user=0.00 sys=0.00, real=0.01 secs] > 2011-05-15T01:05:08.723-0400: 744978.182: [GC 744978.182: [DefNew: > 32500K->636K(36288K), 0.0023190 secs] 443019K->411154K(520256K), 0.0023617 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:05:08.906-0400: 744978.365: [GC 744978.366: [DefNew: > 32892K->321K(36288K), 0.0021073 secs] 443410K->410842K(520256K), 0.0021459 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:05:13.378-0400: 744982.839: [GC 744982.839: [DefNew: > 32577K->37K(36288K), 0.0016258 secs] 443098K->410558K(520256K), 0.0016636 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:06:06.624-0400: 745036.095: [GC 745036.097: [DefNew: > 10190K->48K(36288K), 0.0026042 secs]745036.100: [Tenured: > 410520K->82745K(483968K), 0.5341348 secs] 420710K->82745K(520256K), [Perm : > 9007K->9007K(12288K)], 0.5369732 secs] [Times: user=0.23 sys=0.00, real=0.54 > secs] > 2011-05-15T01:06:07.166-0400: 745036.636: [Full GC 745036.636: [Tenured: > 82745K->59599K(967936K), 0.4755627 secs] 82745K->59599K(1004288K), [Perm : > 9007K->8935K(12288K)], 0.4758317 secs] [Times: user=0.15 sys=0.00, real=0.48 > secs] > 2011-05-15T01:06:17.461-0400: 745046.933: [GC 745046.944: [DefNew: > 64512K->488K(72576K), 0.0045634 secs] 124111K->60087K(1040512K), 0.0046410 > secs] [Times: user=0.00 sys=0.00, real=0.02 secs] > 2011-05-15T01:07:08.690-0400: 745098.171: [GC 745098.181: [DefNew: > 65000K->797K(72576K), 0.0041858 secs] 124599K->60397K(1040512K), 0.0042621 > secs] [Times: user=0.00 sys=0.00, real=0.01 secs] > 2011-05-15T01:07:09.694-0400: 745099.176: [GC 745099.176: [DefNew: > 65309K->485K(72576K), 0.0016391 secs] 124909K->60085K(1040512K), 0.0016827 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:08:08.858-0400: 745158.350: [GC 745158.357: [DefNew: > 64997K->799K(72576K), 0.0036000 secs] 124597K->60399K(1040512K), 0.0036762 > secs] [Times: user=0.00 sys=0.00, real=0.01 secs] > 2011-05-15T01:08:11.512-0400: 745161.005: [GC 745161.005: [DefNew: > 65311K->488K(72576K), 0.0015378 secs] 124911K->60088K(1040512K), 0.0015790 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:08:34.926-0400: 745184.423: [GC 745184.433: [DefNew: > 19958K->190K(72576K), 0.0020757 secs]745184.435: [Tenured: > 59599K->59785K(967936K), 0.2253939 secs] 79558K->59785K(1040512K), [Perm : > 9173K->9173K(12288K)], 0.2279033 secs] [Times: user=0.15 sys=0.00, real=0.24 > secs] > 2011-05-15T01:08:35.164-0400: 745184.661: [Full GC 745184.661: [Tenured: > 59785K->58941K(967936K), 0.3941210 secs] 59785K->58941K(1037056K), [Perm : > 9173K->9165K(12288K)], 0.3947753 secs] [Times: user=0.15 sys=0.00, real=0.39 > secs] > 2011-05-15T01:09:08.897-0400: 745218.400: [GC 745218.412: [DefNew: > 51712K->346K(58176K), 0.0012630 secs] 110653K->59287K(832528K), 0.0013170 > secs] [Times: user=0.00 sys=0.00, real=0.01 secs] > 2011-05-15T01:09:10.210-0400: 745219.714: [GC 745219.714: [DefNew: > 52058K->330K(58176K), 0.0018250 secs] 110999K->59271K(832528K), 0.0018960 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:10:09.557-0400: 745279.072: [GC 745279.075: [DefNew: > 52042K->653K(58176K), 0.0032786 secs] 110983K->59595K(832528K), 0.0429740 > secs] [Times: user=0.00 sys=0.00, real=0.05 secs] > 2011-05-15T01:10:11.557-0400: 745281.072: [GC 745281.072: [DefNew: > 52365K->342K(58176K), 0.0010212 secs] 111307K->59284K(832528K), 0.0010563 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:11:08.779-0400: 745338.304: [GC 745338.316: [DefNew: > 52054K->655K(58176K), 0.0028664 secs] 110996K->59596K(832528K), 0.0029585 > secs] [Times: user=0.00 sys=0.00, real=0.02 secs] > 2011-05-15T01:11:10.138-0400: 745339.664: [GC 745339.664: [DefNew: > 52367K->342K(58176K), 0.0010634 secs] 111308K->59284K(832528K), 0.0011042 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > 2011-05-15T01:12:08.585-0400: 745398.121: [GC 745398.130: [DefNew: > 52054K->752K(58176K), 0.0018517 secs] 110996K->59693K(832528K), 0.0019152 > secs] [Times: user=0.00 sys=0.00, real=0.01 secs] > > > 2011-05-15 01:06:07,738 [Reader#0 ] ERROR ntroller.ReadLogFileWorker - > Exception processing file. > java.lang.OutOfMemoryError: Java heap space > at > com.cpex.icelog.core.tradingengine.reader.SocketMessageLogReader.getNextSocketMessageByteStream(SocketMessageLogReader.java:159) > at > com.cpex.icelog.core.tradingengine.reader.SocketMessageLogReader.readLogEntries(SocketMessageLogReader.java:94) > at > com.cpex.icelog.core.tradingengine.reader.SocketMessageLogReader.read(SocketMessageLogReader.java:287) > at > com.cpex.icelog.core.controller.ReadLogFileWorker.dispatchReader(ReadLogFileWorker.java:110) > at > com.cpex.icelog.core.controller.ReadLogFileWorker.delegateDoRun(ReadLogFileWorker.java:81) > at > com.cpex.icelog.core.controller.ReadLogFileWorker.doRun(ReadLogFileWorker.java:63) > at > com.cpex.icelog.core.domain.IceLogProcessingRunnable.run(IceLogProcessingRunnable.java:37) > at > edu.emory.mathcs.backport.java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:442) > at > edu.emory.mathcs.backport.java.util.concurrent.FutureTask.run(FutureTask.java:176) > at > edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:665) > at > edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:690) > at java.lang.Thread.run(Thread.java:619) > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From shane.cox at gmail.com Tue May 17 08:46:41 2011 From: shane.cox at gmail.com (Shane Cox) Date: Tue, 17 May 2011 11:46:41 -0400 Subject: JVM Crash during GC Message-ID: Has anyone seen a JVM crash similar to this one? Wondering if this is a new or existing problem. Any insights would be appreciated. Thanks, Shane # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00002b0f1733cdc9, pid=14532, tid=1093286208 # # SIGSEGV (0xb) at pc=0x00002b0f1733cdc9, pid=14532, tid=1093286208 # # JRE version: 6.0_18-b07 # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode linux-amd64 ) # Problematic frame: # V [libjvm.so+0x3b1dc9] Current thread (0x0000000056588800): GCTaskThread [stack: 0x0000000000000000,0x0000000000000000] [id=14539] siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), si_addr=0x0000000000000025;; Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x3b1dc9];; void ParScanClosure::do_oop_work(oopDesc**, bool, bool)+0x79 V [libjvm.so+0x5e7f03];; ParRootScanWithBarrierTwoGensClosure::do_oop(oopDesc**)+0x13 V [libjvm.so+0x3ab18c];; instanceKlass::oop_oop_iterate_nv_m(oopDesc*, FilteringClosure*, MemRegion)+0x16c V [libjvm.so+0x297aff];; FreeListSpace_DCTOC::walk_mem_region_with_cl_par(MemRegion, HeapWord*, HeapWord*, FilteringClosure*)+0x13f V [libjvm.so+0x297995];; FreeListSpace_DCTOC::walk_mem_region_with_cl(MemRegion, HeapWord*, HeapWord*, FilteringClosure*)+0x35 V [libjvm.so+0x66014f];; Filtering_DCTOC::walk_mem_region(MemRegion, HeapWord*, HeapWord*)+0x5f V [libjvm.so+0x65fee9];; DirtyCardToOopClosure::do_MemRegion(MemRegion)+0xf9 V [libjvm.so+0x24153d];; ClearNoncleanCardWrapper::do_MemRegion(MemRegion)+0xdd V [libjvm.so+0x23ffea];; CardTableModRefBS::non_clean_card_iterate_work(MemRegion, MemRegionClosure*, bool)+0x1ca V [libjvm.so+0x5e504b];; CardTableModRefBS::process_stride(Space*, MemRegion, int, int, DirtyCardToOopClosure*, MemRegionClosure*, bool, signed char**, unsigned long, unsigned long)+0x13b V [libjvm.so+0x5e4e98];; CardTableModRefBS::par_non_clean_card_iterate_work(Space*, MemRegion, DirtyCardToOopClosure*, MemRegionClosure*, bool, int)+0xc8 V [libjvm.so+0x23fdfb];; CardTableModRefBS::non_clean_card_iterate(Space*, MemRegion, DirtyCardToOopClosure*, MemRegionClosure*, bool)+0x5b V [libjvm.so+0x240b9a];; CardTableRS::younger_refs_in_space_iterate(Space*, OopsInGenClosure*)+0x8a V [libjvm.so+0x379378];; Generation::younger_refs_in_space_iterate(Space*, OopsInGenClosure*)+0x18 V [libjvm.so+0x2c5c5f];; ConcurrentMarkSweepGeneration::younger_refs_iterate(OopsInGenClosure*)+0x4f V [libjvm.so+0x240a8a];; CardTableRS::younger_refs_iterate(Generation*, OopsInGenClosure*)+0x2a V [libjvm.so+0x36bfcd];; GenCollectedHeap::gen_process_strong_roots(int, bool, bool, SharedHeap::ScanningOption, OopsInGenClosure*, OopsInGenClosure*)+0x9d V [libjvm.so+0x5e82c9];; ParNewGenTask::work(int)+0xc9 V [libjvm.so+0x722e0d];; GangWorker::loop()+0xad V [libjvm.so+0x722d24];; GangWorker::run()+0x24 V [libjvm.so+0x5da2af];; java_start(Thread*)+0x13f VM Arguments: jvm_args: -verbose:gc -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+UseParNewGC -Xmx4000m -Xms4000m -Xss256k -XX:PermSize=256M -XX:MaxPermSize=256M -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:+CMSPermGenSweepingEnabled -XX:+ExplicitGCInvokesConcurrent OS:Red Hat Enterprise Linux Server release 5.3 (Tikanga) uname:Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 vm_info: Java HotSpot(TM) 64-Bit Server VM (16.0-b13) for linux-amd64 JRE (1.6.0_18-b07), built on Dec 17 2009 13:42:22 by "java_re" with gcc 3.2.2 (SuSE Linux) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20110517/c1001207/attachment.html From y.s.ramakrishna at oracle.com Tue May 17 09:03:48 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 17 May 2011 09:03:48 -0700 Subject: JVM Crash during GC In-Reply-To: References: Message-ID: <4DD29C64.8060501@oracle.com> Hi Shane, that's 6u18 which is about 18 months old. Could you try the latest, 6u25, and see if the problem reproduces? The crash is somewhat generic in that we crash when scanning cards during a scavenge, presumably running across a bad pointer. If you need to stick with that JVM, you can try turning off compressed oops explicitly, and/or enable heap verification to see if it catches anything sooner. If the problem reproduces with the latest bits, we'd definitely be interested in a formal bug report with a test case. -- ramki On 05/17/11 08:46, Shane Cox wrote: > Has anyone seen a JVM crash similar to this one? Wondering if this is a > new or existing problem. Any insights would be appreciated. > > Thanks, > Shane > > > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x00002b0f1733cdc9, pid=14532, tid=1093286208 > # > # SIGSEGV (0xb) at pc=0x00002b0f1733cdc9, pid=14532, tid=1093286208 > # > # JRE version: 6.0_18-b07 > # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode > linux-amd64 ) > # Problematic frame: > # V [libjvm.so+0x3b1dc9] > > > Current thread (0x0000000056588800): GCTaskThread [stack: > 0x0000000000000000,0x0000000000000000] [id=14539] > > siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), > si_addr=0x0000000000000025;; > > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, > C=native code) > V [libjvm.so+0x3b1dc9];; void > ParScanClosure::do_oop_work(oopDesc**, bool, bool)+0x79 > V [libjvm.so+0x5e7f03];; > ParRootScanWithBarrierTwoGensClosure::do_oop(oopDesc**)+0x13 > V [libjvm.so+0x3ab18c];; instanceKlass::oop_oop_iterate_nv_m(oopDesc*, > FilteringClosure*, MemRegion)+0x16c > V [libjvm.so+0x297aff];; > FreeListSpace_DCTOC::walk_mem_region_with_cl_par(MemRegion, HeapWord*, > HeapWord*, FilteringClosure*)+0x13f > V [libjvm.so+0x297995];; > FreeListSpace_DCTOC::walk_mem_region_with_cl(MemRegion, HeapWord*, > HeapWord*, FilteringClosure*)+0x35 > V [libjvm.so+0x66014f];; Filtering_DCTOC::walk_mem_region(MemRegion, > HeapWord*, HeapWord*)+0x5f > V [libjvm.so+0x65fee9];; > DirtyCardToOopClosure::do_MemRegion(MemRegion)+0xf9 > V [libjvm.so+0x24153d];; > ClearNoncleanCardWrapper::do_MemRegion(MemRegion)+0xdd > V [libjvm.so+0x23ffea];; > CardTableModRefBS::non_clean_card_iterate_work(MemRegion, > MemRegionClosure*, bool)+0x1ca > V [libjvm.so+0x5e504b];; CardTableModRefBS::process_stride(Space*, > MemRegion, int, int, DirtyCardToOopClosure*, MemRegionClosure*, bool, > signed char**, unsigned long, unsigned long)+0x13b > V [libjvm.so+0x5e4e98];; > CardTableModRefBS::par_non_clean_card_iterate_work(Space*, MemRegion, > DirtyCardToOopClosure*, MemRegionClosure*, bool, int)+0xc8 > V [libjvm.so+0x23fdfb];; > CardTableModRefBS::non_clean_card_iterate(Space*, MemRegion, > DirtyCardToOopClosure*, MemRegionClosure*, bool)+0x5b > V [libjvm.so+0x240b9a];; > CardTableRS::younger_refs_in_space_iterate(Space*, OopsInGenClosure*)+0x8a > V [libjvm.so+0x379378];; > Generation::younger_refs_in_space_iterate(Space*, OopsInGenClosure*)+0x18 > V [libjvm.so+0x2c5c5f];; > ConcurrentMarkSweepGeneration::younger_refs_iterate(OopsInGenClosure*)+0x4f > V [libjvm.so+0x240a8a];; > CardTableRS::younger_refs_iterate(Generation*, OopsInGenClosure*)+0x2a > V [libjvm.so+0x36bfcd];; > GenCollectedHeap::gen_process_strong_roots(int, bool, bool, > SharedHeap::ScanningOption, OopsInGenClosure*, OopsInGenClosure*)+0x9d > V [libjvm.so+0x5e82c9];; ParNewGenTask::work(int)+0xc9 > V [libjvm.so+0x722e0d];; GangWorker::loop()+0xad > V [libjvm.so+0x722d24];; GangWorker::run()+0x24 > V [libjvm.so+0x5da2af];; java_start(Thread*)+0x13f > > VM Arguments: > jvm_args: -verbose:gc -XX:+PrintGCDetails -XX:+PrintHeapAtGC > -XX:+PrintGCDateStamps -XX:+UseParNewGC -Xmx4000m -Xms4000m -Xss256k > -XX:PermSize=256M -XX:MaxPermSize=256M -XX:+UseConcMarkSweepGC > -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing > -XX:+CMSPermGenSweepingEnabled -XX:+ExplicitGCInvokesConcurrent > > OS:Red Hat Enterprise Linux Server release 5.3 (Tikanga) > > uname:Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 > > vm_info: Java HotSpot(TM) 64-Bit Server VM (16.0-b13) for linux-amd64 > JRE (1.6.0_18-b07), built on Dec 17 2009 13:42:22 by "java_re" with gcc > 3.2.2 (SuSE Linux) > > > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use