From guanxiaohua at gmail.com Wed Sep 2 11:47:22 2009 From: guanxiaohua at gmail.com (Tony Guan) Date: Wed, 2 Sep 2009 13:47:22 -0500 Subject: capturing method entry/exit efficiently Message-ID: <2fcb552b0909021147r4a9425dds961bf020cb8ee4a4@mail.gmail.com> Dear all, One update for yesterday's mail is that I am now able to add some call_vm() code in TemplateTable::invokespecial(), to get my own code executed. This is much economy in compared with JVMTI. But I just cannot find a suitable place for monitoring the method_exit. Any idea about that? And still, the compiled method is still a nightmare for me. Thanks! Tony > Dear all, > > My current research project with hotspot requires me to do something > particular whenever a method(interpreted or compiled) is invoked. I > need to know the thread and the method at the invocation time. What I > am trying to do is to do some VM hacking based on the methods called. > Question 1: Can I use BCI to achieve this? > > I am now able to capture the method_entry/exit events by writing a > JVMTI agent, but it's not what I really need to do. By using JVMTI, > performance is deteriorated a lot. And I am not sure if the compiled > method can still be captured. (Though I know java1.5 has some JVMPI > support in the compilation part, but not in java1.7. Am I right?). > Question 2: I am trying find a way to enable the > notify_method_enry/exit by partly simulating an JVMTI agent, that > means that I modify several parts in the hotspot without actually use > an external JVMTI agent. Is it feasible? (in terms of perfomance) > > Question 3: Is there some better way to capture the method_entry/exit event? > > Thanks for diluting the question marks in my mind! > > Tony ?(Xiaohua Guan) > > > ------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > End of hotspot-gc-use Digest, Vol 21, Issue 1 > ********************************************* > -- Xiaohua Guan (Tony) Department of Computer Science and Engineering University of Nebraska-Lincoln 103A Avery Hall Lincoln, NE 68588-0115 Tel: 402-472-3884 Email: xguan at cse.unl.edu URL: http://www.cse.unl.edu/~xguan (This email is encoded as Unicode(UTF-8)) From Peter.Kessler at Sun.COM Thu Sep 3 10:47:23 2009 From: Peter.Kessler at Sun.COM (Peter B. Kessler) Date: Thu, 03 Sep 2009 10:47:23 -0700 Subject: capturing method entry/exit efficiently In-Reply-To: <2fcb552b0909021147r4a9425dds961bf020cb8ee4a4@mail.gmail.com> References: <2fcb552b0909021147r4a9425dds961bf020cb8ee4a4@mail.gmail.com> Message-ID: <4AA0012B.2090509@Sun.COM> Since hotspot-gc-use at openjdk.java.net is the HotSpot GC users mailing list, you might not be reaching the people who can help you with your project. I would suggest asking on the hotspot-runtime-dev at openjdk-java.net for help with the interpreter (e.g., TemplateTable), or hotspot-compiler-dev at openjdk.java.net for help with the runtime compiler, or serviceability-dev at openjdk.java.net for help with the monitoring frameworks (e.g., JVMTI). But the garbage collectors don't have anything to do with method entry or exit. You might also want to look at bytecode rewriting, e.g., via your own classloader, to add instrumentation to method entries and exits for the methods of the classes you are interested in. I think there are tools available that make bytecode rewriting not as difficult as it sounds. ... peter Tony Guan wrote: > Dear all, > > One update for yesterday's mail is that I am now able to add some > call_vm() code in TemplateTable::invokespecial(), to get my own code > executed. This is much economy in compared with JVMTI. But I just > cannot find a suitable place for monitoring the method_exit. > > Any idea about that? And still, the compiled method is still a nightmare for me. > > Thanks! > > Tony > >> Dear all, >> >> My current research project with hotspot requires me to do something >> particular whenever a method(interpreted or compiled) is invoked. I >> need to know the thread and the method at the invocation time. What I >> am trying to do is to do some VM hacking based on the methods called. >> Question 1: Can I use BCI to achieve this? >> >> I am now able to capture the method_entry/exit events by writing a >> JVMTI agent, but it's not what I really need to do. By using JVMTI, >> performance is deteriorated a lot. And I am not sure if the compiled >> method can still be captured. (Though I know java1.5 has some JVMPI >> support in the compilation part, but not in java1.7. Am I right?). >> Question 2: I am trying find a way to enable the >> notify_method_enry/exit by partly simulating an JVMTI agent, that >> means that I modify several parts in the hotspot without actually use >> an external JVMTI agent. Is it feasible? (in terms of perfomance) >> >> Question 3: Is there some better way to capture the method_entry/exit event? >> >> Thanks for diluting the question marks in my mind! >> >> Tony (Xiaohua Guan) >> >> >> ------------------------------ >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> End of hotspot-gc-use Digest, Vol 21, Issue 1 >> ********************************************* >> > > > From jeff.lloyd at algorithmics.com Thu Sep 10 13:06:54 2009 From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com) Date: Thu, 10 Sep 2009 16:06:54 -0400 Subject: Young generation configuration Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> Hi, I'm new to this list and I have a few questions about tuning my young generation gc. I have chosen to use the CMS garbage collector because my application is a relatively large reporting server that has a web front end and therefore needs to have minimal pauses. I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB ram. The machine is dedicated to this JVM. My steady-state was calculated as follows: - A typical number of users logged in and viewed several reports - Stopped user actions and performed a manual full GC - Look at the amount of heap used and take that number as the steady-state memory requirement In this case my heap usage was ~10GB. In order to handle variance or spikes I sized my old generation at 15-20GB. I sized my young generation at 32-42GB and used survivor ratios of 1, 2, 3 and 6. My goal is to maximize throughput and minimize pauses. I'm willing to sacrifice ram to increase speed. I have attached several of my many gc logs. The file gc_48G.txt is just using CMS without any other tuning, and the results are much worse than what I have been able to accomplish with other settings. The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and gc_57G_15Gold_42Gyoung_1sr.txt. The problem is that some of the pauses are just too long. Is there a way to reduce the pause time any more than I have it now? Am I heading in the right direction? I ask because the default settings are so different than what I have been heading towards. The best reference I have found on what good gc logs look like come from brief examples presented at JavaOne this year by Tony Printezis and Charlie Hunt. But I don't seem to be able to get logs that resemble their tenuring patterns. I think I have a lot of medium-lived objects instead of nice short-lived ones. Are there any good practices for apps with objects like this? Thanks, Jeff -------------------------------------------------------------------------- This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20090910/55defd8d/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: gc.zip Type: application/x-zip-compressed Size: 66850 bytes Desc: gc.zip Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20090910/55defd8d/attachment-0001.bin From tony.printezis at sun.com Fri Sep 11 08:22:17 2009 From: tony.printezis at sun.com (Tony Printezis) Date: Fri, 11 Sep 2009 11:22:17 -0400 Subject: Young generation configuration In-Reply-To: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> Message-ID: <4AAA6B29.3030008@sun.com> Jeff, Hi. I had a very brief look at your logs. Yes, your app does seem to need to copy quite a lot (I don't think I've ever seen 1-2GB of data being copied in age 1!!!). From what I've seen from the space sizes, you're doing the right thing (i.e., you're consistent with what we talked about during the talk): you have quite large young gen and a reasonably sized old gen. But the sheer amount of surviving objects is what's getting you. How much larger can you make your young gen? I think in this case, the larger, the better. Maybe, you can also try MaxTenuringThreshold=1. This goes against our general advice, but this might decrease the amount of objects being copied during young GCs, at the expense of more frequent CMS cycles... Tony jeff.lloyd at algorithmics.com wrote: > > Hi, > > > > I?m new to this list and I have a few questions about tuning my young > generation gc. > > > > I have chosen to use the CMS garbage collector because my application > is a relatively large reporting server that has a web front end and > therefore needs to have minimal pauses. > > > > I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB ram. > > > > The machine is dedicated to this JVM. > > > > My steady-state was calculated as follows: > > - A typical number of users logged in and viewed several reports > > - Stopped user actions and performed a manual full GC > > - Look at the amount of heap used and take that number as the > steady-state memory requirement > > > > In this case my heap usage was ~10GB. In order to handle variance or > spikes I sized my old generation at 15-20GB. > > > > I sized my young generation at 32-42GB and used survivor ratios of 1, > 2, 3 and 6. > > > > My goal is to maximize throughput and minimize pauses. I?m willing to > sacrifice ram to increase speed. > > > > I have attached several of my many gc logs. The file gc_48G.txt is > just using CMS without any other tuning, and the results are much > worse than what I have been able to accomplish with other settings. > The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and > gc_57G_15Gold_42Gyoung_1sr.txt. > > > > The problem is that some of the pauses are just too long. > > > > Is there a way to reduce the pause time any more than I have it now? > > Am I heading in the right direction? I ask because the default > settings are so different than what I have been heading towards. > > > > The best reference I have found on what good gc logs look like come > from brief examples presented at JavaOne this year by Tony Printezis > and Charlie Hunt. But I don?t seem to be able to get logs that > resemble their tenuring patterns. > > > > I think I have a lot of medium-lived objects instead of nice > short-lived ones. > > > > Are there any good practices for apps with objects like this? > > > > Thanks, > > Jeff > > > > > ------------------------------------------------------------------------ > This email and any files transmitted with it are confidential and > proprietary to Algorithmics Incorporated and its affiliates > ("Algorithmics"). If received in error, use is prohibited. Please > destroy, and notify sender. Sender does not waive confidentiality or > privilege. Internet communications cannot be guaranteed to be timely, > secure, error or virus-free. Algorithmics does not accept liability > for any errors or omissions. Any commitment intended to bind > Algorithmics must be reduced to writing and signed by an authorized > signatory. > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -- --------------------------------------------------------------------- | Tony Printezis, Staff Engineer | Sun Microsystems Inc. | | | MS UBUR02-311 | | e-mail: tony.printezis at sun.com | 35 Network Drive | | office: +1 781 442 0998 (x20998) | Burlington, MA 01803-2756, USA | --------------------------------------------------------------------- e-mail client: Thunderbird (Linux) From Paul.Hohensee at Sun.COM Fri Sep 11 10:22:33 2009 From: Paul.Hohensee at Sun.COM (Paul Hohensee) Date: Fri, 11 Sep 2009 13:22:33 -0400 Subject: Young generation configuration In-Reply-To: <4AAA6B29.3030008@sun.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> Message-ID: <4AAA8759.8010904@sun.com> Another alternative mentioned in Tony and Charlie's J1 slides is the parallel collector. If, as Tony says, you can make the young gen large enough to avoid promotion, and you really do have a steady state old gen, then which old gen collector you use wouldn't matter much to pause times, given that young gen pause times seem to be your immediate problem. It may be that you just need more hardware threads to collect such a big young gen too. You might vary the number of gc threads to see how that affects collection times. If there's significant differences, then you need more hardware threads, i.e., a bigger machine. You might also try using compressed pointers via -XX:+UseCompressedOops. That should cut down the total survivor size significantly, perhaps enough to that your current hardware threads can collect significantly faster. Heap size will be limited to < 32gb, but you're app will probably fit. A more efficient version of compressed pointers will be available in 6u18, btw. I notice that none of your logs shows more than age 7 stats even though the tenuring threshold is 15. It'd be nice to see if anything dies before then. Paul Tony Printezis wrote: > Jeff, > > Hi. I had a very brief look at your logs. Yes, your app does seem to > need to copy quite a lot (I don't think I've ever seen 1-2GB of data > being copied in age 1!!!). From what I've seen from the space sizes, > you're doing the right thing (i.e., you're consistent with what we > talked about during the talk): you have quite large young gen and a > reasonably sized old gen. But the sheer amount of surviving objects is > what's getting you. How much larger can you make your young gen? I think > in this case, the larger, the better. Maybe, you can also try > MaxTenuringThreshold=1. This goes against our general advice, but this > might decrease the amount of objects being copied during young GCs, at > the expense of more frequent CMS cycles... > > Tony > > jeff.lloyd at algorithmics.com wrote: > >> Hi, >> >> >> >> I?m new to this list and I have a few questions about tuning my young >> generation gc. >> >> >> >> I have chosen to use the CMS garbage collector because my application >> is a relatively large reporting server that has a web front end and >> therefore needs to have minimal pauses. >> >> >> >> I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB ram. >> >> >> >> The machine is dedicated to this JVM. >> >> >> >> My steady-state was calculated as follows: >> >> - A typical number of users logged in and viewed several reports >> >> - Stopped user actions and performed a manual full GC >> >> - Look at the amount of heap used and take that number as the >> steady-state memory requirement >> >> >> >> In this case my heap usage was ~10GB. In order to handle variance or >> spikes I sized my old generation at 15-20GB. >> >> >> >> I sized my young generation at 32-42GB and used survivor ratios of 1, >> 2, 3 and 6. >> >> >> >> My goal is to maximize throughput and minimize pauses. I?m willing to >> sacrifice ram to increase speed. >> >> >> >> I have attached several of my many gc logs. The file gc_48G.txt is >> just using CMS without any other tuning, and the results are much >> worse than what I have been able to accomplish with other settings. >> The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and >> gc_57G_15Gold_42Gyoung_1sr.txt. >> >> >> >> The problem is that some of the pauses are just too long. >> >> >> >> Is there a way to reduce the pause time any more than I have it now? >> >> Am I heading in the right direction? I ask because the default >> settings are so different than what I have been heading towards. >> >> >> >> The best reference I have found on what good gc logs look like come >> from brief examples presented at JavaOne this year by Tony Printezis >> and Charlie Hunt. But I don?t seem to be able to get logs that >> resemble their tenuring patterns. >> >> >> >> I think I have a lot of medium-lived objects instead of nice >> short-lived ones. >> >> >> >> Are there any good practices for apps with objects like this? >> >> >> >> Thanks, >> >> Jeff >> >> >> >> >> ------------------------------------------------------------------------ >> This email and any files transmitted with it are confidential and >> proprietary to Algorithmics Incorporated and its affiliates >> ("Algorithmics"). If received in error, use is prohibited. Please >> destroy, and notify sender. Sender does not waive confidentiality or >> privilege. Internet communications cannot be guaranteed to be timely, >> secure, error or virus-free. Algorithmics does not accept liability >> for any errors or omissions. Any commitment intended to bind >> Algorithmics must be reduced to writing and signed by an authorized >> signatory. >> ------------------------------------------------------------------------ >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > From Y.S.Ramakrishna at Sun.COM Fri Sep 11 11:13:54 2009 From: Y.S.Ramakrishna at Sun.COM (Y.S.Ramakrishna at Sun.COM) Date: Fri, 11 Sep 2009 11:13:54 -0700 Subject: Young generation configuration In-Reply-To: <4AAA8759.8010904@sun.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> Message-ID: <4AAA9362.3080009@Sun.COM> Just some very general remarks ... >> jeff.lloyd at algorithmics.com wrote: ... >>> My goal is to maximize throughput and minimize pauses. I?m willing to >>> sacrifice ram to increase speed. Ah, but you may not be able to achieve a joint optimum there; on the contrary, maximal throughput is often achieved at maximal pause times. Lowering pause times to within budget currently often involves giving up some throughput. You need to define the maximum pause time you can stand and the minimum throughput you can tolerate, and solve that optimization problem. ... >>> The problem is that some of the pauses are just too long. Hmm, good, we are getting closer :-) How long is "too long"? ... >>> Is there a way to reduce the pause time any more than I have it now? yes, but you will likely give up on throughput. >>> >>> Am I heading in the right direction? I ask because the default >>> settings are so different than what I have been heading towards. Depending on your boundary conditions (constraints on your objective metrics, and if you can define a suitable utility or objective function) there may be multiple optimal configurations, or none at all, which will meet your constraints. >>> I think I have a lot of medium-lived objects instead of nice >>> short-lived ones. You also have some short-lived ones (may be about 80%?), but yes you do have quite some (~15%?) of medium-lived ones. The total volume of such medium-lived objects is proportional to the transactional rate that your server is subject to, and also proportional to the longevity of those transactions (where i am using transactions loosely to mean how long it takes for the records associated with those transactions to flush their state). You mention that your application is a "reporting server". What is your estimate of the (expected/measured) lifetime of such a "reporting transaction"? Does it match the kinds of object lifetimes you are seeing here? -- ramki From jeff.lloyd at algorithmics.com Fri Sep 11 13:39:02 2009 From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com) Date: Fri, 11 Sep 2009 16:39:02 -0400 Subject: Young generation configuration In-Reply-To: <4AAA6B29.3030008@sun.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0A813803@TORMAIL.algorithmics.com> Hi Tony, We do have a lot of data that we create/copy within the application. We hold big trees/graphs of data representing large portfolio structures in memory per user. Slicing and dicing the data creates similar strains. I'll try to increase the YG and play more with MTT to see if I can speed things up. The problem is that we have an interactive web interface so the pauses need to be relatively quick or the UI responsiveness suffers. If I set MTT to 1, then I am guessing I may need to boost my OG size because it will fill up faster. Would it make sense to increase the OG size and reduce the initiating occupancy fraction? Thanks! Jeff -----Original Message----- From: Antonios.Printezis at sun.com [mailto:Antonios.Printezis at sun.com] On Behalf Of Tony Printezis Sent: Friday, September 11, 2009 11:22 AM To: Jeff Lloyd Cc: hotspot-gc-use at openjdk.java.net Subject: Re: Young generation configuration Jeff, Hi. I had a very brief look at your logs. Yes, your app does seem to need to copy quite a lot (I don't think I've ever seen 1-2GB of data being copied in age 1!!!). From what I've seen from the space sizes, you're doing the right thing (i.e., you're consistent with what we talked about during the talk): you have quite large young gen and a reasonably sized old gen. But the sheer amount of surviving objects is what's getting you. How much larger can you make your young gen? I think in this case, the larger, the better. Maybe, you can also try MaxTenuringThreshold=1. This goes against our general advice, but this might decrease the amount of objects being copied during young GCs, at the expense of more frequent CMS cycles... Tony jeff.lloyd at algorithmics.com wrote: > > Hi, > > > > I'm new to this list and I have a few questions about tuning my young > generation gc. > > > > I have chosen to use the CMS garbage collector because my application > is a relatively large reporting server that has a web front end and > therefore needs to have minimal pauses. > > > > I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB ram. > > > > The machine is dedicated to this JVM. > > > > My steady-state was calculated as follows: > > - A typical number of users logged in and viewed several reports > > - Stopped user actions and performed a manual full GC > > - Look at the amount of heap used and take that number as the > steady-state memory requirement > > > > In this case my heap usage was ~10GB. In order to handle variance or > spikes I sized my old generation at 15-20GB. > > > > I sized my young generation at 32-42GB and used survivor ratios of 1, > 2, 3 and 6. > > > > My goal is to maximize throughput and minimize pauses. I'm willing to > sacrifice ram to increase speed. > > > > I have attached several of my many gc logs. The file gc_48G.txt is > just using CMS without any other tuning, and the results are much > worse than what I have been able to accomplish with other settings. > The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and > gc_57G_15Gold_42Gyoung_1sr.txt. > > > > The problem is that some of the pauses are just too long. > > > > Is there a way to reduce the pause time any more than I have it now? > > Am I heading in the right direction? I ask because the default > settings are so different than what I have been heading towards. > > > > The best reference I have found on what good gc logs look like come > from brief examples presented at JavaOne this year by Tony Printezis > and Charlie Hunt. But I don't seem to be able to get logs that > resemble their tenuring patterns. > > > > I think I have a lot of medium-lived objects instead of nice > short-lived ones. > > > > Are there any good practices for apps with objects like this? > > > > Thanks, > > Jeff > > > > > ------------------------------------------------------------------------ > This email and any files transmitted with it are confidential and > proprietary to Algorithmics Incorporated and its affiliates > ("Algorithmics"). If received in error, use is prohibited. Please > destroy, and notify sender. Sender does not waive confidentiality or > privilege. Internet communications cannot be guaranteed to be timely, > secure, error or virus-free. Algorithmics does not accept liability > for any errors or omissions. Any commitment intended to bind > Algorithmics must be reduced to writing and signed by an authorized > signatory. > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -- --------------------------------------------------------------------- | Tony Printezis, Staff Engineer | Sun Microsystems Inc. | | | MS UBUR02-311 | | e-mail: tony.printezis at sun.com | 35 Network Drive | | office: +1 781 442 0998 (x20998) | Burlington, MA 01803-2756, USA | --------------------------------------------------------------------- e-mail client: Thunderbird (Linux) -------------------------------------------------------------------------- This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. -------------------------------------------------------------------------- From jeff.lloyd at algorithmics.com Fri Sep 11 13:49:45 2009 From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com) Date: Fri, 11 Sep 2009 16:49:45 -0400 Subject: Young generation configuration In-Reply-To: <4AAA8759.8010904@sun.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0A813819@TORMAIL.algorithmics.com> Thanks for your response Paul. I'll take another look at the parallel collector. That's a good point about the -XX:+UseCompressedOops. We started off with heaps bigger than 32G so I had left that option out. I'll put it back in and definitely try out 6u18 when it's available. What about the option -XX:+UseAdaptiveGCBoundary? I don't see it referenced very often. Would it be helpful in a case like mine? I'm not sure I understand your last paragraph. What is the period of time that you would be interested in seeing? Jeff -----Original Message----- From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] Sent: Friday, September 11, 2009 1:23 PM To: Tony Printezis Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net Subject: Re: Young generation configuration Another alternative mentioned in Tony and Charlie's J1 slides is the parallel collector. If, as Tony says, you can make the young gen large enough to avoid promotion, and you really do have a steady state old gen, then which old gen collector you use wouldn't matter much to pause times, given that young gen pause times seem to be your immediate problem. It may be that you just need more hardware threads to collect such a big young gen too. You might vary the number of gc threads to see how that affects collection times. If there's significant differences, then you need more hardware threads, i.e., a bigger machine. You might also try using compressed pointers via -XX:+UseCompressedOops. That should cut down the total survivor size significantly, perhaps enough to that your current hardware threads can collect significantly faster. Heap size will be limited to < 32gb, but you're app will probably fit. A more efficient version of compressed pointers will be available in 6u18, btw. I notice that none of your logs shows more than age 7 stats even though the tenuring threshold is 15. It'd be nice to see if anything dies before then. Paul Tony Printezis wrote: > Jeff, > > Hi. I had a very brief look at your logs. Yes, your app does seem to > need to copy quite a lot (I don't think I've ever seen 1-2GB of data > being copied in age 1!!!). From what I've seen from the space sizes, > you're doing the right thing (i.e., you're consistent with what we > talked about during the talk): you have quite large young gen and a > reasonably sized old gen. But the sheer amount of surviving objects is > what's getting you. How much larger can you make your young gen? I think > in this case, the larger, the better. Maybe, you can also try > MaxTenuringThreshold=1. This goes against our general advice, but this > might decrease the amount of objects being copied during young GCs, at > the expense of more frequent CMS cycles... > > Tony > > jeff.lloyd at algorithmics.com wrote: > >> Hi, >> >> >> >> I'm new to this list and I have a few questions about tuning my young >> generation gc. >> >> >> >> I have chosen to use the CMS garbage collector because my application >> is a relatively large reporting server that has a web front end and >> therefore needs to have minimal pauses. >> >> >> >> I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB ram. >> >> >> >> The machine is dedicated to this JVM. >> >> >> >> My steady-state was calculated as follows: >> >> - A typical number of users logged in and viewed several reports >> >> - Stopped user actions and performed a manual full GC >> >> - Look at the amount of heap used and take that number as the >> steady-state memory requirement >> >> >> >> In this case my heap usage was ~10GB. In order to handle variance or >> spikes I sized my old generation at 15-20GB. >> >> >> >> I sized my young generation at 32-42GB and used survivor ratios of 1, >> 2, 3 and 6. >> >> >> >> My goal is to maximize throughput and minimize pauses. I'm willing to >> sacrifice ram to increase speed. >> >> >> >> I have attached several of my many gc logs. The file gc_48G.txt is >> just using CMS without any other tuning, and the results are much >> worse than what I have been able to accomplish with other settings. >> The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and >> gc_57G_15Gold_42Gyoung_1sr.txt. >> >> >> >> The problem is that some of the pauses are just too long. >> >> >> >> Is there a way to reduce the pause time any more than I have it now? >> >> Am I heading in the right direction? I ask because the default >> settings are so different than what I have been heading towards. >> >> >> >> The best reference I have found on what good gc logs look like come >> from brief examples presented at JavaOne this year by Tony Printezis >> and Charlie Hunt. But I don't seem to be able to get logs that >> resemble their tenuring patterns. >> >> >> >> I think I have a lot of medium-lived objects instead of nice >> short-lived ones. >> >> >> >> Are there any good practices for apps with objects like this? >> >> >> >> Thanks, >> >> Jeff >> >> >> >> >> ------------------------------------------------------------------------ >> This email and any files transmitted with it are confidential and >> proprietary to Algorithmics Incorporated and its affiliates >> ("Algorithmics"). If received in error, use is prohibited. Please >> destroy, and notify sender. Sender does not waive confidentiality or >> privilege. Internet communications cannot be guaranteed to be timely, >> secure, error or virus-free. Algorithmics does not accept liability >> for any errors or omissions. Any commitment intended to bind >> Algorithmics must be reduced to writing and signed by an authorized >> signatory. >> ------------------------------------------------------------------------ >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > -------------------------------------------------------------------------- This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. -------------------------------------------------------------------------- From tony.printezis at sun.com Fri Sep 11 13:54:50 2009 From: tony.printezis at sun.com (Tony Printezis) Date: Fri, 11 Sep 2009 16:54:50 -0400 Subject: Young generation configuration In-Reply-To: <0FCC438D62A5E643AA3F57D3417B220D0A813803@TORMAIL.algorithmics.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <0FCC438D62A5E643AA3F57D3417B220D0A813803@TORMAIL.algorithmics.com> Message-ID: <4AAAB91A.9020302@sun.com> jeff.lloyd at algorithmics.com wrote: > Hi Tony, > > We do have a lot of data that we create/copy within the application. We > hold big trees/graphs of data representing large portfolio structures in > memory per user. Slicing and dicing the data creates similar strains. > > I'll try to increase the YG and play more with MTT to see if I can speed > things up. The problem is that we have an interactive web interface so > the pauses need to be relatively quick or the UI responsiveness suffers. > > If I set MTT to 1, then I am guessing I may need to boost my OG size > because it will fill up faster. Would it make sense to increase the OG > size and reduce the initiating occupancy fraction? > Definitely. Someone was paying attention during the talk. :-) But, concentrate first on whether the young GC times are good enough. Tony > -----Original Message----- > From: Antonios.Printezis at sun.com [mailto:Antonios.Printezis at sun.com] On > Behalf Of Tony Printezis > Sent: Friday, September 11, 2009 11:22 AM > To: Jeff Lloyd > Cc: hotspot-gc-use at openjdk.java.net > Subject: Re: Young generation configuration > > Jeff, > > Hi. I had a very brief look at your logs. Yes, your app does seem to > need to copy quite a lot (I don't think I've ever seen 1-2GB of data > being copied in age 1!!!). From what I've seen from the space sizes, > you're doing the right thing (i.e., you're consistent with what we > talked about during the talk): you have quite large young gen and a > reasonably sized old gen. But the sheer amount of surviving objects is > what's getting you. How much larger can you make your young gen? I think > > in this case, the larger, the better. Maybe, you can also try > MaxTenuringThreshold=1. This goes against our general advice, but this > might decrease the amount of objects being copied during young GCs, at > the expense of more frequent CMS cycles... > > Tony > > jeff.lloyd at algorithmics.com wrote: > >> Hi, >> >> >> >> I'm new to this list and I have a few questions about tuning my young >> generation gc. >> >> >> >> I have chosen to use the CMS garbage collector because my application >> is a relatively large reporting server that has a web front end and >> therefore needs to have minimal pauses. >> >> >> >> I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB >> > ram. > >> >> >> The machine is dedicated to this JVM. >> >> >> >> My steady-state was calculated as follows: >> >> - A typical number of users logged in and viewed several >> > reports > >> - Stopped user actions and performed a manual full GC >> >> - Look at the amount of heap used and take that number as the >> > > >> steady-state memory requirement >> >> >> >> In this case my heap usage was ~10GB. In order to handle variance or >> spikes I sized my old generation at 15-20GB. >> >> >> >> I sized my young generation at 32-42GB and used survivor ratios of 1, >> 2, 3 and 6. >> >> >> >> My goal is to maximize throughput and minimize pauses. I'm willing to >> > > >> sacrifice ram to increase speed. >> >> >> >> I have attached several of my many gc logs. The file gc_48G.txt is >> just using CMS without any other tuning, and the results are much >> worse than what I have been able to accomplish with other settings. >> The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and >> gc_57G_15Gold_42Gyoung_1sr.txt. >> >> >> >> The problem is that some of the pauses are just too long. >> >> >> >> Is there a way to reduce the pause time any more than I have it now? >> >> Am I heading in the right direction? I ask because the default >> settings are so different than what I have been heading towards. >> >> >> >> The best reference I have found on what good gc logs look like come >> from brief examples presented at JavaOne this year by Tony Printezis >> and Charlie Hunt. But I don't seem to be able to get logs that >> resemble their tenuring patterns. >> >> >> >> I think I have a lot of medium-lived objects instead of nice >> short-lived ones. >> >> >> >> Are there any good practices for apps with objects like this? >> >> >> >> Thanks, >> >> Jeff >> >> >> >> >> >> > ------------------------------------------------------------------------ > >> This email and any files transmitted with it are confidential and >> proprietary to Algorithmics Incorporated and its affiliates >> ("Algorithmics"). If received in error, use is prohibited. Please >> destroy, and notify sender. Sender does not waive confidentiality or >> privilege. Internet communications cannot be guaranteed to be timely, >> secure, error or virus-free. Algorithmics does not accept liability >> for any errors or omissions. Any commitment intended to bind >> Algorithmics must be reduced to writing and signed by an authorized >> signatory. >> >> > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > -- --------------------------------------------------------------------- | Tony Printezis, Staff Engineer | Sun Microsystems Inc. | | | MS UBUR02-311 | | e-mail: tony.printezis at sun.com | 35 Network Drive | | office: +1 781 442 0998 (x20998) | Burlington, MA 01803-2756, USA | --------------------------------------------------------------------- e-mail client: Thunderbird (Linux) From jeff.lloyd at algorithmics.com Fri Sep 11 14:06:01 2009 From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com) Date: Fri, 11 Sep 2009 17:06:01 -0400 Subject: Young generation configuration In-Reply-To: <4AAA9362.3080009@Sun.COM> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> <4AAA9362.3080009@Sun.COM> Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0A813842@TORMAIL.algorithmics.com> Hi Ramki, I did not know that lower pause times and higher throughput were generally incompatible. Good to know - it makes sense too. I'm trying to find out how long "too long" is. Bankers can be fickle. :-) Honestly, I think "too long" constitutes a noticeable pause in GUI interactions. How did you measure the proportion of short-lived and medium-lived objects? We typically expect a "session" be live for most of the day, and multiple reports of seconds or minutes in duration executed within that session. So yes, I am seeing my "steady state" continue for a long time, with blips of activity throughout the day. We cache a lot of results, which can lead to a general upward trend, but it doesn't seem to be our current source of object volume. Thanks for your help, Jeff -----Original Message----- From: Y.S.Ramakrishna at Sun.COM [mailto:Y.S.Ramakrishna at Sun.COM] Sent: Friday, September 11, 2009 2:14 PM To: Jeff Lloyd Cc: hotspot-gc-use at openjdk.java.net Subject: Re: Young generation configuration Just some very general remarks ... >> jeff.lloyd at algorithmics.com wrote: ... >>> My goal is to maximize throughput and minimize pauses. I'm willing to >>> sacrifice ram to increase speed. Ah, but you may not be able to achieve a joint optimum there; on the contrary, maximal throughput is often achieved at maximal pause times. Lowering pause times to within budget currently often involves giving up some throughput. You need to define the maximum pause time you can stand and the minimum throughput you can tolerate, and solve that optimization problem. ... >>> The problem is that some of the pauses are just too long. Hmm, good, we are getting closer :-) How long is "too long"? ... >>> Is there a way to reduce the pause time any more than I have it now? yes, but you will likely give up on throughput. >>> >>> Am I heading in the right direction? I ask because the default >>> settings are so different than what I have been heading towards. Depending on your boundary conditions (constraints on your objective metrics, and if you can define a suitable utility or objective function) there may be multiple optimal configurations, or none at all, which will meet your constraints. >>> I think I have a lot of medium-lived objects instead of nice >>> short-lived ones. You also have some short-lived ones (may be about 80%?), but yes you do have quite some (~15%?) of medium-lived ones. The total volume of such medium-lived objects is proportional to the transactional rate that your server is subject to, and also proportional to the longevity of those transactions (where i am using transactions loosely to mean how long it takes for the records associated with those transactions to flush their state). You mention that your application is a "reporting server". What is your estimate of the (expected/measured) lifetime of such a "reporting transaction"? Does it match the kinds of object lifetimes you are seeing here? -- ramki -------------------------------------------------------------------------- This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. -------------------------------------------------------------------------- From Paul.Hohensee at Sun.COM Fri Sep 11 14:13:46 2009 From: Paul.Hohensee at Sun.COM (Paul Hohensee) Date: Fri, 11 Sep 2009 17:13:46 -0400 Subject: Young generation configuration In-Reply-To: <0FCC438D62A5E643AA3F57D3417B220D0A813819@TORMAIL.algorithmics.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> <0FCC438D62A5E643AA3F57D3417B220D0A813819@TORMAIL.algorithmics.com> Message-ID: <4AAABD8A.7000900@sun.com> You can try out compressed pointers in 6u14. It just won't be quite as fast as the version that's going into 6u18. 6u14 with compressed pointers will still be quite a bit faster than without. One of the gc guys may correct me, but UseAdaptiveGCBoundary allows the vm to ergonomically move the boundary between old and young generations, effectively resizing them. I don't know if it's bit-rotted, and I seem to remember that there wasn't much benefit. But maybe we just didn't have a good use case. What I meant by the last paragraph was that with the tenuring threshold set at 15 (which is what the log says), and with only 7 young gcs in the log, we can't see at what age (or if) between 8 and 15 the survivor size goes down to something reasonable. If it doesn't, it might be worth it to us to revisit increasing the age limit for 64-bit. Paul jeff.lloyd at algorithmics.com wrote: > Thanks for your response Paul. > > I'll take another look at the parallel collector. > > That's a good point about the -XX:+UseCompressedOops. We started off > with heaps bigger than 32G so I had left that option out. I'll put it > back in and definitely try out 6u18 when it's available. > > What about the option -XX:+UseAdaptiveGCBoundary? I don't see it > referenced very often. Would it be helpful in a case like mine? > > I'm not sure I understand your last paragraph. What is the period of > time that you would be interested in seeing? > > Jeff > > -----Original Message----- > From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] > Sent: Friday, September 11, 2009 1:23 PM > To: Tony Printezis > Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net > Subject: Re: Young generation configuration > > Another alternative mentioned in Tony and Charlie's J1 slides is the > parallel > collector. If, as Tony says, you can make the young gen large enough to > > avoid > promotion, and you really do have a steady state old gen, then which old > gen > collector you use wouldn't matter much to pause times, given that young > gen pause times seem to be your immediate problem. > > It may be that you just need more hardware threads to collect such a big > > young > gen too. You might vary the number of gc threads to see how that > affects > collection times. If there's significant differences, then you need > more > hardware threads, i.e., a bigger machine. > > You might also try using compressed pointers via -XX:+UseCompressedOops. > That should cut down the total survivor size significantly, perhaps > enough > to that your current hardware threads can collect significantly faster. > > Heap size > will be limited to < 32gb, but you're app will probably fit. A more > efficient > version of compressed pointers will be available in 6u18, btw. > > I notice that none of your logs shows more than age 7 stats even though > the > tenuring threshold is 15. It'd be nice to see if anything dies before > then. > > Paul > > Tony Printezis wrote: > >> Jeff, >> >> Hi. I had a very brief look at your logs. Yes, your app does seem to >> need to copy quite a lot (I don't think I've ever seen 1-2GB of data >> being copied in age 1!!!). From what I've seen from the space sizes, >> you're doing the right thing (i.e., you're consistent with what we >> talked about during the talk): you have quite large young gen and a >> reasonably sized old gen. But the sheer amount of surviving objects is >> > > >> what's getting you. How much larger can you make your young gen? I >> > think > >> in this case, the larger, the better. Maybe, you can also try >> MaxTenuringThreshold=1. This goes against our general advice, but this >> > > >> might decrease the amount of objects being copied during young GCs, at >> > > >> the expense of more frequent CMS cycles... >> >> Tony >> >> jeff.lloyd at algorithmics.com wrote: >> >> >>> Hi, >>> >>> >>> >>> I'm new to this list and I have a few questions about tuning my young >>> > > >>> generation gc. >>> >>> >>> >>> I have chosen to use the CMS garbage collector because my application >>> > > >>> is a relatively large reporting server that has a web front end and >>> therefore needs to have minimal pauses. >>> >>> >>> >>> I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB >>> > ram. > >>> >>> >>> The machine is dedicated to this JVM. >>> >>> >>> >>> My steady-state was calculated as follows: >>> >>> - A typical number of users logged in and viewed several >>> > reports > >>> - Stopped user actions and performed a manual full GC >>> >>> - Look at the amount of heap used and take that number as >>> > the > >>> steady-state memory requirement >>> >>> >>> >>> In this case my heap usage was ~10GB. In order to handle variance or >>> > > >>> spikes I sized my old generation at 15-20GB. >>> >>> >>> >>> I sized my young generation at 32-42GB and used survivor ratios of 1, >>> > > >>> 2, 3 and 6. >>> >>> >>> >>> My goal is to maximize throughput and minimize pauses. I'm willing >>> > to > >>> sacrifice ram to increase speed. >>> >>> >>> >>> I have attached several of my many gc logs. The file gc_48G.txt is >>> just using CMS without any other tuning, and the results are much >>> worse than what I have been able to accomplish with other settings. >>> The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and >>> gc_57G_15Gold_42Gyoung_1sr.txt. >>> >>> >>> >>> The problem is that some of the pauses are just too long. >>> >>> >>> >>> Is there a way to reduce the pause time any more than I have it now? >>> >>> Am I heading in the right direction? I ask because the default >>> settings are so different than what I have been heading towards. >>> >>> >>> >>> The best reference I have found on what good gc logs look like come >>> from brief examples presented at JavaOne this year by Tony Printezis >>> and Charlie Hunt. But I don't seem to be able to get logs that >>> resemble their tenuring patterns. >>> >>> >>> >>> I think I have a lot of medium-lived objects instead of nice >>> short-lived ones. >>> >>> >>> >>> Are there any good practices for apps with objects like this? >>> >>> >>> >>> Thanks, >>> >>> Jeff >>> >>> >>> >>> >>> >>> > ------------------------------------------------------------------------ > >>> This email and any files transmitted with it are confidential and >>> proprietary to Algorithmics Incorporated and its affiliates >>> ("Algorithmics"). If received in error, use is prohibited. Please >>> destroy, and notify sender. Sender does not waive confidentiality or >>> privilege. Internet communications cannot be guaranteed to be timely, >>> > > >>> secure, error or virus-free. Algorithmics does not accept liability >>> for any errors or omissions. Any commitment intended to bind >>> Algorithmics must be reduced to writing and signed by an authorized >>> signatory. >>> >>> > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> >> > > > -------------------------------------------------------------------------- > This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. > -------------------------------------------------------------------------- > From tony.printezis at sun.com Fri Sep 11 14:17:55 2009 From: tony.printezis at sun.com (Tony Printezis) Date: Fri, 11 Sep 2009 17:17:55 -0400 Subject: Young generation configuration In-Reply-To: <4AAABD8A.7000900@sun.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> <0FCC438D62A5E643AA3F57D3417B220D0A813819@TORMAIL.algorithmics.com> <4AAABD8A.7000900@sun.com> Message-ID: <4AAABE83.90308@sun.com> Paul Hohensee wrote: > You can try out compressed pointers in 6u14. It just won't be quite as > fast as the version that's going into 6u18. 6u14 with compressed pointers > will still be quite a bit faster than without. > > One of the gc guys may correct me, but UseAdaptiveGCBoundary allows > the vm to ergonomically move the boundary between old and young generations, > effectively resizing them. I don't know if it's bit-rotted, and I seem > to remember > that there wasn't much benefit. But maybe we just didn't have a good > use case. > Also, it's ParallelGC-only, IIRC. > What I meant by the last paragraph was that with the tenuring threshold > set at > 15 (which is what the log says), and with only 7 young gcs in the log, > we can't > see at what age (or if) between 8 and 15 the survivor size goes down to > something > reasonable. If it doesn't, it might be worth it to us to revisit > increasing the age > limit for 64-bit. > Paul, the problem in Jeff's case is that even at age 1 he copies 1GB or so. So, maybe, setting a small MTT and having more CMS cycles might be the right option for him. Tony > jeff.lloyd at algorithmics.com wrote: > >> Thanks for your response Paul. >> >> I'll take another look at the parallel collector. >> >> That's a good point about the -XX:+UseCompressedOops. We started off >> with heaps bigger than 32G so I had left that option out. I'll put it >> back in and definitely try out 6u18 when it's available. >> >> What about the option -XX:+UseAdaptiveGCBoundary? I don't see it >> referenced very often. Would it be helpful in a case like mine? >> >> I'm not sure I understand your last paragraph. What is the period of >> time that you would be interested in seeing? >> >> Jeff >> >> -----Original Message----- >> From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] >> Sent: Friday, September 11, 2009 1:23 PM >> To: Tony Printezis >> Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net >> Subject: Re: Young generation configuration >> >> Another alternative mentioned in Tony and Charlie's J1 slides is the >> parallel >> collector. If, as Tony says, you can make the young gen large enough to >> >> avoid >> promotion, and you really do have a steady state old gen, then which old >> gen >> collector you use wouldn't matter much to pause times, given that young >> gen pause times seem to be your immediate problem. >> >> It may be that you just need more hardware threads to collect such a big >> >> young >> gen too. You might vary the number of gc threads to see how that >> affects >> collection times. If there's significant differences, then you need >> more >> hardware threads, i.e., a bigger machine. >> >> You might also try using compressed pointers via -XX:+UseCompressedOops. >> That should cut down the total survivor size significantly, perhaps >> enough >> to that your current hardware threads can collect significantly faster. >> >> Heap size >> will be limited to < 32gb, but you're app will probably fit. A more >> efficient >> version of compressed pointers will be available in 6u18, btw. >> >> I notice that none of your logs shows more than age 7 stats even though >> the >> tenuring threshold is 15. It'd be nice to see if anything dies before >> then. >> >> Paul >> >> Tony Printezis wrote: >> >> >>> Jeff, >>> >>> Hi. I had a very brief look at your logs. Yes, your app does seem to >>> need to copy quite a lot (I don't think I've ever seen 1-2GB of data >>> being copied in age 1!!!). From what I've seen from the space sizes, >>> you're doing the right thing (i.e., you're consistent with what we >>> talked about during the talk): you have quite large young gen and a >>> reasonably sized old gen. But the sheer amount of surviving objects is >>> >>> >> >> >>> what's getting you. How much larger can you make your young gen? I >>> >>> >> think >> >> >>> in this case, the larger, the better. Maybe, you can also try >>> MaxTenuringThreshold=1. This goes against our general advice, but this >>> >>> >> >> >>> might decrease the amount of objects being copied during young GCs, at >>> >>> >> >> >>> the expense of more frequent CMS cycles... >>> >>> Tony >>> >>> jeff.lloyd at algorithmics.com wrote: >>> >>> >>> >>>> Hi, >>>> >>>> >>>> >>>> I'm new to this list and I have a few questions about tuning my young >>>> >>>> >> >> >>>> generation gc. >>>> >>>> >>>> >>>> I have chosen to use the CMS garbage collector because my application >>>> >>>> >> >> >>>> is a relatively large reporting server that has a web front end and >>>> therefore needs to have minimal pauses. >>>> >>>> >>>> >>>> I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB >>>> >>>> >> ram. >> >> >>>> >>>> >>>> The machine is dedicated to this JVM. >>>> >>>> >>>> >>>> My steady-state was calculated as follows: >>>> >>>> - A typical number of users logged in and viewed several >>>> >>>> >> reports >> >> >>>> - Stopped user actions and performed a manual full GC >>>> >>>> - Look at the amount of heap used and take that number as >>>> >>>> >> the >> >> >>>> steady-state memory requirement >>>> >>>> >>>> >>>> In this case my heap usage was ~10GB. In order to handle variance or >>>> >>>> >> >> >>>> spikes I sized my old generation at 15-20GB. >>>> >>>> >>>> >>>> I sized my young generation at 32-42GB and used survivor ratios of 1, >>>> >>>> >> >> >>>> 2, 3 and 6. >>>> >>>> >>>> >>>> My goal is to maximize throughput and minimize pauses. I'm willing >>>> >>>> >> to >> >> >>>> sacrifice ram to increase speed. >>>> >>>> >>>> >>>> I have attached several of my many gc logs. The file gc_48G.txt is >>>> just using CMS without any other tuning, and the results are much >>>> worse than what I have been able to accomplish with other settings. >>>> The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and >>>> gc_57G_15Gold_42Gyoung_1sr.txt. >>>> >>>> >>>> >>>> The problem is that some of the pauses are just too long. >>>> >>>> >>>> >>>> Is there a way to reduce the pause time any more than I have it now? >>>> >>>> Am I heading in the right direction? I ask because the default >>>> settings are so different than what I have been heading towards. >>>> >>>> >>>> >>>> The best reference I have found on what good gc logs look like come >>>> from brief examples presented at JavaOne this year by Tony Printezis >>>> and Charlie Hunt. But I don't seem to be able to get logs that >>>> resemble their tenuring patterns. >>>> >>>> >>>> >>>> I think I have a lot of medium-lived objects instead of nice >>>> short-lived ones. >>>> >>>> >>>> >>>> Are there any good practices for apps with objects like this? >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Jeff >>>> >>>> >>>> >>>> >>>> >>>> >>>> >> ------------------------------------------------------------------------ >> >> >>>> This email and any files transmitted with it are confidential and >>>> proprietary to Algorithmics Incorporated and its affiliates >>>> ("Algorithmics"). If received in error, use is prohibited. Please >>>> destroy, and notify sender. Sender does not waive confidentiality or >>>> privilege. Internet communications cannot be guaranteed to be timely, >>>> >>>> >> >> >>>> secure, error or virus-free. Algorithmics does not accept liability >>>> for any errors or omissions. Any commitment intended to bind >>>> Algorithmics must be reduced to writing and signed by an authorized >>>> signatory. >>>> >>>> >>>> >> ------------------------------------------------------------------------ >> >> ------------------------------------------------------------------------ >> >> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>>> >>> >>> >>> >> >> -------------------------------------------------------------------------- >> This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. >> -------------------------------------------------------------------------- >> >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -- --------------------------------------------------------------------- | Tony Printezis, Staff Engineer | Sun Microsystems Inc. | | | MS UBUR02-311 | | e-mail: tony.printezis at sun.com | 35 Network Drive | | office: +1 781 442 0998 (x20998) | Burlington, MA 01803-2756, USA | --------------------------------------------------------------------- e-mail client: Thunderbird (Linux) From Y.S.Ramakrishna at Sun.COM Fri Sep 11 15:19:46 2009 From: Y.S.Ramakrishna at Sun.COM (Y.S.Ramakrishna at Sun.COM) Date: Fri, 11 Sep 2009 15:19:46 -0700 Subject: Young generation configuration In-Reply-To: <0FCC438D62A5E643AA3F57D3417B220D0A813842@TORMAIL.algorithmics.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> <4AAA9362.3080009@Sun.COM> <0FCC438D62A5E643AA3F57D3417B220D0A813842@TORMAIL.algorithmics.com> Message-ID: <4AAACD02.4040109@Sun.COM> Hi Jeff -- On 09/11/09 14:06, jeff.lloyd at algorithmics.com wrote: > Hi Ramki, > > I did not know that lower pause times and higher throughput were > generally incompatible. Good to know - it makes sense too. > > I'm trying to find out how long "too long" is. Bankers can be fickle. > :-) Honestly, I think "too long" constitutes a noticeable pause in GUI > interactions. So, may be around one 200 ms pause per second or so at the most? (If you think that is not suitable, think up a suitable figure like that.) That would give us the requisite pause time budget and implicitly define a GC overhead budget of 200/1000 = 20% (which is actually quite high, but still lower than the overhead i saw in some of your logs from a quick browse, but as Tony pointed out that's because of the excessive copying you were doing of relatively long-lived data that you may be better off tenuring more quickly and letting the concurrent collector deal with it (modulo yr & Tony's earlier remarks re the slightly (see below) increased pressure -- probably unavoidable if you are to meet yr pause time goals -- on the concurrent collector). > > How did you measure the proportion of short-lived and medium-lived > objects? oh, i was playing somewhat fast and loose. I was taking the ratio of (age 1 survivors): (Eden size) to get a rough read on the short:(not short). I sampled a single GC from one of yr log files, but that would be the way to figure this out (while averaging over a sufficiently large set of samples). (Of course, "long" and "short", are relative, and age1 just tells you what survived that was allocated in the last GC epoch. If GC's happen frequently less data would die and more would qualify as "not short" by that kind of loose definition (so my "long" and "short" was relative to the given GC period). > > We typically expect a "session" be live for most of the day, and How much typical session data do you have? What is the rate at which sessions get created? Does this happen perhaps mostly at the start of the day? (In which case you would see lots of promotion activity at the start of the day, but not so much later in the day?) Or is the session creation rate uniform through the typical day? > multiple reports of seconds or minutes in duration executed within that > session. So yes, I am seeing my "steady state" continue for a long Let's say 1 minute. So during that 1 minute, how much data do you produce and of that how much needs to be saved into the session in the form of the "result" from that report? Looks like that result would constitute data that you want to tenure sooner rather than later. Depending on how long the intermediate results needed to generate the final result are needed (you mentioned large trees of intermediate objects i think in an earlier email), you may want to copy them in the survivor spaces, or -- if that data is so large as to cost excessive copying time -- just promote that too. Luckily, in typical cases, if data wants to be large, it also wants to live long. > time, with blips of activity throughout the day. We cache a lot of > results, which can lead to a general upward trend, but it doesn't seem > to be our current source of object volume. The cached data will tenure. Best to tenure it soon, if the proportion of cached data is large. (I am guessing that if you cache, you probably find it saves computation later -- so it also saves allocation later; thus I might naively expect that you will initially tenure lots of data as your caches fill, and later in steady state tenure less as well as perhaps allocate less.) If I look at one random tenuring distribution sample out of yr logs, I see:- - age 1: 2151744736 bytes, 2151744736 total - age 2: 897330448 bytes, 3049075184 total - age 3: 1274314280 bytes, 4323389464 total - age 4: 1351603024 bytes, 5674992488 total - age 5: 1529394376 bytes, 7204386864 total - age 6: 1219001160 bytes, 8423388024 total which is very flat -- indicating that anything that survives a scavenge appears to live on for quite a while (lots of assumptions about steady loads and so on). Experimenting with an MTT of 1 or 2 might be useful, cf yr previous emails with Tony et al. (Yes you will want to increase yr OG size, as you noted, but no it will not fill up much faster because the rate at which you promote will be nearly the same, because most data that survives a single scavenge here tends to live -- above -- for at least 6 scavenges after which it prmotes anyway; you are just promoting that same data a bit sooner without wasting effort in copying it back and forth. It is true that some small amount if intermediate data will promote but that's probably OK). You will then want to play with initiating occupancy fraction once you get an idea about the rate at which it's filling upo versus the rate at which CMS is able to collect versus the effect on scavenges of letting the CMS gen fill up more before collecting versus the effect of doing more frequent or less frequent CMS cycles (and its effect on mutator throughput and available CPU and memory bandwidth). Yes, as Paul noted, definitely +UseCompressedOops to relieve heap pressure (reduce GC overhead) and speed up mutators by improving cache efficiency. -- ramki From Paul.Hohensee at Sun.COM Fri Sep 11 17:00:02 2009 From: Paul.Hohensee at Sun.COM (Paul Hohensee) Date: Fri, 11 Sep 2009 20:00:02 -0400 Subject: Young generation configuration In-Reply-To: <4AAABE83.90308@sun.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> <0FCC438D62A5E643AA3F57D3417B220D0A813819@TORMAIL.algorithmics.com> <4AAABD8A.7000900@sun.com> <4AAABE83.90308@sun.com> Message-ID: <4AAAE482.7070200@sun.com> Could be, but that would lead to a lot of concurrent overhead, reducing his throughput. Such a balancing act. :) Paul Tony Printezis wrote: > > > Paul Hohensee wrote: >> You can try out compressed pointers in 6u14. It just won't be quite as >> fast as the version that's going into 6u18. 6u14 with compressed >> pointers >> will still be quite a bit faster than without. >> >> One of the gc guys may correct me, but UseAdaptiveGCBoundary allows >> the vm to ergonomically move the boundary between old and young >> generations, >> effectively resizing them. I don't know if it's bit-rotted, and I >> seem to remember >> that there wasn't much benefit. But maybe we just didn't have a good >> use case. >> > Also, it's ParallelGC-only, IIRC. >> What I meant by the last paragraph was that with the tenuring >> threshold set at >> 15 (which is what the log says), and with only 7 young gcs in the >> log, we can't >> see at what age (or if) between 8 and 15 the survivor size goes down >> to something >> reasonable. If it doesn't, it might be worth it to us to revisit >> increasing the age >> limit for 64-bit. >> > Paul, the problem in Jeff's case is that even at age 1 he copies 1GB > or so. So, maybe, setting a small MTT and having more CMS cycles might > be the right option for him. > > Tony >> jeff.lloyd at algorithmics.com wrote: >> >>> Thanks for your response Paul. >>> >>> I'll take another look at the parallel collector. >>> That's a good point about the -XX:+UseCompressedOops. We started off >>> with heaps bigger than 32G so I had left that option out. I'll put it >>> back in and definitely try out 6u18 when it's available. >>> >>> What about the option -XX:+UseAdaptiveGCBoundary? I don't see it >>> referenced very often. Would it be helpful in a case like mine? >>> >>> I'm not sure I understand your last paragraph. What is the period of >>> time that you would be interested in seeing? >>> >>> Jeff >>> >>> -----Original Message----- >>> From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] Sent: >>> Friday, September 11, 2009 1:23 PM >>> To: Tony Printezis >>> Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net >>> Subject: Re: Young generation configuration >>> >>> Another alternative mentioned in Tony and Charlie's J1 slides is the >>> parallel >>> collector. If, as Tony says, you can make the young gen large >>> enough to >>> >>> avoid >>> promotion, and you really do have a steady state old gen, then which >>> old >>> gen >>> collector you use wouldn't matter much to pause times, given that young >>> gen pause times seem to be your immediate problem. >>> >>> It may be that you just need more hardware threads to collect such a >>> big >>> >>> young >>> gen too. You might vary the number of gc threads to see how that >>> affects >>> collection times. If there's significant differences, then you need >>> more >>> hardware threads, i.e., a bigger machine. >>> >>> You might also try using compressed pointers via >>> -XX:+UseCompressedOops. >>> That should cut down the total survivor size significantly, perhaps >>> enough >>> to that your current hardware threads can collect significantly faster. >>> >>> Heap size >>> will be limited to < 32gb, but you're app will probably fit. A more >>> efficient >>> version of compressed pointers will be available in 6u18, btw. >>> >>> I notice that none of your logs shows more than age 7 stats even though >>> the >>> tenuring threshold is 15. It'd be nice to see if anything dies before >>> then. >>> >>> Paul >>> >>> Tony Printezis wrote: >>> >>>> Jeff, >>>> >>>> Hi. I had a very brief look at your logs. Yes, your app does seem >>>> to need to copy quite a lot (I don't think I've ever seen 1-2GB of >>>> data being copied in age 1!!!). From what I've seen from the space >>>> sizes, you're doing the right thing (i.e., you're consistent with >>>> what we talked about during the talk): you have quite large young >>>> gen and a reasonably sized old gen. But the sheer amount of >>>> surviving objects is >>>> >>> >>>> what's getting you. How much larger can you make your young gen? I >>>> >>> think >>>> in this case, the larger, the better. Maybe, you can also try >>>> MaxTenuringThreshold=1. This goes against our general advice, but this >>>> >>> >>>> might decrease the amount of objects being copied during young GCs, at >>>> >>> >>>> the expense of more frequent CMS cycles... >>>> >>>> Tony >>>> >>>> jeff.lloyd at algorithmics.com wrote: >>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> I'm new to this list and I have a few questions about tuning my young >>>>> >>> >>>>> generation gc. >>>>> >>>>> >>>>> >>>>> I have chosen to use the CMS garbage collector because my application >>>>> >>> >>>>> is a relatively large reporting server that has a web front end >>>>> and therefore needs to have minimal pauses. >>>>> >>>>> >>>>> I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB >>>>> >>> ram. >>> >>>>> >>>>> >>>>> The machine is dedicated to this JVM. >>>>> >>>>> >>>>> >>>>> My steady-state was calculated as follows: >>>>> >>>>> - A typical number of users logged in and viewed several >>>>> >>> reports >>> >>>>> - Stopped user actions and performed a manual full GC >>>>> >>>>> - Look at the amount of heap used and take that number as >>>>> >>> the >>>>> steady-state memory requirement >>>>> >>>>> >>>>> >>>>> In this case my heap usage was ~10GB. In order to handle variance or >>>>> >>> >>>>> spikes I sized my old generation at 15-20GB. >>>>> >>>>> >>>>> >>>>> I sized my young generation at 32-42GB and used survivor ratios of 1, >>>>> >>> >>>>> 2, 3 and 6. >>>>> >>>>> >>>>> >>>>> My goal is to maximize throughput and minimize pauses. I'm willing >>>>> >>> to >>>>> sacrifice ram to increase speed. >>>>> >>>>> >>>>> >>>>> I have attached several of my many gc logs. The file gc_48G.txt >>>>> is just using CMS without any other tuning, and the results are >>>>> much worse than what I have been able to accomplish with other >>>>> settings. The best results are in the files >>>>> gc_52G_20Gold_32Gyoung_2sr.txt and gc_57G_15Gold_42Gyoung_1sr.txt. >>>>> >>>>> >>>>> >>>>> The problem is that some of the pauses are just too long. >>>>> >>>>> >>>>> >>>>> Is there a way to reduce the pause time any more than I have it now? >>>>> >>>>> Am I heading in the right direction? I ask because the default >>>>> settings are so different than what I have been heading towards. >>>>> >>>>> >>>>> >>>>> The best reference I have found on what good gc logs look like >>>>> come from brief examples presented at JavaOne this year by Tony >>>>> Printezis and Charlie Hunt. But I don't seem to be able to get >>>>> logs that resemble their tenuring patterns. >>>>> >>>>> >>>>> >>>>> I think I have a lot of medium-lived objects instead of nice >>>>> short-lived ones. >>>>> >>>>> >>>>> >>>>> Are there any good practices for apps with objects like this? >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Jeff >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>> ------------------------------------------------------------------------ >>> >>> >>>>> This email and any files transmitted with it are confidential and >>>>> proprietary to Algorithmics Incorporated and its affiliates >>>>> ("Algorithmics"). If received in error, use is prohibited. Please >>>>> destroy, and notify sender. Sender does not waive confidentiality >>>>> or privilege. Internet communications cannot be guaranteed to be >>>>> timely, >>>>> >>> >>>>> secure, error or virus-free. Algorithmics does not accept >>>>> liability for any errors or omissions. Any commitment intended to >>>>> bind Algorithmics must be reduced to writing and signed by an >>>>> authorized signatory. >>>>> >>>>> >>> ------------------------------------------------------------------------ >>> >>> >>> ------------------------------------------------------------------------ >>> >>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>> >>> >>> -------------------------------------------------------------------------- >>> >>> This email and any files transmitted with it are confidential and >>> proprietary to Algorithmics Incorporated and its affiliates >>> ("Algorithmics"). If received in error, use is prohibited. Please >>> destroy, and notify sender. Sender does not waive confidentiality or >>> privilege. Internet communications cannot be guaranteed to be >>> timely, secure, error or virus-free. Algorithmics does not accept >>> liability for any errors or omissions. Any commitment intended to >>> bind Algorithmics must be reduced to writing and signed by an >>> authorized signatory. >>> -------------------------------------------------------------------------- >>> >>> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > From tcogan50 at gmail.com Fri Sep 11 19:54:33 2009 From: tcogan50 at gmail.com (tcogan50 at gmail.com) Date: Fri, 11 Sep 2009 22:54:33 -0400 Subject: hotspot-gc-use Digest, Vol 21, Issue 7 Message-ID: <4aab0d91.1402be0a.1873.5dfa@mx.google.com> stop -----Original Message----- From: hotspot-gc-use-request at openjdk.java.net Sent: Friday, September 11, 2009 8:00 PM To: hotspot-gc-use at openjdk.java.net Subject: hotspot-gc-use Digest, Vol 21, Issue 7 Send hotspot-gc-use mailing list submissions to hotspot-gc-use at openjdk.java.net To subscribe or unsubscribe via the World Wide Web, visit http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use or, via email, send a message with subject or body 'help' to hotspot-gc-use-request at openjdk.java.net You can reach the person managing the list at hotspot-gc-use-owner at openjdk.java.net When replying, please edit your Subject line so it is more specific than "Re: Contents of hotspot-gc-use digest..." Today's Topics: 1. Re: Young generation configuration (Tony Printezis) 2. Re: Young generation configuration (Y.S.Ramakrishna at Sun.COM) 3. Re: Young generation configuration (Paul Hohensee) ---------------------------------------------------------------------- Message: 1 Date: Fri, 11 Sep 2009 17:17:55 -0400 From: Tony Printezis Subject: Re: Young generation configuration To: Paul Hohensee Cc: hotspot-gc-use at openjdk.java.net Message-ID: <4AAABE83.90308 at sun.com> Content-Type: text/plain; CHARSET=US-ASCII; format=flowed Paul Hohensee wrote: > You can try out compressed pointers in 6u14. It just won't be quite as > fast as the version that's going into 6u18. 6u14 with compressed pointers > will still be quite a bit faster than without. > > One of the gc guys may correct me, but UseAdaptiveGCBoundary allows > the vm to ergonomically move the boundary between old and young generations, > effectively resizing them. I don't know if it's bit-rotted, and I seem > to remember > that there wasn't much benefit. But maybe we just didn't have a good > use case. > Also, it's ParallelGC-only, IIRC. > What I meant by the last paragraph was that with the tenuring threshold > set at > 15 (which is what the log says), and with only 7 young gcs in the log, > we can't > see at what age (or if) between 8 and 15 the survivor size goes down to > something > reasonable. If it doesn't, it might be worth it to us to revisit > increasing the age > limit for 64-bit. > Paul, the problem in Jeff's case is that even at age 1 he copies 1GB or so. So, maybe, setting a small MTT and having more CMS cycles might be the right option for him. Tony > jeff.lloyd at algorithmics.com wrote: > >> Thanks for your response Paul. >> >> I'll take another look at the parallel collector. >> >> That's a good point about the -XX:+UseCompressedOops. We started off >> with heaps bigger than 32G so I had left that option out. I'll put it >> back in and definitely try out 6u18 when it's available. >> >> What about the option -XX:+UseAdaptiveGCBoundary? I don't see it >> referenced very often. Would it be helpful in a case like mine? >> >> I'm not sure I understand your last paragraph. What is the period of >> time that you would be interested in seeing? >> >> Jeff >> >> -----Original Message----- >> From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] >> Sent: Friday, September 11, 2009 1:23 PM >> To: Tony Printezis >> Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net >> Subject: Re: Young generation configuration >> >> Another alternative mentioned in Tony and Charlie's J1 slides is the >> parallel >> collector. If, as Tony says, you can make the young gen large enough to >> >> avoid >> promotion, and you really do have a steady state old gen, then which old >> gen >> collector you use wouldn't matter much to pause times, given that young >> gen pause times seem to be your immediate problem. >> >> It may be that you just need more hardware threads to collect such a big >> >> young >> gen too. You might vary the number of gc threads to see how that >> affects >> collection times. If there's significant differences, then you need >> more >> hardware threads, i.e., a bigger machine. >> >> You might also try using compressed pointers via -XX:+UseCompressedOops. >> That should cut down the total survivor size significantly, perhaps >> enough >> to that your current hardware threads can collect significantly faster. >> >> Heap size >> will be limited to < 32gb, but you're app will probably fit. A more >> efficient >> version of compressed pointers will be available in 6u18, btw. >> >> I notice that none of your logs shows more than age 7 stats even though >> the >> tenuring threshold is 15. It'd be nice to see if anything dies before >> then. >> >> Paul >> >> Tony Printezis wrote: >> >> >>> Jeff, >>> >>> Hi. I had a very brief look at your logs. Yes, your app does seem to >>> need to copy quite a lot (I don't think I've ever seen 1-2GB of data >>> being copied in age 1!!!). From what I've seen from the space sizes, >>> you're doing the right thing (i.e., you're consistent with what we >>> talked about during the talk): you have quite large young gen and a >>> reasonably sized old gen. But the sheer amount of surviving objects is >>> >>> >> >> >>> what's getting you. How much larger can you make your young gen? I >>> >>> >> think >> >> >>> in this case, the larger, the better. Maybe, you can also try >>> MaxTenuringThreshold=1. This goes against our general advice, but this >>> >>> >> >> >>> might decrease the amount of objects being copied during young GCs, at >>> >>> >> >> >>> the expense of more frequent CMS cycles... >>> >>> Tony >>> >>> jeff.lloyd at algorithmics.com wrote: >>> >>> >>> >>>> Hi, >>>> >>>> >>>> >>>> I'm new to this list and I have a few questions about tuning my young >>>> >>>> >> >> >>>> generation gc. >>>> >>>> >>>> >>>> I have chosen to use the CMS garbage collector because my application >>>> >>>> >> >> >>>> is a relatively large reporting server that has a web front end and >>>> therefore needs to have minimal pauses. >>>> >>>> >>>> >>>> I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB >>>> >>>> >> ram. >> >> >>>> >>>> >>>> The machine is dedicated to this JVM. >>>> >>>> >>>> >>>> My steady-state was calculated as follows: >>>> >>>> - A typical number of users logged in and viewed several >>>> >>>> >> reports >> >> >>>> - Stopped user actions and performed a manual full GC >>>> >>>> - Look at the amount of heap used and take that number as >>>> >>>> >> the >> >> >>>> steady-state memory requirement >>>> >>>> >>>> >>>> In this case my heap usage was ~10GB. In order to handle variance or >>>> >>>> >> >> >>>> spikes I sized my old generation at 15-20GB. >>>> >>>> >>>> >>>> I sized my young generation at 32-42GB and used survivor ratios of 1, >>>> >>>> >> >> >>>> 2, 3 and 6. >>>> >>>> >>>> >>>> My goal is to maximize throughput and minimize pauses. I'm willing >>>> >>>> >> to >> >> >>>> sacrifice ram to increase speed. >>>> >>>> >>>> >>>> I have attached several of my many gc logs. The file gc_48G.txt is >>>> just using CMS without any other tuning, and the results are much >>>> worse than what I have been able to accomplish with other settings. >>>> The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and >>>> gc_57G_15Gold_42Gyoung_1sr.txt. >>>> >>>> >>>> >>>> The problem is that some of the pauses are just too long. >>>> >>>> >>>> >>>> Is there a way to reduce the pause time any more than I have it now? >>>> >>>> Am I heading in the right direction? I ask because the default >>>> settings are so different than what I have been heading towards. >>>> >>>> >>>> >>>> The best reference I have found on what good gc logs look like come >>>> from brief examples presented at JavaOne this year by Tony Printezis >>>> and Charlie Hunt. But I don't seem to be able to get logs that >>>> resemble their tenuring patterns. >>>> >>>> >>>> >>>> I think I have a lot of medium-lived objects instead of nice >>>> short-lived ones. >>>> >>>> >>>> >>>> Are there any good practices for apps with objects like this? >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Jeff >>>> >>>> >>>> >>>> >>>> >>>> >>>> >> ------------------------------------------------------------------------ >> >> >>>> This email and any files transmitted with it are confidential and >>>> proprietary to Algorithmics Incorporated and its affiliates >>>> ("Algorithmics"). If received in error, use is prohibited. Please >>>> destroy, and notify sender. Sender does not waive confidentiality or >>>> privilege. Internet communications cannot be guaranteed to be timely, >>>> >>>> >> >> >>>> secure, error or virus-free. Algorithmics does not accept liability >>>> for any errors or omissions. Any commitment intended to bind >>>> Algorithmics must be reduced to writing and signed by an authorized >>>> signatory. >>>> >>>> >>>> >> ------------------------------------------------------------------------ >> >> ------------------------------------------------------------------------ >> >> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>>> >>> >>> >>> >> >> -------------------------------------------------------------------------- >> This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. >> -------------------------------------------------------------------------- >> >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -- --------------------------------------------------------------------- | Tony Printezis, Staff Engineer | Sun Microsystems Inc. | | | MS UBUR02-311 | | e-mail: tony.printezis at sun.com | 35 Network Drive | | office: +1 781 442 0998 (x20998) | Burlington, MA 01803-2756, USA | --------------------------------------------------------------------- e-mail client: Thunderbird (Linux) ------------------------------ Message: 2 Date: Fri, 11 Sep 2009 15:19:46 -0700 From: Y.S.Ramakrishna at Sun.COM Subject: Re: Young generation configuration To: jeff.lloyd at algorithmics.com Cc: hotspot-gc-use at openjdk.java.net Message-ID: <4AAACD02.4040109 at Sun.COM> Content-Type: text/plain; CHARSET=US-ASCII; format=flowed Hi Jeff -- On 09/11/09 14:06, jeff.lloyd at algorithmics.com wrote: > Hi Ramki, > > I did not know that lower pause times and higher throughput were > generally incompatible. Good to know - it makes sense too. > > I'm trying to find out how long "too long" is. Bankers can be fickle. > :-) Honestly, I think "too long" constitutes a noticeable pause in GUI > interactions. So, may be around one 200 ms pause per second or so at the most? (If you think that is not suitable, think up a suitable figure like that.) That would give us the requisite pause time budget and implicitly define a GC overhead budget of 200/1000 = 20% (which is actually quite high, but still lower than the overhead i saw in some of your logs from a quick browse, but as Tony pointed out that's because of the excessive copying you were doing of relatively long-lived data that you may be better off tenuring more quickly and letting the concurrent collector deal with it (modulo yr & Tony's earlier remarks re the slightly (see below) increased pressure -- probably unavoidable if you are to meet yr pause time goals -- on the concurrent collector). > > How did you measure the proportion of short-lived and medium-lived > objects? oh, i was playing somewhat fast and loose. I was taking the ratio of (age 1 survivors): (Eden size) to get a rough read on the short:(not short). I sampled a single GC from one of yr log files, but that would be the way to figure this out (while averaging over a sufficiently large set of samples). (Of course, "long" and "short", are relative, and age1 just tells you what survived that was allocated in the last GC epoch. If GC's happen frequently less data would die and more would qualify as "not short" by that kind of loose definition (so my "long" and "short" was relative to the given GC period). > > We typically expect a "session" be live for most of the day, and How much typical session data do you have? What is the rate at which sessions get created? Does this happen perhaps mostly at the start of the day? (In which case you would see lots of promotion activity at the start of the day, but not so much later in the day?) Or is the session creation rate uniform through the typical day? > multiple reports of seconds or minutes in duration executed within that > session. So yes, I am seeing my "steady state" continue for a long Let's say 1 minute. So during that 1 minute, how much data do you produce and of that how much needs to be saved into the session in the form of the "result" from that report? Looks like that result would constitute data that you want to tenure sooner rather than later. Depending on how long the intermediate results needed to generate the final result are needed (you mentioned large trees of intermediate objects i think in an earlier email), you may want to copy them in the survivor spaces, or -- if that data is so large as to cost excessive copying time -- just promote that too. Luckily, in typical cases, if data wants to be large, it also wants to live long. > time, with blips of activity throughout the day. We cache a lot of > results, which can lead to a general upward trend, but it doesn't seem > to be our current source of object volume. The cached data will tenure. Best to tenure it soon, if the proportion of cached data is large. (I am guessing that if you cache, you probably find it saves computation later -- so it also saves allocation later; thus I might naively expect that you will initially tenure lots of data as your caches fill, and later in steady state tenure less as well as perhaps allocate less.) If I look at one random tenuring distribution sample out of yr logs, I see:- - age 1: 2151744736 bytes, 2151744736 total - age 2: 897330448 bytes, 3049075184 total - age 3: 1274314280 bytes, 4323389464 total - age 4: 1351603024 bytes, 5674992488 total - age 5: 1529394376 bytes, 7204386864 total - age 6: 1219001160 bytes, 8423388024 total which is very flat -- indicating that anything that survives a scavenge appears to live on for quite a while (lots of assumptions about steady loads and so on). Experimenting with an MTT of 1 or 2 might be useful, cf yr previous emails with Tony et al. (Yes you will want to increase yr OG size, as you noted, but no it will not fill up much faster because the rate at which you promote will be nearly the same, because most data that survives a single scavenge here tends to live -- above -- for at least 6 scavenges after which it prmotes anyway; you are just promoting that same data a bit sooner without wasting effort in copying it back and forth. It is true that some small amount if intermediate data will promote but that's probably OK). You will then want to play with initiating occupancy fraction once you get an idea about the rate at which it's filling upo versus the rate at which CMS is able to collect versus the effect on scavenges of letting the CMS gen fill up more before collecting versus the effect of doing more frequent or less frequent CMS cycles (and its effect on mutator throughput and available CPU and memory bandwidth). Yes, as Paul noted, definitely +UseCompressedOops to relieve heap pressure (reduce GC overhead) and speed up mutators by improving cache efficiency. -- ramki ------------------------------ Message: 3 Date: Fri, 11 Sep 2009 20:00:02 -0400 From: Paul Hohensee Subject: Re: Young generation configuration To: Tony Printezis Cc: hotspot-gc-use at openjdk.java.net Message-ID: <4AAAE482.7070200 at sun.com> Content-Type: text/plain; CHARSET=US-ASCII; format=flowed Could be, but that would lead to a lot of concurrent overhead, reducing his throughput. Such a balancing act. :) Paul Tony Printezis wrote: > > > Paul Hohensee wrote: >> You can try out compressed pointers in 6u14. It just won't be quite as >> fast as the version that's going into 6u18. 6u14 with compressed >> pointers >> will still be quite a bit faster than without. >> >> One of the gc guys may correct me, but UseAdaptiveGCBoundary allows >> the vm to ergonomically move the boundary between old and young >> generations, >> effectively resizing them. I don't know if it's bit-rotted, and I >> seem to remember >> that there wasn't much benefit. But maybe we just didn't have a good >> use case. >> > Also, it's ParallelGC-only, IIRC. >> What I meant by the last paragraph was that with the tenuring >> threshold set at >> 15 (which is what the log says), and with only 7 young gcs in the >> log, we can't >> see at what age (or if) between 8 and 15 the survivor size goes down >> to something >> reasonable. If it doesn't, it might be worth it to us to revisit >> increasing the age >> limit for 64-bit. >> > Paul, the problem in Jeff's case is that even at age 1 he copies 1GB > or so. So, maybe, setting a small MTT and having more CMS cycles might > be the right option for him. > > Tony >> jeff.lloyd at algorithmics.com wrote: >> >>> Thanks for your response Paul. >>> >>> I'll take another look at the parallel collector. >>> That's a good point about the -XX:+UseCompressedOops. We started off >>> with heaps bigger than 32G so I had left that option out. I'll put it >>> back in and definitely try out 6u18 when it's available. >>> >>> What about the option -XX:+UseAdaptiveGCBoundary? I don't see it >>> referenced very often. Would it be helpful in a case like mine? >>> >>> I'm not sure I understand your last paragraph. What is the period of >>> time that you would be interested in seeing? >>> >>> Jeff >>> >>> -----Original Message----- >>> From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] Sent: >>> Friday, September 11, 2009 1:23 PM >>> To: Tony Printezis >>> Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net >>> Subject: Re: Young generation configuration >>> >>> Another alternative mentioned in Tony and Charlie's J1 slides is the >>> parallel >>> collector. If, as Tony says, you can make the young gen large >>> enough to >>> >>> avoid >>> promotion, and you really do have a steady state old gen, then which >>> old >>> gen >>> collector you use wouldn't matter much to pause times, given that young >>> gen pause times seem to be your immediate problem. >>> >>> It may be that you just need more hardware threads to collect such a >>> big >>> >>> young >>> gen too. You might vary the number of gc threads to see how that >>> affects >>> collection times. If there's significant differences, then you need >>> more >>> hardware threads, i.e., a bigger machine. >>> >>> You might also try using compressed pointers via >>> -XX:+UseCompressedOops. >>> That should cut down the total survivor size significantly, perhaps >>> enough >>> to that your current hardware threads can collect significantly faster. >>> >>> Heap size >>> will be limi [The entire original message is not included] From jeff.lloyd at algorithmics.com Mon Sep 14 13:12:15 2009 From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com) Date: Mon, 14 Sep 2009 16:12:15 -0400 Subject: Young generation configuration In-Reply-To: <4AAABD8A.7000900@sun.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> <0FCC438D62A5E643AA3F57D3417B220D0A813819@TORMAIL.algorithmics.com> <4AAABD8A.7000900@sun.com> Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0A813E38@TORMAIL.algorithmics.com> Ah - I see what you mean about the last paragraph. I hadn't counted the number of gc's relative to the mtt yet. For what it's worth, the pause time to collect that much YG garbage is too large for me, so I'll be decreasing the YG anyway. Thanks again. Jeff -----Original Message----- From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] Sent: Friday, September 11, 2009 5:14 PM To: Jeff Lloyd Cc: hotspot-gc-use at openjdk.java.net Subject: Re: Young generation configuration You can try out compressed pointers in 6u14. It just won't be quite as fast as the version that's going into 6u18. 6u14 with compressed pointers will still be quite a bit faster than without. One of the gc guys may correct me, but UseAdaptiveGCBoundary allows the vm to ergonomically move the boundary between old and young generations, effectively resizing them. I don't know if it's bit-rotted, and I seem to remember that there wasn't much benefit. But maybe we just didn't have a good use case. What I meant by the last paragraph was that with the tenuring threshold set at 15 (which is what the log says), and with only 7 young gcs in the log, we can't see at what age (or if) between 8 and 15 the survivor size goes down to something reasonable. If it doesn't, it might be worth it to us to revisit increasing the age limit for 64-bit. Paul jeff.lloyd at algorithmics.com wrote: > Thanks for your response Paul. > > I'll take another look at the parallel collector. > > That's a good point about the -XX:+UseCompressedOops. We started off > with heaps bigger than 32G so I had left that option out. I'll put it > back in and definitely try out 6u18 when it's available. > > What about the option -XX:+UseAdaptiveGCBoundary? I don't see it > referenced very often. Would it be helpful in a case like mine? > > I'm not sure I understand your last paragraph. What is the period of > time that you would be interested in seeing? > > Jeff > > -----Original Message----- > From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] > Sent: Friday, September 11, 2009 1:23 PM > To: Tony Printezis > Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net > Subject: Re: Young generation configuration > > Another alternative mentioned in Tony and Charlie's J1 slides is the > parallel > collector. If, as Tony says, you can make the young gen large enough to > > avoid > promotion, and you really do have a steady state old gen, then which old > gen > collector you use wouldn't matter much to pause times, given that young > gen pause times seem to be your immediate problem. > > It may be that you just need more hardware threads to collect such a big > > young > gen too. You might vary the number of gc threads to see how that > affects > collection times. If there's significant differences, then you need > more > hardware threads, i.e., a bigger machine. > > You might also try using compressed pointers via -XX:+UseCompressedOops. > That should cut down the total survivor size significantly, perhaps > enough > to that your current hardware threads can collect significantly faster. > > Heap size > will be limited to < 32gb, but you're app will probably fit. A more > efficient > version of compressed pointers will be available in 6u18, btw. > > I notice that none of your logs shows more than age 7 stats even though > the > tenuring threshold is 15. It'd be nice to see if anything dies before > then. > > Paul > > Tony Printezis wrote: > >> Jeff, >> >> Hi. I had a very brief look at your logs. Yes, your app does seem to >> need to copy quite a lot (I don't think I've ever seen 1-2GB of data >> being copied in age 1!!!). From what I've seen from the space sizes, >> you're doing the right thing (i.e., you're consistent with what we >> talked about during the talk): you have quite large young gen and a >> reasonably sized old gen. But the sheer amount of surviving objects is >> > > >> what's getting you. How much larger can you make your young gen? I >> > think > >> in this case, the larger, the better. Maybe, you can also try >> MaxTenuringThreshold=1. This goes against our general advice, but this >> > > >> might decrease the amount of objects being copied during young GCs, at >> > > >> the expense of more frequent CMS cycles... >> >> Tony >> >> jeff.lloyd at algorithmics.com wrote: >> >> >>> Hi, >>> >>> >>> >>> I'm new to this list and I have a few questions about tuning my young >>> > > >>> generation gc. >>> >>> >>> >>> I have chosen to use the CMS garbage collector because my application >>> > > >>> is a relatively large reporting server that has a web front end and >>> therefore needs to have minimal pauses. >>> >>> >>> >>> I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB >>> > ram. > >>> >>> >>> The machine is dedicated to this JVM. >>> >>> >>> >>> My steady-state was calculated as follows: >>> >>> - A typical number of users logged in and viewed several >>> > reports > >>> - Stopped user actions and performed a manual full GC >>> >>> - Look at the amount of heap used and take that number as >>> > the > >>> steady-state memory requirement >>> >>> >>> >>> In this case my heap usage was ~10GB. In order to handle variance or >>> > > >>> spikes I sized my old generation at 15-20GB. >>> >>> >>> >>> I sized my young generation at 32-42GB and used survivor ratios of 1, >>> > > >>> 2, 3 and 6. >>> >>> >>> >>> My goal is to maximize throughput and minimize pauses. I'm willing >>> > to > >>> sacrifice ram to increase speed. >>> >>> >>> >>> I have attached several of my many gc logs. The file gc_48G.txt is >>> just using CMS without any other tuning, and the results are much >>> worse than what I have been able to accomplish with other settings. >>> The best results are in the files gc_52G_20Gold_32Gyoung_2sr.txt and >>> gc_57G_15Gold_42Gyoung_1sr.txt. >>> >>> >>> >>> The problem is that some of the pauses are just too long. >>> >>> >>> >>> Is there a way to reduce the pause time any more than I have it now? >>> >>> Am I heading in the right direction? I ask because the default >>> settings are so different than what I have been heading towards. >>> >>> >>> >>> The best reference I have found on what good gc logs look like come >>> from brief examples presented at JavaOne this year by Tony Printezis >>> and Charlie Hunt. But I don't seem to be able to get logs that >>> resemble their tenuring patterns. >>> >>> >>> >>> I think I have a lot of medium-lived objects instead of nice >>> short-lived ones. >>> >>> >>> >>> Are there any good practices for apps with objects like this? >>> >>> >>> >>> Thanks, >>> >>> Jeff >>> >>> >>> >>> >>> >>> > ------------------------------------------------------------------------ > >>> This email and any files transmitted with it are confidential and >>> proprietary to Algorithmics Incorporated and its affiliates >>> ("Algorithmics"). If received in error, use is prohibited. Please >>> destroy, and notify sender. Sender does not waive confidentiality or >>> privilege. Internet communications cannot be guaranteed to be timely, >>> > > >>> secure, error or virus-free. Algorithmics does not accept liability >>> for any errors or omissions. Any commitment intended to bind >>> Algorithmics must be reduced to writing and signed by an authorized >>> signatory. >>> >>> > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> >> > > > ------------------------------------------------------------------------ -- > This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. > ------------------------------------------------------------------------ -- > -------------------------------------------------------------------------- This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. -------------------------------------------------------------------------- From jeff.lloyd at algorithmics.com Mon Sep 14 14:12:04 2009 From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com) Date: Mon, 14 Sep 2009 17:12:04 -0400 Subject: Young generation configuration In-Reply-To: <4AAAE482.7070200@sun.com> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> <0FCC438D62A5E643AA3F57D3417B220D0A813819@TORMAIL.algorithmics.com> <4AAABD8A.7000900@sun.com> <4AAABE83.90308@sun.com> <4AAAE482.7070200@sun.com> Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0A813ED4@TORMAIL.algorithmics.com> Is there somewhere I can download a balancing pole? :-) Thanks, you guys have been great help. Jeff -----Original Message----- From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] Sent: Friday, September 11, 2009 8:00 PM To: Tony Printezis Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net Subject: Re: Young generation configuration Could be, but that would lead to a lot of concurrent overhead, reducing his throughput. Such a balancing act. :) Paul Tony Printezis wrote: > > > Paul Hohensee wrote: >> You can try out compressed pointers in 6u14. It just won't be quite as >> fast as the version that's going into 6u18. 6u14 with compressed >> pointers >> will still be quite a bit faster than without. >> >> One of the gc guys may correct me, but UseAdaptiveGCBoundary allows >> the vm to ergonomically move the boundary between old and young >> generations, >> effectively resizing them. I don't know if it's bit-rotted, and I >> seem to remember >> that there wasn't much benefit. But maybe we just didn't have a good >> use case. >> > Also, it's ParallelGC-only, IIRC. >> What I meant by the last paragraph was that with the tenuring >> threshold set at >> 15 (which is what the log says), and with only 7 young gcs in the >> log, we can't >> see at what age (or if) between 8 and 15 the survivor size goes down >> to something >> reasonable. If it doesn't, it might be worth it to us to revisit >> increasing the age >> limit for 64-bit. >> > Paul, the problem in Jeff's case is that even at age 1 he copies 1GB > or so. So, maybe, setting a small MTT and having more CMS cycles might > be the right option for him. > > Tony >> jeff.lloyd at algorithmics.com wrote: >> >>> Thanks for your response Paul. >>> >>> I'll take another look at the parallel collector. >>> That's a good point about the -XX:+UseCompressedOops. We started off >>> with heaps bigger than 32G so I had left that option out. I'll put it >>> back in and definitely try out 6u18 when it's available. >>> >>> What about the option -XX:+UseAdaptiveGCBoundary? I don't see it >>> referenced very often. Would it be helpful in a case like mine? >>> >>> I'm not sure I understand your last paragraph. What is the period of >>> time that you would be interested in seeing? >>> >>> Jeff >>> >>> -----Original Message----- >>> From: Paul.Hohensee at Sun.COM [mailto:Paul.Hohensee at Sun.COM] Sent: >>> Friday, September 11, 2009 1:23 PM >>> To: Tony Printezis >>> Cc: Jeff Lloyd; hotspot-gc-use at openjdk.java.net >>> Subject: Re: Young generation configuration >>> >>> Another alternative mentioned in Tony and Charlie's J1 slides is the >>> parallel >>> collector. If, as Tony says, you can make the young gen large >>> enough to >>> >>> avoid >>> promotion, and you really do have a steady state old gen, then which >>> old >>> gen >>> collector you use wouldn't matter much to pause times, given that young >>> gen pause times seem to be your immediate problem. >>> >>> It may be that you just need more hardware threads to collect such a >>> big >>> >>> young >>> gen too. You might vary the number of gc threads to see how that >>> affects >>> collection times. If there's significant differences, then you need >>> more >>> hardware threads, i.e., a bigger machine. >>> >>> You might also try using compressed pointers via >>> -XX:+UseCompressedOops. >>> That should cut down the total survivor size significantly, perhaps >>> enough >>> to that your current hardware threads can collect significantly faster. >>> >>> Heap size >>> will be limited to < 32gb, but you're app will probably fit. A more >>> efficient >>> version of compressed pointers will be available in 6u18, btw. >>> >>> I notice that none of your logs shows more than age 7 stats even though >>> the >>> tenuring threshold is 15. It'd be nice to see if anything dies before >>> then. >>> >>> Paul >>> >>> Tony Printezis wrote: >>> >>>> Jeff, >>>> >>>> Hi. I had a very brief look at your logs. Yes, your app does seem >>>> to need to copy quite a lot (I don't think I've ever seen 1-2GB of >>>> data being copied in age 1!!!). From what I've seen from the space >>>> sizes, you're doing the right thing (i.e., you're consistent with >>>> what we talked about during the talk): you have quite large young >>>> gen and a reasonably sized old gen. But the sheer amount of >>>> surviving objects is >>>> >>> >>>> what's getting you. How much larger can you make your young gen? I >>>> >>> think >>>> in this case, the larger, the better. Maybe, you can also try >>>> MaxTenuringThreshold=1. This goes against our general advice, but this >>>> >>> >>>> might decrease the amount of objects being copied during young GCs, at >>>> >>> >>>> the expense of more frequent CMS cycles... >>>> >>>> Tony >>>> >>>> jeff.lloyd at algorithmics.com wrote: >>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> I'm new to this list and I have a few questions about tuning my young >>>>> >>> >>>>> generation gc. >>>>> >>>>> >>>>> >>>>> I have chosen to use the CMS garbage collector because my application >>>>> >>> >>>>> is a relatively large reporting server that has a web front end >>>>> and therefore needs to have minimal pauses. >>>>> >>>>> >>>>> I am using java 1.6.0_16 64-bit on redhat 5.2 intel 8x3GHz and 64GB >>>>> >>> ram. >>> >>>>> >>>>> >>>>> The machine is dedicated to this JVM. >>>>> >>>>> >>>>> >>>>> My steady-state was calculated as follows: >>>>> >>>>> - A typical number of users logged in and viewed several >>>>> >>> reports >>> >>>>> - Stopped user actions and performed a manual full GC >>>>> >>>>> - Look at the amount of heap used and take that number as >>>>> >>> the >>>>> steady-state memory requirement >>>>> >>>>> >>>>> >>>>> In this case my heap usage was ~10GB. In order to handle variance or >>>>> >>> >>>>> spikes I sized my old generation at 15-20GB. >>>>> >>>>> >>>>> >>>>> I sized my young generation at 32-42GB and used survivor ratios of 1, >>>>> >>> >>>>> 2, 3 and 6. >>>>> >>>>> >>>>> >>>>> My goal is to maximize throughput and minimize pauses. I'm willing >>>>> >>> to >>>>> sacrifice ram to increase speed. >>>>> >>>>> >>>>> >>>>> I have attached several of my many gc logs. The file gc_48G.txt >>>>> is just using CMS without any other tuning, and the results are >>>>> much worse than what I have been able to accomplish with other >>>>> settings. The best results are in the files >>>>> gc_52G_20Gold_32Gyoung_2sr.txt and gc_57G_15Gold_42Gyoung_1sr.txt. >>>>> >>>>> >>>>> >>>>> The problem is that some of the pauses are just too long. >>>>> >>>>> >>>>> >>>>> Is there a way to reduce the pause time any more than I have it now? >>>>> >>>>> Am I heading in the right direction? I ask because the default >>>>> settings are so different than what I have been heading towards. >>>>> >>>>> >>>>> >>>>> The best reference I have found on what good gc logs look like >>>>> come from brief examples presented at JavaOne this year by Tony >>>>> Printezis and Charlie Hunt. But I don't seem to be able to get >>>>> logs that resemble their tenuring patterns. >>>>> >>>>> >>>>> >>>>> I think I have a lot of medium-lived objects instead of nice >>>>> short-lived ones. >>>>> >>>>> >>>>> >>>>> Are there any good practices for apps with objects like this? >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Jeff >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>> ------------------------------------------------------------------------ >>> >>> >>>>> This email and any files transmitted with it are confidential and >>>>> proprietary to Algorithmics Incorporated and its affiliates >>>>> ("Algorithmics"). If received in error, use is prohibited. Please >>>>> destroy, and notify sender. Sender does not waive confidentiality >>>>> or privilege. Internet communications cannot be guaranteed to be >>>>> timely, >>>>> >>> >>>>> secure, error or virus-free. Algorithmics does not accept >>>>> liability for any errors or omissions. Any commitment intended to >>>>> bind Algorithmics must be reduced to writing and signed by an >>>>> authorized signatory. >>>>> >>>>> >>> ------------------------------------------------------------------------ >>> >>> >>> ------------------------------------------------------------------------ >>> >>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>> >>> >>> ------------------------------------------------------------------------ -- >>> >>> This email and any files transmitted with it are confidential and >>> proprietary to Algorithmics Incorporated and its affiliates >>> ("Algorithmics"). If received in error, use is prohibited. Please >>> destroy, and notify sender. Sender does not waive confidentiality or >>> privilege. Internet communications cannot be guaranteed to be >>> timely, secure, error or virus-free. Algorithmics does not accept >>> liability for any errors or omissions. Any commitment intended to >>> bind Algorithmics must be reduced to writing and signed by an >>> authorized signatory. >>> ------------------------------------------------------------------------ -- >>> >>> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > -------------------------------------------------------------------------- This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. -------------------------------------------------------------------------- From jeff.lloyd at algorithmics.com Mon Sep 14 14:52:41 2009 From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com) Date: Mon, 14 Sep 2009 17:52:41 -0400 Subject: Young generation configuration In-Reply-To: <4AAACD02.4040109@Sun.COM> References: <0FCC438D62A5E643AA3F57D3417B220D0A7AE2ED@TORMAIL.algorithmics.com> <4AAA6B29.3030008@sun.com> <4AAA8759.8010904@sun.com> <4AAA9362.3080009@Sun.COM> <0FCC438D62A5E643AA3F57D3417B220D0A813842@TORMAIL.algorithmics.com> <4AAACD02.4040109@Sun.COM> Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0A813EFA@TORMAIL.algorithmics.com> Thanks for all the information Ramki. I had to lower my YG to 1G in order to reduce my typical YG GC to under one second, and under .5 sec for many gc's. I'm not playing with initiating occupancy fraction settings to avoid the cms failures I'm getting. But it's looking so much better. Our app login and logout produces truck loads of garbage, so figuring out the initiating occupancy fraction settings is a bit tricky. Everything is definitely much clearer now. Thanks! Jeff -----Original Message----- From: Y.S.Ramakrishna at Sun.COM [mailto:Y.S.Ramakrishna at Sun.COM] Sent: Friday, September 11, 2009 6:20 PM To: Jeff Lloyd Cc: hotspot-gc-use at openjdk.java.net Subject: Re: Young generation configuration Hi Jeff -- On 09/11/09 14:06, jeff.lloyd at algorithmics.com wrote: > Hi Ramki, > > I did not know that lower pause times and higher throughput were > generally incompatible. Good to know - it makes sense too. > > I'm trying to find out how long "too long" is. Bankers can be fickle. > :-) Honestly, I think "too long" constitutes a noticeable pause in GUI > interactions. So, may be around one 200 ms pause per second or so at the most? (If you think that is not suitable, think up a suitable figure like that.) That would give us the requisite pause time budget and implicitly define a GC overhead budget of 200/1000 = 20% (which is actually quite high, but still lower than the overhead i saw in some of your logs from a quick browse, but as Tony pointed out that's because of the excessive copying you were doing of relatively long-lived data that you may be better off tenuring more quickly and letting the concurrent collector deal with it (modulo yr & Tony's earlier remarks re the slightly (see below) increased pressure -- probably unavoidable if you are to meet yr pause time goals -- on the concurrent collector). > > How did you measure the proportion of short-lived and medium-lived > objects? oh, i was playing somewhat fast and loose. I was taking the ratio of (age 1 survivors): (Eden size) to get a rough read on the short:(not short). I sampled a single GC from one of yr log files, but that would be the way to figure this out (while averaging over a sufficiently large set of samples). (Of course, "long" and "short", are relative, and age1 just tells you what survived that was allocated in the last GC epoch. If GC's happen frequently less data would die and more would qualify as "not short" by that kind of loose definition (so my "long" and "short" was relative to the given GC period). > > We typically expect a "session" be live for most of the day, and How much typical session data do you have? What is the rate at which sessions get created? Does this happen perhaps mostly at the start of the day? (In which case you would see lots of promotion activity at the start of the day, but not so much later in the day?) Or is the session creation rate uniform through the typical day? > multiple reports of seconds or minutes in duration executed within that > session. So yes, I am seeing my "steady state" continue for a long Let's say 1 minute. So during that 1 minute, how much data do you produce and of that how much needs to be saved into the session in the form of the "result" from that report? Looks like that result would constitute data that you want to tenure sooner rather than later. Depending on how long the intermediate results needed to generate the final result are needed (you mentioned large trees of intermediate objects i think in an earlier email), you may want to copy them in the survivor spaces, or -- if that data is so large as to cost excessive copying time -- just promote that too. Luckily, in typical cases, if data wants to be large, it also wants to live long. > time, with blips of activity throughout the day. We cache a lot of > results, which can lead to a general upward trend, but it doesn't seem > to be our current source of object volume. The cached data will tenure. Best to tenure it soon, if the proportion of cached data is large. (I am guessing that if you cache, you probably find it saves computation later -- so it also saves allocation later; thus I might naively expect that you will initially tenure lots of data as your caches fill, and later in steady state tenure less as well as perhaps allocate less.) If I look at one random tenuring distribution sample out of yr logs, I see:- - age 1: 2151744736 bytes, 2151744736 total - age 2: 897330448 bytes, 3049075184 total - age 3: 1274314280 bytes, 4323389464 total - age 4: 1351603024 bytes, 5674992488 total - age 5: 1529394376 bytes, 7204386864 total - age 6: 1219001160 bytes, 8423388024 total which is very flat -- indicating that anything that survives a scavenge appears to live on for quite a while (lots of assumptions about steady loads and so on). Experimenting with an MTT of 1 or 2 might be useful, cf yr previous emails with Tony et al. (Yes you will want to increase yr OG size, as you noted, but no it will not fill up much faster because the rate at which you promote will be nearly the same, because most data that survives a single scavenge here tends to live -- above -- for at least 6 scavenges after which it prmotes anyway; you are just promoting that same data a bit sooner without wasting effort in copying it back and forth. It is true that some small amount if intermediate data will promote but that's probably OK). You will then want to play with initiating occupancy fraction once you get an idea about the rate at which it's filling upo versus the rate at which CMS is able to collect versus the effect on scavenges of letting the CMS gen fill up more before collecting versus the effect of doing more frequent or less frequent CMS cycles (and its effect on mutator throughput and available CPU and memory bandwidth). Yes, as Paul noted, definitely +UseCompressedOops to relieve heap pressure (reduce GC overhead) and speed up mutators by improving cache efficiency. -- ramki -------------------------------------------------------------------------- This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. -------------------------------------------------------------------------- From jeff.lloyd at algorithmics.com Thu Sep 17 07:52:48 2009 From: jeff.lloyd at algorithmics.com (jeff.lloyd at algorithmics.com) Date: Thu, 17 Sep 2009 10:52:48 -0400 Subject: GC working well now - thanks! Message-ID: <0FCC438D62A5E643AA3F57D3417B220D0A888FA3@TORMAIL.algorithmics.com> Hi, I just wanted to say thank you very much to everyone who gave me some time on this list. You've been very helpful, and I believe my problem is solved. For anyone who is interested, I took the old-school approach to using the CMS collector: The only way to reduce the gui pauses was to make the YG relatively small - in our case 1G. That kept the ParNew pauses under 1 second most of the time, and the GUI felt responsive. However I started getting CMS failures so I radically changed my OG size. Since my steady-state size is 10G, I decided to give myself a 50% buffer and leave 5G for quick tenuring of temporary objects that survived the ParNew YG GC. Then since my machine has lots of physical ram I set the initiating occupancy fraction to 50%, and the total OG size at 30G. That's probably higher than it needs to be, but at 20G I was still getting CMS failures followed by a full GC. Below is the full set of GC parameters I used: -verbose:gc XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.txt -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC -Xmn1g -XX:CMSInitiatingOccupancyFraction=50 -XX:+DoEscapeAnalysis -XX:+UseCompressedOops I'm attaching the log file for anyone who may be curious to see what it looks like. When I view it in Visual GC the YG is very active and the OG has long rolling hills with room to spare at the top of the hills. Thanks again. Jeff -------------------------------------------------------------------------- This email and any files transmitted with it are confidential and proprietary to Algorithmics Incorporated and its affiliates ("Algorithmics"). If received in error, use is prohibited. Please destroy, and notify sender. Sender does not waive confidentiality or privilege. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. Algorithmics does not accept liability for any errors or omissions. Any commitment intended to bind Algorithmics must be reduced to writing and signed by an authorized signatory. -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20090917/cc54130a/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: gc3_31_30old_1young_50iof.zip Type: application/x-zip-compressed Size: 22253 bytes Desc: gc3_31_30old_1young_50iof.zip Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20090917/cc54130a/attachment-0001.bin From Sujit.Das at cognizant.com Tue Sep 22 12:46:51 2009 From: Sujit.Das at cognizant.com (Sujit.Das at cognizant.com) Date: Wed, 23 Sep 2009 01:16:51 +0530 Subject: Question on ParallelGCThreads Message-ID: <19B27FD5AF2EAA49A66F787911CF519596D051@CTSINCHNSXUU.cts.com> Hi All, We use CMS collector for old generation collection (option -XX:+UseConcMarkSweepGC) and parallel copying collector for young generation (option -XX:+ UseParNewGC). We use ParallelGCThreads command line option (-XX:ParallelGCThreads=) to control number of garbage collector threads. My question is: 1. Is ParallelGCThreads option applicable for only minor GC or is it applicable for old generation GC also? 2. Since CMS collector is a non-compacting collector and if application faces memory fragmentation issue then reducing # of ParallelGCThreads is an option to reduce fragmentation. Please confirm the understanding. This is based on my reading that each garbage collection thread reserves a part of the old generation for promotions and the division of the available space into these "promotion buffers" can cause a fragmentation effect. Reducing the number of garbage collector threads will reduce this fragmentation effect as will increasing the size of the old generation. Thanks, Sujit This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20090923/22947409/attachment.html From Jon.Masamitsu at Sun.COM Tue Sep 22 13:56:12 2009 From: Jon.Masamitsu at Sun.COM (Jon Masamitsu) Date: Tue, 22 Sep 2009 13:56:12 -0700 Subject: Question on ParallelGCThreads In-Reply-To: <19B27FD5AF2EAA49A66F787911CF519596D051@CTSINCHNSXUU.cts.com> References: <19B27FD5AF2EAA49A66F787911CF519596D051@CTSINCHNSXUU.cts.com> Message-ID: <4AB939EC.2070206@Sun.COM> On 09/22/09 12:46, Sujit.Das at cognizant.com wrote: > Hi All, > > We use CMS collector for old generation collection (option > -XX:+UseConcMarkSweepGC) and parallel copying collector for young > generation (option -XX:+ UseParNewGC). We use ParallelGCThreads command > line option (-XX:ParallelGCThreads=) to control number > of garbage collector threads. > > My question is: > 1. Is ParallelGCThreads option applicable for only minor GC or is it > applicable for old generation GC also? Parts of the old generation collection that stop-the-world and do work with multiple GC threads also use ParallelGCThreads. This would be the initial-mark and remark phases assuming you're using a recent release (parallel initial-mark and parallel remark were not in the first release of CMS). Additionally, the concurrent marking that uses multiple GC threads (introduced in jdk6) may be affected by ParallelGCThreads. The number of GC threads used in the concurrent marking is a fraction of ParallelGCThreads. > > 2. Since CMS collector is a non-compacting collector and if application > faces memory fragmentation issue then reducing # of ParallelGCThreads is > an option to reduce fragmentation. Please confirm the understanding. > This is based on my reading that each garbage collection thread reserves > a part of the old generation for promotions and the division of the > available space into these "promotion buffers" can cause a fragmentation > effect. Reducing the number of garbage collector threads will reduce > this fragmentation effect as will increasing the size of the old > generation. Yes, the promotion-local-allocation-buffers (PLAB's) can fragment the old generation although that is not the most common cause. There might have been an investigation of this type of fragmentation recently. I'll ask around. Increasing the size of the old gen ameliorates the affects of fragmentation by giving objects more time to die and allows CMS more time to coalesce dead space into larger blocks. > > Thanks, > Sujit > > This e-mail and any files transmitted with it are for the sole use of > the intended recipient(s) and may contain confidential and privileged > information. > If you are not the intended recipient, please contact the sender by > reply e-mail and destroy all copies of the original message. > Any unauthorized review, use, disclosure, dissemination, forwarding, > printing or copying of this email or any action taken in reliance on > this e-mail is strictly prohibited and may be unlawful. > > > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From Y.S.Ramakrishna at Sun.COM Mon Sep 28 16:37:26 2009 From: Y.S.Ramakrishna at Sun.COM (Y.S.Ramakrishna at Sun.COM) Date: Mon, 28 Sep 2009 16:37:26 -0700 Subject: unexplained CMS pauses In-Reply-To: <4AC145AE.30804@sun.com> References: <4AC0EEAE.5010705@Sun.COM> <4AC145AE.30804@sun.com> Message-ID: <4AC148B6.7010608@Sun.COM> Hi Paul, that would be 100 us + minor gc time (=30 ms) = 30.1 ms. It does not explain the 10- to 40-fold increase in txn times observed here (of 300 ms - 1200 ms). -- ramki On 09/28/09 16:24, Paul Hohensee wrote: > Might have nothing to do with concurrent mark running, rather with minor > collections > happening during a java->native call. > > Minor collections are stop-the-world, and if one occurs during a > java->native call, the > java thread making the call to native will block on return from the call > to native until > the minor collection is over. If the outliers are outliers in the time > it takes to execute > the native call, and the minor gc duration and timing are exactly right, > you could end > up spending most of the minor gc time blocked. Pause time would then be > 100ms > + minor gc time. > > Paul > > Shane Cox wrote: >> Ramki, >> We're running Solaris 10 on 8-core Intel Xeon's. 1 Java instance per >> box. JRE 1.6, update 14. 64-bit JVM. >> >> Our app is making JNI calls. Each call requires approximately the >> same amount of work (so we expect them to perform similarly). >> Internally, we measure how long it takes to perform these calls. +99% >> of these calls complete in less than 100 micros. However, we have >> outliers in the 300-1200ms range. After some research, we have found >> that these extreme outliers coincide with the Concurrent Mark phase of >> CMS (based on timestamps), and ONLY if there is a minor GC during that >> phase. >> >> In a given day, our app will execute CMS collections 500-1000 times. >> Of these, less than 10 will have a minor collection execute during the >> Concurrent Mark. For each of these, our app reports a large pause in >> the 300-1200ms range. In fact, all of the large pauses reported by >> our app correlate with a minor GC during a Concurrent Mark, without >> exception. >> >> Thanks >> >> On Mon, Sep 28, 2009 at 1:13 PM, > > wrote: >> >> What platform are you on (#cpu's etc.) >> and when you say "app reports a pause of 300 ms" >> is it that the odd transaction sees a latency >> of 300 ms (coincident with concurrent mark), >> whereas most transactions complete much more >> quickly? >> >> I am trying to first understand how you determine >> that the application is seeing "long pauses" >> when a minor gc occurs during concurrent mark. >> >> PS: for example, if two gc pauses (say a scavenge of 30 ms >> and an initial mark of 13 ms occur in quick succession, >> your application might notice a pause of 43 ms, etc.) >> >> -- ramki >> >> >> On 09/28/09 10:07, Shane Cox wrote: >> >> Our application is reporting long pauses when a minor GC >> occurs during the Concurrent Mark phase of CMS. The output >> below is a specific example. All of the GC pauses are less >> than 30ms (initial mark, remark, minor GC). However, our app >> reported a 300ms pause. >> >> 56750.934: [GC [1 CMS-initial-mark: 702464K(1402496K)] >> 719045K(1551616K), 0.0131859 secs] >> 56750.947: [CMS-concurrent-mark-start] >> 56752.133: [GC 56752.133: [ParNew: 144393K->12122K(149120K), >> 0.0237615 secs] 846857K->719330K(1551616K), 0.0239988 secs] >> 56752.162: [CMS-concurrent-mark: 1.188/1.215 secs] >> 56752.162: [CMS-concurrent-preclean-start] >> 56752.243: [CMS-concurrent-preclean: 0.070/0.081 secs] >> 56752.243: [CMS-concurrent-abortable-preclean-start] >> 56752.765: [CMS-concurrent-abortable-preclean: 0.143/0.522 secs] >> 56752.766: [GC[YG occupancy: 77423 K (149120 K)]56752.766: >> [Rescan (parallel) , 0.0065730 secs]56752.773: [weak refs >> processing, 0.0001983 secs] [1 CMS-remark: 707208K(1402496K)] >> 784631K(1551616K), 0.0068908 secs] >> 56752.773: [CMS-concurrent-sweep-start] >> 56753.209: [CMS-concurrent-sweep: 0.436/0.436 secs] >> 56753.209: [CMS-concurrent-reset-start] >> 56753.219: [CMS-concurrent-reset: 0.010/0.010 secs] >> >> >> We only observe this behavior when a minor GC occurs during >> the Concurrent Mark (which is rare). Our app has reported >> pauses up to 1.2 seconds ... which is generally the time it >> takes to perform a Concurrent Mark. >> >> >> Any insight/help that you could provide would be much >> appreciated. >> >> Thanks >> >> >> From Paul.Hohensee at Sun.COM Mon Sep 28 16:39:38 2009 From: Paul.Hohensee at Sun.COM (Paul Hohensee) Date: Mon, 28 Sep 2009 19:39:38 -0400 Subject: unexplained CMS pauses In-Reply-To: <4AC148B6.7010608@Sun.COM> References: <4AC0EEAE.5010705@Sun.COM> <4AC145AE.30804@sun.com> <4AC148B6.7010608@Sun.COM> Message-ID: <4AC1493A.2030004@sun.com> Ouch. Missed that. Paul Y.S.Ramakrishna at Sun.COM wrote: > Hi Paul, that would be 100 us + minor gc time (=30 ms) = 30.1 ms. > It does not explain the 10- to 40-fold increase in txn times observed > here (of 300 ms - 1200 ms). > > -- ramki > > On 09/28/09 16:24, Paul Hohensee wrote: >> Might have nothing to do with concurrent mark running, rather with >> minor collections >> happening during a java->native call. >> >> Minor collections are stop-the-world, and if one occurs during a >> java->native call, the >> java thread making the call to native will block on return from the >> call to native until >> the minor collection is over. If the outliers are outliers in the >> time it takes to execute >> the native call, and the minor gc duration and timing are exactly >> right, you could end >> up spending most of the minor gc time blocked. Pause time would then >> be 100ms >> + minor gc time. >> >> Paul >> >> Shane Cox wrote: >>> Ramki, >>> We're running Solaris 10 on 8-core Intel Xeon's. 1 Java instance >>> per box. JRE 1.6, update 14. 64-bit JVM. >>> >>> Our app is making JNI calls. Each call requires approximately the >>> same amount of work (so we expect them to perform similarly). >>> Internally, we measure how long it takes to perform these calls. >>> +99% of these calls complete in less than 100 micros. However, we >>> have outliers in the 300-1200ms range. After some research, we have >>> found that these extreme outliers coincide with the Concurrent Mark >>> phase of CMS (based on timestamps), and ONLY if there is a minor GC >>> during that phase. >>> >>> In a given day, our app will execute CMS collections 500-1000 >>> times. Of these, less than 10 will have a minor collection execute >>> during the Concurrent Mark. For each of these, our app reports a >>> large pause in the 300-1200ms range. In fact, all of the large >>> pauses reported by our app correlate with a minor GC during a >>> Concurrent Mark, without exception. >>> >>> Thanks >>> >>> On Mon, Sep 28, 2009 at 1:13 PM, >> > wrote: >>> >>> What platform are you on (#cpu's etc.) >>> and when you say "app reports a pause of 300 ms" >>> is it that the odd transaction sees a latency >>> of 300 ms (coincident with concurrent mark), >>> whereas most transactions complete much more >>> quickly? >>> >>> I am trying to first understand how you determine >>> that the application is seeing "long pauses" >>> when a minor gc occurs during concurrent mark. >>> >>> PS: for example, if two gc pauses (say a scavenge of 30 ms >>> and an initial mark of 13 ms occur in quick succession, >>> your application might notice a pause of 43 ms, etc.) >>> >>> -- ramki >>> >>> >>> On 09/28/09 10:07, Shane Cox wrote: >>> >>> Our application is reporting long pauses when a minor GC >>> occurs during the Concurrent Mark phase of CMS. The output >>> below is a specific example. All of the GC pauses are less >>> than 30ms (initial mark, remark, minor GC). However, our app >>> reported a 300ms pause. >>> >>> 56750.934: [GC [1 CMS-initial-mark: 702464K(1402496K)] >>> 719045K(1551616K), 0.0131859 secs] >>> 56750.947: [CMS-concurrent-mark-start] >>> 56752.133: [GC 56752.133: [ParNew: 144393K->12122K(149120K), >>> 0.0237615 secs] 846857K->719330K(1551616K), 0.0239988 secs] >>> 56752.162: [CMS-concurrent-mark: 1.188/1.215 secs] >>> 56752.162: [CMS-concurrent-preclean-start] >>> 56752.243: [CMS-concurrent-preclean: 0.070/0.081 secs] >>> 56752.243: [CMS-concurrent-abortable-preclean-start] >>> 56752.765: [CMS-concurrent-abortable-preclean: 0.143/0.522 >>> secs] >>> 56752.766: [GC[YG occupancy: 77423 K (149120 K)]56752.766: >>> [Rescan (parallel) , 0.0065730 secs]56752.773: [weak refs >>> processing, 0.0001983 secs] [1 CMS-remark: 707208K(1402496K)] >>> 784631K(1551616K), 0.0068908 secs] >>> 56752.773: [CMS-concurrent-sweep-start] >>> 56753.209: [CMS-concurrent-sweep: 0.436/0.436 secs] >>> 56753.209: [CMS-concurrent-reset-start] >>> 56753.219: [CMS-concurrent-reset: 0.010/0.010 secs] >>> >>> >>> We only observe this behavior when a minor GC occurs during >>> the Concurrent Mark (which is rare). Our app has reported >>> pauses up to 1.2 seconds ... which is generally the time it >>> takes to perform a Concurrent Mark. >>> >>> >>> Any insight/help that you could provide would be much >>> appreciated. >>> >>> Thanks >>> >>> >>> >