From chkwok at digibites.nl Sat Aug 1 14:54:06 2015 From: chkwok at digibites.nl (Chi Ho Kwok) Date: Sat, 1 Aug 2015 16:54:06 +0200 Subject: Make G1 the Default GC - not a good idea for heavy calculation use cases In-Reply-To: <55BB6127.4070309@redhat.com> References: <55BB6127.4070309@redhat.com> Message-ID: The linked JEP literally states: * The change is based on the assumption that limiting latency is often more important than maximizing throughput. If this assumption is incorrect then this change might need to be reconsidered. * The resource usage of G1 is different from Parallel. When resource usage overhead needs to be minimized a collector other than G1 should be used, and after this change the alternate collector will have to be specified explicitly. I'm not sure why linking a throughput benchmark is relevant to the discussion - we already know that throughput collector is king for efficiency. G1 and CMS are both aiming at reducing latency which is the most important factor when you run a user facing programs like web services, search with Solr/Lucene, desktop program with an user interface, etc, and I agree with the assessment that latency is more important to the user than throughput. A job that takes 5 minutes can take 6 without anyone noticing, but a stuck web request / user interface is immediately annoying. The questions should be - *Readers of hotspot-gc-use, does the workload you run prefer limiting latency or maximizing throughput?* *What is your current heap size, and do you plan on expanding it?* Question 2 is relevant as throughput collector already feels slow starting with 4 or 8GB heaps, and useless for 32GB+ because collection times goes to seconds. I think we should be prepared for larger heaps, machines with 128GB+ RAM aren't that rare anymore. My response would be, for web services (servlet + embedded jetty specifically) 1. Latency. Used CMS for years, but G1 is both lower overhead and has zero issues so far with concurrent collection failures unlike CMS, also reduced amount of -XX:whatever from 15 lines to just target pause time millis and initiation %. 2. 28GB per java process is standard here (staying below 32 for zero based compressed OOPS), but might go for more next generation. G1 also increased density by 20%, I can run a higher concurrent session count on the same memory size than before - CMS needed a large safety margin (we had initiating occupancy on 74%, concurrent cycle ends with heap used between 78 and 84%). Our workload is multi threaded - we've can hit 100% load on 16 vCPU with just a single process, but keep it below 50% usually. Load is bursty - users decide when to hit the site, not us. For Android Studio (and a bit tuning sensitive: Eclipse / IntelliJ / PyCharm) 1. Latency! Android Studio especially creates a ton of garbage, going with G1 fixed all my "input lag" due to heavy background processing. CMS was Jetbrain's default configuration but failed quite hard on larger projects (concurrent failure in desktop app = sigh), and throughput GC was unworkable due to noticeable pauses mostly right after a key press. Auto completion / analyzing the change happens at key press - generating garbage - and a program being stuck right when you expect a character to appear on screen is just horrible user experience. 2. -Xms1G -Xmx3G. G1 does just fine, claiming more memory from the OS when needed and releasing it if heap is mostly empty. It needs about 30% more room for objects to die than CMS/ParallelGC in Android Studio's object allocation pattern, using the same -Xms2G -Xmx2G introduced a lot of concurrent failures. Peak RAM usage has grown but average (due to better resizing behavior?) has dropped. (same thing applies to the other 3 IDE's I've mentioned, but they are a bit less horrible out of the box and easier to tune for, no massive garbage storms after key press in Eclipse, just high GC load during builds.) I guess I'm an exception, using a lot of custom GC flags in my eclipse.ini / studio64.exe.vmoptions, but it improved the quality of life enormously; I knew them anyway from tuning web services days. It would be good if heavier desktop programs ran great by default instead of "meh" or "laggy". Kind regards, -- Chi Ho Kwok Digibites Technology chkwok at digibites.nl On 31 July 2015 at 13:51, Geoffrey De Smet wrote: > Hi guys, > > I've ran some benchmarks on OptaPlanner use cases with the latest OpenJDK 8 > to asses the impact of switching the default to G1: > > http://www.optaplanner.org/blog/2015/07/31/WhatIsTheFastestGarbageCollectorInJava8.html > > Short summary: G1 is consistently worse in every use case for every > dataset... > > -- > > With kind regards, > Geoffrey De Smet > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at thinktopic.com Mon Aug 3 16:30:24 2015 From: chris at thinktopic.com (Chris Nuernberger) Date: Mon, 3 Aug 2015 10:30:24 -0600 Subject: Repro case for https://bugs.openjdk.java.net/browse/JDK-8059010 Message-ID: Hey, I have managed to run into and build a somewhat minimal repro case for this bug. It has been closed with a CNR tag but I think it is an important one. The background to this is that we are doing a lot of image processing so I have been working with opencv. I have a long-running process that goes through lots of images. Some of the process memory is used with opencv image representations which are pretty large. I don't want to speculate to much as to the cause but I have a feeling that it is because opencv has a lot of memory allocated that the openjdk system cannot account for. I can run this process for a very long time but eventually bug happens so I pulled out the pathway and put it into a github repository: https://github.com/thinktopic/JDK-8059010 -- *Chris Nuernberger* *principal engineer / founder* 2336 Canyon Blvd, Suite 101 Boulder, CO, USA, 80302 c: (303) 859.5854 <303.859.5854> thinktopic.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kjkoster at gmail.com Sun Aug 9 15:58:36 2015 From: kjkoster at gmail.com (Kees Jan Koster) Date: Sun, 9 Aug 2015 17:58:36 +0200 Subject: Stack sizes and stack allocation Message-ID: <5622CE5A-3F64-4C72-95D5-248AC8DD252F@gmail.com> Dear All, Not sure if this is the right channel, please feel free to redirect my question. Assuming Java 8 on a 64-bit Linux machine with plenty RAM: When I reduce the stack sizes for a JVM, does that impact the stack allocation algorithms of the JVM? Conversely, if stack allocation calls for too many objects to be allocated on the stack, does the JVM throw a stack overflow error, or fall back to heap allocation? And, how do I investigate this? How would I configure the JVM to print out details of stack allocation vs fall-back to heap? -- Kees Jan http://java-monitor.com/ kjkoster at kjkoster.org +31651838192 I hate unit tests; I much prefer the illusion that there are no errors in my code. -- Hendrik Muller -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: Message signed with OpenPGP using GPGMail URL: From yu.zhang at oracle.com Mon Aug 10 19:11:08 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Mon, 10 Aug 2015 12:11:08 -0700 Subject: Stack sizes and stack allocation In-Reply-To: <5622CE5A-3F64-4C72-95D5-248AC8DD252F@gmail.com> References: <5622CE5A-3F64-4C72-95D5-248AC8DD252F@gmail.com> Message-ID: <55C8F74C.1070705@oracle.com> Hi, Kees, Please see my comments in line. Thanks, Jenny On 8/9/2015 8:58 AM, Kees Jan Koster wrote: > Dear All, > > Not sure if this is the right channel, please feel free to redirect my question. > > Assuming Java 8 on a 64-bit Linux machine with plenty RAM: When I reduce the stack sizes for a JVM, does that impact the stack allocation algorithms of the JVM? No. You might run into stack overflow error when the stack size is too small. > > Conversely, if stack allocation calls for too many objects to be allocated on the stack, does the JVM throw a stack overflow error, or fall back to heap allocation? No. When escape analysis is enabled, some heap objects might be put on stack though. > > And, how do I investigate this? How would I configure the JVM to print out details of stack allocation vs fall-back to heap? > > -- > Kees Jan > > http://java-monitor.com/ > kjkoster at kjkoster.org > +31651838192 > > I hate unit tests; I much prefer the illusion that there are no errors in my code. > -- Hendrik Muller > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From kjkoster at gmail.com Wed Aug 12 11:13:00 2015 From: kjkoster at gmail.com (Kees Jan Koster) Date: Wed, 12 Aug 2015 13:13:00 +0200 Subject: Stack sizes and stack allocation In-Reply-To: <55C8F74C.1070705@oracle.com> References: <5622CE5A-3F64-4C72-95D5-248AC8DD252F@gmail.com> <55C8F74C.1070705@oracle.com> Message-ID: <5D933D61-C030-4CEE-BBF1-1D49078BA857@gmail.com> Dear Jenny, >> Conversely, if stack allocation calls for too many objects to be allocated on the stack, does the JVM throw a stack overflow error, or fall back to heap allocation? > No. When escape analysis is enabled, some heap objects might be put on stack though. Just so I understand correctly: imagine escape analysis is enabled. For a certain bit of code, the escape analysis logic decides that a certain heap object can be allocated on the stack. If the stack size is too small for that object, does the JVM throw a stack overflow error, or does the JVM allocate the object on the heap instead? -- Kees Jan http://java-monitor.com/ kjkoster at kjkoster.org +31651838192 I hate unit tests; I much prefer the illusion that there are no errors in my code. -- Hendrik Muller -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: Message signed with OpenPGP using GPGMail URL: From vitalyd at gmail.com Wed Aug 12 12:23:05 2015 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 12 Aug 2015 08:23:05 -0400 Subject: Stack sizes and stack allocation In-Reply-To: <5D933D61-C030-4CEE-BBF1-1D49078BA857@gmail.com> References: <5622CE5A-3F64-4C72-95D5-248AC8DD252F@gmail.com> <55C8F74C.1070705@oracle.com> <5D933D61-C030-4CEE-BBF1-1D49078BA857@gmail.com> Message-ID: It will throw stackoverflow error. Keep in mind that escape analysis (currently) only does scalar replacement and not true stack allocation, so the stack usage of it is additional registers and spill slots, no different than calling a method with more arguments. sent from my phone On Aug 12, 2015 7:15 AM, "Kees Jan Koster" wrote: > Dear Jenny, > > >> Conversely, if stack allocation calls for too many objects to be > allocated on the stack, does the JVM throw a stack overflow error, or fall > back to heap allocation? > > No. When escape analysis is enabled, some heap objects might be put on > stack though. > > Just so I understand correctly: imagine escape analysis is enabled. For a > certain bit of code, the escape analysis logic decides that a certain heap > object can be allocated on the stack. > > If the stack size is too small for that object, does the JVM throw a stack > overflow error, or does the JVM allocate the object on the heap instead? > > -- > Kees Jan > > http://java-monitor.com/ > kjkoster at kjkoster.org > +31651838192 > > I hate unit tests; I much prefer the illusion that there are no errors in > my code. > -- Hendrik > Muller > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kjkoster at gmail.com Thu Aug 13 16:23:29 2015 From: kjkoster at gmail.com (Kees Jan Koster) Date: Thu, 13 Aug 2015 18:23:29 +0200 Subject: Stack sizes and stack allocation In-Reply-To: References: <5622CE5A-3F64-4C72-95D5-248AC8DD252F@gmail.com> <55C8F74C.1070705@oracle.com> <5D933D61-C030-4CEE-BBF1-1D49078BA857@gmail.com> Message-ID: Dear Vitaly, > It will throw stackoverflow error. Right, that?s good. So by reducing the stack size we are not inadvertently reducing the ability for the VM to optimise. > Keep in mind that escape analysis (currently) only does scalar replacement and not true stack allocation, so the stack usage of it is additional registers and spill slots, no different than calling a method with more arguments. What does ?scalar? mean in this context? Only atomic types? Only JVM wrapper types such as java.lang.Integer? java.lang.Strings? How about my own types? If they have one field? Many fields? I guess my question is: what makes a class eligible to have its instances stack-allocated by escape analysis? -- Kees Jan http://java-monitor.com/ kjkoster at kjkoster.org +31651838192 Human beings make life so interesting. Do you know that in a universe so full of wonders, they have managed to invent boredom. Quite astonishing... -- Terry Pratchett -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: Message signed with OpenPGP using GPGMail URL: From vitalyd at gmail.com Thu Aug 13 16:53:22 2015 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 13 Aug 2015 12:53:22 -0400 Subject: Stack sizes and stack allocation In-Reply-To: References: <5622CE5A-3F64-4C72-95D5-248AC8DD252F@gmail.com> <55C8F74C.1070705@oracle.com> <5D933D61-C030-4CEE-BBF1-1D49078BA857@gmail.com> Message-ID: Scalar replacement basically takes the components of an object (i.e. its fields), places them into registers, and eliminates the "host" object itself. Canonical example is eliminating ArrayList$Itr. Your own types are eligible for this, it's not restricted to JDK. If you want to know more about the algorithm, I suggest you take a look at the sources ( http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/1aef080fd28d/src/share/vm/opto/escape.[h|c]pp) and/or ask the compiler guys (hotspot-compiler mailing list). On Thu, Aug 13, 2015 at 12:23 PM, Kees Jan Koster wrote: > Dear Vitaly, > > > It will throw stackoverflow error. > > Right, that?s good. So by reducing the stack size we are not inadvertently > reducing the ability for the VM to optimise. > > > Keep in mind that escape analysis (currently) only does scalar > replacement and not true stack allocation, so the stack usage of it is > additional registers and spill slots, no different than calling a method > with more arguments. > > What does ?scalar? mean in this context? Only atomic types? Only JVM > wrapper types such as java.lang.Integer? java.lang.Strings? > > How about my own types? If they have one field? Many fields? > > I guess my question is: what makes a class eligible to have its instances > stack-allocated by escape analysis? > > -- > Kees Jan > > http://java-monitor.com/ > kjkoster at kjkoster.org > +31651838192 > > Human beings make life so interesting. Do you know that in a universe so > full of wonders, > they have managed to invent boredom. Quite astonishing... -- Terry > Pratchett > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kjkoster at gmail.com Sat Aug 15 20:45:19 2015 From: kjkoster at gmail.com (Kees Jan Koster) Date: Sat, 15 Aug 2015 22:45:19 +0200 Subject: Stack sizes and stack allocation In-Reply-To: References: <5622CE5A-3F64-4C72-95D5-248AC8DD252F@gmail.com> <55C8F74C.1070705@oracle.com> <5D933D61-C030-4CEE-BBF1-1D49078BA857@gmail.com> Message-ID: <82EF4FEE-438C-4F59-AA3F-7F51A3A8A05B@gmail.com> Dear Vitaly and Jenny, Thank you so much for taking the time to answer my questions and give me pointers on where to go next. Much appreciated. Kees Jan > On 13 Aug 2015, at 18:53, Vitaly Davidovich wrote: > > Scalar replacement basically takes the components of an object (i.e. its fields), places them into registers, and eliminates the "host" object itself. Canonical example is eliminating ArrayList$Itr. Your own types are eligible for this, it's not restricted to JDK. > > If you want to know more about the algorithm, I suggest you take a look at the sources (http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/1aef080fd28d/src/share/vm/opto/escape.[h|c]pp) and/or ask the compiler guys (hotspot-compiler mailing list). > > On Thu, Aug 13, 2015 at 12:23 PM, Kees Jan Koster wrote: > Dear Vitaly, > >> It will throw stackoverflow error. > > Right, that?s good. So by reducing the stack size we are not inadvertently reducing the ability for the VM to optimise. > >> Keep in mind that escape analysis (currently) only does scalar replacement and not true stack allocation, so the stack usage of it is additional registers and spill slots, no different than calling a method with more arguments. > > What does ?scalar? mean in this context? Only atomic types? Only JVM wrapper types such as java.lang.Integer? java.lang.Strings? > > How about my own types? If they have one field? Many fields? > > I guess my question is: what makes a class eligible to have its instances stack-allocated by escape analysis? > > -- > Kees Jan > > http://java-monitor.com/ > kjkoster at kjkoster.org > +31651838192 > > Human beings make life so interesting. Do you know that in a universe so full of wonders, > they have managed to invent boredom. Quite astonishing... -- Terry Pratchett > > -- Kees Jan http://java-monitor.com/ kjkoster at kjkoster.org +31651838192 Change is good. Granted, it is good in retrospect, but change is good. From dvdeepankar.reddy at gmail.com Fri Aug 21 00:27:31 2015 From: dvdeepankar.reddy at gmail.com (D vd Reddy) Date: Thu, 20 Aug 2015 17:27:31 -0700 Subject: Question about Object Copy times Message-ID: Hi, We are running G1 GC with heap size of around 140 - 150 GB, we are observing high object copy times during young gc (> 80 % of the total GC time). Is this expected or is there anything we are doing wrong. I am not able to find any documentation of optimizing high object copy times, any help would be appreciated CommandLine flags: -XX:+AggressiveOpts -XX:InitialHeapSize=154618822656 -XX:+ManagementServer -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 -XX:MaxMetaspaceSize=268435456 -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UnlockExperimentalVMOptions -XX:-UseCompressedOops -XX:+UseG1GC Sample young GC snippet 4416.985: [GC pause (G1 Evacuation Pause) (young), 0.3180932 secs] [Parallel Time: 291.1 ms, GC Workers: 23] [GC Worker Start (ms): Min: 4416985.5, Avg: 4416985.9, Max: 4416986.2, Diff: 0.7] [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.4, Diff: 3.2, Sum: 38.5] [Update RS (ms): Min: 36.3, Avg: 39.4, Max: 40.0, Diff: 3.8, Sum: 906.0] [Processed Buffers: Min: 47, Avg: 80.9, Max: 124, Diff: 77, Sum: 1861] [Scan RS (ms): Min: 0.5, Avg: 1.0, Max: 1.1, Diff: 0.6, Sum: 22.5] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.6] [Object Copy (ms): Min: 247.1, Avg: 247.2, Max: 247.5, Diff: 0.4, Sum: 5686.4] [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.5] [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, Sum: 7.6] [GC Worker Total (ms): Min: 289.4, Avg: 290.0, Max: 290.4, Diff: 1.0, Sum: 6669.3] [GC Worker End (ms): Min: 4417275.5, Avg: 4417275.8, Max: 4417276.2, Diff: 0.7] [Code Root Fixup: 0.4 ms] [Code Root Migration: 0.5 ms] [Clear CT: 9.3 ms] [Other: 16.8 ms] [Choose CSet: 0.0 ms] [Ref Proc: 3.7 ms] [Ref Enq: 0.1 ms] [Free CSet: 6.0 ms] [Eden: 80.3G(80.3G)->0.0B(81.7G) Survivors: 2944.0M->2176.0M Heap: 126.4G(144.0G)->45.4G(144.0G)] [Times: user=6.84 sys=0.01, real=0.32 secs] Full GC Log for a period of run : https://gist.github.com/dvdreddy/5ecf9a58a3f309e8bb60 Thanks in advance -------------- next part -------------- An HTML attachment was scrubbed... URL: From yiyeguhu at gmail.com Fri Aug 21 00:33:05 2015 From: yiyeguhu at gmail.com (Tao Mao) Date: Thu, 20 Aug 2015 17:33:05 -0700 Subject: Question about Object Copy times In-Reply-To: References: Message-ID: Hi, Do you mainly optimize for throughtput or latency? What are the requirements? Thanks. Tao Mao On Thu, Aug 20, 2015 at 5:27 PM, D vd Reddy wrote: > Hi, > > We are running G1 GC with heap size of around 140 - 150 GB, we are > observing high object copy times during young gc (> 80 % of the total GC > time). > Is this expected or is there anything we are doing wrong. I am not able > to find any documentation of optimizing high object copy times, > any help would be appreciated > > > CommandLine flags: -XX:+AggressiveOpts -XX:InitialHeapSize=154618822656 > -XX:+ManagementServer > -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 > -XX:MaxMetaspaceSize=268435456 > -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 > -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+UnlockExperimentalVMOptions > -XX:-UseCompressedOops -XX:+UseG1GC > > > Sample young GC snippet > > > > > 4416.985: [GC pause (G1 Evacuation Pause) (young), 0.3180932 secs] > [Parallel Time: 291.1 ms, GC Workers: 23] > [GC Worker Start (ms): Min: 4416985.5, Avg: 4416985.9, Max: > 4416986.2, Diff: 0.7] > [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.4, Diff: 3.2, > Sum: 38.5] > [Update RS (ms): Min: 36.3, Avg: 39.4, Max: 40.0, Diff: 3.8, Sum: > 906.0] > [Processed Buffers: Min: 47, Avg: 80.9, Max: 124, Diff: 77, Sum: > 1861] > [Scan RS (ms): Min: 0.5, Avg: 1.0, Max: 1.1, Diff: 0.6, Sum: 22.5] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.6] > [Object Copy (ms): Min: 247.1, Avg: 247.2, Max: 247.5, Diff: 0.4, > Sum: 5686.4] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.5] > [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, Sum: > 7.6] > [GC Worker Total (ms): Min: 289.4, Avg: 290.0, Max: 290.4, Diff: > 1.0, Sum: 6669.3] > [GC Worker End (ms): Min: 4417275.5, Avg: 4417275.8, Max: 4417276.2, > Diff: 0.7] > [Code Root Fixup: 0.4 ms] > [Code Root Migration: 0.5 ms] > [Clear CT: 9.3 ms] > [Other: 16.8 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 3.7 ms] > [Ref Enq: 0.1 ms] > [Free CSet: 6.0 ms] > [Eden: 80.3G(80.3G)->0.0B(81.7G) Survivors: 2944.0M->2176.0M Heap: > 126.4G(144.0G)->45.4G(144.0G)] > [Times: user=6.84 sys=0.01, real=0.32 secs] > > Full GC Log for a period of run : > https://gist.github.com/dvdreddy/5ecf9a58a3f309e8bb60 > > > Thanks in advance > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dvdeepankar.reddy at gmail.com Fri Aug 21 01:02:18 2015 From: dvdeepankar.reddy at gmail.com (D vd Reddy) Date: Thu, 20 Aug 2015 18:02:18 -0700 Subject: Question about Object Copy times In-Reply-To: References: Message-ID: Thanks for the quick reply. We are trying to optimize for throughput. PS: Putting these logs through GCViewer put the throughput at 95.5 % which we want to improve On Thu, Aug 20, 2015 at 5:33 PM, Tao Mao wrote: > Hi, > > Do you mainly optimize for throughtput or latency? What are the > requirements? > > Thanks. > Tao Mao > > > > > On Thu, Aug 20, 2015 at 5:27 PM, D vd Reddy > wrote: > >> Hi, >> >> We are running G1 GC with heap size of around 140 - 150 GB, we are >> observing high object copy times during young gc (> 80 % of the total GC >> time). >> Is this expected or is there anything we are doing wrong. I am not able >> to find any documentation of optimizing high object copy times, >> any help would be appreciated >> >> >> CommandLine flags: -XX:+AggressiveOpts -XX:InitialHeapSize=154618822656 >> -XX:+ManagementServer >> -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 >> -XX:MaxMetaspaceSize=268435456 >> -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 >> -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >> -XX:+UnlockExperimentalVMOptions >> -XX:-UseCompressedOops -XX:+UseG1GC >> >> >> Sample young GC snippet >> >> >> >> >> 4416.985: [GC pause (G1 Evacuation Pause) (young), 0.3180932 secs] >> [Parallel Time: 291.1 ms, GC Workers: 23] >> [GC Worker Start (ms): Min: 4416985.5, Avg: 4416985.9, Max: >> 4416986.2, Diff: 0.7] >> [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.4, Diff: 3.2, >> Sum: 38.5] >> [Update RS (ms): Min: 36.3, Avg: 39.4, Max: 40.0, Diff: 3.8, Sum: >> 906.0] >> [Processed Buffers: Min: 47, Avg: 80.9, Max: 124, Diff: 77, Sum: >> 1861] >> [Scan RS (ms): Min: 0.5, Avg: 1.0, Max: 1.1, Diff: 0.6, Sum: 22.5] >> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, >> Sum: 0.6] >> [Object Copy (ms): Min: 247.1, Avg: 247.2, Max: 247.5, Diff: 0.4, >> Sum: 5686.4] >> [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: >> 7.5] >> [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, >> Sum: 7.6] >> [GC Worker Total (ms): Min: 289.4, Avg: 290.0, Max: 290.4, Diff: >> 1.0, Sum: 6669.3] >> [GC Worker End (ms): Min: 4417275.5, Avg: 4417275.8, Max: >> 4417276.2, Diff: 0.7] >> [Code Root Fixup: 0.4 ms] >> [Code Root Migration: 0.5 ms] >> [Clear CT: 9.3 ms] >> [Other: 16.8 ms] >> [Choose CSet: 0.0 ms] >> [Ref Proc: 3.7 ms] >> [Ref Enq: 0.1 ms] >> [Free CSet: 6.0 ms] >> [Eden: 80.3G(80.3G)->0.0B(81.7G) Survivors: 2944.0M->2176.0M Heap: >> 126.4G(144.0G)->45.4G(144.0G)] >> [Times: user=6.84 sys=0.01, real=0.32 secs] >> >> Full GC Log for a period of run : >> https://gist.github.com/dvdreddy/5ecf9a58a3f309e8bb60 >> >> >> Thanks in advance >> >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Fri Aug 21 01:14:50 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Thu, 20 Aug 2015 18:14:50 -0700 Subject: Question about Object Copy times In-Reply-To: References: Message-ID: <55D67B8A.6020209@oracle.com> Hi, can you reduce MaxGCPauseMillis to ~300 or 200, 1000 maybe too high, and the eden size is ~80g, survivor size 2.5g. This might make the old gen getting full quicker so more mixed gc, but worth a try. Also, if you can add -XX:+PrintAdaptiveSizePolicy, it might tell us more. It seems you are on an older version of jvm. can you move to later jdk8u40 or jdk8u60? Thanks, Jenny On 8/20/2015 5:27 PM, D vd Reddy wrote: > Hi, > > We are running G1 GC with heap size of around 140 - 150 GB, we are > observing high object copy times during young gc (> 80 % of the total > GC time). > Is this expected or is there anything we are doing wrong. I am not > able to find any documentation of optimizing high object copy times, > any help would be appreciated > > > CommandLine flags: -XX:+AggressiveOpts > -XX:InitialHeapSize=154618822656 -XX:+ManagementServer > -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 > -XX:MaxMetaspaceSize=268435456 > -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 > -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+UnlockExperimentalVMOptions > -XX:-UseCompressedOops -XX:+UseG1GC > > > Sample young GC snippet > > > > 4416.985: [GC pause (G1 Evacuation Pause) (young), 0.3180932 secs] > [Parallel Time: 291.1 ms, GC Workers: 23] > [GC Worker Start (ms): Min: 4416985.5, Avg: 4416985.9, Max: > 4416986.2, Diff: 0.7] > [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.4, Diff: > 3.2, Sum: 38.5] > [Update RS (ms): Min: 36.3, Avg: 39.4, Max: 40.0, Diff: 3.8, > Sum: 906.0] > [Processed Buffers: Min: 47, Avg: 80.9, Max: 124, Diff: 77, > Sum: 1861] > [Scan RS (ms): Min: 0.5, Avg: 1.0, Max: 1.1, Diff: 0.6, Sum: 22.5] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.6] > [Object Copy (ms): Min: 247.1, Avg: 247.2, Max: 247.5, Diff: > 0.4, Sum: 5686.4] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: > 7.5] > [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, > Sum: 7.6] > [GC Worker Total (ms): Min: 289.4, Avg: 290.0, Max: 290.4, Diff: > 1.0, Sum: 6669.3] > [GC Worker End (ms): Min: 4417275.5, Avg: 4417275.8, Max: > 4417276.2, Diff: 0.7] > [Code Root Fixup: 0.4 ms] > [Code Root Migration: 0.5 ms] > [Clear CT: 9.3 ms] > [Other: 16.8 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 3.7 ms] > [Ref Enq: 0.1 ms] > [Free CSet: 6.0 ms] > [Eden: 80.3G(80.3G)->0.0B(81.7G) Survivors: 2944.0M->2176.0M Heap: > 126.4G(144.0G)->45.4G(144.0G)] > [Times: user=6.84 sys=0.01, real=0.32 secs] > > Full GC Log for a period of run : > https://gist.github.com/dvdreddy/5ecf9a58a3f309e8bb60 > > > Thanks in advance > > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From dvdeepankar.reddy at gmail.com Fri Aug 21 01:24:28 2015 From: dvdeepankar.reddy at gmail.com (D vd Reddy) Date: Thu, 20 Aug 2015 18:24:28 -0700 Subject: Question about Object Copy times In-Reply-To: <55D67B8A.6020209@oracle.com> References: <55D67B8A.6020209@oracle.com> Message-ID: Thanks, I will add the -XX:+PrintAdaptiveSizePolicy, and will try to move the jvm But about the old gen getting full, there is a contradicting point that the overall heap size is going down at the same size as the size of eden minus size of survivor, So I feel that the old size is not going up by that much. Also in the logs I saw that Mixed GC is not happening that frequently only 8 mixed compared to 380 young. Am I missing something here ? Thanks, On Thu, Aug 20, 2015 at 6:14 PM, Yu Zhang wrote: > Hi, > > can you reduce MaxGCPauseMillis to ~300 or 200, 1000 maybe too high, and > the eden size is ~80g, survivor size 2.5g. This might make the old gen > getting full quicker so more mixed gc, but worth a try. Also, if you can > add -XX:+PrintAdaptiveSizePolicy, it might tell us more. > > It seems you are on an older version of jvm. can you move to later jdk8u40 > or jdk8u60? > > Thanks, > Jenny > > On 8/20/2015 5:27 PM, D vd Reddy wrote: > > Hi, > > We are running G1 GC with heap size of around 140 - 150 GB, we are > observing high object copy times during young gc (> 80 % of the total GC > time). > Is this expected or is there anything we are doing wrong. I am not able > to find any documentation of optimizing high object copy times, > any help would be appreciated > > > CommandLine flags: -XX:+AggressiveOpts -XX:InitialHeapSize=154618822656 > -XX:+ManagementServer > -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 > -XX:MaxMetaspaceSize=268435456 > -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 > -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+UnlockExperimentalVMOptions > -XX:-UseCompressedOops -XX:+UseG1GC > > > Sample young GC snippet > > > > 4416.985: [GC pause (G1 Evacuation Pause) (young), 0.3180932 secs] > [Parallel Time: 291.1 ms, GC Workers: 23] > [GC Worker Start (ms): Min: 4416985.5, Avg: 4416985.9, Max: > 4416986.2, Diff: 0.7] > [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.4, Diff: 3.2, > Sum: 38.5] > [Update RS (ms): Min: 36.3, Avg: 39.4, Max: 40.0, Diff: 3.8, Sum: > 906.0] > [Processed Buffers: Min: 47, Avg: 80.9, Max: 124, Diff: 77, Sum: > 1861] > [Scan RS (ms): Min: 0.5, Avg: 1.0, Max: 1.1, Diff: 0.6, Sum: 22.5] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.6] > [Object Copy (ms): Min: 247.1, Avg: 247.2, Max: 247.5, Diff: 0.4, > Sum: 5686.4] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.5] > [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, Sum: > 7.6] > [GC Worker Total (ms): Min: 289.4, Avg: 290.0, Max: 290.4, Diff: > 1.0, Sum: 6669.3] > [GC Worker End (ms): Min: 4417275.5, Avg: 4417275.8, Max: 4417276.2, > Diff: 0.7] > [Code Root Fixup: 0.4 ms] > [Code Root Migration: 0.5 ms] > [Clear CT: 9.3 ms] > [Other: 16.8 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 3.7 ms] > [Ref Enq: 0.1 ms] > [Free CSet: 6.0 ms] > [Eden: 80.3G(80.3G)->0.0B(81.7G) Survivors: 2944.0M->2176.0M Heap: > 126.4G(144.0G)->45.4G(144.0G)] > [Times: user=6.84 sys=0.01, real=0.32 secs] > > Full GC Log for a period of run : > https://gist.github.com/dvdreddy/5ecf9a58a3f309e8bb60 > > > Thanks in advance > > > > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dvdeepankar.reddy at gmail.com Fri Aug 21 03:21:19 2015 From: dvdeepankar.reddy at gmail.com (D vd Reddy) Date: Thu, 20 Aug 2015 20:21:19 -0700 Subject: Question about Object Copy times In-Reply-To: References: <55D67B8A.6020209@oracle.com> Message-ID: Hi, I have enable -XX:+PrintTenuringDistribution -XX:+PrintAdaptiveSizePolicy and moved to a newer build (1.8.0_45). No major improvements but I am seeing new information. Also Is the Desired Survivor Size printed by Tenuring Distribution only a approximate value ? I am seeing it is significantly lower than the final Survivor Size. The new logs are here : https://gist.github.com/dvdreddy/1cb9829e526d419d8452 Thanks PS: I am starting new experiment with lower max pause, will post results later. On Thu, Aug 20, 2015 at 6:24 PM, D vd Reddy wrote: > Thanks, I will add the -XX:+PrintAdaptiveSizePolicy, and will try to move > the jvm > > But about the old gen getting full, there is a contradicting point that > the overall heap size is going down at the same size as the size of eden > minus size of survivor, So I feel that the old size is not going up by > that much. Also in the logs I saw that Mixed GC is not happening that > frequently only 8 mixed compared to 380 young. > > Am I missing something here ? > > > Thanks, > > > On Thu, Aug 20, 2015 at 6:14 PM, Yu Zhang wrote: > >> Hi, >> >> can you reduce MaxGCPauseMillis to ~300 or 200, 1000 maybe too high, and >> the eden size is ~80g, survivor size 2.5g. This might make the old gen >> getting full quicker so more mixed gc, but worth a try. Also, if you can >> add -XX:+PrintAdaptiveSizePolicy, it might tell us more. >> >> It seems you are on an older version of jvm. can you move to later >> jdk8u40 or jdk8u60? >> >> Thanks, >> Jenny >> >> On 8/20/2015 5:27 PM, D vd Reddy wrote: >> >> Hi, >> >> We are running G1 GC with heap size of around 140 - 150 GB, we are >> observing high object copy times during young gc (> 80 % of the total GC >> time). >> Is this expected or is there anything we are doing wrong. I am not able >> to find any documentation of optimizing high object copy times, >> any help would be appreciated >> >> >> CommandLine flags: -XX:+AggressiveOpts -XX:InitialHeapSize=154618822656 >> -XX:+ManagementServer >> -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 >> -XX:MaxMetaspaceSize=268435456 >> -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 >> -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >> -XX:+UnlockExperimentalVMOptions >> -XX:-UseCompressedOops -XX:+UseG1GC >> >> >> Sample young GC snippet >> >> >> >> 4416.985: [GC pause (G1 Evacuation Pause) (young), 0.3180932 secs] >> [Parallel Time: 291.1 ms, GC Workers: 23] >> [GC Worker Start (ms): Min: 4416985.5, Avg: 4416985.9, Max: >> 4416986.2, Diff: 0.7] >> [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.4, Diff: 3.2, >> Sum: 38.5] >> [Update RS (ms): Min: 36.3, Avg: 39.4, Max: 40.0, Diff: 3.8, Sum: >> 906.0] >> [Processed Buffers: Min: 47, Avg: 80.9, Max: 124, Diff: 77, Sum: >> 1861] >> [Scan RS (ms): Min: 0.5, Avg: 1.0, Max: 1.1, Diff: 0.6, Sum: 22.5] >> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, >> Sum: 0.6] >> [Object Copy (ms): Min: 247.1, Avg: 247.2, Max: 247.5, Diff: 0.4, >> Sum: 5686.4] >> [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: >> 7.5] >> [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, >> Sum: 7.6] >> [GC Worker Total (ms): Min: 289.4, Avg: 290.0, Max: 290.4, Diff: >> 1.0, Sum: 6669.3] >> [GC Worker End (ms): Min: 4417275.5, Avg: 4417275.8, Max: >> 4417276.2, Diff: 0.7] >> [Code Root Fixup: 0.4 ms] >> [Code Root Migration: 0.5 ms] >> [Clear CT: 9.3 ms] >> [Other: 16.8 ms] >> [Choose CSet: 0.0 ms] >> [Ref Proc: 3.7 ms] >> [Ref Enq: 0.1 ms] >> [Free CSet: 6.0 ms] >> [Eden: 80.3G(80.3G)->0.0B(81.7G) Survivors: 2944.0M->2176.0M Heap: >> 126.4G(144.0G)->45.4G(144.0G)] >> [Times: user=6.84 sys=0.01, real=0.32 secs] >> >> Full GC Log for a period of run : >> https://gist.github.com/dvdreddy/5ecf9a58a3f309e8bb60 >> >> >> Thanks in advance >> >> >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yiyeguhu at gmail.com Fri Aug 21 04:04:48 2015 From: yiyeguhu at gmail.com (Tao Mao) Date: Thu, 20 Aug 2015 21:04:48 -0700 Subject: Question about Object Copy times In-Reply-To: References: <55D67B8A.6020209@oracle.com> Message-ID: Hi, Is 140~150GB your memory limit? What's your ballpark live data set size? Since you are looking to improve throughput, it may be helpful to increase maxheapsize. Thanks. Tao Mao On Thu, Aug 20, 2015 at 8:21 PM, D vd Reddy wrote: > Hi, > > I have enable -XX:+PrintTenuringDistribution -XX:+PrintAdaptiveSizePolicy > and moved to a newer build (1.8.0_45). No major improvements but I am > seeing new information. > > Also Is the Desired Survivor Size printed by Tenuring Distribution only a > approximate value ? I am seeing it is significantly lower than the final > Survivor Size. > > The new logs are here : > https://gist.github.com/dvdreddy/1cb9829e526d419d8452 > > Thanks > > PS: I am starting new experiment with lower max pause, will post results > later. > > > > > > > > On Thu, Aug 20, 2015 at 6:24 PM, D vd Reddy > wrote: > >> Thanks, I will add the -XX:+PrintAdaptiveSizePolicy, and will try to >> move the jvm >> >> But about the old gen getting full, there is a contradicting point that >> the overall heap size is going down at the same size as the size of eden >> minus size of survivor, So I feel that the old size is not going up by >> that much. Also in the logs I saw that Mixed GC is not happening that >> frequently only 8 mixed compared to 380 young. >> >> Am I missing something here ? >> >> >> Thanks, >> >> >> On Thu, Aug 20, 2015 at 6:14 PM, Yu Zhang wrote: >> >>> Hi, >>> >>> can you reduce MaxGCPauseMillis to ~300 or 200, 1000 maybe too high, and >>> the eden size is ~80g, survivor size 2.5g. This might make the old gen >>> getting full quicker so more mixed gc, but worth a try. Also, if you can >>> add -XX:+PrintAdaptiveSizePolicy, it might tell us more. >>> >>> It seems you are on an older version of jvm. can you move to later >>> jdk8u40 or jdk8u60? >>> >>> Thanks, >>> Jenny >>> >>> On 8/20/2015 5:27 PM, D vd Reddy wrote: >>> >>> Hi, >>> >>> We are running G1 GC with heap size of around 140 - 150 GB, we are >>> observing high object copy times during young gc (> 80 % of the total GC >>> time). >>> Is this expected or is there anything we are doing wrong. I am not able >>> to find any documentation of optimizing high object copy times, >>> any help would be appreciated >>> >>> >>> CommandLine flags: -XX:+AggressiveOpts -XX:InitialHeapSize=154618822656 >>> -XX:+ManagementServer >>> -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 >>> -XX:MaxMetaspaceSize=268435456 >>> -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 >>> -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >>> -XX:+UnlockExperimentalVMOptions >>> -XX:-UseCompressedOops -XX:+UseG1GC >>> >>> >>> Sample young GC snippet >>> >>> >>> >>> 4416.985: [GC pause (G1 Evacuation Pause) (young), 0.3180932 secs] >>> [Parallel Time: 291.1 ms, GC Workers: 23] >>> [GC Worker Start (ms): Min: 4416985.5, Avg: 4416985.9, Max: >>> 4416986.2, Diff: 0.7] >>> [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.4, Diff: 3.2, >>> Sum: 38.5] >>> [Update RS (ms): Min: 36.3, Avg: 39.4, Max: 40.0, Diff: 3.8, Sum: >>> 906.0] >>> [Processed Buffers: Min: 47, Avg: 80.9, Max: 124, Diff: 77, >>> Sum: 1861] >>> [Scan RS (ms): Min: 0.5, Avg: 1.0, Max: 1.1, Diff: 0.6, Sum: 22.5] >>> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, >>> Sum: 0.6] >>> [Object Copy (ms): Min: 247.1, Avg: 247.2, Max: 247.5, Diff: 0.4, >>> Sum: 5686.4] >>> [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: >>> 7.5] >>> [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, >>> Sum: 7.6] >>> [GC Worker Total (ms): Min: 289.4, Avg: 290.0, Max: 290.4, Diff: >>> 1.0, Sum: 6669.3] >>> [GC Worker End (ms): Min: 4417275.5, Avg: 4417275.8, Max: >>> 4417276.2, Diff: 0.7] >>> [Code Root Fixup: 0.4 ms] >>> [Code Root Migration: 0.5 ms] >>> [Clear CT: 9.3 ms] >>> [Other: 16.8 ms] >>> [Choose CSet: 0.0 ms] >>> [Ref Proc: 3.7 ms] >>> [Ref Enq: 0.1 ms] >>> [Free CSet: 6.0 ms] >>> [Eden: 80.3G(80.3G)->0.0B(81.7G) Survivors: 2944.0M->2176.0M Heap: >>> 126.4G(144.0G)->45.4G(144.0G)] >>> [Times: user=6.84 sys=0.01, real=0.32 secs] >>> >>> Full GC Log for a period of run : >>> https://gist.github.com/dvdreddy/5ecf9a58a3f309e8bb60 >>> >>> >>> Thanks in advance >>> >>> >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dvdeepankar.reddy at gmail.com Fri Aug 21 04:32:22 2015 From: dvdeepankar.reddy at gmail.com (D vd Reddy) Date: Thu, 20 Aug 2015 21:32:22 -0700 Subject: Question about Object Copy times In-Reply-To: References: <55D67B8A.6020209@oracle.com> Message-ID: Machines are mostly 192 GB, max we can go 10 or 20 GB more, we have a old gen objects (which is what live means right?) of around 40 ish GB size other than that every thing is per request churn (short lived objects). Does going for even bigger heap help in this scenario ? Thanks On Thu, Aug 20, 2015 at 9:04 PM, Tao Mao wrote: > Hi, > > Is 140~150GB your memory limit? What's your ballpark live data set size? > Since you are looking to improve throughput, it may be helpful to increase > maxheapsize. > > Thanks. > Tao Mao > > > On Thu, Aug 20, 2015 at 8:21 PM, D vd Reddy > wrote: > >> Hi, >> >> I have enable -XX:+PrintTenuringDistribution -XX:+PrintAdaptiveSizePolicy >> and moved to a newer build (1.8.0_45). No major improvements but I am >> seeing new information. >> >> Also Is the Desired Survivor Size printed by Tenuring Distribution only a >> approximate value ? I am seeing it is significantly lower than the final >> Survivor Size. >> >> The new logs are here : >> https://gist.github.com/dvdreddy/1cb9829e526d419d8452 >> >> Thanks >> >> PS: I am starting new experiment with lower max pause, will post results >> later. >> >> >> >> >> >> >> >> On Thu, Aug 20, 2015 at 6:24 PM, D vd Reddy >> wrote: >> >>> Thanks, I will add the -XX:+PrintAdaptiveSizePolicy, and will try to >>> move the jvm >>> >>> But about the old gen getting full, there is a contradicting point that >>> the overall heap size is going down at the same size as the size of eden >>> minus size of survivor, So I feel that the old size is not going up by >>> that much. Also in the logs I saw that Mixed GC is not happening that >>> frequently only 8 mixed compared to 380 young. >>> >>> Am I missing something here ? >>> >>> >>> Thanks, >>> >>> >>> On Thu, Aug 20, 2015 at 6:14 PM, Yu Zhang wrote: >>> >>>> Hi, >>>> >>>> can you reduce MaxGCPauseMillis to ~300 or 200, 1000 maybe too high, >>>> and the eden size is ~80g, survivor size 2.5g. This might make the old gen >>>> getting full quicker so more mixed gc, but worth a try. Also, if you can >>>> add -XX:+PrintAdaptiveSizePolicy, it might tell us more. >>>> >>>> It seems you are on an older version of jvm. can you move to later >>>> jdk8u40 or jdk8u60? >>>> >>>> Thanks, >>>> Jenny >>>> >>>> On 8/20/2015 5:27 PM, D vd Reddy wrote: >>>> >>>> Hi, >>>> >>>> We are running G1 GC with heap size of around 140 - 150 GB, we are >>>> observing high object copy times during young gc (> 80 % of the total GC >>>> time). >>>> Is this expected or is there anything we are doing wrong. I am not >>>> able to find any documentation of optimizing high object copy times, >>>> any help would be appreciated >>>> >>>> >>>> CommandLine flags: -XX:+AggressiveOpts -XX:InitialHeapSize=154618822656 >>>> -XX:+ManagementServer >>>> -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 >>>> -XX:MaxMetaspaceSize=268435456 >>>> -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 >>>> -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >>>> -XX:+UnlockExperimentalVMOptions >>>> -XX:-UseCompressedOops -XX:+UseG1GC >>>> >>>> >>>> Sample young GC snippet >>>> >>>> >>>> >>>> 4416.985: [GC pause (G1 Evacuation Pause) (young), 0.3180932 secs] >>>> [Parallel Time: 291.1 ms, GC Workers: 23] >>>> [GC Worker Start (ms): Min: 4416985.5, Avg: 4416985.9, Max: >>>> 4416986.2, Diff: 0.7] >>>> [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.4, Diff: 3.2, >>>> Sum: 38.5] >>>> [Update RS (ms): Min: 36.3, Avg: 39.4, Max: 40.0, Diff: 3.8, Sum: >>>> 906.0] >>>> [Processed Buffers: Min: 47, Avg: 80.9, Max: 124, Diff: 77, >>>> Sum: 1861] >>>> [Scan RS (ms): Min: 0.5, Avg: 1.0, Max: 1.1, Diff: 0.6, Sum: 22.5] >>>> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: >>>> 0.0, Sum: 0.6] >>>> [Object Copy (ms): Min: 247.1, Avg: 247.2, Max: 247.5, Diff: 0.4, >>>> Sum: 5686.4] >>>> [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: >>>> 7.5] >>>> [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, >>>> Sum: 7.6] >>>> [GC Worker Total (ms): Min: 289.4, Avg: 290.0, Max: 290.4, Diff: >>>> 1.0, Sum: 6669.3] >>>> [GC Worker End (ms): Min: 4417275.5, Avg: 4417275.8, Max: >>>> 4417276.2, Diff: 0.7] >>>> [Code Root Fixup: 0.4 ms] >>>> [Code Root Migration: 0.5 ms] >>>> [Clear CT: 9.3 ms] >>>> [Other: 16.8 ms] >>>> [Choose CSet: 0.0 ms] >>>> [Ref Proc: 3.7 ms] >>>> [Ref Enq: 0.1 ms] >>>> [Free CSet: 6.0 ms] >>>> [Eden: 80.3G(80.3G)->0.0B(81.7G) Survivors: 2944.0M->2176.0M Heap: >>>> 126.4G(144.0G)->45.4G(144.0G)] >>>> [Times: user=6.84 sys=0.01, real=0.32 secs] >>>> >>>> Full GC Log for a period of run : >>>> https://gist.github.com/dvdreddy/5ecf9a58a3f309e8bb60 >>>> >>>> >>>> Thanks in advance >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dvdeepankar.reddy at gmail.com Mon Aug 24 19:05:48 2015 From: dvdeepankar.reddy at gmail.com (D vd Reddy) Date: Mon, 24 Aug 2015 12:05:48 -0700 Subject: Question about Object Copy times In-Reply-To: References: <55D67B8A.6020209@oracle.com> Message-ID: Hi, I made a couple of experiments run over the weekend, first I ran a experiment with lowering the maxGCPause it did help, lowered the throughput to 92%. I found some machines with higher memory and ran the experiment with 3 different memory sizes 92 GB, 144 GB and 192 GB, the 92 Gb one was having lower throughput (96.7 %) and the other two configs gave same throughput (97.xx %). One thing we observed was the survivor size are similar in all cases (around 2 GB,) and the object copy times (around 200 -250ms per worker, no of workers ~ 28) is similar(~ish) in all the cases (expect for some outliers). If the object copy is just copying memory, Do you know what is expected throughput of this copy per GB / MB or what is expected in practice. Thanks in advance On Thu, Aug 20, 2015 at 9:32 PM, D vd Reddy wrote: > Machines are mostly 192 GB, max we can go 10 or 20 GB more, we have a old > gen objects (which is what live means right?) of around 40 ish GB size > other than that every thing is per request churn (short lived objects). > Does going for even bigger heap help in this scenario ? > > Thanks > > > > On Thu, Aug 20, 2015 at 9:04 PM, Tao Mao wrote: > >> Hi, >> >> Is 140~150GB your memory limit? What's your ballpark live data set size? >> Since you are looking to improve throughput, it may be helpful to increase >> maxheapsize. >> >> Thanks. >> Tao Mao >> >> >> On Thu, Aug 20, 2015 at 8:21 PM, D vd Reddy >> wrote: >> >>> Hi, >>> >>> I have >>> enable -XX:+PrintTenuringDistribution -XX:+PrintAdaptiveSizePolicy and >>> moved to a newer build (1.8.0_45). No major improvements but I am seeing >>> new information. >>> >>> Also Is the Desired Survivor Size printed by Tenuring Distribution only >>> a approximate value ? I am seeing it is significantly lower than the final >>> Survivor Size. >>> >>> The new logs are here : >>> https://gist.github.com/dvdreddy/1cb9829e526d419d8452 >>> >>> Thanks >>> >>> PS: I am starting new experiment with lower max pause, will post results >>> later. >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Aug 20, 2015 at 6:24 PM, D vd Reddy >> > wrote: >>> >>>> Thanks, I will add the -XX:+PrintAdaptiveSizePolicy, and will try to >>>> move the jvm >>>> >>>> But about the old gen getting full, there is a contradicting point that >>>> the overall heap size is going down at the same size as the size of eden >>>> minus size of survivor, So I feel that the old size is not going up by >>>> that much. Also in the logs I saw that Mixed GC is not happening that >>>> frequently only 8 mixed compared to 380 young. >>>> >>>> Am I missing something here ? >>>> >>>> >>>> Thanks, >>>> >>>> >>>> On Thu, Aug 20, 2015 at 6:14 PM, Yu Zhang wrote: >>>> >>>>> Hi, >>>>> >>>>> can you reduce MaxGCPauseMillis to ~300 or 200, 1000 maybe too high, >>>>> and the eden size is ~80g, survivor size 2.5g. This might make the old gen >>>>> getting full quicker so more mixed gc, but worth a try. Also, if you can >>>>> add -XX:+PrintAdaptiveSizePolicy, it might tell us more. >>>>> >>>>> It seems you are on an older version of jvm. can you move to later >>>>> jdk8u40 or jdk8u60? >>>>> >>>>> Thanks, >>>>> Jenny >>>>> >>>>> On 8/20/2015 5:27 PM, D vd Reddy wrote: >>>>> >>>>> Hi, >>>>> >>>>> We are running G1 GC with heap size of around 140 - 150 GB, we are >>>>> observing high object copy times during young gc (> 80 % of the total GC >>>>> time). >>>>> Is this expected or is there anything we are doing wrong. I am not >>>>> able to find any documentation of optimizing high object copy times, >>>>> any help would be appreciated >>>>> >>>>> >>>>> CommandLine flags: -XX:+AggressiveOpts >>>>> -XX:InitialHeapSize=154618822656 -XX:+ManagementServer >>>>> -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 >>>>> -XX:MaxMetaspaceSize=268435456 >>>>> -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 >>>>> -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >>>>> -XX:+UnlockExperimentalVMOptions >>>>> -XX:-UseCompressedOops -XX:+UseG1GC >>>>> >>>>> >>>>> Sample young GC snippet >>>>> >>>>> >>>>> >>>>> 4416.985: [GC pause (G1 Evacuation Pause) (young), 0.3180932 secs] >>>>> [Parallel Time: 291.1 ms, GC Workers: 23] >>>>> [GC Worker Start (ms): Min: 4416985.5, Avg: 4416985.9, Max: >>>>> 4416986.2, Diff: 0.7] >>>>> [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.4, Diff: >>>>> 3.2, Sum: 38.5] >>>>> [Update RS (ms): Min: 36.3, Avg: 39.4, Max: 40.0, Diff: 3.8, >>>>> Sum: 906.0] >>>>> [Processed Buffers: Min: 47, Avg: 80.9, Max: 124, Diff: 77, >>>>> Sum: 1861] >>>>> [Scan RS (ms): Min: 0.5, Avg: 1.0, Max: 1.1, Diff: 0.6, Sum: >>>>> 22.5] >>>>> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: >>>>> 0.0, Sum: 0.6] >>>>> [Object Copy (ms): Min: 247.1, Avg: 247.2, Max: 247.5, Diff: >>>>> 0.4, Sum: 5686.4] >>>>> [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: >>>>> 7.5] >>>>> [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, >>>>> Sum: 7.6] >>>>> [GC Worker Total (ms): Min: 289.4, Avg: 290.0, Max: 290.4, Diff: >>>>> 1.0, Sum: 6669.3] >>>>> [GC Worker End (ms): Min: 4417275.5, Avg: 4417275.8, Max: >>>>> 4417276.2, Diff: 0.7] >>>>> [Code Root Fixup: 0.4 ms] >>>>> [Code Root Migration: 0.5 ms] >>>>> [Clear CT: 9.3 ms] >>>>> [Other: 16.8 ms] >>>>> [Choose CSet: 0.0 ms] >>>>> [Ref Proc: 3.7 ms] >>>>> [Ref Enq: 0.1 ms] >>>>> [Free CSet: 6.0 ms] >>>>> [Eden: 80.3G(80.3G)->0.0B(81.7G) Survivors: 2944.0M->2176.0M Heap: >>>>> 126.4G(144.0G)->45.4G(144.0G)] >>>>> [Times: user=6.84 sys=0.01, real=0.32 secs] >>>>> >>>>> Full GC Log for a period of run : >>>>> https://gist.github.com/dvdreddy/5ecf9a58a3f309e8bb60 >>>>> >>>>> >>>>> Thanks in advance >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Mon Aug 24 19:15:41 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Mon, 24 Aug 2015 12:15:41 -0700 Subject: Question about Object Copy times In-Reply-To: References: <55D67B8A.6020209@oracle.com> Message-ID: <55DB6D5D.7060402@oracle.com> Reddy, Can you explain how you measure the throughput? For example, what does it mean 'throughput of 92%'? Thanks, Jenny On 8/24/2015 12:05 PM, D vd Reddy wrote: > I made a couple of experiments run over the weekend, first I ran a > experiment with lowering the maxGCPause it did help, lowered the > throughput to 92%. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dvdeepankar.reddy at gmail.com Mon Aug 24 19:28:54 2015 From: dvdeepankar.reddy at gmail.com (D vd Reddy) Date: Mon, 24 Aug 2015 12:28:54 -0700 Subject: Question about Object Copy times In-Reply-To: <55DB6D5D.7060402@oracle.com> References: <55D67B8A.6020209@oracle.com> <55DB6D5D.7060402@oracle.com> Message-ID: This is the number given out by GCViewer, it is the ratio of time spent in GC to total time of the JVM running. Thanks On Mon, Aug 24, 2015 at 12:15 PM, Yu Zhang wrote: > Reddy, > > Can you explain how you measure the throughput? For example, what does it > mean 'throughput of 92%'? > > Thanks, > Jenny > > On 8/24/2015 12:05 PM, D vd Reddy wrote: > > I made a couple of experiments run over the weekend, first I ran a > experiment with lowering the maxGCPause it did help, lowered the throughput > to 92%. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yiyeguhu at gmail.com Mon Aug 24 19:54:47 2015 From: yiyeguhu at gmail.com (Tao Mao) Date: Mon, 24 Aug 2015 12:54:47 -0700 Subject: Question about Object Copy times In-Reply-To: References: <55D67B8A.6020209@oracle.com> <55DB6D5D.7060402@oracle.com> Message-ID: You are almost correct: this throughput number is defined as the ratio of user application time to the total run time, i.e., 100% is the "ideal" >From your reporting, ~97% is the best result so far. It's near the limit. If you don't have a particular reason to push further, you can walk away with your current tuning. Can you post the JVM options for your best run? Thanks. Tao Mao On Mon, Aug 24, 2015 at 12:28 PM, D vd Reddy wrote: > This is the number given out by GCViewer, it is the ratio of time spent in > GC to total time of the JVM running. > > > > > Thanks > > > On Mon, Aug 24, 2015 at 12:15 PM, Yu Zhang wrote: > >> Reddy, >> >> Can you explain how you measure the throughput? For example, what does it >> mean 'throughput of 92%'? >> >> Thanks, >> Jenny >> >> On 8/24/2015 12:05 PM, D vd Reddy wrote: >> >> I made a couple of experiments run over the weekend, first I ran a >> experiment with lowering the maxGCPause it did help, lowered the throughput >> to 92%. >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dvdeepankar.reddy at gmail.com Mon Aug 24 21:09:13 2015 From: dvdeepankar.reddy at gmail.com (D vd Reddy) Date: Mon, 24 Aug 2015 14:09:13 -0700 Subject: Question about Object Copy times In-Reply-To: References: <55D67B8A.6020209@oracle.com> <55DB6D5D.7060402@oracle.com> Message-ID: Oh sorry for the mistake in the throughput definition, The options we were using were just setting MaxGCPause to 1000ms, here is the exhaustive list "-XX:+AggressiveOpts -XX:InitialHeapSize=154618822656 -XX:+ManagementServer -XX:MaxGCPauseMillis=1000 -XX:MaxHeapSize=154618822656 -XX:MaxMetaspaceSize=268435456 -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 -XX:+PrintAdaptiveSizePolicy -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:-UseCompressedOops -XX:+UseG1GC " The numbers we were seeing in other applications (one is HBase) were close to 99 % , that's when we took a look in the GC logs to see if there is anything more we can extract, the only thing that caught our eye was the high object copy times which we did not see in the other applications. Thanks for the help and suggestions. On Mon, Aug 24, 2015 at 12:54 PM, Tao Mao wrote: > You are almost correct: this throughput number is defined as the ratio of > user application time to the total run time, i.e., 100% is the "ideal" > > From your reporting, ~97% is the best result so far. It's near the limit. > If you don't have a particular reason to push further, you can walk away > with your current tuning. Can you post the JVM options for your best run? > > Thanks. > Tao Mao > > > On Mon, Aug 24, 2015 at 12:28 PM, D vd Reddy > wrote: > >> This is the number given out by GCViewer, it is the ratio of time spent >> in GC to total time of the JVM running. >> >> >> >> >> Thanks >> >> >> On Mon, Aug 24, 2015 at 12:15 PM, Yu Zhang wrote: >> >>> Reddy, >>> >>> Can you explain how you measure the throughput? For example, what does >>> it mean 'throughput of 92%'? >>> >>> Thanks, >>> Jenny >>> >>> On 8/24/2015 12:05 PM, D vd Reddy wrote: >>> >>> I made a couple of experiments run over the weekend, first I ran a >>> experiment with lowering the maxGCPause it did help, lowered the throughput >>> to 92%. >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yiyeguhu at gmail.com Mon Aug 24 22:49:10 2015 From: yiyeguhu at gmail.com (Tao Mao) Date: Mon, 24 Aug 2015 15:49:10 -0700 Subject: Question about Object Copy times In-Reply-To: References: <55D67B8A.6020209@oracle.com> <55DB6D5D.7060402@oracle.com> Message-ID: A bunch you could try: 1. Do you have a particular reason to set ObjectAlignmentInBytes=16? The default is 8. A larger value may have more waste in object chunks. Just remove it to have default. 2. One gc log shows you have 23 gc workers. Experiment with lower and higher numbers depending on your machine constraints. May try 16 and 32 if that's applicable. 3. If what you care is solely throughput, setting MaxGCPauseMillis to a higher number (e.g., MaxGCPauseMillis=2000) may help, which is counter-intuitive tho/ Try the above and good luck! Thanks. Tao Mao On Mon, Aug 24, 2015 at 2:09 PM, D vd Reddy wrote: > Oh sorry for the mistake in the throughput definition, > > The options we were using were just setting MaxGCPause to 1000ms, here is > the exhaustive list > "-XX:+AggressiveOpts -XX:InitialHeapSize=154618822656 > -XX:+ManagementServer -XX:MaxGCPauseMillis=1000 > -XX:MaxHeapSize=154618822656 -XX:MaxMetaspaceSize=268435456 > -XX:MetaspaceSize=268435456 -XX:ObjectAlignmentInBytes=16 > -XX:+PrintAdaptiveSizePolicy -XX:+PrintGC -XX:+PrintGCDetails > -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution > -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions > -XX:-UseCompressedOops -XX:+UseG1GC " > > The numbers we were seeing in other applications (one is HBase) were > close to 99 % , that's when we took a look in the GC logs to see if there > is anything more we can extract, the only thing that caught our eye was the > high object copy times which we did not see in the other applications. > > Thanks for the help and suggestions. > > > On Mon, Aug 24, 2015 at 12:54 PM, Tao Mao wrote: > >> You are almost correct: this throughput number is defined as the ratio of >> user application time to the total run time, i.e., 100% is the "ideal" >> >> From your reporting, ~97% is the best result so far. It's near the limit. >> If you don't have a particular reason to push further, you can walk away >> with your current tuning. Can you post the JVM options for your best run? >> >> Thanks. >> Tao Mao >> >> >> On Mon, Aug 24, 2015 at 12:28 PM, D vd Reddy > > wrote: >> >>> This is the number given out by GCViewer, it is the ratio of time spent >>> in GC to total time of the JVM running. >>> >>> >>> >>> >>> Thanks >>> >>> >>> On Mon, Aug 24, 2015 at 12:15 PM, Yu Zhang wrote: >>> >>>> Reddy, >>>> >>>> Can you explain how you measure the throughput? For example, what does >>>> it mean 'throughput of 92%'? >>>> >>>> Thanks, >>>> Jenny >>>> >>>> On 8/24/2015 12:05 PM, D vd Reddy wrote: >>>> >>>> I made a couple of experiments run over the weekend, first I ran a >>>> experiment with lowering the maxGCPause it did help, lowered the throughput >>>> to 92%. >>>> >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlo.fernando at baml.com Fri Aug 28 14:21:01 2015 From: carlo.fernando at baml.com (Fernando, Carlo) Date: Fri, 28 Aug 2015 14:21:01 +0000 Subject: Question regarding one off long GC pause times. Message-ID: <204609DC9565564AA71E9B4312EA3232420AA101@smtp_mail.bankofamerica.com> Hi. I have been seeing long ( > 1sec) GC pauses in our latency sensitive application. I have enabled SafepointStatistics thinking that there was something else that could have been causing this but based on the output, it seems like everything is humming along nicely with GC taking only ~1ms then this >1 sec GC happens all of a sudden. I was wondering if anybody had observed similar behavior and has some explanation why this is happening. I have attached my flags and a snippet of my GC. Any info I appreciated. JAVA 7 FLAGS: -Xms256M -Xmx256M -XX:NewSize=146M -XX:TargetSurvivorRatio=90 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=8 -XX:+DisableExplicitGC -XX:CMSInitiatingOccupancyFraction=50 -XX:SurvivorRatio=160 -XX:MaxTenuringThreshold=1 -XX:+UseStringCache -XX:+UseCompressedStrings -XX:+OptimizeStringConcat -XX:+AggressiveOpts -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintSafepointStatistics -XX:+PerfDisableSharedMem SAFEPOINT: 55638.934: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1 ] 0 55643.367: GenCollectForAllocation [ 174 0 1 ] [ 0 0 0 0 1 ] 0 55647.391: GenCollectForAllocation [ 174 1 0 ] [ 0 0 0 0 1 ] 0 55650.766: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1538 ] 0 55654.785: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1 ] 0 55661.227: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1 ] 0 GC: 2015-08-28T08:29:44.486-0400: 55643.368: [GC 55643.368: [ParNew: 147851K->118K(148608K), 0.0014580 secs] 171468K->23745K(261248K), 0.0015420 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] Total time for which application threads were stopped: 0.0018720 seconds Application time: 4.0202340 seconds 2015-08-28T08:29:48.508-0400: 55647.390: [GC 55647.391: [ParNew: 147830K->225K(148608K), 0.0011710 secs] 171457K->23852K(261248K), 0.0012660 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0016180 seconds Application time: 3.3738970 seconds 2015-08-28T08:29:51.883-0400: 55650.766: [GC 55652.303: [ParNew: 147937K->247K(148608K), 0.0009800 secs] 171564K->23955K(261248K), 0.0010800 secs] [Times: user=0.01 sys=0.00, real=1.54 secs] Total time for which application threads were stopped: 1.5383140 seconds Application time: 2.4803280 seconds 2015-08-28T08:29:55.902-0400: 55654.785: [GC 55654.785: [ParNew: 147959K->137K(148608K), 0.0009230 secs] 171667K->23853K(261248K), 0.0009900 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0012740 seconds Application time: 6.4405350 seconds 2015-08-28T08:30:02.344-0400: 55661.226: [GC 55661.226: [ParNew: 147849K->156K(148608K), 0.0008740 secs] 171565K->23877K(261248K), 0.0009480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0012230 seconds Application time: 5.1378200 seconds ---------------------------------------------------------------------- This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysr1729 at gmail.com Fri Aug 28 16:00:37 2015 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Fri, 28 Aug 2015 09:00:37 -0700 Subject: Question regarding one off long GC pause times. In-Reply-To: <204609DC9565564AA71E9B4312EA3232420AA101@smtp_mail.bankofamerica.com> References: <204609DC9565564AA71E9B4312EA3232420AA101@smtp_mail.bankofamerica.com> Message-ID: Hi Carlo -- Two possible avenues worth investigating: (1) check if there's any kind of periodicity to such events. If that is the case, see if there's something else that runs periodically on the same machine (note that only the elapsed time is affected here, as if for some reason the process was stalled for that amount of time; GC itself seems to finish quite quickly, by the way.) (2) There was a recent thread from Evan Jones on how, if your /tmp is not a tmpfs partition (but rather a disk-based partition) then, in the presence of high I/O load, there can be situations (particularly with the Linux virtual memory manager) where the perfdata pages can be occasionally evicted to disk. When the JVM tries to update the perfdata counters at the end of GC, we end up taking page misses that can impact pause times. A suggested workaround was to use: -XX:-UsePerfData to stop such perfdata segment logging (you will lose access to those counters externally of course), and avoid that pause time hit. There can be other possibilities as well, but given the symptom here, of only the elapsed time being affected, chances are that one of these (and possibly (1)) is a likely cause. Good luck! -- ramki On Fri, Aug 28, 2015 at 7:21 AM, Fernando, Carlo wrote: > > > Hi. > > > > I have been seeing long ( > 1sec) GC pauses in our latency sensitive > application. I have enabled SafepointStatistics thinking that there was > something else that could have been causing this but based on the output, > it seems like everything is humming along nicely with GC taking only ~1ms > then this >1 sec GC happens all of a sudden. I was wondering if anybody had > observed similar behavior and has some explanation why this is happening. > > > > I have attached my flags and a snippet of my GC. Any info I appreciated. > > > > JAVA 7 FLAGS: > > -Xms256M -Xmx256M -XX:NewSize=146M -XX:TargetSurvivorRatio=90 > -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=8 > -XX:+DisableExplicitGC -XX:CMSInitiatingOccupancyFraction=50 > -XX:SurvivorRatio=160 -XX:MaxTenuringThreshold=1 -XX:+UseStringCache > -XX:+UseCompressedStrings -XX:+OptimizeStringConcat -XX:+AggressiveOpts > -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime > -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps > -XX:+PrintGCDetails -XX:+PrintSafepointStatistics -XX:+PerfDisableSharedMem > > > > > > SAFEPOINT: > > 55638.934: GenCollectForAllocation [ 174 > 0 0 ] [ 0 0 0 0 1 ] 0 > > 55643.367: GenCollectForAllocation [ 174 > 0 1 ] [ 0 0 0 0 1 ] 0 > > 55647.391: GenCollectForAllocation [ 174 > 1 0 ] [ 0 0 0 0 1 ] 0 > > 55650.766: GenCollectForAllocation [ 174 > 0 0 ] [ 0 0 0 0 *1538* ] 0 > > 55654.785: GenCollectForAllocation [ 174 > 0 0 ] [ 0 0 0 0 1 ] 0 > > 55661.227: GenCollectForAllocation [ 174 > 0 0 ] [ 0 0 0 0 1 ] 0 > > > > > > GC: > > 2015-08-28T08:29:44.486-0400: 55643.368: [GC 55643.368: [ParNew: > 147851K->118K(148608K), 0.0014580 secs] 171468K->23745K(261248K), 0.0015420 > secs] [Times: user=0.01 sys=0.00, real=0.01 secs] > > Total time for which application threads were stopped: 0.0018720 seconds > > Application time: 4.0202340 seconds > > 2015-08-28T08:29:48.508-0400: 55647.390: [GC 55647.391: [ParNew: > 147830K->225K(148608K), 0.0011710 secs] 171457K->23852K(261248K), 0.0012660 > secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > > Total time for which application threads were stopped: 0.0016180 seconds > > Application time: 3.3738970 seconds > > 2015-08-28T08:29:51.883-0400: 55650.766: [GC 55652.303: [ParNew: > 147937K->247K(148608K), 0.0009800 secs] 171564K->23955K(261248K), 0.0010800 > secs] [Times: user=0.01 sys=0.00, *real=1.54 secs*] > > Total time for which application threads were stopped: *1.5383140* seconds > > Application time: 2.4803280 seconds > > 2015-08-28T08:29:55.902-0400: 55654.785: [GC 55654.785: [ParNew: > 147959K->137K(148608K), 0.0009230 secs] 171667K->23853K(261248K), 0.0009900 > secs] [Times: user=0.01 sys=0.00, real=0.00 secs] > > Total time for which application threads were stopped: 0.0012740 seconds > > Application time: 6.4405350 seconds > > 2015-08-28T08:30:02.344-0400: 55661.226: [GC 55661.226: [ParNew: > 147849K->156K(148608K), 0.0008740 secs] 171565K->23877K(261248K), 0.0009480 > secs] [Times: user=0.01 sys=0.00, real=0.00 secs] > > Total time for which application threads were stopped: 0.0012230 seconds > > Application time: 5.1378200 seconds > > > ------------------------------ > This message, and any attachments, is for the intended recipient(s) only, > may contain information that is privileged, confidential and/or proprietary > and subject to important terms and conditions available at > http://www.bankofamerica.com/emaildisclaimer. If you are not the intended > recipient, please delete this message. > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kirtiteja at gmail.com Fri Aug 28 17:33:10 2015 From: kirtiteja at gmail.com (Kirti Teja Rao) Date: Fri, 28 Aug 2015 10:33:10 -0700 Subject: Question regarding one off long GC pause times. In-Reply-To: References: <204609DC9565564AA71E9B4312EA3232420AA101@smtp_mail.bankofamerica.com> Message-ID: Hi Carlo, In addition to what Ramki said, you may want to set vm.swappiness to 0 (Default is 60 on most linux systems). Also, if there is high IO load, gc logging can also impact pause times (logging happens inside the pause time). Thanks, Teja On Fri, Aug 28, 2015 at 9:00 AM, Srinivas Ramakrishna wrote: > Hi Carlo -- > > Two possible avenues worth investigating: > > (1) check if there's any kind of periodicity to such events. If that is > the case, see if there's something else that runs periodically on the same > machine (note that only the elapsed time is affected here, as if for some > reason the process was stalled for that amount of time; GC itself seems > to finish quite quickly, by the way.) > > (2) There was a recent thread from Evan Jones on how, if your /tmp is not > a tmpfs partition (but rather a disk-based partition) then, in the presence > of high I/O load, there can be situations (particularly with the Linux > virtual memory manager) where the perfdata pages can be occasionally > evicted to disk. When the JVM tries to update the perfdata counters at the > end of GC, we end up taking page misses that can impact pause times. A > suggested workaround was to use: -XX:-UsePerfData to stop such perfdata > segment logging (you will lose access to those counters externally of > course), and avoid that pause time hit. > > There can be other possibilities as well, but given the symptom here, of > only the elapsed time being affected, chances are that one of these (and > possibly (1)) is a likely cause. > > Good luck! > -- ramki > > > On Fri, Aug 28, 2015 at 7:21 AM, Fernando, Carlo > wrote: > >> >> >> Hi. >> >> >> >> I have been seeing long ( > 1sec) GC pauses in our latency sensitive >> application. I have enabled SafepointStatistics thinking that there was >> something else that could have been causing this but based on the output, >> it seems like everything is humming along nicely with GC taking only ~1ms >> then this >1 sec GC happens all of a sudden. I was wondering if anybody had >> observed similar behavior and has some explanation why this is happening. >> >> >> >> I have attached my flags and a snippet of my GC. Any info I appreciated. >> >> >> >> JAVA 7 FLAGS: >> >> -Xms256M -Xmx256M -XX:NewSize=146M -XX:TargetSurvivorRatio=90 >> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=8 >> -XX:+DisableExplicitGC -XX:CMSInitiatingOccupancyFraction=50 >> -XX:SurvivorRatio=160 -XX:MaxTenuringThreshold=1 -XX:+UseStringCache >> -XX:+UseCompressedStrings -XX:+OptimizeStringConcat -XX:+AggressiveOpts >> -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime >> -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps >> -XX:+PrintGCDetails -XX:+PrintSafepointStatistics -XX:+PerfDisableSharedMem >> >> >> >> >> >> SAFEPOINT: >> >> 55638.934: GenCollectForAllocation [ 174 >> 0 0 ] [ 0 0 0 0 1 ] 0 >> >> 55643.367: GenCollectForAllocation [ 174 >> 0 1 ] [ 0 0 0 0 1 ] 0 >> >> 55647.391: GenCollectForAllocation [ 174 >> 1 0 ] [ 0 0 0 0 1 ] 0 >> >> 55650.766: GenCollectForAllocation [ 174 >> 0 0 ] [ 0 0 0 0 *1538* ] 0 >> >> 55654.785: GenCollectForAllocation [ 174 >> 0 0 ] [ 0 0 0 0 1 ] 0 >> >> 55661.227: GenCollectForAllocation [ 174 >> 0 0 ] [ 0 0 0 0 1 ] 0 >> >> >> >> >> >> GC: >> >> 2015-08-28T08:29:44.486-0400: 55643.368: [GC 55643.368: [ParNew: >> 147851K->118K(148608K), 0.0014580 secs] 171468K->23745K(261248K), 0.0015420 >> secs] [Times: user=0.01 sys=0.00, real=0.01 secs] >> >> Total time for which application threads were stopped: 0.0018720 seconds >> >> Application time: 4.0202340 seconds >> >> 2015-08-28T08:29:48.508-0400: 55647.390: [GC 55647.391: [ParNew: >> 147830K->225K(148608K), 0.0011710 secs] 171457K->23852K(261248K), 0.0012660 >> secs] [Times: user=0.00 sys=0.00, real=0.00 secs] >> >> Total time for which application threads were stopped: 0.0016180 seconds >> >> Application time: 3.3738970 seconds >> >> 2015-08-28T08:29:51.883-0400: 55650.766: [GC 55652.303: [ParNew: >> 147937K->247K(148608K), 0.0009800 secs] 171564K->23955K(261248K), 0.0010800 >> secs] [Times: user=0.01 sys=0.00, *real=1.54 secs*] >> >> Total time for which application threads were stopped: *1.5383140* >> seconds >> >> Application time: 2.4803280 seconds >> >> 2015-08-28T08:29:55.902-0400: 55654.785: [GC 55654.785: [ParNew: >> 147959K->137K(148608K), 0.0009230 secs] 171667K->23853K(261248K), 0.0009900 >> secs] [Times: user=0.01 sys=0.00, real=0.00 secs] >> >> Total time for which application threads were stopped: 0.0012740 seconds >> >> Application time: 6.4405350 seconds >> >> 2015-08-28T08:30:02.344-0400: 55661.226: [GC 55661.226: [ParNew: >> 147849K->156K(148608K), 0.0008740 secs] 171565K->23877K(261248K), 0.0009480 >> secs] [Times: user=0.01 sys=0.00, real=0.00 secs] >> >> Total time for which application threads were stopped: 0.0012230 seconds >> >> Application time: 5.1378200 seconds >> >> >> ------------------------------ >> This message, and any attachments, is for the intended recipient(s) only, >> may contain information that is privileged, confidential and/or proprietary >> and subject to important terms and conditions available at >> http://www.bankofamerica.com/emaildisclaimer. If you are not the >> intended recipient, please delete this message. >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlo.fernando at baml.com Fri Aug 28 19:05:32 2015 From: carlo.fernando at baml.com (Fernando, Carlo) Date: Fri, 28 Aug 2015 19:05:32 +0000 Subject: Question regarding one off long GC pause times. In-Reply-To: References: <204609DC9565564AA71E9B4312EA3232420AA101@smtp_mail.bankofamerica.com> Message-ID: <204609DC9565564AA71E9B4312EA3232420AA3D2@smtp_mail.bankofamerica.com> Thanks for the response. For case 1: it was not periodic however, it only happens during the most busiest times of the day. This makes me think it that maybe it is tied somewhat to the host?s IO activity. For case 2: I did read about the thread from Evan Jones and added ?XX:+PerfDisableSharedMem recently. However, the long pause times were still there. I will go back and do an audit if I missed anything that runs periodically on the machine. Thanks again for the info. -carlo From: Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] Sent: Friday, August 28, 2015 11:01 AM To: Fernando, Carlo Cc: hotspot-gc-use at openjdk.java.net Subject: Re: Question regarding one off long GC pause times. Hi Carlo -- Two possible avenues worth investigating: (1) check if there's any kind of periodicity to such events. If that is the case, see if there's something else that runs periodically on the same machine (note that only the elapsed time is affected here, as if for some reason the process was stalled for that amount of time; GC itself seems to finish quite quickly, by the way.) (2) There was a recent thread from Evan Jones on how, if your /tmp is not a tmpfs partition (but rather a disk-based partition) then, in the presence of high I/O load, there can be situations (particularly with the Linux virtual memory manager) where the perfdata pages can be occasionally evicted to disk. When the JVM tries to update the perfdata counters at the end of GC, we end up taking page misses that can impact pause times. A suggested workaround was to use: -XX:-UsePerfData to stop such perfdata segment logging (you will lose access to those counters externally of course), and avoid that pause time hit. There can be other possibilities as well, but given the symptom here, of only the elapsed time being affected, chances are that one of these (and possibly (1)) is a likely cause. Good luck! -- ramki On Fri, Aug 28, 2015 at 7:21 AM, Fernando, Carlo > wrote: Hi. I have been seeing long ( > 1sec) GC pauses in our latency sensitive application. I have enabled SafepointStatistics thinking that there was something else that could have been causing this but based on the output, it seems like everything is humming along nicely with GC taking only ~1ms then this >1 sec GC happens all of a sudden. I was wondering if anybody had observed similar behavior and has some explanation why this is happening. I have attached my flags and a snippet of my GC. Any info I appreciated. JAVA 7 FLAGS: -Xms256M -Xmx256M -XX:NewSize=146M -XX:TargetSurvivorRatio=90 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=8 -XX:+DisableExplicitGC -XX:CMSInitiatingOccupancyFraction=50 -XX:SurvivorRatio=160 -XX:MaxTenuringThreshold=1 -XX:+UseStringCache -XX:+UseCompressedStrings -XX:+OptimizeStringConcat -XX:+AggressiveOpts -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintSafepointStatistics -XX:+PerfDisableSharedMem SAFEPOINT: 55638.934: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1 ] 0 55643.367: GenCollectForAllocation [ 174 0 1 ] [ 0 0 0 0 1 ] 0 55647.391: GenCollectForAllocation [ 174 1 0 ] [ 0 0 0 0 1 ] 0 55650.766: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1538 ] 0 55654.785: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1 ] 0 55661.227: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1 ] 0 GC: 2015-08-28T08:29:44.486-0400: 55643.368: [GC 55643.368: [ParNew: 147851K->118K(148608K), 0.0014580 secs] 171468K->23745K(261248K), 0.0015420 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] Total time for which application threads were stopped: 0.0018720 seconds Application time: 4.0202340 seconds 2015-08-28T08:29:48.508-0400: 55647.390: [GC 55647.391: [ParNew: 147830K->225K(148608K), 0.0011710 secs] 171457K->23852K(261248K), 0.0012660 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0016180 seconds Application time: 3.3738970 seconds 2015-08-28T08:29:51.883-0400: 55650.766: [GC 55652.303: [ParNew: 147937K->247K(148608K), 0.0009800 secs] 171564K->23955K(261248K), 0.0010800 secs] [Times: user=0.01 sys=0.00, real=1.54 secs] Total time for which application threads were stopped: 1.5383140 seconds Application time: 2.4803280 seconds 2015-08-28T08:29:55.902-0400: 55654.785: [GC 55654.785: [ParNew: 147959K->137K(148608K), 0.0009230 secs] 171667K->23853K(261248K), 0.0009900 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0012740 seconds Application time: 6.4405350 seconds 2015-08-28T08:30:02.344-0400: 55661.226: [GC 55661.226: [ParNew: 147849K->156K(148608K), 0.0008740 secs] 171565K->23877K(261248K), 0.0009480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0012230 seconds Application time: 5.1378200 seconds ________________________________ This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use ---------------------------------------------------------------------- This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlo.fernando at baml.com Fri Aug 28 19:06:39 2015 From: carlo.fernando at baml.com (Fernando, Carlo) Date: Fri, 28 Aug 2015 19:06:39 +0000 Subject: Question regarding one off long GC pause times. In-Reply-To: References: <204609DC9565564AA71E9B4312EA3232420AA101@smtp_mail.bankofamerica.com> Message-ID: <204609DC9565564AA71E9B4312EA3232420AA3E4@smtp_mail.bankofamerica.com> Hi. I will loop in our Sys admin and we will try a test with this setting. Appreciate the response. Thanks -carlo From: Kirti Teja Rao [mailto:kirtiteja at gmail.com] Sent: Friday, August 28, 2015 12:33 PM To: Srinivas Ramakrishna Cc: Fernando, Carlo; hotspot-gc-use at openjdk.java.net Subject: Re: Question regarding one off long GC pause times. Hi Carlo, In addition to what Ramki said, you may want to set vm.swappiness to 0 (Default is 60 on most linux systems). Also, if there is high IO load, gc logging can also impact pause times (logging happens inside the pause time). Thanks, Teja On Fri, Aug 28, 2015 at 9:00 AM, Srinivas Ramakrishna > wrote: Hi Carlo -- Two possible avenues worth investigating: (1) check if there's any kind of periodicity to such events. If that is the case, see if there's something else that runs periodically on the same machine (note that only the elapsed time is affected here, as if for some reason the process was stalled for that amount of time; GC itself seems to finish quite quickly, by the way.) (2) There was a recent thread from Evan Jones on how, if your /tmp is not a tmpfs partition (but rather a disk-based partition) then, in the presence of high I/O load, there can be situations (particularly with the Linux virtual memory manager) where the perfdata pages can be occasionally evicted to disk. When the JVM tries to update the perfdata counters at the end of GC, we end up taking page misses that can impact pause times. A suggested workaround was to use: -XX:-UsePerfData to stop such perfdata segment logging (you will lose access to those counters externally of course), and avoid that pause time hit. There can be other possibilities as well, but given the symptom here, of only the elapsed time being affected, chances are that one of these (and possibly (1)) is a likely cause. Good luck! -- ramki On Fri, Aug 28, 2015 at 7:21 AM, Fernando, Carlo > wrote: Hi. I have been seeing long ( > 1sec) GC pauses in our latency sensitive application. I have enabled SafepointStatistics thinking that there was something else that could have been causing this but based on the output, it seems like everything is humming along nicely with GC taking only ~1ms then this >1 sec GC happens all of a sudden. I was wondering if anybody had observed similar behavior and has some explanation why this is happening. I have attached my flags and a snippet of my GC. Any info I appreciated. JAVA 7 FLAGS: -Xms256M -Xmx256M -XX:NewSize=146M -XX:TargetSurvivorRatio=90 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=8 -XX:+DisableExplicitGC -XX:CMSInitiatingOccupancyFraction=50 -XX:SurvivorRatio=160 -XX:MaxTenuringThreshold=1 -XX:+UseStringCache -XX:+UseCompressedStrings -XX:+OptimizeStringConcat -XX:+AggressiveOpts -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintSafepointStatistics -XX:+PerfDisableSharedMem SAFEPOINT: 55638.934: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1 ] 0 55643.367: GenCollectForAllocation [ 174 0 1 ] [ 0 0 0 0 1 ] 0 55647.391: GenCollectForAllocation [ 174 1 0 ] [ 0 0 0 0 1 ] 0 55650.766: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1538 ] 0 55654.785: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1 ] 0 55661.227: GenCollectForAllocation [ 174 0 0 ] [ 0 0 0 0 1 ] 0 GC: 2015-08-28T08:29:44.486-0400: 55643.368: [GC 55643.368: [ParNew: 147851K->118K(148608K), 0.0014580 secs] 171468K->23745K(261248K), 0.0015420 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] Total time for which application threads were stopped: 0.0018720 seconds Application time: 4.0202340 seconds 2015-08-28T08:29:48.508-0400: 55647.390: [GC 55647.391: [ParNew: 147830K->225K(148608K), 0.0011710 secs] 171457K->23852K(261248K), 0.0012660 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0016180 seconds Application time: 3.3738970 seconds 2015-08-28T08:29:51.883-0400: 55650.766: [GC 55652.303: [ParNew: 147937K->247K(148608K), 0.0009800 secs] 171564K->23955K(261248K), 0.0010800 secs] [Times: user=0.01 sys=0.00, real=1.54 secs] Total time for which application threads were stopped: 1.5383140 seconds Application time: 2.4803280 seconds 2015-08-28T08:29:55.902-0400: 55654.785: [GC 55654.785: [ParNew: 147959K->137K(148608K), 0.0009230 secs] 171667K->23853K(261248K), 0.0009900 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0012740 seconds Application time: 6.4405350 seconds 2015-08-28T08:30:02.344-0400: 55661.226: [GC 55661.226: [ParNew: 147849K->156K(148608K), 0.0008740 secs] 171565K->23877K(261248K), 0.0009480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0012230 seconds Application time: 5.1378200 seconds ________________________________ This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use ---------------------------------------------------------------------- This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlie.hunt at oracle.com Fri Aug 28 19:36:40 2015 From: charlie.hunt at oracle.com (charlie hunt) Date: Fri, 28 Aug 2015 14:36:40 -0500 Subject: Question regarding one off long GC pause times. In-Reply-To: <204609DC9565564AA71E9B4312EA3232420AA3E4@smtp_mail.bankofamerica.com> References: <204609DC9565564AA71E9B4312EA3232420AA101@smtp_mail.bankofamerica.com> <204609DC9565564AA71E9B4312EA3232420AA3E4@smtp_mail.bankofamerica.com> Message-ID: <55E0B848.501@oracle.com> While you have your sys admin handy, also have him check that transparent huge pages are disabled. charlie On 08/28/2015 02:06 PM, Fernando, Carlo wrote: > > Hi. > > I will loop in our Sys admin and we will try a test with this setting. > > Appreciate the response. > > Thanks > > -carlo > > *From:*Kirti Teja Rao [mailto:kirtiteja at gmail.com] > *Sent:* Friday, August 28, 2015 12:33 PM > *To:* Srinivas Ramakrishna > *Cc:* Fernando, Carlo; hotspot-gc-use at openjdk.java.net > *Subject:* Re: Question regarding one off long GC pause times. > > Hi Carlo, > > In addition to what Ramki said, you may want to set vm.swappiness to 0 > (Default is 60 on most linux systems). > > Also, if there is high IO load, gc logging can also impact pause times > (logging happens inside the pause time). > > Thanks, > > Teja > > On Fri, Aug 28, 2015 at 9:00 AM, Srinivas Ramakrishna > > wrote: > > Hi Carlo -- > > Two possible avenues worth investigating: > > (1) check if there's any kind of periodicity to such events. If that > is the case, see if there's something else that runs periodically on > the same machine (note that only the elapsed time is affected here, as > if for some reason the process was stalled for that amount of time; GC > itself seems > > to finish quite quickly, by the way.) > > (2) There was a recent thread from Evan Jones on how, if your /tmp is > not a tmpfs partition (but rather a disk-based partition) then, in the > presence of high I/O load, there can be situations (particularly with > the Linux virtual memory manager) where the perfdata pages can be > occasionally evicted to disk. When the JVM tries to update the > perfdata counters at the end of GC, we end up taking page misses that > can impact pause times. A suggested workaround was to use: > -XX:-UsePerfData to stop such perfdata segment logging (you will lose > access to those counters externally of course), and avoid that pause > time hit. > > There can be other possibilities as well, but given the symptom here, > of only the elapsed time being affected, chances are that one of these > (and possibly (1)) is a likely cause. > > Good luck! > > -- ramki > > On Fri, Aug 28, 2015 at 7:21 AM, Fernando, Carlo > > wrote: > > Hi. > > I have been seeing long ( > 1sec) GC pauses in our latency > sensitive application. I have enabled SafepointStatistics thinking > that there was something else that could have been causing this > but based on the output, it seems like everything is humming along > nicely with GC taking only ~1ms then this >1 sec GC happens all of > a sudden. I was wondering if anybody had observed similar behavior > and has some explanation why this is happening. > > I have attached my flags and a snippet of my GC. Any info I > appreciated. > > JAVA 7 FLAGS: > > -Xms256M -Xmx256M -XX:NewSize=146M -XX:TargetSurvivorRatio=90 > -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=8 > -XX:+DisableExplicitGC -XX:CMSInitiatingOccupancyFraction=50 > -XX:SurvivorRatio=160 -XX:MaxTenuringThreshold=1 > -XX:+UseStringCache -XX:+UseCompressedStrings > -XX:+OptimizeStringConcat -XX:+AggressiveOpts > -XX:+PrintGCApplicationStoppedTime > -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGC > -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCDetails > -XX:+PrintSafepointStatistics -XX:+PerfDisableSharedMem > > SAFEPOINT: > > 55638.934: GenCollectForAllocation [ 174 > 0 0 ] [ 0 0 0 0 1 ] 0 > > 55643.367: GenCollectForAllocation [ 174 > 0 1 ] [ 0 0 0 0 1 ] 0 > > 55647.391: GenCollectForAllocation [ 174 > 1 0 ] [ 0 0 0 0 1 ] 0 > > 55650.766: GenCollectForAllocation [ 174 > 0 0 ] [ 0 0 0 0 *1538* ] 0 > > 55654.785: GenCollectForAllocation [ 174 > 0 0 ] [ 0 0 0 0 1 ] 0 > > 55661.227: GenCollectForAllocation [ 174 > 0 0 ] [ 0 0 0 0 1 ] 0 > > GC: > > 2015-08-28T08:29:44.486-0400: 55643.368: [GC 55643.368: [ParNew: > 147851K->118K(148608K), 0.0014580 secs] 171468K->23745K(261248K), > 0.0015420 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] > > Total time for which application threads were stopped: 0.0018720 > seconds > > Application time: 4.0202340 seconds > > 2015-08-28T08:29:48.508-0400: 55647.390: [GC 55647.391: [ParNew: > 147830K->225K(148608K), 0.0011710 secs] 171457K->23852K(261248K), > 0.0012660 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > > Total time for which application threads were stopped: 0.0016180 > seconds > > Application time: 3.3738970 seconds > > 2015-08-28T08:29:51.883-0400: 55650.766: [GC 55652.303: [ParNew: > 147937K->247K(148608K), 0.0009800 secs] 171564K->23955K(261248K), > 0.0010800 secs] [Times: user=0.01 sys=0.00, *real=1.54 secs*] > > Total time for which application threads were stopped: *1.5383140* > seconds > > Application time: 2.4803280 seconds > > 2015-08-28T08:29:55.902-0400: 55654.785: [GC 55654.785: [ParNew: > 147959K->137K(148608K), 0.0009230 secs] 171667K->23853K(261248K), > 0.0009900 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] > > Total time for which application threads were stopped: 0.0012740 > seconds > > Application time: 6.4405350 seconds > > 2015-08-28T08:30:02.344-0400: 55661.226: [GC 55661.226: [ParNew: > 147849K->156K(148608K), 0.0008740 secs] 171565K->23877K(261248K), > 0.0009480 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] > > Total time for which application threads were stopped: 0.0012230 > seconds > > Application time: 5.1378200 seconds > > ------------------------------------------------------------------------ > > This message, and any attachments, is for the intended > recipient(s) only, may contain information that is privileged, > confidential and/or proprietary and subject to important terms and > conditions available at > http://www.bankofamerica.com/emaildisclaimer. If you are not the > intended recipient, please delete this message. > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > ------------------------------------------------------------------------ > This message, and any attachments, is for the intended recipient(s) > only, may contain information that is privileged, confidential and/or > proprietary and subject to important terms and conditions available at > http://www.bankofamerica.com/emaildisclaimer. If you are not the > intended recipient, please delete this message. > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: