From brianfromoregon at gmail.com Tue May 5 03:17:08 2015 From: brianfromoregon at gmail.com (Brian Harris) Date: Mon, 4 May 2015 20:17:08 -0700 Subject: java8 metaspace issue Message-ID: Hi, I find that this code crashes in 8u40 after getting up to about 900 when run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with -XX:MaxPermSize=10m it does not crash. Is that expected? It seems similar to https://bugs.openjdk.java.net/browse/JDK-8025635 Thanks, Brian // uses Guava's CacheBuilder public class Main { public static void main(String[] args) throws Exception { Cache cache = CacheBuilder.newBuilder() .softValues() .build(); for (int i = 0; i < 50_000; i++) { URL[] dummyUrls = {new URL("file:" + i + ".jar")}; URLClassLoader cl = new URLClassLoader(dummyUrls, Thread.currentThread().getContextClassLoader()); Object proxy = Proxy.newProxyInstance(cl, new Class[]{Foo.class}, new InvocationHandler() { @Override public Object invoke(Object proxy, Method method, Object[] args) throws Throwable { return null; } }); cache.put(i, proxy); System.out.println(i); } } public interface Foo { void x(); } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From bengt.rutisson at oracle.com Tue May 5 09:05:11 2015 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 05 May 2015 11:05:11 +0200 Subject: java8 metaspace issue In-Reply-To: References: Message-ID: <554887C7.7080506@oracle.com> Hi Brian, On 2015-05-05 05:17, Brian Harris wrote: > Hi, > > I find that this code crashes in 8u40 after getting up to about 900 > when run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with > -XX:MaxPermSize=10m it does not crash. Thanks for providing the example program. For me it does not crash but if I run with -XX:MaxMetaspaceSize=10m I get an OutOfMemoryError. Does it crash for you? The OutOfMemoryError can be explained by the fact that when you run with -XX:MaxPermSize=10m there is some aligning going on and in the end you actually end up with a perm gen that is 20m large. Here's what I get when I use $ java -XX:+PrintGCDetails -XX:MaxPermSize=10m Heap PSYoungGen total 150528K, used 10363K [0x0000000758c80000, 0x0000000763400000, 0x0000000800000000) eden space 129536K, 8% used [0x0000000758c80000,0x000000075969ed58,0x0000000760b00000) from space 20992K, 0% used [0x0000000761f80000,0x0000000761f80000,0x0000000763400000) to space 20992K, 0% used [0x0000000760b00000,0x0000000760b00000,0x0000000761f80000) ParOldGen total 342016K, used 0K [0x000000060a600000, 0x000000061f400000, 0x0000000758c80000) object space 342016K, 0% used [0x000000060a600000,0x000000060a600000,0x000000061f400000) PSPermGen total 20480K, used 3382K [0x0000000609200000, 0x000000060a600000, 0x000000060a600000) object space 20480K, 16% used [0x0000000609200000,0x000000060954dae0,0x000000060a600000) As you can see the perm gen is 20 m even though I specified 10m on the command line. If I run your program with -XX:MaxMetaspaceSize=20m it passes and does not run out of memory. There are no guarantees that you can always just replace MaxPermSize with MaxMetaspaceSize. Often it works, but sometimes you have to adjust the values. Especially at boundary cases as low as 10m. Hths, Bengt > > Is that expected? It seems similar to > https://bugs.openjdk.java.net/browse/JDK-8025635 > > Thanks, > Brian > > // uses Guava's CacheBuilder > public class Main { > public static void main(String[] args) throws Exception { > Cache cache = CacheBuilder.newBuilder() > .softValues() > .build(); > for (int i = 0; i < 50_000; i++) { > URL[] dummyUrls = {new URL("file:" + i + ".jar")}; > URLClassLoader cl = new URLClassLoader(dummyUrls, > Thread.currentThread().getContextClassLoader()); > Object proxy = Proxy.newProxyInstance(cl, new > Class[]{Foo.class}, new InvocationHandler() { > @Override > public Object invoke(Object proxy, Method method, > Object[] args) throws Throwable { > return null; > } > }); > cache.put(i, proxy); > System.out.println(i); > } > } > public interface Foo { > void x(); > } > } > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From brianfromoregon at gmail.com Fri May 1 21:10:05 2015 From: brianfromoregon at gmail.com (Brian Harris) Date: Fri, 1 May 2015 14:10:05 -0700 Subject: java8 metaspace issue Message-ID: Hi, I find that this code crashes in 8u40 after getting up to about 900 when run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with -XX:MaxPermSize=10m it does not crash. Is that expected? It seems similar to https://bugs.openjdk.java.net/browse/JDK-8025635 Thanks, Brian // uses Guava's CacheBuilder public class Main { public static void main(String[] args) throws Exception { Cache cache = CacheBuilder.newBuilder() .softValues() .build(); for (int i = 0; i < 50_000; i++) { URL[] dummyUrls = {new URL("file:" + i + ".jar")}; URLClassLoader cl = new URLClassLoader(dummyUrls, Thread.currentThread().getContextClassLoader()); Object proxy = Proxy.newProxyInstance(cl, new Class[]{Foo.class}, new InvocationHandler() { @Override public Object invoke(Object proxy, Method method, Object[] args) throws Throwable { return null; } }); cache.put(i, proxy); System.out.println(i); } } public interface Foo { void x(); } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From simone.bordet at gmail.com Fri May 8 15:11:00 2015 From: simone.bordet at gmail.com (Simone Bordet) Date: Fri, 8 May 2015 17:11:00 +0200 Subject: G1, Remembered Sets & Refinement Message-ID: Hi, I would like to ask some clarification about remembered sets (RS), buffers and refinement in G1. My understanding is that G1 installs a write barrier to record old to young pointers. Let's assume that A is the object in old generation, and B is the object in young generation. I understand that when the barrier triggers, the card correspondent to the place where the A resides is marked. This card is then enqueued into a queue (the dirty card queue). I understand that until the number of entries in the dirty card queue does not enter the yellow zone, then nothing is done. When the yellow zone is entered, a refinement thread is started to poll items out of the dirty card queue and update the RS for the young region. I understand that, when a young GC happens, the refinement threads are stopped (if running), and "Update RS" phase takes care of processing the dirty card queue. Provided my understanding is correct, what is the meaning of the word "buffer" in this scenario ? PrintGCDetails prints out a "Processed Buffers" subphase for "Update RS", but what is a "buffer" ? Another question: given that the write barrier knows exactly A's oop, what is the reason for card marking, rather than just recording the oop ? Last question: would setting the yellow zone to zero always reduce the "Update RS" to (almost) zero ? What problem could this setting possibly generate ? Thanks ! -- Simone Bordet http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From yiyeguhu at gmail.com Fri May 8 22:33:44 2015 From: yiyeguhu at gmail.com (Tao Mao) Date: Fri, 8 May 2015 15:33:44 -0700 Subject: How to find classes FinalReference references to? Message-ID: Hi, I find one of our applications using G1GC takes a comparably long time to do [Ref Proc]. Tried ParallelRefProcEnabled, with no noticeable improvement. Turning on PrintReferenceGC further finds all of processed references are FinalReference. We want to know which code is using a finalizer at this point. Our own code base does not use finalize() at all. So, I suspect finalize() is being used by some external Java libraries but they are not easy to find. Is there any way to find and profile classes/objects FinalReference references to? (I guess I'm still able to hack into OpenJDK code but I'd rather not do that :) Thanks. Tao -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Fri May 8 22:43:05 2015 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Fri, 8 May 2015 18:43:05 -0400 Subject: How to find classes FinalReference references to? In-Reply-To: References: Message-ID: Hi Tao, Have you tried taking a heap dump (jmap) and then using a heap viewer/analyzer, such as MAT ? You should be able to find instances of FinalReference, and what they point at. HTH On Fri, May 8, 2015 at 6:33 PM, Tao Mao wrote: > Hi, > > I find one of our applications using G1GC takes a comparably long time to > do [Ref Proc]. Tried ParallelRefProcEnabled, with no noticeable > improvement. Turning on PrintReferenceGC further finds all of processed > references are FinalReference. We want to know which code is using a > finalizer at this point. Our own code base does not use finalize() at all. > So, I suspect finalize() is being used by some external Java libraries but > they are not easy to find. > > Is there any way to find and profile classes/objects FinalReference > references to? > > (I guess I'm still able to hack into OpenJDK code but I'd rather not do > that :) > > Thanks. > Tao > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Mon May 11 18:46:58 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Mon, 11 May 2015 11:46:58 -0700 Subject: G1, Remembered Sets & Refinement In-Reply-To: References: Message-ID: <5550F922.8040703@oracle.com> Simon, I will try to answer your questions. Thanks, Jenny On 5/8/2015 8:11 AM, Simone Bordet wrote: > Hi, > > I would like to ask some clarification about remembered sets (RS), > buffers and refinement in G1. > > My understanding is that G1 installs a write barrier to record old to > young pointers. > Let's assume that A is the object in old generation, and B is the > object in young generation. Yes. > > I understand that when the barrier triggers, the card correspondent to > the place where the A resides is marked. > This card is then enqueued into a queue (the dirty card queue). yes > > I understand that until the number of entries in the dirty card queue > does not enter the yellow zone, then nothing is done. > When the yellow zone is entered, a refinement thread is started to > poll items out of the dirty card queue and update the RS for the young > region. yes. The number of refinement threads activated is decided by G1ConcRefinementThresholdStep. > > I understand that, when a young GC happens, the refinement threads are > stopped (if running), and "Update RS" phase takes care of processing > the dirty card queue. yes. > > Provided my understanding is correct, what is the meaning of the word > "buffer" in this scenario ? > PrintGCDetails prints out a "Processed Buffers" subphase for "Update > RS", but what is a "buffer" ? as you mentioned, the buffer is a set of the dirty card queues(DirtyCardQueueSet). The dirty cards are processed by concurrent refinement threads or at STW phase( update RS). > > Another question: given that the write barrier knows exactly A's oop, > what is the reason for card marking, rather than just recording the > oop ? I hope others can chime in on this. My guess it is related to memory footprint and performance. > > Last question: would setting the yellow zone to zero always reduce the > "Update RS" to (almost) zero ? What problem could this setting > possibly generate ? Probably not. It will push more work to concurrent refinement threads, but still could leave some work for STW phase. > > Thanks ! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simone.bordet at gmail.com Mon May 11 19:15:18 2015 From: simone.bordet at gmail.com (Simone Bordet) Date: Mon, 11 May 2015 21:15:18 +0200 Subject: G1, Remembered Sets & Refinement In-Reply-To: <5550F922.8040703@oracle.com> References: <5550F922.8040703@oracle.com> Message-ID: Jenny, On Mon, May 11, 2015 at 8:46 PM, Yu Zhang wrote: > Simon, > > I will try to answer your questions. Thank you for your answers ! -- Simone Bordet http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From thomas.schatzl at oracle.com Tue May 12 08:06:48 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 May 2015 10:06:48 +0200 Subject: G1, Remembered Sets & Refinement In-Reply-To: <5550F922.8040703@oracle.com> References: <5550F922.8040703@oracle.com> Message-ID: <1431418008.3356.6.camel@oracle.com> Hi, On Mon, 2015-05-11 at 11:46 -0700, Yu Zhang wrote: > Simon, > > I will try to answer your questions. > Thanks, > Jenny > On 5/8/2015 8:11 AM, Simone Bordet wrote: > > > Hi, > > > > I would like to ask some clarification about remembered sets (RS), > > buffers and refinement in G1. > > > > My understanding is that G1 installs a write barrier to record old to > > young pointers. Actually all inter-region pointers, where old->young are special cases. > > Let's assume that A is the object in old generation, and B is the > > object in young generation. > Yes. > > > > I understand that when the barrier triggers, the card correspondent to > > the place where the A resides is marked. > > This card is then enqueued into a queue (the dirty card queue). > yes > > > > I understand that until the number of entries in the dirty card queue > > does not enter the yellow zone, then nothing is done. > > When the yellow zone is entered, a refinement thread is started to > > poll items out of the dirty card queue and update the RS for the young > > region. > yes. The number of refinement threads activated is decided by > G1ConcRefinementThresholdStep. Actually refinement starts at the green threshold, until all refinement threads are running at the yellow one. At the red threshold, mutator threads start helping. > > I understand that, when a young GC happens, the refinement threads are > > stopped (if running), and "Update RS" phase takes care of processing > > the dirty card queue. > yes. > > > > Provided my understanding is correct, what is the meaning of the word > > "buffer" in this scenario ? > > PrintGCDetails prints out a "Processed Buffers" subphase for "Update > > RS", but what is a "buffer" ? > as you mentioned, the buffer is a set of the dirty card > queues(DirtyCardQueueSet). The dirty cards are processed by > concurrent refinement threads or at STW phase( update RS). > > > > Another question: given that the write barrier knows exactly A's oop, > > what is the reason for card marking, rather than just recording the > > oop ? > I hope others can chime in on this. My guess it is related to memory > footprint and performance. - memory usage: a card covers a set of references which are often changed together. - performance: while a card is marked, that card is not re-enqueued again. This kind of duplicate detection is simple using cards (or any range of memory backed by an array), while hard otherwise. > > Last question: would setting the yellow zone to zero always reduce the > > "Update RS" to (almost) zero ? What problem could this setting > > possibly generate ? > Probably not. It will push more work to concurrent refinement > threads, but still could leave some work for STW phase. - there will always be some work remaining to be done during the stw pause - this will effectively disable duplicate detection leading to much higher cpu-usage as cards are more frequently re-processed. Often it is actually advantageous to keep the cards longer in the queue. Thanks, Thomas From thomas.schatzl at oracle.com Tue May 12 09:05:28 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 May 2015 11:05:28 +0200 Subject: G1, Remembered Sets & Refinement In-Reply-To: <1431418008.3356.6.camel@oracle.com> References: <5550F922.8040703@oracle.com> <1431418008.3356.6.camel@oracle.com> Message-ID: <1431421528.3356.28.camel@oracle.com> Hi all, On Tue, 2015-05-12 at 10:06 +0200, Thomas Schatzl wrote: > Hi, > > On Mon, 2015-05-11 at 11:46 -0700, Yu Zhang wrote: > > Simon, > > > > I will try to answer your questions. > > Thanks, > > Jenny > > On 5/8/2015 8:11 AM, Simone Bordet wrote: > > > > > Hi, > > > > > > I would like to ask some clarification about remembered sets (RS), > > > buffers and refinement in G1. > > > > > > My understanding is that G1 installs a write barrier to record old to > > > young pointers. > > Actually all inter-region pointers, where old->young are special cases. just to clear up any misunderstandings: the barrier is installed for all writes (except ones that the compiler can prove to be uninteresting as below, I think only guaranteed NULL-writes). Summarizing that, G1 is interested in any old->whatever references that - are non-NULL - cross a region - originate from old (in that order btw) For refinement, the corresponding card also needs to be non-dirty to be enqueued, otherwise it knows that the card is already somewhere in some queue and will eventually be looked at anyway. Thanks, Thomas From brianfromoregon at gmail.com Thu May 14 23:35:04 2015 From: brianfromoregon at gmail.com (Brian Harris) Date: Thu, 14 May 2015 16:35:04 -0700 Subject: java8 metaspace issue In-Reply-To: <554887C7.7080506@oracle.com> References: <554887C7.7080506@oracle.com> Message-ID: Yes I should have said OOME instead of 'crash'. Indeed when setting -XX:MaxMetaspaceSize=20m the program does not throw OOME. Appears to be a boundary case jvm bug that this will throw OOME when -XX:MaxMetaspaceSize=10m after going through the loop 890 times. Otherwise, how else can the OOME be explained given we're using soft refs? On Tue, May 5, 2015 at 2:05 AM, Bengt Rutisson wrote: > > Hi Brian, > > On 2015-05-05 05:17, Brian Harris wrote: > > Hi, > > I find that this code crashes in 8u40 after getting up to about 900 when > run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with > -XX:MaxPermSize=10m it does not crash. > > > Thanks for providing the example program. For me it does not crash but if > I run with -XX:MaxMetaspaceSize=10m I get an OutOfMemoryError. Does it > crash for you? > > The OutOfMemoryError can be explained by the fact that when you run with > -XX:MaxPermSize=10m there is some aligning going on and in the end you > actually end up with a perm gen that is 20m large. Here's what I get when I > use > > $ java -XX:+PrintGCDetails -XX:MaxPermSize=10m > > Heap > PSYoungGen total 150528K, used 10363K [0x0000000758c80000, > 0x0000000763400000, 0x0000000800000000) > eden space 129536K, 8% used > [0x0000000758c80000,0x000000075969ed58,0x0000000760b00000) > from space 20992K, 0% used > [0x0000000761f80000,0x0000000761f80000,0x0000000763400000) > to space 20992K, 0% used > [0x0000000760b00000,0x0000000760b00000,0x0000000761f80000) > ParOldGen total 342016K, used 0K [0x000000060a600000, > 0x000000061f400000, 0x0000000758c80000) > object space 342016K, 0% used > [0x000000060a600000,0x000000060a600000,0x000000061f400000) > PSPermGen total 20480K, used 3382K [0x0000000609200000, > 0x000000060a600000, 0x000000060a600000) > object space 20480K, 16% used > [0x0000000609200000,0x000000060954dae0,0x000000060a600000) > > As you can see the perm gen is 20 m even though I specified 10m on the > command line. > > If I run your program with -XX:MaxMetaspaceSize=20m it passes and does not > run out of memory. > > > There are no guarantees that you can always just replace MaxPermSize with > MaxMetaspaceSize. Often it works, but sometimes you have to adjust the > values. Especially at boundary cases as low as 10m. > > Hths, > Bengt > > > > Is that expected? It seems similar to > https://bugs.openjdk.java.net/browse/JDK-8025635 > > Thanks, > Brian > > // uses Guava's CacheBuilder > public class Main { > > public static void main(String[] args) throws Exception { > Cache cache = CacheBuilder.newBuilder() > .softValues() > .build(); > > for (int i = 0; i < 50_000; i++) { > URL[] dummyUrls = {new URL("file:" + i + ".jar")}; > URLClassLoader cl = new URLClassLoader(dummyUrls, > Thread.currentThread().getContextClassLoader()); > Object proxy = Proxy.newProxyInstance(cl, new > Class[]{Foo.class}, new InvocationHandler() { > @Override > public Object invoke(Object proxy, Method method, Object[] > args) throws Throwable { > return null; > } > }); > cache.put(i, proxy); > System.out.println(i); > } > } > > public interface Foo { > void x(); > } > } > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecki at zusammenkunft.net Fri May 15 00:41:29 2015 From: ecki at zusammenkunft.net (Bernd) Date: Fri, 15 May 2015 02:41:29 +0200 Subject: java8 metaspace issue In-Reply-To: References: <554887C7.7080506@oracle.com> Message-ID: You can try to turn on tracing of inlining and compilation. The JITed code can need space in the meta generation and can be caused by repeating code till it becomes hot enough. And 10mb is really small, anyway. Gruss Bernd Am 15.05.2015 02:35 schrieb "Brian Harris" : > Yes I should have said OOME instead of 'crash'. Indeed when setting > -XX:MaxMetaspaceSize=20m the program does not throw OOME. > > Appears to be a boundary case jvm bug that this will throw OOME when -XX:MaxMetaspaceSize=10m > after going through the loop 890 times. Otherwise, how else can the OOME be > explained given we're using soft refs? > > On Tue, May 5, 2015 at 2:05 AM, Bengt Rutisson > wrote: > >> >> Hi Brian, >> >> On 2015-05-05 05:17, Brian Harris wrote: >> >> Hi, >> >> I find that this code crashes in 8u40 after getting up to about 900 >> when run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with >> -XX:MaxPermSize=10m it does not crash. >> >> >> Thanks for providing the example program. For me it does not crash but if >> I run with -XX:MaxMetaspaceSize=10m I get an OutOfMemoryError. Does it >> crash for you? >> >> The OutOfMemoryError can be explained by the fact that when you run with >> -XX:MaxPermSize=10m there is some aligning going on and in the end you >> actually end up with a perm gen that is 20m large. Here's what I get when I >> use >> >> $ java -XX:+PrintGCDetails -XX:MaxPermSize=10m >> >> Heap >> PSYoungGen total 150528K, used 10363K [0x0000000758c80000, >> 0x0000000763400000, 0x0000000800000000) >> eden space 129536K, 8% used >> [0x0000000758c80000,0x000000075969ed58,0x0000000760b00000) >> from space 20992K, 0% used >> [0x0000000761f80000,0x0000000761f80000,0x0000000763400000) >> to space 20992K, 0% used >> [0x0000000760b00000,0x0000000760b00000,0x0000000761f80000) >> ParOldGen total 342016K, used 0K [0x000000060a600000, >> 0x000000061f400000, 0x0000000758c80000) >> object space 342016K, 0% used >> [0x000000060a600000,0x000000060a600000,0x000000061f400000) >> PSPermGen total 20480K, used 3382K [0x0000000609200000, >> 0x000000060a600000, 0x000000060a600000) >> object space 20480K, 16% used >> [0x0000000609200000,0x000000060954dae0,0x000000060a600000) >> >> As you can see the perm gen is 20 m even though I specified 10m on the >> command line. >> >> If I run your program with -XX:MaxMetaspaceSize=20m it passes and does >> not run out of memory. >> >> >> There are no guarantees that you can always just replace MaxPermSize with >> MaxMetaspaceSize. Often it works, but sometimes you have to adjust the >> values. Especially at boundary cases as low as 10m. >> >> Hths, >> Bengt >> >> >> >> Is that expected? It seems similar to >> https://bugs.openjdk.java.net/browse/JDK-8025635 >> >> Thanks, >> Brian >> >> // uses Guava's CacheBuilder >> public class Main { >> >> public static void main(String[] args) throws Exception { >> Cache cache = CacheBuilder.newBuilder() >> .softValues() >> .build(); >> >> for (int i = 0; i < 50_000; i++) { >> URL[] dummyUrls = {new URL("file:" + i + ".jar")}; >> URLClassLoader cl = new URLClassLoader(dummyUrls, >> Thread.currentThread().getContextClassLoader()); >> Object proxy = Proxy.newProxyInstance(cl, new >> Class[]{Foo.class}, new InvocationHandler() { >> @Override >> public Object invoke(Object proxy, Method method, >> Object[] args) throws Throwable { >> return null; >> } >> }); >> cache.put(i, proxy); >> System.out.println(i); >> } >> } >> >> public interface Foo { >> void x(); >> } >> } >> >> >> _______________________________________________ >> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brianfromoregon at gmail.com Fri May 15 15:40:07 2015 From: brianfromoregon at gmail.com (Brian Harris) Date: Fri, 15 May 2015 08:40:07 -0700 Subject: java8 metaspace issue In-Reply-To: References: <554887C7.7080506@oracle.com> Message-ID: This toy example was meant to be a reproduction of real OOME we're getting in JVM where MaxMetaspaceSize is not set and heap dumps suggest uncleared soft references to objects in metaspace as being the cause. If you're right and this really only happens when metaspace is capped super low, then it's a false reproduction. But if there's a deeper problem and the metaspace cap simply reveals in the toy example the same underlying issue we're hitting in prod, I'd hope that could be investigated. On Thu, May 14, 2015 at 5:41 PM, Bernd wrote: > You can try to turn on tracing of inlining and compilation. The JITed code > can need space in the meta generation and can be caused by repeating code > till it becomes hot enough. And 10mb is really small, anyway. > > Gruss > Bernd > Am 15.05.2015 02:35 schrieb "Brian Harris" : > >> Yes I should have said OOME instead of 'crash'. Indeed when setting >> -XX:MaxMetaspaceSize=20m the program does not throw OOME. >> >> Appears to be a boundary case jvm bug that this will throw OOME when -XX:MaxMetaspaceSize=10m >> after going through the loop 890 times. Otherwise, how else can the OOME be >> explained given we're using soft refs? >> >> On Tue, May 5, 2015 at 2:05 AM, Bengt Rutisson > > wrote: >> >>> >>> Hi Brian, >>> >>> On 2015-05-05 05:17, Brian Harris wrote: >>> >>> Hi, >>> >>> I find that this code crashes in 8u40 after getting up to about 900 >>> when run with -XX:MaxMetaspaceSize=10m. When run in 7u60 with >>> -XX:MaxPermSize=10m it does not crash. >>> >>> >>> Thanks for providing the example program. For me it does not crash but >>> if I run with -XX:MaxMetaspaceSize=10m I get an OutOfMemoryError. Does it >>> crash for you? >>> >>> The OutOfMemoryError can be explained by the fact that when you run with >>> -XX:MaxPermSize=10m there is some aligning going on and in the end you >>> actually end up with a perm gen that is 20m large. Here's what I get when I >>> use >>> >>> $ java -XX:+PrintGCDetails -XX:MaxPermSize=10m >>> >>> Heap >>> PSYoungGen total 150528K, used 10363K [0x0000000758c80000, >>> 0x0000000763400000, 0x0000000800000000) >>> eden space 129536K, 8% used >>> [0x0000000758c80000,0x000000075969ed58,0x0000000760b00000) >>> from space 20992K, 0% used >>> [0x0000000761f80000,0x0000000761f80000,0x0000000763400000) >>> to space 20992K, 0% used >>> [0x0000000760b00000,0x0000000760b00000,0x0000000761f80000) >>> ParOldGen total 342016K, used 0K [0x000000060a600000, >>> 0x000000061f400000, 0x0000000758c80000) >>> object space 342016K, 0% used >>> [0x000000060a600000,0x000000060a600000,0x000000061f400000) >>> PSPermGen total 20480K, used 3382K [0x0000000609200000, >>> 0x000000060a600000, 0x000000060a600000) >>> object space 20480K, 16% used >>> [0x0000000609200000,0x000000060954dae0,0x000000060a600000) >>> >>> As you can see the perm gen is 20 m even though I specified 10m on the >>> command line. >>> >>> If I run your program with -XX:MaxMetaspaceSize=20m it passes and does >>> not run out of memory. >>> >>> >>> There are no guarantees that you can always just replace MaxPermSize >>> with MaxMetaspaceSize. Often it works, but sometimes you have to adjust the >>> values. Especially at boundary cases as low as 10m. >>> >>> Hths, >>> Bengt >>> >>> >>> >>> Is that expected? It seems similar to >>> https://bugs.openjdk.java.net/browse/JDK-8025635 >>> >>> Thanks, >>> Brian >>> >>> // uses Guava's CacheBuilder >>> public class Main { >>> >>> public static void main(String[] args) throws Exception { >>> Cache cache = CacheBuilder.newBuilder() >>> .softValues() >>> .build(); >>> >>> for (int i = 0; i < 50_000; i++) { >>> URL[] dummyUrls = {new URL("file:" + i + ".jar")}; >>> URLClassLoader cl = new URLClassLoader(dummyUrls, >>> Thread.currentThread().getContextClassLoader()); >>> Object proxy = Proxy.newProxyInstance(cl, new >>> Class[]{Foo.class}, new InvocationHandler() { >>> @Override >>> public Object invoke(Object proxy, Method method, >>> Object[] args) throws Throwable { >>> return null; >>> } >>> }); >>> cache.put(i, proxy); >>> System.out.println(i); >>> } >>> } >>> >>> public interface Foo { >>> void x(); >>> } >>> } >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >>> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon.masamitsu at oracle.com Fri May 15 18:20:47 2015 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Fri, 15 May 2015 11:20:47 -0700 Subject: java8 metaspace issue In-Reply-To: References: <554887C7.7080506@oracle.com> Message-ID: <555638FF.6010705@oracle.com> Brian, In your first mail you mention 7u60 and 8u40. With 7u60 you must be setting MaxPermSize. What value are you using and you are not seeing any problems. A sample GC log with -XX:+PrintGCDetails and -XX:+PrintHeapAtGC would help us understand the 7u60 behavior. With 8u40 you are not setting MaxMetaspaceSize yet you are seeing an OOME with Metaspace as the cause? A GC log would be most helpful there also Jon On 5/15/2015 8:40 AM, Brian Harris wrote: > This toy example was meant to be a reproduction of real OOME we're > getting in JVM where MaxMetaspaceSize is not set and heap dumps > suggest uncleared soft references to objects in metaspace as being the > cause. > If you're right and this really only happens when metaspace is capped > super low, then it's a false reproduction. But if there's a deeper > problem and the metaspace cap simply reveals in the toy example the > same underlying issue we're hitting in prod, I'd hope that could be > investigated. > > On Thu, May 14, 2015 at 5:41 PM, Bernd > wrote: > > You can try to turn on tracing of inlining and compilation. The > JITed code can need space in the meta generation and can be caused > by repeating code till it becomes hot enough. And 10mb is really > small, anyway. > > Gruss > Bernd > > Am 15.05.2015 02:35 schrieb "Brian Harris" > >: > > Yes I should have said OOME instead of 'crash'. Indeed when > setting -XX:MaxMetaspaceSize=20m the program does not throw OOME. > > Appears to be a boundary case jvm bug that this will throw > OOME when -XX:MaxMetaspaceSize=10m after going through the > loop 890 times. Otherwise, how else can the OOME be explained > given we're using soft refs? > > On Tue, May 5, 2015 at 2:05 AM, Bengt Rutisson > > > wrote: > > > Hi Brian, > > On 2015-05-05 05:17, Brian Harris wrote: >> Hi, >> >> I find that this code crashes in 8u40 after getting up to >> about 900 when run with -XX:MaxMetaspaceSize=10m. When >> run in 7u60 with -XX:MaxPermSize=10m it does not crash. > > Thanks for providing the example program. For me it does > not crash but if I run with -XX:MaxMetaspaceSize=10m I get > an OutOfMemoryError. Does it crash for you? > > The OutOfMemoryError can be explained by the fact that > when you run with -XX:MaxPermSize=10m there is some > aligning going on and in the end you actually end up with > a perm gen that is 20m large. Here's what I get when I use > > $ java -XX:+PrintGCDetails -XX:MaxPermSize=10m > > Heap > PSYoungGen total 150528K, used 10363K > [0x0000000758c80000, 0x0000000763400000, 0x0000000800000000) > eden space 129536K, 8% used > [0x0000000758c80000,0x000000075969ed58,0x0000000760b00000) > from space 20992K, 0% used > [0x0000000761f80000,0x0000000761f80000,0x0000000763400000) > to space 20992K, 0% used > [0x0000000760b00000,0x0000000760b00000,0x0000000761f80000) > ParOldGen total 342016K, used 0K > [0x000000060a600000, 0x000000061f400000, 0x0000000758c80000) > object space 342016K, 0% used > [0x000000060a600000,0x000000060a600000,0x000000061f400000) > PSPermGen total 20480K, used 3382K > [0x0000000609200000, 0x000000060a600000, 0x000000060a600000) > object space 20480K, 16% used > [0x0000000609200000,0x000000060954dae0,0x000000060a600000) > > As you can see the perm gen is 20 m even though I > specified 10m on the command line. > > If I run your program with -XX:MaxMetaspaceSize=20m it > passes and does not run out of memory. > > > There are no guarantees that you can always just replace > MaxPermSize with MaxMetaspaceSize. Often it works, but > sometimes you have to adjust the values. Especially at > boundary cases as low as 10m. > > Hths, > Bengt > > >> >> Is that expected? It seems similar to >> https://bugs.openjdk.java.net/browse/JDK-8025635 >> >> Thanks, >> Brian >> >> // uses Guava's CacheBuilder >> public class Main { >> public static void main(String[] args) throws Exception { >> Cache cache = >> CacheBuilder.newBuilder() >> .softValues() >> .build(); >> for (int i = 0; i < 50_000; i++) { >> URL[] dummyUrls = {new URL("file:" + i + >> ".jar")}; >> URLClassLoader cl = new >> URLClassLoader(dummyUrls, >> Thread.currentThread().getContextClassLoader()); >> Object proxy = Proxy.newProxyInstance(cl, new >> Class[]{Foo.class}, new InvocationHandler() { >> @Override >> public Object invoke(Object proxy, Method >> method, Object[] args) throws Throwable { >> return null; >> } >> }); >> cache.put(i, proxy); >> System.out.println(i); >> } >> } >> public interface Foo { >> void x(); >> } >> } >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From joyxiong at yahoo.com Wed May 20 17:35:24 2015 From: joyxiong at yahoo.com (Joy Xiong) Date: Wed, 20 May 2015 17:35:24 +0000 (UTC) Subject: Long Reference Processing Time Message-ID: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com> Hi All, I recently moved our application from CMS to G1 due to heap fragmentation. Here are the JVM tunable used for the application:-XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC? /export/apps/jdk/JDK-1_8_0_5/bin/java With G1, I observe long time processing references. The long reference processing time has two types:?1) Occur in Young GC phase. The processing time does not make sense to me, as the majority time is spent on processing soft reference, whose number is 0. Is there some hidden time contributing to processing soft references?2) Occur in the remark phase during the concurrent phase. Our application has a large number of weak references, but I don't quite understand why the processing time is much larger with G1 than with CMS. Detailed log record is shown as below:1. Processing soft reference takes long time. However, we only have 0 soft reference2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation Pause) (young)Desired survivor size 201326592 bytes, new threshold 15 (max 15)- age ? 1: ? ?6197672 bytes, ? ?6197672 total- age ? 2: ? ? 553864 bytes, ? ?6751536 total- age ? 3: ? ? 321216 bytes, ? ?7072752 total- age ? 4: ? ? 563120 bytes, ? ?7635872 total- age ? 5: ? ? 261920 bytes, ? ?7897792 total- age ? 6: ? ? 265768 bytes, ? ?8163560 total- age ? 7: ? ? 319856 bytes, ? ?8483416 total- age ? 8: ? ? 132328 bytes, ? ?8615744 total- age ? 9: ? ? 153768 bytes, ? ?8769512 total- age ?10: ? ? 194256 bytes, ? ?8963768 total- age ?11: ? ? ?64600 bytes, ? ?9028368 total- age ?12: ? ? 160208 bytes, ? ?9188576 total- age ?13: ? ? ?69376 bytes, ? ?9257952 total- age ?14: ? ? 151832 bytes, ? ?9409784 total- age ?15: ? ? 186920 bytes, ? ?9596704 total?30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: 17.67 ms, target pause time: 40.00 ms]?30271.429: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 380 regions, survivors: 2 regions, predicted young region time: 5.51 ms]?30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted pause time: 27.83 ms, target pause time: 40.00 ms]30271.445: [SoftReference, 0 refs, 0.9021283 secs]30272.347: [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs]? ?[Parallel Time: 14.4 ms, GC Workers: 22]? ? ? [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 30271429.9, Diff: 0.5]? ? ? [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 6.4, Sum: 120.1]? ? ? [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8]? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, Sum: 1181]? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0]? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]? ? ? [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: 55.3]? ? ? [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.3]? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 2.0]? ? ? [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 0.6, Sum: 300.7]? ? ? [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 30271443.5, Diff: 0.2]? ?[Code Root Fixup: 0.2 ms]? ?[Code Root Migration: 0.1 ms]? ?[Clear CT: 1.0 ms]? ?[Other: 914.8 ms]? ? ? [Choose CSet: 0.0 ms]? ? ? [Ref Proc: 910.3 ms]? ? ? [Ref Enq: 0.5 ms]? ? ? [Free CSet: 1.7 ms]? ?[Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)]?[Times: user=0.29 sys=0.00, real=0.93 secs] 2. Processing weak reference takes long time93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs]? ?[Parallel Time: 33.1 ms, GC Workers: 22]? ? ? [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 93967013.6, Diff: 0.7]? ? ? [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, Sum: 122.9]? ? ? [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, Sum: 25.2]? ? ? [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3]? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173]? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6]? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]? ? ? [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: 463.0]? ? ? [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4]? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: 4.9]? ? ? [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, Sum: 709.3]? ? ? [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 93967045.8, Diff: 0.5]? ?[Code Root Fixup: 0.2 ms]? ?[Code Root Migration: 0.1 ms]? ?[Clear CT: 0.5 ms]? ?[Other: 11.7 ms]? ? ? [Choose CSet: 0.0 ms]? ? ? [Ref Proc: 7.8 ms]? ? ? [Ref Enq: 0.5 ms]? ? ? [Free CSet: 0.8 ms]? ?[Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)]?[Times: user=0.66 sys=0.00, real=0.04 secs]2015-05-16T13:21:33.478+0000: 93967.057: [GC concurrent-root-region-scan-start]2015-05-16T13:21:33.479+0000: 93967.058: Total time for which application threads were stopped: 0.0652331 seconds2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-root-region-scan-end, 0.0082516 secs]2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start]2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 0.4016735 secs]2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: [WeakReference, 1430199 refs, 0.7913479 secs]93968.281: [FinalReference, 367 refs, 0.0036350 secs]93968.285: [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs]?[Times: user=15.10 sys=0.19, real=1.08 secs] Appreciate your help,-Joy -------------- next part -------------- An HTML attachment was scrubbed... URL: From poonam.bajaj at oracle.com Wed May 20 18:17:58 2015 From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar) Date: Wed, 20 May 2015 11:17:58 -0700 Subject: Long Reference Processing Time In-Reply-To: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com> References: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com> Message-ID: <555CCFD6.50400@oracle.com> Hello Joy, Could you try running with the latest JDK8 update release (8u45). Looks like you are trying out G1 with 8u5. There have been many improvements/fixes in G1GC since 8u5. Please test with the latest 8u and let us know the results. Thanks, Poonam On 5/20/2015 10:35 AM, Joy Xiong wrote: > * > * > Hi All, > > I recently moved our application from CMS to G1 due to heap > fragmentation. Here are the JVM tunable used for the application: > -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled > -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g > -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC > /export/apps/jdk/JDK-1_8_0_5/bin/java > > With G1, I observe long time processing references. The long reference > processing time has two types: > 1) Occur in Young GC phase. The processing time does not make sense to > me, as the majority time is spent on processing soft reference, whose > number is 0. Is there some hidden time contributing to processing soft > references? > 2) Occur in the remark phase during the concurrent phase. Our > application has a large number of weak references, but I don't quite > understand why the processing time is much larger with G1 than with CMS. > > Detailed log record is shown as below: > *1. Processing soft reference takes long time*. However, we only have > 0 soft reference > 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation > Pause) (young) > Desired survivor size 201326592 bytes, new threshold 15 (max 15) > - age 1: 6197672 bytes, 6197672 total > - age 2: 553864 bytes, 6751536 total > - age 3: 321216 bytes, 7072752 total > - age 4: 563120 bytes, 7635872 total > - age 5: 261920 bytes, 7897792 total > - age 6: 265768 bytes, 8163560 total > - age 7: 319856 bytes, 8483416 total > - age 8: 132328 bytes, 8615744 total > - age 9: 153768 bytes, 8769512 total > - age 10: 194256 bytes, 8963768 total > - age 11: 64600 bytes, 9028368 total > - age 12: 160208 bytes, 9188576 total > - age 13: 69376 bytes, 9257952 total > - age 14: 151832 bytes, 9409784 total > - age 15: 186920 bytes, 9596704 total > 30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: > 17.67 ms, target pause time: 40.00 ms] > 30271.429: [G1Ergonomics (CSet Construction) add young regions to > CSet, eden: 380 regions, survivors: 2 regions, predicted young region > time: 5.51 ms] > 30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted > pause time: 27.83 ms, target pause time: 40.00 ms] > 30271.445: [*SoftReference, 0 refs, 0.9021283 secs*]30272.347: > [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 > refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 > secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs] > [Parallel Time: 14.4 ms, GC Workers: 22] > [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: > 30271429.9, Diff: 0.5] > [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: > 6.4, Sum: 120.1] > [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8] > [Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, > Sum: 1181] > [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.1] > [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: > 55.3] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: > 7.3] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, > Sum: 2.0] > [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: > 0.6, Sum: 300.7] > [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: > 30271443.5, Diff: 0.2] > [Code Root Fixup: 0.2 ms] > [Code Root Migration: 0.1 ms] > [Clear CT: 1.0 ms] > [Other: 914.8 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 910.3 ms] > [Ref Enq: 0.5 ms] > [Free CSet: 1.7 ms] > [Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: > 4588.4M(5120.0M)->1551.4M(5120.0M)] > [Times: user=0.29 sys=0.00, real=0.93 secs] > > *2. Processing weak reference takes long time* > 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: > [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 > refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 > secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs] > [Parallel Time: 33.1 ms, GC Workers: 22] > [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: > 93967013.6, Diff: 0.7] > [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, > Sum: 122.9] > [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, > Sum: 25.2] > [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3] > [Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: > 1173] > [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.1] > [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: > 463.0] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4] > [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: > 4.9] > [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, > Sum: 709.3] > [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: > 93967045.8, Diff: 0.5] > [Code Root Fixup: 0.2 ms] > [Code Root Migration: 0.1 ms] > [Clear CT: 0.5 ms] > [Other: 11.7 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 7.8 ms] > [Ref Enq: 0.5 ms] > [Free CSet: 0.8 ms] > [Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M > Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] > [Times: user=0.66 sys=0.00, real=0.04 secs] > 2015-05-16T13:21:33.478+0000: 93967.057: [GC > concurrent-root-region-scan-start] > 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which > application threads were stopped: 0.0652331 seconds > 2015-05-16T13:21:33.487+0000: 93967.066: [GC > concurrent-root-region-scan-end, 0.0082516 secs] > 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] > 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, > 0.4016735 secs] > 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC > ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: > [*WeakReference, 1430199 refs, 0.7913479 secs*]93968.281: > [FinalReference, 367 refs, 0.0036350 secs]93968.285: > [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak > Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs] > [Times: user=15.10 sys=0.19, real=1.08 secs] > > Appreciate your help, > -Joy > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Wed May 20 18:25:40 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Wed, 20 May 2015 11:25:40 -0700 Subject: Long Reference Processing Time In-Reply-To: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com> References: <884245963.3603617.1432143324577.JavaMail.yahoo@mail.yahoo.com> Message-ID: <555CD1A4.5010507@oracle.com> Joy, For the 1st one, there is a bug https://bugs.openjdk.java.net/browse/JDK-8076462 You can try to reduce the ParallelGCThreads. There is not much work anyway. Thanks, Jenny On 5/20/2015 10:35 AM, Joy Xiong wrote: > * > * > Hi All, > > I recently moved our application from CMS to G1 due to heap > fragmentation. Here are the JVM tunable used for the application: > -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled > -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g > -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC > /export/apps/jdk/JDK-1_8_0_5/bin/java > > With G1, I observe long time processing references. The long reference > processing time has two types: > 1) Occur in Young GC phase. The processing time does not make sense to > me, as the majority time is spent on processing soft reference, whose > number is 0. Is there some hidden time contributing to processing soft > references? > 2) Occur in the remark phase during the concurrent phase. Our > application has a large number of weak references, but I don't quite > understand why the processing time is much larger with G1 than with CMS. > > Detailed log record is shown as below: > *1. Processing soft reference takes long time*. However, we only have > 0 soft reference > 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation > Pause) (young) > Desired survivor size 201326592 bytes, new threshold 15 (max 15) > - age 1: 6197672 bytes, 6197672 total > - age 2: 553864 bytes, 6751536 total > - age 3: 321216 bytes, 7072752 total > - age 4: 563120 bytes, 7635872 total > - age 5: 261920 bytes, 7897792 total > - age 6: 265768 bytes, 8163560 total > - age 7: 319856 bytes, 8483416 total > - age 8: 132328 bytes, 8615744 total > - age 9: 153768 bytes, 8769512 total > - age 10: 194256 bytes, 8963768 total > - age 11: 64600 bytes, 9028368 total > - age 12: 160208 bytes, 9188576 total > - age 13: 69376 bytes, 9257952 total > - age 14: 151832 bytes, 9409784 total > - age 15: 186920 bytes, 9596704 total > 30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: > 17.67 ms, target pause time: 40.00 ms] > 30271.429: [G1Ergonomics (CSet Construction) add young regions to > CSet, eden: 380 regions, survivors: 2 regions, predicted young region > time: 5.51 ms] > 30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted > pause time: 27.83 ms, target pause time: 40.00 ms] > 30271.445: [*SoftReference, 0 refs, 0.9021283 secs*]30272.347: > [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 > refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 > secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs] > [Parallel Time: 14.4 ms, GC Workers: 22] > [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: > 30271429.9, Diff: 0.5] > [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: > 6.4, Sum: 120.1] > [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8] > [Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, > Sum: 1181] > [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.1] > [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: > 55.3] > [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: > 7.3] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, > Sum: 2.0] > [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: > 0.6, Sum: 300.7] > [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: > 30271443.5, Diff: 0.2] > [Code Root Fixup: 0.2 ms] > [Code Root Migration: 0.1 ms] > [Clear CT: 1.0 ms] > [Other: 914.8 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 910.3 ms] > [Ref Enq: 0.5 ms] > [Free CSet: 1.7 ms] > [Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: > 4588.4M(5120.0M)->1551.4M(5120.0M)] > [Times: user=0.29 sys=0.00, real=0.93 secs] > > *2. Processing weak reference takes long time* > 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: > [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 > refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 > secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs] > [Parallel Time: 33.1 ms, GC Workers: 22] > [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: > 93967013.6, Diff: 0.7] > [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, > Sum: 122.9] > [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, > Sum: 25.2] > [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3] > [Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: > 1173] > [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.1] > [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: > 463.0] > [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4] > [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: > 4.9] > [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, > Sum: 709.3] > [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: > 93967045.8, Diff: 0.5] > [Code Root Fixup: 0.2 ms] > [Code Root Migration: 0.1 ms] > [Clear CT: 0.5 ms] > [Other: 11.7 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 7.8 ms] > [Ref Enq: 0.5 ms] > [Free CSet: 0.8 ms] > [Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M > Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] > [Times: user=0.66 sys=0.00, real=0.04 secs] > 2015-05-16T13:21:33.478+0000: 93967.057: [GC > concurrent-root-region-scan-start] > 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which > application threads were stopped: 0.0652331 seconds > 2015-05-16T13:21:33.487+0000: 93967.066: [GC > concurrent-root-region-scan-end, 0.0082516 secs] > 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] > 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, > 0.4016735 secs] > 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC > ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: > [*WeakReference, 1430199 refs, 0.7913479 secs*]93968.281: > [FinalReference, 367 refs, 0.0036350 secs]93968.285: > [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak > Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs] > [Times: user=15.10 sys=0.19, real=1.08 secs] > > Appreciate your help, > -Joy > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From joyxiong at yahoo.com Wed May 20 20:17:05 2015 From: joyxiong at yahoo.com (Joy Xiong) Date: Wed, 20 May 2015 20:17:05 +0000 (UTC) Subject: Long Reference Processing Time In-Reply-To: <555CCFD6.50400@oracle.com> References: <555CCFD6.50400@oracle.com> Message-ID: <723782990.3775475.1432153025601.JavaMail.yahoo@mail.yahoo.com> Yu and Poonam, Thank you for your quick response.?In terms of JDK version, we have 8u40 available, so want to check with you how 8u40 differs from 8u45. thanks,-Joy On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar wrote: Hello Joy, Could you try running with the latest JDK8 update release (8u45). Looks like you are trying out G1 with 8u5. There have been many improvements/fixes in G1GC since 8u5. Please test with the latest 8u and let us know the results. Thanks, Poonam On 5/20/2015 10:35 AM, Joy Xiong wrote: Hi All, I recently moved our application from CMS to G1 due to heap fragmentation. Here are the JVM tunable used for the application: -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC? /export/apps/jdk/JDK-1_8_0_5/bin/java With G1, I observe long time processing references. The long reference processing time has two types:? 1) Occur in Young GC phase. The processing time does not make sense to me, as the majority time is spent on processing soft reference, whose number is 0. Is there some hidden time contributing to processing soft references? 2) Occur in the remark phase during the concurrent phase. Our application has a large number of weak references, but I don't quite understand why the processing time is much larger with G1 than with CMS. Detailed log record is shown as below: 1. Processing soft reference takes long time. However, we only have 0 soft reference 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 201326592 bytes, new threshold 15 (max 15) - age ? 1: ? ?6197672 bytes, ? ?6197672 total - age ? 2: ? ? 553864 bytes, ? ?6751536 total - age ? 3: ? ? 321216 bytes, ? ?7072752 total - age ? 4: ? ? 563120 bytes, ? ?7635872 total - age ? 5: ? ? 261920 bytes, ? ?7897792 total - age ? 6: ? ? 265768 bytes, ? ?8163560 total - age ? 7: ? ? 319856 bytes, ? ?8483416 total - age ? 8: ? ? 132328 bytes, ? ?8615744 total - age ? 9: ? ? 153768 bytes, ? ?8769512 total - age ?10: ? ? 194256 bytes, ? ?8963768 total - age ?11: ? ? ?64600 bytes, ? ?9028368 total - age ?12: ? ? 160208 bytes, ? ?9188576 total - age ?13: ? ? ?69376 bytes, ? ?9257952 total - age ?14: ? ? 151832 bytes, ? ?9409784 total - age ?15: ? ? 186920 bytes, ? ?9596704 total ?30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: 17.67 ms, target pause time: 40.00 ms] ?30271.429: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 380 regions, survivors: 2 regions, predicted young region time: 5.51 ms] ?30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted pause time: 27.83 ms, target pause time: 40.00 ms] 30271.445: [SoftReference, 0 refs, 0.9021283 secs]30272.347: [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs] ? ?[Parallel Time: 14.4 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 30271429.9, Diff: 0.5] ? ? ? [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 6.4, Sum: 120.1] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, Sum: 1181] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: 55.3] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.3] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 2.0] ? ? ? [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 0.6, Sum: 300.7] ? ? ? [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 30271443.5, Diff: 0.2] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms] ? ?[Clear CT: 1.0 ms] ? ?[Other: 914.8 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 910.3 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 1.7 ms] ? ?[Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)] ?[Times: user=0.29 sys=0.00, real=0.93 secs] 2. Processing weak reference takes long time 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs] ? ?[Parallel Time: 33.1 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 93967013.6, Diff: 0.7] ? ? ? [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, Sum: 122.9] ? ? ? [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, Sum: 25.2] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: 463.0] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: 4.9] ? ? ? [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, Sum: 709.3] ? ? ? [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 93967045.8, Diff: 0.5] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms] ? ?[Clear CT: 0.5 ms] ? ?[Other: 11.7 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 7.8 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 0.8 ms] ? ?[Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] ?[Times: user=0.66 sys=0.00, real=0.04 secs] 2015-05-16T13:21:33.478+0000: 93967.057: [GC concurrent-root-region-scan-start] 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which application threads were stopped: 0.0652331 seconds 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-root-region-scan-end, 0.0082516 secs] 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 0.4016735 secs] 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: [WeakReference, 1430199 refs, 0.7913479 secs]93968.281: [FinalReference, 367 refs, 0.0036350 secs]93968.285: [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs] ?[Times: user=15.10 sys=0.19, real=1.08 secs] Appreciate your help, -Joy _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From poonam.bajaj at oracle.com Wed May 20 21:26:43 2015 From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar) Date: Wed, 20 May 2015 14:26:43 -0700 Subject: Long Reference Processing Time In-Reply-To: <723782990.3775475.1432153025601.JavaMail.yahoo@mail.yahoo.com> References: <555CCFD6.50400@oracle.com> <723782990.3775475.1432153025601.JavaMail.yahoo@mail.yahoo.com> Message-ID: <555CFC13.10900@oracle.com> Hello Joy, 8u40 is the latest update release that contains new enhancements and bug fixes, and 8u45 is the latest security release that includes security fixes on top of 8u40. So, for your test run I think you can try with 8u40. regards, Poonam On 5/20/2015 1:17 PM, Joy Xiong wrote: > Yu and Poonam, > > Thank you for your quick response. > In terms of JDK version, we have 8u40 available, so want to check with > you how 8u40 differs from 8u45. > > thanks, > -Joy > > > > On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar > wrote: > > > Hello Joy, > > Could you try running with the latest JDK8 update release (8u45). > Looks like you are trying out G1 with 8u5. There have been many > improvements/fixes in G1GC since 8u5. Please test with the latest 8u > and let us know the results. > > Thanks, > Poonam > > On 5/20/2015 10:35 AM, Joy Xiong wrote: >> * >> * >> Hi All, >> >> I recently moved our application from CMS to G1 due to heap >> fragmentation. Here are the JVM tunable used for the application: >> -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled >> -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g >> -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC >> /export/apps/jdk/JDK-1_8_0_5/bin/java >> >> With G1, I observe long time processing references. The long >> reference processing time has two types: >> 1) Occur in Young GC phase. The processing time does not make sense >> to me, as the majority time is spent on processing soft reference, >> whose number is 0. Is there some hidden time contributing to >> processing soft references? >> 2) Occur in the remark phase during the concurrent phase. Our >> application has a large number of weak references, but I don't quite >> understand why the processing time is much larger with G1 than with CMS. >> >> Detailed log record is shown as below: >> *1. Processing soft reference takes long time*. However, we only have >> 0 soft reference >> 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation >> Pause) (young) >> Desired survivor size 201326592 bytes, new threshold 15 (max 15) >> - age 1: 6197672 bytes, 6197672 total >> - age 2: 553864 bytes, 6751536 total >> - age 3: 321216 bytes, 7072752 total >> - age 4: 563120 bytes, 7635872 total >> - age 5: 261920 bytes, 7897792 total >> - age 6: 265768 bytes, 8163560 total >> - age 7: 319856 bytes, 8483416 total >> - age 8: 132328 bytes, 8615744 total >> - age 9: 153768 bytes, 8769512 total >> - age 10: 194256 bytes, 8963768 total >> - age 11: 64600 bytes, 9028368 total >> - age 12: 160208 bytes, 9188576 total >> - age 13: 69376 bytes, 9257952 total >> - age 14: 151832 bytes, 9409784 total >> - age 15: 186920 bytes, 9596704 total >> 30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, >> _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: >> 17.67 ms, target pause time: 40.00 ms] >> 30271.429: [G1Ergonomics (CSet Construction) add young regions to >> CSet, eden: 380 regions, survivors: 2 regions, predicted young region >> time: 5.51 ms] >> 30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, >> eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted >> pause time: 27.83 ms, target pause time: 40.00 ms] >> 30271.445: [*SoftReference, 0 refs, 0.9021283 secs*]30272.347: >> [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 >> refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, >> 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], >> 0.9305765 secs] >> [Parallel Time: 14.4 ms, GC Workers: 22] >> [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: >> 30271429.9, Diff: 0.5] >> [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: >> 6.4, Sum: 120.1] >> [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: >> 80.8] >> [Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, >> Sum: 1181] >> [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0] >> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: >> 0.0, Sum: 0.1] >> [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, >> Sum: 55.3] >> [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, >> Sum: 7.3] >> [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, >> Sum: 2.0] >> [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: >> 0.6, Sum: 300.7] >> [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: >> 30271443.5, Diff: 0.2] >> [Code Root Fixup: 0.2 ms] >> [Code Root Migration: 0.1 ms] >> [Clear CT: 1.0 ms] >> [Other: 914.8 ms] >> [Choose CSet: 0.0 ms] >> [Ref Proc: 910.3 ms] >> [Ref Enq: 0.5 ms] >> [Free CSet: 1.7 ms] >> [Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M >> Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)] >> [Times: user=0.29 sys=0.00, real=0.93 secs] >> >> *2. Processing weak reference takes long time* >> 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: >> [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 >> refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 >> secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs] >> [Parallel Time: 33.1 ms, GC Workers: 22] >> [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: >> 93967013.6, Diff: 0.7] >> [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: >> 10.4, Sum: 122.9] >> [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: >> 9.2, Sum: 25.2] >> [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: >> 67.3] >> [Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, >> Sum: 1173] >> [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6] >> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: >> 0.0, Sum: 0.1] >> [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, >> Sum: 463.0] >> [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, >> Sum: 1.4] >> [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, >> Sum: 4.9] >> [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: >> 0.9, Sum: 709.3] >> [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: >> 93967045.8, Diff: 0.5] >> [Code Root Fixup: 0.2 ms] >> [Code Root Migration: 0.1 ms] >> [Clear CT: 0.5 ms] >> [Other: 11.7 ms] >> [Choose CSet: 0.0 ms] >> [Ref Proc: 7.8 ms] >> [Ref Enq: 0.5 ms] >> [Free CSet: 0.8 ms] >> [Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M >> Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] >> [Times: user=0.66 sys=0.00, real=0.04 secs] >> 2015-05-16T13:21:33.478+0000: 93967.057: [GC >> concurrent-root-region-scan-start] >> 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which >> application threads were stopped: 0.0652331 seconds >> 2015-05-16T13:21:33.487+0000: 93967.066: [GC >> concurrent-root-region-scan-end, 0.0082516 secs] >> 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] >> 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, >> 0.4016735 secs] >> 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC >> ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 >> secs]93967.490: [*WeakReference, 1430199 refs, 0.7913479 >> secs*]93968.281: [FinalReference, 367 refs, 0.0036350 secs]93968.285: >> [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak >> Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs] >> [Times: user=15.10 sys=0.19, real=1.08 secs] >> >> Appreciate your help, >> -Joy >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joyxiong at yahoo.com Wed May 20 22:12:08 2015 From: joyxiong at yahoo.com (Joy Xiong) Date: Wed, 20 May 2015 22:12:08 +0000 (UTC) Subject: Long Reference Processing Time In-Reply-To: <555CFC13.10900@oracle.com> References: <555CFC13.10900@oracle.com> Message-ID: <1791195295.3883071.1432159928570.JavaMail.yahoo@mail.yahoo.com> Thank you, Poonam. Also is there a way to get more info on weak references, such as the reference name? Our application does not use weak references, so it's likely that the weak references come from the underneath library, and I'd like to know which library is using lots of weak references. thanks,-Joy On Wednesday, May 20, 2015 2:26 PM, Poonam Bajaj Parhar wrote: Hello Joy, 8u40 is the latest update release that contains new enhancements and bug fixes, and 8u45 is the latest security release that includes security fixes on top of 8u40. So, for your test run I think you can try with 8u40. regards, Poonam On 5/20/2015 1:17 PM, Joy Xiong wrote: Yu and Poonam, Thank you for your quick response.? In terms of JDK version, we have 8u40 available, so want to check with you how 8u40 differs from 8u45. thanks, -Joy On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar wrote: Hello Joy, Could you try running with the latest JDK8 update release (8u45). Looks like you are trying out G1 with 8u5. There have been many improvements/fixes in G1GC since 8u5. Please test with the latest 8u and let us know the results. Thanks, Poonam On 5/20/2015 10:35 AM, Joy Xiong wrote: Hi All, I recently moved our application from CMS to G1 due to heap fragmentation. Here are the JVM tunable used for the application: -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC? /export/apps/jdk/JDK-1_8_0_5/bin/java With G1, I observe long time processing references. The long reference processing time has two types:? 1) Occur in Young GC phase. The processing time does not make sense to me, as the majority time is spent on processing soft reference, whose number is 0. Is there some hidden time contributing to processing soft references? 2) Occur in the remark phase during the concurrent phase. Our application has a large number of weak references, but I don't quite understand why the processing time is much larger with G1 than with CMS. Detailed log record is shown as below: 1. Processing soft reference takes long time. However, we only have 0 soft reference 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 201326592 bytes, new threshold 15 (max 15) - age ? 1: ? ?6197672 bytes, ? ?6197672 total - age ? 2: ? ? 553864 bytes, ? ?6751536 total - age ? 3: ? ? 321216 bytes, ? ?7072752 total - age ? 4: ? ? 563120 bytes, ? ?7635872 total - age ? 5: ? ? 261920 bytes, ? ?7897792 total - age ? 6: ? ? 265768 bytes, ? ?8163560 total - age ? 7: ? ? 319856 bytes, ? ?8483416 total - age ? 8: ? ? 132328 bytes, ? ?8615744 total - age ? 9: ? ? 153768 bytes, ? ?8769512 total - age ?10: ? ? 194256 bytes, ? ?8963768 total - age ?11: ? ? ?64600 bytes, ? ?9028368 total - age ?12: ? ? 160208 bytes, ? ?9188576 total - age ?13: ? ? ?69376 bytes, ? ?9257952 total - age ?14: ? ? 151832 bytes, ? ?9409784 total - age ?15: ? ? 186920 bytes, ? ?9596704 total ?30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: 17.67 ms, target pause time: 40.00 ms] ?30271.429: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 380 regions, survivors: 2 regions, predicted young region time: 5.51 ms] ?30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted pause time: 27.83 ms, target pause time: 40.00 ms] 30271.445: [SoftReference, 0 refs, 0.9021283 secs]30272.347: [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs] ? ?[Parallel Time: 14.4 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 30271429.9, Diff: 0.5] ? ? ? [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 6.4, Sum: 120.1] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, Sum: 1181] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: 55.3] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.3] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 2.0] ? ? ? [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 0.6, Sum: 300.7] ? ? ? [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 30271443.5, Diff: 0.2] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms] ? ?[Clear CT: 1.0 ms] ? ?[Other: 914.8 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 910.3 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 1.7 ms] ? ?[Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)] ?[Times: user=0.29 sys=0.00, real=0.93 secs] 2. Processing weak reference takes long time 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs] ? ?[Parallel Time: 33.1 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 93967013.6, Diff: 0.7] ? ? ? [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, Sum: 122.9] ? ? ? [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, Sum: 25.2] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: 463.0] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: 4.9] ? ? ? [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, Sum: 709.3] ? ? ? [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 93967045.8, Diff: 0.5] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms] ? ?[Clear CT: 0.5 ms] ? ?[Other: 11.7 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 7.8 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 0.8 ms] ? ?[Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] ?[Times: user=0.66 sys=0.00, real=0.04 secs] 2015-05-16T13:21:33.478+0000: 93967.057: [GC concurrent-root-region-scan-start] 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which application threads were stopped: 0.0652331 seconds 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-root-region-scan-end, 0.0082516 secs] 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 0.4016735 secs] 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: [WeakReference, 1430199 refs, 0.7913479 secs]93968.281: [FinalReference, 367 refs, 0.0036350 secs]93968.285: [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs] ?[Times: user=15.10 sys=0.19, real=1.08 secs] Appreciate your help, -Joy _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Wed May 20 22:35:21 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Wed, 20 May 2015 15:35:21 -0700 Subject: Long Reference Processing Time In-Reply-To: <1791195295.3883071.1432159928570.JavaMail.yahoo@mail.yahoo.com> References: <555CFC13.10900@oracle.com> <1791195295.3883071.1432159928570.JavaMail.yahoo@mail.yahoo.com> Message-ID: <555D0C29.8000301@oracle.com> can you dump the heap and examine it with eclipse mat or some similar tools? Thanks, Jenny On 5/20/2015 3:12 PM, Joy Xiong wrote: > Thank you, Poonam. > > Also is there a way to get more info on weak references, such as the > reference name? Our application does not use weak references, so it's > likely that the weak references come from the underneath library, and > I'd like to know which library is using lots of weak references. > > thanks, > -Joy > > > > On Wednesday, May 20, 2015 2:26 PM, Poonam Bajaj Parhar > wrote: > > > Hello Joy, > > 8u40 is the latest update release that contains new enhancements and > bug fixes, and 8u45 is the latest security release that includes > security fixes on top of 8u40. > > So, for your test run I think you can try with 8u40. > > regards, > Poonam > > On 5/20/2015 1:17 PM, Joy Xiong wrote: >> Yu and Poonam, >> >> Thank you for your quick response. >> In terms of JDK version, we have 8u40 available, so want to check >> with you how 8u40 differs from 8u45. >> >> thanks, >> -Joy >> >> >> >> On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar >> wrote: >> >> >> Hello Joy, >> >> Could you try running with the latest JDK8 update release (8u45). >> Looks like you are trying out G1 with 8u5. There have been many >> improvements/fixes in G1GC since 8u5. Please test with the latest 8u >> and let us know the results. >> >> Thanks, >> Poonam >> >> On 5/20/2015 10:35 AM, Joy Xiong wrote: >>> * >>> * >>> Hi All, >>> >>> I recently moved our application from CMS to G1 due to heap >>> fragmentation. Here are the JVM tunable used for the application: >>> -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled >>> -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g >>> -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC >>> /export/apps/jdk/JDK-1_8_0_5/bin/java >>> >>> With G1, I observe long time processing references. The long >>> reference processing time has two types: >>> 1) Occur in Young GC phase. The processing time does not make sense >>> to me, as the majority time is spent on processing soft reference, >>> whose number is 0. Is there some hidden time contributing to >>> processing soft references? >>> 2) Occur in the remark phase during the concurrent phase. Our >>> application has a large number of weak references, but I don't quite >>> understand why the processing time is much larger with G1 than with CMS. >>> >>> Detailed log record is shown as below: >>> *1. Processing soft reference takes long time*. However, we only >>> have 0 soft reference >>> 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation >>> Pause) (young) >>> Desired survivor size 201326592 bytes, new threshold 15 (max 15) >>> - age 1: 6197672 bytes, 6197672 total >>> - age 2: 553864 bytes, 6751536 total >>> - age 3: 321216 bytes, 7072752 total >>> - age 4: 563120 bytes, 7635872 total >>> - age 5: 261920 bytes, 7897792 total >>> - age 6: 265768 bytes, 8163560 total >>> - age 7: 319856 bytes, 8483416 total >>> - age 8: 132328 bytes, 8615744 total >>> - age 9: 153768 bytes, 8769512 total >>> - age 10: 194256 bytes, 8963768 total >>> - age 11: 64600 bytes, 9028368 total >>> - age 12: 160208 bytes, 9188576 total >>> - age 13: 69376 bytes, 9257952 total >>> - age 14: 151832 bytes, 9409784 total >>> - age 15: 186920 bytes, 9596704 total >>> 30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, >>> _pending_cards: 13708, predicted base time: 22.33 ms, remaining >>> time: 17.67 ms, target pause time: 40.00 ms] >>> 30271.429: [G1Ergonomics (CSet Construction) add young regions to >>> CSet, eden: 380 regions, survivors: 2 regions, predicted young >>> region time: 5.51 ms] >>> 30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, >>> eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted >>> pause time: 27.83 ms, target pause time: 40.00 ms] >>> 30271.445: [*SoftReference, 0 refs, 0.9021283 secs*]30272.347: >>> [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 >>> refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, >>> 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], >>> 0.9305765 secs] >>> [Parallel Time: 14.4 ms, GC Workers: 22] >>> [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: >>> 30271429.9, Diff: 0.5] >>> [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: >>> 6.4, Sum: 120.1] >>> [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: >>> 80.8] >>> [Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, Sum: 1181] >>> [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0] >>> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: >>> 0.0, Sum: 0.1] >>> [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, >>> Sum: 55.3] >>> [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.3] >>> [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: >>> 0.2, Sum: 2.0] >>> [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: >>> 0.6, Sum: 300.7] >>> [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: >>> 30271443.5, Diff: 0.2] >>> [Code Root Fixup: 0.2 ms] >>> [Code Root Migration: 0.1 ms] >>> [Clear CT: 1.0 ms] >>> [Other: 914.8 ms] >>> [Choose CSet: 0.0 ms] >>> [Ref Proc: 910.3 ms] >>> [Ref Enq: 0.5 ms] >>> [Free CSet: 1.7 ms] >>> [Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M >>> Heap: 4588.4M(5120.0M)->1551.4M(5120.0M)] >>> [Times: user=0.29 sys=0.00, real=0.93 secs] >>> >>> *2. Processing weak reference takes long time* >>> 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: >>> [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 >>> refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, >>> 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160 secs], >>> 0.0455414 secs] >>> [Parallel Time: 33.1 ms, GC Workers: 22] >>> [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: >>> 93967013.6, Diff: 0.7] >>> [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: >>> 10.4, Sum: 122.9] >>> [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: >>> 9.2, Sum: 25.2] >>> [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: >>> 67.3] >>> [Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173] >>> [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6] >>> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: >>> 0.0, Sum: 0.1] >>> [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, >>> Sum: 463.0] >>> [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4] >>> [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: >>> 0.5, Sum: 4.9] >>> [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: >>> 0.9, Sum: 709.3] >>> [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: >>> 93967045.8, Diff: 0.5] >>> [Code Root Fixup: 0.2 ms] >>> [Code Root Migration: 0.1 ms] >>> [Clear CT: 0.5 ms] >>> [Other: 11.7 ms] >>> [Choose CSet: 0.0 ms] >>> [Ref Proc: 7.8 ms] >>> [Ref Enq: 0.5 ms] >>> [Free CSet: 0.8 ms] >>> [Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M >>> Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] >>> [Times: user=0.66 sys=0.00, real=0.04 secs] >>> 2015-05-16T13:21:33.478+0000: 93967.057: [GC >>> concurrent-root-region-scan-start] >>> 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which >>> application threads were stopped: 0.0652331 seconds >>> 2015-05-16T13:21:33.487+0000: 93967.066: [GC >>> concurrent-root-region-scan-end, 0.0082516 secs] >>> 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] >>> 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, >>> 0.4016735 secs] >>> 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC >>> ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 >>> secs]93967.490: [*WeakReference, 1430199 refs, 0.7913479 >>> secs*]93968.281: [FinalReference, 367 refs, 0.0036350 >>> secs]93968.285: [PhantomReference, 221 refs, 0.0031875 >>> secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 >>> secs], 1.0832167 secs] >>> [Times: user=15.10 sys=0.19, real=1.08 secs] >>> >>> Appreciate your help, >>> -Joy >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From joyxiong at yahoo.com Wed May 20 22:41:27 2015 From: joyxiong at yahoo.com (Joy Xiong) Date: Wed, 20 May 2015 22:41:27 +0000 (UTC) Subject: Long Reference Processing Time In-Reply-To: <555D0C29.8000301@oracle.com> References: <555D0C29.8000301@oracle.com> Message-ID: <913945159.3879136.1432161687674.JavaMail.yahoo@mail.yahoo.com> Is there other ways for this? It's a prod environment and it would be too intrusive for a heap dump... -Joy On Wednesday, May 20, 2015 3:35 PM, Yu Zhang wrote: can you dump the heap and examine it with eclipse mat or some similar tools? Thanks, Jenny On 5/20/2015 3:12 PM, Joy Xiong wrote: Thank you, Poonam. Also is there a way to get more info on weak references, such as the reference name? Our application does not use weak references, so it's likely that the weak references come from the underneath library, and I'd like to know which library is using lots of weak references. thanks, -Joy On Wednesday, May 20, 2015 2:26 PM, Poonam Bajaj Parhar wrote: Hello Joy, 8u40 is the latest update release that contains new enhancements and bug fixes, and 8u45 is the latest security release that includes security fixes on top of 8u40. So, for your test run I think you can try with 8u40. regards, Poonam On 5/20/2015 1:17 PM, Joy Xiong wrote: Yu and Poonam, Thank you for your quick response.? In terms of JDK version, we have 8u40 available, so want to check with you how 8u40 differs from 8u45. thanks, -Joy On Wednesday, May 20, 2015 11:18 AM, Poonam Bajaj Parhar wrote: Hello Joy, Could you try running with the latest JDK8 update release (8u45). Looks like you are trying out G1 with 8u5. There have been many improvements/fixes in G1GC since 8u5. Please test with the latest 8u and let us know the results. Thanks, Poonam On 5/20/2015 10:35 AM, Joy Xiong wrote: Hi All, I recently moved our application from CMS to G1 due to heap fragmentation. Here are the JVM tunable used for the application: -XX:MaxGCPauseMillis=40 -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8M -XX:ParallelGCThreads=22 -server -Xms5g -Xmx5g -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:+UseG1GC? /export/apps/jdk/JDK-1_8_0_5/bin/java With G1, I observe long time processing references. The long reference processing time has two types:? 1) Occur in Young GC phase. The processing time does not make sense to me, as the majority time is spent on processing soft reference, whose number is 0. Is there some hidden time contributing to processing soft references? 2) Occur in the remark phase during the concurrent phase. Our application has a large number of weak references, but I don't quite understand why the processing time is much larger with G1 than with CMS. Detailed log record is shown as below: 1. Processing soft reference takes long time. However, we only have 0 soft reference 2015-05-15T19:39:57.849+0000: 30271.428: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 201326592 bytes, new threshold 15 (max 15) - age ? 1: ? ?6197672 bytes, ? ?6197672 total - age ? 2: ? ? 553864 bytes, ? ?6751536 total - age ? 3: ? ? 321216 bytes, ? ?7072752 total - age ? 4: ? ? 563120 bytes, ? ?7635872 total - age ? 5: ? ? 261920 bytes, ? ?7897792 total - age ? 6: ? ? 265768 bytes, ? ?8163560 total - age ? 7: ? ? 319856 bytes, ? ?8483416 total - age ? 8: ? ? 132328 bytes, ? ?8615744 total - age ? 9: ? ? 153768 bytes, ? ?8769512 total - age ?10: ? ? 194256 bytes, ? ?8963768 total - age ?11: ? ? ?64600 bytes, ? ?9028368 total - age ?12: ? ? 160208 bytes, ? ?9188576 total - age ?13: ? ? ?69376 bytes, ? ?9257952 total - age ?14: ? ? 151832 bytes, ? ?9409784 total - age ?15: ? ? 186920 bytes, ? ?9596704 total ?30271.429: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 13708, predicted base time: 22.33 ms, remaining time: 17.67 ms, target pause time: 40.00 ms] ?30271.429: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 380 regions, survivors: 2 regions, predicted young region time: 5.51 ms] ?30271.429: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 380 regions, survivors: 2 regions, old: 0 regions, predicted pause time: 27.83 ms, target pause time: 40.00 ms] 30271.445: [SoftReference, 0 refs, 0.9021283 secs]30272.347: [WeakReference, 5 refs, 0.0031983 secs]30272.350: [FinalReference, 2 refs, 0.0019730 secs]30272.352: [PhantomReference, 102 refs, 0.0019032 secs]30272.354: [JNI Weak Reference, 0.0000124 secs], 0.9305765 secs] ? ?[Parallel Time: 14.4 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 30271429.4, Avg: 30271429.7, Max: 30271429.9, Diff: 0.5] ? ? ? [Ext Root Scanning (ms): Min: 4.3, Avg: 5.5, Max: 10.8, Diff: 6.4, Sum: 120.1] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.7, Max: 8.7, Diff: 8.7, Sum: 80.8] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.7, Max: 109, Diff: 109, Sum: 1181] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.6, Max: 4.9, Diff: 4.9, Sum: 35.0] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 0.1, Avg: 2.5, Max: 3.1, Diff: 3.1, Sum: 55.3] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 7.3] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 2.0] ? ? ? [GC Worker Total (ms): Min: 13.4, Avg: 13.7, Max: 14.0, Diff: 0.6, Sum: 300.7] ? ? ? [GC Worker End (ms): Min: 30271443.3, Avg: 30271443.3, Max: 30271443.5, Diff: 0.2] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms] ? ?[Clear CT: 1.0 ms] ? ?[Other: 914.8 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 910.3 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 1.7 ms] ? ?[Eden: 3040.0M(3040.0M)->0.0B(240.0M) Survivors: 16.0M->16.0M Heap:4588.4M(5120.0M)->1551.4M(5120.0M)] ?[Times: user=0.29 sys=0.00, real=0.93 secs] 2. Processing weak reference takes long time 93967.047: [SoftReference, 0 refs, 0.0032025 secs]93967.051: [WeakReference, 1 refs, 0.0012743 secs]93967.052: [FinalReference, 2 refs, 0.0010594 secs]93967.053: [PhantomReference, 97 refs, 0.0009133 secs]93967.054: [JNI Weak Reference, 0.0000160 secs], 0.0455414 secs] ? ?[Parallel Time: 33.1 ms, GC Workers: 22] ? ? ? [GC Worker Start (ms): Min: 93967012.9, Avg: 93967013.3, Max: 93967013.6, Diff: 0.7] ? ? ? [Ext Root Scanning (ms): Min: 4.7, Avg: 5.6, Max: 15.1, Diff: 10.4, Sum: 122.9] ? ? ? [Code Root Marking (ms): Min: 0.0, Avg: 1.1, Max: 9.2, Diff: 9.2, Sum: 25.2] ? ? ? [Update RS (ms): Min: 0.0, Avg: 3.1, Max: 4.2, Diff: 4.2, Sum: 67.3] ? ? ? ? ?[Processed Buffers: Min: 0, Avg: 53.3, Max: 125, Diff: 125, Sum: 1173] ? ? ? [Scan RS (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.4, Sum: 24.6] ? ? ? [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] ? ? ? [Object Copy (ms): Min: 16.8, Avg: 21.0, Max: 21.8, Diff: 5.0, Sum: 463.0] ? ? ? [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4] ? ? ? [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.5, Sum: 4.9] ? ? ? [GC Worker Total (ms): Min: 31.8, Avg: 32.2, Max: 32.7, Diff: 0.9, Sum: 709.3] ? ? ? [GC Worker End (ms): Min: 93967045.3, Avg: 93967045.5, Max: 93967045.8, Diff: 0.5] ? ?[Code Root Fixup: 0.2 ms] ? ?[Code Root Migration: 0.1 ms] ? ?[Clear CT: 0.5 ms] ? ?[Other: 11.7 ms] ? ? ? [Choose CSet: 0.0 ms] ? ? ? [Ref Proc: 7.8 ms] ? ? ? [Ref Enq: 0.5 ms] ? ? ? [Free CSet: 0.8 ms] ? ?[Eden: 1696.0M(1696.0M)->0.0B(1544.0M) Survivors: 32.0M->32.0M Heap: 4021.0M(5120.0M)->2321.6M(5120.0M)] ?[Times: user=0.66 sys=0.00, real=0.04 secs] 2015-05-16T13:21:33.478+0000: 93967.057: [GC concurrent-root-region-scan-start] 2015-05-16T13:21:33.479+0000: 93967.058: Total time for which application threads were stopped: 0.0652331 seconds 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-root-region-scan-end, 0.0082516 secs] 2015-05-16T13:21:33.487+0000: 93967.066: [GC concurrent-mark-start] 2015-05-16T13:21:33.888+0000: 93967.467: [GC concurrent-mark-end, 0.4016735 secs] 2015-05-16T13:21:33.905+0000: 93967.484: [GC remark 93967.486: [GC ref-proc93967.486: [SoftReference, 725 refs, 0.0043522 secs]93967.490: [WeakReference, 1430199 refs, 0.7913479 secs]93968.281: [FinalReference, 367 refs, 0.0036350 secs]93968.285: [PhantomReference, 221 refs, 0.0031875 secs]93968.288: [JNI Weak Reference, 0.0001281 secs], 1.0652076 secs], 1.0832167 secs] ?[Times: user=15.10 sys=0.19, real=1.08 secs] Appreciate your help, -Joy _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From simone.bordet at gmail.com Wed May 20 22:49:35 2015 From: simone.bordet at gmail.com (Simone Bordet) Date: Thu, 21 May 2015 00:49:35 +0200 Subject: Long Reference Processing Time In-Reply-To: <555D0C29.8000301@oracle.com> References: <555CFC13.10900@oracle.com> <1791195295.3883071.1432159928570.JavaMail.yahoo@mail.yahoo.com> <555D0C29.8000301@oracle.com> Message-ID: Hi, On Thu, May 21, 2015 at 12:35 AM, Yu Zhang wrote: > can you dump the heap and examine it with eclipse mat or some similar tools? In our case this was not helpful, but your case may be different. > On 5/20/2015 3:12 PM, Joy Xiong wrote: > Also is there a way to get more info on weak references, such as the > reference name? Our application does not use weak references, so it's likely > that the weak references come from the underneath library, and I'd like to > know which library is using lots of weak references. We ended up with this "dirty" trick: https://github.com/jetty-project/weakref-allocation Basically, we modified WeakReference to keep track of allocations and expose them via JMX. Turned out that for our case, ThreadLocal and RMI usage were the most important allocators of WeakReferences. Hope it helps. -- Simone Bordet http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From simone.bordet at gmail.com Wed May 20 22:52:49 2015 From: simone.bordet at gmail.com (Simone Bordet) Date: Thu, 21 May 2015 00:52:49 +0200 Subject: Long Reference Processing Time In-Reply-To: <913945159.3879136.1432161687674.JavaMail.yahoo@mail.yahoo.com> References: <555D0C29.8000301@oracle.com> <913945159.3879136.1432161687674.JavaMail.yahoo@mail.yahoo.com> Message-ID: Hi, On Thu, May 21, 2015 at 12:41 AM, Joy Xiong wrote: > Is there other ways for this? It's a prod environment and it would be too > intrusive for a heap dump... We used our solution in production too, by enabling it for few minutes to collect data (via JMX) and then disabling it until the next restart (also via JMX), where it was removed. Required 2 restarts: one to add the instrumentation, and one to remove it. -- Simone Bordet http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From joyxiong at yahoo.com Thu May 21 18:32:43 2015 From: joyxiong at yahoo.com (Joy Xiong) Date: Thu, 21 May 2015 18:32:43 +0000 (UTC) Subject: Long Reference Processing Time In-Reply-To: References: Message-ID: <1091555704.4676086.1432233163367.JavaMail.yahoo@mail.yahoo.com> Thank you Simone. On Wednesday, May 20, 2015 3:52 PM, Simone Bordet wrote: Hi, On Thu, May 21, 2015 at 12:41 AM, Joy Xiong wrote: > Is there other ways for this? It's a prod environment and it would be too > intrusive for a heap dump... We used our solution in production too, by enabling it for few minutes to collect data (via JMX) and then disabling it until the next restart (also via JMX), where it was removed. Required 2 restarts: one to add the instrumentation, and one to remove it. -- Simone Bordet http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless.? Victoria Livschitz -------------- next part -------------- An HTML attachment was scrubbed... URL: From jk at codearte.io Mon May 25 08:56:08 2015 From: jk at codearte.io (Jakub Kubrynski) Date: Mon, 25 May 2015 10:56:08 +0200 Subject: G1 young STW time in MBean Message-ID: Hi, is there any possibility to get STW time for G1 young collection through MBean? The duration reported here is IMHO total GC time (concurrent + stw). -- Best regards, Jakub Kubrynski -------------- next part -------------- An HTML attachment was scrubbed... URL: From jesper.wilhelmsson at oracle.com Mon May 25 15:37:51 2015 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Mon, 25 May 2015 17:37:51 +0200 Subject: G1 young STW time in MBean In-Reply-To: References: Message-ID: <556341CF.9000705@oracle.com> Hi Jakub, The times reported in the MBeans are total GC times. There is currently no way to get STW times through the MBeans. Best regards, /Jesper Jakub Kubrynski skrev den 25/5/15 10:56: > Hi, > > is there any possibility to get STW time for G1 young collection through MBean? > The duration reported here is IMHO total GC time (concurrent + stw). > > -- > Best regards, > Jakub Kubrynski > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > From jk at codearte.io Mon May 25 15:43:14 2015 From: jk at codearte.io (Jakub Kubrynski) Date: Mon, 25 May 2015 17:43:14 +0200 Subject: G1 young STW time in MBean In-Reply-To: <556341CF.9000705@oracle.com> References: <556341CF.9000705@oracle.com> Message-ID: Do you know if there is any development planned in this area? Also there is no avgPauseTime, promoted and survived mbean information for G1. The only available substitution I see for now is to get safepoint timings. Best, Jakub Kubrynski 25 maj 2015 17:37 "Jesper Wilhelmsson" napisa?(a): > Hi Jakub, > > The times reported in the MBeans are total GC times. There is currently no > way to get STW times through the MBeans. > > Best regards, > /Jesper > > > Jakub Kubrynski skrev den 25/5/15 10:56: > >> Hi, >> >> is there any possibility to get STW time for G1 young collection through >> MBean? >> The duration reported here is IMHO total GC time (concurrent + stw). >> >> -- >> Best regards, >> Jakub Kubrynski >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jesper.wilhelmsson at oracle.com Mon May 25 16:28:04 2015 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Mon, 25 May 2015 18:28:04 +0200 Subject: G1 young STW time in MBean In-Reply-To: References: <556341CF.9000705@oracle.com> Message-ID: <55634D94.1080301@oracle.com> As far as I know there hasn't been much development done to get G1 to play well with the MBeans. And I don't think there is anything planned in this area. Maybe the serviceability team has other plans for the MBeans. /Jesper Jakub Kubrynski skrev den 25/5/15 17:43: > Do you know if there is any development planned in this area? Also there is no > avgPauseTime, promoted and survived mbean information for G1. The only available > substitution I see for now is to get safepoint timings. > > Best, > Jakub Kubrynski > > 25 maj 2015 17:37 "Jesper Wilhelmsson" > napisa?(a): > > Hi Jakub, > > The times reported in the MBeans are total GC times. There is currently no > way to get STW times through the MBeans. > > Best regards, > /Jesper > > > Jakub Kubrynski skrev den 25/5/15 10:56: > > Hi, > > is there any possibility to get STW time for G1 young collection through > MBean? > The duration reported here is IMHO total GC time (concurrent + stw). > > -- > Best regards, > Jakub Kubrynski > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > From jk at codearte.io Mon May 25 20:42:52 2015 From: jk at codearte.io (Jakub Kubrynski) Date: Mon, 25 May 2015 22:42:52 +0200 Subject: G1 young STW time in MBean In-Reply-To: <55634D94.1080301@oracle.com> References: <556341CF.9000705@oracle.com> <55634D94.1080301@oracle.com> Message-ID: Maybe we could propose some JSR about that? Using G1 without proper monitoring is like living on the edge :) Cheers, Jakub 2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson : > As far as I know there hasn't been much development done to get G1 to play > well with the MBeans. And I don't think there is anything planned in this > area. Maybe the serviceability team has other plans for the MBeans. > /Jesper > > > Jakub Kubrynski skrev den 25/5/15 17:43: > >> Do you know if there is any development planned in this area? Also there >> is no >> avgPauseTime, promoted and survived mbean information for G1. The only >> available >> substitution I see for now is to get safepoint timings. >> >> Best, >> Jakub Kubrynski >> >> 25 maj 2015 17:37 "Jesper Wilhelmsson" > > napisa?(a): >> >> Hi Jakub, >> >> The times reported in the MBeans are total GC times. There is >> currently no >> way to get STW times through the MBeans. >> >> Best regards, >> /Jesper >> >> >> Jakub Kubrynski skrev den 25/5/15 10:56: >> >> Hi, >> >> is there any possibility to get STW time for G1 young collection >> through >> MBean? >> The duration reported here is IMHO total GC time (concurrent + >> stw). >> >> -- >> Best regards, >> Jakub Kubrynski >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net > hotspot-gc-use at openjdk.java.net> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> -- Best regards, Jakub Kubrynski -------------- next part -------------- An HTML attachment was scrubbed... URL: From jesper.wilhelmsson at oracle.com Mon May 25 20:49:12 2015 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Mon, 25 May 2015 22:49:12 +0200 Subject: G1 young STW time in MBean In-Reply-To: References: <556341CF.9000705@oracle.com> <55634D94.1080301@oracle.com> Message-ID: <55638AC8.4080401@oracle.com> Including the serviceability team. There might be plans for G1 monitoring in the future but I don't know too much about what and when. I know there are commercial products that does a pretty good job at G1 monitoring but I don't know if that is an option for you. Best, /Jesper Jakub Kubrynski skrev den 25/5/15 22:42: > Maybe we could propose some JSR about that? Using G1 without proper monitoring > is like living on the edge :) > > Cheers, > Jakub > > 2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson >: > > As far as I know there hasn't been much development done to get G1 to play > well with the MBeans. And I don't think there is anything planned in this > area. Maybe the serviceability team has other plans for the MBeans. > /Jesper > > > Jakub Kubrynski skrev den 25/5/15 17:43: > > Do you know if there is any development planned in this area? Also there > is no > avgPauseTime, promoted and survived mbean information for G1. The only > available > substitution I see for now is to get safepoint timings. > > Best, > Jakub Kubrynski > > 25 maj 2015 17:37 "Jesper Wilhelmsson" > >> napisa?(a): > > Hi Jakub, > > The times reported in the MBeans are total GC times. There is > currently no > way to get STW times through the MBeans. > > Best regards, > /Jesper > > > Jakub Kubrynski skrev den 25/5/15 10:56: > > Hi, > > is there any possibility to get STW time for G1 young > collection through > MBean? > The duration reported here is IMHO total GC time (concurrent + > stw). > > -- > Best regards, > Jakub Kubrynski > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > -- > Best regards, > Jakub Kubrynski From jk at codearte.io Tue May 26 06:22:25 2015 From: jk at codearte.io (Jakub Kubrynski) Date: Tue, 26 May 2015 08:22:25 +0200 Subject: G1 young STW time in MBean In-Reply-To: References: <556341CF.9000705@oracle.com> <55634D94.1080301@oracle.com> <55638AC8.4080401@oracle.com> Message-ID: Jesper already pointed me about JMC. The reason we're not using it is that it cannot be integrated with our production monitoring. It's more problem solving tool than continuous APM. So the question is if Oracle is going to implement some MBeans for G1, and if not maybe we could propose JSR about it? Cheers, Jakub 26 maj 2015 08:17 "Staffan Larsen" napisa?(a): > Try out Java Flight Recorder - it has a lot more data in it. > > /Staffan > > > On 25 maj 2015, at 22:49, Jesper Wilhelmsson < > jesper.wilhelmsson at oracle.com> wrote: > > > > Including the serviceability team. > > > > There might be plans for G1 monitoring in the future but I don't know > too much about what and when. I know there are commercial products that > does a pretty good job at G1 monitoring but I don't know if that is an > option for you. > > > > Best, > > /Jesper > > > > > > Jakub Kubrynski skrev den 25/5/15 22:42: > >> Maybe we could propose some JSR about that? Using G1 without proper > monitoring > >> is like living on the edge :) > >> > >> Cheers, > >> Jakub > >> > >> 2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson < > jesper.wilhelmsson at oracle.com > >> >: > >> > >> As far as I know there hasn't been much development done to get G1 > to play > >> well with the MBeans. And I don't think there is anything planned in > this > >> area. Maybe the serviceability team has other plans for the MBeans. > >> /Jesper > >> > >> > >> Jakub Kubrynski skrev den 25/5/15 17:43: > >> > >> Do you know if there is any development planned in this area? > Also there > >> is no > >> avgPauseTime, promoted and survived mbean information for G1. > The only > >> available > >> substitution I see for now is to get safepoint timings. > >> > >> Best, > >> Jakub Kubrynski > >> > >> 25 maj 2015 17:37 "Jesper Wilhelmsson" < > jesper.wilhelmsson at oracle.com > >> > >> >> >> napisa?(a): > >> > >> Hi Jakub, > >> > >> The times reported in the MBeans are total GC times. There > is > >> currently no > >> way to get STW times through the MBeans. > >> > >> Best regards, > >> /Jesper > >> > >> > >> Jakub Kubrynski skrev den 25/5/15 10:56: > >> > >> Hi, > >> > >> is there any possibility to get STW time for G1 young > >> collection through > >> MBean? > >> The duration reported here is IMHO total GC time > (concurrent + > >> stw). > >> > >> -- > >> Best regards, > >> Jakub Kubrynski > >> > >> > >> _______________________________________________ > >> hotspot-gc-use mailing list > >> hotspot-gc-use at openjdk.java.net hotspot-gc-use at openjdk.java.net> > >> >> > > >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > >> > >> > >> > >> > >> -- > >> Best regards, > >> Jakub Kubrynski > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From staffan.larsen at oracle.com Tue May 26 06:17:39 2015 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 26 May 2015 08:17:39 +0200 Subject: G1 young STW time in MBean In-Reply-To: <55638AC8.4080401@oracle.com> References: <556341CF.9000705@oracle.com> <55634D94.1080301@oracle.com> <55638AC8.4080401@oracle.com> Message-ID: Try out Java Flight Recorder - it has a lot more data in it. /Staffan > On 25 maj 2015, at 22:49, Jesper Wilhelmsson wrote: > > Including the serviceability team. > > There might be plans for G1 monitoring in the future but I don't know too much about what and when. I know there are commercial products that does a pretty good job at G1 monitoring but I don't know if that is an option for you. > > Best, > /Jesper > > > Jakub Kubrynski skrev den 25/5/15 22:42: >> Maybe we could propose some JSR about that? Using G1 without proper monitoring >> is like living on the edge :) >> >> Cheers, >> Jakub >> >> 2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson > >: >> >> As far as I know there hasn't been much development done to get G1 to play >> well with the MBeans. And I don't think there is anything planned in this >> area. Maybe the serviceability team has other plans for the MBeans. >> /Jesper >> >> >> Jakub Kubrynski skrev den 25/5/15 17:43: >> >> Do you know if there is any development planned in this area? Also there >> is no >> avgPauseTime, promoted and survived mbean information for G1. The only >> available >> substitution I see for now is to get safepoint timings. >> >> Best, >> Jakub Kubrynski >> >> 25 maj 2015 17:37 "Jesper Wilhelmsson" > >> > >> napisa?(a): >> >> Hi Jakub, >> >> The times reported in the MBeans are total GC times. There is >> currently no >> way to get STW times through the MBeans. >> >> Best regards, >> /Jesper >> >> >> Jakub Kubrynski skrev den 25/5/15 10:56: >> >> Hi, >> >> is there any possibility to get STW time for G1 young >> collection through >> MBean? >> The duration reported here is IMHO total GC time (concurrent + >> stw). >> >> -- >> Best regards, >> Jakub Kubrynski >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> > > >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> >> -- >> Best regards, >> Jakub Kubrynski From staffan.larsen at oracle.com Tue May 26 06:26:31 2015 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 26 May 2015 08:26:31 +0200 Subject: G1 young STW time in MBean In-Reply-To: References: <556341CF.9000705@oracle.com> <55634D94.1080301@oracle.com> <55638AC8.4080401@oracle.com> Message-ID: > On 26 maj 2015, at 08:22, Jakub Kubrynski wrote: > > Jesper already pointed me about JMC. The reason we're not using it is that it cannot be integrated with our production monitoring. It's more problem solving tool than continuous APM. So the question is if Oracle is going to implement some MBeans for G1, and if not maybe we could propose JSR about it? > I don?t think there are any open issues or plans for this, so contributions are always welcome! /Staffan > Cheers, > Jakub > > 26 maj 2015 08:17 "Staffan Larsen" > napisa?(a): > Try out Java Flight Recorder - it has a lot more data in it. > > /Staffan > > > On 25 maj 2015, at 22:49, Jesper Wilhelmsson > wrote: > > > > Including the serviceability team. > > > > There might be plans for G1 monitoring in the future but I don't know too much about what and when. I know there are commercial products that does a pretty good job at G1 monitoring but I don't know if that is an option for you. > > > > Best, > > /Jesper > > > > > > Jakub Kubrynski skrev den 25/5/15 22:42: > >> Maybe we could propose some JSR about that? Using G1 without proper monitoring > >> is like living on the edge :) > >> > >> Cheers, > >> Jakub > >> > >> 2015-05-25 18:28 GMT+02:00 Jesper Wilhelmsson > >> >>: > >> > >> As far as I know there hasn't been much development done to get G1 to play > >> well with the MBeans. And I don't think there is anything planned in this > >> area. Maybe the serviceability team has other plans for the MBeans. > >> /Jesper > >> > >> > >> Jakub Kubrynski skrev den 25/5/15 17:43: > >> > >> Do you know if there is any development planned in this area? Also there > >> is no > >> avgPauseTime, promoted and survived mbean information for G1. The only > >> available > >> substitution I see for now is to get safepoint timings. > >> > >> Best, > >> Jakub Kubrynski > >> > >> 25 maj 2015 17:37 "Jesper Wilhelmsson" > >> > > >> > >> >>> napisa?(a): > >> > >> Hi Jakub, > >> > >> The times reported in the MBeans are total GC times. There is > >> currently no > >> way to get STW times through the MBeans. > >> > >> Best regards, > >> /Jesper > >> > >> > >> Jakub Kubrynski skrev den 25/5/15 10:56: > >> > >> Hi, > >> > >> is there any possibility to get STW time for G1 young > >> collection through > >> MBean? > >> The duration reported here is IMHO total GC time (concurrent + > >> stw). > >> > >> -- > >> Best regards, > >> Jakub Kubrynski > >> > >> > >> _______________________________________________ > >> hotspot-gc-use mailing list > >> hotspot-gc-use at openjdk.java.net > > >> > >> >> > >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > >> > >> > >> > >> > >> -- > >> Best regards, > >> Jakub Kubrynski > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.goetz at gmail.com Thu May 28 17:58:27 2015 From: jason.goetz at gmail.com (Jason Goetz) Date: Thu, 28 May 2015 10:58:27 -0700 Subject: JVM taking a few seconds to reach a safepoint for routine young gen GC Message-ID: We're consistently seeing a situation where threads take a few seconds to stop for a routine GC. For 20 straight minutes the GC will run right away (it runs about every second). But then, during a 20-minute period, the threads will take longer to stop for GC. See the GC output below. 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which application threads were stopped: 0.1121233 seconds, Stopping threads took: 0.0000908 seconds 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which application threads were stopped: 0.0019384 seconds, Stopping threads took: 0.0001106 seconds 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, 28.4067370 secs] 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs], 0.0709271 secs] [Times: user=0.73 sys=0.00, real=0.08 secs] 2015-05-28T12:15:09.864-0500: 54815.466: Total time for which application threads were stopped: 3.2916224 seconds, Stopping threads took: 3.2188032 seconds 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G), 0.0451098 secs] [Times: user=0.61 sys=0.00, real=0.05 secs] 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which application threads were stopped: 0.0459803 seconds, Stopping threads took: 0.0001950 seconds Turning on safepoint logging reveals that these stopping threads times are taken up by safepoint ?sync? time. Taking thread dumps every second around these pauses fail to show anything of note happening during this time, but it?s my understanding that native code won?t necessarily show up in thread dumps anyway given that they exit before the JVM reaches a safepoint. Enabling PrintJNIGCStalls fails to show any logging around the 3 second pause seen above. I highly suspected JNI but was surprise that I didn?t see any logging about JNI Weak References after turning that option on. Any ideas for what I can try next? We?re using JDK 7u80. Here are the rest of my JVM settings: DisableExplicitGC true FlightRecorder true GCLogFileSize 52428800 ManagementServer true MinHeapSize 25769803776 MaxHeapSize 25769803776 MaxPermSize 536870912 NumberOfGCLogFiles 10 PrintAdaptiveSizePolicy true PrintGC true PrintGCApplicationStoppedTime true PrintGCCause true PrintGCDateStamps true PrintGCDetails true PrintGCTimeStamps true PrintSafepointStatistics true PrintSafepointStatisticsCount 1 PrintTenuringDistribution true ReservedCodeCacheSize 268435456 SafepointTimeout true SafepointTimeoutDelay 4000 ThreadStackSize 4096 UnlockCommercialFeatures true UseBiasedLocking false UseGCLogFileRotation false UseG1GC true PrintJNIGCStalls true -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 28 18:17:02 2015 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 28 May 2015 14:17:02 -0400 Subject: JVM taking a few seconds to reach a safepoint for routine young gen GC In-Reply-To: References: Message-ID: Jason, How many java threads are active when these stalls happen? How many CPUs are available to the jvm? How much physical memory on the machine? Is your jvm sole occupant of the machine or do you have noisy neighbors? You mentioned JNI - do you have a lot of JNI calls around these times? Do you allocate and/or write to large arrays/memory regions? Is there something different/interesting about these 20 min periods (e.g. workload increases, same time of day, more disk activity, any paging/swap activity, etc). sent from my phone On May 28, 2015 1:58 PM, "Jason Goetz" wrote: > We're consistently seeing a situation where threads take a few seconds to > stop for a routine GC. For 20 straight minutes the GC will run right away > (it runs about every second). But then, during a 20-minute period, the > threads will take longer to stop for GC. See the GC output below. > > 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which application > threads were stopped: 0.1121233 seconds, Stopping threads took: 0.0000908 > seconds > 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which application > threads were stopped: 0.0019384 seconds, Stopping threads took: 0.0001106 > seconds > 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, > 28.4067370 secs] > 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark > 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs], > 0.0709271 secs] > [Times: user=0.73 sys=0.00, real=0.08 secs] > *2015-05-28T12:15:09.864-0500: 54815.466: Total time for which application > threads were stopped: 3.2916224 seconds, Stopping threads took: 3.2188032 > seconds* > 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G), > 0.0451098 secs] > [Times: user=0.61 sys=0.00, real=0.05 secs] > 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which application > threads were stopped: 0.0459803 seconds, Stopping threads took: 0.0001950 > seconds > > Turning on safepoint logging reveals that these stopping threads times are > taken up by safepoint ?sync? time. Taking thread dumps every second around > these pauses fail to show anything of note happening during this time, but > it?s my understanding that native code won?t necessarily show up in thread > dumps anyway given that they exit before the JVM reaches a safepoint. > > Enabling PrintJNIGCStalls fails to show any logging around the 3 second > pause seen above. I highly suspected JNI but was surprise that I didn?t see > any logging about JNI Weak References after turning that option on. Any > ideas for what I can try next? We?re using JDK 7u80. Here are the rest of > my JVM settings: > > DisableExplicitGC true > FlightRecorder true > GCLogFileSize 52428800 > ManagementServer true > MinHeapSize 25769803776 > MaxHeapSize 25769803776 > MaxPermSize 536870912 > NumberOfGCLogFiles 10 > PrintAdaptiveSizePolicy true > PrintGC true > PrintGCApplicationStoppedTime true > PrintGCCause true > PrintGCDateStamps true > PrintGCDetails true > PrintGCTimeStamps true > PrintSafepointStatistics true > PrintSafepointStatisticsCount 1 > PrintTenuringDistribution true > ReservedCodeCacheSize 268435456 > SafepointTimeout true > SafepointTimeoutDelay 4000 > ThreadStackSize 4096 > UnlockCommercialFeatures true > UseBiasedLocking false > UseGCLogFileRotation false > UseG1GC true > PrintJNIGCStalls true > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From poonam.bajaj at oracle.com Thu May 28 19:37:24 2015 From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar) Date: Thu, 28 May 2015 12:37:24 -0700 Subject: JVM taking a few seconds to reach a safepoint for routine young gen GC In-Reply-To: References: Message-ID: <55676E74.8080301@oracle.com> Hello Jason, On 5/28/2015 10:58 AM, Jason Goetz wrote: > We're consistently seeing a situation where threads take a few seconds > to stop for a routine GC. For 20 straight minutes the GC will run > right away (it runs about every second). But then, during a 20-minute > period, the threads will take longer to stop for GC. See the GC output > below. > > 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which > application threads were stopped: 0.1121233 seconds, Stopping threads > took: 0.0000908 seconds > 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which > application threads were stopped: 0.0019384 seconds, Stopping threads > took: 0.0001106 seconds > 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, > 28.4067370 secs] > 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark > 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 > secs], 0.0709271 secs] > [Times: user=0.73 sys=0.00, real=0.08 secs] > *2015-05-28T12:15:09.864-0500: 54815.466: Total time for which > application threads were stopped: 3.2916224 seconds, Stopping threads > took: 3.2188032 seconds* > 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G), > 0.0451098 secs] > [Times: user=0.61 sys=0.00, real=0.05 secs] > 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which > application threads were stopped: 0.0459803 seconds, Stopping threads > took: 0.0001950 seconds > > Turning on safepoint logging reveals that these stopping threads times > are taken up by safepoint 'sync' time. Taking thread dumps every > second around these pauses fail to show anything of note happening > during this time, but it's my understanding that native code won't > necessarily show up in thread dumps anyway given that they exit before > the JVM reaches a safepoint. > Could you please share the output collected with PrintSafepointStatisticsoption. That may tell us which VM-operation is having trouble in stopping threads before starting the actual work. > Enabling PrintJNIGCStalls fails to show any logging around the 3 > second pause seen above. PrintJNIGCStalls option logs the events when a GC invocation request is made by any of the application threads but GC can not be invoked at that time because one or more of the applications threads are running in a JNI Critical Section. So, the GC is stalled until threads come out of the JNI critical section, and as threads exit the JNI critical section GC request is honored. If this option didn't print anything that means that the application didn't encounter any such situation. Thanks, Poonam > I highly suspected JNI but was surprise that I didn't see any logging > about JNI Weak References after turning that option on. Any ideas for > what I can try next? We're using JDK 7u80. Here are the rest of my JVM > settings: > > DisableExplicitGCtrue > FlightRecordertrue > GCLogFileSize52428800 > ManagementServertrue > MinHeapSize25769803776 > MaxHeapSize25769803776 > MaxPermSize536870912 > NumberOfGCLogFiles10 > PrintAdaptiveSizePolicytrue > PrintGCtrue > PrintGCApplicationStoppedTimetrue > PrintGCCausetrue > PrintGCDateStampstrue > PrintGCDetailstrue > PrintGCTimeStampstrue > PrintSafepointStatisticstrue > PrintSafepointStatisticsCount1 > PrintTenuringDistributiontrue > ReservedCodeCacheSize268435456 > SafepointTimeouttrue > SafepointTimeoutDelay4000 > ThreadStackSize4096 > UnlockCommercialFeaturestrue > UseBiasedLockingfalse > UseGCLogFileRotationfalse > UseG1GCtrue > PrintJNIGCStallstrue > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason.goetz at gmail.com Fri May 29 16:51:44 2015 From: jason.goetz at gmail.com (Jason Goetz) Date: Fri, 29 May 2015 09:51:44 -0700 Subject: JVM taking a few seconds to reach a safepoint for routine young gen GC In-Reply-To: References: Message-ID: Oops, I did not intend to remove this from the list. Re-added. I?ll take a look at how many RUNNABLE threads are actually blocked in native code. I?ll also look at VSphere to see if I can see anything unusual around resource contention. I?ve grepped the safepoint logs for GenCollectForAllocation, which, as I mentioned before, happen about every second, but only show this long sync times during the mysterious 20-minute period. I?ve taken an excerpt from one of these 20-minute pauses. You can see that for most GC the only time is in vmop and sync time is 0, but during these pauses the ?sync? time takes up the majority of the time. [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 116546.586: GenCollectForAllocation [ 146 5 7 ] [ 0 0 0 1 50 ] 0 116546.891: GenCollectForAllocation [ 146 2 7 ] [ 0 0 0 2 50 ] 0 116547.969: GenCollectForAllocation [ 145 0 2 ] [ 0 0 0 2 290 ] 0 116549.500: GenCollectForAllocation [ 145 0 0 ] [ 0 0 0 2 67 ] 0 116550.836: GenCollectForAllocation [ 142 0 1 ] [ 0 0 0 1 82 ] 0 116553.398: GenCollectForAllocation [ 142 0 2 ] [ 0 0 0 2 76 ] 0 116555.109: GenCollectForAllocation [ 142 0 0 ] [ 0 0 0 2 84 ] 0 116557.328: GenCollectForAllocation [ 142 0 0 ] [ 0 0 0 2 64 ] 0 116561.992: GenCollectForAllocation [ 143 2 1 ] [ 0 0 523 2 76 ] 1 116567.367: GenCollectForAllocation [ 143 1 0 ] [ 1 0 39 2 104 ] 0 116572.438: GenCollectForAllocation [ 143 4 3 ] [ 0 0 0 2 85 ] 0 116575.977: GenCollectForAllocation [ 144 76 1 ] [ 24 0154039 9 181 ] 0 116731.336: GenCollectForAllocation [ 353 41 5 ] [ 5 0 5 1 101 ] 0 116732.328: GenCollectForAllocation [ 354 5 16 ] [ 2080 0 2115 1 0 ] 1 116736.430: GenCollectForAllocation [ 354 5 9 ] [ 0 0 0 1 81 ] 2 116736.891: GenCollectForAllocation [ 354 0 4 ] [ 0 0 0 4 88 ] 0 116737.305: GenCollectForAllocation [ 354 2 9 ] [ 0 0 0 1 80 ] 0 116737.664: GenCollectForAllocation [ 354 1 8 ] [ 0 0 0 2 65 ] 0 116738.055: GenCollectForAllocation [ 355 1 8 ] [ 0 0 0 1 106 ] 0 116738.797: GenCollectForAllocation [ 354 0 5 ] [ 0 0 2116 2 125 ] 0 116741.523: GenCollectForAllocation [ 353 1 0 ] [ 5 0 502 1 195 ] 0 116743.219: GenCollectForAllocation [ 352 1 5 ] [ 0 0 0 1 0 ] 0 116743.719: GenCollectForAllocation [ 352 1 7 ] [ 0 0 0 1 67 ] 0 116744.266: GenCollectForAllocation [ 352 271 0 ] [ 28 0764563 4 0 ] 0 117509.914: GenCollectForAllocation [ 347 1 2 ] [ 0 0 0 2 166 ] 0 117510.609: GenCollectForAllocation [ 456 84 9 ] [ 8 0 8 2 103 ] 1 117511.305: GenCollectForAllocation [ 479 0 6 ] [ 0 0 0 7 199 ] 0 117512.086: GenCollectForAllocation [ 480 0 2 ] [ 0 0 0 2 192 ] 0 117829.000: GenCollectForAllocation [ 569 0 3 ] [ 0 0 0 2 0 ] 0 117829.000: GenCollectForAllocation [ 569 2 5 ] [ 0 0 0 0 128 ] 0 117829.523: GenCollectForAllocation [ 569 0 6 ] [ 0 0 0 2 84 ] 0 117830.039: GenCollectForAllocation [ 571 0 5 ] [ 0 0 0 2 0 ] 0 117830.781: GenCollectForAllocation [ 571 0 6 ] [ 0 0 0 6 72 ] 0 117831.461: GenCollectForAllocation [ 571 0 4 ] [ 0 0 0 1 0 ] 0 117831.469: GenCollectForAllocation [ 571 0 3 ] [ 0 0 0 0 113 ] 0 From: Vitaly Davidovich Date: Thursday, May 28, 2015 at 4:20 PM To: Jason Goetz Subject: Re: JVM taking a few seconds to reach a safepoint for routine young gen GC Jason, Not sure if you meant to reply just to me, but you did :) So I suspect the RUNNABLE you list is what jstack gives you, which is slightly a lie since it'll show some threads blocked in native code as RUNNABLE. The fact that you're on a VM is biasing me towards looking at that angle. If there's a spike in runnable (from kernel scheduler standpoint) threads and/or contention for resources, and it's driven by hypervisor, I wonder if there're any artifacts in that. I don't have much experience running servers on VMs (only bare metal), so hard to say. You may want to reply to the list again and see if anyone else has more insight into this type of setup. Also, Poonam asked for safepoint statistics for the vm ops that were requested -- do you have those? On Thu, May 28, 2015 at 4:20 PM, Jason Goetz wrote: > I?m happy to answer whatever I can. Thanks for taking the time to help. It?s > running on a VM, not bare metal. The exact OS is Windows Server 2008. The > database is running on another machine. There is a very large Lucene index on > the same machine as the application and commits to this index are frequent and > often contended. > > From the thread dumps I took during these pauses (there are several that > happen around minor GCs during these 20-minute periods) I can see the > following stats: > > Dump 1: > Threads: 147 > RUNNABLE: 42 > WAITING: 30 > TIMED_WAITING: 75 > BLOCKED: 0 > > Dump 2: > Threads: 259 > RUNNABLE: 143 > WAITING: 47 > TIMED_WAITING: 62 > BLOCKED: 7 > > The only reason I believe the thread count is higher than usual on the second > dump is that the dump follows a very long pause (69 seconds, all spent in sync > time stopping threads for a safepoint) so I think there were several web > requests that gathered up during this pause and needed to be served. > > As far as Unsafe operations, the only thing I see in thread dumps when I grep > for Unsafe is Unsafe.park operations in threads that are TIMED_WAITING. > > As far as memory allocation, I do have some good profiling of that from the > flight recordings that are taken and have a listing of allocations by thread. > I haven?t been able to see any abnormal allocations happening during the time > of the pauses, and the total amount of memory being allocated is no different > during these pauses. In fact, the amount of memory getting allocated (inside > and outside TLABs) is less during these pauses as I imagine the time that > threads are waiting for a safepoint are taking time away from running code > that allocates memory. > > From: Vitaly Davidovich > Date: Thursday, May 28, 2015 at 12:06 PM > To: Jason Goetz > > Subject: Re: JVM taking a few seconds to reach a safepoint for routine young > gen GC > > Thanks Jason. Is this bare metal Windows or virtualized? Of the 140-200 > active, how many are runnable at the time of the stalls? > > Do you (or any used libs that you know of) use Unsafe for big memcpy style > operations? > > When these spikes occur, how many runnable procs are there on the machine? Is > there scheduling contention perhaps (with Tomcat?)? > > As for JNI, typically, java threads in JNI won't stall threads from sync'ing > on a safepoint. > > Sorry for the spanish inquisition, but may help us figure this out or at least > get a lead. > > On Thu, May 28, 2015 at 2:45 PM, Jason Goetz wrote: >> Vitaly, >> >> We?ve seen 140-200 active threads during the time of the stalls but that?s no >> different than any other time period. There are 12 CPUs available on the JVM >> and there is 24G in the heap, 64G on the machine. This is the only JVM >> running on the machine, which runs on a Windows server, and Tomcat is the >> only application of note other than a few monitoring tools (Zabbix, HP Open >> View, VMWare Tools), which I haven?t had the option of turning off). >> >> I?m not sure that JNI is running. We don?t explicitly have any JNI calls >> running, but I?m not sure about whether any of the 3rd-party libraries we use >> have JNI code that I?m unaware of. I haven?t been able to figure out how to >> identify if JNI calls are even running. We have taken several Java Flight >> Recordings around these every-20-minute pauses, but haven?t seen any patterns >> or unusual spikes in disk I/O, thread contention, or any thread activity. >> There is no swapping at all either. >> >> Any other information that I could provide in order to give a clearer picture >> of the system? >> >> Thanks, >> Jason >> >> From: Vitaly Davidovich >> Date: Thursday, May 28, 2015 at 11:17 AM >> To: Jason Goetz >> Cc: hotspot-gc-use >> Subject: Re: JVM taking a few seconds to reach a safepoint for routine young >> gen GC >> >> >> Jason, >> >> How many java threads are active when these stalls happen? How many CPUs are >> available to the jvm? How much physical memory on the machine? Is your jvm >> sole occupant of the machine or do you have noisy neighbors? You mentioned >> JNI - do you have a lot of JNI calls around these times? Do you allocate >> and/or write to large arrays/memory regions? Is there something >> different/interesting about these 20 min periods (e.g. workload increases, >> same time of day, more disk activity, any paging/swap activity, etc). >> >> sent from my phone >> >> On May 28, 2015 1:58 PM, "Jason Goetz" wrote: >>> We're consistently seeing a situation where threads take a few seconds to >>> stop for a routine GC. For 20 straight minutes the GC will run right away >>> (it runs about every second). But then, during a 20-minute period, the >>> threads will take longer to stop for GC. See the GC output below. >>> >>> 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which application >>> threads were stopped: 0.1121233 seconds, Stopping threads took: 0.0000908 >>> seconds >>> 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which application >>> threads were stopped: 0.0019384 seconds, Stopping threads took: 0.0001106 >>> seconds >>> 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, 28.4067370 >>> secs] >>> 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark >>> 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs], >>> 0.0709271 secs] >>> [Times: user=0.73 sys=0.00, real=0.08 secs] >>> 2015-05-28T12:15:09.864-0500: 54815.466: Total time for which application >>> threads were stopped: 3.2916224 seconds, Stopping threads took: 3.2188032 >>> seconds >>> 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G), >>> 0.0451098 secs] >>> [Times: user=0.61 sys=0.00, real=0.05 secs] >>> 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which application >>> threads were stopped: 0.0459803 seconds, Stopping threads took: 0.0001950 >>> seconds >>> >>> Turning on safepoint logging reveals that these stopping threads times are >>> taken up by safepoint ?sync? time. Taking thread dumps every second around >>> these pauses fail to show anything of note happening during this time, but >>> it?s my understanding that native code won?t necessarily show up in thread >>> dumps anyway given that they exit before the JVM reaches a safepoint. >>> >>> Enabling PrintJNIGCStalls fails to show any logging around the 3 second >>> pause seen above. I highly suspected JNI but was surprise that I didn?t see >>> any logging about JNI Weak References after turning that option on. Any >>> ideas for what I can try next? We?re using JDK 7u80. Here are the rest of my >>> JVM settings: >>> >>> DisableExplicitGC true >>> FlightRecorder true >>> GCLogFileSize 52428800 >>> ManagementServer true >>> MinHeapSize 25769803776 >>> MaxHeapSize 25769803776 >>> MaxPermSize 536870912 >>> NumberOfGCLogFiles 10 >>> PrintAdaptiveSizePolicy true >>> PrintGC true >>> PrintGCApplicationStoppedTime true >>> PrintGCCause true >>> PrintGCDateStamps true >>> PrintGCDetails true >>> PrintGCTimeStamps true >>> PrintSafepointStatistics true >>> PrintSafepointStatisticsCount 1 >>> PrintTenuringDistribution true >>> ReservedCodeCacheSize 268435456 >>> SafepointTimeout true >>> SafepointTimeoutDelay 4000 >>> ThreadStackSize 4096 >>> UnlockCommercialFeatures true >>> UseBiasedLocking false >>> UseGCLogFileRotation false >>> UseG1GC true >>> PrintJNIGCStalls true >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysr1729 at gmail.com Sat May 30 02:14:10 2015 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Fri, 29 May 2015 19:14:10 -0700 Subject: JVM taking a few seconds to reach a safepoint for routine young gen GC In-Reply-To: References: Message-ID: Hi Jason -- You mentioned a lucene indexer on the same box. Can you check for correlation between the indexing activity, paging behavior and the incidence of the long safe points? -- ramki ysr1729 > On May 29, 2015, at 09:51, Jason Goetz wrote: > > Oops, I did not intend to remove this from the list. Re-added. > > I?ll take a look at how many RUNNABLE threads are actually blocked in native code. I?ll also look at VSphere to see if I can see anything unusual around resource contention. > > I?ve grepped the safepoint logs for GenCollectForAllocation, which, as I mentioned before, happen about every second, but only show this long sync times during the mysterious 20-minute period. I?ve taken an excerpt from one of these 20-minute pauses. You can see that for most GC the only time is in vmop and sync time is 0, but during these pauses the ?sync? time takes up the majority of the time. > > [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > > 116546.586: GenCollectForAllocation [ 146 5 7 ] [ 0 0 0 1 50 ] 0 > 116546.891: GenCollectForAllocation [ 146 2 7 ] [ 0 0 0 2 50 ] 0 > 116547.969: GenCollectForAllocation [ 145 0 2 ] [ 0 0 0 2 290 ] 0 > 116549.500: GenCollectForAllocation [ 145 0 0 ] [ 0 0 0 2 67 ] 0 > 116550.836: GenCollectForAllocation [ 142 0 1 ] [ 0 0 0 1 82 ] 0 > 116553.398: GenCollectForAllocation [ 142 0 2 ] [ 0 0 0 2 76 ] 0 > 116555.109: GenCollectForAllocation [ 142 0 0 ] [ 0 0 0 2 84 ] 0 > 116557.328: GenCollectForAllocation [ 142 0 0 ] [ 0 0 0 2 64 ] 0 > 116561.992: GenCollectForAllocation [ 143 2 1 ] [ 0 0 523 2 76 ] 1 > 116567.367: GenCollectForAllocation [ 143 1 0 ] [ 1 0 39 2 104 ] 0 > 116572.438: GenCollectForAllocation [ 143 4 3 ] [ 0 0 0 2 85 ] 0 > 116575.977: GenCollectForAllocation [ 144 76 1 ] [ 24 0154039 9 181 ] 0 > 116731.336: GenCollectForAllocation [ 353 41 5 ] [ 5 0 5 1 101 ] 0 > 116732.328: GenCollectForAllocation [ 354 5 16 ] [ 2080 0 2115 1 0 ] 1 > 116736.430: GenCollectForAllocation [ 354 5 9 ] [ 0 0 0 1 81 ] 2 > 116736.891: GenCollectForAllocation [ 354 0 4 ] [ 0 0 0 4 88 ] 0 > 116737.305: GenCollectForAllocation [ 354 2 9 ] [ 0 0 0 1 80 ] 0 > 116737.664: GenCollectForAllocation [ 354 1 8 ] [ 0 0 0 2 65 ] 0 > 116738.055: GenCollectForAllocation [ 355 1 8 ] [ 0 0 0 1 106 ] 0 > 116738.797: GenCollectForAllocation [ 354 0 5 ] [ 0 0 2116 2 125 ] 0 > 116741.523: GenCollectForAllocation [ 353 1 0 ] [ 5 0 502 1 195 ] 0 > 116743.219: GenCollectForAllocation [ 352 1 5 ] [ 0 0 0 1 0 ] 0 > 116743.719: GenCollectForAllocation [ 352 1 7 ] [ 0 0 0 1 67 ] 0 > 116744.266: GenCollectForAllocation [ 352 271 0 ] [ 28 0764563 4 0 ] 0 > 117509.914: GenCollectForAllocation [ 347 1 2 ] [ 0 0 0 2 166 ] 0 > 117510.609: GenCollectForAllocation [ 456 84 9 ] [ 8 0 8 2 103 ] 1 > 117511.305: GenCollectForAllocation [ 479 0 6 ] [ 0 0 0 7 199 ] 0 > 117512.086: GenCollectForAllocation [ 480 0 2 ] [ 0 0 0 2 192 ] 0 > 117829.000: GenCollectForAllocation [ 569 0 3 ] [ 0 0 0 2 0 ] 0 > 117829.000: GenCollectForAllocation [ 569 2 5 ] [ 0 0 0 0 128 ] 0 > 117829.523: GenCollectForAllocation [ 569 0 6 ] [ 0 0 0 2 84 ] 0 > 117830.039: GenCollectForAllocation [ 571 0 5 ] [ 0 0 0 2 0 ] 0 > 117830.781: GenCollectForAllocation [ 571 0 6 ] [ 0 0 0 6 72 ] 0 > 117831.461: GenCollectForAllocation [ 571 0 4 ] [ 0 0 0 1 0 ] 0 > 117831.469: GenCollectForAllocation [ 571 0 3 ] [ 0 0 0 0 113 ] 0 > > From: Vitaly Davidovich > Date: Thursday, May 28, 2015 at 4:20 PM > To: Jason Goetz > Subject: Re: JVM taking a few seconds to reach a safepoint for routine young gen GC > > Jason, > > Not sure if you meant to reply just to me, but you did :) > > So I suspect the RUNNABLE you list is what jstack gives you, which is slightly a lie since it'll show some threads blocked in native code as RUNNABLE. > > The fact that you're on a VM is biasing me towards looking at that angle. If there's a spike in runnable (from kernel scheduler standpoint) threads and/or contention for resources, and it's driven by hypervisor, I wonder if there're any artifacts in that. I don't have much experience running servers on VMs (only bare metal), so hard to say. You may want to reply to the list again and see if anyone else has more insight into this type of setup. > > Also, Poonam asked for safepoint statistics for the vm ops that were requested -- do you have those? > >> On Thu, May 28, 2015 at 4:20 PM, Jason Goetz wrote: >> I?m happy to answer whatever I can. Thanks for taking the time to help. It?s running on a VM, not bare metal. The exact OS is Windows Server 2008. The database is running on another machine. There is a very large Lucene index on the same machine as the application and commits to this index are frequent and often contended. >> >> From the thread dumps I took during these pauses (there are several that happen around minor GCs during these 20-minute periods) I can see the following stats: >> >> Dump 1: >> Threads: 147 >> RUNNABLE: 42 >> WAITING: 30 >> TIMED_WAITING: 75 >> BLOCKED: 0 >> >> Dump 2: >> Threads: 259 >> RUNNABLE: 143 >> WAITING: 47 >> TIMED_WAITING: 62 >> BLOCKED: 7 >> >> The only reason I believe the thread count is higher than usual on the second dump is that the dump follows a very long pause (69 seconds, all spent in sync time stopping threads for a safepoint) so I think there were several web requests that gathered up during this pause and needed to be served. >> >> As far as Unsafe operations, the only thing I see in thread dumps when I grep for Unsafe is Unsafe.park operations in threads that are TIMED_WAITING. >> >> As far as memory allocation, I do have some good profiling of that from the flight recordings that are taken and have a listing of allocations by thread. I haven?t been able to see any abnormal allocations happening during the time of the pauses, and the total amount of memory being allocated is no different during these pauses. In fact, the amount of memory getting allocated (inside and outside TLABs) is less during these pauses as I imagine the time that threads are waiting for a safepoint are taking time away from running code that allocates memory. >> >> From: Vitaly Davidovich >> Date: Thursday, May 28, 2015 at 12:06 PM >> To: Jason Goetz >> >> Subject: Re: JVM taking a few seconds to reach a safepoint for routine young gen GC >> >> Thanks Jason. Is this bare metal Windows or virtualized? Of the 140-200 active, how many are runnable at the time of the stalls? >> >> Do you (or any used libs that you know of) use Unsafe for big memcpy style operations? >> >> When these spikes occur, how many runnable procs are there on the machine? Is there scheduling contention perhaps (with Tomcat?)? >> >> As for JNI, typically, java threads in JNI won't stall threads from sync'ing on a safepoint. >> >> Sorry for the spanish inquisition, but may help us figure this out or at least get a lead. >> >>> On Thu, May 28, 2015 at 2:45 PM, Jason Goetz wrote: >>> Vitaly, >>> >>> We?ve seen 140-200 active threads during the time of the stalls but that?s no different than any other time period. There are 12 CPUs available on the JVM and there is 24G in the heap, 64G on the machine. This is the only JVM running on the machine, which runs on a Windows server, and Tomcat is the only application of note other than a few monitoring tools (Zabbix, HP Open View, VMWare Tools), which I haven?t had the option of turning off). >>> >>> I?m not sure that JNI is running. We don?t explicitly have any JNI calls running, but I?m not sure about whether any of the 3rd-party libraries we use have JNI code that I?m unaware of. I haven?t been able to figure out how to identify if JNI calls are even running. We have taken several Java Flight Recordings around these every-20-minute pauses, but haven?t seen any patterns or unusual spikes in disk I/O, thread contention, or any thread activity. There is no swapping at all either. >>> >>> Any other information that I could provide in order to give a clearer picture of the system? >>> >>> Thanks, >>> Jason >>> >>> From: Vitaly Davidovich >>> Date: Thursday, May 28, 2015 at 11:17 AM >>> To: Jason Goetz >>> Cc: hotspot-gc-use >>> Subject: Re: JVM taking a few seconds to reach a safepoint for routine young gen GC >>> >>> Jason, >>> >>> How many java threads are active when these stalls happen? How many CPUs are available to the jvm? How much physical memory on the machine? Is your jvm sole occupant of the machine or do you have noisy neighbors? You mentioned JNI - do you have a lot of JNI calls around these times? Do you allocate and/or write to large arrays/memory regions? Is there something different/interesting about these 20 min periods (e.g. workload increases, same time of day, more disk activity, any paging/swap activity, etc). >>> >>> sent from my phone >>> >>>> On May 28, 2015 1:58 PM, "Jason Goetz" wrote: >>>> We're consistently seeing a situation where threads take a few seconds to stop for a routine GC. For 20 straight minutes the GC will run right away (it runs about every second). But then, during a 20-minute period, the threads will take longer to stop for GC. See the GC output below. >>>> >>>> 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which application threads were stopped: 0.1121233 seconds, Stopping threads took: 0.0000908 seconds >>>> 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which application threads were stopped: 0.0019384 seconds, Stopping threads took: 0.0001106 seconds >>>> 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, 28.4067370 secs] >>>> 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs], 0.0709271 secs] >>>> [Times: user=0.73 sys=0.00, real=0.08 secs] >>>> 2015-05-28T12:15:09.864-0500: 54815.466: Total time for which application threads were stopped: 3.2916224 seconds, Stopping threads took: 3.2188032 seconds >>>> 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G), 0.0451098 secs] >>>> [Times: user=0.61 sys=0.00, real=0.05 secs] >>>> 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which application threads were stopped: 0.0459803 seconds, Stopping threads took: 0.0001950 seconds >>>> >>>> Turning on safepoint logging reveals that these stopping threads times are taken up by safepoint ?sync? time. Taking thread dumps every second around these pauses fail to show anything of note happening during this time, but it?s my understanding that native code won?t necessarily show up in thread dumps anyway given that they exit before the JVM reaches a safepoint. >>>> >>>> Enabling PrintJNIGCStalls fails to show any logging around the 3 second pause seen above. I highly suspected JNI but was surprise that I didn?t see any logging about JNI Weak References after turning that option on. Any ideas for what I can try next? We?re using JDK 7u80. Here are the rest of my JVM settings: >>>> >>>> DisableExplicitGC true >>>> FlightRecorder true >>>> GCLogFileSize 52428800 >>>> ManagementServer true >>>> MinHeapSize 25769803776 >>>> MaxHeapSize 25769803776 >>>> MaxPermSize 536870912 >>>> NumberOfGCLogFiles 10 >>>> PrintAdaptiveSizePolicy true >>>> PrintGC true >>>> PrintGCApplicationStoppedTime true >>>> PrintGCCause true >>>> PrintGCDateStamps true >>>> PrintGCDetails true >>>> PrintGCTimeStamps true >>>> PrintSafepointStatistics true >>>> PrintSafepointStatisticsCount 1 >>>> PrintTenuringDistribution true >>>> ReservedCodeCacheSize 268435456 >>>> SafepointTimeout true >>>> SafepointTimeoutDelay 4000 >>>> ThreadStackSize 4096 >>>> UnlockCommercialFeatures true >>>> UseBiasedLocking false >>>> UseGCLogFileRotation false >>>> UseG1GC true >>>> PrintJNIGCStalls true >>>> >>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From gustav.r.akesson at gmail.com Sat May 30 17:05:08 2015 From: gustav.r.akesson at gmail.com (=?UTF-8?Q?Gustav_=C3=85kesson?=) Date: Sat, 30 May 2015 19:05:08 +0200 Subject: JVM taking a few seconds to reach a safepoint for routine young gen GC In-Reply-To: References: Message-ID: Hi, I thought that threads (blocked) in native was not a problem for safe-pointing? Meaning that these threads could continue to execute/block during an attempt to reach safe point, but denied to return back to Java-land until safe-ponting is over. Best Regards, Gustav ?kesson Den 30 maj 2015 04:14 skrev "Srinivas Ramakrishna" : > Hi Jason -- > > You mentioned a lucene indexer on the same box. Can you check for > correlation between the indexing activity, paging behavior and the > incidence of the long safe points? > > -- ramki > > ysr1729 > > On May 29, 2015, at 09:51, Jason Goetz wrote: > > Oops, I did not intend to remove this from the list. Re-added. > > I?ll take a look at how many RUNNABLE threads are actually blocked in > native code. I?ll also look at VSphere to see if I can see anything unusual > around resource contention. > > I?ve grepped the safepoint logs for GenCollectForAllocation, which, as I > mentioned before, happen about every second, but only show this long sync > times during the mysterious 20-minute period. I?ve taken an excerpt from > one of these 20-minute pauses. You can see that for most GC the only time > is in vmop and sync time is 0, but during these pauses the ?sync? time > takes up the majority of the time. > > [threads: total initially_running wait_to_block] [time: spin block sync > cleanup vmop] page_trap_count > > 116546.586: GenCollectForAllocation [ 146 5 > 7 ] [ 0 0 0 1 50 ] 0 > 116546.891: GenCollectForAllocation [ 146 2 > 7 ] [ 0 0 0 2 50 ] 0 > 116547.969: GenCollectForAllocation [ 145 0 > 2 ] [ 0 0 0 2 290 ] 0 > 116549.500: GenCollectForAllocation [ 145 0 > 0 ] [ 0 0 0 2 67 ] 0 > 116550.836: GenCollectForAllocation [ 142 0 > 1 ] [ 0 0 0 1 82 ] 0 > 116553.398: GenCollectForAllocation [ 142 0 > 2 ] [ 0 0 0 2 76 ] 0 > 116555.109: GenCollectForAllocation [ 142 0 > 0 ] [ 0 0 0 2 84 ] 0 > 116557.328: GenCollectForAllocation [ 142 0 > 0 ] [ 0 0 0 2 64 ] 0 > 116561.992: GenCollectForAllocation [ 143 2 > 1 ] [ 0 0 523 2 76 ] 1 > 116567.367: GenCollectForAllocation [ 143 1 > 0 ] [ 1 0 39 2 104 ] 0 > 116572.438: GenCollectForAllocation [ 143 4 > 3 ] [ 0 0 0 2 85 ] 0 > 116575.977: GenCollectForAllocation [ 144 76 > 1 ] [ 24 0154039 9 181 ] 0 > 116731.336: GenCollectForAllocation [ 353 41 > 5 ] [ 5 0 5 1 101 ] 0 > 116732.328: GenCollectForAllocation [ 354 5 > 16 ] [ 2080 0 2115 1 0 ] 1 > 116736.430: GenCollectForAllocation [ 354 5 > 9 ] [ 0 0 0 1 81 ] 2 > 116736.891: GenCollectForAllocation [ 354 0 > 4 ] [ 0 0 0 4 88 ] 0 > 116737.305: GenCollectForAllocation [ 354 2 > 9 ] [ 0 0 0 1 80 ] 0 > 116737.664: GenCollectForAllocation [ 354 1 > 8 ] [ 0 0 0 2 65 ] 0 > 116738.055: GenCollectForAllocation [ 355 1 > 8 ] [ 0 0 0 1 106 ] 0 > 116738.797: GenCollectForAllocation [ 354 0 > 5 ] [ 0 0 2116 2 125 ] 0 > 116741.523: GenCollectForAllocation [ 353 1 > 0 ] [ 5 0 502 1 195 ] 0 > 116743.219: GenCollectForAllocation [ 352 1 > 5 ] [ 0 0 0 1 0 ] 0 > 116743.719: GenCollectForAllocation [ 352 1 > 7 ] [ 0 0 0 1 67 ] 0 > 116744.266: GenCollectForAllocation [ 352 271 > 0 ] [ 28 0764563 4 0 ] 0 > 117509.914: GenCollectForAllocation [ 347 1 > 2 ] [ 0 0 0 2 166 ] 0 > 117510.609: GenCollectForAllocation [ 456 84 > 9 ] [ 8 0 8 2 103 ] 1 > 117511.305: GenCollectForAllocation [ 479 0 > 6 ] [ 0 0 0 7 199 ] 0 > 117512.086: GenCollectForAllocation [ 480 0 > 2 ] [ 0 0 0 2 192 ] 0 > 117829.000: GenCollectForAllocation [ 569 0 > 3 ] [ 0 0 0 2 0 ] 0 > 117829.000: GenCollectForAllocation [ 569 2 > 5 ] [ 0 0 0 0 128 ] 0 > 117829.523: GenCollectForAllocation [ 569 0 > 6 ] [ 0 0 0 2 84 ] 0 > 117830.039: GenCollectForAllocation [ 571 0 > 5 ] [ 0 0 0 2 0 ] 0 > 117830.781: GenCollectForAllocation [ 571 0 > 6 ] [ 0 0 0 6 72 ] 0 > 117831.461: GenCollectForAllocation [ 571 0 > 4 ] [ 0 0 0 1 0 ] 0 > 117831.469: GenCollectForAllocation [ 571 0 > 3 ] [ 0 0 0 0 113 ] 0 > > From: Vitaly Davidovich > Date: Thursday, May 28, 2015 at 4:20 PM > To: Jason Goetz > Subject: Re: JVM taking a few seconds to reach a safepoint for routine > young gen GC > > Jason, > > Not sure if you meant to reply just to me, but you did :) > > So I suspect the RUNNABLE you list is what jstack gives you, which is > slightly a lie since it'll show some threads blocked in native code as > RUNNABLE. > > The fact that you're on a VM is biasing me towards looking at that angle. > If there's a spike in runnable (from kernel scheduler standpoint) threads > and/or contention for resources, and it's driven by hypervisor, I wonder if > there're any artifacts in that. I don't have much experience running > servers on VMs (only bare metal), so hard to say. You may want to reply to > the list again and see if anyone else has more insight into this type of > setup. > > Also, Poonam asked for safepoint statistics for the vm ops that were > requested -- do you have those? > > On Thu, May 28, 2015 at 4:20 PM, Jason Goetz > wrote: > >> I?m happy to answer whatever I can. Thanks for taking the time to help. >> It?s running on a VM, not bare metal. The exact OS is Windows Server 2008. >> The database is running on another machine. There is a very large Lucene >> index on the same machine as the application and commits to this index are >> frequent and often contended. >> >> From the thread dumps I took during these pauses (there are several that >> happen around minor GCs during these 20-minute periods) I can see the >> following stats: >> >> Dump 1: >> Threads: 147 >> RUNNABLE: 42 >> WAITING: 30 >> TIMED_WAITING: 75 >> BLOCKED: 0 >> >> Dump 2: >> Threads: 259 >> RUNNABLE: 143 >> WAITING: 47 >> TIMED_WAITING: 62 >> BLOCKED: 7 >> >> The only reason I believe the thread count is higher than usual on the >> second dump is that the dump follows a very long pause (69 seconds, all >> spent in sync time stopping threads for a safepoint) so I think there were >> several web requests that gathered up during this pause and needed to be >> served. >> >> As far as Unsafe operations, the only thing I see in thread dumps when I >> grep for Unsafe is Unsafe.park operations in threads that are >> TIMED_WAITING. >> >> As far as memory allocation, I do have some good profiling of that from >> the flight recordings that are taken and have a listing of allocations by >> thread. I haven?t been able to see any abnormal allocations happening >> during the time of the pauses, and the total amount of memory being >> allocated is no different during these pauses. In fact, the amount of >> memory getting allocated (inside and outside TLABs) is less during these >> pauses as I imagine the time that threads are waiting for a safepoint are >> taking time away from running code that allocates memory. >> >> From: Vitaly Davidovich >> Date: Thursday, May 28, 2015 at 12:06 PM >> To: Jason Goetz >> >> Subject: Re: JVM taking a few seconds to reach a safepoint for routine >> young gen GC >> >> Thanks Jason. Is this bare metal Windows or virtualized? Of the 140-200 >> active, how many are runnable at the time of the stalls? >> >> Do you (or any used libs that you know of) use Unsafe for big memcpy >> style operations? >> >> When these spikes occur, how many runnable procs are there on the >> machine? Is there scheduling contention perhaps (with Tomcat?)? >> >> As for JNI, typically, java threads in JNI won't stall threads from >> sync'ing on a safepoint. >> >> Sorry for the spanish inquisition, but may help us figure this out or at >> least get a lead. >> >> On Thu, May 28, 2015 at 2:45 PM, Jason Goetz >> wrote: >> >>> Vitaly, >>> >>> We?ve seen 140-200 active threads during the time of the stalls but >>> that?s no different than any other time period. There are 12 CPUs available >>> on the JVM and there is 24G in the heap, 64G on the machine. This is the >>> only JVM running on the machine, which runs on a Windows server, and Tomcat >>> is the only application of note other than a few monitoring tools (Zabbix, >>> HP Open View, VMWare Tools), which I haven?t had the option of turning off). >>> >>> I?m not sure that JNI is running. We don?t explicitly have any JNI calls >>> running, but I?m not sure about whether any of the 3rd-party libraries we >>> use have JNI code that I?m unaware of. I haven?t been able to figure out >>> how to identify if JNI calls are even running. We have taken several Java >>> Flight Recordings around these every-20-minute pauses, but haven?t seen any >>> patterns or unusual spikes in disk I/O, thread contention, or any thread >>> activity. There is no swapping at all either. >>> >>> Any other information that I could provide in order to give a clearer >>> picture of the system? >>> >>> Thanks, >>> Jason >>> >>> From: Vitaly Davidovich >>> Date: Thursday, May 28, 2015 at 11:17 AM >>> To: Jason Goetz >>> Cc: hotspot-gc-use >>> Subject: Re: JVM taking a few seconds to reach a safepoint for routine >>> young gen GC >>> >>> Jason, >>> >>> How many java threads are active when these stalls happen? How many CPUs >>> are available to the jvm? How much physical memory on the machine? Is your >>> jvm sole occupant of the machine or do you have noisy neighbors? You >>> mentioned JNI - do you have a lot of JNI calls around these times? Do you >>> allocate and/or write to large arrays/memory regions? Is there something >>> different/interesting about these 20 min periods (e.g. workload increases, >>> same time of day, more disk activity, any paging/swap activity, etc). >>> >>> sent from my phone >>> On May 28, 2015 1:58 PM, "Jason Goetz" wrote: >>> >>>> We're consistently seeing a situation where threads take a few seconds >>>> to stop for a routine GC. For 20 straight minutes the GC will run right >>>> away (it runs about every second). But then, during a 20-minute period, the >>>> threads will take longer to stop for GC. See the GC output below. >>>> >>>> 2015-05-28T12:14:51.205-0500: 54796.811: Total time for which >>>> application threads were stopped: 0.1121233 seconds, Stopping threads took: >>>> 0.0000908 seconds >>>> 2015-05-28T12:15:00.331-0500: 54805.930: Total time for which >>>> application threads were stopped: 0.0019384 seconds, Stopping threads took: >>>> 0.0001106 seconds >>>> 2015-05-28T12:15:06.572-0500: 54812.174: [GC concurrent-mark-end, >>>> 28.4067370 secs] >>>> 2015-05-28T12:15:09.786-0500: 54815.395: [GC remark >>>> 2015-05-28T12:15:09.786-0500: 54815.396: [GC ref-proc, 0.0103603 secs], >>>> 0.0709271 secs] >>>> [Times: user=0.73 sys=0.00, real=0.08 secs] >>>> *2015-05-28T12:15:09.864-0500: 54815.466: Total time for which >>>> application threads were stopped: 3.2916224 seconds, Stopping threads took: >>>> 3.2188032 seconds* >>>> 2015-05-28T12:15:09.864-0500: 54815.467: [GC cleanup 20G->20G(30G), >>>> 0.0451098 secs] >>>> [Times: user=0.61 sys=0.00, real=0.05 secs] >>>> 2015-05-28T12:15:09.910-0500: 54815.512: Total time for which >>>> application threads were stopped: 0.0459803 seconds, Stopping threads took: >>>> 0.0001950 seconds >>>> >>>> Turning on safepoint logging reveals that these stopping threads times >>>> are taken up by safepoint ?sync? time. Taking thread dumps every second >>>> around these pauses fail to show anything of note happening during this >>>> time, but it?s my understanding that native code won?t necessarily show up >>>> in thread dumps anyway given that they exit before the JVM reaches a >>>> safepoint. >>>> >>>> Enabling PrintJNIGCStalls fails to show any logging around the 3 second >>>> pause seen above. I highly suspected JNI but was surprise that I didn?t see >>>> any logging about JNI Weak References after turning that option on. Any >>>> ideas for what I can try next? We?re using JDK 7u80. Here are the rest of >>>> my JVM settings: >>>> >>>> DisableExplicitGC true >>>> FlightRecorder true >>>> GCLogFileSize 52428800 >>>> ManagementServer true >>>> MinHeapSize 25769803776 >>>> MaxHeapSize 25769803776 >>>> MaxPermSize 536870912 >>>> NumberOfGCLogFiles 10 >>>> PrintAdaptiveSizePolicy true >>>> PrintGC true >>>> PrintGCApplicationStoppedTime true >>>> PrintGCCause true >>>> PrintGCDateStamps true >>>> PrintGCDetails true >>>> PrintGCTimeStamps true >>>> PrintSafepointStatistics true >>>> PrintSafepointStatisticsCount 1 >>>> PrintTenuringDistribution true >>>> ReservedCodeCacheSize 268435456 >>>> SafepointTimeout true >>>> SafepointTimeoutDelay 4000 >>>> ThreadStackSize 4096 >>>> UnlockCommercialFeatures true >>>> UseBiasedLocking false >>>> UseGCLogFileRotation false >>>> UseG1GC true >>>> PrintJNIGCStalls true >>>> >>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: