From rainer.jung at kippdata.de Mon Nov 9 20:23:25 2015 From: rainer.jung at kippdata.de (Rainer Jung) Date: Mon, 9 Nov 2015 21:23:25 +0100 Subject: Long safepoint pause directly after GC log file rotation in 1.7.0_80 Message-ID: <564100BD.1070002@kippdata.de> Hi, after upgrading from 1.7.0_76 to 1.7.0_80 we experience long pauses directly after a GC log rotation. The pause duration varies due to application and load but is in the range of 6 seconds to 60 seconds. There is no GC involved, i.e. no GC output is written related to these pauses. Example: Previous file ends with: 2015-11-09T01:28:36.832+0100: 38461,486: Application time: 5,2840810 seconds {Heap before GC invocations=7366 (full 8): par new generation total 458752K, used 442678K [0xfffffffe00400000, 0xfffffffe20400000, 0xfffffffe20400000) eden space 393216K, 100% used [0xfffffffe00400000, 0xfffffffe18400000, 0xfffffffe18400000) from space 65536K, 75% used [0xfffffffe18400000, 0xfffffffe1b44d998, 0xfffffffe1c400000) to space 65536K, 0% used [0xfffffffe1c400000, 0xfffffffe1c400000, 0xfffffffe20400000) concurrent mark-sweep generation total 3670016K, used 1887085K [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) concurrent-mark-sweep perm gen total 524288K, used 453862K [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) 2015-11-09T01:28:36.839+0100: 38461,493: [GC2015-11-09T01:28:36.840+0100: 38461,493: [ParNew Desired survivor size 33554432 bytes, new threshold 16 (max 31) - age 1: 2964800 bytes, 2964800 total - age 2: 2628048 bytes, 5592848 total - age 3: 1415792 bytes, 7008640 total - age 4: 1354008 bytes, 8362648 total - age 5: 1132056 bytes, 9494704 total - age 6: 1334072 bytes, 10828776 total - age 7: 1407336 bytes, 12236112 total - age 8: 3321304 bytes, 15557416 total - age 9: 1531064 bytes, 17088480 total - age 10: 2453024 bytes, 19541504 total - age 11: 2797616 bytes, 22339120 total - age 12: 1698584 bytes, 24037704 total - age 13: 1870064 bytes, 25907768 total - age 14: 2211528 bytes, 28119296 total - age 15: 3626888 bytes, 31746184 total : 442678K->37742K(458752K), 0,0802687 secs] 2329763K->1924827K(4128768K), 0,0812120 secs] [Times: user=0,90 sys=0,03, real=0,08 secs] Heap after GC invocations=7367 (full 8): par new generation total 458752K, used 37742K [0xfffffffe00400000, 0xfffffffe20400000, 0xfffffffe20400000) eden space 393216K, 0% used [0xfffffffe00400000, 0xfffffffe00400000, 0xfffffffe18400000) from space 65536K, 57% used [0xfffffffe1c400000, 0xfffffffe1e8db9a0, 0xfffffffe20400000) to space 65536K, 0% used [0xfffffffe18400000, 0xfffffffe18400000, 0xfffffffe1c400000) concurrent mark-sweep generation total 3670016K, used 1887085K [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) concurrent-mark-sweep perm gen total 524288K, used 453862K [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) } .... 2015-11-09T01:28:36.921+0100: 38461,575: Total time for which application threads were stopped: 0,0888232 seconds, Stopping threads took: 0,0005420 seconds 2015-11-09T01:28:59.821+0100: 38484,474: Application time: 0,0002954 seconds 2015-11-09T01:28:59.823+0100: 38484,477: Total time for which application threads were stopped: 0,0026081 seconds, Stopping threads took: 0,0004146 seconds 2015-11-09T01:28:59.824+0100: 38484,477: Application time: 0,0003073 seconds 2015-11-09T01:28:59.826+0100: 38484,480: Total time for which application threads were stopped: 0,0025411 seconds, Stopping threads took: 0,0004064 seconds 2015-11-09T01:28:59.827+0100: 38484,480: Application time: 0,0002885 seconds 2015-11-09 01:28:59 GC log file has reached the maximum size. Saved as ./application/logs-a/mkb_gc.log.2 This output looks normal. Last timestamp is 2015-11-09T01:28:59.827 Now the next file begins: 2015-11-09 01:28:59 GC log file created ./application/logs-a/mkb_gc.log.3 Java HotSpot(TM) 64-Bit Server VM (24.80-b11) for solaris-sparc JRE (1.7.0_80-b15), built on Apr 10 2015 18:47:18 by "" with Sun Studio 12u1 Memory: 8k page, physical 133693440k(14956840k free) CommandLine flags: -XX:AllocateInstancePrefetchLines=2 -XX:AllocatePrefetchInstr=1 -XX:AllocatePrefetchLines=6 -XX:AllocatePrefetchStyle=3 -XX:+CMSClassUnloadingEnabled -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses -XX:GCLogFileSize=10485760 -XX:InitialHeapSize=4294967296 -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=536870912 -XX:MaxPermSize=536870912 -XX:MaxTenuringThreshold=31 -XX:NewSize=536870912 -XX:NumberOfGCLogFiles=10 -XX:OldPLABSize=16 -XX:ParallelGCThreads=16 -XX:PermSize=536870912 -XX:+PrintGC -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:SurvivorRatio=6 -XX:-UseAdaptiveSizePolicy -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseParNewGC 2015-11-09T01:29:55.640+0100: 38540,292: Total time for which application threads were stopped: 55,8119519 seconds, Stopping threads took: 0,0003857 seconds 2015-11-09T01:29:55.648+0100: 38540,299: Application time: 0,0076173 seconds Note the 55.8 seconds pause directly after printing the flags and the consistent timestamp jump from 01:28:59 to 01:29:55. There's no GC output, although verbose GC is active and works. For some other reason there is a very long safepoint. Note also, that the time is not due to waiting until the safepoint is reached. At least the log claims that reaching the safepoint only took 0.008 seconds. Also at that timeof day the servers are not very busy. Is there any idea, what happens here? Anything that rings a bell between 1.7.0_76 and 1.7.0_80? Why should there be a long safepoint directly after GC rotation opened a new file? I searched the bug parade, but didn't find a good hit. There was also nothing in the change for JDK-7164841 that seemed immediately responsible for a long pause. Unfortunately this happens on a production system and the first thing was to roll back to the old Java version.Not sure, how good this will be reproducible on a test system (will check tomorrow). Thanks for any hint, Rainer From rainer.jung at kippdata.de Tue Nov 10 09:13:35 2015 From: rainer.jung at kippdata.de (Rainer Jung) Date: Tue, 10 Nov 2015 10:13:35 +0100 Subject: Long safepoint pause directly after GC log file rotation in 1.7.0_80 In-Reply-To: <564100BD.1070002@kippdata.de> References: <564100BD.1070002@kippdata.de> Message-ID: <5641B53F.2000709@kippdata.de> Addition: the longest pause that was experienced was more than 2400 seconds ... And: platform is Solaris Sparc (T4). But we don't know whether it is platform dependent. It also happens on test systems, so I'll write a script that calls pstack when detection is detected to find out, what the threads are doing or where they are hanging. Regards, Rainer Am 09.11.2015 um 21:23 schrieb Rainer Jung: > Hi, > > after upgrading from 1.7.0_76 to 1.7.0_80 we experience long pauses > directly after a GC log rotation. > > The pause duration varies due to application and load but is in the > range of 6 seconds to 60 seconds. There is no GC involved, i.e. no GC > output is written related to these pauses. > > Example: > > Previous file ends with: > > 2015-11-09T01:28:36.832+0100: 38461,486: Application time: 5,2840810 > seconds > {Heap before GC invocations=7366 (full 8): > par new generation total 458752K, used 442678K [0xfffffffe00400000, > 0xfffffffe20400000, 0xfffffffe20400000) > eden space 393216K, 100% used [0xfffffffe00400000, > 0xfffffffe18400000, 0xfffffffe18400000) > from space 65536K, 75% used [0xfffffffe18400000, 0xfffffffe1b44d998, > 0xfffffffe1c400000) > to space 65536K, 0% used [0xfffffffe1c400000, 0xfffffffe1c400000, > 0xfffffffe20400000) > concurrent mark-sweep generation total 3670016K, used 1887085K > [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) > concurrent-mark-sweep perm gen total 524288K, used 453862K > [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) > 2015-11-09T01:28:36.839+0100: 38461,493: > [GC2015-11-09T01:28:36.840+0100: 38461,493: [ParNew > Desired survivor size 33554432 bytes, new threshold 16 (max 31) > - age 1: 2964800 bytes, 2964800 total > - age 2: 2628048 bytes, 5592848 total > - age 3: 1415792 bytes, 7008640 total > - age 4: 1354008 bytes, 8362648 total > - age 5: 1132056 bytes, 9494704 total > - age 6: 1334072 bytes, 10828776 total > - age 7: 1407336 bytes, 12236112 total > - age 8: 3321304 bytes, 15557416 total > - age 9: 1531064 bytes, 17088480 total > - age 10: 2453024 bytes, 19541504 total > - age 11: 2797616 bytes, 22339120 total > - age 12: 1698584 bytes, 24037704 total > - age 13: 1870064 bytes, 25907768 total > - age 14: 2211528 bytes, 28119296 total > - age 15: 3626888 bytes, 31746184 total > : 442678K->37742K(458752K), 0,0802687 secs] > 2329763K->1924827K(4128768K), 0,0812120 secs] [Times: user=0,90 > sys=0,03, real=0,08 secs] > Heap after GC invocations=7367 (full 8): > par new generation total 458752K, used 37742K [0xfffffffe00400000, > 0xfffffffe20400000, 0xfffffffe20400000) > eden space 393216K, 0% used [0xfffffffe00400000, > 0xfffffffe00400000, 0xfffffffe18400000) > from space 65536K, 57% used [0xfffffffe1c400000, 0xfffffffe1e8db9a0, > 0xfffffffe20400000) > to space 65536K, 0% used [0xfffffffe18400000, 0xfffffffe18400000, > 0xfffffffe1c400000) > concurrent mark-sweep generation total 3670016K, used 1887085K > [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) > concurrent-mark-sweep perm gen total 524288K, used 453862K > [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) > } > .... > 2015-11-09T01:28:36.921+0100: 38461,575: Total time for which > application threads were stopped: 0,0888232 seconds, Stopping threads > took: 0,0005420 seconds > 2015-11-09T01:28:59.821+0100: 38484,474: Application time: 0,0002954 > seconds > 2015-11-09T01:28:59.823+0100: 38484,477: Total time for which > application threads were stopped: 0,0026081 seconds, Stopping threads > took: 0,0004146 seconds > 2015-11-09T01:28:59.824+0100: 38484,477: Application time: 0,0003073 > seconds > 2015-11-09T01:28:59.826+0100: 38484,480: Total time for which > application threads were stopped: 0,0025411 seconds, Stopping threads > took: 0,0004064 seconds > 2015-11-09T01:28:59.827+0100: 38484,480: Application time: 0,0002885 > seconds > 2015-11-09 01:28:59 GC log file has reached the maximum size. Saved as > ./application/logs-a/mkb_gc.log.2 > > This output looks normal. Last timestamp is 2015-11-09T01:28:59.827 > > Now the next file begins: > > 2015-11-09 01:28:59 GC log file created ./application/logs-a/mkb_gc.log.3 > Java HotSpot(TM) 64-Bit Server VM (24.80-b11) for solaris-sparc JRE > (1.7.0_80-b15), built on Apr 10 2015 18:47:18 by "" with Sun Studio 12u1 > Memory: 8k page, physical 133693440k(14956840k free) > CommandLine flags: -XX:AllocateInstancePrefetchLines=2 > -XX:AllocatePrefetchInstr=1 -XX:AllocatePrefetchLines=6 > -XX:AllocatePrefetchStyle=3 -XX:+CMSClassUnloadingEnabled > -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses > -XX:GCLogFileSize=10485760 -XX:InitialHeapSize=4294967296 > -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=536870912 > -XX:MaxPermSize=536870912 -XX:MaxTenuringThreshold=31 > -XX:NewSize=536870912 -XX:NumberOfGCLogFiles=10 -XX:OldPLABSize=16 > -XX:ParallelGCThreads=16 -XX:PermSize=536870912 -XX:+PrintGC > -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime > -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:SurvivorRatio=6 > -XX:-UseAdaptiveSizePolicy -XX:+UseCompressedOops > -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseParNewGC > 2015-11-09T01:29:55.640+0100: 38540,292: Total time for which > application threads were stopped: 55,8119519 seconds, Stopping threads > took: 0,0003857 seconds > 2015-11-09T01:29:55.648+0100: 38540,299: Application time: 0,0076173 > seconds > > Note the 55.8 seconds pause directly after printing the flags and the > consistent timestamp jump from 01:28:59 to 01:29:55. There's no GC > output, although verbose GC is active and works. For some other reason > there is a very long safepoint. Note also, that the time is not due to > waiting until the safepoint is reached. At least the log claims that > reaching the safepoint only took 0.008 seconds. Also at that timeof day > the servers are not very busy. > > Is there any idea, what happens here? Anything that rings a bell between > 1.7.0_76 and 1.7.0_80? Why should there be a long safepoint directly > after GC rotation opened a new file? > > I searched the bug parade, but didn't find a good hit. There was also > nothing in the change for JDK-7164841 that seemed immediately > responsible for a long pause. > > Unfortunately this happens on a production system and the first thing > was to roll back to the old Java version.Not sure, how good this will be > reproducible on a test system (will check tomorrow). > > Thanks for any hint, > > Rainer From rainer.jung at kippdata.de Tue Nov 10 12:47:34 2015 From: rainer.jung at kippdata.de (Rainer Jung) Date: Tue, 10 Nov 2015 13:47:34 +0100 Subject: Long safepoint pause directly after GC log file rotation in 1.7.0_80 In-Reply-To: <5641B53F.2000709@kippdata.de> References: <564100BD.1070002@kippdata.de> <5641B53F.2000709@kippdata.de> Message-ID: <5641E766.9050306@kippdata.de> The pause is due to the call "(void) check_addr0(st)" in os::print_memory_info(). The call reads "/proc/self/map". In our case it has for instance 1400 entries, and each read takes about 40 ms. The same function check_addr0() is also used in os::run_periodic_checks(). Not sure why it is also done directly after each GC log rotation. Regards, Rainer Am 10.11.2015 um 10:13 schrieb Rainer Jung: > Addition: the longest pause that was experienced was more than 2400 > seconds ... > > And: platform is Solaris Sparc (T4). But we don't know whether it is > platform dependent. > > It also happens on test systems, so I'll write a script that calls > pstack when detection is detected to find out, what the threads are > doing or where they are hanging. > > Regards, > > Rainer > > Am 09.11.2015 um 21:23 schrieb Rainer Jung: >> Hi, >> >> after upgrading from 1.7.0_76 to 1.7.0_80 we experience long pauses >> directly after a GC log rotation. >> >> The pause duration varies due to application and load but is in the >> range of 6 seconds to 60 seconds. There is no GC involved, i.e. no GC >> output is written related to these pauses. >> >> Example: >> >> Previous file ends with: >> >> 2015-11-09T01:28:36.832+0100: 38461,486: Application time: 5,2840810 >> seconds >> {Heap before GC invocations=7366 (full 8): >> par new generation total 458752K, used 442678K [0xfffffffe00400000, >> 0xfffffffe20400000, 0xfffffffe20400000) >> eden space 393216K, 100% used [0xfffffffe00400000, >> 0xfffffffe18400000, 0xfffffffe18400000) >> from space 65536K, 75% used [0xfffffffe18400000, 0xfffffffe1b44d998, >> 0xfffffffe1c400000) >> to space 65536K, 0% used [0xfffffffe1c400000, 0xfffffffe1c400000, >> 0xfffffffe20400000) >> concurrent mark-sweep generation total 3670016K, used 1887085K >> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >> concurrent-mark-sweep perm gen total 524288K, used 453862K >> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >> 2015-11-09T01:28:36.839+0100: 38461,493: >> [GC2015-11-09T01:28:36.840+0100: 38461,493: [ParNew >> Desired survivor size 33554432 bytes, new threshold 16 (max 31) >> - age 1: 2964800 bytes, 2964800 total >> - age 2: 2628048 bytes, 5592848 total >> - age 3: 1415792 bytes, 7008640 total >> - age 4: 1354008 bytes, 8362648 total >> - age 5: 1132056 bytes, 9494704 total >> - age 6: 1334072 bytes, 10828776 total >> - age 7: 1407336 bytes, 12236112 total >> - age 8: 3321304 bytes, 15557416 total >> - age 9: 1531064 bytes, 17088480 total >> - age 10: 2453024 bytes, 19541504 total >> - age 11: 2797616 bytes, 22339120 total >> - age 12: 1698584 bytes, 24037704 total >> - age 13: 1870064 bytes, 25907768 total >> - age 14: 2211528 bytes, 28119296 total >> - age 15: 3626888 bytes, 31746184 total >> : 442678K->37742K(458752K), 0,0802687 secs] >> 2329763K->1924827K(4128768K), 0,0812120 secs] [Times: user=0,90 >> sys=0,03, real=0,08 secs] >> Heap after GC invocations=7367 (full 8): >> par new generation total 458752K, used 37742K [0xfffffffe00400000, >> 0xfffffffe20400000, 0xfffffffe20400000) >> eden space 393216K, 0% used [0xfffffffe00400000, >> 0xfffffffe00400000, 0xfffffffe18400000) >> from space 65536K, 57% used [0xfffffffe1c400000, 0xfffffffe1e8db9a0, >> 0xfffffffe20400000) >> to space 65536K, 0% used [0xfffffffe18400000, 0xfffffffe18400000, >> 0xfffffffe1c400000) >> concurrent mark-sweep generation total 3670016K, used 1887085K >> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >> concurrent-mark-sweep perm gen total 524288K, used 453862K >> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >> } >> .... >> 2015-11-09T01:28:36.921+0100: 38461,575: Total time for which >> application threads were stopped: 0,0888232 seconds, Stopping threads >> took: 0,0005420 seconds >> 2015-11-09T01:28:59.821+0100: 38484,474: Application time: 0,0002954 >> seconds >> 2015-11-09T01:28:59.823+0100: 38484,477: Total time for which >> application threads were stopped: 0,0026081 seconds, Stopping threads >> took: 0,0004146 seconds >> 2015-11-09T01:28:59.824+0100: 38484,477: Application time: 0,0003073 >> seconds >> 2015-11-09T01:28:59.826+0100: 38484,480: Total time for which >> application threads were stopped: 0,0025411 seconds, Stopping threads >> took: 0,0004064 seconds >> 2015-11-09T01:28:59.827+0100: 38484,480: Application time: 0,0002885 >> seconds >> 2015-11-09 01:28:59 GC log file has reached the maximum size. Saved as >> ./application/logs-a/mkb_gc.log.2 >> >> This output looks normal. Last timestamp is 2015-11-09T01:28:59.827 >> >> Now the next file begins: >> >> 2015-11-09 01:28:59 GC log file created ./application/logs-a/mkb_gc.log.3 >> Java HotSpot(TM) 64-Bit Server VM (24.80-b11) for solaris-sparc JRE >> (1.7.0_80-b15), built on Apr 10 2015 18:47:18 by "" with Sun Studio 12u1 >> Memory: 8k page, physical 133693440k(14956840k free) >> CommandLine flags: -XX:AllocateInstancePrefetchLines=2 >> -XX:AllocatePrefetchInstr=1 -XX:AllocatePrefetchLines=6 >> -XX:AllocatePrefetchStyle=3 -XX:+CMSClassUnloadingEnabled >> -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses >> -XX:GCLogFileSize=10485760 -XX:InitialHeapSize=4294967296 >> -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=536870912 >> -XX:MaxPermSize=536870912 -XX:MaxTenuringThreshold=31 >> -XX:NewSize=536870912 -XX:NumberOfGCLogFiles=10 -XX:OldPLABSize=16 >> -XX:ParallelGCThreads=16 -XX:PermSize=536870912 -XX:+PrintGC >> -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime >> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >> -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:SurvivorRatio=6 >> -XX:-UseAdaptiveSizePolicy -XX:+UseCompressedOops >> -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseParNewGC >> 2015-11-09T01:29:55.640+0100: 38540,292: Total time for which >> application threads were stopped: 55,8119519 seconds, Stopping threads >> took: 0,0003857 seconds >> 2015-11-09T01:29:55.648+0100: 38540,299: Application time: 0,0076173 >> seconds >> >> Note the 55.8 seconds pause directly after printing the flags and the >> consistent timestamp jump from 01:28:59 to 01:29:55. There's no GC >> output, although verbose GC is active and works. For some other reason >> there is a very long safepoint. Note also, that the time is not due to >> waiting until the safepoint is reached. At least the log claims that >> reaching the safepoint only took 0.008 seconds. Also at that timeof day >> the servers are not very busy. >> >> Is there any idea, what happens here? Anything that rings a bell between >> 1.7.0_76 and 1.7.0_80? Why should there be a long safepoint directly >> after GC rotation opened a new file? >> >> I searched the bug parade, but didn't find a good hit. There was also >> nothing in the change for JDK-7164841 that seemed immediately >> responsible for a long pause. >> >> Unfortunately this happens on a production system and the first thing >> was to roll back to the old Java version.Not sure, how good this will be >> reproducible on a test system (will check tomorrow). >> >> Thanks for any hint, >> >> Rainer From rainer.jung at kippdata.de Tue Nov 10 13:24:02 2015 From: rainer.jung at kippdata.de (Rainer Jung) Date: Tue, 10 Nov 2015 14:24:02 +0100 Subject: Long safepoint pause directly after GC log file rotation in 1.7.0_80 In-Reply-To: <5641E766.9050306@kippdata.de> References: <564100BD.1070002@kippdata.de> <5641B53F.2000709@kippdata.de> <5641E766.9050306@kippdata.de> Message-ID: <5641EFF2.4020300@kippdata.de> Am 10.11.2015 um 13:47 schrieb Rainer Jung: > The pause is due to the call "(void) check_addr0(st)" in > os::print_memory_info(). > > The call reads "/proc/self/map". In our case it has for instance 1400 > entries, and each read takes about 40 ms. > > The same function check_addr0() is also used in > os::run_periodic_checks(). Not sure why it is also done directly after > each GC log rotation. Note that the solaris command pmap reads the same file in one go using the pread() call instead of reading lots of entries one by one. Therefore pmap executes very quick, but check_addr0() not. If I copy the check_addr0() code in a separate executable and run it from there, reading the map file of the Java process, I can reproduce the long runtime. So it seems this is unfortunately an example of bad (very inefficient) programming. The only thing I could not yet analyze, is what the difference to 1.7.0_76 is, where the problem doesn't happen. Work in progress. It looks like we should open a bug? Regards, Rainer > Am 10.11.2015 um 10:13 schrieb Rainer Jung: >> Addition: the longest pause that was experienced was more than 2400 >> seconds ... >> >> And: platform is Solaris Sparc (T4). But we don't know whether it is >> platform dependent. >> >> It also happens on test systems, so I'll write a script that calls >> pstack when detection is detected to find out, what the threads are >> doing or where they are hanging. >> >> Regards, >> >> Rainer >> >> Am 09.11.2015 um 21:23 schrieb Rainer Jung: >>> Hi, >>> >>> after upgrading from 1.7.0_76 to 1.7.0_80 we experience long pauses >>> directly after a GC log rotation. >>> >>> The pause duration varies due to application and load but is in the >>> range of 6 seconds to 60 seconds. There is no GC involved, i.e. no GC >>> output is written related to these pauses. >>> >>> Example: >>> >>> Previous file ends with: >>> >>> 2015-11-09T01:28:36.832+0100: 38461,486: Application time: 5,2840810 >>> seconds >>> {Heap before GC invocations=7366 (full 8): >>> par new generation total 458752K, used 442678K [0xfffffffe00400000, >>> 0xfffffffe20400000, 0xfffffffe20400000) >>> eden space 393216K, 100% used [0xfffffffe00400000, >>> 0xfffffffe18400000, 0xfffffffe18400000) >>> from space 65536K, 75% used [0xfffffffe18400000, 0xfffffffe1b44d998, >>> 0xfffffffe1c400000) >>> to space 65536K, 0% used [0xfffffffe1c400000, 0xfffffffe1c400000, >>> 0xfffffffe20400000) >>> concurrent mark-sweep generation total 3670016K, used 1887085K >>> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >>> concurrent-mark-sweep perm gen total 524288K, used 453862K >>> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >>> 2015-11-09T01:28:36.839+0100: 38461,493: >>> [GC2015-11-09T01:28:36.840+0100: 38461,493: [ParNew >>> Desired survivor size 33554432 bytes, new threshold 16 (max 31) >>> - age 1: 2964800 bytes, 2964800 total >>> - age 2: 2628048 bytes, 5592848 total >>> - age 3: 1415792 bytes, 7008640 total >>> - age 4: 1354008 bytes, 8362648 total >>> - age 5: 1132056 bytes, 9494704 total >>> - age 6: 1334072 bytes, 10828776 total >>> - age 7: 1407336 bytes, 12236112 total >>> - age 8: 3321304 bytes, 15557416 total >>> - age 9: 1531064 bytes, 17088480 total >>> - age 10: 2453024 bytes, 19541504 total >>> - age 11: 2797616 bytes, 22339120 total >>> - age 12: 1698584 bytes, 24037704 total >>> - age 13: 1870064 bytes, 25907768 total >>> - age 14: 2211528 bytes, 28119296 total >>> - age 15: 3626888 bytes, 31746184 total >>> : 442678K->37742K(458752K), 0,0802687 secs] >>> 2329763K->1924827K(4128768K), 0,0812120 secs] [Times: user=0,90 >>> sys=0,03, real=0,08 secs] >>> Heap after GC invocations=7367 (full 8): >>> par new generation total 458752K, used 37742K [0xfffffffe00400000, >>> 0xfffffffe20400000, 0xfffffffe20400000) >>> eden space 393216K, 0% used [0xfffffffe00400000, >>> 0xfffffffe00400000, 0xfffffffe18400000) >>> from space 65536K, 57% used [0xfffffffe1c400000, 0xfffffffe1e8db9a0, >>> 0xfffffffe20400000) >>> to space 65536K, 0% used [0xfffffffe18400000, 0xfffffffe18400000, >>> 0xfffffffe1c400000) >>> concurrent mark-sweep generation total 3670016K, used 1887085K >>> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >>> concurrent-mark-sweep perm gen total 524288K, used 453862K >>> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >>> } >>> .... >>> 2015-11-09T01:28:36.921+0100: 38461,575: Total time for which >>> application threads were stopped: 0,0888232 seconds, Stopping threads >>> took: 0,0005420 seconds >>> 2015-11-09T01:28:59.821+0100: 38484,474: Application time: 0,0002954 >>> seconds >>> 2015-11-09T01:28:59.823+0100: 38484,477: Total time for which >>> application threads were stopped: 0,0026081 seconds, Stopping threads >>> took: 0,0004146 seconds >>> 2015-11-09T01:28:59.824+0100: 38484,477: Application time: 0,0003073 >>> seconds >>> 2015-11-09T01:28:59.826+0100: 38484,480: Total time for which >>> application threads were stopped: 0,0025411 seconds, Stopping threads >>> took: 0,0004064 seconds >>> 2015-11-09T01:28:59.827+0100: 38484,480: Application time: 0,0002885 >>> seconds >>> 2015-11-09 01:28:59 GC log file has reached the maximum size. Saved as >>> ./application/logs-a/mkb_gc.log.2 >>> >>> This output looks normal. Last timestamp is 2015-11-09T01:28:59.827 >>> >>> Now the next file begins: >>> >>> 2015-11-09 01:28:59 GC log file created >>> ./application/logs-a/mkb_gc.log.3 >>> Java HotSpot(TM) 64-Bit Server VM (24.80-b11) for solaris-sparc JRE >>> (1.7.0_80-b15), built on Apr 10 2015 18:47:18 by "" with Sun Studio 12u1 >>> Memory: 8k page, physical 133693440k(14956840k free) >>> CommandLine flags: -XX:AllocateInstancePrefetchLines=2 >>> -XX:AllocatePrefetchInstr=1 -XX:AllocatePrefetchLines=6 >>> -XX:AllocatePrefetchStyle=3 -XX:+CMSClassUnloadingEnabled >>> -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses >>> -XX:GCLogFileSize=10485760 -XX:InitialHeapSize=4294967296 >>> -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=536870912 >>> -XX:MaxPermSize=536870912 -XX:MaxTenuringThreshold=31 >>> -XX:NewSize=536870912 -XX:NumberOfGCLogFiles=10 -XX:OldPLABSize=16 >>> -XX:ParallelGCThreads=16 -XX:PermSize=536870912 -XX:+PrintGC >>> -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime >>> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >>> -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:SurvivorRatio=6 >>> -XX:-UseAdaptiveSizePolicy -XX:+UseCompressedOops >>> -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseParNewGC >>> 2015-11-09T01:29:55.640+0100: 38540,292: Total time for which >>> application threads were stopped: 55,8119519 seconds, Stopping threads >>> took: 0,0003857 seconds >>> 2015-11-09T01:29:55.648+0100: 38540,299: Application time: 0,0076173 >>> seconds >>> >>> Note the 55.8 seconds pause directly after printing the flags and the >>> consistent timestamp jump from 01:28:59 to 01:29:55. There's no GC >>> output, although verbose GC is active and works. For some other reason >>> there is a very long safepoint. Note also, that the time is not due to >>> waiting until the safepoint is reached. At least the log claims that >>> reaching the safepoint only took 0.008 seconds. Also at that timeof day >>> the servers are not very busy. >>> >>> Is there any idea, what happens here? Anything that rings a bell between >>> 1.7.0_76 and 1.7.0_80? Why should there be a long safepoint directly >>> after GC rotation opened a new file? >>> >>> I searched the bug parade, but didn't find a good hit. There was also >>> nothing in the change for JDK-7164841 that seemed immediately >>> responsible for a long pause. >>> >>> Unfortunately this happens on a production system and the first thing >>> was to roll back to the old Java version.Not sure, how good this will be >>> reproducible on a test system (will check tomorrow). >>> >>> Thanks for any hint, >>> >>> Rainer > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From ceeaspb at gmail.com Tue Nov 10 13:31:49 2015 From: ceeaspb at gmail.com (Alex Bagehot) Date: Tue, 10 Nov 2015 13:31:49 +0000 Subject: Long safepoint pause directly after GC log file rotation in 1.7.0_80 In-Reply-To: <5641E766.9050306@kippdata.de> References: <564100BD.1070002@kippdata.de> <5641B53F.2000709@kippdata.de> <5641E766.9050306@kippdata.de> Message-ID: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6478739 relates (& states it is only for solaris) On Tue, Nov 10, 2015 at 12:47 PM, Rainer Jung wrote: > The pause is due to the call "(void) check_addr0(st)" in > os::print_memory_info(). > > The call reads "/proc/self/map". In our case it has for instance 1400 > entries, and each read takes about 40 ms. > > The same function check_addr0() is also used in os::run_periodic_checks(). > Not sure why it is also done directly after each GC log rotation. > > Regards, > > Rainer > > > Am 10.11.2015 um 10:13 schrieb Rainer Jung: >> >> Addition: the longest pause that was experienced was more than 2400 >> seconds ... >> >> And: platform is Solaris Sparc (T4). But we don't know whether it is >> platform dependent. >> >> It also happens on test systems, so I'll write a script that calls >> pstack when detection is detected to find out, what the threads are >> doing or where they are hanging. >> >> Regards, >> >> Rainer >> >> Am 09.11.2015 um 21:23 schrieb Rainer Jung: >>> >>> Hi, >>> >>> after upgrading from 1.7.0_76 to 1.7.0_80 we experience long pauses >>> directly after a GC log rotation. >>> >>> The pause duration varies due to application and load but is in the >>> range of 6 seconds to 60 seconds. There is no GC involved, i.e. no GC >>> output is written related to these pauses. >>> >>> Example: >>> >>> Previous file ends with: >>> >>> 2015-11-09T01:28:36.832+0100: 38461,486: Application time: 5,2840810 >>> seconds >>> {Heap before GC invocations=7366 (full 8): >>> par new generation total 458752K, used 442678K [0xfffffffe00400000, >>> 0xfffffffe20400000, 0xfffffffe20400000) >>> eden space 393216K, 100% used [0xfffffffe00400000, >>> 0xfffffffe18400000, 0xfffffffe18400000) >>> from space 65536K, 75% used [0xfffffffe18400000, 0xfffffffe1b44d998, >>> 0xfffffffe1c400000) >>> to space 65536K, 0% used [0xfffffffe1c400000, 0xfffffffe1c400000, >>> 0xfffffffe20400000) >>> concurrent mark-sweep generation total 3670016K, used 1887085K >>> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >>> concurrent-mark-sweep perm gen total 524288K, used 453862K >>> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >>> 2015-11-09T01:28:36.839+0100: 38461,493: >>> [GC2015-11-09T01:28:36.840+0100: 38461,493: [ParNew >>> Desired survivor size 33554432 bytes, new threshold 16 (max 31) >>> - age 1: 2964800 bytes, 2964800 total >>> - age 2: 2628048 bytes, 5592848 total >>> - age 3: 1415792 bytes, 7008640 total >>> - age 4: 1354008 bytes, 8362648 total >>> - age 5: 1132056 bytes, 9494704 total >>> - age 6: 1334072 bytes, 10828776 total >>> - age 7: 1407336 bytes, 12236112 total >>> - age 8: 3321304 bytes, 15557416 total >>> - age 9: 1531064 bytes, 17088480 total >>> - age 10: 2453024 bytes, 19541504 total >>> - age 11: 2797616 bytes, 22339120 total >>> - age 12: 1698584 bytes, 24037704 total >>> - age 13: 1870064 bytes, 25907768 total >>> - age 14: 2211528 bytes, 28119296 total >>> - age 15: 3626888 bytes, 31746184 total >>> : 442678K->37742K(458752K), 0,0802687 secs] >>> 2329763K->1924827K(4128768K), 0,0812120 secs] [Times: user=0,90 >>> sys=0,03, real=0,08 secs] >>> Heap after GC invocations=7367 (full 8): >>> par new generation total 458752K, used 37742K [0xfffffffe00400000, >>> 0xfffffffe20400000, 0xfffffffe20400000) >>> eden space 393216K, 0% used [0xfffffffe00400000, >>> 0xfffffffe00400000, 0xfffffffe18400000) >>> from space 65536K, 57% used [0xfffffffe1c400000, 0xfffffffe1e8db9a0, >>> 0xfffffffe20400000) >>> to space 65536K, 0% used [0xfffffffe18400000, 0xfffffffe18400000, >>> 0xfffffffe1c400000) >>> concurrent mark-sweep generation total 3670016K, used 1887085K >>> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >>> concurrent-mark-sweep perm gen total 524288K, used 453862K >>> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >>> } >>> .... >>> 2015-11-09T01:28:36.921+0100: 38461,575: Total time for which >>> application threads were stopped: 0,0888232 seconds, Stopping threads >>> took: 0,0005420 seconds >>> 2015-11-09T01:28:59.821+0100: 38484,474: Application time: 0,0002954 >>> seconds >>> 2015-11-09T01:28:59.823+0100: 38484,477: Total time for which >>> application threads were stopped: 0,0026081 seconds, Stopping threads >>> took: 0,0004146 seconds >>> 2015-11-09T01:28:59.824+0100: 38484,477: Application time: 0,0003073 >>> seconds >>> 2015-11-09T01:28:59.826+0100: 38484,480: Total time for which >>> application threads were stopped: 0,0025411 seconds, Stopping threads >>> took: 0,0004064 seconds >>> 2015-11-09T01:28:59.827+0100: 38484,480: Application time: 0,0002885 >>> seconds >>> 2015-11-09 01:28:59 GC log file has reached the maximum size. Saved as >>> ./application/logs-a/mkb_gc.log.2 >>> >>> This output looks normal. Last timestamp is 2015-11-09T01:28:59.827 >>> >>> Now the next file begins: >>> >>> 2015-11-09 01:28:59 GC log file created ./application/logs-a/mkb_gc.log.3 >>> Java HotSpot(TM) 64-Bit Server VM (24.80-b11) for solaris-sparc JRE >>> (1.7.0_80-b15), built on Apr 10 2015 18:47:18 by "" with Sun Studio 12u1 >>> Memory: 8k page, physical 133693440k(14956840k free) >>> CommandLine flags: -XX:AllocateInstancePrefetchLines=2 >>> -XX:AllocatePrefetchInstr=1 -XX:AllocatePrefetchLines=6 >>> -XX:AllocatePrefetchStyle=3 -XX:+CMSClassUnloadingEnabled >>> -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses >>> -XX:GCLogFileSize=10485760 -XX:InitialHeapSize=4294967296 >>> -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=536870912 >>> -XX:MaxPermSize=536870912 -XX:MaxTenuringThreshold=31 >>> -XX:NewSize=536870912 -XX:NumberOfGCLogFiles=10 -XX:OldPLABSize=16 >>> -XX:ParallelGCThreads=16 -XX:PermSize=536870912 -XX:+PrintGC >>> -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime >>> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >>> -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:SurvivorRatio=6 >>> -XX:-UseAdaptiveSizePolicy -XX:+UseCompressedOops >>> -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseParNewGC >>> 2015-11-09T01:29:55.640+0100: 38540,292: Total time for which >>> application threads were stopped: 55,8119519 seconds, Stopping threads >>> took: 0,0003857 seconds >>> 2015-11-09T01:29:55.648+0100: 38540,299: Application time: 0,0076173 >>> seconds >>> >>> Note the 55.8 seconds pause directly after printing the flags and the >>> consistent timestamp jump from 01:28:59 to 01:29:55. There's no GC >>> output, although verbose GC is active and works. For some other reason >>> there is a very long safepoint. Note also, that the time is not due to >>> waiting until the safepoint is reached. At least the log claims that >>> reaching the safepoint only took 0.008 seconds. Also at that timeof day >>> the servers are not very busy. >>> >>> Is there any idea, what happens here? Anything that rings a bell between >>> 1.7.0_76 and 1.7.0_80? Why should there be a long safepoint directly >>> after GC rotation opened a new file? >>> >>> I searched the bug parade, but didn't find a good hit. There was also >>> nothing in the change for JDK-7164841 that seemed immediately >>> responsible for a long pause. >>> >>> Unfortunately this happens on a production system and the first thing >>> was to roll back to the old Java version.Not sure, how good this will be >>> reproducible on a test system (will check tomorrow). >>> >>> Thanks for any hint, >>> >>> Rainer > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From rainer.jung at kippdata.de Tue Nov 10 14:53:09 2015 From: rainer.jung at kippdata.de (Rainer Jung) Date: Tue, 10 Nov 2015 15:53:09 +0100 Subject: Long safepoint pause directly after GC log file rotation in 1.7.0_80 In-Reply-To: References: <564100BD.1070002@kippdata.de> <5641B53F.2000709@kippdata.de> <5641E766.9050306@kippdata.de> Message-ID: <564204D5.8080406@kippdata.de> Am 10.11.2015 um 14:31 schrieb Alex Bagehot: > http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6478739 relates (& > states it is only for solaris) Thanks for the explanation. I now found, why the problem happens on 80 but not on 76. Both versions contain the same very inefficient code. But in 76 there was a bug that closed the map file after reading the first entry, becuse the close() was done inside the read loop instead of after the read loop. That bug was fixed in 80, so that now actually the whole file is read and not only the first item. Only then does the performance bug exhibit itself. The responsible commit is http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/diff/e50eb3195734/src/os/solaris/vm/os_solaris.cpp But the real bug is reading the map file item by item instead of using pread() to read it all at one like pmap does it. I expect the problem to be also in 1.8.0 and 1.9.0 (to be checked). What's the right list to tell about this hotspot issue? Regards, Rainer > On Tue, Nov 10, 2015 at 12:47 PM, Rainer Jung wrote: >> The pause is due to the call "(void) check_addr0(st)" in >> os::print_memory_info(). >> >> The call reads "/proc/self/map". In our case it has for instance 1400 >> entries, and each read takes about 40 ms. >> >> The same function check_addr0() is also used in os::run_periodic_checks(). >> Not sure why it is also done directly after each GC log rotation. >> >> Regards, >> >> Rainer >> >> >> Am 10.11.2015 um 10:13 schrieb Rainer Jung: >>> >>> Addition: the longest pause that was experienced was more than 2400 >>> seconds ... >>> >>> And: platform is Solaris Sparc (T4). But we don't know whether it is >>> platform dependent. >>> >>> It also happens on test systems, so I'll write a script that calls >>> pstack when detection is detected to find out, what the threads are >>> doing or where they are hanging. >>> >>> Regards, >>> >>> Rainer >>> >>> Am 09.11.2015 um 21:23 schrieb Rainer Jung: >>>> >>>> Hi, >>>> >>>> after upgrading from 1.7.0_76 to 1.7.0_80 we experience long pauses >>>> directly after a GC log rotation. >>>> >>>> The pause duration varies due to application and load but is in the >>>> range of 6 seconds to 60 seconds. There is no GC involved, i.e. no GC >>>> output is written related to these pauses. >>>> >>>> Example: >>>> >>>> Previous file ends with: >>>> >>>> 2015-11-09T01:28:36.832+0100: 38461,486: Application time: 5,2840810 >>>> seconds >>>> {Heap before GC invocations=7366 (full 8): >>>> par new generation total 458752K, used 442678K [0xfffffffe00400000, >>>> 0xfffffffe20400000, 0xfffffffe20400000) >>>> eden space 393216K, 100% used [0xfffffffe00400000, >>>> 0xfffffffe18400000, 0xfffffffe18400000) >>>> from space 65536K, 75% used [0xfffffffe18400000, 0xfffffffe1b44d998, >>>> 0xfffffffe1c400000) >>>> to space 65536K, 0% used [0xfffffffe1c400000, 0xfffffffe1c400000, >>>> 0xfffffffe20400000) >>>> concurrent mark-sweep generation total 3670016K, used 1887085K >>>> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >>>> concurrent-mark-sweep perm gen total 524288K, used 453862K >>>> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >>>> 2015-11-09T01:28:36.839+0100: 38461,493: >>>> [GC2015-11-09T01:28:36.840+0100: 38461,493: [ParNew >>>> Desired survivor size 33554432 bytes, new threshold 16 (max 31) >>>> - age 1: 2964800 bytes, 2964800 total >>>> - age 2: 2628048 bytes, 5592848 total >>>> - age 3: 1415792 bytes, 7008640 total >>>> - age 4: 1354008 bytes, 8362648 total >>>> - age 5: 1132056 bytes, 9494704 total >>>> - age 6: 1334072 bytes, 10828776 total >>>> - age 7: 1407336 bytes, 12236112 total >>>> - age 8: 3321304 bytes, 15557416 total >>>> - age 9: 1531064 bytes, 17088480 total >>>> - age 10: 2453024 bytes, 19541504 total >>>> - age 11: 2797616 bytes, 22339120 total >>>> - age 12: 1698584 bytes, 24037704 total >>>> - age 13: 1870064 bytes, 25907768 total >>>> - age 14: 2211528 bytes, 28119296 total >>>> - age 15: 3626888 bytes, 31746184 total >>>> : 442678K->37742K(458752K), 0,0802687 secs] >>>> 2329763K->1924827K(4128768K), 0,0812120 secs] [Times: user=0,90 >>>> sys=0,03, real=0,08 secs] >>>> Heap after GC invocations=7367 (full 8): >>>> par new generation total 458752K, used 37742K [0xfffffffe00400000, >>>> 0xfffffffe20400000, 0xfffffffe20400000) >>>> eden space 393216K, 0% used [0xfffffffe00400000, >>>> 0xfffffffe00400000, 0xfffffffe18400000) >>>> from space 65536K, 57% used [0xfffffffe1c400000, 0xfffffffe1e8db9a0, >>>> 0xfffffffe20400000) >>>> to space 65536K, 0% used [0xfffffffe18400000, 0xfffffffe18400000, >>>> 0xfffffffe1c400000) >>>> concurrent mark-sweep generation total 3670016K, used 1887085K >>>> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >>>> concurrent-mark-sweep perm gen total 524288K, used 453862K >>>> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >>>> } >>>> .... >>>> 2015-11-09T01:28:36.921+0100: 38461,575: Total time for which >>>> application threads were stopped: 0,0888232 seconds, Stopping threads >>>> took: 0,0005420 seconds >>>> 2015-11-09T01:28:59.821+0100: 38484,474: Application time: 0,0002954 >>>> seconds >>>> 2015-11-09T01:28:59.823+0100: 38484,477: Total time for which >>>> application threads were stopped: 0,0026081 seconds, Stopping threads >>>> took: 0,0004146 seconds >>>> 2015-11-09T01:28:59.824+0100: 38484,477: Application time: 0,0003073 >>>> seconds >>>> 2015-11-09T01:28:59.826+0100: 38484,480: Total time for which >>>> application threads were stopped: 0,0025411 seconds, Stopping threads >>>> took: 0,0004064 seconds >>>> 2015-11-09T01:28:59.827+0100: 38484,480: Application time: 0,0002885 >>>> seconds >>>> 2015-11-09 01:28:59 GC log file has reached the maximum size. Saved as >>>> ./application/logs-a/mkb_gc.log.2 >>>> >>>> This output looks normal. Last timestamp is 2015-11-09T01:28:59.827 >>>> >>>> Now the next file begins: >>>> >>>> 2015-11-09 01:28:59 GC log file created ./application/logs-a/mkb_gc.log.3 >>>> Java HotSpot(TM) 64-Bit Server VM (24.80-b11) for solaris-sparc JRE >>>> (1.7.0_80-b15), built on Apr 10 2015 18:47:18 by "" with Sun Studio 12u1 >>>> Memory: 8k page, physical 133693440k(14956840k free) >>>> CommandLine flags: -XX:AllocateInstancePrefetchLines=2 >>>> -XX:AllocatePrefetchInstr=1 -XX:AllocatePrefetchLines=6 >>>> -XX:AllocatePrefetchStyle=3 -XX:+CMSClassUnloadingEnabled >>>> -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses >>>> -XX:GCLogFileSize=10485760 -XX:InitialHeapSize=4294967296 >>>> -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=536870912 >>>> -XX:MaxPermSize=536870912 -XX:MaxTenuringThreshold=31 >>>> -XX:NewSize=536870912 -XX:NumberOfGCLogFiles=10 -XX:OldPLABSize=16 >>>> -XX:ParallelGCThreads=16 -XX:PermSize=536870912 -XX:+PrintGC >>>> -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime >>>> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >>>> -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:SurvivorRatio=6 >>>> -XX:-UseAdaptiveSizePolicy -XX:+UseCompressedOops >>>> -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseParNewGC >>>> 2015-11-09T01:29:55.640+0100: 38540,292: Total time for which >>>> application threads were stopped: 55,8119519 seconds, Stopping threads >>>> took: 0,0003857 seconds >>>> 2015-11-09T01:29:55.648+0100: 38540,299: Application time: 0,0076173 >>>> seconds >>>> >>>> Note the 55.8 seconds pause directly after printing the flags and the >>>> consistent timestamp jump from 01:28:59 to 01:29:55. There's no GC >>>> output, although verbose GC is active and works. For some other reason >>>> there is a very long safepoint. Note also, that the time is not due to >>>> waiting until the safepoint is reached. At least the log claims that >>>> reaching the safepoint only took 0.008 seconds. Also at that timeof day >>>> the servers are not very busy. >>>> >>>> Is there any idea, what happens here? Anything that rings a bell between >>>> 1.7.0_76 and 1.7.0_80? Why should there be a long safepoint directly >>>> after GC rotation opened a new file? >>>> >>>> I searched the bug parade, but didn't find a good hit. There was also >>>> nothing in the change for JDK-7164841 that seemed immediately >>>> responsible for a long pause. >>>> >>>> Unfortunately this happens on a production system and the first thing >>>> was to roll back to the old Java version.Not sure, how good this will be >>>> reproducible on a test system (will check tomorrow). >>>> >>>> Thanks for any hint, >>>> >>>> Rainer From rainer.jung at kippdata.de Tue Nov 10 15:21:19 2015 From: rainer.jung at kippdata.de (Rainer Jung) Date: Tue, 10 Nov 2015 16:21:19 +0100 Subject: Long safepoint pause directly after GC log file rotation in 1.7.0_80 In-Reply-To: <564204D5.8080406@kippdata.de> References: <564100BD.1070002@kippdata.de> <5641B53F.2000709@kippdata.de> <5641E766.9050306@kippdata.de> <564204D5.8080406@kippdata.de> Message-ID: <56420B6F.5090908@kippdata.de> Am 10.11.2015 um 15:53 schrieb Rainer Jung: > Am 10.11.2015 um 14:31 schrieb Alex Bagehot: >> http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6478739 relates (& >> states it is only for solaris) > > Thanks for the explanation. I now found, why the problem happens on 80 > but not on 76. > > Both versions contain the same very inefficient code. But in 76 there > was a bug that closed the map file after reading the first entry, becuse > the close() was done inside the read loop instead of after the read > loop. That bug was fixed in 80, so that now actually the whole file is > read and not only the first item. Only then does the performance bug > exhibit itself. > > The responsible commit is > > http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/diff/e50eb3195734/src/os/solaris/vm/os_solaris.cpp > > > But the real bug is reading the map file item by item instead of using > pread() to read it all at one like pmap does it. > > I expect the problem to be also in 1.8.0 and 1.9.0 (to be checked). Yes, the same problem is in current 1.8.0 and 1.9.0. For 1.8.0 it was introduced in 1.8.0_20. > What's the right list to tell about this hotspot issue? > > Regards, > > Rainer > >> On Tue, Nov 10, 2015 at 12:47 PM, Rainer Jung >> wrote: >>> The pause is due to the call "(void) check_addr0(st)" in >>> os::print_memory_info(). >>> >>> The call reads "/proc/self/map". In our case it has for instance 1400 >>> entries, and each read takes about 40 ms. >>> >>> The same function check_addr0() is also used in >>> os::run_periodic_checks(). >>> Not sure why it is also done directly after each GC log rotation. >>> >>> Regards, >>> >>> Rainer >>> >>> >>> Am 10.11.2015 um 10:13 schrieb Rainer Jung: >>>> >>>> Addition: the longest pause that was experienced was more than 2400 >>>> seconds ... >>>> >>>> And: platform is Solaris Sparc (T4). But we don't know whether it is >>>> platform dependent. >>>> >>>> It also happens on test systems, so I'll write a script that calls >>>> pstack when detection is detected to find out, what the threads are >>>> doing or where they are hanging. >>>> >>>> Regards, >>>> >>>> Rainer >>>> >>>> Am 09.11.2015 um 21:23 schrieb Rainer Jung: >>>>> >>>>> Hi, >>>>> >>>>> after upgrading from 1.7.0_76 to 1.7.0_80 we experience long pauses >>>>> directly after a GC log rotation. >>>>> >>>>> The pause duration varies due to application and load but is in the >>>>> range of 6 seconds to 60 seconds. There is no GC involved, i.e. no GC >>>>> output is written related to these pauses. >>>>> >>>>> Example: >>>>> >>>>> Previous file ends with: >>>>> >>>>> 2015-11-09T01:28:36.832+0100: 38461,486: Application time: 5,2840810 >>>>> seconds >>>>> {Heap before GC invocations=7366 (full 8): >>>>> par new generation total 458752K, used 442678K [0xfffffffe00400000, >>>>> 0xfffffffe20400000, 0xfffffffe20400000) >>>>> eden space 393216K, 100% used [0xfffffffe00400000, >>>>> 0xfffffffe18400000, 0xfffffffe18400000) >>>>> from space 65536K, 75% used [0xfffffffe18400000, >>>>> 0xfffffffe1b44d998, >>>>> 0xfffffffe1c400000) >>>>> to space 65536K, 0% used [0xfffffffe1c400000, >>>>> 0xfffffffe1c400000, >>>>> 0xfffffffe20400000) >>>>> concurrent mark-sweep generation total 3670016K, used 1887085K >>>>> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >>>>> concurrent-mark-sweep perm gen total 524288K, used 453862K >>>>> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >>>>> 2015-11-09T01:28:36.839+0100: 38461,493: >>>>> [GC2015-11-09T01:28:36.840+0100: 38461,493: [ParNew >>>>> Desired survivor size 33554432 bytes, new threshold 16 (max 31) >>>>> - age 1: 2964800 bytes, 2964800 total >>>>> - age 2: 2628048 bytes, 5592848 total >>>>> - age 3: 1415792 bytes, 7008640 total >>>>> - age 4: 1354008 bytes, 8362648 total >>>>> - age 5: 1132056 bytes, 9494704 total >>>>> - age 6: 1334072 bytes, 10828776 total >>>>> - age 7: 1407336 bytes, 12236112 total >>>>> - age 8: 3321304 bytes, 15557416 total >>>>> - age 9: 1531064 bytes, 17088480 total >>>>> - age 10: 2453024 bytes, 19541504 total >>>>> - age 11: 2797616 bytes, 22339120 total >>>>> - age 12: 1698584 bytes, 24037704 total >>>>> - age 13: 1870064 bytes, 25907768 total >>>>> - age 14: 2211528 bytes, 28119296 total >>>>> - age 15: 3626888 bytes, 31746184 total >>>>> : 442678K->37742K(458752K), 0,0802687 secs] >>>>> 2329763K->1924827K(4128768K), 0,0812120 secs] [Times: user=0,90 >>>>> sys=0,03, real=0,08 secs] >>>>> Heap after GC invocations=7367 (full 8): >>>>> par new generation total 458752K, used 37742K [0xfffffffe00400000, >>>>> 0xfffffffe20400000, 0xfffffffe20400000) >>>>> eden space 393216K, 0% used [0xfffffffe00400000, >>>>> 0xfffffffe00400000, 0xfffffffe18400000) >>>>> from space 65536K, 57% used [0xfffffffe1c400000, >>>>> 0xfffffffe1e8db9a0, >>>>> 0xfffffffe20400000) >>>>> to space 65536K, 0% used [0xfffffffe18400000, >>>>> 0xfffffffe18400000, >>>>> 0xfffffffe1c400000) >>>>> concurrent mark-sweep generation total 3670016K, used 1887085K >>>>> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000) >>>>> concurrent-mark-sweep perm gen total 524288K, used 453862K >>>>> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000) >>>>> } >>>>> .... >>>>> 2015-11-09T01:28:36.921+0100: 38461,575: Total time for which >>>>> application threads were stopped: 0,0888232 seconds, Stopping threads >>>>> took: 0,0005420 seconds >>>>> 2015-11-09T01:28:59.821+0100: 38484,474: Application time: 0,0002954 >>>>> seconds >>>>> 2015-11-09T01:28:59.823+0100: 38484,477: Total time for which >>>>> application threads were stopped: 0,0026081 seconds, Stopping threads >>>>> took: 0,0004146 seconds >>>>> 2015-11-09T01:28:59.824+0100: 38484,477: Application time: 0,0003073 >>>>> seconds >>>>> 2015-11-09T01:28:59.826+0100: 38484,480: Total time for which >>>>> application threads were stopped: 0,0025411 seconds, Stopping threads >>>>> took: 0,0004064 seconds >>>>> 2015-11-09T01:28:59.827+0100: 38484,480: Application time: 0,0002885 >>>>> seconds >>>>> 2015-11-09 01:28:59 GC log file has reached the maximum size. Saved as >>>>> ./application/logs-a/mkb_gc.log.2 >>>>> >>>>> This output looks normal. Last timestamp is 2015-11-09T01:28:59.827 >>>>> >>>>> Now the next file begins: >>>>> >>>>> 2015-11-09 01:28:59 GC log file created >>>>> ./application/logs-a/mkb_gc.log.3 >>>>> Java HotSpot(TM) 64-Bit Server VM (24.80-b11) for solaris-sparc JRE >>>>> (1.7.0_80-b15), built on Apr 10 2015 18:47:18 by "" with Sun Studio >>>>> 12u1 >>>>> Memory: 8k page, physical 133693440k(14956840k free) >>>>> CommandLine flags: -XX:AllocateInstancePrefetchLines=2 >>>>> -XX:AllocatePrefetchInstr=1 -XX:AllocatePrefetchLines=6 >>>>> -XX:AllocatePrefetchStyle=3 -XX:+CMSClassUnloadingEnabled >>>>> -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses >>>>> -XX:GCLogFileSize=10485760 -XX:InitialHeapSize=4294967296 >>>>> -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=536870912 >>>>> -XX:MaxPermSize=536870912 -XX:MaxTenuringThreshold=31 >>>>> -XX:NewSize=536870912 -XX:NumberOfGCLogFiles=10 -XX:OldPLABSize=16 >>>>> -XX:ParallelGCThreads=16 -XX:PermSize=536870912 -XX:+PrintGC >>>>> -XX:+PrintGCApplicationConcurrentTime >>>>> -XX:+PrintGCApplicationStoppedTime >>>>> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >>>>> -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:SurvivorRatio=6 >>>>> -XX:-UseAdaptiveSizePolicy -XX:+UseCompressedOops >>>>> -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseParNewGC >>>>> 2015-11-09T01:29:55.640+0100: 38540,292: Total time for which >>>>> application threads were stopped: 55,8119519 seconds, Stopping threads >>>>> took: 0,0003857 seconds >>>>> 2015-11-09T01:29:55.648+0100: 38540,299: Application time: 0,0076173 >>>>> seconds >>>>> >>>>> Note the 55.8 seconds pause directly after printing the flags and the >>>>> consistent timestamp jump from 01:28:59 to 01:29:55. There's no GC >>>>> output, although verbose GC is active and works. For some other reason >>>>> there is a very long safepoint. Note also, that the time is not due to >>>>> waiting until the safepoint is reached. At least the log claims that >>>>> reaching the safepoint only took 0.008 seconds. Also at that timeof >>>>> day >>>>> the servers are not very busy. >>>>> >>>>> Is there any idea, what happens here? Anything that rings a bell >>>>> between >>>>> 1.7.0_76 and 1.7.0_80? Why should there be a long safepoint directly >>>>> after GC rotation opened a new file? >>>>> >>>>> I searched the bug parade, but didn't find a good hit. There was also >>>>> nothing in the change for JDK-7164841 that seemed immediately >>>>> responsible for a long pause. >>>>> >>>>> Unfortunately this happens on a production system and the first thing >>>>> was to roll back to the old Java version.Not sure, how good this >>>>> will be >>>>> reproducible on a test system (will check tomorrow). >>>>> >>>>> Thanks for any hint, >>>>> >>>>> Rainer From jun.zhuang at hobsons.com Tue Nov 10 14:35:50 2015 From: jun.zhuang at hobsons.com (Jun Zhuang) Date: Tue, 10 Nov 2015 14:35:50 +0000 Subject: Seeking answer to a GC pattern In-Reply-To: <56411FC4.1030205@oracle.com> References: <56411FC4.1030205@oracle.com> Message-ID: Hi Yu, Yes I have disabled the GC ergonomics after experimenting with many java startup parameter combinations. For all the GC algorithms I have tried, the GC behavior was more or less the same but some were definitely worse than others. Main problems I have are: * The saw-tooth pattern response time as measured by my load testing tool with or without -XX:+AlwaysTenure. Right before a full collection, the response time could reach a few seconds depending on the size of the young gen. The bigger the young gen the higher the highs of the response time. * Objects for this application seem to have a long life * Using the parameters I had in the previous email seems to give me the best performance so far. A few examples of the gc log: -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:MaxTenuringThreshold=10 -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC 31265.959: [GC 31265.959: [ParNew Desired survivor size 53673984 bytes, new threshold 10 (max 10) - age 1: 1695168 bytes, 1695168 total - age 2: 825984 bytes, 2521152 total - age 3: 823424 bytes, 3344576 total - age 4: 770776 bytes, 4115352 total - age 5: 822064 bytes, 4937416 total - age 6: 816984 bytes, 5754400 total - age 7: 777064 bytes, 6531464 total - age 8: 850344 bytes, 7381808 total - age 9: 836016 bytes, 8217824 total - age 10: 810968 bytes, 9028792 total : 853704K->10926K(943744K), 1.4887200 secs] 1602653K->760749K(4089472K), 1.4889740 secs] [Times: user=5.46 sys=0.04, real=1.49 secs] 31365.452: [GC 31365.453: [ParNew Desired survivor size 53673984 bytes, new threshold 10 (max 10) - age 1: 1925120 bytes, 1925120 total - age 2: 818208 bytes, 2743328 total - age 3: 815128 bytes, 3558456 total - age 4: 819992 bytes, 4378448 total - age 5: 770648 bytes, 5149096 total - age 6: 822064 bytes, 5971160 total - age 7: 816984 bytes, 6788144 total - age 8: 777032 bytes, 7565176 total - age 9: 850344 bytes, 8415520 total - age 10: 836016 bytes, 9251536 total : 849838K->15550K(943744K), 1.4532880 secs] 1599661K->766188K(4089472K), 1.4535550 secs] [Times: user=5.38 sys=0.02, real=1.45 secs] ~~~~~~~~~~~~~~~~~ -server -XX:+UseCompressedOops -XX:+TieredCompilation -XX:ReservedCodeCacheSize=256m -XX:+UseCodeCacheFlushing -XX:+PrintTenuringDistribution -Xms2048m -Xmx4096m -XX:MaxPermSize=256m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:SurvivorRatio=126 -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:+AlwaysTenure 2417.702: [GC [PSYoungGen: 129024K->0K(130048K)] 2078501K->1954674K(2096128K), 0.1765730 secs] [Times: user=0.64 sys=0.01, real=0.18 secs] 2421.355: [GC [PSYoungGen: 129007K->0K(130048K)] 2083681K->1957745K(2096128K), 0.1654380 secs] [Times: user=0.60 sys=0.01, real=0.17 secs] 2426.403: [GC [PSYoungGen: 129009K->0K(130048K)] 2086754K->1961487K(2096128K), 0.1711790 secs] [Times: user=0.62 sys=0.00, real=0.17 secs] 2426.575: [Full GC [PSYoungGen: 0K->0K(130048K)] [ParOldGen: 1961487K->353838K(1966080K)] 1961487K->353838K(2096128K) [PSPermGen: 103092K->100055K(203712K)], 1.1573820 secs] [Times: user=2.27 sys=0.14, real=1.16 secs] Thanks, Jun From: Yu Zhang [mailto:yu.zhang at oracle.com] Sent: Monday, November 09, 2015 5:36 PM To: Jun Zhuang ; hotspot-gc-use at openjdk.java.net Subject: Re: Seeking answer to a GC pattern Jun, Sorry for the late response. It seems you are disabling the gc ergonomic. Can you explain why? Do you need very low pause time? If you have a gc log, that would be helpful as well. Thanks, Jenny On 10/26/2015 12:33 PM, Jun Zhuang wrote: Hi, When running performance testing for a java web service running on JBOSS, I observed a clear saw-tooth pattern in CPU utilization that closely follows the GC cycles. see below: [cid:image001.jpg at 01D11B95.E81C2250] [cid:image002.jpg at 01D11B95.E81C2250] Java startup parameters used: -XX:+TieredCompilation -XX:+PrintTenuringDistribution -Xms2048m -Xmx4096m -XX:MaxPermSize=256m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:SurvivorRatio=126 -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:+AlwaysTenure With this set of parameters, the young GC pause time ranged from 0.02 to 0.25 secs. When I used 256m for the young gen, the young GC pause time ranged from 0.02 to 0.5 secs. My understanding is that the young GC pause time normally stays fairly stable, I have spent quite some time researching but have yet to find an answer to this behavior. I wonder if people in this distribution list can help me out? Other related info * Server Specs: VM with 4 CPUs and 8 Gb mem * Test setup: * # of Vusers: 100 * Ramp up: 10 mins * Pacing: 5 - 7 secs * I tried with all other available GC algorithms, tenuring thresholds, various sizes of the generations, but the AlwaysTenure parameter seems to work the best so far. [cid:image003.jpg at 01D11B95.E81C2250] [cid:image004.jpg at 01D11B95.E81C2250] Any input will be highly appreciated. Sincerely yours, Jun Zhuang Sr. Performance QA Engineer | Hobsons 513-746-2288 (work) 513-227-7643 (mobile) _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 7083 bytes Desc: image001.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.jpg Type: image/jpeg Size: 11773 bytes Desc: image002.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.jpg Type: image/jpeg Size: 33498 bytes Desc: image003.jpg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.jpg Type: image/jpeg Size: 19704 bytes Desc: image004.jpg URL: From yu.zhang at oracle.com Mon Nov 9 22:35:48 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Mon, 9 Nov 2015 14:35:48 -0800 Subject: Seeking answer to a GC pattern In-Reply-To: References: Message-ID: <56411FC4.1030205@oracle.com> Jun, Sorry for the late response. It seems you are disabling the gc ergonomic. Can you explain why? Do you need very low pause time? If you have a gc log, that would be helpful as well. Thanks, Jenny On 10/26/2015 12:33 PM, Jun Zhuang wrote: > > Hi, > > When running performance testing for a java web service running on > JBOSS, I observed a clear saw-tooth pattern in CPU utilization that > closely follows the GC cycles. see below: > > Java startup parameters used: > > -XX:+TieredCompilation -XX:+PrintTenuringDistribution -Xms2048m > -Xmx4096m -XX:MaxPermSize=256m -XX:NewSize=128m -XX:MaxNewSize=128m > -XX:SurvivorRatio=126 -XX:-UseAdaptiveSizePolicy > -XX:+DisableExplicitGC -XX:+AlwaysTenure > > With this set of parameters, the young GC pause time ranged from 0.02 > to 0.25 secs. When I used 256m for the young gen, the young GC pause > time ranged from 0.02 to 0.5 secs. My understanding is that the young > GC pause time normally stays fairly stable, I have spent quite some > time researching but have yet to find an answer to this behavior. I > wonder if people in this distribution list can help me out? > > *_Other related info_* > > * Server Specs: VM with 4 CPUs and 8 Gb mem > > * Test setup: > > ?# of Vusers: 100 > > ?Ramp up: 10 mins > > ?Pacing: 5 ? 7 secs > > * I tried with all other available GC algorithms, tenuring thresholds, > various sizes of the generations, but the AlwaysTenure parameter seems > to work the best so far. > > Any input will be highly appreciated. > > Sincerely yours, > > *//* > > */Jun Zhuang/**//* > > /Sr. Performance QA Engineer | Hobsons/// > > /513-746-2288 (work)/// > > /513-227-7643 (mobile)/// > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 7083 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 11773 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 33498 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 19704 bytes Desc: not available URL: From thomas.schatzl at oracle.com Sat Nov 14 14:58:53 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Sat, 14 Nov 2015 15:58:53 +0100 Subject: Seeking answer to a GC pattern In-Reply-To: References: <56411FC4.1030205@oracle.com> Message-ID: <1447513133.2373.18.camel@oracle.com> Hi, > On 10/26/2015 12:33 PM, Jun Zhuang wrote: > Hi, > > When running performance testing for a java web service > running on JBOSS, I observed a clear saw-tooth pattern in CPU > utilization that closely follows the GC cycles. see below: > one issue that could cause this is nepotism (http://www.memorymanagement.org/glossary/n.html) of the objects that continously get promoted to the old gen, that is cyclically cleaned up by a kind of full gc. I.e. over time more and more actually dead objects get promoted to the old gen, but due to how generational gc works, they keep more and more objects in young gen alive. That would also explain why the problem is the same for any gc. The only real solution that you can do is make the application code null out references. Or make sure by proper young gen sizing that such pointers are not created, i.e. no objects that might keep alive many others in the young gen get promoted. That latter is a very brittle "solution" though. It may also just be the application though. One hint whether this is the fault of nepotism is take heap snapshots (e.g. jmap -histo:live should do fine), and look if the amount of live data stays roughly the same when taken at different times of that increase in heap memory (or at least does not increase similarly to the graphs). If it is not nepotism, using heap dumps and to some degree the histogram you may get information on what is keeping stuff alive. Then you know whether it the problem are really dead objects keeping stuff alive in young gen or not. Thanks, Thomas From jun.zhuang at hobsons.com Fri Nov 20 19:46:32 2015 From: jun.zhuang at hobsons.com (Jun Zhuang) Date: Fri, 20 Nov 2015 19:46:32 +0000 Subject: Seeking answer to a GC pattern In-Reply-To: References: Message-ID: Hi Srinivas, Thanks for your suggestion. I ran test with following parameters: -server -XX:+UseCompressedOops -XX:+TieredCompilation -XX:ReservedCodeCacheSize=64m -XX:+UseCodeCacheFlushing -XX:+PrintTenuringDistribution -Xms2g -Xmx2g -XX:MaxPermSize=256m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:SurvivorRatio=6 -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:MaxTenuringThreshold=2 But the -XX:MaxTenuringThreshold=2 setting does not seem to help anything. I am still seeing similar GC pattern as with the +AlwaysTenure, actually the young GC time is higher with MTT=2 (getting to 0.5 secs vs. 0.25 with AlwaysTenure). Unless anyone else can provide another theory, I am convinced that nepotism is at play here. Changing the java startup parameters can only get me so far, dev will have to look at the code and see what can be done on the code level. Thanks, Jun From: Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] Sent: Thursday, November 19, 2015 8:09 PM To: Jun Zhuang Cc: hotspot-gc-use at openjdk.java.net Subject: Re: Seeking answer to a GC pattern Use -XX:MaxTenuringThreshold=2 and you might see better behavior that +AlwaysTenure (which is almost always a very bad choice). That will at least reduce some of the nepotism issues from +AlwaysTenure that Thomas mentions. MTT > 2 is unlikely to help at your current frequency of minor collections since the mortality after age 1 is fairly low (from your tenuring distribution). Worth a quick test. -- ramki -------------- next part -------------- An HTML attachment was scrubbed... URL: From csulyj at gmail.com Sun Nov 22 00:00:53 2015 From: csulyj at gmail.com (yuanjun Li) Date: Sun, 22 Nov 2015 08:00:53 +0800 Subject: frequent major gc but not free heap? Message-ID: After running several hours, My http server begin frequenly major gc, but no heap was freed. several times major gc later, promotion failed and concurrent mode failure occured, then heap was freed. My gc log is below : {Heap before GC invocations=7172 (full 720): par new generation total 737280K, used 667492K [0x000000076b800000, 0x000000079d800000, 0x000000079d800000) eden space 655360K, 100% used [0x000000076b800000, 0x0000000793800000, 0x0000000793800000) from space 81920K, 14% used [0x0000000793800000, 0x00000007943d91d0, 0x0000000798800000) to space 81920K, 0% used [0x0000000798800000, 0x0000000798800000, 0x000000079d800000) concurrent mark-sweep generation total 1482752K, used 1479471K [0x000000079d800000, 0x00000007f8000000, 0x00000007f8000000) concurrent-mark-sweep perm gen total 131072K, used 58091K [0x00000007f8000000, 0x0000000800000000, 0x0000000800000000)2015-11-19T21:50:02.692+0800: 113963.532: [GC2015-11-19T21:50:02.692+0800: 113963.532: [ParNew (promotion failed)Desired survivor size 41943040 bytes, new threshold 15 (max 15)- age 1: 3826144 bytes, 3826144 total- age 2: 305696 bytes, 4131840 total- age 3: 181416 bytes, 4313256 total- age 4: 940632 bytes, 5253888 total- age 5: 88368 bytes, 5342256 total- age 6: 159840 bytes, 5502096 total- age 7: 733856 bytes, 6235952 total- age 8: 64712 bytes, 6300664 total- age 9: 314304 bytes, 6614968 total- age 10: 587160 bytes, 7202128 total- age 11: 38728 bytes, 7240856 total- age 12: 221160 bytes, 7462016 total- age 13: 648376 bytes, 8110392 total- age 14: 33296 bytes, 8143688 total- age 15: 380768 bytes, 8524456 total: 667492K->665908K(737280K), 0.7665810 secs]2015-11-19T21:50:03.459+0800: 113964.299: [CMS2015-11-19T21:50:05.161+0800: 113966.001: [CMS-concurrent-mark: 3.579/4.747 secs] [Times: user=13.41 sys=0.35, rea l=4.75 secs] (concurrent mode failure): 1479910K->44010K(1482752K), 4.7267420 secs] 2146964K->44010K(2220032K), [CMS Perm : 58091K->57795K(131072K)], 5.4939440 secs] [Times: user=9.07 sys=0.13, real=5.49 secs] Heap after GC invocations=7173 (full 721): par new generation total 737280K, used 0K [0x000000076b800000, 0x000000079d800000, 0x000000079d800000) eden space 655360K, 0% used [0x000000076b800000, 0x000000076b800000, 0x0000000793800000) from space 81920K, 0% used [0x0000000798800000, 0x0000000798800000, 0x000000079d800000) to space 81920K, 0% used [0x0000000793800000, 0x0000000793800000, 0x0000000798800000) concurrent mark-sweep generation total 1482752K, used 44010K [0x000000079d800000, 0x00000007f8000000, 0x00000007f8000000) concurrent-mark-sweep perm gen total 131072K, used 57795K [0x00000007f8000000, 0x0000000800000000, 0x0000000800000000)} It seems the CMS GC doesn't make any sense. Could you please explain to me ? This is my gc config: -server -Xms2248m -Xmx2248m -Xmn800m -XX:PermSize=128m -XX:MaxPermSize=128m -XX:MaxTenuringThreshold=15 -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xloggc:gc.log -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+UseFastAccessorMethods -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecki at zusammenkunft.net Mon Nov 23 22:41:25 2015 From: ecki at zusammenkunft.net (Bernd) Date: Mon, 23 Nov 2015 22:41:25 +0000 Subject: frequent major gc but not free heap? In-Reply-To: References: Message-ID: Hello, Can you provide a link to a full log file? >From your description it sounds like the memory was used, so a full GC cant free anything. This can either be, that it just take some more time until it was unused (requested/task ended or transaction timed out) or the additional tries triggered some soft references or classloaders to be freed. Hard to say without knowing your applications and having details. I would suggest you generate a heap dump next time the heap becomes full (if this is a slow event) or you try to correlate the memory problems with actual jobs or reqest types (if it happens quickly based on some workload) Bernd yuanjun Li schrieb am Mo., 23. Nov. 2015 23:33: > After running several hours, My http server begin frequenly major gc, but > no heap was freed. > > several times major gc later, promotion failed and concurrent mode failure occured, > then heap was freed. My gc log is below : > > {Heap before GC invocations=7172 (full 720): > par new generation total 737280K, used 667492K [0x000000076b800000, 0x000000079d800000, 0x000000079d800000) > eden space 655360K, 100% used [0x000000076b800000, 0x0000000793800000, 0x0000000793800000) > from space 81920K, 14% used [0x0000000793800000, 0x00000007943d91d0, 0x0000000798800000) > to space 81920K, 0% used [0x0000000798800000, 0x0000000798800000, 0x000000079d800000) > concurrent mark-sweep generation total 1482752K, used 1479471K [0x000000079d800000, 0x00000007f8000000, 0x00000007f8000000) > concurrent-mark-sweep perm gen total 131072K, used 58091K [0x00000007f8000000, 0x0000000800000000, 0x0000000800000000)2015-11-19T21:50:02.692+0800: 113963.532: [GC2015-11-19T21:50:02.692+0800: 113963.532: [ParNew (promotion failed)Desired survivor size 41943040 bytes, new threshold 15 (max 15)- age 1: 3826144 bytes, 3826144 total- age 2: 305696 bytes, 4131840 total- age 3: 181416 bytes, 4313256 total- age 4: 940632 bytes, 5253888 total- age 5: 88368 bytes, 5342256 total- age 6: 159840 bytes, 5502096 total- age 7: 733856 bytes, 6235952 total- age 8: 64712 bytes, 6300664 total- age 9: 314304 bytes, 6614968 total- age 10: 587160 bytes, 7202128 total- age 11: 38728 bytes, 7240856 total- age 12: 221160 bytes, 7462016 total- age 13: 648376 bytes, 8110392 total- age 14: 33296 bytes, 8143688 total- age 15: 380768 bytes, 8524456 total: 667492K->665908K(737280K), 0.7665810 secs]2015-11-19T21:50:03.459+0800: 113964.299: [CMS2015-11-19T21:50:05.161+0800: 113966.001: [CMS-concurrent-mark: 3.579/4.747 secs] [Times: user=13.41 sys=0.35, rea > l=4.75 secs] > (concurrent mode failure): 1479910K->44010K(1482752K), 4.7267420 secs] 2146964K->44010K(2220032K), [CMS Perm : 58091K->57795K(131072K)], 5.4939440 secs] [Times: user=9.07 sys=0.13, real=5.49 secs] Heap after GC invocations=7173 (full 721): > par new generation total 737280K, used 0K [0x000000076b800000, 0x000000079d800000, 0x000000079d800000) > eden space 655360K, 0% used [0x000000076b800000, 0x000000076b800000, 0x0000000793800000) > from space 81920K, 0% used [0x0000000798800000, 0x0000000798800000, 0x000000079d800000) > to space 81920K, 0% used [0x0000000793800000, 0x0000000793800000, 0x0000000798800000) > concurrent mark-sweep generation total 1482752K, used 44010K [0x000000079d800000, 0x00000007f8000000, 0x00000007f8000000) > concurrent-mark-sweep perm gen total 131072K, used 57795K [0x00000007f8000000, 0x0000000800000000, 0x0000000800000000)} > > It seems the CMS GC doesn't make any sense. Could you please explain to > me ? > > This is my gc config: > > -server -Xms2248m -Xmx2248m -Xmn800m -XX:PermSize=128m -XX:MaxPermSize=128m -XX:MaxTenuringThreshold=15 -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xloggc:gc.log -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+UseFastAccessorMethods > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Denny.Kettwig at werum.com Tue Nov 24 11:13:12 2015 From: Denny.Kettwig at werum.com (Denny Kettwig) Date: Tue, 24 Nov 2015 11:13:12 +0000 Subject: GCInterval in Java 8 Message-ID: <6175F8C4FE407D4F830EDA25C27A431701433E5194@Werum1790.werum.net> Hello, I have quick question regarding these two parameters: -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 We have set these parameters in the past to force a Full GC once every hour, however since we switched to Java 8 the parameter no longer has any effect. Has something changed in past? I can't find any source in the net mentioning a change in this area. Regards, Denny Kettwig Software Engineer [cid:image001.jpg at 01D126B1.7CCEF5A0] Werum IT Solutions GmbH Wulf-Werum-Str. 3, 21337 L?neburg, Germany T +49 4131 8900-983 F +49 4131 8900-20 denny.kettwig at werum.com www.werum.com Gesch?ftsf?hrer / Managing Directors: R?diger Schlierenk?mper, Richard Nagorny, Hans-Peter Subel RG L?neburg / Court of Jurisdiction: L?neburg, Germany Handelsregisternummer / Commercial Register: HRB 204984 USt.-IdNr. / VAT No.: DE 116 083 850 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 9408 bytes Desc: image001.jpg URL: From molendag at gmail.com Tue Nov 24 17:17:44 2015 From: molendag at gmail.com (Grzegorz Molenda) Date: Tue, 24 Nov 2015 18:17:44 +0100 Subject: Seeking answer to a GC pattern In-Reply-To: References: Message-ID: Just a few tips: Check OS stats for paging / swapping activity at both VM'and hypervisor levels. Make sure the OS doesn't use transparent huge pages. If the above two don't help, try enabling -XX:+PrintGCTaskTimeStamps to diagnose, which part of GC collecion takes the most time. Note values aren't reported in time units, but in ticks. Subtract one from the other reported per task . Compare between tasks per signle collection and check stats from a few collections in row, to get the idea, where it does degradate. Thanks, Grzegorz 2015-11-20 20:46 GMT+01:00 Jun Zhuang : > Hi Srinivas, > > > > Thanks for your suggestion. I ran test with following parameters: > > > > -server -XX:+UseCompressedOops -XX:+TieredCompilation > -XX:ReservedCodeCacheSize=64m -XX:+UseCodeCacheFlushing > -XX:+PrintTenuringDistribution -Xms2g -Xmx2g -XX:MaxPermSize=256m > -XX:NewSize=128m -XX:MaxNewSize=128m -XX:SurvivorRatio=6 > -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:MaxTenuringThreshold=2 > > > > But the -XX:MaxTenuringThreshold=2 setting does not seem to help anything. > I am still seeing similar GC pattern as with the +AlwaysTenure, actually > the young GC time is higher with MTT=2 (getting to 0.5 secs vs. 0.25 with > AlwaysTenure). > > > > Unless anyone else can provide another theory, I am convinced that > nepotism is at play here. Changing the java startup parameters can only get > me so far, dev will have to look at the code and see what can be done on > the code level. > > > > Thanks, > > Jun > > > > *From:* Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] > *Sent:* Thursday, November 19, 2015 8:09 PM > *To:* Jun Zhuang > *Cc:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: Seeking answer to a GC pattern > > > > Use -XX:MaxTenuringThreshold=2 and you might see better behavior that > +AlwaysTenure (which is almost always a very bad choice). That will at > least reduce some of the nepotism issues from +AlwaysTenure that Thomas > mentions. MTT > 2 is unlikely to help at your current frequency of minor > collections since the mortality after age 1 is fairly low (from your > tenuring distribution). Worth a quick test. > > > > -- ramki > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rainer.jung at kippdata.de Tue Nov 24 19:07:43 2015 From: rainer.jung at kippdata.de (Rainer Jung) Date: Tue, 24 Nov 2015 20:07:43 +0100 Subject: GCInterval in Java 8 In-Reply-To: <6175F8C4FE407D4F830EDA25C27A431701433E5194@Werum1790.werum.net> References: <6175F8C4FE407D4F830EDA25C27A431701433E5194@Werum1790.werum.net> Message-ID: <5654B57F.4030209@kippdata.de> Am 24.11.2015 um 12:13 schrieb Denny Kettwig: > Hello, > > I have quick question regarding these two parameters: > > -Dsun.rmi.dgc.client.gcInterval=3600000 > > -Dsun.rmi.dgc.server.gcInterval=3600000 > > We have set these parameters in the past to force a Full GC once every > hour, however since we switched to Java 8 the parameter no longer has > any effect. Has something changed in past? I can?t find any source in > the net mentioning a change in this area. Originally the params resulted in a distributed GC exactly every 3600 seconds apart from each other. I vaguely remember that at some point in time it changed to running it if no other GC of Tenured had run since at least an hour and only then DGC kicked in. So as long as there are other reasons for normal GC of Tenured/OldGen at least once an hour you would no longer observe a DGC. Other sources might confirm this claim and know, in which version that was introduced. Could it be the reason for your observation, or do you have not other GCs of Tenured/OldGen as well? How do you check for GC events? Regards, Rainer From ysr1729 at gmail.com Tue Nov 24 20:13:42 2015 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Tue, 24 Nov 2015 12:13:42 -0800 Subject: Seeking answer to a GC pattern In-Reply-To: References: Message-ID: What Grzegorz & Thomas said. Also you might take a heap dump before and after a full gc (-XX:+PrintClassHistogramBefore/AfterFullGC) to see the types that are reclaimed in the old gen. Might give you an idea as to the types of objects that got promoted and then later died, and hence whether avoidable nepotism is or is not a factor (and thence what you might null out to reduce such nepotism). Also, I guess what I meant was MTT=1. However, given that going from MTT=10 to MTT=2 didn't make any appreciable difference, MTT=2 to MTT=1 will not either. You might also consider increasing yr young gen size, but that will likely also increase your pause times since objects tend to either die quickly or survive for a long time, and increasing the young gen size will still not age objects sufficiently to cause an increase in mortality. How many CPU's (and GC threads) do you have? Does the ratio of "real" to "usr+sys" increase as "real" ramps up? Does the amount that is promoted stay constant? That might imply that something is interfering with parallelization of copying. Typically that means that there is a long skinny data structure, such as a singly linked list that is being copied, although why that list would become longer (in terms of longer times) is not clear. Does the sawtooth of minor gc times happen even with MTT=1 or AlwaysTenure? (Hint: How many young collections do you see between the major collections when you see the sawtooth in young collection times? How does it compare with the highest age of object that is kept in the survivor spaces?) -- ramki On Tue, Nov 24, 2015 at 9:17 AM, Grzegorz Molenda wrote: > Just a few tips: > > Check OS stats for paging / swapping activity at both VM'and hypervisor > levels. > > Make sure the OS doesn't use transparent huge pages. > > If the above two don't help, try enabling -XX:+PrintGCTaskTimeStamps to > diagnose, which part of GC collecion takes the most time. Note values > aren't reported in time units, but in ticks. Subtract one from the other > reported per task . Compare between tasks per signle collection and check > stats from a few collections in row, to get the idea, where it does > degradate. > > Thanks, > > Grzegorz > > > > 2015-11-20 20:46 GMT+01:00 Jun Zhuang : > >> Hi Srinivas, >> >> >> >> Thanks for your suggestion. I ran test with following parameters: >> >> >> >> -server -XX:+UseCompressedOops -XX:+TieredCompilation >> -XX:ReservedCodeCacheSize=64m -XX:+UseCodeCacheFlushing >> -XX:+PrintTenuringDistribution -Xms2g -Xmx2g -XX:MaxPermSize=256m >> -XX:NewSize=128m -XX:MaxNewSize=128m -XX:SurvivorRatio=6 >> -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:MaxTenuringThreshold=2 >> >> >> >> But the -XX:MaxTenuringThreshold=2 setting does not seem to help >> anything. I am still seeing similar GC pattern as with the +AlwaysTenure, >> actually the young GC time is higher with MTT=2 (getting to 0.5 secs vs. >> 0.25 with AlwaysTenure). >> >> >> >> Unless anyone else can provide another theory, I am convinced that >> nepotism is at play here. Changing the java startup parameters can only get >> me so far, dev will have to look at the code and see what can be done on >> the code level. >> >> >> >> Thanks, >> >> Jun >> >> >> >> *From:* Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] >> *Sent:* Thursday, November 19, 2015 8:09 PM >> *To:* Jun Zhuang >> *Cc:* hotspot-gc-use at openjdk.java.net >> *Subject:* Re: Seeking answer to a GC pattern >> >> >> >> Use -XX:MaxTenuringThreshold=2 and you might see better behavior that >> +AlwaysTenure (which is almost always a very bad choice). That will at >> least reduce some of the nepotism issues from +AlwaysTenure that Thomas >> mentions. MTT > 2 is unlikely to help at your current frequency of minor >> collections since the mortality after age 1 is fairly low (from your >> tenuring distribution). Worth a quick test. >> >> >> >> -- ramki >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jun.zhuang at hobsons.com Tue Nov 24 21:21:45 2015 From: jun.zhuang at hobsons.com (Jun Zhuang) Date: Tue, 24 Nov 2015 21:21:45 +0000 Subject: Seeking answer to a GC pattern In-Reply-To: References: Message-ID: Hi Srinivas, Appreciate your input. Following are answers to your questions. I?ll try your other advices. - How many CPU's (and GC threads) do you have? Does the ratio of "real" to "usr+sys" increase as "real" ramps up? 4 CPUs. Here is the time for one of the young GCs with +AlwaysTenure: Times: user=1.45 sys=0.00, real=0.40 secs. The sys time is always close to 0, user time is more than 3x real time and increases with real time accordingly. - Does the amount that is promoted stay constant? With +AlwaysTenure, looks like the promoted amount was fairly constant @ about 125K. Following shows the first 3 young GCs right after a full collection and last 3 right before the next one. 328706.505: [GC [PSYoungGen: 129018K->0K(130048K)] 286309K->160838K(2096896K), 0.0152110 secs] [Times: user=0.04 sys=0.01, real=0.01 secs] 328711.687: [GC [PSYoungGen: 129024K->0K(130048K)] 289862K->165092K(2096896K), 0.0199390 secs] [Times: user=0.06 sys=0.00, real=0.02 secs] 328716.875: [GC [PSYoungGen: 129024K->0K(130048K)] 294116K->168626K(2096896K), 0.0247520 secs] [Times: user=0.07 sys=0.00, real=0.02 secs] ? 331103.140: [GC [PSYoungGen: 129024K->0K(130048K)] 2082788K->1957116K(2096896K), 0.2220360 secs] [Times: user=0.78 sys=0.00, real=0.23 secs] 331108.118: [GC [PSYoungGen: 129024K->0K(130048K)] 2086140K->1960268K(2096896K), 0.2170640 secs] [Times: user=0.79 sys=0.01, real=0.22 secs] 331113.074: [GC [PSYoungGen: 129024K->0K(130048K)] 2089292K->1963948K(2096896K), 0.2132430 secs] [Times: user=0.79 sys=0.00, real=0.21 secs] - Does the sawtooth of minor gc times happen even with MTT=1 or AlwaysTenure? Yes. It always happens with or without AlwaysTenure. - (Hint: How many young collections do you see between the major collections when you see the sawtooth in young collection times? How does it compare with the highest age of object that is kept in the survivor spaces?) For one of my tests with 128m young gen and +AlwaysTenure, the average # of young GCs before a full collection was a little over 500. Sincerely, Jun From: Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] Sent: Tuesday, November 24, 2015 3:14 PM To: Grzegorz Molenda Cc: Jun Zhuang ; hotspot-gc-use at openjdk.java.net Subject: Re: Seeking answer to a GC pattern What Grzegorz & Thomas said. Also you might take a heap dump before and after a full gc (-XX:+PrintClassHistogramBefore/AfterFullGC) to see the types that are reclaimed in the old gen. Might give you an idea as to the types of objects that got promoted and then later died, and hence whether avoidable nepotism is or is not a factor (and thence what you might null out to reduce such nepotism). Also, I guess what I meant was MTT=1. However, given that going from MTT=10 to MTT=2 didn't make any appreciable difference, MTT=2 to MTT=1 will not either. You might also consider increasing yr young gen size, but that will likely also increase your pause times since objects tend to either die quickly or survive for a long time, and increasing the young gen size will still not age objects sufficiently to cause an increase in mortality. How many CPU's (and GC threads) do you have? Does the ratio of "real" to "usr+sys" increase as "real" ramps up? Does the amount that is promoted stay constant? That might imply that something is interfering with parallelization of copying. Typically that means that there is a long skinny data structure, such as a singly linked list that is being copied, although why that list would become longer (in terms of longer times) is not clear. Does the sawtooth of minor gc times happen even with MTT=1 or AlwaysTenure? (Hint: How many young collections do you see between the major collections when you see the sawtooth in young collection times? How does it compare with the highest age of object that is kept in the survivor spaces?) -- ramki On Tue, Nov 24, 2015 at 9:17 AM, Grzegorz Molenda > wrote: Just a few tips: Check OS stats for paging / swapping activity at both VM'and hypervisor levels. Make sure the OS doesn't use transparent huge pages. If the above two don't help, try enabling -XX:+PrintGCTaskTimeStamps to diagnose, which part of GC collecion takes the most time. Note values aren't reported in time units, but in ticks. Subtract one from the other reported per task . Compare between tasks per signle collection and check stats from a few collections in row, to get the idea, where it does degradate. Thanks, Grzegorz 2015-11-20 20:46 GMT+01:00 Jun Zhuang >: Hi Srinivas, Thanks for your suggestion. I ran test with following parameters: -server -XX:+UseCompressedOops -XX:+TieredCompilation -XX:ReservedCodeCacheSize=64m -XX:+UseCodeCacheFlushing -XX:+PrintTenuringDistribution -Xms2g -Xmx2g -XX:MaxPermSize=256m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:SurvivorRatio=6 -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:MaxTenuringThreshold=2 But the -XX:MaxTenuringThreshold=2 setting does not seem to help anything. I am still seeing similar GC pattern as with the +AlwaysTenure, actually the young GC time is higher with MTT=2 (getting to 0.5 secs vs. 0.25 with AlwaysTenure). Unless anyone else can provide another theory, I am convinced that nepotism is at play here. Changing the java startup parameters can only get me so far, dev will have to look at the code and see what can be done on the code level. Thanks, Jun From: Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] Sent: Thursday, November 19, 2015 8:09 PM To: Jun Zhuang > Cc: hotspot-gc-use at openjdk.java.net Subject: Re: Seeking answer to a GC pattern Use -XX:MaxTenuringThreshold=2 and you might see better behavior that +AlwaysTenure (which is almost always a very bad choice). That will at least reduce some of the nepotism issues from +AlwaysTenure that Thomas mentions. MTT > 2 is unlikely to help at your current frequency of minor collections since the mortality after age 1 is fairly low (from your tenuring distribution). Worth a quick test. -- ramki _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.B.Kessler at Oracle.COM Wed Nov 25 18:58:07 2015 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Wed, 25 Nov 2015 10:58:07 -0800 Subject: Seeking answer to a GC pattern In-Reply-To: References: Message-ID: <565604BF.9020303@Oracle.COM> On 11/24/15 01:21 PM, Jun Zhuang wrote: > Hi Srinivas, > > Appreciate your input. Following are answers to your questions. I?ll try your other advices. > > -How many CPU's (and GC threads) do you have? Does the ratio of "real" to "usr+sys" increase as "real" ramps up? > > 4 CPUs. > > Here is the time for one of the young GCs with +AlwaysTenure: Times: user=1.45 sys=0.00, real=0.40 secs. The sys time is always close to 0, user time is more than 3x real time and increases with real time accordingly. > > -Does the amount that is promoted stay constant? > > With +AlwaysTenure, looks like the promoted amount was fairly constant @ about 125K. Following shows the first 3 young GCs right after a full collection and last 3 right before the next one. > > 328706.505: [GC [PSYoungGen: 129018K->0K(130048K)] 286309K->160838K(2096896K), 0.0152110 secs] [Times: user=0.04 sys=0.01, real=0.01 secs] > > 328711.687: [GC [PSYoungGen: 129024K->0K(130048K)] 289862K->165092K(2096896K), 0.0199390 secs] [Times: user=0.06 sys=0.00, real=0.02 secs] > > 328716.875: [GC [PSYoungGen: 129024K->0K(130048K)] 294116K->168626K(2096896K), 0.0247520 secs] [Times: user=0.07 sys=0.00, real=0.02 secs] > > ? > > 331103.140: [GC [PSYoungGen: 129024K->0K(130048K)] 2082788K->1957116K(2096896K), 0.2220360 secs] [Times: user=0.78 sys=0.00, real=0.23 secs] > > 331108.118: [GC [PSYoungGen: 129024K->0K(130048K)] 2086140K->1960268K(2096896K), 0.2170640 secs] [Times: user=0.79 sys=0.01, real=0.22 secs] > > 331113.074: [GC [PSYoungGen: 129024K->0K(130048K)] 2089292K->1963948K(2096896K), 0.2132430 secs] [Times: user=0.79 sys=0.00, real=0.21 secs] > > -Does the sawtooth of minor gc times happen even with MTT=1 or AlwaysTenure? > > Yes. It always happens with or without AlwaysTenure. > > -(Hint: How many young collections do you see between the major collections when you see the sawtooth in young collection times? How does it compare with the highest age of object that is kept in the survivor spaces?) > > For one of my tests with 128m young gen and +AlwaysTenure, the average # of young GCs before a full collection was a little over 500. > > Sincerely, > > Jun Looking at the increase in your heap size after each young generation collection seems to show that you are promoting 3MB~4MB at each young generation collection. E.g., 165092K - 160838K = 4254K. With your 1920MB old generation that would let you have 500 young generation collections between full collections, as you say. If you were promoting only 125KB at each young generation collection your 1920MB old generation could absorb promotions from 15000 young generation collections. What is confusing is that times for the young generation collections increases proportionally with the size of the old generation. Your sawtooth pattern. Usually the time for a young generation collection is proportional to the amount of space that is promoted, which seems to be constant in your case. That implies some cost proportional to the size of the old generation: but what? It does not seem to take you longer to allocate through the space in your young generation when the old generation is empty than when it is full (~5 seconds) so it does not seem like you are doing more work when the old generation is full: e.g., dirtying cards for data that has piled up in the old generation, which would cause more work for the collector. With a 10x increase in time like that, one would think it would be easy to identify with a profiler, or detailed timers for phases, if there are any in the code. To Ramki: The parallelization seems to hold at somewhat over 3. ... peter > *From:*Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] > *Sent:* Tuesday, November 24, 2015 3:14 PM > *To:* Grzegorz Molenda > *Cc:* Jun Zhuang ; hotspot-gc-use at openjdk.java.net > *Subject:* Re: Seeking answer to a GC pattern > > What Grzegorz & Thomas said. > > Also you might take a heap dump before and after a full gc (-XX:+PrintClassHistogramBefore/AfterFullGC) to see the types that are reclaimed in the old gen. Might give you an idea as to the types of objects that got promoted and then later died, and hence whether avoidable nepotism is or is not a factor (and thence what you might null out to reduce such nepotism). > > Also, I guess what I meant was MTT=1. However, given that going from MTT=10 to MTT=2 didn't make any appreciable difference, MTT=2 to MTT=1 will not either. > > You might also consider increasing yr young gen size, but that will likely also increase your pause times since objects tend to either die quickly or survive for a long time, and increasing the young gen size will still not age objects sufficiently to cause an increase in mortality. > > How many CPU's (and GC threads) do you have? Does the ratio of "real" to "usr+sys" increase as "real" ramps up? Does the amount that is promoted stay constant? That might imply that something is interfering with parallelization of copying. Typically that means that there is a long skinny data structure, such as a singly linked list that is being copied, although why that list would become longer (in terms of longer times) is not clear. Does the sawtooth of minor gc times happen even with MTT=1 or AlwaysTenure? (Hint: How many young collections do you see between the major collections when you see the sawtooth in young collection times? How does it compare with the highest age of object that is kept in the survivor spaces?) > > -- ramki > > On Tue, Nov 24, 2015 at 9:17 AM, Grzegorz Molenda > wrote: > > Just a few tips: > > Check OS stats for paging / swapping activity at both VM'and hypervisor levels. > > Make sure the OS doesn't use transparent huge pages. > > If the above two don't help, try enabling -XX:+PrintGCTaskTimeStamps to diagnose, which part of GC collecion takes the most time. Note values aren't reported in time units, but in ticks. Subtract one from the other reported per task . Compare between tasks per signle collection and check stats from a few collections in row, to get the idea, where it does degradate. > > > Thanks, > > > Grzegorz > > 2015-11-20 20:46 GMT+01:00 Jun Zhuang >: > > Hi Srinivas, > > Thanks for your suggestion. I ran test with following parameters: > > -server -XX:+UseCompressedOops -XX:+TieredCompilation -XX:ReservedCodeCacheSize=64m -XX:+UseCodeCacheFlushing -XX:+PrintTenuringDistribution -Xms2g -Xmx2g -XX:MaxPermSize=256m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:SurvivorRatio=6 -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:MaxTenuringThreshold=2 > > But the -XX:MaxTenuringThreshold=2 setting does not seem to help anything. I am still seeing similar GC pattern as with the +AlwaysTenure, actually the young GC time is higher with MTT=2 (getting to 0.5 secs vs. 0.25 with AlwaysTenure). > > Unless anyone else can provide another theory, I am convinced that nepotism is at play here. Changing the java startup parameters can only get me so far, dev will have to look at the code and see what can be done on the code level. > > Thanks, > > Jun > > *From:*Srinivas Ramakrishna [mailto:ysr1729 at gmail.com ] > *Sent:* Thursday, November 19, 2015 8:09 PM > *To:* Jun Zhuang > > *Cc:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: Seeking answer to a GC pattern > > Use -XX:MaxTenuringThreshold=2 and you might see better behavior that +AlwaysTenure (which is almost always a very bad choice). That will at least reduce some of the nepotism issues from +AlwaysTenure that Thomas mentions. MTT > 2 is unlikely to help at your current frequency of minor collections since the mortality after age 1 is fairly low (from your tenuring distribution). Worth a quick test. > > -- ramki > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > From ysr1729 at gmail.com Wed Nov 25 20:54:05 2015 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Wed, 25 Nov 2015 12:54:05 -0800 Subject: Seeking answer to a GC pattern In-Reply-To: <565604BF.9020303@Oracle.COM> References: <565604BF.9020303@Oracle.COM> Message-ID: Yes, sounds like some card scanning or BOT walking pathology perhaps? Are you on the latest 8u66 or better for these numbers? Did you also try 9? If you have a test case, it might make sense to file a ticket, so someone might take a closer look. -- ramki On Wed, Nov 25, 2015 at 10:58 AM, Peter B. Kessler < Peter.B.Kessler at oracle.com> wrote: > On 11/24/15 01:21 PM, Jun Zhuang wrote: > >> Hi Srinivas, >> >> Appreciate your input. Following are answers to your questions. I?ll try >> your other advices. >> >> -How many CPU's (and GC threads) do you have? Does the ratio of "real" to >> "usr+sys" increase as "real" ramps up? >> >> 4 CPUs. >> >> Here is the time for one of the young GCs with +AlwaysTenure: Times: >> user=1.45 sys=0.00, real=0.40 secs. The sys time is always close to 0, user >> time is more than 3x real time and increases with real time accordingly. >> >> -Does the amount that is promoted stay constant? >> >> With +AlwaysTenure, looks like the promoted amount was fairly constant @ >> about 125K. Following shows the first 3 young GCs right after a full >> collection and last 3 right before the next one. >> >> 328706.505: [GC [PSYoungGen: 129018K->0K(130048K)] >> 286309K->160838K(2096896K), 0.0152110 secs] [Times: user=0.04 sys=0.01, >> real=0.01 secs] >> >> 328711.687: [GC [PSYoungGen: 129024K->0K(130048K)] >> 289862K->165092K(2096896K), 0.0199390 secs] [Times: user=0.06 sys=0.00, >> real=0.02 secs] >> >> 328716.875: [GC [PSYoungGen: 129024K->0K(130048K)] >> 294116K->168626K(2096896K), 0.0247520 secs] [Times: user=0.07 sys=0.00, >> real=0.02 secs] >> >> ? >> >> 331103.140: [GC [PSYoungGen: 129024K->0K(130048K)] >> 2082788K->1957116K(2096896K), 0.2220360 secs] [Times: user=0.78 sys=0.00, >> real=0.23 secs] >> >> 331108.118: [GC [PSYoungGen: 129024K->0K(130048K)] >> 2086140K->1960268K(2096896K), 0.2170640 secs] [Times: user=0.79 sys=0.01, >> real=0.22 secs] >> >> 331113.074: [GC [PSYoungGen: 129024K->0K(130048K)] >> 2089292K->1963948K(2096896K), 0.2132430 secs] [Times: user=0.79 sys=0.00, >> real=0.21 secs] >> >> -Does the sawtooth of minor gc times happen even with MTT=1 or >> AlwaysTenure? >> >> Yes. It always happens with or without AlwaysTenure. >> >> -(Hint: How many young collections do you see between the major >> collections when you see the sawtooth in young collection times? How does >> it compare with the highest age of object that is kept in the survivor >> spaces?) >> >> For one of my tests with 128m young gen and +AlwaysTenure, the average # >> of young GCs before a full collection was a little over 500. >> >> Sincerely, >> >> Jun >> > > Looking at the increase in your heap size after each young generation > collection seems to show that you are promoting 3MB~4MB at each young > generation collection. E.g., 165092K - 160838K = 4254K. With your 1920MB > old generation that would let you have 500 young generation collections > between full collections, as you say. If you were promoting only 125KB at > each young generation collection your 1920MB old generation could absorb > promotions from 15000 young generation collections. > > What is confusing is that times for the young generation collections > increases proportionally with the size of the old generation. Your > sawtooth pattern. Usually the time for a young generation collection is > proportional to the amount of space that is promoted, which seems to be > constant in your case. That implies some cost proportional to the size of > the old generation: but what? It does not seem to take you longer to > allocate through the space in your young generation when the old generation > is empty than when it is full (~5 seconds) so it does not seem like you are > doing more work when the old generation is full: e.g., dirtying cards for > data that has piled up in the old generation, which would cause more work > for the collector. > > With a 10x increase in time like that, one would think it would be easy to > identify with a profiler, or detailed timers for phases, if there are any > in the code. > > To Ramki: The parallelization seems to hold at somewhat over 3. > > ... peter > > *From:*Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] >> *Sent:* Tuesday, November 24, 2015 3:14 PM >> *To:* Grzegorz Molenda >> *Cc:* Jun Zhuang ; >> hotspot-gc-use at openjdk.java.net >> *Subject:* Re: Seeking answer to a GC pattern >> >> What Grzegorz & Thomas said. >> >> Also you might take a heap dump before and after a full gc >> (-XX:+PrintClassHistogramBefore/AfterFullGC) to see the types that are >> reclaimed in the old gen. Might give you an idea as to the types of objects >> that got promoted and then later died, and hence whether avoidable nepotism >> is or is not a factor (and thence what you might null out to reduce such >> nepotism). >> >> Also, I guess what I meant was MTT=1. However, given that going from >> MTT=10 to MTT=2 didn't make any appreciable difference, MTT=2 to MTT=1 will >> not either. >> >> You might also consider increasing yr young gen size, but that will >> likely also increase your pause times since objects tend to either die >> quickly or survive for a long time, and increasing the young gen size will >> still not age objects sufficiently to cause an increase in mortality. >> >> How many CPU's (and GC threads) do you have? Does the ratio of "real" to >> "usr+sys" increase as "real" ramps up? Does the amount that is promoted >> stay constant? That might imply that something is interfering with >> parallelization of copying. Typically that means that there is a long >> skinny data structure, such as a singly linked list that is being copied, >> although why that list would become longer (in terms of longer times) is >> not clear. Does the sawtooth of minor gc times happen even with MTT=1 or >> AlwaysTenure? (Hint: How many young collections do you see between the >> major collections when you see the sawtooth in young collection times? How >> does it compare with the highest age of object that is kept in the survivor >> spaces?) >> >> -- ramki >> >> On Tue, Nov 24, 2015 at 9:17 AM, Grzegorz Molenda > > wrote: >> >> Just a few tips: >> >> Check OS stats for paging / swapping activity at both VM'and >> hypervisor levels. >> >> Make sure the OS doesn't use transparent huge pages. >> >> If the above two don't help, try enabling -XX:+PrintGCTaskTimeStamps >> to diagnose, which part of GC collecion takes the most time. Note values >> aren't reported in time units, but in ticks. Subtract one from the other >> reported per task . Compare between tasks per signle collection and check >> stats from a few collections in row, to get the idea, where it does >> degradate. >> >> >> Thanks, >> >> >> Grzegorz >> >> 2015-11-20 20:46 GMT+01:00 Jun Zhuang > >: >> >> Hi Srinivas, >> >> Thanks for your suggestion. I ran test with following parameters: >> >> -server -XX:+UseCompressedOops -XX:+TieredCompilation >> -XX:ReservedCodeCacheSize=64m -XX:+UseCodeCacheFlushing >> -XX:+PrintTenuringDistribution -Xms2g -Xmx2g -XX:MaxPermSize=256m >> -XX:NewSize=128m -XX:MaxNewSize=128m -XX:SurvivorRatio=6 >> -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:MaxTenuringThreshold=2 >> >> But the -XX:MaxTenuringThreshold=2 setting does not seem to help >> anything. I am still seeing similar GC pattern as with the +AlwaysTenure, >> actually the young GC time is higher with MTT=2 (getting to 0.5 secs vs. >> 0.25 with AlwaysTenure). >> >> Unless anyone else can provide another theory, I am convinced >> that nepotism is at play here. Changing the java startup parameters can >> only get me so far, dev will have to look at the code and see what can be >> done on the code level. >> >> Thanks, >> >> Jun >> >> *From:*Srinivas Ramakrishna [mailto:ysr1729 at gmail.com > ysr1729 at gmail.com>] >> *Sent:* Thursday, November 19, 2015 8:09 PM >> *To:* Jun Zhuang > jun.zhuang at hobsons.com>> >> *Cc:* hotspot-gc-use at openjdk.java.net > hotspot-gc-use at openjdk.java.net> >> *Subject:* Re: Seeking answer to a GC pattern >> >> Use -XX:MaxTenuringThreshold=2 and you might see better behavior >> that +AlwaysTenure (which is almost always a very bad choice). That will at >> least reduce some of the nepotism issues from +AlwaysTenure that Thomas >> mentions. MTT > 2 is unlikely to help at your current frequency of minor >> collections since the mortality after age 1 is fairly low (from your >> tenuring distribution). Worth a quick test. >> >> -- ramki >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net > hotspot-gc-use at openjdk.java.net> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysr1729 at gmail.com Fri Nov 20 01:08:50 2015 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Fri, 20 Nov 2015 01:08:50 -0000 Subject: Seeking answer to a GC pattern In-Reply-To: References: Message-ID: Use -XX:MaxTenuringThreshold=2 and you might see better behavior that +AlwaysTenure (which is almost always a very bad choice). That will at least reduce some of the nepotism issues from +AlwaysTenure that Thomas mentions. MTT > 2 is unlikely to help at your current frequency of minor collections since the mortality after age 1 is fairly low (from your tenuring distribution). Worth a quick test. -- ramki On Mon, Oct 26, 2015 at 12:33 PM, Jun Zhuang wrote: > Hi, > > > > When running performance testing for a java web service running on JBOSS, > I observed a clear saw-tooth pattern in CPU utilization that closely > follows the GC cycles. see below: > > > > > > Java startup parameters used: > > > > -XX:+TieredCompilation -XX:+PrintTenuringDistribution -Xms2048m -Xmx4096m > -XX:MaxPermSize=256m -XX:NewSize=128m -XX:MaxNewSize=128m > -XX:SurvivorRatio=126 -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC > -XX:+AlwaysTenure > > > > With this set of parameters, the young GC pause time ranged from 0.02 to > 0.25 secs. When I used 256m for the young gen, the young GC pause time > ranged from 0.02 to 0.5 secs. My understanding is that the young GC pause > time normally stays fairly stable, I have spent quite some time researching > but have yet to find an answer to this behavior. I wonder if people in this > distribution list can help me out? > > > > *Other related info* > > > > * Server Specs: VM with 4 CPUs and 8 Gb mem > > * Test setup: > > ? # of Vusers: 100 > > ? Ramp up: 10 mins > > ? Pacing: 5 ? 7 secs > > * I tried with all other available GC algorithms, tenuring thresholds, > various sizes of the generations, but the AlwaysTenure parameter seems to > work the best so far. > > > > > > Any input will be highly appreciated. > > > > Sincerely yours, > > > > *Jun Zhuang* > > *Sr. Performance QA Engineer | Hobsons* > > *513-746-2288 <513-746-2288> (work)* > > *513-227-7643 <513-227-7643> (mobile)* > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image010.jpg Type: image/jpeg Size: 11773 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image011.jpg Type: image/jpeg Size: 33498 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image012.jpg Type: image/jpeg Size: 19704 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.jpg Type: image/jpeg Size: 7083 bytes Desc: not available URL: