From java at elyograg.org Thu Jan 7 16:15:25 2016 From: java at elyograg.org (Shawn Heisey) Date: Thu, 7 Jan 2016 09:15:25 -0700 Subject: Odd OS-level memory reporting for Java Message-ID: <568E8F1D.9060501@elyograg.org> I have seen something very odd with memory reporting in both Windows and Linux. This may not be the correct list for this question, but I am already subscribed to a very large number of mailing lists and would prefer to not add another one for a one-off question. I apologize if this is the wrong list. Take a look at these two screenshots: https://www.dropbox.com/s/64en3sar4cr1ytj/linux-solr-mem-high-shr.png?dl=0 https://www.dropbox.com/s/w4bnrb66r16lpx1/Resource%20Monitor.png?dl=0 The first one is a Linux server that I am running, the second is a Windows server that someone else is running. Both machines are running Solr. The Windows machine is running two copies of Solr, and the Solr processes are at the top of both lists. The Linux machine has an 8GB heap for its copy of Solr, and I believe that each of the copies of Solr on the Windows machine have the heap set to 14GB. In both cases, the shared memory reported is very high, with the resident (or working) memory *far* higher than the configured heap size. In the case of the Linux machine, I can illustrate that there is definitely a reporting bug. If you take the 48GB currently allocated to the disk cache, and add the *reported* 22GB resident size of the Solr process, you get 70GB ... but the machine only has 64GB total. On the Linux machine, Solr is accessing over 100GB of data via MMap (see the VIRT memory size of 121GB). This is the data that is in the disk cache. I was told that the indexes are even larger on the Windows machine. This is the Java version on the Linux machine. I am working to learn what the version is on the Windows machine. java version "1.7.0_72" Java(TM) SE Runtime Environment (build 1.7.0_72-b14) Java HotSpot(TM) 64-Bit Server VM (build 24.72-b04, mixed mode) Here are the packages providing this Java version (CentOS 6.6). I used the source RPM on the city-fan.org website and the official tarball from Oracle: java-1.7.0-oracle-jdbc-1.7.0.72-1.0.cf.x86_64 java-1.7.0-oracle-devel-1.7.0.72-1.0.cf.x86_64 java-1.7.0-oracle-1.7.0.72-1.0.cf.x86_64 Is this memory reporting problem a bug in Java or a bug in both operating systems? I do not have easy access to any other OS platforms. Thanks, Shawn From yu.zhang at oracle.com Thu Jan 7 16:29:12 2016 From: yu.zhang at oracle.com (Yu Zhang) Date: Thu, 7 Jan 2016 08:29:12 -0800 Subject: Odd OS-level memory reporting for Java In-Reply-To: <568E8F1D.9060501@elyograg.org> References: <568E8F1D.9060501@elyograg.org> Message-ID: <568E9258.2060102@oracle.com> Shawn, What is the parameters for java? Did you use AlwaysPreTouch, give -Xms -Xmx? I am not sure about the Linux output. But the Windows one 15,861,867(KB) is reasonable for 14g heap, I think. Thanks, Jenny On 1/7/2016 8:15 AM, Shawn Heisey wrote: > I have seen something very odd with memory reporting in both Windows and > Linux. > > This may not be the correct list for this question, but I am already > subscribed to a very large number of mailing lists and would prefer to > not add another one for a one-off question. I apologize if this is the > wrong list. > > Take a look at these two screenshots: > > https://www.dropbox.com/s/64en3sar4cr1ytj/linux-solr-mem-high-shr.png?dl=0 > https://www.dropbox.com/s/w4bnrb66r16lpx1/Resource%20Monitor.png?dl=0 > > The first one is a Linux server that I am running, the second is a > Windows server that someone else is running. Both machines are running > Solr. The Windows machine is running two copies of Solr, and the Solr > processes are at the top of both lists. The Linux machine has an 8GB > heap for its copy of Solr, and I believe that each of the copies of Solr > on the Windows machine have the heap set to 14GB. > > In both cases, the shared memory reported is very high, with the > resident (or working) memory *far* higher than the configured heap size. > > In the case of the Linux machine, I can illustrate that there is > definitely a reporting bug. If you take the 48GB currently allocated to > the disk cache, and add the *reported* 22GB resident size of the Solr > process, you get 70GB ... but the machine only has 64GB total. > > On the Linux machine, Solr is accessing over 100GB of data via MMap (see > the VIRT memory size of 121GB). This is the data that is in the disk > cache. I was told that the indexes are even larger on the Windows machine. > > This is the Java version on the Linux machine. I am working to learn > what the version is on the Windows machine. > > java version "1.7.0_72" > Java(TM) SE Runtime Environment (build 1.7.0_72-b14) > Java HotSpot(TM) 64-Bit Server VM (build 24.72-b04, mixed mode) > > Here are the packages providing this Java version (CentOS 6.6). I used > the source RPM on the city-fan.org website and the official tarball from > Oracle: > > java-1.7.0-oracle-jdbc-1.7.0.72-1.0.cf.x86_64 > java-1.7.0-oracle-devel-1.7.0.72-1.0.cf.x86_64 > java-1.7.0-oracle-1.7.0.72-1.0.cf.x86_64 > > Is this memory reporting problem a bug in Java or a bug in both > operating systems? I do not have easy access to any other OS platforms. > > Thanks, > Shawn > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From java at elyograg.org Thu Jan 7 17:08:34 2016 From: java at elyograg.org (Shawn Heisey) Date: Thu, 7 Jan 2016 10:08:34 -0700 Subject: Odd OS-level memory reporting for Java In-Reply-To: <568E9258.2060102@oracle.com> References: <568E8F1D.9060501@elyograg.org> <568E9258.2060102@oracle.com> Message-ID: <568E9B92.7050809@elyograg.org> On 1/7/2016 9:29 AM, Yu Zhang wrote: > What is the parameters for java? Did you use AlwaysPreTouch, give -Xms > -Xmx? > > I am not sure about the Linux output. But the Windows one > 15,861,867(KB) is reasonable for 14g heap, I think. This is the full commandline for the Linux machine. It is starting jetty, which then runs Solr4.9.1. These parameters are included by an init script that I wrote: /usr/bin/java -Xms4096M -Xmx8192M -Dlog4j.configuration=file:etc/log4j.properties -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages -XX:+AggressiveOpts -verbose:gc -Xloggc:logs/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8686 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dsolr.solr.home=/index/solr4 -Djetty.port=8981 -DSTOP.PORT=8078 -DSTOP.KEY=redacted -jar /opt/solr4/start.jar If "Commit" is the proper number to look at, then Windows may not have the same issue Linux does. I was under the impression that "Working Set" was what I should be looking at, especially since the other Java processes and the SQL Server process are showing a smaller number there than under Commit. I do not know what those other java processes are. The Windows machine *is* showing the same shared memory inflation that I see on the Linux machine. I have complete control over the Linux machine, so I can definitely answer questions about that one. I understand Linux a lot more, and to be honest I don't really care about Windows. I'm trying to help this user with what they perceive as memory problems, although I suspect that everything is working just like it should. The following info is not completely confirmed, but based on the user's posting history on the solr-user mailing list, I believe that the Windows machine is using the arguments set by the included 5.x start script, which will mean a list like this that I obtained from running Solr 5.4.0 on my Windows 7 desktop. Xms and Xmx are only 512m on my machine, this will definitely not be the case for the machine that produced the screenshot: -DSTOP.KEY=solrrocks -DSTOP.PORT=7983 -Djava.io.tmpdir=C:\Users\sheisey\Downloads\solr-5.4.0\server\tmp -Djetty.home=C:\Users\sheisey\Downloads\solr-5.4.0\server -Djetty.port=8983 -Dlog4j.configuration=file:C:\Users\sheisey\Downloads\solr-5.4.0\server\resources\log4j.properties -Dsolr.install.dir=C:\Users\sheisey\Downloads\solr-5.4.0 -Dsolr.solr.home=C:\Users\sheisey\Downloads\solr-5.4.0\server\solr -Duser.timezone=UTC -XX:+CMSParallelRemarkEnabled -XX:+CMSScavengeBeforeRemark -XX:+ParallelRefProcEnabled -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:ConcGCThreads=4 -XX:MaxTenuringThreshold=8 -XX:NewRatio=3 -XX:ParallelGCThreads=4 -XX:PretenureSizeThreshold=64m -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -Xloggc:C:\Users\sheisey\Downloads\solr-5.4.0\server\logs/solr_gc.log -Xms512m -Xmx512m -Xss256k -verbose:gc It is probably worth mentioning that I have another Linux machine running Java 8u66 and Solr 5.3 with CMS GC (the same settings that I just listed for my Win7 desktop machine) is *not* showing the same SHR and RES inflation: https://www.dropbox.com/s/i49s2uyfetwo3xq/solr-mem-prod-8g-heap.png?dl=0 Thanks, Shawn From yu.zhang at oracle.com Thu Jan 7 17:18:15 2016 From: yu.zhang at oracle.com (Yu Zhang) Date: Thu, 7 Jan 2016 09:18:15 -0800 Subject: Odd OS-level memory reporting for Java In-Reply-To: <568E9B92.7050809@elyograg.org> References: <568E8F1D.9060501@elyograg.org> <568E9258.2060102@oracle.com> <568E9B92.7050809@elyograg.org> Message-ID: <568E9DD7.30208@oracle.com> With 8u66, can you switch to g1gc and see if it has the same issue? Trying to understand if it is G1GC vs CMS, or different jdk version Thanks, Jenny On 1/7/2016 9:08 AM, Shawn Heisey wrote: > It is probably worth mentioning that I have another Linux machine > running Java 8u66 and Solr 5.3 with CMS GC (the same settings that I > just listed for my Win7 desktop machine) is*not* showing the same SHR > and RES inflation: -------------- next part -------------- An HTML attachment was scrubbed... URL: From dawid.weiss at gmail.com Thu Jan 7 17:25:34 2016 From: dawid.weiss at gmail.com (Dawid Weiss) Date: Thu, 7 Jan 2016 18:25:34 +0100 Subject: Odd OS-level memory reporting for Java In-Reply-To: <568E8F1D.9060501@elyograg.org> References: <568E8F1D.9060501@elyograg.org> Message-ID: > If you take the 48GB currently allocated to > the disk cache, and add the *reported* 22GB resident size of the Solr > process, you get 70GB ... but the machine only has 64GB total. Isn't this a side effect of memory-mapped index files? Dawid From vitalyd at gmail.com Thu Jan 7 17:28:09 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 7 Jan 2016 12:28:09 -0500 Subject: Odd OS-level memory reporting for Java In-Reply-To: <568E8F1D.9060501@elyograg.org> References: <568E8F1D.9060501@elyograg.org> Message-ID: > > In the case of the Linux machine, I can illustrate that there is > definitely a reporting bug. If you take the 48GB currently allocated to > the disk cache, and add the *reported* 22GB resident size of the Solr > process, you get 70GB ... but the machine only has 64GB total. I'm not sure you can add RES and cached together. RES includes file-backed pages (i.e. it's not just anon pages), but those same pages may also be in the page cache and reported in cached. On Thu, Jan 7, 2016 at 11:15 AM, Shawn Heisey wrote: > I have seen something very odd with memory reporting in both Windows and > Linux. > > This may not be the correct list for this question, but I am already > subscribed to a very large number of mailing lists and would prefer to > not add another one for a one-off question. I apologize if this is the > wrong list. > > Take a look at these two screenshots: > > https://www.dropbox.com/s/64en3sar4cr1ytj/linux-solr-mem-high-shr.png?dl=0 > https://www.dropbox.com/s/w4bnrb66r16lpx1/Resource%20Monitor.png?dl=0 > > The first one is a Linux server that I am running, the second is a > Windows server that someone else is running. Both machines are running > Solr. The Windows machine is running two copies of Solr, and the Solr > processes are at the top of both lists. The Linux machine has an 8GB > heap for its copy of Solr, and I believe that each of the copies of Solr > on the Windows machine have the heap set to 14GB. > > In both cases, the shared memory reported is very high, with the > resident (or working) memory *far* higher than the configured heap size. > > In the case of the Linux machine, I can illustrate that there is > definitely a reporting bug. If you take the 48GB currently allocated to > the disk cache, and add the *reported* 22GB resident size of the Solr > process, you get 70GB ... but the machine only has 64GB total. > > On the Linux machine, Solr is accessing over 100GB of data via MMap (see > the VIRT memory size of 121GB). This is the data that is in the disk > cache. I was told that the indexes are even larger on the Windows machine. > > This is the Java version on the Linux machine. I am working to learn > what the version is on the Windows machine. > > java version "1.7.0_72" > Java(TM) SE Runtime Environment (build 1.7.0_72-b14) > Java HotSpot(TM) 64-Bit Server VM (build 24.72-b04, mixed mode) > > Here are the packages providing this Java version (CentOS 6.6). I used > the source RPM on the city-fan.org website and the official tarball from > Oracle: > > java-1.7.0-oracle-jdbc-1.7.0.72-1.0.cf.x86_64 > java-1.7.0-oracle-devel-1.7.0.72-1.0.cf.x86_64 > java-1.7.0-oracle-1.7.0.72-1.0.cf.x86_64 > > Is this memory reporting problem a bug in Java or a bug in both > operating systems? I do not have easy access to any other OS platforms. > > Thanks, > Shawn > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: From todd at cloudera.com Thu Jan 7 18:24:22 2016 From: todd at cloudera.com (Todd Lipcon) Date: Thu, 7 Jan 2016 10:24:22 -0800 Subject: Odd OS-level memory reporting for Java In-Reply-To: <568E9DD7.30208@oracle.com> References: <568E8F1D.9060501@elyograg.org> <568E9258.2060102@oracle.com> <568E9B92.7050809@elyograg.org> <568E9DD7.30208@oracle.com> Message-ID: Note that linux counts mapped files as RSS if they're paged in. Maybe try something like summing up the various different memory types from /proc//smaps ? eg: $ cat /proc/$(pgrep java)/smaps | perl -mData::Dumper -n -e 'if (/(.+):\s+(\d+) kB/) { $sum{$1} += $2; } END { print Data::Dumper::Dumper(\%sum); }' $VAR1 = { 'Rss' => 475148, 'Shared_Dirty' => 20, 'Locked' => 0, 'MMUPageSize' => 2128, 'Shared_Clean' => 9660, 'Size' => 2860292, 'Private_Dirty' => 459248, 'KernelPageSize' => 2128, 'Swap' => 0, 'Referenced' => 472196, 'Anonymous' => 459240, 'Pss' => 465681, 'AnonHugePages' => 444416, 'Private_Clean' => 6220 }; On Thu, Jan 7, 2016 at 9:18 AM, Yu Zhang wrote: > With 8u66, can you switch to g1gc and see if it has the same issue? Trying > to understand if it is G1GC vs CMS, or different jdk version > > Thanks, > Jenny > > On 1/7/2016 9:08 AM, Shawn Heisey wrote: > > It is probably worth mentioning that I have another Linux machine > running Java 8u66 and Solr 5.3 with CMS GC (the same settings that I > just listed for my Win7 desktop machine) is **not** showing the same SHR > and RES inflation: > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -- Todd Lipcon Software Engineer, Cloudera -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.mishra at redknee.com Thu Jan 21 12:30:19 2016 From: amit.mishra at redknee.com (Amit Mishra) Date: Thu, 21 Jan 2016 12:30:19 +0000 Subject: regarding long young generation pause when using G1 GC In-Reply-To: References: Message-ID: Hello team, Please help me with G1 GC long young generation pause.. At one of our Customer deployment we were using CMS with 48G heap size and facing Concurrent mode failure every month. To solve this we started load test using G1 GC on lab but we are seeing intermittent long GC pause as long as 148 seconds, Could you please look into logs and confirm what best we can do to ascertain root cause and possible tuning to fix this. Thanks for help in advance. We are using default GC parameters:(Attaching full GC file as well) JVM_ARGS="${JVM_ARGS} -XX:-OmitStackTraceInFastThrow" JVM_ARGS="${JVM_ARGS} -XX:-UseSSE42Intrinsics" JVM_ARGS="${JVM_ARGS} -XX:-ReduceInitialCardMarks" JVM_ARGS="${JVM_ARGS} -XX:+UseG1GC" JVM_ARGS="${JVM_ARGS} -XX:MaxGCPauseMillis=500" JVM_ARGS="${JVM_ARGS} -XX:+UseCompressedOops" JVM_ARGS="${JVM_ARGS} -XX:+PrintFlagsFinal" JVM_ARGS="${JVM_ARGS} -XX:-EliminateLocks" JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" export GC_LOG="${PROJECT_HOME}/log/gcstats.log.$$" JVM_ARGS=$JVM_ARGS" -Xloggc:${GC_LOG}" JVM_ARGS=$JVM_ARGS" -verbose:gc" JVM_ARGS="${JVM_ARGS} -XX:+DisableExplicitGC" JVM_ARGS=$JVM_ARGS" -XX:+AggressiveOpts" JVM_ARGS="${JVM_ARGS} -XX:+HeapDumpOnOutOfMemoryError" JVM_ARGS="${JVM_ARGS} -XX:ReservedCodeCacheSize=128m" JVM_ARGS="${JVM_ARGS} -XX:PermSize=256m" JVM_ARGS="${JVM_ARGS} -XX:MaxPermSize=256m" JVM_ARGS="${JVM_ARGS} -Xmx28672m" JVM_ARGS="${JVM_ARGS} -Xms28672m Long GC pause. 5020.853: [GC pause (young), 148.0343389 secs] [Parallel Time: 142474.3 ms, GC Workers: 13] [GC Worker Start (ms): Min: 5022695.7, Avg: 5022705.6, Max: 5022714.4, Diff: 18.7] [Ext Root Scanning (ms): Min: 143.0, Avg: 2550.5, Max: 6442.5, Diff: 6299.4, Sum: 33157.1] [Update RS (ms): Min: 5704.6, Avg: 19017.1, Max: 45977.9, Diff: 40273.3, Sum: 247222.7] [Processed Buffers: Min: 0, Avg: 26.9, Max: 291, Diff: 291, Sum: 350] [Scan RS (ms): Min: 70538.2, Avg: 92539.7, Max: 114688.0, Diff: 44149.8, Sum: 1203016.2] [Object Copy (ms): Min: 894.7, Avg: 9265.9, Max: 20674.1, Diff: 19779.4, Sum: 120456.9] [Termination (ms): Min: 10169.3, Avg: 19073.0, Max: 26684.1, Diff: 16514.7, Sum: 247949.3] [GC Worker Other (ms): Min: 0.1, Avg: 4.5, Max: 18.2, Diff: 18.0, Sum: 58.4] [GC Worker Total (ms): Min: 142441.2, Avg: 142450.8, Max: 142461.1, Diff: 19.8, Sum: 1851860.7] [GC Worker End (ms): Min: 5165155.6, Avg: 5165156.4, Max: 5165162.0, Diff: 6.4] [Code Root Fixup: 2.2 ms] [Clear CT: 11.2 ms] [Other: 5546.7 ms] [Choose CSet: 0.0 ms] [Ref Proc: 47.4 ms] [Ref Enq: 0.9 ms] [Free CSet: 3642.1 ms] [Eden: 13.2G(13.2G)->0.0B(1416.0M) Survivors: 16.0M->16.0M Heap: 24.7G(28.0G)->11.5G(28.0G)] [Times: user=184.95 sys=34.49, real=148.07 secs] Thanks, Amit -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gcstats.log.15989 Type: application/octet-stream Size: 1111579 bytes Desc: gcstats.log.15989 URL: From yu.zhang at oracle.com Thu Jan 21 18:12:40 2016 From: yu.zhang at oracle.com (Yu Zhang) Date: Thu, 21 Jan 2016 10:12:40 -0800 Subject: regarding long young generation pause when using G1 GC In-Reply-To: References: Message-ID: <56A11F98.3000602@oracle.com> Amit, Which jdk version are you using? You give MaxGCPauseMillis=500, but it seems g1 is adjusting eden size too big, up to 13g. I am curious why. Can you add a flag -XX:+PrintAdaptiveSizePolicy? Meanwhile, to reduce young gc pause, can you try to a smaller number for MaxGCPauseMillis, or give a smaller eden size? Another thing I noticed, whenever you have long gc pause, the system cpu is high. Maybe the system is busy doing something else? Thanks, Jenny On 1/21/2016 4:30 AM, Amit Mishra wrote: > > Hello team, > > Please help me with G1 GC long young generation pause.. > > At one of our Customer deployment we were using CMS with 48G heap size > and facing Concurrent mode failure every month. > > To solve this we started load test using G1 GC on lab but we are > seeing intermittent long GC pause as long as 148 seconds, > > Could you please look into logs and confirm what best we can do to > ascertain root cause and possible tuning to fix this. > > Thanks for help in advance. > > We are using default GC parameters:(Attaching full GC file as well) > > JVM_ARGS="${JVM_ARGS} -XX:-OmitStackTraceInFastThrow" > > JVM_ARGS="${JVM_ARGS} -XX:-UseSSE42Intrinsics" > > JVM_ARGS="${JVM_ARGS} -XX:-ReduceInitialCardMarks" > > JVM_ARGS="${JVM_ARGS} -XX:+UseG1GC" > > JVM_ARGS="${JVM_ARGS} -XX:MaxGCPauseMillis=500" > > JVM_ARGS="${JVM_ARGS} -XX:+UseCompressedOops" > > JVM_ARGS="${JVM_ARGS} -XX:+PrintFlagsFinal" > > JVM_ARGS="${JVM_ARGS} -XX:-EliminateLocks" > > JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" > > export GC_LOG="${PROJECT_HOME}/log/gcstats.log.$$" > > JVM_ARGS=$JVM_ARGS" -Xloggc:${GC_LOG}" > > JVM_ARGS=$JVM_ARGS" -verbose:gc" > > JVM_ARGS="${JVM_ARGS} -XX:+DisableExplicitGC" > > JVM_ARGS=$JVM_ARGS" -XX:+AggressiveOpts" > > JVM_ARGS="${JVM_ARGS} -XX:+HeapDumpOnOutOfMemoryError" > > JVM_ARGS="${JVM_ARGS} -XX:ReservedCodeCacheSize=128m" > > JVM_ARGS="${JVM_ARGS} -XX:PermSize=256m" > > JVM_ARGS="${JVM_ARGS} -XX:MaxPermSize=256m" > > JVM_ARGS="${JVM_ARGS} -Xmx28672m" > > JVM_ARGS="${JVM_ARGS} -Xms28672m > > Long GC pause. > > 5020.853: [GC pause (young), 148.0343389 secs] > > [Parallel Time: 142474.3 ms, GC Workers: 13] > > [GC Worker Start (ms): Min: 5022695.7, Avg: 5022705.6, Max: > 5022714.4, Diff: 18.7] > > [Ext Root Scanning (ms): Min: 143.0, Avg: 2550.5, Max: 6442.5, > Diff: 6299.4, Sum: 33157.1] > > [Update RS (ms): Min: 5704.6, Avg: 19017.1, Max: 45977.9, Diff: > 40273.3, Sum: 247222.7] > > [Processed Buffers: Min: 0, Avg: 26.9, Max: 291, Diff: 291, > Sum: 350] > > [Scan RS (ms): Min: 70538.2, Avg: 92539.7, Max: 114688.0, Diff: > 44149.8, Sum: 1203016.2] > > [Object Copy (ms): Min: 894.7, Avg: 9265.9, Max: 20674.1, Diff: > 19779.4, Sum: 120456.9] > > [Termination (ms): Min: 10169.3, Avg: 19073.0, Max: 26684.1, > Diff: 16514.7, Sum: 247949.3] > > [GC Worker Other (ms): Min: 0.1, Avg: 4.5, Max: 18.2, Diff: > 18.0, Sum: 58.4] > > [GC Worker Total (ms): Min: 142441.2, Avg: 142450.8, Max: > 142461.1, Diff: 19.8, Sum: 1851860.7] > > [GC Worker End (ms): Min: 5165155.6, Avg: 5165156.4, Max: > 5165162.0, Diff: 6.4] > > [Code Root Fixup: 2.2 ms] > > [Clear CT: 11.2 ms] > > [Other: 5546.7 ms] > > [Choose CSet: 0.0 ms] > > [Ref Proc: 47.4 ms] > > [Ref Enq: 0.9 ms] > > [Free CSet: 3642.1 ms] > > [Eden: 13.2G(13.2G)->0.0B(1416.0M) Survivors: 16.0M->16.0M Heap: > 24.7G(28.0G)->11.5G(28.0G)] > > [Times: user=184.95 sys=34.49, real=148.07 secs] > > > Thanks, > > Amit > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlie.hunt at oracle.com Thu Jan 21 19:36:02 2016 From: charlie.hunt at oracle.com (charlie hunt) Date: Thu, 21 Jan 2016 13:36:02 -0600 Subject: regarding long young generation pause when using G1 GC In-Reply-To: <56A11F98.3000602@oracle.com> References: <56A11F98.3000602@oracle.com> Message-ID: Hi Amit, To add to Jenny?s comments ? In the absence of looking at the entire GC log and looking only at the entry included in email, this line really catches my eye: [Times: user=184.95 sys=34.49, real=148.07 secs] - This is pretty high sys time. If this is Linux, check that transparent huge pages are disabled. If that?s disabled, then you may be paging to virtual memory. In short, need to chase down the root of that high sys time. - Comparing user=184.95 to real=148.07 you do not much parallelism, ~ 1.24x. That is not much parallelism for a multi-threaded young GC. hths, charlie > On Jan 21, 2016, at 12:12 PM, Yu Zhang wrote: > > Amit, > > Which jdk version are you using? > > You give MaxGCPauseMillis=500, but it seems g1 is adjusting eden size too big, up to 13g. I am curious why. Can you add a flag -XX:+PrintAdaptiveSizePolicy? Meanwhile, to reduce young gc pause, can you try to a smaller number for MaxGCPauseMillis, or give a smaller eden size? > > Another thing I noticed, whenever you have long gc pause, the system cpu is high. Maybe the system is busy doing something else? > Thanks, > Jenny > On 1/21/2016 4:30 AM, Amit Mishra wrote: >> Hello team, >> >> Please help me with G1 GC long young generation pause.. >> >> At one of our Customer deployment we were using CMS with 48G heap size and facing Concurrent mode failure every month. >> >> To solve this we started load test using G1 GC on lab but we are seeing intermittent long GC pause as long as 148 seconds, >> >> Could you please look into logs and confirm what best we can do to ascertain root cause and possible tuning to fix this. >> >> Thanks for help in advance. >> We are using default GC parameters:(Attaching full GC file as well) >> >> JVM_ARGS="${JVM_ARGS} -XX:-OmitStackTraceInFastThrow" >> JVM_ARGS="${JVM_ARGS} -XX:-UseSSE42Intrinsics" >> JVM_ARGS="${JVM_ARGS} -XX:-ReduceInitialCardMarks" >> JVM_ARGS="${JVM_ARGS} -XX:+UseG1GC" >> JVM_ARGS="${JVM_ARGS} -XX:MaxGCPauseMillis=500" >> JVM_ARGS="${JVM_ARGS} -XX:+UseCompressedOops" >> JVM_ARGS="${JVM_ARGS} -XX:+PrintFlagsFinal" >> JVM_ARGS="${JVM_ARGS} -XX:-EliminateLocks" >> JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" >> export GC_LOG="${PROJECT_HOME}/log/gcstats.log.$$" >> JVM_ARGS=$JVM_ARGS" -Xloggc:${GC_LOG}" >> JVM_ARGS=$JVM_ARGS" -verbose:gc" >> JVM_ARGS="${JVM_ARGS} -XX:+DisableExplicitGC" >> JVM_ARGS=$JVM_ARGS" -XX:+AggressiveOpts" >> JVM_ARGS="${JVM_ARGS} -XX:+HeapDumpOnOutOfMemoryError" >> JVM_ARGS="${JVM_ARGS} -XX:ReservedCodeCacheSize=128m" >> JVM_ARGS="${JVM_ARGS} -XX:PermSize=256m" >> JVM_ARGS="${JVM_ARGS} -XX:MaxPermSize=256m" >> JVM_ARGS="${JVM_ARGS} -Xmx28672m" >> JVM_ARGS="${JVM_ARGS} -Xms28672m >> >> Long GC pause. >> >> 5020.853: [GC pause (young), 148.0343389 secs] >> [Parallel Time: 142474.3 ms, GC Workers: 13] >> [GC Worker Start (ms): Min: 5022695.7, Avg: 5022705.6, Max: 5022714.4, Diff: 18.7] >> [Ext Root Scanning (ms): Min: 143.0, Avg: 2550.5, Max: 6442.5, Diff: 6299.4, Sum: 33157.1] >> [Update RS (ms): Min: 5704.6, Avg: 19017.1, Max: 45977.9, Diff: 40273.3, Sum: 247222.7] >> [Processed Buffers: Min: 0, Avg: 26.9, Max: 291, Diff: 291, Sum: 350] >> [Scan RS (ms): Min: 70538.2, Avg: 92539.7, Max: 114688.0, Diff: 44149.8, Sum: 1203016.2] >> [Object Copy (ms): Min: 894.7, Avg: 9265.9, Max: 20674.1, Diff: 19779.4, Sum: 120456.9] >> [Termination (ms): Min: 10169.3, Avg: 19073.0, Max: 26684.1, Diff: 16514.7, Sum: 247949.3] >> [GC Worker Other (ms): Min: 0.1, Avg: 4.5, Max: 18.2, Diff: 18.0, Sum: 58.4] >> [GC Worker Total (ms): Min: 142441.2, Avg: 142450.8, Max: 142461.1, Diff: 19.8, Sum: 1851860.7] >> [GC Worker End (ms): Min: 5165155.6, Avg: 5165156.4, Max: 5165162.0, Diff: 6.4] >> [Code Root Fixup: 2.2 ms] >> [Clear CT: 11.2 ms] >> [Other: 5546.7 ms] >> [Choose CSet: 0.0 ms] >> [Ref Proc: 47.4 ms] >> [Ref Enq: 0.9 ms] >> [Free CSet: 3642.1 ms] >> [Eden: 13.2G(13.2G)->0.0B(1416.0M) Survivors: 16.0M->16.0M Heap: 24.7G(28.0G)->11.5G(28.0G)] >> [Times: user=184.95 sys=34.49, real=148.07 secs] >> >> >> Thanks, >> Amit >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlie.hunt at oracle.com Thu Jan 21 20:00:20 2016 From: charlie.hunt at oracle.com (charlie hunt) Date: Thu, 21 Jan 2016 14:00:20 -0600 Subject: regarding long young generation pause when using G1 GC In-Reply-To: References: <56A11F98.3000602@oracle.com> Message-ID: <8A03CEF9-F203-403C-87EA-74351637B63F@oracle.com> Being you are on Solaris, the transparent huge pages is not an issue. Solaris is smart enough automatically figure out how to make large pages work, transparently. You do not need to remove -XX:+UseLargePages. Also since you are using Solaris, use vmstat to monitor the application and watch the ?free? and ?sr? columns. You should see ?0? in the ?sr? column most of the time while the app is running. In short, the ?sr? column is the page scan rate. Hence, as memory gets low, (as indicated in the ?free? column), the page scanner will start looking for available pages it can reclaim. So, if you are starting to see non-zero entries in the ?sr? column that tend to be increasing, and the values in the ?free? column getting smaller and smaller, that is a pretty good sign that the system is paging to virtual memory, or approaching that state. And, obviously once that paging starts to happen you will see the ?pi? and ?po? columns will show activity as well (they are the page in and page out columns). On setting an eden size ? while you can set an eden size, doing so will disable the adaptive sizing G1 will do in an effort to meet the pause time goal. IMO, I don?t think the 13 GB young gen is the root of the long pause. Need to find the reason / cause for the high sys time and low parallelism. hths, charlie PS: It would likely help to move to a JDK 8 JVM since there were many improvements made in G1 between JDK 7 and JDK 8. > On Jan 21, 2016, at 1:46 PM, Amit Mishra wrote: > > Thanks Charlie & Jenny?s for your expert comments. > > We are using jdk : jdk1.7.0_45 as our applications have never been tested on Java 1.8 > > Yes I dig further and found system is running low on memory so might be OS occasionally resort to swapping, I will further ask Customer to increase physical memory on node to ensure swap is not getting used at any point of time. > > I am not sure about Eden size as a best practice we haven?t set any explicit value for new size or Eden, is it fine to set explicit eden size ? > > I will ask team to run next test with MaxGCPauseMillis=200 . > > We are on solaris 5.10 platform and you can see that large pages parameter is enabled, shall we need to disable it for better performance. > > JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" > > > You can also refer to rest of JVM parameters application is using to confirm whether we need to disable any of them or need to enable few more for better G1 GC performance. > > argv[11]: -Xms28672m > argv[12]: -Xmx28672m > argv[13]: -Xss1m > argv[14]: -Xoss1m > argv[15]: -XX:PermSize=256m > argv[16]: -XX:MaxPermSize=256m > argv[17]: -XX:ReservedCodeCacheSize=128m > argv[18]: -XX:+HeapDumpOnOutOfMemoryError > argv[19]: -XX:+AggressiveOpts > argv[20]: -XX:+DisableExplicitGC > argv[22]: -Dcom.sun.management.jmxremote.ssl=false > argv[23]: -Dcom.sun.management.jmxremote.authenticate=false > argv[25]: -XX:+PrintGCTimeStamps > argv[26]: -XX:+PrintGCDetails > argv[28]: -verbose:gc > argv[29]: -Dsun.rmi.dgc.server.gcInterval=86400000 > argv[30]: -Dsun.rmi.dgc.client.gcInterval=86400000 > argv[31]: -XX:+UseLargePages > argv[32]: -XX:+MaxFDLimit > argv[37]: -Dorg.omg.CORBA.ORBClass=org.jacorb.orb.ORB > argv[38]: -Dorg.omg.CORBA.ORBSingletonClass=org.jacorb.orb.ORBSingleton > argv[39]: -XX:-EliminateLocks > argv[40]: -XX:-OmitStackTraceInFastThrow > argv[41]: -XX:-UseSSE42Intrinsics > argv[42]: -XX:-ReduceInitialCardMarks > argv[43]: -XX:+UseG1GC > argv[44]: -XX:MaxGCPauseMillis=500 > argv[45]: -XX:+UseCompressedOops > argv[46]: -XX:+PrintFlagsFinal > argv[47]: com.redknee.framework.core.platform.Core > > > Thanks, > Amit > > From: charlie hunt [mailto:charlie.hunt at oracle.com] > Sent: Friday, January 22, 2016 01:06 > To: Amit Mishra > Cc: hotspot-gc-use at openjdk.java.net > Subject: Re: regarding long young generation pause when using G1 GC > > Hi Amit, > > To add to Jenny?s comments ? > > In the absence of looking at the entire GC log and looking only at the entry included in email, this line really catches my eye: > [Times: user=184.95 sys=34.49, real=148.07 secs] > > - This is pretty high sys time. If this is Linux, check that transparent huge pages are disabled. If that?s disabled, then you may be paging to virtual memory. In short, need to chase down the root of that high sys time. > - Comparing user=184.95 to real=148.07 you do not much parallelism, ~ 1.24x. That is not much parallelism for a multi-threaded young GC. > > hths, > > charlie > > On Jan 21, 2016, at 12:12 PM, Yu Zhang > wrote: > > Amit, > > Which jdk version are you using? > > You give MaxGCPauseMillis=500, but it seems g1 is adjusting eden size too big, up to 13g. I am curious why. Can you add a flag -XX:+PrintAdaptiveSizePolicy? Meanwhile, to reduce young gc pause, can you try to a smaller number for MaxGCPauseMillis, or give a smaller eden size? > > Another thing I noticed, whenever you have long gc pause, the system cpu is high. Maybe the system is busy doing something else? > > Thanks, > Jenny > On 1/21/2016 4:30 AM, Amit Mishra wrote: > Hello team, > > Please help me with G1 GC long young generation pause.. > > At one of our Customer deployment we were using CMS with 48G heap size and facing Concurrent mode failure every month. > > To solve this we started load test using G1 GC on lab but we are seeing intermittent long GC pause as long as 148 seconds, > > Could you please look into logs and confirm what best we can do to ascertain root cause and possible tuning to fix this. > > Thanks for help in advance. > We are using default GC parameters:(Attaching full GC file as well) > > JVM_ARGS="${JVM_ARGS} -XX:-OmitStackTraceInFastThrow" > JVM_ARGS="${JVM_ARGS} -XX:-UseSSE42Intrinsics" > JVM_ARGS="${JVM_ARGS} -XX:-ReduceInitialCardMarks" > JVM_ARGS="${JVM_ARGS} -XX:+UseG1GC" > JVM_ARGS="${JVM_ARGS} -XX:MaxGCPauseMillis=500" > JVM_ARGS="${JVM_ARGS} -XX:+UseCompressedOops" > JVM_ARGS="${JVM_ARGS} -XX:+PrintFlagsFinal" > JVM_ARGS="${JVM_ARGS} -XX:-EliminateLocks" > JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" > export GC_LOG="${PROJECT_HOME}/log/gcstats.log.$$" > JVM_ARGS=$JVM_ARGS" -Xloggc:${GC_LOG}" > JVM_ARGS=$JVM_ARGS" -verbose:gc" > JVM_ARGS="${JVM_ARGS} -XX:+DisableExplicitGC" > JVM_ARGS=$JVM_ARGS" -XX:+AggressiveOpts" > JVM_ARGS="${JVM_ARGS} -XX:+HeapDumpOnOutOfMemoryError" > JVM_ARGS="${JVM_ARGS} -XX:ReservedCodeCacheSize=128m" > JVM_ARGS="${JVM_ARGS} -XX:PermSize=256m" > JVM_ARGS="${JVM_ARGS} -XX:MaxPermSize=256m" > JVM_ARGS="${JVM_ARGS} -Xmx28672m" > JVM_ARGS="${JVM_ARGS} -Xms28672m > > Long GC pause. > > 5020.853: [GC pause (young), 148.0343389 secs] > [Parallel Time: 142474.3 ms, GC Workers: 13] > [GC Worker Start (ms): Min: 5022695.7, Avg: 5022705.6, Max: 5022714.4, Diff: 18.7] > [Ext Root Scanning (ms): Min: 143.0, Avg: 2550.5, Max: 6442.5, Diff: 6299.4, Sum: 33157.1] > [Update RS (ms): Min: 5704.6, Avg: 19017.1, Max: 45977.9, Diff: 40273.3, Sum: 247222.7] > [Processed Buffers: Min: 0, Avg: 26.9, Max: 291, Diff: 291, Sum: 350] > [Scan RS (ms): Min: 70538.2, Avg: 92539.7, Max: 114688.0, Diff: 44149.8, Sum: 1203016.2] > [Object Copy (ms): Min: 894.7, Avg: 9265.9, Max: 20674.1, Diff: 19779.4, Sum: 120456.9] > [Termination (ms): Min: 10169.3, Avg: 19073.0, Max: 26684.1, Diff: 16514.7, Sum: 247949.3] > [GC Worker Other (ms): Min: 0.1, Avg: 4.5, Max: 18.2, Diff: 18.0, Sum: 58.4] > [GC Worker Total (ms): Min: 142441.2, Avg: 142450.8, Max: 142461.1, Diff: 19.8, Sum: 1851860.7] > [GC Worker End (ms): Min: 5165155.6, Avg: 5165156.4, Max: 5165162.0, Diff: 6.4] > [Code Root Fixup: 2.2 ms] > [Clear CT: 11.2 ms] > [Other: 5546.7 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 47.4 ms] > [Ref Enq: 0.9 ms] > [Free CSet: 3642.1 ms] > [Eden: 13.2G(13.2G)->0.0B(1416.0M) Survivors: 16.0M->16.0M Heap: 24.7G(28.0G)->11.5G(28.0G)] > [Times: user=184.95 sys=34.49, real=148.07 secs] > > > Thanks, > Amit > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From poonam.bajaj at oracle.com Thu Jan 21 20:59:31 2016 From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar) Date: Thu, 21 Jan 2016 12:59:31 -0800 Subject: regarding long young generation pause when using G1 GC In-Reply-To: <8A03CEF9-F203-403C-87EA-74351637B63F@oracle.com> References: <56A11F98.3000602@oracle.com> <8A03CEF9-F203-403C-87EA-74351637B63F@oracle.com> Message-ID: <56A146B3.5060805@oracle.com> Hello, On 1/21/2016 12:00 PM, charlie hunt wrote: > Being you are on Solaris, the transparent huge pages is not an issue. > Solaris is smart enough automatically figure out how to make large > pages work, transparently. You do not need to remove -XX:+UseLargePages. > > Also since you are using Solaris, use vmstat to monitor the > application and watch the 'free' and 'sr' columns. You should see '0' > in the 'sr' column most of the time while the app is running. In > short, the 'sr' column is the page scan rate. Hence, as memory gets > low, (as indicated in the 'free' column), the page scanner will start > looking for available pages it can reclaim. So, if you are starting to > see non-zero entries in the 'sr' column that tend to be increasing, > and the values in the 'free' column getting smaller and smaller, that > is a pretty good sign that the system is paging to virtual memory, or > approaching that state. And, obviously once that paging starts to > happen you will see the 'pi' and 'po' columns will show activity as > well (they are the page in and page out columns). > > On setting an eden size ... while you can set an eden size, doing so > will disable the adaptive sizing G1 will do in an effort to meet the > pause time goal. IMO, I don't think the 13 GB young gen is the root of > the long pause. Need to find the reason / cause for the high sys time > and low parallelism. > But it may be worthwhile to try and understand why Eden got sized at 13GB. If we look at the other GC entries, the eden size is comparatively very small: [Eden: 1384.0M(1376.0M)->0.0B(1248.0M) Survivors: 56.0M->184.0M Heap: 1431.6M(28.0G)->247.6M(28.0G)] [Times: user=3.55 sys=0.26, real=0.33 secs] As Jenny suggested, PrintAdaptiveSizePolicy may shed some light on the sizing decisions. Thanks, Poonam > hths, > > charlie > > PS: It would likely help to move to a JDK 8 JVM since there were many > improvements made in G1 between JDK 7 and JDK 8. > >> On Jan 21, 2016, at 1:46 PM, Amit Mishra > > wrote: >> >> Thanks Charlie & Jenny's for your expert comments. >> We are using jdk : jdk1.7.0_45 as our applications have never been >> tested on Java 1.8 >> Yes I dig further and found system is running low on memory so might >> be OS occasionally resort to swapping, I will further ask Customer to >> increase physical memory on node to ensure swap is not getting used >> at any point of time. >> I am not sure about Eden size as a best practice we haven't set any >> explicit value for new size or Eden, is it fine to set explicit eden >> size ? >> I will ask team to run next test with MaxGCPauseMillis=200 . >> We are on solaris 5.10 platform and you can see that large pages >> parameter is enabled, shall we need to disable it for better performance. >> JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" >> You can also refer to rest of JVM parameters application is using to >> confirm whether we need to disable any of them or need to enable few >> more for better G1 GC performance. >> argv[11]: -Xms28672m >> argv[12]: -Xmx28672m >> argv[13]: -Xss1m >> argv[14]: -Xoss1m >> argv[15]: -XX:PermSize=256m >> argv[16]: -XX:MaxPermSize=256m >> argv[17]: -XX:ReservedCodeCacheSize=128m >> argv[18]: -XX:+HeapDumpOnOutOfMemoryError >> argv[19]: -XX:+AggressiveOpts >> argv[20]: -XX:+DisableExplicitGC >> argv[22]: -Dcom.sun.management.jmxremote.ssl=false >> argv[23]: -Dcom.sun.management.jmxremote.authenticate=false >> argv[25]: -XX:+PrintGCTimeStamps >> argv[26]: -XX:+PrintGCDetails >> argv[28]: -verbose:gc >> argv[29]: -Dsun.rmi.dgc.server.gcInterval=86400000 >> argv[30]: -Dsun.rmi.dgc.client.gcInterval=86400000 >> argv[31]: -XX:+UseLargePages >> argv[32]: -XX:+MaxFDLimit >> argv[37]: -Dorg.omg.CORBA.ORBClass=org.jacorb.orb.ORB >> argv[38]: -Dorg.omg.CORBA.ORBSingletonClass=org.jacorb.orb.ORBSingleton >> argv[39]: -XX:-EliminateLocks >> argv[40]: -XX:-OmitStackTraceInFastThrow >> argv[41]: -XX:-UseSSE42Intrinsics >> argv[42]: -XX:-ReduceInitialCardMarks >> argv[43]: -XX:+UseG1GC >> argv[44]: -XX:MaxGCPauseMillis=500 >> argv[45]: -XX:+UseCompressedOops >> argv[46]: -XX:+PrintFlagsFinal >> argv[47]: com.redknee.framework.core.platform.Core >> Thanks, >> Amit >> *From:*charlie hunt [mailto:charlie.hunt at oracle.com] >> *Sent:*Friday, January 22, 2016 01:06 >> *To:*Amit Mishra > > >> *Cc:*hotspot-gc-use at openjdk.java.net >> >> *Subject:*Re: regarding long young generation pause when using G1 GC >> Hi Amit, >> To add to Jenny's comments ... >> In the absence of looking at the entire GC log and looking only at >> the entry included in email, this line really catches my eye: >> [Times: user=184.95 sys=34.49, real=148.07 secs] >> - This is pretty high sys time. If this is Linux, check that >> transparent huge pages are disabled. If that's disabled, then you may >> be paging to virtual memory. In short, need to chase down the root of >> that high sys time. >> - Comparing user=184.95 to real=148.07 you do not much parallelism, ~ >> 1.24x. That is not much parallelism for a multi-threaded young GC. >> hths, >> charlie >> >> On Jan 21, 2016, at 12:12 PM, Yu Zhang > > wrote: >> Amit, >> >> Which jdk version are you using? >> >> You give MaxGCPauseMillis=500, but it seems g1 is adjusting eden >> size too big, up to 13g. I am curious why. Can you add a flag >> -XX:+PrintAdaptiveSizePolicy? Meanwhile, to reduce young gc >> pause, can you try to a smaller number for MaxGCPauseMillis, or >> give a smaller eden size? >> >> Another thing I noticed, whenever you have long gc pause, the >> system cpu is high. Maybe the system is busy doing something else? >> >> Thanks, >> >> Jenny >> >> On 1/21/2016 4:30 AM, Amit Mishra wrote: >> >> Helloteam, >> Please help me with G1 GC long young generation pause.. >> At one of our Customer deployment we were using CMS with 48G >> heap size and facing Concurrent mode failure every month. >> To solve this we started load test using G1 GC on lab but we >> are seeing intermittent long GC pause as long as 148 seconds, >> Could you please look into logs and confirm what best we can >> do to ascertain root cause and possible tuning to fix this. >> Thanks for help in advance. >> We are using default GC parameters:(Attaching full GC file as >> well) >> JVM_ARGS="${JVM_ARGS} -XX:-OmitStackTraceInFastThrow" >> JVM_ARGS="${JVM_ARGS} -XX:-UseSSE42Intrinsics" >> JVM_ARGS="${JVM_ARGS} -XX:-ReduceInitialCardMarks" >> JVM_ARGS="${JVM_ARGS} -XX:+UseG1GC" >> JVM_ARGS="${JVM_ARGS} -XX:MaxGCPauseMillis=500" >> JVM_ARGS="${JVM_ARGS} -XX:+UseCompressedOops" >> JVM_ARGS="${JVM_ARGS} -XX:+PrintFlagsFinal" >> JVM_ARGS="${JVM_ARGS} -XX:-EliminateLocks" >> JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" >> export GC_LOG="${PROJECT_HOME}/log/gcstats.log.$$" >> JVM_ARGS=$JVM_ARGS" -Xloggc:${GC_LOG}" >> JVM_ARGS=$JVM_ARGS" -verbose:gc" >> JVM_ARGS="${JVM_ARGS} -XX:+DisableExplicitGC" >> JVM_ARGS=$JVM_ARGS" -XX:+AggressiveOpts" >> JVM_ARGS="${JVM_ARGS} -XX:+HeapDumpOnOutOfMemoryError" >> JVM_ARGS="${JVM_ARGS} -XX:ReservedCodeCacheSize=128m" >> JVM_ARGS="${JVM_ARGS} -XX:PermSize=256m" >> JVM_ARGS="${JVM_ARGS} -XX:MaxPermSize=256m" >> JVM_ARGS="${JVM_ARGS} -Xmx28672m" >> JVM_ARGS="${JVM_ARGS} -Xms28672m >> Long GC pause. >> 5020.853: [GC pause (young), 148.0343389 secs] >> [Parallel Time: 142474.3 ms, GC Workers: 13] >> [GC Worker Start (ms): Min: 5022695.7, Avg: 5022705.6, Max: >> 5022714.4, Diff: 18.7] >> [Ext Root Scanning (ms): Min: 143.0, Avg: 2550.5, Max: >> 6442.5, Diff: 6299.4, Sum: 33157.1] >> [Update RS (ms): Min: 5704.6, Avg: 19017.1, Max: 45977.9, >> Diff: 40273.3, Sum: 247222.7] >> [Processed Buffers: Min: 0, Avg: 26.9, Max: 291, Diff: 291, >> Sum: 350] >> [Scan RS (ms): Min: 70538.2, Avg: 92539.7, Max: 114688.0, >> Diff: 44149.8, Sum: 1203016.2] >> [Object Copy (ms): Min: 894.7, Avg: 9265.9, Max: 20674.1, >> Diff: 19779.4, Sum: 120456.9] >> [Termination (ms): Min: 10169.3, Avg: 19073.0, Max: 26684.1, >> Diff: 16514.7, Sum: 247949.3] >> [GC Worker Other (ms): Min: 0.1, Avg: 4.5, Max: 18.2, Diff: >> 18.0, Sum: 58.4] >> [GC Worker Total (ms): Min: 142441.2, Avg: 142450.8, Max: >> 142461.1, Diff: 19.8, Sum: 1851860.7] >> [GC Worker End (ms): Min: 5165155.6, Avg: 5165156.4, Max: >> 5165162.0, Diff: 6.4] >> [Code Root Fixup: 2.2 ms] >> [Clear CT: 11.2 ms] >> [Other: 5546.7 ms] >> [Choose CSet: 0.0 ms] >> [Ref Proc: 47.4 ms] >> [Ref Enq: 0.9 ms] >> [Free CSet: 3642.1 ms] >> [Eden: 13.2G(13.2G)->0.0B(1416.0M) Survivors: 16.0M->16.0M >> Heap: 24.7G(28.0G)->11.5G(28.0G)] >> [Times: user=184.95 sys=34.49, real=148.07 secs] >> >> Thanks, >> Amit >> >> >> >> _______________________________________________ >> >> hotspot-gc-use mailing list >> >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlie.hunt at oracle.com Thu Jan 21 21:38:53 2016 From: charlie.hunt at oracle.com (charlie hunt) Date: Thu, 21 Jan 2016 15:38:53 -0600 Subject: regarding long young generation pause when using G1 GC In-Reply-To: <56A146B3.5060805@oracle.com> References: <56A11F98.3000602@oracle.com> <8A03CEF9-F203-403C-87EA-74351637B63F@oracle.com> <56A146B3.5060805@oracle.com> Message-ID: > On Jan 21, 2016, at 2:59 PM, Poonam Bajaj Parhar wrote: > > Hello, > > On 1/21/2016 12:00 PM, charlie hunt wrote: >> Being you are on Solaris, the transparent huge pages is not an issue. Solaris is smart enough automatically figure out how to make large pages work, transparently. You do not need to remove -XX:+UseLargePages. >> >> Also since you are using Solaris, use vmstat to monitor the application and watch the ?free? and ?sr? columns. You should see ?0? in the ?sr? column most of the time while the app is running. In short, the ?sr? column is the page scan rate. Hence, as memory gets low, (as indicated in the ?free? column), the page scanner will start looking for available pages it can reclaim. So, if you are starting to see non-zero entries in the ?sr? column that tend to be increasing, and the values in the ?free? column getting smaller and smaller, that is a pretty good sign that the system is paging to virtual memory, or approaching that state. And, obviously once that paging starts to happen you will see the ?pi? and ?po? columns will show activity as well (they are the page in and page out columns). >> >> On setting an eden size ? while you can set an eden size, doing so will disable the adaptive sizing G1 will do in an effort to meet the pause time goal. IMO, I don?t think the 13 GB young gen is the root of the long pause. Need to find the reason / cause for the high sys time and low parallelism. >> > But it may be worthwhile to try and understand why Eden got sized at 13GB. If we look at the other GC entries, the eden size is comparatively very small: > > [Eden: 1384.0M(1376.0M)->0.0B(1248.0M) Survivors: 56.0M->184.0M Heap: 1431.6M(28.0G)->247.6M(28.0G)] > [Times: user=3.55 sys=0.26, real=0.33 secs] > > As Jenny suggested, PrintAdaptiveSizePolicy may shed some light on the sizing decisions. Agreed. :) You (and Jenny) are right. It is a good practice to enable +PrintAdaptiveSizePolicy to see what sizing decisions are being made. charlie > > Thanks, > Poonam > > >> hths, >> >> charlie >> >> PS: It would likely help to move to a JDK 8 JVM since there were many improvements made in G1 between JDK 7 and JDK 8. >> >>> On Jan 21, 2016, at 1:46 PM, Amit Mishra > wrote: >>> >>> Thanks Charlie & Jenny?s for your expert comments. >>> >>> We are using jdk : jdk1.7.0_45 as our applications have never been tested on Java 1.8 >>> >>> Yes I dig further and found system is running low on memory so might be OS occasionally resort to swapping, I will further ask Customer to increase physical memory on node to ensure swap is not getting used at any point of time. >>> >>> I am not sure about Eden size as a best practice we haven?t set any explicit value for new size or Eden, is it fine to set explicit eden size ? >>> >>> I will ask team to run next test with MaxGCPauseMillis=200 . >>> >>> We are on solaris 5.10 platform and you can see that large pages parameter is enabled, shall we need to disable it for better performance. >>> >>> JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" >>> >>> >>> You can also refer to rest of JVM parameters application is using to confirm whether we need to disable any of them or need to enable few more for better G1 GC performance. >>> >>> argv[11]: -Xms28672m >>> argv[12]: -Xmx28672m >>> argv[13]: -Xss1m >>> argv[14]: -Xoss1m >>> argv[15]: -XX:PermSize=256m >>> argv[16]: -XX:MaxPermSize=256m >>> argv[17]: -XX:ReservedCodeCacheSize=128m >>> argv[18]: -XX:+HeapDumpOnOutOfMemoryError >>> argv[19]: -XX:+AggressiveOpts >>> argv[20]: -XX:+DisableExplicitGC >>> argv[22]: -Dcom.sun.management.jmxremote.ssl=false >>> argv[23]: -Dcom.sun.management.jmxremote.authenticate=false >>> argv[25]: -XX:+PrintGCTimeStamps >>> argv[26]: -XX:+PrintGCDetails >>> argv[28]: -verbose:gc >>> argv[29]: -Dsun.rmi.dgc.server.gcInterval=86400000 >>> argv[30]: -Dsun.rmi.dgc.client.gcInterval=86400000 >>> argv[31]: -XX:+UseLargePages >>> argv[32]: -XX:+MaxFDLimit >>> argv[37]: -Dorg.omg.CORBA.ORBClass=org.jacorb.orb.ORB >>> argv[38]: -Dorg.omg.CORBA.ORBSingletonClass=org.jacorb.orb.ORBSingleton >>> argv[39]: -XX:-EliminateLocks >>> argv[40]: -XX:-OmitStackTraceInFastThrow >>> argv[41]: -XX:-UseSSE42Intrinsics >>> argv[42]: -XX:-ReduceInitialCardMarks >>> argv[43]: -XX:+UseG1GC >>> argv[44]: -XX:MaxGCPauseMillis=500 >>> argv[45]: -XX:+UseCompressedOops >>> argv[46]: -XX:+PrintFlagsFinal >>> argv[47]: com.redknee.framework.core.platform.Core >>> >>> >>> Thanks, >>> Amit >>> >>> From: charlie hunt [mailto:charlie.hunt at oracle.com ] >>> Sent: Friday, January 22, 2016 01:06 >>> To: Amit Mishra > >>> Cc: hotspot-gc-use at openjdk.java.net >>> Subject: Re: regarding long young generation pause when using G1 GC >>> >>> Hi Amit, >>> >>> To add to Jenny?s comments ? >>> >>> In the absence of looking at the entire GC log and looking only at the entry included in email, this line really catches my eye: >>> [Times: user=184.95 sys=34.49, real=148.07 secs] >>> >>> - This is pretty high sys time. If this is Linux, check that transparent huge pages are disabled. If that?s disabled, then you may be paging to virtual memory. In short, need to chase down the root of that high sys time. >>> - Comparing user=184.95 to real=148.07 you do not much parallelism, ~ 1.24x. That is not much parallelism for a multi-threaded young GC. >>> >>> hths, >>> >>> charlie >>> >>> On Jan 21, 2016, at 12:12 PM, Yu Zhang > wrote: >>> >>> Amit, >>> >>> Which jdk version are you using? >>> >>> You give MaxGCPauseMillis=500, but it seems g1 is adjusting eden size too big, up to 13g. I am curious why. Can you add a flag -XX:+PrintAdaptiveSizePolicy? Meanwhile, to reduce young gc pause, can you try to a smaller number for MaxGCPauseMillis, or give a smaller eden size? >>> >>> Another thing I noticed, whenever you have long gc pause, the system cpu is high. Maybe the system is busy doing something else? >>> >>> Thanks, >>> Jenny >>> On 1/21/2016 4:30 AM, Amit Mishra wrote: >>> Hello team, >>> >>> Please help me with G1 GC long young generation pause.. >>> >>> At one of our Customer deployment we were using CMS with 48G heap size and facing Concurrent mode failure every month. >>> >>> To solve this we started load test using G1 GC on lab but we are seeing intermittent long GC pause as long as 148 seconds, >>> >>> Could you please look into logs and confirm what best we can do to ascertain root cause and possible tuning to fix this. >>> >>> Thanks for help in advance. >>> We are using default GC parameters:(Attaching full GC file as well) >>> >>> JVM_ARGS="${JVM_ARGS} -XX:-OmitStackTraceInFastThrow" >>> JVM_ARGS="${JVM_ARGS} -XX:-UseSSE42Intrinsics" >>> JVM_ARGS="${JVM_ARGS} -XX:-ReduceInitialCardMarks" >>> JVM_ARGS="${JVM_ARGS} -XX:+UseG1GC" >>> JVM_ARGS="${JVM_ARGS} -XX:MaxGCPauseMillis=500" >>> JVM_ARGS="${JVM_ARGS} -XX:+UseCompressedOops" >>> JVM_ARGS="${JVM_ARGS} -XX:+PrintFlagsFinal" >>> JVM_ARGS="${JVM_ARGS} -XX:-EliminateLocks" >>> JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" >>> export GC_LOG="${PROJECT_HOME}/log/gcstats.log.$$" >>> JVM_ARGS=$JVM_ARGS" -Xloggc:${GC_LOG}" >>> JVM_ARGS=$JVM_ARGS" -verbose:gc" >>> JVM_ARGS="${JVM_ARGS} -XX:+DisableExplicitGC" >>> JVM_ARGS=$JVM_ARGS" -XX:+AggressiveOpts" >>> JVM_ARGS="${JVM_ARGS} -XX:+HeapDumpOnOutOfMemoryError" >>> JVM_ARGS="${JVM_ARGS} -XX:ReservedCodeCacheSize=128m" >>> JVM_ARGS="${JVM_ARGS} -XX:PermSize=256m" >>> JVM_ARGS="${JVM_ARGS} -XX:MaxPermSize=256m" >>> JVM_ARGS="${JVM_ARGS} -Xmx28672m" >>> JVM_ARGS="${JVM_ARGS} -Xms28672m >>> >>> Long GC pause. >>> >>> 5020.853: [GC pause (young), 148.0343389 secs] >>> [Parallel Time: 142474.3 ms, GC Workers: 13] >>> [GC Worker Start (ms): Min: 5022695.7, Avg: 5022705.6, Max: 5022714.4, Diff: 18.7] >>> [Ext Root Scanning (ms): Min: 143.0, Avg: 2550.5, Max: 6442.5, Diff: 6299.4, Sum: 33157.1] >>> [Update RS (ms): Min: 5704.6, Avg: 19017.1, Max: 45977.9, Diff: 40273.3, Sum: 247222.7] >>> [Processed Buffers: Min: 0, Avg: 26.9, Max: 291, Diff: 291, Sum: 350] >>> [Scan RS (ms): Min: 70538.2, Avg: 92539.7, Max: 114688.0, Diff: 44149.8, Sum: 1203016.2] >>> [Object Copy (ms): Min: 894.7, Avg: 9265.9, Max: 20674.1, Diff: 19779.4, Sum: 120456.9] >>> [Termination (ms): Min: 10169.3, Avg: 19073.0, Max: 26684.1, Diff: 16514.7, Sum: 247949.3] >>> [GC Worker Other (ms): Min: 0.1, Avg: 4.5, Max: 18.2, Diff: 18.0, Sum: 58.4] >>> [GC Worker Total (ms): Min: 142441.2, Avg: 142450.8, Max: 142461.1, Diff: 19.8, Sum: 1851860.7] >>> [GC Worker End (ms): Min: 5165155.6, Avg: 5165156.4, Max: 5165162.0, Diff: 6.4] >>> [Code Root Fixup: 2.2 ms] >>> [Clear CT: 11.2 ms] >>> [Other: 5546.7 ms] >>> [Choose CSet: 0.0 ms] >>> [Ref Proc: 47.4 ms] >>> [Ref Enq: 0.9 ms] >>> [Free CSet: 3642.1 ms] >>> [Eden: 13.2G(13.2G)->0.0B(1416.0M) Survivors: 16.0M->16.0M Heap: 24.7G(28.0G)->11.5G(28.0G)] >>> [Times: user=184.95 sys=34.49, real=148.07 secs] >>> >>> >>> Thanks, >>> Amit >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.mishra at redknee.com Thu Jan 21 19:46:21 2016 From: amit.mishra at redknee.com (Amit Mishra) Date: Thu, 21 Jan 2016 19:46:21 +0000 Subject: regarding long young generation pause when using G1 GC In-Reply-To: References: <56A11F98.3000602@oracle.com> Message-ID: Thanks Charlie & Jenny's for your expert comments. We are using jdk : jdk1.7.0_45 as our applications have never been tested on Java 1.8 Yes I dig further and found system is running low on memory so might be OS occasionally resort to swapping, I will further ask Customer to increase physical memory on node to ensure swap is not getting used at any point of time. I am not sure about Eden size as a best practice we haven't set any explicit value for new size or Eden, is it fine to set explicit eden size ? I will ask team to run next test with MaxGCPauseMillis=200 . We are on solaris 5.10 platform and you can see that large pages parameter is enabled, shall we need to disable it for better performance. JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" You can also refer to rest of JVM parameters application is using to confirm whether we need to disable any of them or need to enable few more for better G1 GC performance. argv[11]: -Xms28672m argv[12]: -Xmx28672m argv[13]: -Xss1m argv[14]: -Xoss1m argv[15]: -XX:PermSize=256m argv[16]: -XX:MaxPermSize=256m argv[17]: -XX:ReservedCodeCacheSize=128m argv[18]: -XX:+HeapDumpOnOutOfMemoryError argv[19]: -XX:+AggressiveOpts argv[20]: -XX:+DisableExplicitGC argv[22]: -Dcom.sun.management.jmxremote.ssl=false argv[23]: -Dcom.sun.management.jmxremote.authenticate=false argv[25]: -XX:+PrintGCTimeStamps argv[26]: -XX:+PrintGCDetails argv[28]: -verbose:gc argv[29]: -Dsun.rmi.dgc.server.gcInterval=86400000 argv[30]: -Dsun.rmi.dgc.client.gcInterval=86400000 argv[31]: -XX:+UseLargePages argv[32]: -XX:+MaxFDLimit argv[37]: -Dorg.omg.CORBA.ORBClass=org.jacorb.orb.ORB argv[38]: -Dorg.omg.CORBA.ORBSingletonClass=org.jacorb.orb.ORBSingleton argv[39]: -XX:-EliminateLocks argv[40]: -XX:-OmitStackTraceInFastThrow argv[41]: -XX:-UseSSE42Intrinsics argv[42]: -XX:-ReduceInitialCardMarks argv[43]: -XX:+UseG1GC argv[44]: -XX:MaxGCPauseMillis=500 argv[45]: -XX:+UseCompressedOops argv[46]: -XX:+PrintFlagsFinal argv[47]: com.redknee.framework.core.platform.Core Thanks, Amit From: charlie hunt [mailto:charlie.hunt at oracle.com] Sent: Friday, January 22, 2016 01:06 To: Amit Mishra Cc: hotspot-gc-use at openjdk.java.net Subject: Re: regarding long young generation pause when using G1 GC Hi Amit, To add to Jenny's comments ... In the absence of looking at the entire GC log and looking only at the entry included in email, this line really catches my eye: [Times: user=184.95 sys=34.49, real=148.07 secs] - This is pretty high sys time. If this is Linux, check that transparent huge pages are disabled. If that's disabled, then you may be paging to virtual memory. In short, need to chase down the root of that high sys time. - Comparing user=184.95 to real=148.07 you do not much parallelism, ~ 1.24x. That is not much parallelism for a multi-threaded young GC. hths, charlie On Jan 21, 2016, at 12:12 PM, Yu Zhang > wrote: Amit, Which jdk version are you using? You give MaxGCPauseMillis=500, but it seems g1 is adjusting eden size too big, up to 13g. I am curious why. Can you add a flag -XX:+PrintAdaptiveSizePolicy? Meanwhile, to reduce young gc pause, can you try to a smaller number for MaxGCPauseMillis, or give a smaller eden size? Another thing I noticed, whenever you have long gc pause, the system cpu is high. Maybe the system is busy doing something else? Thanks, Jenny On 1/21/2016 4:30 AM, Amit Mishra wrote: Hello team, Please help me with G1 GC long young generation pause.. At one of our Customer deployment we were using CMS with 48G heap size and facing Concurrent mode failure every month. To solve this we started load test using G1 GC on lab but we are seeing intermittent long GC pause as long as 148 seconds, Could you please look into logs and confirm what best we can do to ascertain root cause and possible tuning to fix this. Thanks for help in advance. We are using default GC parameters:(Attaching full GC file as well) JVM_ARGS="${JVM_ARGS} -XX:-OmitStackTraceInFastThrow" JVM_ARGS="${JVM_ARGS} -XX:-UseSSE42Intrinsics" JVM_ARGS="${JVM_ARGS} -XX:-ReduceInitialCardMarks" JVM_ARGS="${JVM_ARGS} -XX:+UseG1GC" JVM_ARGS="${JVM_ARGS} -XX:MaxGCPauseMillis=500" JVM_ARGS="${JVM_ARGS} -XX:+UseCompressedOops" JVM_ARGS="${JVM_ARGS} -XX:+PrintFlagsFinal" JVM_ARGS="${JVM_ARGS} -XX:-EliminateLocks" JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" export GC_LOG="${PROJECT_HOME}/log/gcstats.log.$$" JVM_ARGS=$JVM_ARGS" -Xloggc:${GC_LOG}" JVM_ARGS=$JVM_ARGS" -verbose:gc" JVM_ARGS="${JVM_ARGS} -XX:+DisableExplicitGC" JVM_ARGS=$JVM_ARGS" -XX:+AggressiveOpts" JVM_ARGS="${JVM_ARGS} -XX:+HeapDumpOnOutOfMemoryError" JVM_ARGS="${JVM_ARGS} -XX:ReservedCodeCacheSize=128m" JVM_ARGS="${JVM_ARGS} -XX:PermSize=256m" JVM_ARGS="${JVM_ARGS} -XX:MaxPermSize=256m" JVM_ARGS="${JVM_ARGS} -Xmx28672m" JVM_ARGS="${JVM_ARGS} -Xms28672m Long GC pause. 5020.853: [GC pause (young), 148.0343389 secs] [Parallel Time: 142474.3 ms, GC Workers: 13] [GC Worker Start (ms): Min: 5022695.7, Avg: 5022705.6, Max: 5022714.4, Diff: 18.7] [Ext Root Scanning (ms): Min: 143.0, Avg: 2550.5, Max: 6442.5, Diff: 6299.4, Sum: 33157.1] [Update RS (ms): Min: 5704.6, Avg: 19017.1, Max: 45977.9, Diff: 40273.3, Sum: 247222.7] [Processed Buffers: Min: 0, Avg: 26.9, Max: 291, Diff: 291, Sum: 350] [Scan RS (ms): Min: 70538.2, Avg: 92539.7, Max: 114688.0, Diff: 44149.8, Sum: 1203016.2] [Object Copy (ms): Min: 894.7, Avg: 9265.9, Max: 20674.1, Diff: 19779.4, Sum: 120456.9] [Termination (ms): Min: 10169.3, Avg: 19073.0, Max: 26684.1, Diff: 16514.7, Sum: 247949.3] [GC Worker Other (ms): Min: 0.1, Avg: 4.5, Max: 18.2, Diff: 18.0, Sum: 58.4] [GC Worker Total (ms): Min: 142441.2, Avg: 142450.8, Max: 142461.1, Diff: 19.8, Sum: 1851860.7] [GC Worker End (ms): Min: 5165155.6, Avg: 5165156.4, Max: 5165162.0, Diff: 6.4] [Code Root Fixup: 2.2 ms] [Clear CT: 11.2 ms] [Other: 5546.7 ms] [Choose CSet: 0.0 ms] [Ref Proc: 47.4 ms] [Ref Enq: 0.9 ms] [Free CSet: 3642.1 ms] [Eden: 13.2G(13.2G)->0.0B(1416.0M) Survivors: 16.0M->16.0M Heap: 24.7G(28.0G)->11.5G(28.0G)] [Times: user=184.95 sys=34.49, real=148.07 secs] Thanks, Amit _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.mishra at redknee.com Fri Jan 22 12:06:26 2016 From: amit.mishra at redknee.com (Amit Mishra) Date: Fri, 22 Jan 2016 12:06:26 +0000 Subject: regarding long young generation pause when using G1 GC In-Reply-To: <8A03CEF9-F203-403C-87EA-74351637B63F@oracle.com> References: <56A11F98.3000602@oracle.com> <8A03CEF9-F203-403C-87EA-74351637B63F@oracle.com> Message-ID: Thanks Charlie, I will further work on paging/swapping issue and will re-run G1 GC once that is fixed. Regards, Amit Mishra From: charlie hunt [mailto:charlie.hunt at oracle.com] Sent: Friday, January 22, 2016 01:30 To: Amit Mishra Cc: hotspot-gc-use at openjdk.java.net Subject: Re: regarding long young generation pause when using G1 GC Being you are on Solaris, the transparent huge pages is not an issue. Solaris is smart enough automatically figure out how to make large pages work, transparently. You do not need to remove -XX:+UseLargePages. Also since you are using Solaris, use vmstat to monitor the application and watch the ?free? and ?sr? columns. You should see ?0? in the ?sr? column most of the time while the app is running. In short, the ?sr? column is the page scan rate. Hence, as memory gets low, (as indicated in the ?free? column), the page scanner will start looking for available pages it can reclaim. So, if you are starting to see non-zero entries in the ?sr? column that tend to be increasing, and the values in the ?free? column getting smaller and smaller, that is a pretty good sign that the system is paging to virtual memory, or approaching that state. And, obviously once that paging starts to happen you will see the ?pi? and ?po? columns will show activity as well (they are the page in and page out columns). On setting an eden size ? while you can set an eden size, doing so will disable the adaptive sizing G1 will do in an effort to meet the pause time goal. IMO, I don?t think the 13 GB young gen is the root of the long pause. Need to find the reason / cause for the high sys time and low parallelism. hths, charlie PS: It would likely help to move to a JDK 8 JVM since there were many improvements made in G1 between JDK 7 and JDK 8. On Jan 21, 2016, at 1:46 PM, Amit Mishra > wrote: Thanks Charlie & Jenny?s for your expert comments. We are using jdk : jdk1.7.0_45 as our applications have never been tested on Java 1.8 Yes I dig further and found system is running low on memory so might be OS occasionally resort to swapping, I will further ask Customer to increase physical memory on node to ensure swap is not getting used at any point of time. I am not sure about Eden size as a best practice we haven?t set any explicit value for new size or Eden, is it fine to set explicit eden size ? I will ask team to run next test with MaxGCPauseMillis=200 . We are on solaris 5.10 platform and you can see that large pages parameter is enabled, shall we need to disable it for better performance. JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" You can also refer to rest of JVM parameters application is using to confirm whether we need to disable any of them or need to enable few more for better G1 GC performance. argv[11]: -Xms28672m argv[12]: -Xmx28672m argv[13]: -Xss1m argv[14]: -Xoss1m argv[15]: -XX:PermSize=256m argv[16]: -XX:MaxPermSize=256m argv[17]: -XX:ReservedCodeCacheSize=128m argv[18]: -XX:+HeapDumpOnOutOfMemoryError argv[19]: -XX:+AggressiveOpts argv[20]: -XX:+DisableExplicitGC argv[22]: -Dcom.sun.management.jmxremote.ssl=false argv[23]: -Dcom.sun.management.jmxremote.authenticate=false argv[25]: -XX:+PrintGCTimeStamps argv[26]: -XX:+PrintGCDetails argv[28]: -verbose:gc argv[29]: -Dsun.rmi.dgc.server.gcInterval=86400000 argv[30]: -Dsun.rmi.dgc.client.gcInterval=86400000 argv[31]: -XX:+UseLargePages argv[32]: -XX:+MaxFDLimit argv[37]: -Dorg.omg.CORBA.ORBClass=org.jacorb.orb.ORB argv[38]: -Dorg.omg.CORBA.ORBSingletonClass=org.jacorb.orb.ORBSingleton argv[39]: -XX:-EliminateLocks argv[40]: -XX:-OmitStackTraceInFastThrow argv[41]: -XX:-UseSSE42Intrinsics argv[42]: -XX:-ReduceInitialCardMarks argv[43]: -XX:+UseG1GC argv[44]: -XX:MaxGCPauseMillis=500 argv[45]: -XX:+UseCompressedOops argv[46]: -XX:+PrintFlagsFinal argv[47]: com.redknee.framework.core.platform.Core Thanks, Amit From: charlie hunt [mailto:charlie.hunt at oracle.com] Sent: Friday, January 22, 2016 01:06 To: Amit Mishra > Cc: hotspot-gc-use at openjdk.java.net Subject: Re: regarding long young generation pause when using G1 GC Hi Amit, To add to Jenny?s comments ? In the absence of looking at the entire GC log and looking only at the entry included in email, this line really catches my eye: [Times: user=184.95 sys=34.49, real=148.07 secs] - This is pretty high sys time. If this is Linux, check that transparent huge pages are disabled. If that?s disabled, then you may be paging to virtual memory. In short, need to chase down the root of that high sys time. - Comparing user=184.95 to real=148.07 you do not much parallelism, ~ 1.24x. That is not much parallelism for a multi-threaded young GC. hths, charlie On Jan 21, 2016, at 12:12 PM, Yu Zhang > wrote: Amit, Which jdk version are you using? You give MaxGCPauseMillis=500, but it seems g1 is adjusting eden size too big, up to 13g. I am curious why. Can you add a flag -XX:+PrintAdaptiveSizePolicy? Meanwhile, to reduce young gc pause, can you try to a smaller number for MaxGCPauseMillis, or give a smaller eden size? Another thing I noticed, whenever you have long gc pause, the system cpu is high. Maybe the system is busy doing something else? Thanks, Jenny On 1/21/2016 4:30 AM, Amit Mishra wrote: Hello team, Please help me with G1 GC long young generation pause.. At one of our Customer deployment we were using CMS with 48G heap size and facing Concurrent mode failure every month. To solve this we started load test using G1 GC on lab but we are seeing intermittent long GC pause as long as 148 seconds, Could you please look into logs and confirm what best we can do to ascertain root cause and possible tuning to fix this. Thanks for help in advance. We are using default GC parameters:(Attaching full GC file as well) JVM_ARGS="${JVM_ARGS} -XX:-OmitStackTraceInFastThrow" JVM_ARGS="${JVM_ARGS} -XX:-UseSSE42Intrinsics" JVM_ARGS="${JVM_ARGS} -XX:-ReduceInitialCardMarks" JVM_ARGS="${JVM_ARGS} -XX:+UseG1GC" JVM_ARGS="${JVM_ARGS} -XX:MaxGCPauseMillis=500" JVM_ARGS="${JVM_ARGS} -XX:+UseCompressedOops" JVM_ARGS="${JVM_ARGS} -XX:+PrintFlagsFinal" JVM_ARGS="${JVM_ARGS} -XX:-EliminateLocks" JVM_ARGS="${JVM_ARGS} -XX:+UseLargePages" export GC_LOG="${PROJECT_HOME}/log/gcstats.log.$$" JVM_ARGS=$JVM_ARGS" -Xloggc:${GC_LOG}" JVM_ARGS=$JVM_ARGS" -verbose:gc" JVM_ARGS="${JVM_ARGS} -XX:+DisableExplicitGC" JVM_ARGS=$JVM_ARGS" -XX:+AggressiveOpts" JVM_ARGS="${JVM_ARGS} -XX:+HeapDumpOnOutOfMemoryError" JVM_ARGS="${JVM_ARGS} -XX:ReservedCodeCacheSize=128m" JVM_ARGS="${JVM_ARGS} -XX:PermSize=256m" JVM_ARGS="${JVM_ARGS} -XX:MaxPermSize=256m" JVM_ARGS="${JVM_ARGS} -Xmx28672m" JVM_ARGS="${JVM_ARGS} -Xms28672m Long GC pause. 5020.853: [GC pause (young), 148.0343389 secs] [Parallel Time: 142474.3 ms, GC Workers: 13] [GC Worker Start (ms): Min: 5022695.7, Avg: 5022705.6, Max: 5022714.4, Diff: 18.7] [Ext Root Scanning (ms): Min: 143.0, Avg: 2550.5, Max: 6442.5, Diff: 6299.4, Sum: 33157.1] [Update RS (ms): Min: 5704.6, Avg: 19017.1, Max: 45977.9, Diff: 40273.3, Sum: 247222.7] [Processed Buffers: Min: 0, Avg: 26.9, Max: 291, Diff: 291, Sum: 350] [Scan RS (ms): Min: 70538.2, Avg: 92539.7, Max: 114688.0, Diff: 44149.8, Sum: 1203016.2] [Object Copy (ms): Min: 894.7, Avg: 9265.9, Max: 20674.1, Diff: 19779.4, Sum: 120456.9] [Termination (ms): Min: 10169.3, Avg: 19073.0, Max: 26684.1, Diff: 16514.7, Sum: 247949.3] [GC Worker Other (ms): Min: 0.1, Avg: 4.5, Max: 18.2, Diff: 18.0, Sum: 58.4] [GC Worker Total (ms): Min: 142441.2, Avg: 142450.8, Max: 142461.1, Diff: 19.8, Sum: 1851860.7] [GC Worker End (ms): Min: 5165155.6, Avg: 5165156.4, Max: 5165162.0, Diff: 6.4] [Code Root Fixup: 2.2 ms] [Clear CT: 11.2 ms] [Other: 5546.7 ms] [Choose CSet: 0.0 ms] [Ref Proc: 47.4 ms] [Ref Enq: 0.9 ms] [Free CSet: 3642.1 ms] [Eden: 13.2G(13.2G)->0.0B(1416.0M) Survivors: 16.0M->16.0M Heap: 24.7G(28.0G)->11.5G(28.0G)] [Times: user=184.95 sys=34.49, real=148.07 secs] Thanks, Amit _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: