From koops at hpe.com Tue Aug 23 04:01:20 2016 From: koops at hpe.com (Kuravinakop, Sunil) Date: Tue, 23 Aug 2016 04:01:20 +0000 Subject: GC : Logs : Usage Message-ID: Hello, I see code like log_info(gc, marking)("Concurrent Mark (%.3fs)", TimeHelper::counter_to_seconds(mark_start)); in concurrentMarkThread.cpp. I am able to use "-Xlog:gc=info" and get logs for "log_info(gc)". However, how do I use the sub-tags like "marking"? I am sorry, if I have posted this in the wrong mailing list. regards, --Sunil -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Tue Aug 23 06:11:54 2016 From: yu.zhang at oracle.com (yu.zhang at oracle.com) Date: Mon, 22 Aug 2016 23:11:54 -0700 Subject: GC : Logs : Usage In-Reply-To: References: Message-ID: You can try gc+marking=info Thanks Jenny On 08/22/2016 09:01 PM, Kuravinakop, Sunil wrote: > > Hello, > > I see code like > > log_info(gc, marking)("Concurrent Mark (%.3fs)", > TimeHelper::counter_to_seconds(mark_start)); > > in concurrentMarkThread.cpp. I am able to use ??Xlog:gc=info? and get > logs for ?log_info(gc)?. However, how do I use the sub-tags like > ?marking?? > > I am sorry, if I have posted this in the wrong mailing > list. > > regards, > > --Sunil > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Wed Aug 24 18:43:02 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 24 Aug 2016 14:43:02 -0400 Subject: Odd G1GC behavior on 8u91 Message-ID: Hi guys, Hoping someone could shed some light on G1 behavior (as seen from the gc log) that I'm having a hard time understanding. The root problem is G1 enters a Full GC that takes many tens of seconds, and need some advice on what could be causing it. First, some basic info: Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE (1.8.0_91-b14), built on Apr 1 2016 00:57:21 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8) Memory: 4k page, physical 264115728k(108464820k free), swap 0k(0k free) CommandLine flags: -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath= -XX:InitialCodeCacheSize=104857600 -XX:InitialHeapSize=103079215104 -XX:InitialTenuringThreshold=2 -XX:InitiatingHeapOccupancyPercent=75 -XX:+ManagementServer -XX:MaxGCPauseMillis=300 -XX:MaxHeapSize=103079215104 -XX:MaxNewSize=32212254720 -XX:MaxTenuringThreshold=2 -XX:NewSize=32212254720 -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy -XX:+PrintCommandLineFlags -XX:+PrintCompilation -XX:PrintFLSStatistics=1 -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintPromotionFailure -XX:+PrintReferenceGC -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 -XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600 -XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC Swap is disabled. THP is disabled. First issue I have a question about: 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 1795162112 bytes, new threshold 2 (max 2) 17776.029: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 0, predicted base time: 14.07 ms, remaining time: 285.93 ms, target pause time: 300.00 ms] 17776.029: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 0 regions, survivors: 0 regions, predicted young region time: 0.00 ms] 17776.029: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 0 regions, survivors: 0 regions, old: 0 regions, predicted pause time: 14.07 ms, target pause time: 300.00 ms] 2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs, 0.0012417 secs]2016-08-24T15:29:12.307+0000: 17776.034: [WeakReference, 0 refs, 0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035: [FinalReference, 0 refs, 0.0007027 secs]2016-08-24T15:29:12.308+0000: 17776.035: [PhantomReference, 0 refs, 0 refs, 0.0013585 secs]2016-08-24T15:29:12.309+0000: 17776.037: [JNI Weak Reference, 0.0000118 secs], 0.0089758 secs] [Parallel Time: 3.1 ms, GC Workers: 23] [GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3, Max: 17776029.4, Diff: 0.2] [Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8, Diff: 1.9, Sum: 24.2] [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff: 1, Sum: 2] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.2] [Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff: 1.8, Sum: 37.9] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 23] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9, Diff: 0.2, Sum: 63.8] [GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max: 17776032.1, Diff: 0.0] [Code Root Fixup: 0.2 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.4 ms] [Other: 5.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4.4 ms] [Ref Enq: 0.3 ms] [Redirty Cards: 0.3 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.0 ms] *[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: 95.2G(96.0G)->95.2G(96.0G)]* [Times: user=0.08 sys=0.00, real=0.01 secs] 2016-08-24T15:29:12.311+0000: 17776.038: Total time for which application threads were stopped: 0.0103002 seconds, Stopping threads took: 0.0000566 seconds 17776.039: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 32 bytes] 17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 33554432 bytes, attempted expansion amount: 33554432 bytes] 17776.039: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded] 2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation Failure) 2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504 refs, 0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456: [WeakReference, 1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456: [FinalReference, 3270 refs, 0.0033290 secs]2016-08-24T15:29:40.732+0000: 17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257 secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference, 0.0000172 secs] 95G->38G(96G), 95.5305034 secs] [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: 95.2G(96.0G)->38.9G(96.0G)], [Metaspace: 104180K->103365K(106496K)] * [Times: user=157.02 sys=0.28, real=95.54 secs] * So here we have a lengthy full GC pause that collects quite a bit of old gen (expected). Right before this is a young evac pause. Why is the heap sizing (bolded) reported after the evac pause showing empty Eden+Survivor? Why is ergonomic info reporting 0 regions selected (i.e. what's evacuated then)? Right before the Full GC, ergonomics report a failure to expand the heap due to an allocation request of 32 bytes. Is this implying that a mutator tried to allocate 32 bytes but couldn't? How do I reconcile that with Eden+Survivor occupancy reported right above that? Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the application is roughly 1GB/s. Am I correct in assuming that allocation is outpacing concurrent marking, based on the above? What tunable(s) would you advise to tweak to get G1 to keep up with the allocation rate? I'm ok taking some throughput hit to mitigate 90s+ pauses. Let me know if any additional info is needed (I have the full GC log, and can attach that if desired). Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric.caspole at oracle.com Wed Aug 24 21:20:03 2016 From: eric.caspole at oracle.com (Eric Caspole) Date: Wed, 24 Aug 2016 17:20:03 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: Message-ID: <0537de2b-4554-5f26-8762-6704aad395ca@oracle.com> I have not used G1 in JDK 8 that much but the two trouble spots to me are: -XX:InitiatingHeapOccupancyPercent=75 -XX:MaxTenuringThreshold=2 So this will tenure very quickly, filling up old gen and start the marking relatively late at 75%. This looks like it is pretty likely to end up in a STW full GC. Since you do have a huge amount of garbage getting collected in the full gc maybe try letting more of it die off in young gen with higher tenuring threshold and also start marking earlier than 75%. good luck, Eric On 08/24/2016 02:43 PM, Vitaly Davidovich wrote: > Hi guys, > > Hoping someone could shed some light on G1 behavior (as seen from the gc > log) that I'm having a hard time understanding. The root problem is G1 > enters a Full GC that takes many tens of seconds, and need some advice > on what could be causing it. > > First, some basic info: > Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE > (1.8.0_91-b14), built on Apr 1 2016 00:57:21 by "java_re" with gcc > 4.3.0 20080428 (Red Hat 4.3.0-8) > Memory: 4k page, physical 264115728k(108464820k free), swap 0k(0k free) > CommandLine flags: -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath= > -XX:InitialCodeCacheSize=104857600 -XX:InitialHeapSize=103079215104 > -XX:InitialTenuringThreshold=2 -XX:InitiatingHeapOccupancyPercent=75 > -XX:+ManagementServer -XX:MaxGCPauseMillis=300 > -XX:MaxHeapSize=103079215104 -XX:MaxNewSize=32212254720 > -XX:MaxTenuringThreshold=2 -XX:NewSize=32212254720 > -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy > -XX:+PrintCommandLineFlags -XX:+PrintCompilation > -XX:PrintFLSStatistics=1 -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime > -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+PrintPromotionFailure -XX:+PrintReferenceGC > -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 > -XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600 > -XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC > > Swap is disabled. THP is disabled. > > First issue I have a question about: > > 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 1795162112 bytes, new threshold 2 (max 2) > 17776.029: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 0, predicted base time: 14.07 ms, remaining time: 285.93 > ms, target pause time: 300.00 ms] > 17776.029: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 0 regions, survivors: 0 regions, predicted young region time: 0.00 ms] > 17776.029: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 0 regions, survivors: 0 regions, old: 0 regions, predicted pause > time: 14.07 ms, target pause time: 300.00 ms] > 2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs, > 0.0012417 secs]2016-08-24T15:29:12.307+0000: 17776.034: [WeakReference, > 0 refs, 0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035: > [FinalReference, 0 refs, 0.0007027 secs]2016-08-24T15:29:12.308+0000: > 17776.035: [PhantomReference, 0 refs, 0 refs, 0.0013585 > secs]2016-08-24T15:29:12.309+0000: 17776.037: [JNI Weak Reference, > 0.0000118 secs], 0.0089758 secs] > [Parallel Time: 3.1 ms, GC Workers: 23] > [GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3, Max: > 17776029.4, Diff: 0.2] > [Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8, Diff: 1.9, > Sum: 24.2] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] > [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff: 1, Sum: 2] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.2] > [Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff: 1.8, Sum: 37.9] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 23] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.2] > [GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9, Diff: 0.2, > Sum: 63.8] > [GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max: > 17776032.1, Diff: 0.0] > [Code Root Fixup: 0.2 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.4 ms] > [Other: 5.3 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.4 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.0 ms] > *[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: > 95.2G(96.0G)->95.2G(96.0G)]* > [Times: user=0.08 sys=0.00, real=0.01 secs] > 2016-08-24T15:29:12.311+0000: 17776.038: Total time for which > application threads were stopped: 0.0103002 seconds, Stopping threads > took: 0.0000566 seconds > 17776.039: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: > allocation request failed, allocation request: 32 bytes] > 17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested > expansion amount: 33554432 bytes, attempted expansion amount: 33554432 > bytes] > 17776.039: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: > heap already fully expanded] > 2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation Failure) > 2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504 refs, > 0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456: [WeakReference, > 1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456: > [FinalReference, 3270 refs, 0.0033290 secs]2016-08-24T15:29:40.732+0000: > 17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257 > secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference, > 0.0000172 secs] 95G->38G(96G), 95.5305034 secs] > [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: > 95.2G(96.0G)->38.9G(96.0G)], [Metaspace: 104180K->103365K(106496K)] > * [Times: user=157.02 sys=0.28, real=95.54 secs] * > * > * > So here we have a lengthy full GC pause that collects quite a bit of old > gen (expected). Right before this is a young evac pause. > > Why is the heap sizing (bolded) reported after the evac pause showing > empty Eden+Survivor? > > Why is ergonomic info reporting 0 regions selected (i.e. what's > evacuated then)? > > Right before the Full GC, ergonomics report a failure to expand the heap > due to an allocation request of 32 bytes. Is this implying that a > mutator tried to allocate 32 bytes but couldn't? How do I reconcile that > with Eden+Survivor occupancy reported right above that? > > Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the > application is roughly 1GB/s. Am I correct in assuming that allocation > is outpacing concurrent marking, based on the above? What tunable(s) > would you advise to tweak to get G1 to keep up with the allocation rate? > I'm ok taking some throughput hit to mitigate 90s+ pauses. > > Let me know if any additional info is needed (I have the full GC log, > and can attach that if desired). > > Thanks > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > From vitalyd at gmail.com Wed Aug 24 21:32:09 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 24 Aug 2016 17:32:09 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: <0537de2b-4554-5f26-8762-6704aad395ca@oracle.com> References: <0537de2b-4554-5f26-8762-6704aad395ca@oracle.com> Message-ID: Hi Eric, Thanks for the reply. Yeah, we're experimenting with reducing the initiating heap occupancy to 55%, hopefully letting the concurrent marking complete before the allocation and promotion rate results in a Full GC. In particular, I'm not sure how to interpret the gc log snippets I posted showing a young evac with an empty Young gen, and then followed by a Full GC due to heap expansion failure for a 32 byte allocation. I'm likely missing something in my interpretation. I think you're also on point about the max tenuring threshold being too low - need to see how that value was arrived at. Thanks On Wed, Aug 24, 2016 at 5:20 PM, Eric Caspole wrote: > I have not used G1 in JDK 8 that much but the two trouble spots to me are: > > -XX:InitiatingHeapOccupancyPercent=75 > -XX:MaxTenuringThreshold=2 > > So this will tenure very quickly, filling up old gen and start the marking > relatively late at 75%. This looks like it is pretty likely to end up in a > STW full GC. > Since you do have a huge amount of garbage getting collected in the full > gc maybe try letting more of it die off in young gen with higher tenuring > threshold and also start marking earlier than 75%. > > good luck, > Eric > > > > > On 08/24/2016 02:43 PM, Vitaly Davidovich wrote: > >> Hi guys, >> >> Hoping someone could shed some light on G1 behavior (as seen from the gc >> log) that I'm having a hard time understanding. The root problem is G1 >> enters a Full GC that takes many tens of seconds, and need some advice >> on what could be causing it. >> >> First, some basic info: >> Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE >> (1.8.0_91-b14), built on Apr 1 2016 00:57:21 by "java_re" with gcc >> 4.3.0 20080428 (Red Hat 4.3.0-8) >> Memory: 4k page, physical 264115728k(108464820k free), swap 0k(0k free) >> CommandLine flags: -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 >> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath= >> -XX:InitialCodeCacheSize=104857600 -XX:InitialHeapSize=103079215104 >> -XX:InitialTenuringThreshold=2 -XX:InitiatingHeapOccupancyPercent=75 >> -XX:+ManagementServer -XX:MaxGCPauseMillis=300 >> -XX:MaxHeapSize=103079215104 -XX:MaxNewSize=32212254720 >> -XX:MaxTenuringThreshold=2 -XX:NewSize=32212254720 >> -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy >> -XX:+PrintCommandLineFlags -XX:+PrintCompilation >> -XX:PrintFLSStatistics=1 -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime >> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >> -XX:+PrintPromotionFailure -XX:+PrintReferenceGC >> -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 >> -XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600 >> -XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC >> >> Swap is disabled. THP is disabled. >> >> First issue I have a question about: >> >> 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) >> (young) >> Desired survivor size 1795162112 bytes, new threshold 2 (max 2) >> 17776.029: [G1Ergonomics (CSet Construction) start choosing CSet, >> _pending_cards: 0, predicted base time: 14.07 ms, remaining time: 285.93 >> ms, target pause time: 300.00 ms] >> 17776.029: [G1Ergonomics (CSet Construction) add young regions to CSet, >> eden: 0 regions, survivors: 0 regions, predicted young region time: 0.00 >> ms] >> 17776.029: [G1Ergonomics (CSet Construction) finish choosing CSet, >> eden: 0 regions, survivors: 0 regions, old: 0 regions, predicted pause >> time: 14.07 ms, target pause time: 300.00 ms] >> 2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs, >> 0.0012417 secs]2016-08-24T15:29:12.307+0000: 17776.034: [WeakReference, >> 0 refs, 0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035: >> [FinalReference, 0 refs, 0.0007027 secs]2016-08-24T15:29:12.308+0000: >> 17776.035: [PhantomReference, 0 refs, 0 refs, 0.0013585 >> secs]2016-08-24T15:29:12.309+0000: 17776.037: [JNI Weak Reference, >> 0.0000118 secs], 0.0089758 secs] >> [Parallel Time: 3.1 ms, GC Workers: 23] >> [GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3, Max: >> 17776029.4, Diff: 0.2] >> [Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8, Diff: 1.9, >> Sum: 24.2] >> [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] >> [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff: 1, Sum: 2] >> [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] >> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, >> Sum: 0.0] >> [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: >> 1.2] >> [Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff: 1.8, Sum: >> 37.9] >> [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: >> 23] >> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, >> Sum: 0.2] >> [GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9, Diff: 0.2, >> Sum: 63.8] >> [GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max: >> 17776032.1, Diff: 0.0] >> [Code Root Fixup: 0.2 ms] >> [Code Root Purge: 0.0 ms] >> [Clear CT: 0.4 ms] >> [Other: 5.3 ms] >> [Choose CSet: 0.0 ms] >> [Ref Proc: 4.4 ms] >> [Ref Enq: 0.3 ms] >> [Redirty Cards: 0.3 ms] >> [Humongous Register: 0.1 ms] >> [Humongous Reclaim: 0.1 ms] >> [Free CSet: 0.0 ms] >> *[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: >> 95.2G(96.0G)->95.2G(96.0G)]* >> [Times: user=0.08 sys=0.00, real=0.01 secs] >> 2016-08-24T15:29:12.311+0000: 17776.038: Total time for which >> application threads were stopped: 0.0103002 seconds, Stopping threads >> took: 0.0000566 seconds >> 17776.039: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: >> allocation request failed, allocation request: 32 bytes] >> 17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested >> expansion amount: 33554432 bytes, attempted expansion amount: 33554432 >> bytes] >> 17776.039: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: >> heap already fully expanded] >> 2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation Failure) >> 2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504 refs, >> 0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456: [WeakReference, >> 1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456: >> [FinalReference, 3270 refs, 0.0033290 secs]2016-08-24T15:29:40.732+0000: >> 17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257 >> secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference, >> 0.0000172 secs] 95G->38G(96G), 95.5305034 secs] >> [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: >> 95.2G(96.0G)->38.9G(96.0G)], [Metaspace: 104180K->103365K(106496K)] >> * [Times: user=157.02 sys=0.28, real=95.54 secs] * >> * >> * >> So here we have a lengthy full GC pause that collects quite a bit of old >> gen (expected). Right before this is a young evac pause. >> >> Why is the heap sizing (bolded) reported after the evac pause showing >> empty Eden+Survivor? >> >> Why is ergonomic info reporting 0 regions selected (i.e. what's >> evacuated then)? >> >> Right before the Full GC, ergonomics report a failure to expand the heap >> due to an allocation request of 32 bytes. Is this implying that a >> mutator tried to allocate 32 bytes but couldn't? How do I reconcile that >> with Eden+Survivor occupancy reported right above that? >> >> Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the >> application is roughly 1GB/s. Am I correct in assuming that allocation >> is outpacing concurrent marking, based on the above? What tunable(s) >> would you advise to tweak to get G1 to keep up with the allocation rate? >> I'm ok taking some throughput hit to mitigate 90s+ pauses. >> >> Let me know if any additional info is needed (I have the full GC log, >> and can attach that if desired). >> >> Thanks >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Wed Aug 24 22:11:35 2016 From: yu.zhang at oracle.com (Jenny Zhang) Date: Wed, 24 Aug 2016 15:11:35 -0700 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <0537de2b-4554-5f26-8762-6704aad395ca@oracle.com> Message-ID: Hi, Vitaly, Charlie and I were just talking about your post. The parameters you used -XX:MaxNewSize=32212254720 -XX:NewSize=32212254720(Plus -XX:-UseAdaptiveSizePolicy ) basically fixed the young gen size. Which is reflected in the gc log entries. As Eric points out, the -XX:MaxTenuringThreshold=2 might promotes objects quicker to old gen. But it does not explain the 0 Eden used. Maybe all your objects are humongous objects and allocated to old gen directly? Do you mind post the entire gc log? If it is too big, can I copy it from some where? BTW, -XX:PrintFLSStatistics=1 is for CMS not G1 Thanks Jenny On 8/24/2016 2:32 PM, Vitaly Davidovich wrote: > Hi Eric, > > Thanks for the reply. Yeah, we're experimenting with reducing the > initiating heap occupancy to 55%, hopefully letting the concurrent > marking complete before the allocation and promotion rate results in a > Full GC. In particular, I'm not sure how to interpret the gc log > snippets I posted showing a young evac with an empty Young gen, and > then followed by a Full GC due to heap expansion failure for a 32 byte > allocation. I'm likely missing something in my interpretation. > > I think you're also on point about the max tenuring threshold being > too low - need to see how that value was arrived at. > > Thanks > > On Wed, Aug 24, 2016 at 5:20 PM, Eric Caspole > wrote: > > I have not used G1 in JDK 8 that much but the two trouble spots to > me are: > > -XX:InitiatingHeapOccupancyPercent=75 > -XX:MaxTenuringThreshold=2 > > So this will tenure very quickly, filling up old gen and start the > marking relatively late at 75%. This looks like it is pretty > likely to end up in a STW full GC. > Since you do have a huge amount of garbage getting collected in > the full gc maybe try letting more of it die off in young gen with > higher tenuring threshold and also start marking earlier than 75%. > > good luck, > Eric > > > > > On 08/24/2016 02:43 PM, Vitaly Davidovich wrote: > > Hi guys, > > Hoping someone could shed some light on G1 behavior (as seen > from the gc > log) that I'm having a hard time understanding. The root > problem is G1 > enters a Full GC that takes many tens of seconds, and need > some advice > on what could be causing it. > > First, some basic info: > Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE > (1.8.0_91-b14), built on Apr 1 2016 00:57:21 by "java_re" > with gcc > 4.3.0 20080428 (Red Hat 4.3.0-8) > Memory: 4k page, physical 264115728k(108464820k free), swap > 0k(0k free) > CommandLine flags: -XX:G1HeapWastePercent=5 > -XX:G1MixedGCCountTarget=4 > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath= > -XX:InitialCodeCacheSize=104857600 > -XX:InitialHeapSize=103079215104 > -XX:InitialTenuringThreshold=2 > -XX:InitiatingHeapOccupancyPercent=75 > -XX:+ManagementServer -XX:MaxGCPauseMillis=300 > -XX:MaxHeapSize=103079215104 -XX:MaxNewSize=32212254720 > -XX:MaxTenuringThreshold=2 -XX:NewSize=32212254720 > -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy > -XX:+PrintCommandLineFlags -XX:+PrintCompilation > -XX:PrintFLSStatistics=1 -XX:+PrintGC > -XX:+PrintGCApplicationStoppedTime > -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+PrintPromotionFailure -XX:+PrintReferenceGC > -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 > -XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600 > -XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC > > Swap is disabled. THP is disabled. > > First issue I have a question about: > > 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 > Evacuation Pause) > (young) > Desired survivor size 1795162112 bytes, new threshold 2 (max 2) > 17776.029: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 0, predicted base time: 14.07 ms, remaining > time: 285.93 > ms, target pause time: 300.00 ms] > 17776.029: [G1Ergonomics (CSet Construction) add young > regions to CSet, > eden: 0 regions, survivors: 0 regions, predicted young region > time: 0.00 ms] > 17776.029: [G1Ergonomics (CSet Construction) finish choosing > CSet, > eden: 0 regions, survivors: 0 regions, old: 0 regions, > predicted pause > time: 14.07 ms, target pause time: 300.00 ms] > 2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs, > 0.0012417 secs]2016-08-24T15:29:12.307+0000: 17776.034: > [WeakReference, > 0 refs, 0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035: > [FinalReference, 0 refs, 0.0007027 > secs]2016-08-24T15:29:12.308+0000: > 17776.035: [PhantomReference, 0 refs, 0 refs, 0.0013585 > secs]2016-08-24T15:29:12.309+0000: 17776.037: [JNI Weak Reference, > 0.0000118 secs], 0.0089758 secs] > [Parallel Time: 3.1 ms, GC Workers: 23] > [GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3, > Max: > 17776029.4, Diff: 0.2] > [Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8, > Diff: 1.9, > Sum: 24.2] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.1] > [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff: > 1, Sum: 2] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, > Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: > 0.1, Sum: 1.2] > [Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff: > 1.8, Sum: 37.9] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, > Diff: 0, Sum: 23] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, > Diff: 0.0, > Sum: 0.2] > [GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9, > Diff: 0.2, > Sum: 63.8] > [GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max: > 17776032.1, Diff: 0.0] > [Code Root Fixup: 0.2 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.4 ms] > [Other: 5.3 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.4 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.0 ms] > *[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: > 95.2G(96.0G)->95.2G(96.0G)]* > [Times: user=0.08 sys=0.00, real=0.01 secs] > 2016-08-24T15:29:12.311+0000: 17776.038: Total time for which > application threads were stopped: 0.0103002 seconds, Stopping > threads > took: 0.0000566 seconds > 17776.039: [G1Ergonomics (Heap Sizing) attempt heap > expansion, reason: > allocation request failed, allocation request: 32 bytes] > 17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested > expansion amount: 33554432 bytes, attempted expansion amount: > 33554432 > bytes] > 17776.039: [G1Ergonomics (Heap Sizing) did not expand the > heap, reason: > heap already fully expanded] > 2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation > Failure) > 2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504 > refs, > 0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456: > [WeakReference, > 1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456: > [FinalReference, 3270 refs, 0.0033290 > secs]2016-08-24T15:29:40.732+0000: > 17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257 > secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference, > 0.0000172 secs] 95G->38G(96G), 95.5305034 secs] > [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: > 95.2G(96.0G)->38.9G(96.0G)], [Metaspace: > 104180K->103365K(106496K)] > * [Times: user=157.02 sys=0.28, real=95.54 secs] * > * > * > So here we have a lengthy full GC pause that collects quite a > bit of old > gen (expected). Right before this is a young evac pause. > > Why is the heap sizing (bolded) reported after the evac pause > showing > empty Eden+Survivor? > > Why is ergonomic info reporting 0 regions selected (i.e. what's > evacuated then)? > > Right before the Full GC, ergonomics report a failure to > expand the heap > due to an allocation request of 32 bytes. Is this implying that a > mutator tried to allocate 32 bytes but couldn't? How do I > reconcile that > with Eden+Survivor occupancy reported right above that? > > Young gen is sized to 30GB, total heap is 96GB. Allocation > rate of the > application is roughly 1GB/s. Am I correct in assuming that > allocation > is outpacing concurrent marking, based on the above? What > tunable(s) > would you advise to tweak to get G1 to keep up with the > allocation rate? > I'm ok taking some throughput hit to mitigate 90s+ pauses. > > Let me know if any additional info is needed (I have the full > GC log, > and can attach that if desired). > > Thanks > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Wed Aug 24 22:18:41 2016 From: yu.zhang at oracle.com (Jenny Zhang) Date: Wed, 24 Aug 2016 15:18:41 -0700 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: Message-ID: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> More comments about the questions Thanks Jenny On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: > Right before the Full GC, ergonomics report a failure to expand the > heap due to an allocation request of 32 bytes. Is this implying that > a mutator tried to allocate 32 bytes but couldn't? How do I reconcile > that with Eden+Survivor occupancy reported right above that? Yes, it means the mutator tries to allocate 32byte but can not get it. Heap won't expand as it already reaches max heap. Do you see any humongous objects allocatoin? > > Young gen is sized to 30GB, total heap is 96GB. Allocation rate of > the application is roughly 1GB/s. Am I correct in assuming that > allocation is outpacing concurrent marking, based on the above? What > tunable(s) would you advise to tweak to get G1 to keep up with the > allocation rate? I'm ok taking some throughput hit to mitigate 90s+ > pauses. > The entire log might give a better picture. Especially if the marking cycle is triggered, how well the mixed gc cleans up the heap. From ecki at zusammenkunft.net Wed Aug 24 22:26:01 2016 From: ecki at zusammenkunft.net (ecki at zusammenkunft.net) Date: Thu, 25 Aug 2016 00:26:01 +0200 Subject: AW: Odd G1GC behavior on 8u91 In-Reply-To: References: <0537de2b-4554-5f26-8762-6704aad395ca@oracle.com> Message-ID: <57be1f01.85c11c0a.cb5e1.e1cd@mx.google.com> Hello, Did you try to use G1 with no constraints. Just setting heapsize and not-to-aggressive pause target. That at least should avoid the allocation failures. At least it would help tuning to know what patologies this would trigger with yout app and 100G heap. Gruss Bernd -- http://bernd.eckenfels.net >From Win 10 Mobile Von: Vitaly Davidovich -------------- next part -------------- An HTML attachment was scrubbed... URL: From poonam.bajaj at oracle.com Wed Aug 24 22:29:17 2016 From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar) Date: Wed, 24 Aug 2016 15:29:17 -0700 Subject: Odd G1GC behavior on 8u91 In-Reply-To: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> Message-ID: Also, do you see entries like "/[G1Ergonomics (Mixed GCs) do not start mixed GCs, reason:" /in the GC logs which mean that the mixed GCs are not happening due to some reason. What is the reason listed with these log entries? Thanks, Poonam On 8/24/2016 3:18 PM, Jenny Zhang wrote: > More comments about the questions > > Thanks > Jenny > > On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: >> Right before the Full GC, ergonomics report a failure to expand the >> heap due to an allocation request of 32 bytes. Is this implying that >> a mutator tried to allocate 32 bytes but couldn't? How do I reconcile >> that with Eden+Survivor occupancy reported right above that? > Yes, it means the mutator tries to allocate 32byte but can not get it. > Heap won't expand as it already reaches max heap. > > Do you see any humongous objects allocatoin? >> >> Young gen is sized to 30GB, total heap is 96GB. Allocation rate of >> the application is roughly 1GB/s. Am I correct in assuming that >> allocation is outpacing concurrent marking, based on the above? What >> tunable(s) would you advise to tweak to get G1 to keep up with the >> allocation rate? I'm ok taking some throughput hit to mitigate 90s+ >> pauses. >> > The entire log might give a better picture. Especially if the marking > cycle is triggered, how well the mixed gc cleans up the heap. > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Wed Aug 24 22:36:13 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 24 Aug 2016 18:36:13 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <0537de2b-4554-5f26-8762-6704aad395ca@oracle.com> Message-ID: Hi Jenny, Very happy that you and Charlie got wind of this thread -- could use your expertise :). I will email you the log directly (it's a bit verbose with all the safepoint + gc logging) as I believe the mailing list software will strip it. To answer/comment on your email ... I believe the fixing of young gen size (and turning off adaptive sizing) was done intentionally. The developers reported that letting G1 manage this ergonomically caused problems, although that may be because the max pause time goal is too aggressive (300ms for such a large heap). This is something we're also looking at revisiting, but trying to get a handle on the other issues first. As for humongous objects, I don't see any trace of them in the log. We actually saw some other poor G1 behavior with some older GC settings whereby the "Finalize Marking" phase was taking hundreds of seconds (same total heap size, but with a 15GB young), and those gc logs did indicate very humongous object allocations. I can certainly try sharing that log with you as well, but I think that's likely a different issue (it's possible it's related to the G1 worker threads marking through large arrays fully, but I'm not sure). I don't see them here though, and just puzzled at the log output thus far. PrintFLSStatistics=1 is likely a leftover from when this app was using CMS. As you may have surmised, I'm trying to assist someone else with troubleshooting the G1 problems, so I don't have the full history of all the decisions behind the flags and their values. I can, however, try to find that out if you find it pertinent. Thanks On Wed, Aug 24, 2016 at 6:11 PM, Jenny Zhang wrote: > Hi, Vitaly, > > Charlie and I were just talking about your post. > > The parameters you used > > -XX:MaxNewSize=32212254720 -XX:NewSize=32212254720(Plus > -XX:-UseAdaptiveSizePolicy ) basically fixed the young gen size. Which is > reflected in the gc log entries. > > As Eric points out, the -XX:MaxTenuringThreshold=2 might promotes objects > quicker to old gen. But it does not explain the 0 Eden used. Maybe all your > objects are humongous objects and allocated to old gen directly? > > Do you mind post the entire gc log? If it is too big, can I copy it from > some where? > > BTW, -XX:PrintFLSStatistics=1 is for CMS not G1 > > Thanks > Jenny > > On 8/24/2016 2:32 PM, Vitaly Davidovich wrote: > > Hi Eric, > > Thanks for the reply. Yeah, we're experimenting with reducing the > initiating heap occupancy to 55%, hopefully letting the concurrent marking > complete before the allocation and promotion rate results in a Full GC. In > particular, I'm not sure how to interpret the gc log snippets I posted > showing a young evac with an empty Young gen, and then followed by a Full > GC due to heap expansion failure for a 32 byte allocation. I'm likely > missing something in my interpretation. > > I think you're also on point about the max tenuring threshold being too > low - need to see how that value was arrived at. > > Thanks > > On Wed, Aug 24, 2016 at 5:20 PM, Eric Caspole > wrote: > >> I have not used G1 in JDK 8 that much but the two trouble spots to me are: >> >> -XX:InitiatingHeapOccupancyPercent=75 >> -XX:MaxTenuringThreshold=2 >> >> So this will tenure very quickly, filling up old gen and start the >> marking relatively late at 75%. This looks like it is pretty likely to end >> up in a STW full GC. >> Since you do have a huge amount of garbage getting collected in the full >> gc maybe try letting more of it die off in young gen with higher tenuring >> threshold and also start marking earlier than 75%. >> >> good luck, >> Eric >> >> >> >> >> On 08/24/2016 02:43 PM, Vitaly Davidovich wrote: >> >>> Hi guys, >>> >>> Hoping someone could shed some light on G1 behavior (as seen from the gc >>> log) that I'm having a hard time understanding. The root problem is G1 >>> enters a Full GC that takes many tens of seconds, and need some advice >>> on what could be causing it. >>> >>> First, some basic info: >>> Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE >>> (1.8.0_91-b14), built on Apr 1 2016 00:57:21 by "java_re" with gcc >>> 4.3.0 20080428 (Red Hat 4.3.0-8) >>> Memory: 4k page, physical 264115728k(108464820k free), swap 0k(0k free) >>> CommandLine flags: -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 >>> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath= >>> -XX:InitialCodeCacheSize=104857600 -XX:InitialHeapSize=103079215104 >>> -XX:InitialTenuringThreshold=2 -XX:InitiatingHeapOccupancyPercent=75 >>> -XX:+ManagementServer -XX:MaxGCPauseMillis=300 >>> -XX:MaxHeapSize=103079215104 -XX:MaxNewSize=32212254720 >>> -XX:MaxTenuringThreshold=2 -XX:NewSize=32212254720 >>> -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy >>> -XX:+PrintCommandLineFlags -XX:+PrintCompilation >>> -XX:PrintFLSStatistics=1 -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime >>> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >>> -XX:+PrintPromotionFailure -XX:+PrintReferenceGC >>> -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 >>> -XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600 >>> -XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC >>> >>> Swap is disabled. THP is disabled. >>> >>> First issue I have a question about: >>> >>> 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) >>> (young) >>> Desired survivor size 1795162112 bytes, new threshold 2 (max 2) >>> 17776.029: [G1Ergonomics (CSet Construction) start choosing CSet, >>> _pending_cards: 0, predicted base time: 14.07 ms, remaining time: 285.93 >>> ms, target pause time: 300.00 ms] >>> 17776.029: [G1Ergonomics (CSet Construction) add young regions to CSet, >>> eden: 0 regions, survivors: 0 regions, predicted young region time: 0.00 >>> ms] >>> 17776.029: [G1Ergonomics (CSet Construction) finish choosing CSet, >>> eden: 0 regions, survivors: 0 regions, old: 0 regions, predicted pause >>> time: 14.07 ms, target pause time: 300.00 ms] >>> 2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs, >>> 0.0012417 secs]2016-08-24T15:29:12.307+0000: 17776.034: [WeakReference, >>> 0 refs, 0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035: >>> [FinalReference, 0 refs, 0.0007027 secs]2016-08-24T15:29:12.308+0000: >>> 17776.035: [PhantomReference, 0 refs, 0 refs, 0.0013585 >>> secs]2016-08-24T15:29:12.309+0000: 17776.037: [JNI Weak Reference, >>> 0.0000118 secs], 0.0089758 secs] >>> [Parallel Time: 3.1 ms, GC Workers: 23] >>> [GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3, Max: >>> 17776029.4, Diff: 0.2] >>> [Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8, Diff: 1.9, >>> Sum: 24.2] >>> [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] >>> [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff: 1, Sum: 2] >>> [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] >>> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, >>> Sum: 0.0] >>> [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: >>> 1.2] >>> [Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff: 1.8, Sum: >>> 37.9] >>> [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: >>> 23] >>> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, >>> Sum: 0.2] >>> [GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9, Diff: 0.2, >>> Sum: 63.8] >>> [GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max: >>> 17776032.1, Diff: 0.0] >>> [Code Root Fixup: 0.2 ms] >>> [Code Root Purge: 0.0 ms] >>> [Clear CT: 0.4 ms] >>> [Other: 5.3 ms] >>> [Choose CSet: 0.0 ms] >>> [Ref Proc: 4.4 ms] >>> [Ref Enq: 0.3 ms] >>> [Redirty Cards: 0.3 ms] >>> [Humongous Register: 0.1 ms] >>> [Humongous Reclaim: 0.1 ms] >>> [Free CSet: 0.0 ms] >>> *[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: >>> 95.2G(96.0G)->95.2G(96.0G)]* >>> [Times: user=0.08 sys=0.00, real=0.01 secs] >>> 2016-08-24T15:29:12.311+0000: 17776.038: Total time for which >>> application threads were stopped: 0.0103002 seconds, Stopping threads >>> took: 0.0000566 seconds >>> 17776.039: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: >>> allocation request failed, allocation request: 32 bytes] >>> 17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested >>> expansion amount: 33554432 bytes, attempted expansion amount: 33554432 >>> bytes] >>> 17776.039: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: >>> heap already fully expanded] >>> 2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation Failure) >>> 2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504 refs, >>> 0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456: [WeakReference, >>> 1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456: >>> [FinalReference, 3270 refs, 0.0033290 secs]2016-08-24T15:29:40.732+0000: >>> 17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257 >>> secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference, >>> 0.0000172 secs] 95G->38G(96G), 95.5305034 secs] >>> [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: >>> 95.2G(96.0G)->38.9G(96.0G)], [Metaspace: 104180K->103365K(106496K)] >>> * [Times: user=157.02 sys=0.28, real=95.54 secs] * >>> * >>> * >>> So here we have a lengthy full GC pause that collects quite a bit of old >>> gen (expected). Right before this is a young evac pause. >>> >>> Why is the heap sizing (bolded) reported after the evac pause showing >>> empty Eden+Survivor? >>> >>> Why is ergonomic info reporting 0 regions selected (i.e. what's >>> evacuated then)? >>> >>> Right before the Full GC, ergonomics report a failure to expand the heap >>> due to an allocation request of 32 bytes. Is this implying that a >>> mutator tried to allocate 32 bytes but couldn't? How do I reconcile that >>> with Eden+Survivor occupancy reported right above that? >>> >>> Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the >>> application is roughly 1GB/s. Am I correct in assuming that allocation >>> is outpacing concurrent marking, based on the above? What tunable(s) >>> would you advise to tweak to get G1 to keep up with the allocation rate? >>> I'm ok taking some throughput hit to mitigate 90s+ pauses. >>> >>> Let me know if any additional info is needed (I have the full GC log, >>> and can attach that if desired). >>> >>> Thanks >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Wed Aug 24 22:50:47 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 24 Aug 2016 18:50:47 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> Message-ID: On Wed, Aug 24, 2016 at 6:18 PM, Jenny Zhang wrote: > More comments about the questions > > Thanks > Jenny > > On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: > >> Right before the Full GC, ergonomics report a failure to expand the heap >> due to an allocation request of 32 bytes. Is this implying that a mutator >> tried to allocate 32 bytes but couldn't? How do I reconcile that with >> Eden+Survivor occupancy reported right above that? >> > Yes, it means the mutator tries to allocate 32byte but can not get it. > Heap won't expand as it already reaches max heap. > > Do you see any humongous objects allocatoin? As mentioned in my previous email, I don't see any humongous allocations recorded. The Humongous Register/Reclaim output does show non-zero timing, so not sure if the log is simply missing them or something else is going on. > > >> Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the >> application is roughly 1GB/s. Am I correct in assuming that allocation is >> outpacing concurrent marking, based on the above? What tunable(s) would you >> advise to tweak to get G1 to keep up with the allocation rate? I'm ok >> taking some throughput hit to mitigate 90s+ pauses. >> >> The entire log might give a better picture. Especially if the marking > cycle is triggered, how well the mixed gc cleans up the heap. > > Are there particular gc lines you're looking for? I can grep for them quickly and provide that for you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Wed Aug 24 22:53:15 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 24 Aug 2016 18:53:15 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: <57be1f01.85c11c0a.cb5e1.e1cd@mx.google.com> References: <0537de2b-4554-5f26-8762-6704aad395ca@oracle.com> <57be1f01.85c11c0a.cb5e1.e1cd@mx.google.com> Message-ID: Hi Bernd, No, haven't tried that yet -- for now, I'm trying to understand the current G1 behavior from the logs, and then hopefully figure out what needs tweaking once I've understood the reason(s). Thanks On Wed, Aug 24, 2016 at 6:26 PM, wrote: > Hello, > > > > Did you try to use G1 with no constraints. Just setting heapsize and > not-to-aggressive pause target. That at least should avoid the allocation > failures. > > > > At least it would help tuning to know what patologies this would trigger > with yout app and 100G heap. > > > Gruss > Bernd > -- > http://bernd.eckenfels.net > From Win 10 Mobile > > > > *Von: *Vitaly Davidovich > *Gesendet: *Donnerstag, 25. August 2016 00:02 > *An: *Eric Caspole > *Cc: *hotspot-gc-use > *Betreff: *Re: Odd G1GC behavior on 8u91 > > > > Hi Eric, > > > > Thanks for the reply. Yeah, we're experimenting with reducing the > initiating heap occupancy to 55%, hopefully letting the concurrent marking > complete before the allocation and promotion rate results in a Full GC. In > particular, I'm not sure how to interpret the gc log snippets I posted > showing a young evac with an empty Young gen, and then followed by a Full > GC due to heap expansion failure for a 32 byte allocation. I'm likely > missing something in my interpretation. > > > > I think you're also on point about the max tenuring threshold being too > low - need to see how that value was arrived at. > > > > Thanks > > > > On Wed, Aug 24, 2016 at 5:20 PM, Eric Caspole > wrote: > > I have not used G1 in JDK 8 that much but the two trouble spots to me are: > > -XX:InitiatingHeapOccupancyPercent=75 > -XX:MaxTenuringThreshold=2 > > So this will tenure very quickly, filling up old gen and start the marking > relatively late at 75%. This looks like it is pretty likely to end up in a > STW full GC. > Since you do have a huge amount of garbage getting collected in the full > gc maybe try letting more of it die off in young gen with higher tenuring > threshold and also start marking earlier than 75%. > > good luck, > Eric > > > > > > On 08/24/2016 02:43 PM, Vitaly Davidovich wrote: > > Hi guys, > > Hoping someone could shed some light on G1 behavior (as seen from the gc > log) that I'm having a hard time understanding. The root problem is G1 > enters a Full GC that takes many tens of seconds, and need some advice > on what could be causing it. > > First, some basic info: > Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE > (1.8.0_91-b14), built on Apr 1 2016 00:57:21 by "java_re" with gcc > 4.3.0 20080428 (Red Hat 4.3.0-8) > Memory: 4k page, physical 264115728k(108464820k free), swap 0k(0k free) > CommandLine flags: -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath= > -XX:InitialCodeCacheSize=104857600 -XX:InitialHeapSize=103079215104 > -XX:InitialTenuringThreshold=2 -XX:InitiatingHeapOccupancyPercent=75 > -XX:+ManagementServer -XX:MaxGCPauseMillis=300 > -XX:MaxHeapSize=103079215104 -XX:MaxNewSize=32212254720 > -XX:MaxTenuringThreshold=2 -XX:NewSize=32212254720 > -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy > -XX:+PrintCommandLineFlags -XX:+PrintCompilation > -XX:PrintFLSStatistics=1 -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime > -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+PrintPromotionFailure -XX:+PrintReferenceGC > -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 > -XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600 > -XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC > > Swap is disabled. THP is disabled. > > First issue I have a question about: > > 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 1795162112 bytes, new threshold 2 (max 2) > 17776.029: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 0, predicted base time: 14.07 ms, remaining time: 285.93 > ms, target pause time: 300.00 ms] > 17776.029: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 0 regions, survivors: 0 regions, predicted young region time: 0.00 > ms] > 17776.029: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 0 regions, survivors: 0 regions, old: 0 regions, predicted pause > time: 14.07 ms, target pause time: 300.00 ms] > 2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs, > 0.0012417 secs]2016-08-24T15:29:12.307+0000: 17776.034: [WeakReference, > 0 refs, 0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035: > [FinalReference, 0 refs, 0.0007027 secs]2016-08-24T15:29:12.308+0000: > 17776.035: [PhantomReference, 0 refs, 0 refs, 0.0013585 > secs]2016-08-24T15:29:12.309+0000: 17776.037: [JNI Weak Reference, > 0.0000118 secs], 0.0089758 secs] > [Parallel Time: 3.1 ms, GC Workers: 23] > [GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3, Max: > 17776029.4, Diff: 0.2] > [Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8, Diff: 1.9, > Sum: 24.2] > [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] > [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff: 1, Sum: 2] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.0] > [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.2] > [Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff: 1.8, Sum: > 37.9] > [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 23] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, > Sum: 0.2] > [GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9, Diff: 0.2, > Sum: 63.8] > [GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max: > 17776032.1, Diff: 0.0] > [Code Root Fixup: 0.2 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.4 ms] > [Other: 5.3 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.4 ms] > [Ref Enq: 0.3 ms] > [Redirty Cards: 0.3 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.0 ms] > > *[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: > 95.2G(96.0G)->95.2G(96.0G)]* > [Times: user=0.08 sys=0.00, real=0.01 secs] > 2016-08-24T15:29:12.311+0000: 17776.038: Total time for which > application threads were stopped: 0.0103002 seconds, Stopping threads > took: 0.0000566 seconds > 17776.039: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: > allocation request failed, allocation request: 32 bytes] > 17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested > expansion amount: 33554432 bytes, attempted expansion amount: 33554432 > bytes] > 17776.039: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: > heap already fully expanded] > 2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation Failure) > 2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504 refs, > 0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456: [WeakReference, > 1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456: > [FinalReference, 3270 refs, 0.0033290 secs]2016-08-24T15:29:40.732+0000: > 17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257 > secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference, > 0.0000172 secs] 95G->38G(96G), 95.5305034 secs] > [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: > 95.2G(96.0G)->38.9G(96.0G)], [Metaspace: 104180K->103365K(106496K)] > * [Times: user=157.02 sys=0.28, real=95.54 secs] * > * > * > So here we have a lengthy full GC pause that collects quite a bit of old > gen (expected). Right before this is a young evac pause. > > Why is the heap sizing (bolded) reported after the evac pause showing > empty Eden+Survivor? > > Why is ergonomic info reporting 0 regions selected (i.e. what's > evacuated then)? > > Right before the Full GC, ergonomics report a failure to expand the heap > due to an allocation request of 32 bytes. Is this implying that a > mutator tried to allocate 32 bytes but couldn't? How do I reconcile that > with Eden+Survivor occupancy reported right above that? > > Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the > application is roughly 1GB/s. Am I correct in assuming that allocation > is outpacing concurrent marking, based on the above? What tunable(s) > would you advise to tweak to get G1 to keep up with the allocation rate? > I'm ok taking some throughput hit to mitigate 90s+ pauses. > > Let me know if any additional info is needed (I have the full GC log, > and can attach that if desired). > > Thanks > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Wed Aug 24 22:55:59 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 24 Aug 2016 18:55:59 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> Message-ID: On Wed, Aug 24, 2016 at 6:29 PM, Poonam Bajaj Parhar < poonam.bajaj at oracle.com> wrote: > Also, do you see entries like "*[G1Ergonomics (Mixed GCs) do not start > mixed GCs, reason:" *in the GC logs which mean that the mixed GCs are not > happening due to some reason. What is the reason listed with these log > entries? > Hi Poonam, Yes, I do see a few those, but only very early in the process lifetime, and nowhere near the Full GCs. 2016-08-24T10:33:04.733+0000: 8.460: [SoftReference, 0 refs, 0.0010108 secs]2016-08-24T10:33:04.734+0000: 8.461: [WeakReference, 383 refs, 0.0006608 secs]2016-08-24T10:33:04.735+0000: 8.462: [FinalReference, 4533 refs, 0.0020491 secs]2016-08-24T10:33:04.737+0000: 8.464: [PhantomReference, 0 refs, 15 refs, 0.0011945 secs]2016-08-24T10:33:04.738+0000: 8.465: [JNI Weak Reference, 0.0000360 secs] 8.467: [G1Ergonomics (Mixed GCs) do not start mixed GCs, reason: concurrent cycle is about to start] 2016-08-24T10:35:22.846+0000: 146.574: [SoftReference, 0 refs, 0.0011450 secs]2016-08-24T10:35:22.847+0000: 146.575: [WeakReference, 440 refs, 0.0006071 secs]2016-08-24T10:35:22.848+0000: 146.575: [FinalReference, 7100 refs, 0.0018074 secs]2016-08-24T10:35:22.850+0000: 146.577: [PhantomReference, 0 refs, 76 refs, 0.0013148 secs]2016-08-24T10:35:22.851+0000: 146.579: [JNI Weak Reference, 0.0000443 secs] 146.584: [G1Ergonomics (Mixed GCs) do not start mixed GCs, reason: concurrent cycle is about to start] 2016-08-24T10:35:56.507+0000: 180.234: [SoftReference, 0 refs, 0.0010184 secs]2016-08-24T10:35:56.508+0000: 180.235: [WeakReference, 138 refs, 0.0006883 secs]2016-08-24T10:35:56.508+0000: 180.236: [FinalReference, 3682 refs, 0.0023152 secs]2016-08-24T10:35:56.511+0000: 180.238: [PhantomReference, 0 refs, 45 refs, 0.0012558 secs]2016-08-24T10:35:56.512+0000: 180.239: [JNI Weak Reference, 0.0000197 secs] 180.247: [G1Ergonomics (Mixed GCs) do not start mixed GCs, reason: concurrent cycle is about to start] 2016-08-24T10:37:33.387+0000: 277.114: [SoftReference, 0 refs, 0.0010965 secs]2016-08-24T10:37:33.388+0000: 277.115: [WeakReference, 5 refs, 0.0006378 secs]2016-08-24T10:37:33.388+0000: 277.116: [FinalReference, 3440 refs, 0.0028640 secs]2016-08-24T10:37:33.391+0000: 277.119: [PhantomReference, 0 refs, 0 refs, 0.0011392 secs]2016-08-24T10:37:33.392+0000: 277.120: [JNI Weak Reference, 0.0000148 secs] 277.130: [G1Ergonomics (Mixed GCs) do not start mixed GCs, reason: candidate old regions not available] Does that tell you anything? > > Thanks, > Poonam > > On 8/24/2016 3:18 PM, Jenny Zhang wrote: > > More comments about the questions > > Thanks > Jenny > > On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: > > Right before the Full GC, ergonomics report a failure to expand the heap > due to an allocation request of 32 bytes. Is this implying that a mutator > tried to allocate 32 bytes but couldn't? How do I reconcile that with > Eden+Survivor occupancy reported right above that? > > Yes, it means the mutator tries to allocate 32byte but can not get it. > Heap won't expand as it already reaches max heap. > > Do you see any humongous objects allocatoin? > > > Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the > application is roughly 1GB/s. Am I correct in assuming that allocation is > outpacing concurrent marking, based on the above? What tunable(s) would you > advise to tweak to get G1 to keep up with the allocation rate? I'm ok > taking some throughput hit to mitigate 90s+ pauses. > > The entire log might give a better picture. Especially if the marking > cycle is triggered, how well the mixed gc cleans up the heap. > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Thu Aug 25 01:18:10 2016 From: yu.zhang at oracle.com (yu.zhang at oracle.com) Date: Wed, 24 Aug 2016 18:18:10 -0700 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> Message-ID: <27125e40-05fc-3b0a-51ed-dcc12b78c420@oracle.com> Vitaly, Thanks for the gc logs. Before reaching full gc and the entry at 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) (young) ... [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: 95.2G(96.0G)->95.2G(96.0G)] There are several young gcs with 'to-space exhausted'. For example "2016-08-24T15:29:03.936+0000: 17767.663: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 1795162112 bytes, new threshold 2 (max 2) 17767.663: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 185708, predicted base time: 46.71 ms, remaining time: 253.29 ms, target pause time: 300.00 ms] 17767.663: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 159 regions, survivors: 0 regions, predicted young region time: 173.05 ms] 17767.663: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 159 regions, survivors: 0 regions, old: 0 regions, predicted pause time: 219.76 ms, target pause time: 300.00 ms] 17767.664: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: region allocation request failed, allocation request: 16340088 bytes] 17767.664: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 16340088 bytes, attempted expansion amount: 33554432 bytes] 17767.664: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded] ... (to-space exhausted), 8.0293588 secs] [Parallel Time: 6920.2 ms, GC Workers: 23] ... [Other: 1106.9 ms] [Evacuation Failure: 1095.6 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4.4 ms] [Ref Enq: 0.4 ms] [Redirty Cards: 4.8 ms] [Humongous Register: 0.2 ms] [Humongous Reclaim: 0.2 ms] [Free CSet: 0.3 ms] [Eden: 5088.0M(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: 95.2G(96.0G)->95.2G(96.0G)] [Times: user=68.75 sys=1.66, real=8.03 secs] Please note the high "Evacuation Failure" time. Some of those young gcs might collect some. Some (like this one) do not collect any heap. (to-space exhausted) means it can not find enough old regions when evacuating from the young regions. There are only 4 marking cycles, 3 of which are at the beginning due to meta data space. Mix gc is not trigger at that time since the heap is not full enough. There is only 1 marking cycle requested due to heap occupancy, but it is too late. The heap is very full already. No mixed gc found in the log. "request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 88080384000 bytes, allocation request: 0 bytes, threshold: 77309411325 bytes (75.00 %), source: end of GC]" To avoid this, as others suggested, you can try 1. set a reasonable pause time goal and let g1 decide the young gen size 2. if you have to set the young gen size fixed, reduce the NewSize, increase max tenure threshold to let objects die in young gen. 3. Reduce InitiatingHeapOccupancyPercent The goal is to trigger more mixed gc and avoid to-space exhausted. The message after 'to-space exhausted' might be confusing. I need to discuss this with dev team. For example, at time stamp 2016-08-24T15:28:05.905+0000: 17709.633: [GC pause (G1 Evacuation Pause) (young) ... (to-space exhausted), 2.6149566 secs] ... [Eden: 28.4G(28.4G)->0.0B(28.4G) Survivors: 1664.0M->1664.0M Heap: 93.5G(96.0G)->73.9G(96.0G)] the eden used after gc might not be true. I will do some investigation and get back to you. Thanks Jenny On 08/24/2016 03:50 PM, Vitaly Davidovich wrote: > > > On Wed, Aug 24, 2016 at 6:18 PM, Jenny Zhang > wrote: > > More comments about the questions > > Thanks > Jenny > > On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: > > Right before the Full GC, ergonomics report a failure to > expand the heap due to an allocation request of 32 bytes. Is > this implying that a mutator tried to allocate 32 bytes but > couldn't? How do I reconcile that with Eden+Survivor occupancy > reported right above that? > > Yes, it means the mutator tries to allocate 32byte but can not get > it. Heap won't expand as it already reaches max heap. > > Do you see any humongous objects allocatoin? > > As mentioned in my previous email, I don't see any humongous > allocations recorded. The Humongous Register/Reclaim output does show > non-zero timing, so not sure if the log is simply missing them or > something else is going on. > > > > Young gen is sized to 30GB, total heap is 96GB. Allocation > rate of the application is roughly 1GB/s. Am I correct in > assuming that allocation is outpacing concurrent marking, > based on the above? What tunable(s) would you advise to tweak > to get G1 to keep up with the allocation rate? I'm ok taking > some throughput hit to mitigate 90s+ pauses. > > The entire log might give a better picture. Especially if the > marking cycle is triggered, how well the mixed gc cleans up the heap. > > Are there particular gc lines you're looking for? I can grep for them > quickly and provide that for you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu Aug 25 11:15:11 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 25 Aug 2016 07:15:11 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: <27125e40-05fc-3b0a-51ed-dcc12b78c420@oracle.com> References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> <27125e40-05fc-3b0a-51ed-dcc12b78c420@oracle.com> Message-ID: Hi Jenny, Thanks very much for your analysis. A few comments inline below. On Wednesday, August 24, 2016, yu.zhang at oracle.com > wrote: > Vitaly, > > Thanks for the gc logs. > > Before reaching full gc and the entry at > > 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) > (young) > > ... > > [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: > 95.2G(96.0G)->95.2G(96.0G)] > There are several young gcs with 'to-space exhausted'. For example > > "2016-08-24T15:29:03.936+0000: 17767.663: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 1795162112 bytes, new threshold 2 (max 2) > 17767.663: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 185708, predicted base time: 46.71 ms, remaining time: > 253.29 ms, target pause time: 300.00 ms] > 17767.663: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 159 regions, survivors: 0 regions, predicted young region time: > 173.05 ms] > 17767.663: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: > 159 regions, survivors: 0 regions, old: 0 regions, predicted pause time: > 219.76 ms, target pause time: 300.00 ms] > 17767.664: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: > region allocation request failed, allocation request: 16340088 bytes] > 17767.664: [G1Ergonomics (Heap Sizing) expand the heap, requested > expansion amount: 16340088 bytes, attempted expansion amount: 33554432 > bytes] > 17767.664: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: > heap already fully expanded] > ... > > (to-space exhausted), 8.0293588 secs] > [Parallel Time: 6920.2 ms, GC Workers: 23] > ... > [Other: 1106.9 ms] > [Evacuation Failure: 1095.6 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 4.4 ms] > [Ref Enq: 0.4 ms] > [Redirty Cards: 4.8 ms] > [Humongous Register: 0.2 ms] > [Humongous Reclaim: 0.2 ms] > [Free CSet: 0.3 ms] > [Eden: 5088.0M(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: > 95.2G(96.0G)->95.2G(96.0G)] > [Times: user=68.75 sys=1.66, real=8.03 secs] > > Please note the high "Evacuation Failure" time. Some of those young gcs > might collect some. Some (like this one) do not collect any heap. > > (to-space exhausted) means it can not find enough old regions when > evacuating from the young regions. > The above log also says that Eden occupancy went from ~5G to 0. Given the to-space exhaustion and Survivors being 0, that implies it was all garbage. If that's the case, what exactly did it fail to evacuate? If it failed to evacuate them, where did they go given Eden + Survivors are 0. Also, Survivors was apparently 0 to start with, so it's unclear to me what exactly happened here. > > > There are only 4 marking cycles, 3 of which are at the beginning due to > meta data space. Mix gc is not trigger at that time since the heap is not > full enough. > > There is only 1 marking cycle requested due to heap occupancy, but it is > too late. The heap is very full already. No mixed gc found in the log. > "request concurrent cycle initiation, reason: occupancy higher than > threshold, occupancy: 88080384000 bytes, allocation request: 0 bytes, > threshold: 77309411325 bytes (75.00 %), source: end of GC]" > Any ideas why the concurrent cycle is initiated when occupancy is already well above the 75% threshold? Why wouldn't it trigger at very close to the threshold? It's "late" by about 10G here. > > > To avoid this, as others suggested, you can try > 1. set a reasonable pause time goal and let g1 decide the young gen size > Is there a rule of thumb for picking reasonable pause time goals based on heap size/alloc rate/etc? > > 2. if you have to set the young gen size fixed, reduce the NewSize, > increase max tenure threshold to let objects die in young gen. > 3. Reduce InitiatingHeapOccupancyPercent > The goal is to trigger more mixed gc and avoid to-space exhausted. > Right, we're trying with a lower IHOP value (55%). > > > The message after 'to-space exhausted' might be confusing. I need to > discuss this with dev team. For example, at time stamp > 2016-08-24T15:28:05.905+0000: 17709.633: [GC pause (G1 Evacuation Pause) > (young) > ... > (to-space exhausted), 2.6149566 secs] > ... > > [Eden: 28.4G(28.4G)->0.0B(28.4G) Survivors: 1664.0M->1664.0M Heap: > 93.5G(96.0G)->73.9G(96.0G)] > > the eden used after gc might not be true. I will do some investigation and > get back to you. > Thanks. Yeah it's confusing. I'm still not sure I understand the log snippet I pasted in my initial email of the young evac immediately preceding the Full GC - it showed 0 regions in the CSet, so nothing was evacuated, but it also showed Eden occupancy of 0. It's then unclear why the Full GC triggers immediately after due to a 32 byte alloc request. Do you think that may be a bogus log as well? > > > Thanks > Jenny > Thanks again Jenny, this is very helpful. > > > On 08/24/2016 03:50 PM, Vitaly Davidovich wrote: > > > > On Wed, Aug 24, 2016 at 6:18 PM, Jenny Zhang wrote: > >> More comments about the questions >> >> Thanks >> Jenny >> >> On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: >> >>> Right before the Full GC, ergonomics report a failure to expand the heap >>> due to an allocation request of 32 bytes. Is this implying that a mutator >>> tried to allocate 32 bytes but couldn't? How do I reconcile that with >>> Eden+Survivor occupancy reported right above that? >>> >> Yes, it means the mutator tries to allocate 32byte but can not get it. >> Heap won't expand as it already reaches max heap. >> >> Do you see any humongous objects allocatoin? > > As mentioned in my previous email, I don't see any humongous allocations > recorded. The Humongous Register/Reclaim output does show non-zero timing, > so not sure if the log is simply missing them or something else is going on. > >> >> >>> Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the >>> application is roughly 1GB/s. Am I correct in assuming that allocation is >>> outpacing concurrent marking, based on the above? What tunable(s) would you >>> advise to tweak to get G1 to keep up with the allocation rate? I'm ok >>> taking some throughput hit to mitigate 90s+ pauses. >>> >>> The entire log might give a better picture. Especially if the marking >> cycle is triggered, how well the mixed gc cleans up the heap. >> >> Are there particular gc lines you're looking for? I can grep for them > quickly and provide that for you. > > -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu Aug 25 13:43:08 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 25 Aug 2016 09:43:08 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> <27125e40-05fc-3b0a-51ed-dcc12b78c420@oracle.com> Message-ID: So, tried to run with IHOP=55 today, and thinks don't look better - lengthy Object Copy times, and lengthy Finalize Marking phases. E.g.: 2016-08-25T11:23:43.191+0000: 3509.335: [GC pause (GCLocker Initiated GC) (young) Desired survivor size 1795162112 bytes, new threshold 2 (max 2) - age 1: 1793677792 bytes, 1793677792 total - age 2: 641809160 bytes, 2435486952 total 3509.335: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 152524, predicted base time: 45.68 ms, remaining time: 254.32 ms, target pause time: 300.00 ms] 3509.335: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 872 regions, survivors: 89 regions, predicted young region time: 20434.47 ms] 3509.335: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 872 regions, survivors: 89 regions, old: 0 regions, predicted pause time: 20480.15 ms, target pause time: 300.00 ms] 2016-08-25T11:24:09.459+0000: 3535.603: [SoftReference, 0 refs, 0.0011022 secs]2016-08-25T11:24:09.460+0000: 3535.604: [WeakReference, 1 refs, 0.0006259 secs]2016-08-25T11:24:09.461+0000: 3535.605: [FinalReference, 2131 refs, 0.0008182 secs]2016-08-25T11:24:09.462+0000: 3535.606: [PhantomReference, 0 refs, 0 refs, 0.0013352 secs]2016-08-25T11:24:09.463+0000: 3535.607: [JNI Weak Reference, 0.0000145 secs] 3535.621: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 47.64 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)] , 26.2863292 secs] [Parallel Time: 26267.1 ms, GC Workers: 23] [GC Worker Start (ms): Min: 3509335.4, Avg: 3509335.5, Max: 3509335.7, Diff: 0.2] [Ext Root Scanning (ms): Min: 0.9, Avg: 1.2, Max: 2.6, Diff: 1.7, Sum: 26.5] [Update RS (ms): Min: 22.7, Avg: 23.7, Max: 24.4, Diff: 1.7, Sum: 546.2] [Processed Buffers: Min: 22, Avg: 34.3, Max: 54, Diff: 32, Sum: 788] [Scan RS (ms): Min: 58.0, Avg: 59.5, Max: 60.3, Diff: 2.3, Sum: 1368.7] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [Object Copy (ms): Min: 22450.6, Avg: 22615.0, Max: 22768.2, Diff: 317.5, Sum: 520143.9] [Termination (ms): Min: 3413.9, Avg: 3567.2, Max: 3732.4, Diff: 318.5, Sum: 82045.7] [Termination Attempts: Min: 601032, Avg: 630093.2, Max: 639709, Diff: 38677, Sum: 14492144] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 3.3] [GC Worker Total (ms): Min: 26266.6, Avg: 26266.8, Max: 26266.9, Diff: 0.3, Sum: 604136.0] [GC Worker End (ms): Min: 3535602.2, Avg: 3535602.3, Max: 3535602.4, Diff: 0.2] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [Clear CT: 5.3 ms] [Other: 13.8 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4.3 ms] [Ref Enq: 0.2 ms] [Redirty Cards: 4.5 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 3.4 ms] [Eden: 27.2G(27.2G)->0.0B(27.2G) Survivors: 2848.0M->2880.0M Heap: 83.0G(96.0G)->56.5G(96.0G)] [Times: user=603.54 sys=0.02, real=26.29 secs] 2016-08-25T11:24:09.477+0000: 3535.621: Total time for which application threads were stopped: 26.2879077 seconds, Stopping threads took: 0.0002478 seconds 2016-08-25T11:24:10.100+0000: 3536.244: [GC concurrent-mark-end, 43.5578776 secs] 2016-08-25T11:24:10.103+0000: 3536.247: [GC remark 2016-08-25T11:24:10.103+0000: 3536.247: [Finalize Marking, 85.4564628 secs] 2016-08-25T11:25:35.559+0000: 3621.703: [GC ref-proc2016-08-25T11:25:35.559+0000: 3621.703: [SoftReference, 0 refs, 0.0011470 secs]2016-08-25T11:25:35.561+0000: 3621.705: [WeakReference, 445 refs, 0.0008184 secs]2016-08-25T11:25:35.561+0000: 3621.705: [FinalReference, 206 refs, 0.0008055 secs]2016-08-25T11:25:35.562+0000: 3621.706: [PhantomReference, 0 refs, 48 refs, 0.0015741 secs]2016-08-25T11:25:35.564+0000: 3621.708: [JNI Weak Reference, 0.0000350 secs], 0.0047317 secs] 2016-08-25T11:25:35.564+0000: 3621.708: [Unloading, 0.0163349 secs], 85.5620546 secs] [Times: user=1373.18 sys=588.01, real=85.57 secs] Jenny, I'll send you the full log offline. Thanks On Thu, Aug 25, 2016 at 7:15 AM, Vitaly Davidovich wrote: > Hi Jenny, > > Thanks very much for your analysis. A few comments inline below. > > > On Wednesday, August 24, 2016, yu.zhang at oracle.com > wrote: > >> Vitaly, >> >> Thanks for the gc logs. >> >> Before reaching full gc and the entry at >> >> 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) >> (young) >> >> ... >> >> [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: >> 95.2G(96.0G)->95.2G(96.0G)] >> There are several young gcs with 'to-space exhausted'. For example >> >> "2016-08-24T15:29:03.936+0000: 17767.663: [GC pause (G1 Evacuation Pause) >> (young) >> Desired survivor size 1795162112 bytes, new threshold 2 (max 2) >> 17767.663: [G1Ergonomics (CSet Construction) start choosing CSet, >> _pending_cards: 185708, predicted base time: 46.71 ms, remaining time: >> 253.29 ms, target pause time: 300.00 ms] >> 17767.663: [G1Ergonomics (CSet Construction) add young regions to CSet, >> eden: 159 regions, survivors: 0 regions, predicted young region time: >> 173.05 ms] >> 17767.663: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: >> 159 regions, survivors: 0 regions, old: 0 regions, predicted pause time: >> 219.76 ms, target pause time: 300.00 ms] >> 17767.664: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: >> region allocation request failed, allocation request: 16340088 bytes] >> 17767.664: [G1Ergonomics (Heap Sizing) expand the heap, requested >> expansion amount: 16340088 bytes, attempted expansion amount: 33554432 >> bytes] >> 17767.664: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: >> heap already fully expanded] >> ... >> >> (to-space exhausted), 8.0293588 secs] >> [Parallel Time: 6920.2 ms, GC Workers: 23] >> ... >> [Other: 1106.9 ms] >> [Evacuation Failure: 1095.6 ms] >> [Choose CSet: 0.0 ms] >> [Ref Proc: 4.4 ms] >> [Ref Enq: 0.4 ms] >> [Redirty Cards: 4.8 ms] >> [Humongous Register: 0.2 ms] >> [Humongous Reclaim: 0.2 ms] >> [Free CSet: 0.3 ms] >> [Eden: 5088.0M(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: >> 95.2G(96.0G)->95.2G(96.0G)] >> [Times: user=68.75 sys=1.66, real=8.03 secs] >> >> Please note the high "Evacuation Failure" time. Some of those young gcs >> might collect some. Some (like this one) do not collect any heap. >> >> (to-space exhausted) means it can not find enough old regions when >> evacuating from the young regions. >> > The above log also says that Eden occupancy went from ~5G to 0. Given the > to-space exhaustion and Survivors being 0, that implies it was all > garbage. If that's the case, what exactly did it fail to evacuate? If it > failed to evacuate them, where did they go given Eden + Survivors are 0. > Also, Survivors was apparently 0 to start with, so it's unclear to me what > exactly happened here. > >> >> >> There are only 4 marking cycles, 3 of which are at the beginning due to >> meta data space. Mix gc is not trigger at that time since the heap is not >> full enough. >> >> There is only 1 marking cycle requested due to heap occupancy, but it is >> too late. The heap is very full already. No mixed gc found in the log. >> "request concurrent cycle initiation, reason: occupancy higher than >> threshold, occupancy: 88080384000 bytes, allocation request: 0 bytes, >> threshold: 77309411325 bytes (75.00 %), source: end of GC]" >> > Any ideas why the concurrent cycle is initiated when occupancy is already > well above the 75% threshold? Why wouldn't it trigger at very close to the > threshold? It's "late" by about 10G here. > >> >> >> To avoid this, as others suggested, you can try >> 1. set a reasonable pause time goal and let g1 decide the young gen size >> > Is there a rule of thumb for picking reasonable pause time goals based on > heap size/alloc rate/etc? > >> >> 2. if you have to set the young gen size fixed, reduce the NewSize, >> increase max tenure threshold to let objects die in young gen. >> 3. Reduce InitiatingHeapOccupancyPercent >> The goal is to trigger more mixed gc and avoid to-space exhausted. >> > Right, we're trying with a lower IHOP value (55%). > >> >> >> The message after 'to-space exhausted' might be confusing. I need to >> discuss this with dev team. For example, at time stamp >> 2016-08-24T15:28:05.905+0000: 17709.633: [GC pause (G1 Evacuation Pause) >> (young) >> ... >> (to-space exhausted), 2.6149566 secs] >> ... >> >> [Eden: 28.4G(28.4G)->0.0B(28.4G) Survivors: 1664.0M->1664.0M Heap: >> 93.5G(96.0G)->73.9G(96.0G)] >> >> the eden used after gc might not be true. I will do some investigation >> and get back to you. >> > Thanks. Yeah it's confusing. I'm still not sure I understand the log > snippet I pasted in my initial email of the young evac immediately > preceding the Full GC - it showed 0 regions in the CSet, so nothing was > evacuated, but it also showed Eden occupancy of 0. It's then unclear why > the Full GC triggers immediately after due to a 32 byte alloc request. > > Do you think that may be a bogus log as well? > >> >> >> Thanks >> Jenny >> > Thanks again Jenny, this is very helpful. > >> >> >> On 08/24/2016 03:50 PM, Vitaly Davidovich wrote: >> >> >> >> On Wed, Aug 24, 2016 at 6:18 PM, Jenny Zhang wrote: >> >>> More comments about the questions >>> >>> Thanks >>> Jenny >>> >>> On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: >>> >>>> Right before the Full GC, ergonomics report a failure to expand the >>>> heap due to an allocation request of 32 bytes. Is this implying that a >>>> mutator tried to allocate 32 bytes but couldn't? How do I reconcile that >>>> with Eden+Survivor occupancy reported right above that? >>>> >>> Yes, it means the mutator tries to allocate 32byte but can not get it. >>> Heap won't expand as it already reaches max heap. >>> >>> Do you see any humongous objects allocatoin? >> >> As mentioned in my previous email, I don't see any humongous allocations >> recorded. The Humongous Register/Reclaim output does show non-zero timing, >> so not sure if the log is simply missing them or something else is going on. >> >>> >>> >>>> Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the >>>> application is roughly 1GB/s. Am I correct in assuming that allocation is >>>> outpacing concurrent marking, based on the above? What tunable(s) would you >>>> advise to tweak to get G1 to keep up with the allocation rate? I'm ok >>>> taking some throughput hit to mitigate 90s+ pauses. >>>> >>>> The entire log might give a better picture. Especially if the marking >>> cycle is triggered, how well the mixed gc cleans up the heap. >>> >>> Are there particular gc lines you're looking for? I can grep for them >> quickly and provide that for you. >> >> > > > > > > > > > > -- > Sent from my phone > -------------- next part -------------- An HTML attachment was scrubbed... URL: From poonam.bajaj at oracle.com Thu Aug 25 16:28:49 2016 From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar) Date: Thu, 25 Aug 2016 09:28:49 -0700 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> Message-ID: <4e4fda11-ea57-b1cb-84ef-958032b51c3a@oracle.com> Hello Vitaly, On 8/24/2016 3:55 PM, Vitaly Davidovich wrote: > > > On Wed, Aug 24, 2016 at 6:29 PM, Poonam Bajaj Parhar > > wrote: > > Also, do you see entries like "/[G1Ergonomics (Mixed GCs) do not > start mixed GCs, reason:" /in the GC logs which mean that the > mixed GCs are not happening due to some reason. What is the reason > listed with these log entries? > > Hi Poonam, > > Yes, I do see a few those, but only very early in the process > lifetime, and nowhere near the Full GCs. > > 2016-08-24T10:33:04.733+0000: 8.460: [SoftReference, 0 refs, 0.0010108 > secs]2016-08-24T10:33:04.734+0000: 8.461: [WeakReference, 383 refs, > 0.0006608 secs]2016-08-24T10:33:04.735+0000: 8.462: [FinalReference, > 4533 refs, 0.0020491 secs]2016-08-24T10:33:04.737+0000: 8.464: > [PhantomReference, 0 refs, 15 refs, 0.0011945 > secs]2016-08-24T10:33:04.738+0000: 8.465: [JNI Weak Reference, > 0.0000360 secs] 8.467: [G1Ergonomics (Mixed GCs) do not start mixed > GCs, reason: concurrent cycle is about to start] > 2016-08-24T10:35:22.846+0000: 146.574: [SoftReference, 0 refs, > 0.0011450 secs]2016-08-24T10:35:22.847+0000: 146.575: [WeakReference, > 440 refs, 0.0006071 secs]2016-08-24T10:35:22.848+0000: 146.575: > [FinalReference, 7100 refs, 0.0018074 > secs]2016-08-24T10:35:22.850+0000: 146.577: [PhantomReference, 0 refs, > 76 refs, 0.0013148 secs]2016-08-24T10:35:22.851+0000: 146.579: [JNI > Weak Reference, 0.0000443 secs] 146.584: [G1Ergonomics (Mixed GCs) do > not start mixed GCs, reason: concurrent cycle is about to start] > 2016-08-24T10:35:56.507+0000: 180.234: [SoftReference, 0 refs, > 0.0010184 secs]2016-08-24T10:35:56.508+0000: 180.235: [WeakReference, > 138 refs, 0.0006883 secs]2016-08-24T10:35:56.508+0000: 180.236: > [FinalReference, 3682 refs, 0.0023152 > secs]2016-08-24T10:35:56.511+0000: 180.238: [PhantomReference, 0 refs, > 45 refs, 0.0012558 secs]2016-08-24T10:35:56.512+0000: 180.239: [JNI > Weak Reference, 0.0000197 secs] 180.247: [G1Ergonomics (Mixed GCs) do > not start mixed GCs, reason: concurrent cycle is about to start] The above entries should be okay. > 2016-08-24T10:37:33.387+0000: 277.114: [SoftReference, 0 refs, > 0.0010965 secs]2016-08-24T10:37:33.388+0000: 277.115: [WeakReference, > 5 refs, 0.0006378 secs]2016-08-24T10:37:33.388+0000: 277.116: > [FinalReference, 3440 refs, 0.0028640 > secs]2016-08-24T10:37:33.391+0000: 277.119: [PhantomReference, 0 refs, > 0 refs, 0.0011392 secs]2016-08-24T10:37:33.392+0000: 277.120: [JNI > Weak Reference, 0.0000148 secs] 277.130: [G1Ergonomics (Mixed GCs) do > not start mixed GCs, reason: candidate old regions not available] > If these appear only during the startup, I won't worry about these too. Do you see mixed GCs happening later during the run? If yes, then it's just that the mixed GCs are not quite enough to keep pace with the allocations/promotions into the old regions. To increase the number of old regions included into the cset, you could try increasing the value of /G1MixedGCLiveThresholdPercent. /Thanks, Poonam > Does that tell you anything? > > > Thanks, > Poonam > > On 8/24/2016 3:18 PM, Jenny Zhang wrote: >> More comments about the questions >> >> Thanks >> Jenny >> >> On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: >>> Right before the Full GC, ergonomics report a failure to expand >>> the heap due to an allocation request of 32 bytes. Is this >>> implying that a mutator tried to allocate 32 bytes but couldn't? >>> How do I reconcile that with Eden+Survivor occupancy reported >>> right above that? >> Yes, it means the mutator tries to allocate 32byte but can not >> get it. Heap won't expand as it already reaches max heap. >> >> Do you see any humongous objects allocatoin? >>> >>> Young gen is sized to 30GB, total heap is 96GB. Allocation rate >>> of the application is roughly 1GB/s. Am I correct in assuming >>> that allocation is outpacing concurrent marking, based on the >>> above? What tunable(s) would you advise to tweak to get G1 to >>> keep up with the allocation rate? I'm ok taking some throughput >>> hit to mitigate 90s+ pauses. >>> >> The entire log might give a better picture. Especially if the >> marking cycle is triggered, how well the mixed gc cleans up the >> heap. >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu Aug 25 16:35:17 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 25 Aug 2016 12:35:17 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: <4e4fda11-ea57-b1cb-84ef-958032b51c3a@oracle.com> References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> <4e4fda11-ea57-b1cb-84ef-958032b51c3a@oracle.com> Message-ID: Hi Poonam, On Thu, Aug 25, 2016 at 12:28 PM, Poonam Bajaj Parhar < poonam.bajaj at oracle.com> wrote: > Hello Vitaly, > > On 8/24/2016 3:55 PM, Vitaly Davidovich wrote: > > > > On Wed, Aug 24, 2016 at 6:29 PM, Poonam Bajaj Parhar < > poonam.bajaj at oracle.com> wrote: > >> Also, do you see entries like "*[G1Ergonomics (Mixed GCs) do not start >> mixed GCs, reason:" *in the GC logs which mean that the mixed GCs are >> not happening due to some reason. What is the reason listed with these log >> entries? >> > Hi Poonam, > > Yes, I do see a few those, but only very early in the process lifetime, > and nowhere near the Full GCs. > > 2016-08-24T10:33:04.733+0000: 8.460: [SoftReference, 0 refs, 0.0010108 > secs]2016-08-24T10:33:04.734+0000: 8.461: [WeakReference, 383 refs, > 0.0006608 secs]2016-08-24T10:33:04.735+0000: 8.462: [FinalReference, 4533 > refs, 0.0020491 secs]2016-08-24T10:33:04.737+0000: 8.464: > [PhantomReference, 0 refs, 15 refs, 0.0011945 secs]2016-08-24T10:33:04.738+0000: > 8.465: [JNI Weak Reference, 0.0000360 secs] 8.467: [G1Ergonomics (Mixed > GCs) do not start mixed GCs, reason: concurrent cycle is about to start] > 2016-08-24T10:35:22.846+0000: 146.574: [SoftReference, 0 refs, 0.0011450 > secs]2016-08-24T10:35:22.847+0000: 146.575: [WeakReference, 440 refs, > 0.0006071 secs]2016-08-24T10:35:22.848+0000: 146.575: [FinalReference, > 7100 refs, 0.0018074 secs]2016-08-24T10:35:22.850+0000: 146.577: > [PhantomReference, 0 refs, 76 refs, 0.0013148 secs]2016-08-24T10:35:22.851+0000: > 146.579: [JNI Weak Reference, 0.0000443 secs] 146.584: [G1Ergonomics (Mixed > GCs) do not start mixed GCs, reason: concurrent cycle is about to start] > 2016-08-24T10:35:56.507+0000: 180.234: [SoftReference, 0 refs, 0.0010184 > secs]2016-08-24T10:35:56.508+0000: 180.235: [WeakReference, 138 refs, > 0.0006883 secs]2016-08-24T10:35:56.508+0000: 180.236: [FinalReference, > 3682 refs, 0.0023152 secs]2016-08-24T10:35:56.511+0000: 180.238: > [PhantomReference, 0 refs, 45 refs, 0.0012558 secs]2016-08-24T10:35:56.512+0000: > 180.239: [JNI Weak Reference, 0.0000197 secs] 180.247: [G1Ergonomics (Mixed > GCs) do not start mixed GCs, reason: concurrent cycle is about to start] > > > The above entries should be okay. > > 2016-08-24T10:37:33.387+0000: 277.114: [SoftReference, 0 refs, 0.0010965 > secs]2016-08-24T10:37:33.388+0000: 277.115: [WeakReference, 5 refs, > 0.0006378 secs]2016-08-24T10:37:33.388+0000: 277.116: [FinalReference, > 3440 refs, 0.0028640 secs]2016-08-24T10:37:33.391+0000: 277.119: > [PhantomReference, 0 refs, 0 refs, 0.0011392 secs]2016-08-24T10:37:33.392+0000: > 277.120: [JNI Weak Reference, 0.0000148 secs] 277.130: [G1Ergonomics (Mixed > GCs) do not start mixed GCs, reason: candidate old regions not available] > > If these appear only during the startup, I won't worry about these too. > > Do you see mixed GCs happening later during the run? If yes, then it's > just that the mixed GCs are not quite enough to keep pace with the > allocations/promotions into the old regions. > > To increase the number of old regions included into the cset, you could > try increasing the value of > *G1MixedGCLiveThresholdPercent.* > So as I mentioned in my earlier email today, we tried using IHOP=55 (instead of 75). There are very long Object Copy and Finalize Marking times now, although the heap cleanup is pretty good. I didn't see any Full GCs with that setting, but the very long Full GC pauses are just replaced by extremely long Finalize Marking times (and fairly long Object Copy times). Thanks > > Thanks, > Poonam > > Does that tell you anything? > > >> >> Thanks, >> Poonam >> >> On 8/24/2016 3:18 PM, Jenny Zhang wrote: >> >> More comments about the questions >> >> Thanks >> Jenny >> >> On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: >> >> Right before the Full GC, ergonomics report a failure to expand the heap >> due to an allocation request of 32 bytes. Is this implying that a mutator >> tried to allocate 32 bytes but couldn't? How do I reconcile that with >> Eden+Survivor occupancy reported right above that? >> >> Yes, it means the mutator tries to allocate 32byte but can not get it. >> Heap won't expand as it already reaches max heap. >> >> Do you see any humongous objects allocatoin? >> >> >> Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the >> application is roughly 1GB/s. Am I correct in assuming that allocation is >> outpacing concurrent marking, based on the above? What tunable(s) would you >> advise to tweak to get G1 to keep up with the allocation rate? I'm ok >> taking some throughput hit to mitigate 90s+ pauses. >> >> The entire log might give a better picture. Especially if the marking >> cycle is triggered, how well the mixed gc cleans up the heap. >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From poonam.bajaj at oracle.com Thu Aug 25 16:54:59 2016 From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar) Date: Thu, 25 Aug 2016 09:54:59 -0700 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> <4e4fda11-ea57-b1cb-84ef-958032b51c3a@oracle.com> Message-ID: <320c40d3-3cb0-9970-d7f6-ca29e7d9b75c@oracle.com> Hello Vitaly, On 8/25/2016 9:35 AM, Vitaly Davidovich wrote: > Hi Poonam, > > On Thu, Aug 25, 2016 at 12:28 PM, Poonam Bajaj Parhar > > wrote: > > Hello Vitaly, > > On 8/24/2016 3:55 PM, Vitaly Davidovich wrote: >> >> >> On Wed, Aug 24, 2016 at 6:29 PM, Poonam Bajaj Parhar >> > wrote: >> >> Also, do you see entries like "/[G1Ergonomics (Mixed GCs) do >> not start mixed GCs, reason:" /in the GC logs which mean that >> the mixed GCs are not happening due to some reason. What is >> the reason listed with these log entries? >> >> Hi Poonam, >> >> Yes, I do see a few those, but only very early in the process >> lifetime, and nowhere near the Full GCs. >> >> 2016-08-24T10:33:04.733+0000: 8.460: [SoftReference, 0 refs, >> 0.0010108 secs]2016-08-24T10:33:04.734+0000: 8.461: >> [WeakReference, 383 refs, 0.0006608 >> secs]2016-08-24T10:33:04.735+0000: 8.462: [FinalReference, 4533 >> refs, 0.0020491 secs]2016-08-24T10:33:04.737+0000: 8.464: >> [PhantomReference, 0 refs, 15 refs, 0.0011945 >> secs]2016-08-24T10:33:04.738+0000: 8.465: [JNI Weak Reference, >> 0.0000360 secs] 8.467: [G1Ergonomics (Mixed GCs) do not start >> mixed GCs, reason: concurrent cycle is about to start] >> 2016-08-24T10:35:22.846+0000: 146.574: [SoftReference, 0 refs, >> 0.0011450 secs]2016-08-24T10:35:22.847+0000: 146.575: >> [WeakReference, 440 refs, 0.0006071 >> secs]2016-08-24T10:35:22.848+0000: 146.575: [FinalReference, 7100 >> refs, 0.0018074 secs]2016-08-24T10:35:22.850+0000: 146.577: >> [PhantomReference, 0 refs, 76 refs, 0.0013148 >> secs]2016-08-24T10:35:22.851+0000: 146.579: [JNI Weak Reference, >> 0.0000443 secs] 146.584: [G1Ergonomics (Mixed GCs) do not start >> mixed GCs, reason: concurrent cycle is about to start] >> 2016-08-24T10:35:56.507+0000: 180.234: [SoftReference, 0 refs, >> 0.0010184 secs]2016-08-24T10:35:56.508+0000: 180.235: >> [WeakReference, 138 refs, 0.0006883 >> secs]2016-08-24T10:35:56.508+0000: 180.236: [FinalReference, 3682 >> refs, 0.0023152 secs]2016-08-24T10:35:56.511+0000: 180.238: >> [PhantomReference, 0 refs, 45 refs, 0.0012558 >> secs]2016-08-24T10:35:56.512+0000: 180.239: [JNI Weak Reference, >> 0.0000197 secs] 180.247: [G1Ergonomics (Mixed GCs) do not start >> mixed GCs, reason: concurrent cycle is about to start] > > The above entries should be okay. > >> 2016-08-24T10:37:33.387+0000: 277.114: [SoftReference, 0 refs, >> 0.0010965 secs]2016-08-24T10:37:33.388+0000: 277.115: >> [WeakReference, 5 refs, 0.0006378 >> secs]2016-08-24T10:37:33.388+0000: 277.116: [FinalReference, 3440 >> refs, 0.0028640 secs]2016-08-24T10:37:33.391+0000: 277.119: >> [PhantomReference, 0 refs, 0 refs, 0.0011392 >> secs]2016-08-24T10:37:33.392+0000: 277.120: [JNI Weak Reference, >> 0.0000148 secs] 277.130: [G1Ergonomics (Mixed GCs) do not start >> mixed GCs, reason: candidate old regions not available] >> > If these appear only during the startup, I won't worry about these > too. > > Do you see mixed GCs happening later during the run? If yes, then > it's just that the mixed GCs are not quite enough to keep pace > with the allocations/promotions into the old regions. > > To increase the number of old regions included into the cset, you > could try increasing the value of /G1MixedGCLiveThresholdPercent. > / > > So as I mentioned in my earlier email today, we tried using IHOP=55 > (instead of 75). There are very long Object Copy and Finalize Marking > times now, although the heap cleanup is pretty good. I didn't see any > Full GCs with that setting, but the very long Full GC pauses are just > replaced by extremely long Finalize Marking times (and fairly long > Object Copy times). Yes, I was just reading that particular log that you had sent. The object copying times are very high. / [Object Copy (ms): Min: 22450.6, Avg: 22615.0, Max: 22768.2, Diff: 317.5, Sum: 520143.9]/ Could you try increasing the value of MaxTenuringThreshold or remove this option altogether. That would help in letting the objects die in the young regions itself and would avoid copying them to the old regions. Thanks, Poonam > > Thanks > > / > /Thanks, > Poonam > >> Does that tell you anything? >> >> >> Thanks, >> Poonam >> >> On 8/24/2016 3:18 PM, Jenny Zhang wrote: >>> More comments about the questions >>> >>> Thanks >>> Jenny >>> >>> On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: >>>> Right before the Full GC, ergonomics report a failure to >>>> expand the heap due to an allocation request of 32 bytes. >>>> Is this implying that a mutator tried to allocate 32 bytes >>>> but couldn't? How do I reconcile that with Eden+Survivor >>>> occupancy reported right above that? >>> Yes, it means the mutator tries to allocate 32byte but can >>> not get it. Heap won't expand as it already reaches max heap. >>> >>> Do you see any humongous objects allocatoin? >>>> >>>> Young gen is sized to 30GB, total heap is 96GB. Allocation >>>> rate of the application is roughly 1GB/s. Am I correct in >>>> assuming that allocation is outpacing concurrent marking, >>>> based on the above? What tunable(s) would you advise to >>>> tweak to get G1 to keep up with the allocation rate? I'm ok >>>> taking some throughput hit to mitigate 90s+ pauses. >>>> >>> The entire log might give a better picture. Especially if >>> the marking cycle is triggered, how well the mixed gc cleans >>> up the heap. >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From monica.beckwith at gmail.com Thu Aug 25 17:01:22 2016 From: monica.beckwith at gmail.com (monica beckwith) Date: Fri, 26 Aug 2016 01:01:22 +0800 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: Message-ID: Hi Vitaly, Would you be able to post all the GC events from 'initial-mark' leading up-to the 0.0B Eden event? (Please include the PrintAdaptiveSizePolicy information as well). Regards, Monica Sent from my Huawei Mobile Hi guys, Hoping someone could shed some light on G1 behavior (as seen from the gc log) that I'm having a hard time understanding. The root problem is G1 enters a Full GC that takes many tens of seconds, and need some advice on what could be causing it. First, some basic info: Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE (1.8.0_91-b14), built on Apr 1 2016 00:57:21 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8) Memory: 4k page, physical 264115728k(108464820k free), swap 0k(0k free) CommandLine flags: -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath= -XX:InitialCodeCacheSize=104857600 -XX:InitialHeapSize=103079215104 -XX:InitialTenuringThreshold=2 -XX: InitiatingHeapOccupancyPercent=75 -XX:+ManagementServer -XX:MaxGCPauseMillis=300 -XX:MaxHeapSize=103079215104 -XX:MaxNewSize=32212254720 -XX:MaxTenuringThreshold=2 -XX:NewSize=32212254720 -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy -XX:+PrintCommandLineFlags -XX:+PrintCompilation -XX:PrintFLSStatistics=1 -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintPromotionFailure -XX:+PrintReferenceGC -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 -XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600 -XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC Swap is disabled. THP is disabled. First issue I have a question about: 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 1795162112 bytes, new threshold 2 (max 2) 17776.029: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 0, predicted base time: 14.07 ms, remaining time: 285.93 ms, target pause time: 300.00 ms] 17776.029: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 0 regions, survivors: 0 regions, predicted young region time: 0.00 ms] 17776.029: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 0 regions, survivors: 0 regions, old: 0 regions, predicted pause time: 14.07 ms, target pause time: 300.00 ms] 2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs, 0.0012417 secs]2016-08-24T15:29:12.307+0000: 17776.034: [WeakReference, 0 refs, 0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035: [FinalReference, 0 refs, 0.0007027 secs]2016-08-24T15:29:12.308+0000: 17776.035: [PhantomReference, 0 refs, 0 refs, 0.0013585 secs]2016-08-24T15:29:12.309+0000: 17776.037: [JNI Weak Reference, 0.0000118 secs], 0.0089758 secs] [Parallel Time: 3.1 ms, GC Workers: 23] [GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3, Max: 17776029.4, Diff: 0.2] [Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8, Diff: 1.9, Sum: 24.2] [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff: 1, Sum: 2] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.2] [Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff: 1.8, Sum: 37.9] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 23] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9, Diff: 0.2, Sum: 63.8] [GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max: 17776032.1, Diff: 0.0] [Code Root Fixup: 0.2 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.4 ms] [Other: 5.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 4.4 ms] [Ref Enq: 0.3 ms] [Redirty Cards: 0.3 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.0 ms] *[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: 95.2G(96.0G)->95.2G(96.0G)]* [Times: user=0.08 sys=0.00, real=0.01 secs] 2016-08-24T15:29:12.311+0000: 17776.038: Total time for which application threads were stopped: 0.0103002 seconds, Stopping threads took: 0.0000566 seconds 17776.039: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 32 bytes] 17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 33554432 bytes, attempted expansion amount: 33554432 bytes] 17776.039: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded] 2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation Failure) 2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504 refs, 0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456: [WeakReference, 1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456: [FinalReference, 3270 refs, 0.0033290 secs]2016-08-24T15:29:40.732+0000: 17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257 secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference, 0.0000172 secs] 95G->38G(96G), 95.5305034 secs] [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: 95.2G(96.0G)->38.9G(96.0G)], [Metaspace: 104180K->103365K(106496K)] * [Times: user=157.02 sys=0.28, real=95.54 secs] * So here we have a lengthy full GC pause that collects quite a bit of old gen (expected). Right before this is a young evac pause. Why is the heap sizing (bolded) reported after the evac pause showing empty Eden+Survivor? Why is ergonomic info reporting 0 regions selected (i.e. what's evacuated then)? Right before the Full GC, ergonomics report a failure to expand the heap due to an allocation request of 32 bytes. Is this implying that a mutator tried to allocate 32 bytes but couldn't? How do I reconcile that with Eden+Survivor occupancy reported right above that? Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the application is roughly 1GB/s. Am I correct in assuming that allocation is outpacing concurrent marking, based on the above? What tunable(s) would you advise to tweak to get G1 to keep up with the allocation rate? I'm ok taking some throughput hit to mitigate 90s+ pauses. Let me know if any additional info is needed (I have the full GC log, and can attach that if desired). Thanks _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Thu Aug 25 17:17:41 2016 From: yu.zhang at oracle.com (yu.zhang at oracle.com) Date: Thu, 25 Aug 2016 10:17:41 -0700 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> <27125e40-05fc-3b0a-51ed-dcc12b78c420@oracle.com> Message-ID: Vitaly, Stefan.Karlsson points me to the right code. The current implementation is when evacuation failure happens, the region that failed to evacuate is turned into old region. I still think it is confusing. In your logs, there is actually no young regions(all are converted to old). But we are trying to do a young gc. So 0 young regions in CSet. We are still discussing this. But would like to give you an update. I will take a look at your 2nd log. Thanks Jenny On 08/25/2016 04:15 AM, Vitaly Davidovich wrote: > > The message after 'to-space exhausted' might be confusing. I need > to discuss this with dev team. For example, at time stamp > 2016-08-24T15:28:05.905+0000: 17709.633: [GC pause (G1 Evacuation > Pause) (young) > ... > (to-space exhausted), 2.6149566 secs] > ... > > [Eden: 28.4G(28.4G)->0.0B(28.4G) Survivors: 1664.0M->1664.0M Heap: > 93.5G(96.0G)->73.9G(96.0G)] > > the eden used after gc might not be true. I will do some > investigation and get back to you. > > Thanks. Yeah it's confusing. I'm still not sure I understand the log > snippet I pasted in my initial email of the young evac immediately > preceding the Full GC - it showed 0 regions in the CSet, so nothing > was evacuated, but it also showed Eden occupancy of 0. It's then > unclear why the Full GC triggers immediately after due to a 32 byte > alloc request. > > Do you think that may be a bogus log as well? -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu Aug 25 17:19:22 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 25 Aug 2016 13:19:22 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: <320c40d3-3cb0-9970-d7f6-ca29e7d9b75c@oracle.com> References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> <4e4fda11-ea57-b1cb-84ef-958032b51c3a@oracle.com> <320c40d3-3cb0-9970-d7f6-ca29e7d9b75c@oracle.com> Message-ID: On Thu, Aug 25, 2016 at 12:54 PM, Poonam Bajaj Parhar < poonam.bajaj at oracle.com> wrote: > Hello Vitaly, > > On 8/25/2016 9:35 AM, Vitaly Davidovich wrote: > > Hi Poonam, > > On Thu, Aug 25, 2016 at 12:28 PM, Poonam Bajaj Parhar < > poonam.bajaj at oracle.com> wrote: > >> Hello Vitaly, >> >> On 8/24/2016 3:55 PM, Vitaly Davidovich wrote: >> >> >> >> On Wed, Aug 24, 2016 at 6:29 PM, Poonam Bajaj Parhar < >> poonam.bajaj at oracle.com> wrote: >> >>> Also, do you see entries like "*[G1Ergonomics (Mixed GCs) do not start >>> mixed GCs, reason:" *in the GC logs which mean that the mixed GCs are >>> not happening due to some reason. What is the reason listed with these log >>> entries? >>> >> Hi Poonam, >> >> Yes, I do see a few those, but only very early in the process lifetime, >> and nowhere near the Full GCs. >> >> 2016-08-24T10:33:04.733+0000: 8.460: [SoftReference, 0 refs, 0.0010108 >> secs]2016-08-24T10:33:04.734+0000: 8.461: [WeakReference, 383 refs, >> 0.0006608 secs]2016-08-24T10:33:04.735+0000: 8.462: [FinalReference, >> 4533 refs, 0.0020491 secs]2016-08-24T10:33:04.737+0000: 8.464: >> [PhantomReference, 0 refs, 15 refs, 0.0011945 secs]2016-08-24T10:33:04.738+0000: >> 8.465: [JNI Weak Reference, 0.0000360 secs] 8.467: [G1Ergonomics (Mixed >> GCs) do not start mixed GCs, reason: concurrent cycle is about to start] >> 2016-08-24T10:35:22.846+0000: 146.574: [SoftReference, 0 refs, 0.0011450 >> secs]2016-08-24T10:35:22.847+0000: 146.575: [WeakReference, 440 refs, >> 0.0006071 secs]2016-08-24T10:35:22.848+0000: 146.575: [FinalReference, >> 7100 refs, 0.0018074 secs]2016-08-24T10:35:22.850+0000: 146.577: >> [PhantomReference, 0 refs, 76 refs, 0.0013148 secs]2016-08-24T10:35:22.851+0000: >> 146.579: [JNI Weak Reference, 0.0000443 secs] 146.584: [G1Ergonomics (Mixed >> GCs) do not start mixed GCs, reason: concurrent cycle is about to start] >> 2016-08-24T10:35:56.507+0000: 180.234: [SoftReference, 0 refs, 0.0010184 >> secs]2016-08-24T10:35:56.508+0000: 180.235: [WeakReference, 138 refs, >> 0.0006883 secs]2016-08-24T10:35:56.508+0000: 180.236: [FinalReference, >> 3682 refs, 0.0023152 secs]2016-08-24T10:35:56.511+0000: 180.238: >> [PhantomReference, 0 refs, 45 refs, 0.0012558 secs]2016-08-24T10:35:56.512+0000: >> 180.239: [JNI Weak Reference, 0.0000197 secs] 180.247: [G1Ergonomics (Mixed >> GCs) do not start mixed GCs, reason: concurrent cycle is about to start] >> >> >> The above entries should be okay. >> >> 2016-08-24T10:37:33.387+0000: 277.114: [SoftReference, 0 refs, 0.0010965 >> secs]2016-08-24T10:37:33.388+0000: 277.115: [WeakReference, 5 refs, >> 0.0006378 secs]2016-08-24T10:37:33.388+0000: 277.116: [FinalReference, >> 3440 refs, 0.0028640 secs]2016-08-24T10:37:33.391+0000: 277.119: >> [PhantomReference, 0 refs, 0 refs, 0.0011392 secs]2016-08-24T10:37:33.392+0000: >> 277.120: [JNI Weak Reference, 0.0000148 secs] 277.130: [G1Ergonomics (Mixed >> GCs) do not start mixed GCs, reason: candidate old regions not available] >> >> If these appear only during the startup, I won't worry about these too. >> >> Do you see mixed GCs happening later during the run? If yes, then it's >> just that the mixed GCs are not quite enough to keep pace with the >> allocations/promotions into the old regions. >> >> To increase the number of old regions included into the cset, you could >> try increasing the value of >> *G1MixedGCLiveThresholdPercent. * >> > So as I mentioned in my earlier email today, we tried using IHOP=55 > (instead of 75). There are very long Object Copy and Finalize Marking > times now, although the heap cleanup is pretty good. I didn't see any Full > GCs with that setting, but the very long Full GC pauses are just replaced > by extremely long Finalize Marking times (and fairly long Object Copy > times). > > > Yes, I was just reading that particular log that you had sent. The object > copying times are very high. > > * [Object Copy (ms): Min: 22450.6, Avg: 22615.0, Max: 22768.2, Diff: > 317.5, Sum: 520143.9]* > > Could you try increasing the value of MaxTenuringThreshold or remove this > option altogether. That would help in letting the objects die in the young > regions itself and would avoid copying them to the old regions. > Yes, will try that -- the rationale makes sense. I'm not entirely sure if the survivors would actually die off or not, which would depend on their lifetime and how it relates to GC frequency, but it's worth a shot. Thanks > > Thanks, > Poonam > > > > Thanks > >> >> Thanks, >> Poonam >> >> Does that tell you anything? >> >> >>> >>> Thanks, >>> Poonam >>> >>> On 8/24/2016 3:18 PM, Jenny Zhang wrote: >>> >>> More comments about the questions >>> >>> Thanks >>> Jenny >>> >>> On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: >>> >>> Right before the Full GC, ergonomics report a failure to expand the heap >>> due to an allocation request of 32 bytes. Is this implying that a mutator >>> tried to allocate 32 bytes but couldn't? How do I reconcile that with >>> Eden+Survivor occupancy reported right above that? >>> >>> Yes, it means the mutator tries to allocate 32byte but can not get it. >>> Heap won't expand as it already reaches max heap. >>> >>> Do you see any humongous objects allocatoin? >>> >>> >>> Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the >>> application is roughly 1GB/s. Am I correct in assuming that allocation is >>> outpacing concurrent marking, based on the above? What tunable(s) would you >>> advise to tweak to get G1 to keep up with the allocation rate? I'm ok >>> taking some throughput hit to mitigate 90s+ pauses. >>> >>> The entire log might give a better picture. Especially if the marking >>> cycle is triggered, how well the mixed gc cleans up the heap. >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu Aug 25 17:23:32 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 25 Aug 2016 13:23:32 -0400 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <22ab1c99-07b0-b481-7c31-8a529a41b992@oracle.com> <4e4fda11-ea57-b1cb-84ef-958032b51c3a@oracle.com> <320c40d3-3cb0-9970-d7f6-ca29e7d9b75c@oracle.com> Message-ID: Forgot to mention that besides the long Object Copy time, the extremely long Finalize Marking duration is also quite puzzling. It may be related to https://bugs.openjdk.java.net/browse/JDK-8057003, but I'm not sure yet. On Thu, Aug 25, 2016 at 1:19 PM, Vitaly Davidovich wrote: > > > On Thu, Aug 25, 2016 at 12:54 PM, Poonam Bajaj Parhar < > poonam.bajaj at oracle.com> wrote: > >> Hello Vitaly, >> >> On 8/25/2016 9:35 AM, Vitaly Davidovich wrote: >> >> Hi Poonam, >> >> On Thu, Aug 25, 2016 at 12:28 PM, Poonam Bajaj Parhar < >> poonam.bajaj at oracle.com> wrote: >> >>> Hello Vitaly, >>> >>> On 8/24/2016 3:55 PM, Vitaly Davidovich wrote: >>> >>> >>> >>> On Wed, Aug 24, 2016 at 6:29 PM, Poonam Bajaj Parhar < >>> poonam.bajaj at oracle.com> wrote: >>> >>>> Also, do you see entries like "*[G1Ergonomics (Mixed GCs) do not start >>>> mixed GCs, reason:" *in the GC logs which mean that the mixed GCs are >>>> not happening due to some reason. What is the reason listed with these log >>>> entries? >>>> >>> Hi Poonam, >>> >>> Yes, I do see a few those, but only very early in the process lifetime, >>> and nowhere near the Full GCs. >>> >>> 2016-08-24T10:33:04.733+0000: 8.460: [SoftReference, 0 refs, 0.0010108 >>> secs]2016-08-24T10:33:04.734+0000: 8.461: [WeakReference, 383 refs, >>> 0.0006608 secs]2016-08-24T10:33:04.735+0000: 8.462: [FinalReference, >>> 4533 refs, 0.0020491 secs]2016-08-24T10:33:04.737+0000: 8.464: >>> [PhantomReference, 0 refs, 15 refs, 0.0011945 secs]2016-08-24T10:33:04.738+0000: >>> 8.465: [JNI Weak Reference, 0.0000360 secs] 8.467: [G1Ergonomics (Mixed >>> GCs) do not start mixed GCs, reason: concurrent cycle is about to start] >>> 2016-08-24T10:35:22.846+0000: 146.574: [SoftReference, 0 refs, 0.0011450 >>> secs]2016-08-24T10:35:22.847+0000: 146.575: [WeakReference, 440 refs, >>> 0.0006071 secs]2016-08-24T10:35:22.848+0000: 146.575: [FinalReference, >>> 7100 refs, 0.0018074 secs]2016-08-24T10:35:22.850+0000: 146.577: >>> [PhantomReference, 0 refs, 76 refs, 0.0013148 secs]2016-08-24T10:35:22.851+0000: >>> 146.579: [JNI Weak Reference, 0.0000443 secs] 146.584: [G1Ergonomics (Mixed >>> GCs) do not start mixed GCs, reason: concurrent cycle is about to start] >>> 2016-08-24T10:35:56.507+0000: 180.234: [SoftReference, 0 refs, 0.0010184 >>> secs]2016-08-24T10:35:56.508+0000: 180.235: [WeakReference, 138 refs, >>> 0.0006883 secs]2016-08-24T10:35:56.508+0000: 180.236: [FinalReference, >>> 3682 refs, 0.0023152 secs]2016-08-24T10:35:56.511+0000: 180.238: >>> [PhantomReference, 0 refs, 45 refs, 0.0012558 secs]2016-08-24T10:35:56.512+0000: >>> 180.239: [JNI Weak Reference, 0.0000197 secs] 180.247: [G1Ergonomics (Mixed >>> GCs) do not start mixed GCs, reason: concurrent cycle is about to start] >>> >>> >>> The above entries should be okay. >>> >>> 2016-08-24T10:37:33.387+0000: 277.114: [SoftReference, 0 refs, 0.0010965 >>> secs]2016-08-24T10:37:33.388+0000: 277.115: [WeakReference, 5 refs, >>> 0.0006378 secs]2016-08-24T10:37:33.388+0000: 277.116: [FinalReference, >>> 3440 refs, 0.0028640 secs]2016-08-24T10:37:33.391+0000: 277.119: >>> [PhantomReference, 0 refs, 0 refs, 0.0011392 secs]2016-08-24T10:37:33.392+0000: >>> 277.120: [JNI Weak Reference, 0.0000148 secs] 277.130: [G1Ergonomics (Mixed >>> GCs) do not start mixed GCs, reason: candidate old regions not available] >>> >>> If these appear only during the startup, I won't worry about these too. >>> >>> Do you see mixed GCs happening later during the run? If yes, then it's >>> just that the mixed GCs are not quite enough to keep pace with the >>> allocations/promotions into the old regions. >>> >>> To increase the number of old regions included into the cset, you could >>> try increasing the value of >>> *G1MixedGCLiveThresholdPercent. * >>> >> So as I mentioned in my earlier email today, we tried using IHOP=55 >> (instead of 75). There are very long Object Copy and Finalize Marking >> times now, although the heap cleanup is pretty good. I didn't see any Full >> GCs with that setting, but the very long Full GC pauses are just replaced >> by extremely long Finalize Marking times (and fairly long Object Copy >> times). >> >> >> Yes, I was just reading that particular log that you had sent. The object >> copying times are very high. >> >> * [Object Copy (ms): Min: 22450.6, Avg: 22615.0, Max: 22768.2, Diff: >> 317.5, Sum: 520143.9]* >> >> Could you try increasing the value of MaxTenuringThreshold or remove this >> option altogether. That would help in letting the objects die in the young >> regions itself and would avoid copying them to the old regions. >> > Yes, will try that -- the rationale makes sense. I'm not entirely sure if > the survivors would actually die off or not, which would depend on their > lifetime and how it relates to GC frequency, but it's worth a shot. > > Thanks > >> >> Thanks, >> Poonam >> >> >> >> Thanks >> >>> >>> Thanks, >>> Poonam >>> >>> Does that tell you anything? >>> >>> >>>> >>>> Thanks, >>>> Poonam >>>> >>>> On 8/24/2016 3:18 PM, Jenny Zhang wrote: >>>> >>>> More comments about the questions >>>> >>>> Thanks >>>> Jenny >>>> >>>> On 8/24/2016 11:43 AM, Vitaly Davidovich wrote: >>>> >>>> Right before the Full GC, ergonomics report a failure to expand the >>>> heap due to an allocation request of 32 bytes. Is this implying that a >>>> mutator tried to allocate 32 bytes but couldn't? How do I reconcile that >>>> with Eden+Survivor occupancy reported right above that? >>>> >>>> Yes, it means the mutator tries to allocate 32byte but can not get it. >>>> Heap won't expand as it already reaches max heap. >>>> >>>> Do you see any humongous objects allocatoin? >>>> >>>> >>>> Young gen is sized to 30GB, total heap is 96GB. Allocation rate of the >>>> application is roughly 1GB/s. Am I correct in assuming that allocation is >>>> outpacing concurrent marking, based on the above? What tunable(s) would you >>>> advise to tweak to get G1 to keep up with the allocation rate? I'm ok >>>> taking some throughput hit to mitigate 90s+ pauses. >>>> >>>> The entire log might give a better picture. Especially if the marking >>>> cycle is triggered, how well the mixed gc cleans up the heap. >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>> >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Mon Aug 29 08:34:18 2016 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 29 Aug 2016 10:34:18 +0200 Subject: Odd G1GC behavior on 8u91 In-Reply-To: References: <0537de2b-4554-5f26-8762-6704aad395ca@oracle.com> Message-ID: <1472459658.4623.32.camel@oracle.com> Hi Vitaly, ? just some random comments, trying to answer the questions in a single thread: On Wed, 2016-08-24 at 14:43 -0400, Vitaly Davidovich wrote: > Hi guys, >? > Hoping someone could shed some light on G1 behavior (as seen from the > gc log) that I'm having a hard time understanding.??The root problem > is G1 enters a Full GC that takes many tens of seconds, and need some > advice on what could be causing it. > > ? [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: >95.2G(96.0G)->95.2G(96.0G)] > [Times: user=0.08 sys=0.00, real=0.01 secs]? As mentioned by Jenny, this odd looking log line is because a preceding evacuation failure used up all space. I.e. evacuation failure turns regions that contain objects that could not be copied into old gen regions. Since after these evacuation failures the heap is full, any allocation in eden fails because there is not enough space, although it intended to use 30G - as you told it to in your options. The log output is indeed confusing, and actually this (very short) GC is superfluous. On Wed, 2016-08-24 at 18:36 -0400, Vitaly Davidovich wrote: > Hi Jenny, > > Very happy that you and Charlie got wind of this thread -- could use > your expertise :).? I will email you the log directly (it's a bit > verbose with all the safepoint + gc logging) as I believe the mailing > list software will strip it.? To answer/comment on your email ... > > I believe the fixing of young gen size (and turning off adaptive > sizing) was done intentionally.? The developers reported that letting > G1 manage this ergonomically caused problems, although that may be > because the max pause time goal is too aggressive (300ms for such a > large heap).? This is something we're also looking at revisiting, but > trying to get a handle on the other issues first. Please do. Note that specifying min/max young overrides the use pause time goal (to size young gen). > As for humongous objects, I don't see any trace of them in the log.? > We actually saw some other poor G1 behavior with some older GC > settings whereby the "Finalize Marking" phase was taking hundreds of > seconds (same total heap size, but with a 15GB young), and those gc > logs did indicate very humongous object allocations.? I can certainly > try sharing that log with you as well, but I think that's likely a > different issue (it's possible it's related to the G1 worker threads > marking through large arrays fully, but I'm not sure).? Sounds like a combination of?JDK-8057003 and?JDK-8159422. Thanks, ? Thomas From jun.zhuang at hobsons.com Mon Aug 29 13:29:08 2016 From: jun.zhuang at hobsons.com (Jun Zhuang) Date: Mon, 29 Aug 2016 13:29:08 +0000 Subject: Java string literal pool Message-ID: Hi, I was reading about the Java8 Metaspace the other day, then got into the topic of string pool. I got conflicting information regarding a couple of things, I wonder if I can get a definitive answer from the experts? #1: Is the string literal/constant pool a heap area that holds actual string literal objects or a place that holds references to string literal objects? Reference: http://stackoverflow.com/questions/11700320/is-string-literal-pool-a-collection-of-references-to-the-string-object-or-a-col #2: Is the following statement correct? String str = new String("Cat"); In above statement, either 1 or 2 string will be created. If there is already a string literal "Cat" in the pool, then only one string "str" will be created in the pool. If there is no string literal "Cat" in the pool, then it will be first created in the pool and then in the heap space, so total 2 string objects will be created. >From http://www.journaldev.com/797/what-is-java-string-pool#comment-36152 Apprciate your answers, Jun Jun Zhuang Sr. Performance QA Engineer | Hobsons T: +1 513 746 2288 | jun.zhuang at hobsons.com 50 E-Business Way, Suite 300 | Cincinnati, OH 45241 | USA Upgraded by Hobsons - Subscribe Today -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image550000.png Type: image/png Size: 13602 bytes Desc: image550000.png URL: From rajasekhar.velamuri at gmail.com Sat Aug 27 06:44:06 2016 From: rajasekhar.velamuri at gmail.com (Rajasekhar Srinivasa Seshasaye Velamuri) Date: Sat, 27 Aug 2016 12:14:06 +0530 Subject: Regarding garbage collector classes ! Message-ID: Hi Are there any classes for garbage collector ? I mean it would be great if you could specify them (please do not get me wrong!). Best regards Velamuri Rajasekhar Seshasaye -------------- next part -------------- An HTML attachment was scrubbed... URL: From ecki at zusammenkunft.net Mon Aug 29 18:18:13 2016 From: ecki at zusammenkunft.net (Bernd Eckenfels) Date: Mon, 29 Aug 2016 20:18:13 +0200 Subject: Regarding garbage collector classes ! In-Reply-To: References: Message-ID: <20160829201813.00007fc4.ecki@zusammenkunft.net> Hello, not sure I understand your question correctly, but generally speaking GC is implemented by the Java Runtime in native code. So there are no Java classes responsible for this. As an extension of this you cannot specify them. There are some Java classes (JMX Beans) specific to the various GCs for monitoring and some classes from the Java Class Library are related to GC (like Weak/Soft/PhantomReferences and the Finalizer). Gruss Bernd Am Sat, 27 Aug 2016 12:14:06 +0530 schrieb Rajasekhar Srinivasa Seshasaye Velamuri : > Hi > > Are there any classes for garbage collector ? I mean it would be > great if you could specify them (please do not get me wrong!). > > > Best regards > Velamuri Rajasekhar Seshasaye > From rednaxelafx at gmail.com Mon Aug 29 23:28:19 2016 From: rednaxelafx at gmail.com (Krystal Mok) Date: Mon, 29 Aug 2016 16:28:19 -0700 Subject: Java string literal pool In-Reply-To: References: Message-ID: Hi Jun, Comments below inline: On Mon, Aug 29, 2016 at 6:29 AM, Jun Zhuang wrote: > Hi, > > > > I was reading about the Java8 Metaspace the other day, then got into the > topic of string pool. I got conflicting information regarding a couple of > things, I wonder if I can get a definitive answer from the experts? > > > > #1: Is the string literal/constant pool a heap area that holds actual > string literal objects or a place that holds references to string literal > objects? > > Reference: http://stackoverflow.com/questions/11700320/is-string- > literal-pool-a-collection-of-references-to-the-string-object-or-a-col > > > What you're referring to is called the "StringTable" in HotSpot JVM [1]. You can think of it as being semantically equivalent to a HashMap in Java, which holds references to the java.lang.String objects being interned, instead of storing the actual string contents. > #2: Is the following statement correct? > > > > String str = new String("Cat"); > > In above statement, either 1 or 2 string will be created. If there is > already a string literal ?Cat? in the pool, then only one string ?str? will > be created in the pool. If there is no string literal ?Cat? in the pool, > then it will be first created in the pool and then in the heap space, so > total 2 string objects will be created. > > > > From http://www.journaldev.com/797/what-is-java-string-pool#comment-36152 > > > For this statement, yes, either 1 or 2 java.lang.String instances are going to be created. But I wouldn't frame it that way, since it's potentially confusing. The correct way to frame this is to make a clear distinction between one-time actions and normal runtime actions. The "Cat" expression is a reference to a compile-time constant of type java.lang.String. At runtime, there will be a one-time resolution for such a reference. Indeed, such resolution will probe the StringTable to see if a matching String instance is already referenced by the StringTable, if so return that reference; otherwise create a java.lang.String instance from the Symbol representing "Cat", intern the reference into the StringTable, and return that reference. The "new String(...)" expression, on the other hand, is a "new" expression. Semantically it should always create a new String instance every time. - Kris [1]: http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/312e113bc3ed/src/share/vm/classfile/symbolTable.hpp#l255 > > > Apprciate your answers, > > Jun > > > > *Jun Zhuang* > > *Sr. Performance QA Engineer* | Hobsons > > > T: +1 513 746 2288 | jun.zhuang at hobsons.com > > 50 E-Business Way, Suite 300 | Cincinnati, OH 45241 | USA > > > > [image: Upgraded by Hobsons - Subscribe Today] > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image550000.png Type: image/png Size: 13602 bytes Desc: not available URL: From jun.zhuang at hobsons.com Tue Aug 30 13:38:13 2016 From: jun.zhuang at hobsons.com (Jun Zhuang) Date: Tue, 30 Aug 2016 13:38:13 +0000 Subject: Java string literal pool In-Reply-To: References: Message-ID: Hi Kris, Thanks for the response. Regarding the 2nd question, I want to make sure I understand your answer. The handling of new String("Cat") can be broken into two steps: Step #1: Handling the ?Cat? expression If ( Does not already exist ) Create an instance on the heap Add a reference to the StringTable for the above instance and return the reference else Return the reference from the StringTable Question: I assume the reference returned is consumed by the String () constructor, is it using the existing object to create a new one? Step #2: Handling the new String() Action: Create an instance and return reference Either way, TWO instance of String object representing Cat will end up on the heap, right? If I have another statement String str2 = new String("Cat"); a third such instance will be created? Thanks much, Jun -------------------------------------------------------------------------------------------------------------------------------------------- #2: Is the following statement correct? String str = new String("Cat"); In above statement, either 1 or 2 string will be created. If there is already a string literal ?Cat? in the pool, then only one string ?str? will be created in the pool. If there is no string literal ?Cat? in the pool, then it will be first created in the pool and then in the heap space, so total 2 string objects will be created. >From http://www.journaldev.com/797/what-is-java-string-pool#comment-36152 For this statement, yes, either 1 or 2 java.lang.String instances are going to be created. But I wouldn't frame it that way, since it's potentially confusing. The correct way to frame this is to make a clear distinction between one-time actions and normal runtime actions. The "Cat" expression is a reference to a compile-time constant of type java.lang.String. At runtime, there will be a one-time resolution for such a reference. Indeed, such resolution will probe the StringTable to see if a matching String instance is already referenced by the StringTable, if so return that reference; otherwise create a java.lang.String instance from the Symbol representing "Cat", intern the reference into the StringTable, and return that reference. The "new String(...)" expression, on the other hand, is a "new" expression. Semantically it should always create a new String instance every time. - Kris [1]: http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/312e113bc3ed/src/share/vm/classfile/symbolTable.hpp#l255 Apprciate your answers, Jun Jun Zhuang Sr. Performance QA Engineer | Hobsons T: +1 513 746 2288 | jun.zhuang at hobsons.com 50 E-Business Way, Suite 300 | Cincinnati, OH 45241 | USA [Upgraded by Hobsons - Subscribe Today] Upgraded by Hobsons - Subscribe Today _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 13602 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image871000.png Type: image/png Size: 13602 bytes Desc: image871000.png URL: From bkgood at gmail.com Tue Aug 30 09:40:48 2016 From: bkgood at gmail.com (William Good) Date: Tue, 30 Aug 2016 11:40:48 +0200 Subject: High termination times pre-concurrent cycle in G1 Message-ID: I've been experiencing an issue in a production application using G1 for quite some time over a handful of 1.8.0 builds. The application is relatively simple: it spends about 60s reading some parameters from files on disk, and then starts serving web requests which merge some input with those parameters, performs some computation and returns a result. We're aiming to keep max total request time (as seen by remote hosts) below 100 ms but from previous experience with parnew and cms (and g1 on previous projects, for that matter), I didn't anticipate this being a problem. The symptoms are an ever-increasing time spent in evacuation pauses, and high parallel worker termination times stick out. With the recommended set of G1 settings (max heap size and pause time target), they increase sharply until I start seeing 500ms+ pause times and have to kill the JVM. I found some time ago that first forcing a bunch of full GCs with System.gc() at the phase (load -> serve) change and then forcing frequent concurrent cycles with -XX:InitiatingHeapOccupancyPercent=1 seems to mitigate the problem. I'd prefer to have to do neither, as the former makes redeployments very slow and the latter adds a couple of neighboring 40ms pauses for remark and cleanup pauses that aren't good for request time targets. I'm attaching a log file that details a short run, with the phase change at about 60s from start. After a few evacuation pauses, one lasts 160ms with nearly 100-120ms spent in parallel workers' 'termination'. After this, a concurrent cycle runs and everything goes back to normal. java params are at the top of the file. Generally this happens over a much longer period of time (and especially if I haven't given the low -XX:InitiatingHeapOccupancyPercent value) and over many different builds of 1.8.0. This was b101. It's running alone on a fairly hefty dual-socket Xeon box with 128GB of RAM on CentOS 7. I'd be more than happy to hear any ideas on what's going on here and how it could be fixed. Best, William -------------- next part -------------- A non-text attachment was scrubbed... Name: ooc_gc.log.gz Type: application/x-gzip Size: 79123 bytes Desc: not available URL: From vitalyd at gmail.com Tue Aug 30 23:08:25 2016 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Tue, 30 Aug 2016 19:08:25 -0400 Subject: High termination times pre-concurrent cycle in G1 In-Reply-To: References: Message-ID: William, Have you tried running with a lower number (than the current 18) of parallel workers? On Tuesday, August 30, 2016, William Good wrote: > I've been experiencing an issue in a production application using G1 > for quite some time over a handful of 1.8.0 builds. The application is > relatively simple: it spends about 60s reading some parameters from > files on disk, and then starts serving web requests which merge some > input with those parameters, performs some computation and returns a > result. We're aiming to keep max total request time (as seen by remote > hosts) below 100 ms but from previous experience with parnew and cms > (and g1 on previous projects, for that matter), I didn't anticipate > this being a problem. > > The symptoms are an ever-increasing time spent in evacuation pauses, > and high parallel worker termination times stick out. With the > recommended set of G1 settings (max heap size and pause time target), > they increase sharply until I start seeing 500ms+ pause times and have > to kill the JVM. > > I found some time ago that first forcing a bunch of full GCs with > System.gc() at the phase (load -> serve) change and then forcing > frequent concurrent cycles with -XX:InitiatingHeapOccupancyPercent=1 > seems to mitigate the problem. I'd prefer to have to do neither, as > the former makes redeployments very slow and the latter adds a couple > of neighboring 40ms pauses for remark and cleanup pauses that aren't > good for request time targets. > > I'm attaching a log file that details a short run, with the phase > change at about 60s from start. After a few evacuation pauses, one > lasts 160ms with nearly 100-120ms spent in parallel workers' > 'termination'. After this, a concurrent cycle runs and everything goes > back to normal. java params are at the top of the file. > > Generally this happens over a much longer period of time (and > especially if I haven't given the low > -XX:InitiatingHeapOccupancyPercent value) and over many different > builds of 1.8.0. This was b101. It's running alone on a fairly hefty > dual-socket Xeon box with 128GB of RAM on CentOS 7. > > I'd be more than happy to hear any ideas on what's going on here and > how it could be fixed. > > Best, > William > -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From rednaxelafx at gmail.com Wed Aug 31 00:14:57 2016 From: rednaxelafx at gmail.com (Krystal Mok) Date: Tue, 30 Aug 2016 17:14:57 -0700 Subject: Java string literal pool In-Reply-To: References: Message-ID: Comments inline below: On Tue, Aug 30, 2016 at 6:38 AM, Jun Zhuang wrote: > Hi Kris, > > > > Thanks for the response. Regarding the 2nd question, I want to make sure > I understand your answer. The handling of *new String("Cat")* can be > broken into two steps: > > > > *Step #1*: Handling the ?Cat? expression > > If ( Does not already exist ) > > Create an instance on the heap > > Add a reference to the StringTable for the above instance and return the > reference > > > > else > > Return the reference from the StringTable > > > Yes, that it correct. > Question: I assume the reference returned is consumed by > the String () constructor, is it using the existing object to create a new > one? > In this specific example, yes, the reference is directly passed to the constructor call as the argument. It's the same as passing any reference type argument to any method. > > > *Step #2*: Handling the new String() > > Action: Create an instance and return reference > > > Yes. > Either way, TWO instance of String object representing Cat will end up on > the heap, right? > That is correct. The one created by the "new" expression is required to have a different identity (meaning being a different object) than the one from the "Cat" constant expression. Thus: "Cat" == "Cat" will always be true, and "Cat" == new String("Cat") will always be false. > If I have another statement String str2 = new String("Cat"); a third such > instance will be created? > > > That is correct. The "Cat" part will resolve to a reference to the same object as any earlier mentions of the interned "Cat" String instance. - Kris > Thanks much, > > Jun > > > > ------------------------------------------------------------ > ------------------------------------------------------------ > -------------------- > > #2: Is the following statement correct? > > String str = new String("Cat"); > > In above statement, either 1 or 2 string will be created. If there is > already a string literal ?Cat? in the pool, then only one string ?str? will > be created in the pool. If there is no string literal ?Cat? in the pool, > then it will be first created in the pool and then in the heap space, so > total 2 string objects will be created. > > From http://www.journaldev.com/797/what-is-java-string-pool#comment-36152 > > For this statement, yes, either 1 or 2 java.lang.String instances are > going to be created. But I wouldn't frame it that way, since it's > potentially confusing. > > The correct way to frame this is to make a clear distinction between > one-time actions and normal runtime actions. > > > > The "Cat" expression is a reference to a compile-time constant of type > java.lang.String. At runtime, there will be a one-time resolution for such > a reference. Indeed, such resolution will probe the StringTable to see if a > matching String instance is already referenced by the StringTable, if so > return that reference; otherwise create a java.lang.String instance from > the Symbol representing "Cat", intern the reference into the StringTable, > and return that reference. > > > > The "new String(...)" expression, on the other hand, is a "new" > expression. Semantically it should always create a new String instance > every time. > > > > - Kris > > > > [1]: http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/ > 312e113bc3ed/src/share/vm/classfile/symbolTable.hpp#l255 > > > > > > Apprciate your answers, > > Jun > > > > *Jun Zhuang* > > *Sr. Performance QA Engineer* | Hobsons > > > T: +1 513 746 2288 | jun.zhuang at hobsons.com > > 50 E-Business Way, Suite 300 | Cincinnati, OH 45241 | USA > > > > > > [image: Upgraded by Hobsons - Subscribe Today] > > > > > > > [image: Upgraded by Hobsons - Subscribe Today] > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 13602 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image871000.png Type: image/png Size: 13602 bytes Desc: not available URL: From yu.zhang at oracle.com Wed Aug 31 06:18:54 2016 From: yu.zhang at oracle.com (yu.zhang at oracle.com) Date: Tue, 30 Aug 2016 23:18:54 -0700 Subject: High termination times pre-concurrent cycle in G1 In-Reply-To: References: Message-ID: <3227fd9b-3f11-3e78-9768-1545b4153219@oracle.com> It seems that after marking (clean up), the termination time drops. Maybe that is why you need a very low ihop so that you can have more marking cycle. The work distribution seems fine. But system time is high. Maybe some lock contention. I would agree to try lowering the gc threads, -XX:ParallelGCThreads= Jenny On 08/30/2016 04:08 PM, Vitaly Davidovich wrote: > William, > > Have you tried running with a lower number (than the current 18) of > parallel workers? > > On Tuesday, August 30, 2016, William Good > wrote: > > I've been experiencing an issue in a production application using G1 > for quite some time over a handful of 1.8.0 builds. The application is > relatively simple: it spends about 60s reading some parameters from > files on disk, and then starts serving web requests which merge some > input with those parameters, performs some computation and returns a > result. We're aiming to keep max total request time (as seen by remote > hosts) below 100 ms but from previous experience with parnew and cms > (and g1 on previous projects, for that matter), I didn't anticipate > this being a problem. > > The symptoms are an ever-increasing time spent in evacuation pauses, > and high parallel worker termination times stick out. With the > recommended set of G1 settings (max heap size and pause time target), > they increase sharply until I start seeing 500ms+ pause times and have > to kill the JVM. > > I found some time ago that first forcing a bunch of full GCs with > System.gc() at the phase (load -> serve) change and then forcing > frequent concurrent cycles with -XX:InitiatingHeapOccupancyPercent=1 > seems to mitigate the problem. I'd prefer to have to do neither, as > the former makes redeployments very slow and the latter adds a couple > of neighboring 40ms pauses for remark and cleanup pauses that aren't > good for request time targets. > > I'm attaching a log file that details a short run, with the phase > change at about 60s from start. After a few evacuation pauses, one > lasts 160ms with nearly 100-120ms spent in parallel workers' > 'termination'. After this, a concurrent cycle runs and everything goes > back to normal. java params are at the top of the file. > > Generally this happens over a much longer period of time (and > especially if I haven't given the low > -XX:InitiatingHeapOccupancyPercent value) and over many different > builds of 1.8.0. This was b101. It's running alone on a fairly hefty > dual-socket Xeon box with 128GB of RAM on CentOS 7. > > I'd be more than happy to hear any ideas on what's going on here and > how it could be fixed. > > Best, > William > > > > -- > Sent from my phone > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: