Discrepancy in ParNew pause times between 1.6 and 1.7 CMS
Alexey Ragozin
alexey.ragozin at gmail.com
Mon Nov 26 21:58:43 PST 2012
Hi Brian,
I have found similar regression testing young GC performance.
After investigation I have come to conclusion that culprit is
-XX:+BlockOffsetArrayUseUnallocatedBlock VM option (requires
-XX:+UnlockDiagnosticVMOptions).
This option is true is 1.6 and false in 1.7. Commit comment means some
concurrency issues with that flag.
But you should not blame JVM for discrepancy in your test. Problem is
withing your test which is using only small fraction of 50GiB.
BlockOffsetArrayUseUnallocatedBlock allows JVM exclude not-yet-used
memory from collection. So in 1.6 your benchmark effectively measures
heapsize of 1-4 GiB.
IMHO you probably adjust your test to use more realistic memory
pattern (I expect 1.7 be a bit slower, though, at least that was an
outcome from my testing).
If you interested in minimizing young GC time at a look at
http://blog.ragozin.info/2012/03/secret-hotspot-option-improving-gc.html
Regards,
Alexey
> We're in the process of doing some performance testing with 1.7, and we're seeing much longer ParNew times in 1.7 than we saw in 1.6. We could use some help isolating the cause.
>
> Running an identical performance test of an hour's length, with 1.6u37 we're seeing only 0.009% of pauses over 100ms and just 0.09% over 10ms, but with 1.7u9, we're seeing 64% of pauses over 100ms.
>
> Here is typical comparison of a ParNew between the two. It was run with -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintCommandLineFlags -XX:PrintFLSStatistics=1 -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime.
>
> 1.6u37
>
> 2012-11-25T16:51:13.145-0600: 1498.096: [GC Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 2132560492
> Max Chunk Size: 2132560492
> Number of Blocks: 1
> Av. Block Size: 2132560492
> Tree Height: 1
> Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 512
> Max Chunk Size: 512
> Number of Blocks: 1
> Av. Block Size: 512
> Tree Height: 1
> 1498.096: [ParNew
> Desired survivor size 107347968 bytes, new threshold 4 (max 4)
> - age 1: 614064 bytes, 614064 total
> - age 2: 39344 bytes, 653408 total
> - age 3: 322432 bytes, 975840 total
> - age 4: 1208 bytes, 977048 total
> : 1679253K->1224K(1887488K), 0.0015070 secs] 1792056K->114027K(52219136K)After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 2132560412
> Max Chunk Size: 2132560412
> Number of Blocks: 1
> Av. Block Size: 2132560412
> Tree Height: 1
> After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 512
> Max Chunk Size: 512
> Number of Blocks: 1
> Av. Block Size: 512
> Tree Height: 1
> , 0.0016110 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0022330 seconds
>
> 1.7u9:
>
> 2012-11-25T19:10:40.869-0600: 1498.458: [GC Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 2136954070
> Max Chunk Size: 2136954070
> Number of Blocks: 1
> Av. Block Size: 2136954070
> Tree Height: 1
> Before GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 0
> Max Chunk Size: 0
> Number of Blocks: 0
> Tree Height: 0
> 1498.458: [ParNew
> Desired survivor size 107347968 bytes, new threshold 6 (max 6)
> - age 1: 1122960 bytes, 1122960 total
> - age 2: 1560 bytes, 1124520 total
> - age 3: 2232 bytes, 1126752 total
> - age 4: 324760 bytes, 1451512 total
> - age 5: 2232 bytes, 1453744 total
> - age 6: 728 bytes, 1454472 total
> : 1680121K->2393K(1887488K), 0.1132030 secs] 1759235K->81509K(52219136K)After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 2136954070
> Max Chunk Size: 2136954070
> Number of Blocks: 1
> Av. Block Size: 2136954070
> Tree Height: 1
> After GC:
> Statistics for BinaryTreeDictionary:
> ------------------------------------
> Total Free Space: 0
> Max Chunk Size: 0
> Number of Blocks: 0
> Tree Height: 0
> , 0.1133210 secs] [Times: user=1.45 sys=0.00, real=0.11 secs]
> Total time for which application threads were stopped: 0.1142450 seconds
>
> Notice the 2ms application stop time vs 114ms. We can point you all at full GC logs
>
> For what it's worth we repeated this test on 1.7 with our standard set of 1.6 arguments that we've had success with in the past:
>
> -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled -XX:+CMSScavengeBeforeRemark -XX:RefDiscoveryPolicy=1 -XX:ParallelCMSThreads=3 -XX:CMSMaxAbortablePrecleanTime=3600000 -XX:CMSInitiatingOccupancyFraction=83 -XX:+UseParNewGC
>
> as well as these minimal CMS args:
>
> -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>
> The results were more or less the same.
>
> Are there some additional JVM args that we can enable to shed some light on the big discrepancy in pause times?
>
> Thanks.
More information about the hotspot-gc-use
mailing list