Seeking answer to a GC pattern

Wed Nov 25 18:58:07 UTC 2015

On 11/24/15 01:21 PM, Jun Zhuang wrote:
> Hi Srinivas,
>
> Appreciate your input. Following are answers to your questions. I’ll try your other advices.
>
> -How many CPU's (and GC threads) do you have? Does the ratio of "real" to "usr+sys" increase as "real" ramps up?
>
> 4 CPUs.
>
> Here is the time for one of the young GCs with +AlwaysTenure: Times: user=1.45 sys=0.00, real=0.40 secs. The sys time is always close to 0, user time is more than 3x real time and increases with real time accordingly.
>
> -Does the amount that is promoted stay constant?
>
> With +AlwaysTenure, looks like the promoted amount was fairly constant @ about 125K. Following shows the first 3 young GCs right after a full collection and last 3 right before the next one.
>
> 328706.505: [GC [PSYoungGen: 129018K->0K(130048K)] 286309K->160838K(2096896K), 0.0152110 secs] [Times: user=0.04 sys=0.01, real=0.01 secs]
>
> 328711.687: [GC [PSYoungGen: 129024K->0K(130048K)] 289862K->165092K(2096896K), 0.0199390 secs] [Times: user=0.06 sys=0.00, real=0.02 secs]
>
> 328716.875: [GC [PSYoungGen: 129024K->0K(130048K)] 294116K->168626K(2096896K), 0.0247520 secs] [Times: user=0.07 sys=0.00, real=0.02 secs]
>
> …
>
> 331103.140: [GC [PSYoungGen: 129024K->0K(130048K)] 2082788K->1957116K(2096896K), 0.2220360 secs] [Times: user=0.78 sys=0.00, real=0.23 secs]
>
> 331108.118: [GC [PSYoungGen: 129024K->0K(130048K)] 2086140K->1960268K(2096896K), 0.2170640 secs] [Times: user=0.79 sys=0.01, real=0.22 secs]
>
> 331113.074: [GC [PSYoungGen: 129024K->0K(130048K)] 2089292K->1963948K(2096896K), 0.2132430 secs] [Times: user=0.79 sys=0.00, real=0.21 secs]
>
> -Does the sawtooth of minor gc times happen even with MTT=1 or AlwaysTenure?
>
> Yes. It always happens with or without AlwaysTenure.
>
> -(Hint: How many young collections do you see between the major collections when you see the sawtooth in young collection times? How does it compare with the highest age of object that is kept in the survivor spaces?)
>
> For one of my tests with 128m young gen and +AlwaysTenure, the average # of young GCs before a full collection was a little over 500.
>
> Sincerely,
>
> Jun

Looking at the increase in your heap size after each young generation collection seems to show that you are promoting 3MB~4MB at each young generation collection.  E.g., 165092K - 160838K = 4254K.  With your 1920MB old generation that would let you have 500 young generation collections between full collections, as you say.  If you were promoting only 125KB at each young generation collection your 1920MB old generation could absorb promotions from 15000 young generation collections.

What is confusing is that times for the young generation collections increases proportionally with the size of the old generation.  Your sawtooth pattern.  Usually the time for a young generation collection is proportional to the amount of space that is promoted, which seems to be constant in your case.  That implies some cost proportional to the size of the old generation: but what?  It does not seem to take you longer to allocate through the space in your young generation when the old generation is empty than when it is full (~5 seconds) so it does not seem like you are doing more work when the old generation is full: e.g., dirtying cards for data that has piled up in the old generation, which would cause more work for the collector.

With a 10x increase in time like that, one would think it would be easy to identify with a profiler, or detailed timers for phases, if there are any in the code.

To Ramki: The parallelization seems to hold at somewhat over 3.

			... peter

> *From:*Srinivas Ramakrishna [mailto:ysr1729 at gmail.com]
> *Sent:* Tuesday, November 24, 2015 3:14 PM
> *To:* Grzegorz Molenda <molendag at gmail.com>
> *Cc:* Jun Zhuang <jun.zhuang at hobsons.com>; hotspot-gc-use at openjdk.java.net
> *Subject:* Re: Seeking answer to a GC pattern
>
> What Grzegorz & Thomas said.
>
> Also you might take a heap dump before and after a full gc (-XX:+PrintClassHistogramBefore/AfterFullGC) to see the types that are reclaimed in the old gen. Might give you an idea as to the types of objects that got promoted and then later died, and hence whether avoidable nepotism is or is not a factor (and thence what you might null out to reduce such nepotism).
>
> Also, I guess what I meant was MTT=1. However, given that going from MTT=10 to MTT=2 didn't make any appreciable difference, MTT=2 to MTT=1 will not either.
>
> You might also consider increasing yr young gen size, but that will likely also increase your pause times since objects tend to either die quickly or survive for a long time, and increasing the young gen size will still not age objects sufficiently to cause an increase in mortality.
>
> How many CPU's (and GC threads) do you have? Does the ratio of "real" to "usr+sys" increase as "real" ramps up? Does the amount that is promoted stay constant? That might imply that something is interfering with parallelization of copying. Typically that means that there is a long skinny data structure, such as a singly linked list that is being copied, although why that list would become longer (in terms of longer times) is not clear. Does the sawtooth of minor gc times happen even with MTT=1 or AlwaysTenure? (Hint: How many young collections do you see between the major collections when you see the sawtooth in young collection times? How does it compare with the highest age of object that is kept in the survivor spaces?)
>
> -- ramki
>
> On Tue, Nov 24, 2015 at 9:17 AM, Grzegorz Molenda <molendag at gmail.com <mailto:molendag at gmail.com>> wrote:
>
>     Just a few tips:
>
>     Check OS stats for paging / swapping activity at both VM'and hypervisor levels.
>
>     Make sure the OS doesn't use transparent huge pages.
>
>     If the above two don't help, try enabling -XX:+PrintGCTaskTimeStamps to diagnose, which part of GC collecion takes the most time. Note values aren't reported in time units, but in ticks. Subtract one from the other reported per task . Compare between tasks per signle collection and check stats from a few collections in row, to get the idea, where it does degradate.
>
>
>     Thanks,
>
>
>     Grzegorz
>
>     2015-11-20 20:46 GMT+01:00 Jun Zhuang <jun.zhuang at hobsons.com <mailto:jun.zhuang at hobsons.com>>:
>
>         Hi Srinivas,
>
>         Thanks for your suggestion. I ran test with following parameters:
>
>         -server -XX:+UseCompressedOops -XX:+TieredCompilation -XX:ReservedCodeCacheSize=64m -XX:+UseCodeCacheFlushing -XX:+PrintTenuringDistribution -Xms2g -Xmx2g -XX:MaxPermSize=256m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:SurvivorRatio=6 -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:MaxTenuringThreshold=2
>
>         But the -XX:MaxTenuringThreshold=2 setting does not seem to help anything. I am still seeing similar GC pattern as with the +AlwaysTenure, actually the young GC time is higher with MTT=2 (getting to 0.5 secs vs. 0.25 with AlwaysTenure).
>
>         Unless anyone else can provide another theory, I am convinced that nepotism is at play here. Changing the java startup parameters can only get me so far, dev will have to look at the code and see what can be done on the code level.
>
>         Thanks,
>
>         Jun
>
>         *From:*Srinivas Ramakrishna [mailto:ysr1729 at gmail.com <mailto:ysr1729 at gmail.com>]
>         *Sent:* Thursday, November 19, 2015 8:09 PM
>         *To:* Jun Zhuang <jun.zhuang at hobsons.com <mailto:jun.zhuang at hobsons.com>>
>         *Cc:* hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>         *Subject:* Re: Seeking answer to a GC pattern
>
>         Use -XX:MaxTenuringThreshold=2 and you might see better behavior that +AlwaysTenure (which is almost always a very bad choice). That will at least reduce some of the nepotism issues from +AlwaysTenure that Thomas mentions. MTT > 2 is unlikely to help at your current frequency of minor collections since the mortality after age 1 is fairly low (from your tenuring distribution). Worth a quick test.
>
>         -- ramki
>
>         _______________________________________________
>         hotspot-gc-use mailing list
>         hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>         http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>