Promotion failures: indication of CMS fragmentation?

Tue Feb 21 10:16:47 PST 2012

Hi Teras,

I think you may want to look into sizing the new and especially the
survivor spaces differently. We run something similar to what you
described, high volume request processing with large dataset loading, and
what we've seen at the start is that the survivor spaces are completely
overloaded, causing premature promotions.

We've configured our vm with the following goals/guideline:

   - old space is for semi-permanent data, living for at least 30s, average
   ~10 minutes
   - new space contains only temporary and just loaded data
   - surviving objects from new should never reach old in 1 gc, so the
   survivor space may never be 100% full

With jstat -gcutil `pidof java` 2000, we see things like:

  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
 70.20   0.00  19.65  57.60  59.90 124808 29474.299  2498  191.110 29665.409
 70.20   0.00  92.89  57.60  59.90 124808 29474.299  2498  191.110 29665.409
 70.20   0.00  93.47  57.60  59.90 124808 29474.299  2498  191.110 29665.409
  0.00  65.69  78.07  58.09  59.90 124809 29474.526  2498  191.110 29665.636
 84.97   0.00  48.19  58.57  59.90 124810 29474.774  2498  191.110 29665.884
 84.97   0.00  81.30  58.57  59.90 124810 29474.774  2498  191.110 29665.884
  0.00  62.64  27.22  59.12  59.90 124811 29474.992  2498  191.110 29666.102
  0.00  62.64  54.47  59.12  59.90 124811 29474.992  2498  191.110 29666.102
 75.68   0.00   6.80  59.53  59.90 124812 29475.228  2498  191.110 29666.338
 75.68   0.00  23.38  59.53  59.90 124812 29475.228  2498  191.110 29666.338
 75.68   0.00  27.72  59.53  59.90 124812 29475.228  2498  191.110 29666.338

If you follow the lines, you can see Eden fill up to 100% on line 4,
surviving objects are copied into S1, S0 is collected and added 0.49% to
Old. On line 5, another GC happened, with Eden->S0, S1->Old, etc. No
objects is ever transferred from Eden to Old, unless there's a huge peak of
requests.

This is with a: 32GB heap, Mxn1200M, SurvivorRatio 2 (600MB Eden, 300MB S0,
300MB S1), MaxTenuringThreshold 1 (whatever is still alive in S0/1 on the
second GC is copied to old, don't wait, web requests are quite bursty).
With about 1 collection every 2-5 seconds, objects promoted to Old must
live for at 4-10 seconds; as that's longer than an average request
(50ms-1s), none of the temporary data ever makes it into Old, which is much
more expensive to collect. It works even with a higher than default
CMSInitiatingOccupancyFraction=76 to optimize for space available for the
large data cache we have.

With your config of 400MB Total new, with 350MB Eden, 25MB S0, 25MB S1
(SurvivorRatio 8), no tenuring threshold, I think loads of new objects get
copied from Eden to Old directly, causing trouble for the CMS. You can use
jstat to get live stats and tweak until it doesn't happen. If you can't
make changes on live that easil, try doubling the new size indeed, with a
400 Eden, 200 S0, 200 S1 and MaxTenuringThreshold 1 setting. It's probably
overkill, but if should solve the problem if it is caused by premature
promotion.

Chi Ho Kwok

On Tue, Feb 21, 2012 at 5:55 PM, Taras Tielkes <taras.tielkes at gmail.com>wrote:

> Hi,
>
> We've removed the "-XX:+CMSScavengeBeforeRemark" setting from 50% of
> our production nodes.
> After running for a few weeks, it seems that there's no impact from
> removing this option.
> Which is good, since it seems we can remove it from the other nodes as
> well, simplifying our overall JVM configuration ;-)
>
> However, we're still seeing promotion failures on all nodes, once
> every day or so.
>
> There's still the "Magic 1026": this accounts for ~60% of the
> promotion failures that we're seeing (single ParNew thread thread,
> 1026 failure size):
> --------------------
> 2012-02-06T09:13:51.806+0100: 328095.085: [GC 328095.086: [ParNew:
> 359895K->29357K(368640K), 0.0429070 secs]
> 3471021K->3143476K(5201920K), 0.0434950 secs] [Times: user=0.32
> sys=0.00, real=0.04 secs]
> 2012-02-06T09:13:55.922+0100: 328099.201: [GC 328099.201: [ParNew:
> 357037K->31817K(368640K), 0.0429130 secs]
> 3471156K->3148946K(5201920K), 0.0434930 secs] [Times: user=0.31
> sys=0.00, real=0.04 secs]
> 2012-02-06T09:13:59.044+0100: 328102.324: [GC 328102.324: [ParNew
> (promotion failure size = 1026)  (promotion failed):
> 359497K->368640K(368640K), 0.2226790 secs]328102.547: [CMS:
> 3125609K->451515K(4833280K), 5.6225880 secs] 3476626K->4515
> 15K(5201920K), [CMS Perm : 124373K->124353K(262144K)], 5.8459380 secs]
> [Times: user=6.20 sys=0.01, real=5.85 secs]
> 2012-02-06T09:14:05.243+0100: 328108.522: [GC 328108.523: [ParNew:
> 327680K->40960K(368640K), 0.0319160 secs] 779195K->497658K(5201920K),
> 0.0325360 secs] [Times: user=0.21 sys=0.01, real=0.03 secs]
> 2012-02-06T09:14:07.836+0100: 328111.116: [GC 328111.116: [ParNew:
> 368640K->32785K(368640K), 0.0744670 secs] 825338K->520234K(5201920K),
> 0.0750390 secs] [Times: user=0.40 sys=0.02, real=0.08 secs]
> --------------------
> Given the 1026 word size, I'm wondering if I should be hunting for an
> overuse of BufferedInputStream/BufferedOutoutStream, since both have
> 8192 as a default buffer size.
>
> The second group of promotion failures look like this (multiple ParNew
> threads, small failure sizes):
> --------------------
> 2012-02-06T09:50:15.773+0100: 328756.964: [GC 328756.964: [ParNew:
> 356116K->29934K(368640K), 0.0461100 secs]
> 3203863K->2880162K(5201920K), 0.0468870 secs] [Times: user=0.34
> sys=0.01, real=0.05 secs]
> 2012-02-06T09:50:19.153+0100: 328760.344: [GC 328760.344: [ParNew:
> 357614K->30359K(368640K), 0.0454680 secs]
> 3207842K->2882892K(5201920K), 0.0462280 secs] [Times: user=0.33
> sys=0.01, real=0.05 secs]
> 2012-02-06T09:50:22.658+0100: 328763.849: [GC 328763.849: [ParNew (1:
> promotion failure size = 25)  (4: promotion failure size = 25)  (6:
> promotion failure size = 25)  (7: promotion failure size = 144)
> (promotion failed): 358039K->358358
> K(368640K), 0.2148680 secs]328764.064: [CMS:
> 2854709K->446750K(4833280K), 5.8368270 secs]
> 3210572K->446750K(5201920K), [CMS Perm : 124670K->124644K(262144K)],
> 6.0525230 secs] [Times: user=6.32 sys=0.00, real=6.05 secs]
> 2012-02-06T09:50:29.896+0100: 328771.086: [GC 328771.087: [ParNew:
> 327680K->22569K(368640K), 0.0227080 secs] 774430K->469319K(5201920K),
> 0.0235020 secs] [Times: user=0.16 sys=0.00, real=0.02 secs]
> 2012-02-06T09:50:31.076+0100: 328772.266: [GC 328772.267: [ParNew:
> 350249K->22264K(368640K), 0.0235480 secs] 796999K->469014K(5201920K),
> 0.0243000 secs] [Times: user=0.18 sys=0.01, real=0.02 secs]
> --------------------
>
> We're going to try to double the new size on a single node, to see the
> effects of that.
>
> Beyond this experiment, is there any additional data I can collect to
> better understand the nature of the promotion failures?
> Am I facing collecting free list statistics at this point?
>
> Thanks,
> Taras
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120221/0c84660f/attachment.html