Promotion failures: indication of CMS fragmentation?

Chi Ho Kwok chkwok at digibites.nl
Sun Apr 15 10:11:34 PDT 2012


Hi Teras,

Hmm, it looks like it failed even tho there's tons of space available, 2.4G
used, 1.8G free out of 4.2G CMS old gen. Or am I reading the next line
wrong? (snipped age histogram)

[GC 3296267.500: [ParNew (1: promotion failure size = 16)  (2: promotion
failure size = 56)  (4: promotion failure size = 342)  (5: promotion
failure size = 1026)  (6: promotion failure size = 278)  (promotion
failed): 568902K->568678K(682688K), 0.3130580 secs]3296267.813: *[CMS:
2510091K->573489K(4423680K), 7.7481330 secs]* 3078184K->573489K(5106368K),
[CMS Perm : 144002K->143970K(262144K)], 8.0619690 secs] [Times: user=8.35
sys=0.01, real=8.06 secs]


Normally I'd say fragmentation, but how can it not find a spot for a 16
bytes chunk? I'm completely out of ideas now - anyone else?


Here's a brute force "solution": what the app "needs" is 600M of live data
in the old gen, that's what left usually after collection. Increase "safety
margin" by adding memory to the old gen pool if possible by increasing
total heap size, and set initial occupancy ratio to a silly low number like
45%. Hopefully, it will survive until the next software/jvm/kernel patch
that requires a restart of the service or machine.


I've seen something similar in our logs as well, with 19%/2.9GB free, my
guess is that CMS needs a few GB to play with... Nowadays we run with a
larger safety margin, doubled the heap on that machine to 32GB, I haven't
seen any CMS and promotion failures since then (Jan 2010).

128265.354: [GC 128265.355: [ParNew (promotion failed):
589631K->589631K(589824K), 0.3582340 secs]128265.713: *[CMS:
12965822K->10393148K(15990784K), 20.9654520
secs]*13462337K->10393148K(16580608K), [CMS Perm :
20604K->16846K(34456K)],
21.3239890 secs] [Times: user=22.06 sys=0.09, real=21.32 secs]



Regards,

Chi Ho


On Sun, Apr 15, 2012 at 6:41 PM, Taras Tielkes <taras.tielkes at gmail.com>wrote:

> Hi Chi,
>
> I've sent you a decent chunk of the gc.log file off-list (hopefully
> not too large).
>
> For completeness, we're running with the following options (ignoring
> the diagnostic ones):
> -----
> -server
> -Xms5g
> -Xmx5g
> -Xmn800m
> -XX:PermSize=256m
> -XX:MaxPermSize=256m
> -XX:SurvivorRatio=4
> -XX:+UseConcMarkSweepGC
> -XX:+UseParNewGC
> -XX:+DisableExplicitGC
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+CMSClassUnloadingEnabled
> -XX:CMSInitiatingOccupancyFraction=68
> -----
> Platform is Java 6u29 running on Linux 2.6 x64.
> Hardware is 2xquad Xeons, but pretty old ones (pre-Nehalem, no QPI).
>
> The gc logs will (typically) show big peaks at the start and end of
> the working day - this is nature of the domain our application
> targets.
>
> I would expect the live set to be below 1G (usually below 600M even).
> However, we can experience temporary spikes of higher volume
> longer-living object allocation bursts.
>
> We'll set up a jstat log for this machine. I do have historical jstat
> logs for one of the other machines, but that one is still running with
> a smaller new gen, and smaller survivor spaces. If there's any other
> relevant data that I can collect, let me know.
>
> Kind regards,
> Taras
>
> On Sun, Apr 15, 2012 at 6:15 PM, Chi Ho Kwok <chkwok at digibites.nl> wrote:
> > Hi Teras,
> >
> > Sure thing. Just the previous CMS should be enough, it doesn't matter if
> > there is 10 or 1000 parnew's between that and the failure.
> >
> > As for the jstat failure, it looks like it looks in
> > /tmp/hsperfdata_[username] for the pid by default, maybe something
> > like -J-Djava.io.tmpdir=[path, like /app/client/program/tomcat/temp] can
> > help; and from what I've seen, running jstat as the same user as the
> process
> > or root is required. Historical data is nice to have, but even just
> staring
> > at it for 15 minutes should give you a hint for the old gen usage.
> >
> > If the collection starts at 68, takes a while and the heap fills to 80%+
> > before it's done when it's not busy, it's probably wise to lower the
> initial
> > occupancy factor or increase the thread count so it completes faster. We
> run
> > with -XX:ParallelCMSThreads=3 on a 8 hw thread server, the default (2)
> was
> > too slow for us as we run with 76%, it still takes 15s on average for
> CMS to
> > scan and clean the old gen (while old gen grows to up to 80% full), much
> > longer can mean a promotion failure during request spikes.
> >
> >
> > Chi Ho Kwok
> >
> >
> > On Sun, Apr 15, 2012 at 5:08 PM, Taras Tielkes <taras.tielkes at gmail.com>
> > wrote:
> >>
> >> Hi Chi,
> >>
> >> Is it o.k. if I send this off-list to you directly? If so, how much
> >> more do you need? Just enough to cover the previous CMS?
> >> We're running with  -XX:CMSInitiatingOccupancyFraction=68 and
> >> -XX:+UseCMSInitiatingOccupancyOnly, by the way.
> >>
> >> I do have shell access, however, on that particular machine we're
> >> experiencing the "process id not found" issue with jstat.
> >> I think this can be worked around by fiddling with temp directory
> >> options, but we haven't tried that yet.
> >> Regarding the jstat output, I assume this would be most valuable to
> >> have for the exact moment when the promotion failure happens, correct?
> >> If so, we can try to set up jstat to run in the background
> >> continuously, to have more diagnostic data in the future.
> >>
> >> Kind regards,
> >> Taras
> >>
> >> On Sun, Apr 15, 2012 at 2:48 PM, Chi Ho Kwok <chkwok at digibites.nl>
> wrote:
> >> > Hi Teras,
> >> >
> >> > Can you send me a larger chunk of the log? I'm interested in seeing
> when
> >> > the
> >> > last CMS was run and what it freed. Maybe it's kicking in too late,
> the
> >> > full
> >> > GC triggered by promotion failure only found 600M live data, rest was
> >> > garbage. If that's the cause,
> lowering XX:CMSInitiatingOccupancyFraction
> >> > can
> >> > help.
> >> >
> >> > Also, do you have shell access to that machine? If so, try running
> >> > jstat,
> >> > you can see the usage of all generations live as it happens.
> >> >
> >> >
> >> > Chi Ho Kwok
> >> >
> >> > On Sun, Apr 15, 2012 at 2:34 PM, Taras Tielkes <
> taras.tielkes at gmail.com>
> >> > wrote:
> >> >>
> >> >> Hi Chi, Srinivas,
> >> >>
> >> >> Optimizing the cost of ParNew (by lowering MTT) would be nice, but
> for
> >> >> now my priority is still to minimize the promotion failures.
> >> >>
> >> >> For example, on the machine running CMS with the "larger" young gen
> >> >> and survivor spaces (-Xmx5g -Xmn800 -XX:SurvivorRatio=4), I've just
> >> >> seen a promotion failure again. Below is a snippet of gc.log showing
> >> >> this.
> >> >> To put this into perspective, this is a first promotion failure on
> >> >> that machine in a couple of weeks. Still, zero failures would beat a
> >> >> single failure, since the clients connecting to this application will
> >> >> only wait a few seconds before timing out and terminating the
> >> >> connection. In addition, the promotion failures are occurring in peak
> >> >> usage moments.
> >> >>
> >> >> Apart from trying to eliminate the promotion failure pauses, my main
> >> >> goal is to learn how to understand the root cause in a case like
> this.
> >> >> Any suggestions for things to try or read up on are appreciated.
> >> >>
> >> >> Kind regards,
> >> >> Taras
> >> >> ------------------------------------------------
> >> >> 2012-04-13T17:44:27.777+0200: 3296255.045: [GC 3296255.046: [ParNew
> >> >> Desired survivor size 69894144 bytes, new threshold 15 (max 15)
> >> >> - age   1:    3684448 bytes,    3684448 total
> >> >> - age   2:     824984 bytes,    4509432 total
> >> >> - age   3:     885120 bytes,    5394552 total
> >> >> - age   4:     756568 bytes,    6151120 total
> >> >> - age   5:     696880 bytes,    6848000 total
> >> >> - age   6:     890688 bytes,    7738688 total
> >> >> - age   7:    2631184 bytes,   10369872 total
> >> >> - age   8:     719976 bytes,   11089848 total
> >> >> - age   9:     724944 bytes,   11814792 total
> >> >> - age  10:     750360 bytes,   12565152 total
> >> >> - age  11:     934944 bytes,   13500096 total
> >> >> - age  12:     521080 bytes,   14021176 total
> >> >> - age  13:     543392 bytes,   14564568 total
> >> >> - age  14:     906616 bytes,   15471184 total
> >> >> - age  15:     504008 bytes,   15975192 total
> >> >> : 568932K->22625K(682688K), 0.0410180 secs]
> >> >> 3077079K->2531413K(5106368K), 0.0416940 secs] [Times: user=0.30
> >> >> sys=0.01, real=0.05 secs]
> >> >> 2012-04-13T17:44:33.893+0200: 3296261.162: [GC 3296261.162: [ParNew
> >> >> Desired survivor size 69894144 bytes, new threshold 15 (max 15)
> >> >> - age   1:    2975896 bytes,    2975896 total
> >> >> - age   2:     742592 bytes,    3718488 total
> >> >> - age   3:     812864 bytes,    4531352 total
> >> >> - age   4:     873488 bytes,    5404840 total
> >> >> - age   5:     746128 bytes,    6150968 total
> >> >> - age   6:     685192 bytes,    6836160 total
> >> >> - age   7:     888376 bytes,    7724536 total
> >> >> - age   8:    2621688 bytes,   10346224 total
> >> >> - age   9:     715608 bytes,   11061832 total
> >> >> - age  10:     723336 bytes,   11785168 total
> >> >> - age  11:     749856 bytes,   12535024 total
> >> >> - age  12:     914632 bytes,   13449656 total
> >> >> - age  13:     520944 bytes,   13970600 total
> >> >> - age  14:     543224 bytes,   14513824 total
> >> >> - age  15:     906040 bytes,   15419864 total
> >> >> : 568801K->22726K(682688K), 0.0447800 secs]
> >> >> 3077589K->2532008K(5106368K), 0.0454710 secs] [Times: user=0.33
> >> >> sys=0.00, real=0.05 secs]
> >> >> 2012-04-13T17:44:40.231+0200: 3296267.499: [GC 3296267.500: [ParNew
> >> >> (1: promotion failure size = 16)  (2: promotion failure size = 56)
> >> >> (4: promotion failure
> >> >> size = 342)  (5: promotion failure size = 1026)  (6: promotion
> failure
> >> >> size = 278)  (promotion failed)
> >> >> Desired survivor size 69894144 bytes, new threshold 15 (max 15)
> >> >> - age   1:    2436840 bytes,    2436840 total
> >> >> - age   2:    1625136 bytes,    4061976 total
> >> >> - age   3:     691664 bytes,    4753640 total
> >> >> - age   4:     799992 bytes,    5553632 total
> >> >> - age   5:     858344 bytes,    6411976 total
> >> >> - age   6:     730200 bytes,    7142176 total
> >> >> - age   7:     680072 bytes,    7822248 total
> >> >> - age   8:     885960 bytes,    8708208 total
> >> >> - age   9:    2618544 bytes,   11326752 total
> >> >> - age  10:     709168 bytes,   12035920 total
> >> >> - age  11:     714576 bytes,   12750496 total
> >> >> - age  12:     734976 bytes,   13485472 total
> >> >> - age  13:     905048 bytes,   14390520 total
> >> >> - age  14:     520320 bytes,   14910840 total
> >> >> - age  15:     543056 bytes,   15453896 total
> >> >> : 568902K->568678K(682688K), 0.3130580 secs]3296267.813: [CMS:
> >> >> 2510091K->573489K(4423680K), 7.7481330 secs]
> >> >> 3078184K->573489K(5106368K), [CMS Perm : 144002K->
> >> >> 143970K(262144K)], 8.0619690 secs] [Times: user=8.35 sys=0.01,
> >> >> real=8.06
> >> >> secs]
> >> >> 2012-04-13T17:44:51.337+0200: 3296278.606: [GC 3296278.606: [ParNew
> >> >> Desired survivor size 69894144 bytes, new threshold 15 (max 15)
> >> >> - age   1:   33717528 bytes,   33717528 total
> >> >> : 546176K->43054K(682688K), 0.0515990 secs]
> >> >> 1119665K->616543K(5106368K), 0.0523550 secs] [Times: user=0.34
> >> >> sys=0.00, real=0.05 secs]
> >> >> ------------------------------------------------
> >> >>
> >> >> On Tue, Mar 20, 2012 at 10:12 PM, Srinivas Ramakrishna
> >> >> <ysr1729 at gmail.com> wrote:
> >> >> > As Chi-ho noted, about 3-4 MB of data does get promoted per
> scavenge,
> >> >> > after having
> >> >> > sloshed around in your survivor spaces some 15 times. I'd venture
> >> >> > that
> >> >> > whatever winnowing
> >> >> > of young objects was to ocur has in fact occured already within the
> >> >> > first 3-4 scavenges that
> >> >> > an object has survived, after which the drop-off in population is
> >> >> > less
> >> >> > sharp. So I'd suggest
> >> >> > lowering the MTT to about 3, while leaving the survivor ratio
> intact.
> >> >> > That should reduce your
> >> >> > copying costs and bring down your scavenge pauses further, while
> not
> >> >> > adversely affecting
> >> >> > your promotion rates (and concomitantly the fragmentation).
> >> >> >
> >> >> > One thing that was a bit puzzling about the stats below was that
> >> >> > you'd
> >> >> > expect the volume
> >> >> > of generation X in scavenge N to be no less than the volume of
> >> >> > generation X+1 in scavenge N+1,
> >> >> > but occasionally that natural invariant does not appear to hold,
> >> >> > which
> >> >> > is quite puzzling --
> >> >> > indicating perhaps that either ages or populations are not being
> >> >> > correctly tracked.
> >> >> >
> >> >> > I don't know if anyone else has noticed that in their tenuring
> >> >> > distributions as well....
> >> >> >
> >> >> > -- ramki
> >> >> >
> >> >> > On Tue, Mar 20, 2012 at 9:36 AM, Taras Tielkes
> >> >> > <taras.tielkes at gmail.com>
> >> >> > wrote:
> >> >> >> Hi,
> >> >> >>
> >> >> >> I've collected -XX:+PrintTenuringDistribution data from a node in
> >> >> >> our
> >> >> >> production environment, running -Xmx5g -Xmn400m
> -XX:SurvivorRatio=8.
> >> >> >> On one other production node, we've configured a larger new gen,
> and
> >> >> >> larger survivor spaces (-Xmx5g -Xmn800m -XX:SurvivorRatio=4).
> >> >> >> This node has -XX:+PrintTenuringDistribution logging as well.
> >> >> >>
> >> >> >> The node running the larger new gen and survivor spaces has not
> run
> >> >> >> into a promotion failure yet, while the ones still running the old
> >> >> >> config have hit a few.
> >> >> >> The promotion failures are typically experienced at high load
> >> >> >> periods,
> >> >> >> which makes sense, as allocation and promotion will experience a
> >> >> >> spike
> >> >> >> in those periods as well.
> >> >> >>
> >> >> >> The inherent nature of the application implies relatively long
> >> >> >> sessions (towards a few hours), retaining a fair amout of state up
> >> >> >> to
> >> >> >> an hour.
> >> >> >> I believe this is the main reason of the relatively high promotion
> >> >> >> rate we're experiencing.
> >> >> >>
> >> >> >>
> >> >> >> Here's a fragment of gc log from one of the nodes running the
> older
> >> >> >> (smaller) new gen, including a promotion failure:
> >> >> >> -------------------------
> >> >> >> 2012-03-15T18:32:17.785+0100: 796604.225: [GC 796604.225: [ParNew
> >> >> >> Desired survivor size 20971520 bytes, new threshold 8 (max 15)
> >> >> >> - age   1:    2927728 bytes,    2927728 total
> >> >> >> - age   2:    2428512 bytes,    5356240 total
> >> >> >> - age   3:    2696376 bytes,    8052616 total
> >> >> >> - age   4:    2623576 bytes,   10676192 total
> >> >> >> - age   5:    3365576 bytes,   14041768 total
> >> >> >> - age   6:    2792272 bytes,   16834040 total
> >> >> >> - age   7:    2233008 bytes,   19067048 total
> >> >> >> - age   8:    2263824 bytes,   21330872 total
> >> >> >> : 358709K->29362K(368640K), 0.0461460 secs]
> >> >> >> 3479492K->3151874K(5201920K), 0.0467320 secs] [Times: user=0.34
> >> >> >> sys=0.01, real=0.05 secs]
> >> >> >> 2012-03-15T18:32:21.546+0100: 796607.986: [GC 796607.986: [ParNew
> >> >> >> (0:
> >> >> >> promotion failure size = 25)  (1: promotion failure size = 25)
>  (2:
> >> >> >> promotion failure size = 25)  (3: promotion failure size = 25)
>  (4:
> >> >> >> promotion failure size = 25)  (5
> >> >> >> : promotion failure size = 25)  (6: promotion failure size = 341)
> >> >> >>  (7:
> >> >> >> promotion failure size = 25)  (promotion failed)
> >> >> >> Desired survivor size 20971520 bytes, new threshold 8 (max 15)
> >> >> >> - age   1:    3708208 bytes,    3708208 total
> >> >> >> - age   2:    2174384 bytes,    5882592 total
> >> >> >> - age   3:    2383256 bytes,    8265848 total
> >> >> >> - age   4:    2689912 bytes,   10955760 total
> >> >> >> - age   5:    2621832 bytes,   13577592 total
> >> >> >> - age   6:    3360440 bytes,   16938032 total
> >> >> >> - age   7:    2784136 bytes,   19722168 total
> >> >> >> - age   8:    2220232 bytes,   21942400 total
> >> >> >> : 357042K->356456K(368640K), 0.2734100 secs]796608.259: [CMS:
> >> >> >> 3124189K->516640K(4833280K), 6.8127070 secs]
> >> >> >> 3479554K->516640K(5201920K), [CMS Perm :
> 142423K->142284K(262144K)],
> >> >> >> 7.0867850 secs] [Times: user=7.32 sys=0.07, real=7.09 secs]
> >> >> >> 2012-03-15T18:32:30.279+0100: 796616.719: [GC 796616.720: [ParNew
> >> >> >> Desired survivor size 20971520 bytes, new threshold 1 (max 15)
> >> >> >> - age   1:   29721456 bytes,   29721456 total
> >> >> >> : 327680K->40960K(368640K), 0.0403130 secs]
> >> >> >> 844320K->557862K(5201920K), 0.0409070 secs] [Times: user=0.27
> >> >> >> sys=0.01, real=0.04 secs]
> >> >> >> 2012-03-15T18:32:32.701+0100: 796619.141: [GC 796619.141: [ParNew
> >> >> >> Desired survivor size 20971520 bytes, new threshold 15 (max 15)
> >> >> >> - age   1:   10310176 bytes,   10310176 total
> >> >> >> -------------------------
> >> >> >>
> >> >> >> For contrast, here's a gc log fragment from the single node
> running
> >> >> >> the larger new gen and larger survivor spaces:
> >> >> >> (the fragment is from the same point in time, with the nodes
> >> >> >> experiencing equal load)
> >> >> >> -------------------------
> >> >> >> 2012-03-15T18:32:12.067+0100: 797119.336: [GC 797119.336: [ParNew
> >> >> >> Desired survivor size 69894144 bytes, new threshold 15 (max 15)
> >> >> >> - age   1:    5611536 bytes,    5611536 total
> >> >> >> - age   2:    3731888 bytes,    9343424 total
> >> >> >> - age   3:    3450672 bytes,   12794096 total
> >> >> >> - age   4:    3314744 bytes,   16108840 total
> >> >> >> - age   5:    3459888 bytes,   19568728 total
> >> >> >> - age   6:    3334712 bytes,   22903440 total
> >> >> >> - age   7:    3671960 bytes,   26575400 total
> >> >> >> - age   8:    3841608 bytes,   30417008 total
> >> >> >> - age   9:    2035392 bytes,   32452400 total
> >> >> >> - age  10:    1975056 bytes,   34427456 total
> >> >> >> - age  11:    2021344 bytes,   36448800 total
> >> >> >> - age  12:    1520752 bytes,   37969552 total
> >> >> >> - age  13:    1494176 bytes,   39463728 total
> >> >> >> - age  14:    2355136 bytes,   41818864 total
> >> >> >> - age  15:    1279000 bytes,   43097864 total
> >> >> >> : 603473K->61640K(682688K), 0.0756570 secs]
> >> >> >> 3373284K->2832383K(5106368K), 0.0762090 secs] [Times: user=0.56
> >> >> >> sys=0.00, real=0.08 secs]
> >> >> >> 2012-03-15T18:32:18.200+0100: 797125.468: [GC 797125.469: [ParNew
> >> >> >> Desired survivor size 69894144 bytes, new threshold 15 (max 15)
> >> >> >> - age   1:    6101320 bytes,    6101320 total
> >> >> >> - age   2:    4446776 bytes,   10548096 total
> >> >> >> - age   3:    3701384 bytes,   14249480 total
> >> >> >> - age   4:    3438488 bytes,   17687968 total
> >> >> >> - age   5:    3295360 bytes,   20983328 total
> >> >> >> - age   6:    3403320 bytes,   24386648 total
> >> >> >> - age   7:    3323368 bytes,   27710016 total
> >> >> >> - age   8:    3665760 bytes,   31375776 total
> >> >> >> - age   9:    2427904 bytes,   33803680 total
> >> >> >> - age  10:    1418656 bytes,   35222336 total
> >> >> >> - age  11:    1955192 bytes,   37177528 total
> >> >> >> - age  12:    2006064 bytes,   39183592 total
> >> >> >> - age  13:    1520768 bytes,   40704360 total
> >> >> >> - age  14:    1493728 bytes,   42198088 total
> >> >> >> - age  15:    2354376 bytes,   44552464 total
> >> >> >> : 607816K->62650K(682688K), 0.0779270 secs]
> >> >> >> 3378559K->2834643K(5106368K), 0.0784690 secs] [Times: user=0.58
> >> >> >> sys=0.00, real=0.08 secs]
> >> >> >> -------------------------
> >> >> >>
> >> >> >> Questions:
> >> >> >>
> >> >> >> 1) From the tenuring distributions, it seems that the application
> >> >> >> benefits from larger new gen and survivor spaces.
> >> >> >> The next thing we'll try is to run with -Xmn1g
> -XX:SurvivorRatio=2,
> >> >> >> and see if the ParNew times are still acceptable.
> >> >> >> Does this seem a sensible approach in this context?
> >> >> >> Are there other variables beyond ParNew times that limit scaling
> the
> >> >> >> new gen to a large size?
> >> >> >>
> >> >> >> 2) Given the object age demographics inherent to our application,
> we
> >> >> >> can not expect to see the majority of data get collected in the
> new
> >> >> >> gen.
> >> >> >>
> >> >> >> Our approach to fight the promotion failures consists of three
> >> >> >> aspects:
> >> >> >> a) Lower the overall allocation rate of our application (by
> >> >> >> improving
> >> >> >> wasteful hotspots), to decrease overall ParNew collection
> frequency.
> >> >> >> b) Configure the new gen and survivor spaces as large as possible,
> >> >> >> keeping an eye on ParNew times and overall new/tenured ratio.
> >> >> >> c) Try to refactor the data structures that form the bulk of
> >> >> >> promoted
> >> >> >> data, to retain only the strictly required subgraphs.
> >> >> >>
> >> >> >> Is there anything else I can try or measure, in order to better
> >> >> >> understand the problem?
> >> >> >>
> >> >> >> Thanks in advance,
> >> >> >> Taras
> >> >> >>
> >> >> >>
> >> >> >> On Wed, Feb 22, 2012 at 10:51 AM, Taras Tielkes
> >> >> >> <taras.tielkes at gmail.com> wrote:
> >> >> >>> (this time properly responding to the list alias)
> >> >> >>> Hi Srinivas,
> >> >> >>>
> >> >> >>> We're running 1.6.0 u29 on Linux x64. My understanding is that
> >> >> >>> CompressedOops is enabled by default since u23.
> >> >> >>>
> >> >> >>> At least this page seems to support that:
> >> >> >>>
> >> >> >>>
> >> >> >>>
> http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html
> >> >> >>>
> >> >> >>> Regarding the other remarks (also from Todd and Chi), I'll
> comment
> >> >> >>> later. The first thing on my list is to collect
> >> >> >>> PrintTenuringDistribution data now.
> >> >> >>>
> >> >> >>> Kind regards,
> >> >> >>> Taras
> >> >> >>>
> >> >> >>> On Wed, Feb 22, 2012 at 10:50 AM, Taras Tielkes
> >> >> >>> <taras.tielkes at gmail.com> wrote:
> >> >> >>>> Hi Srinivas,
> >> >> >>>>
> >> >> >>>> We're running 1.6.0 u29 on Linux x64. My understanding is that
> >> >> >>>> CompressedOops is enabled by default since u23.
> >> >> >>>>
> >> >> >>>> At least this page seems to support that:
> >> >> >>>>
> >> >> >>>>
> >> >> >>>>
> http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html
> >> >> >>>>
> >> >> >>>> Regarding the other remarks (also from Todd and Chi), I'll
> comment
> >> >> >>>> later. The first thing on my list is to collect
> >> >> >>>> PrintTenuringDistribution data now.
> >> >> >>>>
> >> >> >>>> Kind regards,
> >> >> >>>> Taras
> >> >> >>>>
> >> >> >>>> On Wed, Feb 22, 2012 at 12:40 AM, Srinivas Ramakrishna
> >> >> >>>> <ysr1729 at gmail.com> wrote:
> >> >> >>>>> I agree that premature promotions are almost always the first
> and
> >> >> >>>>> most
> >> >> >>>>> important thing to fix when running
> >> >> >>>>> into fragmentation or overload issues with CMS. However, I can
> >> >> >>>>> also
> >> >> >>>>> imagine
> >> >> >>>>> long-lived objects with a highly
> >> >> >>>>> non-stationary size distribution which can also cause problems
> >> >> >>>>> for
> >> >> >>>>> CMS
> >> >> >>>>> despite best efforts to tune against
> >> >> >>>>> premature promotion.
> >> >> >>>>>
> >> >> >>>>> I didn't think Treas was running with MTT=0, although MTT > 0
> is
> >> >> >>>>> no
> >> >> >>>>> recipe
> >> >> >>>>> for avoiding premature promotion
> >> >> >>>>> with bursty loads that case overflow the survivor spaces -- as
> >> >> >>>>> you
> >> >> >>>>> say large
> >> >> >>>>> survivor spaces with a low
> >> >> >>>>> TargetSurvivorRatio -- so as to leave plenty of space to
> >> >> >>>>> absorb/accommodate
> >> >> >>>>> spiking/bursty loads  is
> >> >> >>>>> definitely a "best practice" for CMS (and possibly for other
> >> >> >>>>> concurrent
> >> >> >>>>> collectors as well).
> >> >> >>>>>
> >> >> >>>>> One thing Taras can do to see if premature promotion might be
> an
> >> >> >>>>> issue is to
> >> >> >>>>> look at the tenuring
> >> >> >>>>> threshold in his case. A rough proxy (if
> >> >> >>>>> PrintTenuringDistribution
> >> >> >>>>> is not
> >> >> >>>>> enabled) is to look at the
> >> >> >>>>> promotion volume per scavenge. It may be possible, if premature
> >> >> >>>>> promotion is
> >> >> >>>>> a cause, to see
> >> >> >>>>> some kind of medium-term correlation between high promotion
> >> >> >>>>> volume
> >> >> >>>>> and
> >> >> >>>>> eventual promotion
> >> >> >>>>> failure despite frequent CMS collections.
> >> >> >>>>>
> >> >> >>>>> One other point which may or may not be relevant. I see that
> >> >> >>>>> Taras
> >> >> >>>>> is not
> >> >> >>>>> using CompressedOops...
> >> >> >>>>> Using that alone would greatly decrease memory pressure and
> >> >> >>>>> provide
> >> >> >>>>> more
> >> >> >>>>> breathing room to CMS,
> >> >> >>>>> which is also almost always a good idea.
> >> >> >>>>>
> >> >> >>>>> -- ramki
> >> >> >>>>>
> >> >> >>>>> On Tue, Feb 21, 2012 at 10:16 AM, Chi Ho Kwok
> >> >> >>>>> <chkwok at digibites.nl>
> >> >> >>>>> wrote:
> >> >> >>>>>>
> >> >> >>>>>> Hi Teras,
> >> >> >>>>>>
> >> >> >>>>>> I think you may want to look into sizing the new and
> especially
> >> >> >>>>>> the
> >> >> >>>>>> survivor spaces differently. We run something similar to what
> >> >> >>>>>> you
> >> >> >>>>>> described,
> >> >> >>>>>> high volume request processing with large dataset loading, and
> >> >> >>>>>> what
> >> >> >>>>>> we've
> >> >> >>>>>> seen at the start is that the survivor spaces are completely
> >> >> >>>>>> overloaded,
> >> >> >>>>>> causing premature promotions.
> >> >> >>>>>>
> >> >> >>>>>> We've configured our vm with the following goals/guideline:
> >> >> >>>>>>
> >> >> >>>>>> old space is for semi-permanent data, living for at least 30s,
> >> >> >>>>>> average ~10
> >> >> >>>>>> minutes
> >> >> >>>>>> new space contains only temporary and just loaded data
> >> >> >>>>>> surviving objects from new should never reach old in 1 gc, so
> >> >> >>>>>> the
> >> >> >>>>>> survivor
> >> >> >>>>>> space may never be 100% full
> >> >> >>>>>>
> >> >> >>>>>> With jstat -gcutil `pidof java` 2000, we see things like:
> >> >> >>>>>>
> >> >> >>>>>>   S0     S1     E      O      P     YGC     YGCT    FGC
>  FGCT
> >> >> >>>>>> GCT
> >> >> >>>>>>  70.20   0.00  19.65  57.60  59.90 124808 29474.299  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29665.409
> >> >> >>>>>>  70.20   0.00  92.89  57.60  59.90 124808 29474.299  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29665.409
> >> >> >>>>>>  70.20   0.00  93.47  57.60  59.90 124808 29474.299  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29665.409
> >> >> >>>>>>   0.00  65.69  78.07  58.09  59.90 124809 29474.526  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29665.636
> >> >> >>>>>>  84.97   0.00  48.19  58.57  59.90 124810 29474.774  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29665.884
> >> >> >>>>>>  84.97   0.00  81.30  58.57  59.90 124810 29474.774  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29665.884
> >> >> >>>>>>   0.00  62.64  27.22  59.12  59.90 124811 29474.992  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29666.102
> >> >> >>>>>>   0.00  62.64  54.47  59.12  59.90 124811 29474.992  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29666.102
> >> >> >>>>>>  75.68   0.00   6.80  59.53  59.90 124812 29475.228  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29666.338
> >> >> >>>>>>  75.68   0.00  23.38  59.53  59.90 124812 29475.228  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29666.338
> >> >> >>>>>>  75.68   0.00  27.72  59.53  59.90 124812 29475.228  2498
> >> >> >>>>>>  191.110
> >> >> >>>>>> 29666.338
> >> >> >>>>>>
> >> >> >>>>>> If you follow the lines, you can see Eden fill up to 100% on
> >> >> >>>>>> line
> >> >> >>>>>> 4,
> >> >> >>>>>> surviving objects are copied into S1, S0 is collected and
> added
> >> >> >>>>>> 0.49% to
> >> >> >>>>>> Old. On line 5, another GC happened, with Eden->S0, S1->Old,
> >> >> >>>>>> etc.
> >> >> >>>>>> No objects
> >> >> >>>>>> is ever transferred from Eden to Old, unless there's a huge
> peak
> >> >> >>>>>> of
> >> >> >>>>>> requests.
> >> >> >>>>>>
> >> >> >>>>>> This is with a: 32GB heap, Mxn1200M, SurvivorRatio 2 (600MB
> >> >> >>>>>> Eden,
> >> >> >>>>>> 300MB
> >> >> >>>>>> S0, 300MB S1), MaxTenuringThreshold 1 (whatever is still alive
> >> >> >>>>>> in
> >> >> >>>>>> S0/1 on
> >> >> >>>>>> the second GC is copied to old, don't wait, web requests are
> >> >> >>>>>> quite
> >> >> >>>>>> bursty).
> >> >> >>>>>> With about 1 collection every 2-5 seconds, objects promoted to
> >> >> >>>>>> Old
> >> >> >>>>>> must live
> >> >> >>>>>> for at 4-10 seconds; as that's longer than an average request
> >> >> >>>>>> (50ms-1s),
> >> >> >>>>>> none of the temporary data ever makes it into Old, which is
> much
> >> >> >>>>>> more
> >> >> >>>>>> expensive to collect. It works even with a higher than default
> >> >> >>>>>> CMSInitiatingOccupancyFraction=76 to optimize for space
> >> >> >>>>>> available
> >> >> >>>>>> for the
> >> >> >>>>>> large data cache we have.
> >> >> >>>>>>
> >> >> >>>>>>
> >> >> >>>>>> With your config of 400MB Total new, with 350MB Eden, 25MB S0,
> >> >> >>>>>> 25MB
> >> >> >>>>>> S1
> >> >> >>>>>> (SurvivorRatio 8), no tenuring threshold, I think loads of new
> >> >> >>>>>> objects get
> >> >> >>>>>> copied from Eden to Old directly, causing trouble for the CMS.
> >> >> >>>>>> You
> >> >> >>>>>> can use
> >> >> >>>>>> jstat to get live stats and tweak until it doesn't happen. If
> >> >> >>>>>> you
> >> >> >>>>>> can't make
> >> >> >>>>>> changes on live that easil, try doubling the new size indeed,
> >> >> >>>>>> with
> >> >> >>>>>> a 400
> >> >> >>>>>> Eden, 200 S0, 200 S1 and MaxTenuringThreshold 1 setting. It's
> >> >> >>>>>> probably
> >> >> >>>>>> overkill, but if should solve the problem if it is caused by
> >> >> >>>>>> premature
> >> >> >>>>>> promotion.
> >> >> >>>>>>
> >> >> >>>>>>
> >> >> >>>>>> Chi Ho Kwok
> >> >> >>>>>>
> >> >> >>>>>>
> >> >> >>>>>> On Tue, Feb 21, 2012 at 5:55 PM, Taras Tielkes
> >> >> >>>>>> <taras.tielkes at gmail.com>
> >> >> >>>>>> wrote:
> >> >> >>>>>>>
> >> >> >>>>>>> Hi,
> >> >> >>>>>>>
> >> >> >>>>>>> We've removed the "-XX:+CMSScavengeBeforeRemark" setting from
> >> >> >>>>>>> 50%
> >> >> >>>>>>> of
> >> >> >>>>>>> our production nodes.
> >> >> >>>>>>> After running for a few weeks, it seems that there's no
> impact
> >> >> >>>>>>> from
> >> >> >>>>>>> removing this option.
> >> >> >>>>>>> Which is good, since it seems we can remove it from the other
> >> >> >>>>>>> nodes as
> >> >> >>>>>>> well, simplifying our overall JVM configuration ;-)
> >> >> >>>>>>>
> >> >> >>>>>>> However, we're still seeing promotion failures on all nodes,
> >> >> >>>>>>> once
> >> >> >>>>>>> every day or so.
> >> >> >>>>>>>
> >> >> >>>>>>> There's still the "Magic 1026": this accounts for ~60% of the
> >> >> >>>>>>> promotion failures that we're seeing (single ParNew thread
> >> >> >>>>>>> thread,
> >> >> >>>>>>> 1026 failure size):
> >> >> >>>>>>> --------------------
> >> >> >>>>>>> 2012-02-06T09:13:51.806+0100: 328095.085: [GC 328095.086:
> >> >> >>>>>>> [ParNew:
> >> >> >>>>>>> 359895K->29357K(368640K), 0.0429070 secs]
> >> >> >>>>>>> 3471021K->3143476K(5201920K), 0.0434950 secs] [Times:
> user=0.32
> >> >> >>>>>>> sys=0.00, real=0.04 secs]
> >> >> >>>>>>> 2012-02-06T09:13:55.922+0100: 328099.201: [GC 328099.201:
> >> >> >>>>>>> [ParNew:
> >> >> >>>>>>> 357037K->31817K(368640K), 0.0429130 secs]
> >> >> >>>>>>> 3471156K->3148946K(5201920K), 0.0434930 secs] [Times:
> user=0.31
> >> >> >>>>>>> sys=0.00, real=0.04 secs]
> >> >> >>>>>>> 2012-02-06T09:13:59.044+0100: 328102.324: [GC 328102.324:
> >> >> >>>>>>> [ParNew
> >> >> >>>>>>> (promotion failure size = 1026)  (promotion failed):
> >> >> >>>>>>> 359497K->368640K(368640K), 0.2226790 secs]328102.547: [CMS:
> >> >> >>>>>>> 3125609K->451515K(4833280K), 5.6225880 secs] 3476626K->4515
> >> >> >>>>>>> 15K(5201920K), [CMS Perm : 124373K->124353K(262144K)],
> >> >> >>>>>>> 5.8459380
> >> >> >>>>>>> secs]
> >> >> >>>>>>> [Times: user=6.20 sys=0.01, real=5.85 secs]
> >> >> >>>>>>> 2012-02-06T09:14:05.243+0100: 328108.522: [GC 328108.523:
> >> >> >>>>>>> [ParNew:
> >> >> >>>>>>> 327680K->40960K(368640K), 0.0319160 secs]
> >> >> >>>>>>> 779195K->497658K(5201920K),
> >> >> >>>>>>> 0.0325360 secs] [Times: user=0.21 sys=0.01, real=0.03 secs]
> >> >> >>>>>>> 2012-02-06T09:14:07.836+0100: 328111.116: [GC 328111.116:
> >> >> >>>>>>> [ParNew:
> >> >> >>>>>>> 368640K->32785K(368640K), 0.0744670 secs]
> >> >> >>>>>>> 825338K->520234K(5201920K),
> >> >> >>>>>>> 0.0750390 secs] [Times: user=0.40 sys=0.02, real=0.08 secs]
> >> >> >>>>>>> --------------------
> >> >> >>>>>>> Given the 1026 word size, I'm wondering if I should be
> hunting
> >> >> >>>>>>> for
> >> >> >>>>>>> an
> >> >> >>>>>>> overuse of BufferedInputStream/BufferedOutoutStream, since
> both
> >> >> >>>>>>> have
> >> >> >>>>>>> 8192 as a default buffer size.
> >> >> >>>>>>>
> >> >> >>>>>>> The second group of promotion failures look like this
> (multiple
> >> >> >>>>>>> ParNew
> >> >> >>>>>>> threads, small failure sizes):
> >> >> >>>>>>> --------------------
> >> >> >>>>>>> 2012-02-06T09:50:15.773+0100: 328756.964: [GC 328756.964:
> >> >> >>>>>>> [ParNew:
> >> >> >>>>>>> 356116K->29934K(368640K), 0.0461100 secs]
> >> >> >>>>>>> 3203863K->2880162K(5201920K), 0.0468870 secs] [Times:
> user=0.34
> >> >> >>>>>>> sys=0.01, real=0.05 secs]
> >> >> >>>>>>> 2012-02-06T09:50:19.153+0100: 328760.344: [GC 328760.344:
> >> >> >>>>>>> [ParNew:
> >> >> >>>>>>> 357614K->30359K(368640K), 0.0454680 secs]
> >> >> >>>>>>> 3207842K->2882892K(5201920K), 0.0462280 secs] [Times:
> user=0.33
> >> >> >>>>>>> sys=0.01, real=0.05 secs]
> >> >> >>>>>>> 2012-02-06T09:50:22.658+0100: 328763.849: [GC 328763.849:
> >> >> >>>>>>> [ParNew
> >> >> >>>>>>> (1:
> >> >> >>>>>>> promotion failure size = 25)  (4: promotion failure size =
> 25)
> >> >> >>>>>>>  (6:
> >> >> >>>>>>> promotion failure size = 25)  (7: promotion failure size =
> 144)
> >> >> >>>>>>> (promotion failed): 358039K->358358
> >> >> >>>>>>> K(368640K), 0.2148680 secs]328764.064: [CMS:
> >> >> >>>>>>> 2854709K->446750K(4833280K), 5.8368270 secs]
> >> >> >>>>>>> 3210572K->446750K(5201920K), [CMS Perm :
> >> >> >>>>>>> 124670K->124644K(262144K)],
> >> >> >>>>>>> 6.0525230 secs] [Times: user=6.32 sys=0.00, real=6.05 secs]
> >> >> >>>>>>> 2012-02-06T09:50:29.896+0100: 328771.086: [GC 328771.087:
> >> >> >>>>>>> [ParNew:
> >> >> >>>>>>> 327680K->22569K(368640K), 0.0227080 secs]
> >> >> >>>>>>> 774430K->469319K(5201920K),
> >> >> >>>>>>> 0.0235020 secs] [Times: user=0.16 sys=0.00, real=0.02 secs]
> >> >> >>>>>>> 2012-02-06T09:50:31.076+0100: 328772.266: [GC 328772.267:
> >> >> >>>>>>> [ParNew:
> >> >> >>>>>>> 350249K->22264K(368640K), 0.0235480 secs]
> >> >> >>>>>>> 796999K->469014K(5201920K),
> >> >> >>>>>>> 0.0243000 secs] [Times: user=0.18 sys=0.01, real=0.02 secs]
> >> >> >>>>>>> --------------------
> >> >> >>>>>>>
> >> >> >>>>>>> We're going to try to double the new size on a single node,
> to
> >> >> >>>>>>> see
> >> >> >>>>>>> the
> >> >> >>>>>>> effects of that.
> >> >> >>>>>>>
> >> >> >>>>>>> Beyond this experiment, is there any additional data I can
> >> >> >>>>>>> collect
> >> >> >>>>>>> to
> >> >> >>>>>>> better understand the nature of the promotion failures?
> >> >> >>>>>>> Am I facing collecting free list statistics at this point?
> >> >> >>>>>>>
> >> >> >>>>>>> Thanks,
> >> >> >>>>>>> Taras
> >> >> >>>>>>
> >> >> >>>>>>
> >> >> >>>>>> _______________________________________________
> >> >> >>>>>> hotspot-gc-use mailing list
> >> >> >>>>>> hotspot-gc-use at openjdk.java.net
> >> >> >>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >> >> >>>>>>
> >> >> >>>>>
> >> >> >> _______________________________________________
> >> >> >> hotspot-gc-use mailing list
> >> >> >> hotspot-gc-use at openjdk.java.net
> >> >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >> >> _______________________________________________
> >> >> hotspot-gc-use mailing list
> >> >> hotspot-gc-use at openjdk.java.net
> >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >> >
> >> >
> >> _______________________________________________
> >> hotspot-gc-use mailing list
> >> hotspot-gc-use at openjdk.java.net
> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >
> >
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120415/76d0dacc/attachment-0001.html 


More information about the hotspot-gc-use mailing list