Problem with Parnew GC

Fri Feb 10 04:51:13 PST 2012

Hi Ramki,
We have done some more analysis using jstat.
Following is the output of jstat.

  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT
GCT    LGCC                 GCC
  0.00  25.58  52.20  30.75  19.14    227   51.356     0    0.000
51.356 unknown GCCause      No GC
  0.00  25.58  58.94  30.75  19.14    227   51.356     0    0.000
51.356 unknown GCCause      No GC
  0.00  25.58  67.09  30.75  19.14    227   51.356     0    0.000
51.356 unknown GCCause      No GC
  0.00  25.58  80.68  30.75  19.14    227   51.356     0    0.000
51.356 unknown GCCause      No GC
  0.00  25.58  88.15  30.75  19.14    227   51.356     0    0.000
51.356 unknown GCCause      No GC
  6.61  25.58 100.00  30.75  19.14    228   51.356     0    0.000
51.356 unknown GCCause      Allocation Failure
  8.93   0.00  10.75  30.78  19.14    228   51.567     0    0.000
51.567 unknown GCCause      No GC
  8.93   0.00  18.57  30.78  19.14    228   51.567     0    0.000
51.567 unknown GCCause      No GC
  8.93   0.00  39.04  30.78  19.14    228   51.567     0    0.000
51.567 unknown GCCause      No GC
  8.93   0.00  45.56  30.78  19.14    228   51.567     0    0.000
51.567 unknown GCCause      No GC

As can be seen, lot of YG are happening.
I have 2 options-
1-Let some objects go in survior space and YG will collect it
2-Let them flow to Old generation (by setting -XX:MaxTenuringThreshold=0
)and CMS will collect it.

Which one is faster ? YG or CMS ?
YG is 2G and OLD G is 5G. currently default survivor ration is used.

Thanks in advance
Pravah.

On Sun, Feb 5, 2012 at 10:27 PM, pravah wordpress <
pravah.wordpress at gmail.com> wrote:

> Hi Ramki,
> Thank you for your reply. I checked the given bug but it seems that it may
> not be the root
> cause as no promotion failures are happening in our case.
> I think we will try with different values for new memory and see if it
> improves the performance.
> I shall post the results of our experiments.
> Thanks again.
> Pravah
>
>
>
> On Fri, Feb 3, 2012 at 11:55 PM, Srinivas Ramakrishna <ysr1729 at gmail.com>wrote:
>
>> Hi Pravah --
>>
>> There was a bug fix for taking care of a fragmentation problem a while
>> back....
>>
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6999988
>>
>> It may or may not be your problem (after all you do not get promotion
>> failures), but
>> it appears to me that it can result in very poor performance of the
>> "local free list caching"
>> and i can see it resulting in increased promotion times. I can't off the
>> top of
>> my head tell if there's a quick way of making that identification from
>> your 6u20 JVM
>> but perhaps someone may have an idea.
>>
>> As to whether allocating more memory to the JVM will help... If my
>> conjecture as to the
>> root cause is correct (and i will be the first to admit it might not be),
>> then, no: allocating
>> more heap will not help because the bug prevents the efficient use of
>> existing memory
>> rather than that existing memory is quickly used up; so the extra heap
>> will not help.
>>
>> But as you can see I am doing a lot of hand-waving here, so you should
>> probably
>> follow up with Sun/Oracle JVM support.
>>
>> -- ramki
>>
>>
>> On Fri, Feb 3, 2012 at 9:44 AM, pravah wordpress <
>> pravah.wordpress at gmail.com> wrote:
>>
>>> Hi,
>>> It is not possible to check it with 1.6.0_29 as this is happening in
>>> production.
>>> Today we checked the iostat and sar output and found that there are no
>>> io waits and no swappings. Will allocating more memory to JVM (Total heap+
>>> YG space ) solve this problem of slow response ? Is it normal to have 2
>>> Parnew / second ?(and they are always taking > 0.1 Second)
>>>
>>> Thanks in advance.
>>> Pravah
>>>
>>>
>>>
>>> On Fri, Feb 3, 2012 at 11:58 AM, Srinivas Ramakrishna <ysr1729 at gmail.com
>>> > wrote:
>>>
>>>> Is it possible for you to check if this happens with 1.6.0_29 as well?
>>>>
>>>> -- ramki
>>>>
>>>> On Thu, Feb 2, 2012 at 9:30 AM, pravah wordpress <
>>>> pravah.wordpress at gmail.com> wrote:
>>>>
>>>>> Hi Team,
>>>>> For last couple of weeks, while monitoring production logs, we are
>>>>> seeing following 2 scenarios-
>>>>> 2012-01-30T09:45:57.783+0530: 211563.940: [GC 211563.941: [ParNew:
>>>>> 1695683K->20039K(1887488K), 0.1443767 secs] 2704741K->1029575K(5033216K),
>>>>> 0.1456574 secs] [Times: user=0.42 sys=0.02, real=0.15 secs]
>>>>> 2012-01-30T09:46:35.195+0530: 211601.351: [GC 211601.352: [ParNew:
>>>>> 1697863K->21131K(1887488K), 0.1623961 secs] 2707399K->1031401K(5033216K),
>>>>> 0.1636535 secs] [Times: user=0.41 sys=0.09, real=0.16 secs]
>>>>> 2012-01-30T09:47:15.555+0530: 211641.711: [GC 211641.712: [ParNew:
>>>>> 1698955K->19118K(1887488K), 0.1422631 secs] 2709225K->1030329K(5033216K),
>>>>> 0.1435373 secs] [Times: user=0.42 sys=0.01, real=0.14 secs]
>>>>> 2012-01-30T09:47:55.656+0530: 211681.811: [GC 211681.812: [ParNew:
>>>>> 1696942K->21437K(1887488K), 0.1340463 secs] 2708153K->1033449K(5033216K),
>>>>> 0.1352694 secs] [Times: user=0.40 sys=0.01, real=0.14 secs]
>>>>> 2012-01-30T09:48:29.406+0530: 211715.562: [GC 211715.564: [ParNew:
>>>>> 1699261K->18356K(1887488K), 0.1362798 secs] 2711273K->1031807K(5033216K),
>>>>> 0.1387961 secs] [Times: user=0.40 sys=0.03, real=0.14 secs]
>>>>> 2012-01-30T09:48:57.017+0530: 211743.172: [GC 211743.173: [ParNew:
>>>>> 1696035K->22375K(1887488K), 12.3948960 secs] 2709487K->1036955K(5033216K),
>>>>> 12.3962135 secs] [Times: *user=15.73 sys=0.60, real=12.40 secs]
>>>>> *2012-01-30T09:49:41.237+0530: 211787.392: [GC 211787.392: [ParNew:
>>>>> 1698873K->21698K(1887488K), 0.1251361 secs] 2713453K->1037007K(5033216K),
>>>>> 0.1262755 secs] [Times: user=0.39 sys=0.00, real=0.13 secs]
>>>>> 2012-01-30T09:50:24.348+0530: 211830.503: [GC 211830.504: [ParNew:
>>>>> 1699522K->21849K(1887488K), 0.1370179 secs] 2714831K->1037980K(5033216K),
>>>>> 0.1382479 secs] [Times: user=0.40 sys=0.01, real=0.14 secs]
>>>>> 2012-01-30T09:50:58.516+0530: 211864.670: [GC 211864.671: [ParNew:
>>>>> 1699673K->25281K(1887488K), 0.1334443 secs] 2715804K->1042397K(5033216K),
>>>>> 0.1347060 secs] [Times: user=0.43 sys=0.00, real=0.14 secs]
>>>>> 2012-01-30T09:51:33.830+0530: 211899.985: [GC 211899.985: [ParNew:
>>>>> 1695938K->28123K(1887488K), 0.1449732 secs] 2713054K->1046170K(5033216K),
>>>>> 0.1462090 secs] [Times: user=0.44 sys=0.01, real=0.15 secs]
>>>>>
>>>>> This is happening atleast once in a day and the real time is around 10
>>>>> S.
>>>>>
>>>>> Another observation is as follows
>>>>>
>>>>> 2012-01-30T14:53:11.645+0530: 229997.686: [GC 229997.687: [ParNew:
>>>>> 1686605K->39956K(1887488K), 0.1504394 secs] 3258191K->1612228K(5033216K),
>>>>> 0.1516840 secs] [Times: user=0.47 sys=0.02, real=0.15 secs]
>>>>> 2012-01-30T14:53:45.006+0530: 230031.047: [GC 230031.048: [ParNew:
>>>>> 1717780K->30950K(1887488K), 0.1693496 secs] 3290052K->1604025K(5033216K),
>>>>> 0.1705586 secs] [Times: user=0.52 sys=0.04, real=0.17 secs]
>>>>> 2012-01-30T14:53:45.208+0530: 230031.249: [GC [1 CMS-initial-mark:
>>>>> 1573075K(3145728K)] 1608943K(5033216K), 0.4815794 secs] [Times: user=0.13
>>>>> sys=0.35, real=0.48 secs]
>>>>> 2012-01-30T14:53:45.692+0530: 230031.733: [CMS-concurrent-mark-start]
>>>>> 2012-01-30T14:53:52.114+0530: 230038.155: [CMS-concurrent-mark:
>>>>> 6.283/6.422 secs] [Times: user=11.70 sys=2.40, *real=6.42 secs*]
>>>>> 2012-01-30T14:53:52.115+0530: 230038.155:
>>>>> [CMS-concurrent-preclean-start]
>>>>> 2012-01-30T14:53:52.470+0530: 230038.510: [CMS-concurrent-preclean:
>>>>> 0.346/0.355 secs] [Times: user=0.45 sys=0.16, real=0.36 secs]
>>>>> 2012-01-30T14:53:52.470+0530: 230038.511:
>>>>> [CMS-concurrent-abortable-preclean-start]
>>>>>  CMS: abort preclean due to time 2012-01-30T14:53:57.500+0530:
>>>>> 230043.541: [CMS-concurrent-abortable-preclean: 4.906/5.030 secs] [Times:
>>>>> user=8.73 sys=0.62, *real=5.03 secs*]
>>>>> 2012-01-30T14:53:57.532+0530: 230043.573: [GC[YG occupancy: 839183 K
>>>>> (1887488 K)]230043.573: [Rescan (parallel) , 1.8934193 secs]230045.467:
>>>>> [weak refs processing, 0.0612851 secs] [1 CMS-remark: 1573075K(3145728K)]
>>>>> 2412258K(5033216K), 1.9618014 secs] [Times: user=5.58 sys=0.10, real=1.96
>>>>> secs]
>>>>> 2012-01-30T14:53:59.497+0530: 230045.538: [CMS-concurrent-sweep-start]
>>>>> 2012-01-30T14:54:08.616+0530: 230054.656: [CMS-concurrent-sweep:
>>>>> 8.635/9.118 secs] [Times: user=17.84 sys=2.36, *real=9.12 secs*]
>>>>> 2012-01-30T14:54:08.617+0530: 230054.657: [CMS-concurrent-reset-start]
>>>>> 2012-01-30T14:54:08.656+0530: 230054.696: [CMS-concurrent-reset:
>>>>> 0.039/0.039 secs] [Times: user=0.05 sys=0.00, real=0.04 secs]
>>>>> 2012-01-30T14:54:12.410+0530: 230058.451: [GC 230058.452: [ParNew:
>>>>> 1708774K->25197K(1887488K), 0.1436152 secs] 2456494K->773832K(5033216K),
>>>>> 0.1448754 secs] [Times: user=0.48 sys=0.01, real=0.15 secs]
>>>>> 2012-01-30T14:54:38.243+0530: 230084.284: [GC 230084.284: [ParNew:
>>>>> 1703021K->62457K(1887488K), 0.2070478 secs] 2451656K->811902K(5033216K),
>>>>> 0.2083137 secs] [Times: user=0.67 sys=0.01, real=0.21 secs]
>>>>>
>>>>>
>>>>> Here CMS concurrent mark and sweep and also remark is taking
>>>>> considerable time many time above 15 seconds. This also happens once in a
>>>>> day.
>>>>> Why is this happening ?
>>>>>
>>>>> Also no full GC is observed. Also Parnew is happening two times per
>>>>> second during peak load. Is this a cause of worry / slow performance ? This
>>>>> is OLTP app and no batch jobs are running.
>>>>> Following is the configuration
>>>>> Java HotSpot(TM) 64-Bit Server VM (16.3-b01) for solaris-sparc JRE
>>>>> (1.6.0_20-b02)
>>>>>
>>>>> -XX:+DisableExplicitGC
>>>>> -XX:-DoEscapeAnalysis
>>>>> -XX:InitialHeapSize=5368709120
>>>>> -XX:+LogVMOutput
>>>>> -XX:MaxHeapSize=5368709120
>>>>> -XX:MaxNewSize=2147483648
>>>>> -XX:MaxPermSize=536870912
>>>>> -XX:MaxTenuringThreshold=4
>>>>> -XX:NewRatio=3
>>>>> -XX:NewSize=2147483648
>>>>> -XX:ParallelGCThreads=4
>>>>> -XX:PermSize=536870912
>>>>> -XX:+PrintCommandLineFlags
>>>>> -XX:+PrintGC
>>>>> -XX:+PrintGCDateStamps
>>>>> -XX:+PrintGCDetails
>>>>> -XX:+PrintGCTimeStamps
>>>>> -XX:ThreadStackSize=256
>>>>> -XX:+TraceClassUnloading
>>>>> -XX:+UnlockDiagnosticVMOptions
>>>>> -XX:+UseConcMarkSweepGC
>>>>> -XX:+UseParNewGC
>>>>>
>>>>> Thanks in advance,
>>>>> Pravah.
>>>>>
>>>>> _______________________________________________
>>>>> hotspot-gc-use mailing list
>>>>> hotspot-gc-use at openjdk.java.net
>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120210/7f07ad0b/attachment.html