Perplexing GC Time Growth

Wed Jan 2 18:42:28 UTC 2008

I've attached a representative garbage log.   To my eye, I don't see  
anything specific that seems to indicate an event, but hopefully more  
experienced eyes will see something different :)

Thanks,
-jason

-------------- next part --------------
A non-text attachment was scrubbed...
Name: garbage.log.gz
Type: application/x-gzip
Size: 33347 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20080102/ed0257a8/garbage.log.gz>
-------------- next part --------------

On Jan 1, 2008, at 15:46 , Y Srinivas Ramakrishna wrote:

>
> It's probably a combination of card-scanning times and allocation  
> slow-down
> (but probably more of the former).
>
> We've had some internal instrumentation of card-scanning times in  
> the JVM which
> unfortunately has not made into the JVM code proper because the  
> instrumentation
> is not as lightweight as to be enabled in production. Perhaps a spin  
> on a test system
> with the card-scanning times explicitly called out might shed light.
>
> Basically what happens with CMS is that allocation is from free  
> lists, and lacking
> something like big bag of pages (BBOP) allocation, this has  
> traditionally tended to
> scatter the allocated objects over a large number of pages. This  
> increases card-scanning
> times, although one would normally expect that this would eventually
> stabilize.
>
> Do the scavenge times increase suddenly after a specific event or do  
> they just
> creep up slowly after each scavenge? The complete GC log would be  
> useful to
> look at to answer that question.
>
> -- ramki
>
>
> ----- Original Message -----
> From: Jason Vasquez <jason at mugfu.com>
> Date: Monday, December 31, 2007 10:21 am
> Subject: Perplexing GC Time Growth
> To: hotspot-gc-use at openjdk.java.net
>
>
>> Hi all,
>>
>> I'm having a perplexing problem -- the garbage collector appears to  
>> be
>>
>> functioning well, with a nice object/garbage lifecycle, yet minor GC
>>
>> times increase over the life of the process inexplicably.  We are
>> working with telephony hardware with this application, so keeping GC
>>
>> pauses very low is quite important.  (keeping well below 100 ms would
>>
>> be ideal)
>>
>> Here is the current configuration we are using:
>>
>> -server \
>> -Xloggc:garbage.log \
>> -XX:+PrintGCDetails \
>> -Dsun.rmi.dgc.server.gcInterval=3600000 \
>> -Dsun.rmi.dgc.client.gcInterval=3600000 \
>> -XX:ParallelGCThreads=8 \
>> -XX:+UseParNewGC \
>> -XX:+UseConcMarkSweepGC \
>> -XX:+PrintGCTimeStamps \
>> -XX:-TraceClassUnloading \
>> -XX:+AggressiveOpts \
>> -Xmx512M \
>> -Xms512M \
>> -Xmn128M \
>> -XX:MaxTenuringThreshold=6 \
>> -XX:+ExplicitGCInvokesConcurrent
>>
>> A large number of our bigger objects size-wise live for approximately
>>
>> 4-5 minutes, thus the larger young generation, and tenuring  
>> threshold.
>>
>> This seems to be successful in filtering most objects before they
>> reach the tenured gen. (8 core x86 server, running 1.6.0_03-b05 on  
>> 32-
>>
>> bit Linux, kernel rev 2.6.18)
>>
>> Here is a representative snippet of our garbage log:
>>
>> 487.135: [GC 487.135: [ParNew: 112726K->7290K(118016K), 0.0218110
>> secs] 134494K->29058K(511232K), 0.0220520 secs]
>> 557.294: [GC 557.294: [ParNew: 112250K->7976K(118016K), 0.0204220
>> secs] 134018K->29744K(511232K), 0.0206690 secs]
>> 607.025: [GC 607.025: [ParNew: 112936K->7831K(118016K), 0.0231230
>> secs] 134704K->30003K(511232K), 0.0233670 secs]
>> 672.522: [GC 672.522: [ParNew: 112791K->7361K(118016K), 0.0253620
>> secs] 134963K->29533K(511232K), 0.0256080 secs]
>> ...
>> 4006.635: [GC 4006.635: [ParNew: 112983K->7386K(118016K), 0.0385960
>> secs] 141969K->36608K(511232K), 0.0388460 secs]
>> 4083.066: [GC 4083.066: [ParNew: 112346K->8439K(118016K), 0.0365940
>> secs] 141568K->37661K(511232K), 0.0368340 secs]
>> 4158.457: [GC 4158.457: [ParNew: 113399K->7152K(118016K), 0.0360540
>> secs] 142621K->36374K(511232K), 0.0362990 secs]
>> 4228.312: [GC 4228.313: [ParNew: 112112K->8738K(118016K), 0.0362510
>> secs] 141334K->38083K(511232K), 0.0365050 secs]
>> 4293.800: [GC 4293.800: [ParNew: 113698K->8294K(118016K), 0.0368700
>> secs] 143043K->37917K(511232K), 0.0371160 secs]
>> ...
>> 10489.555: [GC 10489.556: [ParNew: 112701K->7770K(118016K), 0.0639770
>>
>> secs] 151966K->47156K(511232K), 0.0642210 secs]
>> 10562.544: [GC 10562.544: [ParNew: 112730K->9267K(118016K), 0.0625900
>>
>> secs] 152116K->48772K(511232K), 0.0628470 secs]
>> 10622.558: [GC 10622.558: [ParNew: 114227K->8361K(118016K), 0.0675730
>>
>> secs] 153732K->48381K(511232K), 0.0678220 secs]
>> 10678.842: [GC 10678.842: [ParNew: 113056K->7214K(118016K), 0.0669330
>>
>> secs] 153076K->47234K(511232K), 0.0671800 secs]
>> ...
>> 177939.062: [GC 177939.062: [ParNew: 112608K->8620K(118016K),
>> 0.7681440 secs] 466132K->362144K(511232K), 0.7684030 secs]
>> 178005.483: [GC 178005.483: [ParNew: 113449K->7731K(118016K),
>> 0.7677300 secs] 466973K->361893K(511232K), 0.7679890 secs]
>> 178069.658: [GC 178069.658: [ParNew: 112670K->6814K(118016K),
>> 0.7700020 secs] 466832K->360976K(511232K), 0.7702590 secs]
>> 178133.513: [GC 178133.513: [ParNew: 111717K->7920K(118016K),
>> 0.7702920 secs] 465879K->362082K(511232K), 0.7705560 secs]
>>
>> As you can see, the gc times continue to increase over time, on the
>> order of about 10-20ms per hour.  CMS runs are spaced very far apart,
>>
>> in fact, since most objects die before reaching the tenured
>> generation, the CMS is triggered more by RMI DGC runs then by heap
>> growth.   (We were getting serial GCs, apparently due to RMI DGC
>> before adding -XX:+ExplicitGCInvokesConcurrent)
>>
>> Here's some representative output from `jstat -gcutil -t -h10 <pid>  
>> 2s`:
>>
>> Timestamp S0 S1 E O P YGC YGCT FGC FGCT GCT
>> 11067.6 55.74 0.00 89.32 9.73 59.84 168 7.471 8 0.280 7.751
>> 11069.6 55.74 0.00 93.65 9.73 59.84 168 7.471 8 0.280 7.751
>> 11071.6 55.74 0.00 99.34 9.73 59.84 168 7.471 8 0.280 7.751
>> 11073.5 0.00 62.22 2.89 9.76 59.84 169 7.537 8 0.280 7.816
>> 11075.6 0.00 62.22 4.80 9.76 59.84 169 7.537 8 0.280 7.816
>>
>> Survivor spaces continue to sit at about 50-65% occupancy, which  
>> seems
>>
>> fairly good to my eye.  Eden fills approximately every 70 seconds,
>> triggering minor GCs.
>>
>>
>> Any ideas?  This is becoming quite frustrating for us -- our
>> application uptime is pretty horrible with the too-frequent scheduled
>>
>> restarts we are being forced to run.
>>
>> Thanks for any assistance you might be able to offer,
>> -Jason Vasquez
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use