CMS Sudden long pauses with large number of weak references
Greg Bowyer
gbowyer at fastmail.co.uk
Fri Jan 20 10:48:32 PST 2012
I didnt know this was in a forthcoming release so I had already compiled
JDK7 from tip with the change (since its so small and simple)
I still saw some pauses, although I think that might be down to CMS
starting to late, so I have lowered the Occupancy Fraction and started
testing.
So far with the fix for 7112034
<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7112034>,
ParallelRefProcessing and an Occupancy Fraction of 70 I am seeing fairly
good results with the worst pause time so far of ~200 ms (this is far
better than multi-second pauses I had before)
I will let a few more tests run and then try jdk7u4 and let you know if
this helps me
thank you for confirming my thoughts on this.
-- Greg
On 19/01/12 19:40, Krystal Mok wrote:
> Hi Greg,
>
> This might just be another case of "7112034: Parallel CMS fails to
> properly mark reference objects" [1], which is only fixed recently,
> and isn't delivered in any FCS releases yet.
>
> Could you try a recent JDK7u4 preview and see if the problem goes
> away? Or, try -XX:-CMSConcurrentMTEnabled, which should workaround
> this particular bug, but you'll get longer CMS collection cycles.
>
> - Kris
>
> [1]: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7112034
> [2]: http://jdk7.java.net/download.html
>
> On Fri, Jan 20, 2012 at 5:37 AM, Greg Bowyer <gbowyer at fastmail.co.uk
> <mailto:gbowyer at fastmail.co.uk>> wrote:
>
> Hi all
>
> I have an application that is solr/lucene running on JDK 1.6 / 1.7
> with
> CMS, 9Gb heap. The only unusual options here are DisableExplictGC and
> UseCompressedOops (which I do not think are the issue).
>
> Generally the application has a heap usage of ~4G with a 2-3G float of
> continual garbage on top.
>
> The application creates several large heap objects, some of these are
> sizeable arrays (~300mb), some of these are objects that have large
> retained graphs.
>
> One of these is a WeakHashMap that in lucene stores IndexReaders
> (as the
> key) -> "Fields", this mapping is used to avoid disk access, there are
> not typically a large number of readers open at any time (at worst
> this
> could be say 25).
>
> Generally we see the GC behavior being fairly solid across JVMS,
> however
> we get ever increasing amounts of concurrent-mode-failues with a
> marked
> spike of weak references, this only seems to appear when new indexes
> load on lucene, which is when the references are changed and when new
> versions of these caches are created.
>
> ---- %< ----
> 4747.399: [GC 4747.399: [ParNew4747.492: [SoftReference, 0 refs,
> 0.0000120 secs]4747.492: [WeakReference, 5 refs, 0.0000050
> secs]4747.492: [FinalReference, 0 refs, 0.0000040 secs]4747.492:
> [PhantomReference, 0 refs, 0.0000030 secs]4747.492: [JNI Weak
> Reference,
> 0.0000020 secs]: 133242K->11776K(153344K), 0.0925850 secs]
> 7199972K->7187397K(9420160K), 0.0926840 secs] [Times: user=0.72
> sys=0.00, real=0.09 secs]
> Total time for which application threads were stopped: 0.0931920
> seconds
> 4747.530: [GC [1 CMS-initial-mark: 7175620K(9266816K)]
> 7300153K(9420160K), 0.0044900 secs] [Times: user=0.00 sys=0.00,
> real=0.00 secs]
> Total time for which application threads were stopped: 0.0398160
> seconds
> 4747.535: [CMS-concurrent-mark-start]
> 4747.537: [GC 4747.537: [ParNew4747.568: [SoftReference, 0 refs,
> 0.0000120 secs]4747.568: [WeakReference, 6 refs, 0.0000060
> secs]4747.568: [FinalReference, 0 refs, 0.0000040 secs]4747.568:
> [PhantomReference, 0 refs, 0.0000040 secs]4747.568: [JNI Weak
> Reference,
> 0.0000020 secs] (promotion failed): 127253K->127965K(153344K),
> 0.0315280
> secs]4747.569: [CMS4755.725: [CMS-concurrent-mark: 8.157/8.190 secs]
> [Times: user=10.81 sys=3.49, real=8.19 secs]
> (concurrent mode failure)4757.436: [SoftReference, 0 refs, 0.0000050
> secs]4757.436: [WeakReference, 6360 refs, 0.0016500 secs]4757.437:
> [FinalReference, 462 refs, 0.0088580 secs]4757.446: [PhantomReference,
> 114 refs, 0.0000220 secs]4757.446: [JNI Weak Reference, 0.0000070
> secs]:
> 7180161K->3603572K(9266816K), 17.1070460 secs]
> 7302874K->3603572K(9420160K), [CMS Perm : 47059K->47044K(78380K)],
> 17.1387610 secs] [Times: user=19.75 sys=3.49, real=17.14 secs]
> Total time for which application threads were stopped: 17.1393510
> seconds
> Total time for which application threads were stopped: 0.0245300
> seconds
> ---- >% ----
>
> ? Is this due to the defects fixed here
> http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f1391adc6681,
> would -XX:+ParallelRefProcEnabled help here ?
>
> Also is there anything that could explain the large number of suddenly
> processed and traced weak-references
>
> Many thanks
> -- Greg
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120120/283a2161/attachment-0001.html
More information about the hotspot-gc-use
mailing list