Number of CMS collections and the JMM GC counters
Y. Srinivas Ramakrishna
y.s.ramakrishna at oracle.com
Wed Jul 27 16:42:56 UTC 2011
Cross-posting to serviceability-dev list. Please include both lists
in your responses to this thread.
-- ramki
On 7/27/2011 9:12 AM, Krystal Mok wrote:
> Hi all,
>
> I've been looking at a strange inconsistency of full GC count recorded by
> jvmstat and JMM counters. I'd like to know which ones of the following
> behaviors are by design, which ones are bugs, and which ones are just my
> misunderstanding. I apologize for making a short story long...
>
> =====================================================
>
> The counters involved:
>
> * A jvmstat counter named "sun.gc.collector.1.invocations" keeps track of
> the number of pauses occured as a result of a major collection. It is used
> by utilities such as jstat as the source of "FGC" (full collection count),
> and the old gen collection count in Visual GC. It's updated by an
> TraceCollectorStats object.
> * A JMM counter, GCMemoryManager::_num_collections, keeps track of the
> number of collections that have ended. This counter is used as HotSpot's
> implementation of the JMX GarbageCollectorMXBean.getCollectionCount(). It's
> updated by either a TraceMemoryManagerStats object or a
> TraceCMSMemoryManagerStats object.
>
> To show the situation, I've made a screenshot of a VisualVM and a JConsole
> running side by side, both are monitoring the VisualVM's GC stats:
> http://dl.iteye.com/upload/attachment/524811/913cb0e1-7add-3ac0-a718-24ca705cad22.png
> (I'll upload the screenshot to somewhere else if anybody can't see it)
> The VisualVM instance is running on JDK6u26, with ParNew+CMS.
> In the screenshot, Visual GC reports that the old gen collection count is
> 20, while JConsole reports 10.
>
> I see that there was this bug:
> 6580448: CMS: Full GC collection count mismatch between
> GarbageCollectorMXBean and jvmstat (VisualGC)
> I don't think the current implementation has a bug in the sense that the two
> counters don't report the same number.
>
> This behavior seems reasonable, but the naming of the value in these tools
> are confusing: both tools say "collections", but apparently the number in
> Visual GC means "number of pauses" where as the number in JConsole means
> "number of collection cycles". It'd be great if the difference could be
> documented somewhere, if that's the intended behavior.
>
> And then the buggy behavior. Code demo posted on gist:
> https://gist.github.com/1106263
> Starting from JDK6u23, when using CMS without ExplicitGCInvokesConcurrent,
> System.gc() (or Runtime.getRuntime().gc(), or MemoryMXBean.gc() via JMX)
> would make the JMM GC counter increment by 2 per invocation, while the
> jvmstat counter is only incremented by 1. I believe the latter is correct
> and the former needs some fixing.
>
> =====================================================
>
> My understanding of the behavior shown above:
>
> 1. The concurrent GC part:
>
> There are 2 pauses in a CMS concurrent GC cycle, one in the initial mark
> phase, and one in the final remark phase.
> To trigger a concurrent GC cycle, the CMS thread wakes up periodically to
> see if it shouldConcurrentCollect(), and trigger a cycle when the predicate
> returned true, or goes back to sleep if the predicate returned false. The
> whole concurrent GC cycle doesn't go through
> GenCollectedHeap::do_collection().
>
> The jvmstat counter for old gen pauses is updated in
> CMSCollector::do_CMS_operation(CMS_op_type op), which exactly covers both
> pause phases.
>
> The JMM counter, however, is updated in the concurrent sweep phase,
> CMSCollector::sweep(bool asynch), if there was no concurrent mode failure;
> or it is updated in CMSCollector::do_compaction_work(bool
> clear_all_soft_refs) in case of a bailout due to concurrent mode failure
> (advertised as so in the code comments). So that's an increment by 1 per
> concurrent GC cycle, which does reflect the "number of collection cycles",
> fair enough.
>
> So far so good.
>
> 2. The System.gc() part:
>
> Without ExplicitGCInvokesConcurrent set, System.gc() does a stop-the-world
> full GC, which consists of only one pause, so "number of pauses" would equal
> "number of collections" in this case.
> It will go into GenCollectedHeap::do_collection(); both the jvmstat and the
> JMM GC counter gets incremented by 1 here,
>
> TraceCollectorStats tcs(_gens[i]->counters());
> TraceMemoryManagerStats tmms(_gens[i]->kind());
>
> But, drilling down into:
> _gens[i]->collect(full, do_clear_all_soft_refs, size, is_tlab);
>
> That'll eventually go into:
> CMSCollector::acquire_control_and_collect(bool full, bool
> clear_all_soft_refs)
>
> System.gc() is user requested so that'll go further into mark-sweep-compact:
> CMSCollector::do_compaction_work(bool clear_all_soft_refs)
> And here, it increments the JMM GC counter again (remember it was in the
> concurrent GC path too, to handle bailouts), even though this is still in
> the same collection. This leads to the "buggy behavior" mentioned earlier.
>
> The JMM GC counter wasn't added to CMS until this fix got in:
> 6581734: CMS Old Gen's collection usage is zero after GC which is incorrect
>
> The code added to CMSCollector::do_compaction_work() works fine in the
> concurrent GC path, but interacts badly with the original logic in
> GenCollectedHeap::do_collection().
>
> =====================================================
>
> I thought all concurrent mode failures/interrupts come from
> GenCollectedHeap::do_collection(). If that's the case, then it seems
> unnecessary to update the JMM GC counter in
> CMSCollector::do_compaction_work(), simply removing it should fix the
> problem.
>
> With that, I'd purpose the following (XS) change: (diff against HS20)
>
> diff -r f0f676c5a2c6
> src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp
> ---
> a/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp
> Tue Mar 15 19:30:16 2011 -0700
> +++
> b/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp
> Thu Jul 28 00:02:41 2011 +0800
> @@ -2022,9 +2022,6 @@
>
> _intra_sweep_estimate.padded_average());
> }
>
> - {
> - TraceCMSMemoryManagerStats();
> - }
> GenMarkSweep::invoke_at_safepoint(_cmsGen->level(),
> ref_processor(), clear_all_soft_refs);
> #ifdef ASSERT
>
> The same goes for the changes in:
> 7036199: Adding a notification to the implementation of
> GarbageCollectorMXBeans
>
> =====================================================
>
> P.S. Is there an "official" name for the counters that I referred to as
> "jvmstat counters" above? Is it just jvmstat, or PerfData or HSPERFDATA?
>
More information about the hotspot-gc-dev
mailing list