A guarantee failure with CMSClassUnloadingMaxInterval

Srinivas Ramakrishna ysr1729 at gmail.com
Thu May 16 05:16:14 UTC 2013


I can't recall the history behind the flag for class unloading every N Concurrent collections, but your analysis appears spot on to me. I can review the code tomorrow morning and confirm since its been a while since I looked at this code...

-- Ramki


ysr1729

On May 15, 2013, at 14:20, Hiroshi Yamauchi <yamauchi at google.com> wrote:

> Hi,
> 
> I'm getting the following JVM crash (a guarantee failure) with flag
> -XX:CMSClassUnloadingMaxInterval=N (where N > 0) somewhat
> intermittently. I have a theory of why it's crashing. It'd be great if
> someone can indicate whether I'm looking at this right.
> 
> Here goes:
> 
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (dictionary.cpp:267), pid=30153, tid=1325398848
> #  guarantee(!is_alive->do_object_b(k_def_class_loader)) failed:
> defining loader should not be live if klass is not
> #
> # JRE version: 7.0_17-b02
> # Java VM: Java HotSpot(TM) Server VM (23.7-b01 mixed mode linux-x86 )
> # Failed to write core dump. Core dumps have been disabled. To enable
> core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> #
> 
> ---------------  T H R E A D  ---------------
> 
> Current thread (0x4eeb9400):  VMThread [stack: 0x4ef7f000,0x4f000000] [id=30176]
> 
> Stack: [0x4ef7f000,0x4f000000],  sp=0x4effe7c0,  free space=509k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x720449]  VMError::report_and_die()+0x199
> V  [libjvm.so+0x2e8829]  report_vm_error(char const*, int, char
> const*, char const*)+0x49
> V  [libjvm.so+0x33b08f]  Dictionary::do_unloading(BoolObjectClosure*)+0x8ef
> V  [libjvm.so+0x6aeec5]  SystemDictionary::do_unloading(BoolObjectClosure*)+0x25
> V  [libjvm.so+0x39f741]  GenMarkSweep::mark_sweep_phase1(int, bool)+0xd1
> V  [libjvm.so+0x39fce5]  GenMarkSweep::invoke_at_safepoint(int,
> ReferenceProcessor*, bool)+0x105
> V  [libjvm.so+0x2d3b62]  CMSCollector::do_compaction_work(bool)+0x142
> V  [libjvm.so+0x2d4d4e]
> CMSCollector::acquire_control_and_collect(bool, bool)+0x19e
> V  [libjvm.so+0x2d5226]  ConcurrentMarkSweepGeneration::collect(bool,
> bool, unsigned int, bool)+0xe6
> V  [libjvm.so+0x39e805]  GenCollectedHeap::do_collection(bool, bool,
> unsigned int, bool, int)+0x4a5
> V  [libjvm.so+0x28f4c7]
> GenCollectorPolicy::satisfy_failed_allocation(unsigned int, bool)+0xe7
> V  [libjvm.so+0x7215f4]  VM_GenCollectForAllocation::doit()+0x74
> V  [libjvm.so+0x72ae61]  VM_Operation::evaluate()+0x41
> V  [libjvm.so+0x729748]  VMThread::evaluate_operation(VM_Operation*)+0x78
> V  [libjvm.so+0x729cd8]  VMThread::loop()+0x1e8
> V  [libjvm.so+0x72a375]  VMThread::run()+0x85
> V  [libjvm.so+0x5df1f1]  java_start(Thread*)+0x111
> C  [libpthread.so.0+0x6d4c]  start_thread+0xcc
> 
> VM_Operation (0x4cd8d6b8): GenCollectForAllocation, mode: safepoint,
> requested by thread 0x4cc38c00
> 
> 
> Background: Enabled with CMSClassUnloadingEnabled, this flag should
> cause the class unloading to invoke less frequently, namely, at every
> N+1 concurrent collections, as opposed to at every concurrent
> collection.
> 
> Here's what I thought:
> 
> Here's the failing guarantee (in Dictionary::do_unloading() in
> dictionary.cpp, around line 262 in
> http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/f8075a623349/src/share/vm/classfile/dictionary.cpp):
> 
>         // The loader in this entry is alive. If the klass is dead,
>         // the loader must be an initiating loader (rather than the
>         // defining loader). Remove this entry.
>         if (!is_alive->do_object_b(e)) {
>           guarantee(!is_alive->do_object_b(k_def_class_loader),
>                     "defining loader should not be live if klass is not");
>           // If we get here, the class_loader must not be the defining
>           // loader, it must be an initiating one.
>           assert(k_def_class_loader != class_loader,
>                  "cannot have live defining loader and unreachable klass");
> 
> It seems that at the time of the crash, e is not alive but
> k_def_class_loader is alive (hence the guarantee failure) and the
> debugger indicates that k_def_class_loader == class_loader (which
> would mean that the following assert would also fail.) I'm not sure
> what this really means (I'm not class (un)loading expert), but it
> appears that there's a certain class (un)loading invariant that broke
> due to CMSClassUnloadingMaxInterval.
> 
> Here's a theory of what I think might be happening:
> 
> In the CMS initialization sequence, the root scanning option is set
> statically (once per JVM invocation) to either SO_AllClasses or
> SO_SystemClasses based on the flag value of CMSClassUnloadingEnabled
> (at line 810 in
> http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/f8075a623349/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp):
> 
> // Choose what strong roots should be scanned depending on verification options
> // and perm gen collection mode.
> if (!CMSClassUnloadingEnabled) {
>   // If class unloading is disabled we want to include all classes
> into the root set.
>   add_root_scanning_option(SharedHeap:: SO_AllClasses);
> } else {
>   add_root_scanning_option(SharedHeap::SO_SystemClasses);
> }
> 
> This code appears to assume that this choice solely depends on the
> value of flag CMSClassUnloadingEnabled.
> 
> Then, at the beginning of each concurrent collection invocation, the
> following code decides whether to perform class unloading with this
> code (at line 3223 in
> http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/f8075a623349/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp):
> 
> bool CMSCollector::update_should_unload_classes() {
>  _should_unload_classes = false;
>  // Condition 1 above
>  if (_full_gc_requested && ExplicitGCInvokesConcurrentAndUnloadsClasses) {
>    _should_unload_classes = true;
>  } else if (CMSClassUnloadingEnabled) { // Condition 2.a above
>     // Disjuncts 2.b.(i,ii,iii) above
>    _should_unload_classes = (concurrent_cycles_since_last_unload() >=
>                               CMSClassUnloadingMaxInterval)
>                            || _permGen->should_concurrent_collect()
>                            || _cmsGen->is_too_full();
>   }
>   return _should_unload_classes;
> }
> 
> If CMSClassUnloadingMaxInterval is 0 (which is the default case), then
> if CMSClassUnloadingEnabled is true,
> _should_unload_classes will be always true (that is, class unloading
> will be performed in every concurrent collection run) as
> concurrent_cycles_since_last_unload() would always return 0.
> 
> But, if CMSClassUnloadingMaxInterval > 0, _should_unload_classes won't
> be always true (and we won't perform class unloading in every
> concurrent collection run) even if CMSClassUnloadingEnabled is true.
> 
> Is it possible it should pick SO_AllClasses or SO_SystemClasses based
> on the result of the update_should_unload_classes() call at every
> concurrent collection, as opposed to solely based on the value of the
> CMSClassUnloadingEnabled flag once at initialization?
> 
> For an experiment, If I insert this code:
> 
> remove_root_scanning_option(SharedHeap::SO_SystemClasses |
>                                             SharedHeap::SO_AllClasses);
> if (should_unload_classes()) {
>   add_root_scanning_option(SharedHeap::SO_SystemClasses);
> } else {
>   add_root_scanning_option(SharedHeap::SO_AllClasses);
> }
> 
> in CMSCollector::setup_cms_unloading_and_verification_state(), around
> line 3254 in http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/f8075a623349/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp,
> 
> then the crash appears to fail to recur. I don't know if this does the
> right thing or just is masking the bug.
> 
> I could reproduce this crash with JDK 7u17. I could not with a JDK8
> probably because the failing guarantee line was removed in JDK8
> (around line 168 in
> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/1ae0472ff3a0/src/share/vm/classfile/dictionary.cpp.)
> I'm not sure what the removal of the guarantee meant as it was part of
> what looks like a perm gen removal change.
> 
> Anyhow, it'd nice to be able to use CMSClassUnloadingMaxInterval.
> 
> Thanks,
> Hiroshi



More information about the hotspot-gc-dev mailing list