A guarantee failure with CMSClassUnloadingMaxInterval

Thu May 16 22:03:05 UTC 2013

Hiroshi,

Thanks.

Let me see if I understand the problem you've hit by explaining
your mail back to you.  Please correct me if I've jumped the
wrong way.

For the situation where CMS is doing class unloading not every time but
at an interval CMSClassUnloadingMaxInterval (let's say that
class unloading is not done at  N and is at N+1.

Because the decision to do marking uses only CMSClassUnloadingEnabled
and ignores CMSClassUnloadingMaxInterval, at collection N classloaders
and classes can die (i.e., not marked as live).

The decision to purge the system dictionary does take
CMSClassUnloadingMaxInterval into account so a purge
is not done at N.  This leaves dead (classloader, classes)
in the system dictionary.  I think this is where the damage
is done.  When the system dictionary is purged at N+1,
the purges encounters a dead object and odd things
happen.

I don't necessarily see why the guarantee() fails but with
dead objects in the dictionary, it certainly is possible.

Coleen,

The guarantee failure is

>           // The loader in this entry is alive. If the klass is dead,
>           // the loader must be an initiating loader (rather than the
>           // defining loader). Remove this entry.
>           if (!is_alive->do_object_b(e)) {
>             guarantee(!is_alive->do_object_b(k_def_class_loader),
>                       "defining loader should not be live if klass is not");
>             // If we get here, the class_loader must not be the defining
>             // loader, it must be an initiating one.
>             assert(k_def_class_loader != class_loader,
>                    "cannot have live defining loader and unreachable klass");

If we're doing marking correctly, the defining class
loader should be dead and the guarantee() should
only fail if k_def_class_loader is pointing to garbage,
right?

I'm trying to differentiate between the case where the
marking is wrong and the case where the skipping of
a purge could cause the guarantee() to fail.

Thanks.

Jon

On 5/16/13 1:27 PM, Hiroshi Yamauchi wrote:
> Hi Jon,
>
>> Do you ever see any other type of crash in the VM when you set
>> CMSClassUnloadingMaxInterval?  Or it it only the guarantee()
>> that fails?
> So far, it's only the guarantee() that fails.
>
>> I noticed that the stack trace goes through the stop-the-world
>> mark-sweep-compact collector.  Is that always the case?
> There was another case where the class unloading during the remark
> phase crashed at the same guarantee() as in:
>
> Stack: [0x2a087000,0x2a108000],  sp=0x2a106a00,  free space=510k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x90657d]  VMError::report_and_die()+0x1ad
> V  [libjvm.so+0x393aef]  report_vm_error(char const*, int, char
> const*, char const*)+0x4f
> V  [libjvm.so+0x3ef9f6]  Dictionary::do_unloading(BoolObjectClosure*)+0x8f6
> V  [libjvm.so+0x864faf]  SystemDictionary::do_unloading(BoolObjectClosure*)+0x2f
> V  [libjvm.so+0x372ba1]  CMSCollector::refProcessingWork(bool, bool)+0x681
> V  [libjvm.so+0x374bd3]  CMSCollector::checkpointRootsFinalWork(bool,
> bool, bool)+0x123
> V  [libjvm.so+0x3751df]  CMSCollector::checkpointRootsFinal(bool,
> bool, bool)+0x1bf
> V  [libjvm.so+0x3757c7]
> CMSCollector::do_CMS_operation(CMSCollector::CMS_op_type)+0x357
> V  [libjvm.so+0x903846]  VM_CMS_Final_Remark::doit()+0x66
> V  [libjvm.so+0x910528]  VM_Operation::evaluate()+0x48
> V  [libjvm.so+0x90f0e3]  VMThread::loop()+0x233
> V  [libjvm.so+0x90f8d8]  VMThread::run()+0x98
> V  [libjvm.so+0x741c05]  java_start(Thread*)+0x215
> C  [libpthread.so.0+0x609e]  start_thread+0xce
>
> VM_Operation (0x2a9fedb4): CMS_Final_Remark, mode: safepoint,
> requested by thread 0x316f6400
>
> But I haven't been able to reproduce this case so far.
>
> Thanks.