RFR (S) JDK-8007725 - Klass::restore_unshareable_info() triggers assert(k->java_mirror() == NULL)

Ioi Lam ioi.lam at oracle.com
Thu Mar 14 12:04:32 PDT 2013


On 03/14/2013 10:08 AM, Karen Kinnear wrote:
>>> 2) method.cpp 804, klass.cpp 489
>>> Can you possibly explain why you would not see a problem in DumpSharedSpaces?
>>> For remove_unshareable_info I am confused when you would clean up a number of the fields
>>> but leave the class_loader_data?
>> In DumpSharedSpaces, we come to remove_unsharable_info only after all classes are successfully loaded, so no Method or Klass would be partially initialized. However, at run-time, a Klass might be partially initialized. A Method may be not-initialized (because Method::link() of a previous method in the method array has failed due to OOM).
> Thank you for the explanation. Could you add this as a comment in the code please?
>
> A question - for DumpSharedSpaces, could you actually run into this problem and not have a successful load?
> I get it that for the usual DumpSharedSpaces logic, you are running this as clean up prior to dump and not
> as error recovery. I wonder if passing in a flag to distinguish those cases would make more sense?

For DumpSharedSpaces, you won't see any partially initialized InstanceKlasses because they won't be entered into the system dictionary.

I thought about adding an extra flag but that would mean changing a lot of code. I just needed two checks for !DumpSharedSpaces so I decided to keep it simple.

>>> 3. klass.cpp remove_unshareable_info
>>> I'm still trying to figure out exactly how you determined the fields to be cleared.
>>> I get the set that matches restore_unshareable_info.
>>> If the other fields got filled in, then another thread got its hands on this instanceKlass, via
>>> a sibling or subklass - and you don't want to be cleaning this up at all, so I'm wondering
>>> if that would be an assertion instead?
>> If the class has been entered into the system dictionary (see my comments below) then Klass::remove_unshareable_info and Klass::restore_unshareable_info will never be called again for this class.
>>
>> If the class has not been entered into the system dictionary, then Klass::restore_unshareable_info is called only by a single thread at a time (due to the locking inside SystemDictionary::resolve_instance_class_or_null). I.e., a subclass or subinterface should never see a partially initialized shared class. Therefore, If loading has failed, it is thread-safe to call Klass::remove_unshareable_info.
>>
>> I will change the condition of when Klass::remove_unshareable_info is called. See below.'
> Thank you.
> So - where do the other fields get set that you are freeing? e.g. subklass, siblings, etc. - if the class did not
> get entered into the system dictionary?

Klass::_subklass and Klass::_next_sibling probably won't be set, but 
clearing them shouldn't hurt -- this way the code looks cleaner, as 
remove_unsharable_info is also called during CDS dumping.

>
>>> 4. Could you possibly remove lines 1162-1166 - I don't think those are pertinent
>>> post-nopermgen.
>> You mean in systemDictionary.cpp? I will remove the comments.
>>
>>> 5. systemDictionary.cpp
>>>
>>> I'm wondering if you want to move the cleanup request to a different place.
>>>
>>> If the load was successful, but the check_constraints on the initiating loader fails - I think you want to
>>> leave the klass loaded by the defining loader (null) and leave the information intact.
>>> (We allow a defineclass to succeed and future lookups to find the class/defining loader even if
>>> the initiating loader fails due to loader constraints or later checkpackageaccess for instance)
>>>
>>> I don't quite understand where the exception is happening that you are cleaning up for.
>>> Where did you see an OOM? Where else could they happen here? What other
>>> exceptions could happen? I think this has to be handled at a tighter level of granularity so
>>> we know exactly when this clean up is needed.
>> In testing, I saw OOM inside Klass::restore_unshareable_info() => java_lang_Class::create_mirror().
>>
>> But InstanceKlass::restore_unshareable_info() actually makes a lot of allocations, so it could fail at a number of different places.
> Thank you. So one of the locations that could cause OOM problems would be restore_unshareable_info. Thanks.
> And there is the possibility of getting a variety of exceptions from the find_or_define path.
>> The main reason for the clean up is this:
>>
>> In SystemDictionary::resolve_instance_class_or_null, if the target class is not in the system dictionary, then load_instance_class() will be called, which eventually will unconditionally call into InstanceKlass::restore_unshareable_info().
> (I think that is conditional on find_shared_class, and the resolve_super for superclasses/superinterfaces succeeding - those have CHECKs so if they fail, we don't get here)
>> So the clean up should be done whenever we have failed to put the shared class into the dictionary.
> I agree.
>> I am thinking of leaving the clean up at the same place, but do this instead:
>>
>>     ....
>>     } // load_instance_class loop
>>
>>     if (HAS_PENDING_EXCEPTION) {
>>       // An OOM could have happened at various places inside load_instance_class, and
>>       // the shared klass may not have been returned to us. Find it and make
>>       // sure it's cleaned up.
>>       if (class_loader.is_null()) {
>>         instanceKlassHandle ik(THREAD, find_shared_class(name));
>>         if (ik.not_null()) {
>>           MutexLocker mu(SystemDictionary_lock, THREAD);
>>           Klass* kk = find_class(name, loader_data);
>>           if (kk == NULL) {
>>             clean_up_shared_class(ik, class_loader, THREAD);
>>           }
>>         }
>>       }
>>     }
>>
>>     if (load_instance_added == true) {
>>     ....
> I like the approach of checking if this is in the system dictionary, that is a great idea.
> It needs this minor modification.
> The find_class that you are doing is the name/initiating loader pair.
> If the define_class worked for the name/defining loader (null for this case) that class
> may have been added to the systemdictionary for lookup, and in that case you do not
> want to do this cleanup.
> So the find class would be the defining loader - which would be the ik->class_loader_data()
> rather than the loader_data you were called with, which is the initiating loader.

I just want to understand a little more ...

 From the code, it seems the local variables "class_loader" and 
"loader_data" won't be changed after these lines 
inSystemDictionary::resolve_instance_class_or_null():

   class_loader = Handle(THREAD, 
java_lang_ClassLoader::non_reflection_class_loader(class_loader()));
   ClassLoaderData *loader_data = register_loader(class_loader, CHECK_NULL);

(or, is this true? Will the oop contained in the class_loader handle be 
modified subsequently??)

and later in SystemDictionary::load_instance_class(), which is the only 
place that calls load_shared_class():

   if (class_loader.is_null()) {
     // Search the shared system dictionary for classes preloaded into the
     // shared spaces.
     instanceKlassHandle k;
     {
       PerfTraceTime vmtimer(ClassLoader::perf_shared_classload_time());
       k = load_shared_class(class_name, class_loader, THREAD);
     }

     if (k.is_null()) {
       // Use VM class loader
       PerfTraceTime vmtimer(ClassLoader::perf_sys_classload_time());
       k = ClassLoader::load_classfile(class_name, CHECK_(nh));
     }

     // find_or_define_instance_class may return a different InstanceKlass
     if (!k.is_null()) {
       k = find_or_define_instance_class(class_name, class_loader, k, 
CHECK_(nh));
     }
     return k;

so load_shared_class() is called only if class_loader.is_null().

And later ...

         if (!HAS_PENDING_EXCEPTION) {
           { // Grabbing the Compile_lock prevents systemDictionary updates
             // during compilations.
             MutexLocker mu(Compile_lock, THREAD);
             update_dictionary(d_index, d_hash, p_index, p_hash,
                             k, class_loader, THREAD);
           }

so the shared klass would be always entered into the system dictionary 
with the NULL class loader.

Would this means, after the update_dictionary() call finishes, for a 
shared classes,

       ik->class_loader() == class_loader
           == ClassLoaderData::the_null_class_loader_data()->class_loader()

I can change the code to the way you like, but I need to add a check to 
make sureik->class_loader_data()has been initialized (just in case, in 
the future, we might get an exception before 
Klass::restore_unshareable_info() is even called):

     if (HAS_PENDING_EXCEPTION) {
       // An OOM could have happened at various places inside load_instance_class, and
       // the shared klass may not have been returned to us. Find it and make
       // sure it's cleaned up.
       if (class_loader.is_null()) {
         instanceKlassHandle ik(THREAD, find_shared_class(name));
         if (ik.not_null()) {
           if (ik->class_loader_data() == NULL) {
             // We didn't go as far asKlass::restore_unshareable_info(),
             // so nothing to clean up
           } else {
              MutexLocker mu(SystemDictionary_lock, THREAD);
             Klass* kk = find_class(name, ik->class_loader_data());
             if (kk == NULL) {
               clean_up_shared_class(ik, class_loader, THREAD);
           }
         }
       }
     }

What do you think?

Thanks
- Ioi
  

>>> So I'm assuming you are doing a cleanup of the unshareable info because:
>>> find_shared_class succeeded, resolve_super_or_fail on superclasses/interfaces succeeded and
>>> restore_unshareable_info succeeded.
>>> Can we assume load_shared_class succeeded or do we need a check there in case notify_class_loaded
>>> failed?
>>>
>>> For the bootloader today, LOAD_INSTANCE is not done in parallel, so we are not going to run
>>> into the situation in which another thread is loading the same class in the LOAD_INSTANCE sense.
>>> That said, once we add the instanceKlass to the systemdictionary - any other thread could
>>> find it and use it - so we don't want to clean it up - it may be in use.
>>>
>>> In load_instance_class if load_shared_class succeeds, you call find_or_define_instance_class, which
>>> calls define_class.
>>> Inside find_or_define_instance_class and define_instance_class you can have failures
>>> before you add to the systemdictionary - in which case you want to do this cleanup. If the failures are
>>> after that point - this thread may not get the information back due to a pending exception, but other
>>> threads may use the information - so you don't want to do the cleanup. So the key is - if update_dictionary
>>> succeeded - don't do the cleanup. If it did not, you want the cleanup.
>>>
>>> Coleen is planning to put in code to cleanup the instanceKlass - and this should go the same place I
>>> would assume, so you probably want to discuss this closely with her.
> Note that Coleen will also need cleanup for the resolve_from_stream path - which you don't need,
> around the find_or_define call.
>> OK, I will do that.
>>
>> Thanks for the detailed review! I really appreciate that.
>>
>> - Ioi
> thanks for jumping in on this complex code,
> Karen
>>> thanks,
>>> Karen
>>>
>>> On Mar 14, 2013, at 1:49 AM, Ioi Lam wrote:
>>>
>>>> Hi, could someone review this? This fixes a crash related to OOM handling.
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>> On 03/11/2013 12:47 PM, Ioi Lam wrote:
>>>>> Please review:
>>>>> http://cr.openjdk.java.net/~iklam/8007725/null_mirror_001/
>>>>>
>>>>> Bug: NPG: Klass::restore_unshareable_info() triggers
>>>>>     assert(k->java_mirror() == NULL)
>>>>>
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007725
>>>>> or https://jbs.oracle.com/bugs/browse/JDK-8007725
>>>>>
>>>>>     (The bugs.sun.com page may be unavailable for a couple of days due
>>>>>      to recent change of confidentiality status.)
>>>>>
>>>>> Summary of fix:
>>>>>
>>>>>     The assert is at:
>>>>>
>>>>>     --------
>>>>>     # Internal Error (src/share/vm/classfile/javaClasses.cpp:527)
>>>>>     # assert(k->java_mirror() == NULL) failed: should only assign mirror once
>>>>>     java_lang_Class::create_mirror(KlassHandle)
>>>>>     Klass::restore_unshareable_info()
>>>>>     InstanceKlass::restore_unshareable_info()
>>>>>     instanceKlassHandle SystemDictionary::load_shared_class(instanceKlassHandle,Handle)
>>>>>     --------
>>>>>
>>>>>     This happens if during class loading, an OOM happened after
>>>>>     Klass::restore_unshareable_info() has called
>>>>>     java_lang_Class::create_mirror(), but before the class loading is completed.
>>>>>
>>>>>     At this point, the class is not in the system dictionary. However the
>>>>>     Klass structure is partially initialized.
>>>>>
>>>>>     If the app tries to load the same class again, the assert happens.
>>>>>
>>>>>     The fix is to undo all work performed by
>>>>>     InstanceKlass::restore_unshareable_info().
>>>>>
>>>>> Tests executed:
>>>>>
>>>>>     * JPRT
>>>>>     * JCK (vm, lang, api/signaturetest)
>>>>>     * UTE (vm.quick.testlist, )
>>>>>
>>>>> Thanks,
>>>>> Ioi



More information about the hotspot-runtime-dev mailing list