RFR (S) JDK-8007725 - Klass::restore_unshareable_info() triggers assert(k->java_mirror() == NULL)
Ioi Lam
ioi.lam at oracle.com
Thu Mar 14 12:04:32 PDT 2013
On 03/14/2013 10:08 AM, Karen Kinnear wrote:
>>> 2) method.cpp 804, klass.cpp 489
>>> Can you possibly explain why you would not see a problem in DumpSharedSpaces?
>>> For remove_unshareable_info I am confused when you would clean up a number of the fields
>>> but leave the class_loader_data?
>> In DumpSharedSpaces, we come to remove_unsharable_info only after all classes are successfully loaded, so no Method or Klass would be partially initialized. However, at run-time, a Klass might be partially initialized. A Method may be not-initialized (because Method::link() of a previous method in the method array has failed due to OOM).
> Thank you for the explanation. Could you add this as a comment in the code please?
>
> A question - for DumpSharedSpaces, could you actually run into this problem and not have a successful load?
> I get it that for the usual DumpSharedSpaces logic, you are running this as clean up prior to dump and not
> as error recovery. I wonder if passing in a flag to distinguish those cases would make more sense?
For DumpSharedSpaces, you won't see any partially initialized InstanceKlasses because they won't be entered into the system dictionary.
I thought about adding an extra flag but that would mean changing a lot of code. I just needed two checks for !DumpSharedSpaces so I decided to keep it simple.
>>> 3. klass.cpp remove_unshareable_info
>>> I'm still trying to figure out exactly how you determined the fields to be cleared.
>>> I get the set that matches restore_unshareable_info.
>>> If the other fields got filled in, then another thread got its hands on this instanceKlass, via
>>> a sibling or subklass - and you don't want to be cleaning this up at all, so I'm wondering
>>> if that would be an assertion instead?
>> If the class has been entered into the system dictionary (see my comments below) then Klass::remove_unshareable_info and Klass::restore_unshareable_info will never be called again for this class.
>>
>> If the class has not been entered into the system dictionary, then Klass::restore_unshareable_info is called only by a single thread at a time (due to the locking inside SystemDictionary::resolve_instance_class_or_null). I.e., a subclass or subinterface should never see a partially initialized shared class. Therefore, If loading has failed, it is thread-safe to call Klass::remove_unshareable_info.
>>
>> I will change the condition of when Klass::remove_unshareable_info is called. See below.'
> Thank you.
> So - where do the other fields get set that you are freeing? e.g. subklass, siblings, etc. - if the class did not
> get entered into the system dictionary?
Klass::_subklass and Klass::_next_sibling probably won't be set, but
clearing them shouldn't hurt -- this way the code looks cleaner, as
remove_unsharable_info is also called during CDS dumping.
>
>>> 4. Could you possibly remove lines 1162-1166 - I don't think those are pertinent
>>> post-nopermgen.
>> You mean in systemDictionary.cpp? I will remove the comments.
>>
>>> 5. systemDictionary.cpp
>>>
>>> I'm wondering if you want to move the cleanup request to a different place.
>>>
>>> If the load was successful, but the check_constraints on the initiating loader fails - I think you want to
>>> leave the klass loaded by the defining loader (null) and leave the information intact.
>>> (We allow a defineclass to succeed and future lookups to find the class/defining loader even if
>>> the initiating loader fails due to loader constraints or later checkpackageaccess for instance)
>>>
>>> I don't quite understand where the exception is happening that you are cleaning up for.
>>> Where did you see an OOM? Where else could they happen here? What other
>>> exceptions could happen? I think this has to be handled at a tighter level of granularity so
>>> we know exactly when this clean up is needed.
>> In testing, I saw OOM inside Klass::restore_unshareable_info() => java_lang_Class::create_mirror().
>>
>> But InstanceKlass::restore_unshareable_info() actually makes a lot of allocations, so it could fail at a number of different places.
> Thank you. So one of the locations that could cause OOM problems would be restore_unshareable_info. Thanks.
> And there is the possibility of getting a variety of exceptions from the find_or_define path.
>> The main reason for the clean up is this:
>>
>> In SystemDictionary::resolve_instance_class_or_null, if the target class is not in the system dictionary, then load_instance_class() will be called, which eventually will unconditionally call into InstanceKlass::restore_unshareable_info().
> (I think that is conditional on find_shared_class, and the resolve_super for superclasses/superinterfaces succeeding - those have CHECKs so if they fail, we don't get here)
>> So the clean up should be done whenever we have failed to put the shared class into the dictionary.
> I agree.
>> I am thinking of leaving the clean up at the same place, but do this instead:
>>
>> ....
>> } // load_instance_class loop
>>
>> if (HAS_PENDING_EXCEPTION) {
>> // An OOM could have happened at various places inside load_instance_class, and
>> // the shared klass may not have been returned to us. Find it and make
>> // sure it's cleaned up.
>> if (class_loader.is_null()) {
>> instanceKlassHandle ik(THREAD, find_shared_class(name));
>> if (ik.not_null()) {
>> MutexLocker mu(SystemDictionary_lock, THREAD);
>> Klass* kk = find_class(name, loader_data);
>> if (kk == NULL) {
>> clean_up_shared_class(ik, class_loader, THREAD);
>> }
>> }
>> }
>> }
>>
>> if (load_instance_added == true) {
>> ....
> I like the approach of checking if this is in the system dictionary, that is a great idea.
> It needs this minor modification.
> The find_class that you are doing is the name/initiating loader pair.
> If the define_class worked for the name/defining loader (null for this case) that class
> may have been added to the systemdictionary for lookup, and in that case you do not
> want to do this cleanup.
> So the find class would be the defining loader - which would be the ik->class_loader_data()
> rather than the loader_data you were called with, which is the initiating loader.
I just want to understand a little more ...
From the code, it seems the local variables "class_loader" and
"loader_data" won't be changed after these lines
inSystemDictionary::resolve_instance_class_or_null():
class_loader = Handle(THREAD,
java_lang_ClassLoader::non_reflection_class_loader(class_loader()));
ClassLoaderData *loader_data = register_loader(class_loader, CHECK_NULL);
(or, is this true? Will the oop contained in the class_loader handle be
modified subsequently??)
and later in SystemDictionary::load_instance_class(), which is the only
place that calls load_shared_class():
if (class_loader.is_null()) {
// Search the shared system dictionary for classes preloaded into the
// shared spaces.
instanceKlassHandle k;
{
PerfTraceTime vmtimer(ClassLoader::perf_shared_classload_time());
k = load_shared_class(class_name, class_loader, THREAD);
}
if (k.is_null()) {
// Use VM class loader
PerfTraceTime vmtimer(ClassLoader::perf_sys_classload_time());
k = ClassLoader::load_classfile(class_name, CHECK_(nh));
}
// find_or_define_instance_class may return a different InstanceKlass
if (!k.is_null()) {
k = find_or_define_instance_class(class_name, class_loader, k,
CHECK_(nh));
}
return k;
so load_shared_class() is called only if class_loader.is_null().
And later ...
if (!HAS_PENDING_EXCEPTION) {
{ // Grabbing the Compile_lock prevents systemDictionary updates
// during compilations.
MutexLocker mu(Compile_lock, THREAD);
update_dictionary(d_index, d_hash, p_index, p_hash,
k, class_loader, THREAD);
}
so the shared klass would be always entered into the system dictionary
with the NULL class loader.
Would this means, after the update_dictionary() call finishes, for a
shared classes,
ik->class_loader() == class_loader
== ClassLoaderData::the_null_class_loader_data()->class_loader()
I can change the code to the way you like, but I need to add a check to
make sureik->class_loader_data()has been initialized (just in case, in
the future, we might get an exception before
Klass::restore_unshareable_info() is even called):
if (HAS_PENDING_EXCEPTION) {
// An OOM could have happened at various places inside load_instance_class, and
// the shared klass may not have been returned to us. Find it and make
// sure it's cleaned up.
if (class_loader.is_null()) {
instanceKlassHandle ik(THREAD, find_shared_class(name));
if (ik.not_null()) {
if (ik->class_loader_data() == NULL) {
// We didn't go as far asKlass::restore_unshareable_info(),
// so nothing to clean up
} else {
MutexLocker mu(SystemDictionary_lock, THREAD);
Klass* kk = find_class(name, ik->class_loader_data());
if (kk == NULL) {
clean_up_shared_class(ik, class_loader, THREAD);
}
}
}
}
What do you think?
Thanks
- Ioi
>>> So I'm assuming you are doing a cleanup of the unshareable info because:
>>> find_shared_class succeeded, resolve_super_or_fail on superclasses/interfaces succeeded and
>>> restore_unshareable_info succeeded.
>>> Can we assume load_shared_class succeeded or do we need a check there in case notify_class_loaded
>>> failed?
>>>
>>> For the bootloader today, LOAD_INSTANCE is not done in parallel, so we are not going to run
>>> into the situation in which another thread is loading the same class in the LOAD_INSTANCE sense.
>>> That said, once we add the instanceKlass to the systemdictionary - any other thread could
>>> find it and use it - so we don't want to clean it up - it may be in use.
>>>
>>> In load_instance_class if load_shared_class succeeds, you call find_or_define_instance_class, which
>>> calls define_class.
>>> Inside find_or_define_instance_class and define_instance_class you can have failures
>>> before you add to the systemdictionary - in which case you want to do this cleanup. If the failures are
>>> after that point - this thread may not get the information back due to a pending exception, but other
>>> threads may use the information - so you don't want to do the cleanup. So the key is - if update_dictionary
>>> succeeded - don't do the cleanup. If it did not, you want the cleanup.
>>>
>>> Coleen is planning to put in code to cleanup the instanceKlass - and this should go the same place I
>>> would assume, so you probably want to discuss this closely with her.
> Note that Coleen will also need cleanup for the resolve_from_stream path - which you don't need,
> around the find_or_define call.
>> OK, I will do that.
>>
>> Thanks for the detailed review! I really appreciate that.
>>
>> - Ioi
> thanks for jumping in on this complex code,
> Karen
>>> thanks,
>>> Karen
>>>
>>> On Mar 14, 2013, at 1:49 AM, Ioi Lam wrote:
>>>
>>>> Hi, could someone review this? This fixes a crash related to OOM handling.
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>> On 03/11/2013 12:47 PM, Ioi Lam wrote:
>>>>> Please review:
>>>>> http://cr.openjdk.java.net/~iklam/8007725/null_mirror_001/
>>>>>
>>>>> Bug: NPG: Klass::restore_unshareable_info() triggers
>>>>> assert(k->java_mirror() == NULL)
>>>>>
>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007725
>>>>> or https://jbs.oracle.com/bugs/browse/JDK-8007725
>>>>>
>>>>> (The bugs.sun.com page may be unavailable for a couple of days due
>>>>> to recent change of confidentiality status.)
>>>>>
>>>>> Summary of fix:
>>>>>
>>>>> The assert is at:
>>>>>
>>>>> --------
>>>>> # Internal Error (src/share/vm/classfile/javaClasses.cpp:527)
>>>>> # assert(k->java_mirror() == NULL) failed: should only assign mirror once
>>>>> java_lang_Class::create_mirror(KlassHandle)
>>>>> Klass::restore_unshareable_info()
>>>>> InstanceKlass::restore_unshareable_info()
>>>>> instanceKlassHandle SystemDictionary::load_shared_class(instanceKlassHandle,Handle)
>>>>> --------
>>>>>
>>>>> This happens if during class loading, an OOM happened after
>>>>> Klass::restore_unshareable_info() has called
>>>>> java_lang_Class::create_mirror(), but before the class loading is completed.
>>>>>
>>>>> At this point, the class is not in the system dictionary. However the
>>>>> Klass structure is partially initialized.
>>>>>
>>>>> If the app tries to load the same class again, the assert happens.
>>>>>
>>>>> The fix is to undo all work performed by
>>>>> InstanceKlass::restore_unshareable_info().
>>>>>
>>>>> Tests executed:
>>>>>
>>>>> * JPRT
>>>>> * JCK (vm, lang, api/signaturetest)
>>>>> * UTE (vm.quick.testlist, )
>>>>>
>>>>> Thanks,
>>>>> Ioi
More information about the hotspot-runtime-dev
mailing list