RFR (S) JDK-8007725 - Klass::restore_unshareable_info() triggers assert(k->java_mirror() == NULL)
Ioi Lam
ioi.lam at oracle.com
Thu Mar 14 09:00:22 PDT 2013
On 03/14/2013 08:24 AM, Karen Kinnear wrote:
> Ioi,
>
> Thank you for fixing this. And thank you so much for asking me to review this - this logic is very tricky.
>
> I had a couple of questions/comments please:
>
> 1) klass.cpp line 490, systemDictionary.cpp line 808
> I believe that there are other possible PENDING_EXCEPTIONS in addition to OOM we can get here -
> could you possibly modify the comments to say something like "An exception, such as OOM, ..."
> (e.g. I believe we could get here due to a loader constraint exception)
OK.
> 2) method.cpp 804, klass.cpp 489
> Can you possibly explain why you would not see a problem in DumpSharedSpaces?
> For remove_unshareable_info I am confused when you would clean up a number of the fields
> but leave the class_loader_data?
In DumpSharedSpaces, we come to remove_unsharable_info only after all
classes are successfully loaded, so no Method or Klass would be
partially initialized. However, at run-time, a Klass might be partially
initialized. A Method may be not-initialized (because Method::link() of
a previous method in the method array has failed due to OOM).
In Klass::remove_unshareable_info, the class_loader_data field is NULLed
at the end of the function:
void Klass::remove_unshareable_info() {
if (!DumpSharedSpaces) {
// Clean up after OOM during class loading
if (class_loader_data() != NULL) {
class_loader_data()->remove_class(this);
}
}
set_subklass(NULL);
set_next_sibling(NULL);
// Clear the java mirror
set_java_mirror(NULL);
set_next_link(NULL);
// Null out class_loader_data because we don't share that yet.
set_class_loader_data(NULL);
}
> 3. klass.cpp remove_unshareable_info
> I'm still trying to figure out exactly how you determined the fields to be cleared.
> I get the set that matches restore_unshareable_info.
> If the other fields got filled in, then another thread got its hands on this instanceKlass, via
> a sibling or subklass - and you don't want to be cleaning this up at all, so I'm wondering
> if that would be an assertion instead?
If the class has been entered into the system dictionary (see my
comments below) then Klass::remove_unshareable_info and
Klass::restore_unshareable_info will never be called again for this class.
If the class has not been entered into the system dictionary, then
Klass::restore_unshareable_info is called only by a single thread at a
time (due to the locking inside
SystemDictionary::resolve_instance_class_or_null). I.e., a subclass or
subinterface should never see a partially initialized shared class.
Therefore, If loading has failed, it is thread-safe to call
Klass::remove_unshareable_info.
I will change the condition of when Klass::remove_unshareable_info is
called. See below.
> 4. Could you possibly remove lines 1162-1166 - I don't think those are pertinent
> post-nopermgen.
You mean in systemDictionary.cpp? I will remove the comments.
> 5. systemDictionary.cpp
>
> I'm wondering if you want to move the cleanup request to a different place.
>
> If the load was successful, but the check_constraints on the initiating loader fails - I think you want to
> leave the klass loaded by the defining loader (null) and leave the information intact.
> (We allow a defineclass to succeed and future lookups to find the class/defining loader even if
> the initiating loader fails due to loader constraints or later checkpackageaccess for instance)
>
> I don't quite understand where the exception is happening that you are cleaning up for.
> Where did you see an OOM? Where else could they happen here? What other
> exceptions could happen? I think this has to be handled at a tighter level of granularity so
> we know exactly when this clean up is needed.
In testing, I saw OOM inside Klass::restore_unshareable_info() =>
java_lang_Class::create_mirror().
But InstanceKlass::restore_unshareable_info() actually makes a lot of
allocations, so it could fail at a number of different places.
The main reason for the clean up is this:
In SystemDictionary::resolve_instance_class_or_null, if the target class
is not in the system dictionary, then load_instance_class() will be
called, which eventually will unconditionally call into
InstanceKlass::restore_unshareable_info().
So the clean up should be done whenever we have failed to put the shared
class into the dictionary.
I am thinking of leaving the clean up at the same place, but do this
instead:
....
} // load_instance_class loop
if (HAS_PENDING_EXCEPTION) {
// An OOM could have happened at various places inside
load_instance_class, and
// the shared klass may not have been returned to us. Find it and
make
// sure it's cleaned up.
if (class_loader.is_null()) {
instanceKlassHandle ik(THREAD, find_shared_class(name));
if (ik.not_null()) {
MutexLocker mu(SystemDictionary_lock, THREAD);
Klass* kk = find_class(name, loader_data);
if (kk == NULL) {
clean_up_shared_class(ik, class_loader, THREAD);
}
}
}
}
if (load_instance_added == true) {
....
> So I'm assuming you are doing a cleanup of the unshareable info because:
> find_shared_class succeeded, resolve_super_or_fail on superclasses/interfaces succeeded and
> restore_unshareable_info succeeded.
> Can we assume load_shared_class succeeded or do we need a check there in case notify_class_loaded
> failed?
>
> For the bootloader today, LOAD_INSTANCE is not done in parallel, so we are not going to run
> into the situation in which another thread is loading the same class in the LOAD_INSTANCE sense.
> That said, once we add the instanceKlass to the systemdictionary - any other thread could
> find it and use it - so we don't want to clean it up - it may be in use.
>
> In load_instance_class if load_shared_class succeeds, you call find_or_define_instance_class, which
> calls define_class.
> Inside find_or_define_instance_class and define_instance_class you can have failures
> before you add to the systemdictionary - in which case you want to do this cleanup. If the failures are
> after that point - this thread may not get the information back due to a pending exception, but other
> threads may use the information - so you don't want to do the cleanup. So the key is - if update_dictionary
> succeeded - don't do the cleanup. If it did not, you want the cleanup.
>
> Coleen is planning to put in code to cleanup the instanceKlass - and this should go the same place I
> would assume, so you probably want to discuss this closely with her.
OK, I will do that.
Thanks for the detailed review! I really appreciate that.
- Ioi
> thanks,
> Karen
>
> On Mar 14, 2013, at 1:49 AM, Ioi Lam wrote:
>
>> Hi, could someone review this? This fixes a crash related to OOM handling.
>>
>> Thanks
>> - Ioi
>>
>> On 03/11/2013 12:47 PM, Ioi Lam wrote:
>>> Please review:
>>> http://cr.openjdk.java.net/~iklam/8007725/null_mirror_001/
>>>
>>> Bug: NPG: Klass::restore_unshareable_info() triggers
>>> assert(k->java_mirror() == NULL)
>>>
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007725
>>> or https://jbs.oracle.com/bugs/browse/JDK-8007725
>>>
>>> (The bugs.sun.com page may be unavailable for a couple of days due
>>> to recent change of confidentiality status.)
>>>
>>> Summary of fix:
>>>
>>> The assert is at:
>>>
>>> --------
>>> # Internal Error (src/share/vm/classfile/javaClasses.cpp:527)
>>> # assert(k->java_mirror() == NULL) failed: should only assign mirror once
>>> java_lang_Class::create_mirror(KlassHandle)
>>> Klass::restore_unshareable_info()
>>> InstanceKlass::restore_unshareable_info()
>>> instanceKlassHandle SystemDictionary::load_shared_class(instanceKlassHandle,Handle)
>>> --------
>>>
>>> This happens if during class loading, an OOM happened after
>>> Klass::restore_unshareable_info() has called
>>> java_lang_Class::create_mirror(), but before the class loading is completed.
>>>
>>> At this point, the class is not in the system dictionary. However the
>>> Klass structure is partially initialized.
>>>
>>> If the app tries to load the same class again, the assert happens.
>>>
>>> The fix is to undo all work performed by
>>> InstanceKlass::restore_unshareable_info().
>>>
>>> Tests executed:
>>>
>>> * JPRT
>>> * JCK (vm, lang, api/signaturetest)
>>> * UTE (vm.quick.testlist, )
>>>
>>> Thanks,
>>> Ioi
More information about the hotspot-runtime-dev
mailing list