RFR (S) JDK-8007725 - Klass::restore_unshareable_info() triggers assert(k->java_mirror() == NULL)
Karen Kinnear
karen.kinnear at oracle.com
Thu Mar 14 10:08:02 PDT 2013
Ioi,
On Mar 14, 2013, at 12:00 PM, Ioi Lam wrote:
> On 03/14/2013 08:24 AM, Karen Kinnear wrote:
>> Ioi,
>>
>> Thank you for fixing this. And thank you so much for asking me to review this - this logic is very tricky.
>>
>> I had a couple of questions/comments please:
>>
>> 1) klass.cpp line 490, systemDictionary.cpp line 808
>> I believe that there are other possible PENDING_EXCEPTIONS in addition to OOM we can get here -
>> could you possibly modify the comments to say something like "An exception, such as OOM, ..."
>> (e.g. I believe we could get here due to a loader constraint exception)
> OK.
>> 2) method.cpp 804, klass.cpp 489
>> Can you possibly explain why you would not see a problem in DumpSharedSpaces?
>> For remove_unshareable_info I am confused when you would clean up a number of the fields
>> but leave the class_loader_data?
> In DumpSharedSpaces, we come to remove_unsharable_info only after all classes are successfully loaded, so no Method or Klass would be partially initialized. However, at run-time, a Klass might be partially initialized. A Method may be not-initialized (because Method::link() of a previous method in the method array has failed due to OOM).
Thank you for the explanation. Could you add this as a comment in the code please?
A question - for DumpSharedSpaces, could you actually run into this problem and not have a successful load?
I get it that for the usual DumpSharedSpaces logic, you are running this as clean up prior to dump and not
as error recovery. I wonder if passing in a flag to distinguish those cases would make more sense?
>
> I
>
>
>> 3. klass.cpp remove_unshareable_info
>> I'm still trying to figure out exactly how you determined the fields to be cleared.
>> I get the set that matches restore_unshareable_info.
>> If the other fields got filled in, then another thread got its hands on this instanceKlass, via
>> a sibling or subklass - and you don't want to be cleaning this up at all, so I'm wondering
>> if that would be an assertion instead?
>
> If the class has been entered into the system dictionary (see my comments below) then Klass::remove_unshareable_info and Klass::restore_unshareable_info will never be called again for this class.
>
> If the class has not been entered into the system dictionary, then Klass::restore_unshareable_info is called only by a single thread at a time (due to the locking inside SystemDictionary::resolve_instance_class_or_null). I.e., a subclass or subinterface should never see a partially initialized shared class. Therefore, If loading has failed, it is thread-safe to call Klass::remove_unshareable_info.
>
> I will change the condition of when Klass::remove_unshareable_info is called. See below.'
Thank you.
So - where do the other fields get set that you are freeing? e.g. subklass, siblings, etc. - if the class did not
get entered into the system dictionary?
>> 4. Could you possibly remove lines 1162-1166 - I don't think those are pertinent
>> post-nopermgen.
> You mean in systemDictionary.cpp? I will remove the comments.
>
>> 5. systemDictionary.cpp
>>
>> I'm wondering if you want to move the cleanup request to a different place.
>>
>> If the load was successful, but the check_constraints on the initiating loader fails - I think you want to
>> leave the klass loaded by the defining loader (null) and leave the information intact.
>> (We allow a defineclass to succeed and future lookups to find the class/defining loader even if
>> the initiating loader fails due to loader constraints or later checkpackageaccess for instance)
>>
>> I don't quite understand where the exception is happening that you are cleaning up for.
>> Where did you see an OOM? Where else could they happen here? What other
>> exceptions could happen? I think this has to be handled at a tighter level of granularity so
>> we know exactly when this clean up is needed.
> In testing, I saw OOM inside Klass::restore_unshareable_info() => java_lang_Class::create_mirror().
>
> But InstanceKlass::restore_unshareable_info() actually makes a lot of allocations, so it could fail at a number of different places.
Thank you. So one of the locations that could cause OOM problems would be restore_unshareable_info. Thanks.
And there is the possibility of getting a variety of exceptions from the find_or_define path.
>
> The main reason for the clean up is this:
>
> In SystemDictionary::resolve_instance_class_or_null, if the target class is not in the system dictionary, then load_instance_class() will be called, which eventually will unconditionally call into InstanceKlass::restore_unshareable_info().
(I think that is conditional on find_shared_class, and the resolve_super for superclasses/superinterfaces succeeding - those have
CHECKs so if they fail, we don't get here)
>
> So the clean up should be done whenever we have failed to put the shared class into the dictionary.
I agree.
>
> I am thinking of leaving the clean up at the same place, but do this instead:
>
> ....
> } // load_instance_class loop
>
> if (HAS_PENDING_EXCEPTION) {
> // An OOM could have happened at various places inside load_instance_class, and
> // the shared klass may not have been returned to us. Find it and make
> // sure it's cleaned up.
> if (class_loader.is_null()) {
> instanceKlassHandle ik(THREAD, find_shared_class(name));
> if (ik.not_null()) {
> MutexLocker mu(SystemDictionary_lock, THREAD);
> Klass* kk = find_class(name, loader_data);
> if (kk == NULL) {
> clean_up_shared_class(ik, class_loader, THREAD);
> }
> }
> }
> }
>
> if (load_instance_added == true) {
> ....
I like the approach of checking if this is in the system dictionary, that is a great idea.
It needs this minor modification.
The find_class that you are doing is the name/initiating loader pair.
If the define_class worked for the name/defining loader (null for this case) that class
may have been added to the systemdictionary for lookup, and in that case you do not
want to do this cleanup.
So the find class would be the defining loader - which would be the ik->class_loader_data()
rather than the loader_data you were called with, which is the initiating loader.
>
>
>> So I'm assuming you are doing a cleanup of the unshareable info because:
>> find_shared_class succeeded, resolve_super_or_fail on superclasses/interfaces succeeded and
>> restore_unshareable_info succeeded.
>> Can we assume load_shared_class succeeded or do we need a check there in case notify_class_loaded
>> failed?
>>
>> For the bootloader today, LOAD_INSTANCE is not done in parallel, so we are not going to run
>> into the situation in which another thread is loading the same class in the LOAD_INSTANCE sense.
>> That said, once we add the instanceKlass to the systemdictionary - any other thread could
>> find it and use it - so we don't want to clean it up - it may be in use.
>>
>> In load_instance_class if load_shared_class succeeds, you call find_or_define_instance_class, which
>> calls define_class.
>> Inside find_or_define_instance_class and define_instance_class you can have failures
>> before you add to the systemdictionary - in which case you want to do this cleanup. If the failures are
>> after that point - this thread may not get the information back due to a pending exception, but other
>> threads may use the information - so you don't want to do the cleanup. So the key is - if update_dictionary
>> succeeded - don't do the cleanup. If it did not, you want the cleanup.
>>
>> Coleen is planning to put in code to cleanup the instanceKlass - and this should go the same place I
>> would assume, so you probably want to discuss this closely with her.
Note that Coleen will also need cleanup for the resolve_from_stream path - which you don't need,
around the find_or_define call.
>
> OK, I will do that.
>
> Thanks for the detailed review! I really appreciate that.
>
> - Ioi
thanks for jumping in on this complex code,
Karen
>
>> thanks,
>> Karen
>>
>> On Mar 14, 2013, at 1:49 AM, Ioi Lam wrote:
>>
>>> Hi, could someone review this? This fixes a crash related to OOM handling.
>>>
>>> Thanks
>>> - Ioi
>>>
>>> On 03/11/2013 12:47 PM, Ioi Lam wrote:
>>>> Please review:
>>>> http://cr.openjdk.java.net/~iklam/8007725/null_mirror_001/
>>>>
>>>> Bug: NPG: Klass::restore_unshareable_info() triggers
>>>> assert(k->java_mirror() == NULL)
>>>>
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007725
>>>> or https://jbs.oracle.com/bugs/browse/JDK-8007725
>>>>
>>>> (The bugs.sun.com page may be unavailable for a couple of days due
>>>> to recent change of confidentiality status.)
>>>>
>>>> Summary of fix:
>>>>
>>>> The assert is at:
>>>>
>>>> --------
>>>> # Internal Error (src/share/vm/classfile/javaClasses.cpp:527)
>>>> # assert(k->java_mirror() == NULL) failed: should only assign mirror once
>>>> java_lang_Class::create_mirror(KlassHandle)
>>>> Klass::restore_unshareable_info()
>>>> InstanceKlass::restore_unshareable_info()
>>>> instanceKlassHandle SystemDictionary::load_shared_class(instanceKlassHandle,Handle)
>>>> --------
>>>>
>>>> This happens if during class loading, an OOM happened after
>>>> Klass::restore_unshareable_info() has called
>>>> java_lang_Class::create_mirror(), but before the class loading is completed.
>>>>
>>>> At this point, the class is not in the system dictionary. However the
>>>> Klass structure is partially initialized.
>>>>
>>>> If the app tries to load the same class again, the assert happens.
>>>>
>>>> The fix is to undo all work performed by
>>>> InstanceKlass::restore_unshareable_info().
>>>>
>>>> Tests executed:
>>>>
>>>> * JPRT
>>>> * JCK (vm, lang, api/signaturetest)
>>>> * UTE (vm.quick.testlist, )
>>>>
>>>> Thanks,
>>>> Ioi
>
More information about the hotspot-runtime-dev
mailing list