RFR (S) 8197844: JVMTI GetLoadedClasses should use the Access API

Tue Mar 20 21:49:21 UTC 2018

Kim and I had some offline discussions about this and he recommended 
that the ClassLoaderData::ensure_loader_alive function return the oop to 
handleize while the closure runs on that ClassLoaderData. This is that 
change.

Tested with jvmti, runtime and jfr tests that use 
ClassLoaderData::*classes/modules/package iterations.

http://cr.openjdk.java.net/~coleenp/8197844.03/webrev/index.html

thanks,
Coleen

On 3/19/18 7:38 PM, coleen.phillimore at oracle.com wrote:
>
>
> On 3/19/18 7:32 PM, Roman Kennke wrote:
>> The change looks good to me.
>>
>> I have also verified that Shenandoah does not call into any of the
>> iterators that might resurrect CLDs by mistake.
>
> That's good!  Thank you for verifying this.  These iterators should 
> now be only runtime iterators.
> thanks,
> Coleen
>
>>
>> Thanks, Roman
>>
>>
>>> After reading my own reply and Erik's, I want to reconsider my change
>>> for this bug.
>>>
>>> There are many places that iterate over ClassLoaderDataGraph, and I've
>>> found another that is buggy in the same way that JVMTI GetLoadedClasses
>>> should have a G1 barrier.  I've also spent more time trying to find if
>>> other uses are buggy, with dissatisfying inconclusion.
>>>
>>> The safest thing to do is mark loaders that we are iterating over as
>>> alive.  This alive barrier doesn't affect oops_do, cld_do or
>>> do_unloading, which are the cases that would be walked by the GC.   It
>>> also can't include verify, which I've removed from the original change.
>>>
>>> I can't assert that the thread isn't a GC thread because
>>> VM_GC_HeapInspection builds a KlassInfoTable as a GC thread.
>>>
>>> I think a klass walk of the ClassLoaderDataGraph when done for 
>>> profiling
>>> and assorted runtime code, must necessarily keep the classes alive
>>> during the interval between GCs.   The class can be unloaded in a
>>> further GC.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8197844.02/webrev
>>>
>>> This is the same patch with the ensure_loader_alive removed from
>>> ClassLoaderDataGraph::verify.
>>>
>>> thanks,
>>> Coleen
>>>
>>> On 3/16/18 11:29 AM, Erik Österlund wrote:
>>>> Hi Coleen,
>>>>
>>>> I for one would like to see your version of the fix, plus some
>>>> suitable assert to catch bad use of these APIs.
>>>> The motivation is that it seems unfortunate to expose APIs for getting
>>>> Metadata that when retrieved is unstable and unsafe to use unless you
>>>> have deep knowledge about GC races and class unloading, and hope all
>>>> callsites know that they can't just use the things they retrieved
>>>> because they are dangerously explosive. The only problem with the more
>>>> conservative API is if the GC itself calls into the code and
>>>> accidentally keeps everything alive preventing class unloading. But I
>>>> think a suitable assert could catch that.
>>>>
>>>> I'm not 100% sure what the assert condition should be. But I'm
>>>> thinking something along the lines of:
>>>> Thread::current()->is_Java_thread() ||
>>>> Thread::current()->is_VM_thread() &&
>>>> (!SafepointSynchronize::is_at_safepoint() || !HeapLock->is_locked())
>>>> Other suggestions for a good condition are welcome - it is just an
>>>> example that may or may not work.
>>>>
>>>> Thanks,
>>>> /Erik
>>>>
>>>> On 2018-03-16 16:01, coleen.phillimore at oracle.com wrote:
>>>>> Hi Roman, thank you for your comments.
>>>>>
>>>>> On 3/15/18 9:46 AM, Roman Kennke wrote:
>>>>>> Am 14.03.2018 um 18:37 schrieb coleen.phillimore at oracle.com:
>>>>>>> Summary: Make sure the holder of a class loader is accessed during
>>>>>>> iteration of CLDG
>>>>>>>
>>>>>>> This is where we should have put the GC barrier.  This can be 
>>>>>>> cleaned
>>>>>>> somewhat when we have a WeakHandle holder in the 
>>>>>>> ClassLoaderData, then
>>>>>>> the code in ensure_loader_alive() becomes _holder.resolve();
>>>>>>>
>>>>>>> Tested with tier1-5.
>>>>>>>
>>>>>>> open webrev at 
>>>>>>> http://cr.openjdk.java.net/~coleenp/8197844.01/webrev
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8197844
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>> I am not sure at all that this wouldn't have adverse effects. 
>>>>>> Suppose a
>>>>>> GC is iterating all CLDs and marks all of them alive, we'd probably
>>>>>> never unload it? I probably wouldn't do the keep-alive stuff 
>>>>>> wholesale
>>>>>> in the iterator methods.
>>>>>>
>>>>>> I've made (and later withdraw) a (IMO) more straightforward patch to
>>>>>> address the same:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~rkennke/8199612/webrev.00/
>>>>>>
>>>>>> It has the advantage that it doesn't to a bogus load, just to keep
>>>>>> something alive. It loads the mirror and at the same time 
>>>>>> communicates
>>>>>> the GC to keep it alive. Maybe better?
>>>>> So when I made this change, I was being conservative. Anytime we
>>>>> iterate over the CLDG, and take out oops to put somewhere else, we
>>>>> need to mark that the CLD needs to stay alive for that GC, until the
>>>>> somewhere else is walked.  We also have to worry about these classes
>>>>> being unloaded if we are still holding metadata after walking the 
>>>>> CLDG.
>>>>>
>>>>> I've gone through all the CLDG::classes_do, loaded_classes_do,
>>>>> dictionary_all_entries_do, methods_do, modules_do and packages_do at
>>>>> least 100 times.  (also oops_do variants and cld_do). All these
>>>>> calls do something with metadata, and it appears that these places
>>>>> are unaffected by GC or class unloading, except the instance in
>>>>> jvmtiGetLoadedClasses that Poonam found.
>>>>>
>>>>> Also if the closure or function passed to the CLDG iterator functions
>>>>> stop for a safepoint and unload, the walk will be broken. I didn't
>>>>> find any instances of this in the code on my 100 traversals.  My
>>>>> change is trying to make it so I don't have to do this analysis 
>>>>> anymore.
>>>>>
>>>>> Poonams, Erik's change and your change puts the responsibility on the
>>>>> caller of ClassLoaderDataGraph to keep the class alive that it is
>>>>> processing through the walk.   I think I can accept that as a fix,
>>>>> and withdraw mine, since we seem to have no instances of bugs in
>>>>> other walks.  If it turns out to be something that becomes buggy, I
>>>>> will suggest my fix again as an alternative.
>>>>>
>>>>> This fix doesn't affect the walks done during GC.  As of this
>>>>> morning, there are no KlassClosures in GC anymore.   Your concern is
>>>>> well placed though because it could help us introduce bugs which
>>>>> could negatively affect class unloading.
>>>>>
>>>>> So on the balance, there's risk in both versions of the fix, so I'll
>>>>> withdraw mine.  I think the best fix is yours:
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8199612/webrev.00/src/hotspot/share/prims/jvmtiGetLoadedClasses.cpp.udiff.html 
>>>>>
>>>>>
>>>>> with Erik's function java_mirror_phantom() that does the 
>>>>> RootAccess load.
>>>>>
>>>>> http://cr.openjdk.java.net/~eosterlund/8197844/webrev.00/src/hotspot/share/oops/klass.cpp.udiff.html 
>>>>>
>>>>>
>>>>> but not have it mark the class_loader, only do below.
>>>>>
>>>>> +oop Klass::java_mirror_phantom() {
>>>>> + // Loading the klass_holder as a phantom oop ref keeps the class 
>>>>> alive.
>>>>> + // After the holder has been kept alive, it is safe to return the
>>>>> mirror.
>>>>> + oop mirror =
>>>>> RootAccess<ON_PHANTOM_OOP_REF>::oop_load(k->java_mirror_handle().ptr_raw()); 
>>>>>
>>>>> + return mirror;
>>>>> +}
>>>>>
>>>>> I didn't like that Klass was a friend of ClassLoaderData and could
>>>>> get to the class_loader field (in case I move it).
>>>>>
>>>>> Your change here is covered by Kim's work to accessorize jni.
>>>>>
>>>>> http://cr.openjdk.java.net/~rkennke/8199612/webrev.00/src/hotspot/share/runtime/jniHandles.cpp.udiff.html 
>>>>>
>>>>>
>>>>> So I withdraw and I'll unassign myself and you and/or Erik can take
>>>>> this bug back again.
>>>>>
>>>>> Thanks!
>>>>> Coleen
>>>>>
>>>>>
>>>>>> Roman
>>>>>>
>>
>