RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Thu Nov 21 14:12:00 UTC 2019


Please review a new version of this change that keeps the nmethod from 
being unloaded, after it is added to the deferred event queue:

http://cr.openjdk.java.net/~coleenp/2019/8173361.03/webrev/index.html

Ran the test that failed 100 times without failure, tier1 on Oracle 
supported platforms, and tier2-3 including jvmti and jdi tests locally.

See bug for more details about the crash.

https://bugs.openjdk.java.net/browse/JDK-8173361

Thanks,
Coleen

On 11/18/19 10:09 PM, coleen.phillimore at oracle.com wrote:
>
> Hi Serguei,
>
> Sorry for not sending an update.  I talked to Erik and am working on a 
> version that keeps the nmethod from being unloaded while it's in the 
> deferred event queue, with a version that the GC people will like, and 
> I like.  I'm testing it out now.
>
> Thanks!
> Coleen
>
>
> On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote:
>> Hi Coleen,
>>
>> Sorry for the latency, I had to investigate it a little bit.
>> I still have some doubt your fix is right thing to do.
>>
>>
>> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote:
>>>> Hi Coleen,
>>>>
>>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Hi, I've been working on answers to these questions, so I'll start 
>>>>> with this one.
>>>>>
>>>>> The nmethodLocker keeps the nmethod from being reclaimed 
>>>>> (made_zombie or memory released) by the sweeper, but the nmethod 
>>>>> could be unloaded.  Unloading the nmethod clears the Method* 
>>>>> _method field.
>>>>
>>>> Yes, I see it is done in the nmethod::make_unloaded().
>>>>
>>>>> The post_compiled_method_load event needs the _method field to 
>>>>> look at things like inlining and ScopeDesc fields.   If the 
>>>>> nmethod is unloaded, some of the oops are dead.  There are 
>>>>> "holder" oops that correspond to the metadata in the nmethod.  If 
>>>>> these oops are dead, causing the nmethod to get unloaded, then the 
>>>>> metadata may not be valid.
>>>>>
>>>>> So my change 02 looks for a NULL nmethod._method field to tell 
>>>>> whether we can post information about the nmethod.
>>>>>
>>>>> There's code in nmethod.cpp like:
>>>>>
>>>>> jmethodID nmethod::get_and_cache_jmethod_id() {
>>>>>   if (_jmethod_id == NULL) {
>>>>>     // Cache the jmethod_id since it can no longer be looked up 
>>>>> once the
>>>>>     // method itself has been marked for unloading.
>>>>>     _jmethod_id = method()->jmethod_id();
>>>>>   }
>>>>>   return _jmethod_id;
>>>>> }
>>>>>
>>>>> Which was added when post_method_load and unload were turned into 
>>>>> deferred events.
>>>>
>>>> Could we cache the jmethodID in the 
>>>> JvmtiDeferredEvent::compiled_method_load_event
>>>> similarly as we do in the 
>>>> JvmtiDeferredEvent::compiled_method_unload_event?
>>>> This would help to get rid of the dependency on the nmethod::_method.
>>>> Do we depend on any other nmethod fields?
>>>
>>> Yes, there are other nmethod metadata that we rely on to print 
>>> inline information, and this function 
>>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses 
>>> the ScopeDesc data in the nmethod.
>>
>> One possible approach is to prepare and cache all this information
>> in the nmethod::post_compiled_method_load_event() before the
>> JvmtiDeferredEvent::compiled_method_load_event() is called.
>> The event parameters are:
>> typedef struct {
>>      const void* start_address;
>>      jlocation location;
>> } jvmtiAddrLocationMap;
>> CompiledMethodLoad(jvmtiEnv *jvmti_env,
>>              jmethodID method,
>>              jint code_size,
>>              const void* code_addr,
>>              jint map_length,
>>              const jvmtiAddrLocationMap* map,
>>              const void* compile_info)
>> Some of these addresses above could be not accessible when an event 
>> is posted.
>> Not sure yet if it is Okay.
>> The question is if this kind of refactoring is worth and right thing 
>> to do.
>>
>>>
>>> We do cache the jmethodID but that's not good enough.  See my last 
>>> comment in the bug report.  The jmethodID can point to an unloaded 
>>> method.
>>
>> This looks like it is done a little bit late.
>> It'd better to do it before the event is deferred (see above).
>>
>>> I tried a version of keeping the nmethod alive, but the GC folks 
>>> will hate it.  And it doesn't work and I hate it.
>>
>> From serviceability point of view this is the best and most 
>> consistent approach.
>> I seems to me, it was initially designed this way.
>> The downside is it adds some extra complexity to the GC.
>>
>>> My version 01 is the best, with the caveat that maybe it should 
>>> check for _method == NULL instead of nmethod->is_alive(). I have to 
>>> talk to Erik to see if there's a race with concurrent class unloading.
>>>
>>> Any application that depends on a compiled method loading event on a 
>>> class that could be unloaded is a buggy application.  Applications 
>>> should not rely on when the JIT compiler decides to compile a 
>>> method!  This happens to us for a stress test.  Most applications 
>>> will get most of their compiled method loading events as they 
>>> normally do.
>>
>> It is not an application that relies on the compiled method loading 
>> event.
>> It is about profiling tools to be able to get correct information 
>> about what is going on with compilations.
>> My concern is that if we skip such compiled method load events then 
>> profilers have no way
>> to find out there many unneeded compilations that are thrown away 
>> without any real use.
>> Also, it is not clear what happens with the subsequent compiled 
>> method unload events.
>> Are they going to be skipped as well or they can appear and confuse 
>> profilers?
>>
>>
>> Thanks,
>> Serguei
>>>
>>> Thanks,
>>> Coleen
>>>
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>> I put more debugging in the bug to show this crash was from an 
>>>>> unloaded nmethod.
>>>>>
>>>>> Coleen
>>>>>
>>>>>
>>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> I have some questions.
>>>>>>
>>>>>> Both the compiler method load and unload are posted as deferred 
>>>>>> events.
>>>>>> Both events keep the nmethod alive until the ServiceThread 
>>>>>> processes the event.
>>>>>>
>>>>>> The implementation is:
>>>>>>
>>>>>> JvmtiDeferredEvent 
>>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) {
>>>>>>   . . .
>>>>>>   // Keep the nmethod alive until the ServiceThread can process
>>>>>>   // this deferred event.
>>>>>>   nmethodLocker::lock_nmethod(nm);
>>>>>>   return event;
>>>>>> }
>>>>>>
>>>>>> JvmtiDeferredEvent 
>>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, 
>>>>>> jmethodID id, const void* code) {
>>>>>>   . . .
>>>>>>   // Keep the nmethod alive until the ServiceThread can process
>>>>>>   // this deferred event. This will keep the memory for the
>>>>>>   // generated code from being reused too early. We pass
>>>>>>   // zombie_ok == true here so that our nmethod that was just
>>>>>>   // made into a zombie can be locked.
>>>>>>   nmethodLocker::lock_nmethod(nm, true /* zombie_ok */);
>>>>>>   return event;
>>>>>> }
>>>>>>
>>>>>> void JvmtiDeferredEvent::post() {
>>>>>> assert(ServiceThread::is_service_thread(Thread::current()),
>>>>>>          "Service thread must post enqueued events");
>>>>>>   switch(_type) {
>>>>>>     case TYPE_COMPILED_METHOD_LOAD: {
>>>>>>       nmethod* nm = _event_data.compiled_method_load;
>>>>>>       JvmtiExport::post_compiled_method_load(nm);
>>>>>>       // done with the deferred event so unlock the nmethod
>>>>>>       nmethodLocker::unlock_nmethod(nm);
>>>>>>       break;
>>>>>>     }
>>>>>>     case TYPE_COMPILED_METHOD_UNLOAD: {
>>>>>>       nmethod* nm = _event_data.compiled_method_unload.nm;
>>>>>>       JvmtiExport::post_compiled_method_unload(
>>>>>>         _event_data.compiled_method_unload.method_id,
>>>>>>         _event_data.compiled_method_unload.code_begin);
>>>>>>       // done with the deferred event so unlock the nmethod
>>>>>>       nmethodLocker::unlock_nmethod(nm);
>>>>>>       break;
>>>>>>     }
>>>>>>     . . .
>>>>>>   }
>>>>>> }
>>>>>>
>>>>>> Then I wonder how is it possible for the nmethod to be not alive 
>>>>>> here?:
>>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) {
>>>>>> . . .
>>>>>> 2173 // It's not safe to look at metadata for unloaded methods.
>>>>>> 2174 if (!nm->is_alive()) {
>>>>>> 2175 return;
>>>>>> 2176 }
>>>>>> At least, it lokks like something else is broken.
>>>>>> Do I miss something important here?
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>>
>>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote:
>>>>>>> Summary: Don't post information which uses metadata from 
>>>>>>> unloaded nmethods
>>>>>>>
>>>>>>> Tested tier1-3 and 100 times with test that failed (reproduced 
>>>>>>> failure without the fix).
>>>>>>>
>>>>>>> open webrev at 
>>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191121/ec3e7d81/attachment.html>


More information about the serviceability-dev mailing list