RFR (S) 8173361: various crashes in JvmtiExport::post_compiled_method_load

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Tue Nov 19 03:09:25 UTC 2019


Hi Serguei,

Sorry for not sending an update.  I talked to Erik and am working on a 
version that keeps the nmethod from being unloaded while it's in the 
deferred event queue, with a version that the GC people will like, and I 
like.  I'm testing it out now.

Thanks!
Coleen


On 11/18/19 10:03 PM, serguei.spitsyn at oracle.com wrote:
> Hi Coleen,
>
> Sorry for the latency, I had to investigate it a little bit.
> I still have some doubt your fix is right thing to do.
>
>
> On 11/16/19 04:55, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 11/15/19 11:17 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi Coleen,
>>>
>>> On 11/15/19 2:12 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Hi, I've been working on answers to these questions, so I'll start 
>>>> with this one.
>>>>
>>>> The nmethodLocker keeps the nmethod from being reclaimed 
>>>> (made_zombie or memory released) by the sweeper, but the nmethod 
>>>> could be unloaded.  Unloading the nmethod clears the Method* 
>>>> _method field.
>>>
>>> Yes, I see it is done in the nmethod::make_unloaded().
>>>
>>>> The post_compiled_method_load event needs the _method field to look 
>>>> at things like inlining and ScopeDesc fields.   If the nmethod is 
>>>> unloaded, some of the oops are dead.  There are "holder" oops that 
>>>> correspond to the metadata in the nmethod.  If these oops are dead, 
>>>> causing the nmethod to get unloaded, then the metadata may not be 
>>>> valid.
>>>>
>>>> So my change 02 looks for a NULL nmethod._method field to tell 
>>>> whether we can post information about the nmethod.
>>>>
>>>> There's code in nmethod.cpp like:
>>>>
>>>> jmethodID nmethod::get_and_cache_jmethod_id() {
>>>>   if (_jmethod_id == NULL) {
>>>>     // Cache the jmethod_id since it can no longer be looked up 
>>>> once the
>>>>     // method itself has been marked for unloading.
>>>>     _jmethod_id = method()->jmethod_id();
>>>>   }
>>>>   return _jmethod_id;
>>>> }
>>>>
>>>> Which was added when post_method_load and unload were turned into 
>>>> deferred events.
>>>
>>> Could we cache the jmethodID in the 
>>> JvmtiDeferredEvent::compiled_method_load_event
>>> similarly as we do in the 
>>> JvmtiDeferredEvent::compiled_method_unload_event?
>>> This would help to get rid of the dependency on the nmethod::_method.
>>> Do we depend on any other nmethod fields?
>>
>> Yes, there are other nmethod metadata that we rely on to print inline 
>> information, and this function 
>> JvmtiCodeBlobEvents::build_jvmti_addr_location_map because it uses 
>> the ScopeDesc data in the nmethod.
>
> One possible approach is to prepare and cache all this information
> in the nmethod::post_compiled_method_load_event() before the
> JvmtiDeferredEvent::compiled_method_load_event() is called.
> The event parameters are:
> typedef struct {
>      const void* start_address;
>      jlocation location;
> } jvmtiAddrLocationMap;
> CompiledMethodLoad(jvmtiEnv *jvmti_env,
>              jmethodID method,
>              jint code_size,
>              const void* code_addr,
>              jint map_length,
>              const jvmtiAddrLocationMap* map,
>              const void* compile_info)
> Some of these addresses above could be not accessible when an event is 
> posted.
> Not sure yet if it is Okay.
> The question is if this kind of refactoring is worth and right thing 
> to do.
>
>>
>> We do cache the jmethodID but that's not good enough.  See my last 
>> comment in the bug report.  The jmethodID can point to an unloaded 
>> method.
>
> This looks like it is done a little bit late.
> It'd better to do it before the event is deferred (see above).
>
>> I tried a version of keeping the nmethod alive, but the GC folks will 
>> hate it.  And it doesn't work and I hate it.
>
> From serviceability point of view this is the best and most consistent 
> approach.
> I seems to me, it was initially designed this way.
> The downside is it adds some extra complexity to the GC.
>
>> My version 01 is the best, with the caveat that maybe it should check 
>> for _method == NULL instead of nmethod->is_alive().  I have to talk 
>> to Erik to see if there's a race with concurrent class unloading.
>>
>> Any application that depends on a compiled method loading event on a 
>> class that could be unloaded is a buggy application. Applications 
>> should not rely on when the JIT compiler decides to compile a 
>> method!  This happens to us for a stress test.  Most applications 
>> will get most of their compiled method loading events as they 
>> normally do.
>
> It is not an application that relies on the compiled method loading event.
> It is about profiling tools to be able to get correct information 
> about what is going on with compilations.
> My concern is that if we skip such compiled method load events then 
> profilers have no way
> to find out there many unneeded compilations that are thrown away 
> without any real use.
> Also, it is not clear what happens with the subsequent compiled method 
> unload events.
> Are they going to be skipped as well or they can appear and confuse 
> profilers?
>
>
> Thanks,
> Serguei
>>
>> Thanks,
>> Coleen
>>
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>> I put more debugging in the bug to show this crash was from an 
>>>> unloaded nmethod.
>>>>
>>>> Coleen
>>>>
>>>>
>>>> On 11/15/19 4:45 PM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> I have some questions.
>>>>>
>>>>> Both the compiler method load and unload are posted as deferred 
>>>>> events.
>>>>> Both events keep the nmethod alive until the ServiceThread 
>>>>> processes the event.
>>>>>
>>>>> The implementation is:
>>>>>
>>>>> JvmtiDeferredEvent 
>>>>> JvmtiDeferredEvent::compiled_method_load_event(nmethod* nm) {
>>>>>   . . .
>>>>>   // Keep the nmethod alive until the ServiceThread can process
>>>>>   // this deferred event.
>>>>>   nmethodLocker::lock_nmethod(nm);
>>>>>   return event;
>>>>> }
>>>>>
>>>>> JvmtiDeferredEvent 
>>>>> JvmtiDeferredEvent::compiled_method_unload_event(nmethod* nm, 
>>>>> jmethodID id, const void* code) {
>>>>>   . . .
>>>>>   // Keep the nmethod alive until the ServiceThread can process
>>>>>   // this deferred event. This will keep the memory for the
>>>>>   // generated code from being reused too early. We pass
>>>>>   // zombie_ok == true here so that our nmethod that was just
>>>>>   // made into a zombie can be locked.
>>>>>   nmethodLocker::lock_nmethod(nm, true /* zombie_ok */);
>>>>>   return event;
>>>>> }
>>>>>
>>>>> void JvmtiDeferredEvent::post() {
>>>>> assert(ServiceThread::is_service_thread(Thread::current()),
>>>>>          "Service thread must post enqueued events");
>>>>>   switch(_type) {
>>>>>     case TYPE_COMPILED_METHOD_LOAD: {
>>>>>       nmethod* nm = _event_data.compiled_method_load;
>>>>>       JvmtiExport::post_compiled_method_load(nm);
>>>>>       // done with the deferred event so unlock the nmethod
>>>>>       nmethodLocker::unlock_nmethod(nm);
>>>>>       break;
>>>>>     }
>>>>>     case TYPE_COMPILED_METHOD_UNLOAD: {
>>>>>       nmethod* nm = _event_data.compiled_method_unload.nm;
>>>>>       JvmtiExport::post_compiled_method_unload(
>>>>>         _event_data.compiled_method_unload.method_id,
>>>>>         _event_data.compiled_method_unload.code_begin);
>>>>>       // done with the deferred event so unlock the nmethod
>>>>>       nmethodLocker::unlock_nmethod(nm);
>>>>>       break;
>>>>>     }
>>>>>     . . .
>>>>>   }
>>>>> }
>>>>>
>>>>> Then I wonder how is it possible for the nmethod to be not alive 
>>>>> here?:
>>>>> 2168 void JvmtiExport::post_compiled_method_load(nmethod *nm) {
>>>>> . . .
>>>>> 2173 // It's not safe to look at metadata for unloaded methods.
>>>>> 2174 if (!nm->is_alive()) {
>>>>> 2175 return;
>>>>> 2176 }
>>>>> At least, it lokks like something else is broken.
>>>>> Do I miss something important here?
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 11/14/19 5:15 PM, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: Don't post information which uses metadata from unloaded 
>>>>>> nmethods
>>>>>>
>>>>>> Tested tier1-3 and 100 times with test that failed (reproduced 
>>>>>> failure without the fix).
>>>>>>
>>>>>> open webrev at 
>>>>>> http://cr.openjdk.java.net/~coleenp/2019/8173361.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8173361
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20191118/eb4df9cf/attachment-0001.html>


More information about the serviceability-dev mailing list