Request for review, 6766644: Redefinition of compiled method fails with assertion "Can not load classes with the Compiler thread"

Mon Jan 31 15:56:02 PST 2011

Here's a webrev with the changes I said I'd make, for re-review:  http://cr.openjdk.java.net/~kamg/6766644/webrev.03/

(there was no need for a NULL check for those 'new QueueNode()'  
statements as the allocation code called by CHeapObj already does out- 
of-memory checking).

--
- Keith

On Jan 30, 2011, at 6:58 PM, David Holmes wrote:

> Hi Keith,
>
> I've been waiting for the dust to settle a little on this before  
> commenting. I don't know the semantics of the events enough to  
> comment on whether this deferred event processing can impact things,  
> but I can comment on the general approach.
>
> This is a pretty major change in the event architecture and I have a  
> few comments and concerns - the main concern being "deadlock" as  
> it's not obvious exactly what locks can be held when the  
> service_lock will be acquired, nor is it obvious that other locks  
> won't be acquired while the service-lock is held. Further using one  
> thread to wait for and process different kinds of events may  
> introduce new interactions that weren't previously possible. How are  
> you going to test for potential bad interactions between the low- 
> memory detection and compile-event postings? I doubt we have any  
> existing tests that try to exercise both bits of functionality.
>
> I'm also a little concerned that changing the LowMemoryDetector  
> (LMD) thread in to the Service thread is somewhat disruptive when  
> doing stack analysis on running apps or core dumps or crash logs,  
> because people have been used to seeing the LMD thread for many  
> years and now it has changed its identity.
>
> Looking in jvmtiImpl.cpp
>
> I'd prefer to see the locking in all the JvmtiDeferredEventQueue  
> methods rather than having the call-sites do the locking, as it is  
> less error-prone. But I see that the main processing loop in the  
> service thread makes this impossible due to the dual use of the  
> "service lock".
>
> Where we have:
>
> 961 void JvmtiDeferredEventQueue::enqueue(const JvmtiDeferredEvent&  
> event) {
> ...
> 967   QueueNode* node = new QueueNode(event);
>
> and
>
> 1007 void JvmtiDeferredEventQueue::add_pending_event(
> 1008     const JvmtiDeferredEvent& event) {
> 1009
> 1010   QueueNode* node = new QueueNode(event);
>
> Will an allocation failure here terminate the VM? I suspect it will  
> but I don't think it should. Can we not add a link field to the  
> event objects rather than have to create Nodes to hold them?
>
> Minor nit but the canonical pattern typically used for a cas-loop is  
> a do-while, eg instead of:
>
> 1012   while (true) {
> 1013     node->set_next((QueueNode*)_pending_list);
> 1014     QueueNode* old_value = (QueueNode*)Atomic::cmpxchg_ptr(
> 1015       (void*)node, (volatile void*)&_pending_list, node->next());
> 1016     if (old_value == node->next()) {
> 1017       break;
> 1018     }
> 1019   }
>
> use:
>
>      do {
>        node->set_next((QueueNode*)_pending_list);
>      } while ( (QueueNode*)Atomic::cmpxchg_ptr(
>         (void*)node, (volatile void*)&_pending_list, node->next()) ! 
> = node->next() );
>
> and similarly in process_pending_events. Though in  
> process_pending_events you can just use Atomic::xchg rather than a  
> cmpxchg loop.
>
>
> 1076 void JvmtiDeferredEventQueue::wait_for_empty_queue() {
> 1077   MutexLockerEx ml(Service_lock,  
> Mutex::_no_safepoint_check_flag);
> 1078   while (has_events()) {
> 1079     Service_lock->notify_all();
> 1080     Service_lock->wait(Mutex::_no_safepoint_check_flag);
> 1081   }
> 1082 }
>
> What is the notify_all for?
>
>
> 1084 void JvmtiDeferredEventQueue::notify_empty_queue() {
> 1085   assert(Service_lock->owned_by_self(), "Must own Service_lock");
> 1086   Service_lock->notify_all();
> 1087 }
>
> This looks suspicious - notification should always occur in  
> conjunction with a change of state, which should be atomic with  
> respect to the notification. Looking at the usage in the service- 
> thread wait-loop I don't understand who is being notified of what.  
> Whomever emptied the queue should be doing the notification ie in  
> dequeue() if _queue_head == NULL ie at line 997.
>
> In dequeue, this:
>
> 988   if (_queue_head == NULL) {
> 989     // Just in case this happens in product; it shouldn't be  
> let's not crash
> 990     return JvmtiDeferredEvent();
> 991   }
>
> doesn't look right. Are we returning a stack allocated instance  
> here? (There's also a typo in the comment.)
>
>
> ---
>
> In jvmtiExport.cpp,  I don't understand why here:
>
> 832 void JvmtiExport::post_dynamic_code_generated(const char *name,  
> const void *code_begin, const void *code_end) {
> ...
> 1841
> 1842   JvmtiDeferredEventQueue::wait_for_empty_queue();
> 1843
>
> we have to wait for an empty queue, particularly as the queue may  
> become non-empty at any moment. Normally such a wait must be atomic  
> with respect to an action that requires the queue to be empty, but  
> that doesn't seem to be the case here.
>
> ---
>
> In serviceThread.cpp, are we sure that it is okay to remain  
> _thread_blocked while executing this:
>
>  91       ThreadBlockInVM tbivm(jt);
>  92
>  93       MutexLockerEx ml(Service_lock,  
> Mutex::_no_safepoint_check_flag);
>  94       while (!(sensors_changed =  
> LowMemoryDetector::has_pending_requests()) &&
>  95              !(has_jvmti_events =  
> JvmtiDeferredEventQueue::has_events())) {
>  96         // wait until one of the sensors has pending requests,  
> or there is a
>  97         // pending JVMTI event to post
>  98         JvmtiDeferredEventQueue::notify_empty_queue();
>  99         Service_lock->wait(Mutex::_no_safepoint_check_flag);
> 100       }
> 101
> 102       if (has_jvmti_events) {
> 103         jvmti_event = JvmtiDeferredEventQueue::dequeue();
> 104       }
>
> It may be fine but it seems a little strange.
>
> ---
>
> There's no thread.hpp in the webrev but presumably you can now  
> delete the  LowMemoryDetectorThread class.
>
> ---
>
> Cheers,
> David
>
>
>
> Keith McGuigan said the following on 01/29/11 03:27:
>> Ok, here's my next attempt:  http://cr.openjdk.java.net/~kamg/6766644/webrev.02
>> This enqueues the compiled-method-unloaded events so that they the  
>> posting of load/unload will be in order.  Other changes include  
>> locking the nmethod until after the compiled-method-load event is  
>> posted, and an alternate mechanism for enqueuing the compiled- 
>> method-unload events that are generated at safepoint.
>> --
>> - Keith
>> On Jan 26, 2011, at 5:07 PM, Daniel D. Daugherty wrote:
>>> On 1/26/2011 2:52 PM, Tom Rodriguez wrote:
>>>> On Jan 26, 2011, at 1:39 PM, Daniel D. Daugherty wrote:
>>>>
>>>>> It also looks like there must be order between the load and  
>>>>> unload events:
>>>>>
>>>>>    CompiledMethodLoad for foo
>>>>>    CompiledMethodUnload for foo
>>>>>    CompiledMethodLoad for foo (again)
>>>>>
>>>>
>>>> Do you mean we can't have multiple versions of compiled code for  
>>>> the same method?  I don't think that's true or should be  
>>>> required.  nmethod freeing is very lazy and there's no guarantee  
>>>> that we will have completed unloading of an nmethod before we've  
>>>> created a new variation of it.  It would also be completely ok  
>>>> for a JVM to have multiple versions of the compiled code for a  
>>>> method.  Obviously the load and unload for a particular nmethod  
>>>> must be properly ordered.
>>>>
>>>
>>> That last sentence is what I meant; load and unload for a specific
>>> compiled version of foo (nmethod) must be in order.
>>>
>>> Dan
>>>
>>>
>>>
>>>>
>>>>> which is going to mean coordination between the mechanisms for  
>>>>> posting
>>>>> of both CompiledMethodLoad and CompiledMethodUnload events.
>>>>>