Request for review, 6766644: Redefinition of compiled method fails with assertion "Can not load classes with the Compiler thread"
David Holmes
David.Holmes at oracle.com
Mon Jan 31 18:20:10 PST 2011
Hi Keith,
I've looked at the new webrev - thanks.
Keith McGuigan said the following on 02/01/11 00:32:
> On Jan 30, 2011, at 6:58 PM, David Holmes wrote:
>
> Since Service_lock is a 'special' lock (see mutex.hpp), no locks can be
> acquired when holding the service-lock. I believe the locking code
> asserts this and I was careful to not acquire new locks when holding the
> Service_lock. So deadlock should not be an issue but let me know if you
> see somewhere I screwed this up.
Okay that makes me feel somewhat better.
> I'm trying to come up with some tests for this now. This risk here
> ought to be low as the behavior of low-memory detection will be
> completely unchanged unless JVMTI compiled-method events are enabled.
> When they are enabled, the low-memory detection actions may be delayed
> by the JVMTI event native code. I don't suspect that this is a problem
> but I'm investigating to to make sure.
My concern is that we may introduce a previously impossible cycle
between these various threads. Event processing can be delayed by
low-memory processing and vice-versa, and I didn't check through to see
if it was impossible to somehow get a cycle between the two.
>> I'm also a little concerned that changing the LowMemoryDetector (LMD)
>> thread in to the Service thread is somewhat disruptive when doing
>> stack analysis on running apps or core dumps or crash logs, because
>> people have been used to seeing the LMD thread for many years and now
>> it has changed its identity.
>
> The thread "is_hidden_from_external_view", (which I know doesn't hide it
> completely), so it's visibility should be pretty limited.
But it is completely visible in the contexts I listed: stack dumps for
live processes and core files. Good compromise keeping the name for
JDK6. May want to give a heads-up to all JDK developers though. :)
>> Looking in jvmtiImpl.cpp
>>
>> I'd prefer to see the locking in all the JvmtiDeferredEventQueue
>> methods rather than having the call-sites do the locking, as it is
>> less error-prone. But I see that the main processing loop in the
>> service thread makes this impossible due to the dual use of the
>> "service lock".
>
> It is possible to massage the service thread loop to make that work. I
> did have it coded up that way at one point but decided to change it back
> and put the onus on the caller to acquire the lock instead. I think
> this is better since it makes it more obvious in the calling code that
> the lock is being acquired. Putting the lock acquisition inside the
> calls instead would sort of "hide" the lock and I think it's safer to
> make it obvious.
That "hiding" is called encapsulation and is generally considered a
"good thing" :) "client-side locking" is generally considered error
prone. That said, inside the VM where deadlocks are a big concern, more
visibility of locks isn't necessarily a bad thing.
>> Where we have:
>>
>> 961 void JvmtiDeferredEventQueue::enqueue(const JvmtiDeferredEvent&
>> event) {
>> ...
>> 967 QueueNode* node = new QueueNode(event);
>>
>> and
>>
>> 1007 void JvmtiDeferredEventQueue::add_pending_event(
>> 1008 const JvmtiDeferredEvent& event) {
>> 1009
>> 1010 QueueNode* node = new QueueNode(event);
>>
>> Will an allocation failure here terminate the VM? I suspect it will
>> but I don't think it should. Can we not add a link field to the event
>> objects rather than have to create Nodes to hold them?
>
> I can add a NULL check to those spots. There's no need for us to segv
> if we can't allocate memory for this. We do need a C-heap allocated
> object since we're passing these events from thread-to-thread. Adding
> the link in the event object itself wouldn't help - it would just mean
> we have to allocate the event object in C-heap instead.
As per your other email NULL check is pointless as "new" will never
return null but will instead call vm_exit_out_of_memory to abort the VM
- which is exactly my point. I don't believe that a failure to allocate
in this code should be considered a fatal error for the VM.
I don't understand why we would have to allocate the event objects in
C-heap, all we want is a link field so that they can form their own
queue without the need for wrapper Nodes.
447 class JvmtiDeferredEvent VALUE_OBJ_CLASS_SPEC {
448 friend class JvmtiDeferredEventQueue;
449 private:
JvmtiDeferredEvent _next; // for chaining events together
public:
JvmtiDeferredEvent next() { return _next; }
void set_next(JvmtiDeferredEvent e) {
assert(_next == NULL, "overwriting linked event!");
_next = e;
}
Further this would allow you to dequeue all the events at once instead
of the service thread going through pulling them off one at a time.
>> In dequeue, this:
>>
>> 988 if (_queue_head == NULL) {
>> 989 // Just in case this happens in product; it shouldn't be let's
>> not crash
>> 990 return JvmtiDeferredEvent();
>> 991 }
>>
>> doesn't look right. Are we returning a stack allocated instance here?
>> (There's also a typo in the comment.)
>
> Yes, we're returning a stack-allocated instance, which returns to the
> caller as a temp and is copied into a local variable there. I'll fix
> the comment.
It isn't "legal" to return a stack-allocated object from a method. I
expect we'll get away with it but still ...
Couldn't you just allocate a static instance to use in this case?
>> In jvmtiExport.cpp, I don't understand why here:
>>
>> 832 void JvmtiExport::post_dynamic_code_generated(const char *name,
>> const void *code_begin, const void *code_end) {
>> ...
>> 1841
>> 1842 JvmtiDeferredEventQueue::wait_for_empty_queue();
>> 1843
>>
>> we have to wait for an empty queue, particularly as the queue may
>> become non-empty at any moment. Normally such a wait must be atomic
>> with respect to an action that requires the queue to be empty, but
>> that doesn't seem to be the case here.
>
<snip>
> that's perfectly fine. It's more of a 'flush_queue' call and I suppose
> I could call it that explicitly (and maybe add a 'flush marker' in the
> queue so that this won't wait for events that were generated while this
> was waiting). Actually if I do that I can get rid of the 'notify' in
> dequeue and instead use a special flush event. Hmmm... that might work
> better.
Ok so in general I like the form of the new architecture but I'm a bit
unclear on the details. In general non-service threads will only call:
- enqueue() to add an event
- add_pending_event() to add an event in the special case of a safepoint
being active
- flush_queue() to wait for earlier events to get posted
The service thread will:
- loop in wait() until there is an event (enqueue will do the notifyAll)
- dequeue() to pull off that event
Now it seems that flush_queue is also called by the service thread, but
I can't see where from?
I'm a little concerned about all the locking/unlocking that seems to
happen; and the fact that there are potentially a lot of spurious
notifications. Though this is mitigated if we only expect a couple of
threads to be involved, there are ways to restructure things so that
event processing can be batched, including a single notifyAll once all
flush events are complete.
Cheers,
David
More information about the serviceability-dev
mailing list