synchronization of JVMTI phase notifications [Fwd: Data visibility between threads in Hotspot]

Wed Feb 18 20:43:37 PST 2009

Hiroshi Yamauchi wrote:
> Thanks for chipping in.
>
> On Fri, Feb 13, 2009 at 9:39 PM, Swamy Venkataramanappa
> <Swamy.Venkataramanappa at sun.com> wrote:
>   
>> jvmti spec says no events should be generated after the VM_DEATH.
>>
>> http://java.sun.com/javase/6/docs/platform/jvmti/jvmti.html#VMDeath
>>
>> It says: "No event will occur of after this event".
>>
>> I suspect we are posting vm death event too soon.   I think we should
>> post vm death after vm thread is destroyed or to the place where all java
>> threads including daemon thread has stopped running.
>>     
>
> That may be a good fix for the issue of events sent after the VM death.
>   
> Since I couldn't suppress my bug with a SetEventCallbacks call that
> nulls out the callback function table or a SetEventNotificationMode
> call to disable all the events, I think there still is some race
> condition/visibility issue there, independent of the above fix.
>   
Yes I agree that there is still some race condition/visibility issue.  
We do need your fix for that. 
My suggestion is just to avoid using mutex locks in event posting.

Thanks,
-Swamy
>> -Swamy
>>
>> Hiroshi Yamauchi wrote:
>>     
>>> Dan,
>>>
>>> On Fri, Feb 13, 2009 at 11:39 AM, Daniel D. Daugherty
>>> <Daniel.Daugherty at sun.com> wrote:
>>>
>>>
>>>       
>>>> I'm leaning toward making the two fields volatile and adding the
>>>> appropriate OrderAccess::XXX calls. I vaguely remember an existing
>>>> bug for events posting late. I'm going to see if I can find it.
>>>>
>>>>         
>>> Making the fields volatile and adding the OrderAccess:xxx calls isn't
>>> enough because we could see the dead phase change while a loop that
>>> calls the list of callbacks is executing (eg the for loop in
>>> JvmtiExport::post_compiled_method_load). We need to guard at least the
>>> loop and the preceding EVT_TRIG_TRACE in a mutex. That's my take.
>>>
>>> Hiroshi
>>>
>>>
>>>
>>>       
>>>> Hiroshi Yamauchi wrote:
>>>>
>>>>         
>>>>> Hi Dan,
>>>>>
>>>>> On Wed, Feb 11, 2009 at 12:43 PM, Daniel D. Daugherty
>>>>> <Daniel.Daugherty at sun.com <mailto:Daniel.Daugherty at sun.com>> wrote:
>>>>>
>>>>>   I'll chime in on parts of this thread below.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>   Tim Bell wrote:
>>>>>
>>>>>       I don't know the precise answer to this question, so I am
>>>>>       forwarding it to the Serviceability list to get a wider audience.
>>>>>
>>>>>       -------- Original Message --------
>>>>>       Subject:     Data visibility between threads in Hotspot
>>>>>       Date:     Tue, 10 Feb 2009 11:14:24 -0800
>>>>>
>>>>>       Hi Tim,
>>>>>
>>>>>       I have a Hotspot question. Chuck Rasbold pointed me to you.
>>>>>       Feel free to
>>>>>       forward this message to other experts at Sun, if needed.
>>>>>
>>>>>       If one Hotspot engineer wants to guarantee the immediate
>>>>>       visibility of a
>>>>>       write to a static variable to reads in other threads (assuming
>>>>>       the reads
>>>>>       and the writes are properly synchronized via a mutex), what
>>>>>       does s/he do?
>>>>>
>>>>>       Does the use of MutexLocker guarantee the visibility? It
>>>>>       probably does not.
>>>>>
>>>>>
>>>>>   I believe that MutexLocker does guarantee the visibility:
>>>>>
>>>>>   src/share/vm/runtime/mutexLocker.hpp:
>>>>>
>>>>>    133  // See orderAccess.hpp.  We assume throughout the VM that
>>>>>   MutexLocker's
>>>>>    134  // and friends constructors do a fence, a lock and an
>>>>>   acquire *in that
>>>>>    135  // order*.  And that their destructors do a release and
>>>>>   unlock, in *that*
>>>>>    136  // order.  If their implementations change such that these
>>>>>   assumptions
>>>>>    137  // are violated, a whole lot of code will break.
>>>>>
>>>>>
>>>>> OK. That's handy.
>>>>>
>>>>>
>>>>>
>>>>>   See src/share/vm/runtime/orderAccess.hpp for the gory details.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>       There are several volatile variables in Hotpot, but it may not
>>>>>       really
>>>>>       work because C++ compilers do not guarantee the visibility or
>>>>>       limit the
>>>>>       instruction reordering.
>>>>>
>>>>>   See the "C++ Volatility" section in
>>>>>   src/share/vm/runtime/orderAccess.hpp.
>>>>>
>>>>>
>>>>> It's an interesting header file to read.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>       For example, there is the static variable JvmtiEnvBase:_phase
>>>>>       which
>>>>>       indicates the JVMTI phase in which the JVM is currently in.
>>>>>       But AFAICT,
>>>>>       there is no synchronization used by the callers of get_phase() and
>>>>>       set_phase() and apparently they can be called by multiple
>>>>>       threads. Also,
>>>>>       the JvmtiEventEnabled::_enabled_bits which is a jlong variable
>>>>>       that
>>>>>       indicates which events are enabled for a single jvmtiEnv. The
>>>>>       writes and
>>>>>       reads to it are not synchronized, either. These are race
>>>>>       conditions,
>>>>>       which implies that some JVMTI event can go off in the dead
>>>>>       phase when no
>>>>>       events are supposed to go off. I'm actually looking into a bug
>>>>>       in a
>>>>>       JVMTI agent that I suspect is due to this.
>>>>>
>>>>>
>>>>>   It certainly looks like JvmtiEnvBase::_phase should minimally be a
>>>>>   volatile; it may also require some sort of synchronization to force
>>>>>   the updated values to be visible...
>>>>>
>>>>>   JvmtiEventEnabled::_enabled_bits also looks problematic. I'm going
>>>>>   to have to mull on both of these and talk to some other folks.
>>>>>
>>>>>
>>>>> The bug that I ran across was a memory corruption crash because a
>>>>> CompiledMethodLoad event happened to be sent (due to this race
>>>>> condition)
>>>>> after Agent_OnUnload call where I had already deallocated the agent's
>>>>> data
>>>>> objects. This happens very rarely (once in every more than 100k runs of
>>>>> a
>>>>> more-or-less javac invocation) but often enough to fix.
>>>>>
>>>>> I have a patch attached, which I used to suppress the bug by avoiding
>>>>> the
>>>>> events to be sent during the dead phase by adding synchronization.
>>>>> Basically
>>>>> I make the _phase variable volatile and I use a mutex to guard event
>>>>> callbacks in the all the post_xxx functions and the phase transition to
>>>>> the
>>>>> dead phase in post_vm_death(). Probably we don't need the volatile as
>>>>> long
>>>>> as the mutex guarantees the visibility. The other phase transitions
>>>>> (onload->primodial->start->live) may not need the same degree of
>>>>> synchronization because events that can be sent in a earlier phase can
>>>>> be
>>>>> sent in the next phase as well.
>>>>>
>>>>> The _enabled_bits and the callback function table will probably need a
>>>>> similar synchronization as well. But the attached patch does not
>>>>> implement
>>>>> it and only ensures that no events are sent during the dead phase. I'd
>>>>> like
>>>>> to get the fix into openjdk in one form or another.
>>>>>
>>>>> Thanks,
>>>>> Hiroshi
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>   Dan
>>>>>
>>>>>
>>>>>
>>>>>       Thanks,
>>>>>       Hiroshi
>>>>>
>>>>>
>>>>>
>>>>>           
>>