synchronization of JVMTI phase notifications [Fwd: Data visibility between threads in Hotspot]

Daniel D. Daugherty Daniel.Daugherty at Sun.COM
Wed Feb 18 09:14:34 PST 2009


We can't hold a mutex across an event posting. The event posting
code calls out of the VM to the user agent and holding a VM mutex
across that boundary is not allowed.

I'll also take a look at Swamy's comment about us posting the
VM_DEATH event too soon. These VM shutdown cases are certainly
interesting...

Dan


Paul Hohensee wrote:
> If that's true, then you do indeed need a mutex.  The OrderAccess 
> methods only guarantee
> visibility ordering.  They don't guarantee exclusive access to shared 
> variables.
>
> Paul
>
> Hiroshi Yamauchi wrote:
>> Dan,
>>
>> On Fri, Feb 13, 2009 at 11:39 AM, Daniel D. Daugherty
>> <Daniel.Daugherty at sun.com> wrote:
>>
>>  
>>> I'm leaning toward making the two fields volatile and adding the
>>> appropriate OrderAccess::XXX calls. I vaguely remember an existing
>>> bug for events posting late. I'm going to see if I can find it.
>>>     
>>
>> Making the fields volatile and adding the OrderAccess:xxx calls isn't
>> enough because we could see the dead phase change while a loop that
>> calls the list of callbacks is executing (eg the for loop in
>> JvmtiExport::post_compiled_method_load). We need to guard at least the
>> loop and the preceding EVT_TRIG_TRACE in a mutex. That's my take.
>>
>> Hiroshi
>>
>>
>>  
>>> Hiroshi Yamauchi wrote:
>>>    
>>>> Hi Dan,
>>>>
>>>> On Wed, Feb 11, 2009 at 12:43 PM, Daniel D. Daugherty
>>>> <Daniel.Daugherty at sun.com <mailto:Daniel.Daugherty at sun.com>> wrote:
>>>>
>>>>    I'll chime in on parts of this thread below.
>>>>
>>>>
>>>>
>>>>
>>>>    Tim Bell wrote:
>>>>
>>>>        I don't know the precise answer to this question, so I am
>>>>        forwarding it to the Serviceability list to get a wider 
>>>> audience.
>>>>
>>>>        -------- Original Message --------
>>>>        Subject:     Data visibility between threads in Hotspot
>>>>        Date:     Tue, 10 Feb 2009 11:14:24 -0800
>>>>
>>>>        Hi Tim,
>>>>
>>>>        I have a Hotspot question. Chuck Rasbold pointed me to you.
>>>>        Feel free to
>>>>        forward this message to other experts at Sun, if needed.
>>>>
>>>>        If one Hotspot engineer wants to guarantee the immediate
>>>>        visibility of a
>>>>        write to a static variable to reads in other threads (assuming
>>>>        the reads
>>>>        and the writes are properly synchronized via a mutex), what
>>>>        does s/he do?
>>>>
>>>>        Does the use of MutexLocker guarantee the visibility? It
>>>>        probably does not.
>>>>
>>>>
>>>>    I believe that MutexLocker does guarantee the visibility:
>>>>
>>>>    src/share/vm/runtime/mutexLocker.hpp:
>>>>
>>>>     133  // See orderAccess.hpp.  We assume throughout the VM that
>>>>    MutexLocker's
>>>>     134  // and friends constructors do a fence, a lock and an
>>>>    acquire *in that
>>>>     135  // order*.  And that their destructors do a release and
>>>>    unlock, in *that*
>>>>     136  // order.  If their implementations change such that these
>>>>    assumptions
>>>>     137  // are violated, a whole lot of code will break.
>>>>
>>>>
>>>> OK. That's handy.
>>>>
>>>>
>>>>
>>>>    See src/share/vm/runtime/orderAccess.hpp for the gory details.
>>>>
>>>>
>>>>
>>>>
>>>>        There are several volatile variables in Hotpot, but it may not
>>>>        really
>>>>        work because C++ compilers do not guarantee the visibility or
>>>>        limit the
>>>>        instruction reordering.
>>>>
>>>>    See the "C++ Volatility" section in
>>>>    src/share/vm/runtime/orderAccess.hpp.
>>>>
>>>>
>>>> It's an interesting header file to read.
>>>>
>>>>
>>>>
>>>>
>>>>        For example, there is the static variable JvmtiEnvBase:_phase
>>>>        which
>>>>        indicates the JVMTI phase in which the JVM is currently in.
>>>>        But AFAICT,
>>>>        there is no synchronization used by the callers of 
>>>> get_phase() and
>>>>        set_phase() and apparently they can be called by multiple
>>>>        threads. Also,
>>>>        the JvmtiEventEnabled::_enabled_bits which is a jlong variable
>>>>        that
>>>>        indicates which events are enabled for a single jvmtiEnv. The
>>>>        writes and
>>>>        reads to it are not synchronized, either. These are race
>>>>        conditions,
>>>>        which implies that some JVMTI event can go off in the dead
>>>>        phase when no
>>>>        events are supposed to go off. I'm actually looking into a bug
>>>>        in a
>>>>        JVMTI agent that I suspect is due to this.
>>>>
>>>>
>>>>    It certainly looks like JvmtiEnvBase::_phase should minimally be a
>>>>    volatile; it may also require some sort of synchronization to force
>>>>    the updated values to be visible...
>>>>
>>>>    JvmtiEventEnabled::_enabled_bits also looks problematic. I'm going
>>>>    to have to mull on both of these and talk to some other folks.
>>>>
>>>>
>>>> The bug that I ran across was a memory corruption crash because a
>>>> CompiledMethodLoad event happened to be sent (due to this race 
>>>> condition)
>>>> after Agent_OnUnload call where I had already deallocated the 
>>>> agent's data
>>>> objects. This happens very rarely (once in every more than 100k 
>>>> runs of a
>>>> more-or-less javac invocation) but often enough to fix.
>>>>
>>>> I have a patch attached, which I used to suppress the bug by 
>>>> avoiding the
>>>> events to be sent during the dead phase by adding synchronization. 
>>>> Basically
>>>> I make the _phase variable volatile and I use a mutex to guard event
>>>> callbacks in the all the post_xxx functions and the phase 
>>>> transition to the
>>>> dead phase in post_vm_death(). Probably we don't need the volatile 
>>>> as long
>>>> as the mutex guarantees the visibility. The other phase transitions
>>>> (onload->primodial->start->live) may not need the same degree of
>>>> synchronization because events that can be sent in a earlier phase 
>>>> can be
>>>> sent in the next phase as well.
>>>>
>>>> The _enabled_bits and the callback function table will probably need a
>>>> similar synchronization as well. But the attached patch does not 
>>>> implement
>>>> it and only ensures that no events are sent during the dead phase. 
>>>> I'd like
>>>> to get the fix into openjdk in one form or another.
>>>>
>>>> Thanks,
>>>> Hiroshi
>>>>
>>>>
>>>>
>>>>
>>>>    Dan
>>>>
>>>>
>>>>
>>>>        Thanks,
>>>>        Hiroshi
>>>>
>>>>
>>>>       



More information about the serviceability-dev mailing list