synchronization of JVMTI phase notifications [Fwd: Data visibility between threads in Hotspot]

Fri Feb 13 15:40:00 PST 2009

Dan,

On Fri, Feb 13, 2009 at 11:39 AM, Daniel D. Daugherty
<Daniel.Daugherty at sun.com> wrote:

> I'm leaning toward making the two fields volatile and adding the
> appropriate OrderAccess::XXX calls. I vaguely remember an existing
> bug for events posting late. I'm going to see if I can find it.

Making the fields volatile and adding the OrderAccess:xxx calls isn't
enough because we could see the dead phase change while a loop that
calls the list of callbacks is executing (eg the for loop in
JvmtiExport::post_compiled_method_load). We need to guard at least the
loop and the preceding EVT_TRIG_TRACE in a mutex. That's my take.

Hiroshi

> Hiroshi Yamauchi wrote:
>>
>> Hi Dan,
>>
>> On Wed, Feb 11, 2009 at 12:43 PM, Daniel D. Daugherty
>> <Daniel.Daugherty at sun.com <mailto:Daniel.Daugherty at sun.com>> wrote:
>>
>>    I'll chime in on parts of this thread below.
>>
>>
>>
>>
>>    Tim Bell wrote:
>>
>>        I don't know the precise answer to this question, so I am
>>        forwarding it to the Serviceability list to get a wider audience.
>>
>>        -------- Original Message --------
>>        Subject:     Data visibility between threads in Hotspot
>>        Date:     Tue, 10 Feb 2009 11:14:24 -0800
>>
>>        Hi Tim,
>>
>>        I have a Hotspot question. Chuck Rasbold pointed me to you.
>>        Feel free to
>>        forward this message to other experts at Sun, if needed.
>>
>>        If one Hotspot engineer wants to guarantee the immediate
>>        visibility of a
>>        write to a static variable to reads in other threads (assuming
>>        the reads
>>        and the writes are properly synchronized via a mutex), what
>>        does s/he do?
>>
>>        Does the use of MutexLocker guarantee the visibility? It
>>        probably does not.
>>
>>
>>    I believe that MutexLocker does guarantee the visibility:
>>
>>    src/share/vm/runtime/mutexLocker.hpp:
>>
>>     133  // See orderAccess.hpp.  We assume throughout the VM that
>>    MutexLocker's
>>     134  // and friends constructors do a fence, a lock and an
>>    acquire *in that
>>     135  // order*.  And that their destructors do a release and
>>    unlock, in *that*
>>     136  // order.  If their implementations change such that these
>>    assumptions
>>     137  // are violated, a whole lot of code will break.
>>
>>
>> OK. That's handy.
>>
>>
>>
>>    See src/share/vm/runtime/orderAccess.hpp for the gory details.
>>
>>
>>
>>
>>        There are several volatile variables in Hotpot, but it may not
>>        really
>>        work because C++ compilers do not guarantee the visibility or
>>        limit the
>>        instruction reordering.
>>
>>    See the "C++ Volatility" section in
>>    src/share/vm/runtime/orderAccess.hpp.
>>
>>
>> It's an interesting header file to read.
>>
>>
>>
>>
>>        For example, there is the static variable JvmtiEnvBase:_phase
>>        which
>>        indicates the JVMTI phase in which the JVM is currently in.
>>        But AFAICT,
>>        there is no synchronization used by the callers of get_phase() and
>>        set_phase() and apparently they can be called by multiple
>>        threads. Also,
>>        the JvmtiEventEnabled::_enabled_bits which is a jlong variable
>>        that
>>        indicates which events are enabled for a single jvmtiEnv. The
>>        writes and
>>        reads to it are not synchronized, either. These are race
>>        conditions,
>>        which implies that some JVMTI event can go off in the dead
>>        phase when no
>>        events are supposed to go off. I'm actually looking into a bug
>>        in a
>>        JVMTI agent that I suspect is due to this.
>>
>>
>>    It certainly looks like JvmtiEnvBase::_phase should minimally be a
>>    volatile; it may also require some sort of synchronization to force
>>    the updated values to be visible...
>>
>>    JvmtiEventEnabled::_enabled_bits also looks problematic. I'm going
>>    to have to mull on both of these and talk to some other folks.
>>
>>
>> The bug that I ran across was a memory corruption crash because a
>> CompiledMethodLoad event happened to be sent (due to this race condition)
>> after Agent_OnUnload call where I had already deallocated the agent's data
>> objects. This happens very rarely (once in every more than 100k runs of a
>> more-or-less javac invocation) but often enough to fix.
>>
>> I have a patch attached, which I used to suppress the bug by avoiding the
>> events to be sent during the dead phase by adding synchronization. Basically
>> I make the _phase variable volatile and I use a mutex to guard event
>> callbacks in the all the post_xxx functions and the phase transition to the
>> dead phase in post_vm_death(). Probably we don't need the volatile as long
>> as the mutex guarantees the visibility. The other phase transitions
>> (onload->primodial->start->live) may not need the same degree of
>> synchronization because events that can be sent in a earlier phase can be
>> sent in the next phase as well.
>>
>> The _enabled_bits and the callback function table will probably need a
>> similar synchronization as well. But the attached patch does not implement
>> it and only ensures that no events are sent during the dead phase. I'd like
>> to get the fix into openjdk in one form or another.
>>
>> Thanks,
>> Hiroshi
>>
>>
>>
>>
>>    Dan
>>
>>
>>
>>        Thanks,
>>        Hiroshi
>>
>>
>