synchronization of JVMTI phase notifications [Fwd: Data visibility between threads in Hotspot]

Wed Feb 18 12:20:06 PST 2009

Thanks for chipping in.

On Fri, Feb 13, 2009 at 9:39 PM, Swamy Venkataramanappa
<Swamy.Venkataramanappa at sun.com> wrote:
> jvmti spec says no events should be generated after the VM_DEATH.
>
> http://java.sun.com/javase/6/docs/platform/jvmti/jvmti.html#VMDeath
>
> It says: "No event will occur of after this event".
>
> I suspect we are posting vm death event too soon.   I think we should
> post vm death after vm thread is destroyed or to the place where all java
> threads including daemon thread has stopped running.

That may be a good fix for the issue of events sent after the VM death.

Since I couldn't suppress my bug with a SetEventCallbacks call that
nulls out the callback function table or a SetEventNotificationMode
call to disable all the events, I think there still is some race
condition/visibility issue there, independent of the above fix.

>
> -Swamy
>
> Hiroshi Yamauchi wrote:
>>
>> Dan,
>>
>> On Fri, Feb 13, 2009 at 11:39 AM, Daniel D. Daugherty
>> <Daniel.Daugherty at sun.com> wrote:
>>
>>
>>>
>>> I'm leaning toward making the two fields volatile and adding the
>>> appropriate OrderAccess::XXX calls. I vaguely remember an existing
>>> bug for events posting late. I'm going to see if I can find it.
>>>
>>
>> Making the fields volatile and adding the OrderAccess:xxx calls isn't
>> enough because we could see the dead phase change while a loop that
>> calls the list of callbacks is executing (eg the for loop in
>> JvmtiExport::post_compiled_method_load). We need to guard at least the
>> loop and the preceding EVT_TRIG_TRACE in a mutex. That's my take.
>>
>> Hiroshi
>>
>>
>>
>>>
>>> Hiroshi Yamauchi wrote:
>>>
>>>>
>>>> Hi Dan,
>>>>
>>>> On Wed, Feb 11, 2009 at 12:43 PM, Daniel D. Daugherty
>>>> <Daniel.Daugherty at sun.com <mailto:Daniel.Daugherty at sun.com>> wrote:
>>>>
>>>>   I'll chime in on parts of this thread below.
>>>>
>>>>
>>>>
>>>>
>>>>   Tim Bell wrote:
>>>>
>>>>       I don't know the precise answer to this question, so I am
>>>>       forwarding it to the Serviceability list to get a wider audience.
>>>>
>>>>       -------- Original Message --------
>>>>       Subject:     Data visibility between threads in Hotspot
>>>>       Date:     Tue, 10 Feb 2009 11:14:24 -0800
>>>>
>>>>       Hi Tim,
>>>>
>>>>       I have a Hotspot question. Chuck Rasbold pointed me to you.
>>>>       Feel free to
>>>>       forward this message to other experts at Sun, if needed.
>>>>
>>>>       If one Hotspot engineer wants to guarantee the immediate
>>>>       visibility of a
>>>>       write to a static variable to reads in other threads (assuming
>>>>       the reads
>>>>       and the writes are properly synchronized via a mutex), what
>>>>       does s/he do?
>>>>
>>>>       Does the use of MutexLocker guarantee the visibility? It
>>>>       probably does not.
>>>>
>>>>
>>>>   I believe that MutexLocker does guarantee the visibility:
>>>>
>>>>   src/share/vm/runtime/mutexLocker.hpp:
>>>>
>>>>    133  // See orderAccess.hpp.  We assume throughout the VM that
>>>>   MutexLocker's
>>>>    134  // and friends constructors do a fence, a lock and an
>>>>   acquire *in that
>>>>    135  // order*.  And that their destructors do a release and
>>>>   unlock, in *that*
>>>>    136  // order.  If their implementations change such that these
>>>>   assumptions
>>>>    137  // are violated, a whole lot of code will break.
>>>>
>>>>
>>>> OK. That's handy.
>>>>
>>>>
>>>>
>>>>   See src/share/vm/runtime/orderAccess.hpp for the gory details.
>>>>
>>>>
>>>>
>>>>
>>>>       There are several volatile variables in Hotpot, but it may not
>>>>       really
>>>>       work because C++ compilers do not guarantee the visibility or
>>>>       limit the
>>>>       instruction reordering.
>>>>
>>>>   See the "C++ Volatility" section in
>>>>   src/share/vm/runtime/orderAccess.hpp.
>>>>
>>>>
>>>> It's an interesting header file to read.
>>>>
>>>>
>>>>
>>>>
>>>>       For example, there is the static variable JvmtiEnvBase:_phase
>>>>       which
>>>>       indicates the JVMTI phase in which the JVM is currently in.
>>>>       But AFAICT,
>>>>       there is no synchronization used by the callers of get_phase() and
>>>>       set_phase() and apparently they can be called by multiple
>>>>       threads. Also,
>>>>       the JvmtiEventEnabled::_enabled_bits which is a jlong variable
>>>>       that
>>>>       indicates which events are enabled for a single jvmtiEnv. The
>>>>       writes and
>>>>       reads to it are not synchronized, either. These are race
>>>>       conditions,
>>>>       which implies that some JVMTI event can go off in the dead
>>>>       phase when no
>>>>       events are supposed to go off. I'm actually looking into a bug
>>>>       in a
>>>>       JVMTI agent that I suspect is due to this.
>>>>
>>>>
>>>>   It certainly looks like JvmtiEnvBase::_phase should minimally be a
>>>>   volatile; it may also require some sort of synchronization to force
>>>>   the updated values to be visible...
>>>>
>>>>   JvmtiEventEnabled::_enabled_bits also looks problematic. I'm going
>>>>   to have to mull on both of these and talk to some other folks.
>>>>
>>>>
>>>> The bug that I ran across was a memory corruption crash because a
>>>> CompiledMethodLoad event happened to be sent (due to this race
>>>> condition)
>>>> after Agent_OnUnload call where I had already deallocated the agent's
>>>> data
>>>> objects. This happens very rarely (once in every more than 100k runs of
>>>> a
>>>> more-or-less javac invocation) but often enough to fix.
>>>>
>>>> I have a patch attached, which I used to suppress the bug by avoiding
>>>> the
>>>> events to be sent during the dead phase by adding synchronization.
>>>> Basically
>>>> I make the _phase variable volatile and I use a mutex to guard event
>>>> callbacks in the all the post_xxx functions and the phase transition to
>>>> the
>>>> dead phase in post_vm_death(). Probably we don't need the volatile as
>>>> long
>>>> as the mutex guarantees the visibility. The other phase transitions
>>>> (onload->primodial->start->live) may not need the same degree of
>>>> synchronization because events that can be sent in a earlier phase can
>>>> be
>>>> sent in the next phase as well.
>>>>
>>>> The _enabled_bits and the callback function table will probably need a
>>>> similar synchronization as well. But the attached patch does not
>>>> implement
>>>> it and only ensures that no events are sent during the dead phase. I'd
>>>> like
>>>> to get the fix into openjdk in one form or another.
>>>>
>>>> Thanks,
>>>> Hiroshi
>>>>
>>>>
>>>>
>>>>
>>>>   Dan
>>>>
>>>>
>>>>
>>>>       Thanks,
>>>>       Hiroshi
>>>>
>>>>
>>>>
>
>