RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc()
Daniel D. Daugherty
daniel.daugherty at oracle.com
Mon Aug 31 16:58:04 UTC 2015
Hi David,
Replies embedded below...
On 8/30/15 7:30 PM, David Holmes wrote:
> Hi Dan,
>
> On 29/08/2015 2:27 AM, Daniel D. Daugherty wrote:
>> On 8/28/15 8:45 AM, Tom Benson wrote:
>>> Hi,
>>> One more pair of eyes on this. 8^)
>>
>> Hi Tom!
>>
>> Thanks for reviewing and welcome to the party...
>>
>>
>>>
>>> On 8/27/2015 8:16 PM, Kim Barrett wrote:
>>>> On Aug 27, 2015, at 5:42 PM, Daniel D. Daugherty
>>>> <daniel.daugherty at oracle.com> wrote:
>>>>> Sorry for starting another e-mail thread fork in an already
>>>>> complicated
>>>>> review...
>>>> OK, that was fascinating. No, really, I mean it.
>>>>
>>>> It made me realize that we've been arguing and talking past each other
>>>> in part because we're really dealing with two distinct though closely
>>>> related bugs here.
>>>>
>>>> I've been primarily thinking about the case where we're calling
>>>> vm_abort / os::abort, where the we presently delete the PerfData
>>>> memory even though there can be arbitrary other threads running. This
>>>> was the case in JDK-8129978, which is how I got involved here in the
>>>> first place. In that bug we were in vm_exit_during_initialization and
>>>> had called perfMemory_exit when some thread attempted to inflate a
>>>> monitor (which is not one of the conflicting cases discussed by Dan).
>>>>
>>>> The problem Dan has been looking at, JDK-8049304, is about a "normal"
>>>> VM shutdown. In this case, the problem is that we believe it is safe
>>>> to delete the PerfData, because we've safepointed, and yet some thread
>>>> unexpectedly runs and attempts to touch the deleted data anyway.
>>>>
>>>> I think Dan's proposed fix (mostly) avoids the specific instance of
>>>> JDK-8129978, but doesn't solve the more general problem of abnormal
>>>> exit deleting the PerfData while some running thread is touching some
>>>> non-monitor-related part of that data. My proposal to leave it to the
>>>> OS to deal with memory cleanup on process exit would deal with this
>>>> case.
>>>>
>>>> I think Dan's proposed fix (mostly) avoids problems like JDK-8049304.
>>>> And the approach I've been talking about doesn't help at all for this
>>>> case. But I wonder if Dan's proposed fix can be improved. A "futile
>>>> wakeup" case doesn't seem to me like one which requires super-high
>>>> performance. Would it be ok, in the two problematic cases that Dan
>>>> identified, to use some kind of atomic / locking protocol with the
>>>> cleanup? Or is the comment for the counter increment in EnterI (and
>>>> only there) correct that it's important to avoid a lock or atomics
>>>> here (and presumably in ReenterI too).
>>>>
>>>
>>> I notice that EnteriI/ReenterI both end with OrderAccess::fence(). Can
>>> the potential update of _sync_FutileWakeups be delayed until that
>>> point, to take advantage of the fence to make the sync hole even
>>> smaller?
>>
>> Not easily with EnterI() since there is one optional optimization
>> between the OM_PERFDATA_OP(FutileWakeups, inc()) call and the
>> OrderAccess::fence() call and that would result in lost FutileWakeups
>> increments.
>>
>> Not easily in ReenterI(), the OM_PERFDATA_OP(FutileWakeups, inc()) call
>> is at the bottom of the for-loop and the OrderAccess::fence() call at
>> the end of the function is outside the loop. This would result in lost
>> FutileWakeups increments.
>>
>> So in ReenterI() the OM_PERFDATA_OP(FutileWakeups, inc()) call
>> immediately
>> follows an OrderAccess::fence() call. Doesn't that make that
>> increment as
>> "safe" as it can be without having a real lock?
>>
>>
>>> You've got a release() (and and short nap!) with the store in
>>> PerfDataManager::destroy() to try to close the window somewhat.
>>
>> Yes, I modeled that after:
>>
>> src/share/vm/runtime/perfMemory.cpp:
>>
>> 83 void PerfMemory::initialize() {
>> <snip>
>> 156 OrderAccess::release_store(&_initialized, 1);
>> 157 }
>>
>>
>>> But I think rather than the release_store() you used, you want a
>>> store, followed by a release(). release_store() puts a fence before
>>> the store to ensure earlier updates are seen before the current one,
>>> no?
>>
>> Yup, and I see I got my reasoning wrong. The code I modeled
>> is right because you want to flush all the inits and it's OK
>> if the _initialized transition from '0' -> '1' is lazily seen.
>>
>> For my shutdown use, we are transitioning from '1' -> '0' and
>> we need that to be seen proactively so:
>
> Nit: OrderAccess "barriers" enforce ordering constraints but don't in
> general provide any guarantees about visibility - ie they are not
> necessarily "flushes". So while it may be true on some platforms by
> virtue of the underlying barrier mechanism, in general they don't
> change when a write becomes visible and so there is nothing
> "proactive" about them.
My apologies for being sloppy with my wording.
What I'm trying to do is put up a barrier so that once the
deleting thread has changed the flag from '1' -> '0' another
thread trying to read the flag won't see the '1' when it
should see the '0'. In order words, I don't want the other
thread's read of the flag to float above the setting of
the flag to '0'.
>
>> OrderAccess::release_store(&_has_PerfData, 0);
>> OrderAccess::storeload();
>
> I agree with Tom that the release is unnecessary - though harmless.
I tried to switch the code to:
OrderAccess::store(&_has_PerfData, 0)
OrderAccess::storeload();
but that wouldn't compile on Solaris X64:
Error: static OrderAccess::store(volatile int*, int) is not accessible
from static PerfDataManager::destroy().
But then it occurred to me that I was trying too hard
to use OrderAccess functions...
> The real ordering constraint here is that we preserve:
>
> _has_PerfData = 0;
> <do actual deallocation>
I think I can switch to a straight "_has_PerfData = 0;" here and
not use either OrderAccess::release_store() or OrderAccess::store().
I also reread OrderAccess.hpp again and convinced myself that
an "OrderAccess::fence()" call is the only way to be sure here.
I'm pretty sure that achieves 'storeload|storestore' barrier
semantics... Yes, I have a headache again... :-)
> for which a storeload|storestore barrier after the write seems most
> appropriate. Though with the insertion of the sleep after the write
> there won't be any reordering anyway so explicit barriers seem redundant.
I didn't think that os::naked_short_sleep() would do anything
to keep another thread's read from floating above the
"_has_PerfData = 0;". What did I miss?
My current plan:
- switch from "OrderAccess::release_store(&_has_PerfData, 0);"
to "_has_PerfData = 0;"
- keep "OrderAccess::fence();"
- retest and send out for another round of review
Dan
>
> Cheers,
> David
> -----
>
>> which is modeled after _owner field transitions from non-zeo
>> -> NULL in ObjectMonitor.cpp
>>
>>
>>>
>>> Also, I think the comment above that release_store() could be
>>> clarified. It is fine as is if you're familiar with this bug report
>>> and discussion, but... I think it should explicitly say there is
>>> still a very small window for the lack of true synchronization to
>>> cause a failure. And perhaps that the release_store() (or
>>> store/release()) is not half of an acquire/release pair.
>>
>> Here's the existing comment:
>>
>> 286 // Clear the flag before we free the PerfData counters. Thus
>> begins
>> 287 // the race between this thread and another thread that
>> has just
>> 288 // queried PerfDataManager::has_PerfData() and gotten back
>> 'true'.
>> 289 // The hope is that the other thread will finish its PerfData
>> 290 // manipulation before we free the memory. The two
>> alternatives
>> 291 // are 1) leak the PerfData memory or 2) do some form of
>> ordered
>> 292 // access before every PerfData operation.
>>
>> I think it pretty clearly states that there is still a race here.
>> And I think that option 2 covers that we're not doing completely
>> safe ordered access. I'm not sure how to make this comment more
>> clear, but if you have specific suggestions...
>>
>> Dan
>>
>>
>>>
>>> Tom
>>
More information about the hotspot-runtime-dev
mailing list