RFR (XS) 8047212: fix race between ObjectMonitor alloc and verification code

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Oct 20 14:29:15 UTC 2015


Carsten,

It's been a long time! Thanks for the quick review...

Replies embedded below...

On 10/19/15, 10:09 PM, Carsten Varming wrote:
> Dear Dan,
>
> I thought acquire/release semantics mandated that the release should 
> be paired with an acquire.

Yes, src/share/vm/runtime/orderAccess.hpp is pretty clear about
that. I modeled this fix after some other ObjectMonitor code
that doesn't (yet) follow that mandate.


> In this case, the first line in verify_objmon_isinpool should be 
> changed from "PaddedEnd<ObjectMonitor> * block = 
> (PaddedEnd<ObjectMonitor> *)gBlockList" to use a load_acquire. 
> Actually, it seems like there are a lot of reads of gBlockList that 
> should be replaced with a load_acquire in synchronizer.cpp.

Yes, that would be a good change and cleanup. The really annoying
part is that the current fix seems to have resolved the bug...

Current numbers: 700K iterations of 4 fastdebug instances in parallel
in 4.5 days. I typically saw one failure per day in a three day run...


>
> Otherwise the change looks great.

Thanks!

Dan


>
> Carsten
> (not a reviewer).
>
> On Mon, Oct 19, 2015 at 8:02 PM, Daniel D. Daugherty 
> <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>> wrote:
>
>     Greetings,
>
>     I have a fix for a long standing race between the lock-free
>     ObjectMonitor
>     verification code and the normal (locked) ObjectMonitor block
>     allocation
>     code path. For this fix, I would like at least a Runtime team reviewer
>     and a Serviceability team reviewer. Thanks!
>
>     JDK-8047212
>     runtime/ParallelClassLoading/bootstrap/random/inner-complex
>                
>     assert(ObjectSynchronizer::verify_objmon_isinpool(inf)) failed:
>                 monitor is invalid
>     https://bugs.openjdk.java.net/browse/JDK-8047212
>
>     Webrev URL:
>     http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/
>     <http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/0-jdk9-hs-rt/>
>
>     Testing: Aurora Adhoc RT-SVC nightly batch
>              4 inner-complex fastdebug parallel runs for 4+ days and
>                600K iterations without seeing this failure; the experiment
>                is still running; final results to be reported at the end
>                of the review cycle
>              JPRT -testset hotspot
>
>     This fix:
>
>     - makes ObjectMonitor::gBlockList volatile
>     - uses "OrderAccess::release_store_ptr(&gBlockList, temp)" to
>       make sure the new block updates _happen before_ gBlockList is
>       changed to refer to the new block
>     - add SA support for a "static pointer volatile" field like:
>
>         static ObjectMonitor * volatile gBlockList;
>
>     See the following link for a nice description of what "volatile"
>     means in the different positions on a variable/parameter decl line:
>
>     http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword
>
>     Thanks, in advance, for any comments, questions or suggestions.
>
>     Dan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20151020/d7311531/attachment.html>


More information about the serviceability-dev mailing list