CMS parallel initial mark; vm thread sneaking in Thread::try_lock()

Thu Jun 13 20:46:50 UTC 2013

On Thu, Jun 13, 2013 at 4:45 AM, Thomas Schatzl
<thomas.schatzl at oracle.com>wrote:

> Hi,
>
> On Thu, 2013-06-13 at 20:33 +1000, David Holmes wrote:
> > Thomas,
> >
> > lock-sneaking is something very specific to the VMThread to avoid
> > deadlocks at safepoints. I really don't think we want to see it promoted
> > into the API for mutexes/monitors.
> >
> > I hadn't been paying this thread too much attention but now I'm starting
> > to think I need to.
>
>   the situation is (and I should have asked the runtime team more
> directly, and cc'ed the thread earlier, apologies) that in CMS there is
> the need to frequently record object boundaries into a buffer for
> performance reasons.
>
> There is a contribution that changes this sampling from being done
> concurrently to being done every time after a tlab refill. The problem
> is that this operation involves a few reads that must be atomic; so the
> change guards that operation with a critical section. It is also not
> critical that the operation is executed always, it may be skipped some
> times.
>
> So the change manually implements a try_lock()/unlock() idiom using a
> CAS and a global variable.
>
> The suggestion that tripped this thread has been to use existing VM
> code, in particular monitors, as they seem to have the requested
> functionality. There does not seem to be need for rolling own code here.
>
> The code in that critical section is very short, and straight line code;
> the critical section boiler plate code is larger than that...
>
> Then there has been the concern about that try_lock() may be sneaked by
> the vm thread; and some suggestions to put these concerns to rest.
>
> See
>
> http://cr.openjdk.java.net/~hiroshi/webrevs/edenchunks/webrev.00/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp.udiff.html,
> in the sample_eden_chunk() method.
>
> So maybe could you confirm the statements made so far:
> - Monitor::try_lock()/::unlock() is equivalent to what the change does
> in this case.
>
> (I had a look at the unlock() code, and it's a lot more complicated than
> the equivalent code in the change. It boils down to the same behavior
> however?)
>
> - that new sampling code is supposed to be only every executed while the
> mutator is running. Sneaking can only occur during safepoints; so it's
> use here would be "safe" in that there will be no sneaking at all. There
> is no case when the suggested assert(!
> SafepointSynchronize::is_at_safepoint()) would fire?
>
> Other than that some other ideas to avoid execution of the sneaking code
> were thrown into the room. The assert should be fine, but it would be
> better to avoid the possibility of sneaking directly. The latest
> suggestion involved adding a no-sneaking-variant of try_lock()
> ("try_lock_no_sneak()") that is called by the original one to keep exact
> semantics (except for the additional call).
> Nobody intends to change anything in that code without your input :)
> everyone is aware that this is tricky code with possible unintended side
> effects.
>
> The relevant thread starts here, most of it having been cc'ed to
> runtime-dev:
> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-June/007510.html
>
> Thanks for looking at this issue,
>   Thomas
>
>
>
Thomas, thanks for summarizing the discussion so far.

I agree that if try_lock()/unlock() does what the manual/custom
synchronization does (and as efficiently), it should suffice. No need to
roll one's own.

I also agree that adding the
assert(!SafepointSynchronize::is_at_safepoint()) (or
assert(!Thread->current()->is_VM_thread())) may be good enough and if
sneaking can be turned off for a case like this, it'd be safer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130613/3dda6eb4/attachment.htm>