CMS parallel initial mark; vm thread sneaking in Thread::try_lock()

Fri Jun 14 00:17:51 PDT 2013

Hi Thomas,

On 13/06/2013 9:45 PM, Thomas Schatzl wrote:
> On Thu, 2013-06-13 at 20:33 +1000, David Holmes wrote:
>> Thomas,
>>
>> lock-sneaking is something very specific to the VMThread to avoid
>> deadlocks at safepoints. I really don't think we want to see it promoted
>> into the API for mutexes/monitors.
>>
>> I hadn't been paying this thread too much attention but now I'm starting
>> to think I need to.
>
>    the situation is (and I should have asked the runtime team more
> directly, and cc'ed the thread earlier, apologies) that in CMS there is
> the need to frequently record object boundaries into a buffer for
> performance reasons.
>
> There is a contribution that changes this sampling from being done
> concurrently to being done every time after a tlab refill. The problem
> is that this operation involves a few reads that must be atomic; so the
> change guards that operation with a critical section. It is also not
> critical that the operation is executed always, it may be skipped some
> times.
>
> So the change manually implements a try_lock()/unlock() idiom using a
> CAS and a global variable.

This is a simple and efficient approach if you are happy to skip 
sampling when others are doing it. A couple of things about the way you 
implemented this though:

a) the variable should be a jint. Using a jbyte is unlikely to save 
space because it will need to be correctly aligned on some platforms. 
Conversely atomic ops on unaligned subwords, if allowed, can take a 
performance hit. And in the worst case attempting a CAS on an unaligned 
jbyte could cause a crash! Also it should be marked volatile.

b) there is no need to use CAS to restore the variable to zero. But you 
would need to use a store_fence, as done in the sync code.

> The suggestion that tripped this thread has been to use existing VM
> code, in particular monitors, as they seem to have the requested
> functionality. There does not seem to be need for rolling own code here.

True, try_lock and unlock are functionally equivalent but may be less 
performant due to the added support for blocking that you don't need.

> The code in that critical section is very short, and straight line code;
> the critical section boiler plate code is larger than that...
>
> Then there has been the concern about that try_lock() may be sneaked by
> the vm thread; and some suggestions to put these concerns to rest.
>
> See
> http://cr.openjdk.java.net/~hiroshi/webrevs/edenchunks/webrev.00/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp.udiff.html ,
> in the sample_eden_chunk() method.
>
> So maybe could you confirm the statements made so far:
> - Monitor::try_lock()/::unlock() is equivalent to what the change does
> in this case.
>
> (I had a look at the unlock() code, and it's a lot more complicated than
> the equivalent code in the change. It boils down to the same behavior
> however?)

Yes.

> - that new sampling code is supposed to be only every executed while the
> mutator is running. Sneaking can only occur during safepoints; so it's
> use here would be "safe" in that there will be no sneaking at all. There
> is no case when the suggested assert(!
> SafepointSynchronize::is_at_safepoint()) would fire?

That seems reasonable. But I think you are misunderstanding what 
sneaking does. It isn't dangerous, you don't get multiple threads inside 
a critical region or anything strange like that - as long as the other 
threads using it will stop at safepoints of course! With 
monitors/mutexes a point is reached where the lock has been claimed but 
the new owner then blocks at a safepoint. So externally it looks like 
the lock is owned but the owner has not yet entered the critical region. 
If the VMThread needed that same lock there would be a deadlock. So the 
VMThread is allowed to "sneak" the lock, effectively taking ownership of 
it as-if the real owner blocked just before it could take ownership. The 
real owner can't wake up and enter the critical section until the 
safepoint is over and the VMThread will release the lock before it ends 
the safepoint. Note that any lock needed by the VMThread at a safepoint 
should not be used to guard critical sections in which other safepoint 
checks occur, else you would again get a deadlock. (And these locks are 
not reentrant.)

Cheers,
David
------

> Other than that some other ideas to avoid execution of the sneaking code
> were thrown into the room. The assert should be fine, but it would be
> better to avoid the possibility of sneaking directly. The latest
> suggestion involved adding a no-sneaking-variant of try_lock()
> ("try_lock_no_sneak()") that is called by the original one to keep exact
> semantics (except for the additional call).
> Nobody intends to change anything in that code without your input :)
> everyone is aware that this is tricky code with possible unintended side
> effects.
>
> The relevant thread starts here, most of it having been cc'ed to
> runtime-dev:
> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-June/007510.html
>
> Thanks for looking at this issue,
>    Thomas
>
>