RFR: 8237143: Eliminate DirtyCardQ_cbl_mon

Kim Barrett kim.barrett at oracle.com
Fri Jan 31 22:25:34 UTC 2020

> On Jan 23, 2020, at 3:10 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>> On Jan 22, 2020, at 11:12 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>> On 16.01.20 09:51, Kim Barrett wrote:
>>> Please review this change to eliminate the DirtyCardQ_cbl_mon.  This
>>> is one of the two remaining super-special "access" ranked mutexes.
>>> (The other is the Shared_DirtyCardQ_lock, whose elimination is covered
>>> by JDK-8221360.)
>>> There are three main parts to this change.
>>> (1) Replace the under-a-lock FIFO queue in G1DirtyCardQueueSet with a
>>> lock-free FIFO queue.
>>> (2) Replace the use of a HotSpot monitor for signaling activation of
>>> concurrent refinement threads with a semaphore-based solution.
>>> (3) Handle pausing of buffer refinement in the middle of a buffer in
>>> order to handle a pending safepoint request.  This can no longer just
>>> push the partially processed buffer back onto the queue, due to ABA
>>> problems now that the buffer is lock-free.
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8237143
>>> Webrev:
>>> https://cr.openjdk.java.net/~kbarrett/8237143/open.00/
>>> Testing:
>>> mach5 tier1-5
>>> Normal performance testing showed no significant change.
>>> specjbb2015 on a very big machine showed a 3.5% average critical-jOPS
>>> improvement, though not statistically significant; removing contention
>>> for that lock by many hardware threads may be a little bit noticeable.
>> initial comments only, and so far only about comments :( The code itself looks good to me, but I want to look over it again.
> After some offline discussion with Thomas, I’m doing some restructuring that
> makes it probably not very efficient for anyone else to do a careful review of
> the open.00 version.

Here's a new webrev:


mach5 tier1-5
Performance testing showed no significant change.

I didn't bother providing an incremental webrev, because the changes
to g1DirtyCardQueue.[ch]pp are pretty substantial.  Those are the only
files changed, except for the suggested move of the comment for
G1ConcurrentRefineThread::maybe_deactivate and some related comment
improvements nearby.

Most of this round of changes are refactoring within G1DirtyCardQueueSet,
mainly adding internal helper classes for the FIFO queue and for the paused
buffers, each with their own (commented) APIs.  I think that has addressed a
lot of Thomas's comments about the comments, and I hope has made the code
easier to understand.

I've also improved the mechanism for handling "paused" buffers, simplifying
it by making better use of some invariants.

> On Jan 22, 2020, at 11:12 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> // The key idea to make this work is that pop (get_completed_buffer)
> // never returns an element of the queue if it is the only accessible
> // element,

> If I understand this correctly, maybe "if there is only one buffer in the FIFO" is easier to understand than "only accessible element". (or define "accessible element”).

I specifically don't want to say it that way because we could have a
situation like

(1) Start with a queue having exactly one element.

(2) Thread1 starts a push by updating tail, but has not yet linked the old
tail to the new.

(3) Thread2 performs a push.

The buffer pushed by Thread2 is "in the queue" by some reasonable
definition, so the queue contains two buffers.  But that buffer is not yet
accessible, because Thread1 hasn't completed its push.  The alternative is
to (in the description) somehow divorce a completed push from the notion of
the number of buffers in the queue, which seems worse to me.  I expanded the
discussion a bit though, including what is meant by "accessible".

> The code seems to unnecessarily use the NULL_buffer constant. Maybe use it here too. Overall I am not sure about the usefulness of using NULL_buffer in the code. The NULL value in Hotspot code is generally accepted as a special value, and the name "NULL_buffer" does not seem to add any information.

The point of NULL_buffer was to avoid casts of NULL in Atomic operations,
and I then used it consistently.  But I've changed to using such casts,
since it turned out there weren't that many and we can get rid of those
uniformly here and elsewhere when we have C++11 nullptr and nullptr_t.

More information about the hotspot-gc-dev mailing list