[jdk18] Integrated: 8273383: vmTestbase/vm/gc/containers/Combination05/TestDescription.java crashes verifying length of DCQS

Kim Barrett kbarrett at openjdk.java.net
Wed Jan 19 04:40:19 UTC 2022


On Sun, 16 Jan 2022 18:20:07 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this fix of an ABA problem in G1DirtyCardQueueSet's use of
> NonblockingQueue. 
> 
> Consider queue q contains one entry, `X`.
> 
> Thread1 begins a push/append operation. WOLG, assume a push of `Y`. It
> atomically exchanges `q._tail` with `Y`, retaining the old value of `q._tail`,
> which is `X`. Pause Thread1 here.
> 
> Thread2 pops `X` from the queue. It successfully changes `X._next` from
> end-marker to null and `q._head` from `X` to null. Thread2 fails to change
> `q._tail` from `X` to null, as Thread1 already changed it to `Y`.
> 
> Various threads add entries to the queue after `Y`, and various threads try to
> pop entries and find the queue "empty" because `q._head` will remain null
> until Thread1 proceeds.
> 
> In parallel with entries being added, Thread2 processes `X` and releases it
> for reuse. Some thread reallocates it, fills it, and pushes it onto the
> queue. We now have a chain from `Y` => `X` in `q._tail` with `q._head` still
> null.
> 
> If Thread1 resumes at that point, it finds `X._next` == end-marker,
> successfully changes it to `Y`, and "completes". We now have `q._head` == null
> and `q._tail` == `X`, with `X` => `Y` => `X`. The next push/append will
> observe `q._tail` of `X` having a non-end-marker `_next` and change both
> `q._head` and `q._tail`, dropping the list containing `X`, `Y`, &etc on the
> floor.
> 
> There may be other similarly bad scenarios, depending on exactly what is going
> on when Thread1 resumes.
> 
> All of these are fixed by performing the push operation in a critical section,
> so that `X` cannot be reused until the push that is observing it
> completes. That is, in `DirtyCardQueueSet::enqueue_completed_buffer`, put
> `_completed.push(*cbn);` in a GlobalCounter critical section.
> 
> Testing:
> mach5 tier1
> 
> Running the failing test with -XX:G1UpdateBufferSize=3 was failing at a rate
> of about 1/1K runs of the test.  With this change, there were no failures in
> more than 60K runs.

This pull request has now been integrated.

Changeset: 69cfa9cb
Author:    Kim Barrett <kbarrett at openjdk.org>
URL:       https://git.openjdk.java.net/jdk18/commit/69cfa9cb36ab2b5490c231c30306f682665faab4
Stats:     8 lines in 1 file changed: 6 ins; 0 del; 2 mod

8273383: vmTestbase/vm/gc/containers/Combination05/TestDescription.java crashes verifying length of DCQS

Reviewed-by: tschatzl, sjohanss

-------------

PR: https://git.openjdk.java.net/jdk18/pull/104



More information about the hotspot-gc-dev mailing list