G1 question: concurrent cleaning of dirty cards

Thu Jun 27 08:27:39 UTC 2013

Hi Igor,

we have seen crashes while testing our hotspot 23 based SAPJVM with G1.
However, there's no evidence that these crashes are caused by this problem.
We basically found it by reading code.

Best regards,
Martin

From: Igor Veresov [mailto:iggy.veresov at gmail.com]
Sent: Donnerstag, 27. Juni 2013 08:15
To: Doerr, Martin
Cc: John Cuthbertson; hotspot-gc-dev at openjdk.java.net; Braun, Matthias
Subject: Re: G1 question: concurrent cleaning of dirty cards

Oh, re-read your letter, yup, there seems to be a problem. Have you observed that in practice?

igor

On Jun 26, 2013, at 9:27 PM, Igor Veresov <iggy.veresov at gmail.com<mailto:iggy.veresov at gmail.com>> wrote:

The cards that are stored in the buffers are not available for concurrent processing right when they are enqueued. Instead they are passed to the processing threads when the buffer fills up. This passing of the buffer involves signaling of a condition (like pthread_cond_signal(), literally) that has a write barrier for sure, which would guarantee that the cards in the buffer, and contents of the card table, and the contents of the object are "in sync".

The only place in the generated code where there has to be a store-store barrier (for non-TSO architectures) is between the actual field store and the dirtying of the card.

Does this make sense?

igor

On May 23, 2013, at 6:12 AM, "Doerr, Martin" <martin.doerr at sap.com<mailto:martin.doerr at sap.com>> wrote:

Hi John,

thank you very much for your comments. Your last line explains exactly what we are concerned about.
Does anybody plan to prevent this situation?
I don't want to propose adding StoreLoad barriers in all G1 post barriers because I'd expect undesired performance impact.
Would it be feasible to rescan all cards which have been dirtied (at least once) during the next stop-the-world phase?
Maybe anybody has a better idea.

Kind regards,
Martin

From: John Cuthbertson [mailto:john.cuthbertson at oracle.com<http://oracle.com/>]
Sent: Donnerstag, 23. Mai 2013 02:29
To: Doerr, Martin
Cc: hotspot-gc-dev at openjdk.java.net<mailto:hotspot-gc-dev at openjdk.java.net>; Mikael Gerdin; Braun, Matthias
Subject: Re: G1 question: concurrent cleaning of dirty cards

Hi Martin,

An enqueued card let's the refinment threads know that the oops spanned by that card need to be walked but we're only interested in the latest contents of the fields in those oops. IOW the oop in (3') doesn't need to be the oop stored in (1). If there's a subsequent store (3) to the same location then we want the load at (3') to see the lastest contents. For example suppose we have:

x.f = a;
x.f = b;

If the application thread sees the card spanning x.f is dirty at the second store then we won't enqueue the card after the second store. As long as the refinement thread sees 'b' when the card is 'refined' then we're OK since we no longer need to add an entry into the RSet for the region containing a - we do need an entry in the RSet for the region containing b.

If the application thread sees the card as clean at the second store before the refinement thread loads x.f we have just needlessly enqueued the card again.

It is only if the application thread sees the card as dirty but the refinement thread reads 'a' then there could be a problem. We have a missing RSet entry for 'b'.

JohnC
On 5/17/2013 1:29 AM, Doerr, Martin wrote:
Hi all,

we have a question about the interaction between G1 post barriers and the refinement thread's concurrent dirty card cleaning.
The case in which the G1 post barrier sees a clean card is obviously not problematic, because it will add an entry in a dirty card queue.
However, in case in which the Java thread (mutator thread) sees the card already dirtied, it won't enqueue the card again. Which is safe as long as its stored oop (1) is seen and processed (3') by the parallel refinement after having cleaned the card (1'):

Java Thread (mutator)              Refinement Thread (G1RemSet::concurrentRefineOneCard_impl calls oops_on_card_seq_iterate_careful)

(1)  store(oop)
( StoreLoad required here ?)
(2)  load(card==dirty)

                                   (1') store(card==clean)
                                   (2') StoreLoad barrier
                                   (3') load(oop)

So the refinement thread seems to rely on getting the oop which was written BEFORE the (2) load(card==dirty) was observed.
We wonder how this ordering is guaranteed? There are no StoreLoad barriers in the Java Thread's path. (StoreLoad ordering needs explicit barriers even on TSO platforms.)

Kind regards,
Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130627/ba47cb5c/attachment.htm>