G1 question: concurrent cleaning of dirty cards

Sat Jun 29 01:00:42 UTC 2013

Oh, yes, that wouldn't work as it is with the safepoint scheme. On the other hand the mutators still may do the processing part only by picking a buffer from the already snapshotted queue, not the current one. The original plan was to use mutator processing as a throttling mechanism. May be it's still useful? I don't know..

igor 

On Jun 28, 2013, at 4:06 PM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:

> Hi Igor.
> 
> You misunderstood me. I meant  that if we use safepoints to refine cards all of the code that currently supports refinement by mutators can be removed. That' all.
> 
> JohnC
> 
> On 6/28/2013 4:02 PM, Igor Veresov wrote:
>> The mutator processing doesn't solve it. The card clearing event is still asynchronous with respect to possible mutations in other threads. While one mutator thread is processing buffers and clearing cards the other can sneak in and do the store to the same object that will go unnoticed. So I'm afraid it's either a store-load barrier, or we need to stop all mutator threads to prevent this race, or worse..
>> 
>> igor
>> 
>> 
>> On Jun 28, 2013, at 1:53 PM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:
>> 
>>> Hi Igor,
>>> 
>>> Yeah G1 has that facility right now. In fact you added it. :) When the number of completed buffers is below the green zone upper limit, none of the refinement threads are refining buffers. That is the green zone upper limit is number of buffers that we expect to be able to process during the GC without it going over some percentage of the pause time (I think the default is 10%). When the number of buffers grows above the green zone upper limit, the refinement threads start processing the buffers in stepped manner. 
>>> 
>>> So during the safepoint we would process N - green-zone-upper-limit completed buffers. In fact we could have a watcher task that monitors the number of completed buffers and triggers a safepoint when the number of completed buffers becomes sufficiently high - say above the yellow-zone upper limit.
>>> 
>>> That does away with the whole notion of concurrent refinement but will remove a lot of the nasty complicated code that gets executed by the mutators or refinement threads.
>>> 
>>> My main concern is that the we would be potentially  increasing the number and duration of non-GC safepoints which cause issues with latency sensitive apps. For those workloads that only care about 90% of the transactions this approach would probably be fine.
>>> 
>>> We would need to evaluate the performance of each approach. 
>>> 
>>> The card cache delays the processing of cards that have been dirtied multiple times - so it does act kind of like a buffer reducing the potential for this issue.
>>> 
>>> JohnC
>>> 
>>> On 6/28/2013 12:47 PM, Igor Veresov wrote:
>>>> The impact on the next collection however can be bounded. Say, if you make it have a safepoint to reap the buffers when the number of buffer reaches $n$, that alone would put a cap on the potential pause incurred during the collection. The card cache currently has the same effect, sort of, right?
>>>> 
>>>> igor
>>>> 
>>>> On Jun 28, 2013, at 12:26 PM, John Cuthbertson <john.cuthbertson at oracle.com> wrote:
>>>> 
>>>>> Hi Igor,
>>>>> 
>>>>> On 6/28/2013 9:47 AM, Igor Veresov wrote:
>>>>>> 
>>>>>> On Jun 28, 2013, at 7:08 AM, "Doerr, Martin" <martin.doerr at sap.com> wrote:
>>>>>> 
>>>>>>> Hi Igor,
>>>>>>>  
>>>>>>> we didn’t find an easy and feasible way to ensure the ordering, either.
>>>>>>> Grabbing the buffers and cleaning the cards at safepoints might be the best solution.
>>>>>> 
>>>>>> Would anybody from the G1 team like to think about that?
>>>>> 
>>>>> I've been thinking about this issue on an off for the last few weeks when I get the time. I mentioned it to Vladimir a couple of times to get his input.
>>>>> 
>>>>>>>  
>>>>>>> Maybe removing the barrier that flushes the store to the cardtable makes the problem more likely to occur.
>>>>>>> I guess the purpose of the barrier was exactly to avoid this problem
>>>>>>> (which should be working perfectly if the post barriers had StoreLoad barriers, too).
>>>>>>>  
>>>>>> 
>>>>>> Yeah, but like you noted that would have a horrific effect on performance. So, it's probably best to bunch the work up to at least eliminate the need of extra work when, say, you're looping and storing to a limited working set (G1 uses the cardtable basically for that purpose). The safepoint approach will likely require more memory for buffers and the load will be spiky, and if the collection were to happen right after we grabbed the buffers the collector will have to process all of them which is not going to work well for predictability. But nothing better comes to mind at this point.
>>>>>> Btw, there are already periodic safepoints to do bias locking revocations, so may be it would make sense to piggyback on that.  
>>>>> 
>>>>> Piggy backing on all the other safepoint operations might work if they happen frequently enough but I don't know if that 's the case. And as you, even then there will be times where we haven't had a safepoint for a while and will have a ton of buffers to process at the start of the pause.
>>>>> 
>>>>> It might be worth adding a suitable memory barrier to the G1 post write barrier and evaluating the throughput hit.
>>>>> 
>>>>> JohnC
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20130628/6dfedf09/attachment.htm>