RFR (XS): 7114303: G1: assert(_g1->mark_in_progress()) failed: shouldn't be here otherwise

Wed Nov 23 22:15:01 UTC 2011

John, looks good. Ship it.

Tony

On 11/23/2011 02:37 PM, John Cuthbertson wrote:
> Hi Bengt, Tony,
>
> Thanks for the review. Good point about 
> record_satb_drain_processed_buffers().
>
> The maximum number of buffers that will be processed  by this code is 
> the # of  filled SATB buffers + # of Java threads + 1. Each Java 
> thread has an embedded SATB pointer queue. In some threads, however, 
> these could be empty. Likewise with the shared SATB pointer queue 
> (used by non-Java threads)
>
> Obtaining the number of completed SATB buffers that the drain closure 
> is applied to is fairly straightforward but do we count empty buffers?
>
> I'm not sure how important the number of buffers processed is 
> currently - given that it's currently serial code. It would be more 
> important if the code were parallel as then we could  see the 
> distribution.  I guess the number of buffers processed could tell us 
> if marking is  not keeping up with the amount of mutation but we get 
> that indirectly from the time.
>
> Anwyay I've removed the dead code. New webrev here: 
> http://cr.openjdk.java.net/~johnc/7114303/webrev.1/
>
> Thanks,
>
> JohnC
>
>
> On 11/23/11 02:26, Tony Printezis wrote:
>> Bengt,
>>
>> That methods looks indeed not to be called, which means that 
>> _last_satb_drain_processed_buffers is never set (I can't find any 
>> assignments to it) and the reported SATB processed buffers
>>
>>       print_stats(2, "Processed Buffers", 
>> _last_satb_drain_processed_buffers);
>>
>> will probably always be 0. So, if we think that this output is 
>> important, we should make sure we actually 
>> record_satb_drain_processed_buffers() it with the appropriate number. 
>> If we don't, we should also remove _last_satb_drain_processed_buffers 
>> and related code.
>>
>> Tony
>>
>> On 11/23/2011 02:44 AM, Bengt Rutisson wrote:
>>>
>>> Hi John,
>>>
>>> Looks good to me too.
>>>
>>> It seems like the method "void 
>>> record_satb_drain_processed_buffers(int processed_buffers)" in 
>>> g1CollectorPolicy.hpp is dead code. I found it since it contains the 
>>> same assert that your CR reported.
>>>
>>> Any chance of removing that method as well as part of your change?
>>>
>>> Bengt
>>>
>>> On 2011-11-22 21:45, John Cuthbertson wrote:
>>>> Hi Everyone,
>>>>
>>>> Can I have a couple of volunteers review this small change? The 
>>>> webrev can be found at: 
>>>> http://cr.openjdk.java.net/~johnc/7114303/webrev.0/
>>>>
>>>> Summary:
>>>> There is a race between the concurrent mark thread updating the 
>>>> mark in progress flag and it being read by the VM thread during an 
>>>> evacuation pause. This is because, at the time the concurrent mark 
>>>> thread updates the flag, it is not part of the suspendible thread 
>>>> set and so is not blocked. The solution is to have the concurrent 
>>>> mark thread join the STS just before updating the flag.
>>>>
>>>> The race was confirmed by inserting a small delay immediately 
>>>> before calling G1CollectorPolicy::record_satb_drain_time(). The 
>>>> delay increased failure rate to 1/5 for a 20ms delay and 4/5 for a 
>>>> 50ms delay.
>>>>
>>>> Testing: a few thousand iterations of the failing test case with a 
>>>> JVM that included both fix and delay.
>>>>
>>>> Thanks,
>>>>
>>>> JohnC
>>>
>