RFR (XS): 7114303: G1: assert(_g1->mark_in_progress()) failed: shouldn't be here otherwise

Wed Nov 23 19:37:58 UTC 2011

Hi Bengt, Tony,

Thanks for the review. Good point about 
record_satb_drain_processed_buffers().

The maximum number of buffers that will be processed  by this code is 
the # of  filled SATB buffers + # of Java threads + 1. Each Java thread 
has an embedded SATB pointer queue. In some threads, however, these 
could be empty. Likewise with the shared SATB pointer queue (used by 
non-Java threads)

Obtaining the number of completed SATB buffers that the drain closure is 
applied to is fairly straightforward but do we count empty buffers?

I'm not sure how important the number of buffers processed is currently 
- given that it's currently serial code. It would be more important if 
the code were parallel as then we could  see the distribution.  I guess 
the number of buffers processed could tell us if marking is  not keeping 
up with the amount of mutation but we get that indirectly from the time.

Anwyay I've removed the dead code. New webrev here: 
http://cr.openjdk.java.net/~johnc/7114303/webrev.1/

Thanks,

JohnC

On 11/23/11 02:26, Tony Printezis wrote:
> Bengt,
>
> That methods looks indeed not to be called, which means that 
> _last_satb_drain_processed_buffers is never set (I can't find any 
> assignments to it) and the reported SATB processed buffers
>
>       print_stats(2, "Processed Buffers", 
> _last_satb_drain_processed_buffers);
>
> will probably always be 0. So, if we think that this output is 
> important, we should make sure we actually 
> record_satb_drain_processed_buffers() it with the appropriate number. 
> If we don't, we should also remove _last_satb_drain_processed_buffers 
> and related code.
>
> Tony
>
> On 11/23/2011 02:44 AM, Bengt Rutisson wrote:
>>
>> Hi John,
>>
>> Looks good to me too.
>>
>> It seems like the method "void 
>> record_satb_drain_processed_buffers(int processed_buffers)" in 
>> g1CollectorPolicy.hpp is dead code. I found it since it contains the 
>> same assert that your CR reported.
>>
>> Any chance of removing that method as well as part of your change?
>>
>> Bengt
>>
>> On 2011-11-22 21:45, John Cuthbertson wrote:
>>> Hi Everyone,
>>>
>>> Can I have a couple of volunteers review this small change? The 
>>> webrev can be found at: 
>>> http://cr.openjdk.java.net/~johnc/7114303/webrev.0/
>>>
>>> Summary:
>>> There is a race between the concurrent mark thread updating the mark 
>>> in progress flag and it being read by the VM thread during an 
>>> evacuation pause. This is because, at the time the concurrent mark 
>>> thread updates the flag, it is not part of the suspendible thread 
>>> set and so is not blocked. The solution is to have the concurrent 
>>> mark thread join the STS just before updating the flag.
>>>
>>> The race was confirmed by inserting a small delay immediately before 
>>> calling G1CollectorPolicy::record_satb_drain_time(). The delay 
>>> increased failure rate to 1/5 for a 20ms delay and 4/5 for a 50ms 
>>> delay.
>>>
>>> Testing: a few thousand iterations of the failing test case with a 
>>> JVM that included both fix and delay.
>>>
>>> Thanks,
>>>
>>> JohnC
>>