RFR (XS): 7114303: G1: assert(_g1->mark_in_progress()) failed: shouldn't be here otherwise

Wed Nov 23 22:53:24 UTC 2011

John,

I agree with Tony. Looks good. Ship it!

Bengt

On 2011-11-23 23:15, Tony Printezis wrote:
> John, looks good. Ship it.
>
> Tony
>
> On 11/23/2011 02:37 PM, John Cuthbertson wrote:
>> Hi Bengt, Tony,
>>
>> Thanks for the review. Good point about 
>> record_satb_drain_processed_buffers().
>>
>> The maximum number of buffers that will be processed  by this code is 
>> the # of  filled SATB buffers + # of Java threads + 1. Each Java 
>> thread has an embedded SATB pointer queue. In some threads, however, 
>> these could be empty. Likewise with the shared SATB pointer queue 
>> (used by non-Java threads)
>>
>> Obtaining the number of completed SATB buffers that the drain closure 
>> is applied to is fairly straightforward but do we count empty buffers?
>>
>> I'm not sure how important the number of buffers processed is 
>> currently - given that it's currently serial code. It would be more 
>> important if the code were parallel as then we could  see the 
>> distribution.  I guess the number of buffers processed could tell us 
>> if marking is  not keeping up with the amount of mutation but we get 
>> that indirectly from the time.
>>
>> Anwyay I've removed the dead code. New webrev here: 
>> http://cr.openjdk.java.net/~johnc/7114303/webrev.1/
>>
>> Thanks,
>>
>> JohnC
>>
>>
>> On 11/23/11 02:26, Tony Printezis wrote:
>>> Bengt,
>>>
>>> That methods looks indeed not to be called, which means that 
>>> _last_satb_drain_processed_buffers is never set (I can't find any 
>>> assignments to it) and the reported SATB processed buffers
>>>
>>>       print_stats(2, "Processed Buffers", 
>>> _last_satb_drain_processed_buffers);
>>>
>>> will probably always be 0. So, if we think that this output is 
>>> important, we should make sure we actually 
>>> record_satb_drain_processed_buffers() it with the appropriate 
>>> number. If we don't, we should also remove 
>>> _last_satb_drain_processed_buffers and related code.
>>>
>>> Tony
>>>
>>> On 11/23/2011 02:44 AM, Bengt Rutisson wrote:
>>>>
>>>> Hi John,
>>>>
>>>> Looks good to me too.
>>>>
>>>> It seems like the method "void 
>>>> record_satb_drain_processed_buffers(int processed_buffers)" in 
>>>> g1CollectorPolicy.hpp is dead code. I found it since it contains 
>>>> the same assert that your CR reported.
>>>>
>>>> Any chance of removing that method as well as part of your change?
>>>>
>>>> Bengt
>>>>
>>>> On 2011-11-22 21:45, John Cuthbertson wrote:
>>>>> Hi Everyone,
>>>>>
>>>>> Can I have a couple of volunteers review this small change? The 
>>>>> webrev can be found at: 
>>>>> http://cr.openjdk.java.net/~johnc/7114303/webrev.0/
>>>>>
>>>>> Summary:
>>>>> There is a race between the concurrent mark thread updating the 
>>>>> mark in progress flag and it being read by the VM thread during an 
>>>>> evacuation pause. This is because, at the time the concurrent mark 
>>>>> thread updates the flag, it is not part of the suspendible thread 
>>>>> set and so is not blocked. The solution is to have the concurrent 
>>>>> mark thread join the STS just before updating the flag.
>>>>>
>>>>> The race was confirmed by inserting a small delay immediately 
>>>>> before calling G1CollectorPolicy::record_satb_drain_time(). The 
>>>>> delay increased failure rate to 1/5 for a 20ms delay and 4/5 for a 
>>>>> 50ms delay.
>>>>>
>>>>> Testing: a few thousand iterations of the failing test case with a 
>>>>> JVM that included both fix and delay.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> JohnC
>>>>
>>