CRR (XS): 7098085: G1: partially-young GCs not initiated under certain circumstances

Y. S. Ramakrishna y.s.ramakrishna at oracle.com
Thu Oct 6 17:29:29 UTC 2011



On 10/06/11 09:46, Tony Printezis wrote:
> Ramki,
> 
> I have to eat humble pie. I got two (infrequent failures) during testing 

OK, i'll share the pie :-) I thought (but didn't check) that
we clear both marking bit maps at the end of the full gc,
and the yielded/aborted marking threads will not create any
new marks. But perhaps they are not checking for abortion when
they emerge from an arbitrary yield? (You can tell that I have
not looked sufficiently at the G1 abort mechanism yet; i'll try
and look at the sources for that when i get some time.)

> overnight that I'm 99.9% sure are caused by the fact that I stopped 
> clearing the bitmap in the CM::abort() method. As I said, I've backed 
> out of those changes in concurrentMark.cpp, here's the latest webrev 
> (same as before but without the changes to the CM::abort() method):
> 
> http://cr.openjdk.java.net/~tonyp/7098085/webrev.1/

reviewed!

> 
> I'll carry on testing this for the next day or two.
> 
> Tony

thanks.
-- ramki

> 
> Tony Printezis wrote:
>> Ramki,
>>
>> Thanks for taking a look at it! See inline.
>>
>> On 10/5/2011 7:48 PM, Ramki Ramakrishna wrote:
>>> Hi Tony --
>>>
>>> The changes look good.
>>>
>>> On a side note, was the removal of the bit map clearing a different 
>>> bug fix,
>>> or is it somehow related to the issue described in the CR?
>>> I can see that it is unnecessary and may be even wasteful,
>>> but it seemed to me to be unrelated to the issue described in the CR,
>>> unless I am missing something here.
>>
>> Agreed. I was looking at the abort() method to see what it did and I 
>> noticed that it was also clearing the bitmap. It's indeed wasteful 
>> and, in fact, it does not achieve what the comment says it should 
>> achieve. abort() is only called at the start of a Full GC but, in 
>> order to do the Full GC safepoint, the concurrent marking workers 
>> should have noticed that a safepoint was initiated and should have all 
>> yielded (otherwise the safepoint would have to wait for them). And, 
>> the concurrent marking thread will actually clear the next bitmap as 
>> the last thing it does before waiting for the next cycle to start. 
>> But, yeah, point taken: I'll back out of the concurrentMark.cpp 
>> changes and do them as a separate CR.
>>
>>> As a related (but otherwise orthogonal) matter, it appears as though 
>>> the idiom:-
>>>
>>>    _sts.join();
>>>    do_something();
>>>    do_something_else();
>>>   _sts_leave();
>>>
>>> might gain from the usual constructor/destructor idiom used
>>> elsewhere for code that wants to synchronize with other
>>> parts of the system (in this case safepoints). For example:
>>>
>>>   {
>>>      SynchronizeWithSafepoints ss();     // or appropriate other name 
>>> [1]
>>>      do_something();
>>>      do_something_else();
>>>   }
>>
>> Yes, I do like this idea very much.
>>
>>> [1] you will have to invent a suitable other name, perhaps 
>>> DesynchronizeWithSafepoints
>>> for when you do the reverse, i.e. perform an action outside of the 
>>> _sts, as happens
>>> when you are doing those synch barriers amongst the concurrent marking
>>> threads in case of overflow and restart.
>>
>> I can't think of a good name off-hand but I'll think about it for a 
>> bit... (suggestions are welcome!)
>>
>>> The benefit is that the extra scopes help clearly demarcate
>>> the code that wants to be in the "critical section", if you
>>> will, wrt safepoints that will want to synchronize wrt these
>>> actions.
>>>
>>> (As far as I can tell all invocations of join/leave or stsJoin/stsLeave
>>> are paired and can become block-structured.)
>>
>> They should definitely be paired (like lock/unlock pairs). Otherwise, 
>> we're in trouble. :-)
>>
>>> The c'tor/d'tor idiom change is of course orthogonal to yr changes
>>> and would likely want to be a separate CR, if you agree that this
>>> may be another clean-up worth doing.
>>
>> Absolutely. So, I take an action item to open two hs23 CRs: one to 
>> avoid clearing the marking bitmap and one for the ctor/dtor idiom to 
>> join / leave the STS and vice versa.
>>
>> Tony
>>
>>> Otherwise looks good.
>>> -- ramki
>>>
>>> On 10/5/2011 2:21 PM, Tony Printezis wrote:
>>>> Hi,
>>>>
>>>> I'd like to get a couple of people to have a look at the following 
>>>> webrev:
>>>>
>>>> http://cr.openjdk.java.net/~tonyp/7098085/webrev.0/
>>>>
>>>> The code changes are very minor, most of the added lines are in fact 
>>>> detailed comments on the changes.
>>>>
>>>> I'm going to do testing on two machines overnight, as this is quite 
>>>> tricky code (even though it doesn't look it). But, I thought I'd 
>>>> open the changes for code review early as the change is small and 
>>>> some early feedback, if possible, would be greatly appreciated.
>>>>
>>>> Tony
>>>>
>>>> PS Correct url? Check. Text actually applies to the CR in the 
>>>> subject? Check.
>>>>



More information about the hotspot-gc-dev mailing list