CRR (XS): 7098085: G1: partially-young GCs not initiated under certain circumstances

Tony Printezis tony.printezis at oracle.com
Thu Oct 6 16:46:30 UTC 2011


Ramki,

I have to eat humble pie. I got two (infrequent failures) during testing 
overnight that I'm 99.9% sure are caused by the fact that I stopped 
clearing the bitmap in the CM::abort() method. As I said, I've backed 
out of those changes in concurrentMark.cpp, here's the latest webrev 
(same as before but without the changes to the CM::abort() method):

http://cr.openjdk.java.net/~tonyp/7098085/webrev.1/

I'll carry on testing this for the next day or two.

Tony

Tony Printezis wrote:
> Ramki,
>
> Thanks for taking a look at it! See inline.
>
> On 10/5/2011 7:48 PM, Ramki Ramakrishna wrote:
>> Hi Tony --
>>
>> The changes look good.
>>
>> On a side note, was the removal of the bit map clearing a different 
>> bug fix,
>> or is it somehow related to the issue described in the CR?
>> I can see that it is unnecessary and may be even wasteful,
>> but it seemed to me to be unrelated to the issue described in the CR,
>> unless I am missing something here.
>
> Agreed. I was looking at the abort() method to see what it did and I 
> noticed that it was also clearing the bitmap. It's indeed wasteful 
> and, in fact, it does not achieve what the comment says it should 
> achieve. abort() is only called at the start of a Full GC but, in 
> order to do the Full GC safepoint, the concurrent marking workers 
> should have noticed that a safepoint was initiated and should have all 
> yielded (otherwise the safepoint would have to wait for them). And, 
> the concurrent marking thread will actually clear the next bitmap as 
> the last thing it does before waiting for the next cycle to start. 
> But, yeah, point taken: I'll back out of the concurrentMark.cpp 
> changes and do them as a separate CR.
>
>> As a related (but otherwise orthogonal) matter, it appears as though 
>> the idiom:-
>>
>>    _sts.join();
>>    do_something();
>>    do_something_else();
>>   _sts_leave();
>>
>> might gain from the usual constructor/destructor idiom used
>> elsewhere for code that wants to synchronize with other
>> parts of the system (in this case safepoints). For example:
>>
>>   {
>>      SynchronizeWithSafepoints ss();     // or appropriate other name 
>> [1]
>>      do_something();
>>      do_something_else();
>>   }
>
> Yes, I do like this idea very much.
>
>> [1] you will have to invent a suitable other name, perhaps 
>> DesynchronizeWithSafepoints
>> for when you do the reverse, i.e. perform an action outside of the 
>> _sts, as happens
>> when you are doing those synch barriers amongst the concurrent marking
>> threads in case of overflow and restart.
>
> I can't think of a good name off-hand but I'll think about it for a 
> bit... (suggestions are welcome!)
>
>> The benefit is that the extra scopes help clearly demarcate
>> the code that wants to be in the "critical section", if you
>> will, wrt safepoints that will want to synchronize wrt these
>> actions.
>>
>> (As far as I can tell all invocations of join/leave or stsJoin/stsLeave
>> are paired and can become block-structured.)
>
> They should definitely be paired (like lock/unlock pairs). Otherwise, 
> we're in trouble. :-)
>
>> The c'tor/d'tor idiom change is of course orthogonal to yr changes
>> and would likely want to be a separate CR, if you agree that this
>> may be another clean-up worth doing.
>
> Absolutely. So, I take an action item to open two hs23 CRs: one to 
> avoid clearing the marking bitmap and one for the ctor/dtor idiom to 
> join / leave the STS and vice versa.
>
> Tony
>
>> Otherwise looks good.
>> -- ramki
>>
>> On 10/5/2011 2:21 PM, Tony Printezis wrote:
>>> Hi,
>>>
>>> I'd like to get a couple of people to have a look at the following 
>>> webrev:
>>>
>>> http://cr.openjdk.java.net/~tonyp/7098085/webrev.0/
>>>
>>> The code changes are very minor, most of the added lines are in fact 
>>> detailed comments on the changes.
>>>
>>> I'm going to do testing on two machines overnight, as this is quite 
>>> tricky code (even though it doesn't look it). But, I thought I'd 
>>> open the changes for code review early as the change is small and 
>>> some early feedback, if possible, would be greatly appreciated.
>>>
>>> Tony
>>>
>>> PS Correct url? Check. Text actually applies to the CR in the 
>>> subject? Check.
>>>



More information about the hotspot-gc-dev mailing list