Request for review (S): 7097516 G1: assert(0<= from_card && from_card<HeapRegion::CardsPerRegion) failed: Must be in range.

Tony Printezis tony.printezis at oracle.com
Mon Oct 24 16:50:19 UTC 2011


Bengt,

Bengt Rutisson wrote:
>
>>
>> But we don't, we update the fields of a space when we extend it so 
>> that it's consistent. However, if we have an area with a single 
>> object, which is reflected on the block offset table, and not have a 
>> single space spanning said area, that's what's going to break the 
>> space abstraction...
>
> I don't fully understand how the BOT and the heap regions interact. 
> But my initial reaction to your statement is: "maybe HeapRegion should 
> not inherit Space".  If inheriting from Space limits our flexibility I 
> guess we should at least evaluate other approaches.

Sure, we can definitely collapse some of the HeapRegion class hierarchy 
and I'm in two minds about in inheriting from Space. On one hand we 
don't need a fair amount of what's already in Space (and subclasses). On 
the other hand, we get quite a few things "for free" (in particular: 
object iteration in the SA; I really appreciated not having to rewrite 
that code in the SA-style).

However, this is a bit orthogonal to the point of this discussion. 
Conceptually, I still believe that having an object that spans and is 
attributed to many spaces is awkward (BOT issues or not). It's much 
cleaner for an object to belong exactly to one space / region.

> Here is the current hierarchy for HeapRegion:
>
> CHeapObj -> Space -> CompactibleSpace -> ContiguousSpace -> 
> G1OffsetTableContigSpace -> HeapRegion
>
> I personally don't really like long inheritance chains, so just 
> looking at this makes me suspicious. Reading a few comments gives me 
> the feeling that we are already breaking several abstractions:
>
> Space: "Invariant: bottom() and end() are on page_size boundaries". 
> Not true for HeapRegion if we use large pages.
>
> CompactibleSpace: "A space that supports compaction operations". G1 
> does not do compaction.

Actually, it does. The Full GC code, which is shared with some of the 
other GCs, uses this. See g1MarkSweep.cpp.

> G1OffsetTableContigSpace: "So I copied that code, which led to an 
> alternate G1 version of OffsetTableContigSpace.". Hm...
>
> Seems to me that if we feel that it would be necessary to break 
> another abstraction it might be a good idea to re-think the 
> inheritance graph. 

I'm definitely not going to defend the G1 BOT code and how it interacts 
with the HeapRegions. In fact, we have a CR open to remove the 
G1-specific BOT code and use the version that all other GCs use.

> Maybe we are trying to get HeapRegion to do too much? Can it use 
> aggregation instead?

Maybe. But, as I said, I think the shape of the HeapRegion hierarchy is 
orthogonal to whether we should extend a heap region to cover a 
humongous object or not.

Tony

> Bengt
>
>
>>
>>> and having this unintuitive code that was the cause of this bug?
>>
>> The cause of this bug was that the RSet code relies on HeapRegions 
>> where it shouldn't. Because RSet use HeapRegions differently than the 
>> rest of the code, we should not pollute the rest of the code with 
>> extra checks.
>>
>>>> Always checking whether a region is humongous or not? Absolutely.
>>>
>>> I looked for places where we used the capacity() function, and found 
>>> only places where we do explicit checks for humongous regions.
>>>
>>>> This is something that's worked out nicely in G1 so far.
>>>
>>> Could you be more specific? I'm just asking since we had a bug 
>>> because of this implementation.
>>>
>>>> The only aspect of this I'd change would be for the region 
>>>> iterators to skip continues humongous regions, given that we filter 
>>>> them out most of the time anyway.
>>>>
>>>>> Could you point me to some of the usages in G1 where we rely on 
>>>>> capacity(), used(), etc. to report the "extended" range of Start 
>>>>> Humonguos regions?
>>>>
>>>> Look at when we add a region to one of the heap region sets. We do 
>>>> check whether it's humongous or not but that's to calculate the 
>>>> number of regions it spans. In fact, maybe we'd be better off just 
>>>> calculating that as capacity() / RegionSize and skip the test. In 
>>>> fact, this is a nice example of how this approach simplifies the code.
>>>
>>> Are you talking about, for example: 
>>> HeapRegionSetBase::calculate_region_num, and that all call paths to 
>>> it does something like:
>>>   if (!hr->isHumongous()) {
>>>     _calc_region_num         += 1;
>>>   } else {
>>>     _calc_region_num         += calculate_region_num(hr);
>>>   }
>>>
>>> I don't agree that that would simplify the code. Yes, maybe shrink 
>>> the number of lines but not lower the complexity, IMHO.
>>
>> If you feel strongly about this, by all means open a CR for it. 
>> Personally, I think the code is fine the way it is.
>>
>> Tony
>>
>>> StefanK
>>>
>>>>
>>>> An additional point: when we set up the block offset table for a 
>>>> humongous region, all the entries on it have to point to the bottom 
>>>> of the starts humongous region (as the rest do not have an object 
>>>> start), so it's again natural for that whole area to be covered by 
>>>> that region wrt to used and capacity.
>>>>
>>>> Tony
>>>>
>>>>> thanks,
>>>>> StefanK
>>>>>
>>>>>>
>>>>>> Note, however, that the only time we really need to "original" 
>>>>>> version of the above methods is in the current RSet code given 
>>>>>> that each RSet "chunk" corresponds to exactly one region. 
>>>>>> Everything else in the code relies on the fact that the above 
>>>>>> methods on SH regions are the "extended" versions. So, we save 
>>>>>> ourselves a lot of trouble but not having to check whether a 
>>>>>> region is SH or not before querying it.
>>>>>>
>>>>>> I'd also like to point out that I'd like us to re-architect the 
>>>>>> RSets so that they do not rely on regions but, instead, on (fixed 
>>>>>> size) memory ranges (could be equal to the region size, could be 
>>>>>> smaller, could be larger). That way, we'll eliminate the need for 
>>>>>> the "raw" version of is_in_reserved() and deal with SH and 
>>>>>> non-humongous regions uniformly.
>>>>>>
>>>>>> Tony
>>>>>>
>>>>>> On 10/20/2011 04:21 AM, Stefan Karlsson wrote:
>>>>>>> Hi Bengt,
>>>>>>>
>>>>>>> On 10/19/2011 11:45 AM, Bengt Rutisson wrote:
>>>>>>>>
>>>>>>>> Hi again,
>>>>>>>>
>>>>>>>> Updated webrev based on comments from Tony:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~brutisso/7097516/webrev.02/
>>>>>>>
>>>>>>> Looks good to me.
>>>>>>>
>>>>>>> But, I find it really odd that Space::is_in_reserved(...) looks 
>>>>>>> at the expanded size when working on Start Continues regions, 
>>>>>>> and HeapRegion::is_in_reserved_raw looks at the "original" 
>>>>>>> region size. To me, it would be more natural if 
>>>>>>> Space::is_in_reserved(...) actually did what one would expect, 
>>>>>>> look at the original heap region size, and then have a somthing 
>>>>>>> like HeapRegion::is_in_reserved_expanded to expand the size for 
>>>>>>> Start Continues regions.
>>>>>>>
>>>>>>> StefanK
>>>>>>>
>>>>>>>>
>>>>>>>> Bengt
>>>>>>>>
>>>>>>>> On 2011-10-19 09:14, Bengt Rutisson wrote:
>>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Could I have a couple of reviews of this change?
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~brutisso/7097516/webrev.01/
>>>>>>>>>
>>>>>>>>> Some background:
>>>>>>>>>
>>>>>>>>> The humongous start regions in G1 gets their _end marker set 
>>>>>>>>> to the end of the last humongous continues region for the 
>>>>>>>>> humongous object. This means that the start region overlaps 
>>>>>>>>> all of the continues regions. Code that handles regions need 
>>>>>>>>> to be aware of this special case.
>>>>>>>>>
>>>>>>>>> However, PerRegionTable::add_reference_work() had a subtle 
>>>>>>>>> issue with this special case. Here is what happened:
>>>>>>>>>
>>>>>>>>> A thread, TA, looks up the PerRegionTable, PRT,  for a 
>>>>>>>>> particular heap region. This heap region happens to be a 
>>>>>>>>> humongous continues region. Let's call it HRA.
>>>>>>>>>
>>>>>>>>> Another thread, TB, wants to look up the PerRegionTable for 
>>>>>>>>> the humongous start region of that same humongous object. 
>>>>>>>>> Let's call that heap region HRB. If we are unlucky TB is the 
>>>>>>>>> first thread that wants the PerRegionTable for HRB and thus 
>>>>>>>>> has to allocate it. Now, there is a limit to how many 
>>>>>>>>> PerRegionTables we allow in the system. If this limit has been 
>>>>>>>>> reached TB will find a suitable existing PerRegionTable and 
>>>>>>>>> coarsen that to use a bitmap for the remset instead of the 
>>>>>>>>> PerRegionTable.
>>>>>>>>>
>>>>>>>>> This means that TB can actually "steal" the PRT from 
>>>>>>>>> underneath TA.
>>>>>>>>>
>>>>>>>>> TA is aware that this can happen. So, before it tries to do 
>>>>>>>>> any work with the information in the PRT it tries to verify 
>>>>>>>>> that it hasn't been "stolen" (actually coarsened). It does 
>>>>>>>>> this by storing the heap region for the PRT in a local 
>>>>>>>>> variable (called loc_hr) and then check that the reference 
>>>>>>>>> that it is about to handle still belong to the heap region.
>>>>>>>>>
>>>>>>>>> Basically:
>>>>>>>>>
>>>>>>>>>     HeapRegion* loc_hr = hr();
>>>>>>>>>     if (loc_hr->is_in_reserved(from)) {
>>>>>>>>>        ...
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>> Normally this is safe. If TA gets HRB in loc_hr the test will 
>>>>>>>>> fail. However, in the case where HRB is the humongous start 
>>>>>>>>> region the test will not fail. Instead we will pass the test 
>>>>>>>>> and execute the code inside the if. There we try to calculate 
>>>>>>>>> a card index based on the bottom of the heap region and the 
>>>>>>>>> from address. Now this card index will be larger than the 
>>>>>>>>> number of cards we have per region, since from actually is not 
>>>>>>>>> in the region.
>>>>>>>>>
>>>>>>>>> The fix is to introduce a new method called 
>>>>>>>>> HeapRegion::is_in_reserved_raw() and use this in the test. The 
>>>>>>>>> _raw method will not return false even for humongous start 
>>>>>>>>> objects in the example above.
>>>>>>>>>
>>>>>>>>> I picked the name is_in_reserved_raw() to be consistent with 
>>>>>>>>> the heap_region_containing()/heap_region_containing_raw() 
>>>>>>>>> methods. I am not particularly fond of the name, so if there 
>>>>>>>>> are better suggestions I am happy to change it.
>>>>>>>>>
>>>>>>>>> Testing
>>>>>>>>> I was able to reproduce the issue on bur398-216 after 40-50 
>>>>>>>>> iterations. With my fix it has now ran 540 iteration without 
>>>>>>>>> failing.
>>>>>>>>>
>>>>>>>>> Other uses of is_in_reserved()
>>>>>>>>> Tony and I went through the G1 code and looked for other uses 
>>>>>>>>> of HeapRegion::is_in_reserved(). As far as we can tell the 
>>>>>>>>> other uses all avoid the issue that 
>>>>>>>>> PerRegionTable::add_reference_work() has. Most of them because 
>>>>>>>>> they are used during evacuation where humongous objects are 
>>>>>>>>> not present at all.
>>>>>>>>>
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~brutisso/7097516/webrev.01/
>>>>>>>>>
>>>>>>>>> CR:
>>>>>>>>> 7097516 G1: assert(0<= from_card && 
>>>>>>>>> from_card<HeapRegion::CardsPerRegion) failed: Must be in range.
>>>>>>>>> http://monaco.us.oracle.com/detail.jsf?cr=7097516
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Bengt
>>>>>>>>
>>>>>>>
>>>>>
>>>
>



More information about the hotspot-gc-dev mailing list