CRR (S): 7099849: G1: include heap region information in hs_err files

Mon Oct 31 21:16:11 UTC 2011

Bengt Rutisson wrote:
> On 2011-10-31 18:40, Vladimir Kozlov wrote:
>> I agree with Tony, don't rely on print() definition in AllocatedObj 
>> since it is only defined in non product VM (because in product version 
>> AllocatedObj should not have virtual table and have size 0). 
> 
> I missed the fact that the AllocatedObj::print() is only defined in 
> debug builds. That of course makes it impossible for CollectedHeap to 
> rely on it.
> 
>> Consider AllocatedObj::print() as backup for classes which does not 
>> define they own print() methods when you debugging VM.
> 
> I see. But why is it in that case not declared virtual?

I have no idea, it was done 13 years ago. May be to reduce virtual table size 
and to have subclasses to redefine only virtual methods *_on(*st).

Vladimir

> 
> Bengt
> 
>>
>> Regards,
>> Vladimir
>>
>> Tony Printezis wrote:
>>> Bengt,
>>>
>>>>
>>>> In collectedHeap.hpp you now have:
>>>>
>>>> virtual void print() const {
>>>>     print_on(tty);
>>>>   }
>>>>
>>>> But CollectedHeap inherits AllocatedObj (through CHeapObj). And 
>>>> AllocatedObj defines print() as:
>>>>
>>>> void AllocatedObj::print() const       { print_on(tty); }
>>>>
>>>> So, I think you can just remove the print() method in CollectedHeap. 
>>>> This will also get rid of the shadowing of AllocatedObj::print() by 
>>>> CollectedHeap::print(). That looks kind of fishy to me.
>>>
>>> I agree. And I'm really not sure why the print methods were 
>>> introduced on AllocatedObj. Having said that, I think it's helpful to 
>>> define:
>>>
>>> virtual void print() const {
>>>     print_on(tty);
>>>   }
>>>
>>> on CollectedHeap to show explicitly what it does. If I remove it I 
>>> feel as if it might confuse the reader who'd never think of looking 
>>> in the AllocatedObj class for the print definition (I know I 
>>> wouldn't....). Also, if someone does the obvious thing and removes 
>>> the print methods from AllocatedObj :-) they'd have to re-introduce 
>>> it on CollectedHeap. I'm inclined to leave it in to keep 
>>> CollectedHeap independent of AllocatedObj (at least in that respect).
>>>
>>>> Also, a nit pick, but the whitespace changes at lines 414 and 417 in 
>>>> universe.hpp seem kind of strange to me.
>>>
>>> The idea was to get better alignment but I think I added extra 
>>> whitespace in line 414 for some reason. Does this look better?
>>>
>>>  414   static bool verify_in_progress() { return _verify_in_progress; }
>>>  415   static void verify(bool allow_dirty = true, bool silent = false,
>>>  416                      VerifyOption option = VerifyOption_Default );
>>>  417   static int  verify_count()       { return _verify_count; }
>>>
>>>
>>> Or you'd prefer:
>>>
>>> static bool verify_in_progress() {
>>>   return _verify_in_progress;
>>> }
>>> static void verify(bool allow_dirty = true, bool silent = false,
>>>                    VerifyOption option = VerifyOption_Default );
>>> static int verify_count() {
>>>   return _verify_count;
>>> }
>>>
>>>
>>>> Finally, just for the record, we have tests that run without setting 
>>>> -Xms or -Xmx. If such a test ends up on a large machine we can end 
>>>> up with many more than 1000-2000 regions. I have seen ten times as 
>>>> many regions.
>>>
>>> I could add a max of how many per-region entries we'll write to the 
>>> hs_err file and basically skip the rest (by adding an appropriate 
>>> warning in the log of course). Having said that, this might make this 
>>> output useless given that the entries we'll skip might be the ones 
>>> that we'd need. So, I'm inclined not to do that.
>>>
>>>> It might be unlikely that customers run without limiting the heap 
>>>> size on this large machines, but we will run into hs_err files with 
>>>> 10s of thousands of heap regions in our own testing. Personally I 
>>>> think the information is useful enough for this to be acceptable. 
>>>
>>> Agreed.
>>>
>>>> One alternative would of course be to log this in a separate 
>>>> hs_err_g1 file...
>>>
>>> I really don't think we need to go there! We'll generate the same 
>>> amount of output but now split over two files! And the customers will 
>>> have to remember to also keep around and send us the hs_err_g1 files 
>>> too. :-)
>>>
>>> Tony
>>>
>>>> Bengt
>>>>
>>>>
>>>> On 2011-10-12 18:49, Tony Printezis wrote:
>>>>> Hi all,
>>>>>
>>>>> (I'm also copying the runtime alias, as this change will concern 
>>>>> them too)
>>>>>
>>>>> I'd like to get a couple of reviews for this change:
>>>>>
>>>>> http://cr.openjdk.java.net/~tonyp/7099849/webrev.0/
>>>>>
>>>>> Some background: when trying to track down issues in G1 it is often 
>>>>> very helpful to know what type of regions the heap has and/or get 
>>>>> information on a particular region (whether it's humongous, whether 
>>>>> it's young, how full it is, etc.). We thought it'd be a good idea 
>>>>> to include the per-region information in the hs_err file to always 
>>>>> have it available after a crash.
>>>>>
>>>>> I don't think the changes in the webrev are too controversial. The 
>>>>> reason I wanted to give a heads up to both groups was to point out 
>>>>> that this change will increase the size of the hs_err files when G1 
>>>>> is used. It's common to have 1,000 to 2,000 regions in the heap, if 
>>>>> larger heaps are used (much fewer if smaller heaps are used), which 
>>>>> means that the hs_err file will have this many extra lines. Does 
>>>>> anyone have any concerns about this?
>>>>>
>>>>> I attached an example hs_err file obtained from a workspace with 
>>>>> this change applied.
>>>>>
>>>>> BTW, I also cleaned up a bit the way the print() methods on the 
>>>>> heap are defined (I pushed the default behavior to the superclass, 
>>>>> where possible).
>>>>>
>>>>> Tony
>>>>>
>>>>
>