CRR (S): 7099849: G1: include heap region information in hs_err files

Bengt Rutisson bengt.rutisson at oracle.com
Tue Nov 1 13:23:08 UTC 2011


Tony,

On 2011-10-31 18:24, Tony Printezis wrote:
> Bengt,
>
> Thanks for looking at this. Inline.
>
> On 10/31/2011 5:55 AM, Bengt Rutisson wrote:
>>
>> Hi Tony,
>>
>> Thanks for fixing this! I think it will be great to have this 
>> information in the hs_err file.
>>
>> It is also good that you clean up the print() and print_on() 
>> structure. But I think you can take it one step further.
>>
>> In collectedHeap.hpp you now have:
>>
>> virtual void print() const {
>>     print_on(tty);
>>   }
>>
>> But CollectedHeap inherits AllocatedObj (through CHeapObj). And 
>> AllocatedObj defines print() as:
>>
>> void AllocatedObj::print() const       { print_on(tty); }
>>
>> So, I think you can just remove the print() method in CollectedHeap. 
>> This will also get rid of the shadowing of AllocatedObj::print() by 
>> CollectedHeap::print(). That looks kind of fishy to me.
>
> I agree. And I'm really not sure why the print methods were introduced 
> on AllocatedObj. Having said that, I think it's helpful to define:
>
> virtual void print() const {
>     print_on(tty);
>   }
>
> on CollectedHeap to show explicitly what it does. If I remove it I 
> feel as if it might confuse the reader who'd never think of looking in 
> the AllocatedObj class for the print definition (I know I 
> wouldn't....). Also, if someone does the obvious thing and removes the 
> print methods from AllocatedObj :-) they'd have to re-introduce it on 
> CollectedHeap. I'm inclined to leave it in to keep CollectedHeap 
> independent of AllocatedObj (at least in that respect).

Discussed this with Vladimir in previous email...

>> Also, a nit pick, but the whitespace changes at lines 414 and 417 in 
>> universe.hpp seem kind of strange to me.
>
> The idea was to get better alignment but I think I added extra 
> whitespace in line 414 for some reason. Does this look better?
>
>  414   static bool verify_in_progress() { return _verify_in_progress; }
>  415   static void verify(bool allow_dirty = true, bool silent = false,
>  416                      VerifyOption option = VerifyOption_Default );
>  417   static int  verify_count()       { return _verify_count; }
>
>
> Or you'd prefer:
>
> static bool verify_in_progress() {
>   return _verify_in_progress;
> }
> static void verify(bool allow_dirty = true, bool silent = false,
>                    VerifyOption option = VerifyOption_Default );
> static int verify_count() {
>   return _verify_count;
> }

I prefer the latter, but the former is also ok. It is better than what 
is there now.
>> Finally, just for the record, we have tests that run without setting 
>> -Xms or -Xmx. If such a test ends up on a large machine we can end up 
>> with many more than 1000-2000 regions. I have seen ten times as many 
>> regions.
>
> I could add a max of how many per-region entries we'll write to the 
> hs_err file and basically skip the rest (by adding an appropriate 
> warning in the log of course). Having said that, this might make this 
> output useless given that the entries we'll skip might be the ones 
> that we'd need. So, I'm inclined not to do that.

I agree. A maximum will likely reduce the benefit of this fix.

>> It might be unlikely that customers run without limiting the heap 
>> size on this large machines, but we will run into hs_err files with 
>> 10s of thousands of heap regions in our own testing. Personally I 
>> think the information is useful enough for this to be acceptable. 
>
> Agreed.
>
>> One alternative would of course be to log this in a separate 
>> hs_err_g1 file...
>
> I really don't think we need to go there! We'll generate the same 
> amount of output but now split over two files! And the customers will 
> have to remember to also keep around and send us the hs_err_g1 files 
> too. :-)

Yes. Agreed.

Ship it!

Bengt
>
> Tony
>
>> Bengt
>>
>>
>> On 2011-10-12 18:49, Tony Printezis wrote:
>>> Hi all,
>>>
>>> (I'm also copying the runtime alias, as this change will concern 
>>> them too)
>>>
>>> I'd like to get a couple of reviews for this change:
>>>
>>> http://cr.openjdk.java.net/~tonyp/7099849/webrev.0/
>>>
>>> Some background: when trying to track down issues in G1 it is often 
>>> very helpful to know what type of regions the heap has and/or get 
>>> information on a particular region (whether it's humongous, whether 
>>> it's young, how full it is, etc.). We thought it'd be a good idea to 
>>> include the per-region information in the hs_err file to always have 
>>> it available after a crash.
>>>
>>> I don't think the changes in the webrev are too controversial. The 
>>> reason I wanted to give a heads up to both groups was to point out 
>>> that this change will increase the size of the hs_err files when G1 
>>> is used. It's common to have 1,000 to 2,000 regions in the heap, if 
>>> larger heaps are used (much fewer if smaller heaps are used), which 
>>> means that the hs_err file will have this many extra lines. Does 
>>> anyone have any concerns about this?
>>>
>>> I attached an example hs_err file obtained from a workspace with 
>>> this change applied.
>>>
>>> BTW, I also cleaned up a bit the way the print() methods on the heap 
>>> are defined (I pushed the default behavior to the superclass, where 
>>> possible).
>>>
>>> Tony
>>>
>>




More information about the hotspot-gc-dev mailing list