RFR 8026324: hs_err improvement: Add summary section to hs_err file

Coleen Phillimore coleen.phillimore at oracle.com
Tue Jul 21 18:52:58 UTC 2015


Thank you for the code review, Dmitry.  New webrev (see below comments).

http://cr.openjdk.java.net/~coleenp/8026324.03/

On 7/20/15 4:26 PM, Dmitry Dmitriev wrote:
> Hello Coleen,
>
> Few comments from me:
>
> src/os/linux/vm/os_linux.cpp
>    In parse_os_info function it is possible that file will not be 
> closed(if 'ptr==NULL' on line 2095). I think that "fclose(fp);" can be 
> added after 2110 line. Also, in this case lines 2100-2101 and 
> 2108-2109 are seems reduntant, because file will be closed by added 
> "fclose(fd);" and function will terminate.

This is a good find.  I am missing an fclose and I'd restructured the 
code so that the early returns are no longer necessary.  I also don't 
print anything if the Ubuntu format isn't expected, which wasn't good.  
See os_linux.cpp in next webrev.

If I don't find the quotes in DISTRIB_DESCRIPTION=, I'll print what's 
after the equal sign (editing out the newline again).

>
>    In os::get_summary_cpu_info function it is possible that file will 
> not be closed(if 'ptr' always == 'end' on line 2238). I think that 
> "fclose(fp);" can be added after 2248 line.

Thank you for finding that also.  Fixed.

>
> src/share/vm/runtime/os.cpp
>    In os::print_summary_info function you print physical_memory in 
> gigabytes. But on systems with less than 1 Gb of memory it will print 
> 0. Probably will be better to print it in megabytes?

I was wondering if there was such thing as these systems with < 1G 
memory.   But if a system has less than 1M memory, they're getting a 
zero.  I made this change but I have nowhere to test it.

Thanks,
Coleen

>
> Thank you,
> Dmitry
>
> On 17.07.2015 22:16, Coleen Phillimore wrote:
>>
>> Hi David,  Thank you for taking the time to look and make comments on 
>> all of this.   I have a new webrev at:
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8026324.02/
>>
>> more below.
>>
>> On 7/17/15 3:14 AM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> Looks good.
>>>
>>> A few minor comments ...
>>>
>>> On 17/07/2015 5:20 AM, Coleen Phillimore wrote:
>>>> 8026333: hs_err improvement: Print GC Strategy
>>>> 8026336: hs_err improvement: Print compilation mode, server, client or
>>>> tiered
>>>> Summary: Added command line, summary cpu and os information to summary
>>>> section.  Moved time of crash and duration in summary section. Add GC
>>>> strategy and compiler setting (tiered) to enhanced version string in
>>>> error report.  Moved the stack trace sooner in hs_err file.
>>>>
>>>> See new summary format in bug report and some samples.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8026324.01/
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8026324
>>>>
>>>> Tested with RBT quick tests (tests that run on all platforms). 
>>>> Verified
>>>> on crash logs on all platforms available inside Oracle either with
>>>> crashing test or C++ program with same new code (or both), and various
>>>> scenarios of missing cpuinfo/distribution files.
>>>
>>> src/os/aix/vm/os_aix.cpp
>>>
>>> + // This looks good
>>> + void os::get_summary_cpu_info(char* buf, size_t buflen) {
>>> +   // This looks good
>>>
>>> It doesn't look _that_ good :)
>>
>> Oh, yeah, the second one should say it looks "awesome".  I took out 
>> one of the comments.   The AIX guys can contradict me if they want.
>>>
>>> ---
>>>
>>> src/os/bsd/vm/os_bsd.cpp
>>>
>>> 1709   size_t size = sizeof(mhz);
>>> 1710   int mib[] = { CTL_HW, HW_CPU_FREQ };
>>> 1711   size = sizeof(mhz);
>>>
>>> Line #1711 is redundant
>>>
>>> 1713       mhz = 0;
>>>
>>> Wouldn't a default of 1 be safer incase anyone parses this and does 
>>> arithmetic?
>>
>> Hm, maybe I shouldn't report it if it's not available instead.
>>
>>>
>>> BTW took me a while to realize these "if" bodies were setting 
>>> defaults for when sysctl failed.
>>>
>>> 1721     strncpy(model, cpu_arch, sizeof(model));
>>>
>>> where did cpu_arch come from??
>>
>> Above...  linux has a global cpu_arch too but different...
>>>
>>> ---
>>>
>>> src/os/linux/vm/os_linux.cpp
>>>
>>> 2066     if (file == NULL) break;
>>> 2068     if (_print_ascii_file(file, st)) return;
>>>
>>> Can you put the break/return on a new line please.
>>>
>>> More generally for some reason today I have an aversion to "if (x) 
>>> foo();" being on a single line :) But control flow in particular I 
>>> prefer to see on its own line.
>>
>> I was trying to make this look small.  Ok.  I don't really like one 
>> line if statements either anymore.
>>
>>>
>>> 2082     char buf[257];
>>> j
>>> Where does the 257 line length come from? Is this  a defined limit 
>>> for the release file?
>>>
>>> 2218       char buf[257];
>>>
>>> Where does the 257 line length come from? The flags line on my linux 
>>> box is 410 characters.
>>
>> I found 257 in another place and thought it seemed good enough for 
>> the cpuinfo line, because it's not looking for the 'flags' line. I 
>> wonder if there's an advantage to a power of two for a local stack 
>> buffer?  I could go to 512 but I'm trying not to use stack space.
>>
>>>
>>> 2247   ARM32_ONLY("arm32") PPC_ONLY("PPC64") 
>>> AARCH64_ONLY("AARCH64"), length);
>>>
>>> May I suggest "ARM" for ARM32 and it should be "AArch64".
>>
>> Okay.
>>>
>>> ---
>>>
>>>  src/os/posix/vm/os_posix.cpp
>>>
>>>  241   // could use uname as in above also
>>>  242   return (gethostname(buf, buflen) != -1);
>>>
>>> I would suggest using uname given that Linux Glibc implements 
>>> gethostname using uname. gethostname also has unclear semantics 
>>> regarding nul-termination if the name is truncated.
>>
>> Okay, I'll use uname.  I suppose both make an OS call (wasn't sure if 
>> gethostname did).
>>>
>>> ---
>>>
>>> src/os/solaris/vm/os_solaris.cpp
>>>
>>> 1978     char tmp[256];
>>> 1979     if (fgets(tmp, 256, fp)) {
>>>
>>> 256 this time? :) And the second 256 should sizeof(tmp).
>>
>> Yes, I don't need a lot of characters.   256 should do.   I'll change 
>> the second to sizeof.
>>>
>>> Can you add a comment that you are getting only the first line, please.
>>
>> ok.
>>
>>>
>>> 2028 #ifdef AMD64
>>> 2029   snprintf(buf, buflen, "x86 64 bit %d MHz", clock);
>>> 2030 #elif  (defined __sparc) && (defined _LP64)
>>> 2031   snprintf(buf, buflen, "Sparcv9 64 bit %d MHz", clock);
>>> 2032 #else
>>> 2033 #error "Invalid solaris configuration"
>>>
>>> I think "ifdef AMD64 else // must be sparc" will do here.
>>
>> Ok.
>>>
>>> Aside: it is truly horrible how little useful processor info you can 
>>> get from Solaris. It still refers to i386 and i387!
>>>
>>
>> Yes, you think it would give me something better after going through 
>> the trouble of calling a syscall to get it.  You get better from 
>> /proc/cpuinfo on linux but I don't know how to find this string on 
>> solaris.
>>
>>> ---
>>>
>>> src/os/windows/vm/os_windows.cpp
>>>
>>> 1620   if (GetComputerNameEx(ComputerNameDnsHostname, buffer, &size)) {
>>>
>>> Is it worth turning the above into a call to os::get_host_name to 
>>> avoid the duplicate call?
>>
>> Yeah, that's nicer.
>>>
>>> ---
>>>
>>> src/share/vm/runtime/arguments.cpp
>>>
>>> I'm not quite sure what gets printed for print_summary_on, or in 
>>> what format. Can you given an example please showing where each 
>>> piece comes from (jvm_flags vs jvm_args vs java_command).
>>
>> The distinction between jvm_flags and jvm_args is not obvious. The 
>> jvm_flags come from .hotspotrc (or settings file) and the jvm_args is 
>> what you'd actually pass on the command line or environment 
>> variable.   I was going to add the flags to the command line but 
>> they're specified oddly without XX, which I guess you have discussed 
>> on another thread.   I changed this summary printing to add a line 
>> for the contents of the settings file:
>>
>> ---------------  S U M M A R Y ------------
>>
>> Settings File: +UseSharedSpaces +PrintCompressedOopsMode
>> Command Line: -Xint Kaboom
>>
>> Host: mymachine, Intel(R) Core(TM) i5-3550 CPU @ 3.30GHz, 4 cores, 
>> 7G, Ubuntu 14.04.1 LTS
>> Time: Fri Jul 17 15:04:33 2015 EDT elapsed time: 0 seconds (0d 0h 0m 0s)
>>
>>
>> And I added comments.
>>
>> This might need some adjustment for bug 8061999 if they're not by 
>> default added to the command line arguments.  I have to check that 
>> for that change.
>>> ---
>>>
>>> src/share/vm/utilities/vmError.cpp
>>>
>>> 691   STEP(250, "(printing register info)")
>>> 699   STEP(260, "(printing registers, top of stack, instructions 
>>> near pc)")
>>>
>>> What's the difference between registers and register info ??
>>
>> The first call decodes the register values if possible, ie. calls 
>> print_location() on each value.   The second prints the registers as 
>> part of the context (which also includes top of stack and 
>> instructions near the pc.   The second printing of the registers is 
>> redundant, and os::print_context() is only called from the vmError 
>> handler.   I couldn't decide to remove it though, because if 
>> print_location() fails in step 250, you wouldn't have the registers, 
>> so you get them again in step 260.
>>
>>>
>>> 564   STEP(140, "(printing VM options)" )
>>> 816   STEP(380, "(printing VM options)" )
>>>
>>> One of those should be changed.
>> Fixed.
>>
>> Thank you for noticing and suggesting all the things to clean up.
>>
>> Coleen
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> Thanks,
>>>> Coleen
>>
>



More information about the hotspot-dev mailing list