RFR 8026324: hs_err improvement: Add summary section to hs_err file
Coleen Phillimore
coleen.phillimore at oracle.com
Tue Jul 21 18:52:58 UTC 2015
Thank you for the code review, Dmitry. New webrev (see below comments).
http://cr.openjdk.java.net/~coleenp/8026324.03/
On 7/20/15 4:26 PM, Dmitry Dmitriev wrote:
> Hello Coleen,
>
> Few comments from me:
>
> src/os/linux/vm/os_linux.cpp
> In parse_os_info function it is possible that file will not be
> closed(if 'ptr==NULL' on line 2095). I think that "fclose(fp);" can be
> added after 2110 line. Also, in this case lines 2100-2101 and
> 2108-2109 are seems reduntant, because file will be closed by added
> "fclose(fd);" and function will terminate.
This is a good find. I am missing an fclose and I'd restructured the
code so that the early returns are no longer necessary. I also don't
print anything if the Ubuntu format isn't expected, which wasn't good.
See os_linux.cpp in next webrev.
If I don't find the quotes in DISTRIB_DESCRIPTION=, I'll print what's
after the equal sign (editing out the newline again).
>
> In os::get_summary_cpu_info function it is possible that file will
> not be closed(if 'ptr' always == 'end' on line 2238). I think that
> "fclose(fp);" can be added after 2248 line.
Thank you for finding that also. Fixed.
>
> src/share/vm/runtime/os.cpp
> In os::print_summary_info function you print physical_memory in
> gigabytes. But on systems with less than 1 Gb of memory it will print
> 0. Probably will be better to print it in megabytes?
I was wondering if there was such thing as these systems with < 1G
memory. But if a system has less than 1M memory, they're getting a
zero. I made this change but I have nowhere to test it.
Thanks,
Coleen
>
> Thank you,
> Dmitry
>
> On 17.07.2015 22:16, Coleen Phillimore wrote:
>>
>> Hi David, Thank you for taking the time to look and make comments on
>> all of this. I have a new webrev at:
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8026324.02/
>>
>> more below.
>>
>> On 7/17/15 3:14 AM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> Looks good.
>>>
>>> A few minor comments ...
>>>
>>> On 17/07/2015 5:20 AM, Coleen Phillimore wrote:
>>>> 8026333: hs_err improvement: Print GC Strategy
>>>> 8026336: hs_err improvement: Print compilation mode, server, client or
>>>> tiered
>>>> Summary: Added command line, summary cpu and os information to summary
>>>> section. Moved time of crash and duration in summary section. Add GC
>>>> strategy and compiler setting (tiered) to enhanced version string in
>>>> error report. Moved the stack trace sooner in hs_err file.
>>>>
>>>> See new summary format in bug report and some samples.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8026324.01/
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8026324
>>>>
>>>> Tested with RBT quick tests (tests that run on all platforms).
>>>> Verified
>>>> on crash logs on all platforms available inside Oracle either with
>>>> crashing test or C++ program with same new code (or both), and various
>>>> scenarios of missing cpuinfo/distribution files.
>>>
>>> src/os/aix/vm/os_aix.cpp
>>>
>>> + // This looks good
>>> + void os::get_summary_cpu_info(char* buf, size_t buflen) {
>>> + // This looks good
>>>
>>> It doesn't look _that_ good :)
>>
>> Oh, yeah, the second one should say it looks "awesome". I took out
>> one of the comments. The AIX guys can contradict me if they want.
>>>
>>> ---
>>>
>>> src/os/bsd/vm/os_bsd.cpp
>>>
>>> 1709 size_t size = sizeof(mhz);
>>> 1710 int mib[] = { CTL_HW, HW_CPU_FREQ };
>>> 1711 size = sizeof(mhz);
>>>
>>> Line #1711 is redundant
>>>
>>> 1713 mhz = 0;
>>>
>>> Wouldn't a default of 1 be safer incase anyone parses this and does
>>> arithmetic?
>>
>> Hm, maybe I shouldn't report it if it's not available instead.
>>
>>>
>>> BTW took me a while to realize these "if" bodies were setting
>>> defaults for when sysctl failed.
>>>
>>> 1721 strncpy(model, cpu_arch, sizeof(model));
>>>
>>> where did cpu_arch come from??
>>
>> Above... linux has a global cpu_arch too but different...
>>>
>>> ---
>>>
>>> src/os/linux/vm/os_linux.cpp
>>>
>>> 2066 if (file == NULL) break;
>>> 2068 if (_print_ascii_file(file, st)) return;
>>>
>>> Can you put the break/return on a new line please.
>>>
>>> More generally for some reason today I have an aversion to "if (x)
>>> foo();" being on a single line :) But control flow in particular I
>>> prefer to see on its own line.
>>
>> I was trying to make this look small. Ok. I don't really like one
>> line if statements either anymore.
>>
>>>
>>> 2082 char buf[257];
>>> j
>>> Where does the 257 line length come from? Is this a defined limit
>>> for the release file?
>>>
>>> 2218 char buf[257];
>>>
>>> Where does the 257 line length come from? The flags line on my linux
>>> box is 410 characters.
>>
>> I found 257 in another place and thought it seemed good enough for
>> the cpuinfo line, because it's not looking for the 'flags' line. I
>> wonder if there's an advantage to a power of two for a local stack
>> buffer? I could go to 512 but I'm trying not to use stack space.
>>
>>>
>>> 2247 ARM32_ONLY("arm32") PPC_ONLY("PPC64")
>>> AARCH64_ONLY("AARCH64"), length);
>>>
>>> May I suggest "ARM" for ARM32 and it should be "AArch64".
>>
>> Okay.
>>>
>>> ---
>>>
>>> src/os/posix/vm/os_posix.cpp
>>>
>>> 241 // could use uname as in above also
>>> 242 return (gethostname(buf, buflen) != -1);
>>>
>>> I would suggest using uname given that Linux Glibc implements
>>> gethostname using uname. gethostname also has unclear semantics
>>> regarding nul-termination if the name is truncated.
>>
>> Okay, I'll use uname. I suppose both make an OS call (wasn't sure if
>> gethostname did).
>>>
>>> ---
>>>
>>> src/os/solaris/vm/os_solaris.cpp
>>>
>>> 1978 char tmp[256];
>>> 1979 if (fgets(tmp, 256, fp)) {
>>>
>>> 256 this time? :) And the second 256 should sizeof(tmp).
>>
>> Yes, I don't need a lot of characters. 256 should do. I'll change
>> the second to sizeof.
>>>
>>> Can you add a comment that you are getting only the first line, please.
>>
>> ok.
>>
>>>
>>> 2028 #ifdef AMD64
>>> 2029 snprintf(buf, buflen, "x86 64 bit %d MHz", clock);
>>> 2030 #elif (defined __sparc) && (defined _LP64)
>>> 2031 snprintf(buf, buflen, "Sparcv9 64 bit %d MHz", clock);
>>> 2032 #else
>>> 2033 #error "Invalid solaris configuration"
>>>
>>> I think "ifdef AMD64 else // must be sparc" will do here.
>>
>> Ok.
>>>
>>> Aside: it is truly horrible how little useful processor info you can
>>> get from Solaris. It still refers to i386 and i387!
>>>
>>
>> Yes, you think it would give me something better after going through
>> the trouble of calling a syscall to get it. You get better from
>> /proc/cpuinfo on linux but I don't know how to find this string on
>> solaris.
>>
>>> ---
>>>
>>> src/os/windows/vm/os_windows.cpp
>>>
>>> 1620 if (GetComputerNameEx(ComputerNameDnsHostname, buffer, &size)) {
>>>
>>> Is it worth turning the above into a call to os::get_host_name to
>>> avoid the duplicate call?
>>
>> Yeah, that's nicer.
>>>
>>> ---
>>>
>>> src/share/vm/runtime/arguments.cpp
>>>
>>> I'm not quite sure what gets printed for print_summary_on, or in
>>> what format. Can you given an example please showing where each
>>> piece comes from (jvm_flags vs jvm_args vs java_command).
>>
>> The distinction between jvm_flags and jvm_args is not obvious. The
>> jvm_flags come from .hotspotrc (or settings file) and the jvm_args is
>> what you'd actually pass on the command line or environment
>> variable. I was going to add the flags to the command line but
>> they're specified oddly without XX, which I guess you have discussed
>> on another thread. I changed this summary printing to add a line
>> for the contents of the settings file:
>>
>> --------------- S U M M A R Y ------------
>>
>> Settings File: +UseSharedSpaces +PrintCompressedOopsMode
>> Command Line: -Xint Kaboom
>>
>> Host: mymachine, Intel(R) Core(TM) i5-3550 CPU @ 3.30GHz, 4 cores,
>> 7G, Ubuntu 14.04.1 LTS
>> Time: Fri Jul 17 15:04:33 2015 EDT elapsed time: 0 seconds (0d 0h 0m 0s)
>>
>>
>> And I added comments.
>>
>> This might need some adjustment for bug 8061999 if they're not by
>> default added to the command line arguments. I have to check that
>> for that change.
>>> ---
>>>
>>> src/share/vm/utilities/vmError.cpp
>>>
>>> 691 STEP(250, "(printing register info)")
>>> 699 STEP(260, "(printing registers, top of stack, instructions
>>> near pc)")
>>>
>>> What's the difference between registers and register info ??
>>
>> The first call decodes the register values if possible, ie. calls
>> print_location() on each value. The second prints the registers as
>> part of the context (which also includes top of stack and
>> instructions near the pc. The second printing of the registers is
>> redundant, and os::print_context() is only called from the vmError
>> handler. I couldn't decide to remove it though, because if
>> print_location() fails in step 250, you wouldn't have the registers,
>> so you get them again in step 260.
>>
>>>
>>> 564 STEP(140, "(printing VM options)" )
>>> 816 STEP(380, "(printing VM options)" )
>>>
>>> One of those should be changed.
>> Fixed.
>>
>> Thank you for noticing and suggesting all the things to clean up.
>>
>> Coleen
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> Thanks,
>>>> Coleen
>>
>
More information about the hotspot-dev
mailing list