RFR: 8221535: add steal tick related information to hs_error file [linux]
David Holmes
david.holmes at oracle.com
Mon Apr 1 05:00:51 UTC 2019
Hi Matthias,
On 29/03/2019 11:37 pm, Baesken, Matthias wrote:
> Hello, new webrev :
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.3/
>
> changed the function to bool os::Linux::get_tick_information(cpu_ticks*
> out, int which_logical_cpu = -1)
Okay.
> added memset at beginning of get_tick_information
Okay
> removed the remaining vm_printf from
> src/hotspot/os/aix/os_perf_aix.cpp (at first I did not notice it
> was there as well ).
Okay
> [ kept the struct naming CPUPerfTicks ( I find plenty of such struct
> name styles in hotspot codebase ). ]
Agreed - no consistency and you're moving existing code so this is fine
IMHO.
>>
>
>>In case we are sure that we are always on 2.6.11+ kernels, then indeed I can remove the special handling.
>
>>I was sure that we are on 2.6+ but only 95% sure that we are on 2.6.11+ .
>
>>
>
> I kept the has_steal_ticks for now; in case someone can confirm
> 2.6.11+ for jdk13 I would remove it .
I can't comment on that.
One nit in os_linux.cpp:
+ if((fh = fopen("/proc/stat", "r")) == NULL) {
Need space after if
Otherwise this all seems fine to me.
Thanks,
David
> Best regards, Matthias
>
> *From:*Baesken, Matthias
> *Sent:* Freitag, 29. März 2019 11:02
> *To:* 'Thomas Stüfe' <thomas.stuefe at gmail.com>
> *Cc:* David Holmes <david.holmes at oracle.com>; hotspot-dev at openjdk.java.net
> *Subject:* RE: RFR: 8221535: add steal tick related information to
> hs_error file [linux]
>
> * I would completely remove this section:
> * +#ifdef DEBUG_LINUX_PROC_STAT
> * + vm_fprintf(stderr, "[stat] read "
> *
>
> Hi Thomas,
>
> There are no vm_printf any more in
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.2/src/hotspot/os/linux/os_linux.cpp.frames.html
>
> they have been removed already.
>
> * Every item past the "idle" column in /proc/stat has a (admittedly
> old) kernel version dependency.
> * Why do you think it is important to communicate the existence of
> "steal",
>
> In case we are sure that we are always on 2.6.11+ kernels, then
> indeed I can remove the special handling.
>
> I was sure that we are on 2.6+ but only 95% sure that we are on
> 2.6.11+ .
>
>>BTW, this has nothing to do with your change, but this whole desig seems inefficient:
>
>>In os_perf_linux.cpp - which is used e.g. from JFR - we call "os::Linux::get_tick_information" for every expected cpu, and so we parse /proc/stat n times for one reading.
>
> Sure this could/should be addressed ; but in another change .
>
> Regards, Matthias
>
> *From:*Thomas Stüfe <thomas.stuefe at gmail.com
> <mailto:thomas.stuefe at gmail.com>>
> *Sent:* Freitag, 29. März 2019 09:44
> *To:* Baesken, Matthias <matthias.baesken at sap.com
> <mailto:matthias.baesken at sap.com>>
> *Cc:* David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>; hotspot-dev at openjdk.java.net
> <mailto:hotspot-dev at openjdk.java.net>
> *Subject:* Re: RFR: 8221535: add steal tick related information to
> hs_error file [linux]
>
> Hi Matthias,
>
> as usual, thanks for your patience and perseverance. Unfortunately you
> touched a piece of coding which was not very pretty (IMHO) in the first
> place, so there are a number of further remarks:
>
> - Return type: please use a simple bool value, not OSReturn. Keeps this
> in line with all other OS functions. Plus I do not see us returning
> anything else than OS_OK and OS_ERR anyway, and callers seem not to care
> either.
>
> - Please use lowercase-underscore naming of the return struct, not camel
> case. Again, like all other os functions around. I suggest a simple
> "cpu_ticks".
>
> - Please change parameter order to get_tick_information and make the
> default of "which_logical_cpu" -1. And add a small comment that -1 means
> "all".
>
> // which_logical_cpu=-1 returns accumulated ticks for all cpus.
>
> bool os::Linux::get_tick_information(cpu_ticks* out, int
> which_logical_cpu = -1) {
>
> - Please initialize the output structure to zero at the begin of the
> function. Just a simple memset is sufficient.
>
> - I would rename "expected_tickinfo_count" to "required_tickinfo_count".
> That clearly states that we have a number of values which are mandatory
> but are prepared to handle optional values too.
>
> - I would completely remove this section:
>
> +#ifdef DEBUG_LINUX_PROC_STAT
>
> + vm_fprintf(stderr, "[stat] read "
>
> + UINT64_FORMAT " " UINT64_FORMAT " " UINT64_FORMAT " "
> UINT64_FORMAT " "
>
> + UINT64_FORMAT " " UINT64_FORMAT " " UINT64_FORMAT " "
> UINT64_FORMAT " " UINT64_FORMAT " \n",
>
> + userTicks, niceTicks, systemTicks, idleTicks,
>
> + iowTicks, irqTicks, sirqTicks, stealTicks, guestNiceTicks);
>
> +#endif
>
> vm_fprintf is nowhere defined and will not build when activated. Seems
> to be a relict. We call it only from here and from os_perf_aix.cpp.
> Please remove it from os_aix.cpp too.
>
> - Backward compatibility:
>
> I do not see the value in "has_steal_ticks". See manpage for /proc:
> http://man7.org/linux/man-pages/man5/proc.5.html
>
> Every item past the "idle" column in /proc/stat has a (admittedly old)
> kernel version dependency. Why do you think it is important to
> communicate the existence of "steal", but not of the items before
> (introduced before Linux 2.6.11)? Seems arbitrary.
>
> I mean, lets do this either right or not at all.
>
> If done right, I would prefer my suggested solution which somehow maps
> an INVALID value onto the output value. If you dislike my prior proposal
> (value_t) you could just define a special value to mean INVALID. E.g.
> #define INVALID_VALUE ((uint64_t)(0xFFFFFFFFFFFFFFFFUL)) or similar.
>
> Alternatively, we could decide to just not care. For values we do not
> find we could report zero (code does that now already). You may report
> "0" steals and not know if thats true or if the kernel is just too old
> to tell you, but again, that is the case for other values too.
>
> ------
>
> BTW, this has nothing to do with your change, but this whole desig seems
> inefficient:
>
> In os_perf_linux.cpp - which is used e.g. from JFR - we call
> "os::Linux::get_tick_information" for every expected cpu, and so we
> parse /proc/stat n times for one reading.
>
> We could instead just have a function like
> os::Linux::get_tick_information just return all information in one go.
> Would be way easier.
>
> Cheers, Thomas
>
> On Fri, Mar 29, 2019 at 8:56 AM Baesken, Matthias
> <matthias.baesken at sap.com <mailto:matthias.baesken at sap.com>> wrote:
>
> Hi David , thanks for your comments .
>
> New webrev :
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.2/
>
> moved os::CPUPerfTicks to os::Linux::CPUPerfTicks
> removed DEBUG_LINUX_PROC_STAT from os_linux.cpp
> addressed the minor comments
>
>
> Best regards, Matthias
>
>
> > -----Original Message-----
> > From: David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>>
> > Sent: Freitag, 29. März 2019 06:07
> > To: Baesken, Matthias <matthias.baesken at sap.com
> <mailto:matthias.baesken at sap.com>>; Thomas Stüfe
> > <thomas.stuefe at gmail.com <mailto:thomas.stuefe at gmail.com>>
> > Cc: hotspot-dev at openjdk.java.net
> <mailto:hotspot-dev at openjdk.java.net>
> > Subject: Re: RFR: 8221535: add steal tick related information to
> hs_error file
> > [linux]
> >
> > Hi Matthias,
> >
> > This generally seems okay.
> >
> > On 28/03/2019 2:50 am, Baesken, Matthias wrote:
> > > Hello, here is a second webrev :
> > >
> > > http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.1/
> > >
> > > I followed Davids idea to use the os_perf coding .
> > > For reusing I moved os::Linux::get_tick_information
> (former name
> > get_total_ticks , but it is now not only total but also steal
> ) into os_linux .
> > > The CPUPerfTicks struct has been enhanced by steal and
> a boolean flag
> > has_steal_ticks .
> > >
> > > 28 struct CPUPerfTicks {
> > > 29 uint64_t used;
> > > 30 uint64_t usedKernel;
> > > 31 uint64_t total;
> > > 32 uint64_t steal;
> > > 33 bool has_steal_ticks;
> > > 34 };
> >
> > Okay, but one minor nit: why put struct CPUPerfTicks in the os::
> > namespace with the code that uses it in the os::Linux namespace?
> >
> > A few other minor comments:
> >
> > src/hotspot/os/linux/os_linux.cpp
> >
> > + if (-1 == which_logical_cpu) {
> >
> > hotspot style puts the variable first in a comparison against a
> constant.
> >
> > + #ifdef DEBUG_LINUX_PROC_STAT
> >
> > This should either just be DEBUG code or else deleted. The vm_printfs
> > would ideally be logging statements if that is possible.
> >
> > + if (n > expected_tickinfo_count+3) {
> >
> > Spaces around + please.
> >
> > + st->print_cr("Steal ticks from vm start: %llu", (long long
> > unsigned) steal_ticks_difference);
> >
> > %llu should be UINT64_FORMAT.
> > The cast should not be needed.
> >
> > Thanks,
> > David
> > -----
> >
> > >
> > > I found the "big API" idea a bit too much for just 2
> use cases .
> > >
> > >
> > > Best regards , Matthias
> > >
>
More information about the hotspot-dev
mailing list