RFR: 8221535: add steal tick related information to hs_error file [linux]

David Holmes david.holmes at oracle.com
Mon Apr 1 05:00:51 UTC 2019


Hi Matthias,

On 29/03/2019 11:37 pm, Baesken, Matthias wrote:
> Hello, new webrev :
> 
> http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.3/
> 
> changed the function to bool os::Linux::get_tick_information(cpu_ticks* 
> out, int which_logical_cpu = -1)

Okay.

> added memset  at beginning of get_tick_information

Okay

> removed the remaining vm_printf    from 
> src/hotspot/os/aix/os_perf_aix.cpp      (at first I did not notice it 
>   was  there as well ).

Okay

> [ kept the struct  naming CPUPerfTicks    ( I find plenty of such struct 
> name  styles  in hotspot codebase  ).   ]

Agreed - no consistency and you're moving existing code so this is fine 
IMHO.

>>
> 
>>In case  we  are sure that we are always  on  2.6.11+   kernels, then indeed I can remove the  special handling.
> 
>>I was sure  that  we are on 2.6+  but only 95%  sure that we are on  2.6.11+   .
> 
>>
> 
> I kept the  has_steal_ticks   for now; in case someone  can  confirm 
> 2.6.11+  for jdk13  I would remove it .

I can't comment on that.

One nit in os_linux.cpp:

+   if((fh = fopen("/proc/stat", "r")) == NULL) {

Need space after if

Otherwise this all seems fine to me.

Thanks,
David

> Best regards, Matthias
> 
> *From:*Baesken, Matthias
> *Sent:* Freitag, 29. März 2019 11:02
> *To:* 'Thomas Stüfe' <thomas.stuefe at gmail.com>
> *Cc:* David Holmes <david.holmes at oracle.com>; hotspot-dev at openjdk.java.net
> *Subject:* RE: RFR: 8221535: add steal tick related information to 
> hs_error file [linux]
> 
>   * I would completely remove this section:
>   * +#ifdef DEBUG_LINUX_PROC_STAT
>   * +  vm_fprintf(stderr, "[stat] read "
> *
> 
> Hi Thomas,
> 
> There are no  vm_printf   any more in
> 
> http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.2/src/hotspot/os/linux/os_linux.cpp.frames.html
> 
> they have been removed  already.
> 
>   * Every item past the "idle" column in /proc/stat has a (admittedly
>     old) kernel version dependency.
>   * Why do you think it is important to communicate the existence of
>     "steal",
> 
> In case  we  are sure that we are always  on  2.6.11+   kernels, then 
> indeed I can remove the  special handling.
> 
> I was sure  that  we are on 2.6+  but only 95%  sure that we are on  
> 2.6.11+   .
> 
>>BTW, this has nothing to do with your change, but this whole desig seems inefficient:
> 
>>In os_perf_linux.cpp - which is used e.g. from JFR - we call "os::Linux::get_tick_information" for every expected cpu, and so we parse /proc/stat n times for one reading.
> 
> Sure this could/should  be addressed ;  but in another change .
> 
> Regards, Matthias
> 
> *From:*Thomas Stüfe <thomas.stuefe at gmail.com 
> <mailto:thomas.stuefe at gmail.com>>
> *Sent:* Freitag, 29. März 2019 09:44
> *To:* Baesken, Matthias <matthias.baesken at sap.com 
> <mailto:matthias.baesken at sap.com>>
> *Cc:* David Holmes <david.holmes at oracle.com 
> <mailto:david.holmes at oracle.com>>; hotspot-dev at openjdk.java.net 
> <mailto:hotspot-dev at openjdk.java.net>
> *Subject:* Re: RFR: 8221535: add steal tick related information to 
> hs_error file [linux]
> 
> Hi Matthias,
> 
> as usual, thanks for your patience and perseverance. Unfortunately you 
> touched a piece of coding which was not very pretty (IMHO) in the first 
> place, so there are a number of further remarks:
> 
> - Return type: please use a simple bool value, not OSReturn. Keeps this 
> in line with all other OS functions. Plus I do not see us returning 
> anything else than OS_OK and OS_ERR anyway, and callers seem not to care 
> either.
> 
> - Please use lowercase-underscore naming of the return struct, not camel 
> case. Again, like all other os functions around. I suggest a simple 
> "cpu_ticks".
> 
> - Please change parameter order to get_tick_information and make the 
> default of "which_logical_cpu" -1. And add a small comment that -1 means 
> "all".
> 
> // which_logical_cpu=-1 returns accumulated ticks for all cpus.
> 
> bool os::Linux::get_tick_information(cpu_ticks* out, int 
> which_logical_cpu = -1) {
> 
> - Please initialize the output structure to zero at the begin of the 
> function. Just a simple memset is sufficient.
> 
> - I would rename "expected_tickinfo_count" to "required_tickinfo_count". 
> That clearly states that we have a number of values which are mandatory 
> but are prepared to handle optional values too.
> 
> - I would completely remove this section:
> 
> +#ifdef DEBUG_LINUX_PROC_STAT
> 
> +  vm_fprintf(stderr, "[stat] read "
> 
> +          UINT64_FORMAT " " UINT64_FORMAT " " UINT64_FORMAT " " 
> UINT64_FORMAT " "
> 
> +          UINT64_FORMAT " " UINT64_FORMAT " " UINT64_FORMAT " " 
> UINT64_FORMAT " " UINT64_FORMAT " \n",
> 
> +          userTicks, niceTicks, systemTicks, idleTicks,
> 
> +          iowTicks, irqTicks, sirqTicks, stealTicks, guestNiceTicks);
> 
> +#endif
> 
> vm_fprintf is nowhere defined and will not build when activated. Seems 
> to be a relict. We call it only from here and from os_perf_aix.cpp. 
> Please remove it from os_aix.cpp too.
> 
> - Backward compatibility:
> 
> I do not see the value in "has_steal_ticks". See manpage for /proc: 
> http://man7.org/linux/man-pages/man5/proc.5.html
> 
> Every item past the "idle" column in /proc/stat has a (admittedly old) 
> kernel version dependency. Why do you think it is important to 
> communicate the existence of "steal", but not of the items before 
> (introduced before Linux 2.6.11)? Seems arbitrary.
> 
> I mean, lets do this either right or not at all.
> 
> If done right, I would prefer my suggested solution which somehow maps 
> an INVALID value onto the output value. If you dislike my prior proposal 
> (value_t) you could just define a special value to mean INVALID. E.g. 
> #define INVALID_VALUE ((uint64_t)(0xFFFFFFFFFFFFFFFFUL)) or similar.
> 
> Alternatively, we could decide to just not care. For values we do not 
> find we could report zero (code does that now already). You may report 
> "0" steals and not know if thats true or if the kernel is just too old 
> to tell you, but again, that is the case for other values too.
> 
> ------
> 
> BTW, this has nothing to do with your change, but this whole desig seems 
> inefficient:
> 
> In os_perf_linux.cpp - which is used e.g. from JFR - we call 
> "os::Linux::get_tick_information" for every expected cpu, and so we 
> parse /proc/stat n times for one reading.
> 
> We could instead just have a function like 
> os::Linux::get_tick_information just return all information in one go. 
> Would be way easier.
> 
> Cheers, Thomas
> 
> On Fri, Mar 29, 2019 at 8:56 AM Baesken, Matthias 
> <matthias.baesken at sap.com <mailto:matthias.baesken at sap.com>> wrote:
> 
>     Hi David ,   thanks  for  your comments .
> 
>     New webrev :
> 
>     http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.2/
> 
>     moved  os::CPUPerfTicks    to   os::Linux::CPUPerfTicks
>     removed DEBUG_LINUX_PROC_STAT  from os_linux.cpp
>     addressed the minor  comments
> 
> 
>     Best regards, Matthias
> 
> 
>      > -----Original Message-----
>      > From: David Holmes <david.holmes at oracle.com
>     <mailto:david.holmes at oracle.com>>
>      > Sent: Freitag, 29. März 2019 06:07
>      > To: Baesken, Matthias <matthias.baesken at sap.com
>     <mailto:matthias.baesken at sap.com>>; Thomas Stüfe
>      > <thomas.stuefe at gmail.com <mailto:thomas.stuefe at gmail.com>>
>      > Cc: hotspot-dev at openjdk.java.net
>     <mailto:hotspot-dev at openjdk.java.net>
>      > Subject: Re: RFR: 8221535: add steal tick related information to
>     hs_error file
>      > [linux]
>      >
>      > Hi Matthias,
>      >
>      > This generally seems okay.
>      >
>      > On 28/03/2019 2:50 am, Baesken, Matthias wrote:
>      > > Hello, here is a second webrev :
>      > >
>      > > http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.1/
>      > >
>      > > I  followed  Davids idea  to  use  the  os_perf coding .
>      > > For reusing  I  moved  os::Linux::get_tick_information   
>     (former name
>      > get_total_ticks  , but it is now not only total  but also steal 
>     )   into  os_linux .
>      > > The CPUPerfTicks   struct has been  enhanced   by    steal and 
>     a boolean flag
>      > has_steal_ticks  .
>      > >
>      > >    28 struct CPUPerfTicks {
>      > >    29   uint64_t used;
>      > >    30   uint64_t usedKernel;
>      > >    31   uint64_t total;
>      > >    32   uint64_t steal;
>      > >    33   bool     has_steal_ticks;
>      > >    34 };
>      >
>      > Okay, but one minor nit: why put struct CPUPerfTicks in the os::
>      > namespace with the code that uses it in the os::Linux namespace?
>      >
>      > A few other minor comments:
>      >
>      > src/hotspot/os/linux/os_linux.cpp
>      >
>      > +   if (-1 == which_logical_cpu) {
>      >
>      > hotspot style puts the variable first in a comparison against a
>     constant.
>      >
>      > + #ifdef DEBUG_LINUX_PROC_STAT
>      >
>      > This should either just be DEBUG code or else deleted. The vm_printfs
>      > would ideally be logging statements if that is possible.
>      >
>      > +   if (n > expected_tickinfo_count+3) {
>      >
>      > Spaces around + please.
>      >
>      > +       st->print_cr("Steal ticks from vm start: %llu", (long long
>      > unsigned) steal_ticks_difference);
>      >
>      > %llu should be UINT64_FORMAT.
>      > The cast should not be needed.
>      >
>      > Thanks,
>      > David
>      > -----
>      >
>      > >
>      > > I found  the  "big API"   idea  a bit too much  for  just 2 
>     use cases  .
>      > >
>      > >
>      > > Best regards , Matthias
>      > >
> 


More information about the hotspot-dev mailing list