RFR: 8221535: add steal tick related information to hs_error file [linux]
David Holmes
david.holmes at oracle.com
Fri Mar 29 05:06:44 UTC 2019
Hi Matthias,
This generally seems okay.
On 28/03/2019 2:50 am, Baesken, Matthias wrote:
> Hello, here is a second webrev :
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.1/
>
> I followed Davids idea to use the os_perf coding .
> For reusing I moved os::Linux::get_tick_information (former name get_total_ticks , but it is now not only total but also steal ) into os_linux .
> The CPUPerfTicks struct has been enhanced by steal and a boolean flag has_steal_ticks .
>
> 28 struct CPUPerfTicks {
> 29 uint64_t used;
> 30 uint64_t usedKernel;
> 31 uint64_t total;
> 32 uint64_t steal;
> 33 bool has_steal_ticks;
> 34 };
Okay, but one minor nit: why put struct CPUPerfTicks in the os::
namespace with the code that uses it in the os::Linux namespace?
A few other minor comments:
src/hotspot/os/linux/os_linux.cpp
+ if (-1 == which_logical_cpu) {
hotspot style puts the variable first in a comparison against a constant.
+ #ifdef DEBUG_LINUX_PROC_STAT
This should either just be DEBUG code or else deleted. The vm_printfs
would ideally be logging statements if that is possible.
+ if (n > expected_tickinfo_count+3) {
Spaces around + please.
+ st->print_cr("Steal ticks from vm start: %llu", (long long
unsigned) steal_ticks_difference);
%llu should be UINT64_FORMAT.
The cast should not be needed.
Thanks,
David
-----
>
> I found the "big API" idea a bit too much for just 2 use cases .
>
>
> Best regards , Matthias
>
>
>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Mittwoch, 27. März 2019 11:22
>> To: Baesken, Matthias <matthias.baesken at sap.com>; Thomas Stüfe
>> <thomas.stuefe at gmail.com>
>> Cc: hotspot-dev at openjdk.java.net
>> Subject: Re: RFR: 8221535: add steal tick related information to hs_error file
>> [linux]
>>
>> On 27/03/2019 8:04 pm, Baesken, Matthias wrote:
>>> Hi Thomas, interesting idea . David – would you be fine with what
>>> Thomas proposes ?
>>
>> Isn't adding the missing fields to the existing os_perf code basically
>> the same thing? I don't see any point in duplicating existing code like
>> this.
>>
>> David
>> -----
>>
>>> Best regards, Matthias
>>>
>>> *From:*Thomas Stüfe <thomas.stuefe at gmail.com>
>>> *Sent:* Mittwoch, 27. März 2019 10:28
>>> *To:* Baesken, Matthias <matthias.baesken at sap.com>
>>> *Cc:* hotspot-dev at openjdk.java.net
>>> *Subject:* Re: RFR: 8221535: add steal tick related information to
>>> hs_error file [linux]
>>>
>>> Hi Matthias,
>>>
>>> this is a lot of coding for just one counter...
>>>
>>> Also os_perf_linux.cpp does something very similar. However, that code
>>> does not get the steal ticks you want, does not seem to care about
>>> kernel differences and could also be improved.
>>>
>>> Proposal: Why do you not tweak your new API to be a general purpose API
>>> to read all ticks from /proc/stat. That would be way more useful. Later,
>>> if we feel like it, could change os_perf_linux.cpp to use your function.
>>>
>>> We did something similar for the statistical history in the SapMachine, see
>>>
>>>
>> https://github.com/SAP/SapMachine/blob/sapmachine/src/hotspot/os/linu
>> x/stathist_linux.cpp
>>>
>>> . So I would suggest to have a function reading proc stat and returning
>>> all values in a struct. That struct should have provisions for "value
>>> not found" since which values you get depends in kernel version.
>>>
>>> --------
>>>
>>> Example:
>>>
>>> struct value_t { bool valid; uint64_t v; }
>>>
>>> struct cpu_values_t {
>>> value_t user;
>>> value_t nice;
>>> value_t system;
>>> value_t idle;
>>> value_t iowait;
>>> value_t steal;
>>> value_t guest;
>>> value_t guest_nice;
>>> };
>>>
>>> void parse_proc_stat_cpu_line(const char* line, cpu_values_t* out) {
>>> // Note: existence of some of these values depends on kernel version
>>> out->user.valid = out->nice.valid = out->system.valid =
>>> out->idle.valid = out->iowait.valid = out->steal.valid =
>>> out->guest.valid = out->guest_nice.valid = false;
>>> int user, nice, system, idle, iowait, irq, softirq, steal, guest,
>>> guest_nice;
>>> int num = ::sscanf(line, "cpu %d %d %d %d %d %d %d %d %d %d",
>>> &user, &nice, &system, &idle, &iowait,
>>> &irq, &softirq, &steal, &guest, &guest_nice);
>>> if (num >= 4) {
>>> out->user.v = user; out->user.valid = true;
>>> out->nice.v = nice; out->nice.valid = true;
>>> out->system.v = system; out->system.valid = true;
>>> out->idle.v = idle; out->idle.valid = true;
>>> if (num >= 5) { // iowait (5) (since Linux 2.5.41)
>>> out->iowait = iowait; out->iowait.valid = true;
>>> if (num >= 8) { // steal (8) (since Linux 2.6.11)
>>> out->steal = steal; out->steal.valid = true;
>>> if (num >= 9) { // guest (9) (since Linux 2.6.24)
>>> out->guest = guest; out->guest.valid = true;
>>> if (num >= 10) { // guest (9) (since Linux 2.6.33)
>>> out->guest_nice = guest_nice; out->guest_nice.valid = true;
>>> }
>>> }
>>> }
>>> }
>>>
>>> Cheers Thomas
>>>
>>> On Wed, Mar 27, 2019 at 9:42 AM Baesken, Matthias
>>> <matthias.baesken at sap.com <mailto:matthias.baesken at sap.com>>
>> wrote:
>>>
>>> Hello, please review the following change .
>>>
>>> Recent Linux kernels provide in the /proc filesystem so called
>>> "steal" information, see the the proc man page :
>>>
>>> ..
>>> "there is an eighth column, steal - stolen time, which is the time
>>> spent in other operating systems when running in a virtualized
>>> environment"
>>>
>>> The info was useful in our internal VM, when looking into
>>> performance issues (especially in virtualized environments), so it
>>> should be added to the hs_err file .
>>>
>>> Bug/webrev :
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8221535
>>>
>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8221535.0/
>>>
>>>
>>> Best Regards, Matthias
>>>
More information about the hotspot-dev
mailing list