jmx-dev RFR 6523160: RuntimeMXBean.getUptime() returns negative values
Jaroslav Bachorik
jaroslav.bachorik at oracle.com
Wed Oct 16 09:16:15 PDT 2013
On 15.10.2013 08:49, David Holmes wrote:
> Hi Jaroslav,
>
> os_bsd.cpp / os_linux.cpp:
>
> If you don't have a monotonic clock you leave timer_frequency set to 0!
> (So you need to test on a system without a monotonic clock, or else
> force it to act as-if not present.)
>
> That aside I don't trust clock_getres to give values that actually allow
> the timer frequency to be determined. As per the comments in os_linux.cpp:
>
> // It's fixed in newer kernels, however clock_getres() still returns
> // 1/HZ. We check if clock_getres() works, but will ignore its reported
> // resolution for now. Hopefully as people move to new kernels, this
> // won't be a problem.
>
> we don't know what kernels provide real values here and which provide
> dummy ones.
>
> On BSD you haven't modified os::elapsed_counter.
>
> Looking at the linux changes I don't think the logic is correct even if
> clock_getres is accurate. In the existing code we have:
>
> elapsed_counter -> elapsed time in microseconds
> elapsed_frequency -> 1000 * 1000 (ie micros per second)
> elapsed_time -> elapsed_counter*0.000001 -> time in seconds
>
> Now we have:
>
> elapsed_counter -> elapsed time in nanoseconds
> elapsed_frequency -> 1x10^9 / whatever clock_getres says
> elapsed_time -> counter/frequency -> ???
>
> So elapsed_time not, in general, going to give the elapsed time in
> seconds. And elapsed_time is not dependent on the "frequency" at all
> because elapsed_counter is not reporting ticks but an actual elapsed
> "time" in nanoseconds.
>
>
> Also note that we constants for:
>
> NANOSECS_PER_SEC
> NANOSECS_PER_MILLISEC
>
> to aid with time conversions.
>
> The linux webrev contains unrelated UseLargePages changes!
Sorry for the mess with UseLargePages changes :/
I've fixed the problems with the frequency (using a fixed number as
before) and I kept the changes to minimum.
I was hesitating about changing the elapsed_counter precision from
microseconds to nanoseconds but since solaris and windows versions
already use nanosecond ticks for elapsed_counter I think the change is safe.
The update webrev: http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03
>
>
> David
> -----
>
>
> On 15/10/2013 12:13 AM, Jaroslav Bachorik wrote:
>> On 10.10.2013 13:15, Staffan Larsen wrote:
>>>
>>> On 10 okt 2013, at 13:02, Jaroslav Bachorik
>>> <jaroslav.bachorik at oracle.com> wrote:
>>>
>>>> On 10.10.2013 05:44, David Holmes wrote:
>>>>> On 10/10/2013 4:12 AM, Staffan Larsen wrote:
>>>>>>
>>>>>> On 9 okt 2013, at 16:19, Jaroslav Bachorik
>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>
>>>>>>> On 9.10.2013 16:10, Staffan Larsen wrote:
>>>>>>>> There is now an awful amount of different timestamps in the
>>>>>>>> Management class. Can they be consolidated somehow? At least
>>>>>>>> _begin_vm_creation_time and the new _begin_vm_creation_ns.
>>>>>>>>
>>>>>>>> This discussion also implies that the "elapsed time" we print in
>>>>>>>> the
>>>>>>>> hs_err file is also wrong. It uses os::elapsedTime() which uses
>>>>>>>> os::elapsed_counter().
>>>>>>>>
>>>>>>>> And I guess the same thing for the VM.uptime Diagnostic Command
>>>>>>>> (class VMUptimeDCmd) which also relies on os::elapsed_counter().
>>>>>>>
>>>>>>> Also the reported GC pauses duration might be wrong since it uses
>>>>>>> Management::timestamp().
>>>>>>>
>>>>>>> On the first sight the change looks rather trivial. But, honestly,
>>>>>>> I'm not sure which other parts could for whatever reason break once
>>>>>>> the time-of-day timestamp is replaced with a monotonic equivalent.
>>>>>>> One would think that it shouldn't matter but one never knows ...
>>>>>>>
>>>>>>> Staffan, do you think this kind of change is suitable for the
>>>>>>> current
>>>>>>> phase of JDK release cycle? I think I could improve the patch in few
>>>>>>> days and then it should probably be able to pass the review before
>>>>>>> ZBB. But, it's only P3 ...
>>>>>>
>>>>>> I think it is a bit late in the release cycle to clean this up in the
>>>>>> way it should be cleaned up. I think we should wait until the first 8
>>>>>> update release and do a more thorough job than we have time for right
>>>>>> now.
>>>>>
>>>>> I second that. The elapsed_counter/elpased_timer APIs and
>>>>> implementations are a tangled mess. But part of the problem has been
>>>>> that people want/expect monotonic time-of-day based timestamps (yes a
>>>>> contradiction - though some people make sure TOD does not get modified
>>>>> on their production systems). The use of timestamps in logging has
>>>>> to be
>>>>> examined carefully - mainly GC logging. I recall a "simple" attempted
>>>>> change in the past that resulted in trying to compare a nanoTime based
>>>>> timestamp with the TOD. :(
>>>>
>>>> Actually, if I'm reading the sources right for Solaris and Win the
>>>> monotonic clock source is used to provide elapsed_counter() value. It
>>>> falls back to TOD when the monotonic clock source is not available.
>>>> For Linux/BSD the TOD is used directly.
>>>>
>>>> This makes me wonder if changing the linux/bsd implementation to
>>>> follow the same logic would be really that disruptive.
>>>
>>> Good point. I would like a world where elapsed_counter is monotonic
>>> (where possible). Still a bit scary this late in the release, but an
>>> interesting experiment.
>>
>> The change is rather simple and tests ok. All the means to get a
>> monotonic timestamp are already there and proved to work. The core tests
>> in JPRT went fine.
>>
>> The updated webrev is at
>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.02
>>
>> -JB-
>>
>>>
>>> /Staffan
>>>
>>>
>>>
>>>
>>>>
>>>> -JB-
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> /Staffan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> -JB-
>>>>>>>
>>>>>>>>
>>>>>>>> /Staffan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 9 okt 2013, at 13:26, Jaroslav Bachorik
>>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>>
>>>>>>>>> On 8.10.2013 23:46, David Holmes wrote:
>>>>>>>>>> On 8/10/2013 10:36 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>> On 8.10.2013 09:34, David Holmes wrote:
>>>>>>>>>>>> Jaroslav,
>>>>>>>>>>>>
>>>>>>>>>>>> On 2/10/2013 6:47 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> currently the JVM uptime reported by the RuntimeMXBean is
>>>>>>>>>>>>> based on
>>>>>>>>>>>>> System.currentTimeMillis() which makes it susceptible to
>>>>>>>>>>>>> changes of the
>>>>>>>>>>>>> OS time (eg. changing timezone, NTP synchronization etc.). The
>>>>>>>>>>>>> uptime
>>>>>>>>>>>>> should not depend on the system time and should be calculated
>>>>>>>>>>>>> using a
>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>
>>>>>>>>>>>>> There is already the way to get the actual JVM uptime in
>>>>>>>>>>>>> ticks.
>>>>>>>>>>>>> It is
>>>>>>>>>>>>> accessible as Management::timestamp() and the ticks are
>>>>>>>>>>>>> convertible to
>>>>>>>>>>>>> milliseconds using Management::ticks_to_ms(ts_ticks) thus
>>>>>>>>>>>>> making it
>>>>>>>>>>>>> very
>>>>>>>>>>>>> easy to switch to the monotonic clock based uptime.
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe I'm missing something but TiumeStamp updates using
>>>>>>>>>>>> os::elapsed_counter() which on Linux uses gettimeofday so is
>>>>>>>>>>>> not a
>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>
>>>>>>>>>>> Hm, yes. I wasn't aware of this linux/bsd specific.
>>>>>>>>>>>
>>>>>>>>>>> Is there any reason why a non monotonic clock source is used for
>>>>>>>>>>> timestamping except of the historical one? os::javaTimeNanos()
>>>>>>>>>>> uses
>>>>>>>>>>> montonic clock when available - why can't be the same used for
>>>>>>>>>>> os::elapsed_counter() especially when a counter based on
>>>>>>>>>>> "gettimeofday"
>>>>>>>>>>> is not really a counter?
>>>>>>>>>>
>>>>>>>>>> It is all historical. These elapsed_counters and elapsed_timers
>>>>>>>>>> make me
>>>>>>>>>> cringe. But changing it has a lot of potential consequences
>>>>>>>>>> because of
>>>>>>>>>> the way these are used in logging etc. Certainly not something
>>>>>>>>>> to be
>>>>>>>>>> contemplated at this stage of JDK 8.
>>>>>>>>>>
>>>>>>>>>> Perhaps a simpler fix here is to expose a startUpTimeNanos that
>>>>>>>>>> can then
>>>>>>>>>> be used for the uptime.
>>>>>>>>>
>>>>>>>>> My attempt at this is at
>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.01/hotspot
>>>>>>>>> I am using os::javaTimeNanos() to get the monotonic ticks where
>>>>>>>>> possible.
>>>>>>>>>
>>>>>>>>> The JDK part stays the same as for webrev.00
>>>>>>>>>
>>>>>>>>> -JB-
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> -JB-
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>> -----
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> The patch consists of the hotspot and jdk parts.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For the hotspot a new constant needs to be introduced in
>>>>>>>>>>>>> src/share/vm/services/jmm.h and the actual logic to obtain the
>>>>>>>>>>>>> uptime in
>>>>>>>>>>>>> milliseconds is added in src/share/vm/services/management.cpp.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For the jdk the changes comprise of adding the necessary JNI
>>>>>>>>>>>>> bridging
>>>>>>>>>>>>> methods in order to get the new uptime, introducing the same
>>>>>>>>>>>>> constant
>>>>>>>>>>>>> that is used in hotspot and changes to mapfile-vers files in
>>>>>>>>>>>>> order to
>>>>>>>>>>>>> properly build the native library.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-6523160
>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.00
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> -JB-
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
More information about the serviceability-dev
mailing list