jmx-dev RFR 6523160: RuntimeMXBean.getUptime() returns negative values

Jaroslav Bachorik jaroslav.bachorik at oracle.com
Thu Oct 17 05:09:40 PDT 2013


On 17.10.2013 13:07, David Holmes wrote:
> On 17/10/2013 8:10 PM, Jaroslav Bachorik wrote:
>> Hi David,
>>
>> On 17.10.2013 04:26, David Holmes wrote:
>>> Hi Jaroslav,
>>>
>>> Minor nit: os::elapsed_time should really be defined in terms of the
>>> other functions ie:
>>>
>>> return ((double) os::elapsed_counter()) / os::elapsed_frequency();
>>
>> Ok. I've changed it. It better communicates the way the elapsedTime is
>> calculated anyway.
>>
>>>
>>> I also prefer the cast above as it is very clear that we will be doing a
>>> floating-point division.
>>>
>>> Aside: AFAICS os::elapsed_time() is never actually used ??
>>
>> Actually, it is os::elapsedTime() and this one is used quite a lot. The
>> "elapsed_time()" form is used only in bytecodeHistogram.hpp,
>> parNewGeneration.hpp and g1CollectedHeap.hpp, where it is also declared.
>
> AFAICS they all define their own elapsed_time() functions they don't use
> os::elapsed_time().
>
>>>
>>> I agree that it appears that changing the frequency should be okay.
>>
>> Thanks for the feedback.
>>
>> Updated webrev: http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03
>
> That should be .04 version :)

Yep :( copy-paste ...
http://cr.openjdk.java.net/~jbachorik/6523160/webrev.04

>
> Looks okay.
>
> Thanks,
> David
>
>> -JB-
>>
>>>
>>> Thanks,
>>> David
>>>
>>> On 17/10/2013 2:16 AM, Jaroslav Bachorik wrote:
>>>> On 15.10.2013 08:49, David Holmes wrote:
>>>>> Hi Jaroslav,
>>>>>
>>>>> os_bsd.cpp / os_linux.cpp:
>>>>>
>>>>> If you don't have a monotonic clock you leave timer_frequency set
>>>>> to 0!
>>>>> (So you need to test on a system without a monotonic clock, or else
>>>>> force it to act as-if not present.)
>>>>>
>>>>> That aside I don't trust clock_getres to give values that actually
>>>>> allow
>>>>> the timer frequency to be determined. As per the comments in
>>>>> os_linux.cpp:
>>>>>
>>>>> // It's fixed in newer kernels, however clock_getres() still returns
>>>>> // 1/HZ. We check if clock_getres() works, but will ignore its
>>>>> reported
>>>>> // resolution for now. Hopefully as people move to new kernels, this
>>>>> // won't be a problem.
>>>>>
>>>>> we don't know what kernels provide real values here and which provide
>>>>> dummy ones.
>>>>>
>>>>> On BSD you haven't modified os::elapsed_counter.
>>>>>
>>>>> Looking at the linux changes I don't think the logic is correct
>>>>> even if
>>>>> clock_getres is accurate. In the existing code we have:
>>>>>
>>>>> elapsed_counter -> elapsed time in microseconds
>>>>> elapsed_frequency -> 1000 * 1000 (ie micros per second)
>>>>> elapsed_time -> elapsed_counter*0.000001 -> time in seconds
>>>>>
>>>>> Now we have:
>>>>>
>>>>> elapsed_counter -> elapsed time in nanoseconds
>>>>> elapsed_frequency -> 1x10^9 / whatever clock_getres says
>>>>> elapsed_time -> counter/frequency -> ???
>>>>>
>>>>> So elapsed_time not, in general, going to give the elapsed time in
>>>>> seconds. And elapsed_time is not dependent on the "frequency" at all
>>>>> because elapsed_counter is not reporting ticks but an actual elapsed
>>>>> "time" in nanoseconds.
>>>>>
>>>>>
>>>>> Also note that we constants for:
>>>>>
>>>>> NANOSECS_PER_SEC
>>>>> NANOSECS_PER_MILLISEC
>>>>>
>>>>> to aid with time conversions.
>>>>>
>>>>> The linux webrev contains unrelated UseLargePages changes!
>>>>
>>>> Sorry for the mess with UseLargePages changes :/
>>>>
>>>> I've fixed the problems with the frequency (using a fixed number as
>>>> before) and I kept the changes to minimum.
>>>>
>>>> I was hesitating about changing the elapsed_counter precision from
>>>> microseconds to nanoseconds but since solaris and windows versions
>>>> already use nanosecond ticks for elapsed_counter I think the change is
>>>> safe.
>>>>
>>>> The update webrev:
>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03
>>>>
>>>>>
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>
>>>>> On 15/10/2013 12:13 AM, Jaroslav Bachorik wrote:
>>>>>> On 10.10.2013 13:15, Staffan Larsen wrote:
>>>>>>>
>>>>>>> On 10 okt 2013, at 13:02, Jaroslav Bachorik
>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>
>>>>>>>> On 10.10.2013 05:44, David Holmes wrote:
>>>>>>>>> On 10/10/2013 4:12 AM, Staffan Larsen wrote:
>>>>>>>>>>
>>>>>>>>>> On 9 okt 2013, at 16:19, Jaroslav Bachorik
>>>>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> On 9.10.2013 16:10, Staffan Larsen wrote:
>>>>>>>>>>>> There is now an awful amount of different timestamps in the
>>>>>>>>>>>> Management class. Can they be consolidated somehow? At least
>>>>>>>>>>>> _begin_vm_creation_time and the new _begin_vm_creation_ns.
>>>>>>>>>>>>
>>>>>>>>>>>> This discussion also implies that the "elapsed time" we
>>>>>>>>>>>> print in
>>>>>>>>>>>> the
>>>>>>>>>>>> hs_err file is also wrong. It uses os::elapsedTime() which uses
>>>>>>>>>>>> os::elapsed_counter().
>>>>>>>>>>>>
>>>>>>>>>>>> And I guess the same thing for the VM.uptime Diagnostic Command
>>>>>>>>>>>> (class VMUptimeDCmd) which also relies on
>>>>>>>>>>>> os::elapsed_counter().
>>>>>>>>>>>
>>>>>>>>>>> Also the reported GC pauses duration might be wrong since it
>>>>>>>>>>> uses
>>>>>>>>>>> Management::timestamp().
>>>>>>>>>>>
>>>>>>>>>>> On the first sight the change looks rather trivial. But,
>>>>>>>>>>> honestly,
>>>>>>>>>>> I'm not sure which other parts could for whatever reason break
>>>>>>>>>>> once
>>>>>>>>>>> the time-of-day timestamp is replaced with a monotonic
>>>>>>>>>>> equivalent.
>>>>>>>>>>> One would think that it shouldn't matter but one never knows ...
>>>>>>>>>>>
>>>>>>>>>>> Staffan, do you think this kind of change is suitable for the
>>>>>>>>>>> current
>>>>>>>>>>> phase of JDK release cycle? I think I could improve the patch in
>>>>>>>>>>> few
>>>>>>>>>>> days and then it should probably be able to pass the review
>>>>>>>>>>> before
>>>>>>>>>>> ZBB. But, it's only P3  ...
>>>>>>>>>>
>>>>>>>>>> I think it is a bit late in the release cycle to clean this up in
>>>>>>>>>> the
>>>>>>>>>> way it should be cleaned up. I think we should wait until the
>>>>>>>>>> first 8
>>>>>>>>>> update release and do a more thorough job than we have time for
>>>>>>>>>> right
>>>>>>>>>> now.
>>>>>>>>>
>>>>>>>>> I second that. The elapsed_counter/elpased_timer APIs and
>>>>>>>>> implementations are a tangled mess. But part of the problem has
>>>>>>>>> been
>>>>>>>>> that people want/expect monotonic time-of-day based timestamps
>>>>>>>>> (yes a
>>>>>>>>> contradiction - though some people make sure TOD does not get
>>>>>>>>> modified
>>>>>>>>> on their production systems). The use of timestamps in logging has
>>>>>>>>> to be
>>>>>>>>> examined carefully - mainly GC logging. I recall a "simple"
>>>>>>>>> attempted
>>>>>>>>> change in the past that resulted in trying to compare a nanoTime
>>>>>>>>> based
>>>>>>>>> timestamp with the TOD. :(
>>>>>>>>
>>>>>>>> Actually, if I'm reading the sources right for Solaris and Win the
>>>>>>>> monotonic clock source is used to provide elapsed_counter()
>>>>>>>> value. It
>>>>>>>> falls back to TOD when the monotonic clock source is not available.
>>>>>>>> For Linux/BSD the TOD is used directly.
>>>>>>>>
>>>>>>>> This makes me wonder if changing the linux/bsd implementation to
>>>>>>>> follow the same logic would be really that disruptive.
>>>>>>>
>>>>>>> Good point. I would like a world where elapsed_counter is monotonic
>>>>>>> (where possible). Still a bit scary this late in the release, but an
>>>>>>> interesting experiment.
>>>>>>
>>>>>> The change is rather simple and tests ok. All the means to get a
>>>>>> monotonic timestamp are already there and proved to work. The core
>>>>>> tests
>>>>>> in JPRT went fine.
>>>>>>
>>>>>> The updated webrev is at
>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.02
>>>>>>
>>>>>> -JB-
>>>>>>
>>>>>>>
>>>>>>> /Staffan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> -JB-
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> /Staffan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -JB-
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 9 okt 2013, at 13:26, Jaroslav Bachorik
>>>>>>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On 8.10.2013 23:46, David Holmes wrote:
>>>>>>>>>>>>>> On 8/10/2013 10:36 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>> On 8.10.2013 09:34, David Holmes wrote:
>>>>>>>>>>>>>>>> Jaroslav,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2/10/2013 6:47 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> currently the JVM uptime reported by the RuntimeMXBean is
>>>>>>>>>>>>>>>>> based on
>>>>>>>>>>>>>>>>> System.currentTimeMillis() which makes it susceptible to
>>>>>>>>>>>>>>>>> changes of the
>>>>>>>>>>>>>>>>> OS time (eg. changing timezone, NTP synchronization etc.).
>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>> uptime
>>>>>>>>>>>>>>>>> should not depend on the system time and should be
>>>>>>>>>>>>>>>>> calculated
>>>>>>>>>>>>>>>>> using a
>>>>>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> There is already the way to get the actual JVM uptime in
>>>>>>>>>>>>>>>>> ticks.
>>>>>>>>>>>>>>>>> It is
>>>>>>>>>>>>>>>>> accessible as Management::timestamp() and the ticks are
>>>>>>>>>>>>>>>>> convertible to
>>>>>>>>>>>>>>>>> milliseconds using Management::ticks_to_ms(ts_ticks) thus
>>>>>>>>>>>>>>>>> making it
>>>>>>>>>>>>>>>>> very
>>>>>>>>>>>>>>>>> easy to switch to the monotonic clock based uptime.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Maybe I'm missing something but TiumeStamp updates using
>>>>>>>>>>>>>>>> os::elapsed_counter() which on Linux uses gettimeofday
>>>>>>>>>>>>>>>> so is
>>>>>>>>>>>>>>>> not a
>>>>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hm, yes. I wasn't aware of this linux/bsd specific.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is there any reason why a non monotonic clock source is used
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> timestamping except of the historical one?
>>>>>>>>>>>>>>> os::javaTimeNanos()
>>>>>>>>>>>>>>> uses
>>>>>>>>>>>>>>> montonic clock when available - why can't be the same used
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> os::elapsed_counter() especially when a counter based on
>>>>>>>>>>>>>>> "gettimeofday"
>>>>>>>>>>>>>>> is not really a counter?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is all historical. These elapsed_counters and
>>>>>>>>>>>>>> elapsed_timers
>>>>>>>>>>>>>> make me
>>>>>>>>>>>>>> cringe. But changing it has a lot of potential consequences
>>>>>>>>>>>>>> because of
>>>>>>>>>>>>>> the way these are used in logging etc. Certainly not
>>>>>>>>>>>>>> something
>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>> contemplated at this stage of JDK 8.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Perhaps a simpler fix here is to expose a startUpTimeNanos
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> can then
>>>>>>>>>>>>>> be used for the uptime.
>>>>>>>>>>>>>
>>>>>>>>>>>>> My attempt at this is at
>>>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.01/hotspot
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am using os::javaTimeNanos() to get the monotonic ticks
>>>>>>>>>>>>> where
>>>>>>>>>>>>> possible.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The JDK part stays the same as for webrev.00
>>>>>>>>>>>>>
>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The patch consists of the hotspot and jdk parts.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For the hotspot a new constant needs to be introduced in
>>>>>>>>>>>>>>>>> src/share/vm/services/jmm.h and the actual logic to obtain
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> uptime in
>>>>>>>>>>>>>>>>> milliseconds is added in
>>>>>>>>>>>>>>>>> src/share/vm/services/management.cpp.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For the jdk the changes comprise of adding the necessary
>>>>>>>>>>>>>>>>> JNI
>>>>>>>>>>>>>>>>> bridging
>>>>>>>>>>>>>>>>> methods in order to get the new uptime, introducing the
>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>> constant
>>>>>>>>>>>>>>>>> that is used in hotspot and changes to mapfile-vers
>>>>>>>>>>>>>>>>> files in
>>>>>>>>>>>>>>>>> order to
>>>>>>>>>>>>>>>>> properly build the native library.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Issue:   https://bugs.openjdk.java.net/browse/JDK-6523160
>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.00
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>



More information about the hotspot-dev mailing list