jmx-dev RFR 6523160: RuntimeMXBean.getUptime() returns negative values

Jaroslav Bachorik jaroslav.bachorik at oracle.com
Thu Oct 17 03:10:39 PDT 2013


Hi David,

On 17.10.2013 04:26, David Holmes wrote:
> Hi Jaroslav,
>
> Minor nit: os::elapsed_time should really be defined in terms of the
> other functions ie:
>
> return ((double) os::elapsed_counter()) / os::elapsed_frequency();

Ok. I've changed it. It better communicates the way the elapsedTime is 
calculated anyway.

>
> I also prefer the cast above as it is very clear that we will be doing a
> floating-point division.
>
> Aside: AFAICS os::elapsed_time() is never actually used ??

Actually, it is os::elapsedTime() and this one is used quite a lot. The 
"elapsed_time()" form is used only in bytecodeHistogram.hpp, 
parNewGeneration.hpp and g1CollectedHeap.hpp, where it is also declared.

>
> I agree that it appears that changing the frequency should be okay.

Thanks for the feedback.

Updated webrev: http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03

-JB-

>
> Thanks,
> David
>
> On 17/10/2013 2:16 AM, Jaroslav Bachorik wrote:
>> On 15.10.2013 08:49, David Holmes wrote:
>>> Hi Jaroslav,
>>>
>>> os_bsd.cpp / os_linux.cpp:
>>>
>>> If you don't have a monotonic clock you leave timer_frequency set to 0!
>>> (So you need to test on a system without a monotonic clock, or else
>>> force it to act as-if not present.)
>>>
>>> That aside I don't trust clock_getres to give values that actually allow
>>> the timer frequency to be determined. As per the comments in
>>> os_linux.cpp:
>>>
>>> // It's fixed in newer kernels, however clock_getres() still returns
>>> // 1/HZ. We check if clock_getres() works, but will ignore its reported
>>> // resolution for now. Hopefully as people move to new kernels, this
>>> // won't be a problem.
>>>
>>> we don't know what kernels provide real values here and which provide
>>> dummy ones.
>>>
>>> On BSD you haven't modified os::elapsed_counter.
>>>
>>> Looking at the linux changes I don't think the logic is correct even if
>>> clock_getres is accurate. In the existing code we have:
>>>
>>> elapsed_counter -> elapsed time in microseconds
>>> elapsed_frequency -> 1000 * 1000 (ie micros per second)
>>> elapsed_time -> elapsed_counter*0.000001 -> time in seconds
>>>
>>> Now we have:
>>>
>>> elapsed_counter -> elapsed time in nanoseconds
>>> elapsed_frequency -> 1x10^9 / whatever clock_getres says
>>> elapsed_time -> counter/frequency -> ???
>>>
>>> So elapsed_time not, in general, going to give the elapsed time in
>>> seconds. And elapsed_time is not dependent on the "frequency" at all
>>> because elapsed_counter is not reporting ticks but an actual elapsed
>>> "time" in nanoseconds.
>>>
>>>
>>> Also note that we constants for:
>>>
>>> NANOSECS_PER_SEC
>>> NANOSECS_PER_MILLISEC
>>>
>>> to aid with time conversions.
>>>
>>> The linux webrev contains unrelated UseLargePages changes!
>>
>> Sorry for the mess with UseLargePages changes :/
>>
>> I've fixed the problems with the frequency (using a fixed number as
>> before) and I kept the changes to minimum.
>>
>> I was hesitating about changing the elapsed_counter precision from
>> microseconds to nanoseconds but since solaris and windows versions
>> already use nanosecond ticks for elapsed_counter I think the change is
>> safe.
>>
>> The update webrev:
>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03
>>
>>>
>>>
>>> David
>>> -----
>>>
>>>
>>> On 15/10/2013 12:13 AM, Jaroslav Bachorik wrote:
>>>> On 10.10.2013 13:15, Staffan Larsen wrote:
>>>>>
>>>>> On 10 okt 2013, at 13:02, Jaroslav Bachorik
>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>
>>>>>> On 10.10.2013 05:44, David Holmes wrote:
>>>>>>> On 10/10/2013 4:12 AM, Staffan Larsen wrote:
>>>>>>>>
>>>>>>>> On 9 okt 2013, at 16:19, Jaroslav Bachorik
>>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>>
>>>>>>>>> On 9.10.2013 16:10, Staffan Larsen wrote:
>>>>>>>>>> There is now an awful amount of different timestamps in the
>>>>>>>>>> Management class. Can they be consolidated somehow? At least
>>>>>>>>>> _begin_vm_creation_time and the new _begin_vm_creation_ns.
>>>>>>>>>>
>>>>>>>>>> This discussion also implies that the "elapsed time" we print in
>>>>>>>>>> the
>>>>>>>>>> hs_err file is also wrong. It uses os::elapsedTime() which uses
>>>>>>>>>> os::elapsed_counter().
>>>>>>>>>>
>>>>>>>>>> And I guess the same thing for the VM.uptime Diagnostic Command
>>>>>>>>>> (class VMUptimeDCmd) which also relies on os::elapsed_counter().
>>>>>>>>>
>>>>>>>>> Also the reported GC pauses duration might be wrong since it uses
>>>>>>>>> Management::timestamp().
>>>>>>>>>
>>>>>>>>> On the first sight the change looks rather trivial. But, honestly,
>>>>>>>>> I'm not sure which other parts could for whatever reason break
>>>>>>>>> once
>>>>>>>>> the time-of-day timestamp is replaced with a monotonic equivalent.
>>>>>>>>> One would think that it shouldn't matter but one never knows ...
>>>>>>>>>
>>>>>>>>> Staffan, do you think this kind of change is suitable for the
>>>>>>>>> current
>>>>>>>>> phase of JDK release cycle? I think I could improve the patch in
>>>>>>>>> few
>>>>>>>>> days and then it should probably be able to pass the review before
>>>>>>>>> ZBB. But, it's only P3  ...
>>>>>>>>
>>>>>>>> I think it is a bit late in the release cycle to clean this up in
>>>>>>>> the
>>>>>>>> way it should be cleaned up. I think we should wait until the
>>>>>>>> first 8
>>>>>>>> update release and do a more thorough job than we have time for
>>>>>>>> right
>>>>>>>> now.
>>>>>>>
>>>>>>> I second that. The elapsed_counter/elpased_timer APIs and
>>>>>>> implementations are a tangled mess. But part of the problem has been
>>>>>>> that people want/expect monotonic time-of-day based timestamps
>>>>>>> (yes a
>>>>>>> contradiction - though some people make sure TOD does not get
>>>>>>> modified
>>>>>>> on their production systems). The use of timestamps in logging has
>>>>>>> to be
>>>>>>> examined carefully - mainly GC logging. I recall a "simple"
>>>>>>> attempted
>>>>>>> change in the past that resulted in trying to compare a nanoTime
>>>>>>> based
>>>>>>> timestamp with the TOD. :(
>>>>>>
>>>>>> Actually, if I'm reading the sources right for Solaris and Win the
>>>>>> monotonic clock source is used to provide elapsed_counter() value. It
>>>>>> falls back to TOD when the monotonic clock source is not available.
>>>>>> For Linux/BSD the TOD is used directly.
>>>>>>
>>>>>> This makes me wonder if changing the linux/bsd implementation to
>>>>>> follow the same logic would be really that disruptive.
>>>>>
>>>>> Good point. I would like a world where elapsed_counter is monotonic
>>>>> (where possible). Still a bit scary this late in the release, but an
>>>>> interesting experiment.
>>>>
>>>> The change is rather simple and tests ok. All the means to get a
>>>> monotonic timestamp are already there and proved to work. The core
>>>> tests
>>>> in JPRT went fine.
>>>>
>>>> The updated webrev is at
>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.02
>>>>
>>>> -JB-
>>>>
>>>>>
>>>>> /Staffan
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> -JB-
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> /Staffan
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> -JB-
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> /Staffan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 9 okt 2013, at 13:26, Jaroslav Bachorik
>>>>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> On 8.10.2013 23:46, David Holmes wrote:
>>>>>>>>>>>> On 8/10/2013 10:36 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>> On 8.10.2013 09:34, David Holmes wrote:
>>>>>>>>>>>>>> Jaroslav,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2/10/2013 6:47 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> currently the JVM uptime reported by the RuntimeMXBean is
>>>>>>>>>>>>>>> based on
>>>>>>>>>>>>>>> System.currentTimeMillis() which makes it susceptible to
>>>>>>>>>>>>>>> changes of the
>>>>>>>>>>>>>>> OS time (eg. changing timezone, NTP synchronization etc.).
>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>> uptime
>>>>>>>>>>>>>>> should not depend on the system time and should be
>>>>>>>>>>>>>>> calculated
>>>>>>>>>>>>>>> using a
>>>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There is already the way to get the actual JVM uptime in
>>>>>>>>>>>>>>> ticks.
>>>>>>>>>>>>>>> It is
>>>>>>>>>>>>>>> accessible as Management::timestamp() and the ticks are
>>>>>>>>>>>>>>> convertible to
>>>>>>>>>>>>>>> milliseconds using Management::ticks_to_ms(ts_ticks) thus
>>>>>>>>>>>>>>> making it
>>>>>>>>>>>>>>> very
>>>>>>>>>>>>>>> easy to switch to the monotonic clock based uptime.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maybe I'm missing something but TiumeStamp updates using
>>>>>>>>>>>>>> os::elapsed_counter() which on Linux uses gettimeofday so is
>>>>>>>>>>>>>> not a
>>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hm, yes. I wasn't aware of this linux/bsd specific.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is there any reason why a non monotonic clock source is used
>>>>>>>>>>>>> for
>>>>>>>>>>>>> timestamping except of the historical one? os::javaTimeNanos()
>>>>>>>>>>>>> uses
>>>>>>>>>>>>> montonic clock when available - why can't be the same used for
>>>>>>>>>>>>> os::elapsed_counter() especially when a counter based on
>>>>>>>>>>>>> "gettimeofday"
>>>>>>>>>>>>> is not really a counter?
>>>>>>>>>>>>
>>>>>>>>>>>> It is all historical. These elapsed_counters and elapsed_timers
>>>>>>>>>>>> make me
>>>>>>>>>>>> cringe. But changing it has a lot of potential consequences
>>>>>>>>>>>> because of
>>>>>>>>>>>> the way these are used in logging etc. Certainly not something
>>>>>>>>>>>> to be
>>>>>>>>>>>> contemplated at this stage of JDK 8.
>>>>>>>>>>>>
>>>>>>>>>>>> Perhaps a simpler fix here is to expose a startUpTimeNanos that
>>>>>>>>>>>> can then
>>>>>>>>>>>> be used for the uptime.
>>>>>>>>>>>
>>>>>>>>>>> My attempt at this is at
>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.01/hotspot
>>>>>>>>>>> I am using os::javaTimeNanos() to get the monotonic ticks where
>>>>>>>>>>> possible.
>>>>>>>>>>>
>>>>>>>>>>> The JDK part stays the same as for webrev.00
>>>>>>>>>>>
>>>>>>>>>>> -JB-
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The patch consists of the hotspot and jdk parts.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For the hotspot a new constant needs to be introduced in
>>>>>>>>>>>>>>> src/share/vm/services/jmm.h and the actual logic to obtain
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> uptime in
>>>>>>>>>>>>>>> milliseconds is added in
>>>>>>>>>>>>>>> src/share/vm/services/management.cpp.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For the jdk the changes comprise of adding the necessary JNI
>>>>>>>>>>>>>>> bridging
>>>>>>>>>>>>>>> methods in order to get the new uptime, introducing the same
>>>>>>>>>>>>>>> constant
>>>>>>>>>>>>>>> that is used in hotspot and changes to mapfile-vers files in
>>>>>>>>>>>>>>> order to
>>>>>>>>>>>>>>> properly build the native library.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Issue:   https://bugs.openjdk.java.net/browse/JDK-6523160
>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.00
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>



More information about the serviceability-dev mailing list