jmx-dev RFR 6523160: RuntimeMXBean.getUptime() returns negative values

David Holmes david.holmes at oracle.com
Wed Oct 16 19:26:40 PDT 2013


Hi Jaroslav,

Minor nit: os::elapsed_time should really be defined in terms of the 
other functions ie:

return ((double) os::elapsed_counter()) / os::elapsed_frequency();

I also prefer the cast above as it is very clear that we will be doing a 
floating-point division.

Aside: AFAICS os::elapsed_time() is never actually used ??

I agree that it appears that changing the frequency should be okay.

Thanks,
David

On 17/10/2013 2:16 AM, Jaroslav Bachorik wrote:
> On 15.10.2013 08:49, David Holmes wrote:
>> Hi Jaroslav,
>>
>> os_bsd.cpp / os_linux.cpp:
>>
>> If you don't have a monotonic clock you leave timer_frequency set to 0!
>> (So you need to test on a system without a monotonic clock, or else
>> force it to act as-if not present.)
>>
>> That aside I don't trust clock_getres to give values that actually allow
>> the timer frequency to be determined. As per the comments in
>> os_linux.cpp:
>>
>> // It's fixed in newer kernels, however clock_getres() still returns
>> // 1/HZ. We check if clock_getres() works, but will ignore its reported
>> // resolution for now. Hopefully as people move to new kernels, this
>> // won't be a problem.
>>
>> we don't know what kernels provide real values here and which provide
>> dummy ones.
>>
>> On BSD you haven't modified os::elapsed_counter.
>>
>> Looking at the linux changes I don't think the logic is correct even if
>> clock_getres is accurate. In the existing code we have:
>>
>> elapsed_counter -> elapsed time in microseconds
>> elapsed_frequency -> 1000 * 1000 (ie micros per second)
>> elapsed_time -> elapsed_counter*0.000001 -> time in seconds
>>
>> Now we have:
>>
>> elapsed_counter -> elapsed time in nanoseconds
>> elapsed_frequency -> 1x10^9 / whatever clock_getres says
>> elapsed_time -> counter/frequency -> ???
>>
>> So elapsed_time not, in general, going to give the elapsed time in
>> seconds. And elapsed_time is not dependent on the "frequency" at all
>> because elapsed_counter is not reporting ticks but an actual elapsed
>> "time" in nanoseconds.
>>
>>
>> Also note that we constants for:
>>
>> NANOSECS_PER_SEC
>> NANOSECS_PER_MILLISEC
>>
>> to aid with time conversions.
>>
>> The linux webrev contains unrelated UseLargePages changes!
>
> Sorry for the mess with UseLargePages changes :/
>
> I've fixed the problems with the frequency (using a fixed number as
> before) and I kept the changes to minimum.
>
> I was hesitating about changing the elapsed_counter precision from
> microseconds to nanoseconds but since solaris and windows versions
> already use nanosecond ticks for elapsed_counter I think the change is
> safe.
>
> The update webrev: http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03
>
>>
>>
>> David
>> -----
>>
>>
>> On 15/10/2013 12:13 AM, Jaroslav Bachorik wrote:
>>> On 10.10.2013 13:15, Staffan Larsen wrote:
>>>>
>>>> On 10 okt 2013, at 13:02, Jaroslav Bachorik
>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>
>>>>> On 10.10.2013 05:44, David Holmes wrote:
>>>>>> On 10/10/2013 4:12 AM, Staffan Larsen wrote:
>>>>>>>
>>>>>>> On 9 okt 2013, at 16:19, Jaroslav Bachorik
>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>
>>>>>>>> On 9.10.2013 16:10, Staffan Larsen wrote:
>>>>>>>>> There is now an awful amount of different timestamps in the
>>>>>>>>> Management class. Can they be consolidated somehow? At least
>>>>>>>>> _begin_vm_creation_time and the new _begin_vm_creation_ns.
>>>>>>>>>
>>>>>>>>> This discussion also implies that the "elapsed time" we print in
>>>>>>>>> the
>>>>>>>>> hs_err file is also wrong. It uses os::elapsedTime() which uses
>>>>>>>>> os::elapsed_counter().
>>>>>>>>>
>>>>>>>>> And I guess the same thing for the VM.uptime Diagnostic Command
>>>>>>>>> (class VMUptimeDCmd) which also relies on os::elapsed_counter().
>>>>>>>>
>>>>>>>> Also the reported GC pauses duration might be wrong since it uses
>>>>>>>> Management::timestamp().
>>>>>>>>
>>>>>>>> On the first sight the change looks rather trivial. But, honestly,
>>>>>>>> I'm not sure which other parts could for whatever reason break once
>>>>>>>> the time-of-day timestamp is replaced with a monotonic equivalent.
>>>>>>>> One would think that it shouldn't matter but one never knows ...
>>>>>>>>
>>>>>>>> Staffan, do you think this kind of change is suitable for the
>>>>>>>> current
>>>>>>>> phase of JDK release cycle? I think I could improve the patch in
>>>>>>>> few
>>>>>>>> days and then it should probably be able to pass the review before
>>>>>>>> ZBB. But, it's only P3  ...
>>>>>>>
>>>>>>> I think it is a bit late in the release cycle to clean this up in
>>>>>>> the
>>>>>>> way it should be cleaned up. I think we should wait until the
>>>>>>> first 8
>>>>>>> update release and do a more thorough job than we have time for
>>>>>>> right
>>>>>>> now.
>>>>>>
>>>>>> I second that. The elapsed_counter/elpased_timer APIs and
>>>>>> implementations are a tangled mess. But part of the problem has been
>>>>>> that people want/expect monotonic time-of-day based timestamps (yes a
>>>>>> contradiction - though some people make sure TOD does not get
>>>>>> modified
>>>>>> on their production systems). The use of timestamps in logging has
>>>>>> to be
>>>>>> examined carefully - mainly GC logging. I recall a "simple" attempted
>>>>>> change in the past that resulted in trying to compare a nanoTime
>>>>>> based
>>>>>> timestamp with the TOD. :(
>>>>>
>>>>> Actually, if I'm reading the sources right for Solaris and Win the
>>>>> monotonic clock source is used to provide elapsed_counter() value. It
>>>>> falls back to TOD when the monotonic clock source is not available.
>>>>> For Linux/BSD the TOD is used directly.
>>>>>
>>>>> This makes me wonder if changing the linux/bsd implementation to
>>>>> follow the same logic would be really that disruptive.
>>>>
>>>> Good point. I would like a world where elapsed_counter is monotonic
>>>> (where possible). Still a bit scary this late in the release, but an
>>>> interesting experiment.
>>>
>>> The change is rather simple and tests ok. All the means to get a
>>> monotonic timestamp are already there and proved to work. The core tests
>>> in JPRT went fine.
>>>
>>> The updated webrev is at
>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.02
>>>
>>> -JB-
>>>
>>>>
>>>> /Staffan
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> -JB-
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> /Staffan
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> -JB-
>>>>>>>>
>>>>>>>>>
>>>>>>>>> /Staffan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 9 okt 2013, at 13:26, Jaroslav Bachorik
>>>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>>>
>>>>>>>>>> On 8.10.2013 23:46, David Holmes wrote:
>>>>>>>>>>> On 8/10/2013 10:36 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>> On 8.10.2013 09:34, David Holmes wrote:
>>>>>>>>>>>>> Jaroslav,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2/10/2013 6:47 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> currently the JVM uptime reported by the RuntimeMXBean is
>>>>>>>>>>>>>> based on
>>>>>>>>>>>>>> System.currentTimeMillis() which makes it susceptible to
>>>>>>>>>>>>>> changes of the
>>>>>>>>>>>>>> OS time (eg. changing timezone, NTP synchronization etc.).
>>>>>>>>>>>>>> The
>>>>>>>>>>>>>> uptime
>>>>>>>>>>>>>> should not depend on the system time and should be calculated
>>>>>>>>>>>>>> using a
>>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There is already the way to get the actual JVM uptime in
>>>>>>>>>>>>>> ticks.
>>>>>>>>>>>>>> It is
>>>>>>>>>>>>>> accessible as Management::timestamp() and the ticks are
>>>>>>>>>>>>>> convertible to
>>>>>>>>>>>>>> milliseconds using Management::ticks_to_ms(ts_ticks) thus
>>>>>>>>>>>>>> making it
>>>>>>>>>>>>>> very
>>>>>>>>>>>>>> easy to switch to the monotonic clock based uptime.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Maybe I'm missing something but TiumeStamp updates using
>>>>>>>>>>>>> os::elapsed_counter() which on Linux uses gettimeofday so is
>>>>>>>>>>>>> not a
>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>
>>>>>>>>>>>> Hm, yes. I wasn't aware of this linux/bsd specific.
>>>>>>>>>>>>
>>>>>>>>>>>> Is there any reason why a non monotonic clock source is used
>>>>>>>>>>>> for
>>>>>>>>>>>> timestamping except of the historical one? os::javaTimeNanos()
>>>>>>>>>>>> uses
>>>>>>>>>>>> montonic clock when available - why can't be the same used for
>>>>>>>>>>>> os::elapsed_counter() especially when a counter based on
>>>>>>>>>>>> "gettimeofday"
>>>>>>>>>>>> is not really a counter?
>>>>>>>>>>>
>>>>>>>>>>> It is all historical. These elapsed_counters and elapsed_timers
>>>>>>>>>>> make me
>>>>>>>>>>> cringe. But changing it has a lot of potential consequences
>>>>>>>>>>> because of
>>>>>>>>>>> the way these are used in logging etc. Certainly not something
>>>>>>>>>>> to be
>>>>>>>>>>> contemplated at this stage of JDK 8.
>>>>>>>>>>>
>>>>>>>>>>> Perhaps a simpler fix here is to expose a startUpTimeNanos that
>>>>>>>>>>> can then
>>>>>>>>>>> be used for the uptime.
>>>>>>>>>>
>>>>>>>>>> My attempt at this is at
>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.01/hotspot
>>>>>>>>>> I am using os::javaTimeNanos() to get the monotonic ticks where
>>>>>>>>>> possible.
>>>>>>>>>>
>>>>>>>>>> The JDK part stays the same as for webrev.00
>>>>>>>>>>
>>>>>>>>>> -JB-
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>> -JB-
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The patch consists of the hotspot and jdk parts.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For the hotspot a new constant needs to be introduced in
>>>>>>>>>>>>>> src/share/vm/services/jmm.h and the actual logic to obtain
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> uptime in
>>>>>>>>>>>>>> milliseconds is added in
>>>>>>>>>>>>>> src/share/vm/services/management.cpp.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For the jdk the changes comprise of adding the necessary JNI
>>>>>>>>>>>>>> bridging
>>>>>>>>>>>>>> methods in order to get the new uptime, introducing the same
>>>>>>>>>>>>>> constant
>>>>>>>>>>>>>> that is used in hotspot and changes to mapfile-vers files in
>>>>>>>>>>>>>> order to
>>>>>>>>>>>>>> properly build the native library.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Issue:   https://bugs.openjdk.java.net/browse/JDK-6523160
>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.00
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>


More information about the hotspot-dev mailing list