jmx-dev RFR 6523160: RuntimeMXBean.getUptime() returns negative values
David Holmes
david.holmes at oracle.com
Thu Oct 17 04:07:36 PDT 2013
On 17/10/2013 8:10 PM, Jaroslav Bachorik wrote:
> Hi David,
>
> On 17.10.2013 04:26, David Holmes wrote:
>> Hi Jaroslav,
>>
>> Minor nit: os::elapsed_time should really be defined in terms of the
>> other functions ie:
>>
>> return ((double) os::elapsed_counter()) / os::elapsed_frequency();
>
> Ok. I've changed it. It better communicates the way the elapsedTime is
> calculated anyway.
>
>>
>> I also prefer the cast above as it is very clear that we will be doing a
>> floating-point division.
>>
>> Aside: AFAICS os::elapsed_time() is never actually used ??
>
> Actually, it is os::elapsedTime() and this one is used quite a lot. The
> "elapsed_time()" form is used only in bytecodeHistogram.hpp,
> parNewGeneration.hpp and g1CollectedHeap.hpp, where it is also declared.
AFAICS they all define their own elapsed_time() functions they don't use
os::elapsed_time().
>>
>> I agree that it appears that changing the frequency should be okay.
>
> Thanks for the feedback.
>
> Updated webrev: http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03
That should be .04 version :)
Looks okay.
Thanks,
David
> -JB-
>
>>
>> Thanks,
>> David
>>
>> On 17/10/2013 2:16 AM, Jaroslav Bachorik wrote:
>>> On 15.10.2013 08:49, David Holmes wrote:
>>>> Hi Jaroslav,
>>>>
>>>> os_bsd.cpp / os_linux.cpp:
>>>>
>>>> If you don't have a monotonic clock you leave timer_frequency set to 0!
>>>> (So you need to test on a system without a monotonic clock, or else
>>>> force it to act as-if not present.)
>>>>
>>>> That aside I don't trust clock_getres to give values that actually
>>>> allow
>>>> the timer frequency to be determined. As per the comments in
>>>> os_linux.cpp:
>>>>
>>>> // It's fixed in newer kernels, however clock_getres() still returns
>>>> // 1/HZ. We check if clock_getres() works, but will ignore its reported
>>>> // resolution for now. Hopefully as people move to new kernels, this
>>>> // won't be a problem.
>>>>
>>>> we don't know what kernels provide real values here and which provide
>>>> dummy ones.
>>>>
>>>> On BSD you haven't modified os::elapsed_counter.
>>>>
>>>> Looking at the linux changes I don't think the logic is correct even if
>>>> clock_getres is accurate. In the existing code we have:
>>>>
>>>> elapsed_counter -> elapsed time in microseconds
>>>> elapsed_frequency -> 1000 * 1000 (ie micros per second)
>>>> elapsed_time -> elapsed_counter*0.000001 -> time in seconds
>>>>
>>>> Now we have:
>>>>
>>>> elapsed_counter -> elapsed time in nanoseconds
>>>> elapsed_frequency -> 1x10^9 / whatever clock_getres says
>>>> elapsed_time -> counter/frequency -> ???
>>>>
>>>> So elapsed_time not, in general, going to give the elapsed time in
>>>> seconds. And elapsed_time is not dependent on the "frequency" at all
>>>> because elapsed_counter is not reporting ticks but an actual elapsed
>>>> "time" in nanoseconds.
>>>>
>>>>
>>>> Also note that we constants for:
>>>>
>>>> NANOSECS_PER_SEC
>>>> NANOSECS_PER_MILLISEC
>>>>
>>>> to aid with time conversions.
>>>>
>>>> The linux webrev contains unrelated UseLargePages changes!
>>>
>>> Sorry for the mess with UseLargePages changes :/
>>>
>>> I've fixed the problems with the frequency (using a fixed number as
>>> before) and I kept the changes to minimum.
>>>
>>> I was hesitating about changing the elapsed_counter precision from
>>> microseconds to nanoseconds but since solaris and windows versions
>>> already use nanosecond ticks for elapsed_counter I think the change is
>>> safe.
>>>
>>> The update webrev:
>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03
>>>
>>>>
>>>>
>>>> David
>>>> -----
>>>>
>>>>
>>>> On 15/10/2013 12:13 AM, Jaroslav Bachorik wrote:
>>>>> On 10.10.2013 13:15, Staffan Larsen wrote:
>>>>>>
>>>>>> On 10 okt 2013, at 13:02, Jaroslav Bachorik
>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>
>>>>>>> On 10.10.2013 05:44, David Holmes wrote:
>>>>>>>> On 10/10/2013 4:12 AM, Staffan Larsen wrote:
>>>>>>>>>
>>>>>>>>> On 9 okt 2013, at 16:19, Jaroslav Bachorik
>>>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>>>
>>>>>>>>>> On 9.10.2013 16:10, Staffan Larsen wrote:
>>>>>>>>>>> There is now an awful amount of different timestamps in the
>>>>>>>>>>> Management class. Can they be consolidated somehow? At least
>>>>>>>>>>> _begin_vm_creation_time and the new _begin_vm_creation_ns.
>>>>>>>>>>>
>>>>>>>>>>> This discussion also implies that the "elapsed time" we print in
>>>>>>>>>>> the
>>>>>>>>>>> hs_err file is also wrong. It uses os::elapsedTime() which uses
>>>>>>>>>>> os::elapsed_counter().
>>>>>>>>>>>
>>>>>>>>>>> And I guess the same thing for the VM.uptime Diagnostic Command
>>>>>>>>>>> (class VMUptimeDCmd) which also relies on os::elapsed_counter().
>>>>>>>>>>
>>>>>>>>>> Also the reported GC pauses duration might be wrong since it uses
>>>>>>>>>> Management::timestamp().
>>>>>>>>>>
>>>>>>>>>> On the first sight the change looks rather trivial. But,
>>>>>>>>>> honestly,
>>>>>>>>>> I'm not sure which other parts could for whatever reason break
>>>>>>>>>> once
>>>>>>>>>> the time-of-day timestamp is replaced with a monotonic
>>>>>>>>>> equivalent.
>>>>>>>>>> One would think that it shouldn't matter but one never knows ...
>>>>>>>>>>
>>>>>>>>>> Staffan, do you think this kind of change is suitable for the
>>>>>>>>>> current
>>>>>>>>>> phase of JDK release cycle? I think I could improve the patch in
>>>>>>>>>> few
>>>>>>>>>> days and then it should probably be able to pass the review
>>>>>>>>>> before
>>>>>>>>>> ZBB. But, it's only P3 ...
>>>>>>>>>
>>>>>>>>> I think it is a bit late in the release cycle to clean this up in
>>>>>>>>> the
>>>>>>>>> way it should be cleaned up. I think we should wait until the
>>>>>>>>> first 8
>>>>>>>>> update release and do a more thorough job than we have time for
>>>>>>>>> right
>>>>>>>>> now.
>>>>>>>>
>>>>>>>> I second that. The elapsed_counter/elpased_timer APIs and
>>>>>>>> implementations are a tangled mess. But part of the problem has
>>>>>>>> been
>>>>>>>> that people want/expect monotonic time-of-day based timestamps
>>>>>>>> (yes a
>>>>>>>> contradiction - though some people make sure TOD does not get
>>>>>>>> modified
>>>>>>>> on their production systems). The use of timestamps in logging has
>>>>>>>> to be
>>>>>>>> examined carefully - mainly GC logging. I recall a "simple"
>>>>>>>> attempted
>>>>>>>> change in the past that resulted in trying to compare a nanoTime
>>>>>>>> based
>>>>>>>> timestamp with the TOD. :(
>>>>>>>
>>>>>>> Actually, if I'm reading the sources right for Solaris and Win the
>>>>>>> monotonic clock source is used to provide elapsed_counter()
>>>>>>> value. It
>>>>>>> falls back to TOD when the monotonic clock source is not available.
>>>>>>> For Linux/BSD the TOD is used directly.
>>>>>>>
>>>>>>> This makes me wonder if changing the linux/bsd implementation to
>>>>>>> follow the same logic would be really that disruptive.
>>>>>>
>>>>>> Good point. I would like a world where elapsed_counter is monotonic
>>>>>> (where possible). Still a bit scary this late in the release, but an
>>>>>> interesting experiment.
>>>>>
>>>>> The change is rather simple and tests ok. All the means to get a
>>>>> monotonic timestamp are already there and proved to work. The core
>>>>> tests
>>>>> in JPRT went fine.
>>>>>
>>>>> The updated webrev is at
>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.02
>>>>>
>>>>> -JB-
>>>>>
>>>>>>
>>>>>> /Staffan
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> -JB-
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>> /Staffan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -JB-
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> /Staffan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 9 okt 2013, at 13:26, Jaroslav Bachorik
>>>>>>>>>>> <jaroslav.bachorik at oracle.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On 8.10.2013 23:46, David Holmes wrote:
>>>>>>>>>>>>> On 8/10/2013 10:36 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>> On 8.10.2013 09:34, David Holmes wrote:
>>>>>>>>>>>>>>> Jaroslav,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2/10/2013 6:47 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> currently the JVM uptime reported by the RuntimeMXBean is
>>>>>>>>>>>>>>>> based on
>>>>>>>>>>>>>>>> System.currentTimeMillis() which makes it susceptible to
>>>>>>>>>>>>>>>> changes of the
>>>>>>>>>>>>>>>> OS time (eg. changing timezone, NTP synchronization etc.).
>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>> uptime
>>>>>>>>>>>>>>>> should not depend on the system time and should be
>>>>>>>>>>>>>>>> calculated
>>>>>>>>>>>>>>>> using a
>>>>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There is already the way to get the actual JVM uptime in
>>>>>>>>>>>>>>>> ticks.
>>>>>>>>>>>>>>>> It is
>>>>>>>>>>>>>>>> accessible as Management::timestamp() and the ticks are
>>>>>>>>>>>>>>>> convertible to
>>>>>>>>>>>>>>>> milliseconds using Management::ticks_to_ms(ts_ticks) thus
>>>>>>>>>>>>>>>> making it
>>>>>>>>>>>>>>>> very
>>>>>>>>>>>>>>>> easy to switch to the monotonic clock based uptime.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Maybe I'm missing something but TiumeStamp updates using
>>>>>>>>>>>>>>> os::elapsed_counter() which on Linux uses gettimeofday so is
>>>>>>>>>>>>>>> not a
>>>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hm, yes. I wasn't aware of this linux/bsd specific.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is there any reason why a non monotonic clock source is used
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> timestamping except of the historical one?
>>>>>>>>>>>>>> os::javaTimeNanos()
>>>>>>>>>>>>>> uses
>>>>>>>>>>>>>> montonic clock when available - why can't be the same used
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> os::elapsed_counter() especially when a counter based on
>>>>>>>>>>>>>> "gettimeofday"
>>>>>>>>>>>>>> is not really a counter?
>>>>>>>>>>>>>
>>>>>>>>>>>>> It is all historical. These elapsed_counters and
>>>>>>>>>>>>> elapsed_timers
>>>>>>>>>>>>> make me
>>>>>>>>>>>>> cringe. But changing it has a lot of potential consequences
>>>>>>>>>>>>> because of
>>>>>>>>>>>>> the way these are used in logging etc. Certainly not something
>>>>>>>>>>>>> to be
>>>>>>>>>>>>> contemplated at this stage of JDK 8.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Perhaps a simpler fix here is to expose a startUpTimeNanos
>>>>>>>>>>>>> that
>>>>>>>>>>>>> can then
>>>>>>>>>>>>> be used for the uptime.
>>>>>>>>>>>>
>>>>>>>>>>>> My attempt at this is at
>>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.01/hotspot
>>>>>>>>>>>> I am using os::javaTimeNanos() to get the monotonic ticks where
>>>>>>>>>>>> possible.
>>>>>>>>>>>>
>>>>>>>>>>>> The JDK part stays the same as for webrev.00
>>>>>>>>>>>>
>>>>>>>>>>>> -JB-
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The patch consists of the hotspot and jdk parts.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For the hotspot a new constant needs to be introduced in
>>>>>>>>>>>>>>>> src/share/vm/services/jmm.h and the actual logic to obtain
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> uptime in
>>>>>>>>>>>>>>>> milliseconds is added in
>>>>>>>>>>>>>>>> src/share/vm/services/management.cpp.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For the jdk the changes comprise of adding the necessary
>>>>>>>>>>>>>>>> JNI
>>>>>>>>>>>>>>>> bridging
>>>>>>>>>>>>>>>> methods in order to get the new uptime, introducing the
>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>> constant
>>>>>>>>>>>>>>>> that is used in hotspot and changes to mapfile-vers
>>>>>>>>>>>>>>>> files in
>>>>>>>>>>>>>>>> order to
>>>>>>>>>>>>>>>> properly build the native library.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-6523160
>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.00
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>
More information about the hotspot-dev
mailing list