RFR: 8040140 System.nanoTime() is slow and non-monotonic on OS X

David Holmes david.holmes at oracle.com
Tue Apr 22 10:14:13 UTC 2014


Karen,

On 19/04/2014 12:43 AM, Karen Kinnear wrote:
> Staffan,
>
> thank you for your patience.
>
> I am happy with the fix to Mac OS X in terms of adding the new code and
> the guarantee about
> time not moving backward.
>
> The cleanup in the Solaris code also looks good.
>
> What I would request please is to split this back into two parts - I
> would like further discussion of the flag
> in a separate thread, to allow the OS X changes to go back right away.
>
> And for the comment in the Solaris code - the one thing we do know is
> that Solaris does not actually
> guarantee that time will not go backward.

The documentation for gethrtime certainly DOES guarantee that. From the 
manpage:

NOTES
      Although the units  of  hi-res  time  are  always  the  same
      (nanoseconds),  the actual resolution is hardware dependent.
      Hi-res time is guaranteed to be monotonic (it won't go back-
      ward, it won't periodically wrap) and linear (it won't occa-
      sionally speed up or slow down for adjustment, like the time
      of  day  can),  but not necessarily unique: two sufficiently
      proximate calls may return the same value.

All the issue you cite are acknowledged bugs in the OS.

David
-----

They currently have issues
> running on virtual machines unless
> bound to a specific cpu, and in  the past had issues with non-invariant
> TSC, which tells us that they do not have
> logic with specific guarantees handling changes across cores. So if
> there is a way to phrase this - it is not
> about Solaris having had bugs in the past, it is about Solaris not
> actually guaranteeing time not moving
> backward so we have to do it.
>
> thank you,
> Karen
>
> On Apr 15, 2014, at 8:27 AM, Staffan Larsen wrote:
>
>>
>> On 15 apr 2014, at 14:00, Karen Kinnear <karen.kinnear at oracle.com
>> <mailto:karen.kinnear at oracle.com>> wrote:
>>
>>> Staffan,
>>>
>>> Have you heard back from the Solaris folks yet on how to tell that we
>>> are on a virtual machine, and
>>> where they tell us  gethhrtime can move backward?
>>
>> I have not. It would be interesting to know this.
>>
>>>
>>> Is there any thought that in cases where we know there is a problem
>>> we might not want to
>>> follow a script's flag recommendation?
>>
>> The default with my change is to guard against time moving backwards.
>> If you want the “scary” behavior you have to say
>> -XX:+UnlockExperimentalFeatures -XX:+AssumeMonotonicOSTimers”.
>>
>>>
>>> I have been in too many situations in which scripts for starting java
>>> applications are inherited on
>>> systems that are not appropriate, by folks who didn't write them and
>>> don't know the history
>>> of the flag settings.
>>>
>>> I am concerned about subtle unexpected behaviors here.
>>>
>>> Since I haven't had time (yet - sorry) to walk through all the
>>> callers of getTimeNanos - could
>>> you possibly let me know if there is a java call that ultimately uses
>>> this and if so, the impact
>>> on java code of time moving backward?
>>
>> Well.. the most obvious case is System.nanoTime(). It must not move
>> backwards. But with the current code (before my change) it can do so
>> on OS X (!).
>>
>>>
>>> Same for callers in the vm please? I believe Ramki fixed at least one
>>> of the garbage collectors
>>> that had problems if time moved backward. Are the vm internal callers
>>> ok if time moves backwards?
>>
>> Again, There should be no moving backwards unless you run with
>> experimental features.
>>
>> Thanks,
>> /Staffan
>>
>>>
>>> Or I can do that research when I get the time - and I can get back to
>>> you on that tomorrow - sorry,
>>> another day of back-to-back meetings.
>>>
>>> thanks,
>>> Karen
>>>
>>> On Apr 15, 2014, at 7:42 AM, Staffan Larsen wrote:
>>>
>>>>
>>>> On 15 apr 2014, at 11:38, David Holmes <david.holmes at oracle.com
>>>> <mailto:david.holmes at oracle.com>> wrote:
>>>>
>>>>> On 15/04/2014 7:20 PM, Staffan Larsen wrote:
>>>>>>
>>>>>> On 15 apr 2014, at 09:14, David Holmes <david.holmes at oracle.com
>>>>>> <mailto:david.holmes at oracle.com>> wrote:
>>>>>>
>>>>>>> Hi Staffan,
>>>>>>>
>>>>>>> Generally looks okay.
>>>>>>>
>>>>>>> os_bsd.cpp still shows the old URL for Dave Dice's article
>>>>>>
>>>>>> I had forgotten to save the file. :-(
>>>>>>
>>>>>>> os_solaris.cpp:
>>>>>>>
>>>>>>> In the Solaris changes there is a lot of old code with inaccurate
>>>>>>> comments, but I suppose cleaning that up (oldgetTimeNanos()) is
>>>>>>> out of scope. You only added the check for
>>>>>>> AssumeMonotonicOSTimers in the supports_cx8 path, but the other
>>>>>>> path is now dead code.
>>>>>>
>>>>>> Since supports_cx8() returns true (because SUPPORTS_NATIVE_CX8 is
>>>>>> defined) on both solaris sparc and solaris x86, it looks like
>>>>>> oldgetTimeNanos() is really dead code. I can remove it as part of
>>>>>> this change if that’s ok.
>>>>>
>>>>> I'm in favour of removing dead code. :)
>>>>
>>>> Gone!
>>>>
>>>>>>> globals.hpp
>>>>>>>
>>>>>>> Do we need to document this only affects OSX and Solaris? (Though
>>>>>>> implicitly this acts as-if true on Linux and Windows in the
>>>>>>> common case.)
>>>>>>
>>>>>> I will update the description to:
>>>>>>
>>>>>>   experimental(bool, AssumeMonotonicOSTimers, false,
>>>>>>                        \
>>>>>>           "Assume that the OS system timers are monotonic "
>>>>>>                 \
>>>>>>           "(Solaris and OS X)")
>>>>>>                                             \
>>>>>
>>>>> Sounds good. Will you close 6864866 as a duplicate of this one?
>>>>
>>>> I will include 6864866 in the hg commit message closing both with
>>>> the same fix.
>>>>
>>>>>
>>>>>>> os.hpp:
>>>>>>>
>>>>>>>  #ifdef TARGET_OS_FAMILY_bsd
>>>>>>>  # include "jvm_bsd.h"
>>>>>>>  # include <setjmp.h>
>>>>>>> + # include <mach/mach_time.h>
>>>>>>>  #endif
>>>>>>>
>>>>>>> I think this include needs to be in a OSX/Apple specific conditional.
>>>>>>
>>>>>> Changed it to:
>>>>>>
>>>>>> #ifdef TARGET_OS_FAMILY_bsd
>>>>>> # include "jvm_bsd.h"
>>>>>> # include <setjmp.h>
>>>>>> # ifdef __APPLE__
>>>>>> #  include <mach/mach_time.h>
>>>>>> # endif
>>>>>
>>>>> Ok. I wonder if someone could test this on a non-Apple BSD system? ;-)
>>>>
>>>> Updated review here: http://cr.openjdk.java.net/~sla/8040140/webrev.02/
>>>>
>>>> Thanks,
>>>> /Staffan
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> /Staffan
>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> We should really fix the non-monotonic-clock path in the Linux
>>>>>>> and Windows implementations too ... but 32-bit is problematic <sigh>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>> On 15/04/2014 4:00 PM, Staffan Larsen wrote:
>>>>>>>> Here is an updated webrev with changes to the comments in
>>>>>>>> os_bsd.cpp and
>>>>>>>> os_solaris.cpp.
>>>>>>>>  - obs -> obsv
>>>>>>>>  - fixed URL to blog entry
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~sla/8040140/webrev.01/
>>>>>>>>
>>>>>>>> /Staffan
>>>>>>>>
>>>>>>>> On 15 apr 2014, at 07:52, Staffan Larsen
>>>>>>>> <staffan.larsen at oracle.com <mailto:staffan.larsen at oracle.com>
>>>>>>>> <mailto:staffan.larsen at oracle.com>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 14 apr 2014, at 21:08, Aleksey Shipilev
>>>>>>>>> <aleksey.shipilev at oracle.com
>>>>>>>>> <mailto:aleksey.shipilev at oracle.com>
>>>>>>>>> <mailto:aleksey.shipilev at oracle.com>> wrote:
>>>>>>>>>
>>>>>>>>>> On 04/14/2014 06:55 PM, Staffan Larsen wrote:
>>>>>>>>>>> mach_absolute_time() is essentially a direct call to RDTSC,
>>>>>>>>>>> but with
>>>>>>>>>>> conversion factor to offset for any system sleeps and frequency
>>>>>>>>>>> changes. The call returns something that can be converted to
>>>>>>>>>>> nanoseconds using information from mach_timebase_info(). Calls to
>>>>>>>>>>> mach_absolute_time() do not enter the kernel and are very
>>>>>>>>>>> fast. The
>>>>>>>>>>> resulting time has nanosecond precision and as good accuracy
>>>>>>>>>>> as one
>>>>>>>>>>> can get.
>>>>>>>>>>
>>>>>>>>>> Some numbers would be good on the public list :) I know the
>>>>>>>>>> numbers
>>>>>>>>>> already, but others on this list don’t.
>>>>>>>>>
>>>>>>>>> I posted the numbers in the bug, but forgot to say so here...
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Since the value from RDTSC can be subject to drifting between
>>>>>>>>>>> CPUs,
>>>>>>>>>>> we implement safeguards for this to make sure we never return
>>>>>>>>>>> a lower
>>>>>>>>>>> value than the previous values. This adds some overhead to
>>>>>>>>>>> nanoTime()
>>>>>>>>>>> but guards us against possible bugs in the OS. For users who are
>>>>>>>>>>> willing to trust the OS and need the fastest possible calls to
>>>>>>>>>>> System.nanoTime(), we add a flag to disable this safeguard:
>>>>>>>>>>> -XX:+AssumeMonotonicOSTimers.
>>>>>>>>>>
>>>>>>>>>> I now wonder if this safeguard can produce a stream of exactly
>>>>>>>>>> the same
>>>>>>>>>> timestamps if local clock is lagging behind. But considering the
>>>>>>>>>> alternative of answering the retrograde time, and the
>>>>>>>>>> observation the
>>>>>>>>>> current Mac OS X mach_absolute_time() *appears* monotonic,
>>>>>>>>>> having this
>>>>>>>>>> safeguard seems OK.
>>>>>>>>>>
>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~sla/8040140/webrev.00/
>>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8040140
>>>>>>>>>>
>>>>>>>>>> This looks good to me.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>>> And, since this question will inevitably pop up, do we plan to
>>>>>>>>>> bring it
>>>>>>>>>> into 8uX? I think many Mac users will be happy about that.
>>>>>>>>>
>>>>>>>>> I would like to do so, but I would also like to have it sit and
>>>>>>>>> bake
>>>>>>>>> for a while in 9 before that. I think the 8u20 train has left the
>>>>>>>>> station, but perhaps 8u40?
>>>>>>>>>
>>>>>>>>> /Staffan
>>>>
>>>
>>
>


More information about the hotspot-runtime-dev mailing list