os::javaTimeSystemUTC to call nanosecond precision OS API, so Clock.systemUTC() can give nanosecond precision UTC

David Holmes david.holmes at oracle.com
Fri Apr 10 23:53:48 UTC 2020


Update:

On 11/04/2020 9:45 am, David Holmes wrote:
> Hi Mark,
> 
> Thanks for the very detailed proposal and write up!
> 
> It's a holiday weekend so I can't dig into this right now but we tried 
> using a high-precision clock source for systemUTC() in the past but it 
> didn't work because systemUTC() and currentTimeMillis() have to use the 
> same time base, and currentTimeMillis() has to use gettimeofday(). I 
> thought this cross-dependency was documented somewhere but can't find it 
> right now. If gettimeofday and clock_gettime(CLOCK_REALTIME) actually 
> have the same time characteristics wrt. wall-clock time then changing 
> both as suggested may indeed work.

Found this from last time things were discussed in detail:

https://bugs.openjdk.java.net/browse/JDK-8185891?focusedCommentId=14107380&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14107380

so it does seem like a switch from gettimeofday to 
clock_gettime(CLOCK_REALTIME) may be viable.

Cheers,
David

> Note however that we are still not 
> yet in a position to use clock_gettime unconditionally at runtime - its 
> use is predicated on os::supports_monotonic_clock() (which despite the 
> name also indicates clock_gettime exists).
> 
> Regarding Windows please see:
> 
> https://bugs.openjdk.java.net/browse/JDK-8180466
> 
> More next week.
> 
> Cheers,
> David
> 
> On 11/04/2020 3:03 am, Mark Kralj-Taylor wrote:
>> I'd like to help Java Clock.systemUTC() expose nanosecond precision
>> UTC realtime (wall-time) clock, where OS and CPU allow.
>>
>> Please let me know if this would be acceptable for OpenJDK, and what
>> next steps I can take to progress this towards submitting a patch.
>> - I have some time at the moment. I've signed the OCA, but don't have
>> login to the OpenJDK bugtracker, if you want me to submit a proper hg
>> patch there.
>> - This mail includes the patch as `hg diff` outputs together with JMH
>> and test output from it.
>>
>> The patch here improves the existing API java.time.Instant.now() i.e.
>> Clock.systemUTC().instant() on Linux to return Linux
>> clock_gettime(CLOCK_REALTIME) which can be nanosecond precision (and
>> very low cost, se JMH results) when Linux has a suitable clocksource
>> (tsc).
>>
>> The nice thing about this is that it doesn't add a new JDK API, nor
>> change semantics of the existing Clock.systemUTC().instant() API which
>> already supports nanosecond granularity.
>>
>> Clock.systemUTC().instant() is backed by Hotspot
>> os::javaTimeSystemUTC, which can be updated to call higher precision
>> OS real-time clock API when available (and has similar performance),
>> rather than the current code that multiplies microseconds from
>> gettimeofday() by 1000 to get nanoseconds.
>>
>> In Linux this means calling the Posix clock_gettime(CLOCK_REALTIME)
>> API instead of the older gettimeofday() whose output is limited to
>> microsecond precision. On a suitable CPU Linux can back
>> clock_gettime(CLOCK_REALTIME) with a nanosecond precision CPU TSC
>> clocksource. Whereas the output of gettimeofday() is limited to
>> microsecond precision (even when Linux is using a higher precision
>> clocksource).
>>
>> A similar change could be made for other OS that offer a performant
>> API to get hi-precision wall-time.
>> - OSX: Unfortunately on my Mac I found that
>> clock_gettime(CLOCK_REALTIME) is limited to microseconds precision
>> (source shows it calling gettimeofday() and multiplying by 1000 - see
>> note on recent Linux change to discourage that). An alternate approach
>> that did get nanosecond precision wall-time was too slow to be of
>> interest (>900ns!): Calling OSX host_get_clock_service CALENDAR_CLOCK
>> in JVM init, then clock_get_time(clockPortFromPrev,...).
>> - Windows: GetSystemTimePreciseAsFileTime looks promising. See
>> https://docs.microsoft.com/en-gb/windows/win32/api/sysinfoapi/nf-sysinfoapi-getsystemtimepreciseasfiletime 
>>
>> . Unfortunately I don't have access to a Windows machine to try this
>> on, but if you are interested in the Linux patch I could buy windows
>> for a laptop we have to see how  fares in JMH.
>>
>> Details, patch and JMH / test outputs follow below.
>> Thanks,
>> Mark
>>
>> -----------------
>>
>> # RATIONAL: Why does Java need nanosecond precision real-time clock?
>>
>> Like many others working on lower-latency Java systems running on
>> Linux I have been using JNI to call clock_gettime(CLOCK_REALTIME, ...)
>> to do this for years. Hi-precision (nanosecond) UTC timestamps have
>> been essential in understanding and improving end-to-end latency and
>> performance of multi-process systems. Including nanosecond wall-time
>> after-receive and before-send timestamps in messages sent over the
>> wire (or shared memory, or in stored events) gives tremendous
>> transparency on latency in multi-process and multi-host event
>> processing and workflows.
>>
>> Java has evolved to remove more and more common cases for JNI. Getting
>> a nanosecond real-time clock timestamp feels a good candidate for that
>> process, especially because it looks like hi-precision real-time UTC
>> clock APIs are common in the minimum OS versions Java is built for
>> (even if some OS still give microsecond precision wall-time behind an
>> API that returns nanoseconds - as OSX 10.15 did for me - built with
>> XCode 10.1 as per JDK docs.
>>
>> Timestamps from a monotonic clock can only be compared within the same
>> host, or for Java's System.nanoTime() within the same Java process.
>> This is very useful, but doesn't help with multi-process, multi-host,
>> multi-language use-cases.
>>
>> Realtime (wall-time) UTC clock timestamps are useful because they can
>> be compared between different processes, possibly running on a
>> different host, as well as within a process. Within the same
>> data-centre latency between processes is typically well under a
>> millisecond for UDP / multicast between hosts. Latency can be
>> sub-microsecond for shared-memory between processes on the same host.
>>
>> A monotonic clock is often favoured for timing latency, because
>> real-time clocks on different hosts are subject to slewing and
>> stepping. Clock-sync technologies drastically reduce impact of these
>> on a host and also time offsets between different hosts (especially in
>> the same data centre). For high event rate systems you can look at
>> latency as percentiles over time-windows. With many time-windows you
>> get an idea of clock sync quality because you see the distribution
>> move when you look at a time-series of percentiles for time-windows if
>> different hosts have different clock stepping time synchronisation.
>>
>> -----------------
>>
>> # IS THIS A SAFE CHANGE for the JDK?
>>
>> It makes no change to semantics of Clock.systemUTC(), whose JavaDoc
>> says: "This clock is based on the best available system clock. This
>> may use System.currentTimeMillis(), or a higher resolution clock if
>> one is available.".
>>
>> Because there is no new JDK API, nor change in API semantics. This
>> enhancement can be made on an OS by OS basis.
>>
>> -----------------
>>
>> # IS THIS A SAFE CHANGE for Linux?
>>
>> The "clock_gettime(3) - Linux man page" says: "All implementations
>> support the system-wide realtime clock, which is identified by
>> CLOCK_REALTIME. Its time represents seconds and nanoseconds since the
>> Epoch.". See: https://linux.die.net/man/3/clock_gettime
>>
>> FYI Some older Linux ports used to implement
>> clock_gettime(CLOCK_REALTIME,) by calling gettimeofday() and
>> multiplying by 1000. This would be fine with the patch proposed, but
>> means that resolution would be limited to microseconds by the OS. A
>> Linux change 6 months ago has discouraged that in ports: "Use
>> clock_gettime to implement gettimeofday"
>> https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=5e46749c64d51f50f8511ed99c1266d7c13e182b 
>>
>> .
>>
>> -----------------
>>
>> # Related OpenJDK BugTrackers
>>
>> Comments on these enhancements requests suggest some interest in
>> calling Linux clock_gettime(REALTIME) intsead of gettimeofday().
>> Although the enhancement requests themselves are not the same as this
>> mail, which asks to enhance an existing API, within its existing
>> semantics.
>> - JDK-6709908
>> - JDK-8185891
>>
>> Since Java 9 Clock.systemUTC() has up to microsecond resolution on Linux
>> - JDK-8068730 Increase the precision of the implementation of
>> java.time.Clock.systemUTC()
>>
>> -----------------
>>
>> # POSSIBLE FOLLOW ON (that is NOT part of the patch or proposal 
>> discussed here)
>>
>> A possible follow-on would be to add a JDK API to give more direct
>> (lower cost) access to a hi-precision timestamp as a nanoseconds since
>> UTC epoc in a long. For example System.curretTimeNanos(). This would
>> be different to the proposal above in that its range would be limited
>> to year 2262 for a unsigned long, which is more than 200 years in the
>> future, but less than Instant.MAX. A JMH benchmarks with
>> systemTimeNanos() as an intrinsic (following the style of
>> currentTimeMillis) demonstrated that this is attractive, with similar
>> cost to System.currentTimeMillis()/nanoTime(), so better than
>> Instant.now() or JNI that I used. But initially I'd rather focus on
>> improving Java without adding any new APIs, or discussing if ~200
>> years is enough of a range.
>>
>> -----------------
>>
>> # PATCHES to Hotspot, tests and output of those tests
>>
>> ## PATCH to hotspot/os/linux/os_linux.cpp
>>
>> This patch changes both os::javaTimeMillis and os::javaTimeSystemUTC
>> to call clock_gettime(CLOCK_REALTIME, ..) instead of gettimeofday().
>> This keeps the 2 methods obviously consistent, and avoids someone
>> reading the code having questions on the consistency of different
>> Linux APIs. Results form JMH benchmark (see below) show this is ok.
>>
>> This seams consistent with direction being taken by Linux, based on
>> change: "Use clock_gettime to implement gettimeofday" at
>> https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=5e46749c64d51f50f8511ed99c1266d7c13e182b 
>>
>>
>> ```
>> diff --git a/src/hotspot/os/linux/os_linux.cpp
>> b/src/hotspot/os/linux/os_linux.cpp
>> --- a/src/hotspot/os/linux/os_linux.cpp
>> +++ b/src/hotspot/os/linux/os_linux.cpp
>> @@ -88,6 +88,7 @@
>> # include <errno.h>
>> # include <dlfcn.h>
>> # include <stdio.h>
>> +# include <time.h>
>> # include <unistd.h>
>> # include <sys/resource.h>
>> # include <pthread.h>
>> @@ -1374,18 +1375,18 @@
>> }
>> jlong os::javaTimeMillis() {
>> - timeval time;
>> - int status = gettimeofday(&time, NULL);
>> + timespec time;
>> + int status = clock_gettime(CLOCK_REALTIME, &time);
>> assert(status != -1, "linux error");
>> - return jlong(time.tv_sec) * 1000 + jlong(time.tv_usec / 1000);
>> + return jlong(time.tv_sec) * 1000 + jlong(time.tv_nsec / 1000000);
>> }
>> void os::javaTimeSystemUTC(jlong &seconds, jlong &nanos) {
>> - timeval time;
>> - int status = gettimeofday(&time, NULL);
>> + timespec time;
>> + int status = clock_gettime(CLOCK_REALTIME, &time);
>> assert(status != -1, "linux error");
>> seconds = jlong(time.tv_sec);
>> - nanos = jlong(time.tv_usec) * 1000;
>> + nanos = jlong(time.tv_nsec);
>> }
>> void os::Linux::fast_thread_clock_init() {
>> ```
>>
>> ## PATCH to test/micro JMH benchmark to assess impact of change on
>> Instant.now(), and show its performance relative to
>> System.currentTimeMillis()/nanoTime().
>>
>> Where would the best place be to add an Instant.now() JMH bennchmark?
>> - While I'd guess the convention is to follow the package of the API
>> being benchmarked. I like that by putting benchmarks of system
>> timestamping APIs in one class its easy and natural to spot them all
>> and compare them. Could the benchmark be added here, then the class
>> renamed to SystemTime.java to focus on benchmarking those features?
>>
>> ```
>> diff --git a/test/micro/org/openjdk/bench/java/lang/Systems.java
>> b/test/micro/org/openjdk/bench/java/lang/Systems.java
>> --- a/test/micro/org/openjdk/bench/java/lang/Systems.java
>> +++ b/test/micro/org/openjdk/bench/java/lang/Systems.java
>> @@ -28,6 +28,7 @@
>> import org.openjdk.jmh.annotations.OutputTimeUnit;
>> import java.util.concurrent.TimeUnit;
>> +import java.time.Instant;
>> @BenchmarkMode(Mode.AverageTime)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> @@ -43,4 +44,9 @@
>> return System.nanoTime();
>> }
>> + @Benchmark
>> + public long instant_now_asEpocNanos() {
>> + Instant now = Instant.now();
>> + return now.getEpochSecond() * 1_000_000_000L + now.getNano();
>> + }
>> ```
>>
>> ### RESULTS from JDK Micro benchmark:
>> Run on Linux with clocksource=tsc on a 3GHz Intel i5 CPU (see details 
>> below).
>>
>> `make test TEST="micro:java.lang.Systems"`
>>
>> WITH change to os_linux.cpp:
>> ```
>> Benchmark Mode Cnt Score Error Units
>> Systems.currentTimeMillis avgt 25 19.190 ? 0.166 ns/op
>> Systems.instant_now_asEpocNanos avgt 25 29.809 ? 0.191 ns/op
>> Systems.nanoTime avgt 25 18.534 ? 0.024 ns/op
>> ```
>>
>> WITHOUT change to os_linux.cpp: (but with the added JMH benchmark for
>> purpose of comparison)
>> ```
>> Benchmark Mode Cnt Score Error Units
>> Systems.currentTimeMillis avgt 25 19.013 ? 0.033 ns/op
>> Systems.instant_now_asEpocNanos avgt 25 30.459 ? 0.017 ns/op
>> Systems.nanoTime avgt 25 18.971 ? 0.053 ns/op
>> ```
>>
>> ### Platform details:
>> System: Linux (Ubuntu) running on Mac Mini 2018 3 GHz Intel i5
>> ```
>> $ uname -a
>> Linux MacMiniLinux 5.3.0-42-generic #34-Ubuntu SMP Fri Feb 28 05:49:40
>> UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
>>
>> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>> tsc
>>
>> $ ldd --version
>> ldd (Ubuntu GLIBC 2.30-0ubuntu2.1) 2.30
>>
>> $ lscpu | egrep '(Model name|CPU.s.:)'
>> CPU(s): 6
>> Model name: Intel(R) Core(TM) i5-8500B CPU @ 3.00GHz
>> NUMA node0 CPU(s): 0-5
>>
>> $ lscpu | grep Flags | sed 's@ @\n at g' | grep -i tsc
>> tsc
>> rdtscp
>> constant_tsc
>> nonstop_tsc
>> tsc_deadline_timer
>> tsc_adjust
>> ```
>>
>> ## PATCH to JDK test code - that logs realtime clock precision:
>>
>> ```
>> diff --git a/test/jdk/java/time/test/java/time/TestClock_System.java
>> b/test/jdk/java/time/test/java/time/TestClock_System.java
>> --- a/test/jdk/java/time/test/java/time/TestClock_System.java
>> +++ b/test/jdk/java/time/test/java/time/TestClock_System.java
>> @@ -177,7 +177,8 @@
>> + formatTime("\n\thighest1", highest1));
>> }
>> - int count=0;
>> + int count_betterThanMillisPrecision=0;
>> + int count_betterThanMicrosPrecision=0;
>> // let's preheat the system a bit:
>> int lastNanos = 0;
>> for (int i = 0; i < 1000 ; i++) {
>> @@ -191,7 +192,10 @@
>> lastNanos = nanos;
>> if ((nanos % 1000000) > 0) {
>> - count++; // we have micro seconds
>> + count_betterThanMillisPrecision++; // we have microseconds
>> + }
>> + if ((nanos % 1000) > 0) {
>> + count_betterThanMicrosPrecision++; // we have nanoseconds
>> }
>> if ((sysnan % 1000000) > 0) {
>> throw new RuntimeException("Expected only millisecconds "
>> @@ -200,10 +204,12 @@
>> }
>> }
>> System.out.println("\nNumber of time stamps which had better than"
>> - + " millisecond precision: "+count+"/"+1000);
>> + + " millisecond precision: "+count_betterThanMillisPrecision+"/"+1000);
>> + System.out.println("\nNumber of time stamps which had better than"
>> + + " microsecond precision: "+count_betterThanMicrosPrecision+"/"+1000);
>> System.out.println(formatTime("\nsystemUTC ", system1));
>> System.out.println(formatTime("highestResolutionUTC ", highest1));
>> - if (count == 0) {
>> + if (count_betterThanMillisPrecision == 0) {
>> System.err.println("Something is strange: no microsecond "
>> + "precision with highestResolutionUTC?");
>> throw new RuntimeException("Micro second preccision not reached");
>> ```
>>
>> ### OUTPUT of JDK Test logs observed clock precision
>>
>> Extract of test log when test patch run on same host as JMH below.
>> Shows nanosecond precision (`999/1000 better than microsecond`).
>>
>> `make test 
>> TEST="jtreg:test/jdk/java/time/test/java/time/TestClock_System.java"
>> JTREG="VERBOSE=all"`
>>
>> ```
>> Number of time stamps which had better than millisecond precision: 
>> 1000/1000
>>
>> Number of time stamps which had better than microsecond precision: 
>> 999/1000
>>
>> systemUTC : 2020-04-03T23:14:12.276Z - seconds: 1585955652, nanos: 
>> 276000000
>> highestResolutionUTC : 2020-04-03T23:14:12.276472680Z - seconds:
>> 1585955652, nanos: 276472680
>> ```
>>


More information about the hotspot-runtime-dev mailing list