os::javaTimeSystemUTC to call nanosecond precision OS API, so Clock.systemUTC() can give nanosecond precision UTC

Mark Kralj-Taylor kralj.mark at gmail.com
Fri Apr 10 17:03:41 UTC 2020


I'd like to help Java Clock.systemUTC() expose nanosecond precision
UTC realtime (wall-time) clock, where OS and CPU allow.

Please let me know if this would be acceptable for OpenJDK, and what
next steps I can take to progress this towards submitting a patch.
- I have some time at the moment. I've signed the OCA, but don't have
login to the OpenJDK bugtracker, if you want me to submit a proper hg
patch there.
- This mail includes the patch as `hg diff` outputs together with JMH
and test output from it.

The patch here improves the existing API java.time.Instant.now() i.e.
Clock.systemUTC().instant() on Linux to return Linux
clock_gettime(CLOCK_REALTIME) which can be nanosecond precision (and
very low cost, se JMH results) when Linux has a suitable clocksource
(tsc).

The nice thing about this is that it doesn't add a new JDK API, nor
change semantics of the existing Clock.systemUTC().instant() API which
already supports nanosecond granularity.

Clock.systemUTC().instant() is backed by Hotspot
os::javaTimeSystemUTC, which can be updated to call higher precision
OS real-time clock API when available (and has similar performance),
rather than the current code that multiplies microseconds from
gettimeofday() by 1000 to get nanoseconds.

In Linux this means calling the Posix clock_gettime(CLOCK_REALTIME)
API instead of the older gettimeofday() whose output is limited to
microsecond precision. On a suitable CPU Linux can back
clock_gettime(CLOCK_REALTIME) with a nanosecond precision CPU TSC
clocksource. Whereas the output of gettimeofday() is limited to
microsecond precision (even when Linux is using a higher precision
clocksource).

A similar change could be made for other OS that offer a performant
API to get hi-precision wall-time.
- OSX: Unfortunately on my Mac I found that
clock_gettime(CLOCK_REALTIME) is limited to microseconds precision
(source shows it calling gettimeofday() and multiplying by 1000 - see
note on recent Linux change to discourage that). An alternate approach
that did get nanosecond precision wall-time was too slow to be of
interest (>900ns!): Calling OSX host_get_clock_service CALENDAR_CLOCK
in JVM init, then clock_get_time(clockPortFromPrev,...).
- Windows: GetSystemTimePreciseAsFileTime looks promising. See
https://docs.microsoft.com/en-gb/windows/win32/api/sysinfoapi/nf-sysinfoapi-getsystemtimepreciseasfiletime
. Unfortunately I don't have access to a Windows machine to try this
on, but if you are interested in the Linux patch I could buy windows
for a laptop we have to see how  fares in JMH.

Details, patch and JMH / test outputs follow below.
Thanks,
Mark

-----------------

# RATIONAL: Why does Java need nanosecond precision real-time clock?

Like many others working on lower-latency Java systems running on
Linux I have been using JNI to call clock_gettime(CLOCK_REALTIME, ...)
to do this for years. Hi-precision (nanosecond) UTC timestamps have
been essential in understanding and improving end-to-end latency and
performance of multi-process systems. Including nanosecond wall-time
after-receive and before-send timestamps in messages sent over the
wire (or shared memory, or in stored events) gives tremendous
transparency on latency in multi-process and multi-host event
processing and workflows.

Java has evolved to remove more and more common cases for JNI. Getting
a nanosecond real-time clock timestamp feels a good candidate for that
process, especially because it looks like hi-precision real-time UTC
clock APIs are common in the minimum OS versions Java is built for
(even if some OS still give microsecond precision wall-time behind an
API that returns nanoseconds - as OSX 10.15 did for me - built with
XCode 10.1 as per JDK docs.

Timestamps from a monotonic clock can only be compared within the same
host, or for Java's System.nanoTime() within the same Java process.
This is very useful, but doesn't help with multi-process, multi-host,
multi-language use-cases.

Realtime (wall-time) UTC clock timestamps are useful because they can
be compared between different processes, possibly running on a
different host, as well as within a process. Within the same
data-centre latency between processes is typically well under a
millisecond for UDP / multicast between hosts. Latency can be
sub-microsecond for shared-memory between processes on the same host.

A monotonic clock is often favoured for timing latency, because
real-time clocks on different hosts are subject to slewing and
stepping. Clock-sync technologies drastically reduce impact of these
on a host and also time offsets between different hosts (especially in
the same data centre). For high event rate systems you can look at
latency as percentiles over time-windows. With many time-windows you
get an idea of clock sync quality because you see the distribution
move when you look at a time-series of percentiles for time-windows if
different hosts have different clock stepping time synchronisation.

-----------------

# IS THIS A SAFE CHANGE for the JDK?

It makes no change to semantics of Clock.systemUTC(), whose JavaDoc
says: "This clock is based on the best available system clock. This
may use System.currentTimeMillis(), or a higher resolution clock if
one is available.".

Because there is no new JDK API, nor change in API semantics. This
enhancement can be made on an OS by OS basis.

-----------------

# IS THIS A SAFE CHANGE for Linux?

The "clock_gettime(3) - Linux man page" says: "All implementations
support the system-wide realtime clock, which is identified by
CLOCK_REALTIME. Its time represents seconds and nanoseconds since the
Epoch.". See: https://linux.die.net/man/3/clock_gettime

FYI Some older Linux ports used to implement
clock_gettime(CLOCK_REALTIME,) by calling gettimeofday() and
multiplying by 1000. This would be fine with the patch proposed, but
means that resolution would be limited to microseconds by the OS. A
Linux change 6 months ago has discouraged that in ports: "Use
clock_gettime to implement gettimeofday"
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=5e46749c64d51f50f8511ed99c1266d7c13e182b
.

-----------------

# Related OpenJDK BugTrackers

Comments on these enhancements requests suggest some interest in
calling Linux clock_gettime(REALTIME) intsead of gettimeofday().
Although the enhancement requests themselves are not the same as this
mail, which asks to enhance an existing API, within its existing
semantics.
- JDK-6709908
- JDK-8185891

Since Java 9 Clock.systemUTC() has up to microsecond resolution on Linux
- JDK-8068730 Increase the precision of the implementation of
java.time.Clock.systemUTC()

-----------------

# POSSIBLE FOLLOW ON (that is NOT part of the patch or proposal discussed here)

A possible follow-on would be to add a JDK API to give more direct
(lower cost) access to a hi-precision timestamp as a nanoseconds since
UTC epoc in a long. For example System.curretTimeNanos(). This would
be different to the proposal above in that its range would be limited
to year 2262 for a unsigned long, which is more than 200 years in the
future, but less than Instant.MAX. A JMH benchmarks with
systemTimeNanos() as an intrinsic (following the style of
currentTimeMillis) demonstrated that this is attractive, with similar
cost to System.currentTimeMillis()/nanoTime(), so better than
Instant.now() or JNI that I used. But initially I'd rather focus on
improving Java without adding any new APIs, or discussing if ~200
years is enough of a range.

-----------------

# PATCHES to Hotspot, tests and output of those tests

## PATCH to hotspot/os/linux/os_linux.cpp

This patch changes both os::javaTimeMillis and os::javaTimeSystemUTC
to call clock_gettime(CLOCK_REALTIME, ..) instead of gettimeofday().
This keeps the 2 methods obviously consistent, and avoids someone
reading the code having questions on the consistency of different
Linux APIs. Results form JMH benchmark (see below) show this is ok.

This seams consistent with direction being taken by Linux, based on
change: "Use clock_gettime to implement gettimeofday" at
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=5e46749c64d51f50f8511ed99c1266d7c13e182b

```
diff --git a/src/hotspot/os/linux/os_linux.cpp
b/src/hotspot/os/linux/os_linux.cpp
--- a/src/hotspot/os/linux/os_linux.cpp
+++ b/src/hotspot/os/linux/os_linux.cpp
@@ -88,6 +88,7 @@
# include <errno.h>
# include <dlfcn.h>
# include <stdio.h>
+# include <time.h>
# include <unistd.h>
# include <sys/resource.h>
# include <pthread.h>
@@ -1374,18 +1375,18 @@
}
jlong os::javaTimeMillis() {
- timeval time;
- int status = gettimeofday(&time, NULL);
+ timespec time;
+ int status = clock_gettime(CLOCK_REALTIME, &time);
assert(status != -1, "linux error");
- return jlong(time.tv_sec) * 1000 + jlong(time.tv_usec / 1000);
+ return jlong(time.tv_sec) * 1000 + jlong(time.tv_nsec / 1000000);
}
void os::javaTimeSystemUTC(jlong &seconds, jlong &nanos) {
- timeval time;
- int status = gettimeofday(&time, NULL);
+ timespec time;
+ int status = clock_gettime(CLOCK_REALTIME, &time);
assert(status != -1, "linux error");
seconds = jlong(time.tv_sec);
- nanos = jlong(time.tv_usec) * 1000;
+ nanos = jlong(time.tv_nsec);
}
void os::Linux::fast_thread_clock_init() {
```

## PATCH to test/micro JMH benchmark to assess impact of change on
Instant.now(), and show its performance relative to
System.currentTimeMillis()/nanoTime().

Where would the best place be to add an Instant.now() JMH bennchmark?
- While I'd guess the convention is to follow the package of the API
being benchmarked. I like that by putting benchmarks of system
timestamping APIs in one class its easy and natural to spot them all
and compare them. Could the benchmark be added here, then the class
renamed to SystemTime.java to focus on benchmarking those features?

```
diff --git a/test/micro/org/openjdk/bench/java/lang/Systems.java
b/test/micro/org/openjdk/bench/java/lang/Systems.java
--- a/test/micro/org/openjdk/bench/java/lang/Systems.java
+++ b/test/micro/org/openjdk/bench/java/lang/Systems.java
@@ -28,6 +28,7 @@
import org.openjdk.jmh.annotations.OutputTimeUnit;
import java.util.concurrent.TimeUnit;
+import java.time.Instant;
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@@ -43,4 +44,9 @@
return System.nanoTime();
}
+ @Benchmark
+ public long instant_now_asEpocNanos() {
+ Instant now = Instant.now();
+ return now.getEpochSecond() * 1_000_000_000L + now.getNano();
+ }
```

### RESULTS from JDK Micro benchmark:
Run on Linux with clocksource=tsc on a 3GHz Intel i5 CPU (see details below).

`make test TEST="micro:java.lang.Systems"`

WITH change to os_linux.cpp:
```
Benchmark Mode Cnt Score Error Units
Systems.currentTimeMillis avgt 25 19.190 ? 0.166 ns/op
Systems.instant_now_asEpocNanos avgt 25 29.809 ? 0.191 ns/op
Systems.nanoTime avgt 25 18.534 ? 0.024 ns/op
```

WITHOUT change to os_linux.cpp: (but with the added JMH benchmark for
purpose of comparison)
```
Benchmark Mode Cnt Score Error Units
Systems.currentTimeMillis avgt 25 19.013 ? 0.033 ns/op
Systems.instant_now_asEpocNanos avgt 25 30.459 ? 0.017 ns/op
Systems.nanoTime avgt 25 18.971 ? 0.053 ns/op
```

### Platform details:
System: Linux (Ubuntu) running on Mac Mini 2018 3 GHz Intel i5
```
$ uname -a
Linux MacMiniLinux 5.3.0-42-generic #34-Ubuntu SMP Fri Feb 28 05:49:40
UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc

$ ldd --version
ldd (Ubuntu GLIBC 2.30-0ubuntu2.1) 2.30

$ lscpu | egrep '(Model name|CPU.s.:)'
CPU(s): 6
Model name: Intel(R) Core(TM) i5-8500B CPU @ 3.00GHz
NUMA node0 CPU(s): 0-5

$ lscpu | grep Flags | sed 's@ @\n at g' | grep -i tsc
tsc
rdtscp
constant_tsc
nonstop_tsc
tsc_deadline_timer
tsc_adjust
```

## PATCH to JDK test code - that logs realtime clock precision:

```
diff --git a/test/jdk/java/time/test/java/time/TestClock_System.java
b/test/jdk/java/time/test/java/time/TestClock_System.java
--- a/test/jdk/java/time/test/java/time/TestClock_System.java
+++ b/test/jdk/java/time/test/java/time/TestClock_System.java
@@ -177,7 +177,8 @@
+ formatTime("\n\thighest1", highest1));
}
- int count=0;
+ int count_betterThanMillisPrecision=0;
+ int count_betterThanMicrosPrecision=0;
// let's preheat the system a bit:
int lastNanos = 0;
for (int i = 0; i < 1000 ; i++) {
@@ -191,7 +192,10 @@
lastNanos = nanos;
if ((nanos % 1000000) > 0) {
- count++; // we have micro seconds
+ count_betterThanMillisPrecision++; // we have microseconds
+ }
+ if ((nanos % 1000) > 0) {
+ count_betterThanMicrosPrecision++; // we have nanoseconds
}
if ((sysnan % 1000000) > 0) {
throw new RuntimeException("Expected only millisecconds "
@@ -200,10 +204,12 @@
}
}
System.out.println("\nNumber of time stamps which had better than"
- + " millisecond precision: "+count+"/"+1000);
+ + " millisecond precision: "+count_betterThanMillisPrecision+"/"+1000);
+ System.out.println("\nNumber of time stamps which had better than"
+ + " microsecond precision: "+count_betterThanMicrosPrecision+"/"+1000);
System.out.println(formatTime("\nsystemUTC ", system1));
System.out.println(formatTime("highestResolutionUTC ", highest1));
- if (count == 0) {
+ if (count_betterThanMillisPrecision == 0) {
System.err.println("Something is strange: no microsecond "
+ "precision with highestResolutionUTC?");
throw new RuntimeException("Micro second preccision not reached");
```

### OUTPUT of JDK Test logs observed clock precision

Extract of test log when test patch run on same host as JMH below.
Shows nanosecond precision (`999/1000 better than microsecond`).

`make test TEST="jtreg:test/jdk/java/time/test/java/time/TestClock_System.java"
JTREG="VERBOSE=all"`

```
Number of time stamps which had better than millisecond precision: 1000/1000

Number of time stamps which had better than microsecond precision: 999/1000

systemUTC : 2020-04-03T23:14:12.276Z - seconds: 1585955652, nanos: 276000000
highestResolutionUTC : 2020-04-03T23:14:12.276472680Z - seconds:
1585955652, nanos: 276472680
```


More information about the hotspot-runtime-dev mailing list