Timing Differences Between JFR and GC Logs

Sat Aug 15 02:56:58 UTC 2020

Hi Ana,

Simple answer is that JFR insists on trying to use rdtsc if it thinks 
the machine has a safe/stable tsc value to read. There is some risk in 
doing this as some machines lie about their TSC capabilities. That is 
one reason why the FastUnorderedElapsedCounterSource flag exists.

Trying to correlate/synchronise a raw clock using rdtsc and the OS 
reported time is a very difficult problem and not one we are trying to 
solve. The JFR folk (cc'd) are the best ones to discuss the details of 
JFR timestamps, but I don't think anyone should be trying to correlate 
the JFR event information with the UL event information.

That is probably not a very satisfactory answer :)

All the best for your internship!

Cheers,
David Holmes

On 15/08/2020 8:11 am, Ana Marsh wrote:
> Hi everyone,
> 
> My name is Ana Marsh and I am interning at Microsoft on the Java Engineering Team (under Martijn Verburg) and this summer I have been diving into JFR in the JVM and analyzing how it gathers information on garbage collection events. I have noticed over the past two months of digging that the start time of a garbage collection event returned by JFR varies widely from the start time of the same event given with Unified Logging. I decided to explore this further by looking at the specifics of OpenJDK code and found some interesting points to highlight.
> 
> 
>    1.  Logging decides to log the time of events as it prints and not necessarily as the events happen. That being said, it appears that JFR collects the time directly at the beginning of an event making it appear as a better marker of the start time of an event.
> 
> 
>    1.  Summary:
> Garbage collection timing is measured in Ticks which holds values from two different clocks when using an x86 machine. These clocks are called "os"  (ElapsedCounterSource) and "rdtsc" (FastUnorderedElapsedCounterSource) and one value of Ticks will hold a relevant time for each clock. When a GC event begins (specifically Parallel GC because that is what I have been looking at) it registers the start in Ticks by calling now(). This is what now() looks like:
> jdk/src/hotspot/share/utilities/ticks.cpp
> CompositeElapsedCounterSource::Type CompositeElapsedCounterSource::now() {
>    CompositeTime ct;
>    ct.val1 = ElapsedCounterSource::now();
> #if defined(X86) && !defined(ZERO)
>    static bool initialized = false;
>    static bool valid_rdtsc = false;
>    if (!initialized) {
>      valid_rdtsc = Rdtsc::initialize();
>      initialized = true;
>    }
>    if (valid_rdtsc) {
>      ct.val2 = Rdtsc::elapsed_counter();
>    }
> #endif
>    return ct;
> }
> 
> ElapsedCounterSource::Type ElapsedCounterSource::now() {
>    return os::elapsed_counter();
> }
> 
> For JFR files, the start time is set to the RDTSC value while in GC logs it uses the OS clock as it prints out the log. This is obviously a further point of concern for timing differences, so I wanted to see how much of a contrast there was in the times these clocks reported.
> 
> Experimentation with Results:
> To find the differences between the two I added this to the code:
> jdk/src/hotspot/share/utilities/ticks.cpp
> CompositeElapsedCounterSource::Type CompositeElapsedCounterSource::now() {
>    CompositeTime ct;
>    ct.val1 = ElapsedCounterSource::now();
> #if defined(X86) && !defined(ZERO)
>    static bool initialized = false;
>    static bool valid_rdtsc = false;
>    if (!initialized) {
>      valid_rdtsc = Rdtsc::initialize();
>      initialized = true;
>    }
>    if (valid_rdtsc) {
>      ct.val2 = Rdtsc::elapsed_counter();
>    }
> #endif
>    std::cout << "I got ticks!!\n";
>    std::cout << "OS time:      ";
>    std::cout << ElapsedCounterSource::seconds(ct.val1) << " seconds\n";
>    std::cout<< "RDTSC time:   ";
>    std::cout << FastUnorderedElapsedCounterSource::seconds(ct.val2) <<  " seconds\n\n";
>    return ct;
> }
> 
> I ran this with a test application that forces garbage collection events and I also had a JFR file and GC log being recorded during the run as well. For reference I am running on a Windows 10 machine but using WSL with Ubuntu.
> 
> Here are a few outputs from the beginning of the application:
> I got ticks!!
> OS time:      2.0739 seconds
> RDTSC time:   1.16455 seconds
> 
> I got ticks!!
> OS time:      2.07512 seconds
> RDTSC time:   1.16591 seconds
> 
> I got ticks!!
> OS time:      2.07549 seconds
> RDTSC time:   1.16631 seconds
> 
> Here are a few outputs from ~40 seconds into the application:
> I got ticks!!
> OS time:      43.0537 seconds
> RDTSC time:   46.717 seconds
> 
> I got ticks!!
> OS time:      43.0562 seconds
> RDTSC time:   46.7198 seconds
> 
> I got ticks!!
> OS time:      43.0562 seconds
> RDTSC time:   46.7198 seconds
> 
> Here are a few outputs from ~200 seconds into the application:
> I got ticks!!
> OS time:      191.654 seconds
> RDTSC time:   211.898 seconds
> 
> I got ticks!!
> OS time:      191.654 seconds
> RDTSC time:   211.898 seconds
> 
> I got ticks!!
> OS time:      191.654 seconds
> RDTSC time:   211.898 seconds
> 
> 
> Observations on Results:
> Both ElapsedCounterSource (OS) and FastUnorderedElapsedCounterSource (RDTSC) have their own way of determining seconds so there could be some issues in there as well. Additionally in this example the clocks are not being set instantaneously causing some numbers to be off, however at the beginning of my test the OS clock ends up reporting a later time even though it is called first. Over time though RDTSC seems to begin reporting later times. Overall these clocks get far very off, in degrees of seconds which seems big in the clock world.
> 
> Any insight or thoughts on this matter would be helpful. Thank you!
> 
> Best,
> Ana Marsh
>