RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2]

Kim Barrett kbarrett at openjdk.java.net
Mon Jun 7 04:59:01 UTC 2021


On Fri, 4 Jun 2021 05:24:15 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:

>> I ran JVM on Ryzen 3300X, and I got following `jdk.CPUTimeStampCounter` event.
>> 
>> 
>> jdk.CPUTimeStampCounter {
>>   startTime = 10:41:14.993
>>   fastTimeEnabled = false
>>   fastTimeAutoEnabled = true
>>   osFrequency = 1000000000
>>   fastTimeFrequency = 1000000000
>> }
>> 
>> 
>> I confirmed 3300X supports Invariant TSC (so `fastTimeAutoEnabled` is set to `true`), however it does not seem to be used (`fastTimeEnabled` is `false`).
>> 
>> Frequency is come from brand string from CPUID (e.g. "Intel(R) Core(TM) i3-8145U CPU @ 2.10GHz"). However AMD processor (Ryzen at least) does not have it ("AMD Ryzen 3 3300X 4-Core Processor").
>> Fortunately rdtsc_x86.cpp can calculate the frequency like bogomips. We should fallback to it if we cannot get the frequency even if invariant TSC is supported.
>> 
>> After this change, I got following `jdk.CPUTimeStampCounter` event. Base clock of Ryzen 3 3300X is 3.8GHz, so `fastTimeFrequency` looks good.
>> 
>> 
>> jdk.CPUTimeStampCounter {
>>   startTime = 10:33:52.884
>>   fastTimeEnabled = true
>>   fastTimeAutoEnabled = true
>>   osFrequency = 10000000 Hz
>>   fastTimeFrequency = 3792929124 Hz
>> }
>> 
>> 
>> This problem is not only for JFR. I confirmed `Rdtsc` class is used in ticks.cpp , and it relates to GC code at least.
>
> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix comments

I think JFR is the only VM subsystem that currently uses the "fast" time
that is based on TSC, with a fallback to OS time facilities if "fast" time
is not enabled.  (There has been discussion (under JDK-8211240) about
eliminating that distinction and just always using OS time facilities, but
it hasn't received much attention.)  GC (and maybe other places?) uses the
dual time mechanism because we want reliable time but also send JFR events.
So some some of the major VM clients for time information are currently
paying some cost for having both implementations.

Currently the TSC frequency is always obtained from the CPUID brand string,
with the bogomips style estimate in initialize_frequency never being used.
Rdtsc::is_supported() is true iff VM_Version_Ext::supports_tscinv_ext().
And initialize_frequency() uses the brand string if supports_tscinv_ext().

I think the current implementation of the bogomips calculation can
intermittently produce catastrophically wrong results.  Descheduling at the
wrong place(s) in the loop can badly mess things up.  I thought there was a
bug for this, but can't find one.

Also, there are things like the Intel erratum referenced here:
http://lkml.iu.edu/hypermail/linux/kernel/1511.1/01048.html
that make things even more fun.

I think that detecting a "good" TSC and it's properties (like frequency) is
pretty hard, and we should not try to duplicate the OS detection or second
guess it.  I also think that using the "fast" time when the TSC is not
"good" is a mistake, but I have so far not convinced the JFR folks.

So I'm not in favor of this change.  I think we should be moving away from
direct TSC access rather than trying to use it in more cases.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4350


More information about the hotspot-dev mailing list