RFR: 8268228: TSC is not used for CPUTimeStampCounter on AMD processor [v2]
Kim Barrett
kbarrett at openjdk.java.net
Tue Jun 8 03:39:13 UTC 2021
On Mon, 7 Jun 2021 04:55:47 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:
>> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Fix comments
>
> I think JFR is the only VM subsystem that currently uses the "fast" time
> that is based on TSC, with a fallback to OS time facilities if "fast" time
> is not enabled. (There has been discussion (under JDK-8211240) about
> eliminating that distinction and just always using OS time facilities, but
> it hasn't received much attention.) GC (and maybe other places?) uses the
> dual time mechanism because we want reliable time but also send JFR events.
> So some some of the major VM clients for time information are currently
> paying some cost for having both implementations.
>
> Currently the TSC frequency is always obtained from the CPUID brand string,
> with the bogomips style estimate in initialize_frequency never being used.
> Rdtsc::is_supported() is true iff VM_Version_Ext::supports_tscinv_ext().
> And initialize_frequency() uses the brand string if supports_tscinv_ext().
>
> I think the current implementation of the bogomips calculation can
> intermittently produce catastrophically wrong results. Descheduling at the
> wrong place(s) in the loop can badly mess things up. I thought there was a
> bug for this, but can't find one.
>
> Also, there are things like the Intel erratum referenced here:
> http://lkml.iu.edu/hypermail/linux/kernel/1511.1/01048.html
> that make things even more fun.
>
> I think that detecting a "good" TSC and it's properties (like frequency) is
> pretty hard, and we should not try to duplicate the OS detection or second
> guess it. I also think that using the "fast" time when the TSC is not
> "good" is a mistake, but I have so far not convinced the JFR folks.
>
> So I'm not in favor of this change. I think we should be moving away from
> direct TSC access rather than trying to use it in more cases.
> @kimbarrett Did you mean we cannot detect "good" TSC from invariant TSC flag? If so, I have to withdraw this PR. And also TSC support for intel processor should be removed (it will happen in JDK-8211240?) ASAP.
According to the previously referenced erratum, the invariant TSC flag is
not sufficient; one may also need to check the kernel version or some such.
So there may be a JDK bug to be addressed there, though maybe it's aged out
with support for older kernels.
> If TSC support will remain, it should be detected from CPUID with EAX = 16H, however it is not available on AMD processor as I said before, so we need to bogomips style calculation.
The current bogomips style estimator has the potential for catastrophic
failure even if all the requirements for "safe" direct use of TSC are met.
If those requirements are not met then such failures become more likely and
solutions much harder. My point was that the bogomips estimator is
currently unused unless one explicitly opts-in to potentially bogus data.
-------------
PR: https://git.openjdk.java.net/jdk/pull/4350
More information about the hotspot-dev
mailing list