[jdk16] RFR: 8259765: ZGC: Handle incorrect processor id reported by the operating system [v2]

Thu Jan 21 09:16:47 UTC 2021

On Thu, 21 Jan 2021 08:37:26 GMT, Per Liden <pliden at openjdk.org> wrote:

>> It seems there have been e-mails sent that didn't show up here, so I'm answering on GitHub to hopefully re-attach the discussion to this PR.
>> 
>> From the mailing list:
>>>> Glibc's tst-getcpu.c (which I assume is the test you are referring
>>>> to?) fails in their environment, so it seems like the affinity mask
>>>> isn't reliable either.
>>> 
>>> What's the nature of the failure?  If it's due to a non-changing
>>> affinity mask, then using sched_getaffinity data would still be okay.
>> 
>> Glibc's tst-getcpu fails with some version of "getcpu results X should be Y".
>> 
>> There seems to be a disconnect between CPU masks/affinity and what sched_getcpu() returns.
>> 
>> Example (container with 1 CPU):
>> 
>> 1. sysconf(_SC_NPROCESSORS_CONF) returns 1
>> 2. sysconf(_SC_NPROCESSORS_ONLN) returns 1
>> 3. sched_getaffinity() returns the mask 00000001
>> 4. sched_setaffinity(00000001) return success, but then sched_getcpu() returns 7(!) Should have returned 0.
>> 
>> Another example (container with 2 CPUs):
>> 
>> 1. sysconf(_SC_NPROCESSORS_CONF) returns 2
>> 2. sysconf(_SC_NPROCESSORS_ONLN) returns 2
>> 3. sched_getaffinity() returns the mask 00000011
>> 4. sched_setaffinity(00000001) returns success, but then sched_getcpu() returns 2(!). Should have returned 0.
>> 5. sched_setaffinity(00000010) returns success, but then sched_getcpu() also returns 2(!). Should have returned 1.
>> 
>> It looks like CPUs are virtualized on some level, but not in sched_getcpu(). I'm guessing sched_getcpu() is returning the CPU id of the physical CPU, and not the virtual CPU, or something. So in the last example, maybe both virtual CPUs were scheduled on the same physical CPU.
>
>> Does sched_getaffinity actually change the affinity mask?
> 
> (assuming you meant sched_setaffinity here...)
> 
> You're seem to be right. sched_setaffinity() returns success, but a following call to sched_getaffinity() shows it had no effect.
> 
>> I wonder if it just reports a 2**N - 1 unconditionally, with N being the
>> number of configured vCPUs for the container.  It probably does that so
>> that the population count of the affinity mask matches the vCPU count.
>> Likewise for the CPU entries under /sys (currently ignored by glibc
>> because of a parser bug) and /proc/stat (the fallback actually used by
>> glibc).  There is no virtualization of CPU IDs whatsoever, it looks like
>> it's all done to communicate the vCPU count, without taking into account
>> how badly this interacts with sched_getcpu.
> 
> Yep, that's what it looks like.

> So it isn't that sysconf(_SC_NPROCESSORS_CONF) returns a too low number as stated in the PR but rather that after calling sched_setaffinity, sched_getcpu is broken?

It wasn't my intention to claim that sysconf() _is_ the problem here. I just wanted to mention that it _might_ be sysconf() that is the problem. The reason I mentioned that is the because of how Docker behaves. If you give a Docker container 2 CPUs, sysconf() will still return the number of CPUs available on the host system, e.g. 8, and sched_getcpu() will in that case return numbers in the 0-7 range. Of course, this was just an observation, Docker and OpenVZ could do things differently here.

> Either way won't that breakage also potentially affect the NUMA code as well?

We should be good, because libnuma will report that NUMA is not available, so we automatically disable UseNUMA if it's set.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/124