ZGC heap size and RSS counters
shade at redhat.com
Mon Dec 11 10:12:56 UTC 2017
On 12/11/2017 10:59 AM, Per Liden wrote:
> On 2017-12-11 10:36, Aleksey Shipilev wrote:
>> On 12/11/2017 10:14 AM, Per Liden wrote:
>>> On 2017-12-11 09:55, Aleksey Shipilev wrote:
>>>> Hi there,
>>>> I'm trying ZGC on trivial workloads, and I have a question about footprint. The workload runs with
>>>> -Xms16g -Xms16g, but the RSS figures are at least 3x larger:
>>>> VmPeak: 18256721392 kB
>>>> VmSize: 18256721392 kB
>>>> VmLck: 0 kB
>>>> VmPin: 0 kB
>>>> VmHWM: 50729036 kB
>>>> VmRSS: 50729036 kB
>>>> RssAnon: 369700 kB
>>>> RssFile: 27688 kB
>>>> RssShmem: 50331648 kB
>>>> Is this because ZGC maps the same physical space with multiple virtual mappings? Or is it a bug?
>>> The kernel's RSS accounting is flaky at best, and varies depending on if you're using small or large
>>> pages (and it can also vary depending on which kernel version you're using).
>>> On Linux/x86_64, we map the heap in three different locations. When using small pages, you'll
>>> typically see that the same physical page will incorrectly be accounted for three times instead of
>>> once. On the other hand, when using large pages, you'll typically see a different behavior, as it's
>>> accounted to the hugetlbfs inode and not the process.
>>> In summary, it's not a bug in ZGC, but more a limitation in Linux's accounting.
>> Understood, that's what I thought. Do you think that is the problem in lieu of pervasive use of
>> containers that allocate/limit resources based on RSS?
> If RSS limits are used in a container, then I'd argue that the kernel better get the accounting
> right, otherwise those limits is fairly useless wouldn't you say? In the kernel's defense, it is
> gradually getting better in this area.
I agree that's kernel's job to account this properly. But I am also concerned about the
practicalities with real deployments on current kernels :( Shenandoah is also about do to
double-mapping for related reasons, and it would gradually come to the same trouble. I was wondering
if you have observed problems with ZGC running in containers that shed more light on this concern.
More information about the zgc-dev