RFR: JDK-8320859: gtest high malloc footprint caused by BufferNodeAllocator stress test
Kim Barrett
kbarrett at openjdk.org
Tue Nov 28 12:26:03 UTC 2023
On Tue, 28 Nov 2023 11:02:24 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> `BufferNodeAllocatorTest.stress_free_list_allocator_vm` is too expensive. On my box, this test accumulates ~1.5 GB of malloc footprint and raises the libc memory retention from 50m to 800m.
>
> It is quite alone in its hunger. The rest of the tests together accumulate just ~50mb of libc retention.
>
> The buffer does a stress test of the BufferNodeAllocator. Four "mutator threads" race four "gc threads". Mutator threads allocate buffers, GC threads release them. No processing is done on the buffers; this seems to be purely a test of the allocator and its freelist mechanism. The memory footprint of this test depends on the number of retained free buffers in the allocator. The allocator bulk-releases free buffers triggered by a free-count threshold. On my box, the number of free items in the unmodified stock VM is 100k..200k.
>
> It is not clear whether the fact that so many free list items exist indicates a problem with the allocator itself. It looks like the free list should be drained if there are more than 10 items in this list.
>
> In any case, each buffer's capacity is 1024 * sizeof(void*)+header, so 8KB+x, and with 100..200k of those things it explains the NMT-reported malloc footprint of ~1..2GB. An easy fix is to reduce the size of this buffer; since they are not processed, their size should not matter.
>
> ---------
>
> The patch decreases the size per buffer node to cache line size. I chose cache line size out of the vague feeling that I don't want to cause false sharing and thereby degrade the test.
>
> Interestingly enough, reducing the buffer size greatly increases the number of free items since it makes allocation cheaper. This also happens in release builds, so this is not us zapping stuff; my unproven assumption is that this is the libc just being slower when allocating 8K vs 64-byte blocks. But with smaller buffers, we spend more time in freelist management and less time in the libc, which is good for a stress test.
>
> This patch also increases the number of "processor" threads vs "mutator" threads. Since allocation seems to have a speed edge over deallocation, this reduces the number of free items somewhat.
>
> This reduces the malloc spike seen from this thread from 1.5-2GB to ~160MB (release) 3xxMB (debug). We could reduce it a lot more if we reduced the buffer size to its minimum (1 slot); that would risk false sharing and may degrade test performance.
Looks good.
-------------
Marked as reviewed by kbarrett (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/16845#pullrequestreview-1752759826
More information about the hotspot-gc-dev
mailing list