RFR: 8347126: gc/stress/TestStressG1Uncommit.java gets OOM-killed
Axel Boldt-Christmas
aboldtch at openjdk.org
Wed Jan 8 08:21:34 UTC 2025
On Wed, 8 Jan 2025 07:41:22 GMT, Guoxiong Li <gli at openjdk.org> wrote:
>> One of my testing nodes caught the OOM kill for the VM carrying the test. The default configuration turns the VM that test runs as the driver into a memory hog. On 48-core / 64G machine, the test configured itself to take 13 workers each allocating 1G. This ballooned the heap size to 13G -- e.g. about 25% of host memory -- which is well beyond the usual footprint for a single test VM (~2GB). Naturally, this runs into a high chance of being OOM killed under high test parallelism.
>>
>> I think the solution is to cut down the heap size we run with, and balance the number of workers a bit more finely. I looked around at sibling tests and 1G seems to be a common heap size for these tests.
>
> test/hotspot/jtreg/gc/stress/TestStressG1Uncommit.java line 84:
>
>> 82:
>> 83: // Figure out suitable number of workers (~1 per 100M).
>> 84: int allocationChunk = (int) Math.ceil((double) allocationSize * 100 / M);
>
> If we want to use one worker per 100M, should the equation be `allocationSize / (100 * M)`? Did I miss anything?
Yes. It looked weird that he got 10 workers. This test should now always result in `min(8, num_procs)` workers. Especially since this uses `executeLimitedTestJava`, so it will not propagate flags. So I am not sure how we ever run this test with a different heap without editing the test file.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/22954#discussion_r1906708622
More information about the hotspot-gc-dev
mailing list