RFR: 8265332: gtest/LargePageGtests.java OOMEs on -XX:+UseSHM cases

Aleksey Shipilev shade at openjdk.java.net
Wed Apr 21 10:44:14 UTC 2021


On Fri, 16 Apr 2021 10:06:43 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> It looks like some `+UseSHM` test cases added by [JDK-8213269](https://bugs.openjdk.java.net/browse/JDK-8213269) reliably blow up the VM log reader with OOME. There are lots of `OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory.` in the log, if you increase the test heap size. AFAIU, many of those messages are expected from the new test cases.
> 
> I believe ultimately this test produces a virtually unbounded number of warning messages, which would eventually blow out the Java heap in test infra parsers. This is a reliable tier1 failure on my TR 3970X, probably because it has enough cores to run 30 threads concurrently for 15 seconds all spewing warning messages.
> 
> #### Try 1
> 
> The first attempt recognizes that `ConcurrentTestRunner` runs a time-bound number of iterations, which means the faster machine is, the more warning messages would be printed. Then, the way out is to make `ConcurrentTestRunner` to cap the number of iterations, so that VM output length is more predictable.
> 
> Test times before:
> 
> 
> # default
> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15003 ms)
> 
> # -XX:+UseLargePages
> [       OK ] os_linux.reserve_memory_special_concurrent_vm (16121 ms)
> 
> # -XX:+UseLargePages -XX:LargePageSizeInBytes=1G
> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15006 ms)
> 
> # -XX:+UseLargePages -XX:+UseSHM
> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15030 ms)
> 
> 
> Test times after:
> 
> 
> # default
> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15003 ms)
> 
> # -XX:+UseLargePages
> [       OK ] os_linux.reserve_memory_special_concurrent_vm (16071 ms)
> 
> # -XX:+UseLargePages -XX:LargePageSizeInBytes=1G
> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15006 ms)
> 
> # -XX:+UseLargePages -XX:+UseSHM
> [       OK ] os_linux.reserve_memory_special_concurrent_vm (1190 ms)
> 
> 
> The major difference is that the last mode gets capped by `maxIteration`. This fixes the test failure, as `-XX:+UseSHM` case would produce lots of warnings on my machine.
> 
> #### Try 2
> 
> The second attempt run the tests with `-XX:-PrintWarnings` to avoid warning log overload.
> 
> Additional testing:
>  - [x] `os_linux` gtest
>  - [x] `gtest/LargePageGtests.java` used to fail, now passes

All right! How about we run these tests with `-XX:-PrintWarnings` then? See new, much simpler commit.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3542


More information about the hotspot-dev mailing list