RFR: 8265332: gtest/LargePageGtests.java OOMEs on -XX:+UseSHM cases [v2]

Thomas Stuefe stuefe at openjdk.java.net
Wed Apr 21 11:49:46 UTC 2021


On Wed, 21 Apr 2021 10:44:12 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> It looks like some `+UseSHM` test cases added by [JDK-8213269](https://bugs.openjdk.java.net/browse/JDK-8213269) reliably blow up the VM log reader with OOME. There are lots of `OpenJDK 64-Bit Server VM warning: Failed to reserve shared memory.` in the log, if you increase the test heap size. AFAIU, many of those messages are expected from the new test cases.
>> 
>> I believe ultimately this test produces a virtually unbounded number of warning messages, which would eventually blow out the Java heap in test infra parsers. This is a reliable tier1 failure on my TR 3970X, probably because it has enough cores to run 30 threads concurrently for 15 seconds all spewing warning messages.
>> 
>> #### Try 1
>> 
>> The first attempt recognizes that `ConcurrentTestRunner` runs a time-bound number of iterations, which means the faster machine is, the more warning messages would be printed. Then, the way out is to make `ConcurrentTestRunner` to cap the number of iterations, so that VM output length is more predictable.
>> 
>> Test times before:
>> 
>> 
>> # default
>> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15003 ms)
>> 
>> # -XX:+UseLargePages
>> [       OK ] os_linux.reserve_memory_special_concurrent_vm (16121 ms)
>> 
>> # -XX:+UseLargePages -XX:LargePageSizeInBytes=1G
>> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15006 ms)
>> 
>> # -XX:+UseLargePages -XX:+UseSHM
>> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15030 ms)
>> 
>> 
>> Test times after:
>> 
>> 
>> # default
>> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15003 ms)
>> 
>> # -XX:+UseLargePages
>> [       OK ] os_linux.reserve_memory_special_concurrent_vm (16071 ms)
>> 
>> # -XX:+UseLargePages -XX:LargePageSizeInBytes=1G
>> [       OK ] os_linux.reserve_memory_special_concurrent_vm (15006 ms)
>> 
>> # -XX:+UseLargePages -XX:+UseSHM
>> [       OK ] os_linux.reserve_memory_special_concurrent_vm (1190 ms)
>> 
>> 
>> The major difference is that the last mode gets capped by `maxIteration`. This fixes the test failure, as `-XX:+UseSHM` case would produce lots of warnings on my machine.
>> 
>> #### Try 2
>> 
>> The second attempt run the tests with `-XX:-PrintWarnings` to avoid warning log overload.
>> 
>> Additional testing:
>>  - [x] `os_linux` gtest
>>  - [x] `gtest/LargePageGtests.java` used to fail, now passes
>
> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - Just run with -XX:-PrintWarnings
>  - Merge branch 'master' into JDK-8265332-largepages-oome
>  - 8265332: gtest/LargePageGtests.java OOMEs on -XX:+UseSHM cases

Good and trivial. Thanks!

-------------

Marked as reviewed by stuefe (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3542


More information about the hotspot-dev mailing list