RFR: JDK-8293313: NMT: Rework MallocLimit [v2]

Thomas Stuefe stuefe at openjdk.org
Tue Dec 6 07:07:33 UTC 2022


> Rework NMT `MallocLimit` to make it more useful for native OOM analysis. Also remove `MallocMaxTestWords`, which had been mostly redundant with NMT (for background, see JDK-8293292).
> 
> 
> ### Background
> 
> Some months ago we had [JDK-8291919](https://bugs.openjdk.org/browse/JDK-8291919), a compiler regression that caused compiler arenas to grow boundlessly. Process was usually OOM-killed before we could take a look, so this was difficult to analyze.
> 
> To improve analysis options, we added NMT *malloc limits* with [JDK-8291878](https://bugs.openjdk.org/browse/JDK-8291878). Malloc limits let us set one or multiple limits to malloc load. Surpassing a limit causes the VM to stop with a fatal error in the allocation function, giving us a hs-err file and a core right at the point that (probably) causes the leak. This makes error analysis a lot simpler, and is also valuable for regression-testing footprint usage.
> 
> Some people wished for a way to not end with a fatal error but to mimic a native OOM (causing os::malloc to return NULL). This would allow us to test resilience in the face of native OOMs, among other things.
> 
> ### Patch
> 
> - Expands the `MallocLimit` option syntax, allowing for an "oom mode" that mimics an oom:
> 
> 
> 
> Global form:
> -XX:MallocLimit=<size>[:<mode>]
> Category-specific form:
> -XX:MallocLimit=<category>:<size>[:<mode>][,<category>:<size>[:<mode>] ...]
> Examples:
> -XX:MallocLimit=3g
> -XX:MallocLimit=3g:oom
> -XX:MallocLimit=compiler:200m:oom
> 
> 
> - moves parsing of `-XX:MallocLimit` out of arguments.cpp into `mallocLimit.hpp/cpp`, and rewrites it.
> 
> - changes when limits are checked. Before, the limits were checked *after* the allocation that caused the threshold overflow. Now we check beforehand.
> 
> - removes `MallocMaxTestWords`, replacing its uses with `MallocLimit`
> 
> - adds new gtests and new jtreg tests
> 
> - removes a bunch of jtreg tests which are now better served by the new gtests.
> 
> - gives us more precise error messages upon reaching a limit. For example:
> 
> before:
> 
> #  fatal error: MallocLimit: category "Compiler" reached limit (size: 20997680, limit: 20971520)
> 
> 
> now:
> 
> #  fatal error: MallocLimit: reached category "Compiler" limit (triggering allocation size: 1920B, allocated so far: 20505K, limit: 20480K)

Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision:

 - Merge branch 'master' into JDK-8293313-NMT-fake-oom
 - MallocLimit

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/11371/files
  - new: https://git.openjdk.org/jdk/pull/11371/files/a3622a36..7184b1b0

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=11371&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11371&range=00-01

  Stats: 81643 lines in 1434 files changed: 37632 ins; 35425 del; 8586 mod
  Patch: https://git.openjdk.org/jdk/pull/11371.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/11371/head:pull/11371

PR: https://git.openjdk.org/jdk/pull/11371


More information about the hotspot-dev mailing list