RFR: JDK-8277822: Remove debug-only heap overrun checks in os::malloc and friends

Thomas Stuefe stuefe at openjdk.java.net
Tue Nov 30 09:55:43 UTC 2021


After integrating C heap overrun checks into NMT (https://bugs.openjdk.java.net/browse/JDK-8275320), we can now remove debug-only guarding memory for os::malloc() and friends.

This will:
- reduce complexity of `os::malloc`, `os::free`, `os::realloc` (especially that one since realloc is tricky)
- saves *48 bytes per malloc* in debug builds. In my experiences this can sum up to a quite substantial overhead compared to release builds, especially since malloc allocations are often fine grained.
- reduce behavioral differences between debug and release builds wrt memory layout and allocation paths.

Currently, we have always-on heap overrun checks in debug VMs. In order to preserve this feature after the patch, NMT is switched always on per default in debug builds (but can be switched off if needed, as opposed to the baked-in debug-only guards today).

We use NMT level summary here, which is very cheap since we don't track stacks, we just count some numbers in NMT categories. Memory overhead will still be a lot less than what we paid before with the quite generous debug-only guards.

Finally, this would also be a good coverage test for NMT.

For more information, please refer to the umbrella JBS (https://bugs.openjdk.java.net/browse/JDK-8275301).

-----------

Memory stomp reports are better now since NMT buffer overrun reporting (see JDK-8275320) is better. 

Before:


[0.051s][warning][malloc,free] ## nof_mallocs = 11948, nof_frees = 2017
[0.051s][warning][malloc,free] ## memory stomp:
[0.051s][warning][malloc,free] GuardedMemory(0x00007fb3115ea560) base_addr=0x00007fb30c1df630 tag=0x0000000000000000 user_size=4096 user_data=0x00007fb30c1df650
[0.051s][warning][malloc,free]   Header guard @0x00007fb30c1df630 is OK
[0.051s][warning][malloc,free]   Trailer guard @0x00007fb30c1e0650 is BROKEN
[0.051s][warning][malloc,free]   User data appears to be in use
# To suppress the following error report, specify this argument


Old reporting was susceptible to C-heap corruption: a sufficiently corrupted C-heap caused the libc to abort before reporting was completed:


thomas at starfish:/shared/projects/openjdk/jdk-jdk/output-fastdebug$ ./images/jdk/bin/java -XX:+UseNewCode
[0.047s][warning][malloc,free] ## nof_mallocs = 11948, nof_frees = 2017
[0.047s][warning][malloc,free] ## memory stomp:
java: malloc.c:2379: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
Aborted (core dumped)


The new reports, done by NMT, print out a hex dump of the corrupted area, before calling into libc. So this printout is something we always get, even if C-heap is corrupted beyond help:


thomas at starfish:/shared/projects/openjdk/jdk-jdk/output-fastdebug$ ./images/jdk/bin/java -XX:+UseNewCode
NMT Block at 0x00007f33c01a8e70, corruption at: 0x00007f33c01a9e80: 
0x00007f33c01a8df0:   00 00 00 00 00 00 00 00 75 00 00 00 00 00 00 00
0x00007f33c01a8e00:   50 00 00 00 00 00 00 00 00 00 00 00 01 00 9e e9
0x00007f33c01a8e10:   a8 b2 46 c6 33 7f 00 00 d0 7a 11 c0 33 7f 00 00
0x00007f33c01a8e20:   f0 4e 11 c0 33 7f 00 00 50 50 52 c6 33 7f 00 00
0x00007f33c01a8e30:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a8e40:   00 00 00 00 00 00 00 00 a0 ae 52 c6 33 7f 00 00
0x00007f33c01a8e50:   ff b7 f7 c5 33 7f 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a8e60:   e8 8e 00 00 00 00 00 00 25 10 00 00 00 00 00 00
0x00007f33c01a8e70:   00 10 00 00 00 00 00 00 00 00 00 00 0f 00 9e e9
0x00007f33c01a8e80:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a8e90:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a8ea0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a8eb0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a8ec0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a8ed0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a8ee0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
...
0x00007f33c01a9e00:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9e10:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9e20:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9e30:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9e40:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9e50:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9e60:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9e70:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9e80:   00 00 00 00 00 00 00 00 81 01 00 00 00 00 00 00
0x00007f33c01a9e90:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9ea0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9eb0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9ec0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9ed0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9ee0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00007f33c01a9ef0:   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
# To suppress the following error report, specify this argument


At first glance it seems that "nof_mallocs" etc are missing from this output. This is true, I removed them. They are not needed since the hs-err file contains an NMT summary report, which has more and finer grained information than that.

-----------


Tests: 
- GHAs (the windows failures are related to 8241403: "JavaThread::get_thread_name() should be ThreadSMR-aware")
- we tested this patch for several days in SAP nightlies
- manual tests x64 and x86 Linux

-------------

Commit messages:
 - JDK-8277822-Remove-debug-only-heap-overrun-checks-in-os-malloc-and-friends

Changes: https://git.openjdk.java.net/jdk/pull/6554/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6554&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8277822
  Stats: 188 lines in 6 files changed: 36 ins; 103 del; 49 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6554.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6554/head:pull/6554

PR: https://git.openjdk.java.net/jdk/pull/6554


More information about the hotspot-runtime-dev mailing list