RFR: JDK-8275320: NMT should perform buffer overrun checks [v6]
Volker Simonis
simonis at openjdk.java.net
Fri Nov 19 17:35:38 UTC 2021
On Fri, 19 Nov 2021 14:29:17 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> This is part of a number of RFE I plan to improve and simplify C-heap overflow checking in hotspot.
>>
>> This proposal adds NMT buffer overflow checking:
>>
>> - it gives us C-heap overflow checking in release builds
>> - the costs are neglectable: if NMT is off, we won't pay anything; if NMT is on, the added work is minuscule since we have to do malloc header management anyway.
>> - NMT needs intact headers anyway. Faced with buffer overwrites today, it would maybe crash or maybe account wrongly, but it's a bit of a lottery really. Better to go the extra step and do a real check.
>> - it could be a preparation for future code removal, if we wanted to do that (see details in umbrella RFE https://bugs.openjdk.java.net/browse/JDK-8275301). That way, net complexity would come down even with this patch.
>>
>> For more details, please see the JBS issue.
>>
>> ----
>>
>> Patch notes:
>>
>> - The malloc header is changed such that it contains a 16-bit canary directly preceding the user payload of the allocation. The new malloc header does not use bitfields anymore but normal types. For more details, see the comment in mallocTracker.hpp.
>> - On 64-bit, we don't enlarge the malloc header. It remains 16 bytes in length. So no additional memory cost (apart from the 1-byte-footer, see below). Space for the canary is instead obtained by reducing the size of the bucket index bit field to 16 bits. That bit field is used to store the bucket slot index of the malloc site table in NMT detail mode. With 40 bits it was over-dimensioned, and even 16-bits arguably still are: malloc site table width is 512.
>> - On 32-bit, I had to enlarge the header from 8 bytes to 16 bytes to make room for a canary. But strictly speaking 8 bytes were not enough anyway: the header size has to be large enough to satisfy malloc(3) alignment, and that would be 16 bytes. I believe it never led to an error since we don't store 128bit data in malloc'd memory in the hotspot anywhere.
>>
>> - I added a footer canary trailing the user allocation to catch tail buffer overruns. To keep matters simple (alignment) I made it a single byte only. That is enough to catch most overrun scenarios.
>>
>> - I brushed up error reporting. When NMT detects corruption, it will now print out a hex dump of the corrupted area to tty before asserting.
>>
>> - I added a bunch of gtests to test various heap overwrite scenarios. I also had to extend the gtest macros a bit because I wanted these tests of course to run in release builds too, but we did not have a death test macro for release builds yet (there are possibilities for code simplification here too, but that's for another RFE).
>>
>> - I renamed `nmt_header_size` to `nmt_overhead` since that size includes header and footer now.
>>
>> - I made the assert for malloc site table width a compile time STATIC_ASSERT.
>>
>> --------------
>>
>> Example output a buffer overrun would provide:
>>
>>
>> Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?)
>> NMT Block at 0x00005600f86136b0, corruption at: 0x00005600f86136c1:
>> 0x00005600f86136a8: 21 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00
>> 0x00005600f86136b8: 00 00 00 00 0f 00 1f fa 00 61 00 00 00 00 00 00
>> 0x00005600f86136c8: 41 39 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 0x00005600f86136d8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 0x00005600f86136e8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 0x00005600f86136f8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 0x00005600f8613708: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 0x00005600f8613718: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 0x00005600f8613728: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 0x00005600f8613738: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> assert failed: fatal error: Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?)#
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error (mallocTracker.cpp:203), pid=10805, tid=10805
>> # fatal error: Block at 0x00005600f86136b0: footer canary broken at 0x00005600f86136c1 (buffer overflow?)
>> #
>>
>> -------
>>
>> Tests:
>> - manual tests with Linux x64, x86, minimal build
>> - GHAs all clean
>> - SAP nightlies ran for 4 weeks now without problems
>
> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits:
>
> - Volker Feedback 2
> - Fix Zhengyu Problem in os::realloc
> - Extend gtests
> - extend footer to 2 bytes
> - Feedback Volker
> - Let NMT do overflow detection
Looks good now.
src/hotspot/share/services/mallocTracker.hpp line 435:
> 433: }
> 434:
> 435: static inline void record_new_arena(MEMFLAGS flags) {
Yes, I also wondered why we need these versions so it's good that you could remove them!
-------------
Marked as reviewed by simonis (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/5952
More information about the hotspot-dev
mailing list