RFR: JDK-8304421: Introduce malloc size feedback [v7]

Justin King jcking at openjdk.org
Mon Mar 20 15:16:51 UTC 2023


On Fri, 17 Mar 2023 21:39:37 GMT, Justin King <jcking at openjdk.org> wrote:

>> Memory allocated by `malloc()` is frequently larger than the size actually requested. Some `malloc()` implementations allow querying this information. This can be useful as an optimization for some use cases, such as an arena allocator, as it allows using the entire memory block.
>> 
>> This change updates `os::malloc()` by appending an additional argument that can request the actual usable size. On platforms that support it, the actual usable size of the allocation returned by `malloc()` will be filled in. Callers should then use `os::free_sized`, as with NMT enabled it can assert that the size is correct. This also is a precursor to eventually supporting `free_sized` from C23, which is an optimization usabled by some `malloc()` implementations to make `free()` quicker.
>> 
>> This change also upgrades Chunk to use this facility.
>> 
>> **Observations**
>> 
>> NMT could use this same facility to keep track of the actual allocated size, instead of the requested size it has today, making it more accurate. Doing so is out of scope for this change.
>
> Justin King has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Undo more refactoring
>   
>   Signed-off-by: Justin King <jcking at google.com>

#include <malloc.h>
#include <iostream>

constexpr size_t kMaxSize = 1024 * 1024;

int main()
{
    for (size_t s = 2; s < kMaxSize; s *= 2) {
        void* p = malloc(s);
        size_t u = malloc_usable_size(p);
        std::cout << "requested_size: " << s << " actual_size: " << u << " wasted_size: " << u - s << std::endl;
        free(p);
    }
}


On FreeBSD and macOS there is 0 wasted space for power of 2 once you are past something like 32 bytes. This is not the case on Linux with glibc. Its possible to get lots of wasted space (4 kb) over 65 kb. And this is configurable. I am not sure about Windows. This is why instead of trying to guess the best fit, like the chunk sizes, do, its better to just ask the implementation. This is free to change at any point and from version to version. glibc will infact coalesce free blocks together. Sometimes there will be less wasted space and sometimes there is more. Try removing the `free` call and you will get slightly different results.

People are also free to replace there system malloc implementation with a custom one, and some do. `tcmalloc` operates this way. So even guessing on appropriate sizes for most platforms won't work.

-------------

PR: https://git.openjdk.org/jdk/pull/13081


More information about the hotspot-runtime-dev mailing list