RFR: JDK-8304421: Introduce malloc size feedback [v7]
Thomas Stuefe
stuefe at openjdk.org
Mon Mar 20 16:42:42 UTC 2023
On Fri, 17 Mar 2023 21:39:37 GMT, Justin King <jcking at openjdk.org> wrote:
>> Memory allocated by `malloc()` is frequently larger than the size actually requested. Some `malloc()` implementations allow querying this information. This can be useful as an optimization for some use cases, such as an arena allocator, as it allows using the entire memory block.
>>
>> This change updates `os::malloc()` by appending an additional argument that can request the actual usable size. On platforms that support it, the actual usable size of the allocation returned by `malloc()` will be filled in. Callers should then use `os::free_sized`, as with NMT enabled it can assert that the size is correct. This also is a precursor to eventually supporting `free_sized` from C23, which is an optimization usabled by some `malloc()` implementations to make `free()` quicker.
>>
>> This change also upgrades Chunk to use this facility.
>>
>> **Observations**
>>
>> NMT could use this same facility to keep track of the actual allocated size, instead of the requested size it has today, making it more accurate. Doing so is out of scope for this change.
>
> Justin King has updated the pull request incrementally with one additional commit since the last revision:
>
> Undo more refactoring
>
> Signed-off-by: Justin King <jcking at google.com>
Your proposal has costs:
- the inherent risk of writing over malloc-allocated boundaries ("Any implementation returning an invalid size from malloc_usable_size is a broken implementation and has no business existing or being supported" sounds good and all but customers with crashing systems won't care if it's us screwing up or the libc maintainers).
- The loss of precision of NMT-provided overwrite checks
- increased complexity and reviewer churn. There is a ton of things to do and little reviewer time to go around.
Do the benefits outweigh these costs? Maybe, but we don't know that since there are no numbers. What we do know is that most small-grained allocations - that could benefit from your change most - are C++ objects with fixed sizes. Leaves those little raw buffers in things like stringStream, and arena chunks.
But for arena chunks in particular, it makes sense to discuss alternatives. E.g. we could just re-use the metaspace allocator for hs arenas. Metaspace is a much better arena allocator than hs arenas, which makes sense since Metaspace is the spiritual successor.
A variant of that would be to just roll our own allocator for arenas. Allocate chunks via os::reseve/commit_memory, and take care of committing/uncommitting ourselves. That would be trivial to implement.
Both examples would solve not only the overhead-per-chunk problem but also the retention problem in glibc and other libc implementations (e.g. AIX, Solaris), which I see as the larger problem.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/13081#issuecomment-1476576364
More information about the hotspot-runtime-dev
mailing list