RFR: JDK-8304421: Introduce malloc size feedback [v3]

Thomas Stuefe stuefe at openjdk.org
Fri Mar 17 19:03:52 UTC 2023


On Fri, 17 Mar 2023 17:36:46 GMT, Justin King <jcking at openjdk.org> wrote:

>> Memory allocated by `malloc()` is frequently larger than the size actually requested. Some `malloc()` implementations allow querying this information. This can be useful as an optimization for some use cases, such as an arena allocator, as it allows using the entire memory block.
>> 
>> This change updates `os::malloc()` by appending an additional argument that can request the actual usable size. On platforms that support it, the actual usable size of the allocation returned by `malloc()` will be filled in. Callers should then use `os::free_sized`, as with NMT enabled it can assert that the size is correct. This also is a precursor to eventually supporting `free_sized` from C23, which is an optimization usabled by some `malloc()` implementations to make `free()` quicker.
>> 
>> This change also upgrades and refactors Chunk to use this facility.
>> 
>> **Observations**
>> 
>> NMT could use this same facility to keep track of the actual allocated size, instead of the requested size it has today, making it more accurate. Doing so is out of scope for this change.
>
> Justin King has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Handle size overflow
>   
>   Signed-off-by: Justin King <jcking at google.com>

Hi,

some initial thoughts.

I find using excess memory in this manner quite scary. It depends on the implementation correctly not using what it reports. Eg glibc malloc_usable_size comes with a warning label of "this is bad practice". Practically speaking we now treat new paths, which are UB, and may trigger paths in libcs that are not well tested. The gains are probably not large enough to justify this risk.

We could probably just resize arena chunks correctly to page sizes or power of two sizes; much simpler that way.

There could be some merit in collecting this number and reporting it. But probably not for the end user. Since the real cost of malloc comprises many parts, and overhead of outstanding allocations is just a part of that. Retention size and size of side structures are often more significant. So we would report just an arbitrary part of a larger, unknown whole. One more number to confuse folks, but ultimately not that helpful. Note that we already report libc overhead as a whole.

There could be a use case for hotspot devs to highlight particularly "unhappy" allocation sizes. I fear that the number of actionable call sites would be small, though, since the vast majority of allocations come from new and depend on object sizes. That leaves us with buffer allocations, which are usually already power two-sized.

Lunping refactoring together with functional changes does not help. Makes it harder to spot the functional changes. Refactoring can be done in separate RFEs if necessary. Since refactoring is often a matter of opinion, it's also fairer to discuss them separately from functional changes.

More thoughts:

- It is unspecified what the effect should be on NMT allocation guards. Would the guard be added to the user-requested size or to the actual size? The former would render this improvement pointless; the latter makes me personally unhappy since we lose our ability to catch off by one overwrites. This needs to be specified, and we need regression tests for this.
- Speaking of it, there are no tests.
- The duplication of code in NMTPreInit is just plain bad, as is the duplication of os::free in os.cpp. That said, I don't get the point of free_sized. Why is this needed?
- Platform-dependent changes should go into their respective platform folders
- Does muslc support malloc_usable_size? If not, pls make your code conditional for glibc (see e.g. the trim coding in the linux branch)

Cheers, Thomas

-------------

PR: https://git.openjdk.org/jdk/pull/13081


More information about the hotspot-runtime-dev mailing list