RFR: 8301404: Factor out os::malloc with os::realloc common code, so that we only have 1 code path [v2]
Johan Sjölen
jsjolen at openjdk.org
Mon Mar 31 11:39:21 UTC 2025
On Fri, 28 Mar 2025 14:33:55 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:
>> This is the 2nd time I am proposing this (controversial?) change, but this time I do have performance numbers, which indicate no change in speed (using NMTBenchmark from #23786):
>>
>> proposed:
>>
>> time:72,642,827[ns]
>> [samples:807,804] [NMT headers:382,064]
>> [malloc#:588,703] [realloc#:12,462] [free#:206,639]
>> memory requested:57,274,288 bytes, allocated:69,004,800 bytes
>> malloc overhead:4,853,360 bytes [8.47%], NMT headers overhead:6,877,152 bytes [12.01%]
>> existing code:
>>
>> time:73,085,446[ns]
>> [samples:807,804] [NMT headers:382,064]
>> [malloc#:588,703] [realloc#:12,462] [free#:206,639]
>> memory requested:57,274,288 bytes, allocated:69,004,800 bytes
>> malloc overhead:4,853,360 bytes [8.47%], NMT headers overhead:6,877,152 bytes [12.01%]
>> Note: the NMTBenchmark reports realloc(nullptr) as mallocs(), which is why both versions show the same count for mallocs/reallocs.
>>
>> The performance is virtually the same where I sampled each test 30 times and took the best (the shortest).
>>
>> This proposed change factors out the common code and simplifies both os::malloc and os::realloc. We were able to reduce malloc from 44 lines down to 8 (saving of 36 lines) and realloc from 84 to 55 (29 lines).
>>
>> To me the most important part here is that we reduce the number of times that NMT has to interact with the native allocation code.
>
> Gerard Ziemski has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision:
>
> - Merge remote-tracking branch 'upstream/master' into JDK-8301404
> - work
> - work
> - work
> - work
> - work:
> - work
> - work
> - work
> - work
> - ... and 5 more: https://git.openjdk.org/jdk/compare/d9f39f91...01470b34
Hi Gerard!
Does the split into `pre_` and `post_` alloc really help with legibility when reading the code? To me, it doesn't.
What I mean by that is: If I read `os::malloc` for the first time, then `os::pre_alloc` and `os::post_alloc` just looks like magic to me. They don't actually say what they do. If I want to understand `os::malloc` I have to go into both `pre_` and `post_` alloc and read the code anyway. Then, I also have the issue that I need to remember exactly what the arguments were for those two functions, because the behavior changes depending on what's passed in.
We essentially have *two* functions for both pre and post, actually. Here're our pre calls:
```c++
680: size_t outer_size = os::pre_alloc(&rc, nullptr, size, true, mem_tag);
712: size_t outer_size = os::pre_alloc(&rc, memblock, size, false, mem_tag);
And here are our post calls:
```c++
693: return post_alloc(outer_ptr, size, 0, mem_tag, stack);
754: rc = post_alloc(outer_ptr, size, old_size, mem_tag, stack);
If we were to partially apply these, we have:
```c++
680: size_t outer_size = os::pre_alloc_nullptr_true(&rc, mem_tag);
712: size_t outer_size = os::pre_alloc_memblock_false(&rc, mem_tag);
And we have:
```c++
693: return post_alloc_0(outer_ptr, size, mem_tag, stack);
754: rc = post_alloc_old_size(outer_ptr, size, mem_tag, stack);
This, in combination with the non-obvious names, makes this harder to understand for me. It also doesn't actually save any lines of code.
Cheers,
Johan
-------------
PR Review: https://git.openjdk.org/jdk/pull/24189#pullrequestreview-2729012535
More information about the hotspot-runtime-dev
mailing list