RFR: JDK-8302820: Remove costs for NMTPreInit when NMT is off [v2]

Justin King jcking at openjdk.org
Wed Mar 1 20:10:08 UTC 2023


On Sat, 25 Feb 2023 07:22:25 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> NMTPreInit has been brought into question lately (see [JDK-8299196](https://bugs.openjdk.org/browse/JDK-8299196) and [JDK-8301811](https://bugs.openjdk.org/browse/JDK-8301811)). The points of contention were costs paid when NMT is off, and complexity.
>> 
>> I believe NMTPreInit is vital (for reasons why see discussion under [8299196](https://bugs.openjdk.org/browse/JDK-8299196) ), and removing it would be a severe mistake. So let's address the cost problem.
>> 
>> NMTPreInit, in its current form, incurs costs post-init for lookup table lookup to identify pre-init allocations. Granted, this cost is already pretty low since the load factor of that table is small. But we can avoid that lookup completely by allocating pre-init blocks without malloc headers.
>> 
>> That has two advantages:
>> - costs for NMTPreInit for NMT=off are practically nil now: all that remains is querying NMT tracking level to see if we are pre- or post-init, and we need to do that anyway to see if NMT is switched on. That cost is not going away unless we get rid of NMT itself.
>> - We can delete the lookup table if NMT is off, since we don't need it nomore, and regain 63352 bytes of memory.
>> 
>> -----
>> 
>> I have done my best to come up with a good compromise between complexity, startup speed, and memory consumption. With a bit more complexity, penalties to startup speed could be even more reduced (e.g. by shepherding preallocation headers into their arena). 
>> 
>> But I'm between a rock and a hard place here: more complexity increases the chance of "its too complex, let's just remove it", which is a tiny bit stressful tbh. And the one point I feel strongly about is that getting rid of NMTPreInit would be a grave mistake. I also don't think this code needs more optimization.
>> 
>> (Please note that I enter vacation and won't be able to react promptly to reviews.)
>> 
>> ---
>> 
>> Tests: 
>> - Manually tested linux x64 and x86 (gtests with all NMT permutations; runtime/NMT)
>> - GHAs ongoing
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
> 
>   feedback johan

Either of the 3 work for me, so long as we are not leaking. If we are intentionally leaking then I have to follow up with code to ignore leaks when built with LSan or most tests will fail.

As far as leaked allocation cost, it really depends. Its almost never just the size of the request allocation. It entirely depends on where the malloc implementation placed the allocations. They could be all next to each other, which is ideal, or they could be fragmented.

-------------

PR: https://git.openjdk.org/jdk/pull/12642


More information about the hotspot-runtime-dev mailing list