RFR: 8333658: NMT: Use an allocator with 4-byte pointers to save memory in NativeCallStackStorage [v26]

Johan Sjölen jsjolen at openjdk.org
Sun Jun 23 12:44:13 UTC 2024


On Sun, 23 Jun 2024 10:33:44 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

>> Hi,
>> 
>> This PR introduces a new allocator, `IndexedFreelistAllocator`. It uses a `GrowableArray` in order to have 4-byte "pointers" to its elements and also works as a freelist as unused slots form a linked list. This allocator cannot shrink its used memory. I'm always open for better names.
>> 
>> We then use this allocator in order to store the `NativeCallStackStorage` hash table. This saves `4 * 4099 = ~16KiB` of memory as each table element is now only 4 bytes. For each stored stack it also saves 4 bytes. The fact that the allocator cannot shrink is fine, we do not ever remove stacks from the storage.
>> 
>> The main point of this is to introduce a pointer-generic experimential API, so I also implemented CHeap and Arena allocators. It's currently placed in NMT, but we might want to move it into utilities. It uses a bit of template-magic, but my IDE (Emacs+clangd) was actually able to figure things out when the types didn't line up correctly etc, so it's not an enemy to IDE help.
>> 
>> It sounded expensive to have the GrowableArray continously realloc its underlying data array, so I did a basic test where we allocate 1 000 000 stacks and push them into NativeCallStackStorage backed by different allocators. This is available in the PR.
>> 
>> The results are as follows on linux-x64-slowdebug:
>> 
>> 
>> Generate stacks... Done
>> Time taken with GrowableArray: 8341.240945
>> Time taken with CHeap: 12189.031318
>> Time taken with Arena: 8800.703092
>> Time taken with GrowableArray again: 8295.508829
>> 
>> 
>> And on linux-x64:
>> 
>> 
>> Time taken with GrowableArray: 8378.018135
>> Time taken with CHeap: 12437.347868
>> Time taken with Arena: 8758.064717
>> Time taken with GrowableArray again: 8391.076291
>> 
>> 
>> Obviously, this is a very basic benchmark, but it seems like we're faster than CHeap and Arena for this case.
>
> Johan Sjölen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Any entry must be trivial

Hi,

I've changed the data which is stored in `HOA` to be trivial and have added `static_assert`s to the methods. We **cannot** add the `static_assert` to the top of the `HOA` class because there's a mutual dependency on the data type being stored and the `HOA`. It's very important that the data we put into `HOA` is trivial as we do not call any destructors or copy constructors/operators on the code. **Any such "improvements" may be undesirable**, it's good to have limited containers and data which is simpler to reason about.

Here's a minimal example show casing the problem of why `static_assert` has to live in the methods and not in the class definition:

```c++
#include <type_traits>
#include <unordered_map>

template<typename E>
struct A {
    // SEE ME: Uncomment this line to see compilation errors re:
    // "incomplete types"
    // static_assert(std::is_trivial<E>::value);
    union alignas(E) Foo {
        char e[sizeof(E)];
    };
    using I = int;

    Foo* foo;
    void put(E& e) {
        static_assert(std::is_trivial<E>::value);
        }
};

struct C {
    struct B;
    using AB = A<B>;
    using I = typename AB::I;

    struct B {
        I i;
        // SEE ME! Uncomment this line to make B non-trivial
        // std::unordered_map<int, int> s;
    };

    AB a;

    void do_it() {
        B b;
        a.put(b);
    }
};

int main() {
    C c;
    c.do_it();
    return 0;
}
``` 

I suggest that you paste this in godbolt.org (remember to switch to C++) or compile locally, following the `SEE ME` comments to understand what's going on.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18979#issuecomment-2184974034


More information about the hotspot-dev mailing list