RFR: 8354954: Typed static memory for late initialization of static class members in Hotspot

Quan Anh Mai qamai at openjdk.org
Thu Apr 17 14:50:23 UTC 2025


On Wed, 16 Apr 2025 12:36:45 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

> Hi,
> 
> This PR introduces a `StableValue<T>` which is sized and aligned identically to a `T`, with the difference that a `StableValue<T>` needs to be explicitly instantiated.
> 
> Dynamic static initalization in C++ leads to unpredictable bugs as there is no defined order in which objects will be initialized, to the degree that 'static initialization fiasco' is a term used. In the code I've worked on in Hotspot we resolve this by having an initialization function, and instead of having static members of `T` we have `T*` instead and use `malloc` in order to gain the memory for the objects.
> 
> This is fine, however when working with certain code you may want to allocate before NMT is initialized. This leads to you having to use raw malloc. This is probably something you find out after having used `os::malloc` and already had a strange crash when building your code.
> 
> That's why I'd like to have `StableValue<T>`. It let's you avoid the whole `malloc` thing, and we overload `->` to make it behave as if it is actually a `T`. We add in a simple checker in debug mode that checks whether the memory has been initialized before using it.
> 
> In the code I've switched two members to be of `StableValue` instead. One is the malloc case above, the second (MemBaseline) is one where I got a bug while developing. The bug occurred because I  changed the initializer of `MemBaseline` without knowing that it was dynamic-static-allocated, and the exact change I made caused weird crashes (because of initialization order issues).
> 
> This solution is quite practical to me, but I wanted to know what others think.

Do you mean my implementation won't work for non-trivial types, can you point out why?

Of course, my concerns are theoretical and may not express themselves in simple programs or even in all programs. But UB is UB and you cannot know when it will bite you. As a result, having a non-UB implementation would be preferrable for new code.

Your implementation has a lot of UB, I believe a conformant implementation should look like this:

    class StaticArea {
      union {
        T _data;
      };
      DEBUG_ONLY(bool _present);

      StaticArea() DEBUG_ONLY(: _present(false)) {}
      ~StaticArea() {
        // we cannot be sure _present is true, just let it leak
        // _data.~T();
      }

      T* as() {
        assert(_present, "");
        return &_data;
      }

      template <class... Ts>
      void init(Ts&... args) {
        assert(!_present, "StaticArea already initialized");
        new (&_data) T(args...);
      }
    };

P.S: I saw where your confusion comes from now :) You need an anonymous union member, you also need a constructor and a destructor for `S`, like this:

    struct S {
        union {
            F _f;
        }

        S() {}
        ~S() {}
    }

The reason is that an anonymous union acts as if the members of the union are the members of the enclosing class, which means that the class constructor will construct the members of the union and the class destructor will destroy them. If you are insisted on named union you will need to provide the constructor and destructor to the union type:

    struct S {
        union U {
            F _f;

            U() {}
            ~U() {}
        };
        U u;
    }

https://godbolt.org/z/xdTKTjWE7

You can access return the pointer to `_t` directly without having to `reinterpret_cast` from `this`. Also, a union must have the size and alignment to contains a `T` so you don't need `alignas`, either.

src/hotspot/share/utilities/staticArea.hpp line 42:

> 40: #ifdef ASSERT
> 41:   StaticArea() {
> 42:     uint32_t* d = reinterpret_cast<uint32_t*>(_mem);

Firstly, this is UB because you are doing pointer arithmetic on an `int*` when the underlying memory is not actually an array of `int`. Secondly, I'm not sure if `d[i] = death_pattern`  is correct, I believe it should be `::new(&_mem[i * 4]) int(death_pattern)` so that an `int` object is properly created.

src/hotspot/share/utilities/staticArea.hpp line 51:

> 49:     uint32_t* d = reinterpret_cast<uint32_t*>(_mem);
> 50:     for (size_t i = 0; i < sizeof(T) / sizeof(uint32_t); i++) {
> 51:       if (d[i] != death_pattern) return false;

This load is only valid if `d[i]` does contain an `uint32_t`, which is `death_pattern`. However, this is contradictory as in order to check for the death pattern, you need to know that the thing here is the death pattern. Trying to read an `uint32_t` from a `T` is UB.

src/hotspot/share/utilities/staticArea.hpp line 58:

> 56: 
> 57:   T* as() {
> 58:     return reinterpret_cast<T*>(this);

I don't think this is legal. Normally the placement new returns a pointer and you have to use that pointer to access the newly initialized object. Reinterpreting the pointer passed into the placement new needs `std::launder` to be conformant.

src/hotspot/share/utilities/staticArea.hpp line 62:

> 60:   T* operator->() {
> 61:     assert(!is_death_pattern(), "Potential access to uninitialized memory");
> 62:     return as();

This is pretty contradictory as you need the content here to be an `uint32_t` for the `is_death_pattern` call to be valid, but you need it to be a `T` so that the following operation can be executed. I can easily see the compiler getting upset with this aliasing.

-------------

PR Review: https://git.openjdk.org/jdk/pull/24689#pullrequestreview-2775141866
PR Comment: https://git.openjdk.org/jdk/pull/24689#issuecomment-2812030333
PR Comment: https://git.openjdk.org/jdk/pull/24689#issuecomment-2812341701
PR Comment: https://git.openjdk.org/jdk/pull/24689#issuecomment-2812430323
PR Review Comment: https://git.openjdk.org/jdk/pull/24689#discussion_r2048584429
PR Review Comment: https://git.openjdk.org/jdk/pull/24689#discussion_r2048588507
PR Review Comment: https://git.openjdk.org/jdk/pull/24689#discussion_r2048592103
PR Review Comment: https://git.openjdk.org/jdk/pull/24689#discussion_r2048595671


More information about the hotspot-dev mailing list