RFR: 8351500: G1: NUMA migrations cause crashes in region allocation
Thomas Schatzl
tschatzl at openjdk.org
Wed Mar 12 11:27:52 UTC 2025
On Tue, 11 Mar 2025 14:01:11 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> For details, please see JBS issue.
>
> _Please note that this bug only shows symptoms in JDK 21 and JDK 17! Due to code shuffling done as part of G1 region-local pinning work, the error does not show in JDKs 22 and later._
>
> I originally planned to fix this just for JDK 21 and 17 (see https://github.com/openjdk/jdk21u-dev/pull/1460). However, I would rather have it fixed in the mainline, even though it is symptom-free. It is a lingering issue that may surface later if the code is ever changed. Plus, this prevents the fix from being accidentally overwritten in JDK 21 if we backport.
>
> ----
>
> The fix is simple: we fix (hah) the NUMA association for the full duration of a heap allocation in G1. That way, regardless of the OS scheduler moving the thread to a different NUMA node, we always use the same `G1AllocRegion` object, and changes in the control flow that rely on that won't break on NUMA.
>
> This has the disadvantage of allocating memory from a node we are potentially moving away from. However, I argue that this is exceedingly rare, and if it happens, the OS will cope by eventually migrating the memory to the correct node.
>
> ---
>
> Testing:
>
> Testing is difficult. See remark in JBS issue.
>
> I tested a modified version of this patch on JDK 21, where the error does cause crashes. I tested with an additional patch mimicking tons of NUMA node migrations. As I wrote in JBS, I plan to contribute that "FakeNUMA" mode eventually, but lack the time to polish it up for now. I hope the fix is simple and uncontested enough to go in quickly, since I would like to fix JDK 21 soon via backporting this patch.
src/hotspot/share/gc/g1/g1Allocator.inline.hpp line 55:
> 53: size_t desired_word_size,
> 54: size_t* actual_word_size,
> 55: uint node_index) {
(I errorneously commented this on the JDK 21 change, but it is the same here; since this is a pre-existing issue we might want to fix all this in a separate CR though)
I would strongly prefer if node_index were not the last argument - it is an input argument, and should be next to the other input arguments. Ie. right now the types of arguments are "input, input, output, input" (min_word_size, desired_word_size, actual_word_size, node_index) which is imo not good style.
Maybe this should also be fixed in survivor_attempt_allocation and other similar places (this is a pre-existing issue; I do remember we just tacked the parameter on when we added NUMA support).
Also the comment should be updated to something like Attempt allocation in the current alloc region on the given node or so.
Another option would be to explicitly group the allocation type parameters into a struct, e.g.
struct AllocParams {
size_t desired_word_size;
uint node_index;
}
to make it explicit that we want to enforce this closeness. not required though, as for apart from this method the number of arguments isn't that bad.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/23984#discussion_r1991270482
More information about the hotspot-gc-dev
mailing list