RFR: 8301403: Eliminate memory allocations in JVMFlag::printFlags during signal handling [v5]
Thomas Stuefe
stuefe at openjdk.org
Sat Jul 20 06:50:32 UTC 2024
On Fri, 19 Jul 2024 20:18:47 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:
>> Allocating memory while handling an error should be avoided, however `JVMFlag::printFlags()` is used by crash log and it allocates memory to print flags sorted (using qsort).
>>
>> We avoid memory allocation by using a simple in place algorithm that uses JVMFlag 1 bit of unused data from its private `Flags` enum data member. It is O(n^2) algorithm, compared to O(n*log(n)) for qsort, however, it's called while handling an error log, so the speed here is not paramount. Also, I measured the real impact on a simple test case and I actually observed performance improvement of about x2.5 faster (2,885,973ns in place ordered printing vs 7,389,456ns for qsort). This is because we ended up having to qsort all flags when only a fraction of them actually end up being printed.
>>
>> Running MACH5 tests...
>
> Gerard Ziemski has updated the pull request incrementally with one additional commit since the last revision:
>
> can not use new
Hi @gerard-ziemski
>
> I don't like the idea of using script, as it would make things more complex for IDE based compilations (doable, but extra steps). Is there any way we can use `constexpr` and other compile time c++ features to help here?
Yeah, maybe its not possible. I wrecked my brain but did not come up with anything that would work via macros or templates. Small arrays one can compile-time-sort with macro tricks in C, but that does not scale to 1xxx entries.
>
> But, first I'd like to hear which API you recommend we use for memory allocation using stack.
Why, just a fixed-size array. The total number of JVMFlags is known at compile time. And while I look at the code, I see we don't even have 1500 entries but seem to have less: 1247.
So, I would:
constexpr int num_flags = NUM_JVMFlagsEnum;
// might be even safer: constexpr int num_flags = sizeof(flagTable) / sizeof(flagTable[0]);
constexpr int num_words = BitMap::calc_size_in_words(num_flags);
bm_word_t test_data[num_words]; // should be about 156 bytes
BitMapView test_map(test_data, num_words);
test_map.clear_large_range_of_words(); // Since I think the constructor does not initialize the array
/// .... off you go
You could even make sure that the code does not bitrot - e.g. if due to some crazy change we suddenly have ten times as many flags - with a STATIC_ASSERT:
STATIC_ASSERT(sizeof(test_data) < 512); // if we ever use more than this on the stack, lets revisit the code
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20202#issuecomment-2240963546
More information about the hotspot-runtime-dev
mailing list