RFR: 8293972: runtime/NMT/NMTInitializationTest.java#default_long-off failed with "Suspiciously long bucket chains in lookup table." [v4]
Thomas Stuefe
stuefe at openjdk.org
Thu Jul 20 06:33:44 UTC 2023
On Wed, 19 Jul 2023 20:43:02 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:
>> We initialize `NMT` earlier in the VM startup sequence, so that `NMTPreInit` does not have to deal with a variable number of VM arguments.
>>
>> Since the `NMTPreInit` lookup table is of a fixed size (it only needs to hold about 56 items) we do not need to worry about it getting out of balance.
>>
>> Since the time window that `NMTPreInit` is needed is now much shorter, we do not "leak" as much memory. We also do not need to worry about `realloc`.
>>
>> As a result, we do not need to test is as extensively as we used to and are being able to simplify `NMTPreInit` quite a bit, which allows us to remove a good chunk of code.
>>
>> In addition, we improve the hash function, by using Mersenne prime (2^x - 1) i.e. 127, as the table size, which when combined with "modulo" operation, provides a uniform bit distribution.
>>
>> To help with testing this change to make sure we improve the hash function, I collected `NMTPreInitAllocation` data on 3 platforms: macOS, Linux and Windows, using 100 runs of `test/hotspot/jtreg/runtime/NMT/NMTInitializationTest.java` and computed `stdev` as well as average `max chain`:
>>
>>
>> mac OS data:
>> BEFORE total: max chain: 676, empties: 210892, ranges: 676, std_devs: 112.771
>> AFTER total: max chain: 622, empties: 230762, ranges: 622, std_devs: 111.094
>> BEFORE avg: max chain: 5.061, empties: 18.695, ranges: 5.061, std_devs: 0.010
>> AFTER avg: max chain: 4.181, empties: 16.970, ranges: 4.161, std_devs: 0.009
>>
>> Linux data:
>> BEFORE total: max chain: 523, empties: 193174, ranges: 523, std_devs: 103.243
>> AFTER total: max chain: 432, empties: 175347, ranges: 430, std_devs: 90.670
>> BEFORE avg: max chain: 6.540, empties: 20.402, ranges: 6.540, std_devs: 0.011
>> AFTER avg: max chain: 6.017, empties: 22.324, ranges: 6.017, std_devs: 0.011
>>
>> Windows data:
>> BEFORE total: max chain: 578, empties: 201224, ranges: 578, std_devs: 108.988
>> AFTER total: max chain: 491, empties: 218653, ranges: 491, std_devs: 104.879
>> BEFORE avg: max chain: 5.602, empties: 19.504, ranges: 5.602, std_devs: 0.011
>> AFTER avg: max chain: 4.759, empties: 21.193, ranges: 4.759, std_devs: 0.010
>>
>>
>> Note: the lower the number, the better.
>> Note: the `stdedev` and `max chain` are the most important numbers here.
>>
>> We also, mark output with PID to help identify which test process it is coming from as the attached log was very confusing to read.
>>
>> Tested manually with `jtreg -nr -va -jdk:./build/macosx-x86_64-server-fastdebug/images/j...
>
> Gerard Ziemski has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:
>
> - init NMT earlier to decrease the reliance on NMTPreInit lookup table, which lets us simplify the code
> - Merge remote-tracking branch 'upstream/master' into JDK-8293972
> - do not clip bits to 32, we need full 64 bits
> - ^ is XOR, not pow
> - remove tabs
> - mark output with PID to help identify where it comes from
> - use Mersenne prime and mod for uniform bit distribution of hash
Sorry, I appreciate your work, but I'm not a big fan. I had similar thoughts when building NMTPreinit, but decided to implement realloc fully because it is just a lot safer and flexible.
You bank on no reallocs ever happening in early initialization code. The amount of code that could run may surprise you. I know it surprised me. And those can be code paths very rarely taken. Realloc is such a basic function. Automatic C++ objects may realloc, for instance. I think UL has a ton of those, as has the Compiler. Or, you may crash and execute the crash handler - are we sure no realloc ever runs then?
Your patch introduces a new dependency: now we will never be able to run code beforehand that uses realloc. And we will not be able to precede NMT argument parsing with another argument. That makes the code brittle.
To be sure this works, we'd need at least add a guarantee - not an assert - at os::realloc(). Because the code paths taken can vary wildly, and you can never be sure we test everything.
Bottomline, you trade in potential crashes (guarantees) and more inflexibility for a complexity reduction in code that makes no real problem. I'd rather just dumb down the test.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14607#issuecomment-1643333917
More information about the hotspot-runtime-dev
mailing list