RFR: 8293972: runtime/NMT/NMTInitializationTest.java#default_long-off failed with "Suspiciously long bucket chains in lookup table." [v4]
Thomas Stuefe
stuefe at openjdk.org
Thu Jul 20 09:53:43 UTC 2023
On Wed, 19 Jul 2023 20:43:02 GMT, Gerard Ziemski <gziemski at openjdk.org> wrote:
>> We initialize `NMT` earlier in the VM startup sequence, so that `NMTPreInit` does not have to deal with a variable number of VM arguments.
>>
>> Since the `NMTPreInit` lookup table is of a fixed size (it only needs to hold about 56 items) we do not need to worry about it getting out of balance.
>>
>> Since the time window that `NMTPreInit` is needed is now much shorter, we do not "leak" as much memory. We also do not need to worry about `realloc`.
>>
>> As a result, we do not need to test is as extensively as we used to and are being able to simplify `NMTPreInit` quite a bit, which allows us to remove a good chunk of code.
>>
>> In addition, we improve the hash function, by using Mersenne prime (2^x - 1) i.e. 127, as the table size, which when combined with "modulo" operation, provides a uniform bit distribution.
>>
>> To help with testing this change to make sure we improve the hash function, I collected `NMTPreInitAllocation` data on 3 platforms: macOS, Linux and Windows, using 100 runs of `test/hotspot/jtreg/runtime/NMT/NMTInitializationTest.java` and computed `stdev` as well as average `max chain`:
>>
>>
>> mac OS data:
>> BEFORE total: max chain: 676, empties: 210892, ranges: 676, std_devs: 112.771
>> AFTER total: max chain: 622, empties: 230762, ranges: 622, std_devs: 111.094
>> BEFORE avg: max chain: 5.061, empties: 18.695, ranges: 5.061, std_devs: 0.010
>> AFTER avg: max chain: 4.181, empties: 16.970, ranges: 4.161, std_devs: 0.009
>>
>> Linux data:
>> BEFORE total: max chain: 523, empties: 193174, ranges: 523, std_devs: 103.243
>> AFTER total: max chain: 432, empties: 175347, ranges: 430, std_devs: 90.670
>> BEFORE avg: max chain: 6.540, empties: 20.402, ranges: 6.540, std_devs: 0.011
>> AFTER avg: max chain: 6.017, empties: 22.324, ranges: 6.017, std_devs: 0.011
>>
>> Windows data:
>> BEFORE total: max chain: 578, empties: 201224, ranges: 578, std_devs: 108.988
>> AFTER total: max chain: 491, empties: 218653, ranges: 491, std_devs: 104.879
>> BEFORE avg: max chain: 5.602, empties: 19.504, ranges: 5.602, std_devs: 0.011
>> AFTER avg: max chain: 4.759, empties: 21.193, ranges: 4.759, std_devs: 0.010
>>
>>
>> Note: the lower the number, the better.
>> Note: the `stdedev` and `max chain` are the most important numbers here.
>>
>> We also, mark output with PID to help identify which test process it is coming from as the attached log was very confusing to read.
>>
>> Tested manually with `jtreg -nr -va -jdk:./build/macosx-x86_64-server-fastdebug/images/j...
>
> Gerard Ziemski has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision:
>
> - init NMT earlier to decrease the reliance on NMTPreInit lookup table, which lets us simplify the code
> - Merge remote-tracking branch 'upstream/master' into JDK-8293972
> - do not clip bits to 32, we need full 64 bits
> - ^ is XOR, not pow
> - remove tabs
> - mark output with PID to help identify where it comes from
> - use Mersenne prime and mod for uniform bit distribution of hash
One more note of why I hesitate to remove "realloc"-support:
realloc is often used for buffers that grow on demand. These growth points are difficult to predict since they depend on dynamic buffer content. Therefore this code is very difficult to fully cover with tests.
We have seen these problems with stringStream back when it was still using RA as backing store. All would go well in tests, but when the customer turned on logging in a particular way, and we logged a particular sequence of log strings, the reallocation would hit just right at the wrong spot - in this case, we would reallocate in a leaf function under a temporary ResourceMark - and then later crash in the caller when the RM was removed.
For our case, it means if someone creates a global C++ object that uses e.g. a stringStream (now uses C-Heap) or a GrowableHeapArray, and uses that before CJVM, it may trigger reallocs before CJVM at a customer, but not in our tests.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14607#issuecomment-1643619636
More information about the hotspot-runtime-dev
mailing list