RFR: JDK-8261334: NMT: tuning statistic shows incorrect hash distribution [v2]

Zhengyu Gu zgu at openjdk.java.net
Mon Feb 8 17:50:40 UTC 2021


On Mon, 8 Feb 2021 16:37:00 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> This is a trivial patch.
>> 
>> Tuning statistics for the malloc site hash map in NMT use a MallocSiteWalker to walk the malloc sites.
>> 
>> There is a bug in the report code which causes hash distribution statistics displayed to be (sometimes widely) off:
>> 
>> Hash distribution:
>>   1    entry: 179
>>   2  entries:  79
>>   3  entries:  66
>>   4  entries:  72
>>   5  entries:  98
>>   6  entries:  75
>>   7  entries:  55
>>   8  entries:  43
>>   9  entries:  22
>>  10 entries:   16
>>  11 entries:    5
>>  12 entries:    6
>>                   (sum 716)
>> 
>> This is the bucket chain length histogram. Note that the sum of all values is 716 which exceeds the table width of 511, the total number of buckets.
>> 
>> The problem is caused by a bug in the walker code where the bucket index is calculated by manually mod'ing `NativeCallStack::hash()` with table size:
>> 
>> https://github.com/openjdk/jdk/blob/d0a8f2f737cdfc1ae742d47f2dd4f2bbc03f4398/src/hotspot/share/services/memTracker.cpp#L263
>> 
>> which is wrong since the hash is defined as signed int. So it yields incorrect index values if the hash code is <0, compared with the regular hashcode-to-index calculation done in the table itself, which translates the hash into an unsigned value before mod'ing:
>> 
>> https://github.com/openjdk/jdk/blob/d0a8f2f737cdfc1ae742d47f2dd4f2bbc03f4398/src/hotspot/share/services/mallocSiteTable.hpp#L243
>> 
>> This is an old bug introduced with JDK-8046598 in 2014. Note that it causes the statistics to look better than it actually is, since it reports a long chain as multiple short chains.
>> 
>> --------------
>> 
>> The patch is really minimal, just adding the necessary cast at the right place. Patch is small for easy backporting. Hash code calculation will be touched up as part of https://bugs.openjdk.java.net/browse/JDK-8261302, so I'd like to keep the patch minimal. I also added tracing code to print tuning info as part of the final NMT report in debug VMs.
>> 
>> This is the correct output after the patch. Sum of buckets is 511 as expected. Note that the statistics are way less flattering now, since there are really almost none 1-length chains as the broken statistic claims:
>> 
>> Hash distribution:
>> empty bucket:  1
>>   1    entry:  8
>>   2  entries: 29
>>   3  entries: 52
>>   4  entries: 58
>>   5  entries: 93
>>   6  entries: 76
>>   7  entries: 71
>>   8  entries: 52
>>   9  entries: 37
>>  10 entries:  16
>>  11 entries:  12
>>  12 entries:   6
>>                   (sum 511)
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove tuning tracing from final report

Marked as reviewed by zgu (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/2458


More information about the hotspot-runtime-dev mailing list