RFR: 8298992: runtime/NMT/SummarySanityCheck.java failed with "Total commi…tted (MMMMMM) did not match the summarized committed (NNNNNN)
Gerard Ziemski
gziemski at openjdk.org
Wed Aug 23 15:03:28 UTC 2023
On Wed, 23 Aug 2023 08:24:10 GMT, Afshin Zafari <azafari at openjdk.org> wrote:
>> src/hotspot/share/services/mallocTracker.hpp line 205:
>>
>>> 203: }
>>> 204: } while(s->_all_mallocs.size() != total_size && ++loop_counter < loop_limit);
>>> 205: assert(s->_all_mallocs.size() == total_size, "Total != sum of parts");
>>
>> Do we agree then that the assert on line 205 is not needed?
>
> The issue here was that during copying malloc measures in the loop, some new allocations happen that change the copied items. This results in a mismatch of Total and the sum of items.
> The `ThreadCritical` in the code was supposed to block other threads' allocations while copying. But it did not work as expected, since the `ThreadCritical` is used in a few _deallocations_ in the code.
> Therefore the while loop is written here to make sure that the malloc items that copied are consistent, i.e. $Total = \sum_i item_i$.
>
> After Gerard's comment, the while-loop is upper limited to some iterations (`loop_limit = 100`) rather than be an infinite loop.
>
> So if after `loop_limit` no of loops, the items are still not consistent then it is better to raise it here rather than to let this mismatch propagates up to the reports.
>
> It is expected that replacing `ThreadCritical` with mutex for NMT, will resolve the issue and no while-loop is needed anymore.
Yes, I see the reaching loop counter issue now.
Are we going to replace ThreadCritical with NMT mutex here or is that going to be a different follow up issue?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/15306#discussion_r1303155094
More information about the hotspot-runtime-dev
mailing list