RFR: 8298992: runtime/NMT/SummarySanityCheck.java failed with "Total commi…tted (MMMMMM) did not match the summarized committed (NNNNNN)

Gerard Ziemski gziemski at openjdk.org
Wed Aug 23 15:03:28 UTC 2023


On Wed, 23 Aug 2023 08:24:10 GMT, Afshin Zafari <azafari at openjdk.org> wrote:

>> src/hotspot/share/services/mallocTracker.hpp line 205:
>> 
>>> 203:       }
>>> 204:     } while(s->_all_mallocs.size() != total_size && ++loop_counter < loop_limit);
>>> 205:     assert(s->_all_mallocs.size() == total_size, "Total != sum of parts");
>> 
>> Do we agree then that the assert on line 205 is not needed?
>
> The issue here was that during copying malloc measures in the loop, some new allocations happen that change the copied items. This results in a mismatch of Total and the sum of items. 
> The `ThreadCritical` in the code was supposed to block other threads' allocations while copying. But it did not work as expected, since the `ThreadCritical` is used in a few _deallocations_ in the code.
> Therefore the while loop is written here to make sure that the malloc items that copied are consistent, i.e. $Total = \sum_i item_i$.
> 
> After Gerard's comment, the while-loop is upper limited to some iterations (`loop_limit = 100`) rather than be an infinite loop.
> 
> So if  after `loop_limit` no of loops, the items are still not consistent then it is better to raise it here rather than to let this mismatch propagates up to the reports.
> 
> It is expected that replacing `ThreadCritical` with mutex for NMT, will resolve the issue and no while-loop is needed anymore.

Yes, I see the reaching loop counter issue now.

Are we going to replace ThreadCritical with NMT mutex here or is that going to be a different follow up issue?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15306#discussion_r1303155094


More information about the hotspot-runtime-dev mailing list