RFR: 8298992: runtime/NMT/SummarySanityCheck.java failed with "Total commi…tted (MMMMMM) did not match the summarized committed (NNNNNN)
Gerard Ziemski
gziemski at openjdk.org
Wed Aug 23 15:29:22 UTC 2023
On Wed, 16 Aug 2023 12:18:35 GMT, Afshin Zafari <azafari at openjdk.org> wrote:
> During exhaustive tests, it is observed that during taking snapshot of NMT metrics it is possible that new allocations happen concurrently, although a `ThreadCritical` is used during copying current metrics to the snapshot.
> A loop is surrounding the copying and checks whether the copied and original are the same.
The comment in `void copy_to(MallocMemorySnapshot* s)` says:
// Need to make sure that mtChunks don't get deallocated while the
// copy is going on, because their size is adjusted using this
// buffer in make_adjustment().
but I don't see `ThreadCritical` used anywhere else, except `copy_to()` itself, so what does that lock synchronizes with exactly? Does it do anything at all right now?
I see atomic operations used to protect the individual memory counters, but I don't see anywhere where we protect the total size + size of the chunks. If anyone of them changes, then copying:
s->_all_mallocs = _all_mallocs;
loop (index) { s->_malloc[index] = _malloc[index]; }
might at the end not to agree on:
`s._all_mallocs.size() == s._malloc[nmt_type1] + s._malloc[nmt_type2] + ...`
Afshin proposed a simple fix, which even though not ideal, will get job done.
An ideal fix would make sure that any operation involving changing `_all_mallocs` and `_malloc[nmt_type]` are both synchronized, so that we can use that same lock (whatever it is) in `copy_to()`
Did I get this right?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/15306#issuecomment-1690169040
More information about the hotspot-runtime-dev
mailing list