RFR: 8298992: runtime/NMT/SummarySanityCheck.java failed with "Total commi…tted (MMMMMM) did not match the summarized committed (NNNNNN)

Afshin Zafari azafari at openjdk.org
Fri Aug 25 08:31:08 UTC 2023


On Wed, 16 Aug 2023 12:18:35 GMT, Afshin Zafari <azafari at openjdk.org> wrote:

> During exhaustive tests, it is observed that during taking snapshot of NMT metrics it is possible that new allocations happen concurrently, although a `ThreadCritical` is used during copying current metrics to the snapshot.
> A loop is surrounding the copying and checks whether the copied and original are the same.

Some facts about these NMT metrics are: 
- The total and sum of parts of the NMT reports are never checked to be consistent, except the JTREG test mentioned in the title of the issue. In that test, a difference of up to 8K/8M is tolerated due to round off values to scale of 1K/1M. 
- The concurrency of updating the metrics is handled by atomic operations. During taking snapshots, no control is made for concurrency, neither for malloc nor for virtual memory allocations.
- In malloc case, there is a Total that held in the snapshot as well. This metric is used to detect inconsistency of Total and sum of the copied parts.
- Using of ThreadCritical for keeping the metrics consistent, is not cheap. (Based on the current implementation of NMT and the comments of the ThreadCritical class definition)

The proposed small change here in this PR tries to be somewhere between fully-consistent metrics and never-consistent ones.
There is another issue([8304824](https://bugs.openjdk.org/browse/JDK-8304824)) for replacing ThreadCritical with mutex. When that issue fixed, the while-loop added here would be removed.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15306#issuecomment-1692972046


More information about the hotspot-runtime-dev mailing list