RFR: 8365306: Provide OS Process Size and Libc statistic metrics to JFR

Thomas Stuefe stuefe at openjdk.org
Sat Aug 16 04:21:09 UTC 2025


This provides the following new metrics:
- `ProcessSize` event (new, periodic) 
  - vsize (for analyzing address-space fragmentation issues)
  - RSS including subtypes (subtypes are useful for excluding atypical issues, e.g. kernel problems that cause large file buffer bloat)
  - peak RSS 
  - process swap (if we swap we cannot trust the RSS values, plus it indicates bad sizing)
  - pte size (to quickly see if we run with a super-large working set but an unsuitably small page size)
- `LibcStatistics` (new, periodic)
  - outstanding malloc size (important counterpoint to whatever NMT tries to tell me, which alone is often misleading)
  - retained malloc size (super-important for the same reason)
  - number of libc trims the hotspot executed (needed to gauge the usefulness of the retain counter, and to see if a customer employs native heap auto trimming (`-XX:TrimNativeHeapInterval`)
- `NativeHeapTrim` (new, event-driven) (for both manual and automatic trims)
   - RSS before and RSS after
   - RSS recovered by this trim
   - whether it was an automatic or manual trim
   - duration
- `JavaThreadStatistic`
  - os thread counter (new field) (useful to understand the behavior of third-party code in our process if threads are created that bypass the JVM. E.g. some custom launchers do that.)
  - nonJava thread counter (new field) (needed to interprete the os thread counter)

Notes:
- we already have `ResidentSetSize` event, and the new `ProcessSize` event is a superset of that. I don't know how these cases are handled. I'd prefer to throw the old event out, but JMC has a hard-coded chart for RSS, so I kept it in unless someone tells me to remove it.

- Obviously, the libc events are very platform-specific. Still, I argue that these metrics are highly useful. We want people to use JFR and JMC; people include developers that are dealing with performance problems that require platform-specific knowledge to understand. See my comment in the JBS issue.

I provided implementations, as far as possible, to Linux, MacOS and Windows.

Testing:
- ran the new tests manually and as part of GHAs

-------------

Commit messages:
 - copyrights
 - Windows
 - MacOS tests
 - fix mac
 - wip
 - mac implementation
 - typo
 - Add nonjava thread count; add test for thread statistics
 - tests
 - Add test to workflow
 - ... and 11 more: https://git.openjdk.org/jdk/compare/dbae90c9...39e282ae

Changes: https://git.openjdk.org/jdk/pull/26756/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26756&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8365306
  Stats: 778 lines in 28 files changed: 643 ins; 27 del; 108 mod
  Patch: https://git.openjdk.org/jdk/pull/26756.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/26756/head:pull/26756

PR: https://git.openjdk.org/jdk/pull/26756


More information about the hotspot-dev mailing list