RFR: 8365306: Provide OS Process Size and Libc statistic metrics to JFR
Erik Gahlin
egahlin at openjdk.org
Tue Aug 19 06:45:39 UTC 2025
On Wed, 13 Aug 2025 09:42:57 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> This provides the following new metrics:
> - `ProcessSize` event (new, periodic)
> - vsize (for analyzing address-space fragmentation issues)
> - RSS including subtypes (subtypes are useful for excluding atypical issues, e.g. kernel problems that cause large file buffer bloat)
> - peak RSS
> - process swap (if we swap we cannot trust the RSS values, plus it indicates bad sizing)
> - pte size (to quickly see if we run with a super-large working set but an unsuitably small page size)
> - `LibcStatistics` (new, periodic)
> - outstanding malloc size (important counterpoint to whatever NMT tries to tell me, which alone is often misleading)
> - retained malloc size (super-important for the same reason)
> - number of libc trims the hotspot executed (needed to gauge the usefulness of the retain counter, and to see if a customer employs native heap auto trimming (`-XX:TrimNativeHeapInterval`)
> - `NativeHeapTrim` (new, event-driven) (for both manual and automatic trims)
> - RSS before and RSS after
> - RSS recovered by this trim
> - whether it was an automatic or manual trim
> - duration
> - `JavaThreadStatistic`
> - os thread counter (new field) (useful to understand the behavior of third-party code in our process if threads are created that bypass the JVM. E.g. some custom launchers do that.)
> - nonJava thread counter (new field) (needed to interprete the os thread counter)
>
> Notes:
> - we already have `ResidentSetSize` event, and the new `ProcessSize` event is a superset of that. I don't know how these cases are handled. I'd prefer to throw the old event out, but JMC has a hard-coded chart for RSS, so I kept it in unless someone tells me to remove it.
>
> - Obviously, the libc events are very platform-specific. Still, I argue that these metrics are highly useful. We want people to use JFR and JMC; people include developers that are dealing with performance problems that require platform-specific knowledge to understand. See my comment in the JBS issue.
>
> I provided implementations, as far as possible, to Linux, MacOS and Windows.
>
> Testing:
> - ran the new tests manually and as part of GHAs
What is the problem with adding RSS metrics to the existing ResidentSetSize event?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26756#issuecomment-3199438538
More information about the hotspot-jfr-dev
mailing list