RFR: 8203359: Container level resources events

Erik Gahlin egahlin at openjdk.java.net
Thu Mar 25 23:31:26 UTC 2021


On Wed, 24 Mar 2021 18:39:06 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:

>> With this change it becomes possible to surface various cgroup level metrics (available via `jdk.internal.platform.Metrics`) as JFR events.
>> 
>> Only a subset of the metrics exposed by `jdk.internal.platform.Metrics` is turned into JFR events to start with.
>> * CPU related metrics
>> * Memory related metrics
>> * I/O related metrics
>> 
>> For each of those subsystems a configuration data will be emitted as well. The initial proposal is to emit the configuration data events at least once per chunk and the metrics values at 30 seconds interval. 
>> By using these values the emitted events seem to contain useful information without increasing overhead (the metrics values are read from `/proc` filesystem so that should not be done too frequently).
>
> @jbachorik Would it make sense for `ContainerConfigurationEvent` to include the underlying cgroup version info (v1 or legacy vs. v2 or unified)? `Metrics.getProvider()` should give that info.

Does each getter call result in parsing /proc, or do things aggregated over several calls or hooks?

Do you have any data how expensive the invocations are? 

You could for example try to measure it by temporary making the events durational, and fetch the values between begin() and end(), and perhaps show a 'jfr print --events Container* recording.jfr' printout. 

If possible, it would be interesting to get some idea about the startup cost as well

If not too much overhead, I think it would be nice to skip the "flag" in the .jfcs, and always record the events in a container environment.

I know there is a way to test JFR using Docker, maybe @mseledts could provide information? Some sanity tests would be good to have.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3126


More information about the core-libs-dev mailing list