RFR: 7903722: JMH: Add xctrace-based perfnorm profiler for macOS [v2]
Filipp Zhinkin
fzhinkin at openjdk.org
Sat Sep 21 20:56:21 UTC 2024
> Implementation of a perfnorm-alike profiler for macOS based on `xctrace` command line tool bundled with Xcode.
>
> While the profiler is tested and seems to be working well, I consider it rather a preliminary version and open to a discussion on what and how it should measure.
>
> Currently, the profiler only supports PMU counters sampling using `CPU Counters` instrument provided by the Instruments app / xctrace.
> Unfortunately, `CPU Counters` instrument has no default settings, unlike `Time Profiler` and `CPU Profiler` instruments used by the recently merged `xctraceasm` profiler.
> To use `CPU Counters`, a user has to create a template in the Instruments UI, select PMU events, save the template and then supply to `xctracenorm` as an argument.
>
> This workflow not only prevents use of the profiler without preliminary manual configuration, but also tends to be annoying when it comes to measuring multiple events, as xctrace, unlike perf_events, does not support events multiplexing.
>
> Thankfully, command-line-based configuration and default parameters could be emulated by building a custom Instruments package that imports data from `CPU Counters` and also supplies all required parameters.
> As you can guess, there's no way to get information about supported PMU events directly from xctrace, but it could be fetched from KPEP database files, stored is `/usr/share/kpep`.
> `xctracenorm` relies on that data to validate events specified by a user, if any, and also to print a help message that gives some insights into what could be sampled.
>
> To sum up, there are a few things that were implemented to make `xctracenorm` profiler works:
> - CPU model deletion using `sysctl`;
> - KPEP file parsing to extract information about the PMU and all supported events;
> - selected performance events validation;
> - Instruments package building (generate XML, call a builder tool), packages are cached in `~/Library/Caches/org.openjdk.jmh`;
> - xctrace execution, resulting samples extraction, and aggregation;
> - samples postprocessing to calculate some additional metrics, like CPI and branch missprediction ratio.
>
> Currently, if a user didn't specify any additional options, `xctracenorm` will sample instructions, cycles, branches and mispredicted branches events.
> These were selected as events that should be supported in all hardware macOS runs on; only 4 events were selected for the same reason.
>
> Profiling results look like this on M2-based MacBook:
>
> java -jar ./benchmarks.jar -prof xctracenorm -f 1 JMHSamp...
Filipp Zhinkin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision:
- Merge branch 'master' into xctracenorm-prof
- 7903722: Use default template instead of generating a package
- 7903722: Add extra tests
- 7903722: Scan all possible KPEP file locations
- 7903722: Serialize xctrace tests execution
- 7903722: simplified code, added missing docs, supported branch events
- 7903722: Improve events preprocessing
- 7903722: Refactor KPEP database loading
- 7903722: compute AS Arm64 instructions density metrics
- 7903722: check if all listed events could be sampled simultaneously
- ... and 6 more: https://git.openjdk.org/jmh/compare/9bfcf924...5a112315
-------------
Changes:
- all: https://git.openjdk.org/jmh/pull/131/files
- new: https://git.openjdk.org/jmh/pull/131/files/ecba1544..5a112315
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jmh&pr=131&range=01
- incr: https://webrevs.openjdk.org/?repo=jmh&pr=131&range=00-01
Stats: 8596 lines in 16 files changed: 3494 ins; 5056 del; 46 mod
Patch: https://git.openjdk.org/jmh/pull/131.diff
Fetch: git fetch https://git.openjdk.org/jmh.git pull/131/head:pull/131
PR: https://git.openjdk.org/jmh/pull/131
More information about the jmh-dev
mailing list