RFR: 7903722: JMH: Add xctrace-based perfnorm profiler for macOS [v2]

Aleksey Shipilev shade at openjdk.org
Mon Sep 23 15:50:50 UTC 2024


On Sat, 21 Sep 2024 20:56:21 GMT, Filipp Zhinkin <fzhinkin at openjdk.org> wrote:

>> Implementation of a perfnorm-alike profiler for macOS based on `xctrace` command line tool bundled with Xcode.
>> 
>> While the profiler is tested and seems to be working well, I consider it rather a preliminary version and open to a discussion on what and how it should measure.
>> 
>> Currently, the profiler only supports PMU counters sampling using `CPU Counters` instrument provided by the Instruments app / xctrace.
>> Unfortunately, `CPU Counters` instrument has no default settings, unlike `Time Profiler` and `CPU Profiler` instruments used by the recently merged `xctraceasm` profiler.
>> To use `CPU Counters`, a user has to create a template in the Instruments UI, select PMU events, save the template and then supply to `xctracenorm` as an argument.
>> 
>> This workflow not only prevents use of the profiler without preliminary manual configuration, but also tends to be annoying when it comes to measuring multiple events, as xctrace, unlike perf_events, does not support events multiplexing.
>> 
>> Thankfully, command-line-based configuration and default parameters could be emulated by building a custom Instruments package that imports data from `CPU Counters` and also supplies all required parameters.
>> As you can guess, there's no way to get information about supported PMU events directly from xctrace, but it could be fetched from KPEP database files, stored is `/usr/share/kpep`.
>> `xctracenorm` relies on that data to validate events specified by a user, if any, and also to print a help message that gives some insights into what could be sampled.
>> 
>> To sum up, there are a few things that were implemented to make `xctracenorm` profiler works:
>> - CPU model deletion using `sysctl`;
>> - KPEP file parsing to extract information about the PMU and all supported events;
>> - selected performance events validation;
>> - Instruments package building (generate XML, call a builder tool), packages are cached in `~/Library/Caches/org.openjdk.jmh`;
>> - xctrace execution, resulting samples extraction, and aggregation;
>> - samples postprocessing to calculate some additional metrics, like CPI and branch missprediction ratio.
>> 
>> Currently, if a user didn't specify any additional options, `xctracenorm` will sample instructions, cycles, branches and mispredicted branches events. 
>> These were selected as events that should be supported in all hardware macOS runs on; only 4 events were selected for the same reason.
>> 
>> Profiling results look like this on M2-based MacBook:
>> 
>> j...
>
> Filipp Zhinkin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision:
> 
>  - Merge branch 'master' into xctracenorm-prof
>  - 7903722: Use default template instead of generating a package
>  - 7903722: Add extra tests
>  - 7903722: Scan all possible KPEP file locations
>  - 7903722: Serialize xctrace tests execution
>  - 7903722: simplified code, added missing docs, supported branch events
>  - 7903722: Improve events preprocessing
>  - 7903722: Refactor KPEP database loading
>  - 7903722: compute AS Arm64 instructions density metrics
>  - 7903722: check if all listed events could be sampled simultaneously
>  - ... and 6 more: https://git.openjdk.org/jmh/compare/07565879...5a112315

Nice!

I tried on my Mac, and I think profiler requires super-user privileges? I only succeeded when running `sudo java ...`. Some JMH profilers supply sudo automatically, this should do the same?

I also have a comment about units:


Benchmark                                                                 Mode  Cnt   Score   Error                               Units
JMHSample_35_Profilers.Atomic.test                                        avgt    5   4.055 ± 0.185                               ns/op
JMHSample_35_Profilers.Atomic.test:BRANCH_MISPRED_NONSPEC                 avgt       ≈ 10⁻⁵                                        #/op
JMHSample_35_Profilers.Atomic.test:Branch miss ratio                      avgt       ≈ 10⁻⁶          BRANCH_MISPRED_NONSPEC/INST_BRANCH
JMHSample_35_Profilers.Atomic.test:CORE_ACTIVE_CYCLE                      avgt       10.541                                        #/op
JMHSample_35_Profilers.Atomic.test:CPI                                    avgt        0.351                  CORE_ACTIVE_CYCLE/INST_ALL
JMHSample_35_Profilers.Atomic.test:INST_ALL                               avgt       30.031                                        #/op
JMHSample_35_Profilers.Atomic.test:INST_BRANCH                            avgt        3.850                                        #/op
JMHSample_35_Profilers.Atomic.test:INST_BRANCH density (of instructions)  avgt        0.128                        INST_BRANCH/INST_ALL
JMHSample_35_Profilers.Atomic.test:IPC                                    avgt        2.849                  INST_ALL/CORE_ACTIVE_CYCLE


The large `CORE_ACTIVE_CYCLE/INST_ALL` are not great: too long? See how Linux perfnorm reports them, e.g. `clks/insns`.

Other code nits:

jmh-core-it/src/test/java/org/openjdk/jmh/it/profilers/XCTraceNormProfilerTest.java line 171:

> 169: 
> 170:         RunResult result;
> 171:         File templateFile = FileUtils.extractFromResource("/XCTraceNormTestTemplate.xml");

Is there any reasonable difference between this template file and the one we ship in `jmh-core/src/main/resources/default.instruments.template.xml`? Can we use that one instead?

jmh-core/src/main/java/org/openjdk/jmh/profile/XCTraceNormProfiler.java line 40:

> 38: 
> 39: /**
> 40:  * macOS permnorm profiler based on xctrace utility shipped with Xcode Instruments.

`permnorm` -> `perfnorm`

jmh-core/src/main/java/org/openjdk/jmh/profile/XCTraceNormProfiler.java line 153:

> 151:                         "\" was not found in the trace results."));
> 152:         if (tableDesc.getPmcEvents().isEmpty() && tableDesc.getTriggerType() == XCTraceTableHandler.TriggerType.TIME) {
> 153:             throw new IllegalStateException("Results does not contain any events.");

`ProfilerException`?

jmh-core/src/main/java/org/openjdk/jmh/profile/XCTraceSupport.java line 82:

> 80: 
> 81:     /**
> 82:      * Returns absolute path to xctrace executable or throws ProfilerException if it does not exist..

`..` -> `.`?

-------------

PR Review: https://git.openjdk.org/jmh/pull/131#pullrequestreview-2322271469
PR Review Comment: https://git.openjdk.org/jmh/pull/131#discussion_r1771476566
PR Review Comment: https://git.openjdk.org/jmh/pull/131#discussion_r1771459667
PR Review Comment: https://git.openjdk.org/jmh/pull/131#discussion_r1771465251
PR Review Comment: https://git.openjdk.org/jmh/pull/131#discussion_r1771472098


More information about the jmh-dev mailing list