RFR: 7903740: JMH: Perf event validation not working with skid options [v6]

Galder Zamarreño galder at openjdk.org
Fri Oct 11 05:28:22 UTC 2024


On Tue, 24 Sep 2024 18:38:32 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> Galder Zamarreño has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision:
>> 
>>  - Add -q to avoid additional perf record messages
>>  - Pipe through perf report --stats instead
>>  - Merge branch 'master' into topic.validate-perf-event-without-modifier
>>  - Check all events with perf report in a single command
>>  - Remove previous approach
>>  - Use trim to remove any empty spaces or carriage returns
>>  - Do not make it quiet otherwise there's no output
>>  - Try perf report from specific file
>>  - Remove old approaches
>>  - Explicitly define an output file for perf record validation
>>  - ... and 5 more: https://git.openjdk.org/jmh/compare/ac84fa21...069447e1
>
> That is to say, I am willing to accept the simple patch like:
> 
> 
> diff --git a/jmh-core/src/main/java/org/openjdk/jmh/profile/LinuxPerfAsmProfiler.java b/jmh-core/src/main/java/org/openjdk/jmh/profile/LinuxPerfAsmProfiler.java
> index c948532f..69d74a86 100644
> --- a/jmh-core/src/main/java/org/openjdk/jmh/profile/LinuxPerfAsmProfiler.java
> +++ b/jmh-core/src/main/java/org/openjdk/jmh/profile/LinuxPerfAsmProfiler.java
> @@ -46,7 +46,7 @@ public class LinuxPerfAsmProfiler extends AbstractPerfAsmProfiler {
>      public LinuxPerfAsmProfiler(String initLine) throws ProfilerException {
>          super(initLine, "cycles");
>  
> -        String[] senseCmd = { PerfSupport.PERF_EXEC, "stat", "--event", Utils.join(requestedEventNames, ","), "--log-fd", "2", "echo", "1" };
> +        String[] senseCmd = { PerfSupport.PERF_EXEC, "record", "-q", "--event", Utils.join(requestedEventNames, ","), "-o", "-", "echo", "1"};
>  
>          Collection<String> failMsg = Utils.tryWith(senseCmd);
>          if (!failMsg.isEmpty()) {
> 
> 
> ...and everything else needs _a very hard justification and a lot of testing_ to make sure it does not break in the cases it is supposed to work. The fact the PR code fails on my first actual try does not inspire confidence, to be honest.
> 
> Remember: the capabilities check in constructor is an opportunistic check, it can pass by accident, and users would discover their profiler does not really work later when looking at the results. The check should _NOT_ fail by accident, when the profiler would actually work, if not for a misbehaving capability check.

@shipilev CI on GH actions doesn't like the patch:


Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 18.797 sec <<< FAILURE! - in org.openjdk.jmh.it.profilers.LinuxPerfAsmProfilerTest
test(org.openjdk.jmh.it.profilers.LinuxPerfAsmProfilerTest)  Time elapsed: 18.66 sec  <<< ERROR!
java.lang.IllegalStateException: Profile does not contain the required frame: PrintAssembly processed: 7 total address lines.
Perf output processed (skipped 6.249 seconds):
 Column 1: cycles (0 events)

WARNING: No hottest code region above the threshold (10.00%) for disassembly.
Use "hotThreshold" profiler option to lower the filter threshold.

....[Hottest Regions]...............................................................................
....................................................................................................
          <totals>

....[Hottest Methods (after inlining)]..............................................................
....................................................................................................
          <totals>

....[Distribution by Source]........................................................................
....................................................................................................
          <totals>

WARNING: The perf event count is suspiciously low (0). The performance data might be
inaccurate or misleading. Try to do the profiling again, or tune up the sampling frequency.
With some profilers on Mac OS X, System Integrity Protection (SIP) may prevent profiling.
In such case, temporarily disabling SIP with 'csrutil disable' might help.

	at org.openjdk.jmh.it.profilers.LinuxPerfAsmProfilerTest.test(LinuxPerfAsmProfilerTest.java:61)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)


Do you have suggestions on how to address this? This is what I've been battling with all throughout this patch, finding a way for the test to pass on both CI on GH actions and locally, and as you can see, it's not easy to find something that works in all environments.

-------------

PR Comment: https://git.openjdk.org/jmh/pull/132#issuecomment-2406577953


More information about the jmh-dev mailing list