Isolating setup of StackWalker test call stack in JMH?
Sergey Kuksenko
sergey.kuksenko at oracle.com
Wed Aug 16 04:52:00 UTC 2023
Invoking nanoTime twice per operation, you will get large nanoTime overhead.
https://shipilev.net/blog/2014/nanotrusting-nanotime/
Not sure that this overhead is smaller than the building up of the call stack (at some degree).
By the way, have you looked how Mode.SampleTime works. In this case, JMH measures the time only of SOME operations (not all of them). It's too expensive to measure the time of each operation.
May I ask you - why do you need to move stack construction into "setup"?
________________________________________
From: jmh-jdk-microbenchmarks-dev <jmh-jdk-microbenchmarks-dev-retn at openjdk.org> on behalf of Brent Christian <brent.christian at oracle.com>
Sent: Tuesday, August 15, 2023 4:53 PM
To: jmh-dev at openjdk.org; jmh-jdk-microbenchmarks-dev at openjdk.org
Subject: Isolating setup of StackWalker test call stack in JMH?
Greetings, JMH users and enthusiasts.
The JDK microbenchmarks include benchmarks to measure the performance of java.lang.StackWalker operations. A 'depth' @Param is used to test various call stack depths. See:
https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/lang/StackWalkBench.java
The TestStack nested class (L69) builds up 'n' method calls, and then run()s the SW method under test once n=depth calls are on the call stack. The issue is that this building up of the call stack is "setup" code, but is executed during the benchmark, influencing the scoring. In particular, it skews the StackWalker.getCallerClass() results, which should perform the same regardless of stack depth. E.g.:
Benchmark (depth) Mode Cnt Score Error Units
StackWalkBench.getCallerClass 4 avgt 15 445.778 ± 8.920 ns/op
StackWalkBench.getCallerClass 100 avgt 15 496.285 ± 6.631 ns/op
StackWalkBench.getCallerClass 1000 avgt 15 1563.743 ± 7.601 ns/op
Because of the unique needs of setting up a StackWalker call at an artificial stack depth, the setup doesn't seem like something that could be moved into a @Setup method. I looked through the JMH samples, and the only promising thing I found was @AuxCounter. Maybe I could add an AuxCounter to separately report *just* the duration spent in the StackWalker call? (Not ideal, but perhaps it's all there is.)
So I gave that a try. The changes are here:
https://github.com/bchristi-git/jdk/compare/master...bchristi-git:jdk:swBench
However the results are not what I expected:
Benchmark (depth) Mode Cnt Score Error Units
StackWalkBench.getCallerClass 4 avgt 10 983.959 ± 10.895 ns/op
StackWalkBench.getCallerClass:benchNanoDur 4 avgt 10 1073151.585 ± 123551.526 ns/op
StackWalkBench.getCallerClass 100 avgt 10 1063.878 ± 16.790 ns/op
StackWalkBench.getCallerClass:benchNanoDur 100 avgt 10 1085369.870 ± 87001.613 ns/op
StackWalkBench.getCallerClass 1000 avgt 10 2143.571 ± 17.697 ns/op
StackWalkBench.getCallerClass:benchNanoDur 1000 avgt 10 868953.256 ± 237607.736 ns/op
The reported benchNanoDur score is much greater than I anticipated.
I think don't have a full grasp of how AuxCounters are calculated or meant to be used. I'd appreciate any hints on what I might be doing wrong. :)
(Or, perhaps, other ideas on how JMH could separate out the call-stack-building-setup for this benchmark, if such a thing is possible.)
Much appreciated,
-Brent
More information about the jmh-dev
mailing list