No assembly output for perfasm
Kristoffer Sjögren
stoffe at gmail.com
Mon Oct 26 21:20:11 UTC 2015
No worries, thanks for tip and the fix :-)
On Mon, Oct 26, 2015 at 6:54 PM, Aleksey Shipilev
<aleksey.shipilev at oracle.com> wrote:
> Hi Kristoffer,
>
> Sorry for this nuisance, we should have printed a message about that.
> Should be fixed now:
> https://bugs.openjdk.java.net/browse/CODETOOLS-7901539
>
> Thanks,
> -Aleksey
>
> P.S. Please note that if the code region falls below a certain threshold
> in the profile, there is probably no actual bottleneck there, and you
> need to look into the profile much more carefully.
>
> On 10/26/2015 01:00 AM, Kristoffer Sjögren wrote:
>> I figured it out. The hotThreshold was too high for the test to
>> generate the detailed assembly output.
>>
>> Took me a while to figure out how to provide the options, but here it is.
>>
>> java -jar target/benchmarks.jar Decode.legacy -prof perfasm:hotThreshold=0.01
>>
>> Cheers,
>> -Kristoffer
>>
>> On Sun, Oct 25, 2015 at 9:51 PM, Kristoffer Sjögren <stoffe at gmail.com> wrote:
>>> Hi
>>>
>>> For some reason the assembly output is not showing for my tests when
>>> running perfasm [1] with JMH 1.11.1 and perf version 3.16.0 on Linux
>>> Debian Jessie.
>>>
>>> I know PrintAssembly works because I can generate a log which is
>>> perfectly parsable by JitWatch. I can also see the assembly output
>>> from JMH by adding the -v EXTRA flag.
>>>
>>> $ java -jar target/benchmarks.jar Decode.legacy
>>> -jvmArgs="-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading
>>> -XX:PrintAssemblyOptions=intel -XX:+LogCompilation -XX:+PrintAssembly
>>> -XX:LogFile=hotspot.log"
>>>
>>> Any ideas why jmh won't show me the assembly?
>>>
>>> Cheers,
>>> -Kristoffer
>>>
>>> [1] Output
>>>
>>> $ java -jar target/benchmarks.jar Decode.legacy -wi 20 -i 20 -r 1 -f 1
>>> -prof perfasm
>>> # JMH 1.11 (released 44 days ago)
>>> # VM version: JDK 1.8.0_45, VM 25.45-b02
>>> # VM invoker: /home/stoffe/Downloads/jdk1.8.0_45/jre/bin/java
>>> # VM options: <none>
>>> # Warmup: 20 iterations, 1 s each
>>> # Measurement: 20 iterations, 1 s each
>>> # Timeout: 10 min per iteration
>>> # Threads: 1 thread, will synchronize iterations
>>> # Benchmark mode: Throughput, ops/time
>>> # Benchmark: deephacks.Decode.legacy
>>>
>>> # Run progress: 0.00% complete, ETA 00:00:40
>>> # Fork: 1 of 1
>>> # Preparing profilers: LinuxPerfAsmProfiler
>>> # Profilers consume stdout and stderr from target VM, use -v EXTRA to
>>> copy to console
>>> # Warmup Iteration 1: 0.513 ops/s
>>> # Warmup Iteration 2: 75.762 ops/s
>>> # Warmup Iteration 3: 166.888 ops/s
>>> # Warmup Iteration 4: 368.548 ops/s
>>> # Warmup Iteration 5: 607.144 ops/s
>>> # Warmup Iteration 6: 190.322 ops/s
>>> # Warmup Iteration 7: 1159.049 ops/s
>>> # Warmup Iteration 8: 1220.974 ops/s
>>> # Warmup Iteration 9: 2412.146 ops/s
>>> # Warmup Iteration 10: 4171.080 ops/s
>>> # Warmup Iteration 11: 4590.409 ops/s
>>> # Warmup Iteration 12: 8118.074 ops/s
>>> # Warmup Iteration 13: 7864.901 ops/s
>>> # Warmup Iteration 14: 7912.240 ops/s
>>> # Warmup Iteration 15: 8886.517 ops/s
>>> # Warmup Iteration 16: 8362.501 ops/s
>>> # Warmup Iteration 17: 7436.626 ops/s
>>> # Warmup Iteration 18: 7584.446 ops/s
>>> # Warmup Iteration 19: 9531.862 ops/s
>>> # Warmup Iteration 20: 9701.975 ops/s
>>> Iteration 1: 7798.191 ops/s
>>> Iteration 2: 8768.801 ops/s
>>> Iteration 3: 9651.449 ops/s
>>> Iteration 4: 8383.285 ops/s
>>> Iteration 5: 10428.639 ops/s
>>> Iteration 6: 8958.372 ops/s
>>> Iteration 7: 8548.640 ops/s
>>> Iteration 8: 9676.202 ops/s
>>> Iteration 9: 8443.678 ops/s
>>> Iteration 10: 8489.997 ops/s
>>> Iteration 11: 9493.343 ops/s
>>> Iteration 12: 9008.206 ops/s
>>> Iteration 13: 8520.892 ops/s
>>> Iteration 14: 9240.624 ops/s
>>> Iteration 15: 9874.695 ops/s
>>> Iteration 16: 8955.088 ops/s
>>> Iteration 17: 9308.933 ops/s
>>> Iteration 18: 8513.425 ops/s
>>> Iteration 19: 9385.654 ops/s
>>> Iteration 20: 8237.902 ops/s
>>> # Processing profiler results: LinuxPerfAsmProfiler
>>>
>>>
>>> Result "legacy":
>>> 8984.301 ±(99.9%) 561.016 ops/s [Average]
>>> (min, avg, max) = (7798.191, 8984.301, 10428.639), stdev = 646.066
>>> CI (99.9%): [8423.285, 9545.316] (assumes normal distribution)
>>>
>>> Secondary result "·asm":
>>> PrintAssembly processed: 417977 total address lines.
>>> Perf output processed (skipped 24.372 seconds):
>>> Column 1: cycles (28022 events)
>>> Column 2: instructions (27858 events)
>>>
>>> ....[Hottest Regions]...............................................................................
>>> 5.41% 8.44% [0x7f1c1530b1e6:0x7f1c1530b62e] in
>>> deephacks.Bin1Decoder::decodeStream
>>> 4.35% 7.29% [0x7f1c281b8760:0x7f1c281b88d8] in inflate_fast (libzip.so)
>>> 3.65% 7.04% [0x7f1c15338732:0x7f1c153389d6] in deephacks.Base64::decode
>>> 2.07% 3.55% [0x7f1c1530b00c:0x7f1c1530b110] in
>>> deephacks.Bin1Decoder::decodeStream
>>> 1.80% 2.00% [0x7f1c1545ab86:0x7f1c1545af2e] in
>>> deephacks.legacy.LegacyDecoder::decodeToBin1
>>> 1.74% 2.44% [0x7f1c15459ab5:0x7f1c15459f1c] in
>>> deephacks.legacy.LegacyDecoder::decodeToBin1
>>> 1.68% 2.49% [0x7f1c15366907:0x7f1c15366b94] in deephacks.Crumbs::getCrumb
>>> 1.32% 1.69% [0x7f1c15314820:0x7f1c15314983] in java.util.HashMap::resize
>>> 1.23% 1.47% [0x7f1c281be31d:0x7f1c281be4e8] in adler32 (libzip.so)
>>> 1.01% 0.71% [0x7f1c29d2ee60:0x7f1c29d2ee69] in SpinPause (libjvm.so)
>>> 0.96% 1.40% [0x7f1c15389360:0x7f1c15389731] in java.math.BigInteger::or
>>> 0.91% 0.36% [0x7f1c281b9100:0x7f1c281b915a] in inflateEnd (libzip.so)
>>> 0.86% 0.29% [0x7f1c15361a60:0x7f1c15361b2c] in
>>> org.joda.time.field.PreciseDateTimeField::set
>>> 0.85% 0.49% [0x7f1c2a584aa0:0x7f1c2a584bfa] in [unknown] (libc-2.19.so)
>>> 0.78% 1.43% [0x7f1c1530b9a9:0x7f1c1530bbf6] in
>>> deephacks.Bin1Decoder::decodeStream
>>> 0.75% 0.14% [0x7f1c2a588010:0x7f1c2a5880cd] in malloc (libc-2.19.so)
>>> 0.71% 0.78% [0x7f1c1530bbe0:0x7f1c1530bdea] in
>>> deephacks.Bin1Decoder::decodeStream
>>> 0.71% 0.18% [0x7f1c2a588980:0x7f1c2a588a81] in __libc_calloc
>>> (libc-2.19.so)
>>> 0.69% 0.31% [0x7f1c15051dbe:0x7f1c15051e9d] in
>>> StubRoutines::jshort_disjoint_arraycopy
>>> 0.69% 0.84% [0x7f1c15412022:0x7f1c1541230a] in
>>> deephacks.legacy.LegacyDecoder::serializedCrumbsFromHash
>>> 67.78% 56.66% <...other 1877 warm regions...>
>>> ....................................................................................................
>>> 99.91% 99.99% <totals>
>>>
>>> ....[Hottest Methods (after
>>> inlining)]..............................................................
>>> 9.39% 14.53% deephacks.Bin1Decoder::decodeStream
>>> 7.39% 6.82% deephacks.legacy.LegacyDecoder::decodeToBin1
>>> 5.34% 8.38% inflate_fast (libzip.so)
>>> 5.09% 3.35% [unknown] (libc-2.19.so)
>>> 3.96% 7.43% deephacks.Base64::decode
>>> 3.61% 3.58% org.joda.time.format.DateTimeParserBucket::computeMillis
>>> 3.37% 4.07% deephacks.Crumbs::getCrumb
>>> 2.20% 2.25% deephacks.legacy.LegacyDecoder::serializedCrumbsFromHash
>>> 1.77% 2.10% deephacks.crumbs.FrequencyCountingCrumb::deserializeFromBin1
>>> 1.77% 1.20% inflate (libzip.so)
>>> 1.53% 1.95% java.util.HashMap::resize
>>> 1.51% 1.56% adler32 (libzip.so)
>>> 1.37% 1.34% java.lang.ref.Finalizer$FinalizerThread::run
>>> 1.35% 1.14% java.util.AbstractCollection::addAll
>>> 1.33% 1.45% deephacks.Crumbs::cacheCrumb
>>> 1.26% 1.12% deephacks.Crumbs::getAllCrumbs
>>> 1.17% 1.12% org.joda.time.format.DateTimeFormatter::parseDateTime
>>> 1.15% 1.66% java.math.BigInteger::or
>>> 1.10% 0.85% deephacks.crumbs.TrackingCrumb::deserializeFromBin1
>>> 1.02% 0.73% deephacks.Bin1Helper::getStringFromHash
>>> 43.25% 32.71% <...other 587 warm methods...>
>>> ....................................................................................................
>>> 99.91% 99.36% <totals>
>>>
>>> ....[Distribution by
>>> Area]..........................................................................
>>> 63.39% 68.97% <generated code>
>>> 13.38% 10.99% <native code in (libjvm.so)>
>>> 10.57% 12.05% <native code in (libzip.so)>
>>> 7.25% 4.33% <native code in (libc-2.19.so)>
>>> 4.64% 3.14% <native code in ([kernel.kallsyms])>
>>> 0.19% 0.19% <native code in (libpthread-2.19.so)>
>>> 0.15% 0.10% <native code in (perf-5881.map)>
>>> 0.12% 0.06% <native code in (jbd2.ko)>
>>> 0.09% 0.05% <native code in (hsdis-amd64.so)>
>>> 0.08% 0.08% <native code in (ext4.ko)>
>>> 0.01% 0.00% <native code in (uhci-hcd.ko)>
>>> 0.01% 0.01% <native code in (hid.ko)>
>>> 0.01% <native code in (nouveau.ko)>
>>> 0.01% <native code in (usbhid.ko)>
>>> 0.00% <native code in (ehci-hcd.ko)>
>>> 0.00% <native code in (libjava.so)>
>>> 0.00% <native code in (libnet.so)>
>>> 0.00% <native code in (sky2.ko)>
>>> ....................................................................................................
>>> 99.91% 99.99% <totals>
>>>
>>>
>>>
>>> # Run complete. Total time: 00:00:59
>>>
>>> Benchmark Mode Cnt Score Error Units
>>> Decode.legacy thrpt 20 8984.301 ± 561.016 ops/s
>>> Decode.legacy:·asm thrpt NaN ---
>
>
More information about the jmh-dev
mailing list