The mix of native and java code in "-prof perfasm"
Sergey Bylokhov
sergey.bylokhov at oracle.com
Thu Nov 12 16:31:07 UTC 2015
Hello Everybody.
Probably this was discussed already, but I fails to find an answer to my question.
I have a benchmark which spend most of the time in the native library.
If I run it with "-prof perfasm" but without "-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly" I got correct result [1].
But if I add -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly then information about the native code is missing [2]:
Is it a bug or I missed some additional options? Thanks.
The benchmark:
import org.openjdk.jmh.annotations.*;
import java.awt.*;
import java.awt.image.BufferedImage;
import java.util.concurrent.TimeUnit;
@State(Scope.Benchmark)
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.SECONDS)
@Warmup(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 3, time = 2, timeUnit = TimeUnit.SECONDS)
public class SoftwareBlitLoopsItself {
@Param({"2"})
public int dsttype;
@Param({"5000"})
public int size;
@Param({"SrcOver"})
public String comp;
private BufferedImage dst;
private Graphics2D g;
@Setup
public void setup() {
dst = new BufferedImage(size, size, dsttype);
g = dst.createGraphics();
if (comp.equals("src")) {
g.setComposite(AlphaComposite.Src);
} else {
g.setComposite(AlphaComposite.SrcOver);
}
}
@Benchmark
public void drawItself() {
g.drawImage(dst, 0, 0, null);
}
}
================================================
================================================
================================================
[1] Correct results:
PrintAssembly processed: 160704 total address lines.
Perf output processed (skipped 5,178 seconds):
Column 1: cycles (6356 events)
Column 2: instructions (6506 events)
Hottest code regions (>10,00% "cycles" events):
....[Hottest Region 1]..............................................................................
[0x7f912876c920:0x7f912876c936] in IntArgbToIntArgbSrcOverMaskBlit (libawt.so)
<no assembly is recorded, native region>
....................................................................................................
58,02% 55,81% <total for region 1>
....[Hottest Region 2]..............................................................................
[0x7f912876ca2f:0x7f912876ca61] in IntArgbToIntArgbSrcOverMaskBlit (libawt.so)
<no assembly is recorded, native region>
....................................................................................................
38,51% 40,30% <total for region 2>
....[Hottest Regions]...............................................................................
58,02% 55,81% [0x7f912876c920:0x7f912876c936] in IntArgbToIntArgbSrcOverMaskBlit (libawt.so)
38,51% 40,30% [0x7f912876ca2f:0x7f912876ca61] in IntArgbToIntArgbSrcOverMaskBlit (libawt.so)
0,52% 0,49% [0xffffffff8105b83a:0xffffffff8105b83a] in native_write_msr_safe ([kernel.kallsyms])
0,11% 0,05% [0xffffffff817b43f0:0xffffffff817b43f0] in mutex_unlock ([kernel.kallsyms])
0,09% 0,05% [0xffffffff810d31c1:0xffffffff810d31f8] in __srcu_read_lock; __srcu_read_unlock ([kernel.kallsyms])
0,09% 0,03% [0xffffffff817b5f7c:0xffffffff817b5f94] in _raw_spin_lock_irqsave ([kernel.kallsyms])
0,08% [0xffffffff813bceb2:0xffffffff813bceb2] in ioread32 ([kernel.kallsyms])
0,08% 0,05% [0xffffffff817b6610:0xffffffff817b661c] in system_call; system_call_after_swapgs ([kernel.kallsyms])
0,06% [0xffffffff811ee96c:0xffffffff811ee96c] in __sb_start_write ([kernel.kallsyms])
0,05% [0xffffffff817b60fe:0xffffffff817b60fe] in _raw_read_lock ([kernel.kallsyms])
0,05% 0,09% [0x7f91618c9519:0x7f917cca5000] in sun.misc.Unsafe::compareAndSwapInt
0,05% 0,02% [0x7f917c876e20:0x7f917c876e3f] in __pthread_enable_asynccancel (libpthread-2.19.so)
0,05% [0x7f917c876e97:0x7f917c876ea2] in __pthread_disable_asynccancel (libpthread-2.19.so)
0,03% 0,06% [0xffffffff81091150:0xffffffff81091181] in __task_pid_nr_ns ([kernel.kallsyms])
0,03% [0xffffffff811ec244:0xffffffff811ec244] in vfs_write ([kernel.kallsyms])
0,03% 0,03% [0xffffffff811f49a0:0xffffffff811f49b1] in pipe_write ([kernel.kallsyms])
0,03% 0,06% [0xffffffff81209441:0xffffffff8120947a] in __fget ([kernel.kallsyms])
0,03% 0,02% [0xffffffff813ccfbe:0xffffffff813ccfc4] in __percpu_counter_add ([kernel.kallsyms])
0,03% [0xffffffffc055190a:0xffffffffc055191b] in nvif_notify ([kernel.kallsyms])
0,03% [0x7f916186c060:0x7f916186c269] in <unresolved>
2,00% 2,94% <...other 228 warm regions...>
....................................................................................................
99,98% 99,98% <totals>
....[Hottest Methods (after inlining)]..............................................................
96,54% 96,11% IntArgbToIntArgbSrcOverMaskBlit (libawt.so)
0,74% 0,95% sun.misc.Unsafe::compareAndSwapInt
0,52% 0,49% native_write_msr_safe ([kernel.kallsyms])
0,11% 0,05% mutex_unlock ([kernel.kallsyms])
0,09% 0,05% __srcu_read_lock; __srcu_read_unlock ([kernel.kallsyms])
0,09% 0,03% _raw_spin_lock_irqsave ([kernel.kallsyms])
0,08% 0,06% pipe_write ([kernel.kallsyms])
0,08% ioread32 ([kernel.kallsyms])
0,08% 0,05% system_call; system_call_after_swapgs ([kernel.kallsyms])
0,06% __sb_start_write ([kernel.kallsyms])
0,06% 0,17% vfprintf (libc-2.19.so)
0,05% nvif_notify ([kernel.kallsyms])
0,05% __pthread_disable_asynccancel (libpthread-2.19.so)
0,05% _raw_read_lock ([kernel.kallsyms])
0,05% 0,06% _IO_fflush (libc-2.19.so)
0,05% 0,02% do_get_write_access ([kernel.kallsyms])
0,05% 0,02% __pthread_enable_asynccancel (libpthread-2.19.so)
0,05% 0,06% __task_pid_nr_ns ([kernel.kallsyms])
0,03% 0,02% fsnotify ([kernel.kallsyms])
0,03% 0,03% __percpu_counter_add ([kernel.kallsyms])
1,13% 0,69% <...other 64 warm methods...>
....................................................................................................
99,98% 98,85% <totals>
....[Distribution by Area]..........................................................................
96,55% 96,11% <native code in (libawt.so)>
2,17% 2,06% <native code in ([kernel.kallsyms])>
0,87% 1,00% <generated code>
0,27% 0,77% <native code in (libc-2.19.so)>
0,13% 0,05% <native code in (libpthread-2.19.so)>
....................................................................................................
99,98% 99,98% <totals>
# Run complete. Total time: 00:00:16
================================================
================================================
================================================
Incorrect results [2]:
# Processing profiler results: LinuxPerfAsmProfiler
Result "drawItself":
35,329 ops/s
Secondary result "·asm":
PrintAssembly processed: 159135 total address lines.
Perf output processed (skipped 5,124 seconds):
Column 1: cycles (6485 events)
Column 2: instructions (6600 events)
Hottest code regions (>10,00% "cycles" events):
....[Hottest Region 1]..............................................................................
[0x7:0x7fda31b84000] in <unresolved>
<region is too big to display, has 287783 lines, but threshold is 1000>
....................................................................................................
57,52% 55,12% <total for region 1>
....[Hottest Region 2]..............................................................................
[0x7:0x7fda31b84000] in <unresolved>
<region is too big to display, has 287783 lines, but threshold is 1000>
....................................................................................................
38,61% 41,17% <total for region 2>
....[Hottest Regions]...............................................................................
57,52% 55,12% [0x7:0x7fda31b84000] in <unresolved>
38,61% 41,17% [0x7:0x7fda31b84000] in <unresolved>
0,39% 0,36% [0xffffffff8105b83a:0xffffffff8105b83a] in native_write_msr_safe ([kernel.kallsyms])
0,14% 0,02% [0xffffffff817b6610:0xffffffff817b6636] in system_call; system_call_after_swapgs ([kernel.kallsyms])
0,11% 0,06% [0xffffffff817b43e0:0xffffffff817b43f0] in mutex_unlock ([kernel.kallsyms])
0,08% 0,11% [0xffffffff810d31cd:0xffffffff810d31ec] in __srcu_read_lock; __srcu_read_unlock ([kernel.kallsyms])
0,08% [0xffffffff813bceb2:0xffffffff813bceb2] in ioread32 ([kernel.kallsyms])
0,08% 0,06% [0x7:0x7fda31b84000] in sun.reflect.ClassFileAssembler::cpi
0,06% 0,03% [0xffffffff817b5f7c:0xffffffff817b5f94] in _raw_spin_lock_irqsave ([kernel.kallsyms])
0,06% [0x7:0x7fda31b84000] in sun.reflect.ClassFileAssembler::cpi
0,05% 0,03% [0xffffffff811ee8e8:0xffffffff811ee8f6] in __sb_end_write ([kernel.kallsyms])
0,05% 0,02% [0xffffffff81218357:0xffffffff81218357] in __mark_inode_dirty ([kernel.kallsyms])
0,05% [0xffffffff812b6ef3:0xffffffff812b6f00] in jbd2_journal_stop ([kernel.kallsyms])
0,05% [0x7fda1587d130:0x7fda1587d19a] in <stub: method entry point (kind = zerolocals)>
0,03% 0,03% [0xffffffff811ee96c:0xffffffff811ee96c] in __sb_start_write ([kernel.kallsyms])
0,03% [0xffffffff812c1884:0xffffffff812c18a0] in jbd2_journal_put_journal_head ([kernel.kallsyms])
0,03% 0,03% [0xffffffff813ccfa0:0xffffffff813ccfbe] in __percpu_counter_add ([kernel.kallsyms])
0,03% [0xffffffff817b60fe:0xffffffff817b60fe] in _raw_read_lock ([kernel.kallsyms])
0,03% [0x7:0x7fda31b84000] in sun.reflect.ClassFileAssembler::cpi
0,03% [0x7:0x7fda31b84000] in sun.reflect.ClassFileAssembler::cpi
2,50% 2,97% <...other 256 warm regions...>
....................................................................................................
99,98% 100,00% <totals>
....[Hottest Methods (after inlining)]..............................................................
96,16% 96,29% <unresolved>
1,70% 1,77% sun.reflect.ClassFileAssembler::cpi
0,39% 0,36% native_write_msr_safe ([kernel.kallsyms])
0,14% 0,02% system_call; system_call_after_swapgs ([kernel.kallsyms])
0,11% 0,06% mutex_unlock ([kernel.kallsyms])
0,08% 0,11% __srcu_read_lock; __srcu_read_unlock ([kernel.kallsyms])
0,08% ioread32 ([kernel.kallsyms])
0,06% 0,03% _raw_spin_lock_irqsave ([kernel.kallsyms])
0,06% <stub: method entry point (kind = zerolocals)>
0,05% 0,03% __percpu_counter_add ([kernel.kallsyms])
0,05% 0,03% __sb_end_write ([kernel.kallsyms])
0,05% 0,02% __mark_inode_dirty ([kernel.kallsyms])
0,05% jbd2_journal_stop ([kernel.kallsyms])
0,03% __block_commit_write.isra.20 ([kernel.kallsyms])
0,03% 0,09% pipe_write ([kernel.kallsyms])
0,03% _raw_read_lock ([kernel.kallsyms])
0,03% _raw_spin_lock ([kernel.kallsyms])
0,03% 0,06% __sb_start_write ([kernel.kallsyms])
0,03% ext4_mark_iloc_dirty ([kernel.kallsyms])
0,03% mutex_optimistic_spin ([kernel.kallsyms])
0,82% 0,32% <...other 52 warm methods...>
....................................................................................................
99,98% 99,18% <totals>
....[Distribution by Area]..........................................................................
98,00% 98,08% <generated code>
1,97% 1,92% <native code in ([kernel.kallsyms])>
0,02% <native code in (perf-21675.map)>
....................................................................................................
99,98% 100,00% <totals>
More information about the jmh-dev
mailing list