DTrace asm profiler for Mac OS X
Vsevolod Tolstopyatov
qwwdfsad at gmail.com
Mon Jan 22 08:58:03 UTC 2018
I do.
Small checklist:
1) Have you installed hsdis properly? (instruction:
http://psy-lob-saw.blogspot.ru/2013/01/java-print-assembly.html)
2) Do you see any generated code if you run JMH-benchmark in verbose mode
with dtraceasm? (option "-v EXTRA")
--
Best regards,
Tolstopyatov Vsevolod
On Mon, Jan 22, 2018 at 3:32 AM, Henri Tremblay <henri.tremblay at gmail.com>
wrote:
> Do you have Java code matching memory addresses?
>
> Because I know get
>
> ERROR: No address lines detected in assembly capture, make sure your JDK
> is PrintAssembly-enabled:
> https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
>
> Perf output processed (skipped 2.585 seconds):
> Column 1: sampled_pc (10804 events)
>
> Hottest code regions (>10.00% "sampled_pc" events):
>
> ....[Hottest Region 1]............................
> ..................................................
> 0x110543f36, [unknown] (584 bytes)
>
> <no assembly is recorded, native region>
> ............................................................
> ........................................
> 22.08% <total for region 1>
>
> ....[Hottest Region 2]............................
> ..................................................
> 0x11055a27b, [unknown] (369 bytes)
>
> <no assembly is recorded, native region>
> ............................................................
> ........................................
> 21.54% <total for region 2>
>
> ....[Hottest Region 3]............................
> ..................................................
> 0x110569b7f, [unknown] (650 bytes)
>
> <no assembly is recorded, native region>
> ............................................................
> ........................................
> 12.44% <total for region 3>
>
> ....[Hottest Region 4]............................
> ..................................................
> 0x11055a140, [unknown] (64 bytes)
>
> <no assembly is recorded, native region>
> ............................................................
> ........................................
> 12.10% <total for region 4>
>
> ....[Hottest Region 5]............................
> ..................................................
> 0x110543e00, [unknown] (52 bytes)
>
> <no assembly is recorded, native region>
> ............................................................
> ........................................
> 12.01% <total for region 5>
>
> ....[Hottest Regions]....................................................
> ...........................
> 22.08% 0x110543f36 [unknown] (584 bytes)
> 21.54% 0x11055a27b [unknown] (369 bytes)
> 12.44% 0x110569b7f [unknown] (650 bytes)
> 12.10% 0x11055a140 [unknown] (64 bytes)
> 12.01% 0x110543e00 [unknown] (52 bytes)
> 6.30% 0x11055a540 [unknown] (217 bytes)
> 5.89% 0x11055a666 [unknown] (361 bytes)
> 3.46% 0x11055a805 [unknown] (85 bytes)
> 1.99% 0x11053da80 [unknown] (71 bytes)
> 1.49% 0x1105441d1 [unknown] (72 bytes)
> 0.22% 0x110543ea3 [unknown] (7 bytes)
> 0.12% 0x110569f09 [unknown] (16 bytes)
> 0.11% 0x11055a1d8 [unknown] (2 bytes)
> 0.08% 0x11055a63c [unknown] (9 bytes)
> 0.06% 0x11055a9ee [unknown] (5 bytes)
> 0.01% libjvm.dylib Assembler::locate_operand(unsigned
> char*, Assembler::WhichOperand) (0 bytes)
> 0.01% libjvm.dylib LIR_OpVisitState::append(LIR_OprDesc*&,
> LIR_OpVisitState::OprMode) (0 bytes)
> 0.01% libjvm.dylib GraphBuilder::push_scope(ciMethod*,
> BlockBegin*) (0 bytes)
> 0.01% libjvm.dylib LIR_Assembler::reg2stack(LIR_OprDesc*,
> LIR_OprDesc*, BasicType, bool) (0 bytes)
> 0.01% libjvm.dylib NullCheckEliminator::visit(Instruction**)
> (0 bytes)
> 0.05% <...other 5 warm regions...>
> ............................................................
> ........................................
> 99.99% <totals>
>
> ....[Hottest Methods (after inlining)]....................
> ..........................................
> 22.08% 0x110543f36 [unknown]
> 21.54% 0x11055a27b [unknown]
> 12.44% 0x110569b7f [unknown]
> 12.10% 0x11055a140 [unknown]
> 12.01% 0x110543e00 [unknown]
> 6.30% 0x11055a540 [unknown]
> 5.89% 0x11055a666 [unknown]
> 3.46% 0x11055a805 [unknown]
> 1.99% 0x11053da80 [unknown]
> 1.49% 0x1105441d1 [unknown]
> 0.22% 0x110543ea3 [unknown]
> 0.12% 0x110569f09 [unknown]
> 0.11% 0x11055a1d8 [unknown]
> 0.08% 0x11055a63c [unknown]
> 0.06% 0x11055a9ee [unknown]
> 0.01% libjvm.dylib Assembler::locate_operand(unsigned
> char*, Assembler::WhichOperand)
> 0.01% libjvm.dylib NullCheckEliminator::visit(
> Instruction**)
> 0.01% libjvm.dylib LIR_OpVisitState::append(LIR_OprDesc*&,
> LIR_OpVisitState::OprMode)
> 0.01% libjvm.dylib GrowableArray<Metadata*>::append(Metadata*
> const&)
> 0.01% 0x1102dd079 [unknown]
> 0.05% <...other 5 warm methods...>
> ............................................................
> ........................................
> 99.99% <totals>
>
> ....[Distribution by Source].......................
> .................................................
> 22.08% 0x110543f36
> 21.54% 0x11055a27b
> 12.44% 0x110569b7f
> 12.10% 0x11055a140
> 12.01% 0x110543e00
> 6.30% 0x11055a540
> 5.89% 0x11055a666
> 3.46% 0x11055a805
> 1.99% 0x11053da80
> 1.49% 0x1105441d1
> 0.22% 0x110543ea3
> 0.12% 0x110569f09
> 0.11% 0x11055a1d8
> 0.08% 0x11055a63c
> 0.06% libjvm.dylib
> 0.06% 0x11055a9ee
> 0.01% 0x1102dd079
> 0.01% libsystem_kernel.dylib
> 0.01% libsystem_c.dylib
> ............................................................
> ........................................
> 99.99% <totals>
>
> On 20 January 2018 at 14:55, Vsevolod Tolstopyatov <qwwdfsad at gmail.com>
> wrote:
>
>> >Then, it still not found lines. I thought JMH was adding the correct
>> PrintAssembly flags when forking. But the issue seems to be similar with
>> the warning I had for perfasm.
>> Yes, that's your PrintAssembly-related issues.
>>
>> Just ran ConcurrentMapThroughput with dtraceasm on latest JCTools,
>> everything seems fine
>>
>> --
>> Best regards,
>> Tolstopyatov Vsevolod
>>
>> On Wed, Jan 17, 2018 at 3:58 PM, Henri Tremblay <henri.tremblay at gmail.com
>> > wrote:
>>
>>> Well found! It seems much better.
>>>
>>> Then, it still not found lines. I thought JMH was adding the correct
>>> PrintAssembly flags when forking. But the issue seems to be similar with
>>> the warning I had for perfasm.
>>>
>>> ERROR: No address lines detected in assembly capture, make sure your JDK
>>> is PrintAssembly-enabled:
>>> https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
>>>
>>> Perf output processed (skipped 2.531 seconds):
>>> Column 1: sampled_pc (11033 events)
>>>
>>> Hottest code regions (>10.00% "sampled_pc" events):
>>>
>>> ....[Hottest Region 1]............................
>>> ..................................................
>>> [unknown], [unknown] (231 bytes)
>>>
>>> <no assembly is recorded, native region>
>>> ............................................................
>>> ........................................
>>> 22.54% <total for region 1>
>>>
>>> ....[Hottest Region 2]............................
>>> ..................................................
>>> [unknown], [unknown] (134 bytes)
>>>
>>> <no assembly is recorded, native region>
>>> ............................................................
>>> ........................................
>>> 11.31% <total for region 2>
>>>
>>> ....[Hottest Region 3]............................
>>> ..................................................
>>> [unknown], [unknown] (41 bytes)
>>>
>>> <no assembly is recorded, native region>
>>> ............................................................
>>> ........................................
>>> 10.39% <total for region 3>
>>>
>>> ....[Hottest Region 4]............................
>>> ..................................................
>>> [unknown], [unknown] (417 bytes)
>>>
>>> <no assembly is recorded, native region>
>>> ............................................................
>>> ........................................
>>> 10.27% <total for region 4>
>>>
>>> ....[Hottest Regions]......................
>>> .........................................................
>>> 22.54% [unknown] [unknown] (231 bytes)
>>> 11.31% [unknown] [unknown] (134 bytes)
>>> 10.39% [unknown] [unknown] (41 bytes)
>>> 10.27% [unknown] [unknown] (417 bytes)
>>> 8.45% [unknown] [unknown] (396 bytes)
>>> 5.71% [unknown] [unknown] (43 bytes)
>>> 5.66% [unknown] [unknown] (55 bytes)
>>> 5.45% [unknown] [unknown] (84 bytes)
>>> 4.17% [unknown] [unknown] (209 bytes)
>>> 2.47% [unknown] [unknown] (71 bytes)
>>> 2.12% [unknown] [unknown] (28 bytes)
>>> 2.04% [unknown] [unknown] (28 bytes)
>>> 1.63% [unknown] [unknown] (170 bytes)
>>> 0.95% [unknown] [unknown] (86 bytes)
>>> 0.89% [unknown] [unknown] (5 bytes)
>>> 0.83% [unknown] [unknown] (127 bytes)
>>> 0.81% [unknown] [unknown] (0 bytes)
>>> 0.55% [unknown] [unknown] (46 bytes)
>>> 0.44% [unknown] [unknown] (71 bytes)
>>> 0.41% [unknown] [unknown] (257 bytes)
>>> 2.91% <...other 73 warm regions...>
>>> ............................................................
>>> ........................................
>>> 99.99% <totals>
>>>
>>> ....[Hottest Methods (after inlining)]....................
>>> ..........................................
>>> 99.99% [unknown] [unknown]
>>> ............................................................
>>> ........................................
>>> 99.99% <totals>
>>>
>>> ....[Distribution by Source].......................
>>> .................................................
>>> 99.99% [unknown]
>>> ............................................................
>>> ........................................
>>> 99.99% <totals>
>>>
>>>
>>>
>>> # Run complete. Total time: 00:00:18
>>>
>>> Benchmark (implementation)
>>> (readRatio) (tableSize) Mode Cnt Score Error Units
>>> ConcurrentMapThroughput.randomGetPutRemove NonBlockingHashMap
>>> 50 100000 thrpt 6 22470630.498 ± 3730613.043 ops/s
>>> ConcurrentMapThroughput.randomGetPutRemove:·asm NonBlockingHashMap
>>> 50 100000 thrpt NaN ---
>>> ConcurrentMapThroughput.randomGetPutRemove ConcurrentHashMap
>>> 50 100000 thrpt 6 23411000.106 ± 1960758.197 ops/s
>>> ConcurrentMapThroughput.randomGetPutRemove:·asm ConcurrentHashMap
>>> 50 100000 thrpt NaN ---
>>>
>>> On 17 January 2018 at 07:20, Aleksey Shipilev <shade at redhat.com> wrote:
>>>
>>>> On 01/14/2018 04:24 PM, Vsevolod Tolstopyatov wrote:
>>>> > I have limited access to different versions of Mac OS X, but it seems
>>>> that in some minor updates
>>>> > DTrace works with SIP enabled.
>>>> > So as solution I'd suggest to check SIP status on profiler start (via
>>>> "csrutil status") and print
>>>> > warning if it's enabled or just clarify it in javadoc. It's up to
>>>> Alexey to decide what approach is
>>>> > preferable in JMH
>>>>
>>>> So, perf* profiler print warning messages when then number of samples
>>>> is suspiciously low. I think
>>>> dtraceasm profiler should do the same, and clearly say what the user is
>>>> supposed to do:
>>>>
>>>> WARNING: The perf event count is suspiciously low (" + sum + "). The
>>>> performance data might be
>>>> inaccurate or misleading. Try to do the profiling again, or tune up
>>>> the sampling frequency.
>>>> With some profilers on Mac OS X, System Integrity Protection (SIP) may
>>>> prevent profiling.
>>>> In such case, temporarily disabling SIP with 'csrutil disable' might
>>>> help.
>>>>
>>>> I'll add this to the patch myself.
>>>>
>>>> I can push this without my own testing, hoping that external people
>>>> validated this. OpenJDK rules
>>>> require the patch to be hosted on OpenJDK infra to get accepted, so,
>>>> Vsevolod, can you please post
>>>> the recent version of the patch inline here?
>>>>
>>>> Thanks
>>>> -Aleksey
>>>>
>>>>
>>>>
>>>
>>
>
More information about the jmh-dev
mailing list