DTrace asm profiler for Mac OS X
Henri Tremblay
henri.tremblay at gmail.com
Mon Jan 22 00:32:48 UTC 2018
Do you have Java code matching memory addresses?
Because I know get
ERROR: No address lines detected in assembly capture, make sure your JDK is
PrintAssembly-enabled:
https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
Perf output processed (skipped 2.585 seconds):
Column 1: sampled_pc (10804 events)
Hottest code regions (>10.00% "sampled_pc" events):
....[Hottest Region
1]..............................................................................
0x110543f36, [unknown] (584 bytes)
<no assembly is recorded, native region>
....................................................................................................
22.08% <total for region 1>
....[Hottest Region
2]..............................................................................
0x11055a27b, [unknown] (369 bytes)
<no assembly is recorded, native region>
....................................................................................................
21.54% <total for region 2>
....[Hottest Region
3]..............................................................................
0x110569b7f, [unknown] (650 bytes)
<no assembly is recorded, native region>
....................................................................................................
12.44% <total for region 3>
....[Hottest Region
4]..............................................................................
0x11055a140, [unknown] (64 bytes)
<no assembly is recorded, native region>
....................................................................................................
12.10% <total for region 4>
....[Hottest Region
5]..............................................................................
0x110543e00, [unknown] (52 bytes)
<no assembly is recorded, native region>
....................................................................................................
12.01% <total for region 5>
....[Hottest
Regions]...............................................................................
22.08% 0x110543f36 [unknown] (584 bytes)
21.54% 0x11055a27b [unknown] (369 bytes)
12.44% 0x110569b7f [unknown] (650 bytes)
12.10% 0x11055a140 [unknown] (64 bytes)
12.01% 0x110543e00 [unknown] (52 bytes)
6.30% 0x11055a540 [unknown] (217 bytes)
5.89% 0x11055a666 [unknown] (361 bytes)
3.46% 0x11055a805 [unknown] (85 bytes)
1.99% 0x11053da80 [unknown] (71 bytes)
1.49% 0x1105441d1 [unknown] (72 bytes)
0.22% 0x110543ea3 [unknown] (7 bytes)
0.12% 0x110569f09 [unknown] (16 bytes)
0.11% 0x11055a1d8 [unknown] (2 bytes)
0.08% 0x11055a63c [unknown] (9 bytes)
0.06% 0x11055a9ee [unknown] (5 bytes)
0.01% libjvm.dylib Assembler::locate_operand(unsigned char*,
Assembler::WhichOperand) (0 bytes)
0.01% libjvm.dylib LIR_OpVisitState::append(LIR_OprDesc*&,
LIR_OpVisitState::OprMode) (0 bytes)
0.01% libjvm.dylib GraphBuilder::push_scope(ciMethod*,
BlockBegin*) (0 bytes)
0.01% libjvm.dylib LIR_Assembler::reg2stack(LIR_OprDesc*,
LIR_OprDesc*, BasicType, bool) (0 bytes)
0.01% libjvm.dylib NullCheckEliminator::visit(Instruction**)
(0 bytes)
0.05% <...other 5 warm regions...>
....................................................................................................
99.99% <totals>
....[Hottest Methods (after
inlining)]..............................................................
22.08% 0x110543f36 [unknown]
21.54% 0x11055a27b [unknown]
12.44% 0x110569b7f [unknown]
12.10% 0x11055a140 [unknown]
12.01% 0x110543e00 [unknown]
6.30% 0x11055a540 [unknown]
5.89% 0x11055a666 [unknown]
3.46% 0x11055a805 [unknown]
1.99% 0x11053da80 [unknown]
1.49% 0x1105441d1 [unknown]
0.22% 0x110543ea3 [unknown]
0.12% 0x110569f09 [unknown]
0.11% 0x11055a1d8 [unknown]
0.08% 0x11055a63c [unknown]
0.06% 0x11055a9ee [unknown]
0.01% libjvm.dylib Assembler::locate_operand(unsigned char*,
Assembler::WhichOperand)
0.01% libjvm.dylib NullCheckEliminator::visit(Instruction**)
0.01% libjvm.dylib LIR_OpVisitState::append(LIR_OprDesc*&,
LIR_OpVisitState::OprMode)
0.01% libjvm.dylib
GrowableArray<Metadata*>::append(Metadata* const&)
0.01% 0x1102dd079 [unknown]
0.05% <...other 5 warm methods...>
....................................................................................................
99.99% <totals>
....[Distribution by
Source]........................................................................
22.08% 0x110543f36
21.54% 0x11055a27b
12.44% 0x110569b7f
12.10% 0x11055a140
12.01% 0x110543e00
6.30% 0x11055a540
5.89% 0x11055a666
3.46% 0x11055a805
1.99% 0x11053da80
1.49% 0x1105441d1
0.22% 0x110543ea3
0.12% 0x110569f09
0.11% 0x11055a1d8
0.08% 0x11055a63c
0.06% libjvm.dylib
0.06% 0x11055a9ee
0.01% 0x1102dd079
0.01% libsystem_kernel.dylib
0.01% libsystem_c.dylib
....................................................................................................
99.99% <totals>
On 20 January 2018 at 14:55, Vsevolod Tolstopyatov <qwwdfsad at gmail.com>
wrote:
> >Then, it still not found lines. I thought JMH was adding the correct
> PrintAssembly flags when forking. But the issue seems to be similar with
> the warning I had for perfasm.
> Yes, that's your PrintAssembly-related issues.
>
> Just ran ConcurrentMapThroughput with dtraceasm on latest JCTools,
> everything seems fine
>
> --
> Best regards,
> Tolstopyatov Vsevolod
>
> On Wed, Jan 17, 2018 at 3:58 PM, Henri Tremblay <henri.tremblay at gmail.com>
> wrote:
>
>> Well found! It seems much better.
>>
>> Then, it still not found lines. I thought JMH was adding the correct
>> PrintAssembly flags when forking. But the issue seems to be similar with
>> the warning I had for perfasm.
>>
>> ERROR: No address lines detected in assembly capture, make sure your JDK
>> is PrintAssembly-enabled:
>> https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
>>
>> Perf output processed (skipped 2.531 seconds):
>> Column 1: sampled_pc (11033 events)
>>
>> Hottest code regions (>10.00% "sampled_pc" events):
>>
>> ....[Hottest Region 1]............................
>> ..................................................
>> [unknown], [unknown] (231 bytes)
>>
>> <no assembly is recorded, native region>
>> ............................................................
>> ........................................
>> 22.54% <total for region 1>
>>
>> ....[Hottest Region 2]............................
>> ..................................................
>> [unknown], [unknown] (134 bytes)
>>
>> <no assembly is recorded, native region>
>> ............................................................
>> ........................................
>> 11.31% <total for region 2>
>>
>> ....[Hottest Region 3]............................
>> ..................................................
>> [unknown], [unknown] (41 bytes)
>>
>> <no assembly is recorded, native region>
>> ............................................................
>> ........................................
>> 10.39% <total for region 3>
>>
>> ....[Hottest Region 4]............................
>> ..................................................
>> [unknown], [unknown] (417 bytes)
>>
>> <no assembly is recorded, native region>
>> ............................................................
>> ........................................
>> 10.27% <total for region 4>
>>
>> ....[Hottest Regions]....................................................
>> ...........................
>> 22.54% [unknown] [unknown] (231 bytes)
>> 11.31% [unknown] [unknown] (134 bytes)
>> 10.39% [unknown] [unknown] (41 bytes)
>> 10.27% [unknown] [unknown] (417 bytes)
>> 8.45% [unknown] [unknown] (396 bytes)
>> 5.71% [unknown] [unknown] (43 bytes)
>> 5.66% [unknown] [unknown] (55 bytes)
>> 5.45% [unknown] [unknown] (84 bytes)
>> 4.17% [unknown] [unknown] (209 bytes)
>> 2.47% [unknown] [unknown] (71 bytes)
>> 2.12% [unknown] [unknown] (28 bytes)
>> 2.04% [unknown] [unknown] (28 bytes)
>> 1.63% [unknown] [unknown] (170 bytes)
>> 0.95% [unknown] [unknown] (86 bytes)
>> 0.89% [unknown] [unknown] (5 bytes)
>> 0.83% [unknown] [unknown] (127 bytes)
>> 0.81% [unknown] [unknown] (0 bytes)
>> 0.55% [unknown] [unknown] (46 bytes)
>> 0.44% [unknown] [unknown] (71 bytes)
>> 0.41% [unknown] [unknown] (257 bytes)
>> 2.91% <...other 73 warm regions...>
>> ............................................................
>> ........................................
>> 99.99% <totals>
>>
>> ....[Hottest Methods (after inlining)]....................
>> ..........................................
>> 99.99% [unknown] [unknown]
>> ............................................................
>> ........................................
>> 99.99% <totals>
>>
>> ....[Distribution by Source].......................
>> .................................................
>> 99.99% [unknown]
>> ............................................................
>> ........................................
>> 99.99% <totals>
>>
>>
>>
>> # Run complete. Total time: 00:00:18
>>
>> Benchmark (implementation)
>> (readRatio) (tableSize) Mode Cnt Score Error Units
>> ConcurrentMapThroughput.randomGetPutRemove NonBlockingHashMap
>> 50 100000 thrpt 6 22470630.498 ± 3730613.043 ops/s
>> ConcurrentMapThroughput.randomGetPutRemove:·asm NonBlockingHashMap
>> 50 100000 thrpt NaN ---
>> ConcurrentMapThroughput.randomGetPutRemove ConcurrentHashMap
>> 50 100000 thrpt 6 23411000.106 ± 1960758.197 ops/s
>> ConcurrentMapThroughput.randomGetPutRemove:·asm ConcurrentHashMap
>> 50 100000 thrpt NaN ---
>>
>> On 17 January 2018 at 07:20, Aleksey Shipilev <shade at redhat.com> wrote:
>>
>>> On 01/14/2018 04:24 PM, Vsevolod Tolstopyatov wrote:
>>> > I have limited access to different versions of Mac OS X, but it seems
>>> that in some minor updates
>>> > DTrace works with SIP enabled.
>>> > So as solution I'd suggest to check SIP status on profiler start (via
>>> "csrutil status") and print
>>> > warning if it's enabled or just clarify it in javadoc. It's up to
>>> Alexey to decide what approach is
>>> > preferable in JMH
>>>
>>> So, perf* profiler print warning messages when then number of samples is
>>> suspiciously low. I think
>>> dtraceasm profiler should do the same, and clearly say what the user is
>>> supposed to do:
>>>
>>> WARNING: The perf event count is suspiciously low (" + sum + "). The
>>> performance data might be
>>> inaccurate or misleading. Try to do the profiling again, or tune up the
>>> sampling frequency.
>>> With some profilers on Mac OS X, System Integrity Protection (SIP) may
>>> prevent profiling.
>>> In such case, temporarily disabling SIP with 'csrutil disable' might
>>> help.
>>>
>>> I'll add this to the patch myself.
>>>
>>> I can push this without my own testing, hoping that external people
>>> validated this. OpenJDK rules
>>> require the patch to be hosted on OpenJDK infra to get accepted, so,
>>> Vsevolod, can you please post
>>> the recent version of the patch inline here?
>>>
>>> Thanks
>>> -Aleksey
>>>
>>>
>>>
>>
>
More information about the jmh-dev
mailing list