DTrace asm profiler for Mac OS X

Vsevolod Tolstopyatov qwwdfsad at gmail.com
Mon Jan 22 08:58:03 UTC 2018


I do.

Small checklist:
1) Have you installed hsdis properly? (instruction:
http://psy-lob-saw.blogspot.ru/2013/01/java-print-assembly.html)
2) Do you see any generated code if you run JMH-benchmark in verbose mode
with dtraceasm? (option "-v EXTRA")



--
Best regards,
Tolstopyatov Vsevolod

On Mon, Jan 22, 2018 at 3:32 AM, Henri Tremblay <henri.tremblay at gmail.com>
wrote:

> Do you have Java code matching memory addresses?
>
> Because I know get
>
> ERROR: No address lines detected in assembly capture, make sure your JDK
> is PrintAssembly-enabled:
>     https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
>
> Perf output processed (skipped 2.585 seconds):
>  Column 1: sampled_pc (10804 events)
>
> Hottest code regions (>10.00% "sampled_pc" events):
>
> ....[Hottest Region 1]............................
> ..................................................
>  0x110543f36, [unknown] (584 bytes)
>
>  <no assembly is recorded, native region>
> ............................................................
> ........................................
>  22.08%  <total for region 1>
>
> ....[Hottest Region 2]............................
> ..................................................
>  0x11055a27b, [unknown] (369 bytes)
>
>  <no assembly is recorded, native region>
> ............................................................
> ........................................
>  21.54%  <total for region 2>
>
> ....[Hottest Region 3]............................
> ..................................................
>  0x110569b7f, [unknown] (650 bytes)
>
>  <no assembly is recorded, native region>
> ............................................................
> ........................................
>  12.44%  <total for region 3>
>
> ....[Hottest Region 4]............................
> ..................................................
>  0x11055a140, [unknown] (64 bytes)
>
>  <no assembly is recorded, native region>
> ............................................................
> ........................................
>  12.10%  <total for region 4>
>
> ....[Hottest Region 5]............................
> ..................................................
>  0x110543e00, [unknown] (52 bytes)
>
>  <no assembly is recorded, native region>
> ............................................................
> ........................................
>  12.01%  <total for region 5>
>
> ....[Hottest Regions]....................................................
> ...........................
>  22.08%              0x110543f36  [unknown] (584 bytes)
>  21.54%              0x11055a27b  [unknown] (369 bytes)
>  12.44%              0x110569b7f  [unknown] (650 bytes)
>  12.10%              0x11055a140  [unknown] (64 bytes)
>  12.01%              0x110543e00  [unknown] (52 bytes)
>   6.30%              0x11055a540  [unknown] (217 bytes)
>   5.89%              0x11055a666  [unknown] (361 bytes)
>   3.46%              0x11055a805  [unknown] (85 bytes)
>   1.99%              0x11053da80  [unknown] (71 bytes)
>   1.49%              0x1105441d1  [unknown] (72 bytes)
>   0.22%              0x110543ea3  [unknown] (7 bytes)
>   0.12%              0x110569f09  [unknown] (16 bytes)
>   0.11%              0x11055a1d8  [unknown] (2 bytes)
>   0.08%              0x11055a63c  [unknown] (9 bytes)
>   0.06%              0x11055a9ee  [unknown] (5 bytes)
>   0.01%             libjvm.dylib  Assembler::locate_operand(unsigned
> char*, Assembler::WhichOperand) (0 bytes)
>   0.01%             libjvm.dylib  LIR_OpVisitState::append(LIR_OprDesc*&,
> LIR_OpVisitState::OprMode) (0 bytes)
>   0.01%             libjvm.dylib  GraphBuilder::push_scope(ciMethod*,
> BlockBegin*) (0 bytes)
>   0.01%             libjvm.dylib  LIR_Assembler::reg2stack(LIR_OprDesc*,
> LIR_OprDesc*, BasicType, bool) (0 bytes)
>   0.01%             libjvm.dylib  NullCheckEliminator::visit(Instruction**)
> (0 bytes)
>   0.05%  <...other 5 warm regions...>
> ............................................................
> ........................................
>  99.99%  <totals>
>
> ....[Hottest Methods (after inlining)]....................
> ..........................................
>  22.08%              0x110543f36  [unknown]
>  21.54%              0x11055a27b  [unknown]
>  12.44%              0x110569b7f  [unknown]
>  12.10%              0x11055a140  [unknown]
>  12.01%              0x110543e00  [unknown]
>   6.30%              0x11055a540  [unknown]
>   5.89%              0x11055a666  [unknown]
>   3.46%              0x11055a805  [unknown]
>   1.99%              0x11053da80  [unknown]
>   1.49%              0x1105441d1  [unknown]
>   0.22%              0x110543ea3  [unknown]
>   0.12%              0x110569f09  [unknown]
>   0.11%              0x11055a1d8  [unknown]
>   0.08%              0x11055a63c  [unknown]
>   0.06%              0x11055a9ee  [unknown]
>   0.01%             libjvm.dylib  Assembler::locate_operand(unsigned
> char*, Assembler::WhichOperand)
>   0.01%             libjvm.dylib  NullCheckEliminator::visit(
> Instruction**)
>   0.01%             libjvm.dylib  LIR_OpVisitState::append(LIR_OprDesc*&,
> LIR_OpVisitState::OprMode)
>   0.01%             libjvm.dylib  GrowableArray<Metadata*>::append(Metadata*
> const&)
>   0.01%              0x1102dd079  [unknown]
>   0.05%  <...other 5 warm methods...>
> ............................................................
> ........................................
>  99.99%  <totals>
>
> ....[Distribution by Source].......................
> .................................................
>  22.08%              0x110543f36
>  21.54%              0x11055a27b
>  12.44%              0x110569b7f
>  12.10%              0x11055a140
>  12.01%              0x110543e00
>   6.30%              0x11055a540
>   5.89%              0x11055a666
>   3.46%              0x11055a805
>   1.99%              0x11053da80
>   1.49%              0x1105441d1
>   0.22%              0x110543ea3
>   0.12%              0x110569f09
>   0.11%              0x11055a1d8
>   0.08%              0x11055a63c
>   0.06%             libjvm.dylib
>   0.06%              0x11055a9ee
>   0.01%              0x1102dd079
>   0.01%   libsystem_kernel.dylib
>   0.01%        libsystem_c.dylib
> ............................................................
> ........................................
>  99.99%  <totals>
>
> On 20 January 2018 at 14:55, Vsevolod Tolstopyatov <qwwdfsad at gmail.com>
> wrote:
>
>> >Then, it still not found lines. I thought JMH was adding the correct
>> PrintAssembly flags when forking. But the issue seems to be similar with
>> the warning I had for perfasm.
>> Yes, that's your PrintAssembly-related issues.
>>
>> Just ran ConcurrentMapThroughput with dtraceasm on latest JCTools,
>> everything seems fine
>>
>> --
>> Best regards,
>> Tolstopyatov Vsevolod
>>
>> On Wed, Jan 17, 2018 at 3:58 PM, Henri Tremblay <henri.tremblay at gmail.com
>> > wrote:
>>
>>> Well found! It seems much better.
>>>
>>> Then, it still not found lines. I thought JMH was adding the correct
>>> PrintAssembly flags when forking. But the issue seems to be similar with
>>> the warning I had for perfasm.
>>>
>>> ERROR: No address lines detected in assembly capture, make sure your JDK
>>> is PrintAssembly-enabled:
>>>     https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
>>>
>>> Perf output processed (skipped 2.531 seconds):
>>>  Column 1: sampled_pc (11033 events)
>>>
>>> Hottest code regions (>10.00% "sampled_pc" events):
>>>
>>> ....[Hottest Region 1]............................
>>> ..................................................
>>> [unknown], [unknown] (231 bytes)
>>>
>>>  <no assembly is recorded, native region>
>>> ............................................................
>>> ........................................
>>>  22.54%  <total for region 1>
>>>
>>> ....[Hottest Region 2]............................
>>> ..................................................
>>> [unknown], [unknown] (134 bytes)
>>>
>>>  <no assembly is recorded, native region>
>>> ............................................................
>>> ........................................
>>>  11.31%  <total for region 2>
>>>
>>> ....[Hottest Region 3]............................
>>> ..................................................
>>> [unknown], [unknown] (41 bytes)
>>>
>>>  <no assembly is recorded, native region>
>>> ............................................................
>>> ........................................
>>>  10.39%  <total for region 3>
>>>
>>> ....[Hottest Region 4]............................
>>> ..................................................
>>> [unknown], [unknown] (417 bytes)
>>>
>>>  <no assembly is recorded, native region>
>>> ............................................................
>>> ........................................
>>>  10.27%  <total for region 4>
>>>
>>> ....[Hottest Regions]......................
>>> .........................................................
>>>  22.54%  [unknown]  [unknown] (231 bytes)
>>>  11.31%  [unknown]  [unknown] (134 bytes)
>>>  10.39%  [unknown]  [unknown] (41 bytes)
>>>  10.27%  [unknown]  [unknown] (417 bytes)
>>>   8.45%  [unknown]  [unknown] (396 bytes)
>>>   5.71%  [unknown]  [unknown] (43 bytes)
>>>   5.66%  [unknown]  [unknown] (55 bytes)
>>>   5.45%  [unknown]  [unknown] (84 bytes)
>>>   4.17%  [unknown]  [unknown] (209 bytes)
>>>   2.47%  [unknown]  [unknown] (71 bytes)
>>>   2.12%  [unknown]  [unknown] (28 bytes)
>>>   2.04%  [unknown]  [unknown] (28 bytes)
>>>   1.63%  [unknown]  [unknown] (170 bytes)
>>>   0.95%  [unknown]  [unknown] (86 bytes)
>>>   0.89%  [unknown]  [unknown] (5 bytes)
>>>   0.83%  [unknown]  [unknown] (127 bytes)
>>>   0.81%  [unknown]  [unknown] (0 bytes)
>>>   0.55%  [unknown]  [unknown] (46 bytes)
>>>   0.44%  [unknown]  [unknown] (71 bytes)
>>>   0.41%  [unknown]  [unknown] (257 bytes)
>>>   2.91%  <...other 73 warm regions...>
>>> ............................................................
>>> ........................................
>>>  99.99%  <totals>
>>>
>>> ....[Hottest Methods (after inlining)]....................
>>> ..........................................
>>>  99.99%  [unknown]  [unknown]
>>> ............................................................
>>> ........................................
>>>  99.99%  <totals>
>>>
>>> ....[Distribution by Source].......................
>>> .................................................
>>>  99.99%  [unknown]
>>> ............................................................
>>> ........................................
>>>  99.99%  <totals>
>>>
>>>
>>>
>>> # Run complete. Total time: 00:00:18
>>>
>>> Benchmark                                          (implementation)
>>> (readRatio)  (tableSize)   Mode  Cnt         Score         Error  Units
>>> ConcurrentMapThroughput.randomGetPutRemove       NonBlockingHashMap
>>>        50       100000  thrpt    6  22470630.498 ± 3730613.043  ops/s
>>> ConcurrentMapThroughput.randomGetPutRemove:·asm  NonBlockingHashMap
>>>        50       100000  thrpt                NaN                  ---
>>> ConcurrentMapThroughput.randomGetPutRemove        ConcurrentHashMap
>>>        50       100000  thrpt    6  23411000.106 ± 1960758.197  ops/s
>>> ConcurrentMapThroughput.randomGetPutRemove:·asm   ConcurrentHashMap
>>>        50       100000  thrpt                NaN                  ---
>>>
>>> On 17 January 2018 at 07:20, Aleksey Shipilev <shade at redhat.com> wrote:
>>>
>>>> On 01/14/2018 04:24 PM, Vsevolod Tolstopyatov wrote:
>>>> > I have limited access to different versions of Mac OS X, but it seems
>>>> that in some minor updates
>>>> > DTrace works with SIP enabled.
>>>> > So as solution I'd suggest to check SIP status on profiler start (via
>>>> "csrutil status") and print
>>>> > warning if it's enabled or just clarify it in javadoc. It's up to
>>>> Alexey to decide what approach is
>>>> > preferable in JMH
>>>>
>>>> So, perf* profiler print warning messages when then number of samples
>>>> is suspiciously low. I think
>>>> dtraceasm profiler should do the same, and clearly say what the user is
>>>> supposed to do:
>>>>
>>>>  WARNING: The perf event count is suspiciously low (" + sum + "). The
>>>> performance data might be
>>>>  inaccurate or misleading. Try to do the profiling again, or tune up
>>>> the sampling frequency.
>>>>  With some profilers on Mac OS X, System Integrity Protection (SIP) may
>>>> prevent profiling.
>>>>  In such case, temporarily disabling SIP with 'csrutil disable' might
>>>> help.
>>>>
>>>> I'll add this to the patch myself.
>>>>
>>>> I can push this without my own testing, hoping that external people
>>>> validated this. OpenJDK rules
>>>> require the patch to be hosted on OpenJDK infra to get accepted, so,
>>>> Vsevolod, can you please post
>>>> the recent version of the patch inline here?
>>>>
>>>> Thanks
>>>> -Aleksey
>>>>
>>>>
>>>>
>>>
>>
>


More information about the jmh-dev mailing list