DTrace asm profiler for Mac OS X

Henri Tremblay henri.tremblay at gmail.com
Mon Jan 22 00:32:48 UTC 2018


Do you have Java code matching memory addresses?

Because I know get

ERROR: No address lines detected in assembly capture, make sure your JDK is
PrintAssembly-enabled:
    https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly

Perf output processed (skipped 2.585 seconds):
 Column 1: sampled_pc (10804 events)

Hottest code regions (>10.00% "sampled_pc" events):

....[Hottest Region
1]..............................................................................
 0x110543f36, [unknown] (584 bytes)

 <no assembly is recorded, native region>
....................................................................................................
 22.08%  <total for region 1>

....[Hottest Region
2]..............................................................................
 0x11055a27b, [unknown] (369 bytes)

 <no assembly is recorded, native region>
....................................................................................................
 21.54%  <total for region 2>

....[Hottest Region
3]..............................................................................
 0x110569b7f, [unknown] (650 bytes)

 <no assembly is recorded, native region>
....................................................................................................
 12.44%  <total for region 3>

....[Hottest Region
4]..............................................................................
 0x11055a140, [unknown] (64 bytes)

 <no assembly is recorded, native region>
....................................................................................................
 12.10%  <total for region 4>

....[Hottest Region
5]..............................................................................
 0x110543e00, [unknown] (52 bytes)

 <no assembly is recorded, native region>
....................................................................................................
 12.01%  <total for region 5>

....[Hottest
Regions]...............................................................................
 22.08%              0x110543f36  [unknown] (584 bytes)
 21.54%              0x11055a27b  [unknown] (369 bytes)
 12.44%              0x110569b7f  [unknown] (650 bytes)
 12.10%              0x11055a140  [unknown] (64 bytes)
 12.01%              0x110543e00  [unknown] (52 bytes)
  6.30%              0x11055a540  [unknown] (217 bytes)
  5.89%              0x11055a666  [unknown] (361 bytes)
  3.46%              0x11055a805  [unknown] (85 bytes)
  1.99%              0x11053da80  [unknown] (71 bytes)
  1.49%              0x1105441d1  [unknown] (72 bytes)
  0.22%              0x110543ea3  [unknown] (7 bytes)
  0.12%              0x110569f09  [unknown] (16 bytes)
  0.11%              0x11055a1d8  [unknown] (2 bytes)
  0.08%              0x11055a63c  [unknown] (9 bytes)
  0.06%              0x11055a9ee  [unknown] (5 bytes)
  0.01%             libjvm.dylib  Assembler::locate_operand(unsigned char*,
Assembler::WhichOperand) (0 bytes)
  0.01%             libjvm.dylib  LIR_OpVisitState::append(LIR_OprDesc*&,
LIR_OpVisitState::OprMode) (0 bytes)
  0.01%             libjvm.dylib  GraphBuilder::push_scope(ciMethod*,
BlockBegin*) (0 bytes)
  0.01%             libjvm.dylib  LIR_Assembler::reg2stack(LIR_OprDesc*,
LIR_OprDesc*, BasicType, bool) (0 bytes)
  0.01%             libjvm.dylib  NullCheckEliminator::visit(Instruction**)
(0 bytes)
  0.05%  <...other 5 warm regions...>
....................................................................................................
 99.99%  <totals>

....[Hottest Methods (after
inlining)]..............................................................
 22.08%              0x110543f36  [unknown]
 21.54%              0x11055a27b  [unknown]
 12.44%              0x110569b7f  [unknown]
 12.10%              0x11055a140  [unknown]
 12.01%              0x110543e00  [unknown]
  6.30%              0x11055a540  [unknown]
  5.89%              0x11055a666  [unknown]
  3.46%              0x11055a805  [unknown]
  1.99%              0x11053da80  [unknown]
  1.49%              0x1105441d1  [unknown]
  0.22%              0x110543ea3  [unknown]
  0.12%              0x110569f09  [unknown]
  0.11%              0x11055a1d8  [unknown]
  0.08%              0x11055a63c  [unknown]
  0.06%              0x11055a9ee  [unknown]
  0.01%             libjvm.dylib  Assembler::locate_operand(unsigned char*,
Assembler::WhichOperand)
  0.01%             libjvm.dylib  NullCheckEliminator::visit(Instruction**)
  0.01%             libjvm.dylib  LIR_OpVisitState::append(LIR_OprDesc*&,
LIR_OpVisitState::OprMode)
  0.01%             libjvm.dylib
GrowableArray<Metadata*>::append(Metadata* const&)
  0.01%              0x1102dd079  [unknown]
  0.05%  <...other 5 warm methods...>
....................................................................................................
 99.99%  <totals>

....[Distribution by
Source]........................................................................
 22.08%              0x110543f36
 21.54%              0x11055a27b
 12.44%              0x110569b7f
 12.10%              0x11055a140
 12.01%              0x110543e00
  6.30%              0x11055a540
  5.89%              0x11055a666
  3.46%              0x11055a805
  1.99%              0x11053da80
  1.49%              0x1105441d1
  0.22%              0x110543ea3
  0.12%              0x110569f09
  0.11%              0x11055a1d8
  0.08%              0x11055a63c
  0.06%             libjvm.dylib
  0.06%              0x11055a9ee
  0.01%              0x1102dd079
  0.01%   libsystem_kernel.dylib
  0.01%        libsystem_c.dylib
....................................................................................................
 99.99%  <totals>

On 20 January 2018 at 14:55, Vsevolod Tolstopyatov <qwwdfsad at gmail.com>
wrote:

> >Then, it still not found lines. I thought JMH was adding the correct
> PrintAssembly flags when forking. But the issue seems to be similar with
> the warning I had for perfasm.
> Yes, that's your PrintAssembly-related issues.
>
> Just ran ConcurrentMapThroughput with dtraceasm on latest JCTools,
> everything seems fine
>
> --
> Best regards,
> Tolstopyatov Vsevolod
>
> On Wed, Jan 17, 2018 at 3:58 PM, Henri Tremblay <henri.tremblay at gmail.com>
> wrote:
>
>> Well found! It seems much better.
>>
>> Then, it still not found lines. I thought JMH was adding the correct
>> PrintAssembly flags when forking. But the issue seems to be similar with
>> the warning I had for perfasm.
>>
>> ERROR: No address lines detected in assembly capture, make sure your JDK
>> is PrintAssembly-enabled:
>>     https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
>>
>> Perf output processed (skipped 2.531 seconds):
>>  Column 1: sampled_pc (11033 events)
>>
>> Hottest code regions (>10.00% "sampled_pc" events):
>>
>> ....[Hottest Region 1]............................
>> ..................................................
>> [unknown], [unknown] (231 bytes)
>>
>>  <no assembly is recorded, native region>
>> ............................................................
>> ........................................
>>  22.54%  <total for region 1>
>>
>> ....[Hottest Region 2]............................
>> ..................................................
>> [unknown], [unknown] (134 bytes)
>>
>>  <no assembly is recorded, native region>
>> ............................................................
>> ........................................
>>  11.31%  <total for region 2>
>>
>> ....[Hottest Region 3]............................
>> ..................................................
>> [unknown], [unknown] (41 bytes)
>>
>>  <no assembly is recorded, native region>
>> ............................................................
>> ........................................
>>  10.39%  <total for region 3>
>>
>> ....[Hottest Region 4]............................
>> ..................................................
>> [unknown], [unknown] (417 bytes)
>>
>>  <no assembly is recorded, native region>
>> ............................................................
>> ........................................
>>  10.27%  <total for region 4>
>>
>> ....[Hottest Regions]....................................................
>> ...........................
>>  22.54%  [unknown]  [unknown] (231 bytes)
>>  11.31%  [unknown]  [unknown] (134 bytes)
>>  10.39%  [unknown]  [unknown] (41 bytes)
>>  10.27%  [unknown]  [unknown] (417 bytes)
>>   8.45%  [unknown]  [unknown] (396 bytes)
>>   5.71%  [unknown]  [unknown] (43 bytes)
>>   5.66%  [unknown]  [unknown] (55 bytes)
>>   5.45%  [unknown]  [unknown] (84 bytes)
>>   4.17%  [unknown]  [unknown] (209 bytes)
>>   2.47%  [unknown]  [unknown] (71 bytes)
>>   2.12%  [unknown]  [unknown] (28 bytes)
>>   2.04%  [unknown]  [unknown] (28 bytes)
>>   1.63%  [unknown]  [unknown] (170 bytes)
>>   0.95%  [unknown]  [unknown] (86 bytes)
>>   0.89%  [unknown]  [unknown] (5 bytes)
>>   0.83%  [unknown]  [unknown] (127 bytes)
>>   0.81%  [unknown]  [unknown] (0 bytes)
>>   0.55%  [unknown]  [unknown] (46 bytes)
>>   0.44%  [unknown]  [unknown] (71 bytes)
>>   0.41%  [unknown]  [unknown] (257 bytes)
>>   2.91%  <...other 73 warm regions...>
>> ............................................................
>> ........................................
>>  99.99%  <totals>
>>
>> ....[Hottest Methods (after inlining)]....................
>> ..........................................
>>  99.99%  [unknown]  [unknown]
>> ............................................................
>> ........................................
>>  99.99%  <totals>
>>
>> ....[Distribution by Source].......................
>> .................................................
>>  99.99%  [unknown]
>> ............................................................
>> ........................................
>>  99.99%  <totals>
>>
>>
>>
>> # Run complete. Total time: 00:00:18
>>
>> Benchmark                                          (implementation)
>> (readRatio)  (tableSize)   Mode  Cnt         Score         Error  Units
>> ConcurrentMapThroughput.randomGetPutRemove       NonBlockingHashMap
>>      50       100000  thrpt    6  22470630.498 ± 3730613.043  ops/s
>> ConcurrentMapThroughput.randomGetPutRemove:·asm  NonBlockingHashMap
>>      50       100000  thrpt                NaN                  ---
>> ConcurrentMapThroughput.randomGetPutRemove        ConcurrentHashMap
>>      50       100000  thrpt    6  23411000.106 ± 1960758.197  ops/s
>> ConcurrentMapThroughput.randomGetPutRemove:·asm   ConcurrentHashMap
>>      50       100000  thrpt                NaN                  ---
>>
>> On 17 January 2018 at 07:20, Aleksey Shipilev <shade at redhat.com> wrote:
>>
>>> On 01/14/2018 04:24 PM, Vsevolod Tolstopyatov wrote:
>>> > I have limited access to different versions of Mac OS X, but it seems
>>> that in some minor updates
>>> > DTrace works with SIP enabled.
>>> > So as solution I'd suggest to check SIP status on profiler start (via
>>> "csrutil status") and print
>>> > warning if it's enabled or just clarify it in javadoc. It's up to
>>> Alexey to decide what approach is
>>> > preferable in JMH
>>>
>>> So, perf* profiler print warning messages when then number of samples is
>>> suspiciously low. I think
>>> dtraceasm profiler should do the same, and clearly say what the user is
>>> supposed to do:
>>>
>>>  WARNING: The perf event count is suspiciously low (" + sum + "). The
>>> performance data might be
>>>  inaccurate or misleading. Try to do the profiling again, or tune up the
>>> sampling frequency.
>>>  With some profilers on Mac OS X, System Integrity Protection (SIP) may
>>> prevent profiling.
>>>  In such case, temporarily disabling SIP with 'csrutil disable' might
>>> help.
>>>
>>> I'll add this to the patch myself.
>>>
>>> I can push this without my own testing, hoping that external people
>>> validated this. OpenJDK rules
>>> require the patch to be hosted on OpenJDK infra to get accepted, so,
>>> Vsevolod, can you please post
>>> the recent version of the patch inline here?
>>>
>>> Thanks
>>> -Aleksey
>>>
>>>
>>>
>>
>


More information about the jmh-dev mailing list