DTrace asm profiler for Mac OS X
Vsevolod Tolstopyatov
qwwdfsad at gmail.com
Sun Jan 14 15:24:09 UTC 2018
Hi, it took me a while to reproduce your problem.
The problem lies in Mac OS X version (everything after El Capitan) and
system integrity protection (SIP).
Usually DTrace works as intended, but on newer OS versions it requires
additional privileges. In such cases if you run DTrace manually you should
see something like "dtrace cannot control executables signed with
restricted entitlements" [1]
The only possible solution is to disable SIP [2]
I have limited access to different versions of Mac OS X, but it seems that
in some minor updates DTrace works with SIP enabled.
So as solution I'd suggest to check SIP status on profiler start (via
"csrutil status") and print warning if it's enabled or just clarify it in
javadoc. It's up to Alexey to decide what approach is preferable in JMH
[1] https://news.ycombinator.com/item?id=10790127
[2]
http://osxdaily.com/2015/10/05/disable-rootless-system-integrity-protection-mac-os-x/
--
Best regards,
Tolstopyatov Vsevolod
On Thu, Dec 28, 2017 at 10:35 PM, Henri Tremblay <henri.tremblay at gmail.com>
wrote:
> I am far far far from being an expert here so I'm pretty sure you will
> throw some stupid mistake in my face but here it goes.
>
> You can use https://github.com/JCTools/JCTools/tree/master/
> jctools-benchmarks.
>
> I did on Linux:
> java -jar target/microbenchmarks.jar -f 1 --prof=perfasm
> org.jctools.maps.nhbm_test.jmh.ConcurrentMapThroughput
>
> And got (yes, with an error on PrintAssembly):
>
> ERROR: No address lines detected in assembly capture, make sure your JDK
> is PrintAssembly-enabled:
> https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
>
> Perf output processed (skipped 2.844 seconds):
> Column 1: cycles (12218 events)
> Column 2: instructions (12169 events)
>
> Hottest code regions (>10.00% "cycles" events):
>
> ....[Hottest Region 1]............................
> ..................................................
> perf-52432.map, [unknown] (177 bytes)
>
> <no assembly is recorded, native region>
> ............................................................
> ........................................
> 19.81% 11.78% <total for region 1>
>
> ....[Hottest Region 2]............................
> ..................................................
> perf-52432.map, [unknown] (381 bytes)
>
> <no assembly is recorded, native region>
> ............................................................
> ........................................
> 15.03% 12.21% <total for region 2>
>
> ....[Hottest Region 3]............................
> ..................................................
> perf-52432.map, [unknown] (138 bytes)
>
> <no assembly is recorded, native region>
> ............................................................
> ........................................
> 10.38% 6.35% <total for region 3>
>
> ....[Hottest Regions]....................................................
> ...........................
> 19.81% 11.78% perf-52432.map [unknown] (177 bytes)
> 15.03% 12.21% perf-52432.map [unknown] (381 bytes)
> 10.38% 6.35% perf-52432.map [unknown] (138 bytes)
> 9.82% 37.09% perf-52432.map [unknown] (447 bytes)
> 8.22% 2.47% perf-52432.map [unknown] (72 bytes)
> 7.89% 1.69% perf-52432.map [unknown] (28 bytes)
> 7.65% 1.69% perf-52432.map [unknown] (33 bytes)
> 5.20% 2.94% perf-52432.map [unknown] (173 bytes)
> 1.98% 1.59% perf-52432.map [unknown] (287 bytes)
> 1.85% 4.54% perf-52432.map [unknown] (59 bytes)
> 1.81% 4.48% perf-52432.map [unknown] (55 bytes)
> 1.51% 0.96% perf-52432.map [unknown] (116 bytes)
> 1.47% 1.83% perf-52432.map [unknown] (71 bytes)
> 1.26% 1.25% kernel [unknown] (2 bytes)
> 1.15% 0.53% perf-52432.map [unknown] (95 bytes)
> 0.89% 0.40% perf-52432.map [unknown] (75 bytes)
> 0.56% 0.05% kernel [unknown] (0 bytes)
> 0.53% 2.34% perf-52432.map [unknown] (92 bytes)
> 0.45% 1.16% perf-52432.map [unknown] (8 bytes)
> 0.44% 2.47% perf-52432.map [unknown] (8 bytes)
> 2.11% 2.14% <...other 199 warm regions...>
> ............................................................
> ........................................
> 100.00% 99.99% <totals>
>
> ....[Hottest Methods (after inlining)]....................
> ..........................................
> 96.95% 97.51% perf-52432.map [unknown]
> 2.76% 2.10% kernel [unknown]
> 0.03% 0.07% libjvm.so fileStream::write
> 0.02% 0.01% libc-2.12.so __strlen_sse42
> 0.02% libc-2.12.so _IO_file_xsputn@@GLIBC_2.2.5
> 0.02% libc-2.12.so __printf_fp
> 0.01% libjvm.so CompileBroker::set_last_compile
> 0.01% libjvm.so CodeCache::allocate
> 0.01% libpthread-2.12.so pthread_mutex_unlock
> 0.01% libjvm.so os::set_priority
> 0.01% libjvm.so DebugInformationRecorder::
> find_sharable_decode_offset
> 0.01% libpthread-2.12.so pthread_cond_wait@@GLIBC_2.3.2
> 0.01% libjvm.so CompileBroker::invoke_
> compiler_on_method
> 0.01% libjvm.so ciEnv::get_klass_by_index_impl
> 0.01% 0.01% libjvm.so PhiResolverState::reset
> 0.01% libjvm.so CompilerOracle::should_exclude
> 0.01% libjvm.so CompilerOracle::has_option_string
> 0.01% libjvm.so LinearScan::compute_local_live_sets
> 0.01% libjvm.so OptoRuntime::new_instance_C
> 0.01% libjvm.so ChunkPool::allocate
> 0.10% 0.02% <...other 12 warm methods...>
> ............................................................
> ........................................
> 100.00% 99.71% <totals>
>
> ....[Distribution by Source].......................
> .................................................
> 96.95% 97.51% perf-52432.map
> 2.76% 2.10% kernel
> 0.22% 0.31% libjvm.so
> 0.05% 0.06% libc-2.12.so
> 0.02% libpthread-2.12.so
> ............................................................
> ........................................
> 100.00% 99.99% <totals>
>
> But on OSX when I do
>
> java -jar target/microbenchmarks.jar -f 1 --prof=dtraceasm
> org.jctools.maps.nhbm_test.jmh.ConcurrentMapThroughput
>
> I get:
>
> PrintAssembly processed: 193901 total address lines.
> Perf output processed (skipped 6.097 seconds):
> Column 1: sampled_pc (0 events)
>
> WARNING: No hottest code region above the threshold (10.00%) for
> disassembly.
> Use "hotThreshold" profiler option to lower the filter threshold.
>
> ....[Hottest Regions]....................................................
> ...........................
> ............................................................
> ........................................
> <totals>
>
> ....[Hottest Methods (after inlining)]....................
> ..........................................
> ............................................................
> ........................................
> <totals>
>
> ....[Distribution by Source].......................
> .................................................
> ............................................................
> ........................................
> <totals>
>
> WARNING: The perf event count is suspiciously low (0). The performance
> data might be
> inaccurate or misleading. Try to do the profiling again, or tune up the
> sampling frequency.
>
> Which seem pretty empty.
>
> Henri
>
> On 27 December 2017 at 09:56, Henri Tremblay <henri.tremblay at gmail.com>
> wrote:
>
>> No. One was Linux (perf), the other was OSX (dtrace). Let me put the
>> benchmark out.
>>
>> On 26 December 2017 at 14:19, Vsevolod Tolstopyatov <qwwdfsad at gmail.com>
>> wrote:
>>
>>> Hi, could you share your benchmark?
>>> I've just re-applied my patch over clean repo and
>>> run JMHSample_37_CacheAccess with dtrace-profiler, everything works as
>>> expected, so maybe your hottest region lies in kernel code.
>>>
>>> >With perf, I would get some content. With dtrace, nothing.
>>> Are you running both on Linux?
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Tolstopyatov Vsevolod
>>>
>>> On Wed, Dec 13, 2017 at 7:25 PM, Henri Tremblay <
>>> henri.tremblay at gmail.com> wrote:
>>>
>>>> A bit late but my only problem right now is that I don't get any hot
>>>> section. Which is weird.
>>>>
>>>> With perf, I would get some content. With dtrace, nothing.
>>>>
>>>> However, I am not an expert in using both. So maybe some javac or java
>>>> arguments are required to get nice results. Is it the case?
>>>>
>>>> Thanks,
>>>> Henri
>>>>
>>>> On 23 November 2017 at 13:04, Aleksey Shipilev <shade at redhat.com>
>>>> wrote:
>>>>
>>>>> On 11/23/2017 09:09 AM, Vsevolod Tolstopyatov wrote:
>>>>> > Hello,
>>>>> >
>>>>> > Any news about this patch? Is it going into jmh?
>>>>>
>>>>> It will. Just let me figure out some Mac testing.
>>>>>
>>>>> -Aleksey
>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the jmh-dev
mailing list