DTrace asm profiler for Mac OS X
Roman Leventov
leventov.ru at gmail.com
Tue Jan 16 15:50:50 UTC 2018
Vsevolod, thanks for this contribution, it works like a charm.
On 14 January 2018 at 16:24, Vsevolod Tolstopyatov <qwwdfsad at gmail.com>
wrote:
> Hi, it took me a while to reproduce your problem.
>
> The problem lies in Mac OS X version (everything after El Capitan) and
> system integrity protection (SIP).
> Usually DTrace works as intended, but on newer OS versions it requires
> additional privileges. In such cases if you run DTrace manually you should
> see something like "dtrace cannot control executables signed with
> restricted entitlements" [1]
> The only possible solution is to disable SIP [2]
>
> I have limited access to different versions of Mac OS X, but it seems that
> in some minor updates DTrace works with SIP enabled.
> So as solution I'd suggest to check SIP status on profiler start (via
> "csrutil status") and print warning if it's enabled or just clarify it in
> javadoc. It's up to Alexey to decide what approach is preferable in JMH
>
> [1] https://news.ycombinator.com/item?id=10790127
> [2]
> http://osxdaily.com/2015/10/05/disable-rootless-system-
> integrity-protection-mac-os-x/
>
> --
> Best regards,
> Tolstopyatov Vsevolod
>
> On Thu, Dec 28, 2017 at 10:35 PM, Henri Tremblay <henri.tremblay at gmail.com
> >
> wrote:
>
> > I am far far far from being an expert here so I'm pretty sure you will
> > throw some stupid mistake in my face but here it goes.
> >
> > You can use https://github.com/JCTools/JCTools/tree/master/
> > jctools-benchmarks.
> >
> > I did on Linux:
> > java -jar target/microbenchmarks.jar -f 1 --prof=perfasm
> > org.jctools.maps.nhbm_test.jmh.ConcurrentMapThroughput
> >
> > And got (yes, with an error on PrintAssembly):
> >
> > ERROR: No address lines detected in assembly capture, make sure your JDK
> > is PrintAssembly-enabled:
> > https://wiki.openjdk.java.net/display/HotSpot/PrintAssembly
> >
> > Perf output processed (skipped 2.844 seconds):
> > Column 1: cycles (12218 events)
> > Column 2: instructions (12169 events)
> >
> > Hottest code regions (>10.00% "cycles" events):
> >
> > ....[Hottest Region 1]............................
> > ..................................................
> > perf-52432.map, [unknown] (177 bytes)
> >
> > <no assembly is recorded, native region>
> > ............................................................
> > ........................................
> > 19.81% 11.78% <total for region 1>
> >
> > ....[Hottest Region 2]............................
> > ..................................................
> > perf-52432.map, [unknown] (381 bytes)
> >
> > <no assembly is recorded, native region>
> > ............................................................
> > ........................................
> > 15.03% 12.21% <total for region 2>
> >
> > ....[Hottest Region 3]............................
> > ..................................................
> > perf-52432.map, [unknown] (138 bytes)
> >
> > <no assembly is recorded, native region>
> > ............................................................
> > ........................................
> > 10.38% 6.35% <total for region 3>
> >
> > ....[Hottest Regions]......................
> ..............................
> > ...........................
> > 19.81% 11.78% perf-52432.map [unknown] (177 bytes)
> > 15.03% 12.21% perf-52432.map [unknown] (381 bytes)
> > 10.38% 6.35% perf-52432.map [unknown] (138 bytes)
> > 9.82% 37.09% perf-52432.map [unknown] (447 bytes)
> > 8.22% 2.47% perf-52432.map [unknown] (72 bytes)
> > 7.89% 1.69% perf-52432.map [unknown] (28 bytes)
> > 7.65% 1.69% perf-52432.map [unknown] (33 bytes)
> > 5.20% 2.94% perf-52432.map [unknown] (173 bytes)
> > 1.98% 1.59% perf-52432.map [unknown] (287 bytes)
> > 1.85% 4.54% perf-52432.map [unknown] (59 bytes)
> > 1.81% 4.48% perf-52432.map [unknown] (55 bytes)
> > 1.51% 0.96% perf-52432.map [unknown] (116 bytes)
> > 1.47% 1.83% perf-52432.map [unknown] (71 bytes)
> > 1.26% 1.25% kernel [unknown] (2 bytes)
> > 1.15% 0.53% perf-52432.map [unknown] (95 bytes)
> > 0.89% 0.40% perf-52432.map [unknown] (75 bytes)
> > 0.56% 0.05% kernel [unknown] (0 bytes)
> > 0.53% 2.34% perf-52432.map [unknown] (92 bytes)
> > 0.45% 1.16% perf-52432.map [unknown] (8 bytes)
> > 0.44% 2.47% perf-52432.map [unknown] (8 bytes)
> > 2.11% 2.14% <...other 199 warm regions...>
> > ............................................................
> > ........................................
> > 100.00% 99.99% <totals>
> >
> > ....[Hottest Methods (after inlining)]....................
> > ..........................................
> > 96.95% 97.51% perf-52432.map [unknown]
> > 2.76% 2.10% kernel [unknown]
> > 0.03% 0.07% libjvm.so fileStream::write
> > 0.02% 0.01% libc-2.12.so __strlen_sse42
> > 0.02% libc-2.12.so _IO_file_xsputn@@GLIBC_2.2.5
> > 0.02% libc-2.12.so __printf_fp
> > 0.01% libjvm.so CompileBroker::set_last_compile
> > 0.01% libjvm.so CodeCache::allocate
> > 0.01% libpthread-2.12.so pthread_mutex_unlock
> > 0.01% libjvm.so os::set_priority
> > 0.01% libjvm.so DebugInformationRecorder::
> > find_sharable_decode_offset
> > 0.01% libpthread-2.12.so pthread_cond_wait@@GLIBC_2.3.2
> > 0.01% libjvm.so CompileBroker::invoke_
> > compiler_on_method
> > 0.01% libjvm.so ciEnv::get_klass_by_index_impl
> > 0.01% 0.01% libjvm.so PhiResolverState::reset
> > 0.01% libjvm.so CompilerOracle::should_exclude
> > 0.01% libjvm.so CompilerOracle::has_option_string
> > 0.01% libjvm.so LinearScan::compute_local_
> live_sets
> > 0.01% libjvm.so OptoRuntime::new_instance_C
> > 0.01% libjvm.so ChunkPool::allocate
> > 0.10% 0.02% <...other 12 warm methods...>
> > ............................................................
> > ........................................
> > 100.00% 99.71% <totals>
> >
> > ....[Distribution by Source].......................
> > .................................................
> > 96.95% 97.51% perf-52432.map
> > 2.76% 2.10% kernel
> > 0.22% 0.31% libjvm.so
> > 0.05% 0.06% libc-2.12.so
> > 0.02% libpthread-2.12.so
> > ............................................................
> > ........................................
> > 100.00% 99.99% <totals>
> >
> > But on OSX when I do
> >
> > java -jar target/microbenchmarks.jar -f 1 --prof=dtraceasm
> > org.jctools.maps.nhbm_test.jmh.ConcurrentMapThroughput
> >
> > I get:
> >
> > PrintAssembly processed: 193901 total address lines.
> > Perf output processed (skipped 6.097 seconds):
> > Column 1: sampled_pc (0 events)
> >
> > WARNING: No hottest code region above the threshold (10.00%) for
> > disassembly.
> > Use "hotThreshold" profiler option to lower the filter threshold.
> >
> > ....[Hottest Regions]......................
> ..............................
> > ...........................
> > ............................................................
> > ........................................
> > <totals>
> >
> > ....[Hottest Methods (after inlining)]....................
> > ..........................................
> > ............................................................
> > ........................................
> > <totals>
> >
> > ....[Distribution by Source].......................
> > .................................................
> > ............................................................
> > ........................................
> > <totals>
> >
> > WARNING: The perf event count is suspiciously low (0). The performance
> > data might be
> > inaccurate or misleading. Try to do the profiling again, or tune up the
> > sampling frequency.
> >
> > Which seem pretty empty.
> >
> > Henri
> >
> > On 27 December 2017 at 09:56, Henri Tremblay <henri.tremblay at gmail.com>
> > wrote:
> >
> >> No. One was Linux (perf), the other was OSX (dtrace). Let me put the
> >> benchmark out.
> >>
> >> On 26 December 2017 at 14:19, Vsevolod Tolstopyatov <qwwdfsad at gmail.com
> >
> >> wrote:
> >>
> >>> Hi, could you share your benchmark?
> >>> I've just re-applied my patch over clean repo and
> >>> run JMHSample_37_CacheAccess with dtrace-profiler, everything works as
> >>> expected, so maybe your hottest region lies in kernel code.
> >>>
> >>> >With perf, I would get some content. With dtrace, nothing.
> >>> Are you running both on Linux?
> >>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>> Tolstopyatov Vsevolod
> >>>
> >>> On Wed, Dec 13, 2017 at 7:25 PM, Henri Tremblay <
> >>> henri.tremblay at gmail.com> wrote:
> >>>
> >>>> A bit late but my only problem right now is that I don't get any hot
> >>>> section. Which is weird.
> >>>>
> >>>> With perf, I would get some content. With dtrace, nothing.
> >>>>
> >>>> However, I am not an expert in using both. So maybe some javac or java
> >>>> arguments are required to get nice results. Is it the case?
> >>>>
> >>>> Thanks,
> >>>> Henri
> >>>>
> >>>> On 23 November 2017 at 13:04, Aleksey Shipilev <shade at redhat.com>
> >>>> wrote:
> >>>>
> >>>>> On 11/23/2017 09:09 AM, Vsevolod Tolstopyatov wrote:
> >>>>> > Hello,
> >>>>> >
> >>>>> > Any news about this patch? Is it going into jmh?
> >>>>>
> >>>>> It will. Just let me figure out some Mac testing.
> >>>>>
> >>>>> -Aleksey
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>
More information about the jmh-dev
mailing list