Low-Overhead Heap Profiling
Vladimir Voskresensky
vladimir.voskresensky at oracle.com
Tue Jun 23 05:31:54 UTC 2015
Hello Jeremy,
If this is sampling, not tracing, then how is it different from the
low-overhead memory profiling provided by JFR [1].
JFR samples per new TLAB allocation. It provides really very good
picture and I haven't seen overhead more than 2%.
Btw, JFR also does not have false positives reported by instrumented
approaches for the cases when JIT was able to eliminate heap allocation.
Thanks,
Vladimir.
[1] http://hirt.se/blog/?p=381
On 22.06.2015 11:48, Jeremy Manson wrote:
> Hey folks,
>
> (cc'ing Aleksey and John, because I mentioned this to them at the
> JVMLS last year, but I never followed up.)
>
> We have a patch at Google I've always wanted to contribute to OpenJDK,
> but I always figured it would be unwanted. I've recently been
> thinking that might not be as true, though. I thought I would ask if
> there is any interest / if I should write a JEP / if I should just
> forget it.
>
> The basic problem is that there is no low-overhead supported way to
> figure out where allocation hotspots are. That is, sets of stack
> traces where lots of allocation / large allocations took place.
>
> What I had originally done (this was in ~2007) was use bytecode
> rewriting to instrument allocation sites. The instrumentation would
> call a Java method, and the method would capture a stack trace. To
> reduce overhead, there was a per-thread counter that only took a stack
> trace once every N bytes allocated, where N is a randomly chosen point
> in a probability distribution that centered around ~512K.
>
> This was *way* too slow, and it didn't pick up allocations through
> JNI, so I instrumented allocations at the VM level, and the overhead
> went away. The sampling is always turned on in our internal VMs, and
> a user can just query an interface for a list of sampled stack
> traces. The allocated stack traces are held with weak refs, so you
> only get live samples.
>
> The additional overhead for allocations amounts to a subtraction, and
> an occasional stack trace, which is usually a very, very small amount
> of our CPU (although I had to do some jiggering in JDK8 to fix some
> performance regressions).
>
> There really isn't another good way to do this with low overhead. I
> was wondering how the gruop would feel about our contributing it?
>
> Thoughts?
>
> Jeremy
More information about the serviceability-dev
mailing list