New heap allocation event proposal and PoC
Jaroslav Bachorík
jaroslav.bachorik at datadoghq.com
Tue Oct 6 13:16:32 UTC 2020
Hi Erik,
so I took a stab at using the event settings to control the rate
limiting sampler and it turns out that we now have a generic support
for rate limited event types (woho!)
I am still struggling with the webrev from my forked github repo
branch :/ Are you sure the side-by-side diff view
(https://github.com/openjdk/jdk/compare/master...jbachorik:allocation_sampling)
can not be used for the starter? Once I am sure this is the right
direction (and the code looks sane) I will turn the custom branch into
a PR and then we should get a.proper webrev generated automatically.
Cheers,
-JB-
On Fri, Oct 2, 2020 at 7:13 PM Jaroslav Bachorík
<jaroslav.bachorik at datadoghq.com> wrote:
>
> Hi Erik,
>
> sorry, no webrev for now. I tried generating the webrev from my
> fork/branch but the skara provided `git webrev` generates an empty one
> :(
>
> On Thu, Oct 1, 2020 at 11:43 AM Jaroslav Bachorík
> <jaroslav.bachorik at datadoghq.com> wrote:
> >
> > Hi Erik,
> >
> > see my replies below
> >
> > On Wed, Sep 30, 2020 at 7:31 PM Erik Gahlin <erik.gahlin at oracle.com> wrote:
> > >
> > > Hello Jaroslav,
> > >
> > > Nice to see progress on this!
> > >
> > > The adaptive sampling approach is probably the most viable as it allows easy configuration while maintaining an upper bound on the overhead. It can also be useful for other events in the future, for example exception or method profiling.
> > >
> > > If we find that x events per minute causes too much overhead in the default configuration, we can reduce it to x / 5 events, with little impact on clients that consume the data. It will have a lower resolution, but the algorithm for a client to produce a report, let's say a top ten list of the intensive allocation sites, will be the same.
> > >
> > > I suggest that we create an option called rate, so users can specify something like this in a .jfc file:
> > >
> > > <event name=“ObjectAllocationSample”>
> > > <setting name=“enabled”>true>/setting>
> > > <setting name=“stackTrace>true</setting>
> > > <setting name=“rate”>100</setting>
> > > </event>
> >
> > Yes. This is one of the possibilities. If we just use the smallest
> > rate defined amongst the active recordings it should be fine.
>
> I started playing with this idea and it seems that this will, in the
> end, require modifying public API if I am not missing anything. I
> tried adding a new setting field called `rateLimit` but the value set
> in the *.jfc file is not 'automagically' propagated to
> `JfrEventSetting` class. I started tracking down what needs to be done
> for the value defined in the JFC to make it all the way down to
> `JfrEventSetting` and it really seems like I would need to have a new
> public annotation `@RateLimit`, similar to eg. `@Cutoff` defined. Is
> this so or is there a less involved way for the native events only?
>
> There is one other thing I would like to understand better - does
> `JfrEventSetting` defining only static methods mean that there are
> really no 'per-recording' settings available to query? Are the
> recording specific settings merged following the strategy defined in
> the `combine(Set<String> values)` method of `JdkSettingControl`
> subclasses?
>
> Cheers,
>
> -JB-
>
>
> >
> > >
> > > It could be that rate should include the window size, but it might be tricky to come up with a syntax, but perhaps “1000 / 10 s”? If multiple recordings are running at the same time, the smallest window size will be chosen, while not exceeding the highest rate. Or perhaps skip the window and go with 100 Hz?
> >
> > I would prefer having just one knob for rate and then we can do
> > dynamic adjustment of the window size and number of events per window
> > such as we don't have too few events in a window (the error will be
> > bigger for the statistical params we are calculating) or the window
> > size is too large (the sampler would become less adaptive).
> >
> > >
> > > The option will only apply to events that support rate control, for now ObjectAllocationSampling
> > >
> > > We want this event on by default and I am leaning towards having TLAB events disabled in profile jfc as well, but have it as a control attribute in the jfc, so users can enable it for troubleshooting in the JMC recording wizard.
> >
> > +1
> >
> > >
> > > Before looking at the implementation, could you produce a webrev on cr.openjdk.java.net?
> >
> > Looking at how to create webrev for the git repo :) Until then you can
> > see side-by-side diff at
> > https://github.com/openjdk/jdk/compare/master...jbachorik:allocation_sampling
> >
> > Cheers,
> >
> > -JB-
> >
> > >
> > > Thanks
> > > Erik
> > >
> > > > On 30 Sep 2020, at 15:17, Jaroslav Bachorík <jaroslav.bachorik at datadoghq.com> wrote:
> > > >
> > > > Hello all,
> > > >
> > > > I would like to present our (Datadog) proposal of the new heap
> > > > allocation event proposal. The proposal is based on the writeup
> > > > created by Marcus Hirt and the followup discussion
> > > > (https://docs.google.com/document/d/191QzZIEPgOi-KGs82Sh9B6_dVtXudUonhdNrRgtt444/edit)
> > > >
> > > > == Introduction
> > > > Let me cite the rationale for the new heap allocation event from the
> > > > aforementioned document.
> > > > ```
> > > > In JFR there are two allocation profiling events today - one for
> > > > allocations inside of thread local area buffers (TLABs) and one for
> > > > allocations outside. The events are quite useful for both allocation
> > > > profiling and for TLAB tuning. They are, however, quite hard to reason
> > > > about in terms of data production rate, and in certain edge cases both
> > > > the overhead and the data volume can be quite high. In always-on
> > > > production time profiling, arguably the most important domain for JFR,
> > > > these are quite serious drawbacks.
> > > > ```
> > > >
> > > >
> > > > == Detailed description
> > > > This proposal and (fully functional) PoC is based on the idea of
> > > > 'subsampling' described in more detail in the linked document. The
> > > > subsampling is being performed by a rate-limiting, adaptive sampler
> > > > which allows keeping the event emission rate constant (more or less)
> > > > while providing a fairly accurate statistical picture of the heap
> > > > allocations happening in the system.
> > > >
> > > > The PoC is built upon the existing inside/outside TLAB hooks used by
> > > > the JFR. These hooks are used to get the 'raw' allocation samples -
> > > > basically each time the TLAB gets retired or outside-of-TLAB
> > > > allocation happens. These 'raw' samples are then pushed through the
> > > > adaptive sampler to generate the heap allocation events at the desired
> > > > rate.
> > > >
> > > > === PoC Sources and Binaries
> > > > - Source: https://github.com/jbachorik/jdk/tree/allocation_sampling
> > > > - Binaries; https://github.com/jbachorik/jdk/actions/runs/276358906
> > > >
> > > > === PoC Performance
> > > > Initial perf assessment was done using the open source Renaissance
> > > > benchmark suite (https://renaissance.dev/) and namely the 'akka-uct'
> > > > benchmark. The benchmark is run on a dedicated EC2 c5.metal instance
> > > > with nothing else running concurrently. The benchmark app is given
> > > > 40GB heap. The performance is described in the cumulative amount of
> > > > CPU time (kernel + user) reported by the 'time' command.
> > > > The results are showing the CPU time overhead in range of 1% for
> > > > avg, p95 and p99, measured for akka-uct, scrabble and future-genetic
> > > > benchmark applications.
> > > >
> > > > === PoC Limitations
> > > > The current implementation has the following limitations:
> > > > - the target rate is not configurable (currently set to 5k events per minute)
> > > > - the adaptive sampler implementation introduces a potential
> > > > contention point over a mutex (although that contention point should
> > > > not be hit more often than once per 10ms)
> > > > - the code structure may not be optimal
> > > >
> > > >
> > > > Thank you for your attention and looking forward to your comments and remarks!
> > > >
> > > > -JB-
> > >
More information about the hotspot-jfr-dev
mailing list