JEP 331: Low-Overhead Heap Profiling

Thu Apr 5 22:04:36 UTC 2018

Added your comments to the text Serguei and changed the name of the section
to JVMTI agent.

Thanks!
Jc

On Thu, Apr 5, 2018 at 2:07 PM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jc and Aleksey,
>
> Please, find a couple of comments below.
>
>
> On 4/5/18 09:05, JC Beyler wrote:
> > Hi Aleksey,
> >
> > I inlined my answers :)
> >
> > On Thu, Apr 5, 2018 at 4:09 AM Aleksey Shipilev <shade at redhat.com>
> wrote:
> >
> >> On 04/05/2018 12:23 AM, JC Beyler wrote:
> >>> On Wed, Apr 4, 2018 at 3:41 AM Aleksey Shipilev <shade at redhat.com
> >> <mailto:shade at redhat.com>> wrote:
> >>> *) It would be nice to mention the implementation details in the JEP
> >> itself, i.e. where are the
> >>>     points it injects into GCs to sample? I assume it has to inject
> into
> >> CollectedHeap::allocate_tlab,
> >>>     and it has to cap the max TLAB sizes to get into allocation
> slowpath
> >> often enough?
> >>> My understanding was that a JEP was the idea and specification and that
> >> more technical information
> >>> like that was out of scope for the JEP (implementations can change,
> etc.)
> >> Well, yes. But if you have the implementation ideas, it is better to
> >> demonstrate them along with the
> >> idea. Discussing implementation approaches serves several purposes: a)
> it
> >> empirically proves the
> >> idea is implementable; b) it highlights tricky design decisions the
> >> implementation has to force,
> >> which aids the understanding of the scope; c) it prevents handwaving
> >> against existing approaches :)
> >>
> > Fair enough, I've added an implementation design/state with a link to the
> > current webrev at the end of the JEP issue
> > <https://bugs.openjdk.java.net/browse/JDK-8171119>. Let me know if that
> is
> > what you had in mind.
> >
> >
> >>> It actually does not cap the max TLAB sizes, it changes the end pointer
> >> to force paths into
> >>> "thinking" the tlab is full; then in the slowpath it samples and fixes
> >> things the pointers up for
> >>> the next sample.
> >> Ooof. So this has implications for JEP scope, and thus should be
> mentioned.
> >>
> > Perhaps, again I saw the JEP as a different level of abstraction and this
> > is more in the details of implementation. However, I have added a full
> > explanation of this into the JEP at the end, let me know if it makes
> sense.
> > For convenience, let me copy-paste what I wrote there:
> >
> > "
> > 2) The TLAB structure is augmented with a new allocation_end pointer and
> a
> > current_end pointer. If the sampling is disabled, the two pointers are
> > always equal and the code performs as before. If the sampling is enabled,
> > the current_end is modified to be where the next sample point is
> requested.
> > Then, any fast path will "think" the TLAB is full at that point and go
> down
> > the slow path, which is explained in (3)
> > "
> >
> >
> >>
> >>>      *) Since JC apparently has the prototype, it would be easier to
> put
> >> it somewhere, and link it into
> >>>      the JEP itself. Webrevs are interesting, but they get outdated
> >> pretty quickly, so maybe putting the
> >>>      whole thing in JDK Sandbox [1] is the way to go.
> >>>
> >>> I've been keeping the webrevs up to date so there should be no real
> >> problem. From what I read, you
> >>> have to be a commiter for the JDK Sandbox, no? So I'm not sure that
> >> would make sense there?
> >>
> >> Right, you have to be a Committer. Link the webrev to the JEP then?
> >>
> > I added the link and a few paragraphs on the implementation.
> >
> >
> >>
> >>>      *) Motivation says: "The downsides [of JFR] are that a) it is tied
> >> to a particular allocation
> >>>      implementation (TLABs), and misses allocations that don’t meet
> that
> >> pattern; b) it doesn’t allow the
> >>>      user to customize the sampling rate; c) it only logs allocations,
> so
> >> you cannot distinguish between
> >>>      live and dead objects."
> >>>
> >>>       ...but then JEP apparently describes sampling the allocations via
> >> TLABs? So, the real difference is
> >>>      (b), allowing to customize the sampling rate, or do I miss
> something?
> >>>
> >>> There are various differences between the JFR tlab events and this
> >> system. First the JFR system
> >>> provides a buffer event system, meaning you can miss samples if the
> >> event buffer is full and threw
> >>> out a sampling event before a reader got to it. Second, you don't get a
> >> callback at the allocation
> >>> spot, so you cannot have a means to do an action at that sampling
> point,
> >> which means you have no way
> >>> of knowing when an object is effectively dead using the JFR events.
> >> Hopefully that makes sense?
> >>
> >> This paragraph should be in JEP text then?
> >>
> > Done, I revamped and edited the section into this now:
> >
> >
> > "There are multiple alternatives to the system presented in this JEP. The
> > introduction presented two already: The Java Flight Recorder
> > <http://openjdk.java.net/jeps/328> system provides an interesting
> > alternative but is not perfect due to it not allowing the sampling size
> to
> > be set and not providing a callback.
> >
> > The JFR system does use the TLAB creation as a means to track memory
> > allocation but, instead of a callback, JFR events use a buffer system
> that
> > can lead to missing some sampled allocations. Finally, the JFR event
> system
> > does not provide a means to track objects that have been garbage
> collected,
> > which means it is not possible currently to have a system provide
> > information about live and garbage collected objects using the JFR event
> > system."
>
> I'd like to highlight an important difference with the JFR.
> The JEP adds new feature into the JVMTI which is an important
> API/framework for various development and monitoring tools.
> Now, a JVMTI agent can use a low overhead heap profiling API along with
> the rest of JVMTI functionality.
> It provides great flexibility to the tools.
> For instance, it is up to the agent to decide if a stack trace needs to
> be collected at each event point.
>
>
> > Let me know if that works for you.
> >
> >
> >
> >>>      *) Goals say: "Can give information about both live and dead Java
> >> objects."
> >>>       ...but there is not discussion what/how does it give information
> >> about the dead Java objects. I am
> >>>      struggling to imagine how allocation sampling would give this. Is
> >> the goal too broad, or some API is
> >>>      not described in the JEP?
> >>>
> >>> Originally the JEP provided a means to the user to get that information
> >> directly. Now because the
> >>> sampling callback provides an oop, the user in the agent can add a weak
> >> reference and use that to
> >>> determine liveness.
> >
> >> Ooof! I guess that technically works. Please mention it.
> >>
> > I did already add it here:
> >
> >
> > "E) What the Java agent can do
>
> I'd suggest to replace the "Java agent" with the "JVMTI agent" for
> accuracy.
> The term "Java agent" is used for JPLIS agents that are based on the
> java.lang.instrument API:
>
> https://docs.oracle.com/javase/8/docs/technotes/guides/instrumentation/index.html
>
>
> Thanks,
> Serguei
>
> > The user of the callback can then pick up a stacktrace at the moment of
> the
> > callback using the JVMTI GetStackTrace method for example. The oop
> obtained
> > by the callback can be also wrapped into a JNI weak reference to help
> > determine when the object has been garbage collected. The idea behind
> that
> > is to provide data on what objects were sampled and are still considered
> > live or garbage collected, which can be a good means to understand the
> > job's behavior.
> >
> > The sampling rate will provide a different sampling precision but also
> can
> > be a means to mitigate overhead due to the profiling. Using a sampling
> rate
> > of 512k and the sampling solution, the overhead should be low enough
> that a
> > user could reasonably leave the system on by default."
> >
> >
> > ------
> >
> >
> > I also have the proof of concept in tests here
> > <
> http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event.10/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCTest.java.html
> >
> > and
> > the native implementation is here
> > <
> http://cr.openjdk.java.net/~jcbeyler/8171119/heap_event.10/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.html
> >
> > .
> >
> > Let me know if my additions to the JEP are what you were looking for and
> is
> > there anything else you think I should add information about!
> >
> > Thanks for reviewing it!
> > Jc
>
>