JEP 331: Low-Overhead Heap Profiling

Wed Apr 4 22:23:47 UTC 2018

On Wed, Apr 4, 2018 at 3:41 AM Aleksey Shipilev <shade at redhat.com> wrote:

> On 03/29/2018 07:12 PM, mark.reinhold at oracle.com wrote:
> > New JEP Candidate: http://openjdk.java.net/jeps/331
>
> Interesting JEP!
>
> *) It would be nice to mention the implementation details in the JEP
> itself, i.e. where are the
> points it injects into GCs to sample? I assume it has to inject into
> CollectedHeap::allocate_tlab,
> and it has to cap the max TLAB sizes to get into allocation slowpath often
> enough?
>

My understanding was that a JEP was the idea and specification and that
more technical information like that was out of scope for the JEP
(implementations can change, etc.)

It actually does not cap the max TLAB sizes, it changes the end pointer to
force paths into "thinking" the tlab is full; then in the slowpath it
samples and fixes things the pointers up for the next sample.

>
> *) Since JC apparently has the prototype, it would be easier to put it
> somewhere, and link it into
> the JEP itself. Webrevs are interesting, but they get outdated pretty
> quickly, so maybe putting the
> whole thing in JDK Sandbox [1] is the way to go.
>

I've been keeping the webrevs up to date so there should be no real
problem. From what I read, you have to be a commiter for the JDK Sandbox,
no? So I'm not sure that would make sense there?

>
> Otherwise it leads to speculation, which raises the questions like below:
>
> *) Motivation says: "The downsides [of JFR] are that a) it is tied to a
> particular allocation
> implementation (TLABs), and misses allocations that don’t meet that
> pattern; b) it doesn’t allow the
> user to customize the sampling rate; c) it only logs allocations, so you
> cannot distinguish between
> live and dead objects."
>
>  ...but then JEP apparently describes sampling the allocations via TLABs?
> So, the real difference is
> (b), allowing to customize the sampling rate, or do I miss something?
>

There are various differences between the JFR tlab events and this system.
First the JFR system provides a buffer event system, meaning you can miss
samples if the event buffer is full and threw out a sampling event before a
reader got to it. Second, you don't get a callback at the allocation spot,
so you cannot have a means to do an action at that sampling point, which
means you have no way of knowing when an object is effectively dead using
the JFR events. Hopefully that makes sense?

>
> *) Goals say: "Can give information about both live and dead Java objects."
>
>  ...but there is not discussion what/how does it give information about
> the dead Java objects. I am
> struggling to imagine how allocation sampling would give this. Is the goal
> too broad, or some API is
> not described in the JEP?
>

Originally the JEP provided a means to the user to get that information
directly. Now because the sampling callback provides an oop, the user in
the agent can add a weak reference and use that to determine liveness.

 Thanks,
Jc