In memory or streaming JFR event implementation?

Thu Aug 2 02:06:30 UTC 2018

Hi Derek,

We have been thinking of adding streaming capabilities to JFR for some time.

In JDK 9, we added something to the file format we call a flush point. When a parser reaches a flush point it can be sure that all relevant constants, i.e stack traces, classes, methods etc., has been written down so they can be resolved in the events. We have however not implemented the flushing mechanism, but it should not be hard to do.

We are thinking of adding a Java API for reading the data in the disk repository before a chunk is finished. It could perhaps  look something like this:

new EventPublisher()
       .subscribe(event -> System.out.println(event));

new EventPublisher(“jdk.GarbageCollection”)
       .maxAge(Duration.ofSeconds(200))
       .subscribe(event -> System.out.print(event));

new EventPublisher()
        .flushInterval(Duration.ofSeconds(2)
        .subscribe(event -> System.out.print(event));

The layout of an event looks like this:

struct Event {
  int eventSize;
  long eventTypeId;
  long startTime;
  long duration;
  long threadId;
  <user defined fields>
};

If we know a user are only interested in certain events, we can just read the eventTypeId and skip parsing the rest of the data.

Our thinking is to make the mechanism disk-based. It makes it easier to handle flow control and we believe the overhead is negligible compared to an in-memory solution. If this turns out to be incorrect, we could always add in-memory support later.

In your implementation, how do you consume data today? Is there a Java API, or do you use some other mechanism?

Thanks
Erik

> On 1 Aug 2018, at 15:35, Derek Thomson <dthomson at google.com> wrote:
> 
> Hi all,
> 
> After talking with some people with Oracle at JVMLS I was pointed at this
> list.
> 
> We (at Google) basically have our own internal implementation of JFR event
> in-memory writing (on the JDK side) and reading (in our own Java code). We
> use the events in interesting ways, aggregated across large-scale jobs and
> services that have many, many processes.
> 
> It's a pretty simplistic implementation to be honest - we had this for our
> own definitions of "GC events" for years (which weren't a great fit for CMS
> let alone G1) and I just decided to move to the JFR definitions (as
> described in the XML), rather than invent all our own event types for G1.
> 
> Now that this is open source we'd like to move to this standard
> implementation, and contribute where we can of course. One barrier for us
> will be that currently it's file only, and I'd like to start talking about
> the possibility of an in-memory or streaming API. Any thoughts on that so
> far?
> 
> Thanks,
> Derek.