RFC: JFR Recording Context

Erik Gahlin erik.gahlin at oracle.com
Tue Jun 15 14:30:17 UTC 2021


Hello Ludovic,

To better understand the rationale for this API, let me first go through what can and can’t be done today. You probably know this, but it's good to have everybody on the same page.

Using today’s API, it’s possible to create your own user-defined event and have a field where a context ID can be stored. It’s also possible to define you own annotation, i.e. Context, and user-defined setting, i.e. ContextSetting, for turning context tracing on and off.

What is not possible, is to add a field to other events, defined in the JVM/JDK or in a third-party library, and populate it with some value/context.

What you can do, is to create an event, i.e. TracingContext, with an ID field and call begin() when the context starts and end() when it ends, and then when parsing the recording, see what other events happened, in the same thread, during that event.

What would be the problem with such a solution?

Thanks
Erik
________________________________
From: hotspot-jfr-dev <hotspot-jfr-dev-retn at openjdk.java.net> on behalf of Ludovic Henry <ludovic.henry at datadoghq.com>
Sent: Friday, June 11, 2021 10:09 AM
To: hotspot-jfr-dev at openjdk.java.net <hotspot-jfr-dev at openjdk.java.net>
Subject: RFC: JFR Recording Context

Hello,

I want to gather your feedback on the following proposal to add a concept
of Context to JFR Events. This context is initialized by the application
and is captured and serialized in relevant JFR events.

That is an important feature for us at Datadog to enable multiple
scenarios. The main one is the ability to correlate an Event to a specific
Trace (it marks the execution of a specific request/transaction across many
services). With this correlation, we are able to extract Events which are
relevant to a specific Trace, allowing us to give more specific information
to our customers.

This context would have the following properties:
- it is enabled similarly to stacktraces, and would only be captured on
specified Events (with a `context=“true”` attribute in the metadata.xml or
a `@Context` annotation on the Event class).
- the context is captured when committing the Event
- the context consist conceptually in a `Map<String,String>`
- the entries in the context are defined by the application
- there is a maximum number of entries in the context
 - that setting can be modified through configuration but still in a
predetermined range (for example, default of maximum 64 context entries,
with a configurable maximum value of 512)
- once initialized, the entries are immutable (the value can’t be changed)
and keys can’t be removed without overriding the context.
- contexts are stackable, a new context doesn’t override an older one but
complement it, and closing a context doesn’t close any previous ones.
- the relevant context is propagated through tasks, threads, and other
async operations in the Class Libraries, similarly to ScopeLocal or
ThreadLocal
 - for external libraries, a capture/restore mechanism is available
- it is serialized efficiently in the JFR recording
 - it is assumed that the keys and values are repeated across many events,
similarly to stacktraces
 - the immutability of keys and values greatly helps caching and
compression (with `StringPool` and a `JfrContextRepository`, for example)
- the context is accessible from the VM without having to switch to Java
 - performance would otherwise be poor, and it would limit where and when a
native Event can be committed.

To achieve these properties, I propose the following public Java API:
```
class RecordingContextEntry {
 public static RecordingContextEntry forName(String name);
 public static RecordingContextEntry inheritableForName(String name);
 public boolean isBound();
}
class RecordingContext implements AutoCloseable {
 // snapshot + run
 public static class Snapshot {}
 public static Snapshot snapshot();
 public static <R> R callWithSnapshot(Callable<R> op, Snapshot s);
 public static void runWithSnapshot(Runnable op, Snapshot s);
 // initialize
 public static class Builder {
  // implicitly initialized with current context’s inheritable entries
  // build and set current context
  public Builder where(RecordingContextEntry key, String value);
  public RecordingContext build();
 }
 public static Builder builder();
 // close
 public void close();
}
```

Let's take the example of a sample Trace class that uses this
RecordingContext class to associate a trace_id to events:
```
class Trace implements AutoCloseable {
 static RecordingContextEntry contextTraceId =
      RecordingContextEntry.inheritableForName(“trace_id”);

 RecordingContext context;

 public Trace() {
  context =
    RecordingContext.builder()
      .where(contextTraceId, this.getTrace().getId())
      .build();
 }

 public void close() {
  context.close();
 }
}
```

Let’s also take a hypothetical async-heavy library which uses a custom
threadpool:
```
class CustomThreadPool {
 public void run(Runnable r) {
  this.workQueue.push(r);
 }
 // background threads then pop from this.workQueue
 // and execute the Runnable.
}
```

To propagate the context in this threadpool’s threads, the library or any
tracing framework would instrument the code with:
```
class RunableWithSnapshot implements Runnable {
 final private Runnable runnable;
 final private RecordingContext.Snapshot snapshot;
 RunableWithSnapshot(Runnable runnable) {
  this.runnable = runnable;
  this.snapshot = RecordingContext.snapshot();
 }
 public void run() {
  RecordingContext.runWithSnapshot(runnable, snapshot);
 }
}
class CustomThreadPool {
 public void run(Runnable r) {
  r = new RunnableWithSnapshot(r);
  this.workQueue.push(r);
 }
 // there are then background threads which pop from
 // this.workQueue and execute the Runnable.
}
```

The main challenges I have faced when implementing this feature [1] are:
- most of the changes are in Hotspot
- it’s tempting to rely on the ScopeLocal proposal but it doesn’t provide
the required API to integrate to existing TWR-like APIs
(try-with-resources).
- it changes the serialized format of an Event since it adds a long field
for the index of the context in the JfrContextRepository (serialized in
each chunk).

Thank you, and looking for your feedback.
Ludovic

[1] Beware, it’s a very early stage proof of concept, it doesn’t fully
capture the context, and the implementation is out of date with this API
proposal.
https://github.com/luhenry/jdk/compare/openjdk:master...jfr-context


More information about the hotspot-jfr-dev mailing list