RFC: JFR Recording Context
Ludovic Henry
ludovic.henry at datadoghq.com
Fri Jun 11 08:09:22 UTC 2021
Hello,
I want to gather your feedback on the following proposal to add a concept
of Context to JFR Events. This context is initialized by the application
and is captured and serialized in relevant JFR events.
That is an important feature for us at Datadog to enable multiple
scenarios. The main one is the ability to correlate an Event to a specific
Trace (it marks the execution of a specific request/transaction across many
services). With this correlation, we are able to extract Events which are
relevant to a specific Trace, allowing us to give more specific information
to our customers.
This context would have the following properties:
- it is enabled similarly to stacktraces, and would only be captured on
specified Events (with a `context=“true”` attribute in the metadata.xml or
a `@Context` annotation on the Event class).
- the context is captured when committing the Event
- the context consist conceptually in a `Map<String,String>`
- the entries in the context are defined by the application
- there is a maximum number of entries in the context
- that setting can be modified through configuration but still in a
predetermined range (for example, default of maximum 64 context entries,
with a configurable maximum value of 512)
- once initialized, the entries are immutable (the value can’t be changed)
and keys can’t be removed without overriding the context.
- contexts are stackable, a new context doesn’t override an older one but
complement it, and closing a context doesn’t close any previous ones.
- the relevant context is propagated through tasks, threads, and other
async operations in the Class Libraries, similarly to ScopeLocal or
ThreadLocal
- for external libraries, a capture/restore mechanism is available
- it is serialized efficiently in the JFR recording
- it is assumed that the keys and values are repeated across many events,
similarly to stacktraces
- the immutability of keys and values greatly helps caching and
compression (with `StringPool` and a `JfrContextRepository`, for example)
- the context is accessible from the VM without having to switch to Java
- performance would otherwise be poor, and it would limit where and when a
native Event can be committed.
To achieve these properties, I propose the following public Java API:
```
class RecordingContextEntry {
public static RecordingContextEntry forName(String name);
public static RecordingContextEntry inheritableForName(String name);
public boolean isBound();
}
class RecordingContext implements AutoCloseable {
// snapshot + run
public static class Snapshot {}
public static Snapshot snapshot();
public static <R> R callWithSnapshot(Callable<R> op, Snapshot s);
public static void runWithSnapshot(Runnable op, Snapshot s);
// initialize
public static class Builder {
// implicitly initialized with current context’s inheritable entries
// build and set current context
public Builder where(RecordingContextEntry key, String value);
public RecordingContext build();
}
public static Builder builder();
// close
public void close();
}
```
Let's take the example of a sample Trace class that uses this
RecordingContext class to associate a trace_id to events:
```
class Trace implements AutoCloseable {
static RecordingContextEntry contextTraceId =
RecordingContextEntry.inheritableForName(“trace_id”);
RecordingContext context;
public Trace() {
context =
RecordingContext.builder()
.where(contextTraceId, this.getTrace().getId())
.build();
}
public void close() {
context.close();
}
}
```
Let’s also take a hypothetical async-heavy library which uses a custom
threadpool:
```
class CustomThreadPool {
public void run(Runnable r) {
this.workQueue.push(r);
}
// background threads then pop from this.workQueue
// and execute the Runnable.
}
```
To propagate the context in this threadpool’s threads, the library or any
tracing framework would instrument the code with:
```
class RunableWithSnapshot implements Runnable {
final private Runnable runnable;
final private RecordingContext.Snapshot snapshot;
RunableWithSnapshot(Runnable runnable) {
this.runnable = runnable;
this.snapshot = RecordingContext.snapshot();
}
public void run() {
RecordingContext.runWithSnapshot(runnable, snapshot);
}
}
class CustomThreadPool {
public void run(Runnable r) {
r = new RunnableWithSnapshot(r);
this.workQueue.push(r);
}
// there are then background threads which pop from
// this.workQueue and execute the Runnable.
}
```
The main challenges I have faced when implementing this feature [1] are:
- most of the changes are in Hotspot
- it’s tempting to rely on the ScopeLocal proposal but it doesn’t provide
the required API to integrate to existing TWR-like APIs
(try-with-resources).
- it changes the serialized format of an Event since it adds a long field
for the index of the context in the JfrContextRepository (serialized in
each chunk).
Thank you, and looking for your feedback.
Ludovic
[1] Beware, it’s a very early stage proof of concept, it doesn’t fully
capture the context, and the implementation is out of date with this API
proposal.
https://github.com/luhenry/jdk/compare/openjdk:master...jfr-context
More information about the hotspot-jfr-dev
mailing list