[External] : Re: JFR: Scrubbing sensitive information from events
Erik Gahlin
erik.gahlin at oracle.com
Mon Aug 25 21:22:01 UTC 2025
Sorry for the late reply.
AFAIK we don't have a native implementation for regular expression, and writing one is error-prone. Glob patterns are easier and will probably work for most use cases. We could try caching the redaction, so we don't need to repeat the same operation every time the event is emitted.
I added this feature to our roadmap, and I will do some prototyping near term. It would be great if you later could provide a default set of patterns.
Erik
________________________________
From: Erwan Viollet <erwan.viollet at gmail.com>
Sent: Friday, June 13, 2025 2:51 PM
To: Erik Gahlin <erik.gahlin at oracle.com>
Cc: hotspot-jfr-dev at openjdk.org <hotspot-jfr-dev at openjdk.org>
Subject: Re: [External] : Re: JFR: Scrubbing sensitive information from events
I like the simple and pragmatic approach. Especially if we can have this as default.
Writing custom scrubbing logic for these specific events is much more straightforward than building a generic scrubbing framework.
To address the risk of missing sensitive patterns that are specific to user environments, I'd suggest adding a customization option:
# Allow users to extend or override default patterns
-XX:FlightRecorderOptions:scrub-sensitive-patterns="password,secret,key,token,credential,auth,pwd,passwd,api[_-]?key,custompattern"
We had to do the same for the Datadog Agent process scrubbing<https://urldefense.com/v3/__https://docs.datadoghq.com/infrastructure/process/?tab=linuxwindows*process-arguments-scrubbing__;Iw!!ACWV5N9M2RV99hQ!OFTohOpwXKtDtD5o-I4TIjEgUBq8-6eVPpOtXWWNH5CBUGQ2V3IGoC9Nc51-9L5N7gi6MJ-LB4tt44FXOxTjiBHE$>, where users can specify `custom_sensitive_words` to handle their specific use cases.
The remaining questions:
1. Scope limitations: Are there any known use cases that would be impacted by limiting scrubbing to these four event types? We believe this covers the vast majority of sensitive data exposure, but want to ensure we're not missing critical scenarios.
2. Default pattern selection:
We should align on what a reasonable default would look like. We can use our JFR Data to do this.
Regards,
Erwan
Le jeu. 12 juin 2025 à 19:24, Erik Gahlin <erik.gahlin at oracle.com<mailto:erik.gahlin at oracle.com>> a écrit :
Thanks for the file.
I worry that processing the file in the JVM or creating an intuitive Java API for post-processing it will be hard. The context/event determines what needs to be redacted. If scrubbing is only necessary for these four events, hardcoding the sensitive tokens and logic into the JVM might be a viable approach.
Users would specify:
$ java -XX:FlightRecorderOptions:scrub-sensitive=true
or it might be enabled by default and users would need to opt-out.
Anyway, if enabled, a jfrScrub.cpp class would do the job. Something like this:
EventInitialEnvironmentVariable event(UNTIMED);
event.set_starttime(time_stamp);
event.set_endtime(time_stamp);
event.set_key(key);
if (JfrScrub::is_sensitive_key(key)) {
event.set_value("[REDACTED]");
} else {
event.set_value(value);
}
event.commit();
EventInitialSystemProperty event(UNTIMED);
event.set_key(p->key());
if (JfrScrub::is_sensitive_key(p->key()) {
event.set_value("[REDACTED]");
} else {
event.set_value(p->value());
}
event.set_starttime(time_stamp);
event.set_endtime(time_stamp);
event.commit();
EventSystemProcess event(UNTIMED);
event.set_pid(pid_buf);
event.set_commandLine(JfrScrub::command_line(info));
event.set_starttime(start_time);
event.set_endtime(end_time);
event.commit();
EventJVMInformation event;
event.set_jvmName(VM_Version::vm_name());
event.set_jvmVersion(VM_Version::internal_vm_info_string());
event.set_javaArguments(JfrScrub::command_line(Arguments::java_command()));
event.set_jvmArguments(Arguments::jvm_args());
event.set_jvmFlags(Arguments::jvm_flags());
event.set_jvmStartTime(Management::vm_init_done_time());
event.set_pid(os::current_process_id());
event.commit();
It's a bit ugly and not as flexible, but perhaps that's something we need to tolerate. Or will it be useless because new passwords/keys will be added all the time, or because they will match false positives, and more advanced logic is needed? Perhaps it will give users the false(?) impression that they don't need to worry about sensitive data?
Thanks
Erik
________________________________
From: Erwan Viollet <erwan.viollet at gmail.com<mailto:erwan.viollet at gmail.com>>
Sent: Thursday, June 12, 2025 5:07 PM
To: Erik Gahlin <erik.gahlin at oracle.com<mailto:erik.gahlin at oracle.com>>
Cc: hotspot-jfr-dev at openjdk.org<mailto:hotspot-jfr-dev at openjdk.org> <hotspot-jfr-dev at openjdk.org<mailto:hotspot-jfr-dev at openjdk.org>>
Subject: [External] : Re: JFR: Scrubbing sensitive information from events
Hello,
Here is an example of the types of events we are concerned about:
Recording
│
├── Event (e.g. jdk.InitialSystemProperty)
│ ├── eventType: "jdk.InitialSystemProperty"
│ ├── startTime
│ ├── duration
│ ├── fields:
│ │ ├── key: "javax.net.ssl.keyStorePassword"
│ │ ├── value: "supersecret"
│ │ └── ...
│ └── ...
│
├── Event (e.g. jdk.JVMInformation)
│ ├── eventType: "jdk.JVMInformation"
│ ├── jvmArguments: [ "-Xmx4G", "-Djavax.net.ssl.keyStorePassword=supersecret", ... ]
│ └── ...
│
└── ...
The rules are slightly challenging as they need to account for key/value pairs, arrays and simple fields (like commandLine field).
Here<https://urldefense.com/v3/__https://gist.github.com/r1viollet/812ed70c6410c4f62640fd792570d36c__;!!ACWV5N9M2RV99hQ!LrFg9xF9Jy2l4LW6sj6mxVPhLXr30tA_2lzstCSiBbi4SxLyh8t2wDJGc4E1b7ePKrrsivDhkoZtMsHWCKsBeBLJ$> is a scrub file example. I'm happy to consider ways to simplify this proposal. Storing JFR files would also be helpful to consider test cases.
Regards,
Erwan
Le mar. 3 juin 2025 à 11:50, Erik Gahlin <erik.gahlin at oracle.com<mailto:erik.gahlin at oracle.com>> a écrit :
We have discussed it, but we don't understand all the details. We are also unsure how to best expose it to the end user. Let's say there was a command line option -XX:FlightRecorder:scrub-file=<file>.
What would you fill that file with? I want examples that work on real data to understand how expressive the filters must be.
Thanks
Erik
________________________________
From: hotspot-jfr-dev <hotspot-jfr-dev-retn at openjdk.org<mailto:hotspot-jfr-dev-retn at openjdk.org>> on behalf of Erwan Viollet <erwan.viollet at gmail.com<mailto:erwan.viollet at gmail.com>>
Sent: Monday, June 2, 2025 3:30 PM
To: hotspot-jfr-dev at openjdk.org<mailto:hotspot-jfr-dev at openjdk.org> <hotspot-jfr-dev at openjdk.org<mailto:hotspot-jfr-dev at openjdk.org>>
Subject: JFR: Scrubbing sensitive information from events
Hello,
I am currently looking into how to remove sensitive information from JFR events. The main events that typically contain sensitive information: jdk.SystemProcess, jdk.InitialSystemProperty, jdk.JVMInformation. Passwords from command lines can typically be found in these events.
Dropping these events altogether is not ideal, as we need them to make relevant performance recommendations to users (e.g. suggesting JVM or system setting adjustments).
Dropping them or scrubbing them on the backend side (after the fact) requires decompressing and re-writing these events, which is wasteful in terms of both compute and storage. The approach is not perfect, as we still end up intaking and temporarily storing sensitive information.
Ideally, we would like to be able to scrub or redact only the sensitive fields within these events (for example, using a simple regex or pattern-based rule), rather than dropping the whole event. We also want to avoid handling this only after the event has already been written to the JFR file, as that does not fully mitigate the risk of exposing sensitive data.
At present, it appears there is no public API or supported mechanism to intercept or scrub JFR events in-process, before they are persisted. What would you think of an API accepting custom scrubbing patterns so that sensitive data never leaves the JVM in an unredacted state?
Are there any plans or discussions in this area? I am fairly new to the JFR world, so it is likely that I missed previous discussions around this.
Thank you, Best regards,
Erwan Viollet,
Profiling team, Datadog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-jfr-dev/attachments/20250825/96914fd1/attachment-0001.htm>
More information about the hotspot-jfr-dev
mailing list