Re: RFR: 8352251: Implement Cooperative JFR Sampling [v15]
Greetings,
This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338).
Implementations in this change set are provided and have been tested on the following platforms:
- windows-x64 - windows-x64-debug - linux-x64 - linux-x64-debug - macosx-x64 - macosx-x64-debug - linux-aarch64 - linux-aarch64-debug - macosx-aarch64 - macosx-aarch64-debug
Testing: tier1-6, jdk_jfr, stress testing.
Platform porters note: Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes:
- src/hotspot/cpu/x86/frame_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.hpp - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp - src/hotspot/cpu/x86/macroAssembler_x86.cpp - src/hotspot/cpu/x86/macroAssembler_x86.hpp - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp - src/hotspot/cpu/x86/templateTable_x86.cpp - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp
Thanks Markus
Markus Grönlund has updated the pull request incrementally with one additional commit since the last revision: Configuration and test for jdk.SafepointLatency event ------------- Changes: - all: https://git.openjdk.org/jdk/pull/24296/files - new: https://git.openjdk.org/jdk/pull/24296/files/c65bc8b6..b0f9c5e5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=24296&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24296&range=13-14 Stats: 82 lines in 4 files changed: 79 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/24296.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24296/head:pull/24296 PR: https://git.openjdk.org/jdk/pull/24296
On Tue, 29 Apr 2025 16:47:17 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
Greetings,
This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338).
Implementations in this change set are provided and have been tested on the following platforms:
- windows-x64 - windows-x64-debug - linux-x64 - linux-x64-debug - macosx-x64 - macosx-x64-debug - linux-aarch64 - linux-aarch64-debug - macosx-aarch64 - macosx-aarch64-debug
Testing: tier1-6, jdk_jfr, stress testing.
Platform porters note: Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes:
- src/hotspot/cpu/x86/frame_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.hpp - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp - src/hotspot/cpu/x86/macroAssembler_x86.cpp - src/hotspot/cpu/x86/macroAssembler_x86.hpp - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp - src/hotspot/cpu/x86/templateTable_x86.cpp - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp
Thanks Markus
Markus Grönlund has updated the pull request incrementally with one additional commit since the last revision:
Configuration and test for jdk.SafepointLatency event
The issue is that the CPU context can be retrieved here after the safepoint poll has been tested. That is causing a race, because a sample would be taken for an fp that is about to pop, breaking the invariant of the sampling mechanism. It is only for some sensitive interpreter positions that we need to inspect the correct fp (the sender's fp), to avoid this race. On x64, we signal that by preemptively moving rbp, first to update the CPU context and then by explicitly setting the sender_java_fp field in the LJF. With your suggestion, we would always prioritize the sender fp (because it is always available), which is unnecessary and incorrect (biased), except for where we are about to pop an interpreter frame (but we can't decide when that is the case). For testing, you will need to run some longer stress tests to see the effect of a racy sampling attempt. To provoke taking more samples, you can decrease the sampling interval of JFR by setting the following in default.jfc and / or profile.jfc: `diff --git a/src/jdk.jfr/share/conf/jfr/profile.jfc b/src/jdk.jfr/share/conf/jfr/profile.jfc index 4c9f4b4f8ec..75f8d75c580 100644 --- a/src/jdk.jfr/share/conf/jfr/profile.jfc +++ b/src/jdk.jfr/share/conf/jfr/profile.jfc @@ -198,12 +198,12 @@ <event name="jdk.ExecutionSample"> <setting name="enabled" control="method-sampling-enabled">true</setting> - <setting name="period" control="method-sampling-java-interval">10 ms</setting> + <setting name="period" control="method-sampling-java-interval">1 ms</setting> </event> <event name="jdk.NativeMethodSample"> <setting name="enabled" control="method-sampling-enabled">true</setting> - <setting name="period" control="method-sampling-native-interval">20 ms</setting> + <setting name="period" control="method-sampling-native-interval">1 ms</setting> </event> <event name="jdk.SafepointLatency">` Try running some longer stress test or benchmark, passing: `-XX:StartFlightRecording:settings=profile.jfc` ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2842656922
On Wed, 30 Apr 2025 17:02:12 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
Markus Grönlund has updated the pull request incrementally with one additional commit since the last revision:
Configuration and test for jdk.SafepointLatency event
The issue is that the CPU context can be retrieved here after the safepoint poll has been tested. That is causing a race, because a sample would be taken for an fp that is about to pop, breaking the invariant of the sampling mechanism.
It is only for some sensitive interpreter positions that we need to inspect the correct fp (the sender's fp), to avoid this race.
On x64, we signal that by preemptively moving rbp, first to update the CPU context and then by explicitly setting the sender_java_fp field in the LJF.
With your suggestion, we would always prioritize the sender fp (because it is always available), which is unnecessary and incorrect (biased), except for where we are about to pop an interpreter frame (but we can't decide when that is the case).
For testing, you will need to run some longer stress tests to see the effect of a racy sampling attempt.
To provoke taking more samples, you can decrease the sampling interval of JFR by setting the following in default.jfc and / or profile.jfc:
`diff --git a/src/jdk.jfr/share/conf/jfr/profile.jfc b/src/jdk.jfr/share/conf/jfr/profile.jfc index 4c9f4b4f8ec..75f8d75c580 100644 --- a/src/jdk.jfr/share/conf/jfr/profile.jfc +++ b/src/jdk.jfr/share/conf/jfr/profile.jfc @@ -198,12 +198,12 @@
<event name="jdk.ExecutionSample"> <setting name="enabled" control="method-sampling-enabled">true</setting> - <setting name="period" control="method-sampling-java-interval">10 ms</setting> + <setting name="period" control="method-sampling-java-interval">1 ms</setting> </event>
<event name="jdk.NativeMethodSample"> <setting name="enabled" control="method-sampling-enabled">true</setting> - <setting name="period" control="method-sampling-native-interval">20 ms</setting> + <setting name="period" control="method-sampling-native-interval">1 ms</setting> </event>
<event name="jdk.SafepointLatency">`
Try running some longer stress test or benchmark, passing:
`-XX:StartFlightRecording:settings=profile.jfc`
Hi @mgronlun , Is it possible to get head stream changes in this PR, if there is no objection from other architecture? It would be good to have changes from https://github.com/openjdk/jdk/pull/23660 to implement the build-fix for s390x. Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2849857901
On Wed, 30 Apr 2025 17:02:12 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
Markus Grönlund has updated the pull request incrementally with one additional commit since the last revision:
Configuration and test for jdk.SafepointLatency event
The issue is that the CPU context can be retrieved here after the safepoint poll has been tested. That is causing a race, because a sample would be taken for an fp that is about to pop, breaking the invariant of the sampling mechanism.
It is only for some sensitive interpreter positions that we need to inspect the correct fp (the sender's fp), to avoid this race.
On x64, we signal that by preemptively moving rbp, first to update the CPU context and then by explicitly setting the sender_java_fp field in the LJF.
With your suggestion, we would always prioritize the sender fp (because it is always available), which is unnecessary and incorrect (biased), except for where we are about to pop an interpreter frame (but we can't decide when that is the case).
For testing, you will need to run some longer stress tests to see the effect of a racy sampling attempt.
To provoke taking more samples, you can decrease the sampling interval of JFR by setting the following in default.jfc and / or profile.jfc:
`diff --git a/src/jdk.jfr/share/conf/jfr/profile.jfc b/src/jdk.jfr/share/conf/jfr/profile.jfc index 4c9f4b4f8ec..75f8d75c580 100644 --- a/src/jdk.jfr/share/conf/jfr/profile.jfc +++ b/src/jdk.jfr/share/conf/jfr/profile.jfc @@ -198,12 +198,12 @@
<event name="jdk.ExecutionSample"> <setting name="enabled" control="method-sampling-enabled">true</setting> - <setting name="period" control="method-sampling-java-interval">10 ms</setting> + <setting name="period" control="method-sampling-java-interval">1 ms</setting> </event>
<event name="jdk.NativeMethodSample"> <setting name="enabled" control="method-sampling-enabled">true</setting> - <setting name="period" control="method-sampling-native-interval">20 ms</setting> + <setting name="period" control="method-sampling-native-interval">1 ms</setting> </event>
<event name="jdk.SafepointLatency">`
Try running some longer stress test or benchmark, passing:
`-XX:StartFlightRecording:settings=profile.jfc`
Hi @mgronlun , Is it possible to get head stream changes in this PR, if there is no objection from other architecture? It would be good to have changes from #23660 to implement the build-fix for s390x. Thanks
HI Amit, now the PR has been merged with master and should contain your changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2850319187
On Tue, 29 Apr 2025 16:47:17 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
Greetings,
This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338).
Implementations in this change set are provided and have been tested on the following platforms:
- windows-x64 - windows-x64-debug - linux-x64 - linux-x64-debug - macosx-x64 - macosx-x64-debug - linux-aarch64 - linux-aarch64-debug - macosx-aarch64 - macosx-aarch64-debug
Testing: tier1-6, jdk_jfr, stress testing.
Platform porters note: Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes:
- src/hotspot/cpu/x86/frame_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.hpp - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp - src/hotspot/cpu/x86/macroAssembler_x86.cpp - src/hotspot/cpu/x86/macroAssembler_x86.hpp - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp - src/hotspot/cpu/x86/templateTable_x86.cpp - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp
Thanks Markus
Markus Grönlund has updated the pull request incrementally with one additional commit since the last revision:
Configuration and test for jdk.SafepointLatency event
Thanks for your explanation! I'll try some longer stress tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2843518409
On Tue, 29 Apr 2025 16:47:17 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
Greetings,
This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338).
Implementations in this change set are provided and have been tested on the following platforms:
- windows-x64 - windows-x64-debug - linux-x64 - linux-x64-debug - macosx-x64 - macosx-x64-debug - linux-aarch64 - linux-aarch64-debug - macosx-aarch64 - macosx-aarch64-debug
Testing: tier1-6, jdk_jfr, stress testing.
Platform porters note: Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes:
- src/hotspot/cpu/x86/frame_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.hpp - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp - src/hotspot/cpu/x86/macroAssembler_x86.cpp - src/hotspot/cpu/x86/macroAssembler_x86.hpp - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp - src/hotspot/cpu/x86/templateTable_x86.cpp - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp
Thanks Markus
Markus Grönlund has updated the pull request incrementally with one additional commit since the last revision:
Configuration and test for jdk.SafepointLatency event
I think I will just do the merge locally and create the Patch. Once all other relativization PR merges, I will pass the conflict-free patch to you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2849864644
On Tue, 29 Apr 2025 16:47:17 GMT, Markus Grönlund <mgronlun@openjdk.org> wrote:
Greetings,
This is the implementation of JEP [JDK-8350338 Cooperative JFR Sampling](https://bugs.openjdk.org/browse/JDK-8350338).
Implementations in this change set are provided and have been tested on the following platforms:
- windows-x64 - windows-x64-debug - linux-x64 - linux-x64-debug - macosx-x64 - macosx-x64-debug - linux-aarch64 - linux-aarch64-debug - macosx-aarch64 - macosx-aarch64-debug
Testing: tier1-6, jdk_jfr, stress testing.
Platform porters note: Some platform-specific code needs to be provided, mainly in the interpreter. Take a look at the following files for changes:
- src/hotspot/cpu/x86/frame_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.cpp - src/hotspot/cpu/x86/interp_masm_x86.hpp - src/hotspot/cpu/x86/javaFrameAnchor_x86.hpp - src/hotspot/cpu/x86/macroAssembler_x86.cpp - src/hotspot/cpu/x86/macroAssembler_x86.hpp - src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp - src/hotspot/cpu/x86/templateTable_x86.cpp - src/hotspot/os_cpu/linux_x86/javaThread_linux_x86.hpp
Thanks Markus
Markus Grönlund has updated the pull request incrementally with one additional commit since the last revision:
Configuration and test for jdk.SafepointLatency event
Is there the possibility of adding a `bias` flag to the ExecutionSample events to record when an event has a clear safepoint bias? This could mark all samples where the sampler falls back on the safepoint frame. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2850225694
On Mon, 5 May 2025 08:09:10 GMT, Johannes Bechberger <jbechberger@openjdk.org> wrote:
Is there the possibility of adding a `bias` flag to the ExecutionSample events to record when an event has a clear safepoint bias? This could mark all samples where the sampler falls back on the safepoint frame.
The next phase, called "Sampling Accuracy," will address how to monitor and improve on safepoint biases and failures. It might involve separate events for failure reason tracking, so I prefer to delay the decision until then. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24296#issuecomment-2851136287
participants (4)
-
Amit Kumar
-
Johannes Bechberger
-
Markus Grönlund
-
Martin Doerr