Re: RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v10]
This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509).
Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with - ... different heap sizes - ... different GCs - ... different samplers (the standard JFR and the new CPU Time Sampler and both) - ... different JFR recording durations - ... different chunk-sizes
Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision: Don't mention throttling ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25302/files - new: https://git.openjdk.org/jdk/pull/25302/files/d071acc9..cfbc25a0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25302&range=08-09 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/25302.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25302/head:pull/25302 PR: https://git.openjdk.org/jdk/pull/25302
On Mon, 26 May 2025 12:29:13 GMT, Johannes Bechberger <jbechberger@openjdk.org> wrote:
This is the code for the [JEP 509: CPU Time based profiling for JFR](https://openjdk.org/jeps/509).
Currently tested using [this test suite](https://github.com/parttimenerd/basic-profiler-tests). This runs profiles the [Renaissance](https://renaissance.dev/) benchmark with - ... different heap sizes - ... different GCs - ... different samplers (the standard JFR and the new CPU Time Sampler and both) - ... different JFR recording durations - ... different chunk-sizes
Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
Don't mention throttling
Having four views for the CPU-Time events seems excessive, especially since it's an experimental event. Is it possible to reduce this to two views: the hot-methods table and a "form" view where all the aggregates are displayed? Labels should use headline-style capitalization, capitalize the first and last words, and all nouns, pronouns, adjectives, verbs and adverbs. Do not include ending punctuation. https://docs.oracle.com/en/java/javase/24/docs/api/jdk.jfr/jdk/jfr/Label.htm... Also, put (Experimental) at the end of the title. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2909886888
On Mon, 26 May 2025 14:11:05 GMT, Erik Gahlin <egahlin@openjdk.org> wrote:
Also, put (Experimental) at the end of the title.
This is added automatically. If I add "(Experimental)" to the title, then I get "X (Experimental) (Experimental)"
"form" view where all the aggregates are displayed?
I'm unsure how to implement this using the SQL version that is used for the views ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2909896289
On Mon, 26 May 2025 14:14:26 GMT, Johannes Bechberger <jbechberger@openjdk.org> wrote:
This is added automatically. If I add "(Experimental)" to the title, then I get "X (Experimental) (Experimental)"
Sweet.
I'm unsure how to implement this using the SQL version that is used for the views
I will see if I can create an example with some other events that show the syntax, and then you can fill in the CPU-Time events. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2909912821
On Mon, 26 May 2025 14:20:41 GMT, Erik Gahlin <egahlin@openjdk.org> wrote:
I will see if I can create an example with some other events that show the syntax, and then you can fill in the CPU-Time events.
I have a Mac, so I could not try it with an actual recording, but something like this: [application.cpu-time-statistics] label = "CPU Time Samples Statistics" form = "COLUMN 'Successful Samples', 'Failed Samples', 'Total Samples', 'Lost Samples' SELECT COUNT(T.failed), COUNT(F.failed), Count(A.failed), SUM(L.lostSamples) FROM CPUTimeSample AS T, CPUTimeSample AS F, CPUTimeSample AS A, CPUTimeSampleLoss AS L WHERE T.failed = 'true' AND F.failed = 'false'" I removed biased, because I wonder If we should have such a field? There can be many types of biases, and the implementation may change in the future. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2910137003
On Mon, 26 May 2025 15:42:57 GMT, Erik Gahlin <egahlin@openjdk.org> wrote:
I removed biased, because I wonder If we should have such a field? There can be many types of biases, and the implementation may change in the future.
This signifies that the sample has a known bias. It is important for the user to distinguish between biased and unbiased samples. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2910142902
On Mon, 26 May 2025 15:42:57 GMT, Erik Gahlin <egahlin@openjdk.org> wrote:
I will see if I can create an example with some other events that show the syntax, and then you can fill in the CPU-Time events.
I have a Mac, so I could not try it with an actual recording, but something like this:
``` [application.cpu-time-statistics] label = "CPU Time Samples Statistics" form = "COLUMN 'Successful Samples', 'Failed Samples', 'Total Samples', 'Lost Samples' SELECT COUNT(S.startTime), COUNT(F.startTime), Count(A.startTime), SUM(L.lostSamples) FROM CPUTimeSample AS S, CPUTimeSample AS F, CPUTimeSample AS A, CPUTimeSampleLoss AS L WHERE S.failed = 'false' AND F.failed = 'true'" ```
I implemented this now and removed the other statistics related views. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2917155602
On Mon, 26 May 2025 15:42:57 GMT, Erik Gahlin <egahlin@openjdk.org> wrote:
This is added automatically. If I add "(Experimental)" to the title, then I get "X (Experimental) (Experimental)"
Sweet.
I'm unsure how to implement this using the SQL version that is used for the views
I will see if I can create an example with some other events that show the syntax, and then you can fill in the CPU-Time events.
I will see if I can create an example with some other events that show the syntax, and then you can fill in the CPU-Time events.
I have a Mac, so I could not try it with an actual recording, but something like this:
[application.cpu-time-statistics] label = "CPU Time Samples Statistics" form = "COLUMN 'Successful Samples', 'Failed Samples', 'Total Samples', 'Lost Samples' SELECT COUNT(S.startTime), COUNT(F.startTime), Count(A.startTime), SUM(L.lostSamples) FROM CPUTimeSample AS S, CPUTimeSample AS F, CPUTimeSample AS A, CPUTimeSampleLoss AS L WHERE S.failed = 'false' AND F.failed = 'true'"
I removed biased, because I wonder If we should have such a field? There can be many types of biases, and the implementation may change in the future.
Hold on, shouldn't this really be "Lost"? @egahlin and @mgronlun need to chime in here.
Lost might be better. I wonder if `<Field type="Thread" name="eventThread" label="Thread" />` is needed, instead of thread = true? ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2934959447
On Tue, 3 Jun 2025 12:15:06 GMT, Erik Gahlin <egahlin@openjdk.org> wrote:
I wonder if <Field type="Thread" name="eventThread" label="Thread" /> is needed, instead of thread = true?
We had these discussions before on the old PR and then decided to end up with eventThread (as the other events do to), ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2934963523
On Tue, 3 Jun 2025 12:16:32 GMT, Johannes Bechberger <jbechberger@openjdk.org> wrote:
Hold on, shouldn't this really be "Lost"? @egahlin and @mgronlun need to chime in here.
Lost might be better.
I wonder if `<Field type="Thread" name="eventThread" label="Thread" />` is needed, instead of thread = true?
I wonder if <Field type="Thread" name="eventThread" label="Thread" /> is needed, instead of thread = true?
We had these discussions before on the old PR and then decided to end up with eventThread (as the other events do to),
@parttimenerd I would really like to see some kind of design description for this which explains what the threading model is, how the signals are used, and how all the pieces interact. Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2938512151
On Wed, 4 Jun 2025 04:50:56 GMT, David Holmes <dholmes@openjdk.org> wrote:
I wonder if <Field type="Thread" name="eventThread" label="Thread" /> is needed, instead of thread = true?
We had these discussions before on the old PR and then decided to end up with eventThread (as the other events do to),
@parttimenerd I would really like to see some kind of design description for this which explains what the threading model is, how the signals are used, and how all the pieces interact. Thanks
@dholmes-ora I attempt a first version here: The design consists of four main parts: - setup code: This sets up the signal handlers for every new thread and deletes them afterwards - the per-thread signal handlers: They check first that the current thread is valid, increment that they are currently active and check that they shouldn't stop (because the profiler is disabled). Now they acquire the thread-local enqueue lock for the current thread's request queue and push the sampling requests in (see https://openjdk.org/jeps/518 + the current period). It triggers/arms a safepoint. If the current thread is in native, they trigger (set a flag) the asynchronous stackwalking. This prevents long native periods of overflowing the request queue. Finally, the enqueue lock is released. - the safepoint handler: In the safepoint handler, we check if the thread-local queue is not empty. If so, we acquire a dequeue lock and process all entries of the queue, thereby creating JFR events. We also untrigger the async-stack-walking request for the thread. We then release the lock. - the sampler thread: Its task is to regularly update the timers if needed (configuration changes) and to walk the thread list to find any task that wants to be asynchronously stack-walked. For every of these threads, the dequeue lock is acquired (skipping if already set to enqueue) and the queue is processed as at the safepoint. Then the lock is released. On shutdown: Whenever the sampler is shut down, we first set the `_stop_signals` flag to prevent new signal handlers from entering the request creation code (and thereby accessing data structures that we already deallocated), we disable the timers for all threads and then wait till no signal handler is engaged anymore. It is important to note that there is only one thread-local lock used, but it has three states: - enqueue - dequeue - unlocked This prevents these phases from overlapping. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25302#issuecomment-2938677600
participants (3)
-
David Holmes
-
Erik Gahlin
-
Johannes Bechberger