From fandreuzzi at openjdk.org Sat Nov 1 11:48:04 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Sat, 1 Nov 2025 11:48:04 GMT Subject: RFR: 8037914: Add JFR event for string deduplication In-Reply-To: References: Message-ID: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> On Thu, 30 Oct 2025 06:01:48 GMT, Erik Gahlin wrote: > It would be good if you could provide some ballpark figures on the number of events in a worst-case scenario, so we can determine what GC level is appropriate. I wrote a simple pathological test with multiple threads interning random strings, [this](https://github.com/user-attachments/files/23282465/out-parallel.txt) is the worst I've seen: 100 deduplication rounds within `3.698s` and `3.729s`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3476289368 From ysuenaga at openjdk.org Sun Nov 2 06:33:01 2025 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Sun, 2 Nov 2025 06:33:01 GMT Subject: RFR: 8370260: Test jdk/jfr/event/oldobject/TestEmergencyDumpAtOOM.java timed out In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 00:51:06 GMT, Yasumasa Suenaga wrote: > The test failure was reported at jdk/jfr/event/oldobject/TestEmergencyDumpAtOOM.java due to minidump timeout. > `TestEmergencyDumpAtOOM` does not need minidump, so we can add `-XX:-CreateCoredumpOnCrash` to the test process. > > This change works on Windows 11 Pro 25H2 and Fedora 42. PING: Can I get a Reviewer? This is a test bug. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/27993#issuecomment-3477496897 From egahlin at openjdk.org Mon Nov 3 10:22:06 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 3 Nov 2025 10:22:06 GMT Subject: RFR: 8370260: Test jdk/jfr/event/oldobject/TestEmergencyDumpAtOOM.java timed out In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 00:51:06 GMT, Yasumasa Suenaga wrote: > The test failure was reported at jdk/jfr/event/oldobject/TestEmergencyDumpAtOOM.java due to minidump timeout. > `TestEmergencyDumpAtOOM` does not need minidump, so we can add `-XX:-CreateCoredumpOnCrash` to the test process. > > This change works on Windows 11 Pro 25H2 and Fedora 42. Marked as reviewed by egahlin (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27993#pullrequestreview-3410338404 From ysuenaga at openjdk.org Mon Nov 3 14:29:29 2025 From: ysuenaga at openjdk.org (Yasumasa Suenaga) Date: Mon, 3 Nov 2025 14:29:29 GMT Subject: Integrated: 8370260: Test jdk/jfr/event/oldobject/TestEmergencyDumpAtOOM.java timed out In-Reply-To: References: Message-ID: On Mon, 27 Oct 2025 00:51:06 GMT, Yasumasa Suenaga wrote: > The test failure was reported at jdk/jfr/event/oldobject/TestEmergencyDumpAtOOM.java due to minidump timeout. > `TestEmergencyDumpAtOOM` does not need minidump, so we can add `-XX:-CreateCoredumpOnCrash` to the test process. > > This change works on Windows 11 Pro 25H2 and Fedora 42. This pull request has now been integrated. Changeset: 20ff33cb Author: Yasumasa Suenaga URL: https://git.openjdk.org/jdk/commit/20ff33cbdf393818b63bb8989e1def0b2d470c4b Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod 8370260: Test jdk/jfr/event/oldobject/TestEmergencyDumpAtOOM.java timed out Reviewed-by: syan, egahlin ------------- PR: https://git.openjdk.org/jdk/pull/27993 From egahlin at openjdk.org Mon Nov 3 17:02:33 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 3 Nov 2025 17:02:33 GMT Subject: RFR: 8037914: Add JFR event for string deduplication In-Reply-To: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> References: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> Message-ID: On Sat, 1 Nov 2025 11:45:27 GMT, Francesco Andreuzzi wrote: > > It would be good if you could provide some ballpark figures on the number of events in a worst-case scenario, so we can determine what GC level is appropriate. > > I wrote a simple pathological test with multiple threads interning random strings, [this](https://github.com/user-attachments/files/23282465/out-parallel.txt) is the worst I've seen: 100 deduplication rounds within `3.698s` and `3.729s`. Thanks for investigating this. It doesn't sound that bad, the event could probably be enabled by default. The elapsed fields, are they the total since the JVM started or from the last round? We typically try to avoid using "Bytes" in field names, since that information is already available in the content type. Perhaps something else could be used, newSize? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3481567424 From sgehwolf at openjdk.org Tue Nov 4 07:09:47 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 4 Nov 2025 07:09:47 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> <9uVKpiWCXvxcxhyg6V1seeSxyLm14lHEdOL_I07uVQs=.de1b0ed8-d006-49d5-a982-27556759415e@github.com> Message-ID: On Mon, 27 Oct 2025 09:17:02 GMT, Andrew Haley wrote: > > it wasn't because we wanted to figure out the color of the bike shed but rather how to write safer code that makes it less likely to accidentally introduce bugs because of type conflation. > > This. A function that returns its value as a side effect on a reference parameter is (at best) a code smell. Thanks for the comments. So what's the consensus then? As far as API surface is concerned I've modelled it after [JDK-8357086](https://bugs.openjdk.org/browse/JDK-8357086). It [introduces](https://github.com/openjdk/jdk/commit/d5d94db12a6d82a6fe9da18b5f8ce3733a6ee7e7) the side-effect/code smell issue. Do we want to re-open this discussion or proceed with this here. It's not clear to me. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3469549344 From egahlin at openjdk.org Tue Nov 4 16:37:51 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 4 Nov 2025 16:37:51 GMT Subject: RFR: 8370884: JFR: Overflow in aggregators Message-ID: Could I have a review of a change that fixes overflow issues for the jfr query aggregators? Testing: jdk/jdk/jfr Thanks Erik ------------- Commit messages: - Initial Changes: https://git.openjdk.org/jdk/pull/28135/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28135&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8370884 Stats: 58 lines in 1 file changed: 46 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/28135.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28135/head:pull/28135 PR: https://git.openjdk.org/jdk/pull/28135 From fandreuzzi at openjdk.org Tue Nov 4 21:27:22 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 4 Nov 2025 21:27:22 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v4] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplicationStatistics` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request incrementally with two additional commits since the last revision: - enable - bytes to size ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/c3c9a8db..e8644c68 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=02-03 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From fandreuzzi at openjdk.org Tue Nov 4 21:34:45 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 4 Nov 2025 21:34:45 GMT Subject: RFR: 8037914: Add JFR event for string deduplication In-Reply-To: References: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> Message-ID: On Mon, 3 Nov 2025 16:59:55 GMT, Erik Gahlin wrote: > We typically try to avoid using "Bytes" in field names, since that information is already available in the content type. Perhaps something else could be used, newSize? Sure: 586f413571c7c0354e9663888c81113065d991bf > It doesn't sound that bad, the event could probably be enabled by default. I re-enabled the event in `default.jfc`: e8644c683a4290f0ae112b7c63a8a1fc1c85b27e > The elapsed fields, are they the total since the JVM started or from the last round? All fields in `EventStringDeduplicationStatistics` contain the diff since the last round: jdk.StringDeduplicationStatistics { startTime = 21:31:19.604 (2025-11-04) duration = 0.000020 ms inspected = 8424 known = 2898 shared = 1247 newStrings = 4279 newSize = 255.0 kB replaced = 0 deleted = 0 deduplicated = 3331 deduplicatedSize = 102.4 kB skippedDead = 6 skippedIncomplete = 0 skippedShared = 0 activeElapsed = 1.85 ms processElapsed = 1.85 ms idleElapsed = 191 ms resizeTableElapsed = 0 s cleanupTableElapsed = 0 s } jdk.StringDeduplicationStatistics { startTime = 21:31:19.604 (2025-11-04) duration = 0.000030 ms inspected = 1 known = 0 shared = 0 newStrings = 1 newSize = 24 bytes replaced = 0 deleted = 0 deduplicated = 0 deduplicatedSize = 0 bytes skippedDead = 0 skippedIncomplete = 0 skippedShared = 0 activeElapsed = 0.00124 ms processElapsed = 0.00100 ms idleElapsed = 0.000780 ms resizeTableElapsed = 0 s cleanupTableElapsed = 0 s } ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3488086974 From fandreuzzi at openjdk.org Wed Nov 5 09:09:05 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Wed, 5 Nov 2025 09:09:05 GMT Subject: RFR: 8370884: JFR: Overflow in aggregators In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 16:23:53 GMT, Erik Gahlin wrote: > Could I have a review of a change that fixes overflow issues for the jfr query aggregators? > > Testing: jdk/jdk/jfr > > Thanks > Erik src/jdk.jfr/share/classes/jdk/jfr/internal/query/Function.java line 383: > 381: seconds = Math.addExact(s, nanosToAdd / NANOS_PER_SECOND); > 382: nanos = nanos + nanosToAdd % NANOS_PER_SECOND; > 383: hasValue = true;; Suggestion: hasValue = true; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28135#discussion_r2493592341 From egahlin at openjdk.org Wed Nov 5 13:59:07 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Wed, 5 Nov 2025 13:59:07 GMT Subject: RFR: 8370884: JFR: Overflow in aggregators [v2] In-Reply-To: References: Message-ID: > Could I have a review of a change that fixes overflow issues for the jfr query aggregators? > > Testing: jdk/jdk/jfr > > Thanks > Erik Erik Gahlin has updated the pull request incrementally with one additional commit since the last revision: Minor fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28135/files - new: https://git.openjdk.org/jdk/pull/28135/files/c7f8dd73..2bdd6968 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28135&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28135&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28135.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28135/head:pull/28135 PR: https://git.openjdk.org/jdk/pull/28135 From egahlin at openjdk.org Wed Nov 5 14:18:17 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Wed, 5 Nov 2025 14:18:17 GMT Subject: RFR: 8037914: Add JFR event for string deduplication In-Reply-To: References: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> Message-ID: On Tue, 4 Nov 2025 21:32:19 GMT, Francesco Andreuzzi wrote: > > The elapsed fields, are they the total since the JVM started or from the last round? > > All fields in `EventStringDeduplicationStatistics` contain the diff since the last round: > Since the event has a duration, I wonder if the event should be called StringDeduplication, similar to Compilation or GarbageCollection? As I understand it, the event represents a round of deduplication. All other events called statistics are instantaneous events. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3491455070 From mgronlun at openjdk.org Wed Nov 5 18:29:25 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 5 Nov 2025 18:29:25 GMT Subject: RFR: 8370884: JFR: Overflow in aggregators [v2] In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 13:59:07 GMT, Erik Gahlin wrote: >> Could I have a review of a change that fixes overflow issues for the jfr query aggregators? >> >> Testing: jdk/jdk/jfr >> >> Thanks >> Erik > > Erik Gahlin has updated the pull request incrementally with one additional commit since the last revision: > > Minor fixes Marked as reviewed by mgronlun (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28135#pullrequestreview-3423711065 From fandreuzzi at openjdk.org Thu Nov 6 01:25:00 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Thu, 6 Nov 2025 01:25:00 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v5] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplicationStatistics` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: no start ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/e8644c68..3793befd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From fandreuzzi at openjdk.org Thu Nov 6 01:59:41 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Thu, 6 Nov 2025 01:59:41 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v6] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplication` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: rename. start/end time ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/3793befd..090c02bc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=04-05 Stats: 12 lines in 6 files changed: 5 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From fandreuzzi at openjdk.org Thu Nov 6 01:59:42 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Thu, 6 Nov 2025 01:59:42 GMT Subject: RFR: 8037914: Add JFR event for string deduplication In-Reply-To: References: <1OO6CrVzIrUtVeqvYA5rwGSuKsrybfUJUSN0B3AS8FM=.3edb6e0a-621c-455f-8191-7eb76d669243@github.com> Message-ID: <9Ne0a7oeIyXjdA_VFbfbX52u7WRgTonCp_jI906V6DQ=.8e10796c-6160-4e08-9595-1b35491dcda0@github.com> On Wed, 5 Nov 2025 14:15:07 GMT, Erik Gahlin wrote: > > > The elapsed fields, are they the total since the JVM started or from the last round? > > > > > > All fields in `EventStringDeduplicationStatistics` contain the diff since the last round: > > Since the event has a duration, I wonder if the event should be called StringDeduplication, similar to Compilation or GarbageCollection? As I understand it, the event represents a round of deduplications. All other events called statistics are instantaneous events. Yeah this makes sense, thanks. See 090c02bce5ba79cff378bd48de0fc0849f532250 ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3494458707 From egahlin at openjdk.org Thu Nov 6 13:43:22 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Thu, 6 Nov 2025 13:43:22 GMT Subject: Integrated: 8370884: JFR: Overflow in aggregators In-Reply-To: References: Message-ID: On Tue, 4 Nov 2025 16:23:53 GMT, Erik Gahlin wrote: > Could I have a review of a change that fixes overflow issues for the jfr query aggregators? > > Testing: jdk/jdk/jfr > > Thanks > Erik This pull request has now been integrated. Changeset: df414e0d Author: Erik Gahlin URL: https://git.openjdk.org/jdk/commit/df414e0d19c1ed68f151d84dbb481a9dd6c65539 Stats: 58 lines in 1 file changed: 46 ins; 0 del; 12 mod 8370884: JFR: Overflow in aggregators Reviewed-by: mgronlun ------------- PR: https://git.openjdk.org/jdk/pull/28135 From egahlin at openjdk.org Fri Nov 7 09:10:41 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Fri, 7 Nov 2025 09:10:41 GMT Subject: RFR: 8365972: JFR: ThreadDump and ClassLoaderStatistics events may cause back to back rotations Message-ID: Could I have a review of a PR that changes how`jdk.ThreadDump` and `jdk.ClassLoaderStatistics` events are emitted? This change is a short-term solution for users who currently suffer from excessive event data, both in terms of count and size, being emitted at the beginning of a chunk. This high volume of events uses up so much chunk space that it immediately triggers a rotation. That rotation can quickly lead to another, causing back-to-back rotations. As a result, the default maximum size of 250 MB may only cover less than 30 seconds of data. This can happen with the thread dump event if there are many threads (1000) with a large number frames (250). It can also occur with applications that have hundreds of thousands of class loaders. The fix is to emit those two events when a new recording starts and then only at the end of every chunk. That way, each chunk contains the event at least once, and users can always see the delta from the beginning to the end of a recording. Emitting events at the end of chunk doesn't cause back-to-back rotations, since they are emitted after the max chunk threshold has been triggered. (The long-term plan is to change all events with the period `everyChunk` in default.jfc to work this way, but it is too intrusive to backport, and we need more time to figure out how it should work. One idea is to redefine `everyChunk`to work in this new way. Another idea is to introduce a new representation called `rotation` for this new behavior, and keep `everyChunk` as is. A third idea is to introduce `beginRecording` and allow multiple values in a setting so you would have `beginRecording,endChunk`) ------------- Commit messages: - Add test - Initial Changes: https://git.openjdk.org/jdk/pull/28153/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28153&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8365972 Stats: 145 lines in 4 files changed: 140 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28153.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28153/head:pull/28153 PR: https://git.openjdk.org/jdk/pull/28153 From mgronlun at openjdk.org Fri Nov 7 09:49:03 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Fri, 7 Nov 2025 09:49:03 GMT Subject: RFR: 8365972: JFR: ThreadDump and ClassLoaderStatistics events may cause back to back rotations In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 16:35:57 GMT, Erik Gahlin wrote: > Could I have a review of a PR that changes how`jdk.ThreadDump` and `jdk.ClassLoaderStatistics` events are emitted? > > This change is a short-term solution for users who currently suffer from excessive event data, both in terms of count and size, being emitted at the beginning of a chunk. This high volume of events uses up so much chunk space that it immediately triggers a rotation. That rotation can quickly lead to another, causing back-to-back rotations. As a result, the default maximum size of 250 MB may only cover less than 30 seconds of data. This can happen with the thread dump event if there are many threads (1000) with a large number frames (250). It can also occur with applications that have hundreds of thousands of class loaders. > > The fix is to emit those two events when a new recording starts and then only at the end of every chunk. That way, each chunk contains the event at least once, and users can always see the delta from the beginning to the end of a recording. Emitting events at the end of chunk doesn't cause back-to-back rotations, since they are emitted after the max chunk threshold has been triggered. > > (The long-term plan is to change all events with the period `everyChunk` in default.jfc to work this way, but it is too intrusive to backport, and we need more time to figure out how it should work. One idea is to redefine `everyChunk`to work in this new way. Another idea is to introduce a new representation called `rotation` for this new behavior, and keep `everyChunk` as is. A third idea is to introduce `beginRecording` and allow multiple values in a setting so you would have `beginRecording,endChunk`) Marked as reviewed by mgronlun (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28153#pullrequestreview-3432635321 From mbaesken at openjdk.org Fri Nov 7 13:31:38 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 7 Nov 2025 13:31:38 GMT Subject: RFR: 8371473: Problem list TestEmergencyDumpAtOOM.java on ppc64 platforms related to JDK-8371014 Message-ID: <7y9Im02ui6uGUTDorUzOCwY4trO6TVAOed9VftrhxhU=.dd34b10c-5389-4c81-a538-1106405b68ee@github.com> TestEmergencyDumpAtOOM.java fails on ppc64 platforms because of issues with the emergency writing of a jfr file in crash cases. ------------- Commit messages: - JDK-8371473 Changes: https://git.openjdk.org/jdk/pull/28193/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28193&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371473 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28193.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28193/head:pull/28193 PR: https://git.openjdk.org/jdk/pull/28193 From mdoerr at openjdk.org Fri Nov 7 13:31:39 2025 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 7 Nov 2025 13:31:39 GMT Subject: RFR: 8371473: Problem list TestEmergencyDumpAtOOM.java on ppc64 platforms related to JDK-8371014 In-Reply-To: <7y9Im02ui6uGUTDorUzOCwY4trO6TVAOed9VftrhxhU=.dd34b10c-5389-4c81-a538-1106405b68ee@github.com> References: <7y9Im02ui6uGUTDorUzOCwY4trO6TVAOed9VftrhxhU=.dd34b10c-5389-4c81-a538-1106405b68ee@github.com> Message-ID: <5XwAGRIxub7YGkqVdQSgvL1MYJrMBJUkZX1MrVLPKSE=.01ca67af-01bc-49a3-9071-c0d751ece3a1@github.com> On Fri, 7 Nov 2025 13:20:50 GMT, Matthias Baesken wrote: > TestEmergencyDumpAtOOM.java fails on ppc64 platforms because of issues with the emergency writing of a jfr file in crash cases. LGTM. Thanks! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28193#pullrequestreview-3433891731 From phubner at openjdk.org Fri Nov 7 13:44:04 2025 From: phubner at openjdk.org (Paul =?UTF-8?B?SMO8Ym5lcg==?=) Date: Fri, 7 Nov 2025 13:44:04 GMT Subject: RFR: 8371473: Problem list TestEmergencyDumpAtOOM.java on ppc64 platforms related to JDK-8371014 In-Reply-To: <7y9Im02ui6uGUTDorUzOCwY4trO6TVAOed9VftrhxhU=.dd34b10c-5389-4c81-a538-1106405b68ee@github.com> References: <7y9Im02ui6uGUTDorUzOCwY4trO6TVAOed9VftrhxhU=.dd34b10c-5389-4c81-a538-1106405b68ee@github.com> Message-ID: <3TYPtkTGhIPkVqVHpcMb1bdL7RbsLQoUUZGdtCKjJUQ=.25ce32cd-c1d6-4a60-bc05-7479e29d0e81@github.com> On Fri, 7 Nov 2025 13:20:50 GMT, Matthias Baesken wrote: > TestEmergencyDumpAtOOM.java fails on ppc64 platforms because of issues with the emergency writing of a jfr file in crash cases. Marked as reviewed by phubner (Author). ------------- PR Review: https://git.openjdk.org/jdk/pull/28193#pullrequestreview-3434009407 From mbaesken at openjdk.org Mon Nov 10 08:01:21 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 10 Nov 2025 08:01:21 GMT Subject: RFR: 8371473: Problem list TestEmergencyDumpAtOOM.java on ppc64 platforms related to JDK-8371014 In-Reply-To: <7y9Im02ui6uGUTDorUzOCwY4trO6TVAOed9VftrhxhU=.dd34b10c-5389-4c81-a538-1106405b68ee@github.com> References: <7y9Im02ui6uGUTDorUzOCwY4trO6TVAOed9VftrhxhU=.dd34b10c-5389-4c81-a538-1106405b68ee@github.com> Message-ID: On Fri, 7 Nov 2025 13:20:50 GMT, Matthias Baesken wrote: > TestEmergencyDumpAtOOM.java fails on ppc64 platforms because of issues with the emergency writing of a jfr file in crash cases. Thanks for the reviews ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/28193#issuecomment-3509980756 From mbaesken at openjdk.org Mon Nov 10 08:01:22 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 10 Nov 2025 08:01:22 GMT Subject: Integrated: 8371473: Problem list TestEmergencyDumpAtOOM.java on ppc64 platforms related to JDK-8371014 In-Reply-To: <7y9Im02ui6uGUTDorUzOCwY4trO6TVAOed9VftrhxhU=.dd34b10c-5389-4c81-a538-1106405b68ee@github.com> References: <7y9Im02ui6uGUTDorUzOCwY4trO6TVAOed9VftrhxhU=.dd34b10c-5389-4c81-a538-1106405b68ee@github.com> Message-ID: On Fri, 7 Nov 2025 13:20:50 GMT, Matthias Baesken wrote: > TestEmergencyDumpAtOOM.java fails on ppc64 platforms because of issues with the emergency writing of a jfr file in crash cases. This pull request has now been integrated. Changeset: 79fee607 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/79fee607fd77320cd5deb8e424582e2f6c2b31a2 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8371473: Problem list TestEmergencyDumpAtOOM.java on ppc64 platforms related to JDK-8371014 Reviewed-by: mdoerr, phubner ------------- PR: https://git.openjdk.org/jdk/pull/28193 From egahlin at openjdk.org Mon Nov 10 10:25:42 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 10 Nov 2025 10:25:42 GMT Subject: Integrated: 8365972: JFR: ThreadDump and ClassLoaderStatistics events may cause back to back rotations In-Reply-To: References: Message-ID: On Wed, 5 Nov 2025 16:35:57 GMT, Erik Gahlin wrote: > Could I have a review of a PR that changes how`jdk.ThreadDump` and `jdk.ClassLoaderStatistics` events are emitted? > > This change is a short-term solution for users who currently suffer from excessive event data, both in terms of count and size, being emitted at the beginning of a chunk. This high volume of events uses up so much chunk space that it immediately triggers a rotation. That rotation can quickly lead to another, causing back-to-back rotations. As a result, the default maximum size of 250 MB may cover less than 30 seconds of data. This can happen with the thread dump event if there are many threads (1000) with a large number frames (250). It can also occur with applications that have hundreds of thousands of class loaders. > > The fix is to emit those two events when a new recording starts and then only at the end of every chunk. That way, each chunk contains the event at least once, and users can always see the delta from the beginning to the end of a recording. Emitting events at the end of chunk doesn't cause back-to-back rotations, since they are emitted after the max chunk threshold has been triggered. > > (The long-term plan is to change all events with the period `everyChunk` in default.jfc to work this way, but it is too intrusive to backport, and we need more time to figure out how it should work. One idea is to redefine `everyChunk`to work in this new way. Another idea is to introduce a new representation called `rotation` for this new behavior, and keep `everyChunk` as is. A third idea is to introduce `beginRecording` and allow multiple values in a setting so you would have `beginRecording,endChunk`) > > Testing: jdk/jdk/jfr + 500 * TestBackToBackSensitive.java This pull request has now been integrated. Changeset: 681dab72 Author: Erik Gahlin URL: https://git.openjdk.org/jdk/commit/681dab7205190176b842bd42914b1cb9fe752e44 Stats: 145 lines in 4 files changed: 140 ins; 0 del; 5 mod 8365972: JFR: ThreadDump and ClassLoaderStatistics events may cause back to back rotations Reviewed-by: mgronlun ------------- PR: https://git.openjdk.org/jdk/pull/28153 From egahlin at openjdk.org Mon Nov 10 10:36:04 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 10 Nov 2025 10:36:04 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v6] In-Reply-To: References: Message-ID: On Thu, 6 Nov 2025 01:59:41 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > rename. start/end time This looks better, but I think activeElapsed can be removed since we now have it in duration. I'm not sure idle should be included, unless it is believed to be important. activeElapsed = 0.00124 ms processElapsed = 0.00100 ms idleElapsed = 0.000780 ms resizeTableElapsed = 0 s cleanupTableElapsed = 0 s An argument can be made that the phases should be separate events, similar to CompilerPhase and GCPausePhase, where you have a name for each phase (String Processing, Table Resize and Table Cleanup), but it may be over-engineering if we don't believe these phases will change in the future? The suffix "Elapsed" is not something we have used for describing a timespan. I wonder if the fields should be: processing tableResize tableCleanup ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3510740806 From krk at openjdk.org Mon Nov 10 12:35:34 2025 From: krk at openjdk.org (Kerem Kat) Date: Mon, 10 Nov 2025 12:35:34 GMT Subject: RFR: 8369949: Increase Xmx of TestWaste.java so the EdgeQueue is larger, to fix crash Message-ID: SIGSEGV is caused by a stack overflow in the VM Thread that is traversing the object graph via `BFSClosure::process`. When the `EdgeQueue` is full, `BFSClosure` falls back to DFS. We could increase `Xmx` of the test, or work on finding a better heap percentage for `edge_queue_memory_reservation`. Here `Xmx` is set to `2g`. More details in the issue. ------------- Commit messages: - 8369949: Increase Xmx of TestWaste.java so the EdgeQueue is larger, to fix crash Changes: https://git.openjdk.org/jdk/pull/28215/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28215&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8369949 Stats: 2 lines in 2 files changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28215.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28215/head:pull/28215 PR: https://git.openjdk.org/jdk/pull/28215 From egahlin at openjdk.org Mon Nov 10 12:57:07 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 10 Nov 2025 12:57:07 GMT Subject: RFR: 8369949: Increase Xmx of TestWaste.java so the EdgeQueue is larger, to fix crash In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 12:28:18 GMT, Kerem Kat wrote: > SIGSEGV is caused by a stack overflow in the VM Thread that is traversing the object graph via `BFSClosure::process`. When the `EdgeQueue` is full, `BFSClosure` falls back to DFS. > > We could increase `Xmx` of the test, or work on finding a better heap percentage for `edge_queue_memory_reservation`. > > Here `Xmx` is set to `2g`. More details in the issue. The DFS has a maximum depth of 4 000. If that is too high, it should be reduced so it works out of the box. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28215#issuecomment-3511468599 From sgehwolf at openjdk.org Mon Nov 10 13:10:09 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:10:09 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 14:13:57 GMT, Thomas Fitzsimmons wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 627: > >> 625: * >> 626: * If quotas have not been specified, return the >> 627: * number of active processors in the system. > > This paragraph uses the "return" language that you adjusted in the next paragraph. It should probably also refer to the reference argument instead. Thanks, fixed. > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 629: > >> 627: * number of active processors in the system. >> 628: * >> 629: * If quotas have been specified, the resulting number > > Tiny nit, but "the resulting number" => "the number", since you say "the result reference" on the next line. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510514731 PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510515207 From sgehwolf at openjdk.org Mon Nov 10 13:13:05 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:13:05 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 09:50:33 GMT, Thomas Stuefe wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 638: > >> 636: bool CgroupSubsystem::active_processor_count(int& value) { >> 637: int cpu_count; >> 638: int result = -1; > > Why not get rid of result and use `value` throughout like you did in the cached case? It's useful to do assertions on the value retrieved by `CgroupUtil::processor_count()` before the actual result is being changed. Using `value` has the issue of not knowing what the reference default value was set to. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510522176 From sgehwolf at openjdk.org Mon Nov 10 13:19:07 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:19:07 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 14:19:19 GMT, Thomas Fitzsimmons wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 653: > >> 651: cpu_count = os::Linux::active_processor_count(); >> 652: if (!CgroupUtil::processor_count(contrl->controller(), cpu_count, result)) { >> 653: return false; > > `value` will be returned unchanged from its passed-in value here. I wonder if it would be safer to explicitly set it to `0` when returning `false`. Also, could `value` be given an unsigned type, like `uint64_t`? The general contract in those functions is that the result reference is unchanged when `false` is being returned. So this is intentional. > Also, could value be given an unsigned type, like uint64_t I've tried to keep the `int` based processor_count API as is. Not sure if we need an unsigned type here. We could if that's the consensus, but then it would make the patch even larger. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510540986 From sgehwolf at openjdk.org Mon Nov 10 13:24:04 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:24:04 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 14:29:11 GMT, Thomas Fitzsimmons wrote: > I think quote value_unlimited here to hint that it is a constant defined elsewhere. OK. > Can the limit ever be 0, and if not, should there be a new assert for > 0 like for cpu_count? The limit could theoretically be `0`. I'd try to avoid an overzealous assert here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510555699 From sgehwolf at openjdk.org Mon Nov 10 13:30:18 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:30:18 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 11:22:01 GMT, Thomas Stuefe wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 80: > >> 78: return false; \ >> 79: } \ >> 80: log_trace(os, container)(log_string " is: " UINT64_FORMAT, retval); \ > > Here and in other places: don't use raw UINT64_FORMAT; use `PHYS_MEM_TYPE_FORMAT` instead. This is intentional since the processor_count API doesn't use `physical_memory_size_type` (as it doesn't make sense in this context). See, for example, `CgroupV2CpuController::cpu_period()`. The common denominator is `uint64_t`. This is a bit awkward, but I don't know a better way to deal with this. The reading functions are shared, most of the API is used for memory value reading (but not exclusively, exceptions are `pid`, `cpu`). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510577587 From sgehwolf at openjdk.org Mon Nov 10 13:34:21 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:34:21 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 12:03:26 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 90: > >> 88: if (!is_ok) { \ >> 89: log_trace(os, container)(log_string " failed: -2"); \ >> 90: return false; \ > > Do we need to keep the `-2` here? Or could we perhaps change to a better message? We don't need the `-2` here. This was an attempt to keep backwards compatible, but I guess we can change testing code as well (at least those that rely on those trace logs). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510588965 From sgehwolf at openjdk.org Mon Nov 10 13:42:15 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:42:15 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 12:04:51 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 93: > >> 91: } \ >> 92: if (retval == value_unlimited) { \ >> 93: log_trace(os, container)(log_string " is: -1"); \ > > Same here, could perhaps do `log_trace(os, container)(log_string " is: unlimited")`instead. OK. This will likely need some test adjustment, but I'll do that instead of hard-coding those numbers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510609021 From sgehwolf at openjdk.org Mon Nov 10 13:42:18 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:42:18 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Fri, 24 Oct 2025 11:23:05 GMT, Thomas Stuefe wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 105: > >> 103: is_ok = controller->read_string(filename, retval, buf_size); \ >> 104: if (!is_ok) { \ >> 105: log_trace(os, container)(log_string " failed: -2"); \ > > Why this change? Did the constant value change? Motivation was getting rid of the OSCONTAINER_ERROR constant. The only place where a negative number was still in use. I've just dropped the `: -2` suffix now. It's not very useful (other than in tests). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510615109 From sgehwolf at openjdk.org Mon Nov 10 13:50:26 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:50:26 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: <5Ossha9mznuIOp64P8MfLZaLaubRFuaVH1jGQEu6Hb0=.82d5744d-3d56-4ac6-8b19-c9664717069f@github.com> On Mon, 27 Oct 2025 19:28:08 GMT, Thomas Fitzsimmons wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 167: > >> 165: /* memory_and_swap_limit_in_bytes >> 166: * >> 167: * Determine the memory and swap limit metric. Returns a positive limit value or > > "Returns" language should probably be updated here too. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510646920 From sgehwolf at openjdk.org Mon Nov 10 13:50:29 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:50:29 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Tue, 28 Oct 2025 09:26:09 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 465: > >> 463: // negative value as a large unsiged int >> 464: if (!reader()->read_number("/cpu.cfs_quota_us", quota)) { >> 465: log_trace(os, container)("CPU Quota failed: -2"); > > Do we need to keep the `-2` here? Or could we perhaps change to a better message? I've dropped `: -2` suffix now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510649413 From sgehwolf at openjdk.org Mon Nov 10 13:55:39 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 13:55:39 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 19:48:58 GMT, Thomas Fitzsimmons wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 470: > >> 468: // cast to int since the read value might be negative >> 469: // and we want to avoid logging -1 as a large unsigned value. >> 470: int quota_int = static_cast(quota); > > It seems like quota is either a positive number or disabled. I wonder if `result` can be treated as a `uint64_t`, and this log message special-cased to detect `-1` read from `/cpu.cfs_quota_us` as disabled. I guess the calling code would need another way to differentiate "disabled" from other values... maybe with `0`? Just a thought to maybe simplify the type logic here. Likewise for `period` and `shares`. Yes, there is opportunity to change the API. This patch was done to do a 1-to-1 translation of the previous version as much as possible. So I've refrained from doing this in this patch as well. It kept the size of the patch a bit more manageable. Happy to file a follow-up RFE to do this in a separate patch. Thoughts? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510666794 From krk at openjdk.org Mon Nov 10 14:05:08 2025 From: krk at openjdk.org (Kerem Kat) Date: Mon, 10 Nov 2025 14:05:08 GMT Subject: RFR: 8369949: Increase Xmx of TestWaste.java so the EdgeQueue is larger, to fix crash [v2] In-Reply-To: References: Message-ID: <96H8k8oQadVm84cNmAQHXmbh-8FuPEKCwuPqz2VWQMI=.d77b9e23-8f89-4ea8-9d68-6ff3eb18dfe2@github.com> > SIGSEGV is caused by a stack overflow in the VM Thread that is traversing the object graph via `BFSClosure::process`. When the `EdgeQueue` is full, `BFSClosure` falls back to DFS. > > We could increase `Xmx` of the test, or work on finding a better heap percentage for `edge_queue_memory_reservation`. > > Here `Xmx` is set to `2g`. More details in the issue. Kerem Kat has updated the pull request incrementally with one additional commit since the last revision: Set max_dfs_depth to 3200 instead ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28215/files - new: https://git.openjdk.org/jdk/pull/28215/files/739b0589..df40515d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28215&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28215&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/28215.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28215/head:pull/28215 PR: https://git.openjdk.org/jdk/pull/28215 From krk at openjdk.org Mon Nov 10 14:05:09 2025 From: krk at openjdk.org (Kerem Kat) Date: Mon, 10 Nov 2025 14:05:09 GMT Subject: RFR: 8369949: Increase Xmx of TestWaste.java so the EdgeQueue is larger, to fix crash In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 12:28:18 GMT, Kerem Kat wrote: > SIGSEGV is caused by a stack overflow in the VM Thread that is traversing the object graph via `BFSClosure::process`. When the `EdgeQueue` is full, `BFSClosure` falls back to DFS. > > We could increase `Xmx` of the test, or work on finding a better heap percentage for `edge_queue_memory_reservation`. > > Here `Xmx` is set to `2g`. More details in the issue. In my tests on linux x86_64, the test crashes with `DFSClosure::max_dfs_depth = 3255` and passes with `3254`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28215#issuecomment-3511838170 From sgehwolf at openjdk.org Mon Nov 10 14:09:54 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:09:54 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 11:32:34 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 61: > >> 59: * true if the result reference got updated >> 60: * false if there was an error >> 61: */ > > We set result to `-1` and return true on a no share setup here, but return `false` and don't on cgroup v1. The comment is contradicting. Good catch. Fixed the cgroup v1 code to match the old behaviour (set `-1` in the result reference and return `true` if we read the default value). I think this fixes the issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510712320 From sgehwolf at openjdk.org Mon Nov 10 14:13:56 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:13:56 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 11:43:34 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 178: > >> 176: bool is_ok = reader()->read_numerical_key_value("/cpu.stat", "usage_usec", value); >> 177: if (!is_ok) { >> 178: log_trace(os, container)("CPU Usage failed: -2"); > > Do we need to keep the `-2` here? Or could we perhaps change to a better message? Thanks. I've removed the `-2`. > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 237: > >> 235: if (!reader()->read_number_handle_max("/memory.swap.max", swap_limit_val)) { >> 236: // Some container tests rely on this trace logging to happen. >> 237: log_trace(os, container)("Swap Limit failed: -2"); > > Do we need to keep the `-2` here? Or could we perhaps change to a better message? I've removed the `-2`. I.e. `Swap Limit failed` is the log message. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510722882 PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510726448 From egahlin at openjdk.org Mon Nov 10 14:16:02 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 10 Nov 2025 14:16:02 GMT Subject: RFR: 8369949: Increase Xmx of TestWaste.java so the EdgeQueue is larger, to fix crash [v2] In-Reply-To: <96H8k8oQadVm84cNmAQHXmbh-8FuPEKCwuPqz2VWQMI=.d77b9e23-8f89-4ea8-9d68-6ff3eb18dfe2@github.com> References: <96H8k8oQadVm84cNmAQHXmbh-8FuPEKCwuPqz2VWQMI=.d77b9e23-8f89-4ea8-9d68-6ff3eb18dfe2@github.com> Message-ID: On Mon, 10 Nov 2025 14:05:08 GMT, Kerem Kat wrote: >> SIGSEGV is caused by a stack overflow in the VM Thread that is traversing the object graph via `BFSClosure::process`. When the `EdgeQueue` is full, `BFSClosure` falls back to DFS. >> >> We could increase `Xmx` of the test, or work on finding a better heap percentage for `edge_queue_memory_reservation`. >> >> Here `Xmx` is set to `2g`. More details in the issue. > > Kerem Kat has updated the pull request incrementally with one additional commit since the last revision: > > Set max_dfs_depth to 3200 instead Marked as reviewed by egahlin (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28215#pullrequestreview-3443475472 From fitzsim at openjdk.org Mon Nov 10 14:27:50 2025 From: fitzsim at openjdk.org (Thomas Fitzsimmons) Date: Mon, 10 Nov 2025 14:27:50 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: <4ECXYwsoJz7nYkDcPFo6R2y7L56KufBRH7ox-7_Proo=.adddc389-8462-4872-8bb8-00bbfa10e2ef@github.com> On Mon, 10 Nov 2025 13:53:11 GMT, Severin Gehwolf wrote: >> src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 470: >> >>> 468: // cast to int since the read value might be negative >>> 469: // and we want to avoid logging -1 as a large unsigned value. >>> 470: int quota_int = static_cast(quota); >> >> It seems like quota is either a positive number or disabled. I wonder if `result` can be treated as a `uint64_t`, and this log message special-cased to detect `-1` read from `/cpu.cfs_quota_us` as disabled. I guess the calling code would need another way to differentiate "disabled" from other values... maybe with `0`? Just a thought to maybe simplify the type logic here. Likewise for `period` and `shares`. > > Yes, there is opportunity to change the API. This patch was done to do a 1-to-1 translation of the previous version as much as possible. So I've refrained from doing this in this patch as well. It kept the size of the patch a bit more manageable. Happy to file a follow-up RFE to do this in a separate patch. Thoughts? Maybe the API as-is is clearer, because it matches the actual `/proc` values. Having thought about it more, it probably doesn't make sense to change the API just to make the implementation's type handling cleaner, so I'd say don't bother with the follow-up RFE. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510770596 From sgehwolf at openjdk.org Mon Nov 10 14:41:47 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:41:47 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 13:49:03 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 379: > >> 377: * Calculate the maximum number of tasks available to the process. Set the >> 378: * value in the passed in 'value' reference. The value might be -1 when >> 379: * there is no limit. > > How can we get `-1`? Or do you mean `(uint64_t)-1`? This was meant to say `value_unlimited` if there is `max` in the `pids.max` interface file. Updated the comment and changed the code handling for `VM.info` to handle `value_unlimited`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510823182 From sgehwolf at openjdk.org Mon Nov 10 14:55:14 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:55:14 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 13:09:36 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/os_linux.cpp line 220: > >> 218: if (OSContainer::is_containerized() && OSContainer::available_memory_in_bytes(avail_mem)) { >> 219: log_trace(os)("available container memory: " PHYS_MEM_TYPE_FORMAT, avail_mem); >> 220: value = avail_mem; > > Should be able to pass in `value` directly instead of using `avail_mem`. Sure, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510872717 From sgehwolf at openjdk.org Mon Nov 10 14:59:35 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 14:59:35 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 13:11:49 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/os_linux.cpp line 261: > >> 259: if (OSContainer::is_containerized() && OSContainer::available_memory_in_bytes(free_mem)) { >> 260: log_trace(os)("free container memory: " PHYS_MEM_TYPE_FORMAT, free_mem); >> 261: value = free_mem; > > Should be able to pass in `value` directly instead of using `free_mem`. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510884240 From sgehwolf at openjdk.org Mon Nov 10 15:21:13 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 15:21:13 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 14:09:48 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/os_linux.cpp line 4863: > >> 4861: if (OSContainer::is_containerized() && OSContainer::active_processor_count(active_cpus)) { >> 4862: log_trace(os)("active_processor_count: determined by OSContainer: %d", >> 4863: active_cpus); > > When running containerized, we would now always fetch the os cpu count at least once. > > `CgroupSubsystem::active_processor_count`, which this calls down to has the cache to actively avoid getting the cpu count too frequently, and only gets the number of cpus with `os::Linux::active_processor_count` when the cache expires. > > I don't know if this is still an issue today, but since it's there I still think we should avoid getting the cpus if unnecessary. Thanks, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510972683 From sgehwolf at openjdk.org Mon Nov 10 15:24:21 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 15:24:21 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 12:21:10 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/share/runtime/os.cpp line 2215: > >> 2213: } >> 2214: value = mem_usage; >> 2215: return true; > > Can we collapse this and just set the `value` reference directly instead in the container functions? Something like: > > ```c++ > if (OSContainer::is_containerized()) { > return OSContainer::memory_usage_in_bytes(mem_usage); > } Sure. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2510986032 From shade at openjdk.org Mon Nov 10 17:01:22 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Nov 2025 17:01:22 GMT Subject: RFR: 8369949: Fix TestWaste.java stack overflow [v2] In-Reply-To: <96H8k8oQadVm84cNmAQHXmbh-8FuPEKCwuPqz2VWQMI=.d77b9e23-8f89-4ea8-9d68-6ff3eb18dfe2@github.com> References: <96H8k8oQadVm84cNmAQHXmbh-8FuPEKCwuPqz2VWQMI=.d77b9e23-8f89-4ea8-9d68-6ff3eb18dfe2@github.com> Message-ID: On Mon, 10 Nov 2025 14:05:08 GMT, Kerem Kat wrote: >> SIGSEGV is caused by a stack overflow in the VM Thread that is traversing the object graph via `BFSClosure::process`. When the `EdgeQueue` is full, `BFSClosure` falls back to DFS. >> >> We could increase `Xmx` of the test, or work on finding a better heap percentage for `edge_queue_memory_reservation`. >> >> Here `Xmx` is set to `2g`. More details in the issue. > > Kerem Kat has updated the pull request incrementally with one additional commit since the last revision: > > Set max_dfs_depth to 3200 instead Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28215#pullrequestreview-3444243547 From sgehwolf at openjdk.org Mon Nov 10 17:15:49 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:15:49 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v3] In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: - Extract OSContainer::available_swap_in_bytes() - Simplify os::used_memory() - Fix os::active_processor_count() - os::free_memory => use 'value' directly - os::available_memory() => use 'value' directly - Fix pids_max printing in VM.info - Better logging for -1 (cpu_shares) - Fix cg v1 cpu_shares to match old behaviour - More comment fixes. - Drop -1 (unlimited) and -2 (failed) constants Will likely need corresponding test changes - ... and 11 more: https://git.openjdk.org/jdk/compare/d5803aa7...08f1c185 ------------- Changes: https://git.openjdk.org/jdk/pull/27743/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=02 Stats: 1307 lines in 16 files changed: 514 ins; 106 del; 687 mod Patch: https://git.openjdk.org/jdk/pull/27743.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27743/head:pull/27743 PR: https://git.openjdk.org/jdk/pull/27743 From sgehwolf at openjdk.org Mon Nov 10 17:15:50 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:15:50 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: <4ECXYwsoJz7nYkDcPFo6R2y7L56KufBRH7ox-7_Proo=.adddc389-8462-4872-8bb8-00bbfa10e2ef@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> <4ECXYwsoJz7nYkDcPFo6R2y7L56KufBRH7ox-7_Proo=.adddc389-8462-4872-8bb8-00bbfa10e2ef@github.com> Message-ID: On Mon, 10 Nov 2025 14:24:48 GMT, Thomas Fitzsimmons wrote: >> Yes, there is opportunity to change the API. This patch was done to do a 1-to-1 translation of the previous version as much as possible. So I've refrained from doing this in this patch as well. It kept the size of the patch a bit more manageable. Happy to file a follow-up RFE to do this in a separate patch. Thoughts? > > Maybe the API as-is is clearer, because it matches the actual `/proc` values. Having thought about it more, it probably doesn't make sense to change the API just to make the implementation's type handling cleaner, so I'd say don't bother with the follow-up RFE. OK. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2511346360 From sgehwolf at openjdk.org Mon Nov 10 17:15:52 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:15:52 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v3] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Wed, 22 Oct 2025 13:44:56 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 21 commits: >> >> - Extract OSContainer::available_swap_in_bytes() >> - Simplify os::used_memory() >> - Fix os::active_processor_count() >> - os::free_memory => use 'value' directly >> - os::available_memory() => use 'value' directly >> - Fix pids_max printing in VM.info >> - Better logging for -1 (cpu_shares) >> - Fix cg v1 cpu_shares to match old behaviour >> - More comment fixes. >> - Drop -1 (unlimited) and -2 (failed) constants >> >> Will likely need corresponding test changes >> - ... and 11 more: https://git.openjdk.org/jdk/compare/d5803aa7...08f1c185 > > src/hotspot/os/linux/os_linux.cpp line 348: > >> 346: return true; >> 347: } >> 348: } > > This whole function is getting a bit too long in my opinion. > Maybe everything inside the `if OSContainer::is_containerized() {}` could be moved into a new function `OSContainer::available_swap_in_bytes`, similar to the already existing `OSContainer::available_memory_in_bytes`. That way, we could abstract away all the `OSContainer` calls. > > The only consequence would be that the `log_trace` wouldn't work any more. I couldn't find any test that depends on the exact output, so it could perhaps be split up instead. I've moved this to a `OSContainer::available_swap_in_bytes` function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2511339099 From sgehwolf at openjdk.org Mon Nov 10 17:15:53 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:15:53 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v3] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: <0sWmQikqXAA1s_F26YTx5TkMBXJoww2FWxkBxdJKZfg=.915e4d84-b8c0-48c4-961c-0ffb3f1c396a@github.com> On Mon, 10 Nov 2025 17:09:54 GMT, Severin Gehwolf wrote: >> src/hotspot/os/linux/os_linux.cpp line 348: >> >>> 346: return true; >>> 347: } >>> 348: } >> >> This whole function is getting a bit too long in my opinion. >> Maybe everything inside the `if OSContainer::is_containerized() {}` could be moved into a new function `OSContainer::available_swap_in_bytes`, similar to the already existing `OSContainer::available_memory_in_bytes`. That way, we could abstract away all the `OSContainer` calls. >> >> The only consequence would be that the `log_trace` wouldn't work any more. I couldn't find any test that depends on the exact output, so it could perhaps be split up instead. > > I've moved this to a `OSContainer::available_swap_in_bytes` function. Example trace log (if it fails) is: [0.672s][trace][os,container] OSContainer::available_swap_in_bytes: container_swap_limit=unlimited container_mem_limit=1073741824, host_free_swap: 8589844480 [0.672s][trace][os,container] os::free_swap_space: containerized value unavailable returning host value: 8589844480 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2511340779 From sgehwolf at openjdk.org Mon Nov 10 17:22:34 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:22:34 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v4] In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: One more comment fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/27743/files - new: https://git.openjdk.org/jdk/pull/27743/files/08f1c185..46df71e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/27743.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27743/head:pull/27743 PR: https://git.openjdk.org/jdk/pull/27743 From sgehwolf at openjdk.org Mon Nov 10 17:22:39 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:22:39 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: On Mon, 27 Oct 2025 11:36:39 GMT, Casper Norrbin wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor >> - Fix print_container_info output >> - whitespace clean-ups and other small fixes >> - Fix log format in container macro and scanf format >> - Fix duplicate include in osContainer_linux >> - 8365606: Container code should not be using jlong/julong > > src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp line 115: > >> 113: * true if the result reference has been set >> 114: * false on error >> 115: */ > > The beginning part of the comment isn't updated to mention the `result` reference, unlike the other comments. Should be fixed in https://github.com/openjdk/jdk/pull/27743/commits/46df71e19458b1682d6b8a28ef5b3e9a8932be9e ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2511363227 From sgehwolf at openjdk.org Mon Nov 10 17:28:21 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 10 Nov 2025 17:28:21 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v4] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Mon, 10 Nov 2025 17:22:34 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: > > One more comment fix I've resolved the conflicts now and incorporated reviewers' feedback. Thanks for the reviews! It doesn't solve the larger issue of reference passing for the result value, though :-/ ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3513007587 From duke at openjdk.org Mon Nov 10 17:30:34 2025 From: duke at openjdk.org (duke) Date: Mon, 10 Nov 2025 17:30:34 GMT Subject: RFR: 8369949: Fix TestWaste.java stack overflow [v2] In-Reply-To: <96H8k8oQadVm84cNmAQHXmbh-8FuPEKCwuPqz2VWQMI=.d77b9e23-8f89-4ea8-9d68-6ff3eb18dfe2@github.com> References: <96H8k8oQadVm84cNmAQHXmbh-8FuPEKCwuPqz2VWQMI=.d77b9e23-8f89-4ea8-9d68-6ff3eb18dfe2@github.com> Message-ID: On Mon, 10 Nov 2025 14:05:08 GMT, Kerem Kat wrote: >> SIGSEGV is caused by a stack overflow in the VM Thread that is traversing the object graph via `BFSClosure::process`. When the `EdgeQueue` is full, `BFSClosure` falls back to DFS. >> >> We could increase `Xmx` of the test, or work on finding a better heap percentage for `edge_queue_memory_reservation`. >> >> Here `Xmx` is set to `2g`. More details in the issue. > > Kerem Kat has updated the pull request incrementally with one additional commit since the last revision: > > Set max_dfs_depth to 3200 instead @krk Your change (at version df40515d5d2665574e7803cc4820586dc50febac) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28215#issuecomment-3513016540 From krk at openjdk.org Mon Nov 10 17:41:34 2025 From: krk at openjdk.org (Kerem Kat) Date: Mon, 10 Nov 2025 17:41:34 GMT Subject: Integrated: 8369949: Fix TestWaste.java stack overflow In-Reply-To: References: Message-ID: On Mon, 10 Nov 2025 12:28:18 GMT, Kerem Kat wrote: > SIGSEGV is caused by a stack overflow in the VM Thread that is traversing the object graph via `BFSClosure::process`. When the `EdgeQueue` is full, `BFSClosure` falls back to DFS. > > We could increase `Xmx` of the test, or work on finding a better heap percentage for `edge_queue_memory_reservation`. > > Here `Xmx` is set to `2g`. More details in the issue. This pull request has now been integrated. Changeset: 1327aa60 Author: Kerem Kat Committer: Cesar Soares Lucas URL: https://git.openjdk.org/jdk/commit/1327aa60907555d7e2d8d131bf4cb20a34660ff2 Stats: 2 lines in 2 files changed: 0 ins; 1 del; 1 mod 8369949: Fix TestWaste.java stack overflow Reviewed-by: egahlin, shade ------------- PR: https://git.openjdk.org/jdk/pull/28215 From duke at openjdk.org Mon Nov 10 17:56:54 2025 From: duke at openjdk.org (duke) Date: Mon, 10 Nov 2025 17:56:54 GMT Subject: Withdrawn: 8366232: JFR startup messages are shown with -Xlog:jfr+startup=warning In-Reply-To: References: Message-ID: On Wed, 27 Aug 2025 12:07:54 GMT, Johannes Bechberger wrote: > Only print the JFR startup messages when no or a < warning log level is set explicitly by the user. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/26957 From krk at openjdk.org Mon Nov 10 19:00:21 2025 From: krk at openjdk.org (Kerem Kat) Date: Mon, 10 Nov 2025 19:00:21 GMT Subject: RFR: 8369692: JFR: Don't record thread metadata in case jdk.ThreadStart is disabled Message-ID: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> ### Before File size: 0.84 MB Scrubbed size: 0.82 MB Waste: 2.73% ### After File size: 0.80 MB Scrubbed size: 0.78 MB Waste: 3.40% File size decreases when `ThreadStart` and `ThreadEnd` events are disabled. ------------- Commit messages: - 8369692: JFR: Don't record thread metadata in case jdk.ThreadStart is disabled Changes: https://git.openjdk.org/jdk/pull/28222/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28222&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8369692 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28222.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28222/head:pull/28222 PR: https://git.openjdk.org/jdk/pull/28222 From mgronlun at openjdk.org Mon Nov 10 19:42:58 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 10 Nov 2025 19:42:58 GMT Subject: RFR: 8369692: JFR: Don't record thread metadata in case jdk.ThreadStart is disabled In-Reply-To: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> References: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> Message-ID: On Mon, 10 Nov 2025 18:52:31 GMT, Kerem Kat wrote: > ### Before > File size: 0.84 MB > Scrubbed size: 0.82 MB > Waste: 2.73% > > ### After > File size: 0.80 MB > Scrubbed size: 0.78 MB > Waste: 3.40% > > File size decreases when `ThreadStart` and `ThreadEnd` events are disabled. This is not correct, because now thread information will be missing for other events. Solving this proper would involve introducing a lazy write scheme of thread metadata on first use, but this can have other impacts, like the need to check for every event. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28222#issuecomment-3513586854 From cnorrbin at openjdk.org Tue Nov 11 13:24:03 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Tue, 11 Nov 2025 13:24:03 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v4] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Mon, 10 Nov 2025 17:22:34 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision: > > One more comment fix Looks good to me! ------------- Marked as reviewed by cnorrbin (Committer). PR Review: https://git.openjdk.org/jdk/pull/27743#pullrequestreview-3448050849 From sgehwolf at openjdk.org Tue Nov 11 14:32:57 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 11 Nov 2025 14:32:57 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v5] In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: - Add space in trace log - Merge branch 'master' into jdk-8365606-jlong-julong-refactor - One more comment fix - Extract OSContainer::available_swap_in_bytes() - Simplify os::used_memory() - Fix os::active_processor_count() - os::free_memory => use 'value' directly - os::available_memory() => use 'value' directly - Fix pids_max printing in VM.info - Better logging for -1 (cpu_shares) - ... and 14 more: https://git.openjdk.org/jdk/compare/29100320...0958b10f ------------- Changes: https://git.openjdk.org/jdk/pull/27743/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=04 Stats: 1308 lines in 16 files changed: 514 ins; 106 del; 688 mod Patch: https://git.openjdk.org/jdk/pull/27743.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27743/head:pull/27743 PR: https://git.openjdk.org/jdk/pull/27743 From sgehwolf at openjdk.org Tue Nov 11 14:43:10 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 11 Nov 2025 14:43:10 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v4] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Tue, 11 Nov 2025 13:21:38 GMT, Casper Norrbin wrote: > Looks good to me! Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3517235877 From krk at openjdk.org Tue Nov 11 16:53:03 2025 From: krk at openjdk.org (Kerem Kat) Date: Tue, 11 Nov 2025 16:53:03 GMT Subject: RFR: 8369692: JFR: Don't record thread metadata in case jdk.ThreadStart is disabled In-Reply-To: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> References: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> Message-ID: On Mon, 10 Nov 2025 18:52:31 GMT, Kerem Kat wrote: > ### Before > File size: 0.84 MB > Scrubbed size: 0.82 MB > Waste: 2.73% > > ### After > File size: 0.80 MB > Scrubbed size: 0.78 MB > Waste: 3.40% > > File size decreases when `ThreadStart` and `ThreadEnd` events are disabled. Instead of checking for every event, we could set a flag unconditionally and check on thread start/end. What if we add a `volatile bool _has_emitted_events` to `JfrThreadLocal` and: * Set the flag when any event with `T::hasThread` is written (unconditional store) * At thread start/exit: write the checkpoint if `EventThreadStart/End::is_enabled() || has_emitted_events()` * Clear the flag after writing checkpoint What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28222#issuecomment-3517837507 From mgronlun at openjdk.org Tue Nov 11 17:47:08 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 11 Nov 2025 17:47:08 GMT Subject: RFR: 8369692: JFR: Don't record thread metadata in case jdk.ThreadStart is disabled In-Reply-To: References: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> Message-ID: On Tue, 11 Nov 2025 16:50:01 GMT, Kerem Kat wrote: > Instead of checking for every event, we could set a flag unconditionally and check on thread start/end. > > What if we add a `volatile bool _has_emitted_events` to `JfrThreadLocal` and: > > * Set the flag when any event with `T::hasThread` is written (unconditional store) > > * At thread start/exit: write the checkpoint if `EventThreadStart/End::is_enabled() || has_emitted_events()` > > * Clear the flag after writing checkpoint > > > What do you think? The checkpoint must be in place before a T::hasThread event is written. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28222#issuecomment-3518045541 From mgronlun at openjdk.org Wed Nov 12 10:10:19 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 12 Nov 2025 10:10:19 GMT Subject: RFR: 8369692: JFR: Don't record thread metadata in case jdk.ThreadStart is disabled In-Reply-To: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> References: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> Message-ID: On Mon, 10 Nov 2025 18:52:31 GMT, Kerem Kat wrote: > ### Before > File size: 0.84 MB > Scrubbed size: 0.82 MB > Waste: 2.73% > > ### After > File size: 0.80 MB > Scrubbed size: 0.78 MB > Waste: 3.40% > > File size decreases when `ThreadStart` and `ThreadEnd` events are disabled. [JDK-8364258](https://bugs.openjdk.org/browse/JDK-8364258) introduced a normalization scheme also for threads, to ensure thread metadata is only written once per chunk. This significantly reduced the number of checkpoints for a thread. I am therefore skeptical to pursuing this PR, because the overall win probably does not warrant the added complexities. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28222#issuecomment-3521118780 From krk at openjdk.org Wed Nov 12 12:46:40 2025 From: krk at openjdk.org (Kerem Kat) Date: Wed, 12 Nov 2025 12:46:40 GMT Subject: Withdrawn: 8369692: JFR: Don't record thread metadata in case jdk.ThreadStart is disabled In-Reply-To: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> References: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> Message-ID: <_Uj57BwUp7Etk_UoBZO8CNeE8a1t0mFkUmyL9iOryBk=.9249da3c-d6ab-4017-83f4-70de3063f78c@github.com> On Mon, 10 Nov 2025 18:52:31 GMT, Kerem Kat wrote: > ### Before > File size: 0.84 MB > Scrubbed size: 0.82 MB > Waste: 2.73% > > ### After > File size: 0.80 MB > Scrubbed size: 0.78 MB > Waste: 3.40% > > File size decreases when `ThreadStart` and `ThreadEnd` events are disabled. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/28222 From krk at openjdk.org Wed Nov 12 12:46:39 2025 From: krk at openjdk.org (Kerem Kat) Date: Wed, 12 Nov 2025 12:46:39 GMT Subject: RFR: 8369692: JFR: Don't record thread metadata in case jdk.ThreadStart is disabled In-Reply-To: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> References: <5AGp9IPro-OVu6fh-9xvyjaIexSAvRAbkyPjWNMffWY=.1d232d1e-845d-40bf-ad00-5264966ec2f0@github.com> Message-ID: On Mon, 10 Nov 2025 18:52:31 GMT, Kerem Kat wrote: > ### Before > File size: 0.84 MB > Scrubbed size: 0.82 MB > Waste: 2.73% > > ### After > File size: 0.80 MB > Scrubbed size: 0.78 MB > Waste: 3.40% > > File size decreases when `ThreadStart` and `ThreadEnd` events are disabled. Thanks, closing in favor of JDK-8364258 ------------- PR Comment: https://git.openjdk.org/jdk/pull/28222#issuecomment-3521745857 From stuefe at openjdk.org Fri Nov 14 11:42:18 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Nov 2025 11:42:18 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v5] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: <3PgMYWEYdQEVJr2qVQ8vkiaIsBrG-qtcF63NPMS69Gk=.458b045e-91f0-4a31-9b9d-20c608ce28fe@github.com> On Tue, 11 Nov 2025 14:32:57 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: > > - Add space in trace log > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - One more comment fix > - Extract OSContainer::available_swap_in_bytes() > - Simplify os::used_memory() > - Fix os::active_processor_count() > - os::free_memory => use 'value' directly > - os::available_memory() => use 'value' directly > - Fix pids_max printing in VM.info > - Better logging for -1 (cpu_shares) > - ... and 14 more: https://git.openjdk.org/jdk/compare/29100320...0958b10f New version looks good. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27743#pullrequestreview-3464545040 From stuefe at openjdk.org Fri Nov 14 11:42:20 2025 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Nov 2025 11:42:20 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> Message-ID: <5zLSizgJLZ-WPMqfgD2ox8fB76jYB6XJk1kUxh5BdXE=.e3ae7506-cbfb-49c1-9c76-622f7078218e@github.com> On Mon, 10 Nov 2025 13:27:59 GMT, Severin Gehwolf wrote: >> src/hotspot/os/linux/cgroupSubsystem_linux.hpp line 80: >> >>> 78: return false; \ >>> 79: } \ >>> 80: log_trace(os, container)(log_string " is: " UINT64_FORMAT, retval); \ >> >> Here and in other places: don't use raw UINT64_FORMAT; use `PHYS_MEM_TYPE_FORMAT` instead. > > This is intentional since the processor_count API doesn't use `physical_memory_size_type` (as it doesn't make sense in this context). See, for example, `CgroupV2CpuController::cpu_period()`. The common denominator is `uint64_t`. This is a bit awkward, but I don't know a better way to deal with this. The reading functions are shared, most of the API is used for memory value reading (but not exclusively, exceptions are `pid`, `cpu`). Okay ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2527193808 From sgehwolf at openjdk.org Fri Nov 14 13:18:53 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 14 Nov 2025 13:18:53 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v5] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Tue, 11 Nov 2025 14:32:57 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: > > - Add space in trace log > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - One more comment fix > - Extract OSContainer::available_swap_in_bytes() > - Simplify os::used_memory() > - Fix os::active_processor_count() > - os::free_memory => use 'value' directly > - os::available_memory() => use 'value' directly > - Fix pids_max printing in VM.info > - Better logging for -1 (cpu_shares) > - ... and 14 more: https://git.openjdk.org/jdk/compare/29100320...0958b10f Thanks for the reviews! If there are no objections to move forward with this I'll integrate Monday. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3532695447 From sgehwolf at openjdk.org Fri Nov 14 15:11:02 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Fri, 14 Nov 2025 15:11:02 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v6] In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: - Merge branch 'master' into jdk-8365606-jlong-julong-refactor - Add space in trace log - Merge branch 'master' into jdk-8365606-jlong-julong-refactor - One more comment fix - Extract OSContainer::available_swap_in_bytes() - Simplify os::used_memory() - Fix os::active_processor_count() - os::free_memory => use 'value' directly - os::available_memory() => use 'value' directly - Fix pids_max printing in VM.info - ... and 15 more: https://git.openjdk.org/jdk/compare/5d65c23c...9a5f3eb5 ------------- Changes: https://git.openjdk.org/jdk/pull/27743/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27743&range=05 Stats: 1308 lines in 16 files changed: 514 ins; 106 del; 688 mod Patch: https://git.openjdk.org/jdk/pull/27743.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/27743/head:pull/27743 PR: https://git.openjdk.org/jdk/pull/27743 From cnorrbin at openjdk.org Fri Nov 14 15:17:39 2025 From: cnorrbin at openjdk.org (Casper Norrbin) Date: Fri, 14 Nov 2025 15:17:39 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v6] In-Reply-To: <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> Message-ID: On Fri, 14 Nov 2025 15:11:02 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: > > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - Add space in trace log > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - One more comment fix > - Extract OSContainer::available_swap_in_bytes() > - Simplify os::used_memory() > - Fix os::active_processor_count() > - os::free_memory => use 'value' directly > - os::available_memory() => use 'value' directly > - Fix pids_max printing in VM.info > - ... and 15 more: https://git.openjdk.org/jdk/compare/5d65c23c...9a5f3eb5 Marked as reviewed by cnorrbin (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/27743#pullrequestreview-3465404828 From fandreuzzi at openjdk.org Mon Nov 17 13:11:32 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Mon, 17 Nov 2025 13:11:32 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v7] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplication` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: - remove elapsed. remove idle - Merge branch 'master' into JDK-8037914 - rename. start/end time - no start - enable - bytes to size - disable - revert - one event - trailing - ... and 5 more: https://git.openjdk.org/jdk/compare/84b50801...fc47a64e ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/090c02bc..fc47a64e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=05-06 Stats: 242359 lines in 1920 files changed: 155343 ins; 52311 del; 34705 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From fandreuzzi at openjdk.org Mon Nov 17 13:11:32 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Mon, 17 Nov 2025 13:11:32 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v6] In-Reply-To: References: Message-ID: On Thu, 6 Nov 2025 01:59:41 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > rename. start/end time Makes sense, thanks. I applied all your feedback in fc47a64e39712024c55542439bd775497a6d70ed. > An argument can be made that the phases should be separate events, similar to CompilerPhase and GCPausePhase, where you have a name for each phase (String Processing, Table Resize and Table Cleanup), but it may be over-engineering if we don't believe these phases will change in the future? Yeah I think there's little chance for changes there, I'd keep it as it is. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3541742817 From egahlin at openjdk.org Mon Nov 17 14:30:10 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 17 Nov 2025 14:30:10 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v7] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 13:11:32 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: > > - remove elapsed. remove idle > - Merge branch 'master' into JDK-8037914 > - rename. start/end time > - no start > - enable > - bytes to size > - disable > - revert > - one event > - trailing > - ... and 5 more: https://git.openjdk.org/jdk/compare/5fa84676...fc47a64e >From a JFR perspective, this looks good. Ideally, the values of the event should be sanity-checked, but I understand this might be tricky to do in a reliable manner. Hunting down false positives would just be a waste of time. The copyright year of the test should be 2025. It would be good if someone on the GC team could take a look at the GC-related code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3542131222 From fandreuzzi at openjdk.org Mon Nov 17 14:45:29 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Mon, 17 Nov 2025 14:45:29 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v8] In-Reply-To: References: Message-ID: > In this PR I introduce a new JFR event: `jdk.StringDeduplication` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: fix year ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28015/files - new: https://git.openjdk.org/jdk/pull/28015/files/fc47a64e..40829ead Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28015&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28015.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28015/head:pull/28015 PR: https://git.openjdk.org/jdk/pull/28015 From fandreuzzi at openjdk.org Mon Nov 17 14:45:30 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Mon, 17 Nov 2025 14:45:30 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v7] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 14:27:24 GMT, Erik Gahlin wrote: > It would be good if someone on the GC team could take a look at the GC-related code. @albertnetymk could you have a look? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3542207837 From sgehwolf at openjdk.org Mon Nov 17 17:34:04 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Mon, 17 Nov 2025 17:34:04 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> <9uVKpiWCXvxcxhyg6V1seeSxyLm14lHEdOL_I07uVQs=.de1b0ed8-d006-49d5-a982-27556759415e@github.com> Message-ID: On Mon, 27 Oct 2025 09:17:02 GMT, Andrew Haley wrote: >>> > I agree. Its not pretty, but consistent with what we did elsewhere. Nobody wants to do that discussion again. >>> >>> Sorry, I was unaware of any previous discussion. I was suggesting a less impactful way to make the change, taking advantage of the recent adoption of C++17, which allows for cleaner code. But I won't stand in the way of consensus. >> >> FWIW, I'd be interested in seeing a small example of what that would look like with C++17. There were a lot of discussion about the style, but it wasn't because we wanted to figure out the color of the bike shed but rather how to write safer code that makes it less likely to accidentally introduce bugs because of type conflation. > >> it wasn't because we wanted to figure out the color of the bike shed but rather how to write safer code that makes it less likely to accidentally introduce bugs because of type conflation. > > This. A function that returns its value as a side effect on a reference parameter is (at best) a code smell. @theRealAph @stefank OK to integrate this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3543074389 From ayang at openjdk.org Mon Nov 17 18:55:08 2025 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 17 Nov 2025 18:55:08 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v8] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 14:45:29 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > fix year Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28015#pullrequestreview-3474115864 From egahlin at openjdk.org Mon Nov 17 18:55:09 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 17 Nov 2025 18:55:09 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v8] In-Reply-To: References: Message-ID: <_StovLFzWCTY3tLarfVFee0vcsPLgcYjrzL0Xq-7n2A=.304d6035-496f-43d8-9a8e-06f159c78e8c@github.com> On Mon, 17 Nov 2025 14:45:29 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > fix year Marked as reviewed by egahlin (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28015#pullrequestreview-3474120613 From stefank at openjdk.org Mon Nov 17 20:03:15 2025 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 17 Nov 2025 20:03:15 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v2] In-Reply-To: <9uVKpiWCXvxcxhyg6V1seeSxyLm14lHEdOL_I07uVQs=.de1b0ed8-d006-49d5-a982-27556759415e@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <0IQ106BTnoNfWulWJ30t9uWy5OH2EF4Y0kC_jZlgU6g=.84583e9b-1d06-440c-8c34-670ebfc7940f@github.com> <9uVKpiWCXvxcxhyg6V1seeSxyLm14lHEdOL_I07uVQs=.de1b0ed8-d006-49d5-a982-27556759415e@github.com> Message-ID: On Mon, 27 Oct 2025 08:33:22 GMT, Stefan Karlsson wrote: >>> >>> I agree. Its not pretty, but consistent with what we did elsewhere. Nobody wants to do that discussion again. >> >> Sorry, I was unaware of any previous discussion. I was suggesting a less impactful way to make the change, taking advantage of the recent adoption of C++17, which allows for cleaner code. But I won't stand in the way of consensus. > >> > I agree. Its not pretty, but consistent with what we did elsewhere. Nobody wants to do that discussion again. >> >> Sorry, I was unaware of any previous discussion. I was suggesting a less impactful way to make the change, taking advantage of the recent adoption of C++17, which allows for cleaner code. But I won't stand in the way of consensus. > > FWIW, I'd be interested in seeing a small example of what that would look like with C++17. There were a lot of discussion about the style, but it wasn't because we wanted to figure out the color of the bike shed but rather how to write safer code that makes it less likely to accidentally introduce bugs because of type conflation. > @stefank OK to integrate this? Yes. I took a quick glance at the changes and it looks like the previous style that was made for the other os:: memory APIs. I'm deferring the responsibility to do a full Review to Casper and Thomas. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3543621156 From fandreuzzi at openjdk.org Tue Nov 18 09:23:38 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 18 Nov 2025 09:23:38 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v7] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 14:27:24 GMT, Erik Gahlin wrote: >> Francesco Andreuzzi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: >> >> - remove elapsed. remove idle >> - Merge branch 'master' into JDK-8037914 >> - rename. start/end time >> - no start >> - enable >> - bytes to size >> - disable >> - revert >> - one event >> - trailing >> - ... and 5 more: https://git.openjdk.org/jdk/compare/c7ce9f21...fc47a64e > > From a JFR perspective, this looks good. Ideally, the values of the event should be sanity-checked, but I understand this might be tricky to do in a reliable manner. Hunting down false positives would just be a waste of time. The copyright year of the test should be 2025. > > It would be good if someone on the GC team could take a look at the GC-related code. Thanks for the review @egahlin and @albertnetymk. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3546398933 From duke at openjdk.org Tue Nov 18 09:23:39 2025 From: duke at openjdk.org (duke) Date: Tue, 18 Nov 2025 09:23:39 GMT Subject: RFR: 8037914: Add JFR event for string deduplication [v8] In-Reply-To: References: Message-ID: On Mon, 17 Nov 2025 14:45:29 GMT, Francesco Andreuzzi wrote: >> In this PR I introduce a new JFR event: `jdk.StringDeduplication` >> >> The new event is emitted every time a deduplication cycle happens. >> >> Passes tier1 and tier2 (fastdebug). > > Francesco Andreuzzi has updated the pull request incrementally with one additional commit since the last revision: > > fix year @fandreuz Your change (at version 40829ead2e80bab673d4852914eabfdee72dc7ce) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28015#issuecomment-3546402253 From sgehwolf at openjdk.org Tue Nov 18 09:42:26 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 18 Nov 2025 09:42:26 GMT Subject: RFR: 8365606: Container code should not be using jlong/julong [v6] In-Reply-To: <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> <7K-YvWpUSK96IGcBjhrcsYRqLz-xsdq_FrzSvOi4d68=.15a535d5-dca0-4aef-9714-91747a6b4fad@github.com> Message-ID: On Fri, 14 Nov 2025 15:11:02 GMT, Severin Gehwolf wrote: >> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. >> >> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. >> >> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. >> >> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. >> >> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. >> >> Testing (looking good): >> - [x] GHA >> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. >> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. >> >> Thoughts? Opinions? > > Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: > > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - Add space in trace log > - Merge branch 'master' into jdk-8365606-jlong-julong-refactor > - One more comment fix > - Extract OSContainer::available_swap_in_bytes() > - Simplify os::used_memory() > - Fix os::active_processor_count() > - os::free_memory => use 'value' directly > - os::available_memory() => use 'value' directly > - Fix pids_max printing in VM.info > - ... and 15 more: https://git.openjdk.org/jdk/compare/5d65c23c...9a5f3eb5 OK. Here it goes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3546491920 From sgehwolf at openjdk.org Tue Nov 18 09:46:45 2025 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 18 Nov 2025 09:46:45 GMT Subject: Integrated: 8365606: Container code should not be using jlong/julong In-Reply-To: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> References: <-8aFRr9Hv0gxOufHCTreBgrkFSatpHjQytEVDQ-v8mY=.7ab7d7b7-09a0-4ae4-b084-e8bf285491bb@github.com> Message-ID: On Fri, 10 Oct 2025 13:09:48 GMT, Severin Gehwolf wrote: > Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values. > > It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned. > > All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for. > > All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code. > > While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large. > > Testing (looking good): > - [x] GHA > - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below. > - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command. > > Thoughts? Opinions? This pull request has now been integrated. Changeset: 72ebca8a Author: Severin Gehwolf URL: https://git.openjdk.org/jdk/commit/72ebca8a0b19fac8a9483e5a3a98b454176fc342 Stats: 1308 lines in 16 files changed: 514 ins; 106 del; 688 mod 8365606: Container code should not be using jlong/julong Reviewed-by: stuefe, cnorrbin, fitzsim ------------- PR: https://git.openjdk.org/jdk/pull/27743 From fandreuzzi at openjdk.org Tue Nov 18 09:46:47 2025 From: fandreuzzi at openjdk.org (Francesco Andreuzzi) Date: Tue, 18 Nov 2025 09:46:47 GMT Subject: Integrated: 8037914: Add JFR event for string deduplication In-Reply-To: References: Message-ID: On Tue, 28 Oct 2025 10:09:58 GMT, Francesco Andreuzzi wrote: > In this PR I introduce a new JFR event: `jdk.StringDeduplication` > > The new event is emitted every time a deduplication cycle happens. > > Passes tier1 and tier2 (fastdebug). This pull request has now been integrated. Changeset: 3a2845f3 Author: Francesco Andreuzzi Committer: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/3a2845f334a59670d54699919073f0e908c038c4 Stats: 261 lines in 9 files changed: 244 ins; 10 del; 7 mod 8037914: Add JFR event for string deduplication Reviewed-by: ayang, egahlin ------------- PR: https://git.openjdk.org/jdk/pull/28015 From mgronlun at openjdk.org Tue Nov 18 13:43:15 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 18 Nov 2025 13:43:15 GMT Subject: RFR: 8371368: SIGSEGV in JfrVframeStream::next_vframe() on arm64 Message-ID: Greetings, Please see a description in the JIRA ticket about this problem related to PreserveFramePointer on arm64. Summary: The third argument passes the sender_sp as the frame FP - which is valid for most situations where unextended_sp() + cb->frame_size() is used (a compiled frame's real_fp() is usually equivalent to the sender SP). But this is incorrect when PreserveFramePointer is set. To fix this, a real frame pointer must be passed to the constructor. Testing: jdk_jfr, stress testing Thanks Markus ------------- Commit messages: - PreserveFramePointer test - 8371368 Changes: https://git.openjdk.org/jdk/pull/28373/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28373&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8371368 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28373.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28373/head:pull/28373 PR: https://git.openjdk.org/jdk/pull/28373 From mgronlun at openjdk.org Tue Nov 18 14:46:25 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 18 Nov 2025 14:46:25 GMT Subject: RFR: 8371368: SIGSEGV in JfrVframeStream::next_vframe() on arm64 [v2] In-Reply-To: References: Message-ID: > Greetings, > > Please see a description in the JIRA ticket about this problem related to PreserveFramePointer on arm64. > > Summary: > The third argument passes the sender_sp as the frame FP - which is valid for most situations where unextended_sp() + cb->frame_size() is used (a compiled frame's real_fp() is usually equivalent to the sender SP). But this is incorrect when PreserveFramePointer is set. To fix this, a real frame pointer must be passed to the constructor. > > Testing: jdk_jfr, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: ppc and s390 do not have frame::sender_sp_offset defined ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28373/files - new: https://git.openjdk.org/jdk/pull/28373/files/697644a1..a2434eb8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28373&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28373&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28373.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28373/head:pull/28373 PR: https://git.openjdk.org/jdk/pull/28373 From shade at openjdk.org Wed Nov 19 11:36:12 2025 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 19 Nov 2025 11:36:12 GMT Subject: RFR: 8371368: SIGSEGV in JfrVframeStream::next_vframe() on arm64 [v2] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 14:46:25 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> Please see a description in the JIRA ticket about this problem related to PreserveFramePointer on arm64. >> >> Summary: >> The third argument passes the sender_sp as the frame FP - which is valid for most situations where unextended_sp() + cb->frame_size() is used (a compiled frame's real_fp() is usually equivalent to the sender SP). But this is incorrect when PreserveFramePointer is set. To fix this, a real frame pointer must be passed to the constructor. >> >> Testing: jdk_jfr, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > ppc and s390 do not have frame::sender_sp_offset defined src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 220: > 218: if (is_valid(pc_desc)) { > 219: intptr_t* const synthetic_sp = sender_sp - sampled_nm->frame_size(); > 220: top_frame = frame(synthetic_sp, synthetic_sp, sender_sp - 2, pc_desc->real_pc(sampled_nm), sampled_nm); Hold on. I am looking at relevant constructor: inline frame::frame(intptr_t* sp) : frame(sp, sp, *(intptr_t**)(sp - frame::sender_sp_offset), pauth_strip_verifiable(*(address*)(sp - 1))) {} ...and: inline intptr_t* frame::fp(const intptr_t* sp) { assert(sp != nullptr, "invariant"); return reinterpret_cast(sp[-2]); } So `sender_sp - 2` (which I think is `sp - frame::sender_sp_offset`?) is the _location_ for the FP, not the FP itself? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28373#discussion_r2541567391 From mgronlun at openjdk.org Wed Nov 19 14:02:39 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 19 Nov 2025 14:02:39 GMT Subject: RFR: 8371368: SIGSEGV in JfrVframeStream::next_vframe() on arm64 [v2] In-Reply-To: References: Message-ID: On Wed, 19 Nov 2025 11:16:51 GMT, Aleksey Shipilev wrote: >> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: >> >> ppc and s390 do not have frame::sender_sp_offset defined > > src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 220: > >> 218: if (is_valid(pc_desc)) { >> 219: intptr_t* const synthetic_sp = sender_sp - sampled_nm->frame_size(); >> 220: top_frame = frame(synthetic_sp, synthetic_sp, sender_sp - 2, pc_desc->real_pc(sampled_nm), sampled_nm); > > Hold on. I am looking at relevant constructor: > > > inline frame::frame(intptr_t* sp) > : frame(sp, sp, > *(intptr_t**)(sp - frame::sender_sp_offset), > pauth_strip_verifiable(*(address*)(sp - 1))) {} > > > ...and: > > > inline intptr_t* frame::fp(const intptr_t* sp) { > assert(sp != nullptr, "invariant"); > return reinterpret_cast(sp[-2]); > } > > > So `sender_sp - 2` (which I think is `sp - frame::sender_sp_offset`?) is the _location_ for the FP, not the FP itself? sender_sp - 2 is the calculated synthetic fp, just like synthetic sp is the calculated sp (from sender sp - cb->frame_size()) for the frame that we are reconstructing. We are stackwalking backwards, if that helps with the conceptual model. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28373#discussion_r2542128620 From duke at openjdk.org Fri Nov 21 22:35:47 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Fri, 21 Nov 2025 22:35:47 GMT Subject: RFR: 8370715: JFR: Races are possible when dumping recordings Message-ID: #### Summary This PR changes the JFR snapshot dumping code so that multiple JFR recordings (potentially from different processes) racing to write the same dump destination won't mix their data. #### Problem The dump destination file is created and/or wiped when a recording is started, but not wiped again before actually copying over the chunks during a dump. So in the window of time between creating/wiping and dumping chunks, another recording could write to the same dump destination and have it's chunks added to the snapshot. This can happen with either a single JVM or multiple JVMs that are racing. #### Proposed fix This PR ensures that any data previously written to the dump destination is wiped before the new recording's data is written. File locking is also done while chunks are being written. Testing: - new test `jdk/jdk/jfr/api/recording/dump/TestDumpOverwrite.java` - Tier 1 ------------- Commit messages: - small changes to transferChunks and add test Changes: https://git.openjdk.org/jdk/pull/28460/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28460&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8370715 Stats: 87 lines in 2 files changed: 86 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28460.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28460/head:pull/28460 PR: https://git.openjdk.org/jdk/pull/28460 From duke at openjdk.org Fri Nov 21 22:41:06 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Fri, 21 Nov 2025 22:41:06 GMT Subject: RFR: 8370715: JFR: Races are possible when dumping recordings [v2] In-Reply-To: References: Message-ID: > #### Summary > This PR changes the JFR snapshot dumping code so that multiple JFR recordings (potentially from different processes) racing to write the same dump destination won't mix their data. > > #### Problem > The dump destination file is created and/or wiped when a recording is started, but not wiped again before actually copying over the chunks during a dump. So in the window of time between creating/wiping and dumping chunks, another recording could write to the same dump destination and have it's chunks added to the snapshot. This can happen with either a single JVM or multiple JVMs that are racing. > > #### Proposed fix > This PR ensures that any data previously written to the dump destination is wiped before the new recording's data is written. File locking is also done while chunks are being written. > > Testing: > - new test `jdk/jdk/jfr/api/recording/dump/TestDumpOverwrite.java` > - Tier 1 Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: small cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28460/files - new: https://git.openjdk.org/jdk/pull/28460/files/48ec9a7f..b2ef365d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28460&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28460&range=00-01 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28460.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28460/head:pull/28460 PR: https://git.openjdk.org/jdk/pull/28460 From duke at openjdk.org Sat Nov 22 02:35:16 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Sat, 22 Nov 2025 02:35:16 GMT Subject: RFR: 8370715: JFR: Races are possible when dumping recordings [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 22:41:06 GMT, Robert Toyonaga wrote: >> #### Summary >> This PR changes the JFR snapshot dumping code so that multiple JFR recordings (potentially from different processes) racing to write the same dump destination won't mix their data. >> >> #### Problem >> The dump destination file is created and/or wiped when a recording is started, but not wiped again before actually copying over the chunks during a dump. So in the window of time between creating/wiping and dumping chunks, another recording could write to the same dump destination and have it's chunks added to the snapshot. This can happen with either a single JVM or multiple JVMs that are racing. >> >> #### Proposed fix >> This PR ensures that any data previously written to the dump destination is wiped before the new recording's data is written. File locking is also done while chunks are being written. >> >> Testing: >> - new test `jdk/jdk/jfr/api/recording/dump/TestDumpOverwrite.java` >> - Tier 1 > > Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: > > small cleanup The gate test `com/sun/crypto/provider/Cipher/HPKE/KAT9180.java` is failing due to: `TEST RESULT: Failed. Execution failed: main threw exception: java.io.IOException: Cannot find the artifact rfc9180-test-vectors` This failure doesn't seem to be related to the changes in this PR. It looks like this test is failing on other recent PRs as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28460#issuecomment-3565362784 From egahlin at openjdk.org Mon Nov 24 21:29:58 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 24 Nov 2025 21:29:58 GMT Subject: RFR: 8371368: SIGSEGV in JfrVframeStream::next_vframe() on arm64 [v2] In-Reply-To: References: Message-ID: On Tue, 18 Nov 2025 14:46:25 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> Please see a description in the JIRA ticket about this problem related to PreserveFramePointer on arm64. >> >> Summary: >> The third argument passes the sender_sp as the frame FP - which is valid for most situations where unextended_sp() + cb->frame_size() is used (a compiled frame's real_fp() is usually equivalent to the sender SP). But this is incorrect when PreserveFramePointer is set. To fix this, a real frame pointer must be passed to the constructor. >> >> Testing: jdk_jfr, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > ppc and s390 do not have frame::sender_sp_offset defined Marked as reviewed by egahlin (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28373#pullrequestreview-3502299120 From egahlin at openjdk.org Mon Nov 24 21:31:11 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 24 Nov 2025 21:31:11 GMT Subject: RFR: 8372441: JFR: Improve logging of TestBackToBackSensitive Message-ID: Could I get a review of a PR that saves the dump file if a test fails? It also changes the order in which events are printed to make it easier to see in the log when chunk rotations happen. Testing: 1000 * jdk/jdk/jfr/event/runtom/TestBackToBackSensitive Thanks Erik ------------- Commit messages: - Initial Changes: https://git.openjdk.org/jdk/pull/28481/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28481&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372441 Stats: 18 lines in 1 file changed: 9 ins; 6 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/28481.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28481/head:pull/28481 PR: https://git.openjdk.org/jdk/pull/28481 From mgronlun at openjdk.org Tue Nov 25 09:06:42 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 25 Nov 2025 09:06:42 GMT Subject: RFR: 8372441: JFR: Improve logging of TestBackToBackSensitive In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 21:20:32 GMT, Erik Gahlin wrote: > Could I get a review of a PR that saves the dump file if a test fails? It also changes the order in which events are printed to make it easier to see in the log when chunk rotations happen. > > Testing: 1000 * jdk/jdk/jfr/event/runtom/TestBackToBackSensitive > > Thanks > Erik Marked as reviewed by mgronlun (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/28481#pullrequestreview-3504020339 From mgronlun at openjdk.org Tue Nov 25 09:12:25 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 25 Nov 2025 09:12:25 GMT Subject: Integrated: 8371368: SIGSEGV in JfrVframeStream::next_vframe() on arm64 In-Reply-To: References: Message-ID: <88Wp-Nr4BiaGMszYupcp1N1tOCZ3zpqFnNl7tnM716I=.55f04a06-e2b6-4c5c-b9bd-896218f5cc64@github.com> On Tue, 18 Nov 2025 13:36:01 GMT, Markus Gr?nlund wrote: > Greetings, > > Please see a description in the JIRA ticket about this problem related to PreserveFramePointer on arm64. > > Summary: > The third argument passes the sender_sp as the frame FP - which is valid for most situations where unextended_sp() + cb->frame_size() is used (a compiled frame's real_fp() is usually equivalent to the sender SP). But this is incorrect when PreserveFramePointer is set. To fix this, a real frame pointer must be passed to the constructor. > > Testing: jdk_jfr, stress testing > > Thanks > Markus This pull request has now been integrated. Changeset: 42f33335 Author: Markus Gr?nlund URL: https://git.openjdk.org/jdk/commit/42f333352408e03389fb37ea8ad8537a4a271b6a Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod 8371368: SIGSEGV in JfrVframeStream::next_vframe() on arm64 Reviewed-by: egahlin ------------- PR: https://git.openjdk.org/jdk/pull/28373 From egahlin at openjdk.org Tue Nov 25 18:51:24 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 25 Nov 2025 18:51:24 GMT Subject: Integrated: 8372441: JFR: Improve logging of TestBackToBackSensitive In-Reply-To: References: Message-ID: On Mon, 24 Nov 2025 21:20:32 GMT, Erik Gahlin wrote: > Could I get a review of a PR that saves the dump file if a test fails? It also changes the order in which events are printed to make it easier to see in the log when chunk rotations happen. > > Testing: 1000 * jdk/jdk/jfr/event/runtom/TestBackToBackSensitive > > Thanks > Erik This pull request has now been integrated. Changeset: c0abecdd Author: Erik Gahlin URL: https://git.openjdk.org/jdk/commit/c0abecdd1ffe59314bc17aeec0684cdda33a222d Stats: 18 lines in 1 file changed: 9 ins; 6 del; 3 mod 8372441: JFR: Improve logging of TestBackToBackSensitive Reviewed-by: mgronlun ------------- PR: https://git.openjdk.org/jdk/pull/28481 From coleenp at openjdk.org Wed Nov 26 13:04:51 2025 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 26 Nov 2025 13:04:51 GMT Subject: RFR: 8365400: Enhance JFR to emit file and module metadata for class loading In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 12:10:55 GMT, Markus Gr?nlund wrote: > Greetings, > > this enhancement adds a "source" field, label "Location" to the jdk.ClassDefine event. > > To enable this functionality, JFR needs a concurrent symbol table. We can build a simpler version of a concurrent hash table, taking advantage of the JFR epoch system. This will be useful also for planned future enhancements. > > Extensions are made to AOT to consistently report identical canonical paths for classes as non-AOT code paths. > > Testing: jdk_jfr, manual AOT verification, stress testing > > Thanks > Markus Why did you write a different concurrent hash table? Why wasn't the existing one good enough? You say it's simpler but this adds many lines of code that someone has to read. src/hotspot/share/jfr/support/jfrSymbolTable.hpp line 68: > 66: > 67: void on_link(const Symbols::Entry* entry); > 68: void on_unlink(const Symbols::Entry* entry); What is Symbols ? Is it a JFR thing, so should it be JfrSymbols ? ------------- PR Review: https://git.openjdk.org/jdk/pull/28505#pullrequestreview-3510858860 PR Review Comment: https://git.openjdk.org/jdk/pull/28505#discussion_r2564923817 From mgronlun at openjdk.org Wed Nov 26 13:49:50 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 26 Nov 2025 13:49:50 GMT Subject: RFR: 8365400: Enhance JFR to emit file and module metadata for class loading In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 13:01:30 GMT, Coleen Phillimore wrote: >> Greetings, >> >> this enhancement adds a "source" field, label "Location" to the jdk.ClassDefine event. >> >> To enable this functionality, JFR needs a concurrent symbol table. We can build a simpler version of a concurrent hash table, taking advantage of the JFR epoch system. This will be useful also for planned future enhancements. >> >> Extensions are made to AOT to consistently report identical canonical paths for classes as non-AOT code paths. >> >> Testing: jdk_jfr, manual AOT verification, stress testing >> >> Thanks >> Markus > > src/hotspot/share/jfr/support/jfrSymbolTable.hpp line 68: > >> 66: >> 67: void on_link(const Symbols::Entry* entry); >> 68: void on_unlink(const Symbols::Entry* entry); > > What is Symbols ? Is it a JFR thing, so should it be JfrSymbols ? It's a more effective version of a concurrent hash table because we employ invariants that apply only to JFR (no concurrency state (hazard ptrs, global counters etc) is needed to track concurrency; only CAS insertions are made into the mapped primary bucket, searches are stable searches in the epoch-current list). Symbols are mainly the regular runtime Symbols, but we also have an extension for tracking const char*s c-strings. A JFR symbol is a constant pool construct, mapped onto an epoch-relative id. Since the Symbols here are typedeffed inside a JFR class, its scope is already a JFR specific construct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/28505#discussion_r2565077778 From mgronlun at openjdk.org Wed Nov 26 14:27:38 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 26 Nov 2025 14:27:38 GMT Subject: RFR: 8365400: Enhance JFR to emit file and module metadata for class loading [v2] In-Reply-To: References: Message-ID: > Greetings, > > this enhancement adds a "source" field, label "Location" to the jdk.ClassDefine event. > > To enable this functionality, JFR needs a concurrent symbol table. We can build a simpler version of a concurrent hash table, taking advantage of the JFR epoch system. This will be useful also for planned future enhancements. > > Extensions are made to AOT to consistently report identical canonical paths for classes as non-AOT code paths. > > Testing: jdk_jfr, manual AOT verification, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: JFR_ONLY ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28505/files - new: https://git.openjdk.org/jdk/pull/28505/files/2c6e4fb6..350203ca Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28505.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28505/head:pull/28505 PR: https://git.openjdk.org/jdk/pull/28505 From mgronlun at openjdk.org Wed Nov 26 15:10:25 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 26 Nov 2025 15:10:25 GMT Subject: RFR: 8372586: Crashes on ppc64(le) after JDK-8371368 Message-ID: Greetings, Follow-up adjustment to JDK-8371368. Testing: aarch64 with -XX:+PreserveFramePointer Thanks Markus ------------- Commit messages: - 8372586 Changes: https://git.openjdk.org/jdk/pull/28509/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28509&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372586 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28509.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28509/head:pull/28509 PR: https://git.openjdk.org/jdk/pull/28509 From mgronlun at openjdk.org Wed Nov 26 15:23:11 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 26 Nov 2025 15:23:11 GMT Subject: RFR: 8365400: Enhance JFR to emit file and module metadata for class loading [v3] In-Reply-To: References: Message-ID: > Greetings, > > this enhancement adds a "source" field, label "Location" to the jdk.ClassDefine event. > > To enable this functionality, JFR needs a concurrent symbol table. We can build a simpler version of a concurrent hash table, taking advantage of the JFR epoch system. This will be useful also for planned future enhancements. > > Extensions are made to AOT to consistently report identical canonical paths for classes as non-AOT code paths. > > Testing: jdk_jfr, manual AOT verification, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: include apa ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28505/files - new: https://git.openjdk.org/jdk/pull/28505/files/350203ca..af55c2a7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=01-02 Stats: 4 lines in 2 files changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28505.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28505/head:pull/28505 PR: https://git.openjdk.org/jdk/pull/28505 From egahlin at openjdk.org Wed Nov 26 17:51:04 2025 From: egahlin at openjdk.org (Erik Gahlin) Date: Wed, 26 Nov 2025 17:51:04 GMT Subject: RFR: 8370715: JFR: Races are possible when dumping recordings [v2] In-Reply-To: References: Message-ID: On Fri, 21 Nov 2025 22:41:06 GMT, Robert Toyonaga wrote: >> #### Summary >> This PR changes the JFR snapshot dumping code so that multiple JFR recordings (potentially from different processes) racing to write the same dump destination won't mix their data. >> >> #### Problem >> The dump destination file is created and/or wiped when a recording is started, but not wiped again before actually copying over the chunks during a dump. So in the window of time between creating/wiping and dumping chunks, another recording could write to the same dump destination and have it's chunks added to the snapshot. This can happen with either a single JVM or multiple JVMs that are racing. >> >> #### Proposed fix >> This PR ensures that any data previously written to the dump destination is wiped before the new recording's data is written. File locking is also done while chunks are being written. >> >> Testing: >> - new test `jdk/jdk/jfr/api/recording/dump/TestDumpOverwrite.java` >> - Tier 1 > > Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: > > small cleanup The year is mentioned twice in the copyright header. Also, can you add @requires vm.flagless ------------- PR Comment: https://git.openjdk.org/jdk/pull/28460#issuecomment-3581149031 From duke at openjdk.org Wed Nov 26 17:51:02 2025 From: duke at openjdk.org (Robert Toyonaga) Date: Wed, 26 Nov 2025 17:51:02 GMT Subject: RFR: 8370715: JFR: Races are possible when dumping recordings [v3] In-Reply-To: References: Message-ID: > #### Summary > This PR changes the JFR snapshot dumping code so that multiple JFR recordings (potentially from different processes) racing to write the same dump destination won't mix their data. > > #### Problem > The dump destination file is created and/or wiped when a recording is started, but not wiped again before actually copying over the chunks during a dump. So in the window of time between creating/wiping and dumping chunks, another recording could write to the same dump destination and have it's chunks added to the snapshot. This can happen with either a single JVM or multiple JVMs that are racing. > > #### Proposed fix > This PR ensures that any data previously written to the dump destination is wiped before the new recording's data is written. File locking is also done while chunks are being written. > > Testing: > - new test `jdk/jdk/jfr/api/recording/dump/TestDumpOverwrite.java` > - Tier 1 Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision: vm.flagless and fix copyright header ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28460/files - new: https://git.openjdk.org/jdk/pull/28460/files/b2ef365d..f3f7da42 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28460&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28460&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/28460.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28460/head:pull/28460 PR: https://git.openjdk.org/jdk/pull/28460 From mgronlun at openjdk.org Wed Nov 26 19:25:34 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 26 Nov 2025 19:25:34 GMT Subject: RFR: 8365400: Enhance JFR to emit file and module metadata for class loading [v4] In-Reply-To: References: Message-ID: > Greetings, > > this enhancement adds a "source" field, label "Location" to the jdk.ClassDefine event. > > To enable this functionality, JFR needs a concurrent symbol table. We can build a simpler version of a concurrent hash table, taking advantage of the JFR epoch system. This will be useful also for planned future enhancements. > > Extensions are made to AOT to consistently report identical canonical paths for classes as non-AOT code paths. > > Testing: jdk_jfr, manual AOT verification, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: adjustments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28505/files - new: https://git.openjdk.org/jdk/pull/28505/files/af55c2a7..d312adba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=02-03 Stats: 50 lines in 4 files changed: 22 ins; 3 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/28505.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28505/head:pull/28505 PR: https://git.openjdk.org/jdk/pull/28505 From mbaesken at openjdk.org Thu Nov 27 08:40:50 2025 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 27 Nov 2025 08:40:50 GMT Subject: RFR: 8372586: Crashes on ppc64(le) after JDK-8371368 In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 15:03:36 GMT, Markus Gr?nlund wrote: > Greetings, > > Follow-up adjustment to JDK-8371368. > > Testing: aarch64 with -XX:+PreserveFramePointer > > Thanks > Markus LGTM . Maybe there is a better way to do it e.g. without the AARCH platform abstraction directly in shared code. But it fixes our crashes/asserts , that's good ! With your patch, the errors on ppc64(le) are gone . ------------- Marked as reviewed by mbaesken (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28509#pullrequestreview-3514053677 PR Comment: https://git.openjdk.org/jdk/pull/28509#issuecomment-3584729676 From mgronlun at openjdk.org Thu Nov 27 09:02:51 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 27 Nov 2025 09:02:51 GMT Subject: RFR: 8372586: Crashes on ppc64(le) after JDK-8371368 In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 08:38:19 GMT, Matthias Baesken wrote: > LGTM . Maybe there is a better way to do it e.g. without the AARCH platform abstraction directly in shared code. But it fixes our crashes/asserts , that's good ! Good news, thanks for verifying. Apologies for the inconveniences. Will go with this exception to the AARCH64 platform then. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28509#issuecomment-3584815993 From mgronlun at openjdk.org Thu Nov 27 09:12:00 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 27 Nov 2025 09:12:00 GMT Subject: RFR: 8372586: Crashes on ppc64(le) after JDK-8371368 In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 15:03:36 GMT, Markus Gr?nlund wrote: > Greetings, > > Follow-up adjustment to JDK-8371368. > > Testing: aarch64 with -XX:+PreserveFramePointer > > Thanks > Markus Proceeding with putback to restore pipelines back to normal. ------------- PR Comment: https://git.openjdk.org/jdk/pull/28509#issuecomment-3584841223 From mgronlun at openjdk.org Thu Nov 27 09:12:01 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 27 Nov 2025 09:12:01 GMT Subject: Integrated: 8372586: Crashes on ppc64(le) after JDK-8371368 In-Reply-To: References: Message-ID: On Wed, 26 Nov 2025 15:03:36 GMT, Markus Gr?nlund wrote: > Greetings, > > Follow-up adjustment to JDK-8371368. > > Testing: aarch64 with -XX:+PreserveFramePointer > > Thanks > Markus This pull request has now been integrated. Changeset: 141aebca Author: Markus Gr?nlund URL: https://git.openjdk.org/jdk/commit/141aebca38bc683cbff8a2dfe0cb98d3f0186a8c Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod 8372586: Crashes on ppc64(le) after JDK-8371368 Reviewed-by: mbaesken ------------- PR: https://git.openjdk.org/jdk/pull/28509 From mgronlun at openjdk.org Thu Nov 27 10:34:26 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 27 Nov 2025 10:34:26 GMT Subject: RFR: 8365400: Enhance JFR to emit file and module metadata for class loading [v5] In-Reply-To: References: Message-ID: > Greetings, > > this enhancement adds a "source" field, label "Location" to the jdk.ClassDefine event. > > To enable this functionality, JFR needs a concurrent symbol table. We can build a simpler version of a concurrent hash table, taking advantage of the JFR epoch system. This will be useful also for planned future enhancements. > > Extensions are made to AOT to consistently report identical canonical paths for classes as non-AOT code paths. > > Testing: jdk_jfr, manual AOT verification, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: remove ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28505/files - new: https://git.openjdk.org/jdk/pull/28505/files/d312adba..2e8809b0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=03-04 Stats: 2 lines in 2 files changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28505.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28505/head:pull/28505 PR: https://git.openjdk.org/jdk/pull/28505 From krk at openjdk.org Thu Nov 27 16:14:57 2025 From: krk at openjdk.org (Kerem Kat) Date: Thu, 27 Nov 2025 16:14:57 GMT Subject: RFR: 8372587: Put jdk/jfr/jvm/TestWaste.java into the ProblemList Message-ID: It still fails in oracle environment as described in [JDK-8371630](https://bugs.openjdk.org/browse/JDK-8371630). ------------- Commit messages: - 8372587: Put jdk/jfr/jvm/TestWaste.java into the ProblemList Changes: https://git.openjdk.org/jdk/pull/28539/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=28539&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8372587 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/28539.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28539/head:pull/28539 PR: https://git.openjdk.org/jdk/pull/28539 From krk at openjdk.org Thu Nov 27 19:14:54 2025 From: krk at openjdk.org (Kerem Kat) Date: Thu, 27 Nov 2025 19:14:54 GMT Subject: RFR: 8372587: Put jdk/jfr/jvm/TestWaste.java into the ProblemList In-Reply-To: References: Message-ID: On Thu, 27 Nov 2025 16:04:56 GMT, Kerem Kat wrote: > It still fails in oracle environment as described in [JDK-8371630](https://bugs.openjdk.org/browse/JDK-8371630). only failure which is on macOS, seems unrelated: TEST: gc/shenandoah/compiler/TestClone.java#generational-verify ... # Internal Error (/Users/runner/work/jdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp:1356), pid=18315, tid=16899 # Error: Remembered set violation at init-update-references; object not properly registered ------------- PR Comment: https://git.openjdk.org/jdk/pull/28539#issuecomment-3587069306 From mgronlun at openjdk.org Thu Nov 27 19:21:33 2025 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 27 Nov 2025 19:21:33 GMT Subject: RFR: 8365400: Enhance JFR to emit file and module metadata for class loading [v6] In-Reply-To: References: Message-ID: > Greetings, > > this enhancement adds a "source" field, label "Location" to the jdk.ClassDefine event. > > To enable this functionality, JFR needs a concurrent symbol table. We can build a simpler version of a concurrent hash table, taking advantage of the JFR epoch system. This will be useful also for planned future enhancements. > > Extensions are made to AOT to consistently report identical canonical paths for classes as non-AOT code paths. > > Testing: jdk_jfr, manual AOT verification, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: jfr views for class source ------------- Changes: - all: https://git.openjdk.org/jdk/pull/28505/files - new: https://git.openjdk.org/jdk/pull/28505/files/2e8809b0..c0e1124e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28505&range=04-05 Stats: 26 lines in 3 files changed: 21 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/28505.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/28505/head:pull/28505 PR: https://git.openjdk.org/jdk/pull/28505