From rkennke at openjdk.org Mon Jun 3 19:18:05 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 3 Jun 2024 19:18:05 GMT Subject: [master] Withdrawn: Lilliput master rebased on jdk-23+24 In-Reply-To: <6fOg6fb6yI_IFjOXZiRdiHngcNfbfn23U7qZNEtwhfw=.45d1a860-b2a2-4097-96ec-003f085751b0@github.com> References: <6fOg6fb6yI_IFjOXZiRdiHngcNfbfn23U7qZNEtwhfw=.45d1a860-b2a2-4097-96ec-003f085751b0@github.com> Message-ID: On Thu, 30 May 2024 08:37:38 GMT, Axel Boldt-Christmas wrote: > The patch queue was squashed from https://github.com/xmas92/lilliput/compare/lilliput_master_rebased...xmas92:lilliput:lilliput_master_rebased_pre_squash > to https://github.com/xmas92/lilliput/compare/lilliput_master_rebased_pre_squash...xmas92:lilliput:lilliput_master_rebased > > Testing > * Tier 1-3 with `+UseCompactObjectHeaders` > * Pre-existing issues due to changed default CDS archive names > * `tools/jlink/plugins/CDSPluginTest.java` > * `runtime/cds/appcds/dynamicArchive/TestAutoCreateSharedArchiveNoDefaultArchive.java` > * Tier 1-3 with `-UseCompactObjectHeaders` This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/lilliput/pull/180 From aboldtch at openjdk.org Mon Jun 3 19:18:05 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 3 Jun 2024 19:18:05 GMT Subject: [master] RFR: Lilliput master rebased on jdk-23+24 [v2] In-Reply-To: <6fOg6fb6yI_IFjOXZiRdiHngcNfbfn23U7qZNEtwhfw=.45d1a860-b2a2-4097-96ec-003f085751b0@github.com> References: <6fOg6fb6yI_IFjOXZiRdiHngcNfbfn23U7qZNEtwhfw=.45d1a860-b2a2-4097-96ec-003f085751b0@github.com> Message-ID: > The patch queue was squashed from https://github.com/xmas92/lilliput/compare/lilliput_master_rebased...xmas92:lilliput:lilliput_master_rebased_pre_squash > to https://github.com/xmas92/lilliput/compare/lilliput_master_rebased_pre_squash...xmas92:lilliput:lilliput_master_rebased > > Testing > * Tier 1-3 with `+UseCompactObjectHeaders` > * Pre-existing issues due to changed default CDS archive names > * `tools/jlink/plugins/CDSPluginTest.java` > * `runtime/cds/appcds/dynamicArchive/TestAutoCreateSharedArchiveNoDefaultArchive.java` > * Tier 1-3 with `-UseCompactObjectHeaders` Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/lilliput/pull/180/files - new: https://git.openjdk.org/lilliput/pull/180/files/76deeaa4..76deeaa4 Webrevs: - full: https://webrevs.openjdk.org/?repo=lilliput&pr=180&range=01 - incr: https://webrevs.openjdk.org/?repo=lilliput&pr=180&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/lilliput/pull/180.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/180/head:pull/180 PR: https://git.openjdk.org/lilliput/pull/180 From aboldtch at openjdk.org Tue Jun 4 06:03:17 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 4 Jun 2024 06:03:17 GMT Subject: [master] RFR: OMWorld: Decouple deflation and table sizing [v3] In-Reply-To: References: Message-ID: > The change reverts all changes to deflation and moves the resizing of the OMWorld ConcurrentHashTable to the service thread. Using a similar logic to how we resize the Symbol- and StringTables. > > The option to shrink the table is taken out and can be reintroduced at a later date as an enhancement. To do it correctly the interactions with deflation needs to be figured out. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - Merge remote-tracking branch 'upstream_lilliput/master' into lilliput-decouple-deflation - Merge remote-tracking branch 'upstream_lilliput/master' into lilliput-decouple-deflation - Decouple deflation and table sizing ------------- Changes: https://git.openjdk.org/lilliput/pull/175/files Webrev: https://webrevs.openjdk.org/?repo=lilliput&pr=175&range=02 Stats: 205 lines in 7 files changed: 65 ins; 91 del; 49 mod Patch: https://git.openjdk.org/lilliput/pull/175.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/175/head:pull/175 PR: https://git.openjdk.org/lilliput/pull/175 From coleenp at openjdk.org Tue Jun 4 14:17:47 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 4 Jun 2024 14:17:47 GMT Subject: [master] RFR: OMWorld: Decouple deflation and table sizing [v3] In-Reply-To: References: Message-ID: On Tue, 4 Jun 2024 06:03:17 GMT, Axel Boldt-Christmas wrote: >> The change reverts all changes to deflation and moves the resizing of the OMWorld ConcurrentHashTable to the service thread. Using a similar logic to how we resize the Symbol- and StringTables. >> >> The option to shrink the table is taken out and can be reintroduced at a later date as an enhancement. To do it correctly the interactions with deflation needs to be figured out. > > Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - Merge remote-tracking branch 'upstream_lilliput/master' into lilliput-decouple-deflation > - Merge remote-tracking branch 'upstream_lilliput/master' into lilliput-decouple-deflation > - Decouple deflation and table sizing The interaction with the OM deflation thread isn't straightforward so it seems better to have this in the ServiceThread for now. Thanks for explaining this @xmas92 ------------- Marked as reviewed by coleenp (Committer). PR Review: https://git.openjdk.org/lilliput/pull/175#pullrequestreview-2096553858 From stuefe at openjdk.org Tue Jun 4 16:31:46 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jun 2024 16:31:46 GMT Subject: [master] RFR: Prepare for smaller-than-22-bit class pointers [v2] In-Reply-To: References: <7yND0YMWK_w99G29XAIZAGWvzM_hi5K3Mr8fBx2ZOeY=.da05e2bc-77ff-4fee-9830-d2f6c04f737c@github.com> Message-ID: <074MBqmMA5UQ-oiYh9LUM6OvLCnODleaOpTEA-YKBCU=.83674c25-4806-40c1-a34d-99e05cfcaf74@github.com> On Tue, 30 Apr 2024 09:00:48 GMT, Thomas Stuefe wrote: >> This PR prepares using arbitrary klass pointer sizes (e.g. 16). It cleans up a few places and corrects comments. >> >> The changes in detail: >> >> - exposes a new function `CompressedKlassPointers::max_encoding_range_size()` that returns the maximum possible size of the encoding range given the current nKlass geometry (e.g. 16 bit klass pointers with a max. shift of 10 bits can encode 64MB of class space). >> >> - In Metaspace::ergo_initialize(), where we ergo-adjust the CompressedClassSpaceSize, the maximum possible encoding range size flows into this adjustment now. We also print clearer warnings in case the user specifies CCS size explicitly, and we override that decision. >> >> - removed any hard-wiredness of a "max class space size/encoding range size of 4GB" since with smaller geometries that does not hold true anymore. Instead, we now use `CompressedKlassPointers::max_encoding_range_size()`. >> >> - made the requirements on klass_alignment_in_bytes clearer when setting up class space >> >> - removed remnant code (TinyClassPointerShift) left over from development >> >> >> Note: One still unsolved problem?unsolved in Lilliput as well as upstream?is to correctly limit the compressed class space size in the presence of CDS. Upstream "solves" this by capping CCS size at 3GB, which leaves 1GB for CDS archives. There is no solution if CDS would ever exceed this limit, and it's a waste of space for CCS. >> >> In Lilliput, if we limit the class pointer size such that we drastically reduce the klass encoding range size, we need to be better at splitting that klass encoding range between CDS and class space. For example, we could map CDS and then use the remaining space completely for class space. But that would require more serious reshuffling for initialization code, and CDS setup is horrendously complex. >> >> For this patch, if one wants to reduce class pointer size, one may have to disable CDS to run. >> >> Tested: Mac m1, fastdebug, with 32, 22 and 16 bit class pointers. GHAs in process. > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into preparation-for-arbitrary-cp-sizes > - Fix behavior when running with a very small MaxMetaspaceSize > - start keep open bot ------------- PR Comment: https://git.openjdk.org/lilliput/pull/172#issuecomment-2147726536 From coleen.phillimore at oracle.com Tue Jun 4 20:52:49 2024 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 4 Jun 2024 16:52:49 -0400 Subject: OM World and Lilliput planning Message-ID: Hi, This is what I wrote up after an internal discussion.? I am about to file some RFEs/CSRs (or maybe will next week).? Let me know what you think. Thanks, Coleen What we call OM World is saving the ObjectMonitor in a ConcurrentHashTable rather than in the markWord of the Java Object. Lilliput absolutely requires this since for Lilliput the Klass pointer is also in the markWord and to get to the Klass pointer for a locked object, the code would have to go to the displaced header in unboundedly racy situations. Without Lilliput, this is also helpful in that it frees up markWord bits for concurrent GCs or Valhalla? to use. Because of this, and because of the high level of testing this type of change requires, we'd like to push this change to mainline ahead of the Lilliput work. OM World is built on top of Lightweight locking as Lightweight locking is required (doesn't save the stack location in the markWord as does Legacy locking).? To reduce the maintenance burden and potential tricky interactions between new features and Legacy locking, we'd like to deprecate Legacy locking in JDK 24. Deprecating Legacy locking then makes the flag LockingMode not make any sense, as one of three enumerations will be missing.? Also, to introduce OM World on top of Lightweight locking, it would be good to have that on a diagnostic flag in case of customer performance issues.? It doesn't make sense to have a new locking mode for OM World, since it shares 80% code with Lightweight locking. Therefore I (with input from Axel and Stefan) propose the following for JDK 24: 1. Reintroduce the flag UseHeavyMonitors for LockingMode=LM_MONITOR 2. Deprecate LockingMode=LM_LEGACY 3. Deprecate the flag LockingMode.? It's a new flag, legacy code won't miss it. 4. When OM World is ready to integrate, introduce a new diagnostic flag UseObjectMonitorTable ????? - Start default off ????? - Make it default on midway through JDK 24 if no problems. JDK 25: 1. Obsolete Legacy locking mode (removes the code - TBD) 2. Obsolete LockingMode flag 3. We can hold onto UseObjectMonitorTable for a while (off turns off Lilliput UseCompactObjectHeaders). From coleenp at openjdk.org Tue Jun 4 22:42:38 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 4 Jun 2024 22:42:38 GMT Subject: [master] RFR: Lilliput om world Message-ID: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. tier1 aarch64 in progress. ------------- Commit messages: - Add UseObjectMonitorTable to disable the table to allow performance work. Changes: https://git.openjdk.org/lilliput/pull/181/files Webrev: https://webrevs.openjdk.org/?repo=lilliput&pr=181&range=00 Stats: 725 lines in 21 files changed: 342 ins; 72 del; 311 mod Patch: https://git.openjdk.org/lilliput/pull/181.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/181/head:pull/181 PR: https://git.openjdk.org/lilliput/pull/181 From aboldtch at openjdk.org Tue Jun 4 22:42:38 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 4 Jun 2024 22:42:38 GMT Subject: [master] RFR: Lilliput om world In-Reply-To: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Message-ID: <6EyQ7nMp6MODXsrfcOVmlGCc3TKUk2PdIdZQ5N3goLg=.62a3b969-f4b9-4be7-ac4b-675e202ca333@github.com> On Fri, 31 May 2024 20:46:25 GMT, Coleen Phillimore wrote: > Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. > > Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). > > Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. > > tier1 aarch64 in progress. I like this change and the restructuring. ? src/hotspot/share/runtime/synchronizer.cpp line 427: > 425: // Recursive lock successful. > 426: current->inc_held_monitor_count(); > 427: return true; This needs a CacheSetter Suggestion: CacheSetter cache_setter(current, lock); // Recursive lock successful. current->inc_held_monitor_count(); return true; ------------- PR Review: https://git.openjdk.org/lilliput/pull/181#pullrequestreview-2093593729 PR Review Comment: https://git.openjdk.org/lilliput/pull/181#discussion_r1624279436 From coleenp at openjdk.org Tue Jun 4 22:42:38 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 4 Jun 2024 22:42:38 GMT Subject: [master] RFR: Lilliput om world In-Reply-To: <6EyQ7nMp6MODXsrfcOVmlGCc3TKUk2PdIdZQ5N3goLg=.62a3b969-f4b9-4be7-ac4b-675e202ca333@github.com> References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> <6EyQ7nMp6MODXsrfcOVmlGCc3TKUk2PdIdZQ5N3goLg=.62a3b969-f4b9-4be7-ac4b-675e202ca333@github.com> Message-ID: On Mon, 3 Jun 2024 11:45:12 GMT, Axel Boldt-Christmas wrote: >> Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. >> >> Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). >> >> Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. >> >> tier1 aarch64 in progress. > > src/hotspot/share/runtime/synchronizer.cpp line 427: > >> 425: // Recursive lock successful. >> 426: current->inc_held_monitor_count(); >> 427: return true; > > This needs a CacheSetter > > Suggestion: > > CacheSetter cache_setter(current, lock); > > // Recursive lock successful. > current->inc_held_monitor_count(); > return true; Ok, yes, I missed that the fast lock stack case set the monitor. I was trying to keep the CacheSetter contained in lightweightSynchronizer.cpp. I'll do some more refactoring. Thanks for pointing this out. ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/181#discussion_r1624300664 From coleenp at openjdk.org Tue Jun 4 22:42:38 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 4 Jun 2024 22:42:38 GMT Subject: [master] RFR: Lilliput om world In-Reply-To: References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> <6EyQ7nMp6MODXsrfcOVmlGCc3TKUk2PdIdZQ5N3goLg=.62a3b969-f4b9-4be7-ac4b-675e202ca333@github.com> Message-ID: <_f0xTu1X7l7zl69g6hSTopcauiiO5Bh-NoU1AXn8NmE=.57e300dc-c68f-49ef-8669-ceaaa4859ead@github.com> On Mon, 3 Jun 2024 12:03:07 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/synchronizer.cpp line 427: >> >>> 425: // Recursive lock successful. >>> 426: current->inc_held_monitor_count(); >>> 427: return true; >> >> This needs a CacheSetter >> >> Suggestion: >> >> CacheSetter cache_setter(current, lock); >> >> // Recursive lock successful. >> current->inc_held_monitor_count(); >> return true; > > Ok, yes, I missed that the fast lock stack case set the monitor. I was trying to keep the CacheSetter contained in lightweightSynchronizer.cpp. I'll do some more refactoring. Thanks for pointing this out. Oh now I see. If we do a recursive lock, we need to clear_object_monitor_cache for some reason I haven't figured out yet. ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/181#discussion_r1624371840 From aboldtch at openjdk.org Tue Jun 4 22:42:38 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 4 Jun 2024 22:42:38 GMT Subject: [master] RFR: Lilliput om world In-Reply-To: <_f0xTu1X7l7zl69g6hSTopcauiiO5Bh-NoU1AXn8NmE=.57e300dc-c68f-49ef-8669-ceaaa4859ead@github.com> References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> <6EyQ7nMp6MODXsrfcOVmlGCc3TKUk2PdIdZQ5N3goLg=.62a3b969-f4b9-4be7-ac4b-675e202ca333@github.com> <_f0xTu1X7l7zl69g6hSTopcauiiO5Bh-NoU1AXn8NmE=.57e300dc-c68f-49ef-8669-ceaaa4859ead@github.com> Message-ID: On Mon, 3 Jun 2024 12:27:54 GMT, Coleen Phillimore wrote: >> Ok, yes, I missed that the fast lock stack case set the monitor. I was trying to keep the CacheSetter contained in lightweightSynchronizer.cpp. I'll do some more refactoring. Thanks for pointing this out. > > Oh now I see. If we do a recursive lock, we need to clear_object_monitor_cache for some reason I haven't figured out yet. Yeah I am not 100% happy with the CacheSetter, it at least needs better documentation. The issue is an uninitialised value will (more than likely) look like a ObjectMonitor* to the cache lookup code. So it will break if this thread observers that it is inflated (by for example calling wait), where it will then crash when doing a cache lookup in exit. It aims to uphold the simple invariant that after ObjectSynchronize::enter the BasicLock contains either the ObjectMonitor locked on or nullptr. ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/181#discussion_r1624509500 From aboldtch at openjdk.org Wed Jun 5 07:08:20 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 5 Jun 2024 07:08:20 GMT Subject: [master] Integrated: OMWorld: Decouple deflation and table sizing In-Reply-To: References: Message-ID: On Thu, 23 May 2024 06:32:28 GMT, Axel Boldt-Christmas wrote: > The change reverts all changes to deflation and moves the resizing of the OMWorld ConcurrentHashTable to the service thread. Using a similar logic to how we resize the Symbol- and StringTables. > > The option to shrink the table is taken out and can be reintroduced at a later date as an enhancement. To do it correctly the interactions with deflation needs to be figured out. This pull request has now been integrated. Changeset: 47ae25c5 Author: Axel Boldt-Christmas URL: https://git.openjdk.org/lilliput/commit/47ae25c5c8b19f9a35709c698f9b9e6938326a75 Stats: 205 lines in 7 files changed: 65 ins; 91 del; 49 mod OMWorld: Decouple deflation and table sizing Reviewed-by: coleenp ------------- PR: https://git.openjdk.org/lilliput/pull/175 From stefan.karlsson at oracle.com Wed Jun 5 07:19:52 2024 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 5 Jun 2024 09:19:52 +0200 Subject: OM World and Lilliput planning In-Reply-To: References: Message-ID: <45b10ea1-41ed-463c-862f-fd44175d8e49@oracle.com> Hi Coleen, Thanks for moving the "OM World" towards completion. I have one comment below: On 2024-06-04 22:52, coleen.phillimore at oracle.com wrote: > > Hi, This is what I wrote up after an internal discussion.? I am about > to file some RFEs/CSRs (or maybe will next week).? Let me know what > you think. > > Thanks, > Coleen > > What we call OM World is saving the ObjectMonitor in a > ConcurrentHashTable rather than in the markWord of the Java Object. > Lilliput absolutely requires this since for Lilliput the Klass pointer > is also in the markWord and to get to the Klass pointer for a locked > object, the code would have to go to the displaced header in > unboundedly racy situations. > > Without Lilliput, this is also helpful in that it frees up markWord > bits for concurrent GCs or Valhalla? to use. Because of this, and > because of the high level of testing this type of change requires, > we'd like to push this change to mainline ahead of the Lilliput work. > > OM World is built on top of Lightweight locking as Lightweight locking > is required (doesn't save the stack location in the markWord as does > Legacy locking).? To reduce the maintenance burden and potential > tricky interactions between new features and Legacy locking, we'd like > to deprecate Legacy locking in JDK 24. > > Deprecating Legacy locking then makes the flag LockingMode not make > any sense, as one of three enumerations will be missing. Also, to > introduce OM World on top of Lightweight locking, it would be good to > have that on a diagnostic flag in case of customer performance > issues.? It doesn't make sense to have a new locking mode for OM > World, since it shares 80% code with Lightweight locking. > > Therefore I (with input from Axel and Stefan) propose the following > for JDK 24: > > 1. Reintroduce the flag UseHeavyMonitors for LockingMode=LM_MONITOR > 2. Deprecate LockingMode=LM_LEGACY > 3. Deprecate the flag LockingMode.? It's a new flag, legacy code won't > miss it. > 4. When OM World is ready to integrate, introduce a new diagnostic > flag UseObjectMonitorTable > ????? - Start default off > ????? - Make it default on midway through JDK 24 if no problems. What is the benefit of starting with this turned off and then a few weeks later making it default? I think we'll get better functional test coverage if it is enabled by default. We had a very similar situation when Lightweight locking was turned off by default and many bugs weren't found until it was turned on by default. Thanks, StefanK > > JDK 25: > > 1. Obsolete Legacy locking mode (removes the code - TBD) > 2. Obsolete LockingMode flag > 3. We can hold onto UseObjectMonitorTable for a while (off turns off > Lilliput UseCompactObjectHeaders). > From coleen.phillimore at oracle.com Tue Jun 4 20:31:17 2024 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 4 Jun 2024 16:31:17 -0400 Subject: [External] : Solving the Klass hyperalignment problem In-Reply-To: References: Message-ID: <0c15f5ca-6d7b-4439-a120-a5a173b45201@oracle.com> Hi Thomas, I'm sad and disappointed that the indirect klass pointer approach performed poorly.? Maybe someday we'll have hardware where loading from memory is not so noticeably slower or algorithms where we don't load the klass in tight loops.? Until that day, your uneven cache line approach sounds like the most feasible approach, despite the quite complicated instructions to encode and decode the klass pointers.? Luckily we don't encode and decode klass pointers in too many places. I do have a patch for allocating klasses without instances but I won't get to it for a few weeks. Thank you for the great writeup. Coleen On 5/23/24 11:20 AM, Thomas St?fe wrote: > Hi all, > > I would like help deciding on the best mitigation strategy for > Lilliput's Klass hyperalignment problem. Since it has wide effects > (e.g. a possible removal of class space), I'd like to base the next > steps on consensus. > > (a more readable version of this, in markdown, is here: > https://gist.github.com/tstuefe/6d8c4a40689c34b12f79442a8469504e > ). > > 1. Background > > We store class information in Klass, and resolving Klass from oop is a > hot path. One example is GC: During GCs, we churn through tons of > objects and need to get at least object size (layout helper) and > Oopmap from Klass frequently. Therefore, the way we resolve a nKlass > matters for performance. > > Today (non-Lilliput), we go from Object to Klass (with compressed > class pointers) this way: We pluck the nKlass from the word adjacent > to the MW. We then calculate Klass* from nKlass by - typically - just > adding the encoding base as immediate. We may or may not omit that > add, and we may or may not shift the nKlass, but the most typical case > (CDS enabled) is just the addition. > > Today's decoding does not need a single memory access, it can happen > in registers only. > > In Lilliput, the nKlass lives in the MW (which allows us to load > nKlass with the same load that loads the MW). Therefore, nKlass needs > to shrink. The problem with the classic 32-bit nKlass is that its > value range is not used effectively. Klass structures tend to be > large, on average 500-700 bytes [1], and that means a lot of values in > that 32-bit range are "blind" - point into the middle of a class - and > are hence wasted. Ideally, one would want to use one nKlass value per > class. > > In Lilliput, we reduced nKlass to 22-bit. We do this by placing Klass > structures only on 1KB-aligned addresses. Therefore, the lower 10 bits > are 0 and can be shifted out. 1KB was chosen as a middle-ground that > allows us to use both the nKlass value range and the Klass encoding > range (4GB) effectively. > > > 2.The Problem > > By keeping Klass structures 1KB-aligned, we march in lockstep with > respect to CPU caches. With a cache?line size of 64 bytes (6 bits) and > an alignment of 1KB (10 bits), we lose 4 bits of entropy. Therefore, > loads from a Klass structure have a high chance of evicting earlier > loads from other Klass structures. Depending on saturation and number > of cache ways, we may only use a 16th of the caches. > > We see this effect clearly, especially in GC pause times. The bad > cache behavior is clearly noticeable and needs to be solved. > > > 3. The solutions > > 3.1. Short-term mitigation: Increasing the nKlass size to 26 bits > > A simple short-term mitigation for Lilliput Milestone 1 is just > increasing the nKlass size. By reducing the nKlass size to 22 bits, we > freed up 10 bits, but to date, we only use 6 of them. For now, we have > 4 spare bits in the header. We could increase the nKlass to 26 bits > and work with a shift of 6 bits instead of 10 bits. That would require > no additional work. Klass would be aligned to just 64 bytes, so the > cache performance problems would disappear. > > Note, however, that we earmarked those 4 spare bits for Valhalla's > use. Therefore, reverting to a 26-bit nKlass can only be an > intermediate step. And, obviously, it won't do for 32-bit headers. > > > 3.2 Use a pointer indirection table > > This idea resurfaces every couple of years. The idea is to replace the > class space with an indirection pointer table. In this approach, a > nKlass would be an index into a global pointer table, and that pointer > table contains the real Klass* pointers. > > The enticing part of this approach is that we could throw away the > class space and a bunch of supporting code. Klass structures could > live in normal Metaspace like all the other data. However, we would > need some new code to maintain the global Klass* table, recycle empty > pointer slots after class unloading, etc. > > Decoding a nKlass would mean: > - load the nKlass from the object > - load Klass* from the indirection table at the index nKlass points to > > The approach would solve the cache problem described above > since?removing any alignment requirement from Klass structures allows > us to place them wherever we like (e.g., in standard Metaspace), and > their start addresses would not march in lockstep. > > However, it introduces a new cache problem since we now have a new > load in the hot decoding path. And the Klasspointer table can only be > improved so much: only 8 uncompressed pointers fit into a cache line. > From a certain number of classes, subsequent table accesses will have > little spatial locality. > > 3.3 Place Klass on alternating cache lines > > Originally brought up by John Rose [2] when we did the first iteration > of 22-bit class pointers in Lilliput. The idea is to alter the > locations of Klass structures by cache line size. There is nothing > that forces us to use a power-of-two stride. We can use any uneven > multiple of cache lines that we like. For example, 11 cache lines (704 > bytes) would mean that Klass structures would come to be located on > different cache lines. > > With a non-pow2 stride, decoding becomes a bit more complex. We cannot > use shift, we need to do integer multiplication: > - multiply nKlass with 704 > - add base. > > But all of this can still happen in registers only. No memory load is > needed. > > 4. The prototypes > > I wanted to measure the performance impacts of all approaches. So I > compared four JVMs: > > - A) (the unmitigated case) a Lilliput JVM as it is today: 22-bit > nKlass, 10-bit shift > - B) a Lilliput JVM that uses a 26-bit nKlass with a 6-bit shift > - C) a Lilliput JVM that uses a 22-bit nKlass and a Klass pointer > indirection table [3] > - D) a Lilliput JVM that uses a 22-bit nKlass and a non-pow2 alignment > of Klass of 704 [4] > > It turned out that Coleen also wrote a prototype with a Klass pointer > indirection table [5], but that is identical to mine (C) in all > relevant points. The only difference is that Coleen based it on the > mainline JVM; mine is based on Lilliput. But I repeated all my tests > with Coleen's prototype. > > 5. The Tests > > I did both SpecJBB2015 and a custom-written Microbenchmark [6]. > > The microbenchmark was designed to stress Object-to-Klass > dereferencing during GC. It fills the heap with many objects of > randomly chosen classes. It keeps those objects alive in an array. It > then executes several Full GCs and sums up all GC pause times. Walking > these objects forces the GCs to de-reference many different nKlass values. > > The Microbenchmark was run on a Ryzen Zen 2, the SpecJBB on an older > i7-4770, and Coleen's prototype I also tested on a Raspberry 4. The > tests were isolated to 8 cores (well, apart from the Raspberry), and I > tried to minimize scheduler interference. The microbenchmark results > were pretty stable, but the SpecJBB2015 results fluctuated - despite > my attempts at stabilizing them. > > I repeated the Microbenchmark for a number of classes (512..16384) and > three different collectors (Serial, Parallel, G1). > > 6. The Results > > The microbenchmark shows clear and stable results. SpecJBB fluctuated > more. > > 6.1 Microbenchmark, G1GC > > See graph [7]. > > (A) - the unmitigated 22-bit version showed overall the worst > performance, with a 41% increase over the best performer (B, the > 26-bit version) at 16k classes. > (B) - best performance > (C) - the klass pointer table seemed overall the worst of the three > mitigation prototypes. For 4k..8k classes, even worse than the > unmitigated case (A). Maxes out at +36% over the best performer (B) at > 16k classes. > (D) - second best performance, for certain class ranges even best. > Maxes out at +11% at 16k classes. > > I wondered why (D) could be better than (B). My assumption is that > with (D), we go out of our way to choose different cache lines. With > (B), the cache line chosen is "random" and may be subject to > allocation pattern artifacts of the underlying allocator. > > 6.2 Microbenchmark, ParallelGC > > See graphs [8]. > > Differences are less pronounced, but the results are similar. (B) > better than (D) better than (C). > > 6.3 Microbenchmark, SerialGC > > See graphs [9]. > > Again, the same result. Here are the deltas most pronounced, cache > inefficiency measured via GC pauses?is the most apparent. > > 6.4 Coleen's prototype > > I repeated the measurements with Coleens prototype (with G1), > comparing it against the same JVM with klass table switched off. No > surprises, similar behavior to (C) vs (B). See > [10]. I also did a run with perf to measure L1 misses, and we see up > to 32% more L1 cache misses with the klass pointer table [11]. > > 6.5 SPecJBB2015 > > SpecJBB results were quite volatile despite my stabilization efforts. > The deltas between maxJOps and critJOps did not rise above random > noise. The GC pause times showed (Percentage numbers, compared with > (A)==100%): > > | Run | 1 ? ? ? | 2 ? ? ? | 3 ? ? ?| > |-----|---------|---------|--------| > | B ? | 108.68% | 95.88% ?| 97.01% | > | C ? | 104.26% | 92.15% ?| 90.25% | > | D ? | 96.20% ?| 94.31% ?| 86.44% | > > (all with G1GC) > > Again, (D) seems to perform best. I am unsure what the problem is with > (B) here (the 26-bit class pointer approach) since it seems to perform > worst in all cases. I will look into that. > > 7. Side considerations > > 7.1 Running out of class space/addressable classes? > > This issue does not affect the number of addressable classes much. > That one is limited by the size of nKlass. With a 22-bit nKlass that > is 4 mio. We can only work beyond that with the concept of near- vs > far classes suggested by John Rose. In any case, that is not the focus > here. > > The class-space-based approaches (A), (B), and (D) also have a soft > limit in that the number of Klass we can store is limited by the size > of the encoding range (4GB). However, that is a rather soft limit > because no hard technical reason prevents us from having a larger > encoding range. The 4G limitation exists only because we optimize the > addition of the base immediate by using e.g. 16-bit moves on some > platforms, so the nKlass must not extend beyond bit 31. We may be able > to do that differently. Note that 4GB is also really large for Klass data. > > 7.2 Reducing the number of Klass structures addressable via nKlass > > Coleen had a great idea that never instantiated classes (e.g., Lambda > Forms) don't need a nKlass at all. They, therefore, don't have to live > within the Klass encoding range. That would be a great improvement > since these classes are typically generated, and their number is > unpredictable. By removing this kind of classes from the equation, the > question of number-of-addressable classes becomes a lot more relaxed. > > 8. Conclusions > > For the moment, I prefer (D) (the uneven-cache-lines-approach). It > shows the overall best performance, in parts even outperforming the > 26-bit approach. > > The Klasspointer-Table approach (C) would be nice since we could > eliminate class space and a lot of?coding that goes with it. That > would reduce complexity. But the additional load in hot decoding paths > hurts. There is also the vague fear of not being future-proof. I am > apprehensive about sacrificing Klass resolving performance since Klass > lookup seems to be something we will always do. > > That said, all input (especially from you, Coleen!) is surely welcome. > > # Materials > > All test results, tests, etc can be found here: [12] > > > Thanks, Thomas > > > - [1] [Allocation Histogram for Klass- and Non-Klass > Allocations](https://raw.githubusercontent.com/tstuefe/metaspace-statistics/ab3625e041d42243039f37983969ac8b770a9f4a/Histogram.svg > ) > - [2] > https://github.com/openjdk/lilliput/pull/13#issuecomment-988456995 > > - [3] [Klasstable > prototype](https://github.com/tstuefe/lilliput/tree/lilliput-with-Klass-indirection-table > ) > - [4] [Uneven alignment > prototype](https://github.com/tstuefe/lilliput/tree/lilliput-with-staggered-Klass-alignment > ) > - [5] [Klass table prototype, > Coleen](https://github.com/openjdk/jdk/pull/19272 > ) > - [6] [The > Microbenchmark](https://github.com/tstuefe/test-hyperaligning-lilliput/blob/master/microbenchmark/the-test/src/main/java/de/stuefe/repros/metaspace/InterleaveKlassRefsInHeap.java > ) > - [7] [Graph: G1 microbench, absolute GC > pauses](https://raw.githubusercontent.com/tstuefe/test-hyperaligning-lilliput/e55025622ec3574b0252ff82d860e63603a9f7df/microbenchmark/archived/results-2024-05-14T15-36-27-CEST/G1GC-abs-pauses.svg > ) > - [8] [Graph: ParallelGC microbench, absolute GC > pauses](https://raw.githubusercontent.com/tstuefe/test-hyperaligning-lilliput/e55025622ec3574b0252ff82d860e63603a9f7df/microbenchmark/archived/results-2024-05-14T15-36-27-CEST/ParallelGC-abs-pauses.svg > ) > - [9] [Graph: SerialGC microbench, absolute GC > pauses](https://raw.githubusercontent.com/tstuefe/test-hyperaligning-lilliput/e55025622ec3574b0252ff82d860e63603a9f7df/microbenchmark/archived/results-2024-05-14T15-36-27-CEST/SerialGC-abs-pauses.svg > ) > - [10] [Graph: G1 microbench, absolute GC pauses, alternate klasstable > prototype](https://raw.githubusercontent.com/tstuefe/test-hyperaligning-lilliput/6fb9d64fff930e16154c4b3fab9a24169b021de8/microbenchmark/archived/coleen-kptable-results-2024-05-21T11-14-31-CEST/g1gc-pauses-absolute.svg > ) > - [11] [Graph: G1 microbench, L1 misses, alternate klasstable > prototype](https://raw.githubusercontent.com/tstuefe/test-hyperaligning-lilliput/6fb9d64fff930e16154c4b3fab9a24169b021de8/microbenchmark/archived/coleen-kptable-results-2024-05-21T11-14-31-CEST/g1gc-l1misses-absolute.svg > ) > - [12] > https://github.com/tstuefe/test-hyperaligning-lilliput/tree/master > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aboldtch at openjdk.org Wed Jun 5 14:35:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 5 Jun 2024 14:35:29 GMT Subject: [master] RFR: Lilliput om world In-Reply-To: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Message-ID: <7FOTHrgSG-CRy-qEJ9NtEP04OKvGiXLSTOYjPiH7nsw=.09e5537b-fc79-4932-8407-a3c931f89b60@github.com> On Fri, 31 May 2024 20:46:25 GMT, Coleen Phillimore wrote: > Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. > > Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). > > Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. > > tier1 aarch64 in progress. I did a more thorough review after the rebase. There are a couple of things I found. Because some changes are in other files I created a review patch below. I opened a pr to your fork if you want to merge it in and edit it. https://github.com/coleenp/lilliput/pull/1 ------------- PR Comment: https://git.openjdk.org/lilliput/pull/181#issuecomment-2150194284 From coleen.phillimore at oracle.com Wed Jun 12 13:28:28 2024 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Wed, 12 Jun 2024 09:28:28 -0400 Subject: OM World and Lilliput planning In-Reply-To: <45b10ea1-41ed-463c-862f-fd44175d8e49@oracle.com> References: <45b10ea1-41ed-463c-862f-fd44175d8e49@oracle.com> Message-ID: <284e7bda-2e37-4452-9e7c-3c484181fd21@oracle.com> On 6/5/24 3:19 AM, Stefan Karlsson wrote: > Hi Coleen, > > Thanks for moving the "OM World" towards completion. I have one > comment below: > > On 2024-06-04 22:52, coleen.phillimore at oracle.com wrote: >> >> Hi, This is what I wrote up after an internal discussion.? I am about >> to file some RFEs/CSRs (or maybe will next week).? Let me know what >> you think. >> >> Thanks, >> Coleen >> >> What we call OM World is saving the ObjectMonitor in a >> ConcurrentHashTable rather than in the markWord of the Java Object. >> Lilliput absolutely requires this since for Lilliput the Klass >> pointer is also in the markWord and to get to the Klass pointer for a >> locked object, the code would have to go to the displaced header in >> unboundedly racy situations. >> >> Without Lilliput, this is also helpful in that it frees up markWord >> bits for concurrent GCs or Valhalla? to use. Because of this, and >> because of the high level of testing this type of change requires, >> we'd like to push this change to mainline ahead of the Lilliput work. >> >> OM World is built on top of Lightweight locking as Lightweight >> locking is required (doesn't save the stack location in the markWord >> as does Legacy locking).? To reduce the maintenance burden and >> potential tricky interactions between new features and Legacy >> locking, we'd like to deprecate Legacy locking in JDK 24. >> >> Deprecating Legacy locking then makes the flag LockingMode not make >> any sense, as one of three enumerations will be missing. Also, to >> introduce OM World on top of Lightweight locking, it would be good to >> have that on a diagnostic flag in case of customer performance >> issues.? It doesn't make sense to have a new locking mode for OM >> World, since it shares 80% code with Lightweight locking. >> >> Therefore I (with input from Axel and Stefan) propose the following >> for JDK 24: >> >> 1. Reintroduce the flag UseHeavyMonitors for LockingMode=LM_MONITOR >> 2. Deprecate LockingMode=LM_LEGACY >> 3. Deprecate the flag LockingMode.? It's a new flag, legacy code >> won't miss it. >> 4. When OM World is ready to integrate, introduce a new diagnostic >> flag UseObjectMonitorTable >> ????? - Start default off >> ????? - Make it default on midway through JDK 24 if no problems. > > What is the benefit of starting with this turned off and then a few > weeks later making it default? I think we'll get better functional > test coverage if it is enabled by default. We had a very similar > situation when Lightweight locking was turned off by default and many > bugs weren't found until it was turned on by default. The reason to start with -XX:-UseObjectMonitorTable (off) is to get test coverage on that path too, and have time to work out any performance issues that we might have before turning it on. Also, later, step 5. remove UseObjectMonitorTable.? It's a diagnostic flag and super-temporary so won't affect applications. Thanks, Coleen > > Thanks, > StefanK > >> >> JDK 25: >> >> 1. Obsolete Legacy locking mode (removes the code - TBD) >> 2. Obsolete LockingMode flag >> 3. We can hold onto UseObjectMonitorTable for a while (off turns off >> Lilliput UseCompactObjectHeaders). >> > From coleenp at openjdk.org Wed Jun 12 13:41:14 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jun 2024 13:41:14 GMT Subject: [master] RFR: Lilliput om world [v2] In-Reply-To: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Message-ID: > Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. > > Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). > > Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. > > tier1 aarch64 in progress. Coleen Phillimore has updated the pull request incrementally with seven additional commits since the last revision: - Merge pull request #1 from xmas92/lilliput-om-world Review Comments - Remove OM_OFFSET_NO_MONITOR_VALUE_TAG changes - Remove OMUseC2Cache - aarch64: Make C2 unchanged when !UseObjectMonitorTable - x86: Make C2 unchanged when !UseObjectMonitorTable - Fix OM_OFFSET_NO_MONITOR_VALUE_TAG - Review Comments ------------- Changes: - all: https://git.openjdk.org/lilliput/pull/181/files - new: https://git.openjdk.org/lilliput/pull/181/files/64e54ea4..3e74a698 Webrevs: - full: https://webrevs.openjdk.org/?repo=lilliput&pr=181&range=01 - incr: https://webrevs.openjdk.org/?repo=lilliput&pr=181&range=00-01 Stats: 275 lines in 13 files changed: 96 ins; 76 del; 103 mod Patch: https://git.openjdk.org/lilliput/pull/181.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/181/head:pull/181 PR: https://git.openjdk.org/lilliput/pull/181 From rkennke at openjdk.org Wed Jun 12 14:47:36 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 12 Jun 2024 14:47:36 GMT Subject: [master] RFR: Prepare for smaller-than-22-bit class pointers [v2] In-Reply-To: References: <7yND0YMWK_w99G29XAIZAGWvzM_hi5K3Mr8fBx2ZOeY=.da05e2bc-77ff-4fee-9830-d2f6c04f737c@github.com> Message-ID: On Tue, 30 Apr 2024 09:00:48 GMT, Thomas Stuefe wrote: >> This PR prepares using arbitrary klass pointer sizes (e.g. 16). It cleans up a few places and corrects comments. >> >> The changes in detail: >> >> - exposes a new function `CompressedKlassPointers::max_encoding_range_size()` that returns the maximum possible size of the encoding range given the current nKlass geometry (e.g. 16 bit klass pointers with a max. shift of 10 bits can encode 64MB of class space). >> >> - In Metaspace::ergo_initialize(), where we ergo-adjust the CompressedClassSpaceSize, the maximum possible encoding range size flows into this adjustment now. We also print clearer warnings in case the user specifies CCS size explicitly, and we override that decision. >> >> - removed any hard-wiredness of a "max class space size/encoding range size of 4GB" since with smaller geometries that does not hold true anymore. Instead, we now use `CompressedKlassPointers::max_encoding_range_size()`. >> >> - made the requirements on klass_alignment_in_bytes clearer when setting up class space >> >> - removed remnant code (TinyClassPointerShift) left over from development >> >> >> Note: One still unsolved problem?unsolved in Lilliput as well as upstream?is to correctly limit the compressed class space size in the presence of CDS. Upstream "solves" this by capping CCS size at 3GB, which leaves 1GB for CDS archives. There is no solution if CDS would ever exceed this limit, and it's a waste of space for CCS. >> >> In Lilliput, if we limit the class pointer size such that we drastically reduce the klass encoding range size, we need to be better at splitting that klass encoding range between CDS and class space. For example, we could map CDS and then use the remaining space completely for class space. But that would require more serious reshuffling for initialization code, and CDS setup is horrendously complex. >> >> For this patch, if one wants to reduce class pointer size, one may have to disable CDS to run. >> >> Tested: Mac m1, fastdebug, with 32, 22 and 16 bit class pointers. GHAs in process. > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into preparation-for-arbitrary-cp-sizes > - Fix behavior when running with a very small MaxMetaspaceSize > - start I used this change to implement 19-bit-wide class-pointers in my 4-byte-header prototype. Those are the changes on top of this PR that were needed to make that happen: https://github.com/rkennke/lilliput/commit/2f2ffeadfb566b7ab0eea2aa065140011214b90d ------------- PR Comment: https://git.openjdk.org/lilliput/pull/172#issuecomment-2163217005 From rkennke at openjdk.org Wed Jun 12 14:53:39 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 12 Jun 2024 14:53:39 GMT Subject: [master] RFR: Prepare for smaller-than-22-bit class pointers [v2] In-Reply-To: References: <7yND0YMWK_w99G29XAIZAGWvzM_hi5K3Mr8fBx2ZOeY=.da05e2bc-77ff-4fee-9830-d2f6c04f737c@github.com> Message-ID: On Tue, 30 Apr 2024 09:00:48 GMT, Thomas Stuefe wrote: >> This PR prepares using arbitrary klass pointer sizes (e.g. 16). It cleans up a few places and corrects comments. >> >> The changes in detail: >> >> - exposes a new function `CompressedKlassPointers::max_encoding_range_size()` that returns the maximum possible size of the encoding range given the current nKlass geometry (e.g. 16 bit klass pointers with a max. shift of 10 bits can encode 64MB of class space). >> >> - In Metaspace::ergo_initialize(), where we ergo-adjust the CompressedClassSpaceSize, the maximum possible encoding range size flows into this adjustment now. We also print clearer warnings in case the user specifies CCS size explicitly, and we override that decision. >> >> - removed any hard-wiredness of a "max class space size/encoding range size of 4GB" since with smaller geometries that does not hold true anymore. Instead, we now use `CompressedKlassPointers::max_encoding_range_size()`. >> >> - made the requirements on klass_alignment_in_bytes clearer when setting up class space >> >> - removed remnant code (TinyClassPointerShift) left over from development >> >> >> Note: One still unsolved problem?unsolved in Lilliput as well as upstream?is to correctly limit the compressed class space size in the presence of CDS. Upstream "solves" this by capping CCS size at 3GB, which leaves 1GB for CDS archives. There is no solution if CDS would ever exceed this limit, and it's a waste of space for CCS. >> >> In Lilliput, if we limit the class pointer size such that we drastically reduce the klass encoding range size, we need to be better at splitting that klass encoding range between CDS and class space. For example, we could map CDS and then use the remaining space completely for class space. But that would require more serious reshuffling for initialization code, and CDS setup is horrendously complex. >> >> For this patch, if one wants to reduce class pointer size, one may have to disable CDS to run. >> >> Tested: Mac m1, fastdebug, with 32, 22 and 16 bit class pointers. GHAs in process. > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into preparation-for-arbitrary-cp-sizes > - Fix behavior when running with a very small MaxMetaspaceSize > - start The changes look good to me. ------------- Marked as reviewed by rkennke (Lead). PR Review: https://git.openjdk.org/lilliput/pull/172#pullrequestreview-2113292612 From rkennke at openjdk.org Wed Jun 12 14:57:38 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 12 Jun 2024 14:57:38 GMT Subject: [master] RFR: OMWorld: reenable all platforms In-Reply-To: References: Message-ID: On Thu, 23 May 2024 06:26:20 GMT, Axel Boldt-Christmas wrote: > This reenables all platforms to use OMWorld, and by extension UseCompactObjectHeaders. > > This change simply calls the runtime if a lock is inflated, until port support for OMWorld cache lookup is added. > > ARM (32-bit) required no changes as it already always called the runtime when a monitor is inflated. Hi Axel, this mostly looks good, I have just one question. src/hotspot/share/runtime/lightweightSynchronizer.cpp line 1045: > 1043: const markWord mark = obj->mark(); > 1044: > 1045: if (mark.is_unlocked()) { Was that not needed before? Or why is this added now? ------------- PR Review: https://git.openjdk.org/lilliput/pull/174#pullrequestreview-2113300917 PR Review Comment: https://git.openjdk.org/lilliput/pull/174#discussion_r1636631525 From coleenp at openjdk.org Wed Jun 12 15:53:38 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jun 2024 15:53:38 GMT Subject: [master] RFR: Lilliput om world [v3] In-Reply-To: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Message-ID: > Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. > > Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). > > Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. > > tier1 aarch64 in progress. Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge branch 'master' into lilliput-om-world - Merge pull request #1 from xmas92/lilliput-om-world Review Comments - Remove OM_OFFSET_NO_MONITOR_VALUE_TAG changes - Remove OMUseC2Cache - aarch64: Make C2 unchanged when !UseObjectMonitorTable - x86: Make C2 unchanged when !UseObjectMonitorTable - Fix OM_OFFSET_NO_MONITOR_VALUE_TAG - Review Comments - Add UseObjectMonitorTable to disable the table to allow performance work. ------------- Changes: https://git.openjdk.org/lilliput/pull/181/files Webrev: https://webrevs.openjdk.org/?repo=lilliput&pr=181&range=02 Stats: 870 lines in 29 files changed: 418 ins; 128 del; 324 mod Patch: https://git.openjdk.org/lilliput/pull/181.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/181/head:pull/181 PR: https://git.openjdk.org/lilliput/pull/181 From coleenp at openjdk.org Wed Jun 12 18:57:30 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jun 2024 18:57:30 GMT Subject: [master] RFR: OMWorld: reenable all platforms In-Reply-To: References: Message-ID: On Wed, 12 Jun 2024 14:53:54 GMT, Roman Kennke wrote: >> This reenables all platforms to use OMWorld, and by extension UseCompactObjectHeaders. >> >> This change simply calls the runtime if a lock is inflated, until port support for OMWorld cache lookup is added. >> >> ARM (32-bit) required no changes as it already always called the runtime when a monitor is inflated. > > src/hotspot/share/runtime/lightweightSynchronizer.cpp line 1045: > >> 1043: const markWord mark = obj->mark(); >> 1044: >> 1045: if (mark.is_unlocked()) { > > Was that not needed before? Or why is this added now? Also wondering this. We come here if the object is locked via slow path, but can it become unlocked only for the platforms that come don't check the ObjectMonitor and go slow path? ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/174#discussion_r1636943499 From coleenp at openjdk.org Wed Jun 12 19:08:42 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jun 2024 19:08:42 GMT Subject: [master] RFR: OMWorld: reenable all platforms In-Reply-To: References: Message-ID: <8uZlZnJwKjVyXr4aJIQwhSv1J65SjU2MjuDpOqGD0CU=.d1ddf7df-3cf4-4ac1-94bb-b232f1078041@github.com> On Wed, 12 Jun 2024 14:54:21 GMT, Roman Kennke wrote: >> This reenables all platforms to use OMWorld, and by extension UseCompactObjectHeaders. >> >> This change simply calls the runtime if a lock is inflated, until port support for OMWorld cache lookup is added. >> >> ARM (32-bit) required no changes as it already always called the runtime when a monitor is inflated. > > Hi Axel, this mostly looks good, I have just one question. The changes for this PR are also in this PR: https://github.com/openjdk/lilliput/pull/181/ @rkennke can you have a look at this? Thanks. ------------- PR Comment: https://git.openjdk.org/lilliput/pull/174#issuecomment-2163720423 From aboldtch at openjdk.org Wed Jun 12 19:17:35 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 12 Jun 2024 19:17:35 GMT Subject: [master] RFR: OMWorld: reenable all platforms In-Reply-To: References: Message-ID: <5GgIwXXLWaLGBZjzEE4S91UT0hVRNmYxM9p3u1lsQVE=.5c5eed69-6ec3-4e0e-925a-52890c22e86e@github.com> On Wed, 12 Jun 2024 18:54:47 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/lightweightSynchronizer.cpp line 1045: >> >>> 1043: const markWord mark = obj->mark(); >>> 1044: >>> 1045: if (mark.is_unlocked()) { >> >> Was that not needed before? Or why is this added now? > > Also wondering this. We come here if the object is locked via slow path, but can it become unlocked only for the platforms that come don't check the ObjectMonitor and go slow path? While it is possible that someone has unlocked/deflated by the time we get here. This was really thought as a bandaid for `x86_32`. Which now skips fast locking due to lack of a thread register in some paths. I think the whole `quick_enter` should be reevaluated in case when we get here from C2. @coleenp has been running some experiments where it is removed completely. I can see a solution where C1 and C2 have two different entry points, where C1 does the `quick_enter` (and only fast_locking on `x86_32`) while the C2 entry point skips it. As it only gets here in a true slow path (except for `arm32`) ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/174#discussion_r1636963527 From coleenp at openjdk.org Wed Jun 12 19:24:52 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jun 2024 19:24:52 GMT Subject: [master] RFR: Lilliput om world [v4] In-Reply-To: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Message-ID: <_cCzflL1KpQ2hKpIYP9f8BKW9x1pQB_N_F-PMsYTcx0=.8326c70e-7c8f-427c-a27c-43e92380ba46@github.com> > Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. > > Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). > > Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. > > tier1 aarch64 in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix command line processing. When COH is on, turn on ObjectTable (ignoring what one said about UseObjectMonitorTable) ------------- Changes: - all: https://git.openjdk.org/lilliput/pull/181/files - new: https://git.openjdk.org/lilliput/pull/181/files/0111020e..43ef424c Webrevs: - full: https://webrevs.openjdk.org/?repo=lilliput&pr=181&range=03 - incr: https://webrevs.openjdk.org/?repo=lilliput&pr=181&range=02-03 Stats: 11 lines in 1 file changed: 8 ins; 1 del; 2 mod Patch: https://git.openjdk.org/lilliput/pull/181.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/181/head:pull/181 PR: https://git.openjdk.org/lilliput/pull/181 From coleenp at openjdk.org Wed Jun 12 20:23:55 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jun 2024 20:23:55 GMT Subject: [master] RFR: Lilliput om world [v5] In-Reply-To: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Message-ID: > Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. > > Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). > > Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. > > tier1 aarch64 in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Allow for hardcoding UseCompactObjectHeaders to on. ------------- Changes: - all: https://git.openjdk.org/lilliput/pull/181/files - new: https://git.openjdk.org/lilliput/pull/181/files/43ef424c..4f90107d Webrevs: - full: https://webrevs.openjdk.org/?repo=lilliput&pr=181&range=04 - incr: https://webrevs.openjdk.org/?repo=lilliput&pr=181&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/lilliput/pull/181.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/181/head:pull/181 PR: https://git.openjdk.org/lilliput/pull/181 From aboldtch at openjdk.org Thu Jun 13 12:14:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 13 Jun 2024 12:14:29 GMT Subject: [master] RFR: Lilliput om world [v5] In-Reply-To: References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Message-ID: On Wed, 12 Jun 2024 20:23:55 GMT, Coleen Phillimore wrote: >> Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. >> >> Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). >> >> Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. >> >> tier1 aarch64 in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Allow for hardcoding UseCompactObjectHeaders to on. lgtm. ------------- Marked as reviewed by aboldtch (Committer). PR Review: https://git.openjdk.org/lilliput/pull/181#pullrequestreview-2115608315 From coleenp at openjdk.org Thu Jun 13 12:36:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 13 Jun 2024 12:36:34 GMT Subject: [master] RFR: Lilliput om world [v5] In-Reply-To: References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Message-ID: On Wed, 12 Jun 2024 20:23:55 GMT, Coleen Phillimore wrote: >> Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. >> >> Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). >> >> Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. >> >> tier1 aarch64 in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Allow for hardcoding UseCompactObjectHeaders to on. Thank you Axel. ------------- PR Comment: https://git.openjdk.org/lilliput/pull/181#issuecomment-2165528482 From coleenp at openjdk.org Thu Jun 13 12:36:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 13 Jun 2024 12:36:34 GMT Subject: [master] Integrated: Lilliput om world In-Reply-To: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> References: <7iRNMVcxpiDLgFgPvTORaHxTbG-39pJ7MhC0e1wKXWo=.ad4e5d9c-f4a5-4699-ab49-2646baa84cd6@github.com> Message-ID: On Fri, 31 May 2024 20:46:25 GMT, Coleen Phillimore wrote: > Added a diagnostic option UseObjectMonitorTable to maintain performance of LM_LIGHTWEIGHT locking which is now defaulted to on in mainline. > > Incorporated Axel's patch to support other platforms with UseObjectMonitorTable (branch to slow path). > > Tested tier 1-4 on x86. Fails both versions of this test, but not locally: runtime/cds/TestDefaultArchiveLoading.java. > > tier1 aarch64 in progress. This pull request has now been integrated. Changeset: 90c4946f Author: Coleen Phillimore URL: https://git.openjdk.org/lilliput/commit/90c4946fb05ac4aa01fac78f24cdf4070679d74e Stats: 877 lines in 29 files changed: 425 ins; 128 del; 324 mod Lilliput om world Reviewed-by: aboldtch ------------- PR: https://git.openjdk.org/lilliput/pull/181 From aboldtch at openjdk.org Thu Jun 13 12:40:36 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 13 Jun 2024 12:40:36 GMT Subject: [master] RFR: OMWorld: reenable all platforms In-Reply-To: References: Message-ID: On Thu, 23 May 2024 06:26:20 GMT, Axel Boldt-Christmas wrote: > This reenables all platforms to use OMWorld, and by extension UseCompactObjectHeaders. > > This change simply calls the runtime if a lock is inflated, until port support for OMWorld cache lookup is added. > > ARM (32-bit) required no changes as it already always called the runtime when a monitor is inflated. Closing this as it was merged as a part of #181 ------------- PR Comment: https://git.openjdk.org/lilliput/pull/174#issuecomment-2165537812 From aboldtch at openjdk.org Thu Jun 13 12:40:36 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 13 Jun 2024 12:40:36 GMT Subject: [master] Withdrawn: OMWorld: reenable all platforms In-Reply-To: References: Message-ID: On Thu, 23 May 2024 06:26:20 GMT, Axel Boldt-Christmas wrote: > This reenables all platforms to use OMWorld, and by extension UseCompactObjectHeaders. > > This change simply calls the runtime if a lock is inflated, until port support for OMWorld cache lookup is added. > > ARM (32-bit) required no changes as it already always called the runtime when a monitor is inflated. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/lilliput/pull/174 From coleen.phillimore at oracle.com Mon Jun 17 13:21:17 2024 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 17 Jun 2024 09:21:17 -0400 Subject: OM World and Lilliput planning In-Reply-To: <45b10ea1-41ed-463c-862f-fd44175d8e49@oracle.com> References: <45b10ea1-41ed-463c-862f-fd44175d8e49@oracle.com> Message-ID: <0f8686af-53a6-4baf-8af3-12d9b3836e68@oracle.com> Hi Roman, Thomas and Aleksey, I filed an RFE for the first part of this. https://bugs.openjdk.org/browse/JDK-8334299 and will be working on a CSR.? I feel like we're backtracking a bit from what we did in JDK 22/23 but this seems better to me.? Comments? Thanks, Coleen On 6/5/24 3:19 AM, Stefan Karlsson wrote: > Hi Coleen, > > Thanks for moving the "OM World" towards completion. I have one > comment below: > > On 2024-06-04 22:52, coleen.phillimore at oracle.com wrote: >> >> Hi, This is what I wrote up after an internal discussion.? I am about >> to file some RFEs/CSRs (or maybe will next week).? Let me know what >> you think. >> >> Thanks, >> Coleen >> >> What we call OM World is saving the ObjectMonitor in a >> ConcurrentHashTable rather than in the markWord of the Java Object. >> Lilliput absolutely requires this since for Lilliput the Klass >> pointer is also in the markWord and to get to the Klass pointer for a >> locked object, the code would have to go to the displaced header in >> unboundedly racy situations. >> >> Without Lilliput, this is also helpful in that it frees up markWord >> bits for concurrent GCs or Valhalla? to use. Because of this, and >> because of the high level of testing this type of change requires, >> we'd like to push this change to mainline ahead of the Lilliput work. >> >> OM World is built on top of Lightweight locking as Lightweight >> locking is required (doesn't save the stack location in the markWord >> as does Legacy locking).? To reduce the maintenance burden and >> potential tricky interactions between new features and Legacy >> locking, we'd like to deprecate Legacy locking in JDK 24. >> >> Deprecating Legacy locking then makes the flag LockingMode not make >> any sense, as one of three enumerations will be missing. Also, to >> introduce OM World on top of Lightweight locking, it would be good to >> have that on a diagnostic flag in case of customer performance >> issues.? It doesn't make sense to have a new locking mode for OM >> World, since it shares 80% code with Lightweight locking. >> >> Therefore I (with input from Axel and Stefan) propose the following >> for JDK 24: >> >> 1. Reintroduce the flag UseHeavyMonitors for LockingMode=LM_MONITOR >> 2. Deprecate LockingMode=LM_LEGACY >> 3. Deprecate the flag LockingMode.? It's a new flag, legacy code >> won't miss it. >> 4. When OM World is ready to integrate, introduce a new diagnostic >> flag UseObjectMonitorTable >> ????? - Start default off >> ????? - Make it default on midway through JDK 24 if no problems. > > What is the benefit of starting with this turned off and then a few > weeks later making it default? I think we'll get better functional > test coverage if it is enabled by default. We had a very similar > situation when Lightweight locking was turned off by default and many > bugs weren't found until it was turned on by default. > > Thanks, > StefanK > >> >> JDK 25: >> >> 1. Obsolete Legacy locking mode (removes the code - TBD) >> 2. Obsolete LockingMode flag >> 3. We can hold onto UseObjectMonitorTable for a while (off turns off >> Lilliput UseCompactObjectHeaders). >> > From coleenp at openjdk.org Mon Jun 17 15:54:43 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 17 Jun 2024 15:54:43 GMT Subject: [master] RFR: Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT Message-ID: The test compiler/uncommontrap/TestDeoptOOM.java was failing with -Xcomp because it was looking for the object monitor in the basicLock which is only the case for -XX:+UseObjectMonitorTable. Testing tier1 locally with table on and off. ------------- Commit messages: - Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT. Changes: https://git.openjdk.org/lilliput/pull/182/files Webrev: https://webrevs.openjdk.org/?repo=lilliput&pr=182&range=00 Stats: 7 lines in 2 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/lilliput/pull/182.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/182/head:pull/182 PR: https://git.openjdk.org/lilliput/pull/182 From coleenp at openjdk.org Mon Jun 17 17:13:39 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 17 Jun 2024 17:13:39 GMT Subject: [master] RFR: Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT [v2] In-Reply-To: References: Message-ID: > The test compiler/uncommontrap/TestDeoptOOM.java was failing with -Xcomp because it was looking for the object monitor in the basicLock which is only the case for -XX:+UseObjectMonitorTable. > > Testing tier1 locally with table on and off. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Missed one in OSR. ------------- Changes: - all: https://git.openjdk.org/lilliput/pull/182/files - new: https://git.openjdk.org/lilliput/pull/182/files/065c1db1..d6146342 Webrevs: - full: https://webrevs.openjdk.org/?repo=lilliput&pr=182&range=01 - incr: https://webrevs.openjdk.org/?repo=lilliput&pr=182&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/lilliput/pull/182.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/182/head:pull/182 PR: https://git.openjdk.org/lilliput/pull/182 From coleenp at openjdk.org Mon Jun 17 22:02:54 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 17 Jun 2024 22:02:54 GMT Subject: [master] RFR: Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT [v3] In-Reply-To: References: Message-ID: <6m4OpybeASEkuH2dy77LesZFrdu-RDiJOLF5tpaje7o=.b346535c-4e5a-4556-8b0e-b6f82d8f4910@github.com> > The test compiler/uncommontrap/TestDeoptOOM.java was failing with -Xcomp because it was looking for the object monitor in the basicLock which is only the case for -XX:+UseObjectMonitorTable. > > Testing tier1 locally with table on and off. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: And another place where we clear_object_monitor_cache if not using the OM table. ------------- Changes: - all: https://git.openjdk.org/lilliput/pull/182/files - new: https://git.openjdk.org/lilliput/pull/182/files/d6146342..bc38a820 Webrevs: - full: https://webrevs.openjdk.org/?repo=lilliput&pr=182&range=02 - incr: https://webrevs.openjdk.org/?repo=lilliput&pr=182&range=01-02 Stats: 8 lines in 1 file changed: 0 ins; 6 del; 2 mod Patch: https://git.openjdk.org/lilliput/pull/182.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/182/head:pull/182 PR: https://git.openjdk.org/lilliput/pull/182 From thomas.stuefe at gmail.com Tue Jun 18 09:26:44 2024 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 18 Jun 2024 11:26:44 +0200 Subject: OM World and Lilliput planning In-Reply-To: <0f8686af-53a6-4baf-8af3-12d9b3836e68@oracle.com> References: <45b10ea1-41ed-463c-862f-fd44175d8e49@oracle.com> <0f8686af-53a6-4baf-8af3-12d9b3836e68@oracle.com> Message-ID: Hi Coleen, obsoleting the legacy mode feels a bit risky for an LTS. I am, of course, happy if the code can be removed. "and reintroduce (revert) the UseHeavyMonitors option as a develop flag, rather than a diagnostic flag" Why not diagnostic? Cheers, Thomas On Mon, Jun 17, 2024 at 3:21?PM wrote: > > Hi Roman, Thomas and Aleksey, > > I filed an RFE for the first part of this. > https://bugs.openjdk.org/browse/JDK-8334299 and will be working on a > CSR. I feel like we're backtracking a bit from what we did in JDK 22/23 > but this seems better to me. Comments? > > Thanks, > Coleen > > On 6/5/24 3:19 AM, Stefan Karlsson wrote: > > Hi Coleen, > > > > Thanks for moving the "OM World" towards completion. I have one > > comment below: > > > > On 2024-06-04 22:52, coleen.phillimore at oracle.com wrote: > >> > >> Hi, This is what I wrote up after an internal discussion. I am about > >> to file some RFEs/CSRs (or maybe will next week). Let me know what > >> you think. > >> > >> Thanks, > >> Coleen > >> > >> What we call OM World is saving the ObjectMonitor in a > >> ConcurrentHashTable rather than in the markWord of the Java Object. > >> Lilliput absolutely requires this since for Lilliput the Klass > >> pointer is also in the markWord and to get to the Klass pointer for a > >> locked object, the code would have to go to the displaced header in > >> unboundedly racy situations. > >> > >> Without Lilliput, this is also helpful in that it frees up markWord > >> bits for concurrent GCs or Valhalla to use. Because of this, and > >> because of the high level of testing this type of change requires, > >> we'd like to push this change to mainline ahead of the Lilliput work. > >> > >> OM World is built on top of Lightweight locking as Lightweight > >> locking is required (doesn't save the stack location in the markWord > >> as does Legacy locking). To reduce the maintenance burden and > >> potential tricky interactions between new features and Legacy > >> locking, we'd like to deprecate Legacy locking in JDK 24. > >> > >> Deprecating Legacy locking then makes the flag LockingMode not make > >> any sense, as one of three enumerations will be missing. Also, to > >> introduce OM World on top of Lightweight locking, it would be good to > >> have that on a diagnostic flag in case of customer performance > >> issues. It doesn't make sense to have a new locking mode for OM > >> World, since it shares 80% code with Lightweight locking. > >> > >> Therefore I (with input from Axel and Stefan) propose the following > >> for JDK 24: > >> > >> 1. Reintroduce the flag UseHeavyMonitors for LockingMode=LM_MONITOR > >> 2. Deprecate LockingMode=LM_LEGACY > >> 3. Deprecate the flag LockingMode. It's a new flag, legacy code > >> won't miss it. > >> 4. When OM World is ready to integrate, introduce a new diagnostic > >> flag UseObjectMonitorTable > >> - Start default off > >> - Make it default on midway through JDK 24 if no problems. > > > > What is the benefit of starting with this turned off and then a few > > weeks later making it default? I think we'll get better functional > > test coverage if it is enabled by default. We had a very similar > > situation when Lightweight locking was turned off by default and many > > bugs weren't found until it was turned on by default. > > > > Thanks, > > StefanK > > > >> > >> JDK 25: > >> > >> 1. Obsolete Legacy locking mode (removes the code - TBD) > >> 2. Obsolete LockingMode flag > >> 3. We can hold onto UseObjectMonitorTable for a while (off turns off > >> Lilliput UseCompactObjectHeaders). > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuefe at openjdk.org Tue Jun 18 09:36:06 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 18 Jun 2024 09:36:06 GMT Subject: [master] RFR: Prepare for smaller-than-22-bit class pointers [v3] In-Reply-To: <7yND0YMWK_w99G29XAIZAGWvzM_hi5K3Mr8fBx2ZOeY=.da05e2bc-77ff-4fee-9830-d2f6c04f737c@github.com> References: <7yND0YMWK_w99G29XAIZAGWvzM_hi5K3Mr8fBx2ZOeY=.da05e2bc-77ff-4fee-9830-d2f6c04f737c@github.com> Message-ID: > This PR prepares using arbitrary klass pointer sizes (e.g. 16). It cleans up a few places and corrects comments. > > The changes in detail: > > - exposes a new function `CompressedKlassPointers::max_encoding_range_size()` that returns the maximum possible size of the encoding range given the current nKlass geometry (e.g. 16 bit klass pointers with a max. shift of 10 bits can encode 64MB of class space). > > - In Metaspace::ergo_initialize(), where we ergo-adjust the CompressedClassSpaceSize, the maximum possible encoding range size flows into this adjustment now. We also print clearer warnings in case the user specifies CCS size explicitly, and we override that decision. > > - removed any hard-wiredness of a "max class space size/encoding range size of 4GB" since with smaller geometries that does not hold true anymore. Instead, we now use `CompressedKlassPointers::max_encoding_range_size()`. > > - made the requirements on klass_alignment_in_bytes clearer when setting up class space > > - removed remnant code (TinyClassPointerShift) left over from development > > > Note: One still unsolved problem?unsolved in Lilliput as well as upstream?is to correctly limit the compressed class space size in the presence of CDS. Upstream "solves" this by capping CCS size at 3GB, which leaves 1GB for CDS archives. There is no solution if CDS would ever exceed this limit, and it's a waste of space for CCS. > > In Lilliput, if we limit the class pointer size such that we drastically reduce the klass encoding range size, we need to be better at splitting that klass encoding range between CDS and class space. For example, we could map CDS and then use the remaining space completely for class space. But that would require more serious reshuffling for initialization code, and CDS setup is horrendously complex. > > For this patch, if one wants to reduce class pointer size, one may have to disable CDS to run. > > Tested: Mac m1, fastdebug, with 32, 22 and 16 bit class pointers. GHAs in process. Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - Fix behavior when running with a very small MaxMetaspaceSize - start ------------- Changes: https://git.openjdk.org/lilliput/pull/172/files Webrev: https://webrevs.openjdk.org/?repo=lilliput&pr=172&range=02 Stats: 91 lines in 4 files changed: 46 ins; 26 del; 19 mod Patch: https://git.openjdk.org/lilliput/pull/172.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/172/head:pull/172 PR: https://git.openjdk.org/lilliput/pull/172 From rkennke at amazon.de Tue Jun 18 09:43:51 2024 From: rkennke at amazon.de (Kennke, Roman) Date: Tue, 18 Jun 2024 09:43:51 +0000 Subject: OM World and Lilliput planning In-Reply-To: References: <45b10ea1-41ed-463c-862f-fd44175d8e49@oracle.com> <0f8686af-53a6-4baf-8af3-12d9b3836e68@oracle.com> Message-ID: The flag was never meant (by me) to be used by end-users. It?s been good to have it available for testing and development, and it?s served that purpose well. I?m happy to let it go. I?m a bit undecided about keeping the legacy code in 25. On the one hand, it feels safer to have the legacy as fallback for customers, in case any troubles arise. While LW locking and also recursive LW locking will be fairly well tested by then (at least by us), the OMWorld stuff will probably still have a trail of problems then. OTOH, keeping the legacy code around makes maintaining the locking code messy and more difficult, and legacy might even have started to bitrot by then (or not work at all, e.g. if Loom starts to rely on LW/OMWorld stuff). I have similar feelings towards UseHeavyMonitors: on one hand it?s nice to be able to turn off the LW locking altogether, but experience shows that this has never really worked that well. For a long time, it would only turn off stack-locking in some paths (e.g. interpreter) but keep it in other (e.g. runtime), so it never really did what people thought it would do. And if we don?t carefully maintain and test this, it will bitrot, again. This is true whether or not we do it diagnostic or develop, though. I?d probably vote for diagnostic and then write some jtreg tests that verify that it does what it says, or remove it altogether. I also kinda agree with Stefan Karlsson about making OMWorld turned on by default as soon as possible (maybe after giving it some short bake-time off-by-default, to make sure the old stuff still works as expected). Cheers, Roman > On Jun 18, 2024, at 11:26?AM, Thomas St?fe wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > Hi Coleen, > > obsoleting the legacy mode feels a bit risky for an LTS. I am, of course, happy if the code can be removed. > > "and reintroduce (revert) the UseHeavyMonitors option as a develop flag, rather than a diagnostic flag" > > Why not diagnostic? > > Cheers, Thomas > > > On Mon, Jun 17, 2024 at 3:21?PM wrote: > > Hi Roman, Thomas and Aleksey, > > I filed an RFE for the first part of this. > https://bugs.openjdk.org/browse/JDK-8334299 and will be working on a > CSR. I feel like we're backtracking a bit from what we did in JDK 22/23 > but this seems better to me. Comments? > > Thanks, > Coleen > > On 6/5/24 3:19 AM, Stefan Karlsson wrote: > > Hi Coleen, > > > > Thanks for moving the "OM World" towards completion. I have one > > comment below: > > > > On 2024-06-04 22:52, coleen.phillimore at oracle.com wrote: > >> > >> Hi, This is what I wrote up after an internal discussion. I am about > >> to file some RFEs/CSRs (or maybe will next week). Let me know what > >> you think. > >> > >> Thanks, > >> Coleen > >> > >> What we call OM World is saving the ObjectMonitor in a > >> ConcurrentHashTable rather than in the markWord of the Java Object. > >> Lilliput absolutely requires this since for Lilliput the Klass > >> pointer is also in the markWord and to get to the Klass pointer for a > >> locked object, the code would have to go to the displaced header in > >> unboundedly racy situations. > >> > >> Without Lilliput, this is also helpful in that it frees up markWord > >> bits for concurrent GCs or Valhalla to use. Because of this, and > >> because of the high level of testing this type of change requires, > >> we'd like to push this change to mainline ahead of the Lilliput work. > >> > >> OM World is built on top of Lightweight locking as Lightweight > >> locking is required (doesn't save the stack location in the markWord > >> as does Legacy locking). To reduce the maintenance burden and > >> potential tricky interactions between new features and Legacy > >> locking, we'd like to deprecate Legacy locking in JDK 24. > >> > >> Deprecating Legacy locking then makes the flag LockingMode not make > >> any sense, as one of three enumerations will be missing. Also, to > >> introduce OM World on top of Lightweight locking, it would be good to > >> have that on a diagnostic flag in case of customer performance > >> issues. It doesn't make sense to have a new locking mode for OM > >> World, since it shares 80% code with Lightweight locking. > >> > >> Therefore I (with input from Axel and Stefan) propose the following > >> for JDK 24: > >> > >> 1. Reintroduce the flag UseHeavyMonitors for LockingMode=LM_MONITOR > >> 2. Deprecate LockingMode=LM_LEGACY > >> 3. Deprecate the flag LockingMode. It's a new flag, legacy code > >> won't miss it. > >> 4. When OM World is ready to integrate, introduce a new diagnostic > >> flag UseObjectMonitorTable > >> - Start default off > >> - Make it default on midway through JDK 24 if no problems. > > > > What is the benefit of starting with this turned off and then a few > > weeks later making it default? I think we'll get better functional > > test coverage if it is enabled by default. We had a very similar > > situation when Lightweight locking was turned off by default and many > > bugs weren't found until it was turned on by default. > > > > Thanks, > > StefanK > > > >> > >> JDK 25: > >> > >> 1. Obsolete Legacy locking mode (removes the code - TBD) > >> 2. Obsolete LockingMode flag > >> 3. We can hold onto UseObjectMonitorTable for a while (off turns off > >> Lilliput UseCompactObjectHeaders). > >> > > > Amazon Web Services Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B Sitz: Berlin Ust-ID: DE 365 538 597 From thomas.stuefe at gmail.com Tue Jun 18 12:23:30 2024 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 18 Jun 2024 14:23:30 +0200 Subject: Far classes Message-ID: Hi, Roman reminded me that we don't have a fallback plan yet for running over the number of classes representable with an nKlass (be it 22 bits or smaller). Therefore I would like to air my brain a bit and discuss the current ideas. We want to have "near" and "far" classes. The first N classes, N depending on the bit size of a nKlass, are "near" classes. Klass* for near-class objects are derived directly from the nKlass in their MW. Classes loaded after that would be "far" classes. Where to store Klass* for far-class objects? Probably as part of the Object. But for completeness's sake let's look at alternatives first, even if they sound stupid: 1) An Object-to-Klass* hashmap. We would pay at least 16 bytes per entry, plus some overhead, and lookup gets expensive once the map degenerates. Growing and rehashing would be non-trivial. A GC would have to remove/reinsert mapping entries for every object it moves. 2) A "shadow-heap" - a mostly uncommitted address range mirroring the heap, containing Klass* at positions where the MW of far-class objects are located in the real heap. Lookup would be fast. The GC would have to update those shadow locations when moving objects. Uncommitting unused shadow heap pages after evacuation is non-trivial. Process vsize can become a problem for very large heaps. Footprint rises steeply with a rising number of far-class objects due to page granularity. Worst case, we double the heap size footprint. Both sound appealing, and would get unfeasible if the number of far-class objects rises. So: 3) Store Klass* in the object: We dedicate one bit in the nKlass for "is-far-class". For far classes, we store the Klass* at the end of the object. Then we encode the offset of the Klass* slot in the remaining nKlass bits. That depends on max. object size. How large does an object get? I found no limit in specs. However, the size of an object depends on its members, and we have an utf-8 CP-entry per member, and the number of CP entries is limited to 2^16. So, an object cannot have more than 65535 members (a bit less, actually). Therefore, I think it cannot be larger than 64k heap words. To encode this, we need 16 bits, and the additional "is-far-class" bit. So, with this technique, we could reduce the nKlass size to 17 bits. The cost would be +8 bytes per far-class object. Only if we store a raw Klass* instead of some form of nKlass. Storing raw Klass* would mean the Klass does not have to live in the class space, and we can stop worrying about class space size. Storing a trailing 32-bit nKlass would mean we have a chance of just filling the alignment gap before the next object, and not pay for size increase at all. We could even get down to 16 bits for the MW-stored nKlass, if we agree on aligning the Klass* slot trailing the object to 16 bytes. In that case, we can encode the Klass* slot offset with 15 bits and have the "is-far-class" as the 16th bit. Then, we could extract the nKlass from the MW with a 16-bit move. This would cost us: On average, another four bytes of overhead per far-class object, and a halved value range for near class IDs. Did I make any thinking errors? Overlook something? Does any of this make sense? Thank you for reading, and Cheers, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.stuefe at gmail.com Tue Jun 18 12:26:07 2024 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 18 Jun 2024 14:26:07 +0200 Subject: Far classes In-Reply-To: References: Message-ID: Should have had a final read before sending out the mail. s/Both sound appealing/Both sound unappealing On Tue, Jun 18, 2024 at 2:23?PM Thomas St?fe wrote: > Hi, > > Roman reminded me that we don't have a fallback plan yet for running over > the number of classes representable with an nKlass (be it 22 bits or > smaller). Therefore I would like to air my brain a bit and discuss the > current ideas. > > We want to have "near" and "far" classes. The first N classes, N depending > on the bit size of a nKlass, are "near" classes. Klass* for near-class > objects are derived directly from the nKlass in their MW. > > Classes loaded after that would be "far" classes. Where to store Klass* > for far-class objects? Probably as part of the Object. But for > completeness's sake let's look at alternatives first, even if they sound > stupid: > > 1) An Object-to-Klass* hashmap. We would pay at least 16 bytes per entry, > plus some overhead, and lookup gets expensive once the map degenerates. > Growing and rehashing would be non-trivial. A GC would have to > remove/reinsert mapping entries for every object it moves. > > 2) A "shadow-heap" - a mostly uncommitted address range mirroring the > heap, containing Klass* at positions where the MW of far-class objects are > located in the real heap. Lookup would be fast. The GC would have to update > those shadow locations when moving objects. Uncommitting unused shadow heap > pages after evacuation is non-trivial. Process vsize can become a problem > for very large heaps. Footprint rises steeply with a rising number of > far-class objects due to page granularity. Worst case, we double the heap > size footprint. > > Both sound appealing, and would get unfeasible if the number of far-class > objects rises. > > So: 3) Store Klass* in the object: > > We dedicate one bit in the nKlass for "is-far-class". For far classes, we > store the Klass* at the end of the object. Then we encode the offset of the > Klass* slot in the remaining nKlass bits. > > That depends on max. object size. How large does an object get? I found no > limit in specs. However, the size of an object depends on its members, and > we have an utf-8 CP-entry per member, and the number of CP entries is > limited to 2^16. So, an object cannot have more than 65535 members (a bit > less, actually). Therefore, I think it cannot be larger than 64k heap words. > > To encode this, we need 16 bits, and the additional "is-far-class" bit. > So, with this technique, we could reduce the nKlass size to 17 bits. The > cost would be +8 bytes per far-class object. Only if we store a raw Klass* > instead of some form of nKlass. Storing raw Klass* would mean the Klass > does not have to live in the class space, and we can stop worrying about > class space size. Storing a trailing 32-bit nKlass would mean we have a > chance of just filling the alignment gap before the next object, and not > pay for size increase at all. > > We could even get down to 16 bits for the MW-stored nKlass, if we agree on > aligning the Klass* slot trailing the object to 16 bytes. In that case, we > can encode the Klass* slot offset with 15 bits and have the "is-far-class" > as the 16th bit. Then, we could extract the nKlass from the MW with a > 16-bit move. This would cost us: On average, another four bytes of overhead > per far-class object, and a halved value range for near class IDs. > > Did I make any thinking errors? Overlook something? Does any of this make > sense? > > Thank you for reading, and Cheers, > > Thomas > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Tue Jun 18 15:59:37 2024 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 18 Jun 2024 11:59:37 -0400 Subject: [External] : Re: OM World and Lilliput planning In-Reply-To: References: <45b10ea1-41ed-463c-862f-fd44175d8e49@oracle.com> <0f8686af-53a6-4baf-8af3-12d9b3836e68@oracle.com> Message-ID: <94a12db2-0258-4859-ba77-caf3cd02c276@oracle.com> Hi Roman and Thomas, Thank you for your answers. On 6/18/24 5:43 AM, Kennke, Roman wrote: > The flag was never meant (by me) to be used by end-users. It?s been good to have it available for testing and development, and it?s served that purpose well. I?m happy to let it go. Great. > > I?m a bit undecided about keeping the legacy code in 25. On the one hand, it feels safer to have the legacy as fallback for customers, in case any troubles arise. While LW locking and also recursive LW locking will be fairly well tested by then (at least by us), the OMWorld stuff will probably still have a trail of problems then. OTOH, keeping the legacy code around makes maintaining the locking code messy and more difficult, and legacy might even have started to bitrot by then (or not work at all, e.g. if Loom starts to rely on LW/OMWorld stuff). If someone customer needs legacy locking going forward, it's because they have older code that suffers some large performance loss with lightweight locking, and more with the ObjectMonitor table (OM world).? It might be severe enough to give them the switch to LM_LEGACY.? We haven't gotten that sort of feedback but it doesn't mean it won't exist.? For this reason, I think we should keep the Legacy code in JDK 25, since it's an LTS and such a customer may only upgrade to LTS releases, and then remove it in 26.? Maintaining it won't be great in JDK 25, but the only backports we'd do from releases going forward would be bug fixes, so hopefully that code wouldn't bit rot badly there.? It does complicate the code, a lot, especially with the UseObjectMonitorTable option. If Legacy is on, loom will pin synchronized code. > > I have similar feelings towards UseHeavyMonitors: on one hand it?s nice to be able to turn off the LW locking altogether, but experience shows that this has never really worked that well. For a long time, it would only turn off stack-locking in some paths (e.g. interpreter) but keep it in other (e.g. runtime), so it never really did what people thought it would do. And if we don?t carefully maintain and test this, it will bitrot, again. This is true whether or not we do it diagnostic or develop, though. I?d probably vote for diagnostic and then write some jtreg tests that verify that it does what it says, or remove it altogether. It's a good idea to add some jtreg tests.? There are 4 of them now, and I ran tier1 with it turned on, and had one expected failure. You and Thomas voted diagnostic, so I'll make it that, so it's available in production if we need to do some testing with it.? Both diagnostic and develop flags can be removed anytime, so we could plan to remove it also in JDK 26.? That would be nice. > > I also kinda agree with Stefan Karlsson about making OMWorld turned on by default as soon as possible (maybe after giving it some short bake-time off-by-default, to make sure the old stuff still works as expected). The rationale for turning OMWorld off by UseObjectMonitorTable option is because I ran some performance tests, and there are results that might not be acceptable, and we need time to sort them out or find some way of mitigating the regressions.? Our old friend, DaCapo xalan is a lot worse.? Also Dacapo spring-large, Renaissance-ScalaKmeans (really bad). Apart from these the results aren't significantly worse, but performance testing always eats up a lot of time, which is why we'd want OMWorld off to start. Thanks, Coleen > > Cheers, > Roman > > >> On Jun 18, 2024, at 11:26?AM, Thomas St?fe wrote: >> >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. >> >> Hi Coleen, >> >> obsoleting the legacy mode feels a bit risky for an LTS. I am, of course, happy if the code can be removed. >> >> "and reintroduce (revert) the UseHeavyMonitors option as a develop flag, rather than a diagnostic flag" >> >> Why not diagnostic? >> >> Cheers, Thomas >> >> >> On Mon, Jun 17, 2024 at 3:21?PM wrote: >> >> Hi Roman, Thomas and Aleksey, >> >> I filed an RFE for the first part of this. >> https://bugs.openjdk.org/browse/JDK-8334299 and will be working on a >> CSR. I feel like we're backtracking a bit from what we did in JDK 22/23 >> but this seems better to me. Comments? >> >> Thanks, >> Coleen >> >> On 6/5/24 3:19 AM, Stefan Karlsson wrote: >>> Hi Coleen, >>> >>> Thanks for moving the "OM World" towards completion. I have one >>> comment below: >>> >>> On 2024-06-04 22:52, coleen.phillimore at oracle.com wrote: >>>> Hi, This is what I wrote up after an internal discussion. I am about >>>> to file some RFEs/CSRs (or maybe will next week). Let me know what >>>> you think. >>>> >>>> Thanks, >>>> Coleen >>>> >>>> What we call OM World is saving the ObjectMonitor in a >>>> ConcurrentHashTable rather than in the markWord of the Java Object. >>>> Lilliput absolutely requires this since for Lilliput the Klass >>>> pointer is also in the markWord and to get to the Klass pointer for a >>>> locked object, the code would have to go to the displaced header in >>>> unboundedly racy situations. >>>> >>>> Without Lilliput, this is also helpful in that it frees up markWord >>>> bits for concurrent GCs or Valhalla to use. Because of this, and >>>> because of the high level of testing this type of change requires, >>>> we'd like to push this change to mainline ahead of the Lilliput work. >>>> >>>> OM World is built on top of Lightweight locking as Lightweight >>>> locking is required (doesn't save the stack location in the markWord >>>> as does Legacy locking). To reduce the maintenance burden and >>>> potential tricky interactions between new features and Legacy >>>> locking, we'd like to deprecate Legacy locking in JDK 24. >>>> >>>> Deprecating Legacy locking then makes the flag LockingMode not make >>>> any sense, as one of three enumerations will be missing. Also, to >>>> introduce OM World on top of Lightweight locking, it would be good to >>>> have that on a diagnostic flag in case of customer performance >>>> issues. It doesn't make sense to have a new locking mode for OM >>>> World, since it shares 80% code with Lightweight locking. >>>> >>>> Therefore I (with input from Axel and Stefan) propose the following >>>> for JDK 24: >>>> >>>> 1. Reintroduce the flag UseHeavyMonitors for LockingMode=LM_MONITOR >>>> 2. Deprecate LockingMode=LM_LEGACY >>>> 3. Deprecate the flag LockingMode. It's a new flag, legacy code >>>> won't miss it. >>>> 4. When OM World is ready to integrate, introduce a new diagnostic >>>> flag UseObjectMonitorTable >>>> - Start default off >>>> - Make it default on midway through JDK 24 if no problems. >>> What is the benefit of starting with this turned off and then a few >>> weeks later making it default? I think we'll get better functional >>> test coverage if it is enabled by default. We had a very similar >>> situation when Lightweight locking was turned off by default and many >>> bugs weren't found until it was turned on by default. >>> >>> Thanks, >>> StefanK >>> >>>> JDK 25: >>>> >>>> 1. Obsolete Legacy locking mode (removes the code - TBD) >>>> 2. Obsolete LockingMode flag >>>> 3. We can hold onto UseObjectMonitorTable for a while (off turns off >>>> Lilliput UseCompactObjectHeaders). >>>> > > > > Amazon Web Services Development Center Germany GmbH > Krausenstr. 38 > 10117 Berlin > Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss > Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B > Sitz: Berlin > Ust-ID: DE 365 538 597 From thomas.schatzl at oracle.com Tue Jun 18 16:25:24 2024 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 18 Jun 2024 18:25:24 +0200 Subject: Far classes In-Reply-To: References: Message-ID: <35d72b48-2eeb-4220-be11-837a9c19d209@oracle.com> On 18.06.24 14:23, Thomas St?fe wrote: > Hi, > [...] > > So: 3) Store Klass* in the object: > > We dedicate one bit in the nKlass for "is-far-class". For far classes, > we store the Klass* at the end of the object. Then we encode the offset > of the Klass* slot in the remaining nKlass bits. > > That depends on max. object size. How large does an object get? I found > no limit in specs. However, the size of an object depends on its > members, and we have an utf-8 CP-entry per member, and the number of CP > entries is limited to 2^16. So, an object cannot have more than 65535 > members (a bit less, actually). Therefore, I think it cannot be larger > than 64k heap words. A child class of a class with 64k members can have 64k members again afair. Not sure if there is a limit on the inheritance level. I remember some tests constructing such huge objects for testing some GC algorithms. So j.l.O. instances can be of "arbitrary" size afaik. Hth, Thomas From coleen.phillimore at oracle.com Tue Jun 18 21:49:05 2024 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 18 Jun 2024 17:49:05 -0400 Subject: [External] : Re: OM World and Lilliput planning In-Reply-To: <94a12db2-0258-4859-ba77-caf3cd02c276@oracle.com> References: <45b10ea1-41ed-463c-862f-fd44175d8e49@oracle.com> <0f8686af-53a6-4baf-8af3-12d9b3836e68@oracle.com> <94a12db2-0258-4859-ba77-caf3cd02c276@oracle.com> Message-ID: <1893faee-ac42-413f-9870-69e2ba6551c5@oracle.com> Now I need a reviewer. https://bugs.openjdk.org/browse/JDK-8334496 I don't have to mention UseObjectMonitorTable and UseHeavyMonitors because they're going to be diagnostic flags. thanks, Coleen On 6/18/24 11:59 AM, coleen.phillimore at oracle.com wrote: > > Hi Roman and Thomas, Thank you for your answers. > > On 6/18/24 5:43 AM, Kennke, Roman wrote: >> The flag was never meant (by me) to be used by end-users. It?s been >> good to have it available for testing and development, and it?s >> served that purpose well. I?m happy to let it go. > > Great. >> >> I?m a bit undecided about keeping the legacy code in 25. On the one >> hand, it feels safer to have the legacy as fallback for customers, in >> case any troubles arise. While LW locking and also recursive LW >> locking will be fairly well tested by then (at least by us), the >> OMWorld stuff will probably still have a trail of problems then. >> OTOH, keeping the legacy code around makes maintaining the locking >> code messy and more difficult, and legacy might even have started to >> bitrot by then (or not work at all, e.g. if Loom starts to rely on >> LW/OMWorld stuff). > > If someone customer needs legacy locking going forward, it's because > they have older code that suffers some large performance loss with > lightweight locking, and more with the ObjectMonitor table (OM > world).? It might be severe enough to give them the switch to > LM_LEGACY.? We haven't gotten that sort of feedback but it doesn't > mean it won't exist.? For this reason, I think we should keep the > Legacy code in JDK 25, since it's an LTS and such a customer may only > upgrade to LTS releases, and then remove it in 26.? Maintaining it > won't be great in JDK 25, but the only backports we'd do from releases > going forward would be bug fixes, so hopefully that code wouldn't bit > rot badly there.? It does complicate the code, a lot, especially with > the UseObjectMonitorTable option. > > If Legacy is on, loom will pin synchronized code. > >> >> I have similar feelings towards UseHeavyMonitors: on one hand it?s >> nice to be able to turn off the LW locking altogether, but experience >> shows that this has never really worked that well. For a long time, >> it would only turn off stack-locking in some paths (e.g. interpreter) >> but keep it in other (e.g. runtime), so it never really did what >> people thought it would do. And if we don?t carefully maintain and >> test this, it will bitrot, again. This is true whether or not we do >> it diagnostic or develop, though. I?d probably vote for diagnostic >> and then write some jtreg tests that verify that it does what it >> says, or remove it altogether. > > It's a good idea to add some jtreg tests.? There are 4 of them now, > and I ran tier1 with it turned on, and had one expected failure. You > and Thomas voted diagnostic, so I'll make it that, so it's available > in production if we need to do some testing with it.? Both diagnostic > and develop flags can be removed anytime, so we could plan to remove > it also in JDK 26.? That would be nice. >> >> I also kinda agree with Stefan Karlsson about making OMWorld turned >> on by default as soon as possible (maybe after giving it some short >> bake-time off-by-default, to make sure the old stuff still works as >> expected). > > The rationale for turning OMWorld off by UseObjectMonitorTable option > is because I ran some performance tests, and there are results that > might not be acceptable, and we need time to sort them out or find > some way of mitigating the regressions.? Our old friend, DaCapo xalan > is a lot worse.? Also Dacapo spring-large, Renaissance-ScalaKmeans > (really bad). Apart from these the results aren't significantly worse, > but performance testing always eats up a lot of time, which is why > we'd want OMWorld off to start. > > Thanks, > Coleen > >> >> Cheers, >> Roman >> >> >>> On Jun 18, 2024, at 11:26?AM, Thomas St?fe >>> wrote: >>> >>> CAUTION: This email originated from outside of the organization. Do >>> not click links or open attachments unless you can confirm the >>> sender and know the content is safe. >>> >>> Hi Coleen, >>> >>> obsoleting the legacy mode feels a bit risky for an LTS. I am, of >>> course, happy if the code can be removed. >>> >>> "and reintroduce (revert) the UseHeavyMonitors option as a develop >>> flag, rather than a diagnostic flag" >>> >>> Why not diagnostic? >>> >>> Cheers, Thomas >>> >>> >>> On Mon, Jun 17, 2024 at 3:21?PM wrote: >>> >>> Hi Roman, Thomas and Aleksey, >>> >>> I filed an RFE for the first part of this. >>> https://bugs.openjdk.org/browse/JDK-8334299 and will be working on a >>> CSR.? I feel like we're backtracking a bit from what we did in JDK >>> 22/23 >>> but this seems better to me.? Comments? >>> >>> Thanks, >>> Coleen >>> >>> On 6/5/24 3:19 AM, Stefan Karlsson wrote: >>>> Hi Coleen, >>>> >>>> Thanks for moving the "OM World" towards completion. I have one >>>> comment below: >>>> >>>> On 2024-06-04 22:52, coleen.phillimore at oracle.com wrote: >>>>> Hi, This is what I wrote up after an internal discussion.? I am about >>>>> to file some RFEs/CSRs (or maybe will next week).? Let me know what >>>>> you think. >>>>> >>>>> Thanks, >>>>> Coleen >>>>> >>>>> What we call OM World is saving the ObjectMonitor in a >>>>> ConcurrentHashTable rather than in the markWord of the Java Object. >>>>> Lilliput absolutely requires this since for Lilliput the Klass >>>>> pointer is also in the markWord and to get to the Klass pointer for a >>>>> locked object, the code would have to go to the displaced header in >>>>> unboundedly racy situations. >>>>> >>>>> Without Lilliput, this is also helpful in that it frees up markWord >>>>> bits for concurrent GCs or Valhalla? to use. Because of this, and >>>>> because of the high level of testing this type of change requires, >>>>> we'd like to push this change to mainline ahead of the Lilliput work. >>>>> >>>>> OM World is built on top of Lightweight locking as Lightweight >>>>> locking is required (doesn't save the stack location in the markWord >>>>> as does Legacy locking).? To reduce the maintenance burden and >>>>> potential tricky interactions between new features and Legacy >>>>> locking, we'd like to deprecate Legacy locking in JDK 24. >>>>> >>>>> Deprecating Legacy locking then makes the flag LockingMode not make >>>>> any sense, as one of three enumerations will be missing. Also, to >>>>> introduce OM World on top of Lightweight locking, it would be good to >>>>> have that on a diagnostic flag in case of customer performance >>>>> issues.? It doesn't make sense to have a new locking mode for OM >>>>> World, since it shares 80% code with Lightweight locking. >>>>> >>>>> Therefore I (with input from Axel and Stefan) propose the following >>>>> for JDK 24: >>>>> >>>>> 1. Reintroduce the flag UseHeavyMonitors for LockingMode=LM_MONITOR >>>>> 2. Deprecate LockingMode=LM_LEGACY >>>>> 3. Deprecate the flag LockingMode.? It's a new flag, legacy code >>>>> won't miss it. >>>>> 4. When OM World is ready to integrate, introduce a new diagnostic >>>>> flag UseObjectMonitorTable >>>>> ?????? - Start default off >>>>> ?????? - Make it default on midway through JDK 24 if no problems. >>>> What is the benefit of starting with this turned off and then a few >>>> weeks later making it default? I think we'll get better functional >>>> test coverage if it is enabled by default. We had a very similar >>>> situation when Lightweight locking was turned off by default and many >>>> bugs weren't found until it was turned on by default. >>>> >>>> Thanks, >>>> StefanK >>>> >>>>> JDK 25: >>>>> >>>>> 1. Obsolete Legacy locking mode (removes the code - TBD) >>>>> 2. Obsolete LockingMode flag >>>>> 3. We can hold onto UseObjectMonitorTable for a while (off turns off >>>>> Lilliput UseCompactObjectHeaders). >>>>> >> >> >> >> Amazon Web Services Development Center Germany GmbH >> Krausenstr. 38 >> 10117 Berlin >> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss >> Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B >> Sitz: Berlin >> Ust-ID: DE 365 538 597 > From aboldtch at openjdk.org Wed Jun 19 06:23:26 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 19 Jun 2024 06:23:26 GMT Subject: [master] RFR: Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT [v3] In-Reply-To: <6m4OpybeASEkuH2dy77LesZFrdu-RDiJOLF5tpaje7o=.b346535c-4e5a-4556-8b0e-b6f82d8f4910@github.com> References: <6m4OpybeASEkuH2dy77LesZFrdu-RDiJOLF5tpaje7o=.b346535c-4e5a-4556-8b0e-b6f82d8f4910@github.com> Message-ID: On Mon, 17 Jun 2024 22:02:54 GMT, Coleen Phillimore wrote: >> The test compiler/uncommontrap/TestDeoptOOM.java was failing with -Xcomp because it was looking for the object monitor in the basicLock which is only the case for -XX:+UseObjectMonitorTable. >> >> Testing tier1 locally with table on and off. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > And another place where we clear_object_monitor_cache if not using the OM table. I think we should try to keep it invariant that only with `LM_LEGACY` do you call `*displaced_header` member functions on BasicLocks. src/hotspot/share/runtime/basicLock.cpp line 34: > 32: void BasicLock::print_on(outputStream* st, oop owner) const { > 33: st->print("monitor"); > 34: if (UseObjectMonitorTable) { This was a pre-existing bug. It should be: ```c++ if (UseObjectMonitorTable) { ObjectMonitor* mon = object_monitor_cache(); if (mon != nullptr) { mon->print_on(st); } } else if (LockingMode == LM_LEGACY) { I guess no testing ever printed a BasicLock with LM_MONITOR. src/hotspot/share/runtime/basicLock.inline.hpp line 38: > 36: assert(!UseObjectMonitorTable, "must be"); > 37: Atomic::store(&_metadata, header.value()); > 38: } These should stay exclusive to `LM_LEGACY`. src/hotspot/share/runtime/deoptimization.cpp line 1657: > 1655: mon_info->lock()->set_bad_metadata_deopt(); > 1656: } > 1657: #endif Along the same lines keep `set_displaced_header` exclusive to `LM_LEGACY` ```c++ if (LockingMode == LM_LEGACY) { mon_info->lock()->set_displaced_header(markWord::unused_mark()); } else if (UseObjectMonitorTable) { mon_info->lock()->clear_object_monitor_cache(); } #ifdef ASSERT else { assert(LockingMode == LM_MONITOR || !UseObjectMonitorTable, "must be"); mon_info->lock()->set_bad_metadata_deopt(); } #endif ------------- Changes requested by aboldtch (Committer). PR Review: https://git.openjdk.org/lilliput/pull/182#pullrequestreview-2127071351 PR Review Comment: https://git.openjdk.org/lilliput/pull/182#discussion_r1645469948 PR Review Comment: https://git.openjdk.org/lilliput/pull/182#discussion_r1645425556 PR Review Comment: https://git.openjdk.org/lilliput/pull/182#discussion_r1645429078 From stuefe at openjdk.org Wed Jun 19 08:22:37 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 19 Jun 2024 08:22:37 GMT Subject: [master] RFR: Prepare for smaller-than-22-bit class pointers [v2] In-Reply-To: References: <7yND0YMWK_w99G29XAIZAGWvzM_hi5K3Mr8fBx2ZOeY=.da05e2bc-77ff-4fee-9830-d2f6c04f737c@github.com> Message-ID: On Wed, 12 Jun 2024 14:45:16 GMT, Roman Kennke wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. > > I used this change to implement 19-bit-wide class-pointers in my 4-byte-header prototype. Those are the changes on top of this PR that were needed to make that happen: > https://github.com/rkennke/lilliput/commit/2f2ffeadfb566b7ab0eea2aa065140011214b90d Thanks @rkennke ! ------------- PR Comment: https://git.openjdk.org/lilliput/pull/172#issuecomment-2178055283 From stuefe at openjdk.org Wed Jun 19 08:22:37 2024 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 19 Jun 2024 08:22:37 GMT Subject: [master] Integrated: Prepare for smaller-than-22-bit class pointers In-Reply-To: <7yND0YMWK_w99G29XAIZAGWvzM_hi5K3Mr8fBx2ZOeY=.da05e2bc-77ff-4fee-9830-d2f6c04f737c@github.com> References: <7yND0YMWK_w99G29XAIZAGWvzM_hi5K3Mr8fBx2ZOeY=.da05e2bc-77ff-4fee-9830-d2f6c04f737c@github.com> Message-ID: On Sat, 27 Apr 2024 07:43:22 GMT, Thomas Stuefe wrote: > This PR prepares using arbitrary klass pointer sizes (e.g. 16). It cleans up a few places and corrects comments. > > The changes in detail: > > - exposes a new function `CompressedKlassPointers::max_encoding_range_size()` that returns the maximum possible size of the encoding range given the current nKlass geometry (e.g. 16 bit klass pointers with a max. shift of 10 bits can encode 64MB of class space). > > - In Metaspace::ergo_initialize(), where we ergo-adjust the CompressedClassSpaceSize, the maximum possible encoding range size flows into this adjustment now. We also print clearer warnings in case the user specifies CCS size explicitly, and we override that decision. > > - removed any hard-wiredness of a "max class space size/encoding range size of 4GB" since with smaller geometries that does not hold true anymore. Instead, we now use `CompressedKlassPointers::max_encoding_range_size()`. > > - made the requirements on klass_alignment_in_bytes clearer when setting up class space > > - removed remnant code (TinyClassPointerShift) left over from development > > > Note: One still unsolved problem?unsolved in Lilliput as well as upstream?is to correctly limit the compressed class space size in the presence of CDS. Upstream "solves" this by capping CCS size at 3GB, which leaves 1GB for CDS archives. There is no solution if CDS would ever exceed this limit, and it's a waste of space for CCS. > > In Lilliput, if we limit the class pointer size such that we drastically reduce the klass encoding range size, we need to be better at splitting that klass encoding range between CDS and class space. For example, we could map CDS and then use the remaining space completely for class space. But that would require more serious reshuffling for initialization code, and CDS setup is horrendously complex. > > For this patch, if one wants to reduce class pointer size, one may have to disable CDS to run. > > Tested: Mac m1, fastdebug, with 32, 22 and 16 bit class pointers. GHAs in process. This pull request has now been integrated. Changeset: 3875ec2b Author: Thomas Stuefe URL: https://git.openjdk.org/lilliput/commit/3875ec2b51fc3c74451ad11287e76070cdbfc31d Stats: 91 lines in 4 files changed: 46 ins; 26 del; 19 mod Prepare for smaller-than-22-bit class pointers Reviewed-by: rkennke ------------- PR: https://git.openjdk.org/lilliput/pull/172 From aboldtch at openjdk.org Thu Jun 20 12:02:53 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 20 Jun 2024 12:02:53 GMT Subject: [master] RFR: OMWorld: Removal of flags, general cleanups Message-ID: Cleanups in preparation for opening up a PR to mainline for [JDK-8315884](https://bugs.openjdk.org/browse/JDK-8315884). @coleenp I incorporated #182 into this. ------------- Commit messages: - Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT - More Misc Cleanup - Misc Cleanups - Remove inflation counters - Cleanup cache hit rate counters - Cleanup MacroAssembler - Inline Flags Changes: https://git.openjdk.org/lilliput/pull/183/files Webrev: https://webrevs.openjdk.org/?repo=lilliput&pr=183&range=00 Stats: 237 lines in 29 files changed: 7 ins; 155 del; 75 mod Patch: https://git.openjdk.org/lilliput/pull/183.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/183/head:pull/183 PR: https://git.openjdk.org/lilliput/pull/183 From aboldtch at openjdk.org Thu Jun 20 12:10:27 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 20 Jun 2024 12:10:27 GMT Subject: [master] RFR: OMWorld: Removal of flags, general cleanups In-Reply-To: References: Message-ID: On Thu, 20 Jun 2024 11:58:16 GMT, Axel Boldt-Christmas wrote: > Cleanups in preparation for opening up a PR to mainline for [JDK-8315884](https://bugs.openjdk.org/browse/JDK-8315884). > > @coleenp I incorporated #182 into this. I will do a rebase on JDK-24 next week after getting these in. Already did one earlier this week. But I felt like it was better to get this OMWorld cleanups in. (I believe @coleenp is about done with the initial pre-review of OMWorld.) But one not on the next rebase, as parallel full GC has been changed to work more like G1 it will also require adaption with to use the Alt Full GC forwarding. E.g. from my earlier rebase ae4715b7d7ce6297d939b4eba5525581765a7b95 ------------- PR Comment: https://git.openjdk.org/lilliput/pull/183#issuecomment-2180511993 From coleenp at openjdk.org Thu Jun 20 12:21:31 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 20 Jun 2024 12:21:31 GMT Subject: [master] RFR: Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT [v3] In-Reply-To: References: <6m4OpybeASEkuH2dy77LesZFrdu-RDiJOLF5tpaje7o=.b346535c-4e5a-4556-8b0e-b6f82d8f4910@github.com> Message-ID: On Wed, 19 Jun 2024 05:31:11 GMT, Axel Boldt-Christmas wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> And another place where we clear_object_monitor_cache if not using the OM table. > > src/hotspot/share/runtime/basicLock.inline.hpp line 38: > >> 36: assert(!UseObjectMonitorTable, "must be"); >> 37: Atomic::store(&_metadata, header.value()); >> 38: } > > These should stay exclusive to `LM_LEGACY`. So I was unsure of this. It's only LM_LEGACY that needs the displaced header? Lightweight locking doesn't use it even with the table? ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/182#discussion_r1647484012 From coleenp at openjdk.org Thu Jun 20 12:33:27 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 20 Jun 2024 12:33:27 GMT Subject: [master] RFR: OMWorld: Removal of flags, general cleanups In-Reply-To: References: Message-ID: <2cd_KddM2dU6cukIVzWUuvq7A7pI-fepg7hAy49qlbM=.9f0a75f8-cead-4ade-8765-1d675767b02a@github.com> On Thu, 20 Jun 2024 11:58:16 GMT, Axel Boldt-Christmas wrote: > Cleanups in preparation for opening up a PR to mainline for [JDK-8315884](https://bugs.openjdk.org/browse/JDK-8315884). > > @coleenp I incorporated #182 into this. The cleanups look really good. Can you fix the spelling of consistency? src/hotspot/cpu/x86/sharedRuntime_x86.cpp line 69: > 67: __ testptr(result, markWord::monitor_value); > 68: __ jcc(Assembler::notZero, slowCase); > 69: } Did I miss this one? src/hotspot/share/runtime/globals.hpp line 1980: > 1978: "monitors rather than the first word of the object.") \ > 1979: \ > 1980: product(int, LightweightFastLockingSpins, 13, DIAGNOSTIC, \ Good. I like the new name. src/hotspot/share/runtime/objectMonitor.cpp line 299: > 297: } > 298: > 299: #define assert_mark_word_concistency() \ While you're here. The spelling is consistency. src/hotspot/share/utilities/vmError.cpp line 595: > 593: LockingMode == LM_MONITOR ? ", lm_monitors" : > 594: LockingMode == LM_LEGACY ? ", lm_legacy" : > 595: LockingMode == LM_LIGHTWEIGHT ? ", lm_lightweight" : "", Without this, I guess the only way to know which locking mode is on is via. the command line in the hs_err file. It didn't really belong here, but maybe we want to find somewhere else for it. Not with this change. It's good to clean this up for this change. ------------- Marked as reviewed by coleenp (Committer). PR Review: https://git.openjdk.org/lilliput/pull/183#pullrequestreview-2130286349 PR Review Comment: https://git.openjdk.org/lilliput/pull/183#discussion_r1647498292 PR Review Comment: https://git.openjdk.org/lilliput/pull/183#discussion_r1647497161 PR Review Comment: https://git.openjdk.org/lilliput/pull/183#discussion_r1647495754 PR Review Comment: https://git.openjdk.org/lilliput/pull/183#discussion_r1647494230 From aboldtch at openjdk.org Thu Jun 20 12:35:26 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 20 Jun 2024 12:35:26 GMT Subject: [master] RFR: Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT [v3] In-Reply-To: References: <6m4OpybeASEkuH2dy77LesZFrdu-RDiJOLF5tpaje7o=.b346535c-4e5a-4556-8b0e-b6f82d8f4910@github.com> Message-ID: On Thu, 20 Jun 2024 12:18:46 GMT, Coleen Phillimore wrote: >> src/hotspot/share/runtime/basicLock.inline.hpp line 38: >> >>> 36: assert(!UseObjectMonitorTable, "must be"); >>> 37: Atomic::store(&_metadata, header.value()); >>> 38: } >> >> These should stay exclusive to `LM_LEGACY`. > > So I was unsure of this. It's only LM_LEGACY that needs the displaced header? Lightweight locking doesn't use it even without the table? Correct, one of the main features with `LM_LIGHTWEIGHT` is that it does not need to displace the header when fast locking. It only displaces it when doing inflated locking, but then it is displaced in the ObjectMonitor*, not on the thread stack. (And with the table we also avoid this second displacement) `LM_LEGACY` uses this stack slot for two things. If it is 0 it means that it was recursively fast locked. Otherwise if it is the address which the markWord points to it is the displaced header which should be restored when unlocking. And when doing inflated locking a non-zero value is written so it does not look like a recursive fast lock. ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/182#discussion_r1647501570 From aboldtch at openjdk.org Thu Jun 20 12:45:28 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 20 Jun 2024 12:45:28 GMT Subject: [master] RFR: OMWorld: Removal of flags, general cleanups In-Reply-To: <2cd_KddM2dU6cukIVzWUuvq7A7pI-fepg7hAy49qlbM=.9f0a75f8-cead-4ade-8765-1d675767b02a@github.com> References: <2cd_KddM2dU6cukIVzWUuvq7A7pI-fepg7hAy49qlbM=.9f0a75f8-cead-4ade-8765-1d675767b02a@github.com> Message-ID: On Thu, 20 Jun 2024 12:26:56 GMT, Coleen Phillimore wrote: >> Cleanups in preparation for opening up a PR to mainline for [JDK-8315884](https://bugs.openjdk.org/browse/JDK-8315884). >> >> @coleenp I incorporated #182 into this. > > src/hotspot/share/utilities/vmError.cpp line 595: > >> 593: LockingMode == LM_MONITOR ? ", lm_monitors" : >> 594: LockingMode == LM_LEGACY ? ", lm_legacy" : >> 595: LockingMode == LM_LIGHTWEIGHT ? ", lm_lightweight" : "", > > Without this, I guess the only way to know which locking mode is on is via. the command line in the hs_err file. It didn't really belong here, but maybe we want to find somewhere else for it. Not with this change. It's good to clean this up for this change. We print flags at the end which are non-default. But we do not know if someone changed them in a build. Should probably fix this somehow. Also printing all flag values, not just non-default ones would help. ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/183#discussion_r1647515283 From coleenp at openjdk.org Thu Jun 20 12:47:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 20 Jun 2024 12:47:34 GMT Subject: [master] RFR: Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT [v3] In-Reply-To: <6m4OpybeASEkuH2dy77LesZFrdu-RDiJOLF5tpaje7o=.b346535c-4e5a-4556-8b0e-b6f82d8f4910@github.com> References: <6m4OpybeASEkuH2dy77LesZFrdu-RDiJOLF5tpaje7o=.b346535c-4e5a-4556-8b0e-b6f82d8f4910@github.com> Message-ID: On Mon, 17 Jun 2024 22:02:54 GMT, Coleen Phillimore wrote: >> The test compiler/uncommontrap/TestDeoptOOM.java was failing with -Xcomp because it was looking for the object monitor in the basicLock which is only the case for -XX:+UseObjectMonitorTable. >> >> Testing tier1 locally with table on and off. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > And another place where we clear_object_monitor_cache if not using the OM table. Closing this PR. ------------- PR Comment: https://git.openjdk.org/lilliput/pull/182#issuecomment-2180579770 From coleenp at openjdk.org Thu Jun 20 12:47:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 20 Jun 2024 12:47:34 GMT Subject: [master] RFR: Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT [v3] In-Reply-To: References: <6m4OpybeASEkuH2dy77LesZFrdu-RDiJOLF5tpaje7o=.b346535c-4e5a-4556-8b0e-b6f82d8f4910@github.com> Message-ID: On Thu, 20 Jun 2024 12:32:47 GMT, Axel Boldt-Christmas wrote: >> So I was unsure of this. It's only LM_LEGACY that needs the displaced header? Lightweight locking doesn't use it even without the table? > > Correct, one of the main features with `LM_LIGHTWEIGHT` is that it does not need to displace the header when fast locking. It only displaces it when doing inflated locking, but then it is displaced in the ObjectMonitor*, not on the thread stack. (And with the table we also avoid this second displacement) > > `LM_LEGACY` uses this stack slot for two things. If it is 0 it means that it was recursively fast locked. Otherwise if it is the address which the markWord points to it is the displaced header which should be restored when unlocking. And when doing inflated locking a non-zero value is written so it does not look like a recursive fast lock. I got my displaced headers mixed up between ObjectMonitor and BasicLock. Thanks for the explanation, I don't really know the legacy locking code that well. ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/182#discussion_r1647516639 From coleenp at openjdk.org Thu Jun 20 12:47:34 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 20 Jun 2024 12:47:34 GMT Subject: [master] Withdrawn: Fix BasicLock to test for UseObjectMonitorTable rather than LM_LIGHTWEIGHT In-Reply-To: References: Message-ID: On Mon, 17 Jun 2024 15:50:41 GMT, Coleen Phillimore wrote: > The test compiler/uncommontrap/TestDeoptOOM.java was failing with -Xcomp because it was looking for the object monitor in the basicLock which is only the case for -XX:+UseObjectMonitorTable. > > Testing tier1 locally with table on and off. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/lilliput/pull/182 From aboldtch at openjdk.org Thu Jun 20 12:49:37 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 20 Jun 2024 12:49:37 GMT Subject: [master] RFR: OMWorld: Removal of flags, general cleanups [v2] In-Reply-To: References: Message-ID: > Cleanups in preparation for opening up a PR to mainline for [JDK-8315884](https://bugs.openjdk.org/browse/JDK-8315884). > > @coleenp I incorporated #182 into this. Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: consistency spelling ------------- Changes: - all: https://git.openjdk.org/lilliput/pull/183/files - new: https://git.openjdk.org/lilliput/pull/183/files/734e763a..173f64de Webrevs: - full: https://webrevs.openjdk.org/?repo=lilliput&pr=183&range=01 - incr: https://webrevs.openjdk.org/?repo=lilliput&pr=183&range=00-01 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.org/lilliput/pull/183.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/183/head:pull/183 PR: https://git.openjdk.org/lilliput/pull/183 From aboldtch at openjdk.org Thu Jun 20 12:49:38 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 20 Jun 2024 12:49:38 GMT Subject: [master] RFR: OMWorld: Removal of flags, general cleanups [v2] In-Reply-To: <2cd_KddM2dU6cukIVzWUuvq7A7pI-fepg7hAy49qlbM=.9f0a75f8-cead-4ade-8765-1d675767b02a@github.com> References: <2cd_KddM2dU6cukIVzWUuvq7A7pI-fepg7hAy49qlbM=.9f0a75f8-cead-4ade-8765-1d675767b02a@github.com> Message-ID: On Thu, 20 Jun 2024 12:30:09 GMT, Coleen Phillimore wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: >> >> consistency spelling > > src/hotspot/cpu/x86/sharedRuntime_x86.cpp line 69: > >> 67: __ testptr(result, markWord::monitor_value); >> 68: __ jcc(Assembler::notZero, slowCase); >> 69: } > > Did I miss this one? We did the C2 one. This is the native wrapper one. We both overlooked it. ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/183#discussion_r1647519364 From coleenp at openjdk.org Thu Jun 20 13:52:20 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 20 Jun 2024 13:52:20 GMT Subject: [master] RFR: OMWorld: Removal of flags, general cleanups [v2] In-Reply-To: References: <2cd_KddM2dU6cukIVzWUuvq7A7pI-fepg7hAy49qlbM=.9f0a75f8-cead-4ade-8765-1d675767b02a@github.com> Message-ID: On Thu, 20 Jun 2024 12:46:19 GMT, Axel Boldt-Christmas wrote: >> src/hotspot/cpu/x86/sharedRuntime_x86.cpp line 69: >> >>> 67: __ testptr(result, markWord::monitor_value); >>> 68: __ jcc(Assembler::notZero, slowCase); >>> 69: } >> >> Did I miss this one? > > We did the C2 one. This is the native wrapper one. We both overlooked it. I see. We missed an optimization here. ------------- PR Review Comment: https://git.openjdk.org/lilliput/pull/183#discussion_r1647613510 From coleenp at openjdk.org Thu Jun 20 16:34:32 2024 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 20 Jun 2024 16:34:32 GMT Subject: [master] RFR: OMWorld: Removal of flags, general cleanups [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jun 2024 12:49:37 GMT, Axel Boldt-Christmas wrote: >> Cleanups in preparation for opening up a PR to mainline for [JDK-8315884](https://bugs.openjdk.org/browse/JDK-8315884). >> >> @coleenp I incorporated #182 into this. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > consistency spelling Marked as reviewed by coleenp (Committer). Marked as reviewed by coleenp (Committer). ------------- PR Review: https://git.openjdk.org/lilliput/pull/183#pullrequestreview-2130890459 PR Review: https://git.openjdk.org/lilliput/pull/183#pullrequestreview-2130891450 From aboldtch at openjdk.org Mon Jun 24 06:08:28 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 24 Jun 2024 06:08:28 GMT Subject: [master] RFR: OMWorld: Removal of flags, general cleanups [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jun 2024 12:49:37 GMT, Axel Boldt-Christmas wrote: >> Cleanups in preparation for opening up a PR to mainline for [JDK-8315884](https://bugs.openjdk.org/browse/JDK-8315884). >> >> @coleenp I incorporated #182 into this. > > Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision: > > consistency spelling Thanks for the review. ------------- PR Comment: https://git.openjdk.org/lilliput/pull/183#issuecomment-2185680292 From aboldtch at openjdk.org Mon Jun 24 06:08:29 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 24 Jun 2024 06:08:29 GMT Subject: [master] Integrated: OMWorld: Removal of flags, general cleanups In-Reply-To: References: Message-ID: <48kVaR9sCpIgHt8CQNN2SDhXISY4QFzF6AtIAqZ8XC4=.bf950bae-1839-46a9-9271-f372b9738582@github.com> On Thu, 20 Jun 2024 11:58:16 GMT, Axel Boldt-Christmas wrote: > Cleanups in preparation for opening up a PR to mainline for [JDK-8315884](https://bugs.openjdk.org/browse/JDK-8315884). > > @coleenp I incorporated #182 into this. This pull request has now been integrated. Changeset: 13442745 Author: Axel Boldt-Christmas URL: https://git.openjdk.org/lilliput/commit/13442745686ee10d9c25c2b11bb1ea4f7014c487 Stats: 242 lines in 29 files changed: 7 ins; 155 del; 80 mod OMWorld: Removal of flags, general cleanups Reviewed-by: coleenp ------------- PR: https://git.openjdk.org/lilliput/pull/183 From aboldtch at openjdk.org Tue Jun 25 08:33:33 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 25 Jun 2024 08:33:33 GMT Subject: [master] RFR: Lilliput rebase jdk-24+3 Message-ID: Patch queue (#184) rebased on jdk-24+3. Rebase conflicts / fixes (so far): * Rename `HeapRegion` -> `G1HeapRegion` * Added SlidingForwarding to parallel full gc (See 3788dcbd02a5fb879ae91e3e5aa1aba8568362d5) * Disabled [JDK-8320448](https://bugs.openjdk.org/browse/JDK-8320448), will require adapting for CompactObjectHeaders (Created [JDK-8334971](https://bugs.openjdk.org/browse/JDK-8334971)) * Because UseObjectMonitorTable tags all allocations as `mtObjectMonitor` [JDK-8330849](https://bugs.openjdk.org/browse/JDK-8330849) will require some modification. ------------- Commit messages: - Tiny Class-Pointers - 8305895: Implementation: JEP 450: Compact Object Headers (Experimental) - 8305896: Alternative full GC forwarding - 8305898: Alternative self-forwarding mechanism - 8315884: New Object to ObjectMonitor mapping - Lilliput conf changes for jcheck - Merge branch 'lilliput_v6' into lilliput_rebase_target - Tiny Class-Pointers - 8305895: Implementation: JEP 450: Compact Object Headers (Experimental) - 8305896: Alternative full GC forwarding - ... and 392 more: https://git.openjdk.org/lilliput/compare/13442745...6fe42912 Changes: https://git.openjdk.org/lilliput/pull/185/files Webrev: https://webrevs.openjdk.org/?repo=lilliput&pr=185&range=00 Stats: 84723 lines in 1918 files changed: 54544 ins; 20945 del; 9234 mod Patch: https://git.openjdk.org/lilliput/pull/185.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/185/head:pull/185 PR: https://git.openjdk.org/lilliput/pull/185 From aboldtch at openjdk.org Tue Jun 25 14:28:03 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 25 Jun 2024 14:28:03 GMT Subject: [master] RFR: Lilliput rebase jdk-24+3 [v2] In-Reply-To: References: Message-ID: > Patch queue (#184) rebased on jdk-24+3. > > Rebase conflicts / fixes (so far): > * Rename `HeapRegion` -> `G1HeapRegion` > * Added SlidingForwarding to parallel full gc (See 3788dcbd02a5fb879ae91e3e5aa1aba8568362d5) > * Disabled [JDK-8320448](https://bugs.openjdk.org/browse/JDK-8320448), will require adapting for CompactObjectHeaders (Created [JDK-8334971](https://bugs.openjdk.org/browse/JDK-8334971)) > * Because UseObjectMonitorTable tags all allocations as `mtObjectMonitor` [JDK-8330849](https://bugs.openjdk.org/browse/JDK-8330849) will require some modification. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/lilliput/pull/185/files - new: https://git.openjdk.org/lilliput/pull/185/files/6fe42912..6fe42912 Webrevs: - full: https://webrevs.openjdk.org/?repo=lilliput&pr=185&range=01 - incr: https://webrevs.openjdk.org/?repo=lilliput&pr=185&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/lilliput/pull/185.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/185/head:pull/185 PR: https://git.openjdk.org/lilliput/pull/185 From rkennke at openjdk.org Tue Jun 25 14:28:04 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 25 Jun 2024 14:28:04 GMT Subject: [master] Withdrawn: Lilliput rebase jdk-24+3 In-Reply-To: References: Message-ID: On Mon, 24 Jun 2024 08:13:01 GMT, Axel Boldt-Christmas wrote: > Patch queue (#184) rebased on jdk-24+3. > > Rebase conflicts / fixes (so far): > * Rename `HeapRegion` -> `G1HeapRegion` > * Added SlidingForwarding to parallel full gc (See 3788dcbd02a5fb879ae91e3e5aa1aba8568362d5) > * Disabled [JDK-8320448](https://bugs.openjdk.org/browse/JDK-8320448), will require adapting for CompactObjectHeaders (Created [JDK-8334971](https://bugs.openjdk.org/browse/JDK-8334971)) > * Because UseObjectMonitorTable tags all allocations as `mtObjectMonitor` [JDK-8330849](https://bugs.openjdk.org/browse/JDK-8330849) will require some modification. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/lilliput/pull/185 From john.r.rose at oracle.com Wed Jun 26 08:54:22 2024 From: john.r.rose at oracle.com (John Rose) Date: Wed, 26 Jun 2024 01:54:22 -0700 Subject: Far classes In-Reply-To: References: Message-ID: <201E239C-6634-46EF-B679-1F02513CA47D@oracle.com> On 18 Jun 2024, at 5:23, Thomas St?fe wrote: > We dedicate one bit in the nKlass for "is-far-class". For far classes, we > store the Klass* at the end of the object. Then we encode the offset of the > Klass* slot in the remaining nKlass bits. You could also use a joint encoding [1] on more than one bit, so as have encode more near classes in the same number of bits. What?s the trade-off? The bits other than the joint encoding would encode the offset, so the offsets would be shorter. In fact, you don?t need long offsets at all; there?s no sense in tying the max offset to the max number of near classes, which is what the naive selector bit does: 2^15 near classes AND 2^15 max offset, if you have 16 bits and burn one bit for the far class indicator. Instead, use (say) 6 joint bits out of 16 total, and then you get 2^16-2^10 near classes, and a maximum far class offset of 2^10. [1] https://cr.openjdk.org/~jrose/jvm/joint-bit-encodings.html > That depends on max. object size. How large does an object get? I found no > limit in specs. However, the size of an object depends on its members, and > we have an utf-8 CP-entry per member, and the number of CP entries is > limited to 2^16. So, an object cannot have more than 65535 members (a bit > less, actually). Therefore, I think it cannot be larger than 64k heap words. Objects can get pathologically large because there is no limit to the depth of the superclass chain, and each superclass can contribute tens of thousands of fields. But this should not be understood as a constraint on the size of the nClass field, or the number of near classes. > ? > We could even get down to 16 bits for the MW-stored nKlass, if we agree on > aligning the Klass* slot trailing the object to 16 bytes. In that case, we > can encode the Klass* slot offset with 15 bits and have the "is-far-class" > as the 16th bit. Then, we could extract the nKlass from the MW with a > 16-bit move. This would cost us: On average, another four bytes of overhead > per far-class object, and a halved value range for near class IDs. You are getting closer here to a better design: The key move is to constrain where the far-class Klass* can occur in the object layout. As long as there are enough bits in the header (minus the far-class selector bit or bits), as long as those bits can distinguish all the possible locations of the far class (Klass*) field in the object layout, you are good. So the problem boils down to what is the best way to constrain the location of the Klass* field. Obviously it is aligned word-wise, so it?s not just any char offset. More importantly, we can simply demand that it is less than some fixed constant, such as 2^10 words (taking the above example again, the one with 16 nClass bits and 6 joint encoded far-class selector bits). Can we meet this demand? Yes. The key is to allocate a far class pointer in any class whose layout is large enough to overflow the offset limit. This is done even if the class itself does not need a far class slot. The slot is wasted in that case, but it is just one word out of 2^10, so the max waste is 0.1%. Jumbo classes are super-rare, anyway. That way, if a subclass of the jumbo class ever needs a far class word, there?s a spot prepared for it, within the maximum offset. If the class is jumbo and final, there is no need to allocate a far class slot for subclasses. But if it is jumbo and non-final then it will require a far class slot EVEN IF it is lucky enough to acquire a near class ID. The far class slot is for the subclasses that are not so lucky and cannot get a near class ID. They will need that far class slot, and they won?t be able to allocate it for themselves. BTW, if the class is abstract there is no need to allocate a near class ID: Only concrete classes need near class IDs. But abstract jumbo classes WILL need far class slots, again for their subclasses that are unlucky, and cannot get a near class ID. For testing make the max offset of the far class word very small, like 10. That way many classes will be burdened with the extra field, and you will get a stress test of the mechanism. Don?t just assume that there are enough jumbo classes in the world to test this contraption without a stress mode. The trick of preallocating a far class slot even before you need it allows you to constrain the offset of the far class slot. The other independent trick of using a joint encoding (of the far class selector pattern) allows you to have very small far class offsets, and therefore use almost all of the encoding power of the nKlass in the header to represent near classes, which is as it should be. Continuing the above concrete example, if the 16-bit nKlass has all zero bits in the top 6 bits, that selects the far class mode, while one or more non-zero bits in the top 6 would select the near class, and all 16 bits would encode the ID of that near class. Klass* get_klass(uint16_t nKlass) { if ((nKlass & (-1<<10)) == 0) { return ((Klass**)this)[nKlass]; } else { return NEAR_CLASSES[nKlass - (1<<10)]; } } From aph-open at littlepinkcloud.com Wed Jun 26 09:54:06 2024 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Wed, 26 Jun 2024 10:54:06 +0100 Subject: Far classes In-Reply-To: <201E239C-6634-46EF-B679-1F02513CA47D@oracle.com> References: <201E239C-6634-46EF-B679-1F02513CA47D@oracle.com> Message-ID: On 6/26/24 09:54, John Rose wrote: > You could also use a joint encoding [1] on more than one bit, > so as have encode more near classes in the same number of > bits. [1] https://cr.openjdk.org/~jrose/jvm/joint-bit-encodings.html Joint Bit Encodings is a kind of Arithmetic Code, isn't it? We don't have to think in terms of bits in the encoded class at all, just to say that if an encoded value is less than N, it's an offset to the real Klass*. Otherwise, subtract N from the encoded value and use that. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From john.r.rose at oracle.com Wed Jun 26 22:32:49 2024 From: john.r.rose at oracle.com (John Rose) Date: Wed, 26 Jun 2024 15:32:49 -0700 Subject: Far classes In-Reply-To: References: <201E239C-6634-46EF-B679-1F02513CA47D@oracle.com> Message-ID: On 26 Jun 2024, at 2:54, Andrew Haley wrote: > On 6/26/24 09:54, John Rose wrote: >> You could also use a joint encoding [1] on more than one bit, >> so as have encode more near classes in the same number of >> bits. > > [1] https://cr.openjdk.org/~jrose/jvm/joint-bit-encodings.html > > Joint Bit Encodings is a kind of Arithmetic Code, isn't it? We don't > have to think in terms of bits in the encoded class at all, just to > say that if an encoded value is less than N, it's an offset to the > real Klass*. Otherwise, subtract N from the encoded value and use > that. Joint bits are often easier to understand than arithmetic limits, but yes. The concrete example I gave intentionally maps not only to joint bits but also to a range check. And such a range check limit doesn?t need to be a specially formatted constant; it can be any parameter. Probably the ISAs we care about don?t care whether it is a bit check (on a nice mask) or a range check (on a nice limit). Also, for those keeping score, the code I gave for get_klass can sometimes be optimized as a flow-free conditional move to get the array base (this or &NEAR_KLASSES[-1<<10]) and then a slick uniform index operation that works for both cases. From rkennke at openjdk.org Thu Jun 27 11:25:51 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 27 Jun 2024 11:25:51 GMT Subject: [master] RFR: 8335251: [Lilliput] Fix TestRecursiveMonitorChurn failure Message-ID: The test TestRecursiveMonitorChurn currrently fails with Lilliput or UseObjectMonitorTable, because the monitor table is also allocated with mtObjectMonitor tag, and the threshold in the test is too low. The fix is to increase the threshold so that it covers the table, but not so much that we'd get false positives. 100,000 seems to hit that spot nicely. (The memory usage with table is about 70,000, the failure case is over the 1,000,000 mark. ------------- Commit messages: - 8335251: [Lilliput] Fix TestRecursiveMonitorChurn failure Changes: https://git.openjdk.org/lilliput/pull/186/files Webrev: https://webrevs.openjdk.org/?repo=lilliput&pr=186&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8335251 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/lilliput/pull/186.diff Fetch: git fetch https://git.openjdk.org/lilliput.git pull/186/head:pull/186 PR: https://git.openjdk.org/lilliput/pull/186 From aboldtch at openjdk.org Thu Jun 27 12:17:27 2024 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 27 Jun 2024 12:17:27 GMT Subject: [master] RFR: 8335251: [Lilliput] Fix TestRecursiveMonitorChurn failure In-Reply-To: References: Message-ID: On Thu, 27 Jun 2024 11:21:13 GMT, Roman Kennke wrote: > The test TestRecursiveMonitorChurn currrently fails with Lilliput or UseObjectMonitorTable, because the monitor table is also allocated with mtObjectMonitor tag, and the threshold in the test is too low. > > The fix is to increase the threshold so that it covers the table, but not so much that we'd get false positives. 100,000 seems to hit that spot nicely. (The memory usage with table is about 70,000, the failure case is over the 1,000,000 mark. Marked as reviewed by aboldtch (Committer). Looks good. I was thinking of maybe rewriting the test as follows for the mainline PR. 5c914d0db185475006e2931b34f1fc7a2d180f67 ------------- PR Review: https://git.openjdk.org/lilliput/pull/186#pullrequestreview-2145193092 PR Comment: https://git.openjdk.org/lilliput/pull/186#issuecomment-2194529808 From amitkumar at openjdk.org Fri Jun 28 07:51:44 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 28 Jun 2024 07:51:44 GMT Subject: [lilliput-jdk21u:lilliput] RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking Message-ID: Hi all, This pull request contains a backport of commit [47df1459](https://github.com/openjdk/jdk/commit/47df14590c003ccb1607ec0edfe999fcf2aebd86) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Amit Kumar on 10 Apr 2024 and was reviewed by Lutz Schmidt and Martin Doerr. Thanks! ------------- Commit messages: - Backport 47df14590c003ccb1607ec0edfe999fcf2aebd86 Changes: https://git.openjdk.org/lilliput-jdk21u/pull/33/files Webrev: https://webrevs.openjdk.org/?repo=lilliput-jdk21u&pr=33&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310513 Stats: 79 lines in 1 file changed: 32 ins; 14 del; 33 mod Patch: https://git.openjdk.org/lilliput-jdk21u/pull/33.diff Fetch: git fetch https://git.openjdk.org/lilliput-jdk21u.git pull/33/head:pull/33 PR: https://git.openjdk.org/lilliput-jdk21u/pull/33 From amitkumar at openjdk.org Fri Jun 28 07:57:39 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 28 Jun 2024 07:57:39 GMT Subject: [lilliput-jdk21u:lilliput] RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: On Fri, 28 Jun 2024 07:46:07 GMT, Amit Kumar wrote: > Hi all, > > This pull request contains a backport of commit [47df1459](https://github.com/openjdk/jdk/commit/47df14590c003ccb1607ec0edfe999fcf2aebd86) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Amit Kumar on 10 Apr 2024 and was reviewed by Lutz Schmidt and Martin Doerr. > > Thanks! Almost clean backport, I got conflicts in copyright headers only. ------------- PR Comment: https://git.openjdk.org/lilliput-jdk21u/pull/33#issuecomment-2196348900 From rkennke at openjdk.org Fri Jun 28 09:18:45 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 28 Jun 2024 09:18:45 GMT Subject: [lilliput-jdk21u:lilliput] RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: On Fri, 28 Jun 2024 07:46:07 GMT, Amit Kumar wrote: > Hi all, > > This pull request contains a backport of commit [47df1459](https://github.com/openjdk/jdk/commit/47df14590c003ccb1607ec0edfe999fcf2aebd86) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Amit Kumar on 10 Apr 2024 and was reviewed by Lutz Schmidt and Martin Doerr. > > Thanks! I don't think that this change is Lilliput-specific. It looks like you already backported it to 21u, from where it will get picked-up to lilliput-jdk21u the next time we merge from upstream (which is overdue). *Also* I don't think Lilliput currently works on s390. ------------- PR Comment: https://git.openjdk.org/lilliput-jdk21u/pull/33#issuecomment-2196476774 From amitkumar at openjdk.org Fri Jun 28 09:24:46 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 28 Jun 2024 09:24:46 GMT Subject: [lilliput-jdk21u:lilliput] RFR: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: On Fri, 28 Jun 2024 09:15:28 GMT, Roman Kennke wrote: >It looks like you already backported it to 21u, from where it will get picked-up to lilliput-jdk21u the next time we merge from upstream (which is overdue). Thanks for info :) I'll close it now. ------------- PR Comment: https://git.openjdk.org/lilliput-jdk21u/pull/33#issuecomment-2196487465 From amitkumar at openjdk.org Fri Jun 28 09:24:47 2024 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 28 Jun 2024 09:24:47 GMT Subject: [lilliput-jdk21u:lilliput] Withdrawn: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking In-Reply-To: References: Message-ID: On Fri, 28 Jun 2024 07:46:07 GMT, Amit Kumar wrote: > Hi all, > > This pull request contains a backport of commit [47df1459](https://github.com/openjdk/jdk/commit/47df14590c003ccb1607ec0edfe999fcf2aebd86) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Amit Kumar on 10 Apr 2024 and was reviewed by Lutz Schmidt and Martin Doerr. > > Thanks! This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/lilliput-jdk21u/pull/33 From rkennke at openjdk.org Fri Jun 28 09:31:46 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 28 Jun 2024 09:31:46 GMT Subject: [master] RFR: 8335251: [Lilliput] Fix TestRecursiveMonitorChurn failure In-Reply-To: References: Message-ID: <0SRuRqA3FD70l9eNAZ0JxGQM7HkSoj1JTtR3rsIoSYo=.bb506810-abcd-4f52-833a-5858c1dc0e6d@github.com> On Thu, 27 Jun 2024 12:15:00 GMT, Axel Boldt-Christmas wrote: > Looks good. Thanks! > I was thinking of maybe rewriting the test as follows for the mainline PR. [5c914d0](https://github.com/openjdk/lilliput/commit/5c914d0db185475006e2931b34f1fc7a2d180f67) That looks even better and more reliable. Want to put it in Lilliput repo instead of my PR? ------------- PR Comment: https://git.openjdk.org/lilliput/pull/186#issuecomment-2196499445 From rkennke at openjdk.org Fri Jun 28 11:44:58 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 28 Jun 2024 11:44:58 GMT Subject: [lilliput-jdk21u:lilliput] RFR: Merge jdk21u:jdk-21.0.4+6 Message-ID: Merging jdk-21.0.4+6 from upstream jdk21u. ------------- Commit messages: - Merge tag 'jdk-21.0.4+6' into merge-jdk-21.0.4+6 - 8334441: Mark tests in jdk_security_infra group as manual - 8320692: Null icon returned for .exe without custom icon - 8330275: Crash in XMark::follow_array - 8329862: libjli GetApplicationHome cleanups and enhance jli tracing - 8323635: Test gc/g1/TestHumongousAllocConcurrentStart.java fails with -XX:TieredStopAtLevel=3 - 8295111: dpkg appears to have problems resolving symbolically linked native libraries - 8331031: unify os::dont_yield and os::naked_yield across Posix platforms - 8329223: Parallel: Parallel GC resizes heap even if -Xms = -Xmx - 8330464: hserr generic events - add entry for the before_exit calls - ... and 298 more: https://git.openjdk.org/lilliput-jdk21u/compare/20dbbeaf...25ec8454 The webrevs contain the adjustments done while merging with regards to each parent branch: - lilliput: https://webrevs.openjdk.org/?repo=lilliput-jdk21u&pr=34&range=00.0 - jdk21u:jdk-21.0.4+6: https://webrevs.openjdk.org/?repo=lilliput-jdk21u&pr=34&range=00.1 Changes: https://git.openjdk.org/lilliput-jdk21u/pull/34/files Stats: 40020 lines in 1195 files changed: 21985 ins; 10206 del; 7829 mod Patch: https://git.openjdk.org/lilliput-jdk21u/pull/34.diff Fetch: git fetch https://git.openjdk.org/lilliput-jdk21u.git pull/34/head:pull/34 PR: https://git.openjdk.org/lilliput-jdk21u/pull/34