From ayang at openjdk.org Tue May 2 08:01:19 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 2 May 2023 08:01:19 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Wed, 26 Apr 2023 09:20:46 GMT, Thomas Schatzl wrote: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas src/hotspot/share/gc/g1/g1CollectionSet.cpp line 328: > 326: assert(_optional_old_regions.length() == 0, "must be"); > 327: > 328: if (collector_state()->in_mixed_phase()) { Why checking the same condition again (L322 the first time)? src/hotspot/share/gc/g1/g1CollectionSet.cpp line 329: > 327: > 328: if (collector_state()->in_mixed_phase()) { > 329: time_remaining_ms = _policy->select_candidates_from_marking(&candidates()->marking_regions(), `time_remaining_ms` seems unused after the assignment. src/hotspot/share/gc/g1/g1CollectionSetCandidates.cpp line 47: > 45: } > 46: > 47: void G1CollectionCandidateList::append_unsorted(HeapRegion* r) { Some methods in this file seem never used. src/hotspot/share/gc/shared/ptrQueue.hpp line 43: > 41: class BufferNode; > 42: class PtrQueueSet; > 43: class PtrQueue : public CHeapObj { Why is this required? (Seems to work fine without it when I tried it.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182204221 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182204738 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182212123 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182203296 From tschatzl at openjdk.org Tue May 2 12:04:18 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 2 May 2023 12:04:18 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Tue, 2 May 2023 07:49:42 GMT, Albert Mingkun Yang wrote: >> Hi all, >> >> please review this refactoring of collection set candidate set handling. >> >> The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. >> >> These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). >> >> This patch only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. >> >> In detail: >> * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Testing: >> - this patch only: tier1-3, gha >> - with JDK-8140326 tier1-7 (or 8?) >> >> Thanks, >> Thomas > > src/hotspot/share/gc/g1/g1CollectionSet.cpp line 328: > >> 326: assert(_optional_old_regions.length() == 0, "must be"); >> 327: >> 328: if (collector_state()->in_mixed_phase()) { > > Why checking the same condition again (L322 the first time)? In https://bugs.openjdk.org/browse/JDK-8140326 the first condition will change to something like "are there collection set candidates" and retained regions will be added later. Will remove. > src/hotspot/share/gc/g1/g1CollectionSet.cpp line 329: > >> 327: >> 328: if (collector_state()->in_mixed_phase()) { >> 329: time_remaining_ms = _policy->select_candidates_from_marking(&candidates()->marking_regions(), > > `time_remaining_ms` seems unused after the assignment. Same reason as above. Later changes will need/use this. Removed. > src/hotspot/share/gc/g1/g1CollectionSetCandidates.cpp line 47: > >> 45: } >> 46: >> 47: void G1CollectionCandidateList::append_unsorted(HeapRegion* r) { > > Some methods in this file seem never used. They are used in https://bugs.openjdk.org/browse/JDK-8140326 . I will look through and remove unused ones. > src/hotspot/share/gc/shared/ptrQueue.hpp line 43: > >> 41: class BufferNode; >> 42: class PtrQueueSet; >> 43: class PtrQueue : public CHeapObj { > > Why is this required? > > (Seems to work fine without it when I tried it.) Required for https://bugs.openjdk.org/browse/JDK-8140326. Will remove. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182448685 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182449132 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182451617 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182452932 From shade at openjdk.org Tue May 2 12:07:29 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 2 May 2023 12:07:29 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v15] In-Reply-To: References: Message-ID: On Fri, 28 Apr 2023 19:34:53 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Some more changes by @shipilev I think we are very close. Another round of review: src/hotspot/share/gc/shared/gcForwarding.hpp line 39: > 37: > 38: public: > 39: static void initialize(MemRegion heap, size_t region_size_words_shift); Suggestion: static void initialize(MemRegion heap, size_t region_size_words); src/hotspot/share/gc/shared/genCollectedHeap.cpp line 136: > 134: GCInitLogger::print(); > 135: > 136: GCForwarding::initialize(_reserved, SpaceAlignment); The second argument is not "shift" anymore, right? So this should be the actual reserved space size? src/hotspot/share/gc/shared/slidingForwarding.cpp line 131: > 129: assert(val < TABLE_SIZE, "must fit in table: val: " UINT64_FORMAT ", table-size: " UINTX_FORMAT ", table-size-bits: %d", > 130: val, TABLE_SIZE, log2i_exact(TABLE_SIZE)); > 131: return static_cast(val); Want to cast first, and _then_ assert, maybe? src/hotspot/share/gc/shared/slidingForwarding.hpp line 68: > 66: * ^------------------------------------------- alternate region select > 67: * ^----------------------------------------- in-region offset > 68: * ^----------------------- compressed class pointer (not handled, but also *not touched* by this code) I think we can invert these: * 64 32 0 * [........................|OOOOOOOOOOOOOOO|A|F|TT] * ^--- normal lock bits, would record "object is forwarded" * ^----- fallback bit (explained below) * ^------- alternate region select * ^----------------------- in-region offset * ^------------------------------------------------ protected area, *not touched* by this code, useful for * compressed class pointer with compact object headers ``` src/hotspot/share/gc/shared/slidingForwarding.hpp line 93: > 91: class SlidingForwarding : public CHeapObj { > 92: private: > 93: static const uintptr_t MARK_LOWER_HALF_MASK = 0xffffffff; This is just `right_n_bits(32)`? ------------- PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1408928430 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182441736 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182440390 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182452646 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182434367 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182438481 From tschatzl at openjdk.org Tue May 2 12:15:36 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 2 May 2023 12:15:36 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v2] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review - remove unused methods ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13666/files - new: https://git.openjdk.org/jdk/pull/13666/files/e58864e1..ee76b9ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=00-01 Stats: 29 lines in 5 files changed: 0 ins; 23 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From rkennke at openjdk.org Tue May 2 12:50:22 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 12:50:22 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v15] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 11:47:15 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Some more changes by @shipilev > > src/hotspot/share/gc/shared/genCollectedHeap.cpp line 136: > >> 134: GCInitLogger::print(); >> 135: >> 136: GCForwarding::initialize(_reserved, SpaceAlignment); > > The second argument is not "shift" anymore, right? So this should be the actual reserved space size? I think SpaceAlignment is correct. We want to pass a region-size there, and the (default) region size for Serial should be the space alignment, because that is what eden, survivors and old-space will be aligned at. Unfortunately, Serial GC doesn't generally slide from top to bottom: it starts to slide old into old, then young into old until old is full, then slide the rest into young. Even worse, the survivor spaces are swapped with every GC cycle, so we really don't know that sliding goes top -> bottom. Using 'virtual' regions that align at SpaceAlignment solves the problem, though. (One exception is when the whole heap fits into our 2^28 words range, in which case we can treat the whole heap as single region) That said, I see a bug in the line: GCForwarding::initialize() takes region size *in words* but SpaceAlignment is *in bytes*. I'm fixing that to passing space-alignment in words instead. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182500382 From rkennke at openjdk.org Tue May 2 13:00:29 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 13:00:29 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Address @shipilev's review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/74d4ad1f..5892ad5d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=14-15 Stats: 15 lines in 4 files changed: 2 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From tschatzl at openjdk.org Tue May 2 13:41:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 2 May 2023 13:41:28 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v3] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Merge branch 'master' into 8306541-refactor-cset-candidates - ayang review - remove unused methods - Whitespace fixes - typo - More cleanup - Cleanup - Cleanup - Refactor collection set candidates Improve the interface to collection set candidates and prepare for having collection set candidates at any time. Preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch only uses candidates from marking at this time. Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. * the collection set candidates set is not temporarily allocated any more, but the candidate set object must be available all the time. * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). * there are several additional helper sets/lists * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. All these sets implement C++ iterators for simpler use in various places. Everything else are changes to use these helper sets/lists throughout. Some additional FIXME for log messages to remove are in there. Please ignore. ------------- Changes: https://git.openjdk.org/jdk/pull/13666/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=02 Stats: 1085 lines in 26 files changed: 622 ins; 217 del; 246 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From shade at openjdk.org Tue May 2 14:23:27 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 2 May 2023 14:23:27 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v15] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 12:47:26 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shared/genCollectedHeap.cpp line 136: >> >>> 134: GCInitLogger::print(); >>> 135: >>> 136: GCForwarding::initialize(_reserved, SpaceAlignment); >> >> The second argument is not "shift" anymore, right? So this should be the actual reserved space size? > > I think SpaceAlignment is correct. We want to pass a region-size there, and the (default) region size for Serial should be the space alignment, because that is what eden, survivors and old-space will be aligned at. Unfortunately, Serial GC doesn't generally slide from top to bottom: it starts to slide old into old, then young into old until old is full, then slide the rest into young. Even worse, the survivor spaces are swapped with every GC cycle, so we really don't know that sliding goes top -> bottom. Using 'virtual' regions that align at SpaceAlignment solves the problem, though. > (One exception is when the whole heap fits into our 2^28 words range, in which case we can treat the whole heap as single region) > That said, I see a bug in the line: GCForwarding::initialize() takes region size *in words* but SpaceAlignment is *in bytes*. I'm fixing that to passing space-alignment in words instead. Ah, that is _region size_, okay. `SpaceAlignment` seems okay then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182618027 From shade at openjdk.org Tue May 2 14:51:29 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 2 May 2023 14:51:29 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: References: Message-ID: <5K91Rc15LdUc1SOEllnetA3-IS5T_pDYSkEXFIR8M64=.4ba97ce6-1033-490e-a4a6-911a9a870109@github.com> On Tue, 2 May 2023 13:00:29 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Address @shipilev's review Some more... src/hotspot/share/gc/shared/slidingForwarding.cpp line 40: > 38: SlidingForwarding::SlidingForwarding(MemRegion heap, size_t region_size_words) > 39: : _heap_start(heap.start()), > 40: _num_regions(((heap.end() - heap.start()) / region_size_words) + 1), This one overestimates the number of regions by 1, if heap is covered by regions exactly, right? Seems innocuous, though. src/hotspot/share/gc/shared/slidingForwarding.hpp line 112: > 110: > 111: // How many bits we use for the compressed pointer > 112: static const int NUM_COMPRESSED_BITS = 32 - OFFSET_BITS_SHIFT; Suggestion: // How many bits we use for the offset static const int NUM_OFFSET_BITS = 32 - OFFSET_BITS_SHIFT; src/hotspot/share/gc/shared/slidingForwarding.hpp line 165: > 163: }; > 164: > 165: static const size_t TABLE_SIZE = 128; Any reason why we do `128` here? I think we can take a bit larger table here, given that: a) the footprint would be eaten by chaining anyway; b) we delete the table after use anyway. 1K entries would take about 32K native memory, if I calculate it right. ------------- PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1409233470 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182631375 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182651630 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182660172 From eosterlund at openjdk.org Tue May 2 15:15:26 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 2 May 2023 15:15:26 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 13:00:29 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Address @shipilev's review src/hotspot/share/gc/shared/preservedMarks.cpp line 52: > 50: if (GCForwarding::is_forwarded(obj)) { > 51: elem->set_oop(GCForwarding::forwardee(obj)); > 52: } Is PreservedMarks still useful after moving the spacious forwarding/mark information out from the markWord? I can see that we need it while transitioning to using your new code, but that's about it right? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182691096 From stuefe at openjdk.org Tue May 2 15:28:37 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 2 May 2023 15:28:37 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: References: Message-ID: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> On Tue, 2 May 2023 13:00:29 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Address @shipilev's review Hi Roman, Small general concern, the last-last-ditch-GC fallback table may be impractical cost-wise. How large is that expected to grow? You pay 24+x (~48 on glibc with internal overhead) bytes per forwarded oop. Very easy first-step mitigation: Let the table house the first n (1000-10000) nodes as an inline member array. Allocate nodes from there, only allocate spilloffs from C-heap. Allocation would be a lot faster and cheaper memory wise, and its just some lines of code. Cheers, Thomas src/hotspot/share/gc/shared/slidingForwarding.cpp line 35: > 33: > 34: // We cannot use 0, because that may already be a valid base address in zero-based heaps. > 35: // 0x1 is safe because heap base addresses must be aligned by much larger alginemnt typo src/hotspot/share/gc/shared/slidingForwarding.cpp line 44: > 42: _region_size_words_shift(log2i_exact(region_size_words)), > 43: _bases_table(nullptr), > 44: _fallback_table(nullptr) { Assert for sane values for region_size? At least >= word size? src/hotspot/share/gc/shared/slidingForwarding.cpp line 81: > 79: _bases_table = nullptr; > 80: > 81: if (_fallback_table != nullptr) { null check not needed src/hotspot/share/gc/shared/slidingForwarding.cpp line 137: > 135: void FallbackTable::forward_to(HeapWord* from, HeapWord* to) { > 136: size_t idx = home_index(from); > 137: if (_table[idx]._from != nullptr) { Here you need to do a contains check, right? Because, as you wrote in your answer to Aleksey, forwardings can be rewritten: https://github.com/openjdk/jdk/pull/13582/files#r1180126262 src/hotspot/share/gc/shared/slidingForwarding.hpp line 121: > 119: size_t _region_size_words; > 120: size_t _region_size_words_shift; > 121: HeapWord** _bases_table; Small nit. For clarity, I would prefer if we had a real structure here, e.g.: struct region_forwarding { HeapWord* dest; HeapWord* alt_dest; }; region_forwarding* _table; src/hotspot/share/gc/shared/slidingForwarding.hpp line 168: > 166: FallbackTableEntry _table[TABLE_SIZE]; > 167: > 168: static size_t home_index(HeapWord* from); Nitpicking, but I'd prefer an int or unsigned as return val here. src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 56: > 54: // Primary is free > 55: _bases_table[base_idx] = to_region_base; > 56: } else if (region_contains(_bases_table[base_idx], to)) { Stupid question, could this not just be `else if ( _bases_table[base_idx] == to_region_base)` ? Same below? src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 87: > 85: > 86: HeapWord* SlidingForwarding::decode_forwarding(HeapWord* from, uintptr_t encoded) const { > 87: assert((encoded & markWord::marked_value) == markWord::marked_value, "must be marked as forwarded"); Assert for !FALLBACK too? src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 95: > 93: size_t base_idx = from_idx + alt_region; > 94: > 95: HeapWord* decoded = _bases_table[base_idx] + offset; Maybe assert that table slot != UNUSED_BASE first src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 115: > 113: uintptr_t encoded = encode_forwarding(from_hw, to_hw); > 114: markWord new_header = markWord((from_header.value() & ~MARK_LOWER_HALF_MASK) | encoded); > 115: from->set_mark(new_header); What happens if the header is displaced into an OM? Should we not update the displaced header instead? src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 125: > 123: assert(_bases_table != nullptr, "call begin() before asking for forwarding"); > 124: > 125: markWord header = from->mark(); Could this header be displaced? test/hotspot/gtest/gc/shared/test_slidingForwarding.cpp line 40: > 38: return ((uintptr_t(1) << 2) /* fallback */ | 3 /* forwarded */); > 39: } > 40: Could you add a test that forwarding works for displaced Oop+OM ? ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1405661292 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182596471 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182631488 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182633369 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182691084 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182612217 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182688315 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182622823 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182636381 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182640645 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182644415 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182647104 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182706683 From stuefe at openjdk.org Tue May 2 15:28:41 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 2 May 2023 15:28:41 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v10] In-Reply-To: <1rB2fkk813I4tm8B-G2ArcAjSJxWzyYIgf8yBWBVGwc=.8dc9ecb5-24c0-4a86-bf29-4cf6408a1b1b@github.com> References: <1rB2fkk813I4tm8B-G2ArcAjSJxWzyYIgf8yBWBVGwc=.8dc9ecb5-24c0-4a86-bf29-4cf6408a1b1b@github.com> Message-ID: <2F7cnbE_2v4qCk1LoBFRI2S9Ky2bcwgPMxHC0Cf0lHU=.0973cc70-d5df-409c-87b0-f21562f1010d@github.com> On Fri, 28 Apr 2023 07:52:53 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix minimal build test/hotspot/gtest/gc/shared/test_slidingForwarding.cpp line 35: > 33: // Test simple forwarding within the same region. > 34: TEST_VM(SlidingForwarding, simple) { > 35: HeapWord heap[16]; Please initialize array for release build gtests ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1180220812 From rkennke at openjdk.org Tue May 2 15:33:24 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 15:33:24 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 15:11:59 GMT, Erik ?sterlund wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Address @shipilev's review > > src/hotspot/share/gc/shared/preservedMarks.cpp line 52: > >> 50: if (GCForwarding::is_forwarded(obj)) { >> 51: elem->set_oop(GCForwarding::forwardee(obj)); >> 52: } > > Is PreservedMarks still useful after moving the spacious forwarding/mark information out from the markWord? I can see that we need it while transitioning to using your new code, but that's about it right? It is still useful. This PR implements a compression that allows to use only the lowest 32bit of the mark-word for the forwarding pointer, but it still essentially uses the mark-word to store that information. That means that it overrides i-hash-code and lock-bits just the same as the normal implementation, and thus must preserve this information. I *also* prototyped a hash-table-based forwarding which does no longer use the mark-word to store forwarding. However, I found that to be 1. significantly slower and 2. significantly larger. That was a trade-off that I did not want to make at this point, when we 'only' want 64-bit-headers, simply because it's not yet necessary. It *will* become necessary to make that trade-off, or come up with a better overall approach (e.g. use scissor-GC like Parallel GC does, or come up with a better fwd-table like in that paper that you sent me: https://dl.acm.org/doi/abs/10.1145/3546918.3546928) but this needs to be researched. So yeah, the sliding forwarding algorithm is an interim solution but I think it is worth to have it at this point in time. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182713491 From tschatzl at openjdk.org Tue May 2 15:53:17 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 2 May 2023 15:53:17 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v5] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that removes the pinned tag from `HeapRegion`. > > So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. > > With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". > > The (current) pinned flag is surprisingly little used, only for policy decisions. > > The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). > > Testing: tier1-3, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge branch 'master' into 8306836-remove-pinned-tag - remove is_young_gc_movable in full gc code - cplummer review - ayang review - Fix hsdb - compilation fixes - Initial implementation ------------- Changes: https://git.openjdk.org/jdk/pull/13643/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13643&range=04 Stats: 69 lines in 20 files changed: 12 ins; 30 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/13643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13643/head:pull/13643 PR: https://git.openjdk.org/jdk/pull/13643 From tschatzl at openjdk.org Tue May 2 16:47:06 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 2 May 2023 16:47:06 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v6] In-Reply-To: References: Message-ID: > Hi all, > > please review this change that removes the pinned tag from `HeapRegion`. > > So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. > > With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". > > The (current) pinned flag is surprisingly little used, only for policy decisions. > > The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). > > Testing: tier1-3, gha > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Remove is_young_gc_movable ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13643/files - new: https://git.openjdk.org/jdk/pull/13643/files/3577054b..3516e982 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13643&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13643&range=04-05 Stats: 17 lines in 6 files changed: 1 ins; 9 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/13643.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13643/head:pull/13643 PR: https://git.openjdk.org/jdk/pull/13643 From tschatzl at openjdk.org Tue May 2 16:47:36 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 2 May 2023 16:47:36 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v5] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 15:53:17 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge branch 'master' into 8306836-remove-pinned-tag > - remove is_young_gc_movable in full gc code > - cplummer review > - ayang review > - Fix hsdb > - compilation fixes > - Initial implementation I removed the `young_gc_is_movable()` predicate; it is probably the wrong time to introduce more abstract concepts like this in this change. Moved off the refactoring of the `G1CollectionSetChooser::should_add()` and its caller to sometime else too - it's not relevant to this change either. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13643#issuecomment-1531806113 From rkennke at openjdk.org Tue May 2 16:51:13 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 16:51:13 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v17] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: More @shipilev's review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/5892ad5d..494ec9ad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=15-16 Stats: 7 lines in 2 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Tue May 2 16:54:24 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 16:54:24 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> References: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> Message-ID: On Tue, 2 May 2023 15:25:10 GMT, Thomas Stuefe wrote: > Hi Roman, > > Small general concern, the last-last-ditch-GC fallback table may be impractical cost-wise. How large is that expected to grow? You pay 24+x (~48 on glibc with internal overhead) bytes per forwarded oop. > > Very easy first-step mitigation: Let the table house the first n (1000-10000) nodes as an inline member array. Allocate nodes from there, only allocate spilloffs from C-heap. Allocation would be a lot faster and cheaper memory wise, and its just some lines of code. > I did some experiments with the only jtreg test that seems to exercise the G1 serial compaction (and thus the fallback-table) (the test is: gc/stress/TestMultiThreadStressRSet.java). With fallback-table size 128 I'd typically end up with several dozens excess nodes, sometimes more than the base table size. Up to table size of 512 this reduces signicantly but still typically one to several dozen extra nodes. When I switched to table-size of 1024 the extra nodes count drops to below one dozen in most cases. I'll leave the table-size at this value until we find a good reason to extend it, ok? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1531813673 From rkennke at openjdk.org Tue May 2 17:37:28 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 17:37:28 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> References: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> Message-ID: On Tue, 2 May 2023 14:16:16 GMT, Thomas Stuefe wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Address @shipilev's review > > src/hotspot/share/gc/shared/slidingForwarding.hpp line 121: > >> 119: size_t _region_size_words; >> 120: size_t _region_size_words_shift; >> 121: HeapWord** _bases_table; > > Small nit. For clarity, I would prefer if we had a real structure here, e.g.: > > struct region_forwarding { HeapWord* dest; HeapWord* alt_dest; }; > region_forwarding* _table; Ok, I am changing it. It looks like it's introducing a branch on the decoding-path though. I am not sure if a C++ compiler would optimise it to a branch-free code, though. It's probably a very minor concern. > src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 115: > >> 113: uintptr_t encoded = encode_forwarding(from_hw, to_hw); >> 114: markWord new_header = markWord((from_header.value() & ~MARK_LOWER_HALF_MASK) | encoded); >> 115: from->set_mark(new_header); > > What happens if the header is displaced into an OM? Should we not update the displaced header instead? When the header is displaced, it will be recorded in the preserved-marks table. Then we over-write the mark-word with the forwarding. At the end of the GC, we will restore the original mark from the preserved-marks table. This is the same mechanism that is already used in normal uncompressed forwarding. > src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 125: > >> 123: assert(_bases_table != nullptr, "call begin() before asking for forwarding"); >> 124: >> 125: markWord header = from->mark(); > > Could this header be displaced? No. See above. We actually check for that in decode_forwarding(): assert((encoded & markWord::marked_value) == markWord::marked_value, "must be marked as forwarded"); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182827422 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182846767 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182847913 From shade at openjdk.org Tue May 2 17:37:28 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 2 May 2023 17:37:28 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: References: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> Message-ID: On Tue, 2 May 2023 17:12:12 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shared/slidingForwarding.hpp line 121: >> >>> 119: size_t _region_size_words; >>> 120: size_t _region_size_words_shift; >>> 121: HeapWord** _bases_table; >> >> Small nit. For clarity, I would prefer if we had a real structure here, e.g.: >> >> struct region_forwarding { HeapWord* dest; HeapWord* alt_dest; }; >> region_forwarding* _table; > > Ok, I am changing it. It looks like it's introducing a branch on the decoding-path though. I am not sure if a C++ compiler would optimise it to a branch-free code, though. It's probably a very minor concern. No wait, let's keep it as `HeapWord*` array. The fact that alternate selection is just a math addition matters a bit for decoding performance. I think it does not complicate the code all that much to warrant extra abstraction here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182832355 From shade at openjdk.org Tue May 2 17:37:31 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 2 May 2023 17:37:31 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> References: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> Message-ID: On Tue, 2 May 2023 14:23:36 GMT, Thomas Stuefe wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Address @shipilev's review > > src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 56: > >> 54: // Primary is free >> 55: _bases_table[base_idx] = to_region_base; >> 56: } else if (region_contains(_bases_table[base_idx], to)) { > > Stupid question, could this not just be `else if ( _bases_table[base_idx] == to_region_base)` ? Same below? (kicks himself a little). Yes. Yes, it can. We would not need `region_contains` method then at all, I think. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182826854 From rkennke at openjdk.org Tue May 2 17:37:32 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 17:37:32 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: References: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> Message-ID: On Tue, 2 May 2023 17:11:35 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 56: >> >>> 54: // Primary is free >>> 55: _bases_table[base_idx] = to_region_base; >>> 56: } else if (region_contains(_bases_table[base_idx], to)) { >> >> Stupid question, could this not just be `else if ( _bases_table[base_idx] == to_region_base)` ? Same below? > > (kicks himself a little). Yes. Yes, it can. We would not need `region_contains` method then at all, I think. Indeed! Well spotted! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182827529 From rkennke at openjdk.org Tue May 2 17:46:37 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 17:46:37 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v16] In-Reply-To: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> References: <9OvQlCjXDbQ_AzJyCc-xYnS-zwv5ib2SxsfQgKINFWM=.86d9e284-1096-4510-a801-dfbb5a8b2880@github.com> Message-ID: <30hPs9U3Wt5ITn5XHdjVPuVcbqK4YWq1Xxfw2LznDYo=.0194177b-9472-41e1-bd2a-056eb80104ff@github.com> On Tue, 2 May 2023 15:11:59 GMT, Thomas Stuefe wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Address @shipilev's review > > src/hotspot/share/gc/shared/slidingForwarding.cpp line 137: > >> 135: void FallbackTable::forward_to(HeapWord* from, HeapWord* to) { >> 136: size_t idx = home_index(from); >> 137: if (_table[idx]._from != nullptr) { > > Here you need to do a contains check, right? Because, as you wrote in your answer to Aleksey, forwardings can be rewritten: https://github.com/openjdk/jdk/pull/13582/files#r1180126262 I don't think that ever happens (I think we'd only ever re-forward from normal forwarding to fallback-forwarding once), but I am adding that check for extra sanity. > test/hotspot/gtest/gc/shared/test_slidingForwarding.cpp line 40: > >> 38: return ((uintptr_t(1) << 2) /* fallback */ | 3 /* forwarded */); >> 39: } >> 40: > > Could you add a test that forwarding works for displaced Oop+OM ? Uhhh, that would involved the OM and preserved-marks subsystems. The saving and restoring of 'interesting mark-words' is done outside of the GCForwarding subsystem and not the responsibility here. I'd rather not test for that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182855383 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182856956 From rkennke at openjdk.org Tue May 2 18:06:30 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 18:06:30 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v18] In-Reply-To: References: Message-ID: <06loJyeqlW5aON-IGrWJzY6DQBLkC3kyuxxeCMxq3xI=.da8bbc4c-0cd7-4c7f-9bb2-0b087ce70d11@github.com> > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Address @tstuefe's review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/494ec9ad..84181db6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=16-17 Stats: 38 lines in 3 files changed: 8 ins; 6 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Tue May 2 18:21:18 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 2 May 2023 18:21:18 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v19] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Initialize 'heap' elements in test case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/84181db6..8366454e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=17-18 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From sspitsyn at openjdk.org Tue May 2 19:02:22 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 2 May 2023 19:02:22 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v6] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 16:47:06 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Remove is_young_gc_movable This looks good in general. I can't judge on the GC side decision about this removal and all updated comments but it looks consistent. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13643#pullrequestreview-1409712756 From ayang at openjdk.org Tue May 2 22:08:18 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 2 May 2023 22:08:18 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v6] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 16:47:06 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Remove is_young_gc_movable Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13643#pullrequestreview-1409951167 From ysr at openjdk.org Wed May 3 00:32:23 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 3 May 2023 00:32:23 GMT Subject: RFR: 8305062: Refactor CardTable::resize_covered_region [v3] In-Reply-To: References: Message-ID: On Tue, 18 Apr 2023 09:21:54 GMT, Albert Mingkun Yang wrote: >> Simple refactoring to make logic around cardtable cover-region more concrete, since #generations and gen-boundary is fixed for Serial/Parallel. >> >> Test: tier1-6 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review Left a comment re `guard_region`. src/hotspot/share/gc/shared/cardTable.hpp line 63: > 61: > 62: // The last card is a guard card; never committed. > 63: MemRegion _guard_region; @albertnetymk : It looks like, following this refactor, you have stopped using `guard_region` for its previous role. I'd either put some of those checks back in, or just delete this now otherwise obsolete field. It is possible, however, that I am missing something here. Thanks! ------------- PR Review: https://git.openjdk.org/jdk/pull/13206#pullrequestreview-1410041342 PR Review Comment: https://git.openjdk.org/jdk/pull/13206#discussion_r1183156270 From ysr at openjdk.org Wed May 3 06:46:25 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 3 May 2023 06:46:25 GMT Subject: RFR: 8305062: Refactor CardTable::resize_covered_region [v3] In-Reply-To: References: Message-ID: <0A_VTfQsSSTI5BGWdFlDWFfwVugfN8MhwYOo_b2astU=.35e7969c-9c98-4565-be8c-af8a4ca7b5a4@github.com> On Tue, 18 Apr 2023 09:21:54 GMT, Albert Mingkun Yang wrote: >> Simple refactoring to make logic around cardtable cover-region more concrete, since #generations and gen-boundary is fixed for Serial/Parallel. >> >> Test: tier1-6 > > Albert Mingkun Yang has updated the pull request incrementally with one additional commit since the last revision: > > review src/hotspot/share/gc/shared/cardTable.hpp line 63: > 61: > 62: // The last card is a guard card; never committed. > 63: MemRegion _guard_region; Doh, scratch that comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13206#discussion_r1183291917 From iwalulya at openjdk.org Wed May 3 08:20:17 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 3 May 2023 08:20:17 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v3] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Tue, 2 May 2023 13:41:28 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this refactoring of collection set candidate set handling. >> >> The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. >> >> These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). >> >> This patch only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. >> >> In detail: >> * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Testing: >> - this patch only: tier1-3, gha >> - with JDK-8140326 tier1-7 (or 8?) >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'master' into 8306541-refactor-cset-candidates > - ayang review - remove unused methods > - Whitespace fixes > - typo > - More cleanup > - Cleanup > - Cleanup > - Refactor collection set candidates > > Improve the interface to collection set candidates and prepare for having collection set > candidates at any time. Preparations to allow for multiple sources for these candidates > (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch > only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's > not used otherwise. > > * the collection set candidates set is not temporarily allocated any more, but the candidate > set object must be available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains > the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not > necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. > Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Everything else are changes to use these helper sets/lists throughout. > > Some additional FIXME for log messages to remove are in there. Please ignore. src/hotspot/share/gc/g1/g1CollectionSet.hpp line 155: > 153: // When doing mixed collections we can add old regions to the collection set, which > 154: // will be collected only if there is enough time. We call these optional regions. > 155: // This member records the current number of regions that are of that type that Comment needs to be revised src/hotspot/share/gc/g1/g1CollectionSetCandidates.cpp line 50: > 48: guarantee((uint)_candidates.length() >= other->length(), "must be"); > 49: > 50: if ((other->length() == 0) || (_candidates.length() == 0)) { `guarantee((uint)_candidates.length() >= other->length(), "must be");` implies that the second part of the predicate is not necessary i.e `|| (_candidates.length() == 0)` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1183278338 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1183285839 From ayang at openjdk.org Wed May 3 09:54:19 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 3 May 2023 09:54:19 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v3] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: <33tIj1LuZJo-0_EbMmYXzw5SgePPVqmhY66M49yQgeA=.d48c62d4-9fa0-4889-810b-d7b0ad30a70b@github.com> On Tue, 2 May 2023 13:41:28 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this refactoring of collection set candidate set handling. >> >> The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. >> >> These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). >> >> This patch only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. >> >> In detail: >> * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Testing: >> - this patch only: tier1-3, gha >> - with JDK-8140326 tier1-7 (or 8?) >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: > > - Merge branch 'master' into 8306541-refactor-cset-candidates > - ayang review - remove unused methods > - Whitespace fixes > - typo > - More cleanup > - Cleanup > - Cleanup > - Refactor collection set candidates > > Improve the interface to collection set candidates and prepare for having collection set > candidates at any time. Preparations to allow for multiple sources for these candidates > (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch > only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's > not used otherwise. > > * the collection set candidates set is not temporarily allocated any more, but the candidate > set object must be available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains > the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not > necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. > Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Everything else are changes to use these helper sets/lists throughout. > > Some additional FIXME for log messages to remove are in there. Please ignore. src/hotspot/share/gc/g1/heapRegion.inline.hpp line 344: > 342: } > 343: > 344: inline bool HeapRegion::in_collection_set_candidates() const { The impl is identical to `is_collection_set_candidate`. Maybe one is enough? src/hotspot/share/gc/shared/ptrQueue.hpp line 202: > 200: // In particular, the individual queues allocate buffers from this shared > 201: // set, and return completed buffers to the set. > 202: class PtrQueueSet : public CHeapObj { This doesn't seem required in this PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182609579 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1182610148 From duke at openjdk.org Wed May 3 10:11:22 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 3 May 2023 10:11:22 GMT Subject: RFR: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code Message-ID: Add logging for the time taken to collect/report ObjectCount(AfterGC) event to Serial, Parallel and G1 full collectors, for parity with the G1 concurrent collector. This event can be *very* expensive to collect (single threaded STW full heap scan), so it's very important to show it clearly to the user. Added output: Serial [7.976s][debug][gc,phases ] GC(7) Trigger cleanups 0.002ms [7.977s][debug][gc,phases ] GC(7) Class Unloading 0.556ms ++ [7.977s][debug][gc,phases,start] GC(7) Report Object Count ++ [8.529s][debug][gc,phases ] GC(7) Report Object Count 552.065ms [8.529s][info ][gc,phases ] GC(7) Phase 1: Mark live objects 1066.882ms [8.529s][info ][gc,phases,start] GC(7) Phase 2: Compute new object addresses Parallel [5.786s][debug][gc,phases ] GC(12) Trigger cleanups 0.002ms [5.786s][debug][gc,phases ] GC(12) Class Unloading 0.556ms ++ [5.786s][debug][gc,phases,start] GC(12) Report Object Count ++ [6.313s][debug][gc,phases ] GC(12) Report Object Count 526.307ms [6.313s][info ][gc,phases ] GC(12) Marking Phase 889.900ms [6.313s][info ][gc,phases,start] GC(12) Summary Phase G1 Full [3.922s][debug][gc,phases ] GC(24) Trigger cleanups 0.002ms [3.922s][debug][gc,phases ] GC(24) Phase 1: Class Unloading and Cleanup 0.159ms ++ [3.922s][debug][gc,phases,start ] GC(24) Report Object Count ++ [4.442s][debug][gc,phases ] GC(24) Report Object Count 519.292ms [4.442s][info ][gc,phases ] GC(24) Phase 1: Mark live objects 653.209ms [4.442s][info ][gc,phases,start ] GC(24) Phase 2: Prepare compaction ------------- Commit messages: - 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code Changes: https://git.openjdk.org/jdk/pull/13772/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13772&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307346 Stats: 12 lines in 3 files changed: 9 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13772.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13772/head:pull/13772 PR: https://git.openjdk.org/jdk/pull/13772 From tschatzl at openjdk.org Wed May 3 10:19:20 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 10:19:20 GMT Subject: RFR: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code In-Reply-To: References: Message-ID: On Wed, 3 May 2023 10:04:14 GMT, olivergillespie wrote: > Add logging for the time taken to collect/report ObjectCount(AfterGC) event to Serial, Parallel and G1 full collectors, for parity with the G1 concurrent collector. This event can be *very* expensive to collect (single threaded STW full heap scan), so it's very important to show it clearly to the user. > > Added output: > > Serial > > [7.976s][debug][gc,phases ] GC(7) Trigger cleanups 0.002ms > [7.977s][debug][gc,phases ] GC(7) Class Unloading 0.556ms > ++ [7.977s][debug][gc,phases,start] GC(7) Report Object Count > ++ [8.529s][debug][gc,phases ] GC(7) Report Object Count 552.065ms > [8.529s][info ][gc,phases ] GC(7) Phase 1: Mark live objects 1066.882ms > [8.529s][info ][gc,phases,start] GC(7) Phase 2: Compute new object addresses > > Parallel > > [5.786s][debug][gc,phases ] GC(12) Trigger cleanups 0.002ms > [5.786s][debug][gc,phases ] GC(12) Class Unloading 0.556ms > ++ [5.786s][debug][gc,phases,start] GC(12) Report Object Count > ++ [6.313s][debug][gc,phases ] GC(12) Report Object Count 526.307ms > [6.313s][info ][gc,phases ] GC(12) Marking Phase 889.900ms > [6.313s][info ][gc,phases,start] GC(12) Summary Phase > > G1 Full > > [3.922s][debug][gc,phases ] GC(24) Trigger cleanups 0.002ms > [3.922s][debug][gc,phases ] GC(24) Phase 1: Class Unloading and Cleanup 0.159ms > ++ [3.922s][debug][gc,phases,start ] GC(24) Report Object Count > ++ [4.442s][debug][gc,phases ] GC(24) Report Object Count 519.292ms > [4.442s][info ][gc,phases ] GC(24) Phase 1: Mark live objects 653.209ms > [4.442s][info ][gc,phases,start ] GC(24) Phase 2: Prepare compaction Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13772#pullrequestreview-1410586664 From shade at openjdk.org Wed May 3 10:19:23 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 3 May 2023 10:19:23 GMT Subject: RFR: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code In-Reply-To: References: Message-ID: <6IdRUilFngVdJMReyIYaKHl-3j3JQWPWvKqdiq81h54=.3505b32b-7120-4c04-add3-0dceceb1ec90@github.com> On Wed, 3 May 2023 10:04:14 GMT, olivergillespie wrote: > Add logging for the time taken to collect/report ObjectCount(AfterGC) event to Serial, Parallel and G1 full collectors, for parity with the G1 concurrent collector. This event can be *very* expensive to collect (single threaded STW full heap scan), so it's very important to show it clearly to the user. > > Added output: > > Serial > > [7.976s][debug][gc,phases ] GC(7) Trigger cleanups 0.002ms > [7.977s][debug][gc,phases ] GC(7) Class Unloading 0.556ms > ++ [7.977s][debug][gc,phases,start] GC(7) Report Object Count > ++ [8.529s][debug][gc,phases ] GC(7) Report Object Count 552.065ms > [8.529s][info ][gc,phases ] GC(7) Phase 1: Mark live objects 1066.882ms > [8.529s][info ][gc,phases,start] GC(7) Phase 2: Compute new object addresses > > Parallel > > [5.786s][debug][gc,phases ] GC(12) Trigger cleanups 0.002ms > [5.786s][debug][gc,phases ] GC(12) Class Unloading 0.556ms > ++ [5.786s][debug][gc,phases,start] GC(12) Report Object Count > ++ [6.313s][debug][gc,phases ] GC(12) Report Object Count 526.307ms > [6.313s][info ][gc,phases ] GC(12) Marking Phase 889.900ms > [6.313s][info ][gc,phases,start] GC(12) Summary Phase > > G1 Full > > [3.922s][debug][gc,phases ] GC(24) Trigger cleanups 0.002ms > [3.922s][debug][gc,phases ] GC(24) Phase 1: Class Unloading and Cleanup 0.159ms > ++ [3.922s][debug][gc,phases,start ] GC(24) Report Object Count > ++ [4.442s][debug][gc,phases ] GC(24) Report Object Count 519.292ms > [4.442s][info ][gc,phases ] GC(24) Phase 1: Mark live objects 653.209ms > [4.442s][info ][gc,phases,start ] GC(24) Phase 2: Prepare compaction Some nits. src/hotspot/share/gc/parallel/psParallelCompact.cpp line 2071: > 2069: > 2070: { > 2071: GCTraceTime(Debug, gc, phases) debug("Report Object Count", &_gc_timer); Nit: in this file, the holder variables are called `tm`, not `debug`. src/hotspot/share/gc/serial/genMarkSweep.cpp line 214: > 212: > 213: { > 214: GCTraceTime(Debug, gc, phases) debug("Report Object Count", gc_timer()); Nit: in this file, the holder variables are called `tm_m`, not `debug`. ------------- PR Review: https://git.openjdk.org/jdk/pull/13772#pullrequestreview-1410586067 PR Review Comment: https://git.openjdk.org/jdk/pull/13772#discussion_r1183496549 PR Review Comment: https://git.openjdk.org/jdk/pull/13772#discussion_r1183496759 From ayang at openjdk.org Wed May 3 10:25:14 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 3 May 2023 10:25:14 GMT Subject: RFR: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code In-Reply-To: References: Message-ID: On Wed, 3 May 2023 10:04:14 GMT, olivergillespie wrote: > single threaded STW full heap scan `HeapInspection::populate_table` can use multiple threads. Could `report_object_count_after_gc` invoke the parallel version? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13772#issuecomment-1532783089 From duke at openjdk.org Wed May 3 10:31:15 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 3 May 2023 10:31:15 GMT Subject: RFR: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code In-Reply-To: References: Message-ID: On Wed, 3 May 2023 10:22:52 GMT, Albert Mingkun Yang wrote: > Could report_object_count_after_gc invoke the parallel version? Yes, I was just thinking the same thing! I think it could, I will follow up to implement that change, thanks for the suggestion. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13772#issuecomment-1532788952 From shade at openjdk.org Wed May 3 10:34:15 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 3 May 2023 10:34:15 GMT Subject: RFR: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code In-Reply-To: References: Message-ID: On Wed, 3 May 2023 10:27:59 GMT, olivergillespie wrote: > > Could report_object_count_after_gc invoke the parallel version? > > Yes, I was just thinking the same thing! I think it could, I will follow up to implement that change, thanks for the suggestion. Filed: https://bugs.openjdk.org/browse/JDK-8307348 ------------- PR Comment: https://git.openjdk.org/jdk/pull/13772#issuecomment-1532793618 From tschatzl at openjdk.org Wed May 3 10:34:19 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 10:34:19 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v3] In-Reply-To: <33tIj1LuZJo-0_EbMmYXzw5SgePPVqmhY66M49yQgeA=.d48c62d4-9fa0-4889-810b-d7b0ad30a70b@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <33tIj1LuZJo-0_EbMmYXzw5SgePPVqmhY66M49yQgeA=.d48c62d4-9fa0-4889-810b-d7b0ad30a70b@github.com> Message-ID: On Tue, 2 May 2023 14:14:21 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: >> >> - Merge branch 'master' into 8306541-refactor-cset-candidates >> - ayang review - remove unused methods >> - Whitespace fixes >> - typo >> - More cleanup >> - Cleanup >> - Cleanup >> - Refactor collection set candidates >> >> Improve the interface to collection set candidates and prepare for having collection set >> candidates at any time. Preparations to allow for multiple sources for these candidates >> (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch >> only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's >> not used otherwise. >> >> * the collection set candidates set is not temporarily allocated any more, but the candidate >> set object must be available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains >> the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not >> necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. >> Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Everything else are changes to use these helper sets/lists throughout. >> >> Some additional FIXME for log messages to remove are in there. Please ignore. > > src/hotspot/share/gc/g1/heapRegion.inline.hpp line 344: > >> 342: } >> 343: >> 344: inline bool HeapRegion::in_collection_set_candidates() const { > > The impl is identical to `is_collection_set_candidate`. Maybe one is enough? I inlined a few helpers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1183512571 From duke at openjdk.org Wed May 3 10:40:24 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 3 May 2023 10:40:24 GMT Subject: RFR: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code [v2] In-Reply-To: References: Message-ID: > Add logging for the time taken to collect/report ObjectCount(AfterGC) event to Serial, Parallel and G1 full collectors, for parity with the G1 concurrent collector. This event can be *very* expensive to collect (single threaded STW full heap scan), so it's very important to show it clearly to the user. > > Added output: > > Serial > > [7.976s][debug][gc,phases ] GC(7) Trigger cleanups 0.002ms > [7.977s][debug][gc,phases ] GC(7) Class Unloading 0.556ms > ++ [7.977s][debug][gc,phases,start] GC(7) Report Object Count > ++ [8.529s][debug][gc,phases ] GC(7) Report Object Count 552.065ms > [8.529s][info ][gc,phases ] GC(7) Phase 1: Mark live objects 1066.882ms > [8.529s][info ][gc,phases,start] GC(7) Phase 2: Compute new object addresses > > Parallel > > [5.786s][debug][gc,phases ] GC(12) Trigger cleanups 0.002ms > [5.786s][debug][gc,phases ] GC(12) Class Unloading 0.556ms > ++ [5.786s][debug][gc,phases,start] GC(12) Report Object Count > ++ [6.313s][debug][gc,phases ] GC(12) Report Object Count 526.307ms > [6.313s][info ][gc,phases ] GC(12) Marking Phase 889.900ms > [6.313s][info ][gc,phases,start] GC(12) Summary Phase > > G1 Full > > [3.922s][debug][gc,phases ] GC(24) Trigger cleanups 0.002ms > [3.922s][debug][gc,phases ] GC(24) Phase 1: Class Unloading and Cleanup 0.159ms > ++ [3.922s][debug][gc,phases,start ] GC(24) Report Object Count > ++ [4.442s][debug][gc,phases ] GC(24) Report Object Count 519.292ms > [4.442s][info ][gc,phases ] GC(24) Phase 1: Mark live objects 653.209ms > [4.442s][info ][gc,phases,start ] GC(24) Phase 2: Prepare compaction olivergillespie has updated the pull request incrementally with one additional commit since the last revision: Use correct holder var names ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13772/files - new: https://git.openjdk.org/jdk/pull/13772/files/3ef4f4cc..ce7227c5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13772&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13772&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13772.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13772/head:pull/13772 PR: https://git.openjdk.org/jdk/pull/13772 From shade at openjdk.org Wed May 3 10:41:16 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 3 May 2023 10:41:16 GMT Subject: RFR: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 10:40:24 GMT, olivergillespie wrote: >> Add logging for the time taken to collect/report ObjectCount(AfterGC) event to Serial, Parallel and G1 full collectors, for parity with the G1 concurrent collector. This event can be *very* expensive to collect (single threaded STW full heap scan), so it's very important to show it clearly to the user. >> >> Added output: >> >> Serial >> >> [7.976s][debug][gc,phases ] GC(7) Trigger cleanups 0.002ms >> [7.977s][debug][gc,phases ] GC(7) Class Unloading 0.556ms >> ++ [7.977s][debug][gc,phases,start] GC(7) Report Object Count >> ++ [8.529s][debug][gc,phases ] GC(7) Report Object Count 552.065ms >> [8.529s][info ][gc,phases ] GC(7) Phase 1: Mark live objects 1066.882ms >> [8.529s][info ][gc,phases,start] GC(7) Phase 2: Compute new object addresses >> >> Parallel >> >> [5.786s][debug][gc,phases ] GC(12) Trigger cleanups 0.002ms >> [5.786s][debug][gc,phases ] GC(12) Class Unloading 0.556ms >> ++ [5.786s][debug][gc,phases,start] GC(12) Report Object Count >> ++ [6.313s][debug][gc,phases ] GC(12) Report Object Count 526.307ms >> [6.313s][info ][gc,phases ] GC(12) Marking Phase 889.900ms >> [6.313s][info ][gc,phases,start] GC(12) Summary Phase >> >> G1 Full >> >> [3.922s][debug][gc,phases ] GC(24) Trigger cleanups 0.002ms >> [3.922s][debug][gc,phases ] GC(24) Phase 1: Class Unloading and Cleanup 0.159ms >> ++ [3.922s][debug][gc,phases,start ] GC(24) Report Object Count >> ++ [4.442s][debug][gc,phases ] GC(24) Report Object Count 519.292ms >> [4.442s][info ][gc,phases ] GC(24) Phase 1: Mark live objects 653.209ms >> [4.442s][info ][gc,phases,start ] GC(24) Phase 2: Prepare compaction > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Use correct holder var names This looks fine to me. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13772#pullrequestreview-1410628693 From rkennke at openjdk.org Wed May 3 10:54:43 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 3 May 2023 10:54:43 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v20] In-Reply-To: References: Message-ID: <3RFs-ck6jkNa-9dxLYQi00uC-f5K995W8d29V5swQpM=.f3bf90e8-1c09-436a-b414-760b76dac9ed@github.com> > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: More Thomas' comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/8366454e..b623db55 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=19 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=18-19 Stats: 25 lines in 3 files changed: 7 ins; 5 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From stuefe at openjdk.org Wed May 3 11:11:22 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 3 May 2023 11:11:22 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v20] In-Reply-To: <3RFs-ck6jkNa-9dxLYQi00uC-f5K995W8d29V5swQpM=.f3bf90e8-1c09-436a-b414-760b76dac9ed@github.com> References: <3RFs-ck6jkNa-9dxLYQi00uC-f5K995W8d29V5swQpM=.f3bf90e8-1c09-436a-b414-760b76dac9ed@github.com> Message-ID: <9CI4EFHvG-HhtLlu8Kf8x6zgmiKqi7au0zn6iHoVrYw=.1a5248d5-5967-428e-bfeb-76f37f8658ad@github.com> On Wed, 3 May 2023 10:54:43 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > More Thomas' comments src/hotspot/share/gc/shared/slidingForwarding.cpp line 153: > 151: // Set from and to in new or found entry. > 152: entry->_from = from; > 153: entry->_to = to; Why so complicated? Proposal: while (entry != nullptr && entry->_from != from) { entry = entry->_next; } if (entry == nullptr) { FallbackTableEntry* new_entry = NEW_C_HEAP_OBJ(FallbackTableEntry, mtGC); new_entry->next = head; new_entry->_from = from; head = entry = new_entry; } entry->_to = to; ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1183545518 From rkennke at openjdk.org Wed May 3 11:16:25 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 3 May 2023 11:16:25 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v20] In-Reply-To: <9CI4EFHvG-HhtLlu8Kf8x6zgmiKqi7au0zn6iHoVrYw=.1a5248d5-5967-428e-bfeb-76f37f8658ad@github.com> References: <3RFs-ck6jkNa-9dxLYQi00uC-f5K995W8d29V5swQpM=.f3bf90e8-1c09-436a-b414-760b76dac9ed@github.com> <9CI4EFHvG-HhtLlu8Kf8x6zgmiKqi7au0zn6iHoVrYw=.1a5248d5-5967-428e-bfeb-76f37f8658ad@github.com> Message-ID: On Wed, 3 May 2023 11:08:04 GMT, Thomas Stuefe wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> More Thomas' comments > > src/hotspot/share/gc/shared/slidingForwarding.cpp line 153: > >> 151: // Set from and to in new or found entry. >> 152: entry->_from = from; >> 153: entry->_to = to; > > Why so complicated? Proposal: > > while (entry != nullptr && entry->_from != from) { > entry = entry->_next; > } > if (entry == nullptr) { > FallbackTableEntry* new_entry = NEW_C_HEAP_OBJ(FallbackTableEntry, mtGC); > new_entry->next = head; > new_entry->_from = from; > head = entry = new_entry; > } > entry->_to = to; Uhm, so this would not change the actual head > Why so complicated? Proposal: > > ``` > while (entry != nullptr && entry->_from != from) { > entry = entry->_next; > } > if (entry == nullptr) { > FallbackTableEntry* new_entry = NEW_C_HEAP_OBJ(FallbackTableEntry, mtGC); > new_entry->next = head; > new_entry->_from = from; > head = entry = new_entry; > } > entry->_to = to; > ``` Remember that head points into the array. We cannot actually prepend the new entry, we can only insert it as the first linked entry after head. If I see it correctly, it would not actually change the head-entry (the stuff in the array) except for its _to field. Also, the new_entry would not get linked anywhere. Or what am I missing? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1183550376 From duke at openjdk.org Wed May 3 11:18:29 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 3 May 2023 11:18:29 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection Message-ID: ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? ------------- Commit messages: - 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection Changes: https://git.openjdk.org/jdk/pull/13774/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307348 Stats: 6 lines in 1 file changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13774.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13774/head:pull/13774 PR: https://git.openjdk.org/jdk/pull/13774 From stuefe at openjdk.org Wed May 3 11:20:23 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 3 May 2023 11:20:23 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v20] In-Reply-To: References: <3RFs-ck6jkNa-9dxLYQi00uC-f5K995W8d29V5swQpM=.f3bf90e8-1c09-436a-b414-760b76dac9ed@github.com> <9CI4EFHvG-HhtLlu8Kf8x6zgmiKqi7au0zn6iHoVrYw=.1a5248d5-5967-428e-bfeb-76f37f8658ad@github.com> Message-ID: <5fyq8JC1XDQJffYmRtITjnFebGWAuYN8doLD_DoiPN0=.8809283a-4cbc-4e47-9598-aeb8a335c8eb@github.com> On Wed, 3 May 2023 11:13:47 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shared/slidingForwarding.cpp line 153: >> >>> 151: // Set from and to in new or found entry. >>> 152: entry->_from = from; >>> 153: entry->_to = to; >> >> Why so complicated? Proposal: >> >> while (entry != nullptr && entry->_from != from) { >> entry = entry->_next; >> } >> if (entry == nullptr) { >> FallbackTableEntry* new_entry = NEW_C_HEAP_OBJ(FallbackTableEntry, mtGC); >> new_entry->next = head; >> new_entry->_from = from; >> head = entry = new_entry; >> } >> entry->_to = to; > > Uhm, so this would not change the actual head > >> Why so complicated? Proposal: >> >> ``` >> while (entry != nullptr && entry->_from != from) { >> entry = entry->_next; >> } >> if (entry == nullptr) { >> FallbackTableEntry* new_entry = NEW_C_HEAP_OBJ(FallbackTableEntry, mtGC); >> new_entry->next = head; >> new_entry->_from = from; >> head = entry = new_entry; >> } >> entry->_to = to; >> ``` > > Remember that head points into the array. We cannot actually prepend the new entry, we can only insert it as the first linked entry after head. If I see it correctly, it would not actually change the head-entry (the stuff in the array) except for its _to field. Also, the new_entry would not get linked anywhere. Or what am I missing? Ah, sorry, I just realized you inlined the head elements into the table. Okay, never mind then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1183553784 From duke at openjdk.org Wed May 3 11:23:50 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 3 May 2023 11:23:50 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v2] In-Reply-To: References: Message-ID: > ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. > > The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: > > > Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) > After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) > > > Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? > > Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? olivergillespie has updated the pull request incrementally with one additional commit since the last revision: Fix compile error ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13774/files - new: https://git.openjdk.org/jdk/pull/13774/files/b8b30b5e..88eb1ede Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13774.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13774/head:pull/13774 PR: https://git.openjdk.org/jdk/pull/13774 From stuefe at openjdk.org Wed May 3 11:24:28 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 3 May 2023 11:24:28 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v20] In-Reply-To: <3RFs-ck6jkNa-9dxLYQi00uC-f5K995W8d29V5swQpM=.f3bf90e8-1c09-436a-b414-760b76dac9ed@github.com> References: <3RFs-ck6jkNa-9dxLYQi00uC-f5K995W8d29V5swQpM=.f3bf90e8-1c09-436a-b414-760b76dac9ed@github.com> Message-ID: On Wed, 3 May 2023 10:54:43 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > More Thomas' comments Okay, good so far. src/hotspot/share/gc/shared/slidingForwarding.cpp line 147: > 145: new_entry->_next = head->_next; > 146: new_entry->_from = head->_from; > 147: new_entry->_to = head->_to; You could probably just use assignment here, which does memberwise copy. `*new_entry = *head;` ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1410687480 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1183556810 From tschatzl at openjdk.org Wed May 3 11:27:37 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 11:27:37 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v4] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang, iwalulya review fix inlining in g1CollectionSet.inline.hpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13666/files - new: https://git.openjdk.org/jdk/pull/13666/files/30a157ed..cdc63375 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=02-03 Stats: 30 lines in 8 files changed: 3 ins; 10 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From duke at openjdk.org Wed May 3 12:01:13 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 3 May 2023 12:01:13 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v3] In-Reply-To: References: Message-ID: <_4E0C256mo0CLZnMrKJq0JoCn7tsFppEGyGt-SsjH9A=.eaa625c9-f745-4341-87d0-97127820bb21@github.com> > ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. > > The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: > > > Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) > After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) > > > Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? > > Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? olivergillespie has updated the pull request incrementally with one additional commit since the last revision: Fix compile error ``` === Output from failing command(s) repeated here === * For target hotspot_variant-server_libjvm_objs_gcTrace.o: /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp: In member function 'void GCTracer::report_object_count_after_gc(BoolObjectClosure*)': /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:114:48: error: invalid use of incomplete type 'class CollectedHeap' 114 | WorkerThreads* workers = Universe::heap()->safepoint_workers(); | ^~ In file included from /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:35: /home/runner/work/jdk/jdk/src/hotspot/share/memory/universe.hpp:42:7: note: forward declaration of 'class CollectedHeap' 42 | class CollectedHeap; | ^~~~~~~~~~~~~ * All command lines available in /home/runner/work/jdk/jdk/build/linux-x64/make-support/failure-logs. === End of repeated output === ``` ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13774/files - new: https://git.openjdk.org/jdk/pull/13774/files/88eb1ede..22b9b6d5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13774.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13774/head:pull/13774 PR: https://git.openjdk.org/jdk/pull/13774 From shade at openjdk.org Wed May 3 12:06:15 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 3 May 2023 12:06:15 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v3] In-Reply-To: <_4E0C256mo0CLZnMrKJq0JoCn7tsFppEGyGt-SsjH9A=.eaa625c9-f745-4341-87d0-97127820bb21@github.com> References: <_4E0C256mo0CLZnMrKJq0JoCn7tsFppEGyGt-SsjH9A=.eaa625c9-f745-4341-87d0-97127820bb21@github.com> Message-ID: On Wed, 3 May 2023 12:01:13 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Fix compile error > > ``` > === Output from failing command(s) repeated here === > * For target hotspot_variant-server_libjvm_objs_gcTrace.o: > /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp: In member function 'void GCTracer::report_object_count_after_gc(BoolObjectClosure*)': > /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:114:48: error: invalid use of incomplete type 'class CollectedHeap' > 114 | WorkerThreads* workers = Universe::heap()->safepoint_workers(); > | ^~ > In file included from /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:35: > /home/runner/work/jdk/jdk/src/hotspot/share/memory/universe.hpp:42:7: note: forward declaration of 'class CollectedHeap' > 42 | class CollectedHeap; > | ^~~~~~~~~~~~~ > > * All command lines available in /home/runner/work/jdk/jdk/build/linux-x64/make-support/failure-logs. > === End of repeated output === > ``` Why not just `hi.populate_table(&cit, is_alive_cl, ParallelGCThreads);`, and let the `populate_table` deal with the rest? I think we have a convention that `ParallelGCThreads` is roughly the proxy for the number of GC threads at paused operation. (It is weird that `HeapInspection::populate_table` uses `safepoint_workers` -- maybe that's for additional isolation from the GC threads -- let's not proliferate it here. `populate_table` also caps the worker count at `max_workers`, which answers one of your questions) ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1532905321 From duke at openjdk.org Wed May 3 12:16:13 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 3 May 2023 12:16:13 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: Message-ID: > ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. > > The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: > > > Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) > After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) > > > Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? > > Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? olivergillespie has updated the pull request incrementally with one additional commit since the last revision: Use ParallelGCThreads instead of active_workers ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13774/files - new: https://git.openjdk.org/jdk/pull/13774/files/22b9b6d5..711cb643 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=02-03 Stats: 9 lines in 1 file changed: 1 ins; 7 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13774.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13774/head:pull/13774 PR: https://git.openjdk.org/jdk/pull/13774 From duke at openjdk.org Wed May 3 12:16:14 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 3 May 2023 12:16:14 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v3] In-Reply-To: References: <_4E0C256mo0CLZnMrKJq0JoCn7tsFppEGyGt-SsjH9A=.eaa625c9-f745-4341-87d0-97127820bb21@github.com> Message-ID: <7ZZt-QYarRp7IHjf8vYEvn8Asp2cmZ8KFwoQLwWKLy8=.bd45eca2-ed82-4cb8-b10c-c1c97b5fab14@github.com> On Wed, 3 May 2023 12:03:44 GMT, Aleksey Shipilev wrote: > Why not just `hi.populate_table(&cit, is_alive_cl, ParallelGCThreads);`, and let the `populate_table` deal with the rest? I think we have a convention that `ParallelGCThreads` is roughly the proxy for the number of GC threads at paused operation. > > (It is weird that `HeapInspection::populate_table` uses `safepoint_workers` -- maybe that's for additional isolation from the GC threads -- let's not proliferate it here. `populate_table` also caps the worker count at `max_workers`, which answers one of your questions) Thanks, that's fine by me, whatever is most idiomatic. Updated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1532917696 From rkennke at openjdk.org Wed May 3 12:21:44 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 3 May 2023 12:21:44 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v21] In-Reply-To: References: Message-ID: <9ne0qVqlv8GjcVAZ76BIuMLqypEmpAhS-W_cHi_FRfE=.8491768a-6c51-4af9-a07f-d99863d634a5@github.com> > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Refactor GCForwarding into SlidingForwarding ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/b623db55..568e5ea3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=20 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=19-20 Stats: 343 lines in 20 files changed: 87 ins; 184 del; 72 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Wed May 3 12:34:29 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 3 May 2023 12:34:29 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v22] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: - Place 'public' correctly - Use member assignment, instead of explicitly copying the struct - Set UseAltGCForwarding flag in test case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/568e5ea3..7691eb81 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=21 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=20-21 Stats: 8 lines in 3 files changed: 4 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From ayang at openjdk.org Wed May 3 12:56:16 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 3 May 2023 12:56:16 GMT Subject: RFR: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code [v2] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 10:40:24 GMT, olivergillespie wrote: >> Add logging for the time taken to collect/report ObjectCount(AfterGC) event to Serial, Parallel and G1 full collectors, for parity with the G1 concurrent collector. This event can be *very* expensive to collect (single threaded STW full heap scan), so it's very important to show it clearly to the user. >> >> Added output: >> >> Serial >> >> [7.976s][debug][gc,phases ] GC(7) Trigger cleanups 0.002ms >> [7.977s][debug][gc,phases ] GC(7) Class Unloading 0.556ms >> ++ [7.977s][debug][gc,phases,start] GC(7) Report Object Count >> ++ [8.529s][debug][gc,phases ] GC(7) Report Object Count 552.065ms >> [8.529s][info ][gc,phases ] GC(7) Phase 1: Mark live objects 1066.882ms >> [8.529s][info ][gc,phases,start] GC(7) Phase 2: Compute new object addresses >> >> Parallel >> >> [5.786s][debug][gc,phases ] GC(12) Trigger cleanups 0.002ms >> [5.786s][debug][gc,phases ] GC(12) Class Unloading 0.556ms >> ++ [5.786s][debug][gc,phases,start] GC(12) Report Object Count >> ++ [6.313s][debug][gc,phases ] GC(12) Report Object Count 526.307ms >> [6.313s][info ][gc,phases ] GC(12) Marking Phase 889.900ms >> [6.313s][info ][gc,phases,start] GC(12) Summary Phase >> >> G1 Full >> >> [3.922s][debug][gc,phases ] GC(24) Trigger cleanups 0.002ms >> [3.922s][debug][gc,phases ] GC(24) Phase 1: Class Unloading and Cleanup 0.159ms >> ++ [3.922s][debug][gc,phases,start ] GC(24) Report Object Count >> ++ [4.442s][debug][gc,phases ] GC(24) Report Object Count 519.292ms >> [4.442s][info ][gc,phases ] GC(24) Phase 1: Mark live objects 653.209ms >> [4.442s][info ][gc,phases,start ] GC(24) Phase 2: Prepare compaction > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Use correct holder var names Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13772#pullrequestreview-1410829908 From tschatzl at openjdk.org Wed May 3 13:53:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 13:53:27 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v3] In-Reply-To: References: Message-ID: On Wed, 26 Apr 2023 17:28:49 GMT, Chris Plummer wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> cplummer review > > SA changes look good. Thanks @plummercj @sspitsyn @albertnetymk for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/13643#issuecomment-1533064129 From tschatzl at openjdk.org Wed May 3 13:53:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 13:53:28 GMT Subject: Integrated: 8306836: Remove pinned tag for G1 heap regions In-Reply-To: References: Message-ID: <4wdBNSgTzWoVKhbSXY8vlBwj_3eE2pyB3knxVGWKDHk=.0225c1ae-6a26-4170-b2ea-1e85ea6e6a64@github.com> On Tue, 25 Apr 2023 13:49:05 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that removes the pinned tag from `HeapRegion`. > > So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. > > With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". > > The (current) pinned flag is surprisingly little used, only for policy decisions. > > The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). > > Testing: tier1-3, gha > > Thanks, > Thomas This pull request has now been integrated. Changeset: fc76687c Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/fc76687c2fac39fcbf706c419bfa170b8efa5747 Stats: 62 lines in 18 files changed: 5 ins; 31 del; 26 mod 8306836: Remove pinned tag for G1 heap regions Reviewed-by: ayang, cjplummer, sspitsyn ------------- PR: https://git.openjdk.org/jdk/pull/13643 From rkennke at openjdk.org Wed May 3 14:10:33 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 3 May 2023 14:10:33 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v23] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Bunch of fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/7691eb81..f30039a0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=22 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=21-22 Stats: 13 lines in 3 files changed: 0 ins; 10 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From shade at openjdk.org Wed May 3 14:34:22 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 3 May 2023 14:34:22 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v23] In-Reply-To: References: Message-ID: <1JFhieDv0YPe9ntcx6S2IFzkMdj6NGQtuoWPcG0KXUU=.eb4cafe0-4086-49d2-9a6d-720aa9b2fe69@github.com> On Wed, 3 May 2023 14:10:33 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Bunch of fixes Performance: the worst case I can come up with is Serial Full GC that moves the entire heap full of smallest objects, like this: public class Retain { static final int RETAINED = Integer.getInteger("retained", 10_000_000); static final int GCS = Integer.getInteger("gcs", 100); static Object[] OBJECTS = new Object[RETAINED]; public static void main(String... args) { for (int t = 0; t < GCS; t++) { for (int c = 0; c < RETAINED; c++) { OBJECTS[c] = new Object(); } System.gc(); } } } On my `c6n.8xlarge` instance, with `java -Xmx1g -Xlog:gc -XX:+UseSerialGC Retain.java`, I see: baseline: 364 +- 5 ms patched, -AltGCForwarding: 385 +- 3 ms [+6%] patched, +AltGCForwarding: 445 +- 5ms [+22%] There are regressions even with `-AltGCForwarding`, and judging from the profiles and the point experiments, those are caused by the `AltGCForwarding` flag checks for every `forward_to` and `forwardee`, split evenly between these two paths. But given the very targeted workload above running back-to-back Full GCs intentionally, this regression looks okay. (I think the only way to dodge it would be to template the bunch of GC code and dispatch to it once per GC phase, rather than per oop, which would be very intrusive and serve no practical need, IMO.) The regression with `+AltGCForwarding` looks impressive in comparison: it is "only" worth three flag checks or so. The code I am seeing in profiles is already quite polished, so we would unlikely squeeze more from it without investing much more time. I don't think any of this would show up at larger benchmarks running in usual (young, mixed) GC modes. Indeed, I ran a few point experiments, and there seem to be no visible change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1533133187 From tschatzl at openjdk.org Wed May 3 15:35:20 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 15:35:20 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v5] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Merge branch 'master' into 8306541-refactor-cset-candidates - ayang, iwalulya review fix inlining in g1CollectionSet.inline.hpp - Merge branch 'master' into 8306541-refactor-cset-candidates - ayang review - remove unused methods - Whitespace fixes - typo - More cleanup - Cleanup - Cleanup - Refactor collection set candidates Improve the interface to collection set candidates and prepare for having collection set candidates at any time. Preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch only uses candidates from marking at this time. Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. * the collection set candidates set is not temporarily allocated any more, but the candidate set object must be available all the time. * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). * there are several additional helper sets/lists * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. All these sets implement C++ iterators for simpler use in various places. Everything else are changes to use these helper sets/lists throughout. Some additional FIXME for log messages to remove are in there. Please ignore. ------------- Changes: https://git.openjdk.org/jdk/pull/13666/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=04 Stats: 1082 lines in 25 files changed: 617 ins; 219 del; 246 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From stuefe at openjdk.org Wed May 3 16:02:23 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 3 May 2023 16:02:23 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v23] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 14:10:33 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Bunch of fixes We may squeeze out a bit more performance for the +AltGCForwarding case, eliminate one ref hop, by inlining the bases table into the SlidingForwarding object. Let the table follow the object, and maybe reduce some member sizes (either one of _num_regions, _region_size_word_shift can be 32bit, for instance). At least if we have very few regions that would let the whole table live on the same cache line. Simplest way to do that would be to add a fixed sized array to the object and use it as backing memory if bases table size is <= that array size, otherwise dynamically allocate it. Maybe not worth the work, up to you of course. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1533303946 From stuefe at openjdk.org Wed May 3 16:11:23 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 3 May 2023 16:11:23 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v23] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 14:10:33 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Bunch of fixes Changes requested by stuefe (Reviewer). src/hotspot/share/gc/shared/slidingForwarding.cpp line 162: > 160: FallbackTableEntry* head = &_table[idx]; > 161: FallbackTableEntry* entry = head; > 162: // Search existing entry in chain starting at idx. You dont use the head node. You should use the head node before creating a new node. ------------- PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1411249288 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1183900419 From rkennke at openjdk.org Wed May 3 17:47:24 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 3 May 2023 17:47:24 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v24] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Flatten SlidingForwarding and use heads of FallbackTable ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/f30039a0..fe0915e2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=23 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=22-23 Stats: 117 lines in 3 files changed: 19 ins; 43 del; 55 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Wed May 3 19:19:44 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 3 May 2023 19:19:44 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v25] In-Reply-To: References: Message-ID: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix type narrowing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/fe0915e2..5ee17597 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=24 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=23-24 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From tschatzl at openjdk.org Wed May 3 21:42:24 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 21:42:24 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v25] In-Reply-To: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> References: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> Message-ID: <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> On Wed, 3 May 2023 19:19:44 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix type narrowing Changes requested by tschatzl (Reviewer). src/hotspot/share/gc/shared/gc_globals.hpp line 699: > 697: \ > 698: product(bool, UseAltGCForwarding, false, EXPERIMENTAL, \ > 699: "Use alternative GC forwarding that preserves object headers") \ I would strongly prefer if this were not a product flag at this time, but a develop flag. It potentially decreases performance of serial gc full gcs by a significant amount with no upside at all (not that worried about g1 or other concurrent gcs). Can you give me reasons why an end user would ever consciously enable this flag? Using a develop flag is only a minor annoyance for development - we already do that for other features like evacuation failure injection in G1. For end users this would result in (guaranteed) zero performance impact. Only when adding compressed object headers with Lilliput this should be changed to a product flag. I do not know your schedule for upstreaming Lilliput, but if it would miss JDK 21, people would suffer from this for the entire lifetime of JDK 21.... which is an LTS release. (Fwiw I would suggest the same for a non-LTS release, it seems to be worse in this situation though). src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 43: > 41: > 42: uint SlidingForwarding::region_index_containing(HeapWord* addr) { > 43: uint index = static_cast(pointer_delta(addr, _heap_start) >> _region_size_words_shift); I believe it is possible to bias the array pointer to avoid that subtraction of the `_heap_start` like we do e.g. for the card table. See also `G1BiasedArray` or so for a kind of ready-made class implementing this. Not sure it will help a lot, but at least remove the subtraction and the load of the `_heap_start` value. src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 59: > 57: // Primary is free > 58: _bases_table[base_idx] = to_region_base; > 59: } else if (_bases_table[base_idx] == to_region_base) { This probably won't help at all with performance, but I would kind of put the checks for the common cases where the table values are set (particularly the first one) first (I may be wrong about whether this is possible). The `UNUSED_BASE` values in the tables will be encountered exactly once... ------------- PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1411879100 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184309687 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184310440 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184313798 From tschatzl at openjdk.org Wed May 3 22:08:24 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 22:08:24 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v25] In-Reply-To: <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> References: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> Message-ID: On Wed, 3 May 2023 21:30:31 GMT, Thomas Schatzl wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix type narrowing > > src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 43: > >> 41: >> 42: uint SlidingForwarding::region_index_containing(HeapWord* addr) { >> 43: uint index = static_cast(pointer_delta(addr, _heap_start) >> _region_size_words_shift); > > I believe it is possible to bias the array pointer to avoid that subtraction of the `_heap_start` like we do e.g. for the card table. See also `G1BiasedArray` or so for a kind of ready-made class implementing this. > > Not sure it will help a lot, but at least remove the subtraction and the load of the `_heap_start` value. Maybe possible ;) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184338484 From tschatzl at openjdk.org Wed May 3 22:08:25 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 3 May 2023 22:08:25 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v25] In-Reply-To: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> References: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> Message-ID: On Wed, 3 May 2023 19:19:44 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix type narrowing src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 82: > 80: > 81: uintptr_t encoded = (offset << OFFSET_BITS_SHIFT) | > 82: (alt_region << ALT_REGION_SHIFT) | While I understand that a `bool` is typically encoded as either `0` or `1` (not sure if it's actually specified somewhere) it would likely make the code cleaner to use a real integer of some type here to me. Also, the shift could be inlined in the assignments above. Like setting `alt_region` to either `0 (<< ALT_REGION_SHIFT)` or `1 << ALT_REGION_SHIFT` directly in the code. This is obviously a nano-optimization that probably won't show up anywhere... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184337013 From kbarrett at openjdk.org Thu May 4 05:35:26 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 4 May 2023 05:35:26 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v6] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 16:47:06 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Remove is_young_gc_movable Sorry to be late to the review. I noticed a problem in a comment. ------------- PR Review: https://git.openjdk.org/jdk/pull/13643#pullrequestreview-1407061087 From kbarrett at openjdk.org Thu May 4 05:35:29 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 4 May 2023 05:35:29 GMT Subject: RFR: 8306836: Remove pinned tag for G1 heap regions [v4] In-Reply-To: References: Message-ID: <6sav8G_h5tJF6Chc-hLzW2k_7WtHPc6uk5Fr7zmuGSM=.bcece9ab-0f31-4f84-8dcf-05e530cac9df@github.com> On Thu, 27 Apr 2023 12:31:24 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that removes the pinned tag from `HeapRegion`. >> >> So that "pinned" tag for G1 heap regions indicates that the region should not move during (young) gc. This applies to now removed archive regions and humongous objects/regions. >> >> With "real" g1 region pinning to deal with gclocker in g1 once and for all upcoming we need a refcount, a single bit is not sufficient anymore. Further there will be a naming conflict as this kind of "pinning" is different to g1 region pinning "pinning". The former indicates "contents can not be moved, but can be reclaimed", while the latter means "contents can not be moved and not reclaimed". >> >> The (current) pinned flag is surprisingly little used, only for policy decisions. >> >> The suggestion this change implements is to remove the "pinned" tag as it is, and reserve it for future g1 region pinning (that needs to store the pinning attribute differently as a refcount anyway). >> >> Testing: tier1-3, gha >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > remove is_young_gc_movable in full gc code src/hotspot/share/gc/g1/g1CollectionSetChooser.hpp line 57: > 55: // Determine whether to add the given region to the collection set candidates or > 56: // not. Currently, we skip regions that we will never move during young gc, and > 57: // regions which liveness is below the occupancy threshold. s/liveness is below/liveness is over/ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13643#discussion_r1181174243 From rkennke at openjdk.org Thu May 4 06:01:25 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 06:01:25 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v25] In-Reply-To: <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> References: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> Message-ID: On Wed, 3 May 2023 21:29:20 GMT, Thomas Schatzl wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix type narrowing > > src/hotspot/share/gc/shared/gc_globals.hpp line 699: > >> 697: \ >> 698: product(bool, UseAltGCForwarding, false, EXPERIMENTAL, \ >> 699: "Use alternative GC forwarding that preserves object headers") \ > > I would strongly prefer if this were not a product flag at this time, but a develop flag. > > It potentially decreases performance of serial gc full gcs by a significant amount with no upside at all (not that worried about g1 or other concurrent gcs). Can you give me reasons why an end user would ever consciously enable this flag? > > Using a develop flag is only a minor annoyance for development - we already do that for other features like evacuation failure injection in G1. For end users this would result in (guaranteed) zero performance impact. > > Only when adding compressed object headers with Lilliput this should be changed to a product flag. > > I do not know your schedule for upstreaming Lilliput, but if it would miss JDK 21, people would suffer from this for the entire lifetime of JDK 21.... which is an LTS release. (Fwiw I would suggest the same for a non-LTS release, it seems to be worse in this situation though). Ok that is reasonable, I will do that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184571039 From rkennke at openjdk.org Thu May 4 06:01:27 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 06:01:27 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v25] In-Reply-To: References: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> Message-ID: On Wed, 3 May 2023 22:05:43 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 43: >> >>> 41: >>> 42: uint SlidingForwarding::region_index_containing(HeapWord* addr) { >>> 43: uint index = static_cast(pointer_delta(addr, _heap_start) >> _region_size_words_shift); >> >> I believe it is possible to bias the array pointer to avoid that subtraction of the `_heap_start` like we do e.g. for the card table. See also `G1BiasedArray` or so for a kind of ready-made class implementing this. >> >> Not sure it will help a lot, but at least remove the subtraction and the load of the `_heap_start` value. > > Maybe possible ;) I don't think so. The biasing in G1 GC (and Shenandoah GC) uses an array to look up per-region stuff (like cset property) without first calculating the actual region index. Instead, it allows to simply shift an address and use that biased index to address the biased array. Here we don't have an array, we only want the index of the region that contains the address. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184571191 From rkennke at openjdk.org Thu May 4 06:05:25 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 06:05:25 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v25] In-Reply-To: <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> References: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> Message-ID: On Wed, 3 May 2023 21:35:03 GMT, Thomas Schatzl wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix type narrowing > > src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 59: > >> 57: // Primary is free >> 58: _bases_table[base_idx] = to_region_base; >> 59: } else if (_bases_table[base_idx] == to_region_base) { > > This probably won't help at all with performance, but I would kind of put the checks for the common cases where the table values are set (particularly the first one) first (I may be wrong about whether this is possible). > The `UNUSED_BASE` values in the tables will be encountered exactly once... I believe we can safely swap the UNUSED with the primary check. I'll do that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184573418 From rkennke at openjdk.org Thu May 4 06:09:26 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 06:09:26 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v25] In-Reply-To: References: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> Message-ID: <3DQM2ay8VGdLVhxa2iCqOOS3KX3AvXyoq_w3t228Sm0=.9385d002-3124-40ac-bddf-5340015dfed5@github.com> On Wed, 3 May 2023 22:04:08 GMT, Thomas Schatzl wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix type narrowing > > src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 82: > >> 80: >> 81: uintptr_t encoded = (offset << OFFSET_BITS_SHIFT) | >> 82: (alt_region << ALT_REGION_SHIFT) | > > While I understand that a `bool` is typically encoded as either `0` or `1` (not sure if it's actually specified somewhere) it would likely make the code cleaner to use a real integer of some type here to me. > > Also, the shift could be inlined in the assignments above. > Like setting `alt_region` to either `0 (<< ALT_REGION_SHIFT)` or `1 << ALT_REGION_SHIFT` directly in the code. > This is obviously a nano-optimization that probably won't show up anywhere... Oh yes, I'll change it to an integral value. I don't see how moving the shift to the assignment would help, and I'd prefer to keep it in the place where we encode the value, I think that is more readable/less confusing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184575641 From rkennke at openjdk.org Thu May 4 06:30:01 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 06:30:01 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v26] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Switch back to size_t for some fields - Address @tschatzl's review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/5ee17597..2762f1b1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=25 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=24-25 Stats: 40 lines in 4 files changed: 4 ins; 4 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Thu May 4 07:04:19 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 07:04:19 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v27] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix release build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/2762f1b1..0cc732ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=26 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=25-26 Stats: 4 lines in 2 files changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From tschatzl at openjdk.org Thu May 4 07:56:17 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 4 May 2023 07:56:17 GMT Subject: RFR: 8307421: Fix comment in g1CollectionSetChooser.hpp after JDK-8306836 Message-ID: Hi all, please review this trivial comment fix @kimbarrett noticed while reviewing the [JDK-8306836](https://bugs.openjdk.org/browse/JDK-8306836) change after having it pushed. Testing: local compilation ------------- Commit messages: - fix comment, kbarrett finding Changes: https://git.openjdk.org/jdk/pull/13793/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13793&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307421 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13793.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13793/head:pull/13793 PR: https://git.openjdk.org/jdk/pull/13793 From duke at openjdk.org Thu May 4 09:22:26 2023 From: duke at openjdk.org (olivergillespie) Date: Thu, 4 May 2023 09:22:26 GMT Subject: Integrated: 8307346 - Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code In-Reply-To: References: Message-ID: On Wed, 3 May 2023 10:04:14 GMT, olivergillespie wrote: > Add logging for the time taken to collect/report ObjectCount(AfterGC) event to Serial, Parallel and G1 full collectors, for parity with the G1 concurrent collector. This event can be *very* expensive to collect (single threaded STW full heap scan), so it's very important to show it clearly to the user. > > Added output: > > Serial > > [7.976s][debug][gc,phases ] GC(7) Trigger cleanups 0.002ms > [7.977s][debug][gc,phases ] GC(7) Class Unloading 0.556ms > ++ [7.977s][debug][gc,phases,start] GC(7) Report Object Count > ++ [8.529s][debug][gc,phases ] GC(7) Report Object Count 552.065ms > [8.529s][info ][gc,phases ] GC(7) Phase 1: Mark live objects 1066.882ms > [8.529s][info ][gc,phases,start] GC(7) Phase 2: Compute new object addresses > > Parallel > > [5.786s][debug][gc,phases ] GC(12) Trigger cleanups 0.002ms > [5.786s][debug][gc,phases ] GC(12) Class Unloading 0.556ms > ++ [5.786s][debug][gc,phases,start] GC(12) Report Object Count > ++ [6.313s][debug][gc,phases ] GC(12) Report Object Count 526.307ms > [6.313s][info ][gc,phases ] GC(12) Marking Phase 889.900ms > [6.313s][info ][gc,phases,start] GC(12) Summary Phase > > G1 Full > > [3.922s][debug][gc,phases ] GC(24) Trigger cleanups 0.002ms > [3.922s][debug][gc,phases ] GC(24) Phase 1: Class Unloading and Cleanup 0.159ms > ++ [3.922s][debug][gc,phases,start ] GC(24) Report Object Count > ++ [4.442s][debug][gc,phases ] GC(24) Report Object Count 519.292ms > [4.442s][info ][gc,phases ] GC(24) Phase 1: Mark live objects 653.209ms > [4.442s][info ][gc,phases,start ] GC(24) Phase 2: Prepare compaction This pull request has now been integrated. Changeset: 3f1927a7 Author: Oli Gillespie Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/3f1927a7f3a2914402a25335c47a5a8bdd5511a6 Stats: 12 lines in 3 files changed: 9 ins; 0 del; 3 mod 8307346: Add missing gc+phases logging for ObjectCount(AfterGC) JFR event collection code Reviewed-by: tschatzl, shade, ayang ------------- PR: https://git.openjdk.org/jdk/pull/13772 From eosterlund at openjdk.org Thu May 4 09:37:28 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 4 May 2023 09:37:28 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v12] In-Reply-To: References: <7wh9oQNWTga_roKBanu2246zZ53TTL3HyoeAdnVQYpk=.f69973bf-821c-47dc-a8df-f4b47af1c9a6@github.com> <3H0IufrunNdAbYOGOTTp9oLepzSV7so-xIBrWUXoFkU=.980fe927-a804-48be-9c5a-aa7de4b8142e@github.com> Message-ID: On Fri, 28 Apr 2023 17:52:33 GMT, Erik ?sterlund wrote: >> It seems to be used in a couple of places already: >> >> grep -R ff51afd7ed558ccd src >> src/jdk.jfr/share/classes/jdk/jfr/internal/EventWriterKey.java: z = (z ^ (z >>> 33)) * 0xff51afd7ed558ccdL; >> src/java.base/share/classes/java/util/SplittableRandom.java: z = (z ^ (z >>> 33)) * 0xff51afd7ed558ccdL; // MurmurHash3 mix constants >> src/java.base/share/classes/java/util/concurrent/ThreadLocalRandom.java: z = (z ^ (z >>> 33)) * 0xff51afd7ed558ccdL; >> src/java.base/share/classes/jdk/internal/util/random/RandomSupport.java: z = (z ^ (z >>> 33)) * 0xff51afd7ed558ccdL; >> src/hotspot/share/gc/shared/slidingForwarding.cpp: val *= 0xff51afd7ed558ccdULL; >> src/hotspot/share/gc/shenandoah/shenandoahEvacOOMHandler.cpp: key *= UINT64_C(0xff51afd7ed558ccd); > > Sounds good then. If you wouldn't mind, I think this one is even better: https://github.com/iwanowww/jdk/blob/ssc.cascading/src/hotspot/share/oops/klass.cpp?plain=1#L309 @iwanowww is using it for faster type checking. Our very own @rose00 is behind this one, so we can definitely use it. It performs very well (~1ns per hash). The hash algo passes BigCrush (as a CBPNRG) and SMhasher (with the right loop to combine the input blocks). It's basically a fantastic hash function, that we are free to use. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184778234 From rkennke at openjdk.org Thu May 4 10:53:26 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 10:53:26 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v12] In-Reply-To: References: <7wh9oQNWTga_roKBanu2246zZ53TTL3HyoeAdnVQYpk=.f69973bf-821c-47dc-a8df-f4b47af1c9a6@github.com> <3H0IufrunNdAbYOGOTTp9oLepzSV7so-xIBrWUXoFkU=.980fe927-a804-48be-9c5a-aa7de4b8142e@github.com> Message-ID: <4-Ztc15xMXxNX5tiyMT-9YokDi65JlIuVX1TFuC__-s=.64094113-c745-4376-8a88-ab75d5cf0beb@github.com> On Thu, 4 May 2023 09:34:27 GMT, Erik ?sterlund wrote: >> Sounds good then. > > If you wouldn't mind, I think this one is even better: https://github.com/iwanowww/jdk/blob/ssc.cascading/src/hotspot/share/oops/klass.cpp?plain=1#L309 > @iwanowww is using it for faster type checking. Our very own @rose00 is behind this one, so we can definitely use it. It performs very well (~1ns per hash). The hash algo passes BigCrush (as a CBPNRG) and SMhasher (with the right loop to combine the input blocks). It's basically a fantastic hash function, that we are free to use. Ok that is great stuff! Maybe it'd be useful to move it to a central place with this PR (alternative full GC fwding), because we're going to need it for other purposes, too (other GC tables, i-hash, faster type checking, maybe more?) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184854210 From eosterlund at openjdk.org Thu May 4 11:00:29 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 4 May 2023 11:00:29 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v12] In-Reply-To: <4-Ztc15xMXxNX5tiyMT-9YokDi65JlIuVX1TFuC__-s=.64094113-c745-4376-8a88-ab75d5cf0beb@github.com> References: <7wh9oQNWTga_roKBanu2246zZ53TTL3HyoeAdnVQYpk=.f69973bf-821c-47dc-a8df-f4b47af1c9a6@github.com> <3H0IufrunNdAbYOGOTTp9oLepzSV7so-xIBrWUXoFkU=.980fe927-a804-48be-9c5a-aa7de4b8142e@github.com> <4-Ztc15xMXxNX5tiyMT-9YokDi65JlIuVX1TFuC__-s=.64094113-c745-4376-8a88-ab75d5cf0beb@github.com> Message-ID: <0DF0LOv43PBl8SDn7spkqCULAqEEwOjjgBZS4ewHF7U=.6d38770e-1849-42db-9274-2adaae264a80@github.com> On Thu, 4 May 2023 10:50:17 GMT, Roman Kennke wrote: >> If you wouldn't mind, I think this one is even better: https://github.com/iwanowww/jdk/blob/ssc.cascading/src/hotspot/share/oops/klass.cpp?plain=1#L309 >> @iwanowww is using it for faster type checking. Our very own @rose00 is behind this one, so we can definitely use it. It performs very well (~1ns per hash). The hash algo passes BigCrush (as a CBPNRG) and SMhasher (with the right loop to combine the input blocks). It's basically a fantastic hash function, that we are free to use. > > Ok that is great stuff! Maybe it'd be useful to move it to a central place with this PR (alternative full GC fwding), because we're going to need it for other purposes, too (other GC tables, i-hash, faster type checking, maybe more?) > > Oh and that code is using __int128 type, how/where do I get that outside of GCC? Yes - great idea. Maybe somewhere in utilities. We might swap to it with ZGC as well when things settle down there. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184860529 From rkennke at openjdk.org Thu May 4 11:27:35 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 11:27:35 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v28] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Use @rose00's fast-hash impl instead of murmur ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/0cc732ed..ad9fb171 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=27 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=26-27 Stats: 106 lines in 2 files changed: 93 ins; 12 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Thu May 4 11:27:36 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 11:27:36 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v12] In-Reply-To: <0DF0LOv43PBl8SDn7spkqCULAqEEwOjjgBZS4ewHF7U=.6d38770e-1849-42db-9274-2adaae264a80@github.com> References: <7wh9oQNWTga_roKBanu2246zZ53TTL3HyoeAdnVQYpk=.f69973bf-821c-47dc-a8df-f4b47af1c9a6@github.com> <3H0IufrunNdAbYOGOTTp9oLepzSV7so-xIBrWUXoFkU=.980fe927-a804-48be-9c5a-aa7de4b8142e@github.com> <4-Ztc15xMXxNX5tiyMT-9YokDi65JlIuVX1TFuC__-s=.64094113-c745-4376-8a88-ab75d5cf0beb@github.com> <0DF0LOv43PBl8SDn7spkqCULAqEEwOjjgBZ S4ewHF7U=.6d38770e-1849-42db-9274-2adaae264a80@github.com> Message-ID: On Thu, 4 May 2023 10:57:07 GMT, Erik ?sterlund wrote: >> Ok that is great stuff! Maybe it'd be useful to move it to a central place with this PR (alternative full GC fwding), because we're going to need it for other purposes, too (other GC tables, i-hash, faster type checking, maybe more?) >> >> Oh and that code is using __int128 type, how/where do I get that outside of GCC? > > Yes - great idea. Maybe somewhere in utilities. We might swap to it with ZGC as well when things settle down there. Ok, I pushed a change that uses @rose00's better hashing. I added/changed the 128-bit multiplication to (hopefully) make it portable. Let's see what GHA has to say about this ;-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184882541 From rkennke at openjdk.org Thu May 4 11:40:14 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 11:40:14 GMT Subject: RFR: 8307395: Add missing STS to Shenandoah Message-ID: Testing in project Lilliput has revealed that Shenandoah GC is lacking one STS. This causes a reliable crash (with Lilliput) when running TestGCBasherWithShenandoah.java with -XX:+UseHeavyMonitors because it touches an already deflated monitor. Testing (all in Lilliput where it caused the troubles, but applies to upstream as well): - [x] TestGCBasherWithShenandoah.java +UseHeavyMonitors - [x] hotspot_gc_shenandoah +UseHeavyMonitors ------------- Commit messages: - 8307395: Add missing STS to Shenandoah Changes: https://git.openjdk.org/jdk/pull/13799/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13799&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307395 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13799.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13799/head:pull/13799 PR: https://git.openjdk.org/jdk/pull/13799 From rkennke at openjdk.org Thu May 4 11:56:22 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 11:56:22 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v29] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Add usual header include guards ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/ad9fb171..0f3604aa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=28 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=27-28 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From shade at openjdk.org Thu May 4 12:24:15 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 4 May 2023 12:24:15 GMT Subject: RFR: 8307395: Add missing STS to Shenandoah In-Reply-To: References: Message-ID: <9NYMLTyK8H2Ruo-8pZ8UtOR0m_tjv7rXucsYlYIhFUs=.3495923c-6f52-42e1-90b9-9b8930f111d3@github.com> On Thu, 4 May 2023 11:34:15 GMT, Roman Kennke wrote: > Testing in project Lilliput has revealed that Shenandoah GC is lacking one STS. This causes a reliable crash (with Lilliput) when running TestGCBasherWithShenandoah.java with -XX:+UseHeavyMonitors because it touches an already deflated monitor. > > Testing (all in Lilliput where it caused the troubles, but applies to upstream as well): > - [x] TestGCBasherWithShenandoah.java +UseHeavyMonitors > - [x] hotspot_gc_shenandoah +UseHeavyMonitors Looks fine, provided testing is clean. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13799#pullrequestreview-1412969497 From shade at openjdk.org Thu May 4 13:28:16 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 4 May 2023 13:28:16 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 12:16:13 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Use ParallelGCThreads instead of active_workers I looked if there might be a better option, like passing the `WorkerThreads*` from callers to actually figure out the number of active workers from the GC itself, but all of this cuts rather deep. I would say that should be done in a separate PR. `ParallelGCThreads` should work well meanwhile. @albertnetymk, @tschatzl might have an opinion here. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13774#pullrequestreview-1413089971 From shade at openjdk.org Thu May 4 13:29:14 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 4 May 2023 13:29:14 GMT Subject: RFR: 8307421: Fix comment in g1CollectionSetChooser.hpp after JDK-8306836 In-Reply-To: References: Message-ID: On Thu, 4 May 2023 07:49:43 GMT, Thomas Schatzl wrote: > Hi all, > > please review this trivial comment fix @kimbarrett noticed while reviewing the [JDK-8306836](https://bugs.openjdk.org/browse/JDK-8306836) change after having it pushed. > > Testing: local compilation Looks fine. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13793#pullrequestreview-1413092518 From rkennke at openjdk.org Thu May 4 13:56:07 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 13:56:07 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v30] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Clamp home index. Duh. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/0f3604aa..c3b9ae9e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=29 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=28-29 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From stuefe at openjdk.org Thu May 4 13:56:09 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 4 May 2023 13:56:09 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v30] In-Reply-To: References: Message-ID: <7Sc3jztrzRq8rXZh4l3V4LdyBnR33qvKh8OZCTBQZbY=.9120353e-0db0-4135-81ea-30732ced0fb0@github.com> On Thu, 4 May 2023 13:51:53 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Clamp home index. Duh. LGTM. I ok this now; my remaining comments are suggestions - up to you to take them or not. I removed some of my obsolete comments to clear the space. Tests are missing. A simple way would be to run a selection of our standard GC tests with +AltGCForwarding. This is especially important if you follow Thomas' advice and make AltGCForwarding a develop switch. The only other thing that occurred to me is that you could probably change initialization: don't require caller to specify it but calculate it yourself such that the 28 bit offset is maximally used. That would save some memory since the bases table can be smaller. Again, up to you. src/hotspot/share/gc/shared/slidingForwarding.cpp line 127: > 125: size_t FallbackTable::home_index(HeapWord* from) { > 126: uint64_t val = reinterpret_cast(from); > 127: uint64_t hash = FastHash::get_hash64(val, 0xAAAAAAAAAAAAAAAA); Use UCONST64(0xAAA..AA) ? ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1413025879 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1185044658 From stuefe at openjdk.org Thu May 4 13:56:19 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 4 May 2023 13:56:19 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v29] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 11:56:22 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Add usual header include guards src/hotspot/share/gc/shared/slidingForwarding.cpp line 35: > 33: // We cannot use 0, because that may already be a valid base address in zero-based heaps. > 34: // 0x1 is safe because heap base addresses must be aligned by much larger alignment > 35: HeapWord* const SlidingForwarding::UNUSED_BASE = reinterpret_cast(0x1); I try to understand under which circumstances a zero heap location would be okay. This is *uncompressed* oops, right? If that were 0, you could just hardcode constexpr 0 in the header. src/hotspot/share/gc/shared/slidingForwarding.cpp line 111: > 109: _table[i]._from = nullptr; > 110: _table[i]._to = nullptr; > 111: } It would be enough to set _from to nullptr. src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 60: > 58: } else if (_bases_table[base_idx] == UNUSED_BASE) { > 59: // Primary is free > 60: _bases_table[base_idx] = to_region_base; Since the else branch is probably much more common, would it make sense to swap the conditions? Same below. src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 87: > 85: assert(to == decode_forwarding(from, encoded), "must be reversible"); > 86: return encoded; > 87: } Since encoding should produce a 32-bit value, why not return a 32-bit value? Same below, for decoding. Or, at least assert that returned value has no higher bits set. src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 90: > 88: > 89: HeapWord* SlidingForwarding::decode_forwarding(HeapWord* from, uintptr_t encoded) { > 90: assert((encoded & markWord::marked_value) == markWord::marked_value, "must be marked as forwarded"); s/marked_value/lock_mask ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184986345 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184989465 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184972889 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184976288 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1184980683 From shade at openjdk.org Thu May 4 14:05:30 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 4 May 2023 14:05:30 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v30] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 13:56:07 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Clamp home index. Duh. Another round... ------------- PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1409678649 From shade at openjdk.org Thu May 4 14:05:36 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 4 May 2023 14:05:36 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v20] In-Reply-To: <3RFs-ck6jkNa-9dxLYQi00uC-f5K995W8d29V5swQpM=.f3bf90e8-1c09-436a-b414-760b76dac9ed@github.com> References: <3RFs-ck6jkNa-9dxLYQi00uC-f5K995W8d29V5swQpM=.f3bf90e8-1c09-436a-b414-760b76dac9ed@github.com> Message-ID: On Wed, 3 May 2023 10:54:43 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > More Thomas' comments src/hotspot/share/gc/g1/g1FullGCOopClosures.inline.hpp line 35: > 33: #include "gc/g1/g1FullGCMarker.inline.hpp" > 34: #include "gc/g1/heapRegionRemSet.inline.hpp" > 35: #include "gc/shared/gcForwarding.inline.hpp" Copyright years need to be updated. src/hotspot/share/gc/serial/markSweep.inline.hpp line 33: > 31: #include "classfile/javaClasses.inline.hpp" > 32: #include "gc/shared/continuationGCSupport.inline.hpp" > 33: #include "gc/shared/gcForwarding.inline.hpp" Copyright years need to be updated. src/hotspot/share/gc/shared/gc_globals.hpp line 698: > 696: constraint(GCCardSizeInBytesConstraintFunc,AtParse) \ > 697: \ > 698: product(bool, UseAltGCForwarding, false, EXPERIMENTAL, \ See if copyright years need to be updated. src/hotspot/share/gc/shared/preservedMarks.cpp line 26: > 24: > 25: #include "precompiled.hpp" > 26: #include "gc/shared/gcForwarding.inline.hpp" Copyright years need to be updated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1183540583 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1183540894 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1183542696 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1183542195 From shade at openjdk.org Thu May 4 14:05:39 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 4 May 2023 14:05:39 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v29] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 11:56:22 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Add usual header include guards src/hotspot/share/gc/shared/gc_globals.hpp line 698: > 696: constraint(GCCardSizeInBytesConstraintFunc,AtParse) \ > 697: \ > 698: develop(bool, UseAltGCForwarding, false, \ I don't think you can opt-in into `true` in release bits, if this flag is `develop` when the rest of Lilliput arrives. In release bits, all checks involving this flag would fold with `false`. Maybe that's the intent here, as it keeps the release performance at the baseline level, but this makes performance overhead estimations for this PR a bit hard :) [I'll hack the flag back to experimental for tests] ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1185043173 From shade at openjdk.org Thu May 4 14:05:41 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 4 May 2023 14:05:41 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v19] In-Reply-To: References: Message-ID: On Tue, 2 May 2023 18:21:18 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Initialize 'heap' elements in test case src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 43: > 41: bool SlidingForwarding::region_contains(HeapWord* region_base, HeapWord* addr) const { > 42: return (region_base <= addr) && (addr < (region_base + _region_size_words)); > 43: } Now unused! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1182911404 From rkennke at openjdk.org Thu May 4 14:34:27 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 14:34:27 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v30] In-Reply-To: <7Sc3jztrzRq8rXZh4l3V4LdyBnR33qvKh8OZCTBQZbY=.9120353e-0db0-4135-81ea-30732ced0fb0@github.com> References: <7Sc3jztrzRq8rXZh4l3V4LdyBnR33qvKh8OZCTBQZbY=.9120353e-0db0-4135-81ea-30732ced0fb0@github.com> Message-ID: On Thu, 4 May 2023 13:50:36 GMT, Thomas Stuefe wrote: > LGTM. I ok this now; my remaining comments are suggestions - up to you to take them or not. Thanks! > Tests are missing. A simple way would be to run a selection of our standard GC tests with +AltGCForwarding. This is especially important if you follow Thomas' advice and make AltGCForwarding a develop switch. I run hotspot_gc with UseAltGCForwarding turned on. Not sure if there is an easy way to make this a test task. I could perhaps add a few run configurations to tests that are useful. For example, gc/stress/TestMultiThreadStressRSet.java tended to exercise both the sliding-forwarding and the fallback-forwarding But would the develop-only switch not complicate this? Because it means we could only run such tests in debug builds. > The only other thing that occurred to me is that you could probably change initialization: don't require caller to specify it but calculate it yourself such that the 28 bit offset is maximally used. That would save some memory since the bases table can be smaller. Again, up to you. Yeah maybe. SpaceAlignment should be set by all GCs to a reasonable region-size, we could probably just pick that up. OTOH, we need a little bit of cooperation from the GC here: The whole sliding-forwarding algo relies on the fact that GC workers divide up their work based on their regions, and are essentially single-threaded within their work queues. I'm a bit worried about touching this stuff at this point, and cause another round of reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1534889536 From duke at openjdk.org Thu May 4 14:55:16 2023 From: duke at openjdk.org (olivergillespie) Date: Thu, 4 May 2023 14:55:16 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 12:16:13 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Use ParallelGCThreads instead of active_workers Passed tier2 and tier3 tests for me locally on Linux x86_64 and aarch64, with G1, Serial and Parallel, with just a few (14) tier2 failures in `java/net/httpclient/` which I believe are due to a bad configuration in my local setup. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1534924010 From tschatzl at openjdk.org Thu May 4 15:12:18 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 4 May 2023 15:12:18 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 13:25:34 GMT, Aleksey Shipilev wrote: > I looked if there might be a better option, like passing the `WorkerThreads*` from callers to actually figure out the number of active workers from the GC itself, but all of this cuts rather deep. I would say that should be done in a separate PR. `ParallelGCThreads` should work well meanwhile. > > @albertnetymk, @tschatzl might have an opinion here. As you suggested, the typical way is passing the `WorkerThreads*` along instead of passing a thread number and the code selecting the `WorkerThread*` by itself. Actually I'm not sure why safepoint_workers are used at all. The changes to do that does not seem to be that significant to me actually, so I would prefer that. The current change is an improvement already though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1534953027 From ayang at openjdk.org Thu May 4 15:41:14 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 4 May 2023 15:41:14 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 15:09:52 GMT, Thomas Schatzl wrote: > Actually I'm not sure why safepoint_workers are used at all. I believe it's semantically incorrect to use `safepoint_workers` here. Maybe `HeapInspection` should live in `gc` folder. > The changes to do that does not seem to be that significant to me actually, so I would prefer that. How does that affect another caller, `VM_GC_HeapInspection::doit`? Unclear to me how one can get gc-workers in that context. Could one introduce another API in `class CollectedHeap`, sth like `virtual WorkerThreads* gc_workers() { return nullptr; }`, next to `safepoint_workers`? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1535000662 From rkennke at openjdk.org Thu May 4 16:35:29 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 16:35:29 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v29] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 12:53:23 GMT, Thomas Stuefe wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Add usual header include guards > > src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 60: > >> 58: } else if (_bases_table[base_idx] == UNUSED_BASE) { >> 59: // Primary is free >> 60: _bases_table[base_idx] = to_region_base; > > Since the else branch is probably much more common, would it make sense to swap the conditions? Same below. I just swapped that around to what it is now: I think the UNUSED_BASE would be taken exactly once per region, then every other call would find a 'good' target. Which means the if-branch would be much more common and else only taken rarely. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1185251911 From rkennke at openjdk.org Thu May 4 16:41:28 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 16:41:28 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v29] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 12:56:11 GMT, Thomas Stuefe wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Add usual header include guards > > src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 87: > >> 85: assert(to == decode_forwarding(from, encoded), "must be reversible"); >> 86: return encoded; >> 87: } > > Since encoding should produce a 32-bit value, why not return a 32-bit value? Same below, for decoding. Or, at least assert that returned value has no higher bits set. I'm adding asserts. The encoded value will be or-ed into the mark-word which is 64bit anyway, I don't see much value to use 32bit here and then require up-casting and stuff like that. Type mess tends to make compilers unhappy ;-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1185257704 From rkennke at openjdk.org Thu May 4 16:46:37 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 16:46:37 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v29] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 13:04:33 GMT, Thomas Stuefe wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Add usual header include guards > > src/hotspot/share/gc/shared/slidingForwarding.cpp line 35: > >> 33: // We cannot use 0, because that may already be a valid base address in zero-based heaps. >> 34: // 0x1 is safe because heap base addresses must be aligned by much larger alignment >> 35: HeapWord* const SlidingForwarding::UNUSED_BASE = reinterpret_cast(0x1); > > I try to understand under which circumstances a zero heap location would be okay. This is *uncompressed* oops, right? > > If that were 0, you could just hardcode constexpr 0 in the header. When running with compressed oops, the heap may be allocated at the zero page, iirc. Yes, this would still be using sliding-forwarding. (*Actually* this may be an optimization opportunity: When JVM runs with compressed oops, we could simply use compressed oops in the forwarding ptr and avoid the whole sliding-forwarding stuff. Maybe as a follow-up?) > src/hotspot/share/gc/shared/slidingForwarding.cpp line 111: > >> 109: _table[i]._from = nullptr; >> 110: _table[i]._to = nullptr; >> 111: } > > It would be enough to set _from to nullptr. Yes but I like this to be clean. ;-) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1185262678 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1185263534 From rkennke at openjdk.org Thu May 4 16:56:34 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 16:56:34 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v31] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: - @shipilev comments - @tstuefe review - Add some test configs that use +UseAltGCForwarding ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/c3b9ae9e..f85e4913 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=30 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=29-30 Stats: 74 lines in 10 files changed: 66 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From shade at openjdk.org Thu May 4 17:14:17 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 4 May 2023 17:14:17 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: Message-ID: <-d5noeLDfyCH_8GK-qGC-j8NyaOlR6AyB2i09GSPBMI=.283b6e7e-5460-49b3-b1c2-27df12a85357@github.com> On Thu, 4 May 2023 15:38:54 GMT, Albert Mingkun Yang wrote: > How does that affect another caller, `VM_GC_HeapInspection::doit`? Unclear to me how one can get gc-workers in that context. I think `safepoint_workers` is "fine" to use from `VM_GC_HeapInspection`. That VMOp does not know the status of GC threads when executing the VM op. The GC threads might be currently parked at STS and not available. `safepoint_workers` are supposed to be separate from normal (concurrent) GC workers then. So, the contract for `safepoint_workers` is relatively safe for doing this the object count heap walk too. I agree it would be cleaner to use the GC workers, though. The problem that I see there is that `report_object_count_after_gc` is called from rather generic places like `GenMarkSweep::mark_sweep_phase1`, where we don't know how many workers do we have (or even which exact GC we are running). That would probably require lifting `workers()` straight to `CollectedHeap`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1535116587 From rkennke at openjdk.org Thu May 4 17:23:57 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 4 May 2023 17:23:57 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v32] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix asserts ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/f85e4913..0ccd5a0a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=31 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=30-31 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From tschatzl at openjdk.org Fri May 5 06:39:29 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 5 May 2023 06:39:29 GMT Subject: RFR: 8307421: Fix comment in g1CollectionSetChooser.hpp after JDK-8306836 In-Reply-To: References: Message-ID: On Thu, 4 May 2023 13:26:42 GMT, Aleksey Shipilev wrote: >> Hi all, >> >> please review this trivial comment fix @kimbarrett noticed while reviewing the [JDK-8306836](https://bugs.openjdk.org/browse/JDK-8306836) change after having it pushed. >> >> Testing: local compilation > > Looks fine. Thanks @shipilev for your review ------------- PR Comment: https://git.openjdk.org/jdk/pull/13793#issuecomment-1535779651 From tschatzl at openjdk.org Fri May 5 06:39:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 5 May 2023 06:39:32 GMT Subject: Integrated: 8307421: Fix comment in g1CollectionSetChooser.hpp after JDK-8306836 In-Reply-To: References: Message-ID: On Thu, 4 May 2023 07:49:43 GMT, Thomas Schatzl wrote: > Hi all, > > please review this trivial comment fix @kimbarrett noticed while reviewing the [JDK-8306836](https://bugs.openjdk.org/browse/JDK-8306836) change after having it pushed. > > Testing: local compilation This pull request has now been integrated. Changeset: 302bc2fd Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/302bc2fd7fdfc02314e22ecc34ba2c78ef5ca9a1 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8307421: Fix comment in g1CollectionSetChooser.hpp after JDK-8306836 Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/13793 From tschatzl at openjdk.org Fri May 5 06:52:19 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 5 May 2023 06:52:19 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: <-d5noeLDfyCH_8GK-qGC-j8NyaOlR6AyB2i09GSPBMI=.283b6e7e-5460-49b3-b1c2-27df12a85357@github.com> References: <-d5noeLDfyCH_8GK-qGC-j8NyaOlR6AyB2i09GSPBMI=.283b6e7e-5460-49b3-b1c2-27df12a85357@github.com> Message-ID: On Thu, 4 May 2023 17:10:58 GMT, Aleksey Shipilev wrote: > > How does that affect another caller, `VM_GC_HeapInspection::doit`? Unclear to me how one can get gc-workers in that context. > > I think `safepoint_workers` is "fine" to use from `VM_GC_HeapInspection`. That VMOp does not know the status of GC threads when executing the VM op. The GC threads might be currently parked at STS and not available. `safepoint_workers` are supposed to be separate from normal (concurrent) GC workers then. I agree. > > So, the contract for `safepoint_workers` is relatively safe for doing this the object count heap walk too. I agree it would be cleaner to use the GC workers, though. The problem that I see there is that `report_object_count_after_gc` is called from rather generic places like `GenMarkSweep::mark_sweep_phase1`, where we don't know how many workers do we have (or even which exact GC we are running). That would probably require lifting `workers()` straight to `CollectedHeap`. Can you elaborate about this? From what I can see, the callers of `report_object_count_after_gc` other than heap inspection always have GC workers available. As for `GenMarkSweep::mark_sweep_phase1` specifically - this is serial gc. There are no workers for it, and it's not supposed to use multiple threads. One would just pass `nullptr` here I assume. Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1535795378 From rkennke at openjdk.org Fri May 5 08:40:30 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 08:40:30 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v25] In-Reply-To: <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> References: <6UvcVsPw9XNkD2wlN6kky_EK8bKyBlqdMt4h9_IONN4=.95d8c0c3-9a31-4edb-bdb5-9d5cd33d7c99@github.com> <7DNp1iiy51WJSWUuywlldghKwLETxFip4jlePA3ybrk=.b0f1243a-fa7b-4446-a6c3-3d7d2080ee3e@github.com> Message-ID: <17SrJiXbiYOKvJUgBACh1QVKgJ4y0NmS4LE0MMU2omc=.2c7d4020-807b-4f48-bc4b-f4418fec7846@github.com> On Wed, 3 May 2023 21:39:19 GMT, Thomas Schatzl wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix type narrowing > > Changes requested by tschatzl (Reviewer). @tschatzl are you good with this PR now? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1535921703 From shade at openjdk.org Fri May 5 08:44:19 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 08:44:19 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: <-d5noeLDfyCH_8GK-qGC-j8NyaOlR6AyB2i09GSPBMI=.283b6e7e-5460-49b3-b1c2-27df12a85357@github.com> Message-ID: On Fri, 5 May 2023 06:49:15 GMT, Thomas Schatzl wrote: > Can you elaborate about this? From what I can see, the callers of `report_object_count_after_gc` other than heap inspection always have GC workers available. As for `GenMarkSweep::mark_sweep_phase1` specifically - this is serial gc. There are no workers for it, and it's not supposed to use multiple threads. One would just pass `nullptr` here I assume. Oh, I did not realize the implicit assumption about `GenMarkSweep` == Serial. I agree we can pass `nullptr` there. I wonder if there are other assumptions like that. Let's try and do that change then, @olivergillespie! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1535926184 From shade at openjdk.org Fri May 5 09:36:17 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 09:36:17 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: Message-ID: On Wed, 3 May 2023 12:16:13 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Use ParallelGCThreads instead of active_workers [changing my review back to "Request Changes"] ------------- Changes requested by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13774#pullrequestreview-1414462380 From shade at openjdk.org Fri May 5 10:51:29 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 10:51:29 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v32] In-Reply-To: References: Message-ID: <-PrjkCfvvA4FqJh17986ncJIdPZSquKmx18tu7C1V4Y=.7230967e-e67f-43ca-b9b6-58aeb6552acf@github.com> On Thu, 4 May 2023 17:23:57 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix asserts I am okay with it. `develop` = `false` protects us from exposing this code path in release bits. I see no regressions even in my targeted tests with the feature turned off. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1414565112 From ayang at openjdk.org Fri May 5 11:16:21 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 5 May 2023 11:16:21 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: <-d5noeLDfyCH_8GK-qGC-j8NyaOlR6AyB2i09GSPBMI=.283b6e7e-5460-49b3-b1c2-27df12a85357@github.com> Message-ID: <4UEoc768wNqfhUKyXmwLkb2kB_hGU5NxIs7ke-sw_c0=.129972c9-cd29-472c-b314-b61bf5ec12c1@github.com> On Fri, 5 May 2023 06:49:15 GMT, Thomas Schatzl wrote: > I think safepoint_workers is "fine" to use from VM_GC_HeapInspection. I see; the name contains "GC", but it doesn't really correspond to a gc-safepoint (young/full gc pause). (To make matters worse, that vm-op can potentially perform a full-gc...) Then, `populate_table` will get diff workers, depending whether it's called inside a gc-safepoint or not. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1536098617 From duke at openjdk.org Fri May 5 11:30:23 2023 From: duke at openjdk.org (olivergillespie) Date: Fri, 5 May 2023 11:30:23 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v5] In-Reply-To: References: Message-ID: > ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. > > The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: > > > Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) > After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) > > > Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? > > Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? olivergillespie has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Rework Pass workers through from caller. Move some of the specific handling to the VM Op for heap inspection. - Merge remote-tracking branch 'origin/master' into 8307348 - Use ParallelGCThreads instead of active_workers - Fix compile error ``` === Output from failing command(s) repeated here === * For target hotspot_variant-server_libjvm_objs_gcTrace.o: /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp: In member function 'void GCTracer::report_object_count_after_gc(BoolObjectClosure*)': /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:114:48: error: invalid use of incomplete type 'class CollectedHeap' 114 | WorkerThreads* workers = Universe::heap()->safepoint_workers(); | ^~ In file included from /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:35: /home/runner/work/jdk/jdk/src/hotspot/share/memory/universe.hpp:42:7: note: forward declaration of 'class CollectedHeap' 42 | class CollectedHeap; | ^~~~~~~~~~~~~ * All command lines available in /home/runner/work/jdk/jdk/build/linux-x64/make-support/failure-logs. === End of repeated output === ``` - Fix compile error - 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13774/files - new: https://git.openjdk.org/jdk/pull/13774/files/711cb643..3d00b05f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=03-04 Stats: 8729 lines in 231 files changed: 6893 ins; 964 del; 872 mod Patch: https://git.openjdk.org/jdk/pull/13774.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13774/head:pull/13774 PR: https://git.openjdk.org/jdk/pull/13774 From duke at openjdk.org Fri May 5 11:32:21 2023 From: duke at openjdk.org (olivergillespie) Date: Fri, 5 May 2023 11:32:21 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v4] In-Reply-To: References: Message-ID: <7MbGDWxakFU3nF_RWE6vH7bPjXGbNnIaFb3SOs4ssUA=.82e02c1b-2eeb-406c-baee-451b4f2f7234@github.com> On Wed, 3 May 2023 12:16:13 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Use ParallelGCThreads instead of active_workers Thanks for the comments. Please check the updated version, it updates to have each caller pass the appropriate workers. The VMOp now grabs (and activates, as necessary) the safepoint workers instead of delegating that work, and populate can simply use the workers->active_workers as passed in (or run serially, as appropriate). This is now a slightly more substantial change - not only are we parallelizing, but we're moving the GC-based ObjectCountAfterGC away from safepoint workers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1536120709 From ayang at openjdk.org Fri May 5 11:38:18 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 5 May 2023 11:38:18 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v5] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 11:30:23 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Rework > > Pass workers through from caller. Move some of the specific handling > to the VM Op for heap inspection. > - Merge remote-tracking branch 'origin/master' into 8307348 > - Use ParallelGCThreads instead of active_workers > - Fix compile error > > ``` > === Output from failing command(s) repeated here === > * For target hotspot_variant-server_libjvm_objs_gcTrace.o: > /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp: In member function 'void GCTracer::report_object_count_after_gc(BoolObjectClosure*)': > /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:114:48: error: invalid use of incomplete type 'class CollectedHeap' > 114 | WorkerThreads* workers = Universe::heap()->safepoint_workers(); > | ^~ > In file included from /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:35: > /home/runner/work/jdk/jdk/src/hotspot/share/memory/universe.hpp:42:7: note: forward declaration of 'class CollectedHeap' > 42 | class CollectedHeap; > | ^~~~~~~~~~~~~ > > * All command lines available in /home/runner/work/jdk/jdk/build/linux-x64/make-support/failure-logs. > === End of repeated output === > ``` > - Fix compile error > - 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection src/hotspot/share/gc/shared/gcVMOperations.cpp line 174: > 172: // Can't run with more threads than provided by the WorkerThreads. > 173: const uint capped_parallel_thread_num = MIN2(_parallel_thread_num, workers->max_workers()); > 174: WithActiveWorkers with_active_workers(workers, capped_parallel_thread_num); `WithActiveWorkers` is an stack-obj, so it must be in the same scope as using workers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13774#discussion_r1185989473 From duke at openjdk.org Fri May 5 11:49:21 2023 From: duke at openjdk.org (olivergillespie) Date: Fri, 5 May 2023 11:49:21 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v5] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 11:35:40 GMT, Albert Mingkun Yang wrote: >> olivergillespie has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Rework >> >> Pass workers through from caller. Move some of the specific handling >> to the VM Op for heap inspection. >> - Merge remote-tracking branch 'origin/master' into 8307348 >> - Use ParallelGCThreads instead of active_workers >> - Fix compile error >> >> ``` >> === Output from failing command(s) repeated here === >> * For target hotspot_variant-server_libjvm_objs_gcTrace.o: >> /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp: In member function 'void GCTracer::report_object_count_after_gc(BoolObjectClosure*)': >> /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:114:48: error: invalid use of incomplete type 'class CollectedHeap' >> 114 | WorkerThreads* workers = Universe::heap()->safepoint_workers(); >> | ^~ >> In file included from /home/runner/work/jdk/jdk/src/hotspot/share/gc/shared/gcTrace.cpp:35: >> /home/runner/work/jdk/jdk/src/hotspot/share/memory/universe.hpp:42:7: note: forward declaration of 'class CollectedHeap' >> 42 | class CollectedHeap; >> | ^~~~~~~~~~~~~ >> >> * All command lines available in /home/runner/work/jdk/jdk/build/linux-x64/make-support/failure-logs. >> === End of repeated output === >> ``` >> - Fix compile error >> - 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection > > src/hotspot/share/gc/shared/gcVMOperations.cpp line 174: > >> 172: // Can't run with more threads than provided by the WorkerThreads. >> 173: const uint capped_parallel_thread_num = MIN2(_parallel_thread_num, workers->max_workers()); >> 174: WithActiveWorkers with_active_workers(workers, capped_parallel_thread_num); > > `WithActiveWorkers` is an stack-obj, so it must be in the same scope as using workers. Oh, good catch. So can I do something like: HeapInspection inspect; if (workers != nullptr) { // The GC provided a WorkerThreads to be used during a safepoint. // Can't run with more threads than provided by the WorkerThreads. const uint capped_parallel_thread_num = MIN2(_parallel_thread_num, workers->max_workers()); WithActiveWorkers with_active_workers(workers, capped_parallel_thread_num); inspect.heap_inspection(_out, workers); } else { inspect.heap_inspection(_out, nullptr); } ? (I'm not familiar with the exact workings of StackObj, sorry) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13774#discussion_r1185998092 From duke at openjdk.org Fri May 5 11:55:14 2023 From: duke at openjdk.org (olivergillespie) Date: Fri, 5 May 2023 11:55:14 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v6] In-Reply-To: References: Message-ID: > ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. > > The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: > > > Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) > After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) > > > Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? > > Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? olivergillespie has updated the pull request incrementally with one additional commit since the last revision: Fix imports ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13774/files - new: https://git.openjdk.org/jdk/pull/13774/files/3d00b05f..b6da2f73 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=04-05 Stats: 2 lines in 2 files changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13774.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13774/head:pull/13774 PR: https://git.openjdk.org/jdk/pull/13774 From shade at openjdk.org Fri May 5 12:18:20 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 12:18:20 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v5] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 11:46:28 GMT, olivergillespie wrote: >> src/hotspot/share/gc/shared/gcVMOperations.cpp line 174: >> >>> 172: // Can't run with more threads than provided by the WorkerThreads. >>> 173: const uint capped_parallel_thread_num = MIN2(_parallel_thread_num, workers->max_workers()); >>> 174: WithActiveWorkers with_active_workers(workers, capped_parallel_thread_num); >> >> `WithActiveWorkers` is an stack-obj, so it must be in the same scope as using workers. > > Oh, good catch. So can I do something like: > > > HeapInspection inspect; > if (workers != nullptr) { > // The GC provided a WorkerThreads to be used during a safepoint. > // Can't run with more threads than provided by the WorkerThreads. > const uint capped_parallel_thread_num = MIN2(_parallel_thread_num, workers->max_workers()); > WithActiveWorkers with_active_workers(workers, capped_parallel_thread_num); > inspect.heap_inspection(_out, workers); > } else { > inspect.heap_inspection(_out, nullptr); > } > > > ? > > (I'm not familiar with the exact workings of StackObj, sorry) Yes, do that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13774#discussion_r1186020045 From shade at openjdk.org Fri May 5 12:18:17 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 12:18:17 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v6] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 11:55:14 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Fix imports This looks better to me. Some comments: src/hotspot/share/memory/heapInspection.cpp line 569: > 567: uintx HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, WorkerThreads* workers) { > 568: // Try parallel first. > 569: ResourceMark rm; I think this `ResourceMark` should remain at both blocks. So that on failed exit from the parallel block, we rollback resource allocations and start over with serial walk. ------------- PR Review: https://git.openjdk.org/jdk/pull/13774#pullrequestreview-1414674820 PR Review Comment: https://git.openjdk.org/jdk/pull/13774#discussion_r1186021364 From ayang at openjdk.org Fri May 5 12:45:21 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 5 May 2023 12:45:21 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v6] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 11:55:14 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Fix imports Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13774#pullrequestreview-1414714981 From duke at openjdk.org Fri May 5 13:05:27 2023 From: duke at openjdk.org (olivergillespie) Date: Fri, 5 May 2023 13:05:27 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v7] In-Reply-To: References: Message-ID: > ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. > > The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: > > > Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) > After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) > > > Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? > > Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? olivergillespie has updated the pull request incrementally with one additional commit since the last revision: Fix use of WithActiveWorkers My scope was incorrect, thanks @albertnetymk Also fix ResourceMark usage, thanks Aleksey. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13774/files - new: https://git.openjdk.org/jdk/pull/13774/files/b6da2f73..ac1d234e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13774&range=05-06 Stats: 14 lines in 2 files changed: 6 ins; 4 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13774.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13774/head:pull/13774 PR: https://git.openjdk.org/jdk/pull/13774 From duke at openjdk.org Fri May 5 13:05:30 2023 From: duke at openjdk.org (olivergillespie) Date: Fri, 5 May 2023 13:05:30 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v6] In-Reply-To: References: Message-ID: <3V_FrlCAl0y7Rem7VpGFuaBGWrcn2IFwQE1eP4jdcUg=.be889aba-abb6-498e-afac-71259c7cc93b@github.com> On Fri, 5 May 2023 12:14:35 GMT, Aleksey Shipilev wrote: >> olivergillespie has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix imports > > src/hotspot/share/memory/heapInspection.cpp line 569: > >> 567: uintx HeapInspection::populate_table(KlassInfoTable* cit, BoolObjectClosure *filter, WorkerThreads* workers) { >> 568: // Try parallel first. >> 569: ResourceMark rm; > > I think this `ResourceMark` should remain at both blocks. So that on failed exit from the parallel block, we rollback resource allocations and start over with serial walk. Ah okay, I didn't understand it before, my bad. Updated, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13774#discussion_r1186064507 From shade at openjdk.org Fri May 5 15:44:23 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 5 May 2023 15:44:23 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v7] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 13:05:27 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Fix use of WithActiveWorkers > > My scope was incorrect, thanks @albertnetymk > Also fix ResourceMark usage, thanks Aleksey. This looks okay to me. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13774#pullrequestreview-1415031641 From tschatzl at openjdk.org Fri May 5 15:58:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 5 May 2023 15:58:27 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v7] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 13:05:27 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Fix use of WithActiveWorkers > > My scope was incorrect, thanks @albertnetymk > Also fix ResourceMark usage, thanks Aleksey. See the suggestion. src/hotspot/share/gc/shared/gcVMOperations.cpp line 175: > 173: // Can't run with more threads than provided by the WorkerThreads. > 174: const uint capped_parallel_thread_num = MIN2(_parallel_thread_num, workers->max_workers()); > 175: WithActiveWorkers with_active_workers(workers, capped_parallel_thread_num); I would *almost* rely on the workgang's 'active_workers()' here because all collectors set this proportional to heap size (people tend to start 128M VMs on hundreds-of-thread machines... ). The problem is that one heap inspection VM operation. Maybe put the `WithActiveWorkers` there and use `workers->active_workers()` here? This is just a suggestion. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13774#pullrequestreview-1415050462 PR Review Comment: https://git.openjdk.org/jdk/pull/13774#discussion_r1186256267 From rkennke at openjdk.org Fri May 5 15:59:23 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 15:59:23 GMT Subject: Integrated: 8307395: Add missing STS to Shenandoah In-Reply-To: References: Message-ID: On Thu, 4 May 2023 11:34:15 GMT, Roman Kennke wrote: > Testing in project Lilliput has revealed that Shenandoah GC is lacking one STS. This causes a reliable crash (with Lilliput) when running TestGCBasherWithShenandoah.java with -XX:+UseHeavyMonitors because it touches an already deflated monitor. > > Testing (all in Lilliput where it caused the troubles, but applies to upstream as well): > - [x] TestGCBasherWithShenandoah.java +UseHeavyMonitors > - [x] hotspot_gc_shenandoah +UseHeavyMonitors This pull request has now been integrated. Changeset: 3968ab5d Author: Roman Kennke URL: https://git.openjdk.org/jdk/commit/3968ab5db5443ce93c9a19ebbc5464f7d91782fc Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8307395: Add missing STS to Shenandoah Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/13799 From tschatzl at openjdk.org Fri May 5 16:16:43 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 5 May 2023 16:16:43 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v32] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 17:23:57 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix asserts I would like to request some cleanups before pushing, but wait changing this. I have been working aggressively strenght-reducing operations and use biased arrays (yes, you can, not sure why not?) (performance seems better if you separate the ?_bases_table` into two logical ones too), reducing the overhead from baseline to +UseAltGCForwarding from a bit more than 10% to ~7.8% you gave on an Ice Lake machine. Same code also shows improvements on some Zen3 machine (from ~9% to 5.3% - but the result is noisy and the machine isn't "clean"), both on that benchmark you gave. ? Let me clean up the code a bit and provide to you on Monday so that you can test too if you are interested. I will then also look in detail at the forwarding table. src/hotspot/share/gc/shared/slidingForwarding.cpp line 57: > 55: _region_size_words = round_up_power_of_2(heap.word_size()); > 56: _region_size_words_shift = log2i_exact(_region_size_words); > 57: } Suggestion: _heap_start = heap.start(); if (UseSerialGC && heap.word_size() <= (1 << NUM_OFFSET_BITS)) { // In this case we can treat the whole heap as a single region and // make the encoding very simple. _num_regions = 1; _region_size_words = round_up_power_of_2(heap.word_size()); } else { _num_regions = align_up(pointer_delta(heap.end(), heap.start()), region_size_words) / region_size_words; _region_size_words = region_size_words; } _region_size_words_shift = log2i_exact(region_size_words); I.e. extract out the common code. src/hotspot/share/gc/shared/slidingForwarding.hpp line 117: > 115: static HeapWord* _heap_start; > 116: static size_t _num_regions; > 117: static size_t _region_size_words; I would prefer that the code, instead of having separate `_heap_start` and `_region_size_words`, would just take a copy of the `MemRegion` passed to it. This would allow making the asserts stronger by using the `contains` method. Both variables only seem to be used in asserts anyway. src/hotspot/share/gc/shared/slidingForwarding.inline.hpp line 77: > 75: > 76: size_t offset = pointer_delta(to, to_region_base); > 77: assert(offset < _region_size_words, "Offset should be within the region. from: " PTR_FORMAT I think for this check, `_region_size_words` should be the size passed to the initialization, not potentially something larger. (See the `SlidingForwarding::initialize()` method for the serial gc special case). ------------- Changes requested by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1415034095 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1186245432 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1186251889 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1186248985 From duke at openjdk.org Fri May 5 16:21:17 2023 From: duke at openjdk.org (olivergillespie) Date: Fri, 5 May 2023 16:21:17 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v7] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 15:54:47 GMT, Thomas Schatzl wrote: >> olivergillespie has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix use of WithActiveWorkers >> >> My scope was incorrect, thanks @albertnetymk >> Also fix ResourceMark usage, thanks Aleksey. > > src/hotspot/share/gc/shared/gcVMOperations.cpp line 175: > >> 173: // Can't run with more threads than provided by the WorkerThreads. >> 174: const uint capped_parallel_thread_num = MIN2(_parallel_thread_num, workers->max_workers()); >> 175: WithActiveWorkers with_active_workers(workers, capped_parallel_thread_num); > > I would *almost* rely on the workgang's 'active_workers()' here because all collectors set this proportional to heap size (people tend to start 128M VMs on hundreds-of-thread machines... ). The problem is that one heap inspection VM operation. Maybe put the `WithActiveWorkers` there and use `workers->active_workers()` here? > > This is just a suggestion. Sorry, I don't think I understand the suggestion. Are you suggesting to use WithActiveWorkers only for the heap inspection VM op, and workers->active_workers() everywhere else (for the collector use-case)? > The problem is that one heap inspection VM operation. Maybe put the WithActiveWorkers there This area *is* that one heap inspection VM operation, isn't it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13774#discussion_r1186278692 From tschatzl at openjdk.org Fri May 5 16:22:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 5 May 2023 16:22:27 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v32] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 17:23:57 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix asserts > I have been working aggressively strength-reducing operations and use biased arrays (yes, you can, not sure why not?) (performance seems better if you separate the ?_bases_table` into two logical ones too), reducing the overhead from baseline to +UseAltGCForwarding from a bit more than 10% to ~7.8% you gave on an Ice Lake machine. Same code also shows improvements on some Zen3 machine (from ~9% to 5.3% - but the result is noisy and the machine isn't "clean"), both on that benchmark you gave. tada > > Let me clean up the code a bit and provide to you on Monday so that you can test too if you are interested. > Actually, current code is at https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:alt-fullgc-forwarding?expand=1 if you want to play with it ; the [strength reduction](https://github.com/openjdk/jdk/commit/10e1f43a736b28e18dc951cdd23309a71dc83694) commit does strength reduction and biasing, keeping current code intact, the other two split the `bases_table` into two separate tables and clean up a bit. Hopefully I wasn't just dreaming or something when testing it (fwiw, I only kind of guarantee that it runs that `Retain` test. Not really tested otherwise). But it tended to crash if I didn't get things right for that test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1536486650 From rkennke at openjdk.org Fri May 5 16:45:46 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 5 May 2023 16:45:46 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v32] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 16:19:01 GMT, Thomas Schatzl wrote: > > I have been working aggressively strength-reducing operations and use biased arrays (yes, you can, not sure why not?) (performance seems better if you separate the ?_bases_table` into two logical ones too), reducing the overhead from baseline to +UseAltGCForwarding from a bit more than 10% to ~7.8% you gave on an Ice Lake machine. Same code also shows improvements on some Zen3 machine (from ~9% to 5.3% - but the result is noisy and the machine isn't "clean"), both on that benchmark you gave. tada > > Let me clean up the code a bit and provide to you on Monday so that you can test too if you are interested. > > Actually, current code is at https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:alt-fullgc-forwarding?expand=1 if you want to play with it ; the [strength reduction](https://github.com/openjdk/jdk/commit/10e1f43a736b28e18dc951cdd23309a71dc83694) commit does strength reduction and biasing, keeping current code intact, the other two split the `bases_table` into two separate tables and clean up a bit. > > Hopefully I wasn't just dreaming or something when testing it (fwiw, I only kind of guarantee that it runs that `Retain` test. Not really tested otherwise). But it tended to crash if I didn't get things right for that test. Nice! Thank you for doing that! I'll give it a test as soon as I get to it (perhaps only on Monday). Would you be ok if I integrate it into the main PR once my testing is good? Or do you have more things you want to do on that branch? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1536510796 From dholmes at openjdk.org Mon May 8 05:13:33 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 8 May 2023 05:13:33 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v29] In-Reply-To: References: Message-ID: <2GJMfozzvuoZLoQjcqIHSA9ml-_2SllOV6ZJPavK1nc=.6b1d9955-a8f7-49f9-9044-2dbfca2c12d7@github.com> On Thu, 4 May 2023 13:46:18 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Add usual header include guards > > src/hotspot/share/gc/shared/gc_globals.hpp line 698: > >> 696: constraint(GCCardSizeInBytesConstraintFunc,AtParse) \ >> 697: \ >> 698: develop(bool, UseAltGCForwarding, false, \ > > I don't think you can opt-in into `true` in release bits, if this flag is `develop` when the rest of Lilliput arrives. In release bits, all checks involving this flag would fold with `false`. Maybe that's the intent here, as it keeps the release performance at the baseline level, but this makes performance overhead estimations for this PR a bit hard :) [I'll hack the flag back to experimental for tests] If you need this for compact object headers then shouldn't this flag be experimental? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1187028649 From rkennke at openjdk.org Mon May 8 05:44:25 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 05:44:25 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v29] In-Reply-To: <2GJMfozzvuoZLoQjcqIHSA9ml-_2SllOV6ZJPavK1nc=.6b1d9955-a8f7-49f9-9044-2dbfca2c12d7@github.com> References: <2GJMfozzvuoZLoQjcqIHSA9ml-_2SllOV6ZJPavK1nc=.6b1d9955-a8f7-49f9-9044-2dbfca2c12d7@github.com> Message-ID: <_uVyIUcDEzKU9Cf4ajJF8zN_-wR2EMQmkkDY4eoRiks=.354aeecd-00a1-4e72-b62f-3c60c8db0793@github.com> On Mon, 8 May 2023 05:10:11 GMT, David Holmes wrote: >> src/hotspot/share/gc/shared/gc_globals.hpp line 698: >> >>> 696: constraint(GCCardSizeInBytesConstraintFunc,AtParse) \ >>> 697: \ >>> 698: develop(bool, UseAltGCForwarding, false, \ >> >> I don't think you can opt-in into `true` in release bits, if this flag is `develop` when the rest of Lilliput arrives. In release bits, all checks involving this flag would fold with `false`. Maybe that's the intent here, as it keeps the release performance at the baseline level, but this makes performance overhead estimations for this PR a bit hard :) [I'll hack the flag back to experimental for tests] > > If you need this for compact object headers then shouldn't this flag be experimental? @tschatzl requested it to be develop *for this PR* and make it product/experimental in the compact-headers PR, because if the compact-headers PR doesn't make it into 21, then we'd have the slight performance hit of checking the flag (in a loop) in release builds for no benefit. See discussions above. I thought that is reasonable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1187043439 From tschatzl at openjdk.org Mon May 8 08:21:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 8 May 2023 08:21:27 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v7] In-Reply-To: References: Message-ID: <1293l8AOaxKsgKLWwQ0Z5AVT84opFPyIlEK3A24L5kU=.0d475159-2d97-4cec-8e7e-71fba84d1f98@github.com> On Fri, 5 May 2023 16:18:28 GMT, olivergillespie wrote: >> src/hotspot/share/gc/shared/gcVMOperations.cpp line 175: >> >>> 173: // Can't run with more threads than provided by the WorkerThreads. >>> 174: const uint capped_parallel_thread_num = MIN2(_parallel_thread_num, workers->max_workers()); >>> 175: WithActiveWorkers with_active_workers(workers, capped_parallel_thread_num); >> >> I would *almost* rely on the workgang's 'active_workers()' here because all collectors set this proportional to heap size (people tend to start 128M VMs on hundreds-of-thread machines... ). The problem is that one heap inspection VM operation. Maybe put the `WithActiveWorkers` there and use `workers->active_workers()` here? >> >> This is just a suggestion. > > Sorry, I don't think I understand the suggestion. Are you suggesting to use WithActiveWorkers only for the heap inspection VM op, and workers->active_workers() everywhere else (for the collector use-case)? > >> The problem is that one heap inspection VM operation. Maybe put the WithActiveWorkers there > > This area *is* that one heap inspection VM operation, isn't it? You are right, I misread the code. Ignore my comment :) Ship it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13774#discussion_r1187164921 From ayang at openjdk.org Mon May 8 08:25:30 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 May 2023 08:25:30 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v5] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> On Wed, 3 May 2023 15:35:20 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this refactoring of collection set candidate set handling. >> >> The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. >> >> These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). >> >> This patch only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. >> >> In detail: >> * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Testing: >> - this patch only: tier1-3, gha >> - with JDK-8140326 tier1-7 (or 8?) >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - Merge branch 'master' into 8306541-refactor-cset-candidates > - ayang, iwalulya review > > fix inlining in g1CollectionSet.inline.hpp > - Merge branch 'master' into 8306541-refactor-cset-candidates > - ayang review - remove unused methods > - Whitespace fixes > - typo > - More cleanup > - Cleanup > - Cleanup > - Refactor collection set candidates > > Improve the interface to collection set candidates and prepare for having collection set > candidates at any time. Preparations to allow for multiple sources for these candidates > (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch > only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's > not used otherwise. > > * the collection set candidates set is not temporarily allocated any more, but the candidate > set object must be available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains > the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not > necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. > Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Everything else are changes to use these helper sets/lists throughout. > > Some additional FIXME for log messages to remove are in there. Please ignore. src/hotspot/share/gc/g1/g1CollectionSetCandidates.cpp line 229: > 227: verify(); > 228: > 229: _marking_regions.merge(candidate_infos, num_infos); Could we avoid `merge` in the name? It suggests there's existing data there already. Maybe "populate_marking_candidates" or sth. src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 46: > 44: class G1CollectionSetRegionList { > 45: GrowableArray _regions; > 46: size_t _reclaimable_bytes; I don't see the necessity of `G1CollectionSetRegionList::_reclaimable_bytes`. Seems to me, one can calculate it on the fly in the for-loop of `G1CollectionSetCandidates::remove`. src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 55: > 53: // Remove the given list of HeapRegion* from this list. Assumes that the given > 54: // list is a prefix of this list. > 55: void remove(G1CollectionSetRegionList* list); Maybe `remove_prefix`? src/hotspot/share/gc/g1/g1CollectionSetChooser.cpp line 198: > 196: if (should_add(r) && !G1CollectedHeap::heap()->is_old_gc_alloc_region(r)) { > 197: add_region(r); > 198: } else if (r->is_old() && !r->is_collection_set_candidate()) { Why the additional predicate? (IOW, what regions will be misplaced without the new predicate?) src/hotspot/share/gc/g1/g1CollectionSetChooser.cpp line 256: > 254: candidates->merge_candidates_from_marking(_result.array(), > 255: _num_regions_added - num_pruned, > 256: _reclaimable_bytes_added - pruned_wasted_bytes); Could `prune` modify `_result` and fields in-place? Requiring caller to do `_num_regions_added - num_pruned` seems an unnecessary overhead. src/hotspot/share/gc/g1/heapRegion.inline.hpp line 301: > 299: if (is_old_or_humongous() && !is_collection_set_candidate()) { > 300: set_top_at_mark_start(top()); > 301: } Unclear why these checks are required. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1186746076 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1186754322 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1186745526 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1186747757 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1186747085 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1186748274 From ayang at openjdk.org Mon May 8 08:31:41 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 May 2023 08:31:41 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v32] In-Reply-To: References: Message-ID: On Thu, 4 May 2023 17:23:57 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix asserts src/hotspot/share/gc/shared/slidingForwarding.hpp line 51: > 49: * maximum of two regions. This is an intuitive property: when we slide the compact region full of data, it can > 50: * only span two adjacent regions. This property allows us to use the off-side table to record the addresses of > 51: * two target regions. The table table holds N*2 entries for N logical regions. For each region, it gives the base "table table" src/hotspot/share/gc/shared/slidingForwarding.hpp line 77: > 75: * 4. Compute the mark word from "offset" and "alternate", write it out > 76: * > 77: * Similarily, looking up the target address, given an original object address generally works as follows: Typo - "Similarly" src/hotspot/share/gc/shared/slidingForwarding.hpp line 159: > 157: * is sufficient because G1 serial compaction is single-threaded. > 158: */ > 159: class FallbackTable : public CHeapObj{ Could this class be placed inside `SlidingForwarding` for better encapsulation? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1186754832 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1186754795 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1186754767 From shade at openjdk.org Mon May 8 09:00:30 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 May 2023 09:00:30 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v7] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 13:05:27 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Fix use of WithActiveWorkers > > My scope was incorrect, thanks @albertnetymk > Also fix ResourceMark usage, thanks Aleksey. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13774#pullrequestreview-1416411060 From rkennke at openjdk.org Mon May 8 09:58:53 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 09:58:53 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v33] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Fix typos - Make FallbackTable an inner class of SlidingForwarding ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/0ccd5a0a..3a8d07b0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=32 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=31-32 Stats: 78 lines in 2 files changed: 35 ins; 35 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From tschatzl at openjdk.org Mon May 8 10:15:32 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 8 May 2023 10:15:32 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v33] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 09:58:53 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Fix typos > - Make FallbackTable an inner class of SlidingForwarding The https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:alt-fullgc-forwarding?expand=1 branch now contains the promised cleanup. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1538120604 From rkennke at openjdk.org Mon May 8 11:03:53 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 11:03:53 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v34] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 96 additional commits since the last revision: - Merge remote-tracking branch 'tschatzl/alt-fullgc-forwarding' into JDK-8305896 - Cleanup; remove commented out code, renames, "alt_region" -> "alternate" to match documentation, limit _region_word_size to heap size in the optimization case for better checking - cleanup - separate region bases - strength reduction - Fix asserts - @shipilev comments - @tstuefe review - Add some test configs that use +UseAltGCForwarding - Clamp home index. Duh. - ... and 86 more: https://git.openjdk.org/jdk/compare/5a30446c...c1ce8a76 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/3a8d07b0..c1ce8a76 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=33 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=32-33 Stats: 24377 lines in 543 files changed: 18238 ins; 2530 del; 3609 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Mon May 8 11:03:53 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 11:03:53 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v33] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 10:12:41 GMT, Thomas Schatzl wrote: > The https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:alt-fullgc-forwarding?expand=1 branch now contains the promised cleanup. Thanks, Thomas! This looks useful. I've merged your branch into this PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1538174145 From ayang at openjdk.org Mon May 8 12:30:49 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 May 2023 12:30:49 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v34] In-Reply-To: References: Message-ID: <4VUsEsRHm-L4bTnkObCeq4PD9IBSqDqiUMSn_g3Dk0o=.b960b3f2-bf32-4f24-a654-eac3f2a29a08@github.com> On Mon, 8 May 2023 11:03:53 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 96 additional commits since the last revision: > > - Merge remote-tracking branch 'tschatzl/alt-fullgc-forwarding' into JDK-8305896 > - Cleanup; remove commented out code, renames, "alt_region" -> "alternate" to match documentation, limit _region_word_size to heap size in the optimization case for better checking > - cleanup > - separate region bases > - strength reduction > - Fix asserts > - @shipilev comments > - @tstuefe review > - Add some test configs that use +UseAltGCForwarding > - Clamp home index. Duh. > - ... and 86 more: https://git.openjdk.org/jdk/compare/580e1066...c1ce8a76 src/hotspot/share/gc/shared/slidingForwarding.hpp line 59: > 57: * > 58: * 64 32 0 > 59: * [........................|OOOOOOOOOOOOOOO|A|F|TT] The number of zero should be 28 == 32 - 4 (A + F + TT), right? (I counted 15 there.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1187387702 From rkennke at openjdk.org Mon May 8 12:41:40 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 12:41:40 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v35] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix ASCII art ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/c1ce8a76..c2df2689 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=34 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=33-34 Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Mon May 8 12:41:40 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 12:41:40 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v34] In-Reply-To: <4VUsEsRHm-L4bTnkObCeq4PD9IBSqDqiUMSn_g3Dk0o=.b960b3f2-bf32-4f24-a654-eac3f2a29a08@github.com> References: <4VUsEsRHm-L4bTnkObCeq4PD9IBSqDqiUMSn_g3Dk0o=.b960b3f2-bf32-4f24-a654-eac3f2a29a08@github.com> Message-ID: On Mon, 8 May 2023 12:27:54 GMT, Albert Mingkun Yang wrote: > The number of zero should be 28 == 32 - 4 (A + F + TT), right? (I counted 15 there.) Indeed. The number of .s was also too small. I think it was meant to be schematic and not precise. I changed it to be precise. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1187394425 From ayang at openjdk.org Mon May 8 12:46:45 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 May 2023 12:46:45 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v35] In-Reply-To: References: Message-ID: <0hIenx25CuYDfGCuxO-MB8ZRO8I3KoYRuFBqLJ-3dr4=.e21615a8-86e3-40e1-993a-34ece37d4ff0@github.com> On Mon, 8 May 2023 12:41:40 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix ASCII art > it can forward objects from one region to a maximum of two regions Those two regions must be adjacent, right? Something like: `_biased_bases[1][from_reg_idx] - _biased_bases[0][from_reg_idx] == _region_size_words`. If that's the case, I don't understand why the to-address is broken into two parts `[offset][alternate region select]`. Is it possible to double the size of the to-region so that "a maximum of two regions" becomes a single, large to-region? (I think this means `_biased_bases` can be shrunk to a 1D array.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1538297593 From tschatzl at openjdk.org Mon May 8 12:48:41 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 8 May 2023 12:48:41 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v6] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review, add/clarify comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13666/files - new: https://git.openjdk.org/jdk/pull/13666/files/4a013283..5fe73ea2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=04-05 Stats: 6 lines in 2 files changed: 3 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From rkennke at openjdk.org Mon May 8 12:51:11 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 12:51:11 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v35] In-Reply-To: <0hIenx25CuYDfGCuxO-MB8ZRO8I3KoYRuFBqLJ-3dr4=.e21615a8-86e3-40e1-993a-34ece37d4ff0@github.com> References: <0hIenx25CuYDfGCuxO-MB8ZRO8I3KoYRuFBqLJ-3dr4=.e21615a8-86e3-40e1-993a-34ece37d4ff0@github.com> Message-ID: On Mon, 8 May 2023 12:43:10 GMT, Albert Mingkun Yang wrote: > > it can forward objects from one region to a maximum of two regions > > Those two regions must be adjacent, right? Something like: `_biased_bases[1][from_reg_idx] - _biased_bases[0][from_reg_idx] == _region_size_words`. > > If that's the case, I don't understand why the to-address is broken into two parts `[offset][alternate region select]`. Is it possible to double the size of the to-region so that "a maximum of two regions" becomes a single, large to-region? (I think this means `_biased_bases` can be shrunk to a 1D array.) No, the target regions don't have to be adjacent. Parallel full GCs divide up the work across worker threads, assigning from-regions to each worker, and each worker would then claim to-regions as they go and work with that. To-regions of different workers can interleave and are often not adjacent. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1538302131 From ayang at openjdk.org Mon May 8 13:08:43 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 May 2023 13:08:43 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v35] In-Reply-To: References: <0hIenx25CuYDfGCuxO-MB8ZRO8I3KoYRuFBqLJ-3dr4=.e21615a8-86e3-40e1-993a-34ece37d4ff0@github.com> Message-ID: On Mon, 8 May 2023 12:47:15 GMT, Roman Kennke wrote: > Parallel full GCs divide up ... Why is Parallel relevant here? The description mentions only "the full-GC modes of Serial, Shenandoah and G1 GCs are", so I assumed this feature doesn't affect/depend on Parallel. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1538327902 From rkennke at openjdk.org Mon May 8 13:25:45 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 13:25:45 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v35] In-Reply-To: References: <0hIenx25CuYDfGCuxO-MB8ZRO8I3KoYRuFBqLJ-3dr4=.e21615a8-86e3-40e1-993a-34ece37d4ff0@github.com> Message-ID: <_LJq_Ls1hcPudH2uB30cf0JDYMfBaRZtv5EQ8Zi68yY=.0d5810ba-2694-4a4a-b5b5-d7a30103625f@github.com> On Mon, 8 May 2023 13:05:14 GMT, Albert Mingkun Yang wrote: > > Parallel full GCs divide up ... > > Why is Parallel relevant here? > > The description mentions only "the full-GC modes of Serial, Shenandoah and G1 GCs are", so I assumed this feature doesn't affect/depend on Parallel. Sorry I wasn't clear enough. I meant the full-GCs of G1 and Shenandoah, which use multiple threads to do their work. Parallel GC is indeed not relevant. Also, Serial GC doesn't really compact serially, because it divides the heap into old, young-eden, young-s1 and young-s2 spaces. And btw, even if what you say were true, we could not do this. If we were to collapse each adjacent two regions into one, we would still end up with the same conclusion that for each source region, objects would be forwarded to one or two potential target regions. Let's say we divide the heap into 10 equal regions, then region 0 would logically only compact into the bottom of region 0. But region 1 might compact to region 0 or 1. Depending on how full region 0 becomes, region 2 might compact into region 0 and 1, or region 1 and 2. And so on. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1538354393 From ayang at openjdk.org Mon May 8 13:45:37 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 8 May 2023 13:45:37 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v35] In-Reply-To: <_LJq_Ls1hcPudH2uB30cf0JDYMfBaRZtv5EQ8Zi68yY=.0d5810ba-2694-4a4a-b5b5-d7a30103625f@github.com> References: <0hIenx25CuYDfGCuxO-MB8ZRO8I3KoYRuFBqLJ-3dr4=.e21615a8-86e3-40e1-993a-34ece37d4ff0@github.com> <_LJq_Ls1hcPudH2uB30cf0JDYMfBaRZtv5EQ8Zi68yY=.0d5810ba-2694-4a4a-b5b5-d7a30103625f@github.com> Message-ID: On Mon, 8 May 2023 13:22:18 GMT, Roman Kennke wrote: > I meant the full-GCs of G1 and Shenandoah, which use multiple threads to do their work. I see; regions in the compaction-queue might not be contiguous in heap-space, so to-regions will not necessarily be adjacent. Thank you for the clarification. > And btw, even if what you say were true, we could not do this. I meant keep the current from-region size and double only to-region size. (But my original assumption was wrong, so this doesn't work.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1538382862 From duke at openjdk.org Mon May 8 16:39:30 2023 From: duke at openjdk.org (olivergillespie) Date: Mon, 8 May 2023 16:39:30 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v7] In-Reply-To: <1293l8AOaxKsgKLWwQ0Z5AVT84opFPyIlEK3A24L5kU=.0d475159-2d97-4cec-8e7e-71fba84d1f98@github.com> References: <1293l8AOaxKsgKLWwQ0Z5AVT84opFPyIlEK3A24L5kU=.0d475159-2d97-4cec-8e7e-71fba84d1f98@github.com> Message-ID: <7bUH9GAzdbVe8upTFZ9c4RoG4sgvQ4v0W1MgfDAfZQk=.8164e5a8-8f7d-4e03-8578-56179e4e0557@github.com> On Mon, 8 May 2023 08:18:46 GMT, Thomas Schatzl wrote: >> Sorry, I don't think I understand the suggestion. Are you suggesting to use WithActiveWorkers only for the heap inspection VM op, and workers->active_workers() everywhere else (for the collector use-case)? >> >>> The problem is that one heap inspection VM operation. Maybe put the WithActiveWorkers there >> >> This area *is* that one heap inspection VM operation, isn't it? > > You are right, I misread the code. Ignore my comment :) Ship it. Cool, thanks for reviewing :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13774#discussion_r1187650940 From rkennke at openjdk.org Mon May 8 18:40:05 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 8 May 2023 18:40:05 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v36] In-Reply-To: References: Message-ID: <9lFPFZ9xr6UI731U-695uYuQoM5BtfoTCG45SH7sXI4=.51e9d5c9-4cb8-4a92-a40e-e23ff6e66f90@github.com> > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 99 commits: - Merge remote-tracking branch 'origin/JDK-8305896' into JDK-8305896 - Fix ASCII art - Merge branch 'master' into JDK-8305896 - Merge remote-tracking branch 'tschatzl/alt-fullgc-forwarding' into JDK-8305896 - Cleanup; remove commented out code, renames, "alt_region" -> "alternate" to match documentation, limit _region_word_size to heap size in the optimization case for better checking - cleanup - separate region bases - strength reduction - Fix asserts - @shipilev comments - ... and 89 more: https://git.openjdk.org/jdk/compare/14df5c13...c0147ca3 ------------- Changes: https://git.openjdk.org/jdk/pull/13582/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=35 Stats: 879 lines in 24 files changed: 841 ins; 0 del; 38 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From shade at openjdk.org Mon May 8 18:45:37 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 May 2023 18:45:37 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v35] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 12:41:40 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix ASCII art Marked as reviewed by shade (Reviewer). src/hotspot/share/gc/shared/slidingForwarding.cpp line 60: > 58: _region_size_words = heap.word_size(); > 59: _region_size_bytes_shift = log2i_exact(round_up_power_of_2(_region_size_words)) + LogHeapWordSize; > 60: } else { Indenting: Suggestion: } else { src/hotspot/share/gc/shared/slidingForwarding.cpp line 86: > 84: _bases_table = NEW_C_HEAP_ARRAY(HeapWord*, max, mtGC); > 85: _biased_bases[0] = _bases_table - _heap_start_region_bias; > 86: _biased_bases[1] = _bases_table + _num_regions - _heap_start_region_bias; Suggestion: HeapWord* biased_start = _bases_table - _heap_start_region_bias; _biased_bases[0] = biased_start; _biased_bases[1] = biased_start + _num_regions; ------------- PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1417252142 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1187728610 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1187757619 From duke at openjdk.org Mon May 8 23:26:33 2023 From: duke at openjdk.org (duke) Date: Mon, 8 May 2023 23:26:33 GMT Subject: Withdrawn: JDK-8303184: ZGC incompatible with ASan In-Reply-To: References: Message-ID: On Mon, 13 Mar 2023 16:37:41 GMT, Justin King wrote: > Update ZGC to work with ASan and fix missing LSan root region registration for ZGC. > > Currently all ZGC tests will fail on x86 with ASan enabled, as it is unable to reserve the address regions necessary due to overlap with ASan. x86 does not appear to have the address layout detection logic of the other architectures. Other alternatives are port the address layout detection logic to x86 (I was not comfortable doing this) or just disable ZGC when building Hotspot with ASan. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/13000 From gli at openjdk.org Tue May 9 03:31:29 2023 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 9 May 2023 03:31:29 GMT Subject: RFR: 8047998: -XX:InitialHeapSize is unnecessarily set to MaxHeapSize Message-ID: Hi all, When `MaxHeapSize` is equal to `InitialHeapSize` and `NewSize` is set in command line, currently, the `max_young_size/MaxNewSize` will be set to the value of `NewSize`. Considering the document of the `NewSize` (shown below), someone may set the `NewSize` to a very small value and expect the JVM to adjust the value dynamically. Then when the `MaxHeapSize` is equal to `InitialHeapSize` (set by user or ergonomics), the `MaxNewSize` is set to the value of `NewSize`, which is small unexpectedly. product(size_t, NewSize, ScaleForWordSize(1*M), \ "Initial new generation size (in bytes)") \ constraint(NewSizeConstraintFunc,AfterErgo) \ This patch fixes the issue by setting the `MaxNewSize` to `NewSize` only when the `NewSize` is larger than the original `max_young_size/MaxNewSize`. The title of JDK-8047998 may need to adjusted. Thanks for the review. Best Regards, -- Guoxiong ------------- Commit messages: - JDK-8047998 Changes: https://git.openjdk.org/jdk/pull/13876/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13876&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8047998 Stats: 8 lines in 1 file changed: 5 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13876.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13876/head:pull/13876 PR: https://git.openjdk.org/jdk/pull/13876 From tschatzl at openjdk.org Tue May 9 07:58:22 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 May 2023 07:58:22 GMT Subject: RFR: 8307518: Remove G1 workaround in jstat about zero sized generation sized Message-ID: Hi all, please review removal of some workaround in g1 memory usage monitoring that made sure that there were no 0-sized generations in the output. After [JDK-8307428](https://bugs.openjdk.org/browse/JDK-8307428) this is not necessary any more. Testing: tier1-5 Thanks, Thomas ------------- Commit messages: - remove obsolete comment - additional changes - initial version Changes: https://git.openjdk.org/jdk/pull/13880/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13880&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307518 Stats: 38 lines in 2 files changed: 0 ins; 23 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/13880.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13880/head:pull/13880 PR: https://git.openjdk.org/jdk/pull/13880 From tschatzl at openjdk.org Tue May 9 08:28:35 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 May 2023 08:28:35 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v5] In-Reply-To: <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> Message-ID: On Sat, 6 May 2023 20:56:54 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: >> >> - Merge branch 'master' into 8306541-refactor-cset-candidates >> - ayang, iwalulya review >> >> fix inlining in g1CollectionSet.inline.hpp >> - Merge branch 'master' into 8306541-refactor-cset-candidates >> - ayang review - remove unused methods >> - Whitespace fixes >> - typo >> - More cleanup >> - Cleanup >> - Cleanup >> - Refactor collection set candidates >> >> Improve the interface to collection set candidates and prepare for having collection set >> candidates at any time. Preparations to allow for multiple sources for these candidates >> (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch >> only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's >> not used otherwise. >> >> * the collection set candidates set is not temporarily allocated any more, but the candidate >> set object must be available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains >> the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not >> necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. >> Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Everything else are changes to use these helper sets/lists throughout. >> >> Some additional FIXME for log messages to remove are in there. Please ignore. > > src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 55: > >> 53: // Remove the given list of HeapRegion* from this list. Assumes that the given >> 54: // list is a prefix of this list. >> 55: void remove(G1CollectionSetRegionList* list); > > Maybe `remove_prefix`? I improved the documentation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188302214 From tschatzl at openjdk.org Tue May 9 08:35:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 May 2023 08:35:27 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v5] In-Reply-To: <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> Message-ID: <3_2_QVtTLTcfRqRNQ8Uukse-bY1pEHiH3iO36fZcPkE=.475af958-bd26-4d0b-9284-c08e4b32b64d@github.com> On Sat, 6 May 2023 21:22:27 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: >> >> - Merge branch 'master' into 8306541-refactor-cset-candidates >> - ayang, iwalulya review >> >> fix inlining in g1CollectionSet.inline.hpp >> - Merge branch 'master' into 8306541-refactor-cset-candidates >> - ayang review - remove unused methods >> - Whitespace fixes >> - typo >> - More cleanup >> - Cleanup >> - Cleanup >> - Refactor collection set candidates >> >> Improve the interface to collection set candidates and prepare for having collection set >> candidates at any time. Preparations to allow for multiple sources for these candidates >> (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch >> only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's >> not used otherwise. >> >> * the collection set candidates set is not temporarily allocated any more, but the candidate >> set object must be available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains >> the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not >> necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. >> Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Everything else are changes to use these helper sets/lists throughout. >> >> Some additional FIXME for log messages to remove are in there. Please ignore. > > src/hotspot/share/gc/g1/g1CollectionSetChooser.cpp line 198: > >> 196: if (should_add(r) && !G1CollectedHeap::heap()->is_old_gc_alloc_region(r)) { >> 197: add_region(r); >> 198: } else if (r->is_old() && !r->is_collection_set_candidate()) { > > Why the additional predicate? (IOW, what regions will be misplaced without the new predicate?) That is a change that is necessary later - when pinned/evacuation failure regions are part of the candidates, they show up here. Will remove for now. Apologies. > src/hotspot/share/gc/g1/heapRegion.inline.hpp line 301: > >> 299: if (is_old_or_humongous() && !is_collection_set_candidate()) { >> 300: set_top_at_mark_start(top()); >> 301: } > > Unclear why these checks are required. Same as above, some change necessary for later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188310565 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188311135 From tschatzl at openjdk.org Tue May 9 08:47:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 May 2023 08:47:28 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v5] In-Reply-To: <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> Message-ID: On Sat, 6 May 2023 22:38:36 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: >> >> - Merge branch 'master' into 8306541-refactor-cset-candidates >> - ayang, iwalulya review >> >> fix inlining in g1CollectionSet.inline.hpp >> - Merge branch 'master' into 8306541-refactor-cset-candidates >> - ayang review - remove unused methods >> - Whitespace fixes >> - typo >> - More cleanup >> - Cleanup >> - Cleanup >> - Refactor collection set candidates >> >> Improve the interface to collection set candidates and prepare for having collection set >> candidates at any time. Preparations to allow for multiple sources for these candidates >> (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch >> only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's >> not used otherwise. >> >> * the collection set candidates set is not temporarily allocated any more, but the candidate >> set object must be available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains >> the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not >> necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. >> Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Everything else are changes to use these helper sets/lists throughout. >> >> Some additional FIXME for log messages to remove are in there. Please ignore. > > src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 46: > >> 44: class G1CollectionSetRegionList { >> 45: GrowableArray _regions; >> 46: size_t _reclaimable_bytes; > > I don't see the necessity of `G1CollectionSetRegionList::_reclaimable_bytes`. Seems to me, one can calculate it on the fly in the for-loop of `G1CollectionSetCandidates::remove`. In `G1CollectionSetRegionList::remove` you would need to iterate over all elements that are being removed, which is not the case for now. The other reason is that `reclaimable_bytes` depends on known live bytes in that region. While currently we exclude regions that may change their contents (e.g. current allocation region) from the collection set, I prefer to be absolutely sure that the values that we are working on do not change and the calculations keep being consistent, i.e. snapshotting the (sum of) reclaimable bytes (one could also snapshot the individual values, but I do not see a gain here). There does not seem to be any other advantage removing this than not having this additional member (i.e. some simplifications this allows). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188323973 From rkennke at openjdk.org Tue May 9 09:27:53 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 09:27:53 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v37] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Update src/hotspot/share/gc/shared/slidingForwarding.cpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/gc/shared/slidingForwarding.cpp Co-authored-by: Aleksey Shipil?v ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/c0147ca3..840d2da7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=36 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=35-36 Stats: 4 lines in 1 file changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From tschatzl at openjdk.org Tue May 9 09:32:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 May 2023 09:32:27 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v5] In-Reply-To: <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> Message-ID: <9S5rAKOPAChao0HYKt8mrkc0t6cREPbAR_tkeMY_9_8=.72242f26-39b9-46f0-97f7-2ac1e8153258@github.com> On Sat, 6 May 2023 22:38:36 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: >> >> - Merge branch 'master' into 8306541-refactor-cset-candidates >> - ayang, iwalulya review >> >> fix inlining in g1CollectionSet.inline.hpp >> - Merge branch 'master' into 8306541-refactor-cset-candidates >> - ayang review - remove unused methods >> - Whitespace fixes >> - typo >> - More cleanup >> - Cleanup >> - Cleanup >> - Refactor collection set candidates >> >> Improve the interface to collection set candidates and prepare for having collection set >> candidates at any time. Preparations to allow for multiple sources for these candidates >> (from the marking, as now, and from retained, i.e. evacuation failed regions). This patch >> only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's >> not used otherwise. >> >> * the collection set candidates set is not temporarily allocated any more, but the candidate >> set object must be available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains >> the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not >> necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. >> Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Everything else are changes to use these helper sets/lists throughout. >> >> Some additional FIXME for log messages to remove are in there. Please ignore. > > src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 46: > >> 44: class G1CollectionSetRegionList { >> 45: GrowableArray _regions; >> 46: size_t _reclaimable_bytes; > > I don't see the necessity of `G1CollectionSetRegionList::_reclaimable_bytes`. Seems to me, one can calculate it on the fly in the for-loop of `G1CollectionSetCandidates::remove`. (After deleting the other message) There is a use in `G1CollectionSetRegionList::remove` where not having this value would add a loop over the `other` list. If you insist, I can change that. > src/hotspot/share/gc/g1/g1CollectionSetChooser.cpp line 256: > >> 254: candidates->merge_candidates_from_marking(_result.array(), >> 255: _num_regions_added - num_pruned, >> 256: _reclaimable_bytes_added - pruned_wasted_bytes); > > Could `prune` modify `_result` and fields in-place? Requiring caller to do `_num_regions_added - num_pruned` seems an unnecessary overhead. Okay, changed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188379382 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188379944 From ayang at openjdk.org Tue May 9 09:41:25 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 May 2023 09:41:25 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v5] In-Reply-To: <9S5rAKOPAChao0HYKt8mrkc0t6cREPbAR_tkeMY_9_8=.72242f26-39b9-46f0-97f7-2ac1e8153258@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> <9S5rAKOPAChao0HYKt8mrkc0t6cREPbAR_tkeMY_9_8=.72242f26-39b9-46f0-97f7-2ac1e8153258@github.com> Message-ID: On Tue, 9 May 2023 09:29:47 GMT, Thomas Schatzl wrote: > would add a loop over the other list I don't get it. void G1CollectionSetRegionList::remove(G1CollectionSetRegionList* other) { #ifdef ASSERT // Check that the given list is a prefix of this list. int i = 0; for (HeapRegion* r : *other) { assert(_regions.at(i) == r, "must be in order, but element %d is not", i); i++; } #endif if (other->length() == 0) { return; } _regions.remove_till(other->length()); _reclaimable_bytes -= other->reclaimable_bytes(); } If one removes `_reclaimable_bytes`, the last statement will go away. Why do you need an additional loop? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188389570 From tschatzl at openjdk.org Tue May 9 09:49:22 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 May 2023 09:49:22 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v5] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <8D_JbSrCaMKG01KAMF8dSy9uBef4-su54lDDLFzib5g=.dfa316d2-ddb2-491c-990b-dc2250a24550@github.com> <9S5rAKOPAChao0HYKt8mrkc0t6cREPbAR_tkeMY_9_8=.72242f26-39b9-46f0-97f7-2ac1e8153258@github.com> Message-ID: On Tue, 9 May 2023 09:38:07 GMT, Albert Mingkun Yang wrote: >> (After deleting the other message) >> There is a use in `G1CollectionSetRegionList::remove` where not having this value would add a loop over the `other` list. If you think it is really important, I can change that. >> I simply do not think it hurts, and avoids the additional iteration (as `reclaimable_bytes` is calculated during appending to that list, which is done iteratively already). > >> would add a loop over the other list > > I don't get it. > > > void G1CollectionSetRegionList::remove(G1CollectionSetRegionList* other) { > #ifdef ASSERT > // Check that the given list is a prefix of this list. > int i = 0; > for (HeapRegion* r : *other) { > assert(_regions.at(i) == r, "must be in order, but element %d is not", i); > i++; > } > #endif > > if (other->length() == 0) { > return; > } > _regions.remove_till(other->length()); > _reclaimable_bytes -= other->reclaimable_bytes(); > } > > > If one removes `_reclaimable_bytes`, the last statement will go away. Why do you need an additional loop? I stand corrected :) Removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188399199 From rkennke at openjdk.org Tue May 9 10:28:21 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 10:28:21 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v38] In-Reply-To: References: Message-ID: <0FqSomGDy121kJ9qWGl8sv2A6eBYdyQm4gIJ2GRcghc=.d5d6d5fd-395f-4cc7-ac43-d821eb9f1b45@github.com> > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/840d2da7..0d001db2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=37 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=36-37 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From tschatzl at openjdk.org Tue May 9 10:37:20 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 May 2023 10:37:20 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v8] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang, iwalulya remove() -> remove_prefix() ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13666/files - new: https://git.openjdk.org/jdk/pull/13666/files/a9ba667e..39ea889e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=06-07 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From rkennke at openjdk.org Tue May 9 10:52:36 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 10:52:36 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v33] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 10:12:41 GMT, Thomas Schatzl wrote: >> Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: >> >> - Fix typos >> - Make FallbackTable an inner class of SlidingForwarding > > The https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:alt-fullgc-forwarding?expand=1 branch now contains the promised cleanup. @tschatzl or anybody else: any concerns remaining with this PR? If not, I would integrate it later today (if the GHA are green). ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1539946381 From tschatzl at openjdk.org Tue May 9 11:10:27 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 9 May 2023 11:10:27 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v9] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: <0Fk9dSGEQXjClsT_GUnAAFOWUQ44cn2VWGsgsni1DK4=.665fc10a-a6d6-4fca-b19e-fd9305a5c1c9@github.com> > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: iwalulya review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13666/files - new: https://git.openjdk.org/jdk/pull/13666/files/39ea889e..fe718701 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=07-08 Stats: 4 lines in 1 file changed: 0 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From ayang at openjdk.org Tue May 9 12:32:35 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 May 2023 12:32:35 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v38] In-Reply-To: <0FqSomGDy121kJ9qWGl8sv2A6eBYdyQm4gIJ2GRcghc=.d5d6d5fd-395f-4cc7-ac43-d821eb9f1b45@github.com> References: <0FqSomGDy121kJ9qWGl8sv2A6eBYdyQm4gIJ2GRcghc=.d5d6d5fd-395f-4cc7-ac43-d821eb9f1b45@github.com> Message-ID: On Tue, 9 May 2023 10:28:21 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix build Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1418513204 From eosterlund at openjdk.org Tue May 9 12:39:30 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 9 May 2023 12:39:30 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v38] In-Reply-To: <0FqSomGDy121kJ9qWGl8sv2A6eBYdyQm4gIJ2GRcghc=.d5d6d5fd-395f-4cc7-ac43-d821eb9f1b45@github.com> References: <0FqSomGDy121kJ9qWGl8sv2A6eBYdyQm4gIJ2GRcghc=.d5d6d5fd-395f-4cc7-ac43-d821eb9f1b45@github.com> Message-ID: On Tue, 9 May 2023 10:28:21 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix build The heap alignment guarantee blows up in my local testing. How did you test this? src/hotspot/share/gc/shared/slidingForwarding.cpp line 68: > 66: _region_mask = ~((uintptr_t(1) << _region_size_bytes_shift) - 1); > 67: > 68: guarantee((_heap_start_region_bias << _region_size_bytes_shift) == (uintptr_t)_heap_start, "must be aligned"); This guarantee seems to blow up on Serial when the heap size isn't a power of two. ------------- Changes requested by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1418523391 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1188527103 From rkennke at openjdk.org Tue May 9 13:20:10 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 13:20:10 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v39] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: @tschatzl's latest fix, cleanup and a test that checks unaligned heap problem ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/0d001db2..158e8b28 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=38 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=37-38 Stats: 21 lines in 2 files changed: 15 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Tue May 9 13:20:13 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 13:20:13 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v38] In-Reply-To: References: <0FqSomGDy121kJ9qWGl8sv2A6eBYdyQm4gIJ2GRcghc=.d5d6d5fd-395f-4cc7-ac43-d821eb9f1b45@github.com> Message-ID: <_1zDYe4DUVzBN7GKjklSKgS1_re6uUAxx1RvN1bHumA=.521cc44b-9ced-4540-8d42-913607c906d7@github.com> On Tue, 9 May 2023 12:33:58 GMT, Erik ?sterlund wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix build > > src/hotspot/share/gc/shared/slidingForwarding.cpp line 68: > >> 66: _region_mask = ~((uintptr_t(1) << _region_size_bytes_shift) - 1); >> 67: >> 68: guarantee((_heap_start_region_bias << _region_size_bytes_shift) == (uintptr_t)_heap_start, "must be aligned"); > > This guarantee seems to blow up on Serial when the heap size isn't a power of two. Apparently this happens when heap is not aligned properly. I integrated @tschatzl's fix and added a test config to TestSystemGCWithSerial.java which checks the situation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1188580688 From ayang at openjdk.org Tue May 9 13:43:30 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 9 May 2023 13:43:30 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v9] In-Reply-To: <0Fk9dSGEQXjClsT_GUnAAFOWUQ44cn2VWGsgsni1DK4=.665fc10a-a6d6-4fca-b19e-fd9305a5c1c9@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <0Fk9dSGEQXjClsT_GUnAAFOWUQ44cn2VWGsgsni1DK4=.665fc10a-a6d6-4fca-b19e-fd9305a5c1c9@github.com> Message-ID: On Tue, 9 May 2023 11:10:27 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this refactoring of collection set candidate set handling. >> >> The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. >> >> These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). >> >> This patch only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. >> >> In detail: >> * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Testing: >> - this patch only: tier1-3, gha >> - with JDK-8140326 tier1-7 (or 8?) >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > iwalulya review src/hotspot/share/gc/g1/g1CollectionSetCandidates.cpp line 274: > 272: verify_helper(&_marking_regions, from_marking, reclaimable_bytes, verify_map); > 273: > 274: assert(length() >= marking_regions_length(), "must be"); Don't get what the intention is here, given `uint length() const { return marking_regions_length(); }`. src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 43: > 41: > 42: // A set of HeapRegion*. > 43: class G1CollectionSetRegionList { Now that this is just a region-list, maybe drop the "CollectionSet" part? src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 65: > 63: G1CollectionSetRegionListIterator end() const { return _regions.end(); } > 64: > 65: void verify() PRODUCT_RETURN; Seems unimplemented. src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 181: > 179: uint _last_marking_candidates_length; > 180: > 181: size_t _reclaimable_bytes; Where is this used other than asset? src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 214: > 212: > 213: bool is_empty() const; > 214: bool has_no_more_marking_candidates() const; Maybe the positive variant, sth like `has_marking_candidates`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188615037 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188572380 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188571518 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188596562 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1188606695 From eosterlund at openjdk.org Tue May 9 16:17:59 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 9 May 2023 16:17:59 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v39] In-Reply-To: References: Message-ID: <--YtVZQ3mzi7VmHg2BAufLPcCof-22LduGi74dTjNHc=.8cdb3c01-bd7b-4afd-849c-76be3e20f15c@github.com> On Tue, 9 May 2023 13:20:10 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). >> >> The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). >> >> I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. >> >> Testing: >> - [x] hotspot_gc -UseAltGCForwarding >> - [x] hotspot_gc +UseAltGCForwarding >> - [x] tier1 -UseAltGCForwarding >> - [x] tier1 +UseAltGCForwarding >> - [x] tier2 -UseAltGCForwarding >> - [x] tier2 +UseAltGCForwarding > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > @tschatzl's latest fix, cleanup and a test that checks unaligned heap problem src/hotspot/share/gc/shared/slidingForwarding.cpp line 166: > 164: // Set from and to in new or found entry. > 165: entry->_from = from; > 166: entry->_to = to; Does _from and _to change between lookups? In other words, if we first create a FallbackTableEntry and then next time find this existing FallbackTableEntry from the first lookup, shouldn't _from and _to be the same? The fact that setting _from and _to is done outside of the if, makes it look like they can and should mutate between lookups. But surely it will be the same values, or have I missed anything? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1188839952 From rkennke at openjdk.org Tue May 9 16:29:34 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 16:29:34 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v39] In-Reply-To: <--YtVZQ3mzi7VmHg2BAufLPcCof-22LduGi74dTjNHc=.8cdb3c01-bd7b-4afd-849c-76be3e20f15c@github.com> References: <--YtVZQ3mzi7VmHg2BAufLPcCof-22LduGi74dTjNHc=.8cdb3c01-bd7b-4afd-849c-76be3e20f15c@github.com> Message-ID: On Tue, 9 May 2023 16:15:08 GMT, Erik ?sterlund wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> @tschatzl's latest fix, cleanup and a test that checks unaligned heap problem > > src/hotspot/share/gc/shared/slidingForwarding.cpp line 166: > >> 164: // Set from and to in new or found entry. >> 165: entry->_from = from; >> 166: entry->_to = to; > > Does _from and _to change between lookups? In other words, if we first create a FallbackTableEntry and then next time find this existing FallbackTableEntry from the first lookup, shouldn't _from and _to be the same? The fact that setting _from and _to is done outside of the if, makes it look like they can and should mutate between lookups. But surely it will be the same values, or have I missed anything? Objects should be forwarded only once. The exception is G1 serial compaction, which re-forwards objects. However, I believe they should initially be forwarded 'normally' (by normal sliding compaction), and then only re-forwarded once, at which point they should land in the fallback-hashtable. I need to check it, but I believe we may be able to simplify this code and assume (and assert) that objects have not yet been forwarded in the fallback-table. Good find! (BTW, the gtests that I added for this code take advantage of re-forwarding, so those would require a re-write) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1188853376 From rkennke at openjdk.org Tue May 9 18:45:53 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 18:45:53 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v40] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new (experimental) flag -XX:[+|-]UseAltGCForwarding. This flag is not really intended to be used by end-users. Instead, I intend to programatically enable it with compact object headers once they arrive (i.e. -XX:+UseCompactObjectHeaders would turn on -XX:+UseAltGCForwarding), and the flag is also useful for testing purposes. Once compact object headers become the default and only implementation, the flag and old implementation could be removed. Also, [JDK-8305898](https://bugs.openjdk.org/browse/JDK-8305898) would also use the same flag to enable an alternative self-forwarding approach (also in support of compact object headers). > > The change also adds a utility class GCForwarding which calls the old or new implementation based on the flag. I think it would also be used for the self-forwarding change to be proposed soon (and separately). > > I also experimented with a different forwarding approach that would use per-region hashtables, but shelved it for now, because performance was significantly worse than the sliding forwarding encoding. It will become useful later when we want to do 32bit compact object headers, because then, the sliding encoding will not be sufficient to hold forwarding pointers in the header. > > Testing: > - [x] hotspot_gc -UseAltGCForwarding > - [x] hotspot_gc +UseAltGCForwarding > - [x] tier1 -UseAltGCForwarding > - [x] tier1 +UseAltGCForwarding > - [x] tier2 -UseAltGCForwarding > - [x] tier2 +UseAltGCForwarding Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix gtest: Align fake-heaps, avoid re-forwardings ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/158e8b28..69c78eba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=39 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=38-39 Stats: 43 lines in 2 files changed: 6 ins; 6 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From shade at openjdk.org Tue May 9 19:24:32 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 9 May 2023 19:24:32 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v4] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 18:42:36 GMT, Roman Kennke wrote: >> Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: > > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge branch 'JDK-8305896' into JDK-8305898 > - Use forwardee() in forward_to_atomic() method > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge branch 'JDK-8305896' into JDK-8305898 > - Replace uses of decode_pointer() with forwardee() > - 8305898: Alternative self-forwarding mechanism Looks okay at the first glance, comments: src/hotspot/share/oops/oop.inline.hpp line 271: > 269: void oopDesc::forward_to(oop p) { > 270: markWord m = markWord::encode_pointer_as_mark(p); > 271: assert(forwardee(m) == p, "encoding must be reversable"); Suggestion: assert(forwardee(m) == p, "encoding must be reversible"); src/hotspot/share/oops/oop.inline.hpp line 278: > 276: if (UseAltGCForwarding) { > 277: markWord m = mark(); > 278: // If mark is displaced, we need to preserve real header during GC. Suggestion: // If mark is displaced, we need to preserve the real header during GC. src/hotspot/share/oops/oop.inline.hpp line 304: > 302: > 303: oop oopDesc::forward_to_self_atomic(markWord compare, atomic_memory_order order) { > 304: if (UseAltGCForwarding) { Do you want to assert in `oopDesc::forward_to` and `oopDesc::forward_to_atomic` that they are not called with self-forwarding arguments? src/hotspot/share/oops/oop.inline.hpp line 306: > 304: if (UseAltGCForwarding) { > 305: markWord m = compare; > 306: // If mark is displaced, we need to preserve real header during GC. Suggestion: // If mark is displaced, we need to preserve the real header during GC. src/hotspot/share/oops/oop.inline.hpp line 322: > 320: } > 321: } else { > 322: return forward_to_atomic(oop(this), compare, order); Suggestion: return forward_to_atomic(cast_to_oop(this), compare, order); src/hotspot/share/oops/oop.inline.hpp line 329: > 327: assert(header.is_marked(), "only decode when actually forwarded"); > 328: if (header.self_forwarded()) { > 329: assert(UseAltGCForwarding, "Only use self-fwd bits when using alt GC forwarding"); This assert looks excessive, as `self_forwarded` asserts the same? src/hotspot/share/oops/oop.inline.hpp line 332: > 330: return cast_to_oop(this); > 331: } else { > 332: return cast_to_oop(header.decode_pointer()); I think this path misses the original assert: assert(is_forwarded(), "only decode when actually forwarded"); ------------- PR Review: https://git.openjdk.org/jdk/pull/13779#pullrequestreview-1419298697 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189037701 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189030583 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189040413 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189034503 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189038639 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189041662 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189035731 From rkennke at openjdk.org Tue May 9 20:07:37 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 20:07:37 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v5] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request incrementally with four additional commits since the last revision: - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13779/files - new: https://git.openjdk.org/jdk/pull/13779/files/15a8626b..a559e8d6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=03-04 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From rkennke at openjdk.org Tue May 9 20:07:41 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 20:07:41 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v4] In-Reply-To: References: Message-ID: On Tue, 9 May 2023 19:14:26 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: >> >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Use forwardee() in forward_to_atomic() method >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Replace uses of decode_pointer() with forwardee() >> - 8305898: Alternative self-forwarding mechanism > > src/hotspot/share/oops/oop.inline.hpp line 332: > >> 330: return cast_to_oop(this); >> 331: } else { >> 332: return cast_to_oop(header.decode_pointer()); > > I think this path misses the original assert: > > > assert(is_forwarded(), "only decode when actually forwarded"); No, not really. This method exists to support racy access on the mark word. The equivalent of is_forwarded() here is header.is_marked(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189077714 From rkennke at openjdk.org Tue May 9 20:25:44 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 20:25:44 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v6] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: - Fix assert - Merge branch 'JDK-8305896' into JDK-8305898 - @shipilev suggestions - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - ... and 6 more: https://git.openjdk.org/jdk/compare/69c78eba...915c20bc ------------- Changes: https://git.openjdk.org/jdk/pull/13779/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=05 Stats: 86 lines in 8 files changed: 70 ins; 2 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From rkennke at openjdk.org Tue May 9 20:54:40 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 9 May 2023 20:54:40 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v7] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix asserts (again) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13779/files - new: https://git.openjdk.org/jdk/pull/13779/files/915c20bc..6d39d575 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=05-06 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From shade at openjdk.org Wed May 10 08:47:16 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 May 2023 08:47:16 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v4] In-Reply-To: References: Message-ID: On Tue, 9 May 2023 20:01:39 GMT, Roman Kennke wrote: >> src/hotspot/share/oops/oop.inline.hpp line 332: >> >>> 330: return cast_to_oop(this); >>> 331: } else { >>> 332: return cast_to_oop(header.decode_pointer()); >> >> I think this path misses the original assert: >> >> >> assert(is_forwarded(), "only decode when actually forwarded"); > > No, not really. This method exists to support racy access on the mark word. The equivalent of is_forwarded() here is header.is_marked(). Ah yes, nevermind then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189560757 From duke at openjdk.org Wed May 10 08:52:17 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 10 May 2023 08:52:17 GMT Subject: RFR: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection [v7] In-Reply-To: References: Message-ID: On Fri, 5 May 2023 13:05:27 GMT, olivergillespie wrote: >> ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. >> >> The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: >> >> >> Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) >> After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) >> >> >> Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? >> >> Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? > > olivergillespie has updated the pull request incrementally with one additional commit since the last revision: > > Fix use of WithActiveWorkers > > My scope was incorrect, thanks @albertnetymk > Also fix ResourceMark usage, thanks Aleksey. Re-ran tiers 2/3/4 on the refactor, Linux x86_64 and aarch64, all good. Marking for integration. Thanks all for comments and reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13774#issuecomment-1541602530 From duke at openjdk.org Wed May 10 08:57:25 2023 From: duke at openjdk.org (olivergillespie) Date: Wed, 10 May 2023 08:57:25 GMT Subject: Integrated: 8307348 - Parallelize heap walk for ObjectCount(AfterGC) JFR event collection In-Reply-To: References: Message-ID: <_o7ytZJBXk6X4gX6KIfeAvq61PEIdjN9mkHbPJcfzmQ=.ddeecdf6-6d60-4432-b518-67cb19c13784@github.com> On Wed, 3 May 2023 11:10:31 GMT, olivergillespie wrote: > ObjectCount(AfterGC) event does a full single-threaded heap scan at a safepoint. After https://bugs.openjdk.org/browse/JDK-8215624, it is trivial to use the parallel version of the heap scan, reducing the time spent at the safepoint, and thus reducing the overhead of this event. > > The performance improvement is obvious, but just for confirmation, on my 16-core host, at around 1GB occupancy: > > > Before: 770ms ( [3.059s][debug][gc,phases ] GC(13) Report Object Count 770.317ms ) > After: 92ms ( [2.335s][debug][gc,phases ] GC(13) Report Object Count 91.742ms ) > > > Question 1: Should this be the default behaviour for populate_table (use the number active workers as the parallelism, if nothing else specified)? > > Question 2: Is active_workers the correct value to use here? Or is max_workers more appropriate? This pull request has now been integrated. Changeset: 540c706b Author: Oli Gillespie Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/540c706bbcbb809ae1304aac4f2a16a5e83cb458 Stats: 42 lines in 9 files changed: 10 ins; 11 del; 21 mod 8307348: Parallelize heap walk for ObjectCount(AfterGC) JFR event collection Reviewed-by: shade, ayang, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/13774 From shade at openjdk.org Wed May 10 09:02:39 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 May 2023 09:02:39 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v7] In-Reply-To: References: Message-ID: On Tue, 9 May 2023 20:54:40 GMT, Roman Kennke wrote: >> Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix asserts (again) More comments src/hotspot/share/oops/oop.inline.hpp line 270: > 268: // Used by scavengers > 269: void oopDesc::forward_to(oop p) { > 270: assert(p != cast_to_oop(this) || !UseAltGCForwarding, "Must not be called with self-forwarding"); Now that I had my morning coffee, I do have a question about the contract here. Can we accidentally call `oop->forward_to(compaction_point)` when `oop == compaction_point` from the compaction code? I guess that would be innocuous for the thing we want to protect against: recording the _promotion failure_, rather than the self-forwarding itself. In other words, the fact that object is self-forwarded might not exactly mean it failed the promotion, might just be a lucky coincidence? If so, maybe this whole thing should be `oopDesc::forward_failed()` or some such, and then let the code decide how to record it, either with self-forwarding address (legacy) or with this new bit. src/hotspot/share/oops/oop.inline.hpp line 286: > 284: } > 285: m = m.set_self_forwarded(); > 286: assert(forwardee(m) == cast_to_oop(this), "encoding must be reversable"); Suggestion: assert(forwardee(m) == cast_to_oop(this), "encoding must be reversible"); src/hotspot/share/oops/oop.inline.hpp line 315: > 313: } > 314: m = m.set_self_forwarded(); > 315: assert(forwardee(m) == cast_to_oop(this), "encoding must be reversable"); Suggestion: assert(forwardee(m) == cast_to_oop(this), "encoding must be reversible"); src/hotspot/share/oops/oop.inline.hpp line 315: > 313: } > 314: m = m.set_self_forwarded(); > 315: assert(forwardee(m) == cast_to_oop(this), "encoding must be reversable"); Suggestion: assert(forwardee(m) == cast_to_oop(this), "encoding must be reversible"); ------------- PR Review: https://git.openjdk.org/jdk/pull/13779#pullrequestreview-1420088907 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189580136 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189562063 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189562326 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189562559 From ayang at openjdk.org Wed May 10 09:55:32 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 10 May 2023 09:55:32 GMT Subject: RFR: 8307808: G1: Remove partial object-count report after gc Message-ID: Simple removing object-count event in the case of incomplete-marking. Test: hotspot_gc ------------- Commit messages: - g1-remark-obj-count Changes: https://git.openjdk.org/jdk/pull/13897/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13897&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307808 Stats: 47 lines in 2 files changed: 17 ins; 30 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13897.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13897/head:pull/13897 PR: https://git.openjdk.org/jdk/pull/13897 From tschatzl at openjdk.org Wed May 10 10:12:16 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 May 2023 10:12:16 GMT Subject: RFR: 8307808: G1: Remove partial object-count report after gc In-Reply-To: References: Message-ID: On Wed, 10 May 2023 09:48:17 GMT, Albert Mingkun Yang wrote: > Simple removing object-count event in the case of incomplete-marking. > > Test: hotspot_gc Lgtm. Feel free to consider doing the suggested optimization in a separate cr. src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1225: > 1223: bool do_object_b(oop obj) { > 1224: return obj != nullptr && > 1225: (!_g1h->is_in_reserved(obj) || !_g1h->is_obj_dead(obj)); I think the first clause can be removed (or asserted); the (parallel-)objectiterator for g1 always returns `true` for that as it only ever iterates over the heap regions. Also the `!= nullptr` is imo removable for the same reason. We are walking the heap linearly, so I do not see how that value could be null. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13897#pullrequestreview-1420265517 PR Review Comment: https://git.openjdk.org/jdk/pull/13897#discussion_r1189672056 From ayang at openjdk.org Wed May 10 10:26:27 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 10 May 2023 10:26:27 GMT Subject: RFR: 8307808: G1: Remove partial object-count report after gc In-Reply-To: References: Message-ID: On Wed, 10 May 2023 10:01:33 GMT, Thomas Schatzl wrote: >> Simple removing object-count event in the case of incomplete-marking. >> >> Test: hotspot_gc > > src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1225: > >> 1223: bool do_object_b(oop obj) { >> 1224: return obj != nullptr && >> 1225: (!_g1h->is_in_reserved(obj) || !_g1h->is_obj_dead(obj)); > > I think the first clause can be removed (or asserted); the (parallel-)objectiterator for g1 always returns `true` for that as it only ever iterates over the heap regions. > Also the `!= nullptr` is imo removable for the same reason. We are walking the heap linearly, so I do not see how that value could be null. Think so; will do it in another PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13897#discussion_r1189699599 From rkennke at openjdk.org Wed May 10 10:28:26 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 10 May 2023 10:28:26 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v8] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13779/files - new: https://git.openjdk.org/jdk/pull/13779/files/6d39d575..40c1b0be Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=06-07 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From rkennke at openjdk.org Wed May 10 10:28:43 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 10 May 2023 10:28:43 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v7] In-Reply-To: References: Message-ID: On Wed, 10 May 2023 08:58:54 GMT, Aleksey Shipilev wrote: > Now that I had my morning coffee, I do have a question about the contract here. Can we accidentally call `oop->forward_to(compaction_point)` when `oop == compaction_point` from the compaction code? No, that doesn't seem to happen. In this case, the object doesn't get forwarded at all. If it would happen, it could and should be ignored, because it would result in extra stuff to be executed. > I guess that would be innocuous for the thing we want to protect against: recording the _promotion failure_, rather than the self-forwarding itself. In other words, the fact that object is self-forwarded might not exactly mean it failed the promotion, might just be a lucky coincidence? No, we want to protect against self-forwarding, because that would irrecoverably destroy the Klass* with compact headers. > If so, maybe this whole thing should be `oopDesc::forward_failed()` or some such, and then let the code decide how to record it, either with self-forwarding address (legacy) or with this new bit. Yes, I guess I could do that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1189702665 From ayang at openjdk.org Wed May 10 10:33:26 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 10 May 2023 10:33:26 GMT Subject: RFR: 8047998: -XX:InitialHeapSize is unnecessarily set to MaxHeapSize In-Reply-To: References: Message-ID: On Tue, 9 May 2023 03:22:52 GMT, Guoxiong Li wrote: > Hi all, > > When `MaxHeapSize` is equal to `InitialHeapSize` and `NewSize` is set in command line, > currently, the `max_young_size/MaxNewSize` will be set to the value of `NewSize`. > > Considering the document of the `NewSize` (shown below), someone may set the `NewSize` > to a very small value and expect the JVM to adjust the value dynamically. > Then when the `MaxHeapSize` is equal to `InitialHeapSize` (set by user or ergonomics), > the `MaxNewSize` is set to the value of `NewSize`, which is small unexpectedly. > > > product(size_t, NewSize, ScaleForWordSize(1*M), \ > "Initial new generation size (in bytes)") \ > constraint(NewSizeConstraintFunc,AfterErgo) \ > > > This patch fixes the issue by setting the `MaxNewSize` to `NewSize` only when the `NewSize` > is larger than the original `max_young_size/MaxNewSize`. > > The title of JDK-8047998 may need to adjusted. > > Thanks for the review. > > Best Regards, > -- Guoxiong Can't say which behavior is "more" unexpected -- `NewSize` means the initial young-size at start-up. Assuming one prints the young-size at start-up, the developer might be surprised if the printed value is diff from `NewSize`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1541891869 From tschatzl at openjdk.org Wed May 10 10:36:49 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 May 2023 10:36:49 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v10] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with four additional commits since the last revision: - remove _reclaimable_bytes - make reclaimable-bytes debug only - ayang review (1) - iwalulya review, naming compare fn ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13666/files - new: https://git.openjdk.org/jdk/pull/13666/files/fe718701..c477239b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=08-09 Stats: 96 lines in 5 files changed: 4 ins; 70 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From tschatzl at openjdk.org Wed May 10 10:44:46 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 May 2023 10:44:46 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v11] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Removed assert that is useless for now ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13666/files - new: https://git.openjdk.org/jdk/pull/13666/files/c477239b..13b6b3c6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=09-10 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From kbarrett at openjdk.org Wed May 10 11:06:17 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 10 May 2023 11:06:17 GMT Subject: RFR: 8307518: Remove G1 workaround in jstat about zero sized generation sized In-Reply-To: References: Message-ID: On Tue, 9 May 2023 07:50:54 GMT, Thomas Schatzl wrote: > Hi all, > > please review removal of some workaround in g1 memory usage monitoring that made sure that there were no 0-sized generations in the output. After [JDK-8307428](https://bugs.openjdk.org/browse/JDK-8307428) this is not necessary any more. > > Testing: tier1-5 > > Thanks, > Thomas Looks good. Just some weird pre-existing formatting that could be dealt with now or later. src/hotspot/share/gc/g1/g1MonitoringSupport.cpp line 141: > 139: "space", 0 /* ordinal */, > 140: g1h->max_capacity() /* max_capacity */, > 141: _old_gen_committed /* init_capacity */); [pre-existing] The indentation of all these HSpaceCounters constructor calls is pretty unusual. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13880#pullrequestreview-1420374919 PR Review Comment: https://git.openjdk.org/jdk/pull/13880#discussion_r1189741928 From gli at openjdk.org Wed May 10 11:33:49 2023 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 10 May 2023 11:33:49 GMT Subject: RFR: 8047998: -XX:InitialHeapSize is unnecessarily set to MaxHeapSize In-Reply-To: References: Message-ID: <3T9AcYAO1tnQk6u194q1QzNv9oSJwbpnKe9u9OVxuRs=.83d94754-cab6-4c39-bca5-bd2dd6e51c53@github.com> On Wed, 10 May 2023 10:30:48 GMT, Albert Mingkun Yang wrote: > Can't say which behavior is "more" unexpected -- `NewSize` means the initial young-size at start-up. Assuming one prints the young-size at start-up, the developer might be surprised if the printed value is diff from `NewSize`. Another solution: only change the document of the corresponding flags to remind the user and don't change the code. The draft patch: diff --git a/src/hotspot/share/gc/shared/gc_globals.hpp b/src/hotspot/share/gc/shared/gc_globals.hpp index 1bf74a81706..3ffe8ddeb74 100644 --- a/src/hotspot/share/gc/shared/gc_globals.hpp +++ b/src/hotspot/share/gc/shared/gc_globals.hpp @@ -604,7 +604,12 @@ \ product(size_t, MaxNewSize, max_uintx, \ "Maximum new generation size (in bytes), max_uintx means set " \ - "ergonomically") \ + "ergonomically." \ + "If the InitialHeapSize is equal to MaxHeapSize " \ + "(either of them may be set at command line or " \ + "be set ergonomically), and the NewSize is set at command line, " \ + "MaxNewSize would be set to the value of NewSize, " \ + "even the MaxNewSize is also set at command line.") \ range(0, max_uintx) \ \ product_pd(size_t, HeapBaseMinAddress, \ What do you think about it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1542026524 From tschatzl at openjdk.org Wed May 10 11:39:53 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 10 May 2023 11:39:53 GMT Subject: RFR: 8307518: Remove G1 workaround in jstat about zero sized generation sized [v2] In-Reply-To: References: Message-ID: > Hi all, > > please review removal of some workaround in g1 memory usage monitoring that made sure that there were no 0-sized generations in the output. After [JDK-8307428](https://bugs.openjdk.org/browse/JDK-8307428) this is not necessary any more. > > Testing: tier1-5 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: minor fixes to indentation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13880/files - new: https://git.openjdk.org/jdk/pull/13880/files/6af87816..fba26bf6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13880&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13880&range=00-01 Stats: 30 lines in 1 file changed: 10 ins; 0 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/13880.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13880/head:pull/13880 PR: https://git.openjdk.org/jdk/pull/13880 From ayang at openjdk.org Wed May 10 12:09:24 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 10 May 2023 12:09:24 GMT Subject: RFR: 8047998: -XX:InitialHeapSize is unnecessarily set to MaxHeapSize In-Reply-To: References: Message-ID: On Tue, 9 May 2023 03:22:52 GMT, Guoxiong Li wrote: > Hi all, > > When `MaxHeapSize` is equal to `InitialHeapSize` and `NewSize` is set in command line, > currently, the `max_young_size/MaxNewSize` will be set to the value of `NewSize`. > > Considering the document of the `NewSize` (shown below), someone may set the `NewSize` > to a very small value and expect the JVM to adjust the value dynamically. > Then when the `MaxHeapSize` is equal to `InitialHeapSize` (set by user or ergonomics), > the `MaxNewSize` is set to the value of `NewSize`, which is small unexpectedly. > > > product(size_t, NewSize, ScaleForWordSize(1*M), \ > "Initial new generation size (in bytes)") \ > constraint(NewSizeConstraintFunc,AfterErgo) \ > > > This patch fixes the issue by setting the `MaxNewSize` to `NewSize` only when the `NewSize` > is larger than the original `max_young_size/MaxNewSize`. > > The title of JDK-8047998 may need to adjusted. > > Thanks for the review. > > Best Regards, > -- Guoxiong Could you provide a concrete example (specific JVM flags and values) to illustrate the problem (unexpected behavior due to inadequate/misleading/confusing doc)? I don't get what the problem is, even after reading the original ticket description and your revised solution. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1542088591 From ayang at openjdk.org Wed May 10 12:13:25 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 10 May 2023 12:13:25 GMT Subject: RFR: 8307518: Remove G1 workaround in jstat about zero sized generation sized [v2] In-Reply-To: References: Message-ID: On Wed, 10 May 2023 11:39:53 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review removal of some workaround in g1 memory usage monitoring that made sure that there were no 0-sized generations in the output. After [JDK-8307428](https://bugs.openjdk.org/browse/JDK-8307428) this is not necessary any more. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > minor fixes to indentation Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13880#pullrequestreview-1420494318 From rkennke at openjdk.org Wed May 10 12:18:27 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 10 May 2023 12:18:27 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v9] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Merge remote-tracking branch 'origin/JDK-8305898' into JDK-8305898 - Rename self-forwarded -> forward-failed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13779/files - new: https://git.openjdk.org/jdk/pull/13779/files/40c1b0be..39c33727 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=07-08 Stats: 22 lines in 6 files changed: 0 ins; 0 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From gli at openjdk.org Wed May 10 12:35:24 2023 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 10 May 2023 12:35:24 GMT Subject: RFR: 8047998: -XX:InitialHeapSize is unnecessarily set to MaxHeapSize In-Reply-To: References: Message-ID: On Wed, 10 May 2023 12:06:29 GMT, Albert Mingkun Yang wrote: > Could you provide a concrete example (specific JVM flags and values) to illustrate the problem (unexpected behavior due to > inadequate/misleading/confusing doc)? I don't get what the problem is, even after reading the original ticket description and > your revised solution. The possible flags are `-XX:MaxHeapSize=256M -XX:NewSize=1M -XX:MaxNewSize=80M`. The user wants the JVM to initialize the new generation as 1M and then expands to 80M gradually. If the physical memory is very larger, the `InitialHeapSize` will be set to `MaxHeapSize` (which is `256M` according to the previous flags). The related code in method `Arguments::set_heap_size` is shown below. // method Arguments::set_heap_size julong reasonable_initial = (julong)((phys_mem * InitialRAMPercentage) / 100); reasonable_initial = limit_heap_by_allocatable_memory(reasonable_initial); reasonable_initial = MAX3(reasonable_initial, reasonable_minimum, (julong)MinHeapSize); reasonable_initial = MIN2(reasonable_initial, (julong)MaxHeapSize); // <-- here Then in method `GenArguments::initialize_size_info`, the `MaxNewSize` is set to `NewSize` (which is `1M` according to the previous flags). The related code in method `GenArguments::initialize_size_info` is shown below. // method GenArguments::initialize_size_info if (MaxHeapSize == InitialHeapSize) { // The maximum and initial heap sizes are the same so the generation's // initial size must be the same as it maximum size. Use NewSize as the // size if set on command line. max_young_size = FLAG_IS_CMDLINE(NewSize) ? NewSize : max_young_size; // <-- here initial_young_size = max_young_size; // Also update the minimum size if min == initial == max. if (MaxHeapSize == MinHeapSize) { MinNewSize = max_young_size; } } As you can see, the `MaxNewSize` and `NewSize` are always `1M` and the new generation never expands, which is not the user's intention and may lead to unexpected result. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1542121917 From gli at openjdk.org Wed May 10 12:38:29 2023 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 10 May 2023 12:38:29 GMT Subject: RFR: 8047998: -XX:InitialHeapSize is unnecessarily set to MaxHeapSize In-Reply-To: References: Message-ID: On Wed, 10 May 2023 12:28:01 GMT, Guoxiong Li wrote: > The possible flags are -XX:MaxHeapSize=256M -XX:NewSize=1M -XX:MaxNewSize=80M. The `1M` may be very extremely small. `10M` is better. But it doesn't prevent the understanding of the issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1542135202 From iwalulya at openjdk.org Wed May 10 12:48:25 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 10 May 2023 12:48:25 GMT Subject: RFR: 8307808: G1: Remove partial object-count report after gc In-Reply-To: References: Message-ID: On Wed, 10 May 2023 09:48:17 GMT, Albert Mingkun Yang wrote: > Simple removing object-count event in the case of incomplete-marking. > > Test: hotspot_gc Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13897#pullrequestreview-1420560826 From ayang at openjdk.org Wed May 10 13:47:13 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 10 May 2023 13:47:13 GMT Subject: RFR: 8047998: -XX:InitialHeapSize is unnecessarily set to MaxHeapSize In-Reply-To: References: Message-ID: <5yV-MFOOJ08VQ8XbT56FVt_cF_6xVxUqNklXOoyIh1Q=.057902ae-6e47-418a-8700-3a0cefa5a8ad@github.com> On Tue, 9 May 2023 03:22:52 GMT, Guoxiong Li wrote: > Hi all, > > When `MaxHeapSize` is equal to `InitialHeapSize` and `NewSize` is set in command line, > currently, the `max_young_size/MaxNewSize` will be set to the value of `NewSize`. > > Considering the document of the `NewSize` (shown below), someone may set the `NewSize` > to a very small value and expect the JVM to adjust the value dynamically. > Then when the `MaxHeapSize` is equal to `InitialHeapSize` (set by user or ergonomics), > the `MaxNewSize` is set to the value of `NewSize`, which is small unexpectedly. > > > product(size_t, NewSize, ScaleForWordSize(1*M), \ > "Initial new generation size (in bytes)") \ > constraint(NewSizeConstraintFunc,AfterErgo) \ > > > This patch fixes the issue by setting the `MaxNewSize` to `NewSize` only when the `NewSize` > is larger than the original `max_young_size/MaxNewSize`. > > The title of JDK-8047998 may need to adjusted. > > Thanks for the review. > > Best Regards, > -- Guoxiong Thank you for breaking down the problem. It's important to note that `-XX:InitialRAMPercentage=1.5625` is also included implicitly. I can't say I'm entirely surprised by the following: `-XX:MaxHeapSize=256M -XX:NewSize=1M -XX:MaxNewSize=80M` + `-XX:InitialRAMPercentage=1.5625` on a large RAM system leads to a constant young-size. (Of course, one could argue that the default value of `InitialRAMPercentage` is surprising. Would `0` be any better? After all, its counterpart, `InitialHeapSize`, has `0` as the default value.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1542242310 From rkennke at openjdk.org Wed May 10 13:51:23 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 10 May 2023 13:51:23 GMT Subject: RFR: 8307816: Add missing STS to ZGC Message-ID: <2zsM-KpyaLtvz_OwW1GhBBLo-VDFal7uf7ZzOaNH2uE=.1a9b33ff-ede5-4a68-a4fc-a51652199b3c@github.com> Testing in project Lilliput has revealed that ZGC is lacking one STS. Without it, ZGC could reach to already-deflated monitor when trying to fetch a displaced header, in order to get to an object's Klass* (e.g. to get its size). Testing: - [x] hotspot_gc ------------- Commit messages: - 8307816: Add missing STS to ZGC Changes: https://git.openjdk.org/jdk/pull/13904/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13904&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307816 Stats: 6 lines in 1 file changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13904.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13904/head:pull/13904 PR: https://git.openjdk.org/jdk/pull/13904 From eosterlund at openjdk.org Wed May 10 18:15:14 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 10 May 2023 18:15:14 GMT Subject: RFR: 8307816: Add missing STS to ZGC In-Reply-To: <2zsM-KpyaLtvz_OwW1GhBBLo-VDFal7uf7ZzOaNH2uE=.1a9b33ff-ede5-4a68-a4fc-a51652199b3c@github.com> References: <2zsM-KpyaLtvz_OwW1GhBBLo-VDFal7uf7ZzOaNH2uE=.1a9b33ff-ede5-4a68-a4fc-a51652199b3c@github.com> Message-ID: On Wed, 10 May 2023 13:43:36 GMT, Roman Kennke wrote: > Testing in project Lilliput has revealed that ZGC is lacking one STS. Without it, ZGC could reach to already-deflated monitor when trying to fetch a displaced header, in order to get to an object's Klass* (e.g. to get its size). > > Testing: > - [x] hotspot_gc So far single generation ZGC has not had any need for the STS joiner, so we have barely used it at all. I think only some class unloading code uses it. With generational ZGC we have added such mechanisms all over the place to allow the generations to synchronize w.r.t. each other in safepoints. So I think in generational ZGC there are no changes required in this regard. However, in single generation ZGC, the change you have made will cause the JVM to hang if a load barrier executed in a safepoint hits a relocation stall due to memory shortage. That code assumes the concurrent GC thread will make progress with compaction from within the safepoint. In generational ZGC we went through the trouble of adding an STS like mechanism that is aware of relocation stalls, and allows them to be solved from within safepoints, on the mutator's behalf. You would need something like that for single generation ZGC as well. Other than that, there are probably other places requiring STS that reads the klass pointer, such as during marking and probably reference processing, etc. Perhaps since this is going to be experimental and mostly aiming towards hitting production in the future, a pragmatic solution is to support only generational ZGC, which shouldn't really require any changes like this. What do you think? ------------- Changes requested by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13904#pullrequestreview-1421160770 From rkennke at openjdk.org Wed May 10 18:46:17 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 10 May 2023 18:46:17 GMT Subject: RFR: 8307816: Add missing STS to ZGC In-Reply-To: References: <2zsM-KpyaLtvz_OwW1GhBBLo-VDFal7uf7ZzOaNH2uE=.1a9b33ff-ede5-4a68-a4fc-a51652199b3c@github.com> Message-ID: On Wed, 10 May 2023 18:12:52 GMT, Erik ?sterlund wrote: > So far single generation ZGC has not had any need for the STS joiner, so we have barely used it at all. I think only some class unloading code uses it. With generational ZGC we have added such mechanisms all over the place to allow the generations to synchronize w.r.t. each other in safepoints. So I think in generational ZGC there are no changes required in this regard. However, in single generation ZGC, the change you have made will cause the JVM to hang if a load barrier executed in a safepoint hits a relocation stall due to memory shortage. That code assumes the concurrent GC thread will make progress with compaction from within the safepoint. In generational ZGC we went through the trouble of adding an STS like mechanism that is aware of relocation stalls, and allows them to be solved from within safepoints, on the mutator's behalf. You would need something like that for single generation ZGC as well. Other than that, there are probably other places requiring STS that reads the klass pointer, such as during marking and probably reference processing, etc. Perhaps since this is going to be experimental and mostly aiming towards hitting production in the future, a pragmatic solution is to support only generational ZGC, which shouldn't really require any changes like this. What do you think? Ok, if generational ZGC comes with all the relevant synchronizations, then I will wait for that, and withdraw this PR. Thanks for the explanations. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13904#issuecomment-1542632281 From rkennke at openjdk.org Wed May 10 20:30:04 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 10 May 2023 20:30:04 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v10] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'JDK-8305896' into JDK-8305898 - Align fake-heap without GCC warnings (duh) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13779/files - new: https://git.openjdk.org/jdk/pull/13779/files/39c33727..02297920 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=08-09 Stats: 8 lines in 2 files changed: 4 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From rkennke at openjdk.org Wed May 10 20:31:27 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 10 May 2023 20:31:27 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v41] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new ... Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Align fake-heap without GCC warnings (duh) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/69c78eba..b0deb2b3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=40 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=39-40 Stats: 8 lines in 2 files changed: 4 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From eosterlund at openjdk.org Thu May 11 08:04:52 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 11 May 2023 08:04:52 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v10] In-Reply-To: References: Message-ID: On Wed, 10 May 2023 20:30:04 GMT, Roman Kennke wrote: >> Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Merge branch 'JDK-8305896' into JDK-8305898 > - Align fake-heap without GCC warnings (duh) Changes requested by eosterlund (Reviewer). src/hotspot/share/oops/oop.inline.hpp line 276: > 274: } > 275: > 276: void oopDesc::forward_failed() { It is a bit confusing that oopDesc::forward_failed is a setter, while markWord::forward_failed is a getter. ------------- PR Review: https://git.openjdk.org/jdk/pull/13779#pullrequestreview-1421977259 PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1190781770 From rkennke at openjdk.org Thu May 11 08:48:45 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 11 May 2023 08:48:45 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v10] In-Reply-To: References: Message-ID: On Thu, 11 May 2023 08:00:53 GMT, Erik ?sterlund wrote: >> Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Align fake-heap without GCC warnings (duh) > > src/hotspot/share/oops/oop.inline.hpp line 276: > >> 274: } >> 275: >> 276: void oopDesc::forward_failed() { > > It is a bit confusing that oopDesc::forward_failed is a setter, while markWord::forward_failed is a getter. Yeah. It's even more confusing that we now have the notion of forward-failed, which aims to hide the implementation detail of self-forwarding, but forwardee() still exposes it. And probably has to, because that is how the forwarding logic of GCs currently work, and I'm not sure it is useful to change that. I need to mull over this a bit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1190843704 From shade at openjdk.org Thu May 11 09:50:49 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 May 2023 09:50:49 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v7] In-Reply-To: References: Message-ID: On Thu, 11 May 2023 09:46:56 GMT, Aleksey Shipilev wrote: >>> Now that I had my morning coffee, I do have a question about the contract here. Can we accidentally call `oop->forward_to(compaction_point)` when `oop == compaction_point` from the compaction code? >> >> No, that doesn't seem to happen. In this case, the object doesn't get forwarded at all. If it would happen, it could and should be ignored, because it would result in extra stuff to be executed. >> >>> I guess that would be innocuous for the thing we want to protect against: recording the _promotion failure_, rather than the self-forwarding itself. In other words, the fact that object is self-forwarded might not exactly mean it failed the promotion, might just be a lucky coincidence? >> >> No, we want to protect against self-forwarding, because that would irrecoverably destroy the Klass* with compact headers. >> >>> If so, maybe this whole thing should be `oopDesc::forward_failed()` or some such, and then let the code decide how to record it, either with self-forwarding address (legacy) or with this new bit. >> >> Yes, I guess I could do that. > > Yeah, perhaps due to the self-forwarding contract with `forwardee`, this is not significantly cleaner. The encapsulation does not achieve much if we have the gaping hole from the other side of this abstraction. So the original `forward_to_self` is already good. Sorry for pushing in the wrong direction :) My only left-over concern is that the assert might still fail when self-forwarding for non-promotion-failure reasons, but that might as well indicate a performance problem in GC code that should avoid self-forwardings on the common path to begin with. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1190920035 From shade at openjdk.org Thu May 11 09:50:48 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 May 2023 09:50:48 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v7] In-Reply-To: References: Message-ID: On Wed, 10 May 2023 10:26:39 GMT, Roman Kennke wrote: >> src/hotspot/share/oops/oop.inline.hpp line 270: >> >>> 268: // Used by scavengers >>> 269: void oopDesc::forward_to(oop p) { >>> 270: assert(p != cast_to_oop(this) || !UseAltGCForwarding, "Must not be called with self-forwarding"); >> >> Now that I had my morning coffee, I do have a question about the contract here. Can we accidentally call `oop->forward_to(compaction_point)` when `oop == compaction_point` from the compaction code? I guess that would be innocuous for the thing we want to protect against: recording the _promotion failure_, rather than the self-forwarding itself. In other words, the fact that object is self-forwarded might not exactly mean it failed the promotion, might just be a lucky coincidence? >> >> If so, maybe this whole thing should be `oopDesc::forward_failed()` or some such, and then let the code decide how to record it, either with self-forwarding address (legacy) or with this new bit. > >> Now that I had my morning coffee, I do have a question about the contract here. Can we accidentally call `oop->forward_to(compaction_point)` when `oop == compaction_point` from the compaction code? > > No, that doesn't seem to happen. In this case, the object doesn't get forwarded at all. If it would happen, it could and should be ignored, because it would result in extra stuff to be executed. > >> I guess that would be innocuous for the thing we want to protect against: recording the _promotion failure_, rather than the self-forwarding itself. In other words, the fact that object is self-forwarded might not exactly mean it failed the promotion, might just be a lucky coincidence? > > No, we want to protect against self-forwarding, because that would irrecoverably destroy the Klass* with compact headers. > >> If so, maybe this whole thing should be `oopDesc::forward_failed()` or some such, and then let the code decide how to record it, either with self-forwarding address (legacy) or with this new bit. > > Yes, I guess I could do that. Yeah, perhaps due to the self-forwarding contract with `forwardee`, this is not significantly cleaner. The encapsulation does not achieve much if we have the gaping hole from the other side of this abstraction. So the original `forward_to_self` is already good. Sorry for pushing in the wrong direction :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1190918917 From iwalulya at openjdk.org Thu May 11 11:12:45 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 11 May 2023 11:12:45 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v11] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Wed, 10 May 2023 10:44:46 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this refactoring of collection set candidate set handling. >> >> The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. >> >> These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). >> >> This patch only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. >> >> In detail: >> * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Testing: >> - this patch only: tier1-3, gha >> - with JDK-8140326 tier1-7 (or 8?) >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Removed assert that is useless for now Lgtm! ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13666#pullrequestreview-1422332605 From shade at openjdk.org Thu May 11 12:01:10 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 May 2023 12:01:10 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v41] In-Reply-To: References: Message-ID: On Wed, 10 May 2023 20:31:27 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The c... > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Align fake-heap without GCC warnings (duh) A few additional minor things: src/hotspot/share/gc/shared/slidingForwarding.cpp line 152: > 150: // Search existing entry in chain starting at idx. > 151: for (FallbackTableEntry* entry = head; entry != nullptr; entry = entry->_next) { > 152: assert(entry->_from != from,"Don't re-forward entries into the fallback-table"); Suggestion: assert(entry->_from != from, "Don't re-forward entries into the fallback-table"); src/hotspot/share/gc/shared/slidingForwarding.hpp line 62: > 60: * ^----- normal lock bits, would record "object is forwarded" > 61: * ^------ fallback bit (explained below) > 62: * ^------- alternate region select Suggestion: * [................................OOOOOOOOOOOOOOOOOOOOOOOOOOOOAFTT] * ^----- normal lock bits, would record "object is forwarded" * ^------- fallback bit (explained below) * ^-------- alternate region select test/hotspot/gtest/gc/shared/test_preservedMarks.cpp line 61: > 59: #ifndef PRODUCT > 60: FlagSetting fs(UseAltGCForwarding, false); > 61: #endif So, would this test fail in release JDK, but when `UseAltGCForwarding` is `true`? If so, maybe do: #ifndef PRODUCT FlagSetting fs(UseAltGCForwarding, false); #else // Should not run this test with alt GC forwarding if (UseAltGCForwarding) return; #endif test/hotspot/jtreg/gc/stress/systemgc/TestSystemGCWithG1.java line 44: > 42: * @requires vm.gc.G1 > 43: * @requires vm.debug > 44: * @requires os.maxMemory > 8g Why `>8g`? The test uses `-Xmx512m`. test/hotspot/jtreg/gc/stress/systemgc/TestSystemGCWithSerial.java line 65: > 63: * @library / > 64: * @requires vm.gc.Serial > 65: * @requires vm.debug This likely requires another requires: * @requires (vm.bits == "64") & (os.maxMemory >= 6G) ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1422199824 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1191051839 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1191057460 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1190936600 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1190929583 PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1190922619 From shade at openjdk.org Thu May 11 12:01:11 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 May 2023 12:01:11 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v41] In-Reply-To: References: Message-ID: On Thu, 11 May 2023 09:55:32 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Align fake-heap without GCC warnings (duh) > > test/hotspot/jtreg/gc/stress/systemgc/TestSystemGCWithG1.java line 44: > >> 42: * @requires vm.gc.G1 >> 43: * @requires vm.debug >> 44: * @requires os.maxMemory > 8g > > Why `>8g`? The test uses `-Xmx512m`. You'd probably need `(vm.bits == "64")` check too, because you can have a 32-bit system with lots of memory. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1190930271 From rkennke at openjdk.org Thu May 11 12:19:58 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 11 May 2023 12:19:58 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v11] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits: - wqRevert "Rename self-forwarded -> forward-failed" This reverts commit 4d9713ca239da8e294c63887426bfb97240d3130. - Merge branch 'JDK-8305896' into JDK-8305898 - Merge remote-tracking branch 'origin/JDK-8305898' into JDK-8305898 - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Rename self-forwarded -> forward-failed - Fix asserts (again) - Fix assert - Merge branch 'JDK-8305896' into JDK-8305898 - @shipilev suggestions - ... and 13 more: https://git.openjdk.org/jdk/compare/b0deb2b3...866771c3 ------------- Changes: https://git.openjdk.org/jdk/pull/13779/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=10 Stats: 86 lines in 8 files changed: 70 ins; 2 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From rkennke at openjdk.org Thu May 11 12:29:59 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 11 May 2023 12:29:59 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v12] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: - Merge branch 'JDK-8305896' into JDK-8305898 - wqRevert "Rename self-forwarded -> forward-failed" This reverts commit 4d9713ca239da8e294c63887426bfb97240d3130. - Merge branch 'JDK-8305896' into JDK-8305898 - Merge remote-tracking branch 'origin/JDK-8305898' into JDK-8305898 - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Rename self-forwarded -> forward-failed - Fix asserts (again) - Fix assert - Merge branch 'JDK-8305896' into JDK-8305898 - ... and 14 more: https://git.openjdk.org/jdk/compare/3271b29b...95341f0a ------------- Changes: https://git.openjdk.org/jdk/pull/13779/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=11 Stats: 86 lines in 8 files changed: 70 ins; 2 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From rkennke at openjdk.org Thu May 11 12:30:08 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 11 May 2023 12:30:08 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v42] In-Reply-To: References: Message-ID: <9P7Qf2b3x_d9lGYPl0VNxtWxSLj3rREZMK5JOrqVqog=.057c5855-3d1c-413a-9d0a-291aef59e3ce@github.com> > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new ... Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Some more @shipilev comments - Update src/hotspot/share/gc/shared/slidingForwarding.hpp Co-authored-by: Aleksey Shipil?v ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/b0deb2b3..3271b29b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=41 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=40-41 Stats: 20 lines in 5 files changed: 16 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From tschatzl at openjdk.org Thu May 11 14:05:55 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 11 May 2023 14:05:55 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v9] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <0Fk9dSGEQXjClsT_GUnAAFOWUQ44cn2VWGsgsni1DK4=.665fc10a-a6d6-4fca-b19e-fd9305a5c1c9@github.com> Message-ID: On Tue, 9 May 2023 13:08:10 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> iwalulya review > > src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 43: > >> 41: >> 42: // A set of HeapRegion*. >> 43: class G1CollectionSetRegionList { > > Now that this is just a region-list, maybe drop the "CollectionSet" part? I would like to keep the name as is and avoid generalizations that are unnecessary at this time. If there is additional use for it, we can always factor it out. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1191220241 From shade at openjdk.org Thu May 11 14:45:47 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 May 2023 14:45:47 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v12] In-Reply-To: References: Message-ID: On Thu, 11 May 2023 12:29:59 GMT, Roman Kennke wrote: >> Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 24 commits: > > - Merge branch 'JDK-8305896' into JDK-8305898 > - wqRevert "Rename self-forwarded -> forward-failed" > > This reverts commit 4d9713ca239da8e294c63887426bfb97240d3130. > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge remote-tracking branch 'origin/JDK-8305898' into JDK-8305898 > - Update src/hotspot/share/oops/oop.inline.hpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/oops/oop.inline.hpp > > Co-authored-by: Aleksey Shipil?v > - Rename self-forwarded -> forward-failed > - Fix asserts (again) > - Fix assert > - Merge branch 'JDK-8305896' into JDK-8305898 > - ... and 14 more: https://git.openjdk.org/jdk/compare/3271b29b...95341f0a I am okay with it, provided it passes `tier1..3`, and at least `tier1` with different GCs. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13779#pullrequestreview-1422780943 From shade at openjdk.org Thu May 11 14:50:06 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 11 May 2023 14:50:06 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v42] In-Reply-To: <9P7Qf2b3x_d9lGYPl0VNxtWxSLj3rREZMK5JOrqVqog=.057c5855-3d1c-413a-9d0a-291aef59e3ce@github.com> References: <9P7Qf2b3x_d9lGYPl0VNxtWxSLj3rREZMK5JOrqVqog=.057c5855-3d1c-413a-9d0a-291aef59e3ce@github.com> Message-ID: On Thu, 11 May 2023 12:30:08 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The c... > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Some more @shipilev comments > - Update src/hotspot/share/gc/shared/slidingForwarding.hpp > > Co-authored-by: Aleksey Shipil?v The updates look fine, thanks. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13582#pullrequestreview-1422790198 From ayang at openjdk.org Thu May 11 14:55:51 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 11 May 2023 14:55:51 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v11] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Wed, 10 May 2023 10:44:46 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this refactoring of collection set candidate set handling. >> >> The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. >> >> These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). >> >> This patch only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. >> >> In detail: >> * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Testing: >> - this patch only: tier1-3, gha >> - with JDK-8140326 tier1-7 (or 8?) >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > Removed assert that is useless for now src/hotspot/share/gc/g1/g1CollectionSet.hpp line 152: > 150: uint _survivor_region_length; > 151: > 152: G1CollectionSetRegionList _initial_old_regions; Why is the whole list saved in the field? I'd expect initial-old-regions is a transient list used to move regions from candidate list to cset (live only inside `G1CollectionSet::finalize_old_part`). `_initial_old_regions` and `_optional_old_regions` share some similarity on the name, but semantically, it's closer to eden/survior regions, so sth like `uint _initial_old_region_length;`. src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 173: > 171: > 172: // The number of regions from the last merge of candidates from the marking. > 173: uint _last_marking_candidates_length; Looking at how it is used, I wonder if we can record `min_old_cset_length`, which is what is actually needed. src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 65: > 63: G1EvacFailureRegions* evac_failure_regions) > 64: : _g1h(g1h), > 65: _collection_set(collection_set), Why is this needed? src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 697: > 695: G1EvacFailureRegions* evac_failure_regions) : > 696: _g1h(g1h), > 697: _collection_set(collection_set), Can't find where this field is used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1191266212 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1191258730 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1191292905 PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1191293906 From ayang at openjdk.org Thu May 11 14:55:55 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 11 May 2023 14:55:55 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v9] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> <0Fk9dSGEQXjClsT_GUnAAFOWUQ44cn2VWGsgsni1DK4=.665fc10a-a6d6-4fca-b19e-fd9305a5c1c9@github.com> Message-ID: On Thu, 11 May 2023 14:02:39 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 43: >> >>> 41: >>> 42: // A set of HeapRegion*. >>> 43: class G1CollectionSetRegionList { >> >> Now that this is just a region-list, maybe drop the "CollectionSet" part? > > I would like to keep the name as is and avoid generalizations that are unnecessary at this time. If there is additional use for it, we can always factor it out. It's mostly to avoid confusion. The two even have the same length... G1CollectionCandidateList G1CollectionSetRegionList ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1191304108 From tschatzl at openjdk.org Thu May 11 15:14:51 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 11 May 2023 15:14:51 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v11] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Thu, 11 May 2023 14:46:24 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed assert that is useless for now > > src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 65: > >> 63: G1EvacFailureRegions* evac_failure_regions) >> 64: : _g1h(g1h), >> 65: _collection_set(collection_set), > > Why is this needed? Going to remove. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1191332666 From tschatzl at openjdk.org Thu May 11 15:23:00 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 11 May 2023 15:23:00 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v11] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: <4hhuk7N-39HHOz24CEHf6jDSINh_3Ys-9-lZYOEMexk=.bb2ffec2-25a0-48b1-8d00-9bfcfbf7ff15@github.com> On Thu, 11 May 2023 14:46:55 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed assert that is useless for now > > src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 697: > >> 695: G1EvacFailureRegions* evac_failure_regions) : >> 696: _g1h(g1h), >> 697: _collection_set(collection_set), > > Can't find where this field is used. `G1ParScanThreadStateSet::state_for_worker()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1191344064 From tschatzl at openjdk.org Thu May 11 15:29:54 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 11 May 2023 15:29:54 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v11] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Thu, 11 May 2023 14:29:26 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed assert that is useless for now > > src/hotspot/share/gc/g1/g1CollectionSetCandidates.hpp line 173: > >> 171: >> 172: // The number of regions from the last merge of candidates from the marking. >> 173: uint _last_marking_candidates_length; > > Looking at how it is used, I wonder if we can record `min_old_cset_length`, which is what is actually needed. I would like to defer this suggestion (which is good) as this is an improvement of the existing code which I would like to follow here for this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1191352822 From dcubed at openjdk.org Fri May 12 00:19:03 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 12 May 2023 00:19:03 GMT Subject: RFR: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 Message-ID: A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. ------------- Commit messages: - 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 Changes: https://git.openjdk.org/jdk/pull/13946/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13946&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307966 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13946.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13946/head:pull/13946 PR: https://git.openjdk.org/jdk/pull/13946 From naoto at openjdk.org Fri May 12 00:19:03 2023 From: naoto at openjdk.org (Naoto Sato) Date: Fri, 12 May 2023 00:19:03 GMT Subject: RFR: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 In-Reply-To: References: Message-ID: On Fri, 12 May 2023 00:05:21 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. Marked as reviewed by naoto (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13946#pullrequestreview-1423602222 From lmesnik at openjdk.org Fri May 12 00:19:03 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 12 May 2023 00:19:03 GMT Subject: RFR: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 In-Reply-To: References: Message-ID: On Fri, 12 May 2023 00:05:21 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. Might be this failure is platform agnostic and should be generic all. We just don't run this combination on macosx so often. ( ------------- PR Review: https://git.openjdk.org/jdk/pull/13946#pullrequestreview-1423603714 From cjplummer at openjdk.org Fri May 12 00:19:03 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 12 May 2023 00:19:03 GMT Subject: RFR: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 In-Reply-To: References: Message-ID: On Fri, 12 May 2023 00:05:21 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. Maybe linux-all would be better. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13946#issuecomment-1544916181 From dcubed at openjdk.org Fri May 12 00:19:04 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 12 May 2023 00:19:04 GMT Subject: RFR: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 In-Reply-To: References: Message-ID: On Fri, 12 May 2023 00:09:12 GMT, Chris Plummer wrote: >> A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. > > Maybe linux-all would be better. @plummercj - Thanks for the review. 'linux-all' would make it apply to linux versions that we don't have. I've only seen this issue on the three platforms that I'm targeting. @naotoj - Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13946#issuecomment-1544921538 From dcubed at openjdk.org Fri May 12 00:23:45 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 12 May 2023 00:23:45 GMT Subject: RFR: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 In-Reply-To: References: Message-ID: On Fri, 12 May 2023 00:15:25 GMT, Leonid Mesnik wrote: >> A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. > > Might be this failure is platform agnostic and should be generic all. We just don't run this combination on macosx so often. ( @lmesnik - Thanks for the review. Folks, I'm only going with the sightings that I've seen. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13946#issuecomment-1544925144 From lmesnik at openjdk.org Fri May 12 00:28:43 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 12 May 2023 00:28:43 GMT Subject: RFR: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 In-Reply-To: References: Message-ID: On Fri, 12 May 2023 00:05:21 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13946#pullrequestreview-1423608862 From dcubed at openjdk.org Fri May 12 00:56:57 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 12 May 2023 00:56:57 GMT Subject: RFR: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 [v2] In-Reply-To: References: Message-ID: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. Daniel D. Daugherty has updated the pull request incrementally with one additional commit since the last revision: plummercj CR - use linux-all instead ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13946/files - new: https://git.openjdk.org/jdk/pull/13946/files/94bad01c..35501c98 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13946&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13946&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13946.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13946/head:pull/13946 PR: https://git.openjdk.org/jdk/pull/13946 From dcubed at openjdk.org Fri May 12 00:56:58 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 12 May 2023 00:56:58 GMT Subject: RFR: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 In-Reply-To: References: Message-ID: <9aTRQwkRWPQHzvHr0MjRE8HLFbjJwb3h1oIZ2mjM0bI=.fe0f5336-ecf6-41ae-80bd-ed1dc60c2a68@github.com> On Fri, 12 May 2023 00:05:21 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. I've switched to using 'linux-all' instead of enumeration of the platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13946#issuecomment-1544945743 From dcubed at openjdk.org Fri May 12 00:56:59 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 12 May 2023 00:56:59 GMT Subject: Integrated: 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 In-Reply-To: References: Message-ID: On Fri, 12 May 2023 00:05:21 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64. This pull request has now been integrated. Changeset: 9a7b4431 Author: Daniel D. Daugherty URL: https://git.openjdk.org/jdk/commit/9a7b4431ecde03f37d9f1c1b06dab6ef8d60a94c Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8307966: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java on linux-x64 Reviewed-by: naoto, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/13946 From jiefu at openjdk.org Fri May 12 01:29:58 2023 From: jiefu at openjdk.org (Jie Fu) Date: Fri, 12 May 2023 01:29:58 GMT Subject: RFR: 8307969: [zgc] Missing includes in gc/z/zTracer.cpp Message-ID: This fixes the build broken with `--with-jvm-features=-jfr`. Thanks. ------------- Commit messages: - 8307969: [zgc] Missing includes in gc/z/zTracer.cpp Changes: https://git.openjdk.org/jdk/pull/13948/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13948&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307969 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13948.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13948/head:pull/13948 PR: https://git.openjdk.org/jdk/pull/13948 From stefank at openjdk.org Fri May 12 04:59:44 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 May 2023 04:59:44 GMT Subject: RFR: 8307969: [zgc] Missing includes in gc/z/zTracer.cpp In-Reply-To: References: Message-ID: On Fri, 12 May 2023 01:23:00 GMT, Jie Fu wrote: > This fixes the build broken with `--with-jvm-features=-jfr`. > Thanks. Thanks for fixing. Our informal convention is to not include the .hpp file when you also include the .inline.hpp, so I'd like that change before approving this PR. ------------- Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13948#pullrequestreview-1423772290 From jiefu at openjdk.org Fri May 12 06:15:45 2023 From: jiefu at openjdk.org (Jie Fu) Date: Fri, 12 May 2023 06:15:45 GMT Subject: RFR: 8307969: [zgc] Missing includes in gc/z/zTracer.cpp [v2] In-Reply-To: References: Message-ID: <0k1GwmB36QMKpAXLh8ppc_itutMe2jbpngd-FiwXHiQ=.7ab2064b-2e13-4633-85db-0751c8218331@github.com> On Fri, 12 May 2023 04:56:40 GMT, Stefan Karlsson wrote: > Our informal convention is to not include the .hpp file when you also include the .inline.hpp, so I'd like that change before approving this PR. Good suggestion. Updated. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13948#issuecomment-1545224088 From jiefu at openjdk.org Fri May 12 06:15:44 2023 From: jiefu at openjdk.org (Jie Fu) Date: Fri, 12 May 2023 06:15:44 GMT Subject: RFR: 8307969: [zgc] Missing includes in gc/z/zTracer.cpp [v2] In-Reply-To: References: Message-ID: <5NO_2Jmrl0DLS78l3DrLMRXFwajCh2IiCa9Ocxe9F-Y=.d5929f01-43d4-43fd-9d75-660abd1c33fc@github.com> > This fixes the build broken with `--with-jvm-features=-jfr`. > Thanks. Jie Fu has updated the pull request incrementally with one additional commit since the last revision: Address review comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13948/files - new: https://git.openjdk.org/jdk/pull/13948/files/149633f3..1a5411f8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13948&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13948&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13948.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13948/head:pull/13948 PR: https://git.openjdk.org/jdk/pull/13948 From stefank at openjdk.org Fri May 12 06:33:54 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 12 May 2023 06:33:54 GMT Subject: RFR: 8307969: [zgc] Missing includes in gc/z/zTracer.cpp [v2] In-Reply-To: <5NO_2Jmrl0DLS78l3DrLMRXFwajCh2IiCa9Ocxe9F-Y=.d5929f01-43d4-43fd-9d75-660abd1c33fc@github.com> References: <5NO_2Jmrl0DLS78l3DrLMRXFwajCh2IiCa9Ocxe9F-Y=.d5929f01-43d4-43fd-9d75-660abd1c33fc@github.com> Message-ID: On Fri, 12 May 2023 06:15:44 GMT, Jie Fu wrote: >> This fixes the build broken with `--with-jvm-features=-jfr`. >> Thanks. > > Jie Fu has updated the pull request incrementally with one additional commit since the last revision: > > Address review comment Looks good. Feel free to integrate. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13948#pullrequestreview-1423853050 From jiefu at openjdk.org Fri May 12 06:33:55 2023 From: jiefu at openjdk.org (Jie Fu) Date: Fri, 12 May 2023 06:33:55 GMT Subject: RFR: 8307969: [zgc] Missing includes in gc/z/zTracer.cpp [v2] In-Reply-To: References: <5NO_2Jmrl0DLS78l3DrLMRXFwajCh2IiCa9Ocxe9F-Y=.d5929f01-43d4-43fd-9d75-660abd1c33fc@github.com> Message-ID: On Fri, 12 May 2023 06:28:50 GMT, Stefan Karlsson wrote: >> Jie Fu has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comment > > Looks good. Feel free to integrate. Thanks @stefank . ------------- PR Comment: https://git.openjdk.org/jdk/pull/13948#issuecomment-1545241704 From jiefu at openjdk.org Fri May 12 06:33:56 2023 From: jiefu at openjdk.org (Jie Fu) Date: Fri, 12 May 2023 06:33:56 GMT Subject: Integrated: 8307969: [zgc] Missing includes in gc/z/zTracer.cpp In-Reply-To: References: Message-ID: On Fri, 12 May 2023 01:23:00 GMT, Jie Fu wrote: > This fixes the build broken with `--with-jvm-features=-jfr`. > Thanks. This pull request has now been integrated. Changeset: ccb4dd61 Author: Jie Fu URL: https://git.openjdk.org/jdk/commit/ccb4dd614483c11903dfde3e249c5ea8c8b04070 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8307969: [zgc] Missing includes in gc/z/zTracer.cpp Reviewed-by: stefank ------------- PR: https://git.openjdk.org/jdk/pull/13948 From tschatzl at openjdk.org Fri May 12 07:26:57 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 12 May 2023 07:26:57 GMT Subject: RFR: 8307518: Remove G1 workaround in jstat about zero sized generation sizes [v2] In-Reply-To: References: Message-ID: On Wed, 10 May 2023 11:39:53 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review removal of some workaround in g1 memory usage monitoring that made sure that there were no 0-sized generations in the output. After [JDK-8307428](https://bugs.openjdk.org/browse/JDK-8307428) this is not necessary any more. >> >> Testing: tier1-5 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > minor fixes to indentation Thanks for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/13880#issuecomment-1545295661 From tschatzl at openjdk.org Fri May 12 07:26:59 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 12 May 2023 07:26:59 GMT Subject: Integrated: 8307518: Remove G1 workaround in jstat about zero sized generation sizes In-Reply-To: References: Message-ID: On Tue, 9 May 2023 07:50:54 GMT, Thomas Schatzl wrote: > Hi all, > > please review removal of some workaround in g1 memory usage monitoring that made sure that there were no 0-sized generations in the output. After [JDK-8307428](https://bugs.openjdk.org/browse/JDK-8307428) this is not necessary any more. > > Testing: tier1-5 > > Thanks, > Thomas This pull request has now been integrated. Changeset: 1ce1611e Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/1ce1611ead1e3eccd9a6b82857740e27e37f05f7 Stats: 52 lines in 2 files changed: 4 ins; 17 del; 31 mod 8307518: Remove G1 workaround in jstat about zero sized generation sizes Reviewed-by: kbarrett, ayang ------------- PR: https://git.openjdk.org/jdk/pull/13880 From tschatzl at openjdk.org Fri May 12 07:46:52 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 12 May 2023 07:46:52 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v11] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Thu, 11 May 2023 14:33:44 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed assert that is useless for now > > src/hotspot/share/gc/g1/g1CollectionSet.hpp line 152: > >> 150: uint _survivor_region_length; >> 151: >> 152: G1CollectionSetRegionList _initial_old_regions; > > Why is the whole list saved in the field? I'd expect initial-old-regions is a transient list used to move regions from candidate list to cset (live only inside `G1CollectionSet::finalize_old_part`). > > `_initial_old_regions` and `_optional_old_regions` share some similarity on the name, but semantically, it's closer to eden/survior regions, so sth like `uint _initial_old_region_length;`. I do not have a too strong opinion either way, so I'll change it (back). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13666#discussion_r1192025147 From tschatzl at openjdk.org Fri May 12 08:07:43 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 12 May 2023 08:07:43 GMT Subject: RFR: 8307816: Add missing STS to ZGC In-Reply-To: <2zsM-KpyaLtvz_OwW1GhBBLo-VDFal7uf7ZzOaNH2uE=.1a9b33ff-ede5-4a68-a4fc-a51652199b3c@github.com> References: <2zsM-KpyaLtvz_OwW1GhBBLo-VDFal7uf7ZzOaNH2uE=.1a9b33ff-ede5-4a68-a4fc-a51652199b3c@github.com> Message-ID: On Wed, 10 May 2023 13:43:36 GMT, Roman Kennke wrote: > Testing in project Lilliput has revealed that ZGC is lacking one STS. Without it, ZGC could reach to already-deflated monitor when trying to fetch a displaced header, in order to get to an object's Klass* (e.g. to get its size). > > Testing: > - [x] hotspot_gc Please also close out the CR manually then as GenZGC has been integrated. Thanks. I already added a comment to the CR about that this is likely only a problem with regular ZGC+Lilliput and for now the consensus is that are not intending to support that combination. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13904#issuecomment-1545343176 PR Comment: https://git.openjdk.org/jdk/pull/13904#issuecomment-1545344921 From tschatzl at openjdk.org Fri May 12 11:09:00 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 12 May 2023 11:09:00 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v12] In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits: - G1CollectionCandidateList -> G1CollectionCandidateRegionList attempt - ayang review, make initial_old_regions an integer - ayang review - Merge branch 'master' into 8306541-refactor-cset-candidates - Removed assert that is useless for now - remove _reclaimable_bytes - make reclaimable-bytes debug only - ayang review (1) - iwalulya review, naming compare fn - iwalulya review - ... and 13 more: https://git.openjdk.org/jdk/compare/3b430b9f...eb797c18 ------------- Changes: https://git.openjdk.org/jdk/pull/13666/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13666&range=11 Stats: 1051 lines in 25 files changed: 559 ins; 251 del; 241 mod Patch: https://git.openjdk.org/jdk/pull/13666.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13666/head:pull/13666 PR: https://git.openjdk.org/jdk/pull/13666 From ayang at openjdk.org Fri May 12 13:14:51 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 May 2023 13:14:51 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v12] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: <3IMPcdtg5oU6kc9MuDgh7AhAm9yBh6LjuYmoun3Ua9w=.eaeb0164-ec8b-4f70-ab60-314c0067826f@github.com> On Fri, 12 May 2023 11:09:00 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this refactoring of collection set candidate set handling. >> >> The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. >> >> These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). >> >> This patch only uses candidates from marking at this time. >> >> Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. >> >> In detail: >> * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. >> >> * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). >> >> * there are several additional helper sets/lists >> * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. >> * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. >> >> All these sets implement C++ iterators for simpler use in various places. >> >> Testing: >> - this patch only: tier1-3, gha >> - with JDK-8140326 tier1-7 (or 8?) >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits: > > - G1CollectionCandidateList -> G1CollectionCandidateRegionList attempt > - ayang review, make initial_old_regions an integer > - ayang review > - Merge branch 'master' into 8306541-refactor-cset-candidates > - Removed assert that is useless for now > - remove _reclaimable_bytes > - make reclaimable-bytes debug only > - ayang review (1) > - iwalulya review, naming compare fn > - iwalulya review > - ... and 13 more: https://git.openjdk.org/jdk/compare/3b430b9f...eb797c18 Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13666#pullrequestreview-1424486427 From ayang at openjdk.org Fri May 12 13:16:52 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 May 2023 13:16:52 GMT Subject: RFR: 8307808: G1: Remove partial object-count report after gc In-Reply-To: References: Message-ID: On Wed, 10 May 2023 09:48:17 GMT, Albert Mingkun Yang wrote: > Simple removing object-count event in the case of incomplete-marking. > > Test: hotspot_gc Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13897#issuecomment-1545726138 From ayang at openjdk.org Fri May 12 13:16:52 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 12 May 2023 13:16:52 GMT Subject: Integrated: 8307808: G1: Remove partial object-count report after gc In-Reply-To: References: Message-ID: On Wed, 10 May 2023 09:48:17 GMT, Albert Mingkun Yang wrote: > Simple removing object-count event in the case of incomplete-marking. > > Test: hotspot_gc This pull request has now been integrated. Changeset: f7bbbc65 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/f7bbbc6590d93a5566ae0ea1f44476ec0e55f59e Stats: 47 lines in 2 files changed: 17 ins; 30 del; 0 mod 8307808: G1: Remove partial object-count report after gc Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/13897 From tschatzl at openjdk.org Fri May 12 15:11:05 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 12 May 2023 15:11:05 GMT Subject: RFR: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 [v11] In-Reply-To: References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Thu, 11 May 2023 11:09:50 GMT, Ivan Walulya wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed assert that is useless for now > > Lgtm! Thanks @walulyai @albertnetymk for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/13666#issuecomment-1545891348 From tschatzl at openjdk.org Fri May 12 15:11:06 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 12 May 2023 15:11:06 GMT Subject: Integrated: 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 In-Reply-To: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> References: <4oheKwC7DqtsyjvQCNR2XDazOT7xkoGrLBrwbVp-wS8=.433fb484-259f-49eb-9bd7-ca31220cf808@github.com> Message-ID: On Wed, 26 Apr 2023 09:20:46 GMT, Thomas Schatzl wrote: > Hi all, > > please review this refactoring of collection set candidate set handling. > > The idea is to improve the interface to collection set candidates and prepare for having collection set candidates available at any time to evacuate them at any young collection. > > These preparations to allow for multiple sources for these candidates (from the marking, as now, and from retained regions, i.e. evacuation failed regions as per [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)). > > This patch only uses candidates from marking at this time. > > Also moves gc efficiency out of HeapRegion and associate it to the list element as it's not used otherwise. > > In detail: > * the collection set candidates set is not temporarily allocated any more, but the candidate collection set object is available all the time. > > * G1CollectionSetCandidates is the main class, representing the current candidates. Contains the "from marking" candidate list only (at this point). > > * there are several additional helper sets/lists > * G1CollectionSetRegionList: list of HeapRegion*, typically sorted by efficiency (but not necessarily). Also does not contain gc efficiences. > * G1CollectionCandidateList: list of candidates, i.e. HeapRegion* with their gc efficiency. Building block for the actual collection set candidates list. > > All these sets implement C++ iterators for simpler use in various places. > > Testing: > - this patch only: tier1-3, gha > - with JDK-8140326 tier1-7 (or 8?) > > Thanks, > Thomas This pull request has now been integrated. Changeset: e512a206 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/e512a20679ee03ae6d3c2219e4ad10c92e362e14 Stats: 1051 lines in 25 files changed: 559 ins; 251 del; 241 mod 8306541: Refactor collection set candidate handling to prepare for JDK-8140326 Reviewed-by: iwalulya, ayang ------------- PR: https://git.openjdk.org/jdk/pull/13666 From rkennke at openjdk.org Fri May 12 17:05:33 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 12 May 2023 17:05:33 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v43] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new ... Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 109 commits: - Merge remote-tracking branch 'origin/JDK-8305896' into JDK-8305896 - Some more @shipilev comments - Update src/hotspot/share/gc/shared/slidingForwarding.hpp Co-authored-by: Aleksey Shipil?v - Align fake-heap without GCC warnings (duh) - Merge branch 'master' into JDK-8305896 - Fix gtest: Align fake-heaps, avoid re-forwardings - @tschatzl's latest fix, cleanup and a test that checks unaligned heap problem - Fix build - Update src/hotspot/share/gc/shared/slidingForwarding.cpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/gc/shared/slidingForwarding.cpp Co-authored-by: Aleksey Shipil?v - ... and 99 more: https://git.openjdk.org/jdk/compare/6ebea897...f1ad3421 ------------- Changes: https://git.openjdk.org/jdk/pull/13582/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=42 Stats: 912 lines in 24 files changed: 875 ins; 0 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Fri May 12 17:14:06 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 12 May 2023 17:14:06 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v13] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 25 commits: - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - wqRevert "Rename self-forwarded -> forward-failed" This reverts commit 4d9713ca239da8e294c63887426bfb97240d3130. - Merge branch 'JDK-8305896' into JDK-8305898 - Merge remote-tracking branch 'origin/JDK-8305898' into JDK-8305898 - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Rename self-forwarded -> forward-failed - Fix asserts (again) - Fix assert - ... and 15 more: https://git.openjdk.org/jdk/compare/f1ad3421...880d564a ------------- Changes: https://git.openjdk.org/jdk/pull/13779/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=12 Stats: 86 lines in 8 files changed: 70 ins; 2 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From rkennke at openjdk.org Sat May 13 22:07:41 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Sat, 13 May 2023 22:07:41 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v14] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix tests on 32bit builds ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13779/files - new: https://git.openjdk.org/jdk/pull/13779/files/880d564a..d35cfb47 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=12-13 Stats: 14 lines in 2 files changed: 11 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From stefank at openjdk.org Mon May 15 08:03:44 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 15 May 2023 08:03:44 GMT Subject: RFR: 8307997: gtest:ZIndexDistributorTest fails on PPC64 Message-ID: ZindexDistributorTest was written with the assumption that `ZCacheLineSize == 64`, which isn't the case on PPC. I've updated the test to handle this case. Tested by temporarily changing ZCacheLineSize to 128. I also added two more cases just to disambiguate the counts in one of the test. ------------- Commit messages: - 8307997: gtest:ZIndexDistributorTest fails on PPC64 Changes: https://git.openjdk.org/jdk/pull/13977/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13977&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307997 Stats: 14 lines in 1 file changed: 9 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/13977.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13977/head:pull/13977 PR: https://git.openjdk.org/jdk/pull/13977 From lkorinth at openjdk.org Mon May 15 09:27:05 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 15 May 2023 09:27:05 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v2] In-Reply-To: References: Message-ID: > Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle > > Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) > > Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. > > Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: rerun tests ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13929/files - new: https://git.openjdk.org/jdk/pull/13929/files/fc847613..7bda00db Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13929&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13929&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13929.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13929/head:pull/13929 PR: https://git.openjdk.org/jdk/pull/13929 From ayang at openjdk.org Mon May 15 13:08:44 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 15 May 2023 13:08:44 GMT Subject: RFR: 8308098: G1: Remove redundant checks in G1ObjectCountIsAliveClosure Message-ID: Simple removing unnecessary checks. Test: hotspot_gc ------------- Commit messages: - g1-cm-is-live Changes: https://git.openjdk.org/jdk/pull/13986/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13986&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308098 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/13986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13986/head:pull/13986 PR: https://git.openjdk.org/jdk/pull/13986 From mdoerr at openjdk.org Mon May 15 13:10:45 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 15 May 2023 13:10:45 GMT Subject: RFR: 8307997: gtest:ZIndexDistributorTest fails on PPC64 In-Reply-To: References: Message-ID: On Mon, 15 May 2023 07:57:34 GMT, Stefan Karlsson wrote: > ZindexDistributorTest was written with the assumption that `ZCacheLineSize == 64`, which isn't the case on PPC. I've updated the test to handle this case. > > Tested by temporarily changing ZCacheLineSize to 128. I also added two more cases just to disambiguate the counts in one of the test. LGTM. Thanks for fixing it so quickly! ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13977#pullrequestreview-1426521670 From ayang at openjdk.org Mon May 15 13:26:57 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 15 May 2023 13:26:57 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v14] In-Reply-To: References: Message-ID: On Sat, 13 May 2023 22:07:41 GMT, Roman Kennke wrote: >> Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Fix tests on 32bit builds src/hotspot/share/oops/markWord.hpp line 107: > 105: static const int age_bits = 4; > 106: static const int lock_bits = 2; > 107: static const int self_forwarded_bits = 1; This warrants some update to the doc above, right? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1193832971 From stefank at openjdk.org Mon May 15 13:55:55 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 15 May 2023 13:55:55 GMT Subject: RFR: 8307997: gtest:ZIndexDistributorTest fails on PPC64 In-Reply-To: References: Message-ID: On Mon, 15 May 2023 07:57:34 GMT, Stefan Karlsson wrote: > ZindexDistributorTest was written with the assumption that `ZCacheLineSize == 64`, which isn't the case on PPC. I've updated the test to handle this case. > > Tested by temporarily changing ZCacheLineSize to 128. I also added two more cases just to disambiguate the counts in one of the test. Thanks for reviewing! Given that this fixes a tier1 failure for PPC and that this is a small change limited to the test, I'll go ahead integrate it now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13977#issuecomment-1547897354 From stefank at openjdk.org Mon May 15 13:55:55 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 15 May 2023 13:55:55 GMT Subject: Integrated: 8307997: gtest:ZIndexDistributorTest fails on PPC64 In-Reply-To: References: Message-ID: On Mon, 15 May 2023 07:57:34 GMT, Stefan Karlsson wrote: > ZindexDistributorTest was written with the assumption that `ZCacheLineSize == 64`, which isn't the case on PPC. I've updated the test to handle this case. > > Tested by temporarily changing ZCacheLineSize to 128. I also added two more cases just to disambiguate the counts in one of the test. This pull request has now been integrated. Changeset: 97b2ca3d Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/97b2ca3de76046c6f52d3649d8787feea7b9ac83 Stats: 14 lines in 1 file changed: 9 ins; 0 del; 5 mod 8307997: gtest:ZIndexDistributorTest fails on PPC64 Reviewed-by: mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/13977 From eosterlund at openjdk.org Mon May 15 14:33:47 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 15 May 2023 14:33:47 GMT Subject: RFR: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating Message-ID: The TestCSLocker.java test spawns a thread that grabs the GC locker, and then wait for the first thread to run some java code and then get signal back to release the GC locker. All of this while another thread is allocating garbage and triggering GCs. Naturally, if the thread that is to signal the release of the GC locker requires GC in order to make progress, we will end up with a deadlock that leads to a timeout. As it turns out, that does indeed happen. A println statement is performed, which in its internal implementation performs an allocation, which requires GC. I think any GC can spuriously fail here, but it seems more likely with generational ZGC for whatever reason. While it seems really shady to wait with the GC locker held while a Java thread executing Java code is supposed to make progress, in general, I think the test can be fixed by removing the println statement causing the allocation. I have run the test 200 times, and it's no longer failing with generational ZGC. ------------- Commit messages: - 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating Changes: https://git.openjdk.org/jdk/pull/13989/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13989&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308043 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13989.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13989/head:pull/13989 PR: https://git.openjdk.org/jdk/pull/13989 From stefank at openjdk.org Mon May 15 14:41:46 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 15 May 2023 14:41:46 GMT Subject: RFR: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating In-Reply-To: References: Message-ID: On Mon, 15 May 2023 14:26:42 GMT, Erik ?sterlund wrote: > The TestCSLocker.java test spawns a thread that grabs the GC locker, and then wait for the first thread to run some java code and then get signal back to release the GC locker. All of this while another thread is allocating garbage and triggering GCs. Naturally, if the thread that is to signal the release of the GC locker requires GC in order to make progress, we will end up with a deadlock that leads to a timeout. As it turns out, that does indeed happen. A println statement is performed, which in its internal implementation performs an allocation, which requires GC. I think any GC can spuriously fail here, but it seems more likely with generational ZGC for whatever reason. While it seems really shady to wait with the GC locker held while a Java thread executing Java code is supposed to make progress, in general, I think the test can be fixed by removing the println statement causing the allocation. I have run the test 200 times, and it's no longer failing with generational ZGC. Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13989#pullrequestreview-1426723002 From ayang at openjdk.org Mon May 15 14:50:47 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 15 May 2023 14:50:47 GMT Subject: RFR: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating In-Reply-To: References: Message-ID: On Mon, 15 May 2023 14:26:42 GMT, Erik ?sterlund wrote: > The TestCSLocker.java test spawns a thread that grabs the GC locker, and then wait for the first thread to run some java code and then get signal back to release the GC locker. All of this while another thread is allocating garbage and triggering GCs. Naturally, if the thread that is to signal the release of the GC locker requires GC in order to make progress, we will end up with a deadlock that leads to a timeout. As it turns out, that does indeed happen. A println statement is performed, which in its internal implementation performs an allocation, which requires GC. I think any GC can spuriously fail here, but it seems more likely with generational ZGC for whatever reason. While it seems really shady to wait with the GC locker held while a Java thread executing Java code is supposed to make progress, in general, I think the test can be fixed by removing the println statement causing the allocation. I have run the test 200 times, and it's no longer failing with generational ZGC. Marked as reviewed by ayang (Reviewer). test/hotspot/jtreg/gc/cslocker/TestCSLocker.java line 54: > 52: // check timeout to success deadlocking > 53: while(System.currentTimeMillis() < startTime + timeout) { > 54: System.out.println("sleeping..."); I think some comments (one cannot run any gc-triggering code, e.g. println) here would be nice. It's super tempting to add some innocent debug-prints before suspending the current thread, while extending/fixing this test case in the future. ------------- PR Review: https://git.openjdk.org/jdk/pull/13989#pullrequestreview-1426741151 PR Review Comment: https://git.openjdk.org/jdk/pull/13989#discussion_r1193953862 From tschatzl at openjdk.org Mon May 15 15:42:45 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 15 May 2023 15:42:45 GMT Subject: RFR: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating In-Reply-To: References: Message-ID: On Mon, 15 May 2023 14:26:42 GMT, Erik ?sterlund wrote: > The TestCSLocker.java test spawns a thread that grabs the GC locker, and then wait for the first thread to run some java code and then get signal back to release the GC locker. All of this while another thread is allocating garbage and triggering GCs. Naturally, if the thread that is to signal the release of the GC locker requires GC in order to make progress, we will end up with a deadlock that leads to a timeout. As it turns out, that does indeed happen. A println statement is performed, which in its internal implementation performs an allocation, which requires GC. I think any GC can spuriously fail here, but it seems more likely with generational ZGC for whatever reason. While it seems really shady to wait with the GC locker held while a Java thread executing Java code is supposed to make progress, in general, I think the test can be fixed by removing the println statement causing the allocation. I have run the test 200 times, and it's no longer failing with generational ZGC. Marked as reviewed by tschatzl (Reviewer). test/hotspot/jtreg/gc/cslocker/TestCSLocker.java line 53: > 51: > 52: // check timeout to success deadlocking > 53: while(System.currentTimeMillis() < startTime + timeout) { In addition to @albertnetymk's comment, please also fix the whitespace after the `while` and the bracket. ------------- PR Review: https://git.openjdk.org/jdk/pull/13989#pullrequestreview-1426849604 PR Review Comment: https://git.openjdk.org/jdk/pull/13989#discussion_r1194021093 From tschatzl at openjdk.org Mon May 15 15:52:44 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 15 May 2023 15:52:44 GMT Subject: RFR: 8308098: G1: Remove redundant checks in G1ObjectCountIsAliveClosure In-Reply-To: References: Message-ID: On Mon, 15 May 2023 13:01:00 GMT, Albert Mingkun Yang wrote: > Simple removing unnecessary checks. > > Test: hotspot_gc Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13986#pullrequestreview-1426869066 From aboldtch at openjdk.org Mon May 15 18:30:45 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 15 May 2023 18:30:45 GMT Subject: RFR: 8308097: Generational ZGC: Update constructor syntax In-Reply-To: References: Message-ID: <5d-KdZXUb2p4EgW-I_FAM0L6-9dBoeo9TjDyVsFgDlY=.bbdaccaf-705c-4d86-903f-bcc253b433d1@github.com> On Mon, 15 May 2023 13:11:42 GMT, Stefan Karlsson wrote: > ZGC's current constructor syntax works well with some editors, but not all. There is a wish to move over from the current syntax: > > > ZClass:ZClass() : > ZSuper(), > _member0, > _member1 { > // Code > doit(); > } > > > to the following syntax: > > ZClass:ZClass() > : ZSuper(), > _member0, > _member1 { > // Code > doit(); > } > > > I propose that make this change. lgtm. ------------- Marked as reviewed by aboldtch (Committer). PR Review: https://git.openjdk.org/jdk/pull/13987#pullrequestreview-1427134462 From shade at openjdk.org Mon May 15 18:36:54 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 May 2023 18:36:54 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop Message-ID: See the bug for more details. Additional testing: - [x] Linux x86_64 fastdebug, `tier1` - [x] Linux x86_64 fastdebug, `tier2` - [x] Linux x86_64 fastdebug, `tier3` ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/13982/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13982&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308088 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13982.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13982/head:pull/13982 PR: https://git.openjdk.org/jdk/pull/13982 From zgu at openjdk.org Mon May 15 19:36:43 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 15 May 2023 19:36:43 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop In-Reply-To: References: Message-ID: On Mon, 15 May 2023 10:25:15 GMT, Aleksey Shipilev wrote: > See the bug for more details. > > Additional testing: > - [x] Linux x86_64 fastdebug, `tier1` > - [x] Linux x86_64 fastdebug, `tier2` > - [x] Linux x86_64 fastdebug, `tier3` LGTM ------------- Marked as reviewed by zgu (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13982#pullrequestreview-1427249603 From zgu at openjdk.org Mon May 15 20:07:46 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Mon, 15 May 2023 20:07:46 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop In-Reply-To: References: Message-ID: On Mon, 15 May 2023 10:25:15 GMT, Aleksey Shipilev wrote: > See the bug for more details. > > Additional testing: > - [x] Linux x86_64 fastdebug, `tier1` > - [x] Linux x86_64 fastdebug, `tier2` > - [x] Linux x86_64 fastdebug, `tier3` Wait, you are calling `k->is_klass()` here, you do need `acquire` here. ------------- Changes requested by zgu (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13982#pullrequestreview-1427289394 From shade at openjdk.org Mon May 15 20:15:56 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 May 2023 20:15:56 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v2] In-Reply-To: References: Message-ID: > See the bug for more details. > > Additional testing: > - [x] Linux x86_64 fastdebug, `tier1` > - [x] Linux x86_64 fastdebug, `tier2` > - [x] Linux x86_64 fastdebug, `tier3` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Avoid touching klass racily (even for v-call) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13982/files - new: https://git.openjdk.org/jdk/pull/13982/files/b3b41441..983ee9f8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13982&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13982&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13982.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13982/head:pull/13982 PR: https://git.openjdk.org/jdk/pull/13982 From shade at openjdk.org Mon May 15 20:17:43 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 15 May 2023 20:17:43 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v2] In-Reply-To: References: Message-ID: <-I7Ys2yumyCEvmbFmOqBSiK70hHoSTTLTc7WuhKryAQ=.c981cee1-d89a-4eb2-88fd-2fdea31c6df7@github.com> On Mon, 15 May 2023 20:04:50 GMT, Zhengyu Gu wrote: > Wait, you are calling `k->is_klass()` here, you do need `acquire` here. Oh. `is_klass` is v-call, okay. Good catch! Actually, let's just avoid checking for `is_klass` here, instead of making this method carry additional memory order semantics. This would make the method lighter (= more performance for debug builds), and would prevent `assert(is_oop(...))` hiding memory ordering bugs in debug mode accidentally. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13982#issuecomment-1548513639 From zgu at openjdk.org Tue May 16 00:40:45 2023 From: zgu at openjdk.org (Zhengyu Gu) Date: Tue, 16 May 2023 00:40:45 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v2] In-Reply-To: References: Message-ID: On Mon, 15 May 2023 20:15:56 GMT, Aleksey Shipilev wrote: >> See the bug for more details. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `tier1` >> - [x] Linux x86_64 fastdebug, `tier2` >> - [x] Linux x86_64 fastdebug, `tier3` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Avoid touching klass racily (even for v-call) LGTM ------------- Marked as reviewed by zgu (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13982#pullrequestreview-1427534327 From dholmes at openjdk.org Tue May 16 05:10:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 16 May 2023 05:10:48 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v2] In-Reply-To: References: Message-ID: On Mon, 15 May 2023 09:27:05 GMT, Leo Korinth wrote: >> Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle >> >> Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) >> >> Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. >> >> Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > rerun tests Changes are fine in principle. I haven't tried to verify the details of each test case. I've made a number of comments below about reformatting the `@test` segments to the normal multi-line format. In the PR I found these very hard to read ( Ididn't even realize jtreg would process them as a single line like that!). I did discover afterwards that these look much better when the file is viewed wide-screen so I will leave it for GC reviewers to decide what they prefer. Thanks. test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/ArrayJuggle.README line 1: > 1: Copyright (c) 2002, 2018, Oracle and/or its affiliates. All rights reserved. The README needs some updating with your changes test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/Juggle1.java line 30: > 28: > 29: /* @test @key stress randomness @library /vmTestbase /test/lib @run main/othervm -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle1 */ > 30: /* @test @key stress randomness @library /vmTestbase /test/lib @run main/othervm -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle1 -tg */ These should be laid out in the normal multi-line format - it is too hard to mentally parse otherwise. test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/Juggle2.java line 32: > 30: /* @test @key stress randomness @library /vmTestbase /test/lib @run main/othervm -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle2 */ > 31: /* @test @key stress randomness @library /vmTestbase /test/lib @run main/othervm -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle2 -tg */ > 32: Again please use multi-line format test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/Juggle3.java line 28: > 26: */ > 27: > 28: // Run in Juggle3Quic.java @test id=1 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp byteArr -ms low Is this meant to be a comment? I think you are telling me this case gets run in another file, but it is very hard to read. test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/Juggle3.java line 60: > 58: /* @test id=31 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp hashed(objectArr) -ms low */ > 59: /* @test id=32 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp hashed(objectArr) -ms medium */ > 60: /* @test id=33 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp hashed(objectArr) -ms high */ Please use normal multi-line format. test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/Juggle3Quick.java line 32: > 30: /* @test id=22 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp doubleArr -ms low */ > 31: /* @test id=29 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp hashed(doubleArr) -ms medium */ > 32: /* @test id=34 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp random(arrays) -ms high */ Please use normal multi-line format ------------- PR Review: https://git.openjdk.org/jdk/pull/13929#pullrequestreview-1427706546 PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1194606832 PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1194601316 PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1194601548 PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1194602532 PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1194602865 PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1194603147 From dholmes at openjdk.org Tue May 16 07:43:44 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 16 May 2023 07:43:44 GMT Subject: RFR: 8308097: Generational ZGC: Update constructor syntax In-Reply-To: References: Message-ID: On Mon, 15 May 2023 13:11:42 GMT, Stefan Karlsson wrote: > ZGC's current constructor syntax works well with some editors, but not all. There is a wish to move over from the current syntax: > > > ZClass:ZClass() : > ZSuper(), > _member0, > _member1 { > // Code > doit(); > } > > > to the following syntax: > > ZClass:ZClass() > : ZSuper(), > _member0, > _member1 { > // Code > doit(); > } > > > I propose that make this change. Is this something that should be in the style guide? Presumably we have constructors all through the hotspot code that may not conform to this style. ------------- PR Review: https://git.openjdk.org/jdk/pull/13987#pullrequestreview-1427925321 From tschatzl at openjdk.org Tue May 16 07:46:50 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 16 May 2023 07:46:50 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v2] In-Reply-To: References: Message-ID: On Mon, 15 May 2023 20:15:56 GMT, Aleksey Shipilev wrote: >> See the bug for more details. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `tier1` >> - [x] Linux x86_64 fastdebug, `tier2` >> - [x] Linux x86_64 fastdebug, `tier3` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Avoid touching klass racily (even for v-call) Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13982#pullrequestreview-1427932500 From ayang at openjdk.org Tue May 16 08:13:48 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 16 May 2023 08:13:48 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v2] In-Reply-To: References: Message-ID: On Mon, 15 May 2023 20:15:56 GMT, Aleksey Shipilev wrote: >> See the bug for more details. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `tier1` >> - [x] Linux x86_64 fastdebug, `tier2` >> - [x] Linux x86_64 fastdebug, `tier3` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Avoid touching klass racily (even for v-call) Marked as reviewed by ayang (Reviewer). src/hotspot/share/gc/shared/collectedHeap.cpp line 230: > 228: } > 229: > 230: Klass* k = object->klass_raw(); Introducing a local-var doesn't seem needed. ------------- PR Review: https://git.openjdk.org/jdk/pull/13982#pullrequestreview-1427983254 PR Review Comment: https://git.openjdk.org/jdk/pull/13982#discussion_r1194783083 From stefank at openjdk.org Tue May 16 09:45:56 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 May 2023 09:45:56 GMT Subject: RFR: 8299075: TestStringDeduplicationInterned.java fails because extra deduplication Message-ID: The tests assumes that some strings have not been deduplicated and checks for that. Unfortunately, we can have concurrently triggered GCs that invalidate those checks. I kept the call sites to show the intention of the test, but then I added a comment explaining why those checks are invalid. ------------- Commit messages: - 8299075: TestStringDeduplicationInterned.java fails because extra deduplication Changes: https://git.openjdk.org/jdk/pull/14005/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14005&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8299075 Stats: 21 lines in 1 file changed: 9 ins; 5 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14005.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14005/head:pull/14005 PR: https://git.openjdk.org/jdk/pull/14005 From shade at openjdk.org Tue May 16 10:23:09 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 16 May 2023 10:23:09 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v3] In-Reply-To: References: Message-ID: > See the bug for more details. > > Additional testing: > - [x] Linux x86_64 fastdebug, `tier1` > - [x] Linux x86_64 fastdebug, `tier2` > - [x] Linux x86_64 fastdebug, `tier3` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Touchup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13982/files - new: https://git.openjdk.org/jdk/pull/13982/files/983ee9f8..961b90e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13982&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13982&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13982.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13982/head:pull/13982 PR: https://git.openjdk.org/jdk/pull/13982 From eosterlund at openjdk.org Tue May 16 11:51:39 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 16 May 2023 11:51:39 GMT Subject: RFR: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating [v2] In-Reply-To: References: Message-ID: > The TestCSLocker.java test spawns a thread that grabs the GC locker, and then wait for the first thread to run some java code and then get signal back to release the GC locker. All of this while another thread is allocating garbage and triggering GCs. Naturally, if the thread that is to signal the release of the GC locker requires GC in order to make progress, we will end up with a deadlock that leads to a timeout. As it turns out, that does indeed happen. A println statement is performed, which in its internal implementation performs an allocation, which requires GC. I think any GC can spuriously fail here, but it seems more likely with generational ZGC for whatever reason. While it seems really shady to wait with the GC locker held while a Java thread executing Java code is supposed to make progress, in general, I think the test can be fixed by removing the println statement causing the allocation. I have run the test 200 times, and it's no longer failing with generational ZGC. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Review feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13989/files - new: https://git.openjdk.org/jdk/pull/13989/files/c4124a0f..c410aed9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13989&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13989&range=00-01 Stats: 7 lines in 2 files changed: 4 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13989.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13989/head:pull/13989 PR: https://git.openjdk.org/jdk/pull/13989 From eosterlund at openjdk.org Tue May 16 11:51:41 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 16 May 2023 11:51:41 GMT Subject: RFR: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating [v2] In-Reply-To: References: Message-ID: On Mon, 15 May 2023 14:39:25 GMT, Stefan Karlsson wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Review feedback > > Marked as reviewed by stefank (Reviewer). Thanks for the reviews @stefank @tschatzl and @albertnetymk! I updated with a comment and whitespace fix as suggested, and removed an old problem listing for this test. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13989#issuecomment-1549508641 From lkorinth at openjdk.org Tue May 16 12:12:45 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 16 May 2023 12:12:45 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v2] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 04:50:21 GMT, David Holmes wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> rerun tests > > test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/Juggle3.java line 28: > >> 26: */ >> 27: >> 28: // Run in Juggle3Quic.java @test id=1 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp byteArr -ms low > > Is this meant to be a comment? I think you are telling me this case gets run in another file, but it is very hard to read. Yes, it is a comment, they show the quickgroup. Unfortunately I can not run those tests directly from a group and I needed to create Juggle3Quic.java (https://bugs.openjdk.org/browse/CODETOOLS-7903467). If you prefer I will remove those comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1195067562 From stuefe at openjdk.org Tue May 16 12:29:47 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 16 May 2023 12:29:47 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v3] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 10:23:09 GMT, Aleksey Shipilev wrote: >> See the bug for more details. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `tier1` >> - [x] Linux x86_64 fastdebug, `tier2` >> - [x] Linux x86_64 fastdebug, `tier3` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Touchup This can be improved: - klass_raw() will, eventually, end up doing some asserts. We don't want that. So narrow Klass extraction needs to be done separately. - then, we could weed out obvious bogus Klass values (not aligned or null) - then, Metaspace::contains checks class space and non-class metaspace. The latter would be a false positive, and its also more expensive, since it walks the (usually short) list of metaspace regions. class space otoh is just one region. A correct check for Klass would be to check if its in CDS or in class space. Proposal sketch: https://github.com/openjdk/jdk/commit/b8868dac39e43c8bea6ccd34d8e5b186506415fe ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13982#pullrequestreview-1428461065 From eosterlund at openjdk.org Tue May 16 12:32:35 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 16 May 2023 12:32:35 GMT Subject: RFR: 8308181: Generational ZGC: Remove CLDG_lock from old gen root scanning Message-ID: We already removed the CLDG_lock from young gen root scanning, after the CLDG was made concurrently walkable with [JDK-8307106](https://bugs.openjdk.org/browse/JDK-8307106). We should remove it from the old generation root scanning code as well. ------------- Commit messages: - Remove CLDG lock from old root scanning Changes: https://git.openjdk.org/jdk/pull/14011/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14011&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308181 Stats: 14 lines in 2 files changed: 0 ins; 12 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14011/head:pull/14011 PR: https://git.openjdk.org/jdk/pull/14011 From lkorinth at openjdk.org Tue May 16 12:49:46 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 16 May 2023 12:49:46 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v2] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 04:58:58 GMT, David Holmes wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> rerun tests > > test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/ArrayJuggle.README line 1: > >> 1: Copyright (c) 2002, 2018, Oracle and/or its affiliates. All rights reserved. > > The README needs some updating with your changes Yes, nice catch! It was not up to date to begin with. Although the description of some parts are still correct --- for me --- the README adds little benefit, and I would prefer removing the file. Another option is to keep lines up to and including line 51, and remove the rest. These kind of files just tend to bit rot. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1195112067 From coleenp at openjdk.org Tue May 16 12:58:10 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 16 May 2023 12:58:10 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v43] In-Reply-To: References: Message-ID: On Fri, 12 May 2023 17:05:33 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The c... > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 109 commits: > > - Merge remote-tracking branch 'origin/JDK-8305896' into JDK-8305896 > - Some more @shipilev comments > - Update src/hotspot/share/gc/shared/slidingForwarding.hpp > > Co-authored-by: Aleksey Shipil?v > - Align fake-heap without GCC warnings (duh) > - Merge branch 'master' into JDK-8305896 > - Fix gtest: Align fake-heaps, avoid re-forwardings > - @tschatzl's latest fix, cleanup and a test that checks unaligned heap problem > - Fix build > - Update src/hotspot/share/gc/shared/slidingForwarding.cpp > > Co-authored-by: Aleksey Shipil?v > - Update src/hotspot/share/gc/shared/slidingForwarding.cpp > > Co-authored-by: Aleksey Shipil?v > - ... and 99 more: https://git.openjdk.org/jdk/compare/6ebea897...f1ad3421 src/hotspot/share/utilities/fastHash.hpp line 30: > 28: #include "memory/allStatic.hpp" > 29: > 30: class FastHash : public AllStatic { Where did this hash come from? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1195122010 From lkorinth at openjdk.org Tue May 16 12:58:44 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 16 May 2023 12:58:44 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v2] In-Reply-To: References: Message-ID: <-T7A8poz95z4YYpa_uXra3nq9Zi5sESvOolx18vbiHc=.1b1e01ad-7025-4f1a-a83b-e88bf79ea6f1@github.com> On Mon, 15 May 2023 09:27:05 GMT, Leo Korinth wrote: >> Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle >> >> Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) >> >> Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. >> >> Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > rerun tests Regarding multi line format, I think there is a strong value of being able to "scan" line after line seeing what permutations are tested. I recently fixed [JDK-8306435](https://bugs.openjdk.org/browse/JDK-8306435) --- such bugs would be much harder to make if we can easily "scan" the permutations, and was one reason why I choose to reorganise the tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13929#issuecomment-1549608456 From stefank at openjdk.org Tue May 16 13:01:46 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 May 2023 13:01:46 GMT Subject: RFR: 8308097: Generational ZGC: Update constructor syntax In-Reply-To: References: Message-ID: <0Iuyzf9eaCTSG876F3j4aeKUItMZgoT8y1BM_byAiSc=.8cb57667-19d9-4798-8d95-60d3c6735d05@github.com> On Tue, 16 May 2023 07:40:41 GMT, David Holmes wrote: > Is this something that should be in the style guide? Presumably we have constructors all through the hotspot code that may not conform to this style. ZGC has a much stricter code style than the rest of HotSpot. Previous attempts to drive style guide questions have been quite disappointing (at least to me), so we've carved out our own corner in HotSpot where we try to keep the code uniform. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13987#issuecomment-1549614258 From stuefe at openjdk.org Tue May 16 13:31:51 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 16 May 2023 13:31:51 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v3] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 10:23:09 GMT, Aleksey Shipilev wrote: >> See the bug for more details. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `tier1` >> - [x] Linux x86_64 fastdebug, `tier2` >> - [x] Linux x86_64 fastdebug, `tier3` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Touchup include memory/metaspace.hpp is missing. Otherwise LGTM > This can be improved: > > * klass_raw() will, eventually, end up doing some asserts. We don't want that. So narrow Klass extraction needs to be done separately. > > * then, we could weed out obvious bogus Klass values (not aligned or null) > > * then, Metaspace::contains checks class space and non-class metaspace. The latter would be a false positive, and its also more expensive, since it walks the (usually short) list of metaspace regions. class space otoh is just one region. A correct check for Klass would be to check if its in CDS or in class space. > > > Proposal sketch: [b8868da](https://github.com/openjdk/jdk/commit/b8868dac39e43c8bea6ccd34d8e5b186506415fe) Update: looks like I was mistaken about the first point. The rest is probably not worth optimizing. Never mind then, this looks good as it is. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13982#pullrequestreview-1428588598 PR Comment: https://git.openjdk.org/jdk/pull/13982#issuecomment-1549677508 From ayang at openjdk.org Tue May 16 13:34:48 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 16 May 2023 13:34:48 GMT Subject: RFR: 8308181: Generational ZGC: Remove CLDG_lock from old gen root scanning In-Reply-To: References: Message-ID: On Tue, 16 May 2023 12:25:13 GMT, Erik ?sterlund wrote: > We already removed the CLDG_lock from young gen root scanning, after the CLDG was made concurrently walkable with [JDK-8307106](https://bugs.openjdk.org/browse/JDK-8307106). We should remove it from the old generation root scanning code as well. Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14011#pullrequestreview-1428594532 From lkorinth at openjdk.org Tue May 16 13:36:46 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 16 May 2023 13:36:46 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v2] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 12:47:03 GMT, Leo Korinth wrote: >> test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/ArrayJuggle.README line 1: >> >>> 1: Copyright (c) 2002, 2018, Oracle and/or its affiliates. All rights reserved. >> >> The README needs some updating with your changes > > Yes, nice catch! It was not up to date to begin with. Although the description of some parts are still correct --- for me --- the README adds little benefit, and I would prefer removing the file. Another option is to keep lines up to and including line 51, and remove the rest. These kind of files just tend to bit rot. (of course also remove -- These tests run forever at the current time [8/14/97] --) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1195175867 From eosterlund at openjdk.org Tue May 16 13:42:07 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 16 May 2023 13:42:07 GMT Subject: RFR: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating [v3] In-Reply-To: References: Message-ID: > The TestCSLocker.java test spawns a thread that grabs the GC locker, and then wait for the first thread to run some java code and then get signal back to release the GC locker. All of this while another thread is allocating garbage and triggering GCs. Naturally, if the thread that is to signal the release of the GC locker requires GC in order to make progress, we will end up with a deadlock that leads to a timeout. As it turns out, that does indeed happen. A println statement is performed, which in its internal implementation performs an allocation, which requires GC. I think any GC can spuriously fail here, but it seems more likely with generational ZGC for whatever reason. While it seems really shady to wait with the GC locker held while a Java thread executing Java code is supposed to make progress, in general, I think the test can be fixed by removing the println statement causing the allocation. I have run the test 200 times, and it's no longer failing with generational ZGC. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: White space fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13989/files - new: https://git.openjdk.org/jdk/pull/13989/files/c410aed9..cd0b0f6b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13989&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13989&range=01-02 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13989.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13989/head:pull/13989 PR: https://git.openjdk.org/jdk/pull/13989 From rkennke at openjdk.org Tue May 16 14:18:16 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 16 May 2023 14:18:16 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v43] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 12:54:50 GMT, Coleen Phillimore wrote: >> Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 109 commits: >> >> - Merge remote-tracking branch 'origin/JDK-8305896' into JDK-8305896 >> - Some more @shipilev comments >> - Update src/hotspot/share/gc/shared/slidingForwarding.hpp >> >> Co-authored-by: Aleksey Shipil?v >> - Align fake-heap without GCC warnings (duh) >> - Merge branch 'master' into JDK-8305896 >> - Fix gtest: Align fake-heaps, avoid re-forwardings >> - @tschatzl's latest fix, cleanup and a test that checks unaligned heap problem >> - Fix build >> - Update src/hotspot/share/gc/shared/slidingForwarding.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - Update src/hotspot/share/gc/shared/slidingForwarding.cpp >> >> Co-authored-by: Aleksey Shipil?v >> - ... and 99 more: https://git.openjdk.org/jdk/compare/6ebea897...f1ad3421 > > src/hotspot/share/utilities/fastHash.hpp line 30: > >> 28: #include "memory/allStatic.hpp" >> 29: >> 30: class FastHash : public AllStatic { > > Where did this hash come from? >From here: https://github.com/openjdk/jdk/pull/13582#discussion_r1184778234 apparently by @rose00 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1195236715 From stefank at openjdk.org Tue May 16 14:49:51 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 May 2023 14:49:51 GMT Subject: RFR: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating [v3] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 13:42:07 GMT, Erik ?sterlund wrote: >> The TestCSLocker.java test spawns a thread that grabs the GC locker, and then wait for the first thread to run some java code and then get signal back to release the GC locker. All of this while another thread is allocating garbage and triggering GCs. Naturally, if the thread that is to signal the release of the GC locker requires GC in order to make progress, we will end up with a deadlock that leads to a timeout. As it turns out, that does indeed happen. A println statement is performed, which in its internal implementation performs an allocation, which requires GC. I think any GC can spuriously fail here, but it seems more likely with generational ZGC for whatever reason. While it seems really shady to wait with the GC locker held while a Java thread executing Java code is supposed to make progress, in general, I think the test can be fixed by removing the println statement causing the allocation. I have run the test 200 times, and it's no longer failing with generational ZGC. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > White space fix Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13989#pullrequestreview-1428768327 From kbarrett at openjdk.org Tue May 16 15:05:48 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 16 May 2023 15:05:48 GMT Subject: RFR: 8299075: TestStringDeduplicationInterned.java fails because extra deduplication In-Reply-To: References: Message-ID: On Tue, 16 May 2023 09:39:04 GMT, Stefan Karlsson wrote: > The tests assumes that some strings have not been deduplicated and checks for that. Unfortunately, we can have concurrently triggered GCs that invalidate those checks. I kept the call sites to show the intention of the test, but then I added a comment explaining why those checks are invalid. Looks good. One comment nit. test/hotspot/jtreg/gc/stringdedup/TestStringDeduplicationTools.java line 409: > 407: > 408: private static void checkNotDeduplicated(Object value1, Object value2) { > 409: // Note that the following check is invalid since a concurrent GC I think the word "concurrent" can be dropped - it seems like any GC could trip over the problem. It's just that the STW GCs are unlikely to occur and trip this because the various forceDeduplication calls force GCs at those points, making automatically triggered GCs elsewhere unlikely. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14005#pullrequestreview-1428805632 PR Review Comment: https://git.openjdk.org/jdk/pull/14005#discussion_r1195308564 From stefank at openjdk.org Tue May 16 15:15:51 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 May 2023 15:15:51 GMT Subject: RFR: 8299075: TestStringDeduplicationInterned.java fails because extra deduplication In-Reply-To: References: Message-ID: <_kcnJXS5qOmNfh1Depd3BzpWeOfjiZ0GitfR0XN_epg=.5c8ade53-e625-4f43-a094-ee61e8ad1381@github.com> On Tue, 16 May 2023 09:39:04 GMT, Stefan Karlsson wrote: > The tests assumes that some strings have not been deduplicated and checks for that. Unfortunately, we can have concurrently triggered GCs that invalidate those checks. I kept the call sites to show the intention of the test, but then I added a comment explaining why those checks are invalid. Thanks for the review! I'd like to get this pushed ASAP since it intermittently fails in tier2. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14005#issuecomment-1549866382 From stefank at openjdk.org Tue May 16 15:15:49 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 May 2023 15:15:49 GMT Subject: RFR: 8299075: TestStringDeduplicationInterned.java fails because extra deduplication [v2] In-Reply-To: References: Message-ID: > The tests assumes that some strings have not been deduplicated and checks for that. Unfortunately, we can have concurrently triggered GCs that invalidate those checks. I kept the call sites to show the intention of the test, but then I added a comment explaining why those checks are invalid. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Review kim ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14005/files - new: https://git.openjdk.org/jdk/pull/14005/files/ea47391b..8e28179d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14005&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14005&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14005.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14005/head:pull/14005 PR: https://git.openjdk.org/jdk/pull/14005 From tschatzl at openjdk.org Tue May 16 15:24:46 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 16 May 2023 15:24:46 GMT Subject: RFR: 8299075: TestStringDeduplicationInterned.java fails because extra deduplication [v2] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 15:15:49 GMT, Stefan Karlsson wrote: >> The tests assumes that some strings have not been deduplicated and checks for that. Unfortunately, we can have concurrently triggered GCs that invalidate those checks. I kept the call sites to show the intention of the test, but then I added a comment explaining why those checks are invalid. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Review kim Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14005#pullrequestreview-1428848601 From stefank at openjdk.org Tue May 16 16:07:56 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 May 2023 16:07:56 GMT Subject: RFR: 8299075: TestStringDeduplicationInterned.java fails because extra deduplication [v2] In-Reply-To: References: Message-ID: <4x5Iinas6i4xvO1k2a8DDrhX47NniWjz-xwG5GPnJdU=.7031c065-cc1d-4a4a-b98a-3f5fcce90b89@github.com> On Tue, 16 May 2023 15:15:49 GMT, Stefan Karlsson wrote: >> The tests assumes that some strings have not been deduplicated and checks for that. Unfortunately, we can have concurrently triggered GCs that invalidate those checks. I kept the call sites to show the intention of the test, but then I added a comment explaining why those checks are invalid. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Review kim I'm going to integrate this now. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14005#issuecomment-1549955801 From stefank at openjdk.org Tue May 16 16:07:58 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 May 2023 16:07:58 GMT Subject: Integrated: 8299075: TestStringDeduplicationInterned.java fails because extra deduplication In-Reply-To: References: Message-ID: On Tue, 16 May 2023 09:39:04 GMT, Stefan Karlsson wrote: > The tests assumes that some strings have not been deduplicated and checks for that. Unfortunately, we can have concurrently triggered GCs that invalidate those checks. I kept the call sites to show the intention of the test, but then I added a comment explaining why those checks are invalid. This pull request has now been integrated. Changeset: 682359cb Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/682359cb4871d779425a9468e8a307169b3651d6 Stats: 21 lines in 1 file changed: 9 ins; 5 del; 7 mod 8299075: TestStringDeduplicationInterned.java fails because extra deduplication Reviewed-by: kbarrett, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/14005 From stefank at openjdk.org Tue May 16 16:16:57 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 May 2023 16:16:57 GMT Subject: Integrated: 8308097: Generational ZGC: Update constructor syntax In-Reply-To: References: Message-ID: On Mon, 15 May 2023 13:11:42 GMT, Stefan Karlsson wrote: > ZGC's current constructor syntax works well with some editors, but not all. There is a wish to move over from the current syntax: > > > ZClass:ZClass() : > ZSuper(), > _member0, > _member1 { > // Code > doit(); > } > > > to the following syntax: > > ZClass:ZClass() > : ZSuper(), > _member0, > _member1 { > // Code > doit(); > } > > > I propose that make this change. This pull request has now been integrated. Changeset: 60ab1358 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/60ab1358da662977e94759eccb95d75a389fd256 Stats: 622 lines in 95 files changed: 8 ins; 17 del; 597 mod 8308097: Generational ZGC: Update constructor syntax Reviewed-by: eosterlund, aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/13987 From stefank at openjdk.org Tue May 16 16:16:56 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 16 May 2023 16:16:56 GMT Subject: RFR: 8308097: Generational ZGC: Update constructor syntax In-Reply-To: References: Message-ID: <6NzlU708OIvOXlxL-NyKCrs1ehbnkwPgyqazeyDEavE=.8c3fa3f5-c0b1-417f-82f9-27e8ba63f178@github.com> On Mon, 15 May 2023 13:11:42 GMT, Stefan Karlsson wrote: > ZGC's current constructor syntax works well with some editors, but not all. There is a wish to move over from the current syntax: > > > ZClass:ZClass() : > ZSuper(), > _member0, > _member1 { > // Code > doit(); > } > > > to the following syntax: > > ZClass:ZClass() > : ZSuper(), > _member0, > _member1 { > // Code > doit(); > } > > > I propose that make this change. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13987#issuecomment-1549971807 From coleenp at openjdk.org Tue May 16 16:57:06 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 16 May 2023 16:57:06 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v32] In-Reply-To: References: Message-ID: On Sat, 6 May 2023 22:44:26 GMT, Albert Mingkun Yang wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix asserts > > src/hotspot/share/gc/shared/slidingForwarding.hpp line 159: > >> 157: * is sufficient because G1 serial compaction is single-threaded. >> 158: */ >> 159: class FallbackTable : public CHeapObj{ > > Could this class be placed inside `SlidingForwarding` for better encapsulation? Why do you write your own hashtable when there's one in utilities/resourceHash.hpp ? there's a put_when_absent() function that is similar than this insert function and faster than put_if_absent(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1195447013 From rkennke at openjdk.org Tue May 16 17:08:09 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 16 May 2023 17:08:09 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v32] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 16:54:00 GMT, Coleen Phillimore wrote: >> src/hotspot/share/gc/shared/slidingForwarding.hpp line 159: >> >>> 157: * is sufficient because G1 serial compaction is single-threaded. >>> 158: */ >>> 159: class FallbackTable : public CHeapObj{ >> >> Could this class be placed inside `SlidingForwarding` for better encapsulation? > > Why do you write your own hashtable when there's one in utilities/resourceHash.hpp ? there's a put_when_absent() function that is similar than this insert function and faster than put_if_absent(). Uh, I've not been aware of it and somehow haven't found it when I searched for something like it. resourceHash.hpp perhaps wasn't a name that struck me as a generic hashtable impl. :-) Let me check if it's feasible to just use the one in resourceHash.hpp ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13582#discussion_r1195460418 From shade at openjdk.org Tue May 16 18:52:46 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 16 May 2023 18:52:46 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v4] In-Reply-To: References: Message-ID: > See the bug for more details. > > Additional testing: > - [x] Linux x86_64 fastdebug, `tier1` > - [x] Linux x86_64 fastdebug, `tier2` > - [x] Linux x86_64 fastdebug, `tier3` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Add include ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13982/files - new: https://git.openjdk.org/jdk/pull/13982/files/961b90e1..f1b47733 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13982&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13982&range=02-03 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13982.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13982/head:pull/13982 PR: https://git.openjdk.org/jdk/pull/13982 From lmesnik at openjdk.org Tue May 16 21:27:45 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 16 May 2023 21:27:45 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v2] In-Reply-To: References: Message-ID: <8egq9N1X4QN6n6f27SskDFCrFTq4RPGVxO707v_hdJc=.37359c30-b2cc-4a4a-8dae-b5e3589b1c21@github.com> On Mon, 15 May 2023 09:27:05 GMT, Leo Korinth wrote: >> Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle >> >> Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) >> >> Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. >> >> Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > rerun tests Thanks for this clean up. There are few comments about names. test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/Juggle3.java line 29: > 27: > 28: // Run in Juggle3Quic.java @test id=1 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp byteArr -ms low > 29: /* @test id=2 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp byteArr -ms medium */ It would be much better to have a meaningful id like 'gc_byteArr_ms_medium'. So we can easier identify failures and easily add/remove rearrange testcases. ------------- PR Review: https://git.openjdk.org/jdk/pull/13929#pullrequestreview-1429455832 PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1195700322 From kbarrett at openjdk.org Wed May 17 03:42:53 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 17 May 2023 03:42:53 GMT Subject: RFR: 8308097: Generational ZGC: Update constructor syntax In-Reply-To: <0Iuyzf9eaCTSG876F3j4aeKUItMZgoT8y1BM_byAiSc=.8cb57667-19d9-4798-8d95-60d3c6735d05@github.com> References: <0Iuyzf9eaCTSG876F3j4aeKUItMZgoT8y1BM_byAiSc=.8cb57667-19d9-4798-8d95-60d3c6735d05@github.com> Message-ID: <1epjlmaYpeMbp7IPH3dfKrfJM6zV5r9cbqadykN37C0=.8bc96e97-abe0-4a65-a46e-8ddb3e1d1c58@github.com> On Tue, 16 May 2023 12:59:11 GMT, Stefan Karlsson wrote: > > Is this something that should be in the style guide? Presumably we have constructors all through the hotspot code that may not conform to this style. > > ZGC has a much stricter code style than the rest of HotSpot. Previous attempts to drive style guide questions have been quite disappointing (at least to me), so we've carved out our own corner in HotSpot where we try to keep the code uniform. As @stefank said, ZGC has a stricter style, making choices in some areas where there doesn't seem to be universal (or even widespread) agreement and usage. Constructor style is one of those areas - we're all over the place here, with some styles seeming quite bad to some people. That said, while I personally didn't care for the prior ZGC constructor style (it requires non-default emacs configuration), this new one is what I would prefer if we were going to mandate something. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13987#issuecomment-1550647114 From shade at openjdk.org Wed May 17 09:22:56 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 17 May 2023 09:22:56 GMT Subject: RFR: 8308088: Improve class check in CollectedHeap::is_oop [v4] In-Reply-To: References: Message-ID: On Tue, 16 May 2023 18:52:46 GMT, Aleksey Shipilev wrote: >> See the bug for more details. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug, `tier1` >> - [x] Linux x86_64 fastdebug, `tier2` >> - [x] Linux x86_64 fastdebug, `tier3` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Add include Thanks for reviews! I am integrating now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13982#issuecomment-1551043531 From shade at openjdk.org Wed May 17 09:22:58 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 17 May 2023 09:22:58 GMT Subject: Integrated: 8308088: Improve class check in CollectedHeap::is_oop In-Reply-To: References: Message-ID: On Mon, 15 May 2023 10:25:15 GMT, Aleksey Shipilev wrote: > See the bug for more details. > > Additional testing: > - [x] Linux x86_64 fastdebug, `tier1` > - [x] Linux x86_64 fastdebug, `tier2` > - [x] Linux x86_64 fastdebug, `tier3` This pull request has now been integrated. Changeset: b300e73a Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/b300e73a4acb5c64f68a355e0ad70d3862084ff4 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod 8308088: Improve class check in CollectedHeap::is_oop Reviewed-by: zgu, tschatzl, ayang, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/13982 From aboldtch at openjdk.org Wed May 17 11:08:46 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 17 May 2023 11:08:46 GMT Subject: RFR: 8308181: Generational ZGC: Remove CLDG_lock from old gen root scanning In-Reply-To: References: Message-ID: <9YT6ceLsYvdiacsGz6ihgBjnG8jYHUaGtxUOubZRWOo=.39225c80-9c5d-489d-a813-7d37e1d9a57b@github.com> On Tue, 16 May 2023 12:25:13 GMT, Erik ?sterlund wrote: > We already removed the CLDG_lock from young gen root scanning, after the CLDG was made concurrently walkable with [JDK-8307106](https://bugs.openjdk.org/browse/JDK-8307106). We should remove it from the old generation root scanning code as well. Marked as reviewed by aboldtch (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14011#pullrequestreview-1430389312 From eosterlund at openjdk.org Wed May 17 11:50:55 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 17 May 2023 11:50:55 GMT Subject: Integrated: 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating In-Reply-To: References: Message-ID: <3qypESEVkZeuxdpkvTT3pJ7zFiCEZX-0x0Ci03XKnd4=.c74974d5-979d-490f-99f9-71225aa03ca1@github.com> On Mon, 15 May 2023 14:26:42 GMT, Erik ?sterlund wrote: > The TestCSLocker.java test spawns a thread that grabs the GC locker, and then wait for the first thread to run some java code and then get signal back to release the GC locker. All of this while another thread is allocating garbage and triggering GCs. Naturally, if the thread that is to signal the release of the GC locker requires GC in order to make progress, we will end up with a deadlock that leads to a timeout. As it turns out, that does indeed happen. A println statement is performed, which in its internal implementation performs an allocation, which requires GC. I think any GC can spuriously fail here, but it seems more likely with generational ZGC for whatever reason. While it seems really shady to wait with the GC locker held while a Java thread executing Java code is supposed to make progress, in general, I think the test can be fixed by removing the println statement causing the allocation. I have run the test 200 times, and it's no longer failing with generational ZGC. This pull request has now been integrated. Changeset: 285c833f Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/285c833ffacdaabe7c4955cbbafb3bc459d26784 Stats: 8 lines in 2 files changed: 4 ins; 3 del; 1 mod 8308043: Deadlock in TestCSLocker.java due to blocking GC while allocating Reviewed-by: stefank, ayang, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/13989 From shade at openjdk.org Wed May 17 11:59:41 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 17 May 2023 11:59:41 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification Message-ID: In multi-array allocations benchmarks, there is a hot path through the native VM allocation code, which calls lots of notification methods, even when we would return immediately, because the allocation was satisfied from existing TLAB. Not calling these helper methods from `MemAllocator::Allocation` constructor/destructor looks like an incremental win for the benchmarks. Example on M1: Benchmark (size) Mode Cnt Score Error Units # Before MultiArrayAlloc.full 1 avgt 15 74,053 ? 0,869 ns/op MultiArrayAlloc.full 2 avgt 15 87,800 ? 0,931 ns/op MultiArrayAlloc.full 4 avgt 15 124,814 ? 0,615 ns/op MultiArrayAlloc.full 8 avgt 15 188,562 ? 0,785 ns/op MultiArrayAlloc.full 16 avgt 15 313,007 ? 1,108 ns/op MultiArrayAlloc.full 32 avgt 15 640,276 ? 4,560 ns/op MultiArrayAlloc.full 64 avgt 15 1395,220 ? 5,860 ns/op MultiArrayAlloc.full 128 avgt 15 3417,848 ? 11,345 ns/op MultiArrayAlloc.full 256 avgt 15 9955,360 ? 102,057 ns/op MultiArrayAlloc.full 512 avgt 15 27738,002 ? 244,940 ns/op MultiArrayAlloc.full 1024 avgt 15 147507,008 ? 1434,085 ns/op # After MultiArrayAlloc.full 1 avgt 15 70,434 ? 0,373 ns/op ; 5% better MultiArrayAlloc.full 2 avgt 15 82,394 ? 0,137 ns/op ; 7% better MultiArrayAlloc.full 4 avgt 15 108,542 ? 0,129 ns/op ; 15% better MultiArrayAlloc.full 8 avgt 15 170,697 ? 4,480 ns/op ; 11% better MultiArrayAlloc.full 16 avgt 15 272,902 ? 0,877 ns/op ; 15% better MultiArrayAlloc.full 32 avgt 15 524,486 ? 1,447 ns/op ; 22% better MultiArrayAlloc.full 64 avgt 15 1088,932 ? 2,739 ns/op ; 17% better MultiArrayAlloc.full 128 avgt 15 3151,144 ? 14,621 ns/op ; 8% better MultiArrayAlloc.full 256 avgt 15 8455,293 ? 12,656 ns/op ; 18% better MultiArrayAlloc.full 512 avgt 15 26060,055 ? 116,524 ns/op ; 6% better MultiArrayAlloc.full 1024 avgt 15 130824,480 ? 831,703 ns/op ; 13% better Additional testing: - [x] Ad-hoc micro-benchmarks - [x] Linux x86_64 fastdebug `serviceability/jvmti` - [x] Linux x86_64 fastdebug `jdk/jfr` - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` - [x] Linux AArch64 fastdebug `tier1 tier2 tier3` ------------- Commit messages: - Hide more stuff - Touchups - Branch - Fix Changes: https://git.openjdk.org/jdk/pull/14019/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14019&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308231 Stats: 29 lines in 1 file changed: 20 ins; 2 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14019.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14019/head:pull/14019 PR: https://git.openjdk.org/jdk/pull/14019 From lkorinth at openjdk.org Wed May 17 15:06:05 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Wed, 17 May 2023 15:06:05 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v3] In-Reply-To: References: Message-ID: > Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle > > Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) > > Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. > > Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: remove comments, add descriptive ids, remove bad README ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13929/files - new: https://git.openjdk.org/jdk/pull/13929/files/7bda00db..5f9ab708 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13929&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13929&range=01-02 Stats: 135 lines in 3 files changed: 0 ins; 101 del; 34 mod Patch: https://git.openjdk.org/jdk/pull/13929.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13929/head:pull/13929 PR: https://git.openjdk.org/jdk/pull/13929 From lkorinth at openjdk.org Wed May 17 15:06:10 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Wed, 17 May 2023 15:06:10 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v2] In-Reply-To: References: Message-ID: On Mon, 15 May 2023 09:27:05 GMT, Leo Korinth wrote: >> Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle >> >> Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) >> >> Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. >> >> Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > rerun tests I removed the test cases that were commented out. I added descriptive ids to the test cases (although they are not used now, they might be used in the future when they could be used to create a quick test group), and I removed the readme that I though was of little help and since long not updated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13929#issuecomment-1551555423 From lmesnik at openjdk.org Wed May 17 15:51:25 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 17 May 2023 15:51:25 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification In-Reply-To: References: Message-ID: On Tue, 16 May 2023 19:36:54 GMT, Aleksey Shipilev wrote: > In multi-array allocations benchmarks, there is a hot path through the native VM allocation code, which calls lots of notification methods, even when we would return immediately, because the allocation was satisfied from existing TLAB. Not calling these helper methods from `MemAllocator::Allocation` constructor/destructor looks like an incremental win for the benchmarks. > > Example on M1: > > > Benchmark (size) Mode Cnt Score Error Units > > # Before > MultiArrayAlloc.full 1 avgt 15 74,053 ? 0,869 ns/op > MultiArrayAlloc.full 2 avgt 15 87,800 ? 0,931 ns/op > MultiArrayAlloc.full 4 avgt 15 124,814 ? 0,615 ns/op > MultiArrayAlloc.full 8 avgt 15 188,562 ? 0,785 ns/op > MultiArrayAlloc.full 16 avgt 15 313,007 ? 1,108 ns/op > MultiArrayAlloc.full 32 avgt 15 640,276 ? 4,560 ns/op > MultiArrayAlloc.full 64 avgt 15 1395,220 ? 5,860 ns/op > MultiArrayAlloc.full 128 avgt 15 3417,848 ? 11,345 ns/op > MultiArrayAlloc.full 256 avgt 15 9955,360 ? 102,057 ns/op > MultiArrayAlloc.full 512 avgt 15 27738,002 ? 244,940 ns/op > MultiArrayAlloc.full 1024 avgt 15 147507,008 ? 1434,085 ns/op > > # After > MultiArrayAlloc.full 1 avgt 15 70,434 ? 0,373 ns/op ; 5% better > MultiArrayAlloc.full 2 avgt 15 82,394 ? 0,137 ns/op ; 7% better > MultiArrayAlloc.full 4 avgt 15 108,542 ? 0,129 ns/op ; 15% better > MultiArrayAlloc.full 8 avgt 15 170,697 ? 4,480 ns/op ; 11% better > MultiArrayAlloc.full 16 avgt 15 272,902 ? 0,877 ns/op ; 15% better > MultiArrayAlloc.full 32 avgt 15 524,486 ? 1,447 ns/op ; 22% better > MultiArrayAlloc.full 64 avgt 15 1088,932 ? 2,739 ns/op ; 17% better > MultiArrayAlloc.full 128 avgt 15 3151,144 ? 14,621 ns/op ; 8% better > MultiArrayAlloc.full 256 avgt 15 8455,293 ? 12,656 ns/op ; 18% better > MultiArrayAlloc.full 512 avgt 15 26060,055 ? 116,524 ns/op ; 6% better > MultiArrayAlloc.full 1024 avgt 15 130824,480 ? 831,703 ns/op ; 13% better > > > Additional testing: > - [x] Ad-hoc micro-benchmarks > - [x] Linux x86_64 fastdebug `serviceability/jvmti` > - [x] Linux x86_64 fastdebug `jdk/jfr` > - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` > - [x] Linux AArch64 ... src/hotspot/share/gc/shared/memAllocator.cpp line 75: > 73: // - (optionally) the enabled JVMTI event that wants to capture all allocations; > 74: > 75: bool should_notify_allocation_no_jvmti_vmobjalloc() { As I understand this check is specific for SampledObjectAlloc event. Might be better to name it like: should_notify_jvmti_sampled_object_alloc? Does it make sense to check should_post_sampled_object_alloc also here? It is usually disabled. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14019#discussion_r1196728090 From rkennke at openjdk.org Wed May 17 16:12:10 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 17 May 2023 16:12:10 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v44] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new ... Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Replace homegrown FallbackTable with a ResourceHashtable based impl ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/f1ad3421..3f7cc376 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=43 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=42-43 Stats: 79 lines in 2 files changed: 2 ins; 59 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Wed May 17 21:25:29 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 17 May 2023 21:25:29 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v45] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new ... Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 111 commits: - Merge branch 'master' into JDK-8305896 - Replace homegrown FallbackTable with a ResourceHashtable based impl - Merge remote-tracking branch 'origin/JDK-8305896' into JDK-8305896 - Some more @shipilev comments - Update src/hotspot/share/gc/shared/slidingForwarding.hpp Co-authored-by: Aleksey Shipil?v - Align fake-heap without GCC warnings (duh) - Merge branch 'master' into JDK-8305896 - Fix gtest: Align fake-heaps, avoid re-forwardings - @tschatzl's latest fix, cleanup and a test that checks unaligned heap problem - Fix build - ... and 101 more: https://git.openjdk.org/jdk/compare/902585be...bff747fa ------------- Changes: https://git.openjdk.org/jdk/pull/13582/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=44 Stats: 855 lines in 24 files changed: 818 ins; 0 del; 37 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Wed May 17 21:32:02 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 17 May 2023 21:32:02 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v15] In-Reply-To: References: Message-ID: <-VzsGc5hmzkgN9MekiGBRjSmettllFG5aiWcRBf9Wps=.11c85a49-4749-401e-94ea-1c7864954f3a@github.com> > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 27 commits: - Merge branch 'JDK-8305896' into JDK-8305898 - Fix tests on 32bit builds - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - wqRevert "Rename self-forwarded -> forward-failed" This reverts commit 4d9713ca239da8e294c63887426bfb97240d3130. - Merge branch 'JDK-8305896' into JDK-8305898 - Merge remote-tracking branch 'origin/JDK-8305898' into JDK-8305898 - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Update src/hotspot/share/oops/oop.inline.hpp Co-authored-by: Aleksey Shipil?v - Rename self-forwarded -> forward-failed - ... and 17 more: https://git.openjdk.org/jdk/compare/bff747fa...9e934ba7 ------------- Changes: https://git.openjdk.org/jdk/pull/13779/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=14 Stats: 97 lines in 8 files changed: 81 ins; 2 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From rkennke at openjdk.org Wed May 17 21:37:02 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 17 May 2023 21:37:02 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v16] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Update comment about mark-word layout ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13779/files - new: https://git.openjdk.org/jdk/pull/13779/files/9e934ba7..4895ad86 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=14-15 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From rkennke at openjdk.org Thu May 18 20:48:43 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 18 May 2023 20:48:43 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v46] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new ... Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Remove G1-only assert for fallback forwarding, and comment with explanation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/bff747fa..6bbb8e01 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=45 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=44-45 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Thu May 18 20:49:57 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 18 May 2023 20:49:57 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v17] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'JDK-8305896' into JDK-8305898 - Remove G1-only assert for fallback forwarding, and comment with explanation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13779/files - new: https://git.openjdk.org/jdk/pull/13779/files/4895ad86..3519da72 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=15-16 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From dholmes at openjdk.org Fri May 19 06:20:58 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 19 May 2023 06:20:58 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v3] In-Reply-To: References: Message-ID: On Wed, 17 May 2023 15:06:05 GMT, Leo Korinth wrote: >> Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle >> >> Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) >> >> Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. >> >> Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > remove comments, add descriptive ids, remove bad README Nothing further from me. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13929#pullrequestreview-1433838540 From shade at openjdk.org Fri May 19 07:30:48 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 19 May 2023 07:30:48 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v2] In-Reply-To: References: Message-ID: <4rRRNc7dCnGAxxXoA6u8DQ6QyMEI42ny4JehKHqX8kU=.f18754fc-224a-4c80-bb0d-c7074f7f899a@github.com> > In multi-array allocations benchmarks, there is a hot path through the native VM allocation code, which calls lots of notification methods, even when we would return immediately, because the allocation was satisfied from existing TLAB. Not calling these helper methods from `MemAllocator::Allocation` constructor/destructor looks like an incremental win for the benchmarks. > > Example on M1: > > > Benchmark (size) Mode Cnt Score Error Units > > # Before > MultiArrayAlloc.full 1 avgt 15 74,053 ? 0,869 ns/op > MultiArrayAlloc.full 2 avgt 15 87,800 ? 0,931 ns/op > MultiArrayAlloc.full 4 avgt 15 124,814 ? 0,615 ns/op > MultiArrayAlloc.full 8 avgt 15 188,562 ? 0,785 ns/op > MultiArrayAlloc.full 16 avgt 15 313,007 ? 1,108 ns/op > MultiArrayAlloc.full 32 avgt 15 640,276 ? 4,560 ns/op > MultiArrayAlloc.full 64 avgt 15 1395,220 ? 5,860 ns/op > MultiArrayAlloc.full 128 avgt 15 3417,848 ? 11,345 ns/op > MultiArrayAlloc.full 256 avgt 15 9955,360 ? 102,057 ns/op > MultiArrayAlloc.full 512 avgt 15 27738,002 ? 244,940 ns/op > MultiArrayAlloc.full 1024 avgt 15 147507,008 ? 1434,085 ns/op > > # After > MultiArrayAlloc.full 1 avgt 15 70,434 ? 0,373 ns/op ; 5% better > MultiArrayAlloc.full 2 avgt 15 82,394 ? 0,137 ns/op ; 7% better > MultiArrayAlloc.full 4 avgt 15 108,542 ? 0,129 ns/op ; 15% better > MultiArrayAlloc.full 8 avgt 15 170,697 ? 4,480 ns/op ; 11% better > MultiArrayAlloc.full 16 avgt 15 272,902 ? 0,877 ns/op ; 15% better > MultiArrayAlloc.full 32 avgt 15 524,486 ? 1,447 ns/op ; 22% better > MultiArrayAlloc.full 64 avgt 15 1088,932 ? 2,739 ns/op ; 17% better > MultiArrayAlloc.full 128 avgt 15 3151,144 ? 14,621 ns/op ; 8% better > MultiArrayAlloc.full 256 avgt 15 8455,293 ? 12,656 ns/op ; 18% better > MultiArrayAlloc.full 512 avgt 15 26060,055 ? 116,524 ns/op ; 6% better > MultiArrayAlloc.full 1024 avgt 15 130824,480 ? 831,703 ns/op ; 13% better > > > Additional testing: > - [x] Ad-hoc micro-benchmarks > - [x] Linux x86_64 fastdebug `serviceability/jvmti` > - [x] Linux x86_64 fastdebug `jdk/jfr` > - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` > - [x] Linux AArch64 ... Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Touch up comment - Merge branch 'master' into JDK-8308231-memalloc-check-faster - Hide more stuff - Touchups - Branch - Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14019/files - new: https://git.openjdk.org/jdk/pull/14019/files/d3709f2d..d189a358 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14019&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14019&range=00-01 Stats: 13856 lines in 437 files changed: 8752 ins; 2388 del; 2716 mod Patch: https://git.openjdk.org/jdk/pull/14019.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14019/head:pull/14019 PR: https://git.openjdk.org/jdk/pull/14019 From shade at openjdk.org Fri May 19 07:33:53 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 19 May 2023 07:33:53 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v2] In-Reply-To: References: Message-ID: <39sDYy4eEH_FZzyFZtEVraZkHiNj8h9GcVWr2lNxZEg=.108b5e43-c8ff-4f9c-966d-a8452ad8b272@github.com> On Wed, 17 May 2023 15:45:59 GMT, Leonid Mesnik wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Touch up comment >> - Merge branch 'master' into JDK-8308231-memalloc-check-faster >> - Hide more stuff >> - Touchups >> - Branch >> - Fix > > src/hotspot/share/gc/shared/memAllocator.cpp line 75: > >> 73: // - (optionally) the enabled JVMTI event that wants to capture all allocations; >> 74: >> 75: bool should_notify_allocation_no_jvmti_vmobjalloc() { > > As I understand this check is specific for SampledObjectAlloc event. Might be better to name it like: should_notify_jvmti_sampled_object_alloc? > > Does it make sense to check should_post_sampled_object_alloc also here? It is usually disabled. No, this check verifies everything, _but_ `JvmtiExport::should_post_vm_object_alloc()`, see the full method below. So in `MemAllocator::Allocation::notify_allocation_jvmti_sampler` can actually be called for two reasons: sampling event is required, or VMObjectAlloc is required. The `should_notify_allocation_no_jvmti_vmobjalloc` disambiguates the case where we don't need to proceed with sampling event gathering. Honestly, I can just revert this hunk, but I think it is cleaner to expose the helper: @@ -187,9 +206,8 @@ void MemAllocator::Allocation::notify_allocation_jvmti_sampler() { return; } - if (!_allocated_outside_tlab && _allocated_tlab_size == 0 && !_tlab_end_reset_for_sample) { - // Sample if it's a non-TLAB allocation, or a TLAB allocation that either refills the TLAB - // or expands it due to taking a sampler induced slow path. + if (!should_notify_allocation_no_jvmti_vmobjalloc()) { + // Called here only for JVMTI VMObjectAlloc event return; } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14019#discussion_r1198639901 From lkorinth at openjdk.org Fri May 19 08:41:51 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 19 May 2023 08:41:51 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v2] In-Reply-To: <8egq9N1X4QN6n6f27SskDFCrFTq4RPGVxO707v_hdJc=.37359c30-b2cc-4a4a-8dae-b5e3589b1c21@github.com> References: <8egq9N1X4QN6n6f27SskDFCrFTq4RPGVxO707v_hdJc=.37359c30-b2cc-4a4a-8dae-b5e3589b1c21@github.com> Message-ID: On Tue, 16 May 2023 21:21:58 GMT, Leonid Mesnik wrote: >> Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: >> >> rerun tests > > test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle/Juggle3.java line 29: > >> 27: >> 28: // Run in Juggle3Quic.java @test id=1 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp byteArr -ms low >> 29: /* @test id=2 @key stress randomness @library /vmTestbase /test/lib @run main/othervm -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc=debug:gc.log gc.ArrayJuggle.Juggle3 -gp byteArr -ms medium */ > > It would be much better to have a meaningful id like 'gc_byteArr_ms_medium'. So we can easier identify failures and easily add/remove rearrange testcases. I added IDs with names although a bit shorter. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13929#discussion_r1198702689 From lmesnik at openjdk.org Fri May 19 14:10:53 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 19 May 2023 14:10:53 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v3] In-Reply-To: References: Message-ID: On Wed, 17 May 2023 15:06:05 GMT, Leo Korinth wrote: >> Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle >> >> Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) >> >> Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. >> >> Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > remove comments, add descriptive ids, remove bad README Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13929#pullrequestreview-1434500291 From lkorinth at openjdk.org Fri May 19 14:39:52 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Fri, 19 May 2023 14:39:52 GMT Subject: RFR: 8307804: Reorganize ArrayJuggle test cases [v3] In-Reply-To: References: Message-ID: <0bFyVUpgeW9OZzQ4HiJUUVA5SzMLLaNDtnuZM22z2FI=.221183d4-2eb0-4ad0-a229-bb2ac63bab45@github.com> On Wed, 17 May 2023 15:06:05 GMT, Leo Korinth wrote: >> Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle >> >> Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) >> >> Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. >> >> Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 > > Leo Korinth has updated the pull request incrementally with one additional commit since the last revision: > > remove comments, add descriptive ids, remove bad README Thanks David and Leonid! I will integrate after the weekend. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13929#issuecomment-1554689942 From lmesnik at openjdk.org Sat May 20 21:41:51 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sat, 20 May 2023 21:41:51 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v2] In-Reply-To: <4rRRNc7dCnGAxxXoA6u8DQ6QyMEI42ny4JehKHqX8kU=.f18754fc-224a-4c80-bb0d-c7074f7f899a@github.com> References: <4rRRNc7dCnGAxxXoA6u8DQ6QyMEI42ny4JehKHqX8kU=.f18754fc-224a-4c80-bb0d-c7074f7f899a@github.com> Message-ID: On Fri, 19 May 2023 07:30:48 GMT, Aleksey Shipilev wrote: >> In multi-array allocations benchmarks, there is a hot path through the native VM allocation code, which calls lots of notification methods, even when we would return immediately, because the allocation was satisfied from existing TLAB. Not calling these helper methods from `MemAllocator::Allocation` constructor/destructor looks like an incremental win for the benchmarks. >> >> Example on M1: >> >> >> Benchmark (size) Mode Cnt Score Error Units >> >> # Before >> MultiArrayAlloc.full 1 avgt 15 74,053 ? 0,869 ns/op >> MultiArrayAlloc.full 2 avgt 15 87,800 ? 0,931 ns/op >> MultiArrayAlloc.full 4 avgt 15 124,814 ? 0,615 ns/op >> MultiArrayAlloc.full 8 avgt 15 188,562 ? 0,785 ns/op >> MultiArrayAlloc.full 16 avgt 15 313,007 ? 1,108 ns/op >> MultiArrayAlloc.full 32 avgt 15 640,276 ? 4,560 ns/op >> MultiArrayAlloc.full 64 avgt 15 1395,220 ? 5,860 ns/op >> MultiArrayAlloc.full 128 avgt 15 3417,848 ? 11,345 ns/op >> MultiArrayAlloc.full 256 avgt 15 9955,360 ? 102,057 ns/op >> MultiArrayAlloc.full 512 avgt 15 27738,002 ? 244,940 ns/op >> MultiArrayAlloc.full 1024 avgt 15 147507,008 ? 1434,085 ns/op >> >> # After >> MultiArrayAlloc.full 1 avgt 15 70,434 ? 0,373 ns/op ; 5% better >> MultiArrayAlloc.full 2 avgt 15 82,394 ? 0,137 ns/op ; 7% better >> MultiArrayAlloc.full 4 avgt 15 108,542 ? 0,129 ns/op ; 15% better >> MultiArrayAlloc.full 8 avgt 15 170,697 ? 4,480 ns/op ; 11% better >> MultiArrayAlloc.full 16 avgt 15 272,902 ? 0,877 ns/op ; 15% better >> MultiArrayAlloc.full 32 avgt 15 524,486 ? 1,447 ns/op ; 22% better >> MultiArrayAlloc.full 64 avgt 15 1088,932 ? 2,739 ns/op ; 17% better >> MultiArrayAlloc.full 128 avgt 15 3151,144 ? 14,621 ns/op ; 8% better >> MultiArrayAlloc.full 256 avgt 15 8455,293 ? 12,656 ns/op ; 18% better >> MultiArrayAlloc.full 512 avgt 15 26060,055 ? 116,524 ns/op ; 6% better >> MultiArrayAlloc.full 1024 avgt 15 130824,480 ? 831,703 ns/op ; 13% better >> >> >> Additional testing: >> - [x] Ad-hoc micro-benchmarks >> - [x] Linux x86_64 fastdebug `serviceability/jvmti` >> - [x] Linux x86_64 fastdebug `jdk/jfr` >> - [x] Linux x86_64 fastdebug `t... > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Touch up comment > - Merge branch 'master' into JDK-8308231-memalloc-check-faster > - Hide more stuff > - Touchups > - Branch > - Fix Still need review from gc team. ------------- Marked as reviewed by lmesnik (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14019#pullrequestreview-1435449105 From aboldtch at openjdk.org Mon May 22 07:44:54 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 22 May 2023 07:44:54 GMT Subject: RFR: 8308500: ZStatSubPhase::register_start should not call register_gc_phase_start if ZAbort::should_abort() Message-ID: `ZStatSubPhase::register_start` should not call `register_gc_phase_start` if `ZAbort::should_abort()` is true. This will cause an unbalanced push and pop behaviour of the phase stack as `ZStatSubPhase::register_end` stops popping (and sending events) after the aborting has started. This will create an issue if more subsequent sub-phases are added in-between two abort points as the phase stack may overflow. Simply avoid pushing new phases when aborting has started solves this issue. ------------- Commit messages: - 8308500: ZStatSubPhase::register_start should not call register_gc_phase_start if ZAbort::should_abort() Changes: https://git.openjdk.org/jdk/pull/14075/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14075&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308500 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14075.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14075/head:pull/14075 PR: https://git.openjdk.org/jdk/pull/14075 From lkorinth at openjdk.org Mon May 22 08:21:04 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 22 May 2023 08:21:04 GMT Subject: Integrated: 8307804: Reorganize ArrayJuggle test cases In-Reply-To: References: Message-ID: On Thu, 11 May 2023 11:44:14 GMT, Leo Korinth wrote: > Move all ArrayJuggle test cases to the same directory: test/hotspot/jtreg/vmTestbase/gc/ArrayJuggle > > Rename Juggle01 to Juggle3 (so it will not be confused with Juggle1) > > Remove all directories and files used to launch the tests, instead use multiple `@test id=xx` "annotations" in the four kept test files. > > Create a new test file Juggle3Quick.java that will act as a quick group of tests. Unfortunately `#id` selectors can not be used in test groups so this is a workaround. See: https://bugs.openjdk.org/browse/CODETOOLS-7903467 This pull request has now been integrated. Changeset: b5887979 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/b58879790083b704da94ea1476fcadb0e65b0805 Stats: 2660 lines in 81 files changed: 157 ins; 2493 del; 10 mod 8307804: Reorganize ArrayJuggle test cases Reviewed-by: dholmes, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/13929 From eosterlund at openjdk.org Mon May 22 10:23:58 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 22 May 2023 10:23:58 GMT Subject: RFR: 8308181: Generational ZGC: Remove CLDG_lock from old gen root scanning In-Reply-To: <9YT6ceLsYvdiacsGz6ihgBjnG8jYHUaGtxUOubZRWOo=.39225c80-9c5d-489d-a813-7d37e1d9a57b@github.com> References: <9YT6ceLsYvdiacsGz6ihgBjnG8jYHUaGtxUOubZRWOo=.39225c80-9c5d-489d-a813-7d37e1d9a57b@github.com> Message-ID: <6gJRcNXl0a97gjEmrNGmC-khlES-SMf5sjED_WAwvFE=.4c1edbf6-0117-4f57-becf-c623a14dcc33@github.com> On Wed, 17 May 2023 11:06:14 GMT, Axel Boldt-Christmas wrote: >> We already removed the CLDG_lock from young gen root scanning, after the CLDG was made concurrently walkable with [JDK-8307106](https://bugs.openjdk.org/browse/JDK-8307106). We should remove it from the old generation root scanning code as well. > > Marked as reviewed by aboldtch (Committer). Thanks for the reviews @xmas92 and @albertnetymk! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14011#issuecomment-1556955271 From eosterlund at openjdk.org Mon May 22 10:23:59 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 22 May 2023 10:23:59 GMT Subject: Integrated: 8308181: Generational ZGC: Remove CLDG_lock from old gen root scanning In-Reply-To: References: Message-ID: On Tue, 16 May 2023 12:25:13 GMT, Erik ?sterlund wrote: > We already removed the CLDG_lock from young gen root scanning, after the CLDG was made concurrently walkable with [JDK-8307106](https://bugs.openjdk.org/browse/JDK-8307106). We should remove it from the old generation root scanning code as well. This pull request has now been integrated. Changeset: 8011ba74 Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/8011ba74a20c069e094a878ab4a1843036521272 Stats: 14 lines in 2 files changed: 0 ins; 12 del; 2 mod 8308181: Generational ZGC: Remove CLDG_lock from old gen root scanning Reviewed-by: ayang, aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/14011 From stefank at openjdk.org Mon May 22 10:45:50 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 22 May 2023 10:45:50 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v2] In-Reply-To: <39sDYy4eEH_FZzyFZtEVraZkHiNj8h9GcVWr2lNxZEg=.108b5e43-c8ff-4f9c-966d-a8452ad8b272@github.com> References: <39sDYy4eEH_FZzyFZtEVraZkHiNj8h9GcVWr2lNxZEg=.108b5e43-c8ff-4f9c-966d-a8452ad8b272@github.com> Message-ID: On Fri, 19 May 2023 07:31:02 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/shared/memAllocator.cpp line 75: >> >>> 73: // - (optionally) the enabled JVMTI event that wants to capture all allocations; >>> 74: >>> 75: bool should_notify_allocation_no_jvmti_vmobjalloc() { >> >> As I understand this check is specific for SampledObjectAlloc event. Might be better to name it like: should_notify_jvmti_sampled_object_alloc? >> >> Does it make sense to check should_post_sampled_object_alloc also here? It is usually disabled. > > No, this check verifies everything, _but_ `JvmtiExport::should_post_vm_object_alloc()`, see the full method below. So in `MemAllocator::Allocation::notify_allocation_jvmti_sampler` can actually be called for two reasons: sampling event is required, or VMObjectAlloc is required. The `should_notify_allocation_no_jvmti_vmobjalloc` disambiguates the case where we don't need to proceed with sampling event gathering. > > Honestly, I can just revert this hunk, but I think it is cleaner to expose the helper: > > > @@ -187,9 +206,8 @@ void MemAllocator::Allocation::notify_allocation_jvmti_sampler() { > return; > } > > - if (!_allocated_outside_tlab && _allocated_tlab_size == 0 && !_tlab_end_reset_for_sample) { > - // Sample if it's a non-TLAB allocation, or a TLAB allocation that either refills the TLAB > - // or expands it due to taking a sampler induced slow path. > + if (!should_notify_allocation_no_jvmti_vmobjalloc()) { > + // Called here only for JVMTI VMObjectAlloc event > return; > } FWIW, I found the name `should_notify_allocation_no_jvmti_vmobjalloc` confusing when I read this patch. Maybe a better name would be `allocated_in_slow_path()` or negate it to `!allocated_in_fast_path()`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14019#discussion_r1200334550 From stefank at openjdk.org Mon May 22 10:53:57 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 22 May 2023 10:53:57 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v2] In-Reply-To: <4rRRNc7dCnGAxxXoA6u8DQ6QyMEI42ny4JehKHqX8kU=.f18754fc-224a-4c80-bb0d-c7074f7f899a@github.com> References: <4rRRNc7dCnGAxxXoA6u8DQ6QyMEI42ny4JehKHqX8kU=.f18754fc-224a-4c80-bb0d-c7074f7f899a@github.com> Message-ID: <6OmfBmoX4m0QF0qSXPqhJePZeP4Y0_Pwpv9vQ-0no6k=.3b8afc21-db07-4523-978c-2479076c1be8@github.com> On Fri, 19 May 2023 07:30:48 GMT, Aleksey Shipilev wrote: >> In multi-array allocations benchmarks, there is a hot path through the native VM allocation code, which calls lots of notification methods, even when we would return immediately, because the allocation was satisfied from existing TLAB. Not calling these helper methods from `MemAllocator::Allocation` constructor/destructor looks like an incremental win for the benchmarks. >> >> Example on M1: >> >> >> Benchmark (size) Mode Cnt Score Error Units >> >> # Before >> MultiArrayAlloc.full 1 avgt 15 74,053 ? 0,869 ns/op >> MultiArrayAlloc.full 2 avgt 15 87,800 ? 0,931 ns/op >> MultiArrayAlloc.full 4 avgt 15 124,814 ? 0,615 ns/op >> MultiArrayAlloc.full 8 avgt 15 188,562 ? 0,785 ns/op >> MultiArrayAlloc.full 16 avgt 15 313,007 ? 1,108 ns/op >> MultiArrayAlloc.full 32 avgt 15 640,276 ? 4,560 ns/op >> MultiArrayAlloc.full 64 avgt 15 1395,220 ? 5,860 ns/op >> MultiArrayAlloc.full 128 avgt 15 3417,848 ? 11,345 ns/op >> MultiArrayAlloc.full 256 avgt 15 9955,360 ? 102,057 ns/op >> MultiArrayAlloc.full 512 avgt 15 27738,002 ? 244,940 ns/op >> MultiArrayAlloc.full 1024 avgt 15 147507,008 ? 1434,085 ns/op >> >> # After >> MultiArrayAlloc.full 1 avgt 15 70,434 ? 0,373 ns/op ; 5% better >> MultiArrayAlloc.full 2 avgt 15 82,394 ? 0,137 ns/op ; 7% better >> MultiArrayAlloc.full 4 avgt 15 108,542 ? 0,129 ns/op ; 15% better >> MultiArrayAlloc.full 8 avgt 15 170,697 ? 4,480 ns/op ; 11% better >> MultiArrayAlloc.full 16 avgt 15 272,902 ? 0,877 ns/op ; 15% better >> MultiArrayAlloc.full 32 avgt 15 524,486 ? 1,447 ns/op ; 22% better >> MultiArrayAlloc.full 64 avgt 15 1088,932 ? 2,739 ns/op ; 17% better >> MultiArrayAlloc.full 128 avgt 15 3151,144 ? 14,621 ns/op ; 8% better >> MultiArrayAlloc.full 256 avgt 15 8455,293 ? 12,656 ns/op ; 18% better >> MultiArrayAlloc.full 512 avgt 15 26060,055 ? 116,524 ns/op ; 6% better >> MultiArrayAlloc.full 1024 avgt 15 130824,480 ? 831,703 ns/op ; 13% better >> >> >> Additional testing: >> - [x] Ad-hoc micro-benchmarks >> - [x] Linux x86_64 fastdebug `serviceability/jvmti` >> - [x] Linux x86_64 fastdebug `jdk/jfr` >> - [x] Linux x86_64 fastdebug `t... > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Touch up comment > - Merge branch 'master' into JDK-8308231-memalloc-check-faster > - Hide more stuff > - Touchups > - Branch > - Fix Changes requested by stefank (Reviewer). src/hotspot/share/gc/shared/memAllocator.cpp line 106: > 104: if (should_notify_allocation()) { > 105: notify_allocation(_thread); > 106: } `notify_allocation` used to call `notify_allocation_low_memory_detector` and `notify_allocation_dtrace_sampler` without performing any "should notify" filtering. Is this change in behavior intentional? src/hotspot/share/gc/shared/memAllocator.cpp line 210: > 208: > 209: if (!should_notify_allocation_no_jvmti_vmobjalloc()) { > 210: // Called here only for JVMTI VMObjectAlloc event This comment sounds like we are executing the return statement because of `JVMTI VMObjectAlloc event`, but isn't it the opposite? The reason why we are in this function and *not* executing the return statement is because the VMObjectAlloc event is turned on. Would it make more sense to move the comment to after this if block? ------------- PR Review: https://git.openjdk.org/jdk/pull/14019#pullrequestreview-1436341555 PR Review Comment: https://git.openjdk.org/jdk/pull/14019#discussion_r1200336211 PR Review Comment: https://git.openjdk.org/jdk/pull/14019#discussion_r1200341940 From stefank at openjdk.org Mon May 22 10:55:48 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 22 May 2023 10:55:48 GMT Subject: RFR: 8308500: ZStatSubPhase::register_start should not call register_gc_phase_start if ZAbort::should_abort() In-Reply-To: References: Message-ID: On Mon, 22 May 2023 07:37:19 GMT, Axel Boldt-Christmas wrote: > `ZStatSubPhase::register_start` should not call `register_gc_phase_start` if `ZAbort::should_abort()` is true. This will cause an unbalanced push and pop behaviour of the phase stack as `ZStatSubPhase::register_end` stops popping (and sending events) after the aborting has started. This will create an issue if more subsequent sub-phases are added in-between two abort points as the phase stack may overflow. > > Simply avoid pushing new phases when aborting has started solves this issue. Changes requested by stefank (Reviewer). src/hotspot/share/gc/z/zStat.cpp line 811: > 809: > 810: void ZStatSubPhase::register_start(ConcurrentGCTimer* timer, const Ticks& start) const { > 811: if (timer != nullptr && !ZAbort::should_abort()) { `register_end` also skips the logging part if we should abort. Should we do the same for the `register_start`? ------------- PR Review: https://git.openjdk.org/jdk/pull/14075#pullrequestreview-1436354071 PR Review Comment: https://git.openjdk.org/jdk/pull/14075#discussion_r1200344003 From aboldtch at openjdk.org Mon May 22 11:09:53 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 22 May 2023 11:09:53 GMT Subject: RFR: 8308500: ZStatSubPhase::register_start should not call register_gc_phase_start if ZAbort::should_abort() In-Reply-To: References: Message-ID: On Mon, 22 May 2023 10:52:44 GMT, Stefan Karlsson wrote: >> `ZStatSubPhase::register_start` should not call `register_gc_phase_start` if `ZAbort::should_abort()` is true. This will cause an unbalanced push and pop behaviour of the phase stack as `ZStatSubPhase::register_end` stops popping (and sending events) after the aborting has started. This will create an issue if more subsequent sub-phases are added in-between two abort points as the phase stack may overflow. >> >> Simply avoid pushing new phases when aborting has started solves this issue. > > src/hotspot/share/gc/z/zStat.cpp line 811: > >> 809: >> 810: void ZStatSubPhase::register_start(ConcurrentGCTimer* timer, const Ticks& start) const { >> 811: if (timer != nullptr && !ZAbort::should_abort()) { > > `register_end` also skips the logging part if we should abort. Should we do the same for the `register_start`? My initial implementation had that. (just an early return) But thought it might be interesting to see if there ever is an issue with the abort code to have the phase starts in the log. The end logging is less interesting as we are aborting. This is how single generation also does it. Given that we only log sub phases on debug, maybe this is less useful as we will miss it in the normal gc* logs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14075#discussion_r1200359600 From stefank at openjdk.org Mon May 22 11:16:51 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 22 May 2023 11:16:51 GMT Subject: RFR: 8308500: ZStatSubPhase::register_start should not call register_gc_phase_start if ZAbort::should_abort() In-Reply-To: References: Message-ID: On Mon, 22 May 2023 07:37:19 GMT, Axel Boldt-Christmas wrote: > `ZStatSubPhase::register_start` should not call `register_gc_phase_start` if `ZAbort::should_abort()` is true. This will cause an unbalanced push and pop behaviour of the phase stack as `ZStatSubPhase::register_end` stops popping (and sending events) after the aborting has started. This will create an issue if more subsequent sub-phases are added in-between two abort points as the phase stack may overflow. > > Simply avoid pushing new phases when aborting has started solves this issue. OK. Then let's go with your current proposal. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14075#pullrequestreview-1436395352 From lkorinth at openjdk.org Mon May 22 11:53:16 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 22 May 2023 11:53:16 GMT Subject: RFR: 8308506: Reduce testing time by removing combinations tested Message-ID: Juggle3.java and Juggle3Quick.java tests many combinations of array juggling on different kinds of arrays. It tests: - Primitive arrays (three for each kind of primitive) - Object arrays - Random arrays - A version with hashing However when testing using primitive arrays, there is --- from a gc perspective --- no extra logic tested by going through all combinations. By removing these, we can save time and put our resources on testing other stuff. I will keep three versions of primitive array tests (low, medium and high) with different kinds of arrays. Also, the random array test will still test all primitive versions of arrays. ------------- Commit messages: - 8308506: Reduce testing time by removing combinations tested Changes: https://git.openjdk.org/jdk/pull/14078/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14078&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308506 Stats: 21 lines in 2 files changed: 0 ins; 21 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14078.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14078/head:pull/14078 PR: https://git.openjdk.org/jdk/pull/14078 From ayang at openjdk.org Mon May 22 11:53:53 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 22 May 2023 11:53:53 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v2] In-Reply-To: <4rRRNc7dCnGAxxXoA6u8DQ6QyMEI42ny4JehKHqX8kU=.f18754fc-224a-4c80-bb0d-c7074f7f899a@github.com> References: <4rRRNc7dCnGAxxXoA6u8DQ6QyMEI42ny4JehKHqX8kU=.f18754fc-224a-4c80-bb0d-c7074f7f899a@github.com> Message-ID: On Fri, 19 May 2023 07:30:48 GMT, Aleksey Shipilev wrote: >> In multi-array allocations benchmarks, there is a hot path through the native VM allocation code, which calls lots of notification methods, even when we would return immediately, because the allocation was satisfied from existing TLAB. Not calling these helper methods from `MemAllocator::Allocation` constructor/destructor looks like an incremental win for the benchmarks. >> >> Example on M1: >> >> >> Benchmark (size) Mode Cnt Score Error Units >> >> # Before >> MultiArrayAlloc.full 1 avgt 15 74,053 ? 0,869 ns/op >> MultiArrayAlloc.full 2 avgt 15 87,800 ? 0,931 ns/op >> MultiArrayAlloc.full 4 avgt 15 124,814 ? 0,615 ns/op >> MultiArrayAlloc.full 8 avgt 15 188,562 ? 0,785 ns/op >> MultiArrayAlloc.full 16 avgt 15 313,007 ? 1,108 ns/op >> MultiArrayAlloc.full 32 avgt 15 640,276 ? 4,560 ns/op >> MultiArrayAlloc.full 64 avgt 15 1395,220 ? 5,860 ns/op >> MultiArrayAlloc.full 128 avgt 15 3417,848 ? 11,345 ns/op >> MultiArrayAlloc.full 256 avgt 15 9955,360 ? 102,057 ns/op >> MultiArrayAlloc.full 512 avgt 15 27738,002 ? 244,940 ns/op >> MultiArrayAlloc.full 1024 avgt 15 147507,008 ? 1434,085 ns/op >> >> # After >> MultiArrayAlloc.full 1 avgt 15 70,434 ? 0,373 ns/op ; 5% better >> MultiArrayAlloc.full 2 avgt 15 82,394 ? 0,137 ns/op ; 7% better >> MultiArrayAlloc.full 4 avgt 15 108,542 ? 0,129 ns/op ; 15% better >> MultiArrayAlloc.full 8 avgt 15 170,697 ? 4,480 ns/op ; 11% better >> MultiArrayAlloc.full 16 avgt 15 272,902 ? 0,877 ns/op ; 15% better >> MultiArrayAlloc.full 32 avgt 15 524,486 ? 1,447 ns/op ; 22% better >> MultiArrayAlloc.full 64 avgt 15 1088,932 ? 2,739 ns/op ; 17% better >> MultiArrayAlloc.full 128 avgt 15 3151,144 ? 14,621 ns/op ; 8% better >> MultiArrayAlloc.full 256 avgt 15 8455,293 ? 12,656 ns/op ; 18% better >> MultiArrayAlloc.full 512 avgt 15 26060,055 ? 116,524 ns/op ; 6% better >> MultiArrayAlloc.full 1024 avgt 15 130824,480 ? 831,703 ns/op ; 13% better >> >> >> Additional testing: >> - [x] Ad-hoc micro-benchmarks >> - [x] Linux x86_64 fastdebug `serviceability/jvmti` >> - [x] Linux x86_64 fastdebug `jdk/jfr` >> - [x] Linux x86_64 fastdebug `t... > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Touch up comment > - Merge branch 'master' into JDK-8308231-memalloc-check-faster > - Hide more stuff > - Touchups > - Branch > - Fix low-mem-detector and jfr are interested only in real-allocation (increasing mem usage), while dtrace will capture inside-tlab allocations also. I think your change alters the behavior of dtrace, from capturing all-allocations to only jvmti-filtered ones. It can be surprising that the result of dtrace depends on some jvmti flags. I reorganized the notify-methods and break them into three groups: 1. capturing all-allocations -- dtrace and jvmti-vm-obj-alloc 2. capturing real-allocation -- low-mem-detector and jfr 3. capturing real or pseudo alloc -- jvmti-sampler The most important part looks like: ~Allocation() { if (!check_out_of_memory()) { verify_after(); notify_allocation_dtrace_sampler(_thread); // support for JVMTI VMObjectAlloc event (no-op if not enabled) JvmtiExport::vm_object_alloc_event_collector(obj()); const bool is_real_allocation = _allocated_tlab_size != 0 || _allocated_outside_tlab; if (is_real_allocation) { notify_allocation_low_memory_detector(); notify_allocation_jfr_sampler(); } if (is_real_allocation || _tlab_end_reset_for_sample) { notify_allocation_jvmti_sampler(); } } } ------------- PR Comment: https://git.openjdk.org/jdk/pull/14019#issuecomment-1557079621 From eosterlund at openjdk.org Mon May 22 13:06:52 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 22 May 2023 13:06:52 GMT Subject: RFR: 8308500: ZStatSubPhase::register_start should not call register_gc_phase_start if ZAbort::should_abort() In-Reply-To: References: Message-ID: On Mon, 22 May 2023 07:37:19 GMT, Axel Boldt-Christmas wrote: > `ZStatSubPhase::register_start` should not call `register_gc_phase_start` if `ZAbort::should_abort()` is true. This will cause an unbalanced push and pop behaviour of the phase stack as `ZStatSubPhase::register_end` stops popping (and sending events) after the aborting has started. This will create an issue if more subsequent sub-phases are added in-between two abort points as the phase stack may overflow. > > Simply avoid pushing new phases when aborting has started solves this issue. Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14075#pullrequestreview-1436588783 From rkennke at openjdk.org Mon May 22 14:37:18 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 22 May 2023 14:37:18 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v47] In-Reply-To: References: Message-ID: > Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. > > I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. > > It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. > > We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. > > With this, forwarding information would be encoded like this: > - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. > - Bit 2: Used for 'fallback'-forwarding (see below) > - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) > - Bits 4..31 The number of heap words from the target base address > > This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. > > All the table accesses can be done unsynchronized because: > - Serial GC is single-threaded anyway > - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. > - G1 serial compaction is single-threaded > > The change introduces a new ... Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Specialize full-GC loops to get UseAltGCForwarding flag check out of hot paths ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13582/files - new: https://git.openjdk.org/jdk/pull/13582/files/6bbb8e01..6bbd2952 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=46 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13582&range=45-46 Stats: 567 lines in 28 files changed: 354 ins; 101 del; 112 mod Patch: https://git.openjdk.org/jdk/pull/13582.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13582/head:pull/13582 PR: https://git.openjdk.org/jdk/pull/13582 From rkennke at openjdk.org Mon May 22 14:41:09 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 22 May 2023 14:41:09 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v33] In-Reply-To: References: Message-ID: On Mon, 8 May 2023 10:12:41 GMT, Thomas Schatzl wrote: >> Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: >> >> - Fix typos >> - Make FallbackTable an inner class of SlidingForwarding > > The https://github.com/openjdk/jdk/compare/master...tschatzl:jdk:alt-fullgc-forwarding?expand=1 branch now contains the promised cleanup. @tschatzl @fisk @coleenp @shipilev I've pushed a change that specializes all affected full-GC loops to get the flag-check out of the hot loops. This should get performance in the -UseAltGCForwarding paths back to normal, and show minimal effect (if any) in the +UseAltGCForwarding path. It is a little uglier though. If you prefer the clean and slightly slower version, let me know and I would back it out. Consider that once (after a few release cycles) the Lilliput stuff is stable and default, all that specialized stuff will disappear.. I guess the alternative would be to not put it under flag to begin with. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1557338343 From shade at openjdk.org Mon May 22 17:57:55 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 22 May 2023 17:57:55 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v2] In-Reply-To: <6OmfBmoX4m0QF0qSXPqhJePZeP4Y0_Pwpv9vQ-0no6k=.3b8afc21-db07-4523-978c-2479076c1be8@github.com> References: <4rRRNc7dCnGAxxXoA6u8DQ6QyMEI42ny4JehKHqX8kU=.f18754fc-224a-4c80-bb0d-c7074f7f899a@github.com> <6OmfBmoX4m0QF0qSXPqhJePZeP4Y0_Pwpv9vQ-0no6k=.3b8afc21-db07-4523-978c-2479076c1be8@github.com> Message-ID: On Mon, 22 May 2023 10:44:52 GMT, Stefan Karlsson wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Touch up comment >> - Merge branch 'master' into JDK-8308231-memalloc-check-faster >> - Hide more stuff >> - Touchups >> - Branch >> - Fix > > src/hotspot/share/gc/shared/memAllocator.cpp line 106: > >> 104: if (should_notify_allocation()) { >> 105: notify_allocation(_thread); >> 106: } > > `notify_allocation` used to call `notify_allocation_low_memory_detector` and `notify_allocation_dtrace_sampler` without performing any "should notify" filtering. Is this change in behavior intentional? Not really, but I wonder if we actually need to do these for inside-TLAB allocs. I think I'll go the way Albert leans, and see what happens here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14019#discussion_r1200845220 From shade at openjdk.org Mon May 22 19:19:01 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 22 May 2023 19:19:01 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v3] In-Reply-To: References: Message-ID: > In multi-array allocations benchmarks, there is a hot path through the native VM allocation code, which calls lots of notification methods, even when we would return immediately, because the allocation was satisfied from existing TLAB. Not calling these helper methods from `MemAllocator::Allocation` constructor/destructor looks like an incremental win for the benchmarks. > > Example on M1: > > > Benchmark (size) Mode Cnt Score Error Units > > # Before > MultiArrayAlloc.full 1 avgt 15 74,053 ? 0,869 ns/op > MultiArrayAlloc.full 2 avgt 15 87,800 ? 0,931 ns/op > MultiArrayAlloc.full 4 avgt 15 124,814 ? 0,615 ns/op > MultiArrayAlloc.full 8 avgt 15 188,562 ? 0,785 ns/op > MultiArrayAlloc.full 16 avgt 15 313,007 ? 1,108 ns/op > MultiArrayAlloc.full 32 avgt 15 640,276 ? 4,560 ns/op > MultiArrayAlloc.full 64 avgt 15 1395,220 ? 5,860 ns/op > MultiArrayAlloc.full 128 avgt 15 3417,848 ? 11,345 ns/op > MultiArrayAlloc.full 256 avgt 15 9955,360 ? 102,057 ns/op > MultiArrayAlloc.full 512 avgt 15 27738,002 ? 244,940 ns/op > MultiArrayAlloc.full 1024 avgt 15 147507,008 ? 1434,085 ns/op > > # After > MultiArrayAlloc.full 1 avgt 15 70,434 ? 0,373 ns/op ; 5% better > MultiArrayAlloc.full 2 avgt 15 82,394 ? 0,137 ns/op ; 7% better > MultiArrayAlloc.full 4 avgt 15 108,542 ? 0,129 ns/op ; 15% better > MultiArrayAlloc.full 8 avgt 15 170,697 ? 4,480 ns/op ; 11% better > MultiArrayAlloc.full 16 avgt 15 272,902 ? 0,877 ns/op ; 15% better > MultiArrayAlloc.full 32 avgt 15 524,486 ? 1,447 ns/op ; 22% better > MultiArrayAlloc.full 64 avgt 15 1088,932 ? 2,739 ns/op ; 17% better > MultiArrayAlloc.full 128 avgt 15 3151,144 ? 14,621 ns/op ; 8% better > MultiArrayAlloc.full 256 avgt 15 8455,293 ? 12,656 ns/op ; 18% better > MultiArrayAlloc.full 512 avgt 15 26060,055 ? 116,524 ns/op ; 6% better > MultiArrayAlloc.full 1024 avgt 15 130824,480 ? 831,703 ns/op ; 13% better > > > Additional testing: > - [x] Ad-hoc micro-benchmarks > - [x] Linux x86_64 fastdebug `serviceability/jvmti` > - [x] Linux x86_64 fastdebug `jdk/jfr` > - [x] Linux x86_64 fastdebug `tier1 tier2 tier3` > - [x] Linux AArch64 ... Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Reshuffle and simplify - Merge branch 'master' into JDK-8308231-memalloc-check-faster - Touch up comment - Merge branch 'master' into JDK-8308231-memalloc-check-faster - Hide more stuff - Touchups - Branch - Fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14019/files - new: https://git.openjdk.org/jdk/pull/14019/files/d189a358..e1b35e55 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14019&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14019&range=01-02 Stats: 9627 lines in 400 files changed: 3944 ins; 4522 del; 1161 mod Patch: https://git.openjdk.org/jdk/pull/14019.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14019/head:pull/14019 PR: https://git.openjdk.org/jdk/pull/14019 From shade at openjdk.org Mon May 22 19:19:01 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 22 May 2023 19:19:01 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v2] In-Reply-To: References: <4rRRNc7dCnGAxxXoA6u8DQ6QyMEI42ny4JehKHqX8kU=.f18754fc-224a-4c80-bb0d-c7074f7f899a@github.com> Message-ID: <_SHO7pHSdnon55MY3p2Diw1eEyFpjJWW9XEUImfsSLM=.57af7c16-9a8b-4824-b82a-36dd413efe27@github.com> On Mon, 22 May 2023 11:51:20 GMT, Albert Mingkun Yang wrote: > I think your change alters the behavior of dtrace, from capturing all-allocations to only jvmti-filtered ones. It can be surprising that the result of dtrace depends on some jvmti flags. Right. I just committed another variant, inspired by your suggestion. It passes testing for me, but I would be able to do full performance tests only tomorrow. Tell me if you want any changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14019#issuecomment-1557784119 From rkennke at openjdk.org Mon May 22 19:58:13 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 22 May 2023 19:58:13 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v47] In-Reply-To: References: Message-ID: On Mon, 22 May 2023 14:37:18 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The c... > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Specialize full-GC loops to get UseAltGCForwarding flag check out of hot paths Using the same Retain.java program that Aleksey posted earlier, I now get the following numbers: Baseline: 286.9ms -AltGCForwarding: 286.3ms (-0.2%) +AltGCForwarding: 309.1ms (+7.7%) ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1557869772 From ayang at openjdk.org Mon May 22 21:08:54 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 22 May 2023 21:08:54 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v3] In-Reply-To: References: Message-ID: On Mon, 22 May 2023 19:19:01 GMT, Aleksey Shipilev wrote: >> In multi-array allocations benchmarks, there is a hot path through the native VM allocation code, which calls lots of notification methods, even when we would return immediately, because the allocation was satisfied from existing TLAB. Not calling these helper methods from `MemAllocator::Allocation` constructor/destructor looks like an incremental win for the benchmarks. >> >> Example on M1: >> >> >> Benchmark (size) Mode Cnt Score Error Units >> >> # Before >> MultiArrayAlloc.full 1 avgt 15 74,053 ? 0,869 ns/op >> MultiArrayAlloc.full 2 avgt 15 87,800 ? 0,931 ns/op >> MultiArrayAlloc.full 4 avgt 15 124,814 ? 0,615 ns/op >> MultiArrayAlloc.full 8 avgt 15 188,562 ? 0,785 ns/op >> MultiArrayAlloc.full 16 avgt 15 313,007 ? 1,108 ns/op >> MultiArrayAlloc.full 32 avgt 15 640,276 ? 4,560 ns/op >> MultiArrayAlloc.full 64 avgt 15 1395,220 ? 5,860 ns/op >> MultiArrayAlloc.full 128 avgt 15 3417,848 ? 11,345 ns/op >> MultiArrayAlloc.full 256 avgt 15 9955,360 ? 102,057 ns/op >> MultiArrayAlloc.full 512 avgt 15 27738,002 ? 244,940 ns/op >> MultiArrayAlloc.full 1024 avgt 15 147507,008 ? 1434,085 ns/op >> >> # After >> MultiArrayAlloc.full 1 avgt 15 70,434 ? 0,373 ns/op ; 5% better >> MultiArrayAlloc.full 2 avgt 15 82,394 ? 0,137 ns/op ; 7% better >> MultiArrayAlloc.full 4 avgt 15 108,542 ? 0,129 ns/op ; 15% better >> MultiArrayAlloc.full 8 avgt 15 170,697 ? 4,480 ns/op ; 11% better >> MultiArrayAlloc.full 16 avgt 15 272,902 ? 0,877 ns/op ; 15% better >> MultiArrayAlloc.full 32 avgt 15 524,486 ? 1,447 ns/op ; 22% better >> MultiArrayAlloc.full 64 avgt 15 1088,932 ? 2,739 ns/op ; 17% better >> MultiArrayAlloc.full 128 avgt 15 3151,144 ? 14,621 ns/op ; 8% better >> MultiArrayAlloc.full 256 avgt 15 8455,293 ? 12,656 ns/op ; 18% better >> MultiArrayAlloc.full 512 avgt 15 26060,055 ? 116,524 ns/op ; 6% better >> MultiArrayAlloc.full 1024 avgt 15 130824,480 ? 831,703 ns/op ; 13% better >> >> >> Additional testing: >> - [x] Ad-hoc micro-benchmarks >> - [x] Linux x86_64 fastdebug `serviceability/jvmti` >> - [x] Linux x86_64 fastdebug `jdk/jfr` >> - [x] Linux x86_64 fastdebug `t... > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Reshuffle and simplify > - Merge branch 'master' into JDK-8308231-memalloc-check-faster > - Touch up comment > - Merge branch 'master' into JDK-8308231-memalloc-check-faster > - Hide more stuff > - Touchups > - Branch > - Fix src/hotspot/share/gc/shared/memAllocator.cpp line 98: > 96: > 97: if ((is_real_allocation || _tlab_end_reset_for_sample) && > 98: JvmtiExport::should_post_sampled_object_alloc()) { The same check is done inside the caller already. On this note, I think all checks using info outside `class Allocation` should not be present on this level. (Ofc, this is very subjective.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14019#discussion_r1201102058 From tschatzl at openjdk.org Tue May 23 07:25:50 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 23 May 2023 07:25:50 GMT Subject: RFR: 8308506: Reduce testing time by removing combinations tested In-Reply-To: References: Message-ID: <30tPLt7xh9EqlKWENLq1BduniR2IbUk1v5vY0poezWI=.4bb95a67-e950-4618-b166-63997d1d295b@github.com> On Mon, 22 May 2023 11:44:54 GMT, Leo Korinth wrote: > Juggle3.java and Juggle3Quick.java tests many combinations of array juggling on different kinds of arrays. It tests: > > - Primitive arrays (three for each kind of primitive) > - Object arrays > - Random arrays > - A version with hashing > > However when testing using primitive arrays, there is --- from a gc perspective --- no extra logic tested by going through all combinations. By removing these, we can save time and put our resources on testing other stuff. > > I will keep three versions of primitive array tests (low, medium and high) with different kinds of arrays. > > Also, the random array test will still test all primitive versions of arrays. Lgtm. Note that for `Juggle3`, the changes keeps boolean-medium float-high double-low double-high i.e. three levels run _four_ times unlike the description suggests. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14078#pullrequestreview-1438838257 From shade at openjdk.org Tue May 23 08:46:01 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 May 2023 08:46:01 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v3] In-Reply-To: References: Message-ID: <1bcIu6lVHZiinYvnIU39_UilHWEkwRVEoPD1bIFri8w=.d3ea1458-ac12-41b3-a043-b887bc7c494b@github.com> On Mon, 22 May 2023 21:06:01 GMT, Albert Mingkun Yang wrote: >> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: >> >> - Reshuffle and simplify >> - Merge branch 'master' into JDK-8308231-memalloc-check-faster >> - Touch up comment >> - Merge branch 'master' into JDK-8308231-memalloc-check-faster >> - Hide more stuff >> - Touchups >> - Branch >> - Fix > > src/hotspot/share/gc/shared/memAllocator.cpp line 98: > >> 96: >> 97: if ((is_real_allocation || _tlab_end_reset_for_sample) && >> 98: JvmtiExport::should_post_sampled_object_alloc()) { > > The same check is done inside the caller already. > > On this note, I think all checks using info outside `class Allocation` should not be present on this level. (Ofc, this is very subjective.) This is a fast-path code, and it is extremely sensitive to the overheads of the actual calls. So I made a point to check the pre-conditions before calling them. It is much less sensitive to the overheads for the methods that are already protected by `is_real_allocations`, but for others the condtions are pulled to the fast path. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14019#discussion_r1201818427 From tschatzl at openjdk.org Tue May 23 08:51:18 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 23 May 2023 08:51:18 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME In-Reply-To: References: Message-ID: <_55IlsA6s_kR7MNh1Q07jXluCm-sdgBBcwBO46tKEpA=.df656d33-724a-44f6-a657-98def06e5763@github.com> On Mon, 22 May 2023 11:44:24 GMT, Ivan Walulya wrote: > Please review this change which fixes the thread starvation problem during allocation for G1. > > The starvation problem is not limited to GCLocker, however, currently, it manifests as an OOME only when GCLocker is active. In other cases, the starvation only affects the "starved" thread as it may loop indefinitely. > > Starvation with an active GCLocker happens as below: > > 1. Thread A tries to allocate memory as normal, and tries to start a GC; the GCLocker is active and so the thread gets stalled waiting for the GC. > 2. GCLocker induced GC executes and frees some memory. > 3. Thread A does not get any of that memory, but other threads also waiting for memory. > 4. Goto 1 until the gclocker retry count has been reached. > > In this change, we take the general approach to solving starvation problems with announcement tables (request queues). On slow allocation, a thread that wishes to complete an Allocation GC and then attempt an allocation, announces its allocation request before proceeding to participate in a race to execute a GC safepoint. Whichever thread succeeds in executing the Allocation GC safepoint will be tasked with completing all allocation requests that were announced before the safepoint. This guarantees that all announced allocation requests are either satisfied during the safepoint, or failed in case there is not enough memory to complete all requests. This effectively deals with the starvation issue and reduces the number of allocation GCs triggered. > > Note: The change also adopts ZList from ZGC and makes it available under utilities as DoublyLinkedList with slight modifications. > > Testing: Tier 1-7 Initial pass/comments. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 906: > 904: bool G1CollectedHeap::upgrade_to_full_collection() { > 905: GCCauseSetter compaction(this, GCCause::_g1_compaction_pause); > 906: // Reset any allocated but yet claimed allocation requests. Suggestion: // Reset any allocated but yet unclaimed allocation requests. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 961: > 959: assert_at_safepoint_on_vm_thread(); > 960: > 961: bool success = handle_allocation_requests(false /* expect_null_mutator_alloc_region*/); Maybe it would be good to rename `success` to something more specific to avoid confusion with `gc_succeeded`. Something like `alloc_succeeded` (compared to succeeding the gc and succeeding the entire method). src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 964: > 962: > 963: if (success) { > 964: return success; Suggestion: return success; src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 980: > 978: > 979: if (success) { > 980: return success; Suggestion: return success; src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 985: > 983: // Attempt to satisfy allocation requests after full-gc also failed. We reset the allocation requests > 984: // then execute a maximal compaction full-gc before retrying the allocations > 985: Superfluous newline (compared to other occurrences of similar comments) src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1006: > 1004: for (StalledAllocReq* alloc_req; iter.next(&alloc_req);) { > 1005: _satisfied_allocations.insert_last(alloc_req); > 1006: alloc_req->set_state(StalledAllocReq::AllocationState::Failed); Maybe that's just me but I prefer code changing the request first and then inserting it. I.e. process then pass on to something else instead of the other way around. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1042: > 1040: > 1041: const uint active_numa_nodes = G1NUMA::numa()->num_active_nodes(); > 1042: bool *expect_null_alloc_regions = (bool *)alloca(active_numa_nodes * sizeof(bool)); Suggestion: bool *expect_null_alloc_regions = (bool*)alloca(active_numa_nodes * sizeof(bool)); src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1047: > 1045: } > 1046: > 1047: while(true) { Suggestion: while (true) { src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1058: > 1056: alloc_req->node_index(), > 1057: expect_null_alloc_regions[alloc_req->node_index()] > 1058: ); Suggestion: expect_null_alloc_regions[alloc_req->node_index()]); src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1067: > 1065: expect_null_alloc_regions[alloc_req->node_index()] = false; > 1066: > 1067: // Allocation succeeded, update the state and result of the allocation request Suggestion: // Allocation succeeded, update the state and result of the allocation request. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1071: > 1069: > 1070: if (is_humongous(alloc_req->size())) { > 1071: // Calculate payload size and initialize the humongous object with a fillerArray Suggestion: // Calculate payload size and initialize the humongous object with a fillerArray. It is nowhere explained why this is needed. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1089: > 1087: } > 1088: > 1089: // Move the allocation request from stalled to satisfied list Suggestion: // Move the allocation request from stalled to satisfied list. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1112: > 1110: } > 1111: > 1112: // Attempting to expand the heap sufficiently Suggestion: // Attempts to expand the heap sufficiently * the formatting of this comment paragraph is very "jagged" * this looks like a comment that should go into the .hpp file src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 172: > 170: friend class G1CheckRegionAttrTableClosure; > 171: > 172: class StalledAllocReq : public DoublyLinkedListNode { Please add a short comment about it; maybe it would even be useful to describe the lifecycle (state transitions) of the allocation request here. src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 174: > 172: class StalledAllocReq : public DoublyLinkedListNode { > 173: public: > 174: enum class AllocationState { Suggestion: enum class AllocationState { src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 179: > 177: Pending, > 178: }; > 179: StalledAllocReq(size_t size, uint numa_node) : Suggestion: StalledAllocReq(size_t size, uint numa_node) : src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 190: > 188: size_t size() { > 189: return _size; > 190: } These trivial getters should probably be written in a single line. src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1110: > 1108: if (hr->is_humongous()) { > 1109: oop obj = cast_to_oop(hr->humongous_start_region()->bottom()); > 1110: if (G1CollectedHeap::is_obj_filler(obj)) { // Object allocated, but not well-formed Suggestion: if (G1CollectedHeap::is_obj_filler(obj)) { // Object allocated, but not well-formed. src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1163: > 1161: if (hr->is_starts_humongous()) { > 1162: oop obj = cast_to_oop(hr->bottom()); > 1163: if (G1CollectedHeap::is_obj_filler(obj)) { // Object allocated, but not well-formed Same as above. src/hotspot/share/gc/g1/g1VMOperations.cpp line 132: > 130: // Any allocation requests that were handled during a previous GC safepoint but have not been observed > 131: // by the requesting mutator thread should be reset to pending. This makes it easier for the current GC to > 132: // treat the unclaimed memory as garbage. Suggestion: // by the requesting mutator thread should be reset to pending. This makes it easier for the current GC to // treat the unclaimed memory as garbage. It also simplifies the initial allocation in the safepoint next. This might cause additional gcs. What would happen if `handle_allocation_requests` just skipped already satisfied allocations (as successful) and only if that fails reset all requests (i.e. around line 148)? src/hotspot/share/gc/g1/g1VMOperations.cpp line 139: > 137: > 138: if (has_pending_allocations) { > 139: bool success = g1h->handle_allocation_requests(false /* expect_null_mutator_alloc_region*/); Same comment about the `success` name as elsewhere. src/hotspot/share/gc/g1/g1VMOperations.cpp line 139: > 137: > 138: if (has_pending_allocations) { > 139: bool success = g1h->handle_allocation_requests(false /* expect_null_mutator_alloc_region*/); Suggestion: bool gc_succeeded = false; // If any allocation has been requested, try to do that first. bool has_pending_allocations = !g1h->_stalled_allocations.is_empty(); if (has_pending_allocations) { bool success = g1h->handle_allocation_requests(false /* expect_null_mutator_alloc_region*/); src/hotspot/share/gc/g1/g1VMOperations.cpp line 143: > 141: if (success) { > 142: return; > 143: } Suggestion: if (success) { return; } src/hotspot/share/gc/shared/collectedHeap.inline.hpp line 50: > 48: > 49: size_t CollectedHeap::filler_array_hdr_size() { > 50: return align_object_offset(arrayOopDesc::header_size(T_INT)); // align to Long Suggestion: return align_object_offset(arrayOopDesc::header_size(T_INT)); // Align to INT. (pre-existing) src/hotspot/share/gc/shared/collectedHeap.inline.hpp line 54: > 52: > 53: size_t CollectedHeap::filler_array_min_size() { > 54: return align_object_size(filler_array_hdr_size()); // align to MinObjAlignment Suggestion: return align_object_size(filler_array_hdr_size()); // Align to MinObjAlignment. (pre-existing) src/hotspot/share/utilities/doublyLinkedList.hpp line 2: > 1: /* > 2: * Copyright (c) 2015, 2020, Oracle and/or its affiliates. All rights reserved. Suggestion: * Copyright (c) 2023, Oracle and/or its affiliates. All rights reserved. (This is a new file, isn't it? An alternative would be to use `2015, 2023,` here.) test/hotspot/jtreg/gc/TestAllocHumongousFragment.java line 175: > 173: * @library /test/lib > 174: * > 175: * @run main/othervm -Xlog:gc -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -Xmx1g -Xms1g Is this change intentional? ------------- Changes requested by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14077#pullrequestreview-1438928751 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201740872 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201745207 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201741950 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201742801 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201746341 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201751478 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201757214 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201756543 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201759971 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201760757 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201761316 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201764617 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201766512 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201769172 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201769417 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201769892 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201770800 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201776160 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201791853 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201813308 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201803997 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201807696 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201803304 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201821648 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201822329 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201822647 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201827147 From tschatzl at openjdk.org Tue May 23 08:51:18 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 23 May 2023 08:51:18 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME In-Reply-To: <_55IlsA6s_kR7MNh1Q07jXluCm-sdgBBcwBO46tKEpA=.df656d33-724a-44f6-a657-98def06e5763@github.com> References: <_55IlsA6s_kR7MNh1Q07jXluCm-sdgBBcwBO46tKEpA=.df656d33-724a-44f6-a657-98def06e5763@github.com> Message-ID: <_LqFx5-ni_pIpcdkRY41C0J-hmoOXp7bZCIXGzVoxAM=.4094388b-6d9f-4833-ac52-03d23d767147@github.com> On Tue, 23 May 2023 08:26:50 GMT, Thomas Schatzl wrote: >> Please review this change which fixes the thread starvation problem during allocation for G1. >> >> The starvation problem is not limited to GCLocker, however, currently, it manifests as an OOME only when GCLocker is active. In other cases, the starvation only affects the "starved" thread as it may loop indefinitely. >> >> Starvation with an active GCLocker happens as below: >> >> 1. Thread A tries to allocate memory as normal, and tries to start a GC; the GCLocker is active and so the thread gets stalled waiting for the GC. >> 2. GCLocker induced GC executes and frees some memory. >> 3. Thread A does not get any of that memory, but other threads also waiting for memory. >> 4. Goto 1 until the gclocker retry count has been reached. >> >> In this change, we take the general approach to solving starvation problems with announcement tables (request queues). On slow allocation, a thread that wishes to complete an Allocation GC and then attempt an allocation, announces its allocation request before proceeding to participate in a race to execute a GC safepoint. Whichever thread succeeds in executing the Allocation GC safepoint will be tasked with completing all allocation requests that were announced before the safepoint. This guarantees that all announced allocation requests are either satisfied during the safepoint, or failed in case there is not enough memory to complete all requests. This effectively deals with the starvation issue and reduces the number of allocation GCs triggered. >> >> Note: The change also adopts ZList from ZGC and makes it available under utilities as DoublyLinkedList with slight modifications. >> >> Testing: Tier 1-7 > > src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1110: > >> 1108: if (hr->is_humongous()) { >> 1109: oop obj = cast_to_oop(hr->humongous_start_region()->bottom()); >> 1110: if (G1CollectedHeap::is_obj_filler(obj)) { // Object allocated, but not well-formed > > Suggestion: > > if (G1CollectedHeap::is_obj_filler(obj)) { // Object allocated, but not well-formed. Just an idea: I think it would be nicer and more understandable if the code explicitly checked if the object is part of the stalled allocation requests instead of relying on it being a filler object. It is a bit ugly that we use filler objects to identify objects in construction here, given that "nobody should have a reference to it" (I see the need). It is somewhat okay since it's GC. Another option would be having dedicated "object-in-construction" objects, which is more work. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201791489 From lkorinth at openjdk.org Tue May 23 09:43:02 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Tue, 23 May 2023 09:43:02 GMT Subject: RFR: 8308506: Reduce testing time by removing combinations tested In-Reply-To: <30tPLt7xh9EqlKWENLq1BduniR2IbUk1v5vY0poezWI=.4bb95a67-e950-4618-b166-63997d1d295b@github.com> References: <30tPLt7xh9EqlKWENLq1BduniR2IbUk1v5vY0poezWI=.4bb95a67-e950-4618-b166-63997d1d295b@github.com> Message-ID: On Tue, 23 May 2023 07:22:48 GMT, Thomas Schatzl wrote: > Lgtm. > > Note that for `Juggle3`, the changes keeps > > boolean-medium float-high double-low double-high > > i.e. three levels run _four_ times unlike the description suggests. Juggle3 tests are the union of Juggle3.java and Juggle3Quick.java; It is an unfortunate arrangement because jtreg does not allow to create test groups from test IDs. If you look at both files you will see that I am keeping: byte-low, boolean-medium, float-high and all three *hashed* double. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14078#issuecomment-1558889165 From iwalulya at openjdk.org Tue May 23 09:43:04 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 23 May 2023 09:43:04 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME In-Reply-To: <_LqFx5-ni_pIpcdkRY41C0J-hmoOXp7bZCIXGzVoxAM=.4094388b-6d9f-4833-ac52-03d23d767147@github.com> References: <_55IlsA6s_kR7MNh1Q07jXluCm-sdgBBcwBO46tKEpA=.df656d33-724a-44f6-a657-98def06e5763@github.com> <_LqFx5-ni_pIpcdkRY41C0J-hmoOXp7bZCIXGzVoxAM=.4094388b-6d9f-4833-ac52-03d23d767147@github.com> Message-ID: On Tue, 23 May 2023 08:30:34 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1110: >> >>> 1108: if (hr->is_humongous()) { >>> 1109: oop obj = cast_to_oop(hr->humongous_start_region()->bottom()); >>> 1110: if (G1CollectedHeap::is_obj_filler(obj)) { // Object allocated, but not well-formed >> >> Suggestion: >> >> if (G1CollectedHeap::is_obj_filler(obj)) { // Object allocated, but not well-formed. > > Just an idea: I think it would be nicer and more understandable if the code explicitly checked if the object is part of the stalled allocation requests instead of relying on it being a filler object. > It is a bit ugly that we use filler objects to identify objects in construction here, given that "nobody should have a reference to it" (I see the need). It is somewhat okay since it's GC. > Another option would be having dedicated "object-in-construction" objects, which is more work. In the worst case, searching the satisfied allocation requests list might have performance implications. However, that is speculation for now, I haven't tried it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201891679 From iwalulya at openjdk.org Tue May 23 09:44:58 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 23 May 2023 09:44:58 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME In-Reply-To: <_55IlsA6s_kR7MNh1Q07jXluCm-sdgBBcwBO46tKEpA=.df656d33-724a-44f6-a657-98def06e5763@github.com> References: <_55IlsA6s_kR7MNh1Q07jXluCm-sdgBBcwBO46tKEpA=.df656d33-724a-44f6-a657-98def06e5763@github.com> Message-ID: On Tue, 23 May 2023 08:40:21 GMT, Thomas Schatzl wrote: >> Please review this change which fixes the thread starvation problem during allocation for G1. >> >> The starvation problem is not limited to GCLocker, however, currently, it manifests as an OOME only when GCLocker is active. In other cases, the starvation only affects the "starved" thread as it may loop indefinitely. >> >> Starvation with an active GCLocker happens as below: >> >> 1. Thread A tries to allocate memory as normal, and tries to start a GC; the GCLocker is active and so the thread gets stalled waiting for the GC. >> 2. GCLocker induced GC executes and frees some memory. >> 3. Thread A does not get any of that memory, but other threads also waiting for memory. >> 4. Goto 1 until the gclocker retry count has been reached. >> >> In this change, we take the general approach to solving starvation problems with announcement tables (request queues). On slow allocation, a thread that wishes to complete an Allocation GC and then attempt an allocation, announces its allocation request before proceeding to participate in a race to execute a GC safepoint. Whichever thread succeeds in executing the Allocation GC safepoint will be tasked with completing all allocation requests that were announced before the safepoint. This guarantees that all announced allocation requests are either satisfied during the safepoint, or failed in case there is not enough memory to complete all requests. This effectively deals with the starvation issue and reduces the number of allocation GCs triggered. >> >> Note: The change also adopts ZList from ZGC and makes it available under utilities as DoublyLinkedList with slight modifications. >> >> Testing: Tier 1-7 > > src/hotspot/share/gc/g1/g1VMOperations.cpp line 132: > >> 130: // Any allocation requests that were handled during a previous GC safepoint but have not been observed >> 131: // by the requesting mutator thread should be reset to pending. This makes it easier for the current GC to >> 132: // treat the unclaimed memory as garbage. > > Suggestion: > > // by the requesting mutator thread should be reset to pending. This makes it easier for the current GC to > // treat the unclaimed memory as garbage. It also simplifies the initial allocation in the safepoint next. > > This might cause additional gcs. What would happen if `handle_allocation_requests` just skipped already satisfied allocations (as successful) and only if that fails reset all requests (i.e. around line 148)? This would create a dependence between handle_allocation_requests, and any collections that happen before. So we would have to deal with how these `fillerObjects` are treated by the collections, some could be invalidated. > src/hotspot/share/utilities/doublyLinkedList.hpp line 2: > >> 1: /* >> 2: * Copyright (c) 2015, 2020, Oracle and/or its affiliates. All rights reserved. > > Suggestion: > > * Copyright (c) 2023, Oracle and/or its affiliates. All rights reserved. > > (This is a new file, isn't it? An alternative would be to use `2015, 2023,` here.) I wasn't sure about how to approach this, its a new file but code is taken from ZList. > test/hotspot/jtreg/gc/TestAllocHumongousFragment.java line 175: > >> 173: * @library /test/lib >> 174: * >> 175: * @run main/othervm -Xlog:gc -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -Xmx1g -Xms1g > > Is this change intentional? This should be moved to a different PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201886847 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201882088 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201880343 From ayang at openjdk.org Tue May 23 15:10:16 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 May 2023 15:10:16 GMT Subject: RFR: 8308231: Faster MemAllocator::Allocation checks for verify/notification [v3] In-Reply-To: <1bcIu6lVHZiinYvnIU39_UilHWEkwRVEoPD1bIFri8w=.d3ea1458-ac12-41b3-a043-b887bc7c494b@github.com> References: <1bcIu6lVHZiinYvnIU39_UilHWEkwRVEoPD1bIFri8w=.d3ea1458-ac12-41b3-a043-b887bc7c494b@github.com> Message-ID: <6sQKFidOSICHmc1Xd6V5sQ_wFiibY7IGBLSuY4Cp348=.aae0e678-4b61-4655-aeeb-f34f2fe38676@github.com> On Tue, 23 May 2023 08:42:43 GMT, Aleksey Shipilev wrote: > the overheads of the actual calls I was expecting the same perf, since the compiler can see everything in this scope. After hiding checks from the this level, I see the same number of `call` instructions in `MemAllocator::allocate`, where everything is inlined into. Ofc, still needs full perf-eval. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14019#discussion_r1201929901 From tschatzl at openjdk.org Tue May 23 15:11:28 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 23 May 2023 15:11:28 GMT Subject: RFR: 8171221: Remove -XX:+CheckMemoryInitialization Message-ID: Hi all, please review this change that removes the broken (verified) -XX:+CheckMemoryInitialization debug flag. Interestingly there are some test cases that explicitly check this functionality without problems, but they are simply not thorough enough. Apparently this has been broken since at least 2016, and given that nobody cared to fix it since then I think it's not worth trying to salvage it here either. Testing: local compilation, gha Thanks, Thomas ------------- Commit messages: - cleanup - remove tests too - Remove flag Changes: https://git.openjdk.org/jdk/pull/14101/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14101&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8171221 Stats: 125 lines in 6 files changed: 0 ins; 125 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14101.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14101/head:pull/14101 PR: https://git.openjdk.org/jdk/pull/14101 From iwalulya at openjdk.org Tue May 23 15:12:22 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 23 May 2023 15:12:22 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME [v2] In-Reply-To: References: Message-ID: <-Mz564sHo22B0cUdp7KPo7Q4Xv41cDVAT3Evsx5zonM=.7f8c5d28-f900-4b76-9bb2-9ff538dd19be@github.com> > Please review this change which fixes the thread starvation problem during allocation for G1. > > The starvation problem is not limited to GCLocker, however, currently, it manifests as an OOME only when GCLocker is active. In other cases, the starvation only affects the "starved" thread as it may loop indefinitely. > > Starvation with an active GCLocker happens as below: > > 1. Thread A tries to allocate memory as normal, and tries to start a GC; the GCLocker is active and so the thread gets stalled waiting for the GC. > 2. GCLocker induced GC executes and frees some memory. > 3. Thread A does not get any of that memory, but other threads also waiting for memory. > 4. Goto 1 until the gclocker retry count has been reached. > > In this change, we take the general approach to solving starvation problems with announcement tables (request queues). On slow allocation, a thread that wishes to complete an Allocation GC and then attempt an allocation, announces its allocation request before proceeding to participate in a race to execute a GC safepoint. Whichever thread succeeds in executing the Allocation GC safepoint will be tasked with completing all allocation requests that were announced before the safepoint. This guarantees that all announced allocation requests are either satisfied during the safepoint, or failed in case there is not enough memory to complete all requests. This effectively deals with the starvation issue and reduces the number of allocation GCs triggered. > > Note: The change also adopts ZList from ZGC and makes it available under utilities as DoublyLinkedList with slight modifications. > > Testing: Tier 1-7 Ivan Walulya has updated the pull request incrementally with two additional commits since the last revision: - Make explicit checks for unclaimed allocatiions - Thomas Review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14077/files - new: https://git.openjdk.org/jdk/pull/14077/files/7da8edfb..a85c59c2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14077&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14077&range=00-01 Stats: 111 lines in 7 files changed: 45 ins; 26 del; 40 mod Patch: https://git.openjdk.org/jdk/pull/14077.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14077/head:pull/14077 PR: https://git.openjdk.org/jdk/pull/14077 From iwalulya at openjdk.org Tue May 23 15:12:29 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 23 May 2023 15:12:29 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME [v2] In-Reply-To: References: <_55IlsA6s_kR7MNh1Q07jXluCm-sdgBBcwBO46tKEpA=.df656d33-724a-44f6-a657-98def06e5763@github.com> <_LqFx5-ni_pIpcdkRY41C0J-hmoOXp7bZCIXGzVoxAM=.4094388b-6d9f-4833-ac52-03d23d767147@github.com> Message-ID: On Tue, 23 May 2023 09:21:04 GMT, Ivan Walulya wrote: >> Just an idea: I think it would be nicer and more understandable if the code explicitly checked if the object is part of the stalled allocation requests instead of relying on it being a filler object. >> It is a bit ugly that we use filler objects to identify objects in construction here, given that "nobody should have a reference to it" (I see the need). It is somewhat okay since it's GC. >> Another option would be having dedicated "object-in-construction" objects, which is more work. > > In the worst case, searching the satisfied allocation requests list might have performance implications. However, that is speculation for now, I haven't tried it. Added the explicit check for unclaimed allocations ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1202480223 From tschatzl at openjdk.org Tue May 23 15:12:43 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 23 May 2023 15:12:43 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME [v2] In-Reply-To: References: <_55IlsA6s_kR7MNh1Q07jXluCm-sdgBBcwBO46tKEpA=.df656d33-724a-44f6-a657-98def06e5763@github.com> Message-ID: On Tue, 23 May 2023 09:18:17 GMT, Ivan Walulya wrote: >> src/hotspot/share/gc/g1/g1VMOperations.cpp line 132: >> >>> 130: // Any allocation requests that were handled during a previous GC safepoint but have not been observed >>> 131: // by the requesting mutator thread should be reset to pending. This makes it easier for the current GC to >>> 132: // treat the unclaimed memory as garbage. >> >> Suggestion: >> >> // by the requesting mutator thread should be reset to pending. This makes it easier for the current GC to >> // treat the unclaimed memory as garbage. It also simplifies the initial allocation in the safepoint next. >> >> This might cause additional gcs. What would happen if `handle_allocation_requests` just skipped already satisfied allocations (as successful) and only if that fails reset all requests (i.e. around line 148)? > > This would create a dependence between handle_allocation_requests, and any collections that happen before. So we would have to deal with how these `fillerObjects` are treated by the collections, some could be invalidated. In the general case this is true, but here, before GC occurred, the existing allocations can still be considered valid until we know that we need a gc. Doing so potentially avoids garbage collections because then we are not going to throw away all of them and try to reallocate. I.e. the suggestion is like: // keep existing allocations, they are still valid here. if (has_allocations) { if (try_satisfy_allocations()) { // satisfy non-satisfied allocations. return; // no gc occurred, both the previously satisfied allocations and the new allocations are valid. } } // We are about to do a GC. Reset all allocation requests since we are likely going to free regions containing them. reset_allocation_requests(); // do gc instead of the current code reset_allocation_requests(); // throw away all requests, even already satisfied ones if (has_allocations) { if (try_satisfy_allocations()) { // satisfy all allocations again return; // done, all satisfied (no gc occurred) } reset_allocation_requests(); // throw away all requests, prepare for GC } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201980307 From iwalulya at openjdk.org Tue May 23 15:12:43 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 23 May 2023 15:12:43 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME [v2] In-Reply-To: References: <_55IlsA6s_kR7MNh1Q07jXluCm-sdgBBcwBO46tKEpA=.df656d33-724a-44f6-a657-98def06e5763@github.com> Message-ID: On Tue, 23 May 2023 10:08:53 GMT, Thomas Schatzl wrote: >> This would create a dependence between handle_allocation_requests, and any collections that happen before. So we would have to deal with how these `fillerObjects` are treated by the collections, some could be invalidated. > > In the general case this is true, but here, before GC occurred, the existing allocations can still be considered valid until we know that we need a gc. > Doing so potentially avoids garbage collections because then we are not going to throw away all of them and try to reallocate. > > I.e. the suggestion is like: > > // keep existing allocations, they are still valid here. > > if (has_allocations) { > if (try_satisfy_allocations()) { // satisfy non-satisfied allocations. > return; // no gc occurred, both the previously satisfied allocations and the new allocations are valid. > } > } > // We are about to do a GC. Reset all allocation requests since we are likely going to free regions containing them. > reset_allocation_requests(); > > // do gc > > instead of the current code > > reset_allocation_requests(); // throw away all requests, even already satisfied ones > > if (has_allocations) { > if (try_satisfy_allocations()) { // satisfy all allocations again > return; // done, all satisfied (no gc occurred) > } > reset_allocation_requests(); // throw away all requests, prepare for GC > } Agreed! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1201988853 From ayang at openjdk.org Tue May 23 16:28:12 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 23 May 2023 16:28:12 GMT Subject: RFR: 8171221: Remove -XX:+CheckMemoryInitialization In-Reply-To: References: Message-ID: On Tue, 23 May 2023 13:22:21 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that removes the broken (verified) -XX:+CheckMemoryInitialization debug flag. Interestingly there are some test cases that explicitly check this functionality without problems, but they are simply not thorough enough. > > Apparently this has been broken since at least 2016, and given that nobody cared to fix it since then I think it's not worth trying to salvage it here either. > > Testing: local compilation, gha > > Thanks, > Thomas Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14101#pullrequestreview-1440147966 From shade at openjdk.org Tue May 23 16:43:08 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 May 2023 16:43:08 GMT Subject: RFR: 8171221: Remove -XX:+CheckMemoryInitialization In-Reply-To: References: Message-ID: On Tue, 23 May 2023 13:22:21 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that removes the broken (verified) -XX:+CheckMemoryInitialization debug flag. Interestingly there are some test cases that explicitly check this functionality without problems, but they are simply not thorough enough. > > Apparently this has been broken since at least 2016, and given that nobody cared to fix it since then I think it's not worth trying to salvage it here either. > > Testing: local compilation, gha > > Thanks, > Thomas I like this, less things to care about on native alloc paths. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14101#pullrequestreview-1440174167 From dcubed at openjdk.org Tue May 23 20:48:16 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 23 May 2023 20:48:16 GMT Subject: Integrated: 8308716: ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 Message-ID: Trivial fixes to ProblemList some tests: [JDK-8308716](https://bugs.openjdk.org/browse/JDK-8308716) ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 [JDK-8308718](https://bugs.openjdk.org/browse/JDK-8308718) ProblemList three mlvm/indy/func/jvmti tests on windows-x64 in Xcomp mode [JDK-8308720](https://bugs.openjdk.org/browse/JDK-8308720) ProblemList java/awt/event/SequencedEvent/MultipleContextsFunctionalTest.java on macosx-x64 ------------- Commit messages: - 8308720: ProblemList java/awt/event/SequencedEvent/MultipleContextsFunctionalTest.java on macosx-x64 - 8308718: ProblemList three mlvm/indy/func/jvmti tests on windows-x64 in Xcomp mode - 8308716: ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 Changes: https://git.openjdk.org/jdk/pull/14106/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14106&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308716 Stats: 6 lines in 3 files changed: 6 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14106.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14106/head:pull/14106 PR: https://git.openjdk.org/jdk/pull/14106 From azvegint at openjdk.org Tue May 23 20:48:16 2023 From: azvegint at openjdk.org (Alexander Zvegintsev) Date: Tue, 23 May 2023 20:48:16 GMT Subject: Integrated: 8308716: ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 In-Reply-To: References: Message-ID: On Tue, 23 May 2023 20:24:18 GMT, Daniel D. Daugherty wrote: > Trivial fixes to ProblemList some tests: > [JDK-8308716](https://bugs.openjdk.org/browse/JDK-8308716) ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 > [JDK-8308718](https://bugs.openjdk.org/browse/JDK-8308718) ProblemList three mlvm/indy/func/jvmti tests on windows-x64 in Xcomp mode > [JDK-8308720](https://bugs.openjdk.org/browse/JDK-8308720) ProblemList java/awt/event/SequencedEvent/MultipleContextsFunctionalTest.java on macosx-x64 Marked as reviewed by azvegint (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14106#pullrequestreview-1440533061 From darcy at openjdk.org Tue May 23 20:48:17 2023 From: darcy at openjdk.org (Joe Darcy) Date: Tue, 23 May 2023 20:48:17 GMT Subject: Integrated: 8308716: ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 In-Reply-To: References: Message-ID: On Tue, 23 May 2023 20:24:18 GMT, Daniel D. Daugherty wrote: > Trivial fixes to ProblemList some tests: > [JDK-8308716](https://bugs.openjdk.org/browse/JDK-8308716) ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 > [JDK-8308718](https://bugs.openjdk.org/browse/JDK-8308718) ProblemList three mlvm/indy/func/jvmti tests on windows-x64 in Xcomp mode > [JDK-8308720](https://bugs.openjdk.org/browse/JDK-8308720) ProblemList java/awt/event/SequencedEvent/MultipleContextsFunctionalTest.java on macosx-x64 Marked as reviewed by darcy (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14106#pullrequestreview-1440535048 From dcubed at openjdk.org Tue May 23 20:48:17 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 23 May 2023 20:48:17 GMT Subject: Integrated: 8308716: ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 In-Reply-To: References: Message-ID: On Tue, 23 May 2023 20:39:02 GMT, Alexander Zvegintsev wrote: >> Trivial fixes to ProblemList some tests: >> [JDK-8308716](https://bugs.openjdk.org/browse/JDK-8308716) ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 >> [JDK-8308718](https://bugs.openjdk.org/browse/JDK-8308718) ProblemList three mlvm/indy/func/jvmti tests on windows-x64 in Xcomp mode >> [JDK-8308720](https://bugs.openjdk.org/browse/JDK-8308720) ProblemList java/awt/event/SequencedEvent/MultipleContextsFunctionalTest.java on macosx-x64 > > Marked as reviewed by azvegint (Reviewer). @azvegint and @jddarcy - Thanks for the fast reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14106#issuecomment-1560097131 From dcubed at openjdk.org Tue May 23 20:48:18 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 23 May 2023 20:48:18 GMT Subject: Integrated: 8308716: ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 In-Reply-To: References: Message-ID: On Tue, 23 May 2023 20:24:18 GMT, Daniel D. Daugherty wrote: > Trivial fixes to ProblemList some tests: > [JDK-8308716](https://bugs.openjdk.org/browse/JDK-8308716) ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 > [JDK-8308718](https://bugs.openjdk.org/browse/JDK-8308718) ProblemList three mlvm/indy/func/jvmti tests on windows-x64 in Xcomp mode > [JDK-8308720](https://bugs.openjdk.org/browse/JDK-8308720) ProblemList java/awt/event/SequencedEvent/MultipleContextsFunctionalTest.java on macosx-x64 This pull request has now been integrated. Changeset: ed0e956f Author: Daniel D. Daugherty URL: https://git.openjdk.org/jdk/commit/ed0e956fc28a54a0eb49bab70a7d010095ce2544 Stats: 6 lines in 3 files changed: 6 ins; 0 del; 0 mod 8308716: ProblemList java/util/concurrent/ScheduledThreadPoolExecutor/BasicCancelTest.java with genzgc on windows-x64 8308718: ProblemList three mlvm/indy/func/jvmti tests on windows-x64 in Xcomp mode 8308720: ProblemList java/awt/event/SequencedEvent/MultipleContextsFunctionalTest.java on macosx-x64 Reviewed-by: azvegint, darcy ------------- PR: https://git.openjdk.org/jdk/pull/14106 From tschatzl at openjdk.org Wed May 24 09:52:04 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 24 May 2023 09:52:04 GMT Subject: RFR: 8306920: G1: Calculate garbage bytes for evacuation failed regions from marked live bytes Message-ID: Hi all, please review this refactoring that uses `G1ConcurrentMark`'s live bytes/marking to collect the amount of live bytes for evacuation failed regions instead of calculating it piecemeal while removing self-forwards. The reason is that the functionality to keep evacuation failed regions in the remembered sets to clear them out quickly ([JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)) needs the region's live bytes to determine whether the region is retained (i.e. put into the collection set candidates). The live bytes for a region will be required in G1's post evacuation phase 1, but currently is calculated in post evacuation phase 2. I.e. this change avoids splitting up post evacuation phase 2 and shuffling around phases (it also makes assignment of live bytes to evacuation failed regions non-incremental, which makes it imo easier to understand). The change does add, if there is an evacuation failure, a very short serial phase that calculates the final liveness bytes for a region (that is O(#worker threads)). The reason for reusing `ConcurrentMark`'s liveness gathering infrastructure is because it's already there and there is no (problematic) overlap with its use during marking; i.e. marking only uses live byte array entries for regions that are marked through, and evacuation failure can only happen for regions in the (candidate) collection set, which g1 never marks through. Testing: tier1-5 Thanks, Thomas ------------- Commit messages: - Add gc log messages testing - Remove test parameter changes - Minimize changes - Calculate garbage bytes for evacuation failed regions from marked live bytes Changes: https://git.openjdk.org/jdk/pull/14118/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14118&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8306920 Stats: 57 lines in 10 files changed: 27 ins; 13 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/14118.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14118/head:pull/14118 PR: https://git.openjdk.org/jdk/pull/14118 From tschatzl at openjdk.org Wed May 24 09:58:57 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 24 May 2023 09:58:57 GMT Subject: RFR: 8306920: G1: Calculate garbage bytes for evacuation failed regions from marked live bytes In-Reply-To: References: Message-ID: On Wed, 24 May 2023 09:38:23 GMT, Thomas Schatzl wrote: > Hi all, > > please review this refactoring that uses `G1ConcurrentMark`'s live bytes/marking to collect the amount of live bytes for evacuation failed regions instead of calculating it piecemeal while removing self-forwards. > > The reason is that the functionality to keep evacuation failed regions in the remembered sets to clear them out quickly ([JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)) needs the region's live bytes to determine whether the region is retained (i.e. put into the collection set candidates). > > The live bytes for a region will be required in G1's post evacuation phase 1, but currently is calculated in post evacuation phase 2. I.e. this change avoids splitting up post evacuation phase 2 and shuffling around phases (it also makes assignment of live bytes to evacuation failed regions non-incremental, which makes it imo easier to understand). > > The change does add, if there is an evacuation failure, a very short serial phase that calculates the final liveness bytes for a region (that is O(#worker threads)). > The reason for reusing `ConcurrentMark`'s liveness gathering infrastructure is because it's already there and there is no (problematic) overlap with its use during marking; i.e. marking only uses live byte array entries for regions that are marked through, and evacuation failure can only happen for regions in the (candidate) collection set, which g1 never marks through. > > Testing: tier1-5 > > Thanks, > Thomas Fwiw, the current suggested change for JDK-8140362 can be looked at https://github.com/openjdk/jdk/compare/pr/14118...tschatzl:submit/8140326-evacuate-retained-regions-at-any-time2?expand=1 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14118#issuecomment-1560813326 From gli at openjdk.org Wed May 24 11:03:12 2023 From: gli at openjdk.org (Guoxiong Li) Date: Wed, 24 May 2023 11:03:12 GMT Subject: RFR: 8194823: Serial does not account GCs caused by TLAB allocation in GC overhead limit Message-ID: Hi all, This patch enables the gc overhead limit when allocating TLAB in serial gc. The main modification is at `GenCollectedHeap::allocate_new_tlab` and the other files only adjust the parameters of the method `allocate_new_tlab`. Thanks for the review. Best Regards, -- Guoxiong ------------- Commit messages: - JDK-8194823 Changes: https://git.openjdk.org/jdk/pull/14120/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14120&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8194823 Stats: 35 lines in 17 files changed: 16 ins; 0 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/14120.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14120/head:pull/14120 PR: https://git.openjdk.org/jdk/pull/14120 From tschatzl at openjdk.org Wed May 24 11:58:01 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 24 May 2023 11:58:01 GMT Subject: RFR: 8308766: TLAB initialization may cause div by zero Message-ID: Hi all, can I have reviews for this change that fixes an FP div by zero? In `ThreadLocalAllocBuffer::initialize()` we initialize the TLAB using current available TLAB capacity for the thread. In G1, this can be zero in some situations, leading to that div by zero (see the CR for the crash when adding an assert). The suggested fix is to just not sample at this point. TLAB resizing will fix TLAB sizing up. Only G1 seems to be affected as it seems to be the only gc that uses a dynamic value for the capacity available for TLAB allocation. Other GCs seem to just use total heap capacity (Z, Shenandoah) or eden capacity (Serial, Parallel). Not sure if that is actually better and I think won't result in the expected behavior (every thread should reload TLABs `target_refills()` times per mutator time); since even with G1 at TLAB resizing time eden is maximal, this hiccup at initialization does not seem too bad. This may also be the cause for the behavior observed in https://bugs.openjdk.org/browse/JDK-8264798. Testing: gha Thanks, Thomas ------------- Commit messages: - Initial version Changes: https://git.openjdk.org/jdk/pull/14121/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14121&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308766 Stats: 5 lines in 1 file changed: 2 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14121.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14121/head:pull/14121 PR: https://git.openjdk.org/jdk/pull/14121 From tschatzl at openjdk.org Wed May 24 12:05:07 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 24 May 2023 12:05:07 GMT Subject: RFR: 8171221: Remove -XX:+CheckMemoryInitialization In-Reply-To: References: Message-ID: On Tue, 23 May 2023 16:25:35 GMT, Albert Mingkun Yang wrote: >> Hi all, >> >> please review this change that removes the broken (verified) -XX:+CheckMemoryInitialization debug flag. Interestingly there are some test cases that explicitly check this functionality without problems, but they are simply not thorough enough. >> >> Apparently this has been broken since at least 2016, and given that nobody cared to fix it since then I think it's not worth trying to salvage it here either. >> >> Testing: local compilation, gha >> >> Thanks, >> Thomas > > Marked as reviewed by ayang (Reviewer). Thanks @albertnetymk @shipilev for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/14101#issuecomment-1560990502 From tschatzl at openjdk.org Wed May 24 12:05:10 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 24 May 2023 12:05:10 GMT Subject: Integrated: 8171221: Remove -XX:+CheckMemoryInitialization In-Reply-To: References: Message-ID: On Tue, 23 May 2023 13:22:21 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that removes the broken (verified) -XX:+CheckMemoryInitialization debug flag. Interestingly there are some test cases that explicitly check this functionality without problems, but they are simply not thorough enough. > > Apparently this has been broken since at least 2016, and given that nobody cared to fix it since then I think it's not worth trying to salvage it here either. > > Testing: local compilation, gha > > Thanks, > Thomas This pull request has now been integrated. Changeset: 65c8dbe6 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/65c8dbe693f09203f66cd25aa9179982ddc38274 Stats: 125 lines in 6 files changed: 0 ins; 125 del; 0 mod 8171221: Remove -XX:+CheckMemoryInitialization Reviewed-by: ayang, shade ------------- PR: https://git.openjdk.org/jdk/pull/14101 From eosterlund at openjdk.org Wed May 24 12:26:03 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 24 May 2023 12:26:03 GMT Subject: RFR: 8308009: Generational ZGC: OOM before clearing all SoftReferences Message-ID: When a major GC in generational ZGC with a different cause that doesn?t pre-clean and doesn?t clear soft references, we ask if there are allocations stalled on old. And part of that condition is to check if we are not stalled on young. So if an allocation request comes in just before such a ?weak? major GC, we will say we won?t clear soft references. But after that major collection we will satisfy all the constraints to throw OOM as both an YC and OC has passed since the allocation request was installed. The solution is to let the driver remember if it cleared soft references or not, and only throw OOM if it cleared soft references. ------------- Commit messages: - 8308009: Generational ZGC: OOM before clearing all SoftReferences Changes: https://git.openjdk.org/jdk/pull/14122/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14122&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308009 Stats: 22 lines in 6 files changed: 6 ins; 4 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/14122.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14122/head:pull/14122 PR: https://git.openjdk.org/jdk/pull/14122 From stefank at openjdk.org Wed May 24 12:43:54 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 24 May 2023 12:43:54 GMT Subject: RFR: 8308009: Generational ZGC: OOM before clearing all SoftReferences In-Reply-To: References: Message-ID: On Wed, 24 May 2023 12:18:19 GMT, Erik ?sterlund wrote: > When a major GC in generational ZGC with a different cause that doesn?t pre-clean and doesn?t clear soft references, we ask if there are allocations stalled on old. And part of that condition is to check if we are not stalled on young. So if an allocation request comes in just before such a ?weak? major GC, we will say we won?t clear soft references. But after that major collection we will satisfy all the constraints to throw OOM as both an YC and OC has passed since the allocation request was installed. > The solution is to let the driver remember if it cleared soft references or not, and only throw OOM if it cleared soft references. Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14122#pullrequestreview-1441748389 From coleenp at openjdk.org Wed May 24 12:50:19 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 24 May 2023 12:50:19 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v33] In-Reply-To: References: Message-ID: On Mon, 22 May 2023 14:37:46 GMT, Roman Kennke wrote: > I guess the alternative would be to not put it under flag to begin with. Yes, is there any reason this can't be an improvement in the default case without a flag? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1561079434 From aboldtch at openjdk.org Wed May 24 12:50:58 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Wed, 24 May 2023 12:50:58 GMT Subject: RFR: 8308009: Generational ZGC: OOM before clearing all SoftReferences In-Reply-To: References: Message-ID: On Wed, 24 May 2023 12:18:19 GMT, Erik ?sterlund wrote: > When a major GC in generational ZGC with a different cause that doesn?t pre-clean and doesn?t clear soft references, we ask if there are allocations stalled on old. And part of that condition is to check if we are not stalled on young. So if an allocation request comes in just before such a ?weak? major GC, we will say we won?t clear soft references. But after that major collection we will satisfy all the constraints to throw OOM as both an YC and OC has passed since the allocation request was installed. > The solution is to let the driver remember if it cleared soft references or not, and only throw OOM if it cleared soft references. I think it looks good. Only one thing I can think of. There is an implicit assumption here that should_clear_soft_references implies should_preclean_young which is required to be spec compliant. Maybe `should_preclean_young` should start with ```c++ if (should_clear_soft_references(cause)) { return true; } to make this clearer. Or even short circuit the `if (should_preclean_young(...))` directly in `ZDriverMajor::collect_young` ------------- Marked as reviewed by aboldtch (Committer). PR Review: https://git.openjdk.org/jdk/pull/14122#pullrequestreview-1441762825 From rkennke at openjdk.org Wed May 24 14:38:23 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 24 May 2023 14:38:23 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v33] In-Reply-To: References: Message-ID: <9xoNQZSU369K49QDHssVieF48owK05KpTSRzGgxFDa0=.449d7f8d-cbfa-409f-ae3f-4aeac773cc89@github.com> On Wed, 24 May 2023 12:47:09 GMT, Coleen Phillimore wrote: > > I guess the alternative would be to not put it under flag to begin with. > > Yes, is there any reason this can't be an improvement in the default case without a flag? Dunno. A 7% worst-case performance degradation in the full-GC of Serial (and perhaps G1 and Shenandoah) might be a concern. Also, it is a design principle of the compact headers JEP that all non-trivial changes should be under a runtime flag, where the 'legacy' case is not measurably affected. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1561280369 From eosterlund at openjdk.org Wed May 24 15:41:26 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 24 May 2023 15:41:26 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v33] In-Reply-To: <9xoNQZSU369K49QDHssVieF48owK05KpTSRzGgxFDa0=.449d7f8d-cbfa-409f-ae3f-4aeac773cc89@github.com> References: <9xoNQZSU369K49QDHssVieF48owK05KpTSRzGgxFDa0=.449d7f8d-cbfa-409f-ae3f-4aeac773cc89@github.com> Message-ID: <5vfFABBVLZ_Vllbn4BBT7qX-KcMXyBljCwNH8SYXnBQ=.1ab444bc-11fb-4df0-a173-de42e454972f@github.com> On Wed, 24 May 2023 14:34:51 GMT, Roman Kennke wrote: > > > I guess the alternative would be to not put it under flag to begin with. > > > > > > Yes, is there any reason this can't be an improvement in the default case without a flag? > > Dunno. A 7% worst-case performance degradation in the full-GC of Serial (and perhaps G1 and Shenandoah) might be a concern. Also, it is a design principle of the compact headers JEP that all non-trivial changes should be under a runtime flag, where the 'legacy' case is not measurably affected. In this case, I'd prefer to not have a flag. If we can convince ourselves that the impact is indeed only 7% worst case time for Serial and G1 + Shenandoah failure modes. I'd rather change the goals in the JEP to say we are willing to throw Serial GC non-failure full GC modes under the bust and risk that taking a bit longer, with the reasoning that if you rely on that level of latency or GC performance, then maybe Serial GC simply isn't for you. Does that make sense? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1561394556 From rkennke at openjdk.org Wed May 24 17:20:07 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 24 May 2023 17:20:07 GMT Subject: Withdrawn: 8307816: Add missing STS to ZGC In-Reply-To: <2zsM-KpyaLtvz_OwW1GhBBLo-VDFal7uf7ZzOaNH2uE=.1a9b33ff-ede5-4a68-a4fc-a51652199b3c@github.com> References: <2zsM-KpyaLtvz_OwW1GhBBLo-VDFal7uf7ZzOaNH2uE=.1a9b33ff-ede5-4a68-a4fc-a51652199b3c@github.com> Message-ID: On Wed, 10 May 2023 13:43:36 GMT, Roman Kennke wrote: > Testing in project Lilliput has revealed that ZGC is lacking one STS. Without it, ZGC could reach to already-deflated monitor when trying to fetch a displaced header, in order to get to an object's Klass* (e.g. to get its size). > > Testing: > - [x] hotspot_gc This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/13904 From lmesnik at openjdk.org Wed May 24 19:06:00 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 24 May 2023 19:06:00 GMT Subject: RFR: 8308506: Reduce testing time by removing combinations tested In-Reply-To: References: Message-ID: On Mon, 22 May 2023 11:44:54 GMT, Leo Korinth wrote: > Juggle3.java and Juggle3Quick.java tests many combinations of array juggling on different kinds of arrays. It tests: > > - Primitive arrays (three for each kind of primitive) > - Object arrays > - Random arrays > - A version with hashing > > However when testing using primitive arrays, there is --- from a gc perspective --- no extra logic tested by going through all combinations. By removing these, we can save time and put our resources on testing other stuff. > > I will keep three versions of primitive array tests (low, medium and high) with different kinds of arrays. > > Also, the random array test will still test all primitive versions of arrays. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14078#pullrequestreview-1442573202 From lkorinth at openjdk.org Thu May 25 08:59:09 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 25 May 2023 08:59:09 GMT Subject: RFR: 8308506: Reduce testing time by removing combinations tested In-Reply-To: References: Message-ID: On Mon, 22 May 2023 11:44:54 GMT, Leo Korinth wrote: > Juggle3.java and Juggle3Quick.java tests many combinations of array juggling on different kinds of arrays. It tests: > > - Primitive arrays (three for each kind of primitive) > - Object arrays > - Random arrays > - A version with hashing > > However when testing using primitive arrays, there is --- from a gc perspective --- no extra logic tested by going through all combinations. By removing these, we can save time and put our resources on testing other stuff. > > I will keep three versions of primitive array tests (low, medium and high) with different kinds of arrays. > > Also, the random array test will still test all primitive versions of arrays. Thanks Thomas and Leonid! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14078#issuecomment-1562533512 From lkorinth at openjdk.org Thu May 25 08:59:10 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 25 May 2023 08:59:10 GMT Subject: Integrated: 8308506: Reduce testing time by removing combinations tested In-Reply-To: References: Message-ID: On Mon, 22 May 2023 11:44:54 GMT, Leo Korinth wrote: > Juggle3.java and Juggle3Quick.java tests many combinations of array juggling on different kinds of arrays. It tests: > > - Primitive arrays (three for each kind of primitive) > - Object arrays > - Random arrays > - A version with hashing > > However when testing using primitive arrays, there is --- from a gc perspective --- no extra logic tested by going through all combinations. By removing these, we can save time and put our resources on testing other stuff. > > I will keep three versions of primitive array tests (low, medium and high) with different kinds of arrays. > > Also, the random array test will still test all primitive versions of arrays. This pull request has now been integrated. Changeset: aaa61899 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/aaa61899c9e246442a50941d075b74083c7c0411 Stats: 21 lines in 2 files changed: 0 ins; 21 del; 0 mod 8308506: Reduce testing time by removing combinations tested Reviewed-by: tschatzl, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/14078 From eosterlund at openjdk.org Thu May 25 09:00:57 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 May 2023 09:00:57 GMT Subject: RFR: 8308009: Generational ZGC: OOM before clearing all SoftReferences In-Reply-To: References: Message-ID: On Wed, 24 May 2023 12:47:56 GMT, Axel Boldt-Christmas wrote: > I think it looks good. > > Only one thing I can think of. There is an implicit assumption here that should_clear_soft_references implies should_preclean_young which is required to be spec compliant. Maybe `should_preclean_young` should start with > > ```c++ > if (should_clear_soft_references(cause)) { > return true; > } > ``` > > to make this clearer. Or even short circuit the `if (should_preclean_young(...))` directly in `ZDriverMajor::collect_young` Right. This is already guaranteed because 1) All the causes that trigger soft ref clearing GC, also trigger young precleaning 2) We hold the driver locker between the should_clear_soft_references and should_preclean_young, which means that if soft ref clearing was triggered due to there being a stall, the stall will be there again when we check if we should_preclean_young. Do we want a comment explaining the above, or change the code structure to be more explicit? I personally don't mind. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14122#issuecomment-1562538005 From tschatzl at openjdk.org Thu May 25 09:51:20 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 May 2023 09:51:20 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms Message-ID: Hi all, can I have reviews for this change that properly deallocates/uncommits CDS archive regions when failing to load the CDS archive? In particular this caused the nuisance mentioned in the CR where even if -Xms==-Xmx, g1 uncommitted the heap memory anyway. Testing: gha, manual testing as below There is no (existing) way to induce CDS load errors easily, so what I did was adding `-XX:+UseNewCode` in `filemap.cpp:2202` to simulate failures when enabled. Obviously I removed the flag in this change. Here's the problematic case: $java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms128m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff06278, 0x0000000100000000] [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.050s][debug][gc,ergo,heap] Attempt heap shrinking (CDS archive regions). Total size: 2097152B [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.057s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] [0.129s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000] openjdk version "21-internal" 2023-09-19 (The `GCCardSizeInBytes` option is there to decrease the minimum heap alignment to 512kb/1M so that setting `-Xms` to an odd value in a later test works) I.e. the CDS regions are unconditionally uncommitted even through `-Xms == -Xmx`. The next case just illustrates current (pre-existing) behavior with `-Xms != -Xmx`, showing that CDS regions are always committed, leading to higher than `-Xms` memory usage. I will file an enhancement here, as it is acceptable behavior (to me). $ java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms126m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.048s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.048s][debug][gc,ergo,heap] Allocate CDS archive regions. Allocated 2 Committed 2 [0.048s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] [0.048s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff05f50, 0x0000000100000000] [0.063s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] I.e. CDS allocation just commits the last few regions without caring about other existing mappings to not fragment the heap. The next case is the new behavior: java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms128m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version [0.047s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.047s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.048s][debug][gc,ergo,heap] Allocate CDS archive regions. Allocated 2 Committed 0 [0.048s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] [0.048s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff05f50, 0x0000000100000000] [0.049s][trace][gc,region ] G1HR CLEANUP(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.050s][trace][gc,region ] G1HR CLEANUP(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.050s][debug][gc,ergo,heap] Deallocate CDS archive regions. Freed 2 Uncommitted 0 regions. [0.057s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.131s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] openjdk version "21-internal" 2023-09-19 G1 does not uncommit the region blindly any more. The next test case shows that if `-Xms` is lower than `-Xmx`, the code still properly uncommits (and no only frees the region). java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms126m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.048s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.048s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.048s][debug][gc,ergo,heap] Allocate CDS archive regions. Allocated 2 Committed 2 [0.048s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] [0.048s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff05f50, 0x0000000100000000] [0.048s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.049s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.049s][debug][gc,ergo,heap] Deallocate CDS archive regions. Freed 2 Uncommitted 2 regions. [0.049s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.049s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.056s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] [0.124s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000] openjdk version "21-internal" 2023-09-19 And finally a "mixed" case where to allocate the CDS archive G1 uses a mix of already allocated regions and committed regions, and uncommits/gives back regions as expected. java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms127m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version [0.053s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.053s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.053s][debug][gc,ergo,heap] Allocate CDS archive regions. Allocated 2 Committed 1 [0.053s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] [0.053s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff05f50, 0x0000000100000000] [0.053s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.054s][trace][gc,region ] G1HR CLEANUP(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.054s][debug][gc,ergo,heap] Deallocate CDS archive regions. Freed 2 Uncommitted 1 regions. [0.054s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] [0.061s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] [0.138s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] openjdk version "21-internal" 2023-09-19 Thanks, Thomas ------------- Commit messages: - Some comment changes - Remove debug code - initial version, uses UseNewCode Changes: https://git.openjdk.org/jdk/pull/14145/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14145&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8232722 Stats: 44 lines in 6 files changed: 15 ins; 4 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/14145.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14145/head:pull/14145 PR: https://git.openjdk.org/jdk/pull/14145 From tschatzl at openjdk.org Thu May 25 10:52:00 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 May 2023 10:52:00 GMT Subject: RFR: 8194823: Serial GC does not account GCs caused by TLAB allocation in GC overhead limit In-Reply-To: References: Message-ID: On Wed, 24 May 2023 10:54:16 GMT, Guoxiong Li wrote: > Hi all, > > This patch enables the gc overhead limit when allocating TLAB in serial gc. > The main modification is at `GenCollectedHeap::allocate_new_tlab` and the other > files only adjust the parameters of the method `allocate_new_tlab`. > > Thanks for the review. > > Best Regards, > -- Guoxiong Lgtm apart from parameter list formatting issues. Please fix before committing. It's a bit unfortunate that support for this feature for one collector causes so many changes everywhere else. (But also indicates an issue with all other collectors not supporting it ;)). Maybe the parameters could be wrapped into something like an "AllocationRequest", but this is a) a separate issue, and b) needs to be discussed first with others. src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 518: > 516: > 517: HeapWord* ParallelScavengeHeap::allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, > 518: bool* gc_overhead_limit_was_exceeded) { Suggestion: HeapWord* ParallelScavengeHeap::allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, bool* gc_overhead_limit_was_exceeded) { Parameter list formatting src/hotspot/share/gc/parallel/parallelScavengeHeap.hpp line 106: > 104: protected: > 105: HeapWord* allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, > 106: bool* gc_overhead_limit_was_exceeded) override; Suggestion: HeapWord* allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, bool* gc_overhead_limit_was_exceeded) override; Parameter formatting src/hotspot/share/gc/shared/memAllocator.cpp line 325: > 323: size_t min_tlab_size = ThreadLocalAllocBuffer::compute_min_size(_word_size); > 324: mem = Universe::heap()->allocate_new_tlab(min_tlab_size, new_tlab_size, &allocation._allocated_tlab_size, > 325: &allocation._overhead_limit_exceeded); Suggestion: mem = Universe::heap()->allocate_new_tlab(min_tlab_size, new_tlab_size, &allocation._allocated_tlab_size, &allocation._overhead_limit_exceeded); Parameter formatting src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 535: > 533: > 534: HeapWord* allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, > 535: bool* gc_overhead_limit_was_exceeded) override; Suggestion: HeapWord* allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, bool* gc_overhead_limit_was_exceeded) override; src/hotspot/share/gc/x/xCollectedHeap.cpp line 150: > 148: > 149: HeapWord* XCollectedHeap::allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, > 150: bool* gc_overhead_limit_was_exceeded) { Suggestion: HeapWord* XCollectedHeap::allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, bool* gc_overhead_limit_was_exceeded) { src/hotspot/share/gc/z/zCollectedHeap.cpp line 145: > 143: > 144: HeapWord* ZCollectedHeap::allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, > 145: bool* gc_overhead_limit_was_exceeded) { Suggestion: HeapWord* ZCollectedHeap::allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, bool* gc_overhead_limit_was_exceeded) { ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14120#pullrequestreview-1443605666 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205332523 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205333052 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205333593 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205333968 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205334232 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205334518 From iwalulya at openjdk.org Thu May 25 11:04:55 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 25 May 2023 11:04:55 GMT Subject: RFR: 8308098: G1: Remove redundant checks in G1ObjectCountIsAliveClosure In-Reply-To: References: Message-ID: On Mon, 15 May 2023 13:01:00 GMT, Albert Mingkun Yang wrote: > Simple removing unnecessary checks. > > Test: hotspot_gc Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13986#pullrequestreview-1443633596 From gli at openjdk.org Thu May 25 11:29:27 2023 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 25 May 2023 11:29:27 GMT Subject: RFR: 8194823: Serial GC does not account GCs caused by TLAB allocation in GC overhead limit [v2] In-Reply-To: References: Message-ID: > Hi all, > > This patch enables the gc overhead limit when allocating TLAB in serial gc. > The main modification is at `GenCollectedHeap::allocate_new_tlab` and the other > files only adjust the parameters of the method `allocate_new_tlab`. > > Thanks for the review. > > Best Regards, > -- Guoxiong Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: Fix parameter list formatting issues. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14120/files - new: https://git.openjdk.org/jdk/pull/14120/files/f19d3213..fd9b3969 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14120&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14120&range=00-01 Stats: 18 lines in 6 files changed: 12 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/14120.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14120/head:pull/14120 PR: https://git.openjdk.org/jdk/pull/14120 From gli at openjdk.org Thu May 25 11:29:28 2023 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 25 May 2023 11:29:28 GMT Subject: RFR: 8194823: Serial GC does not account GCs caused by TLAB allocation in GC overhead limit [v2] In-Reply-To: References: Message-ID: On Thu, 25 May 2023 10:48:49 GMT, Thomas Schatzl wrote: >> Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix parameter list formatting issues. > > Lgtm apart from parameter list formatting issues. Please fix before committing. > > It's a bit unfortunate that support for this feature for one collector causes so many changes everywhere else. (But also indicates an issue with all other collectors not supporting it ;)). Maybe the parameters could be wrapped into something like an "AllocationRequest", but this is a) a separate issue, and b) needs to be discussed first with others. @tschatzl Thanks for the review. I fixed the formatting issues just now. > src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 518: > >> 516: >> 517: HeapWord* ParallelScavengeHeap::allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, >> 518: bool* gc_overhead_limit_was_exceeded) { > > Suggestion: > > HeapWord* ParallelScavengeHeap::allocate_new_tlab(size_t min_size, > size_t requested_size, > size_t* actual_size, > bool* gc_overhead_limit_was_exceeded) { > > Parameter list formatting Fixed. > src/hotspot/share/gc/parallel/parallelScavengeHeap.hpp line 106: > >> 104: protected: >> 105: HeapWord* allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, >> 106: bool* gc_overhead_limit_was_exceeded) override; > > Suggestion: > > HeapWord* allocate_new_tlab(size_t min_size, > size_t requested_size, > size_t* actual_size, > bool* gc_overhead_limit_was_exceeded) override; > > Parameter formatting Fixed. > src/hotspot/share/gc/shared/memAllocator.cpp line 325: > >> 323: size_t min_tlab_size = ThreadLocalAllocBuffer::compute_min_size(_word_size); >> 324: mem = Universe::heap()->allocate_new_tlab(min_tlab_size, new_tlab_size, &allocation._allocated_tlab_size, >> 325: &allocation._overhead_limit_exceeded); > > Suggestion: > > mem = Universe::heap()->allocate_new_tlab(min_tlab_size, > new_tlab_size, > &allocation._allocated_tlab_size, > &allocation._overhead_limit_exceeded); > > Parameter formatting Fixed. > src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 535: > >> 533: >> 534: HeapWord* allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, >> 535: bool* gc_overhead_limit_was_exceeded) override; > > Suggestion: > > HeapWord* allocate_new_tlab(size_t min_size, > size_t requested_size, > size_t* actual_size, > bool* gc_overhead_limit_was_exceeded) override; Fixed. > src/hotspot/share/gc/x/xCollectedHeap.cpp line 150: > >> 148: >> 149: HeapWord* XCollectedHeap::allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, >> 150: bool* gc_overhead_limit_was_exceeded) { > > Suggestion: > > HeapWord* XCollectedHeap::allocate_new_tlab(size_t min_size, > size_t requested_size, > size_t* actual_size, > bool* gc_overhead_limit_was_exceeded) { Fixed. > src/hotspot/share/gc/z/zCollectedHeap.cpp line 145: > >> 143: >> 144: HeapWord* ZCollectedHeap::allocate_new_tlab(size_t min_size, size_t requested_size, size_t* actual_size, >> 145: bool* gc_overhead_limit_was_exceeded) { > > Suggestion: > > HeapWord* ZCollectedHeap::allocate_new_tlab(size_t min_size, > size_t requested_size, > size_t* actual_size, > bool* gc_overhead_limit_was_exceeded) { Fixed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14120#issuecomment-1562734243 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205370192 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205370232 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205370289 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205370349 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205370388 PR Review Comment: https://git.openjdk.org/jdk/pull/14120#discussion_r1205370436 From ayang at openjdk.org Thu May 25 11:59:55 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 25 May 2023 11:59:55 GMT Subject: RFR: 8306920: G1: Calculate garbage bytes for evacuation failed regions from marked live bytes In-Reply-To: References: Message-ID: On Wed, 24 May 2023 09:38:23 GMT, Thomas Schatzl wrote: > but currently is calculated in post evacuation phase 2 Which task specifically in phase 2? I thought it's `RestoreRetainedRegionsTask` that calculates garbage-bytes/live-bytes, which lives in phase 1. > there is no (problematic) overlap with its use during marking Even if it's correct, I find it a bit hacky to reuse marking-related data structure like this, `add_to_liveness` uses `_mark_stats_cache`. (`G1CMBitMap` is also used outside conc-mark, but I'd say that it should be brought up to the heap-level; bitmap, mirroring the whole heap, is not really tied to conc-mark.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14118#issuecomment-1562776398 From ayang at openjdk.org Thu May 25 12:03:06 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 25 May 2023 12:03:06 GMT Subject: RFR: 8308098: G1: Remove redundant checks in G1ObjectCountIsAliveClosure In-Reply-To: References: Message-ID: On Mon, 15 May 2023 13:01:00 GMT, Albert Mingkun Yang wrote: > Simple removing unnecessary checks. > > Test: hotspot_gc Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13986#issuecomment-1562777598 From ayang at openjdk.org Thu May 25 12:03:07 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 25 May 2023 12:03:07 GMT Subject: Integrated: 8308098: G1: Remove redundant checks in G1ObjectCountIsAliveClosure In-Reply-To: References: Message-ID: On Mon, 15 May 2023 13:01:00 GMT, Albert Mingkun Yang wrote: > Simple removing unnecessary checks. > > Test: hotspot_gc This pull request has now been integrated. Changeset: 7e2e05d8 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/7e2e05d836adc8fce57af2dfb4ca12e2f3625d92 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod 8308098: G1: Remove redundant checks in G1ObjectCountIsAliveClosure Reviewed-by: tschatzl, iwalulya ------------- PR: https://git.openjdk.org/jdk/pull/13986 From stefank at openjdk.org Thu May 25 12:48:03 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 25 May 2023 12:48:03 GMT Subject: RFR: 8308844: ProblemList gc/z/TestHighUsage.java with Generational ZGC on windows x64 Message-ID: The test fails intermittently in early tiers on Windows x64. ------------- Commit messages: - 8308844: ProblemList gc/z/TestHighUsage.java with Generational ZGC Changes: https://git.openjdk.org/jdk/pull/14144/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14144&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308844 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14144.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14144/head:pull/14144 PR: https://git.openjdk.org/jdk/pull/14144 From aboldtch at openjdk.org Thu May 25 12:53:54 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Thu, 25 May 2023 12:53:54 GMT Subject: RFR: 8308844: ProblemList gc/z/TestHighUsage.java with Generational ZGC on windows x64 In-Reply-To: References: Message-ID: On Thu, 25 May 2023 08:46:54 GMT, Stefan Karlsson wrote: > The test fails intermittently in early tiers on Windows x64. Marked as reviewed by aboldtch (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14144#pullrequestreview-1443821568 From stefank at openjdk.org Thu May 25 13:01:54 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 25 May 2023 13:01:54 GMT Subject: RFR: 8308844: ProblemList gc/z/TestHighUsage.java with Generational ZGC on windows x64 In-Reply-To: References: Message-ID: On Thu, 25 May 2023 08:46:54 GMT, Stefan Karlsson wrote: > The test fails intermittently in early tiers on Windows x64. Thanks for the review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14144#issuecomment-1562865078 From iwalulya at openjdk.org Thu May 25 13:07:58 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 25 May 2023 13:07:58 GMT Subject: RFR: 8194823: Serial GC does not account GCs caused by TLAB allocation in GC overhead limit [v2] In-Reply-To: References: Message-ID: On Thu, 25 May 2023 11:29:27 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch enables the gc overhead limit when allocating TLAB in serial gc. >> The main modification is at `GenCollectedHeap::allocate_new_tlab` and the other >> files only adjust the parameters of the method `allocate_new_tlab`. >> >> Thanks for the review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: > > Fix parameter list formatting issues. Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14120#pullrequestreview-1443854022 From ayang at openjdk.org Thu May 25 13:34:57 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 25 May 2023 13:34:57 GMT Subject: RFR: 8194823: Serial GC does not account GCs caused by TLAB allocation in GC overhead limit [v2] In-Reply-To: References: Message-ID: On Thu, 25 May 2023 11:29:27 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch enables the gc overhead limit when allocating TLAB in serial gc. >> The main modification is at `GenCollectedHeap::allocate_new_tlab` and the other >> files only adjust the parameters of the method `allocate_new_tlab`. >> >> Thanks for the review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: > > Fix parameter list formatting issues. Do you have a small example to trigger `java.lang.OutOfMemoryError: GC Overhead Limit Exceeded` for Serial GC. I was under the impression that Serial doesn't support `UseGCOverheadLimit`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14120#issuecomment-1562916142 From stefank at openjdk.org Thu May 25 13:51:27 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 25 May 2023 13:51:27 GMT Subject: RFR: 8308589: gc/cslocker/TestCSLocker.java timed out Message-ID: We have found that this test is flawed and will cause deadlocks if allocations wait for a GC to complete. We tried to fix this issue by removing one source of allocations (see JDK-8308043), but that there are still more reasons why the JVM might allocate memory in the test. The test has a history of causing timeouts (likely caused by deadlocks), but we're currently only seeing hangs with Generational ZGC. I propose that we turn off this test for Generational ZGC, and if the test still cause problems in other configurations then we'll reevaluate if this should be handled some other way. ------------- Commit messages: - 8308589: gc/cslocker/TestCSLocker.java timed out Changes: https://git.openjdk.org/jdk/pull/14150/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14150&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308589 Stats: 7 lines in 1 file changed: 7 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14150.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14150/head:pull/14150 PR: https://git.openjdk.org/jdk/pull/14150 From iwalulya at openjdk.org Thu May 25 13:54:02 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Thu, 25 May 2023 13:54:02 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Thu, 25 May 2023 09:09:06 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that properly deallocates/uncommits CDS archive regions when failing to load the CDS archive? > > In particular this caused the nuisance mentioned in the CR where even if -Xms==-Xmx, g1 uncommitted the heap memory anyway. > > Testing: gha, manual testing as below > > There is no (existing) way to induce CDS load errors easily, so what I did was adding `-XX:+UseNewCode` in `filemap.cpp:2202` to simulate failures when enabled. Obviously I removed the flag in this change. > > Here's the problematic case: > > > $java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms128m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff06278, 0x0000000100000000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.050s][debug][gc,ergo,heap] Attempt heap shrinking (CDS archive regions). Total size: 2097152B > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.057s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] > [0.129s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000] > openjdk version "21-internal" 2023-09-19 > > > (The `GCCardSizeInBytes` option is there to decrease the minimum heap alignment to 512kb/1M so that setting `-Xms` to an odd value in a later test works) > > I.e. the CDS regions are unconditionally uncommitted even through `-Xms == -Xmx`. > > The next case just illustrates current (pre-existing) behavior with `-Xms != -Xmx`, showing that CDS regions are always committed, leading to higher than `-Xms` memory usage. I will file an enhancement here, as it is acceptable behavior (to me). > > > $ java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms126m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.048s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000fff00000, 0x00000... src/hotspot/share/gc/g1/heapRegionManager.cpp line 568: > 566: } > 567: HeapRegion* curr_region = _regions.get_by_index(curr_index); > 568: if (!curr_region->is_free()) { What happens with regions committed before this false return? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14145#discussion_r1205553922 From tschatzl at openjdk.org Thu May 25 13:55:57 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 May 2023 13:55:57 GMT Subject: RFR: 8306920: G1: Calculate garbage bytes for evacuation failed regions from marked live bytes In-Reply-To: References: Message-ID: On Thu, 25 May 2023 11:57:23 GMT, Albert Mingkun Yang wrote: > > but currently is calculated in post evacuation phase 2 > > Which task specifically in phase 2? I thought it's `RestoreRetainedRegionsTask` that calculates garbage-bytes/live-bytes, which lives in phase 1. Apologies, my bad. This change series has been on my plate for so long now, and since then I've already been working on so many different things, partially far ahead, that I forget things (the following should also be written down in the comment for [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326) referenced in that PR). It is indeed `RestoreRetainedRegionsTask` in phase 1 that does that work. The problem is completely different, with a dependency clearing the card table (in post phase 1) and redirtying cards (in post phase 2), that is problematic in [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326) where during gc g1 is going to collect cards into the evacuation failed regions unconditionally. These cards (at least the ones that are kept) need to be redirtied like others. So depending on liveness g1 is going to decide whether to keep these in post phase 1 (during gc there is too little information) and push them into the global dcqs (in post phase 1, when merging the PSS). Redirtying in post phase 2 needs the DCQs complete. That means, that liveness information is required to be complete before phase 1. There is the option which separates merging and redirtying of the selected local DCQs'es into phase 2 (i.e. a phase that implements both selecting interesting evacuation failure regions and redirtying at the same time), butI did not want to do redirtying in two separate phases depending on source. Maybe I should implement that, and basically drop this change? (I still like this change in some way because it avoids the fairly complicated incremental liveness calculation before in `RestoreRetainedRgionsTask`, but ymmv) Implemented is determining liveness in that (serial) phase 0. The reason is that determining liveness is very short and fairly constant time (flushing the merge stats cache basically, i.e. the mentioned O(# gc threads)). Merging the PSS can be a bit slow now (merging the local DCQs into the whole, particularly dropping unneeded DCQs'es - but it is nicely hidden in post phase 1; I think Kim has been talking about improving DCQ management a few times in the past), and moving out redirtying into a new post phase 3 can be even slower. This seemed the best tradeoff between code complexity and pause time impact. This PR already mentions that this change is fairly closely tied to the other. I already mentioned (internally) that I think it may be too small now (it was part of a more general improvement to bitmap/tams/live bytes management, i.e. "always" clearing them together), but I posted it anyway separately. I can easily merge them; [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326) is done and tested already). > > > there is no (problematic) overlap with its use during marking > > Even if it's correct, I find it a bit hacky to reuse marking-related data structure like this, `add_to_liveness` uses `_mark_stats_cache`. > > (`G1CMBitMap` is also used outside conc-mark, but I'd say that it should be brought up to the heap-level; bitmap, mirroring the whole heap, is not really tied to conc-mark.) I think both the sharing of `G1CMBitMap` and the live-bytes machinery (including `mark_stats_cache`) are on the same level of ugliness/hackiness, both of them are mainly marking data structures that are (mis-)used for evacuation failure rarely (even with region pinning this will only be a small fraction). We've discussed this already when using the bitmap for this recovery mechanism and agreed to hold our noses then (actually I remember that I've been fairly if not most vocal about this being really ugly). Wrt to the mark bitmap not really tied to conc-mark: Since `G1CMBitMap` is a single large reservation (ease of implementation), it has historically been managed by `G1CollectedHeap`. Large data structures always end up there :) Imho conceptually the bitmap is a marking data structure and should be "owned" by marking (at least the parts it uses). It is also called mark bitmap for a reason. I even think that tying bitmaps 1:1 to the region memory (again, done because of ease of implementation) is not advantageous apart from a simplicity POV. A large part of the heap (and bitmap) is unlikely to be marked through (e.g. young gen, large parts of humongous regions), so I think the current way of managing it is of a waste of (committed) memory on longer running applications. What helps wrt to memory consumption currently is the fact that committing does not actually back the reservation with physical memory, so this saves a little, but once an area has ever been touched, it's not going away at the moment unless the corresponding heap region is uncommitted. One option that looks good to me would be some global bitmap manager where both the evacuation failure and marking are users of, making both of them temporary owners of whatever parts they use. That component could also take care of (concurrent) clearing, keeping "enough" bitmap pages committed etc. Also dropping that 1:1 mapping. I could either * duplicate the live bytes/mark stats cache for evacuation failure if you prefer (for not much gain). * or try above suggested modification of [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326) to have split card redirtying for the gc global DCQS and the per-thread local ones (that thinking about it, seems to be the best option, but may have issues with work distribution and/or unforeseen other issues). What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14118#issuecomment-1562950742 From ayang at openjdk.org Thu May 25 13:59:57 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Thu, 25 May 2023 13:59:57 GMT Subject: RFR: 8308766: TLAB initialization may cause div by zero In-Reply-To: References: Message-ID: On Wed, 24 May 2023 11:50:02 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that fixes an FP div by zero? > > In `ThreadLocalAllocBuffer::initialize()` we initialize the TLAB using current available TLAB capacity for the thread. In G1, this can be zero in some situations, leading to that div by zero (see the CR for the crash when adding an assert). > The suggested fix is to just not sample at this point. TLAB resizing will fix TLAB sizing up. > > Only G1 seems to be affected as it seems to be the only gc that uses a dynamic value for the capacity available for TLAB allocation. Other GCs seem to just use total heap capacity (Z, Shenandoah) or eden capacity (Serial, Parallel). > Not sure if that is actually better and I think won't result in the expected behavior (every thread should reload TLABs `target_refills()` times per mutator time); since even with G1 at TLAB resizing time eden is maximal, this hiccup at initialization does not seem too bad. > > This may also be the cause for the behavior observed in https://bugs.openjdk.org/browse/JDK-8264798. > > Testing: gha > > Thanks, > Thomas Looking at where `_allocation_fraction` is accessed, wouldn't a variable capacity cause the alloc-amount to be miscalculated? I'd expect `capacity` to be const to more accurately track/predict #alloc-bytes. // ** sampling place ** // size_t capacity = Universe::heap()->tlab_capacity(thread()) / HeapWordSize; float alloc_frac = desired_size() * target_refills() / (float)capacity; _allocation_fraction.sample(alloc_frac); // ** where it's used ** // // Compute the next tlab size using expected allocation amount size_t alloc = (size_t)(_allocation_fraction.average() * (Universe::heap()->tlab_capacity(thread()) / HeapWordSize)); ------------- PR Comment: https://git.openjdk.org/jdk/pull/14121#issuecomment-1562956610 From tschatzl at openjdk.org Thu May 25 14:14:01 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 May 2023 14:14:01 GMT Subject: RFR: 8308766: TLAB initialization may cause div by zero In-Reply-To: References: Message-ID: On Wed, 24 May 2023 11:50:02 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that fixes an FP div by zero? > > In `ThreadLocalAllocBuffer::initialize()` we initialize the TLAB using current available TLAB capacity for the thread. In G1, this can be zero in some situations, leading to that div by zero (see the CR for the crash when adding an assert). > The suggested fix is to just not sample at this point. TLAB resizing will fix TLAB sizing up. > > Only G1 seems to be affected as it seems to be the only gc that uses a dynamic value for the capacity available for TLAB allocation. Other GCs seem to just use total heap capacity (Z, Shenandoah) or eden capacity (Serial, Parallel). > Not sure if that is actually better and I think won't result in the expected behavior (every thread should reload TLABs `target_refills()` times per mutator time); since even with G1 at TLAB resizing time eden is maximal, this hiccup at initialization does not seem too bad. > > This may also be the cause for the behavior observed in https://bugs.openjdk.org/browse/JDK-8264798. > > Testing: gha > > Thanks, > Thomas > Looking at where `_allocation_fraction` is accessed, wouldn't a variable capacity cause the alloc-amount to be miscalculated? I'd expect `capacity` to be const to more accurately track/predict #alloc-bytes. > > ```c++ > // ** sampling place ** // > size_t capacity = Universe::heap()->tlab_capacity(thread()) / HeapWordSize; > float alloc_frac = desired_size() * target_refills() / (float)capacity; > _allocation_fraction.sample(alloc_frac); > > // ** where it's used ** // > // Compute the next tlab size using expected allocation amount > size_t alloc = (size_t)(_allocation_fraction.average() * > (Universe::heap()->tlab_capacity(thread()) / HeapWordSize)); > ``` Where the capacity is used, during the GC pause, in G1 `Universe::heap()->tlab_capacity` is effectively a constant, reflecting the eden size for the next mutator phase. There is a problem what to do during TLAB initialization when attaching a random thread: eden can be partially exhausted as it can happen at any time when the mutator is running: do you want to have `target_refills()` reloads until eden is exhausted, or as if the thread ran since the start of the mutator phase or something completely different. Serial and parallel calculate it as if eden were empty, Shenandoah and Z seem to use total heap capacity (they're single-generational), and G1 uses the remaining eden capacity, with different effects. (Fwiw, if there is an issue with that logic, it is pre-existing). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14121#issuecomment-1562981802 From tschatzl at openjdk.org Thu May 25 14:33:03 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 May 2023 14:33:03 GMT Subject: RFR: 8308844: ProblemList gc/z/TestHighUsage.java with Generational ZGC on windows x64 In-Reply-To: References: Message-ID: <07ZGzW4EsQ14zqWH_q_TrqMq6-lc902FEP5hpUTvHqY=.f218f543-f853-4c10-a8f2-044485759e7e@github.com> On Thu, 25 May 2023 08:46:54 GMT, Stefan Karlsson wrote: > The test fails intermittently in early tiers on Windows x64. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14144#pullrequestreview-1444034450 From tschatzl at openjdk.org Thu May 25 14:35:03 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 May 2023 14:35:03 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Thu, 25 May 2023 13:51:14 GMT, Ivan Walulya wrote: >> Hi all, >> >> can I have reviews for this change that properly deallocates/uncommits CDS archive regions when failing to load the CDS archive? >> >> In particular this caused the nuisance mentioned in the CR where even if -Xms==-Xmx, g1 uncommitted the heap memory anyway. >> >> Testing: gha, manual testing as below >> >> There is no (existing) way to induce CDS load errors easily, so what I did was adding `-XX:+UseNewCode` in `filemap.cpp:2202` to simulate failures when enabled. Obviously I removed the flag in this change. >> >> Here's the problematic case: >> >> >> $java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms128m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version >> >> [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] >> [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff06278, 0x0000000100000000] >> [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] >> [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] >> [0.050s][debug][gc,ergo,heap] Attempt heap shrinking (CDS archive regions). Total size: 2097152B >> [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] >> [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] >> [0.057s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] >> [0.129s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000] >> openjdk version "21-internal" 2023-09-19 >> >> >> (The `GCCardSizeInBytes` option is there to decrease the minimum heap alignment to 512kb/1M so that setting `-Xms` to an odd value in a later test works) >> >> I.e. the CDS regions are unconditionally uncommitted even through `-Xms == -Xmx`. >> >> The next case just illustrates current (pre-existing) behavior with `-Xms != -Xmx`, showing that CDS regions are always committed, leading to higher than `-Xms` memory usage. I will file an enhancement here, as it is acceptable behavior (to me). >> >> >> $ java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms126m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version >> >> [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] >> [0.048s][trace][gc,region ] G1HR AC... > > src/hotspot/share/gc/g1/heapRegionManager.cpp line 568: > >> 566: } >> 567: HeapRegion* curr_region = _regions.get_by_index(curr_index); >> 568: if (!curr_region->is_free()) { > > What happens with regions committed before this false return? They will be abandoned and in a weird state... I'll file a bug (this is pre-existing). It never happened because the only case where this is called is CDS loading which happens very early during execution. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14145#discussion_r1205609317 From tschatzl at openjdk.org Thu May 25 14:35:04 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 May 2023 14:35:04 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Thu, 25 May 2023 14:31:24 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/heapRegionManager.cpp line 568: >> >>> 566: } >>> 567: HeapRegion* curr_region = _regions.get_by_index(curr_index); >>> 568: if (!curr_region->is_free()) { >> >> What happens with regions committed before this false return? > > They will be abandoned and in a weird state... I'll file a bug (this is pre-existing). It never happened because the only case where this is called is CDS loading which happens very early during execution. I.e. the next region verification will complain. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14145#discussion_r1205610246 From shade at openjdk.org Thu May 25 15:03:58 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 May 2023 15:03:58 GMT Subject: RFR: 8308766: TLAB initialization may cause div by zero In-Reply-To: References: Message-ID: On Wed, 24 May 2023 11:50:02 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that fixes an FP div by zero? > > In `ThreadLocalAllocBuffer::initialize()` we initialize the TLAB using current available TLAB capacity for the thread. In G1, this can be zero in some situations, leading to that div by zero (see the CR for the crash when adding an assert). > The suggested fix is to just not sample at this point. TLAB resizing will fix TLAB sizing up. > > Only G1 seems to be affected as it seems to be the only gc that uses a dynamic value for the capacity available for TLAB allocation. Other GCs seem to just use total heap capacity (Z, Shenandoah) or eden capacity (Serial, Parallel). > Not sure if that is actually better and I think won't result in the expected behavior (every thread should reload TLABs `target_refills()` times per mutator time); since even with G1 at TLAB resizing time eden is maximal, this hiccup at initialization does not seem too bad. > > This may also be the cause for the behavior observed in https://bugs.openjdk.org/browse/JDK-8264798. > > Testing: gha > > Thanks, > Thomas OK, so this does happen when a new thread comes at unfortunate time in VM lifecycle, like on shutdown? Anyway, the fix looks okay. I think many other versions are also affected, can you please add relevant Affected-Versions to the bug? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14121#pullrequestreview-1444103801 From eosterlund at openjdk.org Thu May 25 15:16:56 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 May 2023 15:16:56 GMT Subject: RFR: 8308589: gc/cslocker/TestCSLocker.java timed out In-Reply-To: References: Message-ID: On Thu, 25 May 2023 13:43:55 GMT, Stefan Karlsson wrote: > We have found that this test is flawed and will cause deadlocks if allocations wait for a GC to complete. We tried to fix this issue by removing one source of allocations (see JDK-8308043), but that there are still more reasons why the JVM might allocate memory in the test. The test has a history of causing timeouts (likely caused by deadlocks), but we're currently only seeing hangs with Generational ZGC. I propose that we turn off this test for Generational ZGC, and if the test still cause problems in other configurations then we'll reevaluate if this should be handled some other way. Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14150#pullrequestreview-1444128154 From gli at openjdk.org Thu May 25 15:34:03 2023 From: gli at openjdk.org (Guoxiong Li) Date: Thu, 25 May 2023 15:34:03 GMT Subject: RFR: 8194823: Serial GC does not account GCs caused by TLAB allocation in GC overhead limit [v2] In-Reply-To: References: Message-ID: On Thu, 25 May 2023 13:31:59 GMT, Albert Mingkun Yang wrote: > Do you have a small example to trigger `java.lang.OutOfMemoryError: GC Overhead Limit Exceeded` for Serial GC. I was under the impression that Serial doesn't support `UseGCOverheadLimit`. I re-read the related code and now I think you are right. Currently, only the parallel GC supports `UseGCOverheadLimit`. In detail, the methods `GCOverheadChecker::check_gc_overhead_limit` and `AdaptiveSizePolicy::check_gc_overhead_limit` are only used by `PSScavenge::invoke_no_policy` and `PSParallelCompact::invoke_no_policy` under the condition **UseAdaptiveSizePolicy is true**. And the `UseAdaptiveSizePolicy` is only used by paralled gc, too. Several problems need to be confirmed before continuing the work: The `UseGCOverheadLimit` is only used when `UseAdaptiveSizePolicy` is true. Is it intentional? If it is intentional and only the parallel GC uses the `UseAdaptiveSizePolicy` now, should I remove the `UseGCOverheadLimit` related code in serial GC? Or we should implement the feature about `UseAdaptiveSizePolicy` and `UseGCOverheadLimit` in serial GC (seems a large change) ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14120#issuecomment-1563110554 From eosterlund at openjdk.org Thu May 25 16:09:13 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 May 2023 16:09:13 GMT Subject: RFR: 8308009: Generational ZGC: OOM before clearing all SoftReferences [v2] In-Reply-To: References: Message-ID: > When a major GC in generational ZGC with a different cause that doesn?t pre-clean and doesn?t clear soft references, we ask if there are allocations stalled on old. And part of that condition is to check if we are not stalled on young. So if an allocation request comes in just before such a ?weak? major GC, we will say we won?t clear soft references. But after that major collection we will satisfy all the constraints to throw OOM as both an YC and OC has passed since the allocation request was installed. > The solution is to let the driver remember if it cleared soft references or not, and only throw OOM if it cleared soft references. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Assert and comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14122/files - new: https://git.openjdk.org/jdk/pull/14122/files/45b2b269..abd90fc3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14122&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14122&range=00-01 Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14122.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14122/head:pull/14122 PR: https://git.openjdk.org/jdk/pull/14122 From eosterlund at openjdk.org Thu May 25 16:09:30 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 May 2023 16:09:30 GMT Subject: RFR: 8308009: Generational ZGC: OOM before clearing all SoftReferences In-Reply-To: References: Message-ID: On Wed, 24 May 2023 12:18:19 GMT, Erik ?sterlund wrote: > When a major GC in generational ZGC with a different cause that doesn?t pre-clean and doesn?t clear soft references, we ask if there are allocations stalled on old. And part of that condition is to check if we are not stalled on young. So if an allocation request comes in just before such a ?weak? major GC, we will say we won?t clear soft references. But after that major collection we will satisfy all the constraints to throw OOM as both an YC and OC has passed since the allocation request was installed. > The solution is to let the driver remember if it cleared soft references or not, and only throw OOM if it cleared soft references. I added an assert checking that we don't mess up pre-cleaning and a comment explaining why it holds. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14122#issuecomment-1563161226 From tschatzl at openjdk.org Thu May 25 16:09:55 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 May 2023 16:09:55 GMT Subject: RFR: 8308766: TLAB initialization may cause div by zero In-Reply-To: References: Message-ID: On Thu, 25 May 2023 15:01:35 GMT, Aleksey Shipilev wrote: > OK, so this does happen when a new thread comes at unfortunate time in VM lifecycle, like on shutdown? Anyway, the fix looks okay. I think many other versions are also affected, can you please add relevant Affected-Versions to the bug? In this case, yes, a thread is attached on shutdown and you can get weird failures in other FP code (if you also enable FP exceptions, but it can leave something in a weird state apparently). I think (well I hope) it is also the cause for another similar bug (*) that caused crashes in G1 extremely intermittently (that has been closed as CNR at that point after it stopped appearing). That assert that tripped is something that I added for trying to reproduce [JDK-8264798](https://bugs.openjdk.org/browse/JDK-8264798), initially failed to do so, and then accidentally left in when testing another change.... I think it is worth cleaning up just in case. (*) That may just be wishful thinking... ------------- PR Comment: https://git.openjdk.org/jdk/pull/14121#issuecomment-1563159544 From tschatzl at openjdk.org Thu May 25 16:16:56 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 25 May 2023 16:16:56 GMT Subject: RFR: 8308766: TLAB initialization may cause div by zero In-Reply-To: References: Message-ID: On Wed, 24 May 2023 11:50:02 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that fixes an FP div by zero? > > In `ThreadLocalAllocBuffer::initialize()` we initialize the TLAB using current available TLAB capacity for the thread. In G1, this can be zero in some situations, leading to that div by zero (see the CR for the crash when adding an assert). > The suggested fix is to just not sample at this point. TLAB resizing will fix TLAB sizing up. > > Only G1 seems to be affected as it seems to be the only gc that uses a dynamic value for the capacity available for TLAB allocation. Other GCs seem to just use total heap capacity (Z, Shenandoah) or eden capacity (Serial, Parallel). > Not sure if that is actually better and I think won't result in the expected behavior (every thread should reload TLABs `target_refills()` times per mutator time); since even with G1 at TLAB resizing time eden is maximal, this hiccup at initialization does not seem too bad. > > This may also be the cause for the behavior observed in https://bugs.openjdk.org/browse/JDK-8264798. > > Testing: gha > > Thanks, > Thomas Added affects version back to JDK 8 since that code and the `tlab_capacity()` implementation are the same as they are now. Maybe other circumstance prevent this from happening. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14121#issuecomment-1563171196 From stefank at openjdk.org Thu May 25 16:23:56 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 25 May 2023 16:23:56 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently In-Reply-To: References: Message-ID: <8B3cSjxnz2cIpdG_NWwBI8KpcAjPSMUCMS0-LmJUR98=.931fb4a9-deaa-47d5-897d-483562aefcc3@github.com> On Thu, 25 May 2023 15:41:23 GMT, Erik ?sterlund wrote: > It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. Could this be using CollectedHeap::keep_alive instead of performing this clear/store dance? ------------- PR Review: https://git.openjdk.org/jdk/pull/14154#pullrequestreview-1444260225 From eosterlund at openjdk.org Thu May 25 17:41:56 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 May 2023 17:41:56 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently In-Reply-To: <8B3cSjxnz2cIpdG_NWwBI8KpcAjPSMUCMS0-LmJUR98=.931fb4a9-deaa-47d5-897d-483562aefcc3@github.com> References: <8B3cSjxnz2cIpdG_NWwBI8KpcAjPSMUCMS0-LmJUR98=.931fb4a9-deaa-47d5-897d-483562aefcc3@github.com> Message-ID: On Thu, 25 May 2023 16:20:57 GMT, Stefan Karlsson wrote: > Could this be using CollectedHeap::keep_alive instead of performing this clear/store dance? No, because the problem isn't keeping the object alive. It's already kept alive using handles. The problem is that the snapshot of strong roots are not processed during marking, leading to ABA problems, causing things to go wrong later on. Therefore I wanted to model the solution conceptually as close as possible to other handles. Admittedly, it does look more like an exptic rain dance than ideal. Just got to clap our hands, spin around 360 degrees and then jump - then clap our hands again! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14154#issuecomment-1563275336 From coleenp at openjdk.org Thu May 25 20:34:55 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 25 May 2023 20:34:55 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently In-Reply-To: References: Message-ID: On Thu, 25 May 2023 15:41:23 GMT, Erik ?sterlund wrote: > It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. This is strange but if it works, it's ok with me. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14154#pullrequestreview-1444608866 From dcubed at openjdk.org Thu May 25 20:46:00 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Thu, 25 May 2023 20:46:00 GMT Subject: RFR: 8308844: ProblemList gc/z/TestHighUsage.java with Generational ZGC on windows x64 In-Reply-To: References: Message-ID: On Thu, 25 May 2023 12:58:58 GMT, Stefan Karlsson wrote: >> The test fails intermittently in early tiers on Windows x64. > > Thanks for the review! @stefank - You did your "/integrate" before the PR was marked "ready" so it didn't get executed. You need to do it again... ------------- PR Comment: https://git.openjdk.org/jdk/pull/14144#issuecomment-1563483888 From eosterlund at openjdk.org Thu May 25 20:55:54 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Thu, 25 May 2023 20:55:54 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently In-Reply-To: References: Message-ID: On Thu, 25 May 2023 20:32:16 GMT, Coleen Phillimore wrote: >> It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. > > This is strange but if it works, it's ok with me. Thanks for the review @coleenp! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14154#issuecomment-1563493181 From aboldtch at openjdk.org Fri May 26 07:00:55 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 May 2023 07:00:55 GMT Subject: RFR: 8308009: Generational ZGC: OOM before clearing all SoftReferences [v2] In-Reply-To: References: Message-ID: On Thu, 25 May 2023 16:09:13 GMT, Erik ?sterlund wrote: >> When a major GC in generational ZGC with a different cause that doesn?t pre-clean and doesn?t clear soft references, we ask if there are allocations stalled on old. And part of that condition is to check if we are not stalled on young. So if an allocation request comes in just before such a ?weak? major GC, we will say we won?t clear soft references. But after that major collection we will satisfy all the constraints to throw OOM as both an YC and OC has passed since the allocation request was installed. >> The solution is to let the driver remember if it cleared soft references or not, and only throw OOM if it cleared soft references. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Assert and comment Marked as reviewed by aboldtch (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14122#pullrequestreview-1445318559 From aboldtch at openjdk.org Fri May 26 07:22:55 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 May 2023 07:22:55 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently In-Reply-To: References: Message-ID: On Thu, 25 May 2023 15:41:23 GMT, Erik ?sterlund wrote: > It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. src/hotspot/share/classfile/classLoaderData.cpp line 317: > 315: // No-op free handle > 316: // No-op allocate new handle using the same address > 317: NativeAccess<>::oop_store(p, obj); // Store the strong non-root Is there an issue if someone with an `OopHandle(p)` racingly reads null while a root is transitioning? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14154#discussion_r1206333887 From iwalulya at openjdk.org Fri May 26 07:44:56 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Fri, 26 May 2023 07:44:56 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Thu, 25 May 2023 09:09:06 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that properly deallocates/uncommits CDS archive regions when failing to load the CDS archive? > > In particular this caused the nuisance mentioned in the CR where even if -Xms==-Xmx, g1 uncommitted the heap memory anyway. > > Testing: gha, manual testing as below > > There is no (existing) way to induce CDS load errors easily, so what I did was adding `-XX:+UseNewCode` in `filemap.cpp:2202` to simulate failures when enabled. Obviously I removed the flag in this change. > > Here's the problematic case: > > > $java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms128m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff06278, 0x0000000100000000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.050s][debug][gc,ergo,heap] Attempt heap shrinking (CDS archive regions). Total size: 2097152B > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.057s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] > [0.129s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000] > openjdk version "21-internal" 2023-09-19 > > > (The `GCCardSizeInBytes` option is there to decrease the minimum heap alignment to 512kb/1M so that setting `-Xms` to an odd value in a later test works) > > I.e. the CDS regions are unconditionally uncommitted even through `-Xms == -Xmx`. > > The next case just illustrates current (pre-existing) behavior with `-Xms != -Xmx`, showing that CDS regions are always committed, leading to higher than `-Xms` memory usage. I will file an enhancement here, as it is acceptable behavior (to me). > > > $ java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms126m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.048s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000fff00000, 0x00000... Lgtm! ------------- Marked as reviewed by iwalulya (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14145#pullrequestreview-1445455680 From stefank at openjdk.org Fri May 26 07:56:08 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 26 May 2023 07:56:08 GMT Subject: RFR: 8308844: ProblemList gc/z/TestHighUsage.java with Generational ZGC on windows x64 In-Reply-To: References: Message-ID: On Thu, 25 May 2023 20:43:01 GMT, Daniel D. Daugherty wrote: >> Thanks for the review! > > @stefank - You did your "/integrate" before the PR was marked "ready" so it didn't > get executed. You need to do it again... @dcubed-ojdk Thanks for the notification. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14144#issuecomment-1563959489 From stefank at openjdk.org Fri May 26 07:56:10 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 26 May 2023 07:56:10 GMT Subject: Integrated: 8308844: ProblemList gc/z/TestHighUsage.java with Generational ZGC on windows x64 In-Reply-To: References: Message-ID: <4OdryNVY_de73jyvqegiCzpnDjFfryjApS6dEwsRLIw=.9a036c84-8a91-48e7-90ce-1e092c9169f2@github.com> On Thu, 25 May 2023 08:46:54 GMT, Stefan Karlsson wrote: > The test fails intermittently in early tiers on Windows x64. This pull request has now been integrated. Changeset: 7c072dbd Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/7c072dbd9dd0478c901daebf053884cdd8dad369 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod 8308844: ProblemList gc/z/TestHighUsage.java with Generational ZGC on windows x64 Reviewed-by: aboldtch, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/14144 From ayang at openjdk.org Fri May 26 09:40:57 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 26 May 2023 09:40:57 GMT Subject: RFR: 8194823: Serial GC does not account GCs caused by TLAB allocation in GC overhead limit [v2] In-Reply-To: References: Message-ID: On Thu, 25 May 2023 15:31:29 GMT, Guoxiong Li wrote: > The UseGCOverheadLimit is only used when UseAdaptiveSizePolicy is true. Is it intentional? I can't see any dependency btw them after skimming through their specification. Worth investigation in its own ticket/PR. Some background which might be useful, "UseGCOverheadLimit can work independently of UseAdaptiveSizePolicy" from https://bugs.openjdk.org/browse/JDK-8212206 > should I remove the UseGCOverheadLimit related code in serial GC? I am leaned towards removing it, as it is effectively dead code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14120#issuecomment-1564108773 From tschatzl at openjdk.org Fri May 26 09:50:03 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 26 May 2023 09:50:03 GMT Subject: RFR: 8306920: G1: Calculate garbage bytes for evacuation failed regions from marked live bytes In-Reply-To: References: Message-ID: On Wed, 24 May 2023 09:38:23 GMT, Thomas Schatzl wrote: > Hi all, > > please review this refactoring that uses `G1ConcurrentMark`'s live bytes/marking to collect the amount of live bytes for evacuation failed regions instead of calculating it piecemeal while removing self-forwards. > > The reason is that the functionality to keep evacuation failed regions in the remembered sets to clear them out quickly ([JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)) needs the region's live bytes to determine whether the region is retained (i.e. put into the collection set candidates). > > The live bytes for a region will be required in G1's post evacuation phase 1, but currently is calculated in post evacuation phase 2. I.e. this change avoids splitting up post evacuation phase 2 and shuffling around phases (it also makes assignment of live bytes to evacuation failed regions non-incremental, which makes it imo easier to understand). > > The change does add, if there is an evacuation failure, a very short serial phase that calculates the final liveness bytes for a region (that is O(#worker threads)). > The reason for reusing `ConcurrentMark`'s liveness gathering infrastructure is because it's already there and there is no (problematic) overlap with its use during marking; i.e. marking only uses live byte array entries for regions that are marked through, and evacuation failure can only happen for regions in the (candidate) collection set, which g1 never marks through. > > Testing: tier1-5 > > Thanks, > Thomas The result of an internal discussion has been to do a more limited version of [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)) that does not need these changes. So I'm retracting this PR, and will follow up with [JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)) directly. If there is need, we'll revisit this again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14118#issuecomment-1564119704 From tschatzl at openjdk.org Fri May 26 09:50:03 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 26 May 2023 09:50:03 GMT Subject: Withdrawn: 8306920: G1: Calculate garbage bytes for evacuation failed regions from marked live bytes In-Reply-To: References: Message-ID: On Wed, 24 May 2023 09:38:23 GMT, Thomas Schatzl wrote: > Hi all, > > please review this refactoring that uses `G1ConcurrentMark`'s live bytes/marking to collect the amount of live bytes for evacuation failed regions instead of calculating it piecemeal while removing self-forwards. > > The reason is that the functionality to keep evacuation failed regions in the remembered sets to clear them out quickly ([JDK-8140326](https://bugs.openjdk.org/browse/JDK-8140326)) needs the region's live bytes to determine whether the region is retained (i.e. put into the collection set candidates). > > The live bytes for a region will be required in G1's post evacuation phase 1, but currently is calculated in post evacuation phase 2. I.e. this change avoids splitting up post evacuation phase 2 and shuffling around phases (it also makes assignment of live bytes to evacuation failed regions non-incremental, which makes it imo easier to understand). > > The change does add, if there is an evacuation failure, a very short serial phase that calculates the final liveness bytes for a region (that is O(#worker threads)). > The reason for reusing `ConcurrentMark`'s liveness gathering infrastructure is because it's already there and there is no (problematic) overlap with its use during marking; i.e. marking only uses live byte array entries for regions that are marked through, and evacuation failure can only happen for regions in the (candidate) collection set, which g1 never marks through. > > Testing: tier1-5 > > Thanks, > Thomas This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/14118 From eosterlund at openjdk.org Fri May 26 10:15:57 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 26 May 2023 10:15:57 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently In-Reply-To: References: Message-ID: On Fri, 26 May 2023 07:20:27 GMT, Axel Boldt-Christmas wrote: >> It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. > > src/hotspot/share/classfile/classLoaderData.cpp line 317: > >> 315: // No-op free handle >> 316: // No-op allocate new handle using the same address >> 317: NativeAccess<>::oop_store(p, obj); // Store the strong non-root > > Is there an issue if someone with an `OopHandle(p)` racingly reads null while a root is transitioning? Shouldn't be, but I'll remove the null store and write some more comments, to be on the safe side. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14154#discussion_r1206543643 From eosterlund at openjdk.org Fri May 26 10:16:06 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 26 May 2023 10:16:06 GMT Subject: RFR: 8308009: Generational ZGC: OOM before clearing all SoftReferences [v2] In-Reply-To: References: Message-ID: On Fri, 26 May 2023 06:58:28 GMT, Axel Boldt-Christmas wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> Assert and comment > > Marked as reviewed by aboldtch (Committer). Thanks for the review @xmas92! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14122#issuecomment-1564153860 From eosterlund at openjdk.org Fri May 26 10:16:08 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 26 May 2023 10:16:08 GMT Subject: Integrated: 8308009: Generational ZGC: OOM before clearing all SoftReferences In-Reply-To: References: Message-ID: <9qz71YxsyUEBgltPhmybjsiKZeIfO8A07D5bo6ShxTY=.c2bd8df8-1660-4f5f-a9e4-b93b5bed787c@github.com> On Wed, 24 May 2023 12:18:19 GMT, Erik ?sterlund wrote: > When a major GC in generational ZGC with a different cause that doesn?t pre-clean and doesn?t clear soft references, we ask if there are allocations stalled on old. And part of that condition is to check if we are not stalled on young. So if an allocation request comes in just before such a ?weak? major GC, we will say we won?t clear soft references. But after that major collection we will satisfy all the constraints to throw OOM as both an YC and OC has passed since the allocation request was installed. > The solution is to let the driver remember if it cleared soft references or not, and only throw OOM if it cleared soft references. This pull request has now been integrated. Changeset: d3b9b364 Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/d3b9b364da8c11c9b4dd14a6451a7b24f41202e7 Stats: 30 lines in 6 files changed: 14 ins; 4 del; 12 mod 8308009: Generational ZGC: OOM before clearing all SoftReferences Reviewed-by: stefank, aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/14122 From ayang at openjdk.org Fri May 26 11:44:59 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 26 May 2023 11:44:59 GMT Subject: RFR: 8308948: Remove unimplemented ThreadLocalAllocBuffer::reset Message-ID: Trivial removing dead code. ------------- Commit messages: - tlab-trivial Changes: https://git.openjdk.org/jdk/pull/14175/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14175&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308948 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14175.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14175/head:pull/14175 PR: https://git.openjdk.org/jdk/pull/14175 From coleenp at openjdk.org Fri May 26 11:48:17 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 26 May 2023 11:48:17 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v47] In-Reply-To: References: Message-ID: On Mon, 22 May 2023 14:37:18 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The c... > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Specialize full-GC loops to get UseAltGCForwarding flag check out of hot paths Is it 7% slower when not testing the flag? FTR - I ran this patch and the https://github.com/openjdk/jdk/pull/13779 with the flag on and it passes tier 1-4. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1564267051 From tschatzl at openjdk.org Fri May 26 11:48:55 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 26 May 2023 11:48:55 GMT Subject: RFR: 8308948: Remove unimplemented ThreadLocalAllocBuffer::reset In-Reply-To: References: Message-ID: On Fri, 26 May 2023 11:37:49 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14175#pullrequestreview-1446056756 From ayang at openjdk.org Fri May 26 12:09:58 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 26 May 2023 12:09:58 GMT Subject: RFR: 8047998: -XX:MaxNewSize is unnecessarily set to NewSize if NewSize is too low In-Reply-To: References: Message-ID: On Tue, 9 May 2023 03:22:52 GMT, Guoxiong Li wrote: > Hi all, > > When `MaxHeapSize` is equal to `InitialHeapSize` and `NewSize` is set in command line, > currently, the `max_young_size/MaxNewSize` will be set to the value of `NewSize`. > > Considering the document of the `NewSize` (shown below), someone may set the `NewSize` > to a very small value and expect the JVM to adjust the value dynamically. > Then when the `MaxHeapSize` is equal to `InitialHeapSize` (set by user or ergonomics), > the `MaxNewSize` is set to the value of `NewSize`, which is small unexpectedly. > > > product(size_t, NewSize, ScaleForWordSize(1*M), \ > "Initial new generation size (in bytes)") \ > constraint(NewSizeConstraintFunc,AfterErgo) \ > > > This patch fixes the issue by setting the `MaxNewSize` to `NewSize` only when the `NewSize` > is larger than the original `max_young_size/MaxNewSize`. > > The title of JDK-8047998 may need to adjusted. > > Thanks for the review. > > Best Regards, > -- Guoxiong As far as I understand it, there are potentially two issues here: 1. `InitialRAMPercentage` and `InitialHeapSize` control the same attribute but they have diff default value. 2. `MaxNewSize` can be silently ignored in certain cases. For example: java -XX:+UseSerialGC -XX:InitialHeapSize=256m -XX:MaxHeapSize=256M -XX:NewSize=8M -XX:MaxNewSize=80M '-Xlog:gc,gc+heap=trace' --version Partial output: `Initial young 8388608 Maximum young 8388608` meaning that `MaxNewSize=80M` is silently ignored/discarded/overwritten, which can be surprising, especially when `-XX:InitialHeapSize=256m` (or even larger value) is derived from the default and implicit `InitialRAMPercentage`. Regardless of issue 1, maybe emitting a warning/error would be less surprising in the case of issue 2. (Your concern is mostly issue 2, right?) ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1564293339 From gli at openjdk.org Fri May 26 12:35:55 2023 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 26 May 2023 12:35:55 GMT Subject: RFR: 8047998: -XX:MaxNewSize is unnecessarily set to NewSize if NewSize is too low In-Reply-To: References: Message-ID: On Fri, 26 May 2023 12:07:14 GMT, Albert Mingkun Yang wrote: > Regardless of issue 1, maybe emitting a warning/error would be less surprising in the case of issue 2. (Your concern is mostly issue 2, right?) Yes, the main problem is `issue 2`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1564323241 From eosterlund at openjdk.org Fri May 26 12:47:04 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 26 May 2023 12:47:04 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently [v2] In-Reply-To: References: Message-ID: > It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: Remove null store and improve comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14154/files - new: https://git.openjdk.org/jdk/pull/14154/files/b75fce32..7aa800be Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14154&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14154&range=00-01 Stats: 29 lines in 1 file changed: 19 ins; 3 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14154.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14154/head:pull/14154 PR: https://git.openjdk.org/jdk/pull/14154 From stefank at openjdk.org Fri May 26 12:54:55 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 26 May 2023 12:54:55 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently [v2] In-Reply-To: References: Message-ID: On Fri, 26 May 2023 12:47:04 GMT, Erik ?sterlund wrote: >> It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Remove null store and improve comments Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14154#pullrequestreview-1446169029 From aboldtch at openjdk.org Fri May 26 13:23:03 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Fri, 26 May 2023 13:23:03 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently [v2] In-Reply-To: References: Message-ID: On Fri, 26 May 2023 12:47:04 GMT, Erik ?sterlund wrote: >> It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Remove null store and improve comments Is there any issue with memory order access here? >From my understanding this works because a GC which starts before `demote_strong_roots` has the property if it observers `_keep_alive == 0` then in must have observed `demote_strong_roots` (either root throughout or it observed the transition through barriers). And if a GC starts after `demote_strong_roots` it must never observe `_keep_alive != 0`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14154#issuecomment-1564385165 From ayang at openjdk.org Fri May 26 13:40:54 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 26 May 2023 13:40:54 GMT Subject: RFR: 8047998: -XX:MaxNewSize is unnecessarily set to NewSize if NewSize is too low In-Reply-To: References: Message-ID: On Tue, 9 May 2023 03:22:52 GMT, Guoxiong Li wrote: > Hi all, > > When `MaxHeapSize` is equal to `InitialHeapSize` and `NewSize` is set in command line, > currently, the `max_young_size/MaxNewSize` will be set to the value of `NewSize`. > > Considering the document of the `NewSize` (shown below), someone may set the `NewSize` > to a very small value and expect the JVM to adjust the value dynamically. > Then when the `MaxHeapSize` is equal to `InitialHeapSize` (set by user or ergonomics), > the `MaxNewSize` is set to the value of `NewSize`, which is small unexpectedly. > > > product(size_t, NewSize, ScaleForWordSize(1*M), \ > "Initial new generation size (in bytes)") \ > constraint(NewSizeConstraintFunc,AfterErgo) \ > > > This patch fixes the issue by setting the `MaxNewSize` to `NewSize` only when the `NewSize` > is larger than the original `max_young_size/MaxNewSize`. > > The title of JDK-8047998 may need to adjusted. > > Thanks for the review. > > Best Regards, > -- Guoxiong OK, maybe sth along these lines: if (MaxHeapSize == InitialHeapSize) { if (FLAG_IS_CMDLINE(NewSize) && FLAG_IS_CMDLINE(MaxNewSize) && NewSize != MaxNewSize) { vm_exit_during_initialization(...); } ... } I think exiting here is fine because this is an impossible constraint to satisfy, variable young-gen size in constant whole-heap setup. What do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1564410166 From stefank at openjdk.org Fri May 26 13:55:02 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 26 May 2023 13:55:02 GMT Subject: Integrated: 8308589: gc/cslocker/TestCSLocker.java timed out In-Reply-To: References: Message-ID: <-FFSHj4C3i9eSaZZ13ml798AsqlNnXbz4Y451jOTAf8=.a44adb7f-3fc1-4a04-9743-331275ae6c08@github.com> On Thu, 25 May 2023 13:43:55 GMT, Stefan Karlsson wrote: > We have found that this test is flawed and will cause deadlocks if allocations wait for a GC to complete. We tried to fix this issue by removing one source of allocations (see JDK-8308043), but that there are still more reasons why the JVM might allocate memory in the test. The test has a history of causing timeouts (likely caused by deadlocks), but we're currently only seeing hangs with Generational ZGC. I propose that we turn off this test for Generational ZGC, and if the test still cause problems in other configurations then we'll reevaluate if this should be handled some other way. This pull request has now been integrated. Changeset: cc0976bf Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/cc0976bf7fc41caa5abdaa23f4df00b1a5d5bfba Stats: 7 lines in 1 file changed: 7 ins; 0 del; 0 mod 8308589: gc/cslocker/TestCSLocker.java timed out Reviewed-by: eosterlund ------------- PR: https://git.openjdk.org/jdk/pull/14150 From eosterlund at openjdk.org Fri May 26 14:00:59 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 26 May 2023 14:00:59 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently [v2] In-Reply-To: References: Message-ID: On Fri, 26 May 2023 12:47:04 GMT, Erik ?sterlund wrote: >> It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Remove null store and improve comments And whether the GC observes the old or new value of the keep alive counter doesn't matter either, as at least one thread ensures the oop location gets processed. If the GC sees the counter as 0 in this race, the mutator will process the oop location as I described. And then at least one thread has processed it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14154#issuecomment-1564436428 From eosterlund at openjdk.org Fri May 26 14:00:56 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 26 May 2023 14:00:56 GMT Subject: RFR: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently [v2] In-Reply-To: References: Message-ID: On Fri, 26 May 2023 13:19:38 GMT, Axel Boldt-Christmas wrote: > Is there any issue with memory order access here? > From my understanding this works because a GC which starts before `demote_strong_roots` has the property if it observers `_keep_alive == 0` then in must have observed `demote_strong_roots` (either root throughout or it observed the transition through barriers). And if a GC starts after `demote_strong_roots` it must never observe `_keep_alive != 0`. If the GC races with these accesses, it doesn't matter if the race hits before or after the accesses. The accesses themselves will sync with the GC. To be precise: 1. If the object relocated, the load barrier will detect that and the new value stored will pointed to the to-space object. 2. The store barrier will load the previous value and enqueue for marking, so that we don't miss there was an oop in there that needed to be marked. 3. After the last store, the colors will be right with both ZGC collectors. GC will never perform non-monotonic transitions away from the best color, which the store ensures we will get. If the GC processes the location before these accesses, it will ensure the above properties. Whoever gets there first doesn't matter as long as one of them will. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14154#issuecomment-1564433544 From gli at openjdk.org Fri May 26 14:06:54 2023 From: gli at openjdk.org (Guoxiong Li) Date: Fri, 26 May 2023 14:06:54 GMT Subject: RFR: 8047998: -XX:MaxNewSize is unnecessarily set to NewSize if NewSize is too low In-Reply-To: References: Message-ID: On Tue, 9 May 2023 03:22:52 GMT, Guoxiong Li wrote: > Hi all, > > When `MaxHeapSize` is equal to `InitialHeapSize` and `NewSize` is set in command line, > currently, the `max_young_size/MaxNewSize` will be set to the value of `NewSize`. > > Considering the document of the `NewSize` (shown below), someone may set the `NewSize` > to a very small value and expect the JVM to adjust the value dynamically. > Then when the `MaxHeapSize` is equal to `InitialHeapSize` (set by user or ergonomics), > the `MaxNewSize` is set to the value of `NewSize`, which is small unexpectedly. > > > product(size_t, NewSize, ScaleForWordSize(1*M), \ > "Initial new generation size (in bytes)") \ > constraint(NewSizeConstraintFunc,AfterErgo) \ > > > This patch fixes the issue by setting the `MaxNewSize` to `NewSize` only when the `NewSize` > is larger than the original `max_young_size/MaxNewSize`. > > The title of JDK-8047998 may need to adjusted. > > Thanks for the review. > > Best Regards, > -- Guoxiong > OK, maybe sth along these lines: > > ```c++ > if (MaxHeapSize == InitialHeapSize) { > if (FLAG_IS_CMDLINE(NewSize) && FLAG_IS_CMDLINE(MaxNewSize) && NewSize != MaxNewSize) { > vm_exit_during_initialization(...); > } > ... > } > ``` > > I think exiting here is fine because this is an impossible constraint to satisfy, variable young-gen size in constant whole-heap setup. What do you think? The condition `MaxHeapSize == InitialHeapSize` is often meet, I don't think it is good to abort the vm. I do worry that such exiting may break some current online systems, which is unacceptable for us (Do nothing is better to break the online systems). Considering the JDK21 is a LTS version, I don't want to do such breaking change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1564444723 From tschatzl at openjdk.org Fri May 26 14:11:05 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 26 May 2023 14:11:05 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME [v2] In-Reply-To: <-Mz564sHo22B0cUdp7KPo7Q4Xv41cDVAT3Evsx5zonM=.7f8c5d28-f900-4b76-9bb2-9ff538dd19be@github.com> References: <-Mz564sHo22B0cUdp7KPo7Q4Xv41cDVAT3Evsx5zonM=.7f8c5d28-f900-4b76-9bb2-9ff538dd19be@github.com> Message-ID: On Tue, 23 May 2023 15:12:22 GMT, Ivan Walulya wrote: >> Please review this change which fixes the thread starvation problem during allocation for G1. >> >> The starvation problem is not limited to GCLocker, however, currently, it manifests as an OOME only when GCLocker is active. In other cases, the starvation only affects the "starved" thread as it may loop indefinitely. >> >> Starvation with an active GCLocker happens as below: >> >> 1. Thread A tries to allocate memory as normal, and tries to start a GC; the GCLocker is active and so the thread gets stalled waiting for the GC. >> 2. GCLocker induced GC executes and frees some memory. >> 3. Thread A does not get any of that memory, but other threads also waiting for memory. >> 4. Goto 1 until the gclocker retry count has been reached. >> >> In this change, we take the general approach to solving starvation problems with announcement tables (request queues). On slow allocation, a thread that wishes to complete an Allocation GC and then attempt an allocation, announces its allocation request before proceeding to participate in a race to execute a GC safepoint. Whichever thread succeeds in executing the Allocation GC safepoint will be tasked with completing all allocation requests that were announced before the safepoint. This guarantees that all announced allocation requests are either satisfied during the safepoint, or failed in case there is not enough memory to complete all requests. This effectively deals with the starvation issue and reduces the number of allocation GCs triggered. >> >> Note: The change also adopts ZList from ZGC and makes it available under utilities as DoublyLinkedList with slight modifications. >> >> Testing: Tier 1-7 > > Ivan Walulya has updated the pull request incrementally with two additional commits since the last revision: > > - Make explicit checks for unclaimed allocatiions > - Thomas Review Changes requested by tschatzl (Reviewer). src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 906: > 904: bool G1CollectedHeap::upgrade_to_full_collection() { > 905: GCCauseSetter compaction(this, GCCause::_g1_compaction_pause); > 906: // Reset any allocated but unclaimed allocation requests. This comment would be nice (at least in addition) in the declaration in the .hpp file. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 981: > 979: } > 980: > 981: // Attempt to satisfy allocation requests failed; reset the requests, execute a full-gc, Suggestion: // Attempt to satisfy allocation requests failed; reset the requests, execute a full gc, src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 982: > 980: > 981: // Attempt to satisfy allocation requests failed; reset the requests, execute a full-gc, > 982: // then try again Suggestion: // then try again. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 998: > 996: > 997: // Attempt to satisfy allocation requests after full-gc also failed. We reset the allocation requests > 998: // then execute a maximal compaction full-gc before retrying the allocations Suggestion: // then execute a maximal compaction full-gc before retrying the allocations. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1018: > 1016: > 1017: for (StalledAllocReq* alloc_req; iter.next(&alloc_req);) { > 1018: alloc_req->set_state(StalledAllocReq::AllocationState::Failed); It does not make a difference at this point, but maybe it is somehow useful to keep "Succeeded" requests as such. This loop seems to unconditionally make all requests "Failed". (I can't think of a situation where this would make a difference, but maybe some threads can handle null return values/OOME, while others don't). src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 526: > 524: HeapWord** result, > 525: GCCause::Cause gc_cause); > 526: Indentation seems to be off here. ------------- PR Review: https://git.openjdk.org/jdk/pull/14077#pullrequestreview-1446169635 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1206744268 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1206744894 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1206745156 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1206745569 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1206736221 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1206743379 From gli at openjdk.org Sat May 27 11:02:55 2023 From: gli at openjdk.org (Guoxiong Li) Date: Sat, 27 May 2023 11:02:55 GMT Subject: RFR: 8194823: Serial GC does not account GCs caused by TLAB allocation in GC overhead limit [v2] In-Reply-To: References: Message-ID: <6siXqALhd9c5Ej9TqoK4h4NmApvvc5T2RmfBPrnjhXA=.6b7758f2-bbdf-4123-a83b-c0e54aa89549@github.com> On Fri, 26 May 2023 09:38:25 GMT, Albert Mingkun Yang wrote: > I can't see any dependency btw them after skimming through their specification. Worth investigation in its own ticket/PR. > > Some background which might be useful, "UseGCOverheadLimit can work independently of UseAdaptiveSizePolicy" from https://bugs.openjdk.org/browse/JDK-8212206 Filed https://bugs.openjdk.org/browse/JDK-8308983 to follow up. > I am leaned towards removing it, as it is effectively dead code. I also think it is good to remove it. If hearing no objection in the next several days, I will submit another issue&PR to remove it and close this one. Or I should remove it in this PR? Need to be confirmed here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14120#issuecomment-1565350288 From kbarrett at openjdk.org Sun May 28 02:46:38 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 28 May 2023 02:46:38 GMT Subject: RFR: 8308643: Incorrect value of 'used' jvmstat counter Message-ID: Please review this change that fixes the space-used performance counters provided by ParallelGC and SerialGC. The prior fix for a problem in this area (JDK-8268265) wasn't correct. It used a static "last value" variable, for use when sampling is blocked by an in-progress GC. But there are multiple counters, so having one static "last value" variable doesn't work. (Not sure what I and my reviewers were thinking at the time. Maybe we overlooked that there are multiple counters.) The solution is to associate a "last value" with each of the counters. Also added a test of basic functionality (just accessibility) of the space counters. No regression test of the problem being fixed here, as its hard to reliably set up. Note: SpaceCounters and CSpaceCounters are very nearly identical. It's likely possible to refactor for more code sharing, which could be done as a followup. Testing: mach5 tier1 ------------- Commit messages: - test basic functionality - fix CSpaceCounters - fix SpaceCounters Changes: https://git.openjdk.org/jdk/pull/14195/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14195&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308643 Stats: 150 lines in 5 files changed: 107 ins; 4 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/14195.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14195/head:pull/14195 PR: https://git.openjdk.org/jdk/pull/14195 From duke at openjdk.org Sun May 28 21:08:55 2023 From: duke at openjdk.org (giorgigagnidze16) Date: Sun, 28 May 2023 21:08:55 GMT Subject: RFR: 8308643: Incorrect value of 'used' jvmstat counter In-Reply-To: References: Message-ID: On Sun, 28 May 2023 02:08:59 GMT, Kim Barrett wrote: > Please review this change that fixes the space-used performance counters > provided by ParallelGC and SerialGC. > > The prior fix for a problem in this area (JDK-8268265) wasn't correct. It > used a static "last value" variable, for use when sampling is blocked by an > in-progress GC. But there are multiple counters, so having one static "last > value" variable doesn't work. (Not sure what I and my reviewers were thinking > at the time. Maybe we overlooked that there are multiple counters.) The > solution is to associate a "last value" with each of the counters. > > Also added a test of basic functionality (just accessibility) of the space > counters. No regression test of the problem being fixed here, as its hard to > reliably set up. > > Note: SpaceCounters and CSpaceCounters are very nearly identical. It's likely > possible to refactor for more code sharing, which could be done as a followup. > > Testing: > mach5 tier1 Marked as reviewed by giorgigagnidze16 at github.com (no known OpenJDK username). ------------- PR Review: https://git.openjdk.org/jdk/pull/14195#pullrequestreview-1448423580 From ayang at openjdk.org Mon May 29 08:10:02 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 29 May 2023 08:10:02 GMT Subject: RFR: 8308948: Remove unimplemented ThreadLocalAllocBuffer::reset In-Reply-To: References: Message-ID: On Fri, 26 May 2023 11:37:49 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14175#issuecomment-1566727008 From ayang at openjdk.org Mon May 29 08:10:03 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 29 May 2023 08:10:03 GMT Subject: Integrated: 8308948: Remove unimplemented ThreadLocalAllocBuffer::reset In-Reply-To: References: Message-ID: On Fri, 26 May 2023 11:37:49 GMT, Albert Mingkun Yang wrote: > Trivial removing dead code. This pull request has now been integrated. Changeset: 6360b499 Author: Albert Mingkun Yang URL: https://git.openjdk.org/jdk/commit/6360b4993163c91fb5d8f0a10429e3aac1e624ac Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod 8308948: Remove unimplemented ThreadLocalAllocBuffer::reset Reviewed-by: tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/14175 From ayang at openjdk.org Mon May 29 08:13:57 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 29 May 2023 08:13:57 GMT Subject: RFR: 8194823: Serial GC does not account GCs caused by TLAB allocation in GC overhead limit [v2] In-Reply-To: References: Message-ID: <7PQoFkIc--n8CWcJwLSksiSaIs1wk-MXDS39QsuLLQU=.3e8d1d80-12a6-4817-af60-0af3df248988@github.com> On Thu, 25 May 2023 11:29:27 GMT, Guoxiong Li wrote: >> Hi all, >> >> This patch enables the gc overhead limit when allocating TLAB in serial gc. >> The main modification is at `GenCollectedHeap::allocate_new_tlab` and the other >> files only adjust the parameters of the method `allocate_new_tlab`. >> >> Thanks for the review. >> >> Best Regards, >> -- Guoxiong > > Guoxiong Li has updated the pull request incrementally with one additional commit since the last revision: > > Fix parameter list formatting issues. I think it's best to do the removing in another ticket/PR and link the two tickets. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14120#issuecomment-1566736064 From ayang at openjdk.org Mon May 29 11:45:57 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 29 May 2023 11:45:57 GMT Subject: RFR: 8047998: -XX:MaxNewSize is unnecessarily set to NewSize if NewSize is too low In-Reply-To: References: Message-ID: <72Ws56U5CQtUvtu54vhJGMRbfO87GRq5yNgIAUkhsYo=.eab80189-566f-4625-a6b6-2216f801c44e@github.com> On Fri, 26 May 2023 14:04:03 GMT, Guoxiong Li wrote: > The condition MaxHeapSize == InitialHeapSize is often meet True, but additionally specifying both `NewSize` and `MaxNewSize` is uncommon, IMO. > I do worry that such exiting may break some current online systems Actually, I think it's desirable to fail loudly if the JVM cmd flags contain errors. Hard to say whether this is an error or not, but debugging suboptimal performance (from incorrect heap/generation-size) is much harder. Therefore, notifying developers at jvm-startup when cmd flags can't be satisfied might be appreciated. Again, this setup should be rare. (I agree this is subjective.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1567027624 From gli at openjdk.org Mon May 29 14:55:55 2023 From: gli at openjdk.org (Guoxiong Li) Date: Mon, 29 May 2023 14:55:55 GMT Subject: RFR: 8047998: -XX:MaxNewSize is unnecessarily set to NewSize if NewSize is too low In-Reply-To: <72Ws56U5CQtUvtu54vhJGMRbfO87GRq5yNgIAUkhsYo=.eab80189-566f-4625-a6b6-2216f801c44e@github.com> References: <72Ws56U5CQtUvtu54vhJGMRbfO87GRq5yNgIAUkhsYo=.eab80189-566f-4625-a6b6-2216f801c44e@github.com> Message-ID: On Mon, 29 May 2023 11:43:08 GMT, Albert Mingkun Yang wrote: > > The condition MaxHeapSize == InitialHeapSize is often meet > > True, but additionally specifying both `NewSize` and `MaxNewSize` is uncommon, IMO. > > > I do worry that such exiting may break some current online systems > > Actually, I think it's desirable to fail loudly if the JVM cmd flags contain errors. Hard to say whether this is an error or not, but debugging suboptimal performance (from incorrect heap/generation-size) is much harder. Therefore, notifying developers at jvm-startup when cmd flags can't be satisfied might be appreciated. Again, this setup should be rare. > > (I agree this is subjective.) I will be ok with your previous suggestion (exiting the vm) if this PR is intergrated into JDK22, which means we should integrate it into mainline after June 08, 2023 [1] . What do you think about it? [1] https://openjdk.org/projects/jdk/21/ ------------- PR Comment: https://git.openjdk.org/jdk/pull/13876#issuecomment-1567237493 From eosterlund at openjdk.org Mon May 29 15:38:03 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 29 May 2023 15:38:03 GMT Subject: RFR: 8308752: Generational ZGC: Avoid final marking through stack chunks Message-ID: <-4SrCIDBMltA1BGL79xit4RoKS-3fJXgR-eQRlxn9t8=.78be0a59-813b-4b08-8fd6-fc70eb65570d@github.com> Single generation ZGC switches marking strength to strong unconditionally when marking through loom stack chunks, as there is no support for final marking through nmethod oops. We missed doing the same thing in generational ZGC, which is an oversight. This patch fixes that. ------------- Commit messages: - 8308752: Generational ZGC: Avoid final marking through stack chunks Changes: https://git.openjdk.org/jdk/pull/14204/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14204&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308752 Stats: 10 lines in 1 file changed: 1 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/14204.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14204/head:pull/14204 PR: https://git.openjdk.org/jdk/pull/14204 From dholmes at openjdk.org Tue May 30 01:18:54 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 30 May 2023 01:18:54 GMT Subject: RFR: 8309048: Remove malloc locker test case In-Reply-To: References: Message-ID: On Mon, 29 May 2023 13:14:24 GMT, Leo Korinth wrote: > There is a bunch of tests that are used to test critical section/gc locker. One of the test is named malloc. In that test, JNI code is doing a loop of `malloc()` followed `sleep()` followed by a `free()`. There is no JVM lock implementation to be tested on malloc/free. Let us save test time, code complexity and confusion by removing this test. > > (Oracle) hs-tier5 testing passed on x86-64. Looks fine. I couldn't find anything explaining the history of this test. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14201#pullrequestreview-1449946999 From duke at openjdk.org Tue May 30 03:11:09 2023 From: duke at openjdk.org (duke) Date: Tue, 30 May 2023 03:11:09 GMT Subject: Withdrawn: 8303968: Serial: Use more precise liveness info in Young GC reference processing In-Reply-To: <5Knsro7OsD8ur2n5XzLPi9iixuFupJre_l4rmwlXo2w=.ecf391a0-176a-4693-bfdb-90992573cabd@github.com> References: <5Knsro7OsD8ur2n5XzLPi9iixuFupJre_l4rmwlXo2w=.ecf391a0-176a-4693-bfdb-90992573cabd@github.com> Message-ID: On Fri, 10 Mar 2023 13:44:18 GMT, Albert Mingkun Yang wrote: > Simple refactoring around Young-GC keep-alive-closure. > > (I went for the static-local-var approach, as other solutions that I am aware of would expose the is-alive-closure in the header.) > > Test: tier1-3 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/12975 From stefank at openjdk.org Tue May 30 07:00:54 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 30 May 2023 07:00:54 GMT Subject: RFR: 8308752: Generational ZGC: Avoid final marking through stack chunks In-Reply-To: <-4SrCIDBMltA1BGL79xit4RoKS-3fJXgR-eQRlxn9t8=.78be0a59-813b-4b08-8fd6-fc70eb65570d@github.com> References: <-4SrCIDBMltA1BGL79xit4RoKS-3fJXgR-eQRlxn9t8=.78be0a59-813b-4b08-8fd6-fc70eb65570d@github.com> Message-ID: On Mon, 29 May 2023 15:31:02 GMT, Erik ?sterlund wrote: > Single generation ZGC switches marking strength to strong unconditionally when marking through loom stack chunks, as there is no support for final marking through nmethod oops. We missed doing the same thing in generational ZGC, which is an oversight. This patch fixes that. Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14204#pullrequestreview-1450199312 From eosterlund at openjdk.org Tue May 30 07:33:04 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 30 May 2023 07:33:04 GMT Subject: Integrated: 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently In-Reply-To: References: Message-ID: On Thu, 25 May 2023 15:41:23 GMT, Erik ?sterlund wrote: > It is illegal to remove strong roots concurrently without clearing them first. A SATB collector with concurrent root scanning assumes that when strong roots disappear from the object graph, they are cleared first, which makes SATB notice the root. All global strong roots do this. Except CLD strong roots, which are turned into non-roots by decrementing the keep_alive counter to 0, when bootstrapping weak hidden class CLDs. This is not valid behaviour. This patch tries to treat these oops like we do any other global strong handles that are unlinked from the system: clear them when they stop being strong roots. This pull request has now been integrated. Changeset: 78aac241 Author: Erik ?sterlund URL: https://git.openjdk.org/jdk/commit/78aac241b8a3f29111e2901e8b7fbadd502a31a9 Stats: 50 lines in 2 files changed: 50 ins; 0 del; 0 mod 8308881: Strong CLD oop handle roots are demoted to non-roots concurrently Reviewed-by: stefank, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/14154 From iwalulya at openjdk.org Tue May 30 08:38:42 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 30 May 2023 08:38:42 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME [v3] In-Reply-To: References: Message-ID: > Please review this change which fixes the thread starvation problem during allocation for G1. > > The starvation problem is not limited to GCLocker, however, currently, it manifests as an OOME only when GCLocker is active. In other cases, the starvation only affects the "starved" thread as it may loop indefinitely. > > Starvation with an active GCLocker happens as below: > > 1. Thread A tries to allocate memory as normal, and tries to start a GC; the GCLocker is active and so the thread gets stalled waiting for the GC. > 2. GCLocker induced GC executes and frees some memory. > 3. Thread A does not get any of that memory, but other threads also waiting for memory. > 4. Goto 1 until the gclocker retry count has been reached. > > In this change, we take the general approach to solving starvation problems with announcement tables (request queues). On slow allocation, a thread that wishes to complete an Allocation GC and then attempt an allocation, announces its allocation request before proceeding to participate in a race to execute a GC safepoint. Whichever thread succeeds in executing the Allocation GC safepoint will be tasked with completing all allocation requests that were announced before the safepoint. This guarantees that all announced allocation requests are either satisfied during the safepoint, or failed in case there is not enough memory to complete all requests. This effectively deals with the starvation issue and reduces the number of allocation GCs triggered. > > Note: The change also adopts ZList from ZGC and makes it available under utilities as DoublyLinkedList with slight modifications. > > Testing: Tier 1-7 Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: Thomas review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14077/files - new: https://git.openjdk.org/jdk/pull/14077/files/a85c59c2..51fcf016 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14077&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14077&range=01-02 Stats: 15 lines in 3 files changed: 2 ins; 0 del; 13 mod Patch: https://git.openjdk.org/jdk/pull/14077.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14077/head:pull/14077 PR: https://git.openjdk.org/jdk/pull/14077 From tschatzl at openjdk.org Tue May 30 09:20:56 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 30 May 2023 09:20:56 GMT Subject: RFR: 8309048: Remove malloc locker test case In-Reply-To: References: Message-ID: On Mon, 29 May 2023 13:14:24 GMT, Leo Korinth wrote: > There is a bunch of tests that are used to test critical section/gc locker. One of the test is named malloc. In that test, JNI code is doing a loop of `malloc()` followed `sleep()` followed by a `free()`. There is no JVM lock implementation to be tested on malloc/free. Let us save test time, code complexity and confusion by removing this test. > > (Oracle) hs-tier5 testing passed on x86-64. Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14201#pullrequestreview-1450495229 From tschatzl at openjdk.org Tue May 30 09:23:58 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 30 May 2023 09:23:58 GMT Subject: RFR: 8308643: Incorrect value of 'used' jvmstat counter In-Reply-To: References: Message-ID: On Sun, 28 May 2023 02:08:59 GMT, Kim Barrett wrote: > Please review this change that fixes the space-used performance counters > provided by ParallelGC and SerialGC. > > The prior fix for a problem in this area (JDK-8268265) wasn't correct. It > used a static "last value" variable, for use when sampling is blocked by an > in-progress GC. But there are multiple counters, so having one static "last > value" variable doesn't work. (Not sure what I and my reviewers were thinking > at the time. Maybe we overlooked that there are multiple counters.) The > solution is to associate a "last value" with each of the counters. > > Also added a test of basic functionality (just accessibility) of the space > counters. No regression test of the problem being fixed here, as its hard to > reliably set up. > > Note: SpaceCounters and CSpaceCounters are very nearly identical. It's likely > possible to refactor for more code sharing, which could be done as a followup. > > Testing: > mach5 tier1 Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14195#pullrequestreview-1450500189 From tschatzl at openjdk.org Tue May 30 10:14:11 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 30 May 2023 10:14:11 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME [v3] In-Reply-To: References: Message-ID: On Tue, 30 May 2023 08:38:42 GMT, Ivan Walulya wrote: >> Please review this change which fixes the thread starvation problem during allocation for G1. >> >> The starvation problem is not limited to GCLocker, however, currently, it manifests as an OOME only when GCLocker is active. In other cases, the starvation only affects the "starved" thread as it may loop indefinitely. >> >> Starvation with an active GCLocker happens as below: >> >> 1. Thread A tries to allocate memory as normal, and tries to start a GC; the GCLocker is active and so the thread gets stalled waiting for the GC. >> 2. GCLocker induced GC executes and frees some memory. >> 3. Thread A does not get any of that memory, but other threads also waiting for memory. >> 4. Goto 1 until the gclocker retry count has been reached. >> >> In this change, we take the general approach to solving starvation problems with announcement tables (request queues). On slow allocation, a thread that wishes to complete an Allocation GC and then attempt an allocation, announces its allocation request before proceeding to participate in a race to execute a GC safepoint. Whichever thread succeeds in executing the Allocation GC safepoint will be tasked with completing all allocation requests that were announced before the safepoint. This guarantees that all announced allocation requests are either satisfied during the safepoint, or failed in case there is not enough memory to complete all requests. This effectively deals with the starvation issue and reduces the number of allocation GCs triggered. >> >> Note: The change also adopts ZList from ZGC and makes it available under utilities as DoublyLinkedList with slight modifications. >> >> Testing: Tier 1-7 > > Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: > > Thomas review Lgtm. Please fix the remaining additional minor nits before pushing. Some of the method descriptions need to be adapted (maybe add mention of the `node_index` param and such) too. src/hotspot/share/gc/g1/g1Allocator.cpp line 75: > 73: return has_mutator_alloc_region(node_index); > 74: } > 75: Seems unused now and can be removed. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 446: > 444: // point in trying to allocate further. We'll just return null. > 445: log_debug(gc, alloc)("%s: Failed to allocate " > 446: SIZE_FORMAT " words", Thread::current()->name(), word_size); Suggestion: log_debug(gc, alloc)("%s: Failed to allocate %zu words", Thread::current()->name(), word_size); src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1088: > 1086: const size_t payload_size = words - CollectedHeap::filler_array_hdr_size(); > 1087: const size_t len = payload_size * HeapWordSize / sizeof(jint); > 1088: assert((int)len >= 0, "size too large " SIZE_FORMAT " becomes %d", words, (int)len); Suggestion: assert((int)len >= 0, "size too large %zu becomes %d", words, (int)len); Use `%zu` for any new code. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1113: > 1111: > 1112: size_t expand_bytes = MAX2(word_size * HeapWordSize, MinHeapDeltaBytes); > 1113: log_debug(gc, ergo, heap)("Attempt heap expansion (allocation request failed). Allocation request: " SIZE_FORMAT "B", Since this method is touched, maybe fix the `SIZE_FORMAT` here too. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1124: > 1122: } > 1123: > 1124: HeapWord* G1CollectedHeap::expand_and_allocate(size_t word_size) { I think this method is unused now. Removing it also seems to orphan `attempt_allocation_at_safepoint(size_t, bool)`. ------------- Marked as reviewed by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14077#pullrequestreview-1450520679 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1210048939 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1210007957 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1210005617 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1210007184 PR Review Comment: https://git.openjdk.org/jdk/pull/14077#discussion_r1210046698 From ayang at openjdk.org Tue May 30 11:39:59 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 30 May 2023 11:39:59 GMT Subject: RFR: 8308643: Incorrect value of 'used' jvmstat counter In-Reply-To: References: Message-ID: On Sun, 28 May 2023 02:08:59 GMT, Kim Barrett wrote: > Please review this change that fixes the space-used performance counters > provided by ParallelGC and SerialGC. > > The prior fix for a problem in this area (JDK-8268265) wasn't correct. It > used a static "last value" variable, for use when sampling is blocked by an > in-progress GC. But there are multiple counters, so having one static "last > value" variable doesn't work. (Not sure what I and my reviewers were thinking > at the time. Maybe we overlooked that there are multiple counters.) The > solution is to associate a "last value" with each of the counters. > > Also added a test of basic functionality (just accessibility) of the space > counters. No regression test of the problem being fixed here, as its hard to > reliably set up. > > Note: SpaceCounters and CSpaceCounters are very nearly identical. It's likely > possible to refactor for more code sharing, which could be done as a followup. > > Testing: > mach5 tier1 src/hotspot/share/gc/parallel/spaceCounters.hpp line 42: > 40: PerfVariable* _capacity; > 41: PerfVariable* _used; > 42: volatile size_t _last_used_in_bytes; Why is an additional field required here? Can't `_used` fulfill the purpose? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14195#discussion_r1210143811 From coleenp at openjdk.org Tue May 30 12:16:25 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 30 May 2023 12:16:25 GMT Subject: RFR: 8305896: Alternative full GC forwarding [v47] In-Reply-To: References: Message-ID: On Mon, 22 May 2023 14:37:18 GMT, Roman Kennke wrote: >> Currently, the full-GC modes of Serial, Shenandoah and G1 GCs are forwarding objects by over-writing the object header with the new object location. Unfortunately, for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) this would not work, because the crucial class information is also stored in the header, and we could no longer iterate over objects until the headers would be restored. Also, the preserved-headers tables would grow quite large. >> >> I propose to use an alternative algorithm for full-GC (sliding-GC) forwarding that uses a special encoding so that the forwarding information fits into the lowest 32 bits of the header. >> >> It exploits the insight that, with sliding GCs, objects from one region will only ever be forwarded to one of two possible target regions. For this to work, we need to divide the heap into equal-sized regions. This is already the case for Shenandoah and G1, and can easily be overlaid for Serial GC, by assuming either the whole heap as a single region (if it fits) or by using SpaceAlignment-sized virtual regions. >> >> We also build and maintain a table that has N elements, where N is the number of regions. Each entry is two addresses, which are the start-address of the possible target regions for each source region. >> >> With this, forwarding information would be encoded like this: >> - Bits 0 and 1: same as before, we put in '11' to indicate that the object is forwarded. >> - Bit 2: Used for 'fallback'-forwarding (see below) >> - Bit 3: Selects the target region 0 or 1. Look up the base address in the table (see above) >> - Bits 4..31 The number of heap words from the target base address >> >> This works well for all sliding GCs in Serial, G1 and Shenandoah. The exception is in G1, there is a special mode called 'serial compaction' which acts as a last-last-ditch effort to squeeze more space out of the heap by re-forwarding the tails of the compaction chains. Unfortunately, this breaks the assumption of the sliding-forwarding-table. When that happens, we initialize a fallback table, which is a simple open hash-table, and set the Bit 2 in the forwarding to indicate that we shall look up the forwardee in the fallback-table. >> >> All the table accesses can be done unsynchronized because: >> - Serial GC is single-threaded anyway >> - In G1 and Shenandoah, GC worker threads divide up the work such that each worker does disjoint sets of regions. >> - G1 serial compaction is single-threaded >> >> The c... > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Specialize full-GC loops to get UseAltGCForwarding flag check out of hot paths I ran this patch as of and the #13779 with the flag and it passed tier5-7 linux-x64-debug. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13582#issuecomment-1568327358 From kbarrett at openjdk.org Tue May 30 13:54:58 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 30 May 2023 13:54:58 GMT Subject: RFR: 8308643: Incorrect value of 'used' jvmstat counter In-Reply-To: References: Message-ID: On Tue, 30 May 2023 11:37:29 GMT, Albert Mingkun Yang wrote: >> Please review this change that fixes the space-used performance counters >> provided by ParallelGC and SerialGC. >> >> The prior fix for a problem in this area (JDK-8268265) wasn't correct. It >> used a static "last value" variable, for use when sampling is blocked by an >> in-progress GC. But there are multiple counters, so having one static "last >> value" variable doesn't work. (Not sure what I and my reviewers were thinking >> at the time. Maybe we overlooked that there are multiple counters.) The >> solution is to associate a "last value" with each of the counters. >> >> Also added a test of basic functionality (just accessibility) of the space >> counters. No regression test of the problem being fixed here, as its hard to >> reliably set up. >> >> Note: SpaceCounters and CSpaceCounters are very nearly identical. It's likely >> possible to refactor for more code sharing, which could be done as a followup. >> >> Testing: >> mach5 tier1 > > src/hotspot/share/gc/parallel/spaceCounters.hpp line 42: > >> 40: PerfVariable* _capacity; >> 41: PerfVariable* _used; >> 42: volatile size_t _last_used_in_bytes; > > Why is an additional field required here? Can't `_used` fulfill the purpose? The _used counters only seem to be updated by GCs. The take_sample function reports "current" data. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14195#discussion_r1210309737 From ayang at openjdk.org Tue May 30 14:01:59 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 30 May 2023 14:01:59 GMT Subject: RFR: 8308643: Incorrect value of 'used' jvmstat counter In-Reply-To: References: Message-ID: <_ajTwaXeUNCGXRFCoc0A5dawGsbAQwIPd425fhjErts=.14eff994-14f8-417a-adc7-a5819f9a5181@github.com> On Tue, 30 May 2023 13:52:21 GMT, Kim Barrett wrote: >> src/hotspot/share/gc/parallel/spaceCounters.hpp line 42: >> >>> 40: PerfVariable* _capacity; >>> 41: PerfVariable* _used; >>> 42: volatile size_t _last_used_in_bytes; >> >> Why is an additional field required here? Can't `_used` fulfill the purpose? > > The _used counters only seem to be updated by GCs. The take_sample function reports "current" data. Can `take_sample` update `_used` instead? IOW, what's the fundamental reason for having two distinct counters here? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14195#discussion_r1210320579 From tschatzl at openjdk.org Tue May 30 14:19:56 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 30 May 2023 14:19:56 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Thu, 25 May 2023 14:32:04 GMT, Thomas Schatzl wrote: >> They will be abandoned and in a weird state... I'll file a bug (this is pre-existing). It never happened because the only case where this is called is CDS loading which happens very early during execution. > > I.e. the next region verification will complain. I filed https://bugs.openjdk.org/browse/JDK-8309117. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14145#discussion_r1210351256 From iwalulya at openjdk.org Tue May 30 17:10:58 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 30 May 2023 17:10:58 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Tue, 30 May 2023 14:17:37 GMT, Thomas Schatzl wrote: >> I.e. the next region verification will complain. > > I filed https://bugs.openjdk.org/browse/JDK-8309117. Thanks ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14145#discussion_r1210576387 From coleenp at openjdk.org Tue May 30 19:06:54 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 30 May 2023 19:06:54 GMT Subject: RFR: 8309048: Remove malloc locker test case In-Reply-To: References: Message-ID: <7Z_jlhWc1MUfTir_0px1NbOE2EqKghADMGMo9ML2DLo=.9dd204f2-8876-4057-bc0a-6e5c65d261f1@github.com> On Mon, 29 May 2023 13:14:24 GMT, Leo Korinth wrote: > There is a bunch of tests that are used to test critical section/gc locker. One of the test is named malloc. In that test, JNI code is doing a loop of `malloc()` followed `sleep()` followed by a `free()`. There is no JVM lock implementation to be tested on malloc/free. Let us save test time, code complexity and confusion by removing this test. > > (Oracle) hs-tier5 testing passed on x86-64. This looks like a nice cleanup. I think we should migrate these tests into the tests/hotspot/gc directory and reconcile duplicates with other tests that do the same thing. This could be a future RFE/improvement. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14201#pullrequestreview-1451637547 From duke at openjdk.org Tue May 30 21:38:55 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 30 May 2023 21:38:55 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Thu, 25 May 2023 09:09:06 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that properly deallocates/uncommits CDS archive regions when failing to load the CDS archive? > > In particular this caused the nuisance mentioned in the CR where even if -Xms==-Xmx, g1 uncommitted the heap memory anyway. > > Testing: gha, manual testing as below > > There is no (existing) way to induce CDS load errors easily, so what I did was adding `-XX:+UseNewCode` in `filemap.cpp:2202` to simulate failures when enabled. Obviously I removed the flag in this change. > > Here's the problematic case: > > > $java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms128m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff06278, 0x0000000100000000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.050s][debug][gc,ergo,heap] Attempt heap shrinking (CDS archive regions). Total size: 2097152B > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.057s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] > [0.129s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000] > openjdk version "21-internal" 2023-09-19 > > > (The `GCCardSizeInBytes` option is there to decrease the minimum heap alignment to 512kb/1M so that setting `-Xms` to an odd value in a later test works) > > I.e. the CDS regions are unconditionally uncommitted even through `-Xms == -Xmx`. > > The next case just illustrates current (pre-existing) behavior with `-Xms != -Xmx`, showing that CDS regions are always committed, leading to higher than `-Xms` memory usage. I will file an enhancement here, as it is acceptable behavior (to me). > > > $ java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms126m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.048s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000fff00000, 0x00000... @tschatzl what if the regions allocated for archive objects are not uncommitted but just added to the free list in `dealloc_archive_regions`? That would also prevent heap going below Xms and would avoid tracking committed regions. Do you see any issue if the regions are not uncommitted? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14145#issuecomment-1569137865 From ayang at openjdk.org Tue May 30 21:42:56 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 30 May 2023 21:42:56 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Thu, 25 May 2023 09:09:06 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that properly deallocates/uncommits CDS archive regions when failing to load the CDS archive? > > In particular this caused the nuisance mentioned in the CR where even if -Xms==-Xmx, g1 uncommitted the heap memory anyway. > > Testing: gha, manual testing as below > > There is no (existing) way to induce CDS load errors easily, so what I did was adding `-XX:+UseNewCode` in `filemap.cpp:2202` to simulate failures when enabled. Obviously I removed the flag in this change. > > Here's the problematic case: > > > $java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms128m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff06278, 0x0000000100000000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.050s][debug][gc,ergo,heap] Attempt heap shrinking (CDS archive regions). Total size: 2097152B > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.057s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] > [0.129s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000] > openjdk version "21-internal" 2023-09-19 > > > (The `GCCardSizeInBytes` option is there to decrease the minimum heap alignment to 512kb/1M so that setting `-Xms` to an odd value in a later test works) > > I.e. the CDS regions are unconditionally uncommitted even through `-Xms == -Xmx`. > > The next case just illustrates current (pre-existing) behavior with `-Xms != -Xmx`, showing that CDS regions are always committed, leading to higher than `-Xms` memory usage. I will file an enhancement here, as it is acceptable behavior (to me). > > > $ java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms126m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.048s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000fff00000, 0x00000... src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 558: > 556: // Then note how much new space we have allocated. > 557: regions_committed = 0; > 558: uint regions_allocated = 0; Why is alloc-count required? Isn't it sth like `ceil(range/region_size)`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14145#discussion_r1210850861 From lmesnik at openjdk.org Wed May 31 05:33:55 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 31 May 2023 05:33:55 GMT Subject: RFR: 8309048: Remove malloc locker test case In-Reply-To: References: Message-ID: On Mon, 29 May 2023 13:14:24 GMT, Leo Korinth wrote: > There is a bunch of tests that are used to test critical section/gc locker. One of the test is named malloc. In that test, JNI code is doing a loop of `malloc()` followed `sleep()` followed by a `free()`. There is no JVM lock implementation to be tested on malloc/free. Let us save test time, code complexity and confusion by removing this test. > > (Oracle) hs-tier5 testing passed on x86-64. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14201#pullrequestreview-1452234684 From tschatzl at openjdk.org Wed May 31 07:07:58 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 31 May 2023 07:07:58 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Tue, 30 May 2023 21:40:35 GMT, Albert Mingkun Yang wrote: >> Hi all, >> >> can I have reviews for this change that properly deallocates/uncommits CDS archive regions when failing to load the CDS archive? >> >> In particular this caused the nuisance mentioned in the CR where even if -Xms==-Xmx, g1 uncommitted the heap memory anyway. >> >> Testing: gha, manual testing as below >> >> There is no (existing) way to induce CDS load errors easily, so what I did was adding `-XX:+UseNewCode` in `filemap.cpp:2202` to simulate failures when enabled. Obviously I removed the flag in this change. >> >> Here's the problematic case: >> >> >> $java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms128m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version >> >> [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] >> [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff06278, 0x0000000100000000] >> [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] >> [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] >> [0.050s][debug][gc,ergo,heap] Attempt heap shrinking (CDS archive regions). Total size: 2097152B >> [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] >> [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] >> [0.057s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] >> [0.129s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000] >> openjdk version "21-internal" 2023-09-19 >> >> >> (The `GCCardSizeInBytes` option is there to decrease the minimum heap alignment to 512kb/1M so that setting `-Xms` to an odd value in a later test works) >> >> I.e. the CDS regions are unconditionally uncommitted even through `-Xms == -Xmx`. >> >> The next case just illustrates current (pre-existing) behavior with `-Xms != -Xmx`, showing that CDS regions are always committed, leading to higher than `-Xms` memory usage. I will file an enhancement here, as it is acceptable behavior (to me). >> >> >> $ java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms126m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version >> >> [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] >> [0.048s][trace][gc,region ] G1HR AC... > > src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 558: > >> 556: // Then note how much new space we have allocated. >> 557: regions_committed = 0; >> 558: uint regions_allocated = 0; > > Why is alloc-count required? Isn't it sth like `ceil(range/region_size)`? This adds implicit assumptions about how the range looks like, i.e. the start address; I think it is nicer if the one allocating the memory returns this value like any other information about the allocation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14145#discussion_r1211185747 From tschatzl at openjdk.org Wed May 31 07:16:56 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 31 May 2023 07:16:56 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Tue, 30 May 2023 21:36:26 GMT, Ashutosh Mehra wrote: > @tschatzl what if the regions allocated for archive objects are not uncommitted but just added to the free list in `dealloc_archive_regions`? That would also prevent heap going below Xms and would avoid tracking committed regions. Do you see any issue if the regions are not uncommitted? No real issue, but no real advantage either. In case of -Xms != -Xmx this would mean that the `main` method will not start with heap size of -Xms, but additional regions. Given that CDS is enabled by default, this would effectively mean that unless disabled, the VM would always (effectively) start (at `main`) with an initial heap size a bit larger than requested. This is allowed (initial heap size is somewhat of a hint), but unexpected I assume. The question is, should G1 try to honor requested initial heap size or not? (Similar to https://bugs.openjdk.org/browse/JDK-8308854) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14145#issuecomment-1569625576 From iwalulya at openjdk.org Wed May 31 08:38:43 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 31 May 2023 08:38:43 GMT Subject: RFR: 8308507: G1: GClocker induced GCs can starve threads requiring memory leading to OOME [v4] In-Reply-To: References: Message-ID: > Please review this change which fixes the thread starvation problem during allocation for G1. > > The starvation problem is not limited to GCLocker, however, currently, it manifests as an OOME only when GCLocker is active. In other cases, the starvation only affects the "starved" thread as it may loop indefinitely. > > Starvation with an active GCLocker happens as below: > > 1. Thread A tries to allocate memory as normal, and tries to start a GC; the GCLocker is active and so the thread gets stalled waiting for the GC. > 2. GCLocker induced GC executes and frees some memory. > 3. Thread A does not get any of that memory, but other threads also waiting for memory. > 4. Goto 1 until the gclocker retry count has been reached. > > In this change, we take the general approach to solving starvation problems with announcement tables (request queues). On slow allocation, a thread that wishes to complete an Allocation GC and then attempt an allocation, announces its allocation request before proceeding to participate in a race to execute a GC safepoint. Whichever thread succeeds in executing the Allocation GC safepoint will be tasked with completing all allocation requests that were announced before the safepoint. This guarantees that all announced allocation requests are either satisfied during the safepoint, or failed in case there is not enough memory to complete all requests. This effectively deals with the starvation issue and reduces the number of allocation GCs triggered. > > Note: The change also adopts ZList from ZGC and makes it available under utilities as DoublyLinkedList with slight modifications. > > Testing: Tier 1-7 Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision: Thomas review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14077/files - new: https://git.openjdk.org/jdk/pull/14077/files/51fcf016..a544c066 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14077&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14077&range=02-03 Stats: 47 lines in 4 files changed: 1 ins; 31 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/14077.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14077/head:pull/14077 PR: https://git.openjdk.org/jdk/pull/14077 From lkorinth at openjdk.org Wed May 31 08:59:11 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Wed, 31 May 2023 08:59:11 GMT Subject: RFR: 8309048: Remove malloc locker test case In-Reply-To: References: Message-ID: On Mon, 29 May 2023 13:14:24 GMT, Leo Korinth wrote: > There is a bunch of tests that are used to test critical section/gc locker. One of the test is named malloc. In that test, JNI code is doing a loop of `malloc()` followed `sleep()` followed by a `free()`. There is no JVM lock implementation to be tested on malloc/free. Let us save test time, code complexity and confusion by removing this test. > > (Oracle) hs-tier5 testing passed on x86-64. Thanks David, Thomas, Coleen and Leonid! I am looking into further cleanups... ------------- PR Comment: https://git.openjdk.org/jdk/pull/14201#issuecomment-1569773788 From lkorinth at openjdk.org Wed May 31 08:59:12 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Wed, 31 May 2023 08:59:12 GMT Subject: Integrated: 8309048: Remove malloc locker test case In-Reply-To: References: Message-ID: On Mon, 29 May 2023 13:14:24 GMT, Leo Korinth wrote: > There is a bunch of tests that are used to test critical section/gc locker. One of the test is named malloc. In that test, JNI code is doing a loop of `malloc()` followed `sleep()` followed by a `free()`. There is no JVM lock implementation to be tested on malloc/free. Let us save test time, code complexity and confusion by removing this test. > > (Oracle) hs-tier5 testing passed on x86-64. This pull request has now been integrated. Changeset: 88236263 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/88236263dcea96dd0cb33c15367ce6e755a949e9 Stats: 241 lines in 9 files changed: 0 ins; 240 del; 1 mod 8309048: Remove malloc locker test case Reviewed-by: dholmes, tschatzl, coleenp, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/14201 From tschatzl at openjdk.org Wed May 31 09:41:04 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 31 May 2023 09:41:04 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: <_mpxRuZcO1xk6PC_4ME58NlCOiS6PAtYup1P7ha-8N8=.d21981a3-e50c-47f3-88a6-57a1252718ff@github.com> On Tue, 30 May 2023 17:08:07 GMT, Ivan Walulya wrote: >> I filed https://bugs.openjdk.org/browse/JDK-8309117. > > Thanks Actually I checked/tested again, and the result is that these unnecessarily committed regions will stay around as free regions. No other error or something. I adapted the CR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14145#discussion_r1211376911 From ayang at openjdk.org Wed May 31 10:20:58 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 31 May 2023 10:20:58 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Thu, 25 May 2023 09:09:06 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that properly deallocates/uncommits CDS archive regions when failing to load the CDS archive? > > In particular this caused the nuisance mentioned in the CR where even if -Xms==-Xmx, g1 uncommitted the heap memory anyway. > > Testing: gha, manual testing as below > > There is no (existing) way to induce CDS load errors easily, so what I did was adding `-XX:+UseNewCode` in `filemap.cpp:2202` to simulate failures when enabled. Obviously I removed the flag in this change. > > Here's the problematic case: > > > $java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms128m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000ffe00000, 0x00000000fff00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR ALLOC(OLD) [0x00000000fff00000, 0x00000000fff06278, 0x0000000100000000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR INACTIVE(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.050s][debug][gc,ergo,heap] Attempt heap shrinking (CDS archive regions). Total size: 2097152B > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000ffe00000, 0x00000000ffe00000, 0x00000000fff00000] > [0.050s][trace][gc,region ] G1HR UNCOMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.057s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffd00000, 0x00000000ffd00000, 0x00000000ffe00000] > [0.129s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000000ffc00000, 0x00000000ffc00000, 0x00000000ffd00000] > openjdk version "21-internal" 2023-09-19 > > > (The `GCCardSizeInBytes` option is there to decrease the minimum heap alignment to 512kb/1M so that setting `-Xms` to an odd value in a later test works) > > I.e. the CDS regions are unconditionally uncommitted even through `-Xms == -Xmx`. > > The next case just illustrates current (pre-existing) behavior with `-Xms != -Xmx`, showing that CDS regions are always committed, leading to higher than `-Xms` memory usage. I will file an enhancement here, as it is acceptable behavior (to me). > > > $ java -XX:GCCardSizeInBytes=128 -Xmx128m -Xms126m -Xlog:gc+region=trace,gc+ergo+heap=debug -XX:+UseNewCode -version > > [0.048s][trace][gc,region ] G1HR COMMIT(FREE) [0x00000000fff00000, 0x00000000fff00000, 0x0000000100000000] > [0.048s][trace][gc,region ] G1HR ACTIVE(FREE) [0x00000000fff00000, 0x00000... Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14145#pullrequestreview-1452797007 From ayang at openjdk.org Wed May 31 10:21:00 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 31 May 2023 10:21:00 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: On Wed, 31 May 2023 07:05:32 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 558: >> >>> 556: // Then note how much new space we have allocated. >>> 557: regions_committed = 0; >>> 558: uint regions_allocated = 0; >> >> Why is alloc-count required? Isn't it sth like `ceil(range/region_size)`? > > This adds implicit assumptions about how the range looks like, i.e. the start address; I think it is nicer if the one allocating the memory returns this value like any other information about the allocation. I think the closure below, `set_region_to_old`, already assumes the start-addr is region-start. In fact, I believe that is an implicit assumption shared by `alloc_archive_regions`, `dealloc_archive_regions`, and `populate_archive_regions_bot_part`. I'd prefer making it explicit, but YMMV. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14145#discussion_r1211474701 From duke at openjdk.org Wed May 31 13:15:58 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 31 May 2023 13:15:58 GMT Subject: RFR: 8232722: G1 archive region deallocation may shrink the heap below -Xms In-Reply-To: References: Message-ID: <5-VfLL5nXFl3sLdktHnNcFLOjvtKV_fWRJASUsvqbGo=.7590385f-c518-454f-895c-2f441fa13d99@github.com> On Wed, 31 May 2023 07:14:11 GMT, Thomas Schatzl wrote: > No real issue, but no real advantage either. In case of -Xms != -Xmx this would mean that the main method will not start with heap size of -Xms, but additional regions. Right, that's the current behavior as well and not uncommitting the regions in `dealloc_archive_regions` doesn't change that. The difference appears only when the archive region mapping fails. If the regions are uncommitted it would bring back heap size to Xms. If not, heap size will remain slightly above Xms, which is same as the current behavior when mapping succeeds. The advantage is avoiding the need to track committed regions which avoids leaks more G1 specific details to CDS code. Adding GC policy specific details would make it harder to implement https://bugs.openjdk.org/browse/JDK-8296263. I am trying to move G1 specific code out of CDS in https://github.com/openjdk/jdk/pull/14208. So if not uncommitting simplifies the things then why not? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14145#issuecomment-1570213437 From tschatzl at openjdk.org Wed May 31 13:45:07 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 31 May 2023 13:45:07 GMT Subject: RFR: 8140326: G1: Consider putting regions where evacuation failed into next collection set Message-ID: This change adds management of retained regions, i.e. trying to evacuate evacuation failed regions asap. The advantage is that evacuation failed regions do not need to wait until the next marking to be cleaned out; as they are often very sparsely occupied (often being eden regions), this occupies a lot of space, potentially causing additional evacuation failures later on. Another use of this change will be region pinning, which are basically evacuation failed regions that can not be reclaimed as long as they are pinned - however as soon as they are unpinned, they should be reclaimed for the same reasons as well. It consists of several behavioral changes: During garbage collection: ... in the Evacuation phase: * always collect dirty cards into evacuation failed regions proactively. In tests, the amount of cards/live objects per evacuation failed region is typically very small. Dirty cards are always put into the global refinement buffer immediately, assuming that we will keep most if not all evacuation failed regions. ... during Post Evacuation 2/Free Collection Set phase: * determine whether the region will be retained (kept for "immediate" evacuation) or not. Highly occupied regions are assumed to stay (mostly) live at least until the next marking, so do not bother with them. These "retained" regions are collected in a new "from retained" set in the collection set candidates and managed separately from "from marking" regions. Having the "from retained" and "from marking" sets separate in the collection set candidates is easier to manage than to use a single list and the picking stuff from it. Particularly wrt to making sure that mixed gcs preferentially pick from the "from marking" list first, then second from the "from retained" list. ... determining the collection set during the pause: * during gc, the collection set is preferentially (first) populated with regions from the "from marking" candidates (these are the important regions to get cleaned out), second from the "from retained" list. * g1 reserves up to 20% of max gc pause time for retained regions as optional candidates (this is a random number) to make sure that these are cleared out asap to free memory. There is also a minimum number of regions to take from the retained regions list. During marking ... changes to marking proper * retained regions will not be marked through during concurrent mark, i.e. they are considered outside of the snapshot. So they are added to the "root regions" during the concurrent start pause. This may be a performance issue (we can't do a gc until all root regions have been marked through), but so far since evacuation failure regions are typically very sparsely populated, this is very fast. ... changes to scrubbing * during scrubbing, regions may now be reclaimed. That means scrubbing needs to be aware of (more) regions being reclaimed while working on them. During mutator time: ... try to accomodate retained candidate regions in the predictions, giving them at most 20% of pause time (random value) Testing: multiple tier1-5 runs, with forced verification on and/or induced evacuation failure ------------- Commit messages: - Fix trailing whitespace - This change adds management of retained regions, i.e. trying to evacuate evacuation failed Changes: https://git.openjdk.org/jdk/pull/14220/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14220&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8140326 Stats: 484 lines in 31 files changed: 372 ins; 20 del; 92 mod Patch: https://git.openjdk.org/jdk/pull/14220.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14220/head:pull/14220 PR: https://git.openjdk.org/jdk/pull/14220 From ayang at openjdk.org Wed May 31 19:28:25 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 31 May 2023 19:28:25 GMT Subject: RFR: 8308766: TLAB initialization may cause div by zero In-Reply-To: References: Message-ID: On Wed, 24 May 2023 11:50:02 GMT, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this change that fixes an FP div by zero? > > In `ThreadLocalAllocBuffer::initialize()` we initialize the TLAB using current available TLAB capacity for the thread. In G1, this can be zero in some situations, leading to that div by zero (see the CR for the crash when adding an assert). > The suggested fix is to just not sample at this point. TLAB resizing will fix TLAB sizing up. > > Only G1 seems to be affected as it seems to be the only gc that uses a dynamic value for the capacity available for TLAB allocation. Other GCs seem to just use total heap capacity (Z, Shenandoah) or eden capacity (Serial, Parallel). > Not sure if that is actually better and I think won't result in the expected behavior (every thread should reload TLABs `target_refills()` times per mutator time); since even with G1 at TLAB resizing time eden is maximal, this hiccup at initialization does not seem too bad. > > This may also be the cause for the behavior observed in https://bugs.openjdk.org/browse/JDK-8264798. > > Testing: gha > > Thanks, > Thomas Thanks to Thomas' explanation, now I understand why it tracks the ratio instead of the actual alloc-amount. It's because (eden) capacity affects the distance btw two gc-pause (in STW GC), and alloc-amount is semi-proportional to gc-distance. Therefore, the ratio more or less reflects alloc-rate, which can be used to predict alloc-amount until the next gc-pause. However, maintaining a constant number of refills btw gc-pauses seems an odd objective; preexisting issue. ------------- Marked as reviewed by ayang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14121#pullrequestreview-1454004997 From dcubed at openjdk.org Wed May 31 20:45:47 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 31 May 2023 20:45:47 GMT Subject: Integrated: 8309236: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again Message-ID: A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again ------------- Commit messages: - 8309236: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again Changes: https://git.openjdk.org/jdk/pull/14253/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14253&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309236 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14253.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14253/head:pull/14253 PR: https://git.openjdk.org/jdk/pull/14253 From bpb at openjdk.org Wed May 31 20:45:48 2023 From: bpb at openjdk.org (Brian Burkhalter) Date: Wed, 31 May 2023 20:45:48 GMT Subject: Integrated: 8309236: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again In-Reply-To: References: Message-ID: On Wed, 31 May 2023 20:34:07 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again Marked as reviewed by bpb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14253#pullrequestreview-1454119784 From azvegint at openjdk.org Wed May 31 20:45:49 2023 From: azvegint at openjdk.org (Alexander Zvegintsev) Date: Wed, 31 May 2023 20:45:49 GMT Subject: Integrated: 8309236: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again In-Reply-To: References: Message-ID: On Wed, 31 May 2023 20:34:07 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again Marked as reviewed by azvegint (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14253#pullrequestreview-1454120005 From dcubed at openjdk.org Wed May 31 20:45:50 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 31 May 2023 20:45:50 GMT Subject: Integrated: 8309236: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again In-Reply-To: References: Message-ID: On Wed, 31 May 2023 20:38:23 GMT, Brian Burkhalter wrote: >> A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again > > Marked as reviewed by bpb (Reviewer). @bplb and @azvegint - Thanks for the fast reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14253#issuecomment-1570917645 From dcubed at openjdk.org Wed May 31 20:45:52 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 31 May 2023 20:45:52 GMT Subject: Integrated: 8309236: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again In-Reply-To: References: Message-ID: On Wed, 31 May 2023 20:34:07 GMT, Daniel D. Daugherty wrote: > A trivial fix to ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again This pull request has now been integrated. Changeset: e42a4b65 Author: Daniel D. Daugherty URL: https://git.openjdk.org/jdk/commit/e42a4b659a78721567e4e882a26fe2972975bc80 Stats: 2 lines in 2 files changed: 2 ins; 0 del; 0 mod 8309236: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC and Generational ZGC again Reviewed-by: bpb, azvegint ------------- PR: https://git.openjdk.org/jdk/pull/14253