From wkemper at openjdk.org Wed Feb 1 00:11:07 2023 From: wkemper at openjdk.org (William Kemper) Date: Wed, 1 Feb 2023 00:11:07 GMT Subject: RFR: Update vestigial comment Message-ID: Update out of date comment pointed out by review of #209 . ------------- Commit messages: - Update vestigial comment Changes: https://git.openjdk.org/shenandoah/pull/211/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=211&range=00 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/shenandoah/pull/211.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/211/head:pull/211 PR: https://git.openjdk.org/shenandoah/pull/211 From ysr at openjdk.org Wed Feb 1 00:18:03 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 1 Feb 2023 00:18:03 GMT Subject: RFR: 8299703: GenShen: improvements in card scanning [v5] In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 14:57:41 GMT, Kelvin Nilsen wrote: >> Y. Srinivas Ramakrishna has updated the pull request incrementally with one additional commit since the last revision: >> >> jcheck: tab > > src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp line 951: > >> 949: >> 950: // ShenandoahRegionChunkIterator divides the total remembered set scanning effort into ShenandoahRegionChunks >> 951: // that are assigned one at a time to worker threads. (Here, we use the terms`assignments` and `chunks` > > Typo: need a space before assignments Fixed. ------------- PR: https://git.openjdk.org/shenandoah/pull/193 From ysr at openjdk.org Wed Feb 1 00:18:30 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 1 Feb 2023 00:18:30 GMT Subject: RFR: Update vestigial comment In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 00:06:19 GMT, William Kemper wrote: > Update out of date comment pointed out by review of #209 . Marked as reviewed by ysr (Author). ------------- PR: https://git.openjdk.org/shenandoah/pull/211 From ysr at openjdk.org Wed Feb 1 00:17:58 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 1 Feb 2023 00:17:58 GMT Subject: RFR: 8299703: GenShen: improvements in card scanning [v6] In-Reply-To: References: Message-ID: > **Main changes:** > 1. `process_clusters()` now finds and processes contiguous ranges of dirty cards, skipping over contiguous ranges of clean cards. For reading the diffs, it might be easiest to look at the new code, rather than view the side-by-side diffs. > 2. the ShenandoahCardCluster class has been extended by a `block_start()` method which returns the first object in a card (which could be co-initial with the card); this method is used by the refactored `process_clusters()` above. > 3. ShenandoahCardCluster class's `has_object()` method has been renamed `starts_object()` which more closely reflects the API. > 4. ShenandoahCardStats class has been modified to better suit the way statistics are gathered in the rewritten `process_clusters()`. The larger-grain API should also result in less overhead for gathering the statistics and might (subject to measurement) allow it to be available in product/release builds (if so, that will be done in a separate follow-up ticket). > 5. Added some const annotations. > > **Testing & Implementation Notes:** > 6. Tested with Extremem and SpecJBB, fastdebug, release, and product builds, with and without verification enabled. > 7. Preliminary performance data with an Extremem workload showed roughly 17-18% reduction in wall-clock durations of concurrent remembered set scanning across the distribution (p0, p25, p50, p75), p100 (max) was marginally down at 2%. The trend of the change was as expected since the gains are lost when we have a higher frequency of dirty/clean alternations with short dirty/clean runs. > 8. More performance data with SPECjbb and several different Extremem workloads were gathered, and can be found below, including both phases that use the process_clusters code. See https://github.com/openjdk/shenandoah/pull/193#issuecomment-1405191124 below. > > **Acknowledgments**: > 9. Many thanks to @kdnilsen for feedback on an earlier version of the draft PR, which helped catch a crucial misunderstanding on the role of TAMS and marked objects, and helped fix the error that had been dogging me. > > **Epilogue**: > 10. Further performance improvements are possible, but are deferred for follow-up. Y. Srinivas Ramakrishna has updated the pull request incrementally with one additional commit since the last revision: guarantees to asserts (non-production), and warnings (production); review feedback. ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/193/files - new: https://git.openjdk.org/shenandoah/pull/193/files/f804468e..46a21b95 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=193&range=05 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=193&range=04-05 Stats: 16 lines in 2 files changed: 9 ins; 0 del; 7 mod Patch: https://git.openjdk.org/shenandoah/pull/193.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/193/head:pull/193 PR: https://git.openjdk.org/shenandoah/pull/193 From ysr at openjdk.org Wed Feb 1 00:18:00 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 1 Feb 2023 00:18:00 GMT Subject: RFR: 8299703: GenShen: improvements in card scanning [v5] In-Reply-To: <9PzFIkRzUD2XozIfVni9oY5guaj771FVQhjjAsDmk3o=.8abeafd9-6020-4fcc-8c88-3f6e39a31ce9@github.com> References: <9PzFIkRzUD2XozIfVni9oY5guaj771FVQhjjAsDmk3o=.8abeafd9-6020-4fcc-8c88-3f6e39a31ce9@github.com> Message-ID: On Tue, 31 Jan 2023 19:19:53 GMT, Y. Srinivas Ramakrishna wrote: >> Or maybe just remove them? > > I'll change them to warnings. Done: converted to asserts (debug/optimized), and warnings (release). ------------- PR: https://git.openjdk.org/shenandoah/pull/193 From wkemper at openjdk.org Wed Feb 1 00:28:30 2023 From: wkemper at openjdk.org (William Kemper) Date: Wed, 1 Feb 2023 00:28:30 GMT Subject: RFR: Tune heuristic defaults and behavior for improved stability In-Reply-To: <9Ycj251p0-0WQPdeq4O3Rw7BujITdQkKXNa93XZ4wdc=.51e3692e-5a2e-45c1-beea-626444d9b04d@github.com> References: <4G8AQT9r0oKv4t98V5RTTS2KkRyRnlK1t7KH2nRO0FY=.cbb3343d-9452-4735-bb43-9e9ea7cf7946@github.com> <9Ycj251p0-0WQPdeq4O3Rw7BujITdQkKXNa93XZ4wdc=.51e3692e-5a2e-45c1-beea-626444d9b04d@github.com> Message-ID: On Tue, 31 Jan 2023 20:33:07 GMT, Y. Srinivas Ramakrishna wrote: >> Also, some minor changes to logging. > > src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp line 184: > >> 182: "increases the sensitivity. ") \ >> 183: \ >> 184: product(double, ShenandoahAdaptiveDecayFactor, 0.1, EXPERIMENTAL, \ > > Is there performance data to share to inform this change? Yes! The maximum number of degenerated cycles observed on 10 runs of `specjbb2015` went from 48 to 9. Same metric for `extremem` went from 111 to 86. The effect is less dramatic for satb mode. There are also notable improvements on Dacapo's `jython` and `biojava`: +11.58% hyperalloc_a2048_o1536/context_switch_count p=0.00537 Control: 2515.458 (+/-249.10 ) 90 Test: 2806.800 (+/-145.41 ) 5 +10.97% hyperalloc_a3072_o1536/context_switch_count p=0.00002 Control: 3780.958 (+/-377.60 ) 90 Test: 4195.600 (+/-103.85 ) 5 +10.14% extremem/product_replacement_p50 p=0.00147 Control: 1.793ms (+/-125.44us) 85 Test: 1.975ms (+/-184.30us) 5 +5.33% hyperalloc_a2048_o1536/cpu_user p=0.00004 Control: 287.015s (+/- 7.43s ) 90 Test: 302.300s (+/- 7.52s ) 5 -162.25% jython/rss_max p=0.00000 Control: 6410130.783 (+/-65440.95 ) 85 Test: 2444264.800 (+/-73349.32 ) 5 -153.73% jython/minor_page_fault_count p=0.00000 Control: 1682775.464 (+/-18501.36 ) 85 Test: 663203.400 (+/-22693.20 ) 5 -146.98% biojava/minor_page_fault_count p=0.00000 Control: 1566706.957 (+/-1318.36 ) 85 Test: 634354.600 (+/-17082.50 ) 5 -77.56% jython/cpu_system p=0.00000 Control: 3.864s (+/- 0.15s ) 85 Test: 2.176s (+/- 0.12s ) 5 -38.79% extremem-phased/calculate_addresses p=0.00000 Control: 433.731ms (+/- 58.95ms) 795 Test: 312.516ms (+/- 32.93ms) 12 -18.63% extremem-phased/minor_page_fault_count p=0.00000 Control: 4272509.681 (+/-243509.95 ) 85 Test: 3601418.400 (+/-70657.90 ) 5 ------------- PR: https://git.openjdk.org/shenandoah/pull/209 From kdnilsen at openjdk.org Wed Feb 1 02:39:25 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 02:39:25 GMT Subject: RFR: Update vestigial comment In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 00:06:19 GMT, William Kemper wrote: > Update out of date comment pointed out by review of #209 . Marked as reviewed by kdnilsen (Committer). ------------- PR: https://git.openjdk.org/shenandoah/pull/211 From kdnilsen at openjdk.org Wed Feb 1 02:41:07 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 02:41:07 GMT Subject: RFR: Fullgc should honor aging cycle Message-ID: As previously implemented, Full GC always incremented object ages. It should only increment object ages if this is an aging cycle. This was detected during code review. This is a correctness improvement without known performance implications. ------------- Commit messages: - Merge remote-tracking branch 'GitFarmBranch/full-gc-should-honor-aging-cycle' into fullgc-should-honor-aging-cycle - Full GC should only increment object age during aging cycles Changes: https://git.openjdk.org/shenandoah/pull/212/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=212&range=00 Stats: 9 lines in 2 files changed: 8 ins; 0 del; 1 mod Patch: https://git.openjdk.org/shenandoah/pull/212.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/212/head:pull/212 PR: https://git.openjdk.org/shenandoah/pull/212 From kdnilsen at openjdk.org Wed Feb 1 03:03:10 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 03:03:10 GMT Subject: RFR: Loan from old should align on region size Message-ID: This fixes an error detected by a recently added assertion on a different branch. When loaning memory from old-gen to young-gen, it is important the total amount of the loan be an integral number of ShenandoahHeapRegions. The total loan represents a sum of memory loaned to support young evacuations and memory loaned to support mutator allocations made while GC is active. The sum of these two values is a multiple of region size. The previous version of the code accidentally loaned only the memory dedicated to supporting mutator allocations, which is not necessarily a multiple of the region size. ------------- Commit messages: - Merge remote-tracking branch 'GitFarmBranch/loan-from-old-should-align-on-region-size' into loan-from-old-should-align-on-region-size - Fix alignment error in old loans to young Changes: https://git.openjdk.org/shenandoah/pull/213/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=213&range=00 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.org/shenandoah/pull/213.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/213/head:pull/213 PR: https://git.openjdk.org/shenandoah/pull/213 From ysr at openjdk.org Wed Feb 1 03:13:24 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 1 Feb 2023 03:13:24 GMT Subject: RFR: Loan from old should align on region size In-Reply-To: References: Message-ID: <6O_KrjUOc2ZMXTZ9yffrZUaQUbGsmx6cGYwF3pMyVm8=.4554502b-8270-41fe-b9ad-fb92cdeaddf8@github.com> On Wed, 1 Feb 2023 02:57:07 GMT, Kelvin Nilsen wrote: > This fixes an error detected by a recently added assertion on a different branch. When loaning memory from old-gen to young-gen, it is important the total amount of the loan be an integral number of ShenandoahHeapRegions. The total loan represents a sum of memory loaned to support young evacuations and memory loaned to support mutator allocations made while GC is active. The sum of these two values is a multiple of region size. The previous version of the code accidentally loaned only the memory dedicated to supporting mutator allocations, which is not necessarily a multiple of the region size. src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp line 732: > 730: working_old_available -= additional_regions_to_loan * region_size_bytes; > 731: } > 732: size_t allocation_supplement = old_bytes_reserved_for_alloc_supplement + old_bytes_loaned_for_young_evac; So, each of these may not themselves be an integral multiple of region size but their sum is (by the assert on the next line). Can we reason through why this should follow? It must be that at some point we end up using a fraction of sum of region sizes and then they come together here. Could you help document that, or point to the relevant split that then merges them back here? I'll try and look through the flow as well, although the calculations are quite complex to keep track of unless one knows the rationale well to see where the fractional regions come about and where the complementary fraction is then held. ------------- PR: https://git.openjdk.org/shenandoah/pull/213 From ysr at openjdk.org Wed Feb 1 03:17:28 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 1 Feb 2023 03:17:28 GMT Subject: RFR: Loan from old should align on region size In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 02:57:07 GMT, Kelvin Nilsen wrote: > This fixes an error detected by a recently added assertion on a different branch. When loaning memory from old-gen to young-gen, it is important the total amount of the loan be an integral number of ShenandoahHeapRegions. The total loan represents a sum of memory loaned to support young evacuations and memory loaned to support mutator allocations made while GC is active. The sum of these two values is a multiple of region size. The previous version of the code accidentally loaned only the memory dedicated to supporting mutator allocations, which is not necessarily a multiple of the region size. Marked as reviewed by ysr (Author). ------------- PR: https://git.openjdk.org/shenandoah/pull/213 From kdnilsen at openjdk.org Wed Feb 1 03:44:24 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 03:44:24 GMT Subject: RFR: Loan from old should align on region size In-Reply-To: <6O_KrjUOc2ZMXTZ9yffrZUaQUbGsmx6cGYwF3pMyVm8=.4554502b-8270-41fe-b9ad-fb92cdeaddf8@github.com> References: <6O_KrjUOc2ZMXTZ9yffrZUaQUbGsmx6cGYwF3pMyVm8=.4554502b-8270-41fe-b9ad-fb92cdeaddf8@github.com> Message-ID: On Wed, 1 Feb 2023 03:10:09 GMT, Y. Srinivas Ramakrishna wrote: >> This fixes an error detected by a recently added assertion on a different branch. When loaning memory from old-gen to young-gen, it is important the total amount of the loan be an integral number of ShenandoahHeapRegions. The total loan represents a sum of memory loaned to support young evacuations and memory loaned to support mutator allocations made while GC is active. The sum of these two values is a multiple of region size. The previous version of the code accidentally loaned only the memory dedicated to supporting mutator allocations, which is not necessarily a multiple of the region size. > > src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp line 732: > >> 730: working_old_available -= additional_regions_to_loan * region_size_bytes; >> 731: } >> 732: size_t allocation_supplement = old_bytes_reserved_for_alloc_supplement + old_bytes_loaned_for_young_evac; > > So, each of these may not themselves be an integral multiple of region size but their sum is (by the assert on the next line). > > Can we reason through why this should follow? It must be that at some point we end up using a fraction of sum of region sizes and then they come together here. Could you help document that, or point to the relevant split that then merges them back here? I'll try and look through the flow as well, although the calculations are quite complex to keep track of unless one knows the rationale well to see where the fractional regions come about and where the complementary fraction is then held. If you follow the computations, you'll see there's a "remnant" of memory that is leftover after we set aside certain regions to support young evacuations. That "remnant" is added into the allocation supplement. The remnant is what causes each of the individual values to possibly be unaligned. ------------- PR: https://git.openjdk.org/shenandoah/pull/213 From ysr at openjdk.org Wed Feb 1 07:52:30 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 1 Feb 2023 07:52:30 GMT Subject: RFR: Loan from old should align on region size In-Reply-To: References: <6O_KrjUOc2ZMXTZ9yffrZUaQUbGsmx6cGYwF3pMyVm8=.4554502b-8270-41fe-b9ad-fb92cdeaddf8@github.com> Message-ID: <2E1YvOYpL6y6jLq38bJLsZGvmaQ2D0vv0n1ulUMKV-s=.134c85b7-d822-44c4-81dd-284357af89a3@github.com> On Wed, 1 Feb 2023 03:42:08 GMT, Kelvin Nilsen wrote: >> src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp line 732: >> >>> 730: working_old_available -= additional_regions_to_loan * region_size_bytes; >>> 731: } >>> 732: size_t allocation_supplement = old_bytes_reserved_for_alloc_supplement + old_bytes_loaned_for_young_evac; >> >> So, each of these may not themselves be an integral multiple of region size but their sum is (by the assert on the next line). >> >> Can we reason through why this should follow? It must be that at some point we end up using a fraction of sum of region sizes and then they come together here. Could you help document that, or point to the relevant split that then merges them back here? I'll try and look through the flow as well, although the calculations are quite complex to keep track of unless one knows the rationale well to see where the fractional regions come about and where the complementary fraction is then held. > > If you follow the computations, you'll see there's a "remnant" of memory that is leftover after we set aside certain regions to support young evacuations. That "remnant" is added into the allocation supplement. The remnant is what causes each of the individual values to possibly be unaligned. Thanks, yes, that makes sense. So may be leave a comment such as: // `available_loan_remnant' may contain a fractional region that was removed from the // whole number of regions that were loaned for young evacuation (the second addend below). // That loan remnant was added to the old bytes reserved for the allocation supplement (the first addend below). // As a result, their sum is back to being a whole number of regions; we only want to transfer // whole regions between the two generations. Or something to that effect. May be it's clear to the reader sufficiently versed with the logic of the accounting represented here, in which case feel free to ignore the suggestion of a comment. ------------- PR: https://git.openjdk.org/shenandoah/pull/213 From wkemper at openjdk.org Wed Feb 1 14:33:42 2023 From: wkemper at openjdk.org (William Kemper) Date: Wed, 1 Feb 2023 14:33:42 GMT Subject: RFR: Fullgc should honor aging cycle In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 02:35:20 GMT, Kelvin Nilsen wrote: > As previously implemented, Full GC always incremented object ages. It should only increment object ages if this is an aging cycle. This was detected during code review. This is a correctness improvement without known performance implications. Marked as reviewed by wkemper (Committer). ------------- PR: https://git.openjdk.org/shenandoah/pull/212 From wkemper at openjdk.org Wed Feb 1 14:39:37 2023 From: wkemper at openjdk.org (William Kemper) Date: Wed, 1 Feb 2023 14:39:37 GMT Subject: RFR: Loan from old should align on region size In-Reply-To: References: Message-ID: <_SjAsWegKszVM1CUpCB9h_EeR9uG-fb1QHNh48wrwV8=.0d696705-2b8c-4f78-bb41-28e2acc7c088@github.com> On Wed, 1 Feb 2023 02:57:07 GMT, Kelvin Nilsen wrote: > This fixes an error detected by a recently added assertion on a different branch. When loaning memory from old-gen to young-gen, it is important the total amount of the loan be an integral number of ShenandoahHeapRegions. The total loan represents a sum of memory loaned to support young evacuations and memory loaned to support mutator allocations made while GC is active. The sum of these two values is a multiple of region size. The previous version of the code accidentally loaned only the memory dedicated to supporting mutator allocations, which is not necessarily a multiple of the region size. I wonder if the missing `old_bytes_loaned_for_young_evac` explains the observation that the loans were often too conservative, resulting in evacuation failures on some workloads. Is the `old_bytes_loaned_for_young_evac` always less than the size of a region? or could it cover multiple regions? ------------- Marked as reviewed by wkemper (Committer). PR: https://git.openjdk.org/shenandoah/pull/213 From ysr at openjdk.org Wed Feb 1 15:41:38 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 1 Feb 2023 15:41:38 GMT Subject: RFR: Loan from old should align on region size In-Reply-To: <2E1YvOYpL6y6jLq38bJLsZGvmaQ2D0vv0n1ulUMKV-s=.134c85b7-d822-44c4-81dd-284357af89a3@github.com> References: <6O_KrjUOc2ZMXTZ9yffrZUaQUbGsmx6cGYwF3pMyVm8=.4554502b-8270-41fe-b9ad-fb92cdeaddf8@github.com> <2E1YvOYpL6y6jLq38bJLsZGvmaQ2D0vv0n1ulUMKV-s=.134c85b7-d822-44c4-81dd-284357af89a3@github.com> Message-ID: On Wed, 1 Feb 2023 07:49:23 GMT, Y. Srinivas Ramakrishna wrote: >> If you follow the computations, you'll see there's a "remnant" of memory that is leftover after we set aside certain regions to support young evacuations. That "remnant" is added into the allocation supplement. The remnant is what causes each of the individual values to possibly be unaligned. > > Thanks, yes, that makes sense. So may be leave a comment such as: > > > // `available_loan_remnant' may contain a fractional region that was removed from the > // whole number of regions that were loaned for young evacuation (the second addend below). > // That loan remnant was added to the old bytes reserved for the allocation supplement (the first addend below). > // As a result, their sum is back to being a whole number of regions; we only want to transfer > // whole regions between the two generations. > > > Or something to that effect. May be it's clear to the reader sufficiently versed with the logic of the accounting > represented here, in which case feel free to ignore the suggestion of a comment. In any case, since this is a stop-gap fix in code that's soon planned to be deleted, it's probably not worth further documentation or effort. Ship it! ------------- PR: https://git.openjdk.org/shenandoah/pull/213 From kdnilsen at openjdk.org Wed Feb 1 15:43:32 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 15:43:32 GMT Subject: RFR: Following abbreviated cycle, only increment region age if aging cycle Message-ID: <-YFR4rd-FGwIk_1GKxnY1fZ5KpQ1m8jLIut_Ehpk4Jw=.c693f733-976d-4cce-81b3-b8f7ec39755c@github.com> The previously existing code that processes regions following completion of an abbreviated cycle unconditionally increased the age of all regions that had not allocated memory during the cycle. It should only increment age of regions if this is an aging cycle. This patch fixes this behavior. ------------- Commit messages: - Following abbreviated cycle, only increment region age if aging cycle Changes: https://git.openjdk.org/shenandoah/pull/214/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=214&range=00 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/shenandoah/pull/214.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/214/head:pull/214 PR: https://git.openjdk.org/shenandoah/pull/214 From kdnilsen at openjdk.org Wed Feb 1 16:58:44 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 16:58:44 GMT Subject: RFR: Loan from old should align on region size In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 02:57:07 GMT, Kelvin Nilsen wrote: > This fixes an error detected by a recently added assertion on a different branch. When loaning memory from old-gen to young-gen, it is important the total amount of the loan be an integral number of ShenandoahHeapRegions. The total loan represents a sum of memory loaned to support young evacuations and memory loaned to support mutator allocations made while GC is active. The sum of these two values is a multiple of region size. The previous version of the code accidentally loaned only the memory dedicated to supporting mutator allocations, which is not necessarily a multiple of the region size. > That's a good point. In our previous loans, we could have been missing multiple regions that had been planned to be loaned, which were not loaned because of this error which is now fixed. ------------- PR: https://git.openjdk.org/shenandoah/pull/213 From kdnilsen at openjdk.org Wed Feb 1 16:58:45 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 16:58:45 GMT Subject: Integrated: Loan from old should align on region size In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 02:57:07 GMT, Kelvin Nilsen wrote: > This fixes an error detected by a recently added assertion on a different branch. When loaning memory from old-gen to young-gen, it is important the total amount of the loan be an integral number of ShenandoahHeapRegions. The total loan represents a sum of memory loaned to support young evacuations and memory loaned to support mutator allocations made while GC is active. The sum of these two values is a multiple of region size. The previous version of the code accidentally loaned only the memory dedicated to supporting mutator allocations, which is not necessarily a multiple of the region size. This pull request has now been integrated. Changeset: 998ea70c Author: Kelvin Nilsen URL: https://git.openjdk.org/shenandoah/commit/998ea70c079ac4b2498178aade9d84eb6b2853ea Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod Loan from old should align on region size Reviewed-by: ysr, wkemper ------------- PR: https://git.openjdk.org/shenandoah/pull/213 From wkemper at openjdk.org Wed Feb 1 18:10:38 2023 From: wkemper at openjdk.org (William Kemper) Date: Wed, 1 Feb 2023 18:10:38 GMT Subject: RFR: Following abbreviated cycle, only increment region age if aging cycle In-Reply-To: <-YFR4rd-FGwIk_1GKxnY1fZ5KpQ1m8jLIut_Ehpk4Jw=.c693f733-976d-4cce-81b3-b8f7ec39755c@github.com> References: <-YFR4rd-FGwIk_1GKxnY1fZ5KpQ1m8jLIut_Ehpk4Jw=.c693f733-976d-4cce-81b3-b8f7ec39755c@github.com> Message-ID: On Wed, 1 Feb 2023 15:36:16 GMT, Kelvin Nilsen wrote: > The previously existing code that processes regions following completion of an abbreviated cycle unconditionally > increased the age of all regions that had not allocated memory during the cycle. It should only increment age of > regions if this is an aging cycle. This patch fixes this behavior. Marked as reviewed by wkemper (Committer). ------------- PR: https://git.openjdk.org/shenandoah/pull/214 From kdnilsen at openjdk.org Wed Feb 1 18:51:42 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 18:51:42 GMT Subject: Integrated: Fullgc should honor aging cycle In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 02:35:20 GMT, Kelvin Nilsen wrote: > As previously implemented, Full GC always incremented object ages. It should only increment object ages if this is an aging cycle. This was detected during code review. This is a correctness improvement without known performance implications. This pull request has now been integrated. Changeset: 0a7d8934 Author: Kelvin Nilsen URL: https://git.openjdk.org/shenandoah/commit/0a7d89340e9d543d95805ac402ca0fd9e8a52d0c Stats: 9 lines in 2 files changed: 8 ins; 0 del; 1 mod Fullgc should honor aging cycle Reviewed-by: wkemper ------------- PR: https://git.openjdk.org/shenandoah/pull/212 From kdnilsen at openjdk.org Wed Feb 1 18:53:38 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 18:53:38 GMT Subject: Integrated: Following abbreviated cycle, only increment region age if aging cycle In-Reply-To: <-YFR4rd-FGwIk_1GKxnY1fZ5KpQ1m8jLIut_Ehpk4Jw=.c693f733-976d-4cce-81b3-b8f7ec39755c@github.com> References: <-YFR4rd-FGwIk_1GKxnY1fZ5KpQ1m8jLIut_Ehpk4Jw=.c693f733-976d-4cce-81b3-b8f7ec39755c@github.com> Message-ID: On Wed, 1 Feb 2023 15:36:16 GMT, Kelvin Nilsen wrote: > The previously existing code that processes regions following completion of an abbreviated cycle unconditionally > increased the age of all regions that had not allocated memory during the cycle. It should only increment age of > regions if this is an aging cycle. This patch fixes this behavior. This pull request has now been integrated. Changeset: 3e8f7657 Author: Kelvin Nilsen URL: https://git.openjdk.org/shenandoah/commit/3e8f765726944764a38c489fd8cdc7e72ace290b Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Following abbreviated cycle, only increment region age if aging cycle Reviewed-by: wkemper ------------- PR: https://git.openjdk.org/shenandoah/pull/214 From kdnilsen at openjdk.org Wed Feb 1 20:03:11 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 20:03:11 GMT Subject: RFR: Bootstrap old gc should honor aging cycle Message-ID: A bootstrap old gc should increment object and region ages if this is an aging cycle. This patch implements this fix. The patch is motivated by observations that performance degradation occurs when timely promotions do not occur because bootstrap old GCs are not incrementing ages. ------------- Commit messages: - Merge remote-tracking branch 'GitFarmBranch/bootstrap-old-gc-should-honor-aging-cycle' into bootstrap-old-gc-should-honor-aging-cycle - Bootstrap and OLD GC should age OLD objects Changes: https://git.openjdk.org/shenandoah/pull/215/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=215&range=00 Stats: 6 lines in 1 file changed: 5 ins; 0 del; 1 mod Patch: https://git.openjdk.org/shenandoah/pull/215.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/215/head:pull/215 PR: https://git.openjdk.org/shenandoah/pull/215 From wkemper at openjdk.org Wed Feb 1 20:03:12 2023 From: wkemper at openjdk.org (William Kemper) Date: Wed, 1 Feb 2023 20:03:12 GMT Subject: RFR: Bootstrap old gc should honor aging cycle In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 19:36:47 GMT, Kelvin Nilsen wrote: > A bootstrap old gc should increment object and region ages if this is an aging cycle. This patch implements this fix. > > The patch is motivated by observations that performance degradation occurs when timely promotions do not occur because bootstrap old GCs are not incrementing ages. Looks good. ------------- Marked as reviewed by wkemper (Committer). PR: https://git.openjdk.org/shenandoah/pull/215 From kdnilsen at openjdk.org Wed Feb 1 20:10:43 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 1 Feb 2023 20:10:43 GMT Subject: Integrated: Bootstrap old gc should honor aging cycle In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 19:36:47 GMT, Kelvin Nilsen wrote: > A bootstrap old gc should increment object and region ages if this is an aging cycle. This patch implements this fix. > > The patch is motivated by observations that performance degradation occurs when timely promotions do not occur because bootstrap old GCs are not incrementing ages. This pull request has now been integrated. Changeset: d40ba7dd Author: Kelvin Nilsen URL: https://git.openjdk.org/shenandoah/commit/d40ba7dd128a79df0a5d1401327c695ba800fed6 Stats: 6 lines in 1 file changed: 5 ins; 0 del; 1 mod Bootstrap old gc should honor aging cycle Reviewed-by: wkemper ------------- PR: https://git.openjdk.org/shenandoah/pull/215 From wkemper at openjdk.org Wed Feb 1 20:13:25 2023 From: wkemper at openjdk.org (William Kemper) Date: Wed, 1 Feb 2023 20:13:25 GMT Subject: Integrated: Update vestigial comment In-Reply-To: References: Message-ID: <9KilvNorS2gApKYb82xYRPyE7LFAF3FMa6kTlV02J_Y=.f5a95488-10d3-4271-a7fd-033e87d7033a@github.com> On Wed, 1 Feb 2023 00:06:19 GMT, William Kemper wrote: > Update out of date comment pointed out by review of #209 . This pull request has now been integrated. Changeset: 4ec2cd90 Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/4ec2cd904476c97d98906ee5b623c57dba261ddb Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Update vestigial comment Reviewed-by: ysr, kdnilsen ------------- PR: https://git.openjdk.org/shenandoah/pull/211 From ysr at openjdk.org Wed Feb 1 20:40:12 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 1 Feb 2023 20:40:12 GMT Subject: RFR: Loan from old should align on region size In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 02:57:07 GMT, Kelvin Nilsen wrote: > This fixes an error detected by a recently added assertion on a different branch. When loaning memory from old-gen to young-gen, it is important the total amount of the loan be an integral number of ShenandoahHeapRegions. The total loan represents a sum of memory loaned to support young evacuations and memory loaned to support mutator allocations made while GC is active. The sum of these two values is a multiple of region size. The previous version of the code accidentally loaned only the memory dedicated to supporting mutator allocations, which is not necessarily a multiple of the region size. In light of your last comment, I'll be curious if and how much it may help with workloads close to the edge. ------------- PR: https://git.openjdk.org/shenandoah/pull/213 From wkemper at openjdk.org Thu Feb 2 00:12:02 2023 From: wkemper at openjdk.org (William Kemper) Date: Thu, 2 Feb 2023 00:12:02 GMT Subject: RFR: 8299703: GenShen: improvements in card scanning [v6] In-Reply-To: References: Message-ID: On Wed, 1 Feb 2023 00:17:58 GMT, Y. Srinivas Ramakrishna wrote: >> **Main changes:** >> 1. `process_clusters()` now finds and processes contiguous ranges of dirty cards, skipping over contiguous ranges of clean cards. For reading the diffs, it might be easiest to look at the new code, rather than view the side-by-side diffs. >> 2. the ShenandoahCardCluster class has been extended by a `block_start()` method which returns the first object in a card (which could be co-initial with the card); this method is used by the refactored `process_clusters()` above. >> 3. ShenandoahCardCluster class's `has_object()` method has been renamed `starts_object()` which more closely reflects the API. >> 4. ShenandoahCardStats class has been modified to better suit the way statistics are gathered in the rewritten `process_clusters()`. The larger-grain API should also result in less overhead for gathering the statistics and might (subject to measurement) allow it to be available in product/release builds (if so, that will be done in a separate follow-up ticket). >> 5. Added some const annotations. >> >> **Testing & Implementation Notes:** >> 6. Tested with Extremem and SpecJBB, fastdebug, release, and product builds, with and without verification enabled. >> 7. Preliminary performance data with an Extremem workload showed roughly 17-18% reduction in wall-clock durations of concurrent remembered set scanning across the distribution (p0, p25, p50, p75), p100 (max) was marginally down at 2%. The trend of the change was as expected since the gains are lost when we have a higher frequency of dirty/clean alternations with short dirty/clean runs. >> 8. More performance data with SPECjbb and several different Extremem workloads were gathered, and can be found below, including both phases that use the process_clusters code. See https://github.com/openjdk/shenandoah/pull/193#issuecomment-1405191124 below. >> >> **Acknowledgments**: >> 9. Many thanks to @kdnilsen for feedback on an earlier version of the draft PR, which helped catch a crucial misunderstanding on the role of TAMS and marked objects, and helped fix the error that had been dogging me. >> >> **Epilogue**: >> 10. Further performance improvements are possible, but are deferred for follow-up. > > Y. Srinivas Ramakrishna has updated the pull request incrementally with one additional commit since the last revision: > > guarantees to asserts (non-production), and warnings (production); > review feedback. Marked as reviewed by wkemper (Committer). src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.inline.hpp line 687: > 685: // compiler & interpreter are imprecise. > 686: if (p < left && !obj->is_objArray()) { > 687: if (OBJ_MARK_IMPRECISE) { The code would be simpler if the case for handling `OBJ_MARK_IMPRECISE` were left out - perhaps moved to a descriptive comment? If we leave it in, could we name it `OBJ_MARK_ALWAYS_IMPRECISE` . The barriers do use imprecise card marks, so it's confusing to see this defined as `false`. ------------- PR: https://git.openjdk.org/shenandoah/pull/193 From ysr at openjdk.org Thu Feb 2 01:01:55 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 2 Feb 2023 01:01:55 GMT Subject: RFR: 8299703: GenShen: improvements in card scanning [v7] In-Reply-To: References: Message-ID: <6jq6U6w2bBtXqY83ymDdbazRpzr1PzscJ3McDS8jy74=.48eb14d2-9ec8-4d83-a229-b4b9774a285b@github.com> > **Main changes:** > 1. `process_clusters()` now finds and processes contiguous ranges of dirty cards, skipping over contiguous ranges of clean cards. For reading the diffs, it might be easiest to look at the new code, rather than view the side-by-side diffs. > 2. the ShenandoahCardCluster class has been extended by a `block_start()` method which returns the first object in a card (which could be co-initial with the card); this method is used by the refactored `process_clusters()` above. > 3. ShenandoahCardCluster class's `has_object()` method has been renamed `starts_object()` which more closely reflects the API. > 4. ShenandoahCardStats class has been modified to better suit the way statistics are gathered in the rewritten `process_clusters()`. The larger-grain API should also result in less overhead for gathering the statistics and might (subject to measurement) allow it to be available in product/release builds (if so, that will be done in a separate follow-up ticket). > 5. Added some const annotations. > > **Testing & Implementation Notes:** > 6. Tested with Extremem and SpecJBB, fastdebug, release, and product builds, with and without verification enabled. > 7. Preliminary performance data with an Extremem workload showed roughly 17-18% reduction in wall-clock durations of concurrent remembered set scanning across the distribution (p0, p25, p50, p75), p100 (max) was marginally down at 2%. The trend of the change was as expected since the gains are lost when we have a higher frequency of dirty/clean alternations with short dirty/clean runs. > 8. More performance data with SPECjbb and several different Extremem workloads were gathered, and can be found below, including both phases that use the process_clusters code. See https://github.com/openjdk/shenandoah/pull/193#issuecomment-1405191124 below. > > **Acknowledgments**: > 9. Many thanks to @kdnilsen for feedback on an earlier version of the draft PR, which helped catch a crucial misunderstanding on the role of TAMS and marked objects, and helped fix the error that had been dogging me. > > **Epilogue**: > 10. Further performance improvements are possible, but are deferred for follow-up. Y. Srinivas Ramakrishna has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 116 commits: - Merge branch 'master' into rs_scan - guarantees to asserts (non-production), and warnings (production); review feedback. - jcheck: tab - Merge branch 'master' into rs_scan - a const, some assertions, and avoid redundant scans for on-objArrays that straddle across card clusters (sic). - A couple of guarantees to catch a pesky assert that's occasionally triggering. - Merge branch 'master' into rs_scan - More const safety, some asserts, some comments. - Change type of loop variable to signed to allow correct termination for the case when start_card_index is 0. Nominal check for overflow when using signed type for card index. - Fix the direction of an address comparison, add a couple of assertions, and elaborate some comments. Passes heap verification handily now. - ... and 106 more: https://git.openjdk.org/shenandoah/compare/4ec2cd90...4b718fd4 ------------- Changes: https://git.openjdk.org/shenandoah/pull/193/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=193&range=06 Stats: 853 lines in 12 files changed: 346 ins; 268 del; 239 mod Patch: https://git.openjdk.org/shenandoah/pull/193.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/193/head:pull/193 PR: https://git.openjdk.org/shenandoah/pull/193 From ysr at openjdk.org Thu Feb 2 02:10:25 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 2 Feb 2023 02:10:25 GMT Subject: RFR: 8299703: GenShen: improvements in card scanning [v8] In-Reply-To: References: Message-ID: <1v_qemYBAJ8eA45Y3KUPJjBrgxbGB_xvoVkiXem6Drk=.a46a9a5c-3024-428b-a0c4-10bd82b833c9@github.com> > **Main changes:** > 1. `process_clusters()` now finds and processes contiguous ranges of dirty cards, skipping over contiguous ranges of clean cards. For reading the diffs, it might be easiest to look at the new code, rather than view the side-by-side diffs. > 2. the ShenandoahCardCluster class has been extended by a `block_start()` method which returns the first object in a card (which could be co-initial with the card); this method is used by the refactored `process_clusters()` above. > 3. ShenandoahCardCluster class's `has_object()` method has been renamed `starts_object()` which more closely reflects the API. > 4. ShenandoahCardStats class has been modified to better suit the way statistics are gathered in the rewritten `process_clusters()`. The larger-grain API should also result in less overhead for gathering the statistics and might (subject to measurement) allow it to be available in product/release builds (if so, that will be done in a separate follow-up ticket). > 5. Added some const annotations. > > **Testing & Implementation Notes:** > 6. Tested with Extremem and SpecJBB, fastdebug, release, and product builds, with and without verification enabled. > 7. Preliminary performance data with an Extremem workload showed roughly 17-18% reduction in wall-clock durations of concurrent remembered set scanning across the distribution (p0, p25, p50, p75), p100 (max) was marginally down at 2%. The trend of the change was as expected since the gains are lost when we have a higher frequency of dirty/clean alternations with short dirty/clean runs. > 8. More performance data with SPECjbb and several different Extremem workloads were gathered, and can be found below, including both phases that use the process_clusters code. See https://github.com/openjdk/shenandoah/pull/193#issuecomment-1405191124 below. > > **Acknowledgments**: > 9. Many thanks to @kdnilsen for feedback on an earlier version of the draft PR, which helped catch a crucial misunderstanding on the role of TAMS and marked objects, and helped fix the error that had been dogging me. > > **Epilogue**: > 10. Further performance improvements are possible, but are deferred for follow-up. Y. Srinivas Ramakrishna has updated the pull request incrementally with one additional commit since the last revision: Review feedback, etc. Cleaned up some comments and TODO's. ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/193/files - new: https://git.openjdk.org/shenandoah/pull/193/files/4b718fd4..43a3455a Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=193&range=07 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=193&range=06-07 Stats: 82 lines in 1 file changed: 22 ins; 32 del; 28 mod Patch: https://git.openjdk.org/shenandoah/pull/193.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/193/head:pull/193 PR: https://git.openjdk.org/shenandoah/pull/193 From ysr at openjdk.org Thu Feb 2 02:10:29 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 2 Feb 2023 02:10:29 GMT Subject: RFR: 8299703: GenShen: improvements in card scanning [v6] In-Reply-To: References: Message-ID: On Thu, 2 Feb 2023 00:06:21 GMT, William Kemper wrote: >> Y. Srinivas Ramakrishna has updated the pull request incrementally with one additional commit since the last revision: >> >> guarantees to asserts (non-production), and warnings (production); >> review feedback. > > src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.inline.hpp line 687: > >> 685: // compiler & interpreter are imprecise. >> 686: if (p < left && !obj->is_objArray()) { >> 687: if (OBJ_MARK_IMPRECISE) { > > The code would be simpler if the case for handling `OBJ_MARK_IMPRECISE` were left out - perhaps moved to a descriptive comment? If we leave it in, could we name it `OBJ_MARK_ALWAYS_IMPRECISE` . The barriers do use imprecise card marks, so it's confusing to see this defined as `false`. Yes, I agree that the naming is confusing, plus the unnecessary code clutter. I removed that arm and associated preprocessor def'n, and fixed up the comments. ------------- PR: https://git.openjdk.org/shenandoah/pull/193 From ysr at openjdk.org Thu Feb 2 16:21:20 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 2 Feb 2023 16:21:20 GMT Subject: RFR: 8299703: GenShen: improvements in card scanning [v9] In-Reply-To: References: Message-ID: <3wLaeYplCaHAcwEtECFhefbf-fJVHDUUrBcu_XeaREM=.e16811b6-5e69-4f6a-80a1-1a285b1fd01f@github.com> > **Main changes:** > 1. `process_clusters()` now finds and processes contiguous ranges of dirty cards, skipping over contiguous ranges of clean cards. For reading the diffs, it might be easiest to look at the new code, rather than view the side-by-side diffs. > 2. the ShenandoahCardCluster class has been extended by a `block_start()` method which returns the first object in a card (which could be co-initial with the card); this method is used by the refactored `process_clusters()` above. > 3. ShenandoahCardCluster class's `has_object()` method has been renamed `starts_object()` which more closely reflects the API. > 4. ShenandoahCardStats class has been modified to better suit the way statistics are gathered in the rewritten `process_clusters()`. The larger-grain API should also result in less overhead for gathering the statistics and might (subject to measurement) allow it to be available in product/release builds (if so, that will be done in a separate follow-up ticket). > 5. Added some const annotations. > > **Testing & Implementation Notes:** > 6. Tested with Extremem and SpecJBB, fastdebug, release, and product builds, with and without verification enabled. > 7. Preliminary performance data with an Extremem workload showed roughly 17-18% reduction in wall-clock durations of concurrent remembered set scanning across the distribution (p0, p25, p50, p75), p100 (max) was marginally down at 2%. The trend of the change was as expected since the gains are lost when we have a higher frequency of dirty/clean alternations with short dirty/clean runs. > 8. More performance data with SPECjbb and several different Extremem workloads were gathered, and can be found below, including both phases that use the process_clusters code. See https://github.com/openjdk/shenandoah/pull/193#issuecomment-1405191124 below. > > **Acknowledgments**: > 9. Many thanks to @kdnilsen for feedback on an earlier version of the draft PR, which helped catch a crucial misunderstanding on the role of TAMS and marked objects, and helped fix the error that had been dogging me. > > **Epilogue**: > 10. Further performance improvements are possible, but are deferred for follow-up. Y. Srinivas Ramakrishna has updated the pull request incrementally with one additional commit since the last revision: Relax a couple of asserts related to overflow that were just a tad too strong. ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/193/files - new: https://git.openjdk.org/shenandoah/pull/193/files/43a3455a..84d0f69d Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=193&range=08 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=193&range=07-08 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/shenandoah/pull/193.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/193/head:pull/193 PR: https://git.openjdk.org/shenandoah/pull/193 From rkennke at openjdk.org Thu Feb 2 17:11:32 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 2 Feb 2023 17:11:32 GMT Subject: RFR: 8299324: inline_native_setCurrentThread lacks GC barrier for Shenandoah In-Reply-To: References: Message-ID: On Mon, 30 Jan 2023 22:32:05 GMT, William Kemper wrote: > Allow Shenandoah barrier to emit the store barrier for native memory. I believe it is safe to delete the assert on L202 because `obj` is not used here. Tested with `hotspot:hotspot_gc` and `hotspot:loom` with `JAVA_OPTIONS=-XX:+UseShenandoahGC` (and again with -XX:TieredStopAtLevel=1). It looks good to me, thank you! ------------- Marked as reviewed by rkennke (Reviewer). PR: https://git.openjdk.org/jdk/pull/12300 From wkemper at openjdk.org Thu Feb 2 17:14:24 2023 From: wkemper at openjdk.org (William Kemper) Date: Thu, 2 Feb 2023 17:14:24 GMT Subject: RFR: 8299703: GenShen: improvements in card scanning [v9] In-Reply-To: <3wLaeYplCaHAcwEtECFhefbf-fJVHDUUrBcu_XeaREM=.e16811b6-5e69-4f6a-80a1-1a285b1fd01f@github.com> References: <3wLaeYplCaHAcwEtECFhefbf-fJVHDUUrBcu_XeaREM=.e16811b6-5e69-4f6a-80a1-1a285b1fd01f@github.com> Message-ID: On Thu, 2 Feb 2023 16:21:20 GMT, Y. Srinivas Ramakrishna wrote: >> **Main changes:** >> 1. `process_clusters()` now finds and processes contiguous ranges of dirty cards, skipping over contiguous ranges of clean cards. For reading the diffs, it might be easiest to look at the new code, rather than view the side-by-side diffs. >> 2. the ShenandoahCardCluster class has been extended by a `block_start()` method which returns the first object in a card (which could be co-initial with the card); this method is used by the refactored `process_clusters()` above. >> 3. ShenandoahCardCluster class's `has_object()` method has been renamed `starts_object()` which more closely reflects the API. >> 4. ShenandoahCardStats class has been modified to better suit the way statistics are gathered in the rewritten `process_clusters()`. The larger-grain API should also result in less overhead for gathering the statistics and might (subject to measurement) allow it to be available in product/release builds (if so, that will be done in a separate follow-up ticket). >> 5. Added some const annotations. >> >> **Testing & Implementation Notes:** >> 6. Tested with Extremem and SpecJBB, fastdebug, release, and product builds, with and without verification enabled. >> 7. Preliminary performance data with an Extremem workload showed roughly 17-18% reduction in wall-clock durations of concurrent remembered set scanning across the distribution (p0, p25, p50, p75), p100 (max) was marginally down at 2%. The trend of the change was as expected since the gains are lost when we have a higher frequency of dirty/clean alternations with short dirty/clean runs. >> 8. More performance data with SPECjbb and several different Extremem workloads were gathered, and can be found below, including both phases that use the process_clusters code. See https://github.com/openjdk/shenandoah/pull/193#issuecomment-1405191124 below. >> >> **Acknowledgments**: >> 9. Many thanks to @kdnilsen for feedback on an earlier version of the draft PR, which helped catch a crucial misunderstanding on the role of TAMS and marked objects, and helped fix the error that had been dogging me. >> >> **Epilogue**: >> 10. Further performance improvements are possible, but are deferred for follow-up. > > Y. Srinivas Ramakrishna has updated the pull request incrementally with one additional commit since the last revision: > > Relax a couple of asserts related to overflow that were just a tad too > strong. Marked as reviewed by wkemper (Committer). ------------- PR: https://git.openjdk.org/shenandoah/pull/193 From ysr at openjdk.org Thu Feb 2 18:18:20 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 2 Feb 2023 18:18:20 GMT Subject: Integrated: 8299703: GenShen: improvements in card scanning In-Reply-To: References: Message-ID: On Thu, 5 Jan 2023 20:45:00 GMT, Y. Srinivas Ramakrishna wrote: > **Main changes:** > 1. `process_clusters()` now finds and processes contiguous ranges of dirty cards, skipping over contiguous ranges of clean cards. For reading the diffs, it might be easiest to look at the new code, rather than view the side-by-side diffs. > 2. the ShenandoahCardCluster class has been extended by a `block_start()` method which returns the first object in a card (which could be co-initial with the card); this method is used by the refactored `process_clusters()` above. > 3. ShenandoahCardCluster class's `has_object()` method has been renamed `starts_object()` which more closely reflects the API. > 4. ShenandoahCardStats class has been modified to better suit the way statistics are gathered in the rewritten `process_clusters()`. The larger-grain API should also result in less overhead for gathering the statistics and might (subject to measurement) allow it to be available in product/release builds (if so, that will be done in a separate follow-up ticket). > 5. Added some const annotations. > > **Testing & Implementation Notes:** > 6. Tested with Extremem and SpecJBB, fastdebug, release, and product builds, with and without verification enabled. > 7. Preliminary performance data with an Extremem workload showed roughly 17-18% reduction in wall-clock durations of concurrent remembered set scanning across the distribution (p0, p25, p50, p75), p100 (max) was marginally down at 2%. The trend of the change was as expected since the gains are lost when we have a higher frequency of dirty/clean alternations with short dirty/clean runs. > 8. More performance data with SPECjbb and several different Extremem workloads were gathered, and can be found below, including both phases that use the process_clusters code. See https://github.com/openjdk/shenandoah/pull/193#issuecomment-1405191124 below. > > **Acknowledgments**: > 9. Many thanks to @kdnilsen for feedback on an earlier version of the draft PR, which helped catch a crucial misunderstanding on the role of TAMS and marked objects, and helped fix the error that had been dogging me. > > **Epilogue**: > 10. Further performance improvements are possible, but are deferred for follow-up. This pull request has now been integrated. Changeset: 75811f96 Author: Y. Srinivas Ramakrishna Committer: William Kemper URL: https://git.openjdk.org/shenandoah/commit/75811f964674902f8e12ffe255479b75bef6b6e9 Stats: 866 lines in 12 files changed: 354 ins; 286 del; 226 mod 8299703: GenShen: improvements in card scanning Reviewed-by: kdnilsen, wkemper ------------- PR: https://git.openjdk.org/shenandoah/pull/193 From jsjolen at openjdk.org Mon Feb 6 10:22:53 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 6 Feb 2023 10:22:53 GMT Subject: RFR: JDK-8301493: Replace NULL with nullptr in cpu/aarch64 In-Reply-To: References: Message-ID: <93w5_gLPwyrao5XtNgUVyj5ntkIZh2Oz4y2f51xnbos=.7caf423d-dd48-4152-8110-55def437182b@github.com> On Tue, 31 Jan 2023 11:39:27 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/aarch64. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Some simple fixes here src/hotspot/cpu/aarch64/frame_aarch64.inline.hpp line 174: > 172: // Then we could use the assert below. However this assert is of somewhat dubious > 173: // value. > 174: // assert(_pc != null, "no pc?"); nullptr src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp line 162: > 160: // Calling the runtime using the regular call_VM_leaf mechanism generates > 161: // code (generated by InterpreterMacroAssember::call_VM_leaf_base) > 162: // that checks that the *(rfp+frame::interpreter_frame_last_sp) == null. nullptr src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp line 160: > 158: // Calling the runtime using the regular call_VM_leaf mechanism generates > 159: // code (generated by InterpreterMacroAssember::call_VM_leaf_base) > 160: // that checks that the *(rfp+frame::interpreter_frame_last_sp) == null. nullptr src/hotspot/cpu/aarch64/icBuffer_aarch64.cpp line 50: > 48: // (1) the value is old (i.e., doesn't matter for scavenges) > 49: // (2) these ICStubs are removed *before* a GC happens, so the roots disappear > 50: // assert(cached_value == null || cached_oop->is_perm(), "must be perm oop"); nullptr src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 143: > 141: Label L; > 142: ldr(rscratch1, Address(rthread, JavaThread::jvmti_thread_state_offset())); > 143: cbz(rscratch1, L); // if (thread->jvmti_thread_state() == null) exit; nullptr src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1313: > 1311: // if (row[2].rec == rec) { row[2].incr(); goto done; } > 1312: // if (row[2].rec != null) { count.incr(); goto done; } // overflow > 1313: // row[2].init(rec); goto done; nullptr src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1586: > 1584: cbz(rscratch1, L); > 1585: stop("InterpreterMacroAssembler::call_VM_leaf_base:" > 1586: " last_sp != null"); nullptr src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1614: > 1612: cbz(rscratch1, L); > 1613: stop("InterpreterMacroAssembler::call_VM_base:" > 1614: " last_sp != null"); nullptr src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1158: > 1156: } > 1157: > 1158: // for (scan = klass->itable(); scan->interface() != null; scan += scan_step) { nullptr src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4791: > 4789: > 4790: if (UseSimpleArrayEquals) { > 4791: Label NEXT_WORD, SHORT, TAIL03, TAIL01, A_MIGHT_BE_nullptr, A_IS_NOT_NULL; Fix this macro src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 2059: > 2057: __ tbnz(src_pos, 31, L_failed); // i.e. sign bit set > 2058: > 2059: // if (dst == null) return -1; nullptr src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 2077: > 2075: #ifdef ASSERT > 2076: // assert(src->klass() != null); > 2077: { nullptr src/hotspot/cpu/aarch64/templateTable_aarch64.cpp line 3698: > 3696: __ bind(done); > 3697: // r0 = 0: obj == NULL or obj is not an instanceof the specified klass > 3698: // r0 = 1: obj != NULL and obj is an instanceof the specified klass nullptr ------------- PR: https://git.openjdk.org/jdk/pull/12321 From jsjolen at openjdk.org Mon Feb 6 10:22:36 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 6 Feb 2023 10:22:36 GMT Subject: RFR: JDK-8301493: Replace NULL with nullptr in cpu/aarch64 Message-ID: Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/aarch64. Unfortunately the script that does the change isn't perfect, and so we need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. Here are some typical things to look out for: 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. An example of this: ```c++ // This function returns null void* ret_null(); // This function returns true if *x == nullptr bool is_nullptr(void** x); Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. Thanks! ------------- Commit messages: - Fixes - Replace NULL with nullptr in cpu/aarch64 Changes: https://git.openjdk.org/jdk/pull/12321/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12321&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301493 Stats: 458 lines in 43 files changed: 0 ins; 0 del; 458 mod Patch: https://git.openjdk.org/jdk/pull/12321.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12321/head:pull/12321 PR: https://git.openjdk.org/jdk/pull/12321 From phh at openjdk.org Mon Feb 6 19:53:50 2023 From: phh at openjdk.org (Paul Hohensee) Date: Mon, 6 Feb 2023 19:53:50 GMT Subject: RFR: 8299324: inline_native_setCurrentThread lacks GC barrier for Shenandoah In-Reply-To: References: Message-ID: On Mon, 30 Jan 2023 22:32:05 GMT, William Kemper wrote: > Allow Shenandoah barrier to emit the store barrier for native memory. I believe it is safe to delete the assert on L202 because `obj` is not used here. Tested with `hotspot:hotspot_gc` and `hotspot:loom` with `JAVA_OPTIONS=-XX:+UseShenandoahGC` (and again with -XX:TieredStopAtLevel=1). Marked as reviewed by phh (Reviewer). ------------- PR: https://git.openjdk.org/jdk/pull/12300 From wkemper at openjdk.org Mon Feb 6 19:57:00 2023 From: wkemper at openjdk.org (William Kemper) Date: Mon, 6 Feb 2023 19:57:00 GMT Subject: Integrated: 8299324: inline_native_setCurrentThread lacks GC barrier for Shenandoah In-Reply-To: References: Message-ID: On Mon, 30 Jan 2023 22:32:05 GMT, William Kemper wrote: > Allow Shenandoah barrier to emit the store barrier for native memory. I believe it is safe to delete the assert on L202 because `obj` is not used here. Tested with `hotspot:hotspot_gc` and `hotspot:loom` with `JAVA_OPTIONS=-XX:+UseShenandoahGC` (and again with -XX:TieredStopAtLevel=1). This pull request has now been integrated. Changeset: 3ac2bedd Author: William Kemper Committer: Paul Hohensee URL: https://git.openjdk.org/jdk/commit/3ac2beddbaa4e974f6d16d578505473a2e1d2a75 Stats: 5 lines in 1 file changed: 0 ins; 4 del; 1 mod 8299324: inline_native_setCurrentThread lacks GC barrier for Shenandoah Reviewed-by: rkennke, phh ------------- PR: https://git.openjdk.org/jdk/pull/12300 From duke at openjdk.org Tue Feb 7 14:11:01 2023 From: duke at openjdk.org (Afshin Zafari) Date: Tue, 7 Feb 2023 14:11:01 GMT Subject: Integrated: 8151413: os::allocation_granularity/page_size and friends return signed values In-Reply-To: References: Message-ID: On Thu, 19 Jan 2023 10:59:02 GMT, Afshin Zafari wrote: > ### Description > os::allocation_granularity/page_size and friends return signed values > > ### Patch > - Type of `vm_page_size` and `vm_allocation_granularity` members of `OSInfo` class and their wrappers in `os` class changed to `size_t` > - Initial value of them changed from -1 to 0. > - In setters, checking for *set only once* condition is updated accordingly (comparing with 0 instead of -1). Also, checking the argument be positive is removed. > - Equal to 0 (instead of `<= 0` ) is used to check if calling setters failed. > - All `(size_t)` casting of getters removed. > - In arithmetic and negation operations, the operand related to the getters casted to `(int)`. Otherwise, the Windows builds complain. > - Explicitly casted to `(int)` where `jint` needed. > - In ` align_up(T size, A alignment)`, assignment of variables of type `A` to type `T` (i.e., `T t = (A) a;`) should be safe. `T : size_t` and `A : int` won't compile. Fixed appropriately. > - `"%d"` format-flags replaced with `SIZE_FORMAT`. > - Type of `CompilerToVM::Data::vm_page_size` changed to `size_t`. > > ### Test > tier1-5: all green, except an unrelated fail for whom a bug is already created. > job-id: afshin-8151413-20230117-1255-40910454 This pull request has now been integrated. Changeset: 4fe99da7 Author: Afshin Zafari Committer: Jesper Wilhelmsson URL: https://git.openjdk.org/jdk/commit/4fe99da74f557461c31293cdc48af1199dd2b85c Stats: 170 lines in 66 files changed: 7 ins; 5 del; 158 mod 8151413: os::allocation_granularity/page_size and friends return signed values Reviewed-by: stefank, ccheung, ysr ------------- PR: https://git.openjdk.org/jdk/pull/12091 From jcking at openjdk.org Tue Feb 7 14:23:40 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 14:23:40 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy Message-ID: - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. - Create aliases for old `mtXXX` names. - Remove `mt_number_of_types` from the enumeration. - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. - `NMTUtil` references are not updated to avoid increasing patch size. - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. This change does not update variable verbiage. That should be something done over time. ------------- Commit messages: - Fix undefined variable reference - Cleanup memory types and allocation failure strategies Changes: https://git.openjdk.org/jdk/pull/12454/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301983 Stats: 794 lines in 80 files changed: 213 ins; 94 del; 487 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 14:33:34 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 14:33:34 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v2] In-Reply-To: References: Message-ID: > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. Justin King has updated the pull request incrementally with one additional commit since the last revision: Fix printf formatting error Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12454/files - new: https://git.openjdk.org/jdk/pull/12454/files/be6c2307..c2f229ba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 14:40:49 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 14:40:49 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. Justin King has updated the pull request incrementally with one additional commit since the last revision: Fix refactor mistake Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12454/files - new: https://git.openjdk.org/jdk/pull/12454/files/c2f229ba..fe293c42 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From coleenp at openjdk.org Tue Feb 7 14:45:56 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 7 Feb 2023 14:45:56 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:40:49 GMT, Justin King wrote: >> - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. >> - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. >> - Create aliases for old `mtXXX` names. >> - Remove `mt_number_of_types` from the enumeration. >> - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. >> - `NMTUtil` references are not updated to avoid increasing patch size. >> - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. >> - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. >> >> This change does not update variable verbiage. That should be something done over time. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Fix refactor mistake > > Signed-off-by: Justin King I don't like MemoryType. MEMFLAGS makes it indicative that it's a flag for some purpose and sticks out to my eyes as a template parameter, which is part of our coding style. mtWhatever is easy to spot in the code where used. I don't agree with the reasoning for this change. Moving the header file out of allocation.hpp seems good though although we still need to include allocation.hpp to get CHeapObj so not sure how much inclusion that saves. I think also having a PR with a bullet list is a good sign that you're doing too much in one change. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 15:07:26 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:07:26 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:42:14 GMT, Coleen Phillimore wrote: > I don't like MemoryType. MEMFLAGS makes it indicative that it's a flag for some purpose and sticks out to my eyes as a template parameter, which is part of our coding style. mtWhatever is easy to spot in the code where used. I don't agree with the reasoning for this change. Moving the header file out of allocation.hpp seems good though although we still need to include allocation.hpp to get CHeapObj so not sure how much inclusion that saves. MEMFLAGS implies it's flags. In C/C++ that almost always means they can be combined together and passed as a single argument. The only other usage I've seen it with is command line options. That is not the case here, it's just an enumeration of types/categories/kinds. It's not a combination of bits or a pattern. The mt prefix is preserved, you can still use mtWhatever, it's just an alias. The prefix mt, AFAIK, quite literally means memory type, hence the chosen name. MEMFLAGS isn't always a template parameter either. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 15:44:36 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 15:44:36 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v4] In-Reply-To: References: Message-ID: <9KjUQXJRYFeVtfzbRh-OY9CaQaAjwguxJJpqR92xFAo=.409e3276-9c9d-4d82-9bcc-27fad82e7074@github.com> > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. Justin King has updated the pull request incrementally with one additional commit since the last revision: Fix printf formatting error Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12454/files - new: https://git.openjdk.org/jdk/pull/12454/files/fe293c42..61596ca6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 17:25:02 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 17:25:02 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v5] In-Reply-To: References: Message-ID: <60XHlRRmnr8C-BcA_pRbLwIs7P5lqv5cayfc4ifBHeE=.12ed00cf-afc1-410a-b415-4425769be62b@github.com> > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. Justin King has updated the pull request incrementally with one additional commit since the last revision: Add precompiled header Signed-off-by: Justin King ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12454/files - new: https://git.openjdk.org/jdk/pull/12454/files/61596ca6..eaa44d06 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12454&range=03-04 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/12454.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12454/head:pull/12454 PR: https://git.openjdk.org/jdk/pull/12454 From stuefe at openjdk.org Tue Feb 7 17:43:35 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 7 Feb 2023 17:43:35 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:43:01 GMT, Coleen Phillimore wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix refactor mistake >> >> Signed-off-by: Justin King > > I think also having a PR with a bullet list is a good sign that you're doing too much in one change. I agree with @coleenp, sorry. I think this PR is premature. My advice would be to discuss such invasive changes on the ML first before opening a PR. Sometimes we do broad changes like this to keep the code base fresh. But it's always a tradeoff. Often the benefits of correcting the code base outweigh the cost, so we do it. But sometimes they don't. And I argue that in this case, they don't. Mental load cost: Not every name in the hotspot adheres to our naming guides, but may nevertheless be burnt into the collective brain of developers who have worked with the code base for many years. We talked about them, argued about them, there are tons of ML discussions and private emails and documentation surrounding them, they appear in scripts and test documentation... I'd hate to talk about the flag previously known as MEMFLAGS. It is a dividing line, and necessarily an arbitrary one. Not everyone sees this line at the same point. For a broad change that lies on this side of the line see the recent NULL->nullptr changes. Invasive, sure, but useful enough to do. Backporting cost: three LTS releases are still maintained by vendors, soon to be joined by a fourth one. Such a broad change makes backporting fixes difficult. The NULL->nullptr change was simpler in that regard because even though it spoils automatic merges, merging manually is straightforward. But renaming is a different matter. We have seen such problems in the past with Metaspace class renames, and that is just an isolated subsystem. Moreover, with MEMFLAGS, things are in flux; we may change the implementation in the future, and there are different plans for it. So I'd leave this as it is. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 18:02:52 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 18:02:52 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v3] In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:43:01 GMT, Coleen Phillimore wrote: >> Justin King has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix refactor mistake >> >> Signed-off-by: Justin King > > I think also having a PR with a bullet list is a good sign that you're doing too much in one change. > I agree with @coleenp, sorry. I think this PR is premature. My advice would be to discuss such invasive changes on the ML first before opening a PR. > > Sometimes we do broad changes like this to keep the code base fresh. But it's always a tradeoff. Often the benefits of correcting the code base outweigh the cost, so we do it. But sometimes they don't. And I argue that in this case, they don't. > > Mental load cost: Not every name in the hotspot adheres to our naming guides, but may nevertheless be burnt into the collective brain of developers who have worked with the code base for many years. We talked about them, argued about them, there are tons of ML discussions and private emails and documentation surrounding them, they appear in scripts and test documentation... I'd hate to talk about the flag previously known as MEMFLAGS. > > It is a dividing line, and necessarily an arbitrary one. Not everyone sees this line at the same point. For a broad change that lies on this side of the line see the recent NULL->nullptr changes. Invasive, sure, but useful enough to do. > > Backporting cost: three LTS releases are still maintained by vendors, soon to be joined by a fourth one. Such a broad change makes backporting fixes difficult. The NULL->nullptr change was simpler in that regard because even though it spoils automatic merges, merging manually is straightforward. But renaming is a different matter. We have seen such problems in the past with Metaspace class renames, and that is just an isolated subsystem. > > Moreover, with MEMFLAGS, things are in flux; we may change the implementation in the future, and there are different plans for it. So I'd leave this as it is. I can agree with the reasoning (except backporting, that can be applied to any change touching an existing file), it is arbitrary and depends where it is draw. It didn't take very long to do this PR, so it doesn't particularly matter to me. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From jcking at openjdk.org Tue Feb 7 18:02:54 2023 From: jcking at openjdk.org (Justin King) Date: Tue, 7 Feb 2023 18:02:54 GMT Subject: Withdrawn: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy In-Reply-To: References: Message-ID: On Tue, 7 Feb 2023 14:14:07 GMT, Justin King wrote: > - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. > - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. > - Create aliases for old `mtXXX` names. > - Remove `mt_number_of_types` from the enumeration. > - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. > - `NMTUtil` references are not updated to avoid increasing patch size. > - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. > - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. > > This change does not update variable verbiage. That should be something done over time. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From andrew at openjdk.org Thu Feb 9 16:54:51 2023 From: andrew at openjdk.org (Andrew John Hughes) Date: Thu, 9 Feb 2023 16:54:51 GMT Subject: RFR: Merge jdk8u:master Message-ID: Merge jdk8u332-b04 ------------- Commit messages: - Merge tag 'jdk8u332-b04' - 8274524: SSLSocket.close() hangs if it is called during the ssl handshake - Added tag jdk8u332-b03 for changeset 7376b980d6b0 The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/shenandoah-jdk8u/pull/9/files Stats: 154 lines in 3 files changed: 154 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/9.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u pull/9/head:pull/9 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/9 From wkemper at openjdk.org Thu Feb 9 23:59:51 2023 From: wkemper at openjdk.org (William Kemper) Date: Thu, 9 Feb 2023 23:59:51 GMT Subject: RFR: Merge openjdk/jdk:master Message-ID: Merge tag jdk-21+8 ------------- Commit messages: - Use new parallel_threads_do to handle claim tokens - ZZMerge tag 'jdk-21+8' into merge-jdk-21-8 - 8294527: Some java.security.debug options missing from security docs - 8301447: [REDO] CodeHeap has virtual methods that are not overridden - 8295486: Inconsistent constant field values observed during compilation - 8301093: C2 fails assert(ctrl == kit.control()) failed: Control flow was added although the intrinsic bailed out - 8300256: C2: vectorization is sometimes skipped on loops where it would succeed - 8301402: os::print_location gets is_global_handle assert - 8301446: Remove unused includes of gc/shared/genOopClosures - 8301459: Serial: Merge KeepAliveClosure into FastKeepAliveClosure - ... and 109 more: https://git.openjdk.org/shenandoah/compare/75811f96...89cd21a5 The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.org/?repo=shenandoah&pr=216&range=00.0 - openjdk/jdk:master: https://webrevs.openjdk.org/?repo=shenandoah&pr=216&range=00.1 Changes: https://git.openjdk.org/shenandoah/pull/216/files Stats: 16090 lines in 821 files changed: 3884 ins; 2289 del; 9917 mod Patch: https://git.openjdk.org/shenandoah/pull/216.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/216/head:pull/216 PR: https://git.openjdk.org/shenandoah/pull/216 From wkemper at openjdk.org Fri Feb 10 00:26:31 2023 From: wkemper at openjdk.org (William Kemper) Date: Fri, 10 Feb 2023 00:26:31 GMT Subject: Integrated: Merge openjdk/jdk:master In-Reply-To: References: Message-ID: <3r-jjU4BUefzIwucvG-1mUuKA1PUXy0IqfVn1ocnEVw=.32adfb0d-a68a-412d-b458-91c5355c5798@github.com> On Thu, 9 Feb 2023 23:52:32 GMT, William Kemper wrote: > Merge tag jdk-21+8 This pull request has now been integrated. Changeset: d258034d Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/d258034d4423192a5c8471c7855951bc9a712ba5 Stats: 16090 lines in 821 files changed: 3884 ins; 2289 del; 9917 mod Merge openjdk/jdk:master ------------- PR: https://git.openjdk.org/shenandoah/pull/216 From jsjolen at openjdk.org Fri Feb 10 09:45:41 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 09:45:41 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 Message-ID: Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. Here are some typical things to look out for: 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. An example of this: ```c++ // This function returns null void* ret_null(); // This function returns true if *x == nullptr bool is_nullptr(void** x); Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. Thanks! ------------- Commit messages: - Fixes - Replace NULL with nullptr in cpu/x86 Changes: https://git.openjdk.org/jdk/pull/12326/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12326&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8301498 Stats: 675 lines in 54 files changed: 0 ins; 0 del; 675 mod Patch: https://git.openjdk.org/jdk/pull/12326.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12326/head:pull/12326 PR: https://git.openjdk.org/jdk/pull/12326 From jsjolen at openjdk.org Fri Feb 10 09:45:57 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 09:45:57 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Some things to fix Builds and sent out for testing. src/hotspot/cpu/x86/frame_x86.inline.hpp line 163: > 161: // value. > 162: // UPDATE: this constructor is only used by trace_method_handle_stub() now. > 163: // assert(_pc != null, "no pc?"); nullptr src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 233: > 231: // Calling the runtime using the regular call_VM_leaf mechanism generates > 232: // code (generated by InterpreterMacroAssember::call_VM_leaf_base) > 233: // that checks that the *(ebp+frame::interpreter_frame_last_sp) == null. nullptr src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 267: > 265: // Calling the runtime using the regular call_VM_leaf mechanism generates > 266: // code (generated by InterpreterMacroAssember::call_VM_leaf_base) > 267: // that checks that the *(ebp+frame::interpreter_frame_last_sp) == null. nullptr src/hotspot/cpu/x86/icBuffer_x86.cpp line 60: > 58: // (1) the value is old (i.e., doesn't matter for scavenges) > 59: // (2) these ICStubs are removed *before* a GC happens, so the roots disappear > 60: // assert(cached_value == null || cached_oop->is_perm(), "must be perm oop"); nullptr src/hotspot/cpu/x86/interp_masm_x86.cpp line 303: > 301: jcc(Assembler::equal, L); > 302: stop("InterpreterMacroAssembler::call_VM_base:" > 303: " last_sp != null"); nullptr src/hotspot/cpu/x86/interp_masm_x86.cpp line 402: > 400: movptr(tmp, Address(rthread, JavaThread::jvmti_thread_state_offset())); > 401: testptr(tmp, tmp); > 402: jcc(Assembler::zero, L); // if (thread->jvmti_thread_state() == null) exit; nullptr src/hotspot/cpu/x86/interp_masm_x86.cpp line 1779: > 1777: // // main copy of decision tree, rooted at row[1] > 1778: // if (row[0].rec == rec) { row[0].incr(); goto done; } > 1779: // if (row[0].rec != null) { nullptr src/hotspot/cpu/x86/macroAssembler_x86.cpp line 2866: > 2864: } else { > 2865: // nothing to do, (later) access of M[reg + offset] > 2866: // will provoke OS null exception if reg = null is not = src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4265: > 4263: } > 4264: > 4265: // for (scan = klass->itable(); scan->interface() != null; scan += scan_step) { nullptr src/hotspot/cpu/x86/stubGenerator_x86_32.cpp line 1553: > 1551: // one less register, because needed values are on the argument stack. > 1552: // __ check_klass_subtype_fast_path(sub_klass, *super_klass*, temp, > 1553: // L_success, L_failure, null); nullptr src/hotspot/cpu/x86/stubGenerator_x86_32.cpp line 1924: > 1922: const Register length = rcx; // transfer count > 1923: > 1924: // if (src == null) return -1; nullptr src/hotspot/cpu/x86/stubGenerator_x86_32.cpp line 1934: > 1932: __ jccb(Assembler::negative, L_failed_0); > 1933: > 1934: // if (dst == null) return -1; nullptr src/hotspot/cpu/x86/stubGenerator_x86_32.cpp line 1956: > 1954: > 1955: #ifdef ASSERT > 1956: // assert(src->klass() != null); nullptr src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2298: > 2296: // > 2297: > 2298: // if (src == null) return -1; nullptr src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2307: > 2305: __ jccb(Assembler::negative, L_failed_0); > 2306: > 2307: // if (dst == null) return -1; nullptr src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2335: > 2333: __ load_klass(r10_src_klass, src, rklass_tmp); > 2334: #ifdef ASSERT > 2335: // assert(src->klass() != null); nullptr ------------- PR: https://git.openjdk.org/jdk/pull/12326 From jsjolen at openjdk.org Fri Feb 10 10:13:43 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 10:13:43 GMT Subject: RFR: JDK-8301225: Replace NULL with nullptr in share/gc/shenandoah/ [v2] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/gc/shenandoah/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge remote-tracking branch 'origin/master' into JDK-8301225 - Fix copyright glitches - Manual fixes - Merge remote-tracking branch 'origin/master' into JDK-8301225 - Replace NULL with nullptr in share/gc/shenandoah/ ------------- Changes: https://git.openjdk.org/jdk/pull/12251/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12251&range=01 Stats: 533 lines in 60 files changed: 0 ins; 0 del; 533 mod Patch: https://git.openjdk.org/jdk/pull/12251.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12251/head:pull/12251 PR: https://git.openjdk.org/jdk/pull/12251 From jsjolen at openjdk.org Fri Feb 10 10:13:44 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 10:13:44 GMT Subject: RFR: JDK-8301225: Replace NULL with nullptr in share/gc/shenandoah/ In-Reply-To: References: Message-ID: On Fri, 27 Jan 2023 10:19:33 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/gc/shenandoah/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Trivial merge conflict fix and copyright fix. Tier1 passes, integrating. ------------- PR: https://git.openjdk.org/jdk/pull/12251 From jsjolen at openjdk.org Fri Feb 10 14:01:53 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 10 Feb 2023 14:01:53 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Passes tier1. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From kvn at openjdk.org Fri Feb 10 22:53:29 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 10 Feb 2023 22:53:29 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! src/hotspot/cpu/x86/interp_masm_x86.cpp line 303: > 301: jcc(Assembler::equal, L); > 302: stop("InterpreterMacroAssembler::call_VM_base:" > 303: " last_sp != nullptr"); `null` ------------- PR: https://git.openjdk.org/jdk/pull/12326 From dholmes at openjdk.org Mon Feb 13 01:46:32 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:46:32 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: <26pQRrJfLjbpGJk3cFZcKgGNl08NnuI2gSh0exe_pQg=.f3e40d0a-b337-4f02-825d-223f4405ccd3@github.com> On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! src/hotspot/cpu/x86/interp_masm_x86.cpp line 271: > 269: jcc(Assembler::equal, L); > 270: stop("InterpreterMacroAssembler::call_VM_leaf_base:" > 271: " last_sp != null"); nullptr src/hotspot/cpu/x86/macroAssembler_x86.cpp line 2857: > 2855: void MacroAssembler::null_check(Register reg, int offset) { > 2856: if (needs_explicit_null_check(offset)) { > 2857: // provoke OS null exception if reg = null by suggestion "reg is null" src/hotspot/cpu/x86/macroAssembler_x86.hpp line 96: > 94: // Support for null-checks > 95: // > 96: // Generates code that causes a null OS exception if the content of reg is null. Note use of "reg is null" here :) src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp line 1765: > 1763: in_ByteSize(-1), > 1764: in_ByteSize(-1), > 1765: (OopMapSet*)nullptr); Unnecessary cast? src/hotspot/cpu/x86/templateTable_x86.cpp line 4188: > 4186: __ bind(done); > 4187: // rax = 0: obj == null or obj is not an instanceof the specified klass > 4188: // rax = 1: obj != null and obj is an instanceof the specified klass nullptr ------------- PR: https://git.openjdk.org/jdk/pull/12326 From dholmes at openjdk.org Mon Feb 13 01:46:35 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:46:35 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 22:43:54 GMT, Vladimir Kozlov wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > src/hotspot/cpu/x86/interp_masm_x86.cpp line 303: > >> 301: jcc(Assembler::equal, L); >> 302: stop("InterpreterMacroAssembler::call_VM_base:" >> 303: " last_sp != nullptr"); > > `null` These are textual code fragments so I think `nullptr` is more appropriate - as per changes made to cpu/arm code. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From dholmes at openjdk.org Mon Feb 13 01:46:37 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 13 Feb 2023 01:46:37 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 10:22:09 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 2866: > >> 2864: } else { >> 2865: // nothing to do, (later) access of M[reg + offset] >> 2866: // will provoke OS null exception if reg = null > > is not = suggestion: "reg is null" ------------- PR: https://git.openjdk.org/jdk/pull/12326 From jsjolen at openjdk.org Mon Feb 13 09:26:07 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 13 Feb 2023 09:26:07 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Some more fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12326/files - new: https://git.openjdk.org/jdk/pull/12326/files/432ec5d5..e0747443 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12326&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12326&range=00-01 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/12326.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12326/head:pull/12326 PR: https://git.openjdk.org/jdk/pull/12326 From jsjolen at openjdk.org Mon Feb 13 09:26:07 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 13 Feb 2023 09:26:07 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 In-Reply-To: References: Message-ID: On Tue, 31 Jan 2023 11:40:19 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! I added the fixes, thanks! Last commit passed tier1. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From kvn at openjdk.org Mon Feb 13 15:56:32 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 13 Feb 2023 15:56:32 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 01:31:21 GMT, David Holmes wrote: >> src/hotspot/cpu/x86/interp_masm_x86.cpp line 303: >> >>> 301: jcc(Assembler::equal, L); >>> 302: stop("InterpreterMacroAssembler::call_VM_base:" >>> 303: " last_sp != nullptr"); >> >> `null` > > These are textual code fragments so I think `nullptr` is more appropriate - as per changes made to cpu/arm code. Right. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From rkennke at openjdk.org Mon Feb 13 17:17:33 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 13 Feb 2023 17:17:33 GMT Subject: RFR: JDK-8301225: Replace NULL with nullptr in share/gc/shenandoah/ [v2] In-Reply-To: References: Message-ID: <_bx0Obwqe0niqCLLswM5rK5rQow_OMTWjcDbz1YbDBI=.2ec50989-a772-43fb-afd9-b633f729a67b@github.com> On Fri, 10 Feb 2023 10:13:43 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/gc/shenandoah/. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge remote-tracking branch 'origin/master' into JDK-8301225 > - Fix copyright glitches > - Manual fixes > - Merge remote-tracking branch 'origin/master' into JDK-8301225 > - Replace NULL with nullptr in share/gc/shenandoah/ Looks good to me, thank you! ------------- Marked as reviewed by rkennke (Reviewer). PR: https://git.openjdk.org/jdk/pull/12251 From wkemper at openjdk.org Mon Feb 13 22:59:38 2023 From: wkemper at openjdk.org (William Kemper) Date: Mon, 13 Feb 2023 22:59:38 GMT Subject: RFR: Enhancements for Shenandoah's generational mode [v5] In-Reply-To: References: Message-ID: > During the course of the development of the generational mode for Shenandoah, we have found this tool useful to demonstrate progress and changes in behavior. We have also found this tool useful for troubleshooting performance issues and debugging crashes. There are many changes here, but these are the highlights: > * The age and affiliation of a region are encoded in the border and shape of the region (respectively). > * Phases are encoded with different colors for different generations and whether they have degenerated. > * A mechanism to record and replay session has been added (record feature is implemented in hotspot and is not yet upstream). > * Popup windows can be opened for additional detail on regions, as well as their history. > * The legend shows the number of regions in the state described by the legend item. > * Visualizer can now 'find' VMs running Shenandoah with region sampling enabled. > > Many months ago we broke backward compatibility on our branch, but we have recently restored it so the time seems right for a PR. Thank you for looking and sorry for the massive number of changes. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Add notes for running with JDK17+ ------------- Changes: - all: https://git.openjdk.org/shenandoah-visualizer/pull/1/files - new: https://git.openjdk.org/shenandoah-visualizer/pull/1/files/f40f81fb..dadfaff6 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-visualizer&pr=1&range=04 - incr: https://webrevs.openjdk.org/?repo=shenandoah-visualizer&pr=1&range=03-04 Stats: 8 lines in 1 file changed: 7 ins; 1 del; 0 mod Patch: https://git.openjdk.org/shenandoah-visualizer/pull/1.diff Fetch: git fetch https://git.openjdk.org/shenandoah-visualizer pull/1/head:pull/1 PR: https://git.openjdk.org/shenandoah-visualizer/pull/1 From wkemper at openjdk.org Tue Feb 14 01:09:29 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 14 Feb 2023 01:09:29 GMT Subject: RFR: Remove unused visualizer option Message-ID: This option should have been removed when we switched the region sampling to use unified logging framework. ------------- Commit messages: - Remove unused visualizer option Changes: https://git.openjdk.org/shenandoah/pull/217/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=217&range=00 Stats: 6 lines in 1 file changed: 0 ins; 6 del; 0 mod Patch: https://git.openjdk.org/shenandoah/pull/217.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/217/head:pull/217 PR: https://git.openjdk.org/shenandoah/pull/217 From wkemper at openjdk.org Tue Feb 14 01:19:06 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 14 Feb 2023 01:19:06 GMT Subject: RFR: Merge openjdk/jdk:master Message-ID: <3E7uJzIwQCeViqPKamAmCYmXVl-Wuw3TydL8oOMowBY=.38f14879-eab4-4217-831f-5087d91dc2a8@github.com> Merges tag jdk-21+9 ------------- Commit messages: - Merge tag 'jdk-21+9' into merge-jdk-21-9 - 8298478: (fs) Path.of should allow input to include long path prefix - 8301462: Convert Permission files to use lambda after JDK-8076596 - 8302072: Parallel: Remove unimplemented ParCompactionManager::stack_push - 8301767: Convert virtual thread tests to JUnit - 8301828: Avoid unnecessary array fill after creation in javax.swing.text - 8302047: G1: Remove unused G1RegionToSpaceMapper::_region_granularity - 8301380: jdk/jfr/api/consumer/streaming/TestCrossProcessStreaming.java - 8301756: Missed constructor from 8301659 - 8301862: G1: Remove G1PageBasedVirtualSpace::_executable - ... and 100 more: https://git.openjdk.org/shenandoah/compare/d258034d...9b390e23 The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.org/?repo=shenandoah&pr=218&range=00.0 - openjdk/jdk:master: https://webrevs.openjdk.org/?repo=shenandoah&pr=218&range=00.1 Changes: https://git.openjdk.org/shenandoah/pull/218/files Stats: 16366 lines in 466 files changed: 9712 ins; 3475 del; 3179 mod Patch: https://git.openjdk.org/shenandoah/pull/218.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/218/head:pull/218 PR: https://git.openjdk.org/shenandoah/pull/218 From dholmes at openjdk.org Tue Feb 14 06:31:45 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 14 Feb 2023 06:31:45 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 09:26:07 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Some more fixes LGTM! Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.org/jdk/pull/12326 From stefank at openjdk.org Tue Feb 14 09:00:55 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Tue, 14 Feb 2023 09:00:55 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v5] In-Reply-To: <60XHlRRmnr8C-BcA_pRbLwIs7P5lqv5cayfc4ifBHeE=.12ed00cf-afc1-410a-b415-4425769be62b@github.com> References: <60XHlRRmnr8C-BcA_pRbLwIs7P5lqv5cayfc4ifBHeE=.12ed00cf-afc1-410a-b415-4425769be62b@github.com> Message-ID: On Tue, 7 Feb 2023 17:25:02 GMT, Justin King wrote: >> - Rename `MEMFLAGS` to `MemoryType`. `MEMFLAGS` is highly misleading as flags typically can be combined. >> - Update `MemoryType` to have enumeration names that follow the style guide, no `mt` prefix. >> - Create aliases for old `mtXXX` names. >> - Remove `mt_number_of_types` from the enumeration. >> - Shift implementation of utilities related to `MEMFLAGS` from `NMTUtil` to `MemoryTypes`. Handle missing `mt` prefix during parsing. >> - `NMTUtil` references are not updated to avoid increasing patch size. >> - Merge `AllocFailStrategy` and `AllocFailType` into `AllocationFailureStrategy`. >> - Move `MemoryType` and `AllocationFailureStrategy` to their own respective headers. >> >> This change does not update variable verbiage. That should be something done over time. > > Justin King has updated the pull request incrementally with one additional commit since the last revision: > > Add precompiled header > > Signed-off-by: Justin King FWIW, I strongly dislike the uppercase MEMFLAGS name. I wouldn't mind this rename at all. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From rkennke at openjdk.org Tue Feb 14 11:00:32 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 14 Feb 2023 11:00:32 GMT Subject: RFR: Merge openjdk/jdk:master In-Reply-To: <3E7uJzIwQCeViqPKamAmCYmXVl-Wuw3TydL8oOMowBY=.38f14879-eab4-4217-831f-5087d91dc2a8@github.com> References: <3E7uJzIwQCeViqPKamAmCYmXVl-Wuw3TydL8oOMowBY=.38f14879-eab4-4217-831f-5087d91dc2a8@github.com> Message-ID: On Tue, 14 Feb 2023 01:11:52 GMT, William Kemper wrote: > Merges tag jdk-21+9 Marked as reviewed by rkennke (Lead). ------------- PR: https://git.openjdk.org/shenandoah/pull/218 From rkennke at openjdk.org Tue Feb 14 11:36:23 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 14 Feb 2023 11:36:23 GMT Subject: RFR: Enhancements for Shenandoah's generational mode [v5] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 22:59:38 GMT, William Kemper wrote: >> During the course of the development of the generational mode for Shenandoah, we have found this tool useful to demonstrate progress and changes in behavior. We have also found this tool useful for troubleshooting performance issues and debugging crashes. There are many changes here, but these are the highlights: >> * The age and affiliation of a region are encoded in the border and shape of the region (respectively). >> * Phases are encoded with different colors for different generations and whether they have degenerated. >> * A mechanism to record and replay session has been added (record feature is implemented in hotspot and is not yet upstream). >> * Popup windows can be opened for additional detail on regions, as well as their history. >> * The legend shows the number of regions in the state described by the legend item. >> * Visualizer can now 'find' VMs running Shenandoah with region sampling enabled. >> >> Many months ago we broke backward compatibility on our branch, but we have recently restored it so the time seems right for a PR. Thank you for looking and sorry for the massive number of changes. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Add notes for running with JDK17+ Hi William, thank you for this! Awesome work. I have some comments, questions and suggestions. README.md line 49: > 47: Add this additional flag to an active JVM running Shenandoah: > 48: > 49: $ -Xlog:gc+region=debug:::filesize=,filecount= I am not convinced that unified logging is the right vehicle for recording visualizer region sampling information. src/main/java/org/openjdk/shenandoah/DataLogProvider.java line 2: > 1: /* > 2: * ==== This copyright notice looks wrong. Also, update copyright notices in all files that you have changed. src/main/java/org/openjdk/shenandoah/DataProvider.java line 85: > 83: private static T getMonitor(MonitoredVm vm, String key) { > 84: try { > 85: //noinspection unchecked What is this comment trying to say? src/main/java/org/openjdk/shenandoah/RegionStat.java line 76: > 74: this.showLivenessDetail = Boolean.getBoolean("show.liveness"); > 75: } > 76: //This constructor is for CounterTest Please add space after // src/main/java/org/openjdk/shenandoah/RegionStat.java line 191: > 189: g.setColor(mixAlpha(TLAB_ALLOC, liveLvl)); > 190: fillShape(g, lx, y, tlabWidth, height); > 191: // g.setColor(TLAB_ALLOC_BORDER); No need to keep commented-out code here and elsewhere. src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 67: > 65: > 66: public static void main(String[] args) throws Exception { > 67: // Command line argument parsing This method is way too big for my taste, and has way too many inline/anonymous classes. This could benefit from some refactoring and better structuring, IMO. src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 85: > 83: @Override > 84: public void paint(Graphics g) { > 85: render.renderGraph(g); I know that this is pre-existing (afaict), but I would strongly recommend to not render the complex stuff directly on the frame. In my experience, the animation goes much smoother (less flickering, better performance, less blocking of event dispatch thread), if the scene is rendered on an offscreen image (using a separate thread), and then, when redrawing is requested on the frame (possibly requested by the render thread, or whenever the GUI thinks it needs to redraw), only draw the offscreen image on the frame. (This could be further improved by not actually drawing on the offscreen image on the frame, but instead flip the buffers, see here for more details on this: https://docs.oracle.com/javase/tutorial/extra/fullscreen/doublebuf.html .) Maybe this should be done as a follow-up change, and not here, though, src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 188: > 186: toolbarPanel.setLastActionField("File selected: " + filePath[0]); > 187: > 188: System.out.println("Selected file: " + filePath[0]); Is this debug output or is this useful? Not sure. src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 284: > 282: toolbarPanel.setSpeedSpinnerListener(speedSpinnerListener); > 283: > 284: ActionListener speed_0_5_Listener = new ActionListener() { That seems an odd variable name (format). Maybe halfSpeedListener ? src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 298: > 296: toolbarPanel.setSpeed_0_5_Listener(speed_0_5_Listener); > 297: > 298: ActionListener speed_2_Listener = new ActionListener() { doubleSpeedListener? src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 449: > 447: } > 448: }); > 449: f[0].get(); Regarding the refactoring again, it seems odd that you have a final local array field here, only to allow accessing the first element. I feel that this should go into a separate class, and a lot of other stuff too. src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 667: > 665: } > 666: > 667: public static class RenderLive extends Render { The render classes can go into their own files. src/main/java/org/openjdk/shenandoah/Snapshot.java line 54: > 52: } > 53: > 54: //decodes for 3 bits older versions of shenandoah collector Please add space after // src/main/java/org/openjdk/shenandoah/Stopwatch.java line 2: > 1: /* > 2: * ==== Bad copyright notice, again src/main/java/org/openjdk/shenandoah/ToolbarPanel.java line 2: > 1: /* > 2: * ==== And again ------------- Changes requested by rkennke (Reviewer). PR: https://git.openjdk.org/shenandoah-visualizer/pull/1 From rkennke at openjdk.org Tue Feb 14 11:36:23 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 14 Feb 2023 11:36:23 GMT Subject: RFR: Enhancements for Shenandoah's generational mode [v5] In-Reply-To: References: Message-ID: <04bKuLrSzo56TGZ6psgUvqcosYDVXwF4sDVLG2fZaXM=.95f7a35b-54d2-445d-bf36-1102299d5507@github.com> On Tue, 14 Feb 2023 11:06:18 GMT, Roman Kennke wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Add notes for running with JDK17+ > > src/main/java/org/openjdk/shenandoah/DataLogProvider.java line 2: > >> 1: /* >> 2: * ==== > > This copyright notice looks wrong. Also, update copyright notices in all files that you have changed. Why are you adding Red Hat to files that you created new? Or have you moved existing code into those new files? ------------- PR: https://git.openjdk.org/shenandoah-visualizer/pull/1 From wkemper at openjdk.org Tue Feb 14 16:03:11 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 14 Feb 2023 16:03:11 GMT Subject: Integrated: Merge openjdk/jdk:master In-Reply-To: <3E7uJzIwQCeViqPKamAmCYmXVl-Wuw3TydL8oOMowBY=.38f14879-eab4-4217-831f-5087d91dc2a8@github.com> References: <3E7uJzIwQCeViqPKamAmCYmXVl-Wuw3TydL8oOMowBY=.38f14879-eab4-4217-831f-5087d91dc2a8@github.com> Message-ID: <7zhLyJQgfwcE22BbVq5kDI1YVcw4bohxfmztnnBcBxY=.855e74dc-e736-4eef-ab6d-e40e6871ed2d@github.com> On Tue, 14 Feb 2023 01:11:52 GMT, William Kemper wrote: > Merges tag jdk-21+9 This pull request has now been integrated. Changeset: 779314b0 Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/779314b0590bdb2e2fada3522d2bc977237ec1bb Stats: 16366 lines in 466 files changed: 9712 ins; 3475 del; 3179 mod Merge openjdk/jdk:master Reviewed-by: rkennke ------------- PR: https://git.openjdk.org/shenandoah/pull/218 From kvn at openjdk.org Tue Feb 14 19:19:45 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 14 Feb 2023 19:19:45 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: On Mon, 13 Feb 2023 09:26:07 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory cpu/x86. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Some more fixes In addition to few missed placed I noticed that you keep `mov_metadata(rbx, (Metadata*)nullptr);` casting. Is `Metadata*` something special we need to cast it? And more general question - do we need to keep some casting for `nullptr`? We kept them in previous PRs which are already pushed #12029 Should we update our "spec" for `nullptr` to remove all casting or keep some? What is condition to keep? src/hotspot/cpu/x86/interp_masm_x86.cpp line 1785: > 1783: // // degenerate decision tree, rooted at row[2] > 1784: // if (row[2].rec == rec) { row[2].incr(); goto done; } > 1785: // if (row[2].rec != null) { count.incr(); goto done; } // overflow Missed change to nullptr. ------------- Changes requested by kvn (Reviewer). PR: https://git.openjdk.org/jdk/pull/12326 From kvn at openjdk.org Tue Feb 14 19:19:49 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 14 Feb 2023 19:19:49 GMT Subject: RFR: JDK-8301498: Replace NULL with nullptr in cpu/x86 [v2] In-Reply-To: References: Message-ID: On Mon, 6 Feb 2023 10:21:25 GMT, Johan Sj?len wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Some more fixes > > src/hotspot/cpu/x86/interp_masm_x86.cpp line 402: > >> 400: movptr(tmp, Address(rthread, JavaThread::jvmti_thread_state_offset())); >> 401: testptr(tmp, tmp); >> 402: jcc(Assembler::zero, L); // if (thread->jvmti_thread_state() == null) exit; > > nullptr Missed change to nullptr as you suggested. > src/hotspot/cpu/x86/interp_masm_x86.cpp line 1779: > >> 1777: // // main copy of decision tree, rooted at row[1] >> 1778: // if (row[0].rec == rec) { row[0].incr(); goto done; } >> 1779: // if (row[0].rec != null) { > > nullptr Missed change to nullptr as you suggested. > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 4265: > >> 4263: } >> 4264: >> 4265: // for (scan = klass->itable(); scan->interface() != null; scan += scan_step) { > > nullptr Missed change to nullptr as you suggested. ------------- PR: https://git.openjdk.org/jdk/pull/12326 From wkemper at openjdk.org Tue Feb 14 19:26:19 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 14 Feb 2023 19:26:19 GMT Subject: RFR: Enhancements for Shenandoah's generational mode [v5] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 11:02:19 GMT, Roman Kennke wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Add notes for running with JDK17+ > > README.md line 49: > >> 47: Add this additional flag to an active JVM running Shenandoah: >> 48: >> 49: $ -Xlog:gc+region=debug:::filesize=,filecount= > > I am not convinced that unified logging is the right vehicle for recording visualizer region sampling information. We originally had a separate argument for configuring the log file, but as we started to add features to control the file's size and rotation it became clear that we were duplicating too much of the functionality already in unified logging - so we scrapped it. ------------- PR: https://git.openjdk.org/shenandoah-visualizer/pull/1 From wkemper at openjdk.org Tue Feb 14 19:26:19 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 14 Feb 2023 19:26:19 GMT Subject: RFR: Enhancements for Shenandoah's generational mode [v5] In-Reply-To: <04bKuLrSzo56TGZ6psgUvqcosYDVXwF4sDVLG2fZaXM=.95f7a35b-54d2-445d-bf36-1102299d5507@github.com> References: <04bKuLrSzo56TGZ6psgUvqcosYDVXwF4sDVLG2fZaXM=.95f7a35b-54d2-445d-bf36-1102299d5507@github.com> Message-ID: <6pBmo0wSd7tfwIEvf60moYt7qeQ_1scIVSTAr67CaEI=.bf2154c3-eb39-4922-a2bd-76e371c5ee19@github.com> On Tue, 14 Feb 2023 11:06:53 GMT, Roman Kennke wrote: >> src/main/java/org/openjdk/shenandoah/DataLogProvider.java line 2: >> >>> 1: /* >>> 2: * ==== >> >> This copyright notice looks wrong. Also, update copyright notices in all files that you have changed. > > Why are you adding Red Hat to files that you created new? Or have you moved existing code into those new files? I'll go over the copyright notices. ------------- PR: https://git.openjdk.org/shenandoah-visualizer/pull/1 From wkemper at openjdk.org Tue Feb 14 19:33:17 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 14 Feb 2023 19:33:17 GMT Subject: RFR: Enhancements for Shenandoah's generational mode [v5] In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 11:08:53 GMT, Roman Kennke wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Add notes for running with JDK17+ > > src/main/java/org/openjdk/shenandoah/DataProvider.java line 85: > >> 83: private static T getMonitor(MonitoredVm vm, String key) { >> 84: try { >> 85: //noinspection unchecked > > What is this comment trying to say? This is an IDE directive to stop it from complaining about the unchecked cast. I can remove it. > src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 67: > >> 65: >> 66: public static void main(String[] args) throws Exception { >> 67: // Command line argument parsing > > This method is way too big for my taste, and has way too many inline/anonymous classes. This could benefit from some refactoring and better structuring, IMO. I'll see what I can do here. > src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 85: > >> 83: @Override >> 84: public void paint(Graphics g) { >> 85: render.renderGraph(g); > > I know that this is pre-existing (afaict), but I would strongly recommend to not render the complex stuff directly on the frame. In my experience, the animation goes much smoother (less flickering, better performance, less blocking of event dispatch thread), if the scene is rendered on an offscreen image (using a separate thread), and then, when redrawing is requested on the frame (possibly requested by the render thread, or whenever the GUI thinks it needs to redraw), only draw the offscreen image on the frame. (This could be further improved by not actually drawing on the offscreen image on the frame, but instead flip the buffers, see here for more details on this: https://docs.oracle.com/javase/tutorial/extra/fullscreen/doublebuf.html .) Maybe this should be done as a follow-up change, and not here, though, I will look into this. Agree it should be a separate change. > src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 188: > >> 186: toolbarPanel.setLastActionField("File selected: " + filePath[0]); >> 187: >> 188: System.out.println("Selected file: " + filePath[0]); > > Is this debug output or is this useful? Not sure. It's debugging. I can have this use JUL? > src/main/java/org/openjdk/shenandoah/ShenandoahVisualizer.java line 284: > >> 282: toolbarPanel.setSpeedSpinnerListener(speedSpinnerListener); >> 283: >> 284: ActionListener speed_0_5_Listener = new ActionListener() { > > That seems an odd variable name (format). Maybe halfSpeedListener ? Agreed, I'll change this. ------------- PR: https://git.openjdk.org/shenandoah-visualizer/pull/1 From jsjolen at openjdk.org Wed Feb 15 13:43:59 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 15 Feb 2023 13:43:59 GMT Subject: RFR: JDK-8301225: Replace NULL with nullptr in share/gc/shenandoah/ [v2] In-Reply-To: References: Message-ID: On Fri, 10 Feb 2023 10:13:43 GMT, Johan Sj?len wrote: >> Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/gc/shenandoah/. Unfortunately the script that does the change isn't perfect, and so we >> need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. >> >> Here are some typical things to look out for: >> >> 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). >> 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. >> 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. >> >> An example of this: >> >> ```c++ >> // This function returns null >> void* ret_null(); >> // This function returns true if *x == nullptr >> bool is_nullptr(void** x); >> >> >> Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. >> >> Thanks! > > Johan Sj?len has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge remote-tracking branch 'origin/master' into JDK-8301225 > - Fix copyright glitches > - Manual fixes > - Merge remote-tracking branch 'origin/master' into JDK-8301225 > - Replace NULL with nullptr in share/gc/shenandoah/ There we go, let's integrate! ------------- PR: https://git.openjdk.org/jdk/pull/12251 From jsjolen at openjdk.org Wed Feb 15 13:44:01 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Wed, 15 Feb 2023 13:44:01 GMT Subject: Integrated: JDK-8301225: Replace NULL with nullptr in share/gc/shenandoah/ In-Reply-To: References: Message-ID: On Fri, 27 Jan 2023 10:19:33 GMT, Johan Sj?len wrote: > Hi, this PR changes all occurrences of NULL to nullptr for the subdirectory share/gc/shenandoah/. Unfortunately the script that does the change isn't perfect, and so we > need to comb through these manually to make sure nothing has gone wrong. I also review these changes but things slip past my eyes sometimes. > > Here are some typical things to look out for: > > 1. No changes but copyright header changed (probably because I reverted some changes but forgot the copyright). > 2. Macros having their NULL changed to nullptr, these are added to the script when I find them. They should be NULL. > 3. nullptr in comments and logs. We try to use lower case "null" in these cases as it reads better. An exception is made when code expressions are in a comment. > > An example of this: > > ```c++ > // This function returns null > void* ret_null(); > // This function returns true if *x == nullptr > bool is_nullptr(void** x); > > > Note how `nullptr` participates in a code expression here, we really are talking about the specific value `nullptr`. > > Thanks! This pull request has now been integrated. Changeset: 0c965844 Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/0c9658446d111ec944f06b7a8a4e3ae7bf53ee8d Stats: 533 lines in 60 files changed: 0 ins; 0 del; 533 mod 8301225: Replace NULL with nullptr in share/gc/shenandoah/ Reviewed-by: wkemper, kdnilsen, rkennke ------------- PR: https://git.openjdk.org/jdk/pull/12251 From kdnilsen at openjdk.org Wed Feb 15 14:56:31 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 15 Feb 2023 14:56:31 GMT Subject: RFR: Remove unused visualizer option In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 01:02:58 GMT, William Kemper wrote: > This option should have been removed when we switched the region sampling to use unified logging framework. Marked as reviewed by kdnilsen (Committer). ------------- PR: https://git.openjdk.org/shenandoah/pull/217 From stuefe at openjdk.org Thu Feb 16 13:00:37 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 16 Feb 2023 13:00:37 GMT Subject: RFR: JDK-8301983: Refactor MEMFLAGS and AllocFailStrategy [v5] In-Reply-To: References: <60XHlRRmnr8C-BcA_pRbLwIs7P5lqv5cayfc4ifBHeE=.12ed00cf-afc1-410a-b415-4425769be62b@github.com> Message-ID: On Tue, 14 Feb 2023 08:57:38 GMT, Stefan Karlsson wrote: > FWIW, I strongly dislike the uppercase MEMFLAGS name. I wouldn't mind this rename at all. Thinking this through again, I somewhat regret my strong opposition. Reeks too much of resisting change for resisting's sake, and we don't want that. While I still think the change is much too big, I never have liked MEMFLAGS myself either. In comments and code, it has been named "nmt category" in places, which I like more. I think simple renaming change MEMFLAGS->xx, with xx being something like "NMTCategory" or "NMTTags" or similar, would be okay. If other refactorings are omitted, such an isolated change could be backported more easily to older releases. ------------- PR: https://git.openjdk.org/jdk/pull/12454 From wkemper at openjdk.org Thu Feb 16 17:55:11 2023 From: wkemper at openjdk.org (William Kemper) Date: Thu, 16 Feb 2023 17:55:11 GMT Subject: Integrated: Remove unused visualizer option In-Reply-To: References: Message-ID: On Tue, 14 Feb 2023 01:02:58 GMT, William Kemper wrote: > This option should have been removed when we switched the region sampling to use unified logging framework. This pull request has now been integrated. Changeset: 0dbacdfe Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/0dbacdfeb7229279a7b5f2ec8075c8bbd71a3a36 Stats: 6 lines in 1 file changed: 0 ins; 6 del; 0 mod Remove unused visualizer option Reviewed-by: kdnilsen ------------- PR: https://git.openjdk.org/shenandoah/pull/217 From duke at openjdk.org Fri Feb 17 03:15:56 2023 From: duke at openjdk.org (SUN Guoyun) Date: Fri, 17 Feb 2023 03:15:56 GMT Subject: RFR: 8276799: Implementation of JEP 422: Linux/RISC-V Port [v5] In-Reply-To: References: Message-ID: On Thu, 24 Mar 2022 07:01:43 GMT, Fei Yang wrote: >> This PR implements JEP 422: Linux/RISC-V Port [1]. >> The PR starts as a squashed merge of the https://openjdk.java.net/projects/riscv-port branch. >> >> This has been tested with jtreg tier{1,2,3,4} and jcstress on HiFive Unmatched board. Dacapo, SPECjbb2015 and SPECjvm2008 benchmark tests are also carried out regularly. So it should be good enough to run most Java programs. >> >> [1] https://openjdk.java.net/jeps/422 > > Fei Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into JDK-8276799 > - Fix copyright header > - Address review comments > - Merge remote-tracking branch 'upstream/master' into JDK-8276799 > - 8276799: Implementation of JEP 422: Linux/RISC-V Port src/hotspot/cpu/riscv/riscv.ad line 8758: > 8756: %{ > 8757: // Same match rule as `far_cmpU_loop'. > 8758: match(CountedLoopEnd cmp (CmpU op1 op2)); Which testcases can test this instruct and the following instructs? match(CountedLoopEnd cmp (CmpP op1 op2)); match(CountedLoopEnd cmp (CmpN op1 op2)); match(CountedLoopEnd cmp (CmpF op1 op2)); match(CountedLoopEnd cmp (CmpD op1 op2)); I suspect this instruction is useless. ------------- PR: https://git.openjdk.org/jdk/pull/6294 From wkemper at openjdk.org Fri Feb 17 16:14:22 2023 From: wkemper at openjdk.org (William Kemper) Date: Fri, 17 Feb 2023 16:14:22 GMT Subject: RFR: Merge openjdk/jdk:master Message-ID: Merges tag jdk-21+10. ------------- Commit messages: - Replace NULL with nullptr in generational mode changes - Merge jdk-21+10 - 8302163: Speed up various String comparison methods with ArraysSupport.mismatch - 8301460: Clean up LambdaForm to reference BasicType enums directly - 8302127: Remove unused arg in write_ref_field_post - 8301225: Replace NULL with nullptr in share/gc/shenandoah/ - 8301700: Increase the default TLS Diffie-Hellman group size from 1024-bit to 2048-bit - 8301463: Code in DatagramSocket still refers to resolved JDK-8237352 - 8302325: Wrong comment in java.base/share/native/libjimage/imageFile.hpp - 8300808: Accelerate Base64 on x86 for AVX2 - ... and 96 more: https://git.openjdk.org/shenandoah/compare/0dbacdfe...0dd53585 The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.org/?repo=shenandoah&pr=219&range=00.0 - openjdk/jdk:master: https://webrevs.openjdk.org/?repo=shenandoah&pr=219&range=00.1 Changes: https://git.openjdk.org/shenandoah/pull/219/files Stats: 45222 lines in 951 files changed: 16760 ins; 14148 del; 14314 mod Patch: https://git.openjdk.org/shenandoah/pull/219.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/219/head:pull/219 PR: https://git.openjdk.org/shenandoah/pull/219 From wkemper at openjdk.org Fri Feb 17 21:59:07 2023 From: wkemper at openjdk.org (William Kemper) Date: Fri, 17 Feb 2023 21:59:07 GMT Subject: Integrated: Merge openjdk/jdk:master In-Reply-To: References: Message-ID: On Fri, 17 Feb 2023 15:42:19 GMT, William Kemper wrote: > Merges tag jdk-21+10. This pull request has now been integrated. Changeset: c778125c Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/c778125c052cbbeaaf416053118c623c326ca098 Stats: 45222 lines in 951 files changed: 16760 ins; 14148 del; 14314 mod Merge openjdk/jdk:master ------------- PR: https://git.openjdk.org/shenandoah/pull/219 From andrew at openjdk.org Mon Feb 20 21:40:15 2023 From: andrew at openjdk.org (Andrew John Hughes) Date: Mon, 20 Feb 2023 21:40:15 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag jdk8u332-b04 for changeset 7f3e86c8 Message-ID: <171a8955-7dab-46c6-af46-702ec8804455@openjdk.org> Tagged by: Andrew John Hughes Date: 2022-02-28 03:52:41 +0000 Changeset: 7f3e86c8 Author: Alexey Bakhtin Date: 2022-02-28 03:36:59 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/7f3e86c82c6c25cb2926f178029481d1ec62f0c4 From andrew at openjdk.org Mon Feb 20 21:40:19 2023 From: andrew at openjdk.org (Andrew John Hughes) Date: Mon, 20 Feb 2023 21:40:19 GMT Subject: git: openjdk/shenandoah-jdk8u: Added tag shenandoah8u332-b04 for changeset fe360d8d Message-ID: Tagged by: Andrew John Hughes Date: 2023-02-20 21:38:15 +0000 Added tag shenandoah8u332-b04 for changeset fe360d8dfcf Changeset: fe360d8d Author: Andrew John Hughes Date: 2023-02-09 16:44:17 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/fe360d8dfcf31ff24245d4075ea1017732b9cdb4 From andrew at openjdk.org Mon Feb 20 21:40:35 2023 From: andrew at openjdk.org (Andrew John Hughes) Date: Mon, 20 Feb 2023 21:40:35 GMT Subject: git: openjdk/shenandoah-jdk8u: master: 3 new changesets Message-ID: <2461c6e4-b8a8-473d-ab17-c4be37932b11@openjdk.org> Changeset: 9a129e3a Author: Andrew John Hughes Date: 2022-02-23 01:58:09 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/9a129e3aa29a82364ed0c7a47a94745bbdd0fa64 Added tag jdk8u332-b03 for changeset 7376b980d6b0 ! .hgtags Changeset: 7f3e86c8 Author: Alexey Bakhtin Date: 2022-02-28 03:36:59 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/7f3e86c82c6c25cb2926f178029481d1ec62f0c4 8274524: SSLSocket.close() hangs if it is called during the ssl handshake Reviewed-by: phh, andrew ! jdk/src/share/classes/sun/security/ssl/SSLSocketImpl.java + jdk/test/sun/security/ssl/SSLSocketImpl/ClientSocketCloseHang.java Changeset: fe360d8d Author: Andrew John Hughes Date: 2023-02-09 16:44:17 +0000 URL: https://git.openjdk.org/shenandoah-jdk8u/commit/fe360d8dfcf31ff24245d4075ea1017732b9cdb4 Merge jdk8u332-b04 From andrew at openjdk.org Mon Feb 20 21:44:26 2023 From: andrew at openjdk.org (Andrew John Hughes) Date: Mon, 20 Feb 2023 21:44:26 GMT Subject: RFR: Merge jdk8u:master [v2] In-Reply-To: References: Message-ID: > Merge jdk8u332-b04 Andrew John Hughes has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk8u/pull/9/files - new: https://git.openjdk.org/shenandoah-jdk8u/pull/9/files/aa516918..fe360d8d Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=9&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk8u&pr=9&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/9.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u pull/9/head:pull/9 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/9 From andrew at openjdk.org Mon Feb 20 21:44:28 2023 From: andrew at openjdk.org (Andrew John Hughes) Date: Mon, 20 Feb 2023 21:44:28 GMT Subject: Withdrawn: Merge jdk8u:master In-Reply-To: References: Message-ID: On Thu, 9 Feb 2023 16:48:55 GMT, Andrew John Hughes wrote: > Merge jdk8u332-b04 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/shenandoah-jdk8u/pull/9 From andrew at openjdk.org Mon Feb 20 21:57:05 2023 From: andrew at openjdk.org (Andrew John Hughes) Date: Mon, 20 Feb 2023 21:57:05 GMT Subject: RFR: Merge jdk8u:master Message-ID: Merge jdk8u332-b05 ------------- Commit messages: - Merge jdk8u332-b05 - 8041523: Xerces Update: Serializer improvements from Xalan - Added tag jdk8u332-b04 for changeset f58fc9077d22 The merge commit only contains trivial merges, so no merge-specific webrevs have been generated. Changes: https://git.openjdk.org/shenandoah-jdk8u/pull/10/files Stats: 1230 lines in 13 files changed: 722 ins; 346 del; 162 mod Patch: https://git.openjdk.org/shenandoah-jdk8u/pull/10.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk8u pull/10/head:pull/10 PR: https://git.openjdk.org/shenandoah-jdk8u/pull/10 From mdoerr at openjdk.org Wed Feb 22 05:39:36 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 22 Feb 2023 05:39:36 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) Message-ID: Description will get added soon. ------------- Commit messages: - Initial Panama implementation. Changes: https://git.openjdk.org/jdk/pull/12708/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8303040 Stats: 1973 lines in 58 files changed: 1865 ins; 1 del; 107 mod Patch: https://git.openjdk.org/jdk/pull/12708.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12708/head:pull/12708 PR: https://git.openjdk.org/jdk/pull/12708 From pminborg at openjdk.org Wed Feb 22 14:11:45 2023 From: pminborg at openjdk.org (Per Minborg) Date: Wed, 22 Feb 2023 14:11:45 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: <_QETLZVuG6pkWyXvp1Plh3rOVjFn_sYEP09AE7MdsAE=.6f20102e-eaa0-4425-bc26-ab95ab6338ee@github.com> On Wed, 22 Feb 2023 05:31:46 GMT, Martin Doerr wrote: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). In the most recent version of the Panama FFM API, any memory layout (including struct and padding layouts) are always byte aligned. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Wed Feb 22 17:05:49 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 22 Feb 2023 17:05:49 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 05:31:46 GMT, Martin Doerr wrote: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). I will do a more thorough review soon. Some preliminary comments: > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. FWIW, we have to do this for Windows vararg floats as well ([here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/CallArranger.java#L231-L239)) This can be done by `dup`-ing the value, and using 2 `vmStore`s. (each `vmStore` corresponding to a single register/stack location). Doing something similar might be simpler than the `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT` storage types you're using right now. I'm not sure if that is related to the other limitations you mention? Might be interesting to look into. (perhaps as a separate RFE. I don't have a big issue since the current approach stays in PPC-only code) > I had to make changes to shared code and code for other platforms: > > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > > > * PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > > * Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > > * Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! I think supplying the `BasicType` is fine. `VMReg` doesn't have any width information attached to it, and that's why a complementary `BasicType` is needed. I'm glad to see that you could make it work with the register masks for `VMStorage` :) WRT the extension of int -> long. This could potentially also be handled in Java by adding the conversion as a `Cast` binding variant, and then adding the widening casts in `CallArranger`. (I'd be happy to implement the needed changes in shared code if you want, since it touches `BindingSpecializer` which is pretty dense). Since the extension seems to be a figment of the C ABI, that could be preferable, since it has the benefit of the VM code staying ABI-agnostic. This is potentially important if we want to add other ABIs in the future. But, we can also cross that bridge when we get to it (and there are probably more bridges to cross in that case too). So, up to you, really. (It's similar to the discussion surrounding floats for RISCV, if you followed that) > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). Zero-length memory segments are supposed to be resized before they are written to or read from (see [Zero-length memory segments](https://download.java.net/java/early_access/jdk20/docs/api/java.base/java/lang/foreign/MemorySegment.html#wrapping-addresses)). We shouldn't disable the check for them, as that would have far-reaching implications for the safety design of the memory access API. Can you explain a bit more about where/why/how the issue occurs? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Wed Feb 22 17:38:55 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 22 Feb 2023 17:38:55 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Wed, 22 Feb 2023 17:03:16 GMT, Jorn Vernee wrote: > (I'd be happy to implement the needed changes in shared code if you want, since it touches `BindingSpecializer` which is pretty dense) FYI: https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:I2L (assuming `I2L` has the same semantics as `extsw`). Then just add a `.cast(int.class, long.class)` wherever an `int` is stored. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Wed Feb 22 18:28:05 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Wed, 22 Feb 2023 18:28:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: <6Kr2C2phn8UoKIB6MbRn0o-0IuGo5BGFaVXVN9jS5Pg=.d59b2968-a811-487a-9775-d7a868bbbda7@github.com> On Wed, 22 Feb 2023 05:31:46 GMT, Martin Doerr wrote: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). src/java.base/share/classes/jdk/internal/foreign/PlatformLayouts.java line 266: > 264: * The {@code T*} native type. > 265: */ > 266: public static final ValueLayout.OfAddress C_POINTER = ValueLayout.ADDRESS.withBitAlignment(64); I think this is where the issue with the check in `MemorySegment::copy` comes from. Note how other platforms add a call to `asUnbounded` for the created layout, which makes any pointer boxed using this layout writable/readable (such as the in memory return pointer for upcalls). ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mcimadamore at openjdk.org Wed Feb 22 18:34:36 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Wed, 22 Feb 2023 18:34:36 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: <-TEFo_FkmQmlWADKNrtOBByV6R0wOrrsVxDtpwf7J7w=.16c3e1cc-4cd8-4157-ab2c-15b0be9454f8@github.com> On Wed, 22 Feb 2023 05:31:46 GMT, Martin Doerr wrote: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). Thanks for looking into this port! src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1359: > 1357: long size = elementCount * srcElementLayout.byteSize(); > 1358: srcImpl.checkAccess(srcOffset, size, true); > 1359: if (dstImpl instanceof NativeMemorySegmentImpl && dstImpl.byteSize() == 0) { As Jorn said, this change is not acceptable, as it allows bulk copy disregarding the segment real size. In such cases, the issue is always a missing unsafe resize somewhere in the linker code. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 04:37:49 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:37:49 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Clean fix for NativeMemorySegmentImpl issue with byteSize 0. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12708/files - new: https://git.openjdk.org/jdk/pull/12708/files/4a5debfc..7315fd20 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=00-01 Stats: 6 lines in 2 files changed: 0 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12708.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12708/head:pull/12708 PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 04:40:04 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:40:04 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: <-TEFo_FkmQmlWADKNrtOBByV6R0wOrrsVxDtpwf7J7w=.16c3e1cc-4cd8-4157-ab2c-15b0be9454f8@github.com> References: <-TEFo_FkmQmlWADKNrtOBByV6R0wOrrsVxDtpwf7J7w=.16c3e1cc-4cd8-4157-ab2c-15b0be9454f8@github.com> Message-ID: On Wed, 22 Feb 2023 18:31:45 GMT, Maurizio Cimadamore wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> Clean fix for NativeMemorySegmentImpl issue with byteSize 0. > > src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1359: > >> 1357: long size = elementCount * srcElementLayout.byteSize(); >> 1358: srcImpl.checkAccess(srcOffset, size, true); >> 1359: if (dstImpl instanceof NativeMemorySegmentImpl && dstImpl.byteSize() == 0) { > > As Jorn said, this change is not acceptable, as it allows bulk copy disregarding the segment real size. In such cases, the issue is always a missing unsafe resize somewhere in the linker code. Removed this workaround. I'm glad to get rid of it. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 04:40:07 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:40:07 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: <6Kr2C2phn8UoKIB6MbRn0o-0IuGo5BGFaVXVN9jS5Pg=.d59b2968-a811-487a-9775-d7a868bbbda7@github.com> References: <6Kr2C2phn8UoKIB6MbRn0o-0IuGo5BGFaVXVN9jS5Pg=.d59b2968-a811-487a-9775-d7a868bbbda7@github.com> Message-ID: On Wed, 22 Feb 2023 18:23:54 GMT, Jorn Vernee wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> Clean fix for NativeMemorySegmentImpl issue with byteSize 0. > > src/java.base/share/classes/jdk/internal/foreign/PlatformLayouts.java line 266: > >> 264: * The {@code T*} native type. >> 265: */ >> 266: public static final ValueLayout.OfAddress C_POINTER = ValueLayout.ADDRESS.withBitAlignment(64); > > I think this is where the issue with the check in `MemorySegment::copy` comes from. Note how other platforms add a call to `asUnbounded` for the created layout, which makes any pointer boxed using this layout writable/readable (such as the in memory return pointer for upcalls). Thanks for the hint! You found it pretty quickly. I had missed that when rebasing my early prototype. Fixed. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 04:47:05 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:47:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 04:37:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Clean fix for NativeMemorySegmentImpl issue with byteSize 0. > > (I'd be happy to implement the needed changes in shared code if you want, since it touches `BindingSpecializer` which is pretty dense) > > FYI: [master...JornVernee:jdk:I2L](https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:I2L) (assuming `I2L` has the same semantics as `extsw`). Then just add a `.cast(int.class, long.class)` wherever currently an `int` is `vmStore`d in the PPC CallArranger. Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 04:56:02 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 04:56:02 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: <4KT43N8FvdfdmoAeaXw1vEBMP20MhpV5RUTZzeD2DqQ=.39e184f4-e9a3-47bd-9cad-ed9fe51a0d7b@github.com> On Wed, 22 Feb 2023 17:03:16 GMT, Jorn Vernee wrote: > I will do a more thorough review soon. Thanks a lot! > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > FWIW, we have to do this for Windows vararg floats as well ([here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/CallArranger.java#L231-L239)) > > This can be done by `dup`-ing the value, and using 2 `vmStore`s. (each `vmStore` corresponding to a single register/stack location). Doing something similar might be simpler than the `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT` storage types you're using right now. I'm not sure if that is related to the other limitations you mention? Might be interesting to look into. (perhaps as a separate RFE. I don't have a big issue since the current approach stays in PPC-only code) Maybe I need to think a bit more about it. I don't really like the extra cases for `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT`. On the other side, doing it in the CallArranger would break the design of factoring out the allocation from the binding generation. In addition, it seems like PPC64 is even more tricky than the Windows case. I need to pass 2 float arguments in a GP reg (or stack slot) plus one of these 2 floats in float register F13. I think this can get implemented more easily in the backend. Do you agree? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 06:18:49 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 06:18:49 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> > Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". > > This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). > > Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. > > There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > I had to make changes to shared code and code for other platforms: > 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: > - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. > - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. > - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! > 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Remove size restriction for structs. Add TODO for Big Endian. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12708/files - new: https://git.openjdk.org/jdk/pull/12708/files/7315fd20..a4d844f7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12708&range=01-02 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/12708.diff Fetch: git fetch https://git.openjdk.org/jdk pull/12708/head:pull/12708 PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 06:24:03 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 06:24:03 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. I have removed the size restriction for structs. Passing a struct consisting of 1 char works (on Little Endian). However, passing a struct consisting of 3 chars doesn't (getting IndexOutOfBoundsException: Out of bound access on segment MemorySegment). Neither on PPC64, nor on x86. Is that known or should I file a bug for that? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Thu Feb 23 07:22:08 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 23 Feb 2023 07:22:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: <2lWqgBlo7fvu2MpVr3N46-03iI1Q49y1hqfd7uUBB58=.ba2a9ec5-c824-416d-a431-ca7e33019736@github.com> On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. I should add tests for the tricky corner cases like the following ones: EXPORT struct S_FF f_S_S_FF(float p0, float p1, float p2, float p3, float p4, float p5, float p6, float p7, float p8, float p9, float p10, float p11, struct S_FF p12, float p13) { return p12; } EXPORT float f_F_S_FF(float p0, float p1, float p2, float p3, float p4, float p5, float p6, float p7, float p8, float p9, float p10, float p11, struct S_FF p12, float p13) { return p13; } EXPORT struct S_FF f_S_SSSSSSS_FF(struct S_FF p0, struct S_FF p1, struct S_FF p2, struct S_FF p3, struct S_FF p4, struct S_FF p5, struct S_FF p6, float p7) { return p6; } EXPORT float f_F_SSSSSSS_FF(struct S_FF p0, struct S_FF p1, struct S_FF p2, struct S_FF p3, struct S_FF p4, struct S_FF p5, struct S_FF p6, float p7) { return p7; } Can I add them to the existing libraries? If so, what is the correct naming scheme and what is needed to get them executed (adding the EXPORT alone is not sufficient). Or should I create a separate test for these cases? Advice will be appreciated! ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mcimadamore at openjdk.org Thu Feb 23 10:18:08 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 23 Feb 2023 10:18:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <2lWqgBlo7fvu2MpVr3N46-03iI1Q49y1hqfd7uUBB58=.ba2a9ec5-c824-416d-a431-ca7e33019736@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> <2lWqgBlo7fvu2MpVr3N46-03iI1Q49y1hqfd7uUBB58=.ba2a9ec5-c824-416d-a431-ca7e33019736@github.com> Message-ID: On Thu, 23 Feb 2023 07:19:24 GMT, Martin Doerr wrote: > > Can I add them to the existing libraries? If so, what is the correct naming scheme and what is needed to get them executed (adding the EXPORT alone is not sufficient). Or should I create a separate test for these cases? Advice will be appreciated! There are two kinds of tests for the linker - some tests (e.g. TestDowncallXYZ and TestUpcallXYZ) execute end to end test with several shapes, and make sure that things work. Then there are ABI specific tests (e.g. see TestXYZCallArranger). These latter tests are typically used to stress tests corners of specific ABIs - and they are easier to write as you can just provide the input (some function descriptor) then test that the resulting set of bindings is the expected one. This allows for much more in-depth testing. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mcimadamore at openjdk.org Thu Feb 23 10:23:05 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 23 Feb 2023 10:23:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> On Thu, 23 Feb 2023 04:44:18 GMT, Martin Doerr wrote: > > Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? Would it be possible to generate a biding cast + move and the recognize the pattern in the HS backend and optimize it away as a `extsw` ? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Thu Feb 23 14:53:09 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 14:53:09 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Thu, 23 Feb 2023 04:44:18 GMT, Martin Doerr wrote: > > > (I'd be happy to implement the needed changes in shared code if you want, since it touches `BindingSpecializer` which is pretty dense) > > > > > > FYI: [master...JornVernee:jdk:I2L](https://github.com/openjdk/jdk/compare/master...JornVernee:jdk:I2L) (assuming `I2L` has the same semantics as `extsw`). Then just add a `.cast(int.class, long.class)` wherever currently an `int` is `vmStore`d in the PPC CallArranger. > > Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? The design philosophy has been to put as much as possible on the Java side, and there are a few reasons for that: 1. For maintainability. Generated assembly is ultimately harder to debug, compared to Java code (especially in interpreted mode using `-Djdk.internal.foreign.*Linker.USE_SPEC=false`). (Though, there might also be some personal bias here) 2. Moving things to the Java side makes it visible to the JIT, which means it has the opportunity to be optimized away, or otherwise optimized together with the surrounding Java code. While anything put into the downcall stub is fixed. 3. If we want to intrisify `linkToNative` in C2 later, having downcall stubs be simple and consistent across platforms makes that much easier. Anything that's special in the native code would have to be replicated by the JIT as well. So... WRT efficiency, I think it depends. I've found in the past that adding a few more move instructions to the downcall stub didn't visibly affect performance. This might be because the CPU is good at just aliasing the registers instead of performing an actual move, or because it's just noise next to the membar we do on the return path. Ultimately, I don't think it matters much for performance, though (you could measure). I think the maintainability/future-proofing from implementing in Java is more important. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Thu Feb 23 14:59:05 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 14:59:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) In-Reply-To: <4KT43N8FvdfdmoAeaXw1vEBMP20MhpV5RUTZzeD2DqQ=.39e184f4-e9a3-47bd-9cad-ed9fe51a0d7b@github.com> References: <4KT43N8FvdfdmoAeaXw1vEBMP20MhpV5RUTZzeD2DqQ=.39e184f4-e9a3-47bd-9cad-ed9fe51a0d7b@github.com> Message-ID: On Thu, 23 Feb 2023 04:53:34 GMT, Martin Doerr wrote: > > I will do a more thorough review soon. > > Thanks a lot! > > > > The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. > > > > > > FWIW, we have to do this for Windows vararg floats as well ([here](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/foreign/abi/x64/windows/CallArranger.java#L231-L239)) > > This can be done by `dup`-ing the value, and using 2 `vmStore`s. (each `vmStore` corresponding to a single register/stack location). Doing something similar might be simpler than the `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT` storage types you're using right now. I'm not sure if that is related to the other limitations you mention? Might be interesting to look into. (perhaps as a separate RFE. I don't have a big issue since the current approach stays in PPC-only code) > > Maybe I need to think a bit more about it. I don't really like the extra cases for `INTEGER_AND_FLOAT` and `STACK_AND_FLOAT`. On the other side, doing it in the CallArranger would break the design of factoring out the allocation from the binding generation. In addition, it seems like PPC64 is even more tricky than the Windows case. I need to pass 2 float arguments in a GP reg (or stack slot) plus one of these 2 floats in float register F13. I think this can get implemented more easily in the backend. Do you agree? I think the same arguments apply here. I'll have a more thorough look at the patch and then get back to you on this. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Thu Feb 23 15:02:06 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 15:02:06 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:21:05 GMT, Martin Doerr wrote: > I have removed the size restriction for structs. Passing a struct consisting of 1 char works (on Little Endian). However, passing a struct consisting of 3 chars doesn't (getting IndexOutOfBoundsException: Out of bound access on segment MemorySegment). Neither on PPC64, nor on x86. Is that known or should I file a bug for that? I think you might be running into: https://bugs.openjdk.org/browse/JDK-8303017 which was recently found. (If you have a simpler test case please add it to the JBS issue, in a comment) ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Thu Feb 23 15:07:05 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 15:07:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> References: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> Message-ID: On Thu, 23 Feb 2023 10:20:31 GMT, Maurizio Cimadamore wrote: > > Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? > > Would it be possible to generate a biding cast + move and the recognize the pattern in the HS backend and optimize it away as a `extsw` ? At the moment, there is a forced indirection between the Java code and downcall stub, so the JIT can not do any optimizations that have to take both sides into account. The downcall stub is opaque to the JIT. (though, perhaps in the long run we can add intrinsification of `linkToNative` that can generate the code in the downcall stub as part of the JIT's own IR, which would solve this) ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mcimadamore at openjdk.org Thu Feb 23 16:51:08 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Thu, 23 Feb 2023 16:51:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> Message-ID: <-q-cx5Voa0dNJS0d8od_Ui7KRqc4fc5VRHxaPjI07X0=.f79a8180-c247-487f-a994-4d0192482a43@github.com> On Thu, 23 Feb 2023 15:02:48 GMT, Jorn Vernee wrote: > > > Correct, `extsw` performs a `I2L` conversion. I had thought about this already, but I think my current implementation is more efficient as it combines register moves with the 64 bit extend. Your proposal would generate separate extend and move instructions, right? > > > > > > Would it be possible to generate a biding cast + move and the recognize the pattern in the HS backend and optimize it away as a `extsw` ? > > At the moment, there is a forced indirection between the Java code and downcall stub, so the JIT can not do any optimizations that have to take both sides into account. The downcall stub is opaque to the JIT. (though, perhaps in the long run we can add intrinsification of `linkToNative` that can generate the code in the downcall stub as part of the JIT's own IR, which would solve this) I meant generating `extsw` when emitting the stub (since when we emit the stub we can see the bindings). But I suppose the problem there is that the VM only sees low level bindings such as moves, it doesn't see bindings such as casts. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Thu Feb 23 17:14:37 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Thu, 23 Feb 2023 17:14:37 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: <-q-cx5Voa0dNJS0d8od_Ui7KRqc4fc5VRHxaPjI07X0=.f79a8180-c247-487f-a994-4d0192482a43@github.com> References: <3dzZfm_BSXLL_iWAmllX-8YXAe1NWMMPr11zJtedIWU=.8a1dd1df-4eb7-4c72-9a06-f4636e5ec3e4@github.com> <-q-cx5Voa0dNJS0d8od_Ui7KRqc4fc5VRHxaPjI07X0=.f79a8180-c247-487f-a994-4d0192482a43@github.com> Message-ID: On Thu, 23 Feb 2023 16:48:30 GMT, Maurizio Cimadamore wrote: > I meant generating extsw when emitting the stub (since when we emit the stub we can see the bindings). But I suppose the problem there is that the VM only sees low level bindings such as moves, it doesn't see bindings such as casts. Oh, sorry, I see what you mean now. But yeah, the VM stub only handles moves, and I think we want to keep it that way (for reasons outlined). The VM stub can be viewed as a low-level primitive that accepts a set of register/stack values, and moves them into the corresponding locations. It's not really supposed to be doing any kind of processing of the values it receives or returns. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Fri Feb 24 04:07:06 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 24 Feb 2023 04:07:06 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. Some more remarks about other issues: - Uploaded my simple reproducer to [JDK-8303017](https://bugs.openjdk.org/browse/JDK-8303017) - Using oversized load / stores is problematic. Don't forget that OpenJDK still supports Big Endian platforms (AIX, s390x). - Since the membar on the return path was mentioned: I think it would be good to enable UseSystemMemoryBarrier by default on operating systems which support it. Maybe we should discuss this with @robehn. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From rehn at openjdk.org Fri Feb 24 07:15:05 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 24 Feb 2023 07:15:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Fri, 24 Feb 2023 04:03:53 GMT, Martin Doerr wrote: > * Since the membar on the return path was mentioned: I think it would be good to enable UseSystemMemoryBarrier by default on operating systems which support it. Maybe we should discuss this with @robehn. Changing the default is fine by me. There is, AFAIK, one case that needs to read the thread state a lot, thus emitting sysmembars alot, JFR with very high sampling rate. Other than that there are no issues that I know about. Maybe it would be good to test at what sampling interval we notice a change? Also I think it's not best match to have the flag experimental when we are in some sense using it. Maybe diagnostic? ------------- PR: https://git.openjdk.org/jdk/pull/12708 From jvernee at openjdk.org Fri Feb 24 07:21:08 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 24 Feb 2023 07:21:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Fri, 24 Feb 2023 04:03:53 GMT, Martin Doerr wrote: > * Uploaded my simple reproducer to [JDK-8303017](https://bugs.openjdk.org/browse/JDK-8303017) Thanks! > * Using oversized load / stores is problematic. Don't forget that OpenJDK still supports Big Endian platforms (AIX, s390x). You're right. I realized that it's also problematic for heap segments, for which we can't do oversized accesses. I have another solution that splits up the loads/stores into power-of-two sized chunks: https://github.com/openjdk/panama-foreign/compare/foreign-memaccess+abi...JornVernee:panama-foreign:OOB That patch is just a POC at this point though. Also, I don't think it works for BE at the moment (need to flip the offset for BE, I think. Just like we do in Unsafe). > * The result of `NativeCallingConvention::calling_convention` is interpreted as size, but it returns the max offset. That's off by one slot. Should I file a bug for that? (PPC64 is not affected because it doesn't use the result.) I'm not sure there's an issue there. Note that the 'max offset' is computed as `reg.offset() + reg.stack_size()`, so that should get us the size we need to allocate for the stack arguments. (e.g. 2 ints being passed at offset 0 and 4, would make max offset 4 + 4 = 8, which gives the size needed for the 2 ints). Computing the max offset instead of just summing the sizes of the stack arguments is needed since stack arguments can be sparsely placed in some cases on Mac/AArch64. > * Since the membar on the return path was mentioned: I think it would be good to enable UseSystemMemoryBarrier by default on operating systems which support it. Maybe we should discuss this with @robehn. ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From rehn at openjdk.org Fri Feb 24 07:43:05 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 24 Feb 2023 07:43:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Fri, 24 Feb 2023 07:17:30 GMT, Jorn Vernee wrote: > ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. For reliability: On every Oracle tier 5 we run these test two groups: vmTestbase_nsk_jvmti, open/test/hotspot/jtreg/:hotspot_runtime For each of linux_x64_debug, windows_x64, linux_aarch64_debug with sysmembar on. I picked jvmti as this is very heavy on 'JNI'. No issues reported that I know about. We also have this test: SystemMembarHandshakeTransitionTest.java which only JNI transisition with this options. For performance, the heavy lifting is on reader of thread state and the only case I have identified is JFR. I suggest changing the default now, if there is issues we have time to revert it back before RDP1. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Fri Feb 24 10:16:08 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 24 Feb 2023 10:16:08 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Fri, 24 Feb 2023 07:39:15 GMT, Robbin Ehn wrote: > > ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. > > For reliability: On every Oracle tier 5 we run these test two groups: vmTestbase_nsk_jvmti, open/test/hotspot/jtreg/:hotspot_runtime For each of linux_x64_debug, windows_x64, linux_aarch64_debug with sysmembar on. I picked jvmti as this is very heavy on 'JNI'. No issues reported that I know about. We also have this test: SystemMembarHandshakeTransitionTest.java which only JNI transisition with this options. > > For performance, the heavy lifting is on reader of thread state and the only case I have identified is JFR. > > I suggest changing the default now, if there is issues we have time to revert it back before RDP1. Thanks for your prompt reply! I agree, this is a good time to enable it. Do you want to do it? We'll support it and run tests on our machines and platforms. I think JFR users who want a very high sampling rate can switch it off. Making the flag diagnostic is fine with me, but that shouldn't get discussed in this PR :-) ------------- PR: https://git.openjdk.org/jdk/pull/12708 From rehn at openjdk.org Fri Feb 24 10:45:05 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 24 Feb 2023 10:45:05 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: <1PZeZa46N1rA3R0IZs-4TIKP5d1gJYMDlWlfv4ApPV8=.6c5fb2fd-2b75-4dc8-979f-87c33bbe6d9d@github.com> On Fri, 24 Feb 2023 10:12:52 GMT, Martin Doerr wrote: > > > ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. > > > > > > For reliability: On every Oracle tier 5 we run these test two groups: vmTestbase_nsk_jvmti, open/test/hotspot/jtreg/:hotspot_runtime For each of linux_x64_debug, windows_x64, linux_aarch64_debug with sysmembar on. I picked jvmti as this is very heavy on 'JNI'. No issues reported that I know about. We also have this test: SystemMembarHandshakeTransitionTest.java which only JNI transisition with this options. > > For performance, the heavy lifting is on reader of thread state and the only case I have identified is JFR. > > I suggest changing the default now, if there is issues we have time to revert it back before RDP1. > > Thanks for your prompt reply! I agree, this is a good time to enable it. Do you want to do it? We'll support it and run tests on our machines and platforms. I think JFR users who want a very high sampling rate can switch it off. Making the flag diagnostic is fine with me, but that shouldn't get discussed in this PR :-) I will be skiing next week, if you want it now, I suggest you do it, otherwise I can when I'm back. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Sat Feb 25 09:43:04 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 25 Feb 2023 09:43:04 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <1PZeZa46N1rA3R0IZs-4TIKP5d1gJYMDlWlfv4ApPV8=.6c5fb2fd-2b75-4dc8-979f-87c33bbe6d9d@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> <1PZeZa46N1rA3R0IZs-4TIKP5d1gJYMDlWlfv4ApPV8=.6c5fb2fd-2b75-4dc8-979f-87c33bbe6d9d@github.com> Message-ID: On Fri, 24 Feb 2023 10:42:06 GMT, Robbin Ehn wrote: > > > > ~I don't think we've done that much testing with UseSystemMemoryBarrier since it was added~. I'm a bit nervous about turning it on by default since it's currently also used for JNI. Let's see what Robbin thinks. > > > > > > > > > For reliability: On every Oracle tier 5 we run these test two groups: vmTestbase_nsk_jvmti, open/test/hotspot/jtreg/:hotspot_runtime For each of linux_x64_debug, windows_x64, linux_aarch64_debug with sysmembar on. I picked jvmti as this is very heavy on 'JNI'. No issues reported that I know about. We also have this test: SystemMembarHandshakeTransitionTest.java which only JNI transisition with this options. > > > For performance, the heavy lifting is on reader of thread state and the only case I have identified is JFR. > > > I suggest changing the default now, if there is issues we have time to revert it back before RDP1. > > > > > > Thanks for your prompt reply! I agree, this is a good time to enable it. Do you want to do it? We'll support it and run tests on our machines and platforms. I think JFR users who want a very high sampling rate can switch it off. Making the flag diagnostic is fine with me, but that shouldn't get discussed in this PR :-) > > I will be skiing next week, if you want it now, I suggest you do it, otherwise I can when I'm back. https://github.com/openjdk/jdk/pull/12753 ------------- PR: https://git.openjdk.org/jdk/pull/12708 From eosterlund at openjdk.org Mon Feb 27 08:52:07 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 27 Feb 2023 08:52:07 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. I don?t think we want this to be on by default on platforms where StoreLoad fences don't cause substantial global overheads. The benefit on such platforms is rather low, and needing the last couple of nanoseconds of transition speed, seems to not be a normal use case that default settings should optimize for. Conversely, the global synchronization can be rather intrusive, especially when it involves handshakes with N threads, and you need to perform global synchronization across the entire machine, for each thread poked. I would be much more afraid of that issue out of the box, than I would be afraid of a couple of nanoseconds slower native transitions. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From mdoerr at openjdk.org Tue Feb 28 02:56:02 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 28 Feb 2023 02:56:02 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Mon, 27 Feb 2023 08:49:18 GMT, Erik ?sterlund wrote: > I don?t think we want this to be on by default on platforms where StoreLoad fences don't cause substantial global overheads. The benefit on such platforms is rather low, and needing the last couple of nanoseconds of transition speed, seems to not be a normal use case that default settings should optimize for. Conversely, the global synchronization can be rather intrusive, especially when it involves handshakes with N threads, and you need to perform global synchronization across the entire machine, for each thread poked. I would be much more afraid of that issue out of the box, than I would be afraid of a couple of nanoseconds slower native transitions. Hi Erik, StoreLoad fences cause substantial overhead on any multi-socket system including x86_64. The benefit may be small on single-socket systems, but can the VM distinguish? We are currently looking for benchmarks which show a negative effect of enabling it. Seems like the SPEC benchmarks don't care about it. Note that we typically use only one membarrier syscall when we handshake all threads. If you know any workload which suffers, would be great to know. David is currently also checking benchmarks. We should discuss further details in https://github.com/openjdk/jdk/pull/12753. ------------- PR: https://git.openjdk.org/jdk/pull/12708 From matthias.baesken at sap.com Tue Feb 28 09:00:03 2023 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 28 Feb 2023 09:00:03 +0000 Subject: Shenandoah info page / SapMachine In-Reply-To: <78079d15-3f40-d5f9-d0ec-03640274781f@redhat.com> References: <78079d15-3f40-d5f9-d0ec-03640274781f@redhat.com> Message-ID: Okay thanks Aleksey . Forwarding to shenandoah-dev at openjdk.org . Best regards, Matthias -----Original Message----- From: Aleksey Shipilev Sent: Tuesday, 28 February 2023 09:52 To: Baesken, Matthias Cc: Langer, Christoph ; Doerr, Martin Subject: Re: Shenandoah info page / SapMachine Hi Matthias, On 2/24/23 15:44, Baesken, Matthias wrote: > Hi Aleksey , could you please add some info about SapMachine and Shenandoah? here : > > https://wiki.openjdk.org/display/shenandoah/Main > > Suggestion : > > * SAP > o Shenandoah? is shipped and supported ?starting with SapMachine 17 I suggest raising this at shenandoah-dev at . -- Thanks, -Aleksey From rkennke at amazon.de Tue Feb 28 11:38:53 2023 From: rkennke at amazon.de (Kennke, Roman) Date: Tue, 28 Feb 2023 12:38:53 +0100 Subject: Calls array and intrinsic stub routines in shenandoahSupport.cpp In-Reply-To: <98fa25ef-ddf9-20bf-69f9-90f7d54f9588@oracle.com> References: <98fa25ef-ddf9-20bf-69f9-90f7d54f9588@oracle.com> Message-ID: Hello Jamil, Sorry for the very late reply. The email got stuck in the moderator queue and I only noticed it now. > Volodymyr and I have implemented some new intrinsics for JDK 20 (see > OpenJDK PRs https://github.com/openjdk/jdk/pull/7702 and > https://github.com/openjdk/jdk/pull/10582).? Volodymyr recently came > across the calls[] array in shenandoahSupport.cpp and was wondering > whether our new intrinsics need to be added to this array and what the > impact of having there there or not is. This array, or rather the routine that uses it, is for barrier verification in Shenandoah. It checks that intrinsics that require barriers do have the right barriers applied. We should perhaps check if this list is still current and if any work is needed. I would not expect any code to fail because of this, unless when using +ShenandoahVerifyOptoBarriers. Thanks for raising this! Cheers, Roman Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 From rkennke at amazon.de Tue Feb 28 11:41:50 2023 From: rkennke at amazon.de (Kennke, Roman) Date: Tue, 28 Feb 2023 12:41:50 +0100 Subject: Shenandoah info page / SapMachine In-Reply-To: References: <78079d15-3f40-d5f9-d0ec-03640274781f@redhat.com> Message-ID: <06971ba5-f2f5-1ebf-ae6a-452bde638544@amazon.de> Hi Matthias, Thanks for raising this! I changed the Wiki entry. Cheers, Roman > Okay thanks Aleksey . > > Forwarding to shenandoah-dev at openjdk.org . > > Best regards, Matthias > > > > -----Original Message----- > From: Aleksey Shipilev > Sent: Tuesday, 28 February 2023 09:52 > To: Baesken, Matthias > Cc: Langer, Christoph ; Doerr, Martin > Subject: Re: Shenandoah info page / SapMachine > > Hi Matthias, > > On 2/24/23 15:44, Baesken, Matthias wrote: >> Hi Aleksey , could you please add some info about SapMachine and Shenandoah here : >> >> https://wiki.openjdk.org/display/shenandoah/Main >> >> Suggestion : >> >> * SAP >> o Shenandoah is shipped and supported starting with SapMachine 17 > > I suggest raising this at shenandoah-dev at . > > -- > Thanks, > -Aleksey > Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879 From jvernee at openjdk.org Tue Feb 28 21:06:23 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 28 Feb 2023 21:06:23 GMT Subject: RFR: 8303040: linux PPC64le: Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> References: <8b3vVrV22RuhdRoRYacXV0ZeghFGgKkC8S_z-iMrzAQ=.dd84b743-8b51-4281-8f5f-f9eff6207bc7@github.com> Message-ID: On Thu, 23 Feb 2023 06:18:49 GMT, Martin Doerr wrote: >> Implementation of "Foreign Function & Memory API" for linux on Power (Little Endian) according to "Power Architecture 64-Bit ELF V2 ABI Specification". >> >> This PR does not include code for VaList support because it's supposed to get removed by [JDK-8299736](https://bugs.openjdk.org/browse/JDK-8299736). I've kept the related tests disabled for this platform and throw an exception instead. Note that the ABI doesn't precisely specify variable argument lists. Instead, it refers to `` (2.2.4 Variable Argument Lists). >> >> Big Endian support is implemented to some extend, but not complete. E.g. structs with size not divisible by 8 are not passed correctly (see `useABIv2` in CallArranger.java). Big Endian is excluded by selecting `ARCH.equals("ppc64le")` (CABI.java) only. >> >> There's another limitation: This PR only accepts structures with size divisible by 4. (An `IllegalArgumentException` gets thrown otherwise.) I think arbitrary sizes are not usable on other platforms, either, because `SharedUtils.primitiveCarrierForSize` only accepts powers of 2. >> >> The ABI has some tricky corner cases related to HFA (Homogeneous Float Aggregate). The same argument may need to get passed in both, a FP reg and a GP reg or stack slot (see "no partial DW rule"). This cases are not covered by the existing tests. >> >> I had to make changes to shared code and code for other platforms: >> 1. Pass type information when creating `VMStorage` objects from `VMReg`. This is needed for the following reasons: >> - PPC64 ABI requires integer types to get extended to 64 bit (also see CCallingConventionRequiresIntsAsLongs in existing hotspot code). We need to know the type or at least the bit width for that. >> - Floating point load / store instructions need the correct width to select between the correct IEEE 754 formats. The register representation in single FP registers is always IEEE 754 double precision on PPC64. >> - Big Endian also needs usage of the precise size. Storing 8 Bytes and loading 4 Bytes yields different values than on Little Endian! >> 2. It happens that a `NativeMemorySegmentImpl` is used as a raw pointer (with byteSize() == 0) while running TestUpcallScope. Hence, existing size checks don't work (see MemorySegment.java). As a workaround, I'm just skipping the check in this particular case. Please check if this makes sense or if there's a better fix (possibly as separate RFE). > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove size restriction for structs. Add TODO for Big Endian. This looks good overall I think, though I'll stick to my previous suggestion to try and move more logic into Java. Also, I recommend adding a test in `java/foreign/callarranger` similar to the tests already found there. They call the CallGenerator directly and then check the binding recipe. This could prove useful if we need to refactor shared code for whatever reason, since those tests run on (almost) every platform, so can be used to do some basic sanity checking. src/hotspot/cpu/ppc/downcallLinker_ppc.cpp line 133: > 131: Register callerSP = R2, // C/C++ uses R2 as TOC, but we can reuse it here > 132: tmp = R11_scratch1, // same as shuffle_reg > 133: call_target_address = R12_scratch2; // same as _abi._scratch2 (ABIv2 requires this reg!) Do I understand correctly that the ABI requires the register to be used for the call to be `R12`? How does that make a difference? I guess in some cases the callee might want to know the address through which it is called? (so it looks at `R12`) src/hotspot/cpu/ppc/downcallLinker_ppc.cpp line 154: > 152: // (abi_reg_args is abi_minframe plus space for 8 argument register spill slots) > 153: assert(_abi._shadow_space_bytes == frame::abi_minframe_size, "expected space according to ABI"); > 154: int allocated_frame_size = frame::abi_minframe_size + MAX2(_input_registers.length(), 8) * BytesPerWord; This is hard-coding an assumption about the ABI that's being called. Ok for now. If it needs to be addressed in the future, it could be done by adding another field to `ABIDescriptor` like `min_stack_arg_bytes`, or something like that (which is set to zero for other ABIs). It seems to be different from `shadow_space` since it's also used by the caller to put stack arguments. src/hotspot/cpu/ppc/downcallLinker_ppc.cpp line 343: > 341: > 342: __ flush(); > 343: // Disassembler::decode((u_char*)start, (u_char*)__ pc(), tty); Leftover commented code? (note that the stub can also be disassembled with `-Xlog:foreign+downcall=trace` now) src/hotspot/cpu/ppc/foreignGlobals_ppc.cpp line 229: > 227: > 228: void ArgumentShuffle::pd_generate(MacroAssembler* masm, VMStorage tmp, int in_stk_bias, int out_stk_bias, const StubLocations& locs) const { > 229: Register callerSP = as_Register(tmp); // preset It looks like `tmp` is being used to hold the caller's SP. I'm guessing this can not be computed the same way as we do on x86 and aarch64? (based on `RBP`, `RFP_BIAS`) If you want, you could add another register argument to `pd_generate` that is just invalid/unused on other platforms. That way you could use `tmp` for the shuffling instead of having to go through the stack. (looks like `R0` is already used in some cases as a temp register) src/hotspot/cpu/ppc/upcallLinker_ppc.cpp line 137: > 135: ArgumentShuffle arg_shuffle(in_sig_bt, total_in_args, out_sig_bt, total_out_args, &in_conv, &out_conv, shuffle_reg); > 136: // The Java call uses the JIT ABI, but we also call C. > 137: int out_arg_area = MAX2(frame::jit_out_preserve_size + arg_shuffle.out_arg_bytes(), (int)frame::abi_reg_args_size); We need `frame::abi_reg_args_size` since we call `on_entry`/`on_exit` which require the stack space I guess? src/hotspot/cpu/ppc/upcallLinker_ppc.cpp line 240: > 238: __ ld(call_target_address, in_bytes(Method::from_compiled_offset()), R19_method); > 239: __ mtctr(call_target_address); > 240: __ bctrl(); Ok, I see. I guess there is some special purpose register called `CTR` which we are moving to for `bctrl` here. Does ABIv2 require that move to always come from `R12`? (from the comment in downcallLinker). (I'm trying to understand the requirements for possibly tweaking shared code). src/hotspot/cpu/ppc/upcallLinker_ppc.cpp line 347: > 345: FunctionDescriptor* fd = (FunctionDescriptor*)fd_addr; > 346: fd->set_entry(fd_addr + sizeof(FunctionDescriptor)); > 347: #endif Had to do a double take. Looks like we're not the only one who are using the name `FunctionDescriptor` :) src/hotspot/cpu/ppc/upcallLinker_ppc.cpp line 356: > 354: } > 355: #endif > 356: //blob->print_on(tty); Leftover commented code? src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/CallArranger.java line 68: > 66: public abstract class CallArranger { > 67: // Linux PPC64 Little Endian uses ABI v2. > 68: private static final boolean useABIv2 = ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN; Now that I'm here. This could be a potentially interesting case for having 2 subclasses of CallArranger: one for `useABIv2 == true` and one for `false`. src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/CallArranger.java line 81: > 79: new VMStorage[] { f1, f2, f3, f4, f5, f6, f7, f8 }, // FP output > 80: new VMStorage[] { r0, r2, r3, r4, r5, r6, r7, r8, r9, r10, r11, r12 }, // volatile GP > 81: new VMStorage[] { f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13 }, // volatile FP Note that argument registers are assumed volatile, so they don't have to be duplicated here. src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/CallArranger.java line 286: > 284: // "no partial DW rule": Mark first stack slot to get filled. > 285: // Note: Can only happen with forArguments = true. > 286: VMStorage overlappingReg = null; `overlappingReg` is initialized along all branches, so it's not needed to assign `null` here. src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/CallArranger.java line 293: > 291: } else { > 292: overlappingReg = new VMStorage(StorageType.STACK_AND_FLOAT, > 293: (short) STACK_SLOT_SIZE, (int) stackOffset - 4); I think you could remove the mixed VMStorage types here relatively easily by returning a `VMStorage[][]`, where each element is a single element array, but then for the `needOverlapping` case add another element to the array for the extra store. Then when unboxing a `STRUCT_HFA`, `dup` the result of the `bufferLoad` and then do 2 `vmStore`s (one for each element). For boxing, you could just ignore the extra storage, and just `vmLoad` the first one (or, whichever one you like :)) src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/TypeClass.java line 66: > 64: } > 65: > 66: static boolean isHomogeneousFloatAggregate(MemoryLayout type, boolean useABIv2) { Note that we had to make some changes to this routine on AArch64, since it didn't properly account for nested structs/unions and arrays. See: https://github.com/openjdk/panama-foreign/pull/780 Just as a heads up, in case PPC needs changes too. src/java.base/share/classes/jdk/internal/foreign/abi/ppc64/linux/LinuxPPC64CallArranger.java line 33: > 31: * PPC64 CallArranger specialized for Linux ABI. > 32: */ > 33: public class LinuxPPC64CallArranger extends CallArranger { I don't really see the point in having a separate subclass with `CallArranger` being abstract, unless you are planning to add other implementations later? (edit: see also later comment in CallArranger https://github.com/openjdk/jdk/pull/12708#discussion_r1120753657) ------------- PR: https://git.openjdk.org/jdk/pull/12708 From wkemper at openjdk.org Tue Feb 28 23:58:38 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 28 Feb 2023 23:58:38 GMT Subject: RFR: Merge openjdk/jdk:master Message-ID: Merge tag jdk-21+11 ------------- Commit messages: - Merge tag 'jdk-21+11' into merge-jdk-21-11 - 8302028: Port fdlibm atan2 to Java - 8303081: Serial: Remove unused VM_MarkSweep - 8302667: Improve message format when failing to load symbols or libraries - 8303024: (fs) WindowsFileSystem.supportedFileAttributeViews can use Set.of - 8303084: G1 Heap region liveness verification has inverted return value - 8302760: Improve liveness/remembered set verification for G1 - 8303067: G1: Remove unimplemented G1FullGCScope::heap_transition - 8302880: Fix includes in g1ConcurrentMarkObjArrayProcessor files - 8302975: Remove redundant mark verification during G1 Full GC - ... and 118 more: https://git.openjdk.org/shenandoah/compare/c778125c...9be58b2d The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.org/?repo=shenandoah&pr=221&range=00.0 - openjdk/jdk:master: https://webrevs.openjdk.org/?repo=shenandoah&pr=221&range=00.1 Changes: https://git.openjdk.org/shenandoah/pull/221/files Stats: 13641 lines in 534 files changed: 8720 ins; 2537 del; 2384 mod Patch: https://git.openjdk.org/shenandoah/pull/221.diff Fetch: git fetch https://git.openjdk.org/shenandoah pull/221/head:pull/221 PR: https://git.openjdk.org/shenandoah/pull/221