From wkemper at openjdk.org Mon Jul 1 15:59:52 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 1 Jul 2024 15:59:52 GMT Subject: Integrated: 8335289: GenShen: Whitebox breakpoint GC requests may cause assertions In-Reply-To: <4mqg5wQ8EsOCwR_d7styAkrBVlxfdmm7v9o-B4G_BjI=.19d03445-aa39-4c8c-9eed-11ed356439ac@github.com> References: <4mqg5wQ8EsOCwR_d7styAkrBVlxfdmm7v9o-B4G_BjI=.19d03445-aa39-4c8c-9eed-11ed356439ac@github.com> Message-ID: On Fri, 28 Jun 2024 22:00:04 GMT, William Kemper wrote: > Clean backport, low risk. Fixes intermittent test failure. This pull request has now been integrated. Changeset: d6eacaff Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/d6eacaff28c313d72f40ef2b68b65de97fd2fbed Stats: 26 lines in 2 files changed: 20 ins; 4 del; 2 mod 8335289: GenShen: Whitebox breakpoint GC requests may cause assertions Backport-of: 71182e240ce4f4a6e3a8773f61be6b091e2d65e9 ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/63 From wkemper at openjdk.org Mon Jul 1 23:57:37 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 1 Jul 2024 23:57:37 GMT Subject: RFR: 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use [v22] In-Reply-To: References: Message-ID: <0bfcP-HwsrAGE77lyBmdG4mXj5G09cwsd2l9drSB0-U=.7451d11f-8885-42d1-99f4-4828028b7b26@github.com> On Sat, 29 Jun 2024 06:44:13 GMT, Y. Srinivas Ramakrishna wrote: >> ShenandoahGCSession is intended to create a scope where the ShenandoahHeap's _gc_cause and _gc_generation field reflect the current gc cycle. We now check that we do not overwrite existing non-default settings (respectively _no_gc and nullptr). The destructor of the scope/stack object also resets these fields to their default settings, ensuring intended uses. This uncovered a situation where the scope was not entered when it should have been, which we have now fixed. >> >> A case of flickering of active_generation() was identified when used concurrently by mutators while it was being modified by the controller thread. To deal with this, we have carefully gone through the setting and use of the field, and found that an expedient fix for the race is to split the field into two: >> - _gc_generation is set & cleared by the controller thread whenever it enters and exits a GC scope, and services concurrent gc cycles for young or old generations. >> - _active_generation is set to the value in _gc_generation at the start of each Shenandoah GC safepoint operation so that mutator threads and load barriers always see a consistent value between safepoints. >> >> Asserts check the protocol for setting and clearing these fields. >> >> The protocol for use of the fields is that mutator threads may never use the _gc_generation field since it's subject to asynchronously changing based on actions of the coordinator thread. Mutator threads may only use the _active_generation field which changes synchronously at safepoints. Worker threads will generally use the former, but they may also use the latter as part of the load barrier. >> >> An alternative approach would be to not use a global variable for the _gc_generation indirected through the heap, but rather to pass it into the gc closures that do the work for specific phases of the GC that need to know which generation is currently subject to collector actions. This would work as well, but the changes would potentially touch more code. We would still have to have set the variable that is consulted by the load barriers, viz. _active_generation, in a mutator-safe fashion at a safepoint, like we do today. This or other alternative approaches may be investigated in the future to potentially make this protocol more self-contained and robust rather than leaking as it does today into many places in the code. >> >> *Testing*: >> - [x] code pipeline >> - [x] specjbb testing >> - [x] specjbb performance >> - [x] jtreg:hotspot_gc and jtreg:hots... > > Y. Srinivas Ramakrishna has updated the pull request incrementally with one additional commit since the last revision: > > Disallow mutator threads from reading the asynchronously updated > _gc_generation field of ShHeap. Marked as reviewed by wkemper (Committer). ------------- PR Review: https://git.openjdk.org/shenandoah/pull/407#pullrequestreview-2152416786 From ysr at openjdk.org Tue Jul 2 01:04:35 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 2 Jul 2024 01:04:35 GMT Subject: RFR: 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use [v22] In-Reply-To: References: Message-ID: On Sat, 29 Jun 2024 06:44:13 GMT, Y. Srinivas Ramakrishna wrote: >> ShenandoahGCSession is intended to create a scope where the ShenandoahHeap's _gc_cause and _gc_generation field reflect the current gc cycle. We now check that we do not overwrite existing non-default settings (respectively _no_gc and nullptr). The destructor of the scope/stack object also resets these fields to their default settings, ensuring intended uses. This uncovered a situation where the scope was not entered when it should have been, which we have now fixed. >> >> A case of flickering of active_generation() was identified when used concurrently by mutators while it was being modified by the controller thread. To deal with this, we have carefully gone through the setting and use of the field, and found that an expedient fix for the race is to split the field into two: >> - _gc_generation is set & cleared by the controller thread whenever it enters and exits a GC scope, and services concurrent gc cycles for young or old generations. >> - _active_generation is set to the value in _gc_generation at the start of each Shenandoah GC safepoint operation so that mutator threads and load barriers always see a consistent value between safepoints. >> >> Asserts check the protocol for setting and clearing these fields. >> >> The protocol for use of the fields is that mutator threads may never use the _gc_generation field since it's subject to asynchronously changing based on actions of the coordinator thread. Mutator threads may only use the _active_generation field which changes synchronously at safepoints. Worker threads will generally use the former, but they may also use the latter as part of the load barrier. >> >> An alternative approach would be to not use a global variable for the _gc_generation indirected through the heap, but rather to pass it into the gc closures that do the work for specific phases of the GC that need to know which generation is currently subject to collector actions. This would work as well, but the changes would potentially touch more code. We would still have to have set the variable that is consulted by the load barriers, viz. _active_generation, in a mutator-safe fashion at a safepoint, like we do today. This or other alternative approaches may be investigated in the future to potentially make this protocol more self-contained and robust rather than leaking as it does today into many places in the code. >> >> *Testing*: >> - [x] code pipeline >> - [x] specjbb testing >> - [x] specjbb performance >> - [x] jtreg:hotspot_gc and jtreg:hots... > > Y. Srinivas Ramakrishna has updated the pull request incrementally with one additional commit since the last revision: > > Disallow mutator threads from reading the asynchronously updated > _gc_generation field of ShHeap. Thanks for your review, William! ------------- PR Comment: https://git.openjdk.org/shenandoah/pull/407#issuecomment-2201583087 From ysr at openjdk.org Tue Jul 2 01:04:36 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 2 Jul 2024 01:04:36 GMT Subject: Integrated: 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use In-Reply-To: References: Message-ID: <4SqhOKXjyBoST4Tm4tKxuc6TdzpuuKIr2Bpe1kM8Ycs=.6064b4b0-13c2-4fa5-beaa-3d8c73065d71@github.com> On Fri, 15 Mar 2024 01:45:51 GMT, Y. Srinivas Ramakrishna wrote: > ShenandoahGCSession is intended to create a scope where the ShenandoahHeap's _gc_cause and _gc_generation field reflect the current gc cycle. We now check that we do not overwrite existing non-default settings (respectively _no_gc and nullptr). The destructor of the scope/stack object also resets these fields to their default settings, ensuring intended uses. This uncovered a situation where the scope was not entered when it should have been, which we have now fixed. > > A case of flickering of active_generation() was identified when used concurrently by mutators while it was being modified by the controller thread. To deal with this, we have carefully gone through the setting and use of the field, and found that an expedient fix for the race is to split the field into two: > - _gc_generation is set & cleared by the controller thread whenever it enters and exits a GC scope, and services concurrent gc cycles for young or old generations. > - _active_generation is set to the value in _gc_generation at the start of each Shenandoah GC safepoint operation so that mutator threads and load barriers always see a consistent value between safepoints. > > Asserts check the protocol for setting and clearing these fields. > > The protocol for use of the fields is that mutator threads may never use the _gc_generation field since it's subject to asynchronously changing based on actions of the coordinator thread. Mutator threads may only use the _active_generation field which changes synchronously at safepoints. Worker threads will generally use the former, but they may also use the latter as part of the load barrier. > > An alternative approach would be to not use a global variable for the _gc_generation indirected through the heap, but rather to pass it into the gc closures that do the work for specific phases of the GC that need to know which generation is currently subject to collector actions. This would work as well, but the changes would potentially touch more code. We would still have to have set the variable that is consulted by the load barriers, viz. _active_generation, in a mutator-safe fashion at a safepoint, like we do today. This or other alternative approaches may be investigated in the future to potentially make this protocol more self-contained and robust rather than leaking as it does today into many places in the code. > > *Testing*: > - [x] code pipeline > - [x] specjbb testing > - [x] specjbb performance > - [x] jtreg:hotspot_gc and jtreg:hotspot:tier1 w/fastdebug > - [x] GHA > > *... This pull request has now been integrated. Changeset: d2102347 Author: Y. Srinivas Ramakrishna URL: https://git.openjdk.org/shenandoah/commit/d2102347ea9c1199221ec33f4e721aefa1193cea Stats: 169 lines in 16 files changed: 135 ins; 1 del; 33 mod 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use Reviewed-by: wkemper ------------- PR: https://git.openjdk.org/shenandoah/pull/407 From duke at openjdk.org Thu Jul 4 02:05:31 2024 From: duke at openjdk.org (duke) Date: Thu, 4 Jul 2024 02:05:31 GMT Subject: Withdrawn: 8321806: Shenandoah: each mutator must see FullGC or GC overhead limit is exceeded before throwing OOM In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 20:30:54 GMT, Kelvin Nilsen wrote: > Require each thread to observe unproductive Full GC before it throws OOM exception. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/16985 From duke at openjdk.org Fri Jul 5 03:25:29 2024 From: duke at openjdk.org (duke) Date: Fri, 5 Jul 2024 03:25:29 GMT Subject: Withdrawn: 8329204: Diagnostic command for zeroing unused parts of the heap In-Reply-To: References: Message-ID: On Wed, 27 Mar 2024 17:24:34 GMT, Volker Simonis wrote: > Diagnostic command for zeroing unused parts of the heap > > I propose to add a new diagnostic command `System.zero_unused_memory` which zeros out all unused parts of the heap. The name of the command is intentionally GC/heap agnostic because in the future it might be extended to also zero unused parts of the Metaspace and/or CodeCache. > > Currently `System.zero_unused_memory` triggers a full GC and afterwards zeros unused parts of the heap. Zeroing can help snapshotting technologies like [CRIU][1] or [Firecracker][2] to shrink the snapshot size of VMs/containers with running JVM processes because pages which only contain zero bytes can be easily removed from the image by making the image *sparse* (e.g. with [`fallocate -p`][3]). > > Notice that uncommitting unused heap parts in the JVM doesn't help in the context of virtualization (e.g. KVM/Firecracker) because from the host perspective they are still dirty and can't be easily removed from the snapshot image because they usually contain some non-zero data. More details can be found in my FOSDEM talk ["Zeroing and the semantic gap between host and guest"][4]. > > Furthermore, removing pages which only contain zero bytes (i.e. "empty pages") from a snapshot image not only decreases the image size but also speeds up the restore process because empty pages don't have to be read from the image file but will be populated by the kernel zero page first until they are used for the first time. This also decreases the initial memory footprint of a restored process. > > An additional argument for memory zeroing is security. By zeroing unused heap parts, we can make sure that secrets contained in unreferenced Java objects are deleted. Something that's currently impossibly to achieve from Java because even if a Java program zeroes out arrays with sensitive data after usage, it can never guarantee that the corresponding object hasn't already been moved by the GC and an old, unreferenced copy of that data still exists somewhere in the heap. > > A prototype implementation for this proposal for Serial, Parallel, G1 and Shenandoah GC is available in the linked pull request. > > [1]: https://criu.org > [2]: https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/snapshot-support.md > [3]: https://man7.org/linux/man-pages/man1/fallocate.1.html > [4]: https://fosdem.org/2024/schedule/event/fosdem-2024-3454-zeroing-and-the-semantic-gap-between-host-and-guest/ This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/18521 From azafari at openjdk.org Fri Jul 5 11:15:25 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 5 Jul 2024 11:15:25 GMT Subject: RFR: 8331539: [REDO] NMT: add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API [v4] In-Reply-To: References: <1i0PKv9mCusM6BZqXG8ULe0lRA2Nz2ix4aZHz9otNMM=.b9d2d151-883e-4cb6-be48-4ba45b49ed43@github.com> Message-ID: On Fri, 24 May 2024 13:46:15 GMT, Afshin Zafari wrote: >> This PR fixes the problems existed in the original PR (https://github.com/openjdk/jdk/pull/18745). There are two main fixes here: >> 1- `ReservedSpace` class is changed so that the `_flag` member never changes after it is set in ctor. Since reserving memory regions may go thru a try and fail sequence of reserve-release pairs, changing the `_flag` member at failed releases would lead to incorrect flags in subsequent reserves. >> Also, some assertion are added to the getters of a `ReservedSpace` to check if the region is successfully reserved. >> >> 2- In order to have adjacent regions with different flags, CDS reserves a (large) region `R` and then splits it into sub regions `R1` and `R2` (`R == <---R1---><--R2-->`). At release time, NMT tracks only `R` and ignores releasing `R1` and `R2`. This ignoring is problematic when a requested region `R` is size-aligned to `R1---R---R2` first and then the `R1` and `R2` are released (`chop_extra_memory` function is called for this). In this case, NMT ignores tracking `R1` and `R2` with false assumption that a containing `R` will be released. Therefore, `R1` and `R2` remain in the NMT reserved-regions-list and when a new reserve happens at that regions, NMT complains by raising an exception. >> >> Tests: >> mach5 tiers 1-5, {linux-x64, macosx-aarch64, windows-x64, linux-aarch64 } x {debug, non-debug} > > Afshin Zafari has updated the pull request incrementally with one additional commit since the last revision: > > more fixes. Withdrawn. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19343#issuecomment-2210686409 From azafari at openjdk.org Fri Jul 5 11:15:26 2024 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 5 Jul 2024 11:15:26 GMT Subject: Withdrawn: 8331539: [REDO] NMT: add/make a mandatory MEMFLAGS argument to family of os::reserve/commit/uncommit memory API In-Reply-To: <1i0PKv9mCusM6BZqXG8ULe0lRA2Nz2ix4aZHz9otNMM=.b9d2d151-883e-4cb6-be48-4ba45b49ed43@github.com> References: <1i0PKv9mCusM6BZqXG8ULe0lRA2Nz2ix4aZHz9otNMM=.b9d2d151-883e-4cb6-be48-4ba45b49ed43@github.com> Message-ID: On Wed, 22 May 2024 08:29:05 GMT, Afshin Zafari wrote: > This PR fixes the problems existed in the original PR (https://github.com/openjdk/jdk/pull/18745). There are two main fixes here: > 1- `ReservedSpace` class is changed so that the `_flag` member never changes after it is set in ctor. Since reserving memory regions may go thru a try and fail sequence of reserve-release pairs, changing the `_flag` member at failed releases would lead to incorrect flags in subsequent reserves. > Also, some assertion are added to the getters of a `ReservedSpace` to check if the region is successfully reserved. > > 2- In order to have adjacent regions with different flags, CDS reserves a (large) region `R` and then splits it into sub regions `R1` and `R2` (`R == <---R1---><--R2-->`). At release time, NMT tracks only `R` and ignores releasing `R1` and `R2`. This ignoring is problematic when a requested region `R` is size-aligned to `R1---R---R2` first and then the `R1` and `R2` are released (`chop_extra_memory` function is called for this). In this case, NMT ignores tracking `R1` and `R2` with false assumption that a containing `R` will be released. Therefore, `R1` and `R2` remain in the NMT reserved-regions-list and when a new reserve happens at that regions, NMT complains by raising an exception. > > Tests: > mach5 tiers 1-5, {linux-x64, macosx-aarch64, windows-x64, linux-aarch64 } x {debug, non-debug} This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/19343 From duke at openjdk.org Mon Jul 8 05:30:40 2024 From: duke at openjdk.org (duke) Date: Mon, 8 Jul 2024 05:30:40 GMT Subject: Withdrawn: 8330171: Lazy W^X switch implementation In-Reply-To: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> References: <9eymaXovxUNFdkAkzojFQP5trwl_yyY0jE2GzcMEjR4=.02ee2ef9-c476-4c7c-9e4a-e021425c38bc@github.com> Message-ID: On Fri, 12 Apr 2024 14:40:05 GMT, Sergey Nazarkin wrote: > An alternative for preemptively switching the W^X thread mode on macOS with an AArch64 CPU. This implementation triggers the switch in response to the SIGBUS signal if the *si_addr* belongs to the CodeCache area. With this approach, it is now feasible to eliminate all WX guards and avoid potentially costly operations. However, no significant improvement or degradation in performance has been observed. Additionally, considering the issue with AsyncGetCallTrace, the patched JVM has been successfully operated with [asgct_bottom](https://github.com/parttimenerd/asgct_bottom) and [async-profiler](https://github.com/async-profiler/async-profiler). > > Additional testing: > - [x] MacOS AArch64 server fastdebug *gtets* > - [ ] MacOS AArch64 server fastdebug *jtreg:hotspot:tier4* > - [ ] Benchmarking > > @apangin and @parttimenerd could you please check the patch on your scenarios?? This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/18762 From duke at openjdk.org Mon Jul 8 08:54:38 2024 From: duke at openjdk.org (duke) Date: Mon, 8 Jul 2024 08:54:38 GMT Subject: RFR: 8331411: Shenandoah: Reconsider spinning duration in ShenandoahLock [v7] In-Reply-To: References: Message-ID: On Tue, 25 Jun 2024 08:20:43 GMT, Xiaolong Peng wrote: >> ### Notes >> While doing CAS to get the lock, original implementation sleep/yield once after spinning 0xFFF times, and do these over and over again until get the lock successfully, it is like ```(N spins + sleep/yield) loop ```, based on test results, it seems doing more spins results in worse performance, we decided to change the algorithm to ```(N spins) + (yield loop)```, meanwhile block thread immediately if Safepoint is pending. But still need to determine the best N value for spins, tested multiple possible values: 0, 0x01, 0x7, 0xF, 0x1F, 0x3F, 0x7F, 0xFF, and compare the results with the baseline data(original implementation). >> >> #### Test code >> >> public class Alloc { >> static final int THREADS = 1280; //32 threads per CPU core, 40 cores >> static final Object[] sinks = new Object[64*THREADS]; >> static volatile boolean start; >> static volatile boolean stop; >> >> public static void main(String... args) throws Throwable { >> for (int t = 0; t < THREADS; t++) { >> int ft = t; >> new Thread(() -> work(ft * 64)).start(); >> } >> >> Thread.sleep(1000); >> start = true; >> Thread.sleep(30_000); >> stop = true; >> } >> >> public static void work(int idx) { >> while (!start) { Thread.onSpinWait(); } >> while (!stop) { >> sinks[idx] = new byte[128]; >> } >> } >> } >> >> >> Run it like this and observe TTSP times: >> >> >> java -Xms256m -Xmx256m -XX:+UseShenandoahGC -XX:-UseTLAB -Xlog:gc -Xlog:safepoint Alloc.java >> >> >> #### Metrics from tests(TTSP, allocation rate) >> ##### Heavy contention(1280 threads, 32 per CPU core) >> | Test | SP polls | Average TTSP | 2% TRIMMEAN | MAX | MIN | >> | -------- | -------- | ------------ | ----------- | -------- | ----- | >> | baseline | 18 | 3882361 | 3882361 | 43310117 | 49197 | >> | 0x00 | 168 | 861677 | 589036 | 46937732 | 44005 | >> | 0x01 | 164 | 627056 | 572697 | 10004767 | 55472 | >> | 0x07 | 163 | 650578 | 625329 | 5312631 | 53734 | >> | 0x0F | 164 | 590398 | 557325 | 6481761 | 56794 | >> | 0x1F | 144 | 814400 | 790089 | 5024881 | 56041 | >> | 0x3F | 137 | 830288 | 801192 | 5533538 | 54982 | >> | 0x7F | 132 | 1101625 | 845626 | 35425614 | 57492 | >> | 0xFF | 125 | 1005433 | 970988 | 6193342 | 54362 | >> >> >> ##### Light contention(40 threads, 1 per CPU core) >> | Spins | SP polls | Average TTSP | 2% T... > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Simplify code with less stacks @pengxiaolong Your change (at version 6fd8205d851b27640a1f2ee6a29dc42cc811f530) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19570#issuecomment-2192464563 From aturbanov at openjdk.org Mon Jul 8 08:54:38 2024 From: aturbanov at openjdk.org (Andrey Turbanov) Date: Mon, 8 Jul 2024 08:54:38 GMT Subject: RFR: 8331411: Shenandoah: Reconsider spinning duration in ShenandoahLock [v7] In-Reply-To: References: Message-ID: On Tue, 25 Jun 2024 08:20:43 GMT, Xiaolong Peng wrote: >> ### Notes >> While doing CAS to get the lock, original implementation sleep/yield once after spinning 0xFFF times, and do these over and over again until get the lock successfully, it is like ```(N spins + sleep/yield) loop ```, based on test results, it seems doing more spins results in worse performance, we decided to change the algorithm to ```(N spins) + (yield loop)```, meanwhile block thread immediately if Safepoint is pending. But still need to determine the best N value for spins, tested multiple possible values: 0, 0x01, 0x7, 0xF, 0x1F, 0x3F, 0x7F, 0xFF, and compare the results with the baseline data(original implementation). >> >> #### Test code >> >> public class Alloc { >> static final int THREADS = 1280; //32 threads per CPU core, 40 cores >> static final Object[] sinks = new Object[64*THREADS]; >> static volatile boolean start; >> static volatile boolean stop; >> >> public static void main(String... args) throws Throwable { >> for (int t = 0; t < THREADS; t++) { >> int ft = t; >> new Thread(() -> work(ft * 64)).start(); >> } >> >> Thread.sleep(1000); >> start = true; >> Thread.sleep(30_000); >> stop = true; >> } >> >> public static void work(int idx) { >> while (!start) { Thread.onSpinWait(); } >> while (!stop) { >> sinks[idx] = new byte[128]; >> } >> } >> } >> >> >> Run it like this and observe TTSP times: >> >> >> java -Xms256m -Xmx256m -XX:+UseShenandoahGC -XX:-UseTLAB -Xlog:gc -Xlog:safepoint Alloc.java >> >> >> #### Metrics from tests(TTSP, allocation rate) >> ##### Heavy contention(1280 threads, 32 per CPU core) >> | Test | SP polls | Average TTSP | 2% TRIMMEAN | MAX | MIN | >> | -------- | -------- | ------------ | ----------- | -------- | ----- | >> | baseline | 18 | 3882361 | 3882361 | 43310117 | 49197 | >> | 0x00 | 168 | 861677 | 589036 | 46937732 | 44005 | >> | 0x01 | 164 | 627056 | 572697 | 10004767 | 55472 | >> | 0x07 | 163 | 650578 | 625329 | 5312631 | 53734 | >> | 0x0F | 164 | 590398 | 557325 | 6481761 | 56794 | >> | 0x1F | 144 | 814400 | 790089 | 5024881 | 56041 | >> | 0x3F | 137 | 830288 | 801192 | 5533538 | 54982 | >> | 0x7F | 132 | 1101625 | 845626 | 35425614 | 57492 | >> | 0xFF | 125 | 1005433 | 970988 | 6193342 | 54362 | >> >> >> ##### Light contention(40 threads, 1 per CPU core) >> | Spins | SP polls | Average TTSP | 2% T... > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Simplify code with less stacks src/hotspot/share/gc/shenandoah/shenandoahLock.cpp line 47: > 45: void ShenandoahLock::contended_lock_internal(JavaThread* java_thread) { > 46: assert(!ALLOW_BLOCK || java_thread != nullptr, "Must have a Java thread when allowing block."); > 47: // Spin this much on multi-processor, do not spin on multi-processor. >do not spin on multi-processor Did you mean on **single**-processor? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/19570#discussion_r1668249587 From shade at openjdk.org Mon Jul 8 11:04:41 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 Jul 2024 11:04:41 GMT Subject: RFR: 8331411: Shenandoah: Reconsider spinning duration in ShenandoahLock [v7] In-Reply-To: References: Message-ID: <8NhfPejZeG763Q6yny929XiwoBRC23XW8rGO-zSHtMM=.7acf73ff-9921-4537-b895-e94f29a569d8@github.com> On Mon, 8 Jul 2024 08:51:38 GMT, Andrey Turbanov wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> Simplify code with less stacks > > src/hotspot/share/gc/shenandoah/shenandoahLock.cpp line 47: > >> 45: void ShenandoahLock::contended_lock_internal(JavaThread* java_thread) { >> 46: assert(!ALLOW_BLOCK || java_thread != nullptr, "Must have a Java thread when allowing block."); >> 47: // Spin this much on multi-processor, do not spin on multi-processor. > >>do not spin on multi-processor > > Did you mean on **single**-processor? Yes, I think so. @pengxiaolong, please do a simple follow-up fix? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/19570#discussion_r1668426904 From xpeng at openjdk.org Mon Jul 8 14:34:38 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 8 Jul 2024 14:34:38 GMT Subject: RFR: 8331411: Shenandoah: Reconsider spinning duration in ShenandoahLock [v7] In-Reply-To: <8NhfPejZeG763Q6yny929XiwoBRC23XW8rGO-zSHtMM=.7acf73ff-9921-4537-b895-e94f29a569d8@github.com> References: <8NhfPejZeG763Q6yny929XiwoBRC23XW8rGO-zSHtMM=.7acf73ff-9921-4537-b895-e94f29a569d8@github.com> Message-ID: On Mon, 8 Jul 2024 11:01:51 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/shenandoah/shenandoahLock.cpp line 47: >> >>> 45: void ShenandoahLock::contended_lock_internal(JavaThread* java_thread) { >>> 46: assert(!ALLOW_BLOCK || java_thread != nullptr, "Must have a Java thread when allowing block."); >>> 47: // Spin this much on multi-processor, do not spin on multi-processor. >> >>>do not spin on multi-processor >> >> Did you mean on **single**-processor? > > Yes, I think so. @pengxiaolong, please do a simple follow-up fix? Thanks, I'll fix it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/19570#discussion_r1668758318 From duke at openjdk.org Mon Jul 8 16:22:33 2024 From: duke at openjdk.org (duke) Date: Mon, 8 Jul 2024 16:22:33 GMT Subject: RFR: 8335126: Shenandoah: Improve OOM handling In-Reply-To: References: Message-ID: On Wed, 26 Jun 2024 17:51:36 GMT, Kelvin Nilsen wrote: > 1. Throw OOM after failed allocation request following a Full GC (rather > than retrying as long as Full GC makes good progress because > repeatedly retrying the allocation request creates brown-out behavior > with no identified benefits on real-world workloads) > > 2. Count a successful allocation following a blocking > handle_allocation_failure() request to be good GC progress. > Otherwise, we increment gc_no_progress_count in full GCs that > have bad progress but successful allocations, and this causes > unwanted failure to even try a full GC in a different thread after > an out-of-memory condition might have been resolved in this thread. > > 3. Count a completed concurrent GC cycle as good progress, regardless > of how much memory it might have been able to reclaim. The fact that > concurrent GC succeeded without allocation failure and without > degeneration is considered good progress. Successful concurrent > GCs between Full GCs will reset the gc_no_progress_count to zero. > > 4. Do not count degenerated cycles as having no-progress. If a > degenerated cycle has no progress, it will upgrade to full GC. > The upgraded full GC will evaluate its own progress. We don't > want to count this "same [upgraded] cycle" twice. > > These changes have been tested over a variety of workloads and standard tests. These changes have also been tested with the generational mode of Shenandoah. It appears these changes provide more robust and consistent handling across a diversity of scenarios than the original implementation. @kdnilsen Your change (at version 439d394cab24ea9577550d3e44bbb02c1486dba8) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19912#issuecomment-2214585906 From kdnilsen at openjdk.org Mon Jul 8 16:22:34 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 8 Jul 2024 16:22:34 GMT Subject: RFR: 8335126: Shenandoah: Improve OOM handling In-Reply-To: <45BHtt51LihYveREtNv8vM0saAiwbeqWrSP8cyxm1EM=.39941e81-f71c-4333-b47c-516bb69375fb@github.com> References: <45BHtt51LihYveREtNv8vM0saAiwbeqWrSP8cyxm1EM=.39941e81-f71c-4333-b47c-516bb69375fb@github.com> Message-ID: <3hPxig-gI0Ac-H6ONjjc1NW-dmLW5erepRiVqG8J2dg=.d2c64ab7-5991-4059-a4d5-424211bd1987@github.com> On Thu, 27 Jun 2024 19:02:04 GMT, Aleksey Shipilev wrote: >> 1. Throw OOM after failed allocation request following a Full GC (rather >> than retrying as long as Full GC makes good progress because >> repeatedly retrying the allocation request creates brown-out behavior >> with no identified benefits on real-world workloads) >> >> 2. Count a successful allocation following a blocking >> handle_allocation_failure() request to be good GC progress. >> Otherwise, we increment gc_no_progress_count in full GCs that >> have bad progress but successful allocations, and this causes >> unwanted failure to even try a full GC in a different thread after >> an out-of-memory condition might have been resolved in this thread. >> >> 3. Count a completed concurrent GC cycle as good progress, regardless >> of how much memory it might have been able to reclaim. The fact that >> concurrent GC succeeded without allocation failure and without >> degeneration is considered good progress. Successful concurrent >> GCs between Full GCs will reset the gc_no_progress_count to zero. >> >> 4. Do not count degenerated cycles as having no-progress. If a >> degenerated cycle has no progress, it will upgrade to full GC. >> The upgraded full GC will evaluate its own progress. We don't >> want to count this "same [upgraded] cycle" twice. >> >> These changes have been tested over a variety of workloads and standard tests. These changes have also been tested with the generational mode of Shenandoah. It appears these changes provide more robust and consistent handling across a diversity of scenarios than the original implementation. > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 960: > >> 958: // b) We experienced at least one Full GC (whether or not it had good progress) >> 959: // >> 960: // TODO: Rather than require a Full GC before throwing OOMError, it might be more appropriate for handle_alloc_failure() > > Pro-tip: If you find yourself writing a large TODO comment, it should probably be transplanted straight into a new issue. Thanks. I will create an issue for this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/19912#discussion_r1668933461 From xpeng at openjdk.org Mon Jul 8 16:48:44 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 8 Jul 2024 16:48:44 GMT Subject: RFR: 8335904: Fix invalid comment in ShenandoahLock Message-ID: Hi all, This PR is to fix an invalid comment in ShenandoahLock I introduced in https://github.com/openjdk/jdk/pull/19570/files#r1668249587, thank you turbanoff@ for catching this, it is easy to miss since the PR had been closed. This PR should be trivial. Best, Xiaolong. ------------- Commit messages: - 8335904: Fix invalid comment in ShenandoahLock Changes: https://git.openjdk.org/jdk/pull/20079/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20079&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8335904 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20079.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20079/head:pull/20079 PR: https://git.openjdk.org/jdk/pull/20079 From xpeng at openjdk.org Mon Jul 8 16:52:39 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 8 Jul 2024 16:52:39 GMT Subject: RFR: 8331411: Shenandoah: Reconsider spinning duration in ShenandoahLock [v7] In-Reply-To: References: <8NhfPejZeG763Q6yny929XiwoBRC23XW8rGO-zSHtMM=.7acf73ff-9921-4537-b895-e94f29a569d8@github.com> Message-ID: On Mon, 8 Jul 2024 14:31:45 GMT, Xiaolong Peng wrote: >> Yes, I think so. @pengxiaolong, please do a simple follow-up fix? > > Thanks, I'll fix it. https://github.com/openjdk/jdk/pull/20079 I have created a bug and PR for it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/19570#discussion_r1668969590 From kdnilsen at openjdk.org Mon Jul 8 17:09:48 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 8 Jul 2024 17:09:48 GMT Subject: RFR: 8335126: Shenandoah: Improve OOM handling [v2] In-Reply-To: References: Message-ID: > 1. Throw OOM after failed allocation request following a Full GC (rather > than retrying as long as Full GC makes good progress because > repeatedly retrying the allocation request creates brown-out behavior > with no identified benefits on real-world workloads) > > 2. Count a successful allocation following a blocking > handle_allocation_failure() request to be good GC progress. > Otherwise, we increment gc_no_progress_count in full GCs that > have bad progress but successful allocations, and this causes > unwanted failure to even try a full GC in a different thread after > an out-of-memory condition might have been resolved in this thread. > > 3. Count a completed concurrent GC cycle as good progress, regardless > of how much memory it might have been able to reclaim. The fact that > concurrent GC succeeded without allocation failure and without > degeneration is considered good progress. Successful concurrent > GCs between Full GCs will reset the gc_no_progress_count to zero. > > 4. Do not count degenerated cycles as having no-progress. If a > degenerated cycle has no progress, it will upgrade to full GC. > The upgraded full GC will evaluate its own progress. We don't > want to count this "same [upgraded] cycle" twice. > > These changes have been tested over a variety of workloads and standard tests. These changes have also been tested with the generational mode of Shenandoah. It appears these changes provide more robust and consistent handling across a diversity of scenarios than the original implementation. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Change TODO comment to reference JBS issue ------------- Changes: - all: https://git.openjdk.org/jdk/pull/19912/files - new: https://git.openjdk.org/jdk/pull/19912/files/439d394c..24cd0fa6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=19912&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=19912&range=00-01 Stats: 15 lines in 1 file changed: 0 ins; 14 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/19912.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19912/head:pull/19912 PR: https://git.openjdk.org/jdk/pull/19912 From shade at openjdk.org Mon Jul 8 17:48:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 Jul 2024 17:48:34 GMT Subject: RFR: 8335904: Fix invalid comment in ShenandoahLock In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 16:43:00 GMT, Xiaolong Peng wrote: > Hi all, > This PR is to fix an invalid comment in ShenandoahLock I introduced in https://github.com/openjdk/jdk/pull/19570/files#r1668249587, thank you turbanoff@ for catching this, it is easy to miss since the PR had been closed. > This PR should be trivial. > > Best, > Xiaolong. This looks okay. src/hotspot/share/gc/shenandoah/shenandoahLock.cpp line 47: > 45: void ShenandoahLock::contended_lock_internal(JavaThread* java_thread) { > 46: assert(!ALLOW_BLOCK || java_thread != nullptr, "Must have a Java thread when allowing block."); > 47: // Spin this much on multi-processor, do not spin on single-processor. I think the proper nomenclature is "uniprocessor". We can avoid all this if we say e.g. "// Spin this much, but only on multi-processor systems" ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20079#pullrequestreview-2163983319 PR Review Comment: https://git.openjdk.org/jdk/pull/20079#discussion_r1669040268 From shade at openjdk.org Mon Jul 8 17:49:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 Jul 2024 17:49:34 GMT Subject: RFR: 8335126: Shenandoah: Improve OOM handling [v2] In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 17:09:48 GMT, Kelvin Nilsen wrote: >> 1. Throw OOM after failed allocation request following a Full GC (rather >> than retrying as long as Full GC makes good progress because >> repeatedly retrying the allocation request creates brown-out behavior >> with no identified benefits on real-world workloads) >> >> 2. Count a successful allocation following a blocking >> handle_allocation_failure() request to be good GC progress. >> Otherwise, we increment gc_no_progress_count in full GCs that >> have bad progress but successful allocations, and this causes >> unwanted failure to even try a full GC in a different thread after >> an out-of-memory condition might have been resolved in this thread. >> >> 3. Count a completed concurrent GC cycle as good progress, regardless >> of how much memory it might have been able to reclaim. The fact that >> concurrent GC succeeded without allocation failure and without >> degeneration is considered good progress. Successful concurrent >> GCs between Full GCs will reset the gc_no_progress_count to zero. >> >> 4. Do not count degenerated cycles as having no-progress. If a >> degenerated cycle has no progress, it will upgrade to full GC. >> The upgraded full GC will evaluate its own progress. We don't >> want to count this "same [upgraded] cycle" twice. >> >> These changes have been tested over a variety of workloads and standard tests. These changes have also been tested with the generational mode of Shenandoah. It appears these changes provide more robust and consistent handling across a diversity of scenarios than the original implementation. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Change TODO comment to reference JBS issue The machines are rebelling! I, for one, welcome our new overlords. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19912#pullrequestreview-2163990430 From duke at openjdk.org Mon Jul 8 18:05:42 2024 From: duke at openjdk.org (duke) Date: Mon, 8 Jul 2024 18:05:42 GMT Subject: RFR: 8335126: Shenandoah: Improve OOM handling [v2] In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 17:09:48 GMT, Kelvin Nilsen wrote: >> 1. Throw OOM after failed allocation request following a Full GC (rather >> than retrying as long as Full GC makes good progress because >> repeatedly retrying the allocation request creates brown-out behavior >> with no identified benefits on real-world workloads) >> >> 2. Count a successful allocation following a blocking >> handle_allocation_failure() request to be good GC progress. >> Otherwise, we increment gc_no_progress_count in full GCs that >> have bad progress but successful allocations, and this causes >> unwanted failure to even try a full GC in a different thread after >> an out-of-memory condition might have been resolved in this thread. >> >> 3. Count a completed concurrent GC cycle as good progress, regardless >> of how much memory it might have been able to reclaim. The fact that >> concurrent GC succeeded without allocation failure and without >> degeneration is considered good progress. Successful concurrent >> GCs between Full GCs will reset the gc_no_progress_count to zero. >> >> 4. Do not count degenerated cycles as having no-progress. If a >> degenerated cycle has no progress, it will upgrade to full GC. >> The upgraded full GC will evaluate its own progress. We don't >> want to count this "same [upgraded] cycle" twice. >> >> These changes have been tested over a variety of workloads and standard tests. These changes have also been tested with the generational mode of Shenandoah. It appears these changes provide more robust and consistent handling across a diversity of scenarios than the original implementation. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Change TODO comment to reference JBS issue @kdnilsen Your change (at version 24cd0fa655526ae7abc68630b07ba1eb5b094187) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19912#issuecomment-2214843005 From kdnilsen at openjdk.org Mon Jul 8 18:05:43 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 8 Jul 2024 18:05:43 GMT Subject: Integrated: 8335126: Shenandoah: Improve OOM handling In-Reply-To: References: Message-ID: On Wed, 26 Jun 2024 17:51:36 GMT, Kelvin Nilsen wrote: > 1. Throw OOM after failed allocation request following a Full GC (rather > than retrying as long as Full GC makes good progress because > repeatedly retrying the allocation request creates brown-out behavior > with no identified benefits on real-world workloads) > > 2. Count a successful allocation following a blocking > handle_allocation_failure() request to be good GC progress. > Otherwise, we increment gc_no_progress_count in full GCs that > have bad progress but successful allocations, and this causes > unwanted failure to even try a full GC in a different thread after > an out-of-memory condition might have been resolved in this thread. > > 3. Count a completed concurrent GC cycle as good progress, regardless > of how much memory it might have been able to reclaim. The fact that > concurrent GC succeeded without allocation failure and without > degeneration is considered good progress. Successful concurrent > GCs between Full GCs will reset the gc_no_progress_count to zero. > > 4. Do not count degenerated cycles as having no-progress. If a > degenerated cycle has no progress, it will upgrade to full GC. > The upgraded full GC will evaluate its own progress. We don't > want to count this "same [upgraded] cycle" twice. > > These changes have been tested over a variety of workloads and standard tests. These changes have also been tested with the generational mode of Shenandoah. It appears these changes provide more robust and consistent handling across a diversity of scenarios than the original implementation. This pull request has now been integrated. Changeset: 3a87eb5c Author: Kelvin Nilsen URL: https://git.openjdk.org/jdk/commit/3a87eb5c4606ce39970962895315567e8606eba7 Stats: 33 lines in 3 files changed: 13 ins; 1 del; 19 mod 8335126: Shenandoah: Improve OOM handling Reviewed-by: shade, ysr, wkemper, rkennke ------------- PR: https://git.openjdk.org/jdk/pull/19912 From ysr at openjdk.org Mon Jul 8 18:17:55 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Mon, 8 Jul 2024 18:17:55 GMT Subject: RFR: 8327000: GenShen: Integrate updated Shenandoah implementation of FreeSet into GenShen In-Reply-To: References: Message-ID: <9sjiWgkSN4iOQKtCZekuBLi0odQe98-1zsElsuLMj4c=.e1d2608c-98e4-4c04-8996-52d405ddaedf@github.com> On Thu, 27 Jun 2024 20:38:32 GMT, Kelvin Nilsen wrote: > This pull request contains a backport of commit 9d2712dd from the openjdk/shenandoah repository. > > The commit being backported was authored by Kelvin Nilsen on 26 Jun 2024 and was reviewed by William Kemper and Y. Srinivas Ramakrishna. LGTM. ------------- Marked as reviewed by ysr (Committer). PR Review: https://git.openjdk.org/shenandoah-jdk21u/pull/62#pullrequestreview-2164045317 From xpeng at openjdk.org Mon Jul 8 18:29:02 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 8 Jul 2024 18:29:02 GMT Subject: RFR: 8335904: Fix invalid comment in ShenandoahLock [v2] In-Reply-To: References: Message-ID: > Hi all, > This PR is to fix an invalid comment in ShenandoahLock I introduced in https://github.com/openjdk/jdk/pull/19570/files#r1668249587, thank you turbanoff@ for catching this, it is easy to miss since the PR had been closed. > This PR should be trivial. > > Best, > Xiaolong. Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Update comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20079/files - new: https://git.openjdk.org/jdk/pull/20079/files/9d2c31f1..ad7dc6a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20079&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20079&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20079.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20079/head:pull/20079 PR: https://git.openjdk.org/jdk/pull/20079 From xpeng at openjdk.org Mon Jul 8 18:29:02 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 8 Jul 2024 18:29:02 GMT Subject: RFR: 8335904: Fix invalid comment in ShenandoahLock [v2] In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 17:43:34 GMT, Aleksey Shipilev wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> Update comments > > src/hotspot/share/gc/shenandoah/shenandoahLock.cpp line 47: > >> 45: void ShenandoahLock::contended_lock_internal(JavaThread* java_thread) { >> 46: assert(!ALLOW_BLOCK || java_thread != nullptr, "Must have a Java thread when allowing block."); >> 47: // Spin this much on multi-processor, do not spin on single-processor. > > I think the proper nomenclature is "uniprocessor". We can avoid all this if we say e.g. "// Spin this much, but only on multi-processor systems" Thanks, suggestion is taken, I updated the PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20079#discussion_r1669092329 From shade at openjdk.org Mon Jul 8 19:15:35 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 8 Jul 2024 19:15:35 GMT Subject: RFR: 8335904: Fix invalid comment in ShenandoahLock [v2] In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 18:29:02 GMT, Xiaolong Peng wrote: >> Hi all, >> This PR is to fix an invalid comment in ShenandoahLock I introduced in https://github.com/openjdk/jdk/pull/19570/files#r1668249587, thank you turbanoff@ for catching this, it is easy to miss since the PR had been closed. >> This PR should be trivial. >> >> Best, >> Xiaolong. > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Update comments Looks good and trivial. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20079#pullrequestreview-2164178482 From xpeng at openjdk.org Mon Jul 8 20:04:35 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 8 Jul 2024 20:04:35 GMT Subject: RFR: 8335904: Fix invalid comment in ShenandoahLock [v2] In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 18:29:02 GMT, Xiaolong Peng wrote: >> Hi all, >> This PR is to fix an invalid comment in ShenandoahLock I introduced in https://github.com/openjdk/jdk/pull/19570/files#r1668249587, thank you turbanoff@ for catching this, it is easy to miss since the PR had been closed. >> This PR should be trivial. >> >> Best, >> Xiaolong. > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Update comments Thanks Aleksey! ------------- PR Comment: https://git.openjdk.org/jdk/pull/20079#issuecomment-2215110281 From duke at openjdk.org Mon Jul 8 20:04:35 2024 From: duke at openjdk.org (duke) Date: Mon, 8 Jul 2024 20:04:35 GMT Subject: RFR: 8335904: Fix invalid comment in ShenandoahLock [v2] In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 18:29:02 GMT, Xiaolong Peng wrote: >> Hi all, >> This PR is to fix an invalid comment in ShenandoahLock I introduced in https://github.com/openjdk/jdk/pull/19570/files#r1668249587, thank you turbanoff@ for catching this, it is easy to miss since the PR had been closed. >> This PR should be trivial. >> >> Best, >> Xiaolong. > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Update comments @pengxiaolong Your change (at version ad7dc6a36d777ddc8811ee05b8307fb6b55bc763) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20079#issuecomment-2215117981 From xpeng at openjdk.org Mon Jul 8 20:13:35 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Mon, 8 Jul 2024 20:13:35 GMT Subject: Integrated: 8335904: Fix invalid comment in ShenandoahLock In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 16:43:00 GMT, Xiaolong Peng wrote: > Hi all, > This PR is to fix an invalid comment in ShenandoahLock I introduced in https://github.com/openjdk/jdk/pull/19570/files#r1668249587, thank you turbanoff@ for catching this, it is easy to miss since the PR had been closed. > This PR should be trivial. > > Best, > Xiaolong. This pull request has now been integrated. Changeset: bb1f8a16 Author: Xiaolong Peng Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/bb1f8a1698553d5962569ac8912edd0d7ef010dd Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8335904: Fix invalid comment in ShenandoahLock Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/20079 From kdnilsen at openjdk.org Mon Jul 8 21:34:56 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 8 Jul 2024 21:34:56 GMT Subject: RFR: 8335930: GenShen: Reserve regions within each generation's freeset until available is sufficient Message-ID: Reserve regions until available memory within each generation's partition is sufficient to satisfy reserve request. ------------- Commit messages: - Reserve until available in partition is sufficient - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Revert "Change behavior of max_old and min_old" - ... and 9 more: https://git.openjdk.org/shenandoah/compare/d2102347...b226b32a Changes: https://git.openjdk.org/shenandoah/pull/457/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=457&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8335930 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/shenandoah/pull/457.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/457/head:pull/457 PR: https://git.openjdk.org/shenandoah/pull/457 From wkemper at openjdk.org Mon Jul 8 21:53:58 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 8 Jul 2024 21:53:58 GMT Subject: RFR: 8335930: GenShen: Reserve regions within each generation's freeset until available is sufficient In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 21:30:23 GMT, Kelvin Nilsen wrote: > Reserve regions until available memory within each generation's partition is sufficient to satisfy reserve request. LGTM ------------- Marked as reviewed by wkemper (Committer). PR Review: https://git.openjdk.org/shenandoah/pull/457#pullrequestreview-2164502168 From wkemper at openjdk.org Mon Jul 8 21:54:23 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 8 Jul 2024 21:54:23 GMT Subject: RFR: 8335932: GenShen: Fix old heuristic unit test Message-ID: This isn't really a unit test and it must be run with Shenandoah enabled, but it exercises a hard to reach code path. The test has suffered some bit-rot* of late that needs to be addressed. * We no longer set soft-max-capacity on the old generation. ------------- Commit messages: - Fix old heuristic test Changes: https://git.openjdk.org/shenandoah/pull/458/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=458&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8335932 Stats: 11 lines in 1 file changed: 5 ins; 3 del; 3 mod Patch: https://git.openjdk.org/shenandoah/pull/458.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/458/head:pull/458 PR: https://git.openjdk.org/shenandoah/pull/458 From kdnilsen at openjdk.org Mon Jul 8 22:06:58 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 8 Jul 2024 22:06:58 GMT Subject: RFR: 8335932: GenShen: Fix old heuristic unit test In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 21:49:51 GMT, William Kemper wrote: > This isn't really a unit test and it must be run with Shenandoah enabled, but it exercises a hard to reach code path. The test has suffered some bit-rot* of late that needs to be addressed. > > * We no longer set soft-max-capacity on the old generation. Marked as reviewed by kdnilsen (Committer). test/hotspot/gtest/gc/shenandoah/test_shenandoahOldHeuristic.cpp line 89: > 87: _heap->heap_region_iterate(&reset); > 88: _heap->old_generation()->set_capacity(ShenandoahHeapRegion::region_size_bytes() * 10); > 89: _heap->old_generation()->set_evacuation_reserve(ShenandoahHeapRegion::region_size_bytes() * 4); I'll approve these changes because I assume you've tested and are satisfied with the results. Might be good to leave some comments in this code that warn the user that failure of this test does not necessarily mean we've broken old heuristics. What causes me some concern is that setting the capacity and reserves of old generation is very "dangerous". There are "assumptions" scattered throughout the implementation related to reserves being sufficient to handle anticipated evacuation, and furthermore, any attempt to take control over capacity and reserves may be defeated by the GC's ergonomics... ------------- PR Review: https://git.openjdk.org/shenandoah/pull/458#pullrequestreview-2164517412 PR Review Comment: https://git.openjdk.org/shenandoah/pull/458#discussion_r1669383515 From kdnilsen at openjdk.org Mon Jul 8 22:13:46 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 8 Jul 2024 22:13:46 GMT Subject: RFR: 8335932: GenShen: Fix old heuristic unit test In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 22:03:51 GMT, Kelvin Nilsen wrote: >> This isn't really a unit test and it must be run with Shenandoah enabled, but it exercises a hard to reach code path. The test has suffered some bit-rot* of late that needs to be addressed. >> >> * We no longer set soft-max-capacity on the old generation. > > test/hotspot/gtest/gc/shenandoah/test_shenandoahOldHeuristic.cpp line 89: > >> 87: _heap->heap_region_iterate(&reset); >> 88: _heap->old_generation()->set_capacity(ShenandoahHeapRegion::region_size_bytes() * 10); >> 89: _heap->old_generation()->set_evacuation_reserve(ShenandoahHeapRegion::region_size_bytes() * 4); > > I'll approve these changes because I assume you've tested and are satisfied with the results. > > Might be good to leave some comments in this code that warn the user that failure of this test does not necessarily mean we've broken old heuristics. > > What causes me some concern is that setting the capacity and reserves of old generation is very "dangerous". There are "assumptions" scattered throughout the implementation related to reserves being sufficient to handle anticipated evacuation, and furthermore, any attempt to take control over capacity and reserves may be defeated by the GC's ergonomics... The fourth argument to finish_rebuild (default value false) is a flag that is supposed to be true when evacuation_reserve() is meaningful. When this flag is true, we also assume that old_evacuation_reserve() and old_promo_reserve() are valid. If this flag is false, the freeset rebuilder will ignore the value of evacuation_reserve. I wonder how we make sure that the evacuation reserve we set in this "unit test" program is not ignored. (In the share-collector-reserves development branch, I get rid of this flag and assume reserves are always valid.) ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/458#discussion_r1669388696 From wkemper at openjdk.org Mon Jul 8 23:26:10 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 8 Jul 2024 23:26:10 GMT Subject: RFR: 8335932: GenShen: Fix old heuristic unit test [v2] In-Reply-To: References: Message-ID: > This isn't really a unit test and it must be run with Shenandoah enabled, but it exercises a hard to reach code path. The test has suffered some bit-rot* of late that needs to be addressed. > > * We no longer set soft-max-capacity on the old generation. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Skip destructor actions when disabled ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/458/files - new: https://git.openjdk.org/shenandoah/pull/458/files/4e7e2b38..cf114eb5 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=458&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=458&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah/pull/458.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/458/head:pull/458 PR: https://git.openjdk.org/shenandoah/pull/458 From wkemper at openjdk.org Tue Jul 9 00:00:45 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 9 Jul 2024 00:00:45 GMT Subject: RFR: 8335932: GenShen: Fix old heuristic unit test [v2] In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 22:11:31 GMT, Kelvin Nilsen wrote: >> test/hotspot/gtest/gc/shenandoah/test_shenandoahOldHeuristic.cpp line 89: >> >>> 87: _heap->heap_region_iterate(&reset); >>> 88: _heap->old_generation()->set_capacity(ShenandoahHeapRegion::region_size_bytes() * 10); >>> 89: _heap->old_generation()->set_evacuation_reserve(ShenandoahHeapRegion::region_size_bytes() * 4); >> >> I'll approve these changes because I assume you've tested and are satisfied with the results. >> >> Might be good to leave some comments in this code that warn the user that failure of this test does not necessarily mean we've broken old heuristics. >> >> What causes me some concern is that setting the capacity and reserves of old generation is very "dangerous". There are "assumptions" scattered throughout the implementation related to reserves being sufficient to handle anticipated evacuation, and furthermore, any attempt to take control over capacity and reserves may be defeated by the GC's ergonomics... > > The fourth argument to finish_rebuild (default value false) is a flag that is supposed to be true when evacuation_reserve() is meaningful. When this flag is true, we also assume that old_evacuation_reserve() and old_promo_reserve() are valid. If this flag is false, the freeset rebuilder will ignore the value of evacuation_reserve. > > I wonder how we make sure that the evacuation reserve we set in this "unit test" program is not ignored. > > (In the share-collector-reserves development branch, I get rid of this flag and assume reserves are always valid.) The tests are configuring the old generation with enough capacity and reserves to make sure old regions can be added to the collection set. Indeed, these tests are not true unit tests because they must be run with `-XX:+UseShenandoahGC` and if a _real_ collection happens during test execution, all bets are off. There is a lengthy comment at the top of the file explaining these risks and limitations. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/458#discussion_r1669458662 From wkemper at openjdk.org Tue Jul 9 17:31:26 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 9 Jul 2024 17:31:26 GMT Subject: Integrated: 8335932: GenShen: Fix old heuristic unit test In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 21:49:51 GMT, William Kemper wrote: > This isn't really a unit test and it must be run with Shenandoah enabled, but it exercises a hard to reach code path. The test has suffered some bit-rot* of late that needs to be addressed. > > * We no longer set soft-max-capacity on the old generation. This pull request has now been integrated. Changeset: f0b87713 Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/f0b877138a7dd45fc957178de23cffe6e32405a0 Stats: 12 lines in 1 file changed: 6 ins; 3 del; 3 mod 8335932: GenShen: Fix old heuristic unit test Reviewed-by: kdnilsen ------------- PR: https://git.openjdk.org/shenandoah/pull/458 From wkemper at openjdk.org Tue Jul 9 18:34:01 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 9 Jul 2024 18:34:01 GMT Subject: RFR: 8335932: GenShen: Fix old heuristic unit test Message-ID: Clean backport, test only, low risk. ------------- Commit messages: - Backport f0b877138a7dd45fc957178de23cffe6e32405a0 Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/64/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=64&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8335932 Stats: 12 lines in 1 file changed: 6 ins; 3 del; 3 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/64.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/64/head:pull/64 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/64 From wkemper at openjdk.org Tue Jul 9 18:43:36 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 9 Jul 2024 18:43:36 GMT Subject: Integrated: 8335932: GenShen: Fix old heuristic unit test In-Reply-To: References: Message-ID: On Tue, 9 Jul 2024 18:08:10 GMT, William Kemper wrote: > Clean backport, test only, low risk. This pull request has now been integrated. Changeset: 2d6289f7 Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/2d6289f7a2536bb7e320a26770a59a0f3e77fa77 Stats: 12 lines in 1 file changed: 6 ins; 3 del; 3 mod 8335932: GenShen: Fix old heuristic unit test Backport-of: f0b877138a7dd45fc957178de23cffe6e32405a0 ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/64 From kdnilsen at openjdk.org Tue Jul 9 21:22:44 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 9 Jul 2024 21:22:44 GMT Subject: RFR: 8334315: Shenandoah: reduce GC logging noise Message-ID: Qualify certain log messages as ergo or debug to remove extraneous noise from regular gc logs. ------------- Commit messages: - Reduce verbosity of gc log messages - Revert "Make GC logging less verbose" - Make GC logging less verbose - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - Merge branch 'openjdk:master' into master - ... and 9 more: https://git.openjdk.org/jdk/compare/8464ce6d...2a74ecc4 Changes: https://git.openjdk.org/jdk/pull/19795/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19795&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8334315 Stats: 17 lines in 1 file changed: 0 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/19795.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19795/head:pull/19795 PR: https://git.openjdk.org/jdk/pull/19795 From wkemper at openjdk.org Tue Jul 9 21:35:14 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 9 Jul 2024 21:35:14 GMT Subject: RFR: 8334315: Shenandoah: reduce GC logging noise In-Reply-To: References: Message-ID: On Wed, 19 Jun 2024 15:11:52 GMT, Kelvin Nilsen wrote: > Qualify certain log messages as ergo or debug to remove extraneous noise from regular gc logs. Marked as reviewed by wkemper (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/19795#pullrequestreview-2167539722 From ysr at openjdk.org Tue Jul 9 22:01:17 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 9 Jul 2024 22:01:17 GMT Subject: RFR: 8334315: Shenandoah: reduce GC logging noise In-Reply-To: References: Message-ID: <_gAlo8USEPJAToN76JC667RoUbTdq_5z4UfvReAMrvY=.08520b88-301f-4316-b5c2-560b65d2c34b@github.com> On Wed, 19 Jun 2024 15:11:52 GMT, Kelvin Nilsen wrote: > Qualify certain log messages as ergo or debug to remove extraneous noise from regular gc logs. ? ? ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19795#pullrequestreview-2167578144 From xpeng at openjdk.org Tue Jul 9 23:53:23 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Tue, 9 Jul 2024 23:53:23 GMT Subject: RFR: 8335356: Shenandoah: Improve concurrent cleanup locking Message-ID: Hi all, This PR is to improve the usage of heap lock in ShenandoahFreeSet::recycle_trash, the original observation mentioned in the [bug](https://bugs.openjdk.org/browse/JDK-8335356) should be caused by an uncommitted/reverted change I added when Aleksey and I worked on [JDK-8331411](https://bugs.openjdk.org/browse/JDK-8331411). Even the change was not committed, the way ShenandoahFreeSet::recycle_trash using heap lock is still not efficient, we think we should improve it. With the logs added in this commit https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2, I got some key metrics: average time to acquire heap lock is about 450 ~ 900 ns, average time to recycle one trash regions is about 600ns (if not batched, it is 1000+ns, might be related to branch prediction). The current implementation takes heap lock once for every trash region, assume there are 1000 regions to recycle, the time wasted on acquiring heap lock is more than 0.6ms. The PR splits the recycling process into two steps: 1. Filter out all the trash regions; 2. recycle the trash regions in batches. I can see some benefits which improve the performance: 1. Less time spent on acquiring heap lock, less contention with mutators/allocators 2.Simpler loops in filtering and batch recycling, presumably benefit CPU branch prediction Here are some logs from test running h2 benchmark: TIP with debug log, [code link](https://github.com/openjdk/jdk/compare/master...pengxiaolong:jdk:JDK-8335356-baseline?expand=1), Average time per region: 2312 ns [6.013s][info][gc] GC(0) Recycled 0 regions in 58675ns, break down: acquiring lock -> 0, recycling -> 0. [6.093s][info][gc] GC(0) Recycled 641 regions in 3025757ns, break down: acquiring lock -> 260016, recycling -> 548345. [9.354s][info][gc] GC(1) Recycled 1 regions in 61793ns, break down: acquiring lock -> 481, recycling -> 1141. [9.428s][info][gc] GC(1) Recycled 600 regions in 1083206ns, break down: acquiring lock -> 256578, recycling -> 511334. [12.145s][info][gc] GC(2) Recycled 35 regions in 118390ns, break down: acquiring lock -> 13703, recycling -> 27438. [12.202s][info][gc] GC(2) Recycled 553 regions in 911747ns, break down: acquiring lock -> 209511, recycling -> 426575. [15.086s][info][gc] GC(3) Recycled 106 regions in 218396ns, break down: acquiring lock -> 39089, recycling -> 80520. [15.164s][info][gc] GC(3) Recycled 454 regions in 762128ns, break down: acquiring lock -> 172583, recycling -> 351263. [18.781s][info][gc] GC(4) Recycled 119 regions in 244275ns, break down: acquiring lock -> 45741, recycling -> 92841. [18.866s][info][gc] GC(4) Recycled 437 regions in 721638ns, break down: acquiring lock -> 162149, recycling -> 329194. [20.735s][info][gc] GC(5) Recycled 119 regions in 244292ns, break down: acquiring lock -> 45992, recycling -> 93320. [20.788s][info][gc] GC(5) Recycled 193 regions in 364782ns, break down: acquiring lock -> 73267, recycling -> 149695. [21.699s][info][gc] GC(6) Recycled 0 regions in 92333ns, break down: acquiring lock -> 0, recycling -> 0. [21.856s][info][gc] GC(6) Recycled 552 regions in 1372852ns, break down: acquiring lock -> 302017, recycling -> 621198. [22.196s][info][gc] GC(7) Recycled 0 regions in 80586ns, break down: acquiring lock -> 0, recycling -> 0. [22.361s][info][gc] GC(7) Recycled 531 regions in 1433166ns, break down: acquiring lock -> 365550, recycling -> 632302. [22.720s][info][gc] GC(8) Recycled 0 regions in 74306ns, break down: acquiring lock -> 0, recycling -> 0. [22.898s][info][gc] GC(8) Recycled 530 regions in 1331485ns, break down: acquiring lock -> 299571, recycling -> 620722. [23.147s][info][gc] GC(9) Recycled 0 regions in 77732ns, break down: acquiring lock -> 0, recycling -> 0. [23.331s][info][gc] GC(9) Recycled 531 regions in 1361709ns, break down: acquiring lock -> 311774, recycling -> 653233. [24.257s][info][gc] GC(10) Recycled 0 regions in 62440ns, break down: acquiring lock -> 0, recycling -> 0. [24.460s][info][gc] GC(10) Recycled 1480 regions in 3611397ns, break down: acquiring lock -> 906760, recycling -> 1729345. [25.407s][info][gc] GC(11) Recycled 1 regions in 61455ns, break down: acquiring lock -> 585, recycling -> 1362. [25.597s][info][gc] GC(11) Recycled 1438 regions in 2313578ns, break down: acquiring lock -> 546099, recycling -> 1126582. Optimized, but w/o batching optimization, basically recycle all trash with one single lock acquirement , [code link](https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2), Average time per region: 560 ns [6.097s][info][gc] GC(0) Recycled 641 regions in 280280ns, break down: filtering -> 20216, taking heap lock -> 569, recycling -> 259495. [9.568s][info][gc] GC(1) Recycled 1 regions in 10592ns, break down: filtering -> 8695, taking heap lock -> 568, recycling -> 1329. [9.643s][info][gc] GC(1) Recycled 600 regions in 260624ns, break down: filtering -> 8023, taking heap lock -> 736, recycling -> 251865. [12.648s][info][gc] GC(2) Recycled 34 regions in 24651ns, break down: filtering -> 9739, taking heap lock -> 524, recycling -> 14388. [12.706s][info][gc] GC(2) Recycled 552 regions in 252231ns, break down: filtering -> 26613, taking heap lock -> 624, recycling -> 224994. [15.579s][info][gc] GC(3) Recycled 102 regions in 50140ns, break down: filtering -> 7846, taking heap lock -> 550, recycling -> 41744. [15.662s][info][gc] GC(3) Recycled 461 regions in 187735ns, break down: filtering -> 8851, taking heap lock -> 479, recycling -> 178405. [19.269s][info][gc] GC(4) Recycled 117 regions in 55504ns, break down: filtering -> 8709, taking heap lock -> 548, recycling -> 46247. [19.360s][info][gc] GC(4) Recycled 437 regions in 187981ns, break down: filtering -> 9582, taking heap lock -> 505, recycling -> 177894. [21.269s][info][gc] GC(5) Recycled 124 regions in 57666ns, break down: filtering -> 8986, taking heap lock -> 537, recycling -> 48143. [21.327s][info][gc] GC(5) Recycled 190 regions in 85890ns, break down: filtering -> 7768, taking heap lock -> 494, recycling -> 77628. [22.367s][info][gc] GC(6) Recycled 547 regions in 378074ns, break down: filtering -> 11634, taking heap lock -> 714, recycling -> 365726. [22.733s][info][gc] GC(7) Recycled 2 regions in 27277ns, break down: filtering -> 24172, taking heap lock -> 741, recycling -> 2364. [22.895s][info][gc] GC(7) Recycled 533 regions in 339216ns, break down: filtering -> 12006, taking heap lock -> 778, recycling -> 326432. [23.213s][info][gc] GC(8) Recycled 1 regions in 28104ns, break down: filtering -> 25781, taking heap lock -> 722, recycling -> 1601. [23.393s][info][gc] GC(8) Recycled 529 regions in 341289ns, break down: filtering -> 10257, taking heap lock -> 682, recycling -> 330350. [23.861s][info][gc] GC(9) Recycled 523 regions in 339914ns, break down: filtering -> 12715, taking heap lock -> 746, recycling -> 326453. [25.120s][info][gc] GC(10) Recycled 1515 regions in 1148153ns, break down: filtering -> 13906, taking heap lock -> 739, recycling -> 1133508. [25.873s][info][gc] GC(11) Recycled 1 regions in 13233ns, break down: filtering -> 11375, taking heap lock -> 568, recycling -> 1290. [26.109s][info][gc] GC(11) Recycled 1237 regions in 493244ns, break down: filtering -> 12178, taking heap lock -> 557, recycling -> 480509. With batch size of 128, [code link](https://github.com/openjdk/jdk/pull/20086/commits/ece3f3c7612560e1b66ca1e56e896488e3c593bc), Average time per region: 533 ns [6.066s][info][gc] GC(0) Recycled 641 regions in 290048ns, break down: filtering -> 20937, recycling -> 257691, yields -> 6088. [9.514s][info][gc] GC(1) Recycled 1 regions in 12863ns, break down: filtering -> 9186, recycling -> 1255, yields -> 1487. [9.591s][info][gc] GC(1) Recycled 601 regions in 285321ns, break down: filtering -> 11941, recycling -> 265095, yields -> 5540. [12.590s][info][gc] GC(2) Recycled 35 regions in 27005ns, break down: filtering -> 9873, recycling -> 14893, yields -> 1341. [12.650s][info][gc] GC(2) Recycled 551 regions in 231127ns, break down: filtering -> 10833, recycling -> 212840, yields -> 5054. [15.504s][info][gc] GC(3) Recycled 101 regions in 54762ns, break down: filtering -> 9759, recycling -> 42579, yields -> 1500. [15.591s][info][gc] GC(3) Recycled 466 regions in 197675ns, break down: filtering -> 9672, recycling -> 181928, yields -> 4095. [19.231s][info][gc] GC(4) Recycled 121 regions in 58985ns, break down: filtering -> 8601, recycling -> 47931, yields -> 1565. [19.322s][info][gc] GC(4) Recycled 439 regions in 186173ns, break down: filtering -> 9754, recycling -> 170179, yields -> 4269. [21.204s][info][gc] GC(5) Recycled 120 regions in 59352ns, break down: filtering -> 8120, recycling -> 48746, yields -> 1602. [21.257s][info][gc] GC(5) Recycled 191 regions in 89995ns, break down: filtering -> 8695, recycling -> 77421, yields -> 2596. [22.291s][info][gc] GC(6) Recycled 550 regions in 344021ns, break down: filtering -> 7845, recycling -> 325227, yields -> 7251. [22.789s][info][gc] GC(7) Recycled 535 regions in 352193ns, break down: filtering -> 11420, recycling -> 328723, yields -> 8067. [23.265s][info][gc] GC(8) Recycled 530 regions in 344795ns, break down: filtering -> 12356, recycling -> 321523, yields -> 7389. [23.731s][info][gc] GC(9) Recycled 526 regions in 268314ns, break down: filtering -> 11727, recycling -> 248182, yields -> 5403. [24.913s][info][gc] GC(10) Recycled 1520 regions in 1035594ns, break down: filtering -> 15586, recycling -> 995272, yields -> 16585. [25.827s][info][gc] GC(11) Recycled 1 regions in 12540ns, break down: filtering -> 9191, recycling -> 1162, yields -> 1286. [25.997s][info][gc] GC(11) Recycled 1406 regions in 593511ns, break down: filtering -> 11703, recycling -> 566761, yields -> 10335. Batch with timed lock up to 30us, PR version, Average time per region: 1118 ns [6.103s][info][gc] GC(0) Recycled 0 regions in 9421ns with 0 batches. [6.189s][info][gc] GC(0) Recycled 641 regions in 570953ns with 18 batches. [9.481s][info][gc] GC(1) Recycled 1 regions in 11745ns with 1 batches. [9.552s][info][gc] GC(1) Recycled 597 regions in 495402ns with 16 batches. [12.295s][info][gc] GC(2) Recycled 35 regions in 36924ns with 1 batches. [12.353s][info][gc] GC(2) Recycled 546 regions in 443037ns with 14 batches. [15.226s][info][gc] GC(3) Recycled 100 regions in 92537ns with 3 batches. [15.310s][info][gc] GC(3) Recycled 463 regions in 423945ns with 14 batches. [19.031s][info][gc] GC(4) Recycled 118 regions in 107655ns with 4 batches. [19.121s][info][gc] GC(4) Recycled 440 regions in 359861ns with 12 batches. [21.094s][info][gc] GC(5) Recycled 125 regions in 108325ns with 4 batches. [21.155s][info][gc] GC(5) Recycled 191 regions in 192925ns with 6 batches. [22.038s][info][gc] GC(6) Recycled 0 regions in 23493ns with 0 batches. [22.213s][info][gc] GC(6) Recycled 574 regions in 748833ns with 23 batches. [22.548s][info][gc] GC(7) Recycled 1 regions in 24498ns with 1 batches. [22.716s][info][gc] GC(7) Recycled 532 regions in 684182ns with 21 batches. [23.232s][info][gc] GC(8) Recycled 1 regions in 14411ns with 1 batches. [23.455s][info][gc] GC(8) Recycled 528 regions in 715436ns with 22 batches. [23.766s][info][gc] GC(9) Recycled 0 regions in 12247ns with 0 batches. [23.982s][info][gc] GC(9) Recycled 703 regions in 685842ns with 22 batches. [24.557s][info][gc] GC(10) Recycled 1 regions in 15890ns with 1 batches. [24.760s][info][gc] GC(10) Recycled 1142 regions in 1506585ns with 47 batches. [25.524s][info][gc] GC(11) Recycled 1 regions in 12424ns with 1 batches. [25.731s][info][gc] GC(11) Recycled 1262 regions in 1695506ns with 52 batches. Decided on batch with timed lock for following reasons: 1. We can manage exactly how long it holds the heap lock(set to up to 30us), therefore manage exactly how much it could impact on long tail latencies in the worst case. 2. Not like static batch size, with timed lock the algorithm is adaptive to different hardwares/runtime, batch size will be automatically adjusted. Additional test: - [x] ```make clean test TEST=hotspot_gc_shenandoah``` ============================== TEST TOTAL PASS FAIL ERROR jtreg:test/hotspot/jtreg:hotspot_gc_shenandoah 261 261 0 0 ============================== TEST SUCCESS ------------- Commit messages: - Code polish - Remove logs - Increase deadline to 30us - Timed lock - Remove inline and code polish - Polish - Remove empty lines - Polish the loops for recycling trash regions - Fix -Werror=reorder - Pre-allocate array for trash regions used in recycle_trash - ... and 9 more: https://git.openjdk.org/jdk/compare/a9b7f42f...dac1ae6e Changes: https://git.openjdk.org/jdk/pull/20086/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20086&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8335356 Stats: 18 lines in 2 files changed: 14 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/20086.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20086/head:pull/20086 PR: https://git.openjdk.org/jdk/pull/20086 From ysr at openjdk.org Wed Jul 10 07:37:15 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 10 Jul 2024 07:37:15 GMT Subject: RFR: 8335356: Shenandoah: Improve concurrent cleanup locking In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 21:38:41 GMT, Xiaolong Peng wrote: > Hi all, > This PR is to improve the usage of heap lock in ShenandoahFreeSet::recycle_trash, the original observation mentioned in the [bug](https://bugs.openjdk.org/browse/JDK-8335356) should be caused by an uncommitted/reverted change I added when Aleksey and I worked on [JDK-8331411](https://bugs.openjdk.org/browse/JDK-8331411). Even the change was not committed, the way ShenandoahFreeSet::recycle_trash using heap lock is still not efficient, we think we should improve it. > With the logs added in this commit https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2, I got some key metrics: average time to acquire heap lock is about 450 ~ 900 ns, average time to recycle one trash regions is about 600ns (if not batched, it is 1000+ns, might be related to branch prediction). The current implementation takes heap lock once for every trash region, assume there are 1000 regions to recycle, the time wasted on acquiring heap lock is more than 0.6ms. > > The PR splits the recycling process into two steps: 1. Filter out all the trash regions; 2. recycle the trash regions in batches. I can see some benefits which improve the performance: > 1. Less time spent on acquiring heap lock, less contention with mutators/allocators > 2.Simpler loops in filtering and batch recycling, presumably benefit CPU branch prediction > > Here are some logs from test running h2 benchmark: > > TIP with debug log, [code link](https://github.com/openjdk/jdk/compare/master...pengxiaolong:jdk:JDK-8335356-baseline?expand=1), Average time per region: 2312 ns > > [6.013s][info][gc] GC(0) Recycled 0 regions in 58675ns, break down: acquiring lock -> 0, recycling -> 0. > [6.093s][info][gc] GC(0) Recycled 641 regions in 3025757ns, break down: acquiring lock -> 260016, recycling -> 548345. > [9.354s][info][gc] GC(1) Recycled 1 regions in 61793ns, break down: acquiring lock -> 481, recycling -> 1141. > [9.428s][info][gc] GC(1) Recycled 600 regions in 1083206ns, break down: acquiring lock -> 256578, recycling -> 511334. > [12.145s][info][gc] GC(2) Recycled 35 regions in 118390ns, break down: acquiring lock -> 13703, recycling -> 27438. > [12.202s][info][gc] GC(2) Recycled 553 regions in 911747ns, break down: acquiring lock -> 209511, recycling -> 426575. > [15.086s][info][gc] GC(3) Recycled 106 regions in 218396ns, break down: acquiring lock -> 39089, recycling -> 80520. > [15.164s][info][gc] GC(3) Recycled 454 regions in 762128ns, break down: acquiring lock -> 172583, recycl... Could you share any visible changes using the three different schemes (and baseline current) with say SPECjbb or such. Ideally, this affects some user-visible score or latency that we can use as a goodness metric that improves. I am a bit leery of why exactly 30 us, and not say 100 us. Also, I am thinking that a straight count might perform as well and the time-based solution almost seems overengineered to me -- or at least I'd like to see evidence that that engineering effort is worth the resulting bang for a service level metric such as latency or throughput. ------------- PR Review: https://git.openjdk.org/jdk/pull/20086#pullrequestreview-2168278271 From shade at openjdk.org Wed Jul 10 07:55:26 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 Jul 2024 07:55:26 GMT Subject: RFR: 8334315: Shenandoah: reduce GC logging noise In-Reply-To: References: Message-ID: On Wed, 19 Jun 2024 15:11:52 GMT, Kelvin Nilsen wrote: > Qualify certain log messages as ergo or debug to remove extraneous noise from regular gc logs. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/19795#pullrequestreview-2168317699 From shade at openjdk.org Wed Jul 10 08:06:15 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 Jul 2024 08:06:15 GMT Subject: RFR: 8335356: Shenandoah: Improve concurrent cleanup locking In-Reply-To: References: Message-ID: On Wed, 10 Jul 2024 07:34:39 GMT, Y. Srinivas Ramakrishna wrote: > Also, I am thinking that a straight count might perform as well and the time-based solution almost seems overengineered to me -- or at least I'd like to see evidence that that engineering effort is worth the resulting bang for a service level metric such as latency or throughput. I suggested the time-based approach to Xiaolong to side-step the discussion about the "reasonable" batch size. The good batch size would fluctuate between the machines, heap sizes, region counts. Since we are doing this whole dance to avoid hoarding the lock for a long time to avoid tail latencies increase for allocators waiting for the same lock, it is also more reasonable to just track the time directly here. This is not to mention that fastdebug builds would zap the unused heap, which makes cleanup orders of magnitude slower, and the large batch sizes would hoard the lock way too much, deviating from the "normal" release behavior. Time-based approach accomodates this as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20086#issuecomment-2219822121 From shade at openjdk.org Wed Jul 10 09:02:21 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 Jul 2024 09:02:21 GMT Subject: RFR: 8335356: Shenandoah: Improve concurrent cleanup locking In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 21:38:41 GMT, Xiaolong Peng wrote: > Hi all, > This PR is to improve the usage of heap lock in ShenandoahFreeSet::recycle_trash, the original observation mentioned in the [bug](https://bugs.openjdk.org/browse/JDK-8335356) should be caused by an uncommitted/reverted change I added when Aleksey and I worked on [JDK-8331411](https://bugs.openjdk.org/browse/JDK-8331411). Even the change was not committed, the way ShenandoahFreeSet::recycle_trash using heap lock is still not efficient, we think we should improve it. > With the logs added in this commit https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2, I got some key metrics: average time to acquire heap lock is about 450 ~ 900 ns, average time to recycle one trash regions is about 600ns (if not batched, it is 1000+ns, might be related to branch prediction). The current implementation takes heap lock once for every trash region, assume there are 1000 regions to recycle, the time wasted on acquiring heap lock is more than 0.6ms. > > The PR splits the recycling process into two steps: 1. Filter out all the trash regions; 2. recycle the trash regions in batches. I can see some benefits which improve the performance: > 1. Less time spent on acquiring heap lock, less contention with mutators/allocators > 2.Simpler loops in filtering and batch recycling, presumably benefit CPU branch prediction > > Here are some logs from test running h2 benchmark: > > TIP with debug log, [code link](https://github.com/openjdk/jdk/compare/master...pengxiaolong:jdk:JDK-8335356-baseline?expand=1), Average time per region: 2312 ns > > [6.013s][info][gc] GC(0) Recycled 0 regions in 58675ns, break down: acquiring lock -> 0, recycling -> 0. > [6.093s][info][gc] GC(0) Recycled 641 regions in 3025757ns, break down: acquiring lock -> 260016, recycling -> 548345. > [9.354s][info][gc] GC(1) Recycled 1 regions in 61793ns, break down: acquiring lock -> 481, recycling -> 1141. > [9.428s][info][gc] GC(1) Recycled 600 regions in 1083206ns, break down: acquiring lock -> 256578, recycling -> 511334. > [12.145s][info][gc] GC(2) Recycled 35 regions in 118390ns, break down: acquiring lock -> 13703, recycling -> 27438. > [12.202s][info][gc] GC(2) Recycled 553 regions in 911747ns, break down: acquiring lock -> 209511, recycling -> 426575. > [15.086s][info][gc] GC(3) Recycled 106 regions in 218396ns, break down: acquiring lock -> 39089, recycling -> 80520. > [15.164s][info][gc] GC(3) Recycled 454 regions in 762128ns, break down: acquiring lock -> 172583, recycl... Found an easy workload to demonstrate the impact on max latencies on allocation path. public class TimedAlloc { static volatile Object sink; public static void main(String... args) throws Throwable { for (int c = 0; c < 10; c++) { run(); } } public static void run() { long cur = System.nanoTime(); long end = cur + 3_000_000_000L; long sum = 0; long max = 0; long allocs = 0; while (cur < end) { long start = System.nanoTime(); sink = new Object[10000]; cur = System.nanoTime(); long v = (cur - start); sum += v; max = Math.max(max, v); allocs++; } System.out.printf("Allocs: %15d; Avg: %8d, Max: %9d%n", allocs, (sum / allocs), max); } } $ java -Xms30g -Xmx30g -XX:+AlwaysPreTouch -XX:+UseShenandoahGC ../TimedAlloc.java # Baseline Allocs: 3303444; Avg: 869, Max: 572869 Allocs: 3331117; Avg: 864, Max: 543211 Allocs: 3368134; Avg: 854, Max: 504501 Allocs: 3369721; Avg: 854, Max: 380658 Allocs: 3370053; Avg: 854, Max: 471238 Allocs: 3370687; Avg: 854, Max: 409657 Allocs: 3370261; Avg: 854, Max: 371527 Allocs: 3370627; Avg: 854, Max: 452230 Allocs: 3369164; Avg: 854, Max: 473957 Allocs: 3370499; Avg: 854, Max: 450008 # Patched Allocs: 3311185; Avg: 866, Max: 58489 Allocs: 3338693; Avg: 861, Max: 81620 Allocs: 3378208; Avg: 852, Max: 81913 Allocs: 3379441; Avg: 852, Max: 69214 Allocs: 3380807; Avg: 851, Max: 186430 Allocs: 3380362; Avg: 851, Max: 61498 Allocs: 3380142; Avg: 851, Max: 84395 Allocs: 3379811; Avg: 851, Max: 81258 Allocs: 3380572; Avg: 851, Max: 55142 Allocs: 3379696; Avg: 852, Max: 81612 ------------- PR Comment: https://git.openjdk.org/jdk/pull/20086#issuecomment-2219949530 From ysr at openjdk.org Wed Jul 10 20:07:44 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 10 Jul 2024 20:07:44 GMT Subject: RFR: 8335356: Shenandoah: Improve concurrent cleanup locking In-Reply-To: References: Message-ID: <-_kQgM-yE1H8ZNX1chP1LcmpPdZYQgf0H7ggPN9XyYU=.24304200-8fa8-410a-aeeb-059095db8d5a@github.com> On Mon, 8 Jul 2024 21:38:41 GMT, Xiaolong Peng wrote: > Hi all, > This PR is to improve the usage of heap lock in ShenandoahFreeSet::recycle_trash, the original observation mentioned in the [bug](https://bugs.openjdk.org/browse/JDK-8335356) should be caused by an uncommitted/reverted change I added when Aleksey and I worked on [JDK-8331411](https://bugs.openjdk.org/browse/JDK-8331411). Even the change was not committed, the way ShenandoahFreeSet::recycle_trash using heap lock is still not efficient, we think we should improve it. > With the logs added in this commit https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2, I got some key metrics: average time to acquire heap lock is about 450 ~ 900 ns, average time to recycle one trash regions is about 600ns (if not batched, it is 1000+ns, might be related to branch prediction). The current implementation takes heap lock once for every trash region, assume there are 1000 regions to recycle, the time wasted on acquiring heap lock is more than 0.6ms. > > The PR splits the recycling process into two steps: 1. Filter out all the trash regions; 2. recycle the trash regions in batches. I can see some benefits which improve the performance: > 1. Less time spent on acquiring heap lock, less contention with mutators/allocators > 2.Simpler loops in filtering and batch recycling, presumably benefit CPU branch prediction > > Here are some logs from test running h2 benchmark: > > TIP with debug log, [code link](https://github.com/openjdk/jdk/compare/master...pengxiaolong:jdk:JDK-8335356-baseline?expand=1), Average time per region: 2312 ns > > [6.013s][info][gc] GC(0) Recycled 0 regions in 58675ns, break down: acquiring lock -> 0, recycling -> 0. > [6.093s][info][gc] GC(0) Recycled 641 regions in 3025757ns, break down: acquiring lock -> 260016, recycling -> 548345. > [9.354s][info][gc] GC(1) Recycled 1 regions in 61793ns, break down: acquiring lock -> 481, recycling -> 1141. > [9.428s][info][gc] GC(1) Recycled 600 regions in 1083206ns, break down: acquiring lock -> 256578, recycling -> 511334. > [12.145s][info][gc] GC(2) Recycled 35 regions in 118390ns, break down: acquiring lock -> 13703, recycling -> 27438. > [12.202s][info][gc] GC(2) Recycled 553 regions in 911747ns, break down: acquiring lock -> 209511, recycling -> 426575. > [15.086s][info][gc] GC(3) Recycled 106 regions in 218396ns, break down: acquiring lock -> 39089, recycling -> 80520. > [15.164s][info][gc] GC(3) Recycled 454 regions in 762128ns, break down: acquiring lock -> 172583, recycl... Marked as reviewed by ysr (Reviewer). Impressive and a nice demonstration of the improvements! Benchmarking with HyperAlloc may also be useful or even just SPECjbb may show some non-linear improvements, who knows? May be worth measuring, perhaps? Running the count-based and time-based on a (slow,fast) x (arm,x86) system to fill the matrix would be great, but may be more effort than worthwhile, but just putting it out there. Good data of actual measured improvements always makes me happy, though! :-) Thanks for the extra effort in collecting the data and sharing it. Reviewed and approved, thank you! ------------- PR Review: https://git.openjdk.org/jdk/pull/20086#pullrequestreview-2169690869 PR Comment: https://git.openjdk.org/jdk/pull/20086#issuecomment-2220964886 From shade at openjdk.org Wed Jul 10 20:07:46 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 10 Jul 2024 20:07:46 GMT Subject: RFR: 8335356: Shenandoah: Improve concurrent cleanup locking In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 21:38:41 GMT, Xiaolong Peng wrote: > Hi all, > This PR is to improve the usage of heap lock in ShenandoahFreeSet::recycle_trash, the original observation mentioned in the [bug](https://bugs.openjdk.org/browse/JDK-8335356) should be caused by an uncommitted/reverted change I added when Aleksey and I worked on [JDK-8331411](https://bugs.openjdk.org/browse/JDK-8331411). Even the change was not committed, the way ShenandoahFreeSet::recycle_trash using heap lock is still not efficient, we think we should improve it. > With the logs added in this commit https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2, I got some key metrics: average time to acquire heap lock is about 450 ~ 900 ns, average time to recycle one trash regions is about 600ns (if not batched, it is 1000+ns, might be related to branch prediction). The current implementation takes heap lock once for every trash region, assume there are 1000 regions to recycle, the time wasted on acquiring heap lock is more than 0.6ms. > > The PR splits the recycling process into two steps: 1. Filter out all the trash regions; 2. recycle the trash regions in batches. I can see some benefits which improve the performance: > 1. Less time spent on acquiring heap lock, less contention with mutators/allocators > 2.Simpler loops in filtering and batch recycling, presumably benefit CPU branch prediction > > Here are some logs from test running h2 benchmark: > > TIP with debug log, [code link](https://github.com/openjdk/jdk/compare/master...pengxiaolong:jdk:JDK-8335356-baseline?expand=1), Average time per region: 2312 ns > > [6.013s][info][gc] GC(0) Recycled 0 regions in 58675ns, break down: acquiring lock -> 0, recycling -> 0. > [6.093s][info][gc] GC(0) Recycled 641 regions in 3025757ns, break down: acquiring lock -> 260016, recycling -> 548345. > [9.354s][info][gc] GC(1) Recycled 1 regions in 61793ns, break down: acquiring lock -> 481, recycling -> 1141. > [9.428s][info][gc] GC(1) Recycled 600 regions in 1083206ns, break down: acquiring lock -> 256578, recycling -> 511334. > [12.145s][info][gc] GC(2) Recycled 35 regions in 118390ns, break down: acquiring lock -> 13703, recycling -> 27438. > [12.202s][info][gc] GC(2) Recycled 553 regions in 911747ns, break down: acquiring lock -> 209511, recycling -> 426575. > [15.086s][info][gc] GC(3) Recycled 106 regions in 218396ns, break down: acquiring lock -> 39089, recycling -> 80520. > [15.164s][info][gc] GC(3) Recycled 454 regions in 762128ns, break down: acquiring lock -> 172583, recycl... Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20086#pullrequestreview-2169696730 From xpeng at openjdk.org Wed Jul 10 20:07:48 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 10 Jul 2024 20:07:48 GMT Subject: RFR: 8335356: Shenandoah: Improve concurrent cleanup locking In-Reply-To: References: Message-ID: On Wed, 10 Jul 2024 08:59:35 GMT, Aleksey Shipilev wrote: >> Hi all, >> This PR is to improve the usage of heap lock in ShenandoahFreeSet::recycle_trash, the original observation mentioned in the [bug](https://bugs.openjdk.org/browse/JDK-8335356) should be caused by an uncommitted/reverted change I added when Aleksey and I worked on [JDK-8331411](https://bugs.openjdk.org/browse/JDK-8331411). Even the change was not committed, the way ShenandoahFreeSet::recycle_trash using heap lock is still not efficient, we think we should improve it. >> With the logs added in this commit https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2, I got some key metrics: average time to acquire heap lock is about 450 ~ 900 ns, average time to recycle one trash regions is about 600ns (if not batched, it is 1000+ns, might be related to branch prediction). The current implementation takes heap lock once for every trash region, assume there are 1000 regions to recycle, the time wasted on acquiring heap lock is more than 0.6ms. >> >> The PR splits the recycling process into two steps: 1. Filter out all the trash regions; 2. recycle the trash regions in batches. I can see some benefits which improve the performance: >> 1. Less time spent on acquiring heap lock, less contention with mutators/allocators >> 2.Simpler loops in filtering and batch recycling, presumably benefit CPU branch prediction >> >> Here are some logs from test running h2 benchmark: >> >> TIP with debug log, [code link](https://github.com/openjdk/jdk/compare/master...pengxiaolong:jdk:JDK-8335356-baseline?expand=1), Average time per region: 2312 ns >> >> [6.013s][info][gc] GC(0) Recycled 0 regions in 58675ns, break down: acquiring lock -> 0, recycling -> 0. >> [6.093s][info][gc] GC(0) Recycled 641 regions in 3025757ns, break down: acquiring lock -> 260016, recycling -> 548345. >> [9.354s][info][gc] GC(1) Recycled 1 regions in 61793ns, break down: acquiring lock -> 481, recycling -> 1141. >> [9.428s][info][gc] GC(1) Recycled 600 regions in 1083206ns, break down: acquiring lock -> 256578, recycling -> 511334. >> [12.145s][info][gc] GC(2) Recycled 35 regions in 118390ns, break down: acquiring lock -> 13703, recycling -> 27438. >> [12.202s][info][gc] GC(2) Recycled 553 regions in 911747ns, break down: acquiring lock -> 209511, recycling -> 426575. >> [15.086s][info][gc] GC(3) Recycled 106 regions in 218396ns, break down: acquiring lock -> 39089, recycling -> 80520. >> [15.164s][info][gc] GC(3) Recycled 454 regions in 762128ns, brea... > > Found an easy workload to demonstrate the impact on max latencies on allocation path. > > > public class TimedAlloc { > static volatile Object sink; > > public static void main(String... args) throws Throwable { > for (int c = 0; c < 10; c++) { > run(); > } > } > > public static void run() { > long cur = System.nanoTime(); > long end = cur + 3_000_000_000L; > > long sum = 0; > long max = 0; > long allocs = 0; > > while (cur < end) { > long start = System.nanoTime(); > sink = new byte[40000]; > cur = System.nanoTime(); > long v = (cur - start); > sum += v; > max = Math.max(max, v); > allocs++; > } > > System.out.printf("Allocs: %15d; Avg: %8d, Max: %9d%n", allocs, (sum / allocs), max); > } > } > > > > $ java -Xms30g -Xmx30g -XX:+AlwaysPreTouch -XX:+UseShenandoahGC ../TimedAlloc.java > > # Baseline > Allocs: 3294669; Avg: 868, Max: 445998 > Allocs: 3372764; Avg: 852, Max: 528294 > Allocs: 3342978; Avg: 861, Max: 478060 > Allocs: 3341784; Avg: 861, Max: 468640 > Allocs: 3341870; Avg: 861, Max: 494377 > Allocs: 3342338; Avg: 861, Max: 469976 > Allocs: 3340135; Avg: 862, Max: 377933 > Allocs: 3341220; Avg: 862, Max: 511117 > Allocs: 3341673; Avg: 861, Max: 494394 > Allocs: 3341495; Avg: 861, Max: 506392 > > # Patched > Allocs: 3311013; Avg: 867, Max: 81908 > Allocs: 3376562; Avg: 851, Max: 82006 > Allocs: 3343815; Avg: 861, Max: 36880 > Allocs: 3341872; Avg: 861, Max: 87010 > Allocs: 3341824; Avg: 861, Max: 65734 > Allocs: 3342571; Avg: 861, Max: 137444 > Allocs: 3343636; Avg: 861, Max: 81407 > Allocs: 3345381; Avg: 860, Max: 36505 > Allocs: 3343388; Avg: 861, Max: 80849 > Allocs: 3344334; Avg: 861, Max: 59562 Thanks a lot @shipilev @ysramakrishna! I'll attach more benchmark result if I get some. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20086#issuecomment-2221097945 From duke at openjdk.org Wed Jul 10 20:07:48 2024 From: duke at openjdk.org (duke) Date: Wed, 10 Jul 2024 20:07:48 GMT Subject: RFR: 8335356: Shenandoah: Improve concurrent cleanup locking In-Reply-To: References: Message-ID: <84SUGDIEAZXiA1qIsMqDpo1RLDvEEkj8tnpp7Zh-20Y=.9b4eebec-6879-4c49-a152-1b42f7af1893@github.com> On Mon, 8 Jul 2024 21:38:41 GMT, Xiaolong Peng wrote: > Hi all, > This PR is to improve the usage of heap lock in ShenandoahFreeSet::recycle_trash, the original observation mentioned in the [bug](https://bugs.openjdk.org/browse/JDK-8335356) should be caused by an uncommitted/reverted change I added when Aleksey and I worked on [JDK-8331411](https://bugs.openjdk.org/browse/JDK-8331411). Even the change was not committed, the way ShenandoahFreeSet::recycle_trash using heap lock is still not efficient, we think we should improve it. > With the logs added in this commit https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2, I got some key metrics: average time to acquire heap lock is about 450 ~ 900 ns, average time to recycle one trash regions is about 600ns (if not batched, it is 1000+ns, might be related to branch prediction). The current implementation takes heap lock once for every trash region, assume there are 1000 regions to recycle, the time wasted on acquiring heap lock is more than 0.6ms. > > The PR splits the recycling process into two steps: 1. Filter out all the trash regions; 2. recycle the trash regions in batches. I can see some benefits which improve the performance: > 1. Less time spent on acquiring heap lock, less contention with mutators/allocators > 2.Simpler loops in filtering and batch recycling, presumably benefit CPU branch prediction > > Here are some logs from test running h2 benchmark: > > TIP with debug log, [code link](https://github.com/openjdk/jdk/compare/master...pengxiaolong:jdk:JDK-8335356-baseline?expand=1), Average time per region: 2312 ns > > [6.013s][info][gc] GC(0) Recycled 0 regions in 58675ns, break down: acquiring lock -> 0, recycling -> 0. > [6.093s][info][gc] GC(0) Recycled 641 regions in 3025757ns, break down: acquiring lock -> 260016, recycling -> 548345. > [9.354s][info][gc] GC(1) Recycled 1 regions in 61793ns, break down: acquiring lock -> 481, recycling -> 1141. > [9.428s][info][gc] GC(1) Recycled 600 regions in 1083206ns, break down: acquiring lock -> 256578, recycling -> 511334. > [12.145s][info][gc] GC(2) Recycled 35 regions in 118390ns, break down: acquiring lock -> 13703, recycling -> 27438. > [12.202s][info][gc] GC(2) Recycled 553 regions in 911747ns, break down: acquiring lock -> 209511, recycling -> 426575. > [15.086s][info][gc] GC(3) Recycled 106 regions in 218396ns, break down: acquiring lock -> 39089, recycling -> 80520. > [15.164s][info][gc] GC(3) Recycled 454 regions in 762128ns, break down: acquiring lock -> 172583, recycl... @pengxiaolong Your change (at version dac1ae6e73d6a4d3165a9a7f12a36ce09524f962) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20086#issuecomment-2221099381 From wkemper at openjdk.org Wed Jul 10 20:28:41 2024 From: wkemper at openjdk.org (William Kemper) Date: Wed, 10 Jul 2024 20:28:41 GMT Subject: Integrated: 8336106: Genshen: Fix use of missing API in Shenandoah Old Heuristic test Message-ID: The recent fixes for the Shenandoah Old Heuristic test used an API that hasn't been backported. This change uses a different API available in 21 to achieve the same end. ------------- Commit messages: - Fix old heuristic test for 21 Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/65/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=65&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8336106 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/65.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/65/head:pull/65 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/65 From kdnilsen at openjdk.org Wed Jul 10 20:28:47 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 10 Jul 2024 20:28:47 GMT Subject: Integrated: 8336106: Genshen: Fix use of missing API in Shenandoah Old Heuristic test In-Reply-To: References: Message-ID: On Wed, 10 Jul 2024 16:17:40 GMT, William Kemper wrote: > The recent fixes for the Shenandoah Old Heuristic test used an API that hasn't been backported. This change uses a different API available in 21 to achieve the same end. Marked as reviewed by kdnilsen (Committer). ------------- PR Review: https://git.openjdk.org/shenandoah-jdk21u/pull/65#pullrequestreview-2169784648 From wkemper at openjdk.org Wed Jul 10 20:28:51 2024 From: wkemper at openjdk.org (William Kemper) Date: Wed, 10 Jul 2024 20:28:51 GMT Subject: Integrated: 8336106: Genshen: Fix use of missing API in Shenandoah Old Heuristic test In-Reply-To: References: Message-ID: On Wed, 10 Jul 2024 16:17:40 GMT, William Kemper wrote: > The recent fixes for the Shenandoah Old Heuristic test used an API that hasn't been backported. This change uses a different API available in 21 to achieve the same end. This pull request has now been integrated. Changeset: 596fccf7 Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/596fccf7bd65d897d7a6dd5aa95d0b13e68b0e6c Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8336106: Genshen: Fix use of missing API in Shenandoah Old Heuristic test Reviewed-by: kdnilsen ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/65 From xpeng at openjdk.org Thu Jul 11 06:22:56 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 11 Jul 2024 06:22:56 GMT Subject: RFR: 8335356: Shenandoah: Improve concurrent cleanup locking In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 21:38:41 GMT, Xiaolong Peng wrote: > Hi all, > This PR is to improve the usage of heap lock in ShenandoahFreeSet::recycle_trash, the original observation mentioned in the [bug](https://bugs.openjdk.org/browse/JDK-8335356) should be caused by an uncommitted/reverted change I added when Aleksey and I worked on [JDK-8331411](https://bugs.openjdk.org/browse/JDK-8331411). Even the change was not committed, the way ShenandoahFreeSet::recycle_trash using heap lock is still not efficient, we think we should improve it. > With the logs added in this commit https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2, I got some key metrics: average time to acquire heap lock is about 450 ~ 900 ns, average time to recycle one trash regions is about 600ns (if not batched, it is 1000+ns, might be related to branch prediction). The current implementation takes heap lock once for every trash region, assume there are 1000 regions to recycle, the time wasted on acquiring heap lock is more than 0.6ms. > > The PR splits the recycling process into two steps: 1. Filter out all the trash regions; 2. recycle the trash regions in batches. I can see some benefits which improve the performance: > 1. Less time spent on acquiring heap lock, less contention with mutators/allocators > 2.Simpler loops in filtering and batch recycling, presumably benefit CPU branch prediction > > Here are some logs from test running h2 benchmark: > > TIP with debug log, [code link](https://github.com/openjdk/jdk/compare/master...pengxiaolong:jdk:JDK-8335356-baseline?expand=1), Average time per region: 2312 ns > > [6.013s][info][gc] GC(0) Recycled 0 regions in 58675ns, break down: acquiring lock -> 0, recycling -> 0. > [6.093s][info][gc] GC(0) Recycled 641 regions in 3025757ns, break down: acquiring lock -> 260016, recycling -> 548345. > [9.354s][info][gc] GC(1) Recycled 1 regions in 61793ns, break down: acquiring lock -> 481, recycling -> 1141. > [9.428s][info][gc] GC(1) Recycled 600 regions in 1083206ns, break down: acquiring lock -> 256578, recycling -> 511334. > [12.145s][info][gc] GC(2) Recycled 35 regions in 118390ns, break down: acquiring lock -> 13703, recycling -> 27438. > [12.202s][info][gc] GC(2) Recycled 553 regions in 911747ns, break down: acquiring lock -> 209511, recycling -> 426575. > [15.086s][info][gc] GC(3) Recycled 106 regions in 218396ns, break down: acquiring lock -> 39089, recycling -> 80520. > [15.164s][info][gc] GC(3) Recycled 454 regions in 762128ns, break down: acquiring lock -> 172583, recycl... Based on Aleksey's benchmark, I wrote a very [simple benchmark](https://github.com/pengxiaolong/benchmarks/blob/main/allocation-latency/src/main/java/personal/xlpeng/benchmarks/allocationlatency/AllocationLatency.java) to generate HdrHistogram, run command like below to generate HdrHistogram metrics: export JAVA_HOME=/home/xlpeng/repos/jdk-xlpeng/optimized-timed-lock export JAVA_OPTS="-Xms30g -Xmx30g -XX:+AlwaysPreTouch -XX:+UseShenandoahGC" ./build/distributions/allocation-latency/bin/allocation-latency export JAVA_HOME=/home/xlpeng/repos/jdk-xlpeng/baseline ./build/distributions/allocation-latency/bin/allocation-latency Here is the HdrHistogram: ![Histogram](https://github.com/openjdk/jdk/assets/2170530/fe52c38e-6c30-442c-b4c8-2c7ee06c5278) ------------- PR Comment: https://git.openjdk.org/jdk/pull/20086#issuecomment-2222121679 From xpeng at openjdk.org Thu Jul 11 08:50:01 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 11 Jul 2024 08:50:01 GMT Subject: Integrated: 8335356: Shenandoah: Improve concurrent cleanup locking In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 21:38:41 GMT, Xiaolong Peng wrote: > Hi all, > This PR is to improve the usage of heap lock in ShenandoahFreeSet::recycle_trash, the original observation mentioned in the [bug](https://bugs.openjdk.org/browse/JDK-8335356) should be caused by an uncommitted/reverted change I added when Aleksey and I worked on [JDK-8331411](https://bugs.openjdk.org/browse/JDK-8331411). Even the change was not committed, the way ShenandoahFreeSet::recycle_trash using heap lock is still not efficient, we think we should improve it. > With the logs added in this commit https://github.com/openjdk/jdk/pull/20086/commits/5688ee2c0754483818a89bb7915f58a7464c2df2, I got some key metrics: average time to acquire heap lock is about 450 ~ 900 ns, average time to recycle one trash regions is about 600ns (if not batched, it is 1000+ns, might be related to branch prediction). The current implementation takes heap lock once for every trash region, assume there are 1000 regions to recycle, the time wasted on acquiring heap lock is more than 0.6ms. > > The PR splits the recycling process into two steps: 1. Filter out all the trash regions; 2. recycle the trash regions in batches. I can see some benefits which improve the performance: > 1. Less time spent on acquiring heap lock, less contention with mutators/allocators > 2.Simpler loops in filtering and batch recycling, presumably benefit CPU branch prediction > > Here are some logs from test running h2 benchmark: > > TIP with debug log, [code link](https://github.com/openjdk/jdk/compare/master...pengxiaolong:jdk:JDK-8335356-baseline?expand=1), Average time per region: 2312 ns > > [6.013s][info][gc] GC(0) Recycled 0 regions in 58675ns, break down: acquiring lock -> 0, recycling -> 0. > [6.093s][info][gc] GC(0) Recycled 641 regions in 3025757ns, break down: acquiring lock -> 260016, recycling -> 548345. > [9.354s][info][gc] GC(1) Recycled 1 regions in 61793ns, break down: acquiring lock -> 481, recycling -> 1141. > [9.428s][info][gc] GC(1) Recycled 600 regions in 1083206ns, break down: acquiring lock -> 256578, recycling -> 511334. > [12.145s][info][gc] GC(2) Recycled 35 regions in 118390ns, break down: acquiring lock -> 13703, recycling -> 27438. > [12.202s][info][gc] GC(2) Recycled 553 regions in 911747ns, break down: acquiring lock -> 209511, recycling -> 426575. > [15.086s][info][gc] GC(3) Recycled 106 regions in 218396ns, break down: acquiring lock -> 39089, recycling -> 80520. > [15.164s][info][gc] GC(3) Recycled 454 regions in 762128ns, break down: acquiring lock -> 172583, recycl... This pull request has now been integrated. Changeset: b32e4a68 Author: Xiaolong Peng Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/b32e4a68bca588d908bd81a398eb3171a6876dc5 Stats: 18 lines in 2 files changed: 14 ins; 1 del; 3 mod 8335356: Shenandoah: Improve concurrent cleanup locking Reviewed-by: ysr, shade ------------- PR: https://git.openjdk.org/jdk/pull/20086 From kdnilsen at openjdk.org Thu Jul 11 23:05:24 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Thu, 11 Jul 2024 23:05:24 GMT Subject: Integrated: 8335930: GenShen: Reserve regions within each generation's freeset until available is sufficient In-Reply-To: References: Message-ID: On Mon, 8 Jul 2024 21:30:23 GMT, Kelvin Nilsen wrote: > Reserve regions until available memory within each generation's partition is sufficient to satisfy reserve request. This pull request has now been integrated. Changeset: 472e4c74 Author: Kelvin Nilsen URL: https://git.openjdk.org/shenandoah/commit/472e4c74bde1c0ceb76755165e709cefa4fef584 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8335930: GenShen: Reserve regions within each generation's freeset until available is sufficient Reviewed-by: wkemper ------------- PR: https://git.openjdk.org/shenandoah/pull/457 From kdnilsen at openjdk.org Thu Jul 11 23:25:35 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Thu, 11 Jul 2024 23:25:35 GMT Subject: RFR: 8327000: GenShen: Integrate updated Shenandoah implementation of FreeSet into GenShen [v2] In-Reply-To: References: Message-ID: > This pull request contains a backport of commit 9d2712dd from the openjdk/shenandoah repository. > > The commit being backported was authored by Kelvin Nilsen on 26 Jun 2024 and was reviewed by William Kemper and Y. Srinivas Ramakrishna. Kelvin Nilsen has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into backport-kdnilsen-9d2712dd-master - Backport 9d2712ddf39160a618c9c839c46d2510063f45df ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk21u/pull/62/files - new: https://git.openjdk.org/shenandoah-jdk21u/pull/62/files/7237da32..5b6a8543 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=62&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=62&range=00-01 Stats: 38 lines in 3 files changed: 26 ins; 7 del; 5 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/62.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/62/head:pull/62 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/62 From wkemper at openjdk.org Fri Jul 12 15:19:13 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 12 Jul 2024 15:19:13 GMT Subject: RFR: Merge openjdk/jdk21u-dev:master [v2] In-Reply-To: References: Message-ID: > Merges tag jdk-21.0.4+6 William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/shenandoah-jdk21u/pull/61/files - new: https://git.openjdk.org/shenandoah-jdk21u/pull/61/files/a0b3c6cd..a0b3c6cd Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=61&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=61&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/61.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/61/head:pull/61 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/61 From wkemper at openjdk.org Fri Jul 12 15:19:13 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 12 Jul 2024 15:19:13 GMT Subject: Integrated: Merge openjdk/jdk21u-dev:master In-Reply-To: References: Message-ID: On Thu, 27 Jun 2024 14:17:09 GMT, William Kemper wrote: > Merges tag jdk-21.0.4+6 This pull request has now been integrated. Changeset: 645a4996 Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/645a49960260b3f945b20e4c270cb6b953d378e4 Stats: 156 lines in 8 files changed: 5 ins; 0 del; 151 mod Merge ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/61 From wkemper at openjdk.org Fri Jul 12 16:13:32 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 12 Jul 2024 16:13:32 GMT Subject: RFR: 8334491: GenShen: Revert changes to Shenandoah defaults Message-ID: <9yoShGWRixxQv9qnZyAyN9oF7zehFPiUp6r1A8UbVfQ=.3a9fa3ad-be2b-4b81-9131-4afb9c2bdf87@github.com> Clean backport. ------------- Commit messages: - Backport 744fc265aafe2fd59c2ca53c798ee930119f5d8d Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/66/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=66&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8334491 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/66.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/66/head:pull/66 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/66 From wkemper at openjdk.org Fri Jul 12 16:21:33 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 12 Jul 2024 16:21:33 GMT Subject: RFR: 8335347: GenShen: Revert change that has adaptive heuristic ignore abbreviated cycles Message-ID: Trivial conflict with not-yet-backported improvements to OOM handling. ------------- Commit messages: - Remove part of upstream change picked up in cherry-pick - Backport 7542e909d1b4fb8a9a29ab5caf61a6be792f0f03 Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/67/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=67&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8335347 Stats: 26 lines in 10 files changed: 1 ins; 14 del; 11 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/67.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/67/head:pull/67 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/67 From wkemper at openjdk.org Fri Jul 12 17:22:01 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 12 Jul 2024 17:22:01 GMT Subject: RFR: Merge openjdk/jdk:master Message-ID: <4u_CVXx1SmkbJR8UG4lLfkkg_3-Uai0oJTqhHn1_fSk=.3064d970-3323-4029-a5a3-beeea8c10feb@github.com> Merges tag jdk-24+6 ------------- Commit messages: - 8335946: DTrace code snippets should be generated when DTrace flags are enabled - 8335743: jhsdb jstack cannot print some information on the waiting thread - 8335935: Chained builders not sending transformed models to next transforms - 8334481: [JVMCI] add LINK_TO_NATIVE to MethodHandleAccessProvider.IntrinsicMethod - 8335637: Add explicit non-null return value expectations to Object.toString() - 8335409: Can't allocate and retain memory from resource area in frame::oops_interpreted_do oop closure after 8329665 - 8331725: ubsan: pc may not always be the entry point for a VtableStub - 8313909: [JVMCI] assert(cp->tag_at(index).is_unresolved_klass()) in lookupKlassInPool - 8336012: Fix usages of jtreg-reserved properties - 8335779: JFR: Hide sleep events - ... and 372 more: https://git.openjdk.org/shenandoah/compare/d8af5894...b363de8c The webrev contains the conflicts with master: - merge conflicts: https://webrevs.openjdk.org/?repo=shenandoah&pr=460&range=00.conflicts Changes: https://git.openjdk.org/shenandoah/pull/460/files Stats: 67266 lines in 1687 files changed: 41791 ins; 18146 del; 7329 mod Patch: https://git.openjdk.org/shenandoah/pull/460.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/460/head:pull/460 PR: https://git.openjdk.org/shenandoah/pull/460 From wkemper at openjdk.org Fri Jul 12 20:34:08 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 12 Jul 2024 20:34:08 GMT Subject: Integrated: 8334491: GenShen: Revert changes to Shenandoah defaults In-Reply-To: <9yoShGWRixxQv9qnZyAyN9oF7zehFPiUp6r1A8UbVfQ=.3a9fa3ad-be2b-4b81-9131-4afb9c2bdf87@github.com> References: <9yoShGWRixxQv9qnZyAyN9oF7zehFPiUp6r1A8UbVfQ=.3a9fa3ad-be2b-4b81-9131-4afb9c2bdf87@github.com> Message-ID: <3Sn7r0Xspdu9gCnJ4G-paMLjsOK4YplG0-pDnX5rljg=.9ce6eb5c-0715-44c0-9cff-7071b4299852@github.com> On Fri, 12 Jul 2024 16:07:49 GMT, William Kemper wrote: > Clean backport. This pull request has now been integrated. Changeset: 8e2bc918 Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/8e2bc918c38b17bf8383a441bf282ec9098b3287 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8334491: GenShen: Revert changes to Shenandoah defaults Backport-of: 744fc265aafe2fd59c2ca53c798ee930119f5d8d ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/66 From wkemper at openjdk.org Fri Jul 12 20:35:17 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 12 Jul 2024 20:35:17 GMT Subject: Integrated: 8335347: GenShen: Revert change that has adaptive heuristic ignore abbreviated cycles In-Reply-To: References: Message-ID: On Fri, 12 Jul 2024 16:15:24 GMT, William Kemper wrote: > Trivial conflict with not-yet-backported improvements to OOM handling. This pull request has now been integrated. Changeset: acb6b5f1 Author: William Kemper URL: https://git.openjdk.org/shenandoah-jdk21u/commit/acb6b5f10e28090dd2987c7f96370f463ea4a69b Stats: 26 lines in 10 files changed: 1 ins; 14 del; 11 mod 8335347: GenShen: Revert change that has adaptive heuristic ignore abbreviated cycles Backport-of: 7542e909d1b4fb8a9a29ab5caf61a6be792f0f03 ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/67 From wkemper at openjdk.org Fri Jul 12 22:49:30 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 12 Jul 2024 22:49:30 GMT Subject: RFR: Merge openjdk/jdk:master [v2] In-Reply-To: <4u_CVXx1SmkbJR8UG4lLfkkg_3-Uai0oJTqhHn1_fSk=.3064d970-3323-4029-a5a3-beeea8c10feb@github.com> References: <4u_CVXx1SmkbJR8UG4lLfkkg_3-Uai0oJTqhHn1_fSk=.3064d970-3323-4029-a5a3-beeea8c10feb@github.com> Message-ID: > Merges tag jdk-24+6 William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 383 commits: - Merge tag 'jdk-24+6' into merge-jdk-24+6 Added tag jdk-24+6 for changeset b363de8c - 8335946: DTrace code snippets should be generated when DTrace flags are enabled Reviewed-by: coleenp, dholmes - 8335743: jhsdb jstack cannot print some information on the waiting thread Reviewed-by: dholmes, cjplummer, kevinw - 8335935: Chained builders not sending transformed models to next transforms Reviewed-by: asotona - 8334481: [JVMCI] add LINK_TO_NATIVE to MethodHandleAccessProvider.IntrinsicMethod Reviewed-by: dnsimon - 8335637: Add explicit non-null return value expectations to Object.toString() Reviewed-by: jpai, alanb, smarks, prappo - 8335409: Can't allocate and retain memory from resource area in frame::oops_interpreted_do oop closure after 8329665 Reviewed-by: dholmes, stuefe, coleenp, shade - 8331725: ubsan: pc may not always be the entry point for a VtableStub Reviewed-by: kvn, mbaesken - 8313909: [JVMCI] assert(cp->tag_at(index).is_unresolved_klass()) in lookupKlassInPool Reviewed-by: yzheng, never - 8336012: Fix usages of jtreg-reserved properties Reviewed-by: jjg - ... and 373 more: https://git.openjdk.org/shenandoah/compare/472e4c74...24e94f53 ------------- Changes: https://git.openjdk.org/shenandoah/pull/460/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=460&range=01 Stats: 67311 lines in 1691 files changed: 41794 ins; 18185 del; 7332 mod Patch: https://git.openjdk.org/shenandoah/pull/460.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/460/head:pull/460 PR: https://git.openjdk.org/shenandoah/pull/460 From ysr at openjdk.org Sat Jul 13 01:04:34 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Sat, 13 Jul 2024 01:04:34 GMT Subject: RFR: 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use Message-ID: 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use ------------- Commit messages: - Backport d2102347ea9c1199221ec33f4e721aefa1193cea Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/68/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=68&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8328235 Stats: 169 lines in 16 files changed: 135 ins; 1 del; 33 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/68.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/68/head:pull/68 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/68 From kdnilsen at openjdk.org Sat Jul 13 21:25:05 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sat, 13 Jul 2024 21:25:05 GMT Subject: RFR: 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use In-Reply-To: References: Message-ID: On Sat, 13 Jul 2024 00:59:12 GMT, Y. Srinivas Ramakrishna wrote: > Clean backport of commit [d2102347](https://github.com/openjdk/shenandoah/commit/d2102347ea9c1199221ec33f4e721aefa1193cea) from the [openjdk/shenandoah](https://git.openjdk.org/shenandoah) repository. > > The commit being backported was authored by Y. Srinivas Ramakrishna on 2 Jul 2024 and was reviewed by William Kemper. > > **Testing:** *in progress, will be updated when completed* > - [ ] Code pipeline testing > - [x] jtreg local > - [x] [GHA](https://github.com/openjdk-bots/shenandoah-jdk21u/actions/runs/9915888290) Marked as reviewed by kdnilsen (Committer). ------------- PR Review: https://git.openjdk.org/shenandoah-jdk21u/pull/68#pullrequestreview-2176549175 From kdnilsen at openjdk.org Mon Jul 15 14:29:43 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 15 Jul 2024 14:29:43 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v2] In-Reply-To: References: Message-ID: > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 70 commits: - Remove unreferenced local variable - Fix whitespace - Remove debug instrumentation and deprecated code - Turn off instrumentation - Verifier should only count non-trashed committed regions - Ignore generation soft capacities when adjusting generation sizes Soft capacities are established, for example, by setting NewRatio or New size on the JVM command line. GenShen, for now at least, does not honor these settings. Better performance is obtained by allowing GenShen to expand and shrink generation sizes according to application behavior. This commit also tidies up various aspects of the implementation to make adjustments to generation sizing more consistent: 1. ShenandoahGlobalHeuristics::choose_global_collection_set(): share the reserves between young and old collection to maximize evacuation of garbage-first regions, regardless of whether most garbage is found in old or young 2. ShenandoahConcurrentGC::entry_final_roots(): do not balance generations before invoking finish_rebuild() because finish_rebuild will balance generations. 3. ShenandoahFreeSet::flip_to_old_gc(): invoke force_transfer_to_young() instead of transfer_to_young() so we can override soft-capacity limits 4. ShenandoahFullGC::phase5_epilog(): Do not invoke compute_balances() or balance_generations_after_rebuilding_free_set(). Allow the free-set rebuild() implementation to do this work in a more consistent fashion. 5. ShenandoahGeneration::adjust_evacuation_budgets(): replace transfer_to_youn() with force_transfer_to_young() to avoid enforcement of soft capacity limits. 6. ShenandoahGenerationSizer::force_transfer_to_young(): new method 7. ShenandoahGenerationalFullGC::balance_generations_after_gc(): establish reserves() so that free-set rebuild() can adjust balance. Do not redundantly force transfer of regions here. 8. ShenandoahGenerationalFullGC::balance_generations_after_rebuilding_free_set(): deprecate this method. 9. ShenandoahGenerationalFullGC::compute_balances(): deprecate this method. 10. ShenandoahGenerationaStatsClosure::validate_usage() (part of Shenandoah Verification): add consistency check for generation capacities - Fix budgeting error during freeset rebuild Limit the size of old-gen by memory available in the OldCollector set following find_regions_with_alloc_capacity() (rather than limiting the size of old-gen by the total capacity of the OldCollector set, which includes used memory). - Fix whitespace - Merge remote-tracking branch 'origin/master' into share-collector-reserves - Merge branch 'openjdk:master' into master - ... and 60 more: https://git.openjdk.org/shenandoah/compare/d2102347...33eacea7 ------------- Changes: https://git.openjdk.org/shenandoah/pull/395/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=01 Stats: 1318 lines in 23 files changed: 832 ins; 240 del; 246 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From kdnilsen at openjdk.org Mon Jul 15 14:29:43 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 15 Jul 2024 14:29:43 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector In-Reply-To: References: Message-ID: On Mon, 12 Feb 2024 18:22:42 GMT, Kelvin Nilsen wrote: > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. After merging, I'm experiencing many unsuccessful GHA tests. These were apparently introduced by the merge. ------------- PR Comment: https://git.openjdk.org/shenandoah/pull/395#issuecomment-2189646618 From wkemper at openjdk.org Mon Jul 15 16:40:23 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 15 Jul 2024 16:40:23 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v2] In-Reply-To: References: Message-ID: On Mon, 15 Jul 2024 14:29:43 GMT, Kelvin Nilsen wrote: >> Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. > > Kelvin Nilsen has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 70 commits: > > - Remove unreferenced local variable > - Fix whitespace > - Remove debug instrumentation and deprecated code > - Turn off instrumentation > - Verifier should only count non-trashed committed regions > - Ignore generation soft capacities when adjusting generation sizes > > Soft capacities are established, for example, by setting NewRatio or New size on the > JVM command line. GenShen, for now at least, does not honor these settings. Better > performance is obtained by allowing GenShen to expand and shrink generation sizes > according to application behavior. > > This commit also tidies up various aspects of the implementation to make adjustments > to generation sizing more consistent: > > 1. ShenandoahGlobalHeuristics::choose_global_collection_set(): share the reserves > between young and old collection to maximize evacuation of garbage-first > regions, regardless of whether most garbage is found in old or young > 2. ShenandoahConcurrentGC::entry_final_roots(): do not balance generations before > invoking finish_rebuild() because finish_rebuild will balance generations. > 3. ShenandoahFreeSet::flip_to_old_gc(): invoke force_transfer_to_young() instead > of transfer_to_young() so we can override soft-capacity limits > 4. ShenandoahFullGC::phase5_epilog(): Do not invoke compute_balances() or > balance_generations_after_rebuilding_free_set(). Allow the free-set > rebuild() implementation to do this work in a more consistent fashion. > 5. ShenandoahGeneration::adjust_evacuation_budgets(): replace transfer_to_youn() > with force_transfer_to_young() to avoid enforcement of soft capacity limits. > 6. ShenandoahGenerationSizer::force_transfer_to_young(): new method > 7. ShenandoahGenerationalFullGC::balance_generations_after_gc(): establish > reserves() so that free-set rebuild() can adjust balance. Do not redundantly > force transfer of regions here. > 8. ShenandoahGenerationalFullGC::balance_generations_after_rebuilding_free_set(): > deprecate this method. > 9. ShenandoahGenerationalFullGC::compute_balances(): deprecate this method. > 10. ShenandoahGenerationaStatsClosure::validate_usage() (part of Shenandoah > Verification): add consistency check for generation capacities > - Fix budgeting error during freeset rebuild > > Limit the size of old-gen by memory av... Changes requested by wkemper (Committer). src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.cpp line 179: > 177: } > 178: > 179: void ShenandoahOldHeuristics::initialize_piggyback_evacs(ShenandoahCollectionSet* collection_set, Can we continue using `mixed` instead of `piggyback` to describe collections that include young and old regions? I think `mixed` is an accepted term and `piggyback` feels a little bit too colloquial (also not sure if the phrase is commonly known outside of English). src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.cpp line 185: > 183: size_t &unfragmented_available, > 184: size_t &fragmented_available, > 185: size_t &excess_fragmented_available) { Should all of these reference parameters be members? The API feels complicated now with these three methods passing the same parameters to each other: initialize_piggyback_evacs(...) prime_collection_set(...) finalize_piggyback_evacs(...) Could everything just happen within `prime_collection_set` without so many changes to exiting callers? ------------- PR Review: https://git.openjdk.org/shenandoah/pull/395#pullrequestreview-2178189515 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1678109072 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1678113040 From kdnilsen at openjdk.org Tue Jul 16 03:35:33 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 16 Jul 2024 03:35:33 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v3] In-Reply-To: References: Message-ID: > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Remove declaration of unused variables ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/395/files - new: https://git.openjdk.org/shenandoah/pull/395/files/33eacea7..3e38c8a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=02 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=01-02 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From ysr at openjdk.org Tue Jul 16 17:09:16 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 16 Jul 2024 17:09:16 GMT Subject: RFR: 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use In-Reply-To: References: Message-ID: <0IuOTkh9pOsqcITmXLSJPXFwVCqh_zhA50MuvYXDdww=.7607596e-92dc-4f00-b95f-3c9847d249ff@github.com> On Sat, 13 Jul 2024 00:59:12 GMT, Y. Srinivas Ramakrishna wrote: > Clean backport of commit [d2102347](https://github.com/openjdk/shenandoah/commit/d2102347ea9c1199221ec33f4e721aefa1193cea) from the [openjdk/shenandoah](https://git.openjdk.org/shenandoah) repository. > > The commit being backported was authored by Y. Srinivas Ramakrishna on 2 Jul 2024 and was reviewed by William Kemper. > > **Testing:** > - [x] Code pipeline testing -- failure orthogonal to the changes here and being independently investigated > - [x] jtreg local > - [x] [GHA](https://github.com/openjdk-bots/shenandoah-jdk21u/actions/runs/9915888290) Thanks Kelvin. ------------- PR Comment: https://git.openjdk.org/shenandoah-jdk21u/pull/68#issuecomment-2231416298 From ysr at openjdk.org Tue Jul 16 17:09:17 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 16 Jul 2024 17:09:17 GMT Subject: Integrated: 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use In-Reply-To: References: Message-ID: On Sat, 13 Jul 2024 00:59:12 GMT, Y. Srinivas Ramakrishna wrote: > Clean backport of commit [d2102347](https://github.com/openjdk/shenandoah/commit/d2102347ea9c1199221ec33f4e721aefa1193cea) from the [openjdk/shenandoah](https://git.openjdk.org/shenandoah) repository. > > The commit being backported was authored by Y. Srinivas Ramakrishna on 2 Jul 2024 and was reviewed by William Kemper. > > **Testing:** > - [x] Code pipeline testing -- failure orthogonal to the changes here and being independently investigated > - [x] jtreg local > - [x] [GHA](https://github.com/openjdk-bots/shenandoah-jdk21u/actions/runs/9915888290) This pull request has now been integrated. Changeset: 80e796b4 Author: Y. Srinivas Ramakrishna URL: https://git.openjdk.org/shenandoah-jdk21u/commit/80e796b4974ba6f6c18eafc179b5b5196582084f Stats: 169 lines in 16 files changed: 135 ins; 1 del; 33 mod 8328235: GenShen: Robustify ShenandoahGCSession and fix missing use Reviewed-by: kdnilsen Backport-of: d2102347ea9c1199221ec33f4e721aefa1193cea ------------- PR: https://git.openjdk.org/shenandoah-jdk21u/pull/68 From kdnilsen at openjdk.org Tue Jul 16 20:03:35 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 16 Jul 2024 20:03:35 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v2] In-Reply-To: References: Message-ID: On Mon, 15 Jul 2024 16:32:59 GMT, William Kemper wrote: >> Kelvin Nilsen has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 70 commits: >> >> - Remove unreferenced local variable >> - Fix whitespace >> - Remove debug instrumentation and deprecated code >> - Turn off instrumentation >> - Verifier should only count non-trashed committed regions >> - Ignore generation soft capacities when adjusting generation sizes >> >> Soft capacities are established, for example, by setting NewRatio or New size on the >> JVM command line. GenShen, for now at least, does not honor these settings. Better >> performance is obtained by allowing GenShen to expand and shrink generation sizes >> according to application behavior. >> >> This commit also tidies up various aspects of the implementation to make adjustments >> to generation sizing more consistent: >> >> 1. ShenandoahGlobalHeuristics::choose_global_collection_set(): share the reserves >> between young and old collection to maximize evacuation of garbage-first >> regions, regardless of whether most garbage is found in old or young >> 2. ShenandoahConcurrentGC::entry_final_roots(): do not balance generations before >> invoking finish_rebuild() because finish_rebuild will balance generations. >> 3. ShenandoahFreeSet::flip_to_old_gc(): invoke force_transfer_to_young() instead >> of transfer_to_young() so we can override soft-capacity limits >> 4. ShenandoahFullGC::phase5_epilog(): Do not invoke compute_balances() or >> balance_generations_after_rebuilding_free_set(). Allow the free-set >> rebuild() implementation to do this work in a more consistent fashion. >> 5. ShenandoahGeneration::adjust_evacuation_budgets(): replace transfer_to_youn() >> with force_transfer_to_young() to avoid enforcement of soft capacity limits. >> 6. ShenandoahGenerationSizer::force_transfer_to_young(): new method >> 7. ShenandoahGenerationalFullGC::balance_generations_after_gc(): establish >> reserves() so that free-set rebuild() can adjust balance. Do not redundantly >> force transfer of regions here. >> 8. ShenandoahGenerationalFullGC::balance_generations_after_rebuilding_free_set(): >> deprecate this method. >> 9. ShenandoahGenerationalFullGC::compute_balances(): deprecate this method. >> 10. ShenandoahGenerationaStatsClosure::validate_usage() (part of Shenandoah >> Verification): add consistency check for generation capacities >> - Fix budget... > > src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.cpp line 179: > >> 177: } >> 178: >> 179: void ShenandoahOldHeuristics::initialize_piggyback_evacs(ShenandoahCollectionSet* collection_set, > > Can we continue using `mixed` instead of `piggyback` to describe collections that include young and old regions? I think `mixed` is an accepted term and `piggyback` feels a little bit too colloquial (also not sure if the phrase is commonly known outside of English). Good suggestion. Thanks. I'm replacing piggyback with mixed throughout (where piggyback means mixed evac). There are a few places where piggyback in Shenandoah where piggyback is used for other purposes and I'm leaving those as is. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1680001115 From kdnilsen at openjdk.org Tue Jul 16 23:22:22 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 16 Jul 2024 23:22:22 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v2] In-Reply-To: References: Message-ID: On Mon, 15 Jul 2024 16:36:38 GMT, William Kemper wrote: >> Kelvin Nilsen has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 70 commits: >> >> - Remove unreferenced local variable >> - Fix whitespace >> - Remove debug instrumentation and deprecated code >> - Turn off instrumentation >> - Verifier should only count non-trashed committed regions >> - Ignore generation soft capacities when adjusting generation sizes >> >> Soft capacities are established, for example, by setting NewRatio or New size on the >> JVM command line. GenShen, for now at least, does not honor these settings. Better >> performance is obtained by allowing GenShen to expand and shrink generation sizes >> according to application behavior. >> >> This commit also tidies up various aspects of the implementation to make adjustments >> to generation sizing more consistent: >> >> 1. ShenandoahGlobalHeuristics::choose_global_collection_set(): share the reserves >> between young and old collection to maximize evacuation of garbage-first >> regions, regardless of whether most garbage is found in old or young >> 2. ShenandoahConcurrentGC::entry_final_roots(): do not balance generations before >> invoking finish_rebuild() because finish_rebuild will balance generations. >> 3. ShenandoahFreeSet::flip_to_old_gc(): invoke force_transfer_to_young() instead >> of transfer_to_young() so we can override soft-capacity limits >> 4. ShenandoahFullGC::phase5_epilog(): Do not invoke compute_balances() or >> balance_generations_after_rebuilding_free_set(). Allow the free-set >> rebuild() implementation to do this work in a more consistent fashion. >> 5. ShenandoahGeneration::adjust_evacuation_budgets(): replace transfer_to_youn() >> with force_transfer_to_young() to avoid enforcement of soft capacity limits. >> 6. ShenandoahGenerationSizer::force_transfer_to_young(): new method >> 7. ShenandoahGenerationalFullGC::balance_generations_after_gc(): establish >> reserves() so that free-set rebuild() can adjust balance. Do not redundantly >> force transfer of regions here. >> 8. ShenandoahGenerationalFullGC::balance_generations_after_rebuilding_free_set(): >> deprecate this method. >> 9. ShenandoahGenerationalFullGC::compute_balances(): deprecate this method. >> 10. ShenandoahGenerationaStatsClosure::validate_usage() (part of Shenandoah >> Verification): add consistency check for generation capacities >> - Fix budget... > > src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.cpp line 185: > >> 183: size_t &unfragmented_available, >> 184: size_t &fragmented_available, >> 185: size_t &excess_fragmented_available) { > > Should all of these reference parameters be members? The API feels complicated now with these three methods passing the same parameters to each other: > > > initialize_piggyback_evacs(...) > > prime_collection_set(...) > > finalize_piggyback_evacs(...) > > > Could everything just happen within `prime_collection_set` without so many changes to exiting callers? Thanks. Good suggestion. Am pursuing this. Looks much cleaner. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1680161701 From kdnilsen at openjdk.org Tue Jul 16 23:58:41 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 16 Jul 2024 23:58:41 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v4] In-Reply-To: References: Message-ID: > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request incrementally with two additional commits since the last revision: - Simplify arguments by using instance variables in ShenandoahOldHeuristics - Use mixed evac rather than piggyback to describe old-gen evacuations ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/395/files - new: https://git.openjdk.org/shenandoah/pull/395/files/3e38c8a3..406d347b Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=03 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=02-03 Stats: 447 lines in 4 files changed: 51 ins; 310 del; 86 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From kdnilsen at openjdk.org Wed Jul 17 00:35:34 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 17 Jul 2024 00:35:34 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v5] In-Reply-To: References: Message-ID: <9-h8ac6CrQd08gP98fVB4BYMU1Z373uigTMO_C9l6Ls=.8563b787-6d5c-4006-9637-a8a205544aa9@github.com> > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Remove unreferenced variables ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/395/files - new: https://git.openjdk.org/shenandoah/pull/395/files/406d347b..ee2ab01d Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=04 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=03-04 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From kemperw at amazon.com Wed Jul 17 23:14:52 2024 From: kemperw at amazon.com (Kemper, William) Date: Wed, 17 Jul 2024 23:14:52 +0000 Subject: Proposal to remove the experimental incremental update mode from Shenandoah Message-ID: <66706abc9ae64a29a46b8f90b3ef4f6f@amazon.com> https://bugs.openjdk.org/browse/JDK-8336685 If there are no strong objections to this, we will remove this mode in JDK24. Thank you, William -------------- next part -------------- An HTML attachment was scrubbed... URL: From shipilev at amazon.de Thu Jul 18 07:13:00 2024 From: shipilev at amazon.de (Aleksey Shipilev) Date: Thu, 18 Jul 2024 09:13:00 +0200 Subject: Proposal to remove the experimental incremental update mode from Shenandoah In-Reply-To: <66706abc9ae64a29a46b8f90b3ef4f6f@amazon.com> References: <66706abc9ae64a29a46b8f90b3ef4f6f@amazon.com> Message-ID: <71e563ca-b789-4f2b-a2f5-7efbb5ca2f08@amazon.de> On 18.07.24 01:14, Kemper, William wrote: > https://bugs.openjdk.org/browse/JDK-8336685 I support this. It does not seem to be (widely, if ever) used, and removal will simplify Shenandoah maintenance. -Aleksey Amazon Web Services Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B Sitz: Berlin Ust-ID: DE 365 538 597 From rkennke at amazon.de Thu Jul 18 14:39:50 2024 From: rkennke at amazon.de (Kennke, Roman) Date: Thu, 18 Jul 2024 14:39:50 +0000 Subject: Proposal to remove the experimental incremental update mode from Shenandoah In-Reply-To: <66706abc9ae64a29a46b8f90b3ef4f6f@amazon.com> References: <66706abc9ae64a29a46b8f90b3ef4f6f@amazon.com> Message-ID: > https://bugs.openjdk.org/browse/JDK-8336685 > > If there are no strong objections to this, we will remove this mode in JDK24. I also support the removal of the I-U mode. It?s long overdue. Thank you! Roman Amazon Web Services Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B Sitz: Berlin Ust-ID: DE 365 538 597 From kdnilsen at openjdk.org Thu Jul 18 18:00:25 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Thu, 18 Jul 2024 18:00:25 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v6] In-Reply-To: References: Message-ID: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Improve comment ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/395/files - new: https://git.openjdk.org/shenandoah/pull/395/files/ee2ab01d..21a5d328 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=05 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=04-05 Stats: 11 lines in 1 file changed: 1 ins; 1 del; 9 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From wkemper at openjdk.org Fri Jul 19 14:17:22 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 19 Jul 2024 14:17:22 GMT Subject: RFR: Merge openjdk/jdk:master Message-ID: Merges tag jdk-24+7 ------------- Commit messages: - 8336587: failure_handler lldb command times out on macosx-aarch64 core file - 8336091: Fix HTML warnings in the generated HTML files - 8335921: Fix HotSpot VM build without JVMTI - 8336300: DateFormatSymbols#getInstanceRef returns non-cached instance - 8336638: Parallel: Remove redundant mangle in PSScavenge::invoke - 8299080: Wrong default value of snippet lang attribute - 8334217: [AIX] Misleading error messages after JDK-8320005 - 8334781: JFR crash: assert(((((JfrTraceIdBits::load(klass)) & ((JfrTraceIdEpoch::this_epoch_method_and_class_bits()))) != 0))) failed: invariant - 8334502: gtest/GTestWrapper.java fails on armhf due to LogDecorations.iso8601_utctime_test - 8336040: Missing closing anchor element in Docs.gmk - ... and 452 more: https://git.openjdk.org/shenandoah/compare/d8af5894...21a6cf84 The webrev contains the conflicts with master: - merge conflicts: https://webrevs.openjdk.org/?repo=shenandoah&pr=461&range=00.conflicts Changes: https://git.openjdk.org/shenandoah/pull/461/files Stats: 71971 lines in 1877 files changed: 44597 ins; 19166 del; 8208 mod Patch: https://git.openjdk.org/shenandoah/pull/461.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/461/head:pull/461 PR: https://git.openjdk.org/shenandoah/pull/461 From duke at openjdk.org Fri Jul 19 21:30:54 2024 From: duke at openjdk.org (Henry Lin) Date: Fri, 19 Jul 2024 21:30:54 GMT Subject: RFR: 8333088: ubsan: shenandoahAdaptiveHeuristics.cpp:245:44: runtime error: division by zero Message-ID: <1drlX-RUquqXP9FFKE1X3Uq6pUyRjYPdPwMtCjACUdM=.11f83216-e30e-454d-a62e-65089d6614d3@github.com> Changed comparison to multiplication to prevent division by zero errors reported by ubsan. ------------- Commit messages: - 8333088: fix ubsan:shenandoahAdaptiveHeuristics divide by zero Changes: https://git.openjdk.org/jdk/pull/20161/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20161&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8333088 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20161.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20161/head:pull/20161 PR: https://git.openjdk.org/jdk/pull/20161 From shade at openjdk.org Fri Jul 19 21:30:54 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 19 Jul 2024 21:30:54 GMT Subject: RFR: 8333088: ubsan: shenandoahAdaptiveHeuristics.cpp:245:44: runtime error: division by zero In-Reply-To: <1drlX-RUquqXP9FFKE1X3Uq6pUyRjYPdPwMtCjACUdM=.11f83216-e30e-454d-a62e-65089d6614d3@github.com> References: <1drlX-RUquqXP9FFKE1X3Uq6pUyRjYPdPwMtCjACUdM=.11f83216-e30e-454d-a62e-65089d6614d3@github.com> Message-ID: On Fri, 12 Jul 2024 19:55:13 GMT, Henry Lin wrote: > Changed comparison to multiplication to prevent division by zero errors reported by ubsan. Looks fine. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20161#pullrequestreview-2178097697 From shade at openjdk.org Fri Jul 19 21:31:02 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 19 Jul 2024 21:31:02 GMT Subject: RFR: 8333728: ubsan: shenandoahFreeSet.cpp:1347:24: runtime error: division by zero In-Reply-To: References: Message-ID: On Fri, 12 Jul 2024 20:50:30 GMT, Henry Lin wrote: > Changed the check to require `linear` to be nonzero to prevent division by zero errors report by `ubsan`. `linear > 0` implies that `count > 0` so the count variable is no longer necessary. This looks good, thanks. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20163#pullrequestreview-2178114899 From duke at openjdk.org Fri Jul 19 21:31:02 2024 From: duke at openjdk.org (Henry Lin) Date: Fri, 19 Jul 2024 21:31:02 GMT Subject: RFR: 8333728: ubsan: shenandoahFreeSet.cpp:1347:24: runtime error: division by zero Message-ID: Changed the check to require `linear` to be nonzero to prevent division by zero errors report by `ubsan`. `linear > 0` implies that `count > 0` so the count variable is no longer necessary. ------------- Commit messages: - 8333728: fix ubsan:shenandoahFreeSet.cpp runtime division by zero Changes: https://git.openjdk.org/jdk/pull/20163/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20163&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8333728 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20163.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20163/head:pull/20163 PR: https://git.openjdk.org/jdk/pull/20163 From nprasad at openjdk.org Mon Jul 22 13:58:44 2024 From: nprasad at openjdk.org (Neethu Prasad) Date: Mon, 22 Jul 2024 13:58:44 GMT Subject: RFR: 8335865: Shenandoah: Improve THP pretouch after JDK-8315923 Message-ID: **Notes** os::pretouch is now using madvice now when available and has a fall back to using vm page size [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923) Hence removing code that sets _pretouch_heap_page_size & _pretouch_bitmap_page_size in Shenandoah. **Testing** * Ran test in Linux 5.10 and Linux 6.x and confirmed that there is no regression. I could not replicate the issue or performance improvement though. [add results] * Ran [TestTransparentHugePageUsage](https://github.com/openjdk/jdk/commit/a65a89522d2f24b1767e1c74f6689a22ea32ca6a) for Shenandoah and verified that test passed * Ran tier 1, tier 2 , tier1_gc_shenandoah, tier2_gc_shenandoah, tier3_gc_shenandoah and hotspot_gc_shenandoah. ------------- Commit messages: - 8335865: Shenandoah: Improve THP pretouch after JDK-8315923 Changes: https://git.openjdk.org/jdk/pull/20254/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20254&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8335865 Stats: 18 lines in 1 file changed: 0 ins; 17 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20254.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20254/head:pull/20254 PR: https://git.openjdk.org/jdk/pull/20254 From rkennke at openjdk.org Mon Jul 22 16:01:37 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 22 Jul 2024 16:01:37 GMT Subject: RFR: 8333728: ubsan: shenandoahFreeSet.cpp:1347:24: runtime error: division by zero In-Reply-To: References: Message-ID: <1LqSYeeKEyJ--F9OBKB7w9VQFEJloob-wizI-sxrGUk=.6049a84c-1c5c-4cb7-bb52-d37ab8849cdc@github.com> On Fri, 12 Jul 2024 20:50:30 GMT, Henry Lin wrote: > Changed the check to require `linear` to be nonzero to prevent division by zero errors report by `ubsan`. `linear > 0` implies that `count > 0` so the count variable is no longer necessary. Looks good, thank you! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20163#pullrequestreview-2191967705 From rkennke at openjdk.org Mon Jul 22 16:01:40 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 22 Jul 2024 16:01:40 GMT Subject: RFR: 8333088: ubsan: shenandoahAdaptiveHeuristics.cpp:245:44: runtime error: division by zero In-Reply-To: <1drlX-RUquqXP9FFKE1X3Uq6pUyRjYPdPwMtCjACUdM=.11f83216-e30e-454d-a62e-65089d6614d3@github.com> References: <1drlX-RUquqXP9FFKE1X3Uq6pUyRjYPdPwMtCjACUdM=.11f83216-e30e-454d-a62e-65089d6614d3@github.com> Message-ID: On Fri, 12 Jul 2024 19:55:13 GMT, Henry Lin wrote: > Changed comparison to multiplication to prevent division by zero errors reported by ubsan. Looks good, thank you! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20161#pullrequestreview-2191965028 From duke at openjdk.org Mon Jul 22 17:31:36 2024 From: duke at openjdk.org (duke) Date: Mon, 22 Jul 2024 17:31:36 GMT Subject: RFR: 8333088: ubsan: shenandoahAdaptiveHeuristics.cpp:245:44: runtime error: division by zero In-Reply-To: <1drlX-RUquqXP9FFKE1X3Uq6pUyRjYPdPwMtCjACUdM=.11f83216-e30e-454d-a62e-65089d6614d3@github.com> References: <1drlX-RUquqXP9FFKE1X3Uq6pUyRjYPdPwMtCjACUdM=.11f83216-e30e-454d-a62e-65089d6614d3@github.com> Message-ID: On Fri, 12 Jul 2024 19:55:13 GMT, Henry Lin wrote: > Changed comparison to multiplication to prevent division by zero errors reported by ubsan. @Henry-Lin-A Your change (at version 75e5435db0300f54ec2d88684f3a6b888badff4c) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20161#issuecomment-2243461370 From duke at openjdk.org Mon Jul 22 17:31:36 2024 From: duke at openjdk.org (Henry Lin) Date: Mon, 22 Jul 2024 17:31:36 GMT Subject: Integrated: 8333088: ubsan: shenandoahAdaptiveHeuristics.cpp:245:44: runtime error: division by zero In-Reply-To: <1drlX-RUquqXP9FFKE1X3Uq6pUyRjYPdPwMtCjACUdM=.11f83216-e30e-454d-a62e-65089d6614d3@github.com> References: <1drlX-RUquqXP9FFKE1X3Uq6pUyRjYPdPwMtCjACUdM=.11f83216-e30e-454d-a62e-65089d6614d3@github.com> Message-ID: <95otRcxDhReKuw9Rwafx2jgE1-4abHcRxmJ0p7sbBdM=.df00951d-5fe8-4595-b552-97a3e54b7d21@github.com> On Fri, 12 Jul 2024 19:55:13 GMT, Henry Lin wrote: > Changed comparison to multiplication to prevent division by zero errors reported by ubsan. This pull request has now been integrated. Changeset: 34eea6a5 Author: Henry Lin Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/34eea6a5fa27121bc0e9e8ace0894833d4a9f826 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8333088: ubsan: shenandoahAdaptiveHeuristics.cpp:245:44: runtime error: division by zero Reviewed-by: shade, rkennke ------------- PR: https://git.openjdk.org/jdk/pull/20161 From duke at openjdk.org Mon Jul 22 17:32:36 2024 From: duke at openjdk.org (duke) Date: Mon, 22 Jul 2024 17:32:36 GMT Subject: RFR: 8333728: ubsan: shenandoahFreeSet.cpp:1347:24: runtime error: division by zero In-Reply-To: References: Message-ID: On Fri, 12 Jul 2024 20:50:30 GMT, Henry Lin wrote: > Changed the check to require `linear` to be nonzero to prevent division by zero errors report by `ubsan`. `linear > 0` implies that `count > 0` so the count variable is no longer necessary. @Henry-Lin-A Your change (at version cf6fb062f7fac3e919ce0427b71ea0bf99054ac9) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20163#issuecomment-2243461577 From duke at openjdk.org Mon Jul 22 17:32:36 2024 From: duke at openjdk.org (Henry Lin) Date: Mon, 22 Jul 2024 17:32:36 GMT Subject: Integrated: 8333728: ubsan: shenandoahFreeSet.cpp:1347:24: runtime error: division by zero In-Reply-To: References: Message-ID: On Fri, 12 Jul 2024 20:50:30 GMT, Henry Lin wrote: > Changed the check to require `linear` to be nonzero to prevent division by zero errors report by `ubsan`. `linear > 0` implies that `count > 0` so the count variable is no longer necessary. This pull request has now been integrated. Changeset: b5575942 Author: Henry Lin Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/b5575942027281166676678e2081b024ec572644 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod 8333728: ubsan: shenandoahFreeSet.cpp:1347:24: runtime error: division by zero Reviewed-by: shade, rkennke ------------- PR: https://git.openjdk.org/jdk/pull/20163 From wkemper at openjdk.org Mon Jul 22 19:06:08 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 22 Jul 2024 19:06:08 GMT Subject: Integrated: Merge openjdk/jdk:master In-Reply-To: <4u_CVXx1SmkbJR8UG4lLfkkg_3-Uai0oJTqhHn1_fSk=.3064d970-3323-4029-a5a3-beeea8c10feb@github.com> References: <4u_CVXx1SmkbJR8UG4lLfkkg_3-Uai0oJTqhHn1_fSk=.3064d970-3323-4029-a5a3-beeea8c10feb@github.com> Message-ID: On Fri, 12 Jul 2024 17:15:53 GMT, William Kemper wrote: > Merges tag jdk-24+6 This pull request has now been integrated. Changeset: e30fa929 Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/e30fa929b19e8b7d6db9073fdf640938d3db39fd Stats: 67311 lines in 1691 files changed: 41794 ins; 18185 del; 7332 mod Merge ------------- PR: https://git.openjdk.org/shenandoah/pull/460 From wkemper at openjdk.org Tue Jul 23 00:18:00 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 23 Jul 2024 00:18:00 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations Message-ID: In some cases, different threads may race to evacuate an object. The race is won by "CAS"ing in the forwarding pointer. The threads that lose this race must "back out" their allocation. The work to relativize stack chunks should only happen for the thread that wins the evacuation race. ------------- Commit messages: - Only relativize stack chunks for the evacuated object that wins the race Changes: https://git.openjdk.org/jdk/pull/20288/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20288&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8336944 Stats: 3 lines in 1 file changed: 1 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/20288.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20288/head:pull/20288 PR: https://git.openjdk.org/jdk/pull/20288 From ysr at openjdk.org Tue Jul 23 01:33:33 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 23 Jul 2024 01:33:33 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations In-Reply-To: References: Message-ID: On Mon, 22 Jul 2024 23:56:46 GMT, William Kemper wrote: > In some cases, different threads may race to evacuate an object. The race is won by "CAS"ing in the forwarding pointer. The threads that lose this race must "back out" their allocation. The work to relativize stack chunks should only happen for the thread that wins the evacuation race. ? ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20288#pullrequestreview-2192803782 From shade at openjdk.org Tue Jul 23 09:53:30 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jul 2024 09:53:30 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations In-Reply-To: References: Message-ID: On Mon, 22 Jul 2024 23:56:46 GMT, William Kemper wrote: > In some cases, different threads may race to evacuate an object. The race is won by "CAS"ing in the forwarding pointer. The threads that lose this race must "back out" their allocation. The work to relativize stack chunks should only happen for the thread that wins the evacuation race. Current code effectively relativizes a private copy of the object, so there is no question about the races over it. New code relativizes when that private copy is now public. This looks less safer than before. So my question is then: why is relativizing a private copy problematic? ------------- PR Review: https://git.openjdk.org/jdk/pull/20288#pullrequestreview-2193512736 From rkennke at openjdk.org Tue Jul 23 09:59:37 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 23 Jul 2024 09:59:37 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations In-Reply-To: References: Message-ID: On Tue, 23 Jul 2024 09:50:57 GMT, Aleksey Shipilev wrote: > Current code effectively relativizes a private copy of the object, so there is no question about the races over it. New code relativizes when that private copy is now public. This looks less safer than before. > > So my question is then: why is relativizing a private copy problematic? IIRC, it requires the Klass*. With Lilliput, it is possible that the copy that lost the CAS copied the fwd-ptr that the winning thread installed. I guess we could follow through that fwd-ptr and grab the Klass* from the winning copy, but why bother? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20288#issuecomment-2244772101 From shade at openjdk.org Tue Jul 23 10:28:32 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jul 2024 10:28:32 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations In-Reply-To: References: Message-ID: On Tue, 23 Jul 2024 09:56:59 GMT, Roman Kennke wrote: > IIRC, it requires the Klass*. With Lilliput, it is possible that the copy that lost the CAS copied the fwd-ptr that the winning thread installed. I guess we could follow through that fwd-ptr and grab the Klass* from the winning copy, but why bother? Right, I remembered this one: https://github.com/openjdk/lilliput/blob/fdfcf46a3a4bfcf0f58f9413788aed8acc746203/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1171-L1185 I would sleep a little better if we did is similarly: reinstall the mark word into a private copy if we pulled the fwdptr already, instead of relying on concurrent relativization of the already published copy. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20288#issuecomment-2244835474 From shade at openjdk.org Tue Jul 23 11:18:45 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jul 2024 11:18:45 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations In-Reply-To: References: Message-ID: On Mon, 22 Jul 2024 23:56:46 GMT, William Kemper wrote: > In some cases, different threads may race to evacuate an object. The race is won by "CAS"ing in the forwarding pointer. The threads that lose this race must "back out" their allocation. The work to relativize stack chunks should only happen for the thread that wins the evacuation race. I suggest something like: // Additional twist for stack chunks if (copy_val->is_stackChunk()) { // We need to call into relativization code for stack chunks. For that, we need // a proper object with all metadata set right. There is a race with fwdptr // installation from the thread that might have already won the CAS and over-written // the mark word now holding the fwdptr, and we have just picked it up by accident. // Additionally, memory copy is not atomic, so whatever we read from that mark word // could be partial. // // Check that original mark is still not forwarded. This means no one have installed // yet, and so we overwrite mark in our private copy with a safe value, relativize, // and try to install the copy. In all other cases, the installation would fail. markWord old_mark = p->mark(); if (!old_mark.is_marked()) { copy_val->set_mark(old_mark); ContinuationGCSupport::relativize_stack_chunk(copy_val); } } ------------- PR Comment: https://git.openjdk.org/jdk/pull/20288#issuecomment-2244959839 From shade at openjdk.org Tue Jul 23 12:09:32 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jul 2024 12:09:32 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations In-Reply-To: References: Message-ID: On Mon, 22 Jul 2024 23:56:46 GMT, William Kemper wrote: > In some cases, different threads may race to evacuate an object. The race is won by "CAS"ing in the forwarding pointer. The threads that lose this race must "back out" their allocation. The work to relativize stack chunks should only happen for the thread that wins the evacuation race. Marked as reviewed by shade (Reviewer). So, my original concern was that doing relativization on already public `StackChunk` instances exposes us to more risk. But now I am looking around, and seeing we actually doing this during marking as well, which is an even more frequent case! https://github.com/openjdk/jdk/blob/e83b4b236eca48d0b75094006f7f888398194fe4/src/hotspot/share/gc/shenandoah/shenandoahMark.inline.hpp#L75-L79 So I think we can go with a simple patch here. src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1185: > 1183: // Successfully evacuated. Our copy is now the public one! > 1184: shenandoah_assert_correct(nullptr, copy_val); > 1185: ContinuationGCSupport::relativize_stack_chunk(copy_val); Please move it before the assert, so that we are sure relativization did not foobar the `copy_val`. Asserts in these methods verify that we return good things. ------------- PR Review: https://git.openjdk.org/jdk/pull/20288#pullrequestreview-2193791051 PR Comment: https://git.openjdk.org/jdk/pull/20288#issuecomment-2245060685 PR Review Comment: https://git.openjdk.org/jdk/pull/20288#discussion_r1687944160 From wkemper at openjdk.org Tue Jul 23 15:19:32 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 23 Jul 2024 15:19:32 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations In-Reply-To: References: Message-ID: On Tue, 23 Jul 2024 09:56:59 GMT, Roman Kennke wrote: > > Current code effectively relativizes a private copy of the object, so there is no question about the races over it. New code relativizes when that private copy is now public. This looks less safer than before. > > So my question is then: why is relativizing a private copy problematic? > > IIRC, it requires the Klass*. With Lilliput, it is possible that the copy that lost the CAS copied the fwd-ptr that the winning thread installed. I guess we could follow through that fwd-ptr and grab the Klass* from the winning copy, but why bother? Yes, exactly. In the crash I debugged the `copy_val` included in the forwarding pointer installed by the winning thread. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20288#issuecomment-2245536056 From wkemper at openjdk.org Tue Jul 23 15:19:34 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 23 Jul 2024 15:19:34 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations In-Reply-To: References: Message-ID: On Tue, 23 Jul 2024 12:06:26 GMT, Aleksey Shipilev wrote: >> In some cases, different threads may race to evacuate an object. The race is won by "CAS"ing in the forwarding pointer. The threads that lose this race must "back out" their allocation. The work to relativize stack chunks should only happen for the thread that wins the evacuation race. > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1185: > >> 1183: // Successfully evacuated. Our copy is now the public one! >> 1184: shenandoah_assert_correct(nullptr, copy_val); >> 1185: ContinuationGCSupport::relativize_stack_chunk(copy_val); > > Please move it before the assert, so that we are sure relativization did not foobar the `copy_val`. Asserts in these methods verify that we return good things. Okay, I had it that way originally, but then changed it to make sure we were relativizing a valid copy. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20288#discussion_r1688249719 From wkemper at openjdk.org Tue Jul 23 16:53:50 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 23 Jul 2024 16:53:50 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations [v2] In-Reply-To: References: Message-ID: <73hYnhJmzNdiODBPRF4Zz3YcmPNHkdAc0QgXszBRPSA=.c45fcbc9-3f86-458e-859b-a29801de7592@github.com> > In some cases, different threads may race to evacuate an object. The race is won by "CAS"ing in the forwarding pointer. The threads that lose this race must "back out" their allocation. The work to relativize stack chunks should only happen for the thread that wins the evacuation race. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Assert that return value is correct after relativization of stack chunk ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20288/files - new: https://git.openjdk.org/jdk/pull/20288/files/ad832e7e..6943b79a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20288&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20288&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/20288.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20288/head:pull/20288 PR: https://git.openjdk.org/jdk/pull/20288 From shade at openjdk.org Tue Jul 23 16:53:50 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 23 Jul 2024 16:53:50 GMT Subject: RFR: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations [v2] In-Reply-To: <73hYnhJmzNdiODBPRF4Zz3YcmPNHkdAc0QgXszBRPSA=.c45fcbc9-3f86-458e-859b-a29801de7592@github.com> References: <73hYnhJmzNdiODBPRF4Zz3YcmPNHkdAc0QgXszBRPSA=.c45fcbc9-3f86-458e-859b-a29801de7592@github.com> Message-ID: On Tue, 23 Jul 2024 16:50:32 GMT, William Kemper wrote: >> In some cases, different threads may race to evacuate an object. The race is won by "CAS"ing in the forwarding pointer. The threads that lose this race must "back out" their allocation. The work to relativize stack chunks should only happen for the thread that wins the evacuation race. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Assert that return value is correct after relativization of stack chunk Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20288#pullrequestreview-2194520892 From wkemper at openjdk.org Tue Jul 23 16:53:50 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 23 Jul 2024 16:53:50 GMT Subject: Integrated: 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations In-Reply-To: References: Message-ID: On Mon, 22 Jul 2024 23:56:46 GMT, William Kemper wrote: > In some cases, different threads may race to evacuate an object. The race is won by "CAS"ing in the forwarding pointer. The threads that lose this race must "back out" their allocation. The work to relativize stack chunks should only happen for the thread that wins the evacuation race. This pull request has now been integrated. Changeset: 2f2223d7 Author: William Kemper URL: https://git.openjdk.org/jdk/commit/2f2223d7524c4405cc7ca6ab77da62016bbfa911 Stats: 3 lines in 1 file changed: 1 ins; 2 del; 0 mod 8336944: Shenandoah: Should only relativize stack chunks for successful evacuations Reviewed-by: shade, ysr ------------- PR: https://git.openjdk.org/jdk/pull/20288 From wkemper at openjdk.org Wed Jul 24 18:12:40 2024 From: wkemper at openjdk.org (William Kemper) Date: Wed, 24 Jul 2024 18:12:40 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode Message-ID: We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. ------------- Commit messages: - Remove last vestiges of incremental update mode - Missed test, remove actual IU barrier flag - Remove missed iu_barrier usages for C1 - Update test (all barriers can be enabled now for all modes) - WIP: Remove incremental update mode Changes: https://git.openjdk.org/jdk/pull/20316/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20316&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8336685 Stats: 1696 lines in 69 files changed: 4 ins; 1658 del; 34 mod Patch: https://git.openjdk.org/jdk/pull/20316.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20316/head:pull/20316 PR: https://git.openjdk.org/jdk/pull/20316 From shade at openjdk.org Wed Jul 24 18:44:31 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 24 Jul 2024 18:44:31 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 18:08:46 GMT, William Kemper wrote: > We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. > > ## Testing > * hotspot_gc_shenandoah > * dacapo > * diluvian > * extremem > * hyperalloc > * specjbb2015 > * specjvm2008 Good riddance. I have to comb through this more accurately tomorrow, but first pass comments below. src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp line 571: > 569: /* ==== Apply keep-alive barrier, if required (e.g., to inhibit weak reference resurrection) ==== */ > 570: if (ShenandoahBarrierSet::need_keep_alive_barrier(decorators, type)) { > 571: if (ShenandoahSATBBarrier) { A bit weird to replace IU with SATB barrier here. src/hotspot/share/opto/classes.hpp line 327: > 325: shmacro(ShenandoahWeakCompareAndSwapN) > 326: shmacro(ShenandoahWeakCompareAndSwapP) > 327: I think this newline is unnecessary. ------------- PR Review: https://git.openjdk.org/jdk/pull/20316#pullrequestreview-2197480695 PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1690256658 PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1690269614 From kdnilsen at openjdk.org Wed Jul 24 18:55:32 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 24 Jul 2024 18:55:32 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 18:25:38 GMT, Aleksey Shipilev wrote: >> We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. >> >> ## Testing >> * hotspot_gc_shenandoah >> * dacapo >> * diluvian >> * extremem >> * hyperalloc >> * specjbb2015 >> * specjvm2008 > > src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp line 571: > >> 569: /* ==== Apply keep-alive barrier, if required (e.g., to inhibit weak reference resurrection) ==== */ >> 570: if (ShenandoahBarrierSet::need_keep_alive_barrier(decorators, type)) { >> 571: if (ShenandoahSATBBarrier) { > > A bit weird to replace IU with SATB barrier here. Will need_keep_alive_barrier() always be false in absence of IU mode support? can we replace this with an assert? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1690285365 From xpeng at openjdk.org Wed Jul 24 19:10:45 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 24 Jul 2024 19:10:45 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate Message-ID: [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. This PR includes proposed improvments addressing the known issues: 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | | 16384 stride | N/A | N/A | N/A | N/A |53679 ns | Basically when we increase stride, less threads are used for parallel iteration, we get worse latency which is expected. when number of regions is same as stride, it won't use mutli-threading, using single thread to process 4096 regions is much better then 4 threads(1024 stride). For the PR, also tested with following JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahTargetNumRegions= -Xlog:gc*" | Regions | time (ns) | | ------- | --------- | | 1024 | 5103 | | 2048 | 6132 | | 4096 | 12763 | | 8192 | 24295 | | 16384 | 33729 | Overall the performance is optimal no matter how many heap regions. Additional test: - [ ] `make test TEST=hotspot_gc_shenandoah` ------------- Commit messages: - Add empty line - clean - Fix build error on Windows - Revert "Add timing logs for execution of ShenandoahHeapRegionClosure" - Remove the default arg value to constructor of ShenandoahParallelHeapRegionTask - Auto derive stride for ShenandoahParallelHeapRegionTask when ShenandoahParallelRegionStride is set to 0 - Dynamic calculate stride - Add timing logs for execution of ShenandoahHeapRegionClosure - Recalibrate ShenandoahParallelRegionStride value if there is no override from JVM args Changes: https://git.openjdk.org/jdk/pull/20305/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20305&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8336640 Stats: 21 lines in 2 files changed: 14 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/20305.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20305/head:pull/20305 PR: https://git.openjdk.org/jdk/pull/20305 From shade at openjdk.org Wed Jul 24 19:10:46 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 24 Jul 2024 19:10:46 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 00:42:22 GMT, Xiaolong Peng wrote: > [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. > > This PR includes proposed improvments addressing the known issues: > 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; > 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; > 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. > 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) > > There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): > > JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" > > | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | > | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | > | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | > | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | > | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | > | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | > | 16384 stride | N/A | N/A | N/A | N/A |53679 ns | > > Basically w... src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1697: > 1695: ShenandoahHeap* const _heap; > 1696: ShenandoahHeapRegionClosure* const _blk; > 1697: size_t _stride; Should be `size_t const _stride;`? src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1729: > 1727: void ShenandoahHeap::parallel_heap_region_iterate(ShenandoahHeapRegionClosure* blk) const { > 1728: assert(blk->is_thread_safe(), "Only thread-safe closures here"); > 1729: const uint active_workers = workers() -> active_workers(); Suggestion: const uint active_workers = workers()->active_workers(); src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1737: > 1735: // not use worker threads to avoid the overhead; otherwise cacluate the stride by num_regions/active_workers > 1736: // to make sure every worker thread will have same amount of workload. > 1737: stride = n_regions <= 4096 ? 4096 : checked_cast(ceil(checked_cast(n_regions) / checked_cast(active_workers))); I suggest writing it like this: size_t stride = ShenandoahParallelRegionStride; if (stride == 0 && active_workers > 1) { // Automatically derive the stride to balance the work between threads // evenly. Do not try to split work if below the reasonable threshold. const size_t threshold = 4096; stride = (n_regions <= threshold) ? threshold : (n_regions + active_workers - 1) / active_workers; } if (n_regions > stride && active_workers > 1) { ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20305#discussion_r1690045161 PR Review Comment: https://git.openjdk.org/jdk/pull/20305#discussion_r1690045677 PR Review Comment: https://git.openjdk.org/jdk/pull/20305#discussion_r1690214436 From xpeng at openjdk.org Wed Jul 24 19:10:47 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 24 Jul 2024 19:10:47 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate In-Reply-To: References: Message-ID: <_pbtLdrJiJw6Wlrim13_cg5G8qTlxiqxHLJHCpmmjVc=.928b9c4a-681e-4dd1-8c0a-d531f2000504@github.com> On Wed, 24 Jul 2024 15:42:50 GMT, Aleksey Shipilev wrote: >> [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. >> >> This PR includes proposed improvments addressing the known issues: >> 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; >> 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; >> 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. >> 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) >> >> There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): >> >> JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" >> >> | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | >> | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | >> | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | >> | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | >> | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | >> | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | >> | 16384 stride | N/A | N/A | N/A | N/A ... > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1697: > >> 1695: ShenandoahHeap* const _heap; >> 1696: ShenandoahHeapRegionClosure* const _blk; >> 1697: size_t _stride; > > Should be `size_t const _stride;`? Yes, it should be const > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1737: > >> 1735: // not use worker threads to avoid the overhead; otherwise cacluate the stride by num_regions/active_workers >> 1736: // to make sure every worker thread will have same amount of workload. >> 1737: stride = n_regions <= 4096 ? 4096 : checked_cast(ceil(checked_cast(n_regions) / checked_cast(active_workers))); > > I suggest writing it like this: > > > size_t stride = ShenandoahParallelRegionStride; > > if (stride == 0 && active_workers > 1) { > // Automatically derive the stride to balance the work between threads > // evenly. Do not try to split work if below the reasonable threshold. > const size_t threshold = 4096; > stride = (n_regions <= threshold) ? > threshold : > (n_regions + active_workers - 1) / active_workers; > } > > if (n_regions > stride && active_workers > 1) { Neat! Thanks! I'll update the PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20305#discussion_r1690281649 PR Review Comment: https://git.openjdk.org/jdk/pull/20305#discussion_r1690281203 From wkemper at openjdk.org Wed Jul 24 19:27:40 2024 From: wkemper at openjdk.org (William Kemper) Date: Wed, 24 Jul 2024 19:27:40 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 18:25:38 GMT, Aleksey Shipilev wrote: >> We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. >> >> ## Testing >> * hotspot_gc_shenandoah >> * dacapo >> * diluvian >> * extremem >> * hyperalloc >> * specjbb2015 >> * specjvm2008 > > src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp line 571: > >> 569: /* ==== Apply keep-alive barrier, if required (e.g., to inhibit weak reference resurrection) ==== */ >> 570: if (ShenandoahBarrierSet::need_keep_alive_barrier(decorators, type)) { >> 571: if (ShenandoahSATBBarrier) { > > A bit weird to replace IU with SATB barrier here. @shipilev [The original code](https://github.com/openjdk/jdk/pull/20316/files#diff-cb01b36a8c7017c9e21645a0ff9075897e5bfa67ae37d4f0d69ccc582656ec31L71) used the function `iu_barrier` to emit the pre-write barrier for both SATB and IU modes. This only happened in the `ppc` port. Other platforms just invoke the function to emit the `satb_write_barrier` directly here (see [x86](https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp#L580), for example). @kdnilsen No, `need_keep_alive_barrier` may be true in SATB mode. The use of the barrier here is to make sure weak references that get loaded during mark are added to the SATB buffer. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1690320078 From wkemper at openjdk.org Wed Jul 24 19:31:04 2024 From: wkemper at openjdk.org (William Kemper) Date: Wed, 24 Jul 2024 19:31:04 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v2] In-Reply-To: References: Message-ID: <03bSRAN8T28AU2-M4IzjsBygTwG4SHrc8HUIJYLM5TE=.e4299a87-b25f-471b-9f6e-2c08e741c6f2@github.com> > We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. > > ## Testing > * hotspot_gc_shenandoah > * dacapo > * diluvian > * extremem > * hyperalloc > * specjbb2015 > * specjvm2008 William Kemper has updated the pull request incrementally with one additional commit since the last revision: Remove unintentional new line ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20316/files - new: https://git.openjdk.org/jdk/pull/20316/files/41a2deb8..ec2d6b64 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20316&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20316&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/20316.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20316/head:pull/20316 PR: https://git.openjdk.org/jdk/pull/20316 From wkemper at openjdk.org Wed Jul 24 19:31:05 2024 From: wkemper at openjdk.org (William Kemper) Date: Wed, 24 Jul 2024 19:31:05 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v2] In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 18:38:02 GMT, Aleksey Shipilev wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unintentional new line > > src/hotspot/share/opto/classes.hpp line 327: > >> 325: shmacro(ShenandoahWeakCompareAndSwapN) >> 326: shmacro(ShenandoahWeakCompareAndSwapP) >> 327: > > I think this newline is unnecessary. I agree. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1690322953 From wkemper at openjdk.org Wed Jul 24 19:46:33 2024 From: wkemper at openjdk.org (William Kemper) Date: Wed, 24 Jul 2024 19:46:33 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 00:42:22 GMT, Xiaolong Peng wrote: > [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. > > This PR includes proposed improvments addressing the known issues: > 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; > 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; > 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. > 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) > > There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): > > JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" > > | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | > | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | > | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | > | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | > | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | > | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | > | 16384 stride | N/A | N/A | N/A | N/A |53679 ns | > > Basically w... src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp line 323: > 321: product(uintx, ShenandoahParallelRegionStride, 0, EXPERIMENTAL, \ > 322: "How many regions to process at once during parallel region " \ > 323: "iteration. Affects heaps with lots of regions." \ Probably want a trailing space after `regions. "` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20305#discussion_r1690340458 From xpeng at openjdk.org Wed Jul 24 19:50:44 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 24 Jul 2024 19:50:44 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate [v2] In-Reply-To: References: Message-ID: > [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. > > This PR includes proposed improvments addressing the known issues: > 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; > 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; > 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. > 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) > > There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): > > JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" > > | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | > | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | > | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | > | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | > | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | > | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | > | 16384 stride | N/A | N/A | N/A | N/A |53679 ns | > > Basically w... Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: Add trailing space ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20305/files - new: https://git.openjdk.org/jdk/pull/20305/files/01367942..dcc1d29f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20305&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20305&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20305.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20305/head:pull/20305 PR: https://git.openjdk.org/jdk/pull/20305 From xpeng at openjdk.org Wed Jul 24 19:50:44 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Wed, 24 Jul 2024 19:50:44 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate [v2] In-Reply-To: References: Message-ID: <77ng_rxr88xCpCGmOV6NR1R_nsdqvcQw0PIzSmWdSdo=.cd4c28db-909c-4cc4-ab20-ab46d4872383@github.com> On Wed, 24 Jul 2024 19:44:14 GMT, William Kemper wrote: >> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: >> >> Add trailing space > > src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp line 323: > >> 321: product(uintx, ShenandoahParallelRegionStride, 0, EXPERIMENTAL, \ >> 322: "How many regions to process at once during parallel region " \ >> 323: "iteration. Affects heaps with lots of regions." \ > > Probably want a trailing space after `regions. "` Thanks! Just added. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20305#discussion_r1690344131 From shade at openjdk.org Wed Jul 24 20:04:03 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 24 Jul 2024 20:04:03 GMT Subject: RFR: 8334482: Shenandoah: Deadlock when safepoint is pending during nmethods iteration Message-ID: There is a problem in interaction between STS taken in `ShenandoahConcurrentWeakRootsEvacUpdateTask::work` and nmethod list iteration. If we start the nmethod iteration in `ShCWREUT` constructor, proceed to STS and block, any waiter that waits for the end of iteration would block until safepoint is over. If that waiter waits without a safepoint check, like anyone waiting on `CodeCache_lock`, e.g. Sweeper (not really in mainline, but we seen a case in JDK 17), or a JIT compiler patching code, then we would have a deadlock. A solution is to initialize iteration only immediately before the use of iterator. This will pass any pending STSes and give much less risk for deadlock, as long as nothing in nmethod handling safepoints. Unfortunately, it is really hard to reproduce in mainline, since the absence of Sweeper makes this deadlock exceedingly rare. Additional testing: - [x] Reproducer on JDK 17u with this patch applied now does not deadlock - [x] Linux AArch64 server fastdebug, `all` with `-XX:+UseShenandoahGC` - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` ------------- Commit messages: - Better worker counting, since some workers can already be parked at STS - Counting threads better - Touchups - Fix Changes: https://git.openjdk.org/jdk/pull/20309/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20309&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8334482 Stats: 72 lines in 4 files changed: 25 ins; 33 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/20309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20309/head:pull/20309 PR: https://git.openjdk.org/jdk/pull/20309 From shade at openjdk.org Wed Jul 24 20:05:32 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 24 Jul 2024 20:05:32 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate [v2] In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 19:50:44 GMT, Xiaolong Peng wrote: >> [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. >> >> This PR includes proposed improvments addressing the known issues: >> 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; >> 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; >> 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. >> 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) >> >> There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): >> >> JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" >> >> | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | >> | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | >> | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | >> | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | >> | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | >> | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | >> | 16384 stride | N/A | N/A | N/A | N/A ... > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Add trailing space Marked as reviewed by shade (Reviewer). I see double-digit-us improvements on my Mac, nice. # Before [30.851s][info][gc,stats] Pause Init Mark (G) = 0.076 s (a = 272 us) (n = 281) (lvls, us = 47, 137, 199, 246, 5296) [30.851s][info][gc,stats] Pause Init Mark (N) = 0.019 s (a = 68 us) (n = 281) (lvls, us = 20, 49, 57, 71, 617) [30.851s][info][gc,stats] Update Region States = 0.014 s (a = 49 us) (n = 281) (lvls, us = 13, 34, 41, 51, 368) # After [30.301s][info][gc,stats] Pause Init Mark (G) = 0.073 s (a = 239 us) (n = 307) (lvls, us = 31, 90, 162, 217, 3586) [30.301s][info][gc,stats] Pause Init Mark (N) = 0.009 s (a = 28 us) (n = 307) (lvls, us = 10, 21, 26, 33, 188) [30.301s][info][gc,stats] Update Region States = 0.004 s (a = 14 us) (n = 307) (lvls, us = 4, 11, 13, 15, 176) ------------- PR Review: https://git.openjdk.org/jdk/pull/20305#pullrequestreview-2197663679 PR Comment: https://git.openjdk.org/jdk/pull/20305#issuecomment-2248800641 From nprasad at openjdk.org Wed Jul 24 21:17:39 2024 From: nprasad at openjdk.org (Neethu Prasad) Date: Wed, 24 Jul 2024 21:17:39 GMT Subject: RFR: 8336742: Shenandoah: Add more verbose logging/stats for mark termination attempts Message-ID: **Notes** This PR adds the following 1. info logging on number of SATB flush attempts 2. total time spend on handshaking all threads requesting them to flush their SATB buffers. As suggested by William in [JDK-8336742 ](https://bugs.openjdk.org/browse/JDK-83367420), we can use handshake logging to get time spend and other stats for each handshake. [4.515s][info][handshake ] Handshake "Shenandoah Flush SATB Handshake", Targeted threads: 1036, Executed by requesting thread: 1035, Total completion time: 597004 ns [4.517s][info][handshake ] Handshake "Shenandoah Flush SATB Handshake", Targeted threads: 1036, Executed by requesting thread: 1033, Total completion time: 207402 ns **Testing** 1. tier1, tier2 and hotspot_gc_shenandoah tests. 2. **-Xlog:gc+stats=info** [4.058s][info][gc,stats ] Concurrent Marking = 0.080 s (a = 5351 us) (n = 15) (lvls, us = 4746, 5000, 5156, 5684, 5988) [4.058s][info][gc,stats ] SATB Flush Rendezvous = 0.013 s (a = 860 us) (n = 15) (lvls, us = 764, 814, 836, 885, 961) [4.058s][info][gc,stats ] Pause Final Mark (G) = 0.058 s (a = 3839 us) (n = 15) (lvls, us = 3047, 3320, 3867, 4121, 4930) [4.058s][info][gc,stats ] Pause Final Mark (N) = 0.054 s (a = 3592 us) (n = 15) (lvls, us = 2812, 3047, 3574, 3887, 4597) [4.058s][info][gc,stats ] Finish Mark = 0.028 s (a = 1843 us) (n = 15) (lvls, us = 1602, 1641, 1816, 1934, 2045) [4.058s][info][gc,stats ] Update Region States = 0.006 s (a = 386 us) (n = 15) (lvls, us = 375, 375, 381, 389, 413) [4.058s][info][gc,stats ] Choose Collection Set = 0.018 s (a = 1186 us) (n = 15) (lvls, us = 609, 619, 1309, 1387, 2109) [4.058s][info][gc,stats ] Rebuild Free Set = 0.001 s (a = 43 us) (n = 15) (lvls, us = 40, 41, 42, 43, 53) [4.058s][info][gc,stats ] Concurrent Weak References = 0.007 s (a = 452 us) (n = 15) (lvls, us = 420, 438, 443, 455, 487) on app termination [5.299s][info][gc,stats] GC STATISTICS: [5.299s][info][gc,stats] "(G)" (gross) pauses include VM time: time to notify and block threads, do the pre- [5.299s][info][gc,stats] and post-safepoint housekeeping. Use -Xlog:safepoint+stats to dissect. [5.299s][info][gc,stats] "(N)" (net) pauses are the times spent in the actual GC code. [5.299s][info][gc,stats] "a" is average time for each phase, look at levels to see if average makes sense. [5.299s][info][gc,stats] "lvls" are quantiles: 0% (minimum), 25%, 50% (median), 75%, 100% (maximum). [5.299s][info][gc,stats] [5.299s][info][gc,stats] All times are wall-clock times, except per-root-class counters, that are sum over [5.299s][info][gc,stats] all workers. Dividing the over the root stage time estimates parallelism. [5.299s][info][gc,stats] [5.299s][info][gc,stats] Pacing delays are measured from entering the pacing code till exiting it. Therefore, [5.299s][info][gc,stats] observed pacing delays may be higher than the threshold when paced thread spent more [5.299s][info][gc,stats] time in the pacing code. It usually happens when thread is de-scheduled while paced, [5.299s][info][gc,stats] OS takes longer to unblock the thread, or JVM experiences an STW pause. [5.299s][info][gc,stats] [5.299s][info][gc,stats] Higher delay would prevent application outpacing the GC, but it will hide the GC latencies [5.299s][info][gc,stats] from the STW pause times. Pacing affects the individual threads, and so it would also be [5.299s][info][gc,stats] invisible to the usual profiling tools, but would add up to end-to-end application latency. [5.299s][info][gc,stats] Raise max pacing delay with care. [5.299s][info][gc,stats] [5.299s][info][gc,stats] Concurrent Reset = 0.062 s (a = 2224 us) (n = 28) (lvls, us = 1406, 2109, 2207, 2285, 3562) [5.299s][info][gc,stats] Pause Init Mark (G) = 0.093 s (a = 3311 us) (n = 28) (lvls, us = 3047, 3242, 3281, 3340, 3632) [5.299s][info][gc,stats] Pause Init Mark (N) = 0.084 s (a = 3018 us) (n = 28) (lvls, us = 2910, 2949, 2988, 3027, 3215) [5.299s][info][gc,stats] Update Region States = 0.011 s (a = 401 us) (n = 28) (lvls, us = 361, 391, 398, 404, 445) [5.299s][info][gc,stats] Concurrent Mark Roots = 0.118 s (a = 4207 us) (n = 28) (lvls, us = 4004, 4082, 4160, 4297, 4431) [5.299s][info][gc,stats] CMR: = 5.520 s (a = 197144 us) (n = 28) (lvls, us = 175781, 183594, 195312, 203125, 253494) [5.300s][info][gc,stats] CMR: Thread Roots = 5.485 s (a = 195881 us) (n = 28) (lvls, us = 173828, 183594, 193359, 203125, 252361) [5.300s][info][gc,stats] CMR: VM Strong Roots = 0.014 s (a = 486 us) (n = 28) (lvls, us = 418, 445, 475, 523, 594) [5.300s][info][gc,stats] CMR: CLDG Roots = 0.022 s (a = 777 us) (n = 28) (lvls, us = 549, 604, 633, 711, 4005) [5.300s][info][gc,stats] Concurrent Marking = 0.147 s (a = 5241 us) (n = 28) (lvls, us = 4434, 5020, 5254, 5430, 5866) [5.300s][info][gc,stats] SATB Flush Rendezvous = 0.025 s (a = 905 us) (n = 28) (lvls, us = 746, 816, 846, 883, 1458) [5.300s][info][gc,stats] Pause Final Mark (G) = 0.110 s (a = 3931 us) (n = 28) (lvls, us = 3066, 3633, 3809, 4062, 6050) [5.300s][info][gc,stats] Pause Final Mark (N) = 0.101 s (a = 3589 us) (n = 28) (lvls, us = 2754, 3340, 3574, 3730, 4880) [5.300s][info][gc,stats] Finish Mark = 0.050 s (a = 1793 us) (n = 28) (lvls, us = 1387, 1582, 1777, 1914, 2456) [5.300s][info][gc,stats] Update Region States = 0.011 s (a = 388 us) (n = 28) (lvls, us = 373, 377, 381, 395, 434) 3. **-Xlog:gc+phases=info** `[2.735s][info][gc,phases] GC(0) Total SATB flush attempts : 2` ------------- Commit messages: - 8336742: Shenandoah: Add more verbose logging/stats for mark termination attempts Changes: https://git.openjdk.org/jdk/pull/20318/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20318&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8336742 Stats: 10 lines in 2 files changed: 9 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20318.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20318/head:pull/20318 PR: https://git.openjdk.org/jdk/pull/20318 From wkemper at openjdk.org Wed Jul 24 23:15:32 2024 From: wkemper at openjdk.org (William Kemper) Date: Wed, 24 Jul 2024 23:15:32 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate [v2] In-Reply-To: References: Message-ID: <3giNF29cG4pQlGorfxF1KNVwGz9TNCflQYAsYwf483U=.9ca57760-9d38-43ef-bc0e-038c76a21f58@github.com> On Wed, 24 Jul 2024 19:50:44 GMT, Xiaolong Peng wrote: >> [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. >> >> This PR includes proposed improvments addressing the known issues: >> 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; >> 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; >> 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. >> 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) >> >> There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): >> >> JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" >> >> | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | >> | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | >> | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | >> | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | >> | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | >> | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | >> | 16384 stride | N/A | N/A | N/A | N/A ... > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Add trailing space Nice work! ------------- Marked as reviewed by wkemper (Committer). PR Review: https://git.openjdk.org/jdk/pull/20305#pullrequestreview-2197944235 From ysr at openjdk.org Thu Jul 25 01:07:59 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 25 Jul 2024 01:07:59 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v6] In-Reply-To: References: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> Message-ID: On Wed, 24 Jul 2024 15:11:56 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comment > > src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.hpp line 156: > >> 154: bool top_off_collection_set(); >> 155: >> 156: // Return true iff the collection set holds at least one unpinned mixed evacuation candidate > > Same comment as the last one. The comment must not only state the value returned, but also what actions the method takes or side-effect it has on this or another interacting object's state. You want to do that for all imperative methods. (Only accessor methods may have a single line stating what they return; otherwise the effect of the method should be described.) (Applies to all the methods line 150-157.) > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 226: > >> 224: // on its next iteration and run a degenerated young cycle. >> 225: >> 226: // vmop_entry_final_updaterefs rebuilds free set in preparation for next GC. > > Did you mean to say `vmop_entry_final_roots` in this comment? Also, are the comments at line 215 and 226 really needed, since the methods do much more than just rebuilding the free set. > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 342: > >> 340: op_final_roots(); >> 341: >> 342: if (heap->mode()->is_generational()) { > > Can this code be factored out into a virtual method `rebuild_free_set()` that `ShenandoahGenerationalHeap` overrides? Also, can you take a look at `op_final_updaterefs` where `rebuild_free_set` is called? There may be an opportunity to refactor the code here into a method as suggested above, and to share that with `rebuild_free_set` perhaps. > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 752: > >> 750: // been set aside to hold objects evacuated from the young-gen collection set. Conservatively, this value >> 751: // equals the entire amount of live young-gen memory within the collection set, even though some of this memory >> 752: // will likely be promoted. > > It sounds like these comments should be moved (and split) into the `prepare_regions_and_collection_set()` header specification, and in its override for `ShenandoahOldGeneration`? The TODO comment at lines 727-734 should similarly move out into the method about which it talks. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1690632111 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1690610219 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1690616973 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1690620818 From ysr at openjdk.org Thu Jul 25 01:07:59 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 25 Jul 2024 01:07:59 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v6] In-Reply-To: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> References: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> Message-ID: On Thu, 18 Jul 2024 18:00:25 GMT, Kelvin Nilsen wrote: >> Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Improve comment Left a few suggestions; would be great to take care of the re-order thing that confuses the github diffs. Not sure if restoring old order of those methods in `shenandoahOldHeuritsics.cpp` is possible/easy. See comments inline. Flushing these for now, but will continue in that file and remaining ones in a subsequent session later today/tomorrow. src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.cpp line 83: > 81: #ifdef ASSERT > 82: if (uint(os::random()) % 100 < ShenandoahCoalesceChance) { > 83: return true; The change in the order of the methods has left the diffs very confused, making it a it difficult to read. Would it be possible to restore the old order of definition of these methods, unless there was a specific reason to reorder them? (May be the reordering, if important, could be deferred to a separate PR that only does that; just to make reviewers' lives easier :-) src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.hpp line 106: > 104: size_t _fragmentation_last_old_region; > 105: > 106: // State variables involved in construction of a mixed-evacuation collection set Can you add a sentence describing when the state is updated, and when it is used, as well as which are inputs to the construction process and which the results of (or updated upon completion of) the construction process and used as inputs to downstream operations? src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.hpp line 141: > 139: void choose_collection_set_from_regiondata(ShenandoahCollectionSet* set, RegionData* data, size_t data_size, size_t free) override; > 140: > 141: // Return true iff we need to finalize mixed evacs In addition to the return value, please add a sentence stating something like "Adds old regions to the collection set." What does "need to finalize mixed evacs" mean? Did you mean: "Returns true if and only if this step should be followed by ... (see method finalize_mixed_evacs)." In general, it's better to write comments such that they do not require chains of lookup (although in some cases it may be unavoidable). src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.hpp line 156: > 154: bool top_off_collection_set(); > 155: > 156: // Return true iff the collection set holds at least one unpinned mixed evacuation candidate Same comment as the last one. The comment must not only state the value returned, but also what actions the method takes or side-effect it has on this or another interacting object's state. You want to do that for all imperative methods. (Only accessor methods may have a single line stating what they return; otherwise the effect of the method should be described.) src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 226: > 224: // on its next iteration and run a degenerated young cycle. > 225: > 226: // vmop_entry_final_updaterefs rebuilds free set in preparation for next GC. Did you mean to say `vmop_entry_final_roots` in this comment? src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 342: > 340: op_final_roots(); > 341: > 342: if (heap->mode()->is_generational()) { Can this code be factored out into a virtual method `rebuild_free_set()` that `ShenandoahGenerationalHeap` overrides? src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 752: > 750: // been set aside to hold objects evacuated from the young-gen collection set. Conservatively, this value > 751: // equals the entire amount of live young-gen memory within the collection set, even though some of this memory > 752: // will likely be promoted. It sounds like these comments should be moved (and split) into the `prepare_regions_and_collection_set()` header specification, and in its override for `ShenandoahOldGeneration`? ------------- Changes requested by ysr (Committer). PR Review: https://git.openjdk.org/shenandoah/pull/395#pullrequestreview-2197025746 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1690638256 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1689989807 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1689995437 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1689997626 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1690605746 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1690609502 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1690620191 From ysr at openjdk.org Thu Jul 25 01:33:32 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 25 Jul 2024 01:33:32 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate [v2] In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 19:50:44 GMT, Xiaolong Peng wrote: >> [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. >> >> This PR includes proposed improvments addressing the known issues: >> 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; >> 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; >> 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. >> 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) >> >> There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): >> >> JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" >> >> | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | >> | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | >> | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | >> | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | >> | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | >> | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | >> | 16384 stride | N/A | N/A | N/A | N/A ... > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Add trailing space Marked as reviewed by ysr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20305#pullrequestreview-2198093550 From shade at openjdk.org Thu Jul 25 08:34:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jul 2024 08:34:34 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v2] In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 19:25:24 GMT, William Kemper wrote: >> src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp line 571: >> >>> 569: /* ==== Apply keep-alive barrier, if required (e.g., to inhibit weak reference resurrection) ==== */ >>> 570: if (ShenandoahBarrierSet::need_keep_alive_barrier(decorators, type)) { >>> 571: if (ShenandoahSATBBarrier) { >> >> A bit weird to replace IU with SATB barrier here. > > @shipilev [The original code](https://github.com/openjdk/jdk/pull/20316/files#diff-cb01b36a8c7017c9e21645a0ff9075897e5bfa67ae37d4f0d69ccc582656ec31L71) used the function `iu_barrier` to emit the pre-write barrier for both SATB and IU modes. This only happened in the `ppc` port. Other platforms just invoke the function to emit the `satb_write_barrier` directly here (see [x86](https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp#L580), for example). > > @kdnilsen No, `need_keep_alive_barrier` may be true in SATB mode. The use of the barrier here is to make sure weak references that get loaded during mark are added to the SATB buffer. Oh, okay. So that weirdness is pre-existing, fine. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1691057726 From shade at openjdk.org Thu Jul 25 09:07:33 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jul 2024 09:07:33 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v2] In-Reply-To: <03bSRAN8T28AU2-M4IzjsBygTwG4SHrc8HUIJYLM5TE=.e4299a87-b25f-471b-9f6e-2c08e741c6f2@github.com> References: <03bSRAN8T28AU2-M4IzjsBygTwG4SHrc8HUIJYLM5TE=.e4299a87-b25f-471b-9f6e-2c08e741c6f2@github.com> Message-ID: On Wed, 24 Jul 2024 19:31:04 GMT, William Kemper wrote: >> We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. >> >> ## Testing >> * hotspot_gc_shenandoah >> * dacapo >> * diluvian >> * extremem >> * hyperalloc >> * specjbb2015 >> * specjvm2008 > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Remove unintentional new line I like this. Consider another thing to clean up: src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 122: > 120: > 121: ShenandoahMarkRefsClosure mark_cl(q, rp); > 122: ShenandoahSATBAndRemarkThreadsClosure tc(satb_mq_set, nullptr); Looks like `ShenandoahSATBAndRemarkThreadsClosure` can be considerably simplified, now that we do not pass any closure to it. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20316#pullrequestreview-2198733772 PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1691098698 From shade at openjdk.org Thu Jul 25 12:06:33 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jul 2024 12:06:33 GMT Subject: RFR: 8334315: Shenandoah: reduce GC logging noise In-Reply-To: References: Message-ID: On Wed, 19 Jun 2024 15:11:52 GMT, Kelvin Nilsen wrote: > Qualify certain log messages as ergo or debug to remove extraneous noise from regular gc logs. Any reason we are not integrating this? Still noise in GC logs :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/19795#issuecomment-2250161972 From shade at openjdk.org Thu Jul 25 14:26:44 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jul 2024 14:26:44 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah Message-ID: While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. Additional testing: - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (100x) ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/20328/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20328&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8279016 Stats: 29 lines in 4 files changed: 29 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/20328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20328/head:pull/20328 PR: https://git.openjdk.org/jdk/pull/20328 From shade at openjdk.org Thu Jul 25 14:39:56 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jul 2024 14:39:56 GMT Subject: RFR: 8337213: Shenandoah: Add verification for class mirrors Message-ID: <8eHostMEXZZ8g9Q_jCwNxZIk0sUiMBxM-6GpXxryMaY=.31253348-41ba-46e0-a6b8-063f73ca0a99@github.com> When working in Leyden, I noticed some oddities when loading Class mirrors from the CDS archive. These are not caught by Shenandoah asserts or verifier, since the problem is due to wrong metadata. We can amend verification to test these. Additional testing: - [ ] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` ------------- Commit messages: - Advanced verify Changes: https://git.openjdk.org/jdk/pull/20332/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20332&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8337213 Stats: 34 lines in 2 files changed: 34 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/20332.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20332/head:pull/20332 PR: https://git.openjdk.org/jdk/pull/20332 From xpeng at openjdk.org Thu Jul 25 15:59:33 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 25 Jul 2024 15:59:33 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate [v2] In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 19:50:44 GMT, Xiaolong Peng wrote: >> [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. >> >> This PR includes proposed improvments addressing the known issues: >> 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; >> 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; >> 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. >> 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) >> >> There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): >> >> JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" >> >> | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | >> | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | >> | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | >> | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | >> | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | >> | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | >> | 16384 stride | N/A | N/A | N/A | N/A ... > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Add trailing space Thank you all for the reviews! I'll integrate the PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20305#issuecomment-2250782835 From duke at openjdk.org Thu Jul 25 15:59:33 2024 From: duke at openjdk.org (duke) Date: Thu, 25 Jul 2024 15:59:33 GMT Subject: RFR: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate [v2] In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 19:50:44 GMT, Xiaolong Peng wrote: >> [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. >> >> This PR includes proposed improvments addressing the known issues: >> 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; >> 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; >> 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. >> 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) >> >> There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): >> >> JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" >> >> | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | >> | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | >> | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | >> | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | >> | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | >> | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | >> | 16384 stride | N/A | N/A | N/A | N/A ... > > Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision: > > Add trailing space @pengxiaolong Your change (at version dcc1d29f8b4028bcf5e29ddd3c38e6ff1603b293) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20305#issuecomment-2250784927 From xpeng at openjdk.org Thu Jul 25 16:07:40 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Thu, 25 Jul 2024 16:07:40 GMT Subject: Integrated: 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 00:42:22 GMT, Xiaolong Peng wrote: > [parallel_heap_region_iterate](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp#L1726-L1734) is used to execute lightweight operations on heap regions, including ShenandoahPrepareForMarkClosure, ShenandoahInitMarkUpdateRegionStateClosure, ShenandoahFinalUpdateRefsUpdateRegionStateClosure, ShenandoahResetUpdateRegionStateClosure and ShenandoahFinalMarkUpdateRegionStateClosure. Since all the operations are very lightweight, in regular cases w/o large number of heap regions, the parallelism seems to be an overkill because the cost of multi-thread orchestrating could be more expensive; In most cases, single thread should be more efficient. Also, if multiple threading is needed, we should maximize the utilization of all active workers for best performance. > > This PR includes proposed improvments addressing the known issues: > 1. Change the default value of ShenandoahParallelRegionStride to 0, when it is 0, Shenandoah will auto derive the value of stride for best performance; > 2. if num_regions is <= 4096, not use worker threads at all to avoid the overhead of multi-threading; > 3. When num_regions is more than 4096, use worker threads to parallelize the workload, derive the value of stride to evenly distribute the workload to all active workers. > 4. When number of active workers is 1, don't bother the workers, it is faster to finish the workload in current thread(avoid overhead of multi-threads orchestration) > > There are some time metrics I collected from test with TIP version(I added time metrics for parallel_heap_region_iterate): > > JVM args: export JAVA_OPTS="-Xms8G -Xmx8G -XX:+AlwaysPreTouch -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahParallelRegionStride= -XX:ShenandoahTargetNumRegions= -Xlog:gc*" > > | | 1024 regions | 2048 regions | 4096 regions | 8192 regions |16384 regions | > | ----------- | ------------ | ------------ | ------------ | ------------ |------------ | > | 1024 stride | 5785 ns | 22194 ns | 20953 ns | 23008 ns |33013 ns | > | 2048 stride | N/A | 6491 ns | 22476 ns | 25842 ns |34378 ns | > | 4096 stride | N/A | N/A | 14034 ns | 28425 ns |36324 ns | > | 8192 stride | N/A | N/A | N/A | 24359 ns |45231 ns | > | 16384 stride | N/A | N/A | N/A | N/A |53679 ns | > > Basically w... This pull request has now been integrated. Changeset: e74edbae Author: Xiaolong Peng Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/e74edbaea9f09169f597a470f647f3b7d10cc71b Stats: 21 lines in 2 files changed: 14 ins; 0 del; 7 mod 8336640: Shenandoah: Parallel worker use in parallel_heap_region_iterate Reviewed-by: shade, wkemper, ysr ------------- PR: https://git.openjdk.org/jdk/pull/20305 From shade at openjdk.org Thu Jul 25 16:17:33 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jul 2024 16:17:33 GMT Subject: RFR: 8336742: Shenandoah: Add more verbose logging/stats for mark termination attempts In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 19:15:55 GMT, Neethu Prasad wrote: > **Notes** > This PR adds the following > 1. info logging on number of SATB flush attempts > 2. total time spend on handshaking all threads requesting them to flush their SATB buffers. > > As suggested by William in [JDK-8336742 ](https://bugs.openjdk.org/browse/JDK-83367420), we can use handshake logging to get time spend and other stats for each handshake. > > [4.515s][info][handshake ] Handshake "Shenandoah Flush SATB Handshake", Targeted threads: 1036, Executed by requesting thread: 1035, Total completion time: 597004 ns > [4.517s][info][handshake ] Handshake "Shenandoah Flush SATB Handshake", Targeted threads: 1036, Executed by requesting thread: 1033, Total completion time: 207402 ns > > > **Testing** > 1. tier1, tier2 and hotspot_gc_shenandoah tests. > 2. **-Xlog:gc+stats=info** > > > [4.058s][info][gc,stats ] Concurrent Marking = 0.080 s (a = 5351 us) (n = 15) (lvls, us = 4746, 5000, 5156, 5684, 5988) > [4.058s][info][gc,stats ] SATB Flush Rendezvous = 0.013 s (a = 860 us) (n = 15) (lvls, us = 764, 814, 836, 885, 961) > [4.058s][info][gc,stats ] Pause Final Mark (G) = 0.058 s (a = 3839 us) (n = 15) (lvls, us = 3047, 3320, 3867, 4121, 4930) > [4.058s][info][gc,stats ] Pause Final Mark (N) = 0.054 s (a = 3592 us) (n = 15) (lvls, us = 2812, 3047, 3574, 3887, 4597) > [4.058s][info][gc,stats ] Finish Mark = 0.028 s (a = 1843 us) (n = 15) (lvls, us = 1602, 1641, 1816, 1934, 2045) > [4.058s][info][gc,stats ] Update Region States = 0.006 s (a = 386 us) (n = 15) (lvls, us = 375, 375, 381, 389, 413) > [4.058s][info][gc,stats ] Choose Collection Set = 0.018 s (a = 1186 us) (n = 15) (lvls, us = 609, 619, 1309, 1387, 2109) > [4.058s][info][gc,stats ] Rebuild Free Set = 0.001 s (a = 43 us) (n = 15) (lvls, us = 40, 41, 42, 43, 53) > [4.058s][info][gc,stats ] Concurrent Weak References = 0.007 s (a = 452 us) (n = 15) (lvls, us = 420, 438, 443, 455, 487) > > > on app termination > > > [5.299s][info][gc,stats] GC STATISTICS: > [5.299s][info][gc,stats] "(G)" (gross) pauses include VM time: time to notify and block threads, do the pre- > [5.299s][info][gc,stats] and post-safepoint housekee... Looks like going in right direction. src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 234: > 232: timings->record_phase_time(ShenandoahPhaseTimings::conc_mark_satb_flush_rendezvous, handshake_total_time); > 233: > 234: log_info(gc, phases) ("Total SATB flush attempts : %d", (flushes + 1)); Instead of this, I suggest we wrap the `Handshake::execute` with `log_info(gc,phases,start)` and `log_info(gc,phases)`. Logging would tell us how many attempts were done, and give us a rough number how long it took. The benefit would be that we would have the exact timestamps when handshakes started/finished. src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.hpp line 60: > 58: SHENANDOAH_PAR_PHASE_DO(conc_mark_roots, " CMR: ", f) \ > 59: f(conc_mark, "Concurrent Marking") \ > 60: f(conc_mark_satb_flush_rendezvous, " SATB Flush Rendezvous") \ The backslash at the end is a bit mis-indented. ------------- PR Review: https://git.openjdk.org/jdk/pull/20318#pullrequestreview-2199768754 PR Review Comment: https://git.openjdk.org/jdk/pull/20318#discussion_r1691755810 PR Review Comment: https://git.openjdk.org/jdk/pull/20318#discussion_r1691744954 From wkemper at openjdk.org Thu Jul 25 16:39:33 2024 From: wkemper at openjdk.org (William Kemper) Date: Thu, 25 Jul 2024 16:39:33 GMT Subject: RFR: 8334482: Shenandoah: Deadlock when safepoint is pending during nmethods iteration In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 09:10:35 GMT, Aleksey Shipilev wrote: > There is a problem in interaction between STS taken in `ShenandoahConcurrentWeakRootsEvacUpdateTask::work` and nmethod list iteration. If we start the nmethod iteration in `ShCWREUT` constructor, proceed to STS and block, any waiter that waits for the end of iteration would block until safepoint is over. If that waiter waits without a safepoint check, like anyone waiting on `CodeCache_lock`, e.g. Sweeper (not really in mainline, but we seen a case in JDK 17), or a JIT compiler patching code, then we would have a deadlock. > > A solution is to initialize iteration only immediately before the use of iterator. This will pass any pending STSes and give much less risk for deadlock, as long as nothing in nmethod handling safepoints. > > Unfortunately, it is really hard to reproduce in mainline, since the absence of Sweeper makes this deadlock exceedingly rare. > > Additional testing: > - [x] Reproducer on JDK 17u with this patch applied now does not deadlock > - [x] Linux AArch64 server fastdebug, `all` with `-XX:+UseShenandoahGC` > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` Marked as reviewed by wkemper (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20309#pullrequestreview-2199845759 From wkemper at openjdk.org Thu Jul 25 16:45:31 2024 From: wkemper at openjdk.org (William Kemper) Date: Thu, 25 Jul 2024 16:45:31 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah In-Reply-To: References: Message-ID: <1Rij3dn3jAEt5t0LwLv5XlnKgBWPhDVYPIePboN0DtY=.c276fca8-7c5e-4948-bc83-a98a24002cec@github.com> On Thu, 25 Jul 2024 11:26:45 GMT, Aleksey Shipilev wrote: > While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). > > Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. > > I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. > > This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. > > Additional testing: > - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) > - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (100x) Marked as reviewed by wkemper (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20328#pullrequestreview-2199862309 From wkemper at openjdk.org Thu Jul 25 17:10:05 2024 From: wkemper at openjdk.org (William Kemper) Date: Thu, 25 Jul 2024 17:10:05 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v3] In-Reply-To: References: Message-ID: > We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. > > ## Testing > * hotspot_gc_shenandoah > * dacapo > * diluvian > * extremem > * hyperalloc > * specjbb2015 > * specjvm2008 William Kemper has updated the pull request incrementally with one additional commit since the last revision: Simplify final mark now that incremental update mode is removed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20316/files - new: https://git.openjdk.org/jdk/pull/20316/files/ec2d6b64..79ceade3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20316&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20316&range=01-02 Stats: 13 lines in 1 file changed: 0 ins; 9 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/20316.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20316/head:pull/20316 PR: https://git.openjdk.org/jdk/pull/20316 From shade at openjdk.org Thu Jul 25 17:28:31 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jul 2024 17:28:31 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v3] In-Reply-To: References: Message-ID: On Thu, 25 Jul 2024 17:10:05 GMT, William Kemper wrote: >> We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. >> >> ## Testing >> * hotspot_gc_shenandoah >> * dacapo >> * diluvian >> * extremem >> * hyperalloc >> * specjbb2015 >> * specjvm2008 > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Simplify final mark now that incremental update mode is removed Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20316#pullrequestreview-2199950416 From shade at openjdk.org Thu Jul 25 18:21:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jul 2024 18:21:34 GMT Subject: RFR: 8336742: Shenandoah: Add more verbose logging/stats for mark termination attempts In-Reply-To: References: Message-ID: On Thu, 25 Jul 2024 16:14:39 GMT, Aleksey Shipilev wrote: >> **Notes** >> This PR adds the following >> 1. info logging on number of SATB flush attempts >> 2. total time spend on handshaking all threads requesting them to flush their SATB buffers. >> >> As suggested by William in [JDK-8336742 ](https://bugs.openjdk.org/browse/JDK-83367420), we can use handshake logging to get time spend and other stats for each handshake. >> >> [4.515s][info][handshake ] Handshake "Shenandoah Flush SATB Handshake", Targeted threads: 1036, Executed by requesting thread: 1035, Total completion time: 597004 ns >> [4.517s][info][handshake ] Handshake "Shenandoah Flush SATB Handshake", Targeted threads: 1036, Executed by requesting thread: 1033, Total completion time: 207402 ns >> >> >> **Testing** >> 1. tier1, tier2 and hotspot_gc_shenandoah tests. >> 2. **-Xlog:gc+stats=info** >> >> >> [4.058s][info][gc,stats ] Concurrent Marking = 0.080 s (a = 5351 us) (n = 15) (lvls, us = 4746, 5000, 5156, 5684, 5988) >> [4.058s][info][gc,stats ] SATB Flush Rendezvous = 0.013 s (a = 860 us) (n = 15) (lvls, us = 764, 814, 836, 885, 961) >> [4.058s][info][gc,stats ] Pause Final Mark (G) = 0.058 s (a = 3839 us) (n = 15) (lvls, us = 3047, 3320, 3867, 4121, 4930) >> [4.058s][info][gc,stats ] Pause Final Mark (N) = 0.054 s (a = 3592 us) (n = 15) (lvls, us = 2812, 3047, 3574, 3887, 4597) >> [4.058s][info][gc,stats ] Finish Mark = 0.028 s (a = 1843 us) (n = 15) (lvls, us = 1602, 1641, 1816, 1934, 2045) >> [4.058s][info][gc,stats ] Update Region States = 0.006 s (a = 386 us) (n = 15) (lvls, us = 375, 375, 381, 389, 413) >> [4.058s][info][gc,stats ] Choose Collection Set = 0.018 s (a = 1186 us) (n = 15) (lvls, us = 609, 619, 1309, 1387, 2109) >> [4.058s][info][gc,stats ] Rebuild Free Set = 0.001 s (a = 43 us) (n = 15) (lvls, us = 40, 41, 42, 43, 53) >> [4.058s][info][gc,stats ] Concurrent Weak References = 0.007 s (a = 452 us) (n = 15) (lvls, us = 420, 438, 443, 455, 487) >> >> >> on app termination >> >> >> [5.299s][info][gc,stats] GC STATISTICS: >> [5.299s][info][gc,stats] "(G)" (gross) pauses include VM time: time to notify and block threads, do the p... > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 234: > >> 232: timings->record_phase_time(ShenandoahPhaseTimings::conc_mark_satb_flush_rendezvous, handshake_total_time); >> 233: >> 234: log_info(gc, phases) ("Total SATB flush attempts : %d", (flushes + 1)); > > Instead of this, I suggest we wrap the `Handshake::execute` with `log_info(gc,phases,start)` and `log_info(gc,phases)`. Logging would tell us how many attempts were done, and give us a rough number how long it took. The benefit would be that we would have the exact timestamps when handshakes started/finished. Hm. Maybe even better to wrap it with: { ShenandoahTimingsTracker t(ShenandoahPhaseTimings::conc_mark_satb_flush_rendezvous, true); Handshake::execute(&flush_satb); } Note the last, new argument that allows to merge multiple events into one. We need to separately make `ShenandoahTimingsTracker` report `gc,phases` log statement, imo. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20318#discussion_r1691943083 From shade at openjdk.org Thu Jul 25 19:03:32 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 25 Jul 2024 19:03:32 GMT Subject: RFR: 8335865: Shenandoah: Improve THP pretouch after JDK-8315923 In-Reply-To: References: Message-ID: On Fri, 19 Jul 2024 14:28:24 GMT, Neethu Prasad wrote: > Ran [TestTransparentHugePageUsage](https://github.com/openjdk/jdk/commit/a65a89522d2f24b1767e1c74f6689a22ea32ca6a) for Shenandoah and verified that test passed I think that test works only with Serial, it hardcodes the GC config. But I think THP pretouch still does not work well :( Look: % cat Wait.java public class Wait { public static void main(String... s) throws Throwable { System.out.println("Ready"); System.in.read(); } } JDK 17.0.12 baseline: % java -Xmx100g -Xms100g -XX:+UseShenandoahGC -XX:+UseTransparentHugePages -XX:+AlwaysPreTouch Wait.java & [1] 20386 % for I in `seq 1 10`; do cat /proc/`pidof java`/smaps | grep AnonHugePages | sort | uniq; sleep 5; done AnonHugePages: 106496000 kB AnonHugePages: 106496000 kB AnonHugePages: 106496000 kB AnonHugePages: 106496000 kB AnonHugePages: 106496000 kB AnonHugePages: 106496000 kB AnonHugePages: 106496000 kB AnonHugePages: 106496000 kB AnonHugePages: 106496000 kB AnonHugePages: 106496000 kB Mainline without this patch maps very slowly: % build/linux-aarch64-server-release/images/jdk/bin/java -Xmx100g -Xms100g -XX:+UseShenandoahGC -XX:+UseTransparentHugePages -XX:+AlwaysPreTouch Wait.java & % for I in `seq 1 10`; do cat /proc/`pidof java`/smaps | grep AnonHugePages | sort | uniq; sleep 5; done AnonHugePages: 131072 kB AnonHugePages: 131072 kB AnonHugePages: 147456 kB AnonHugePages: 163840 kB AnonHugePages: 163840 kB AnonHugePages: 180224 kB AnonHugePages: 180224 kB AnonHugePages: 196608 kB AnonHugePages: 196608 kB AnonHugePages: 212992 kB Mainline *with this patch* still gets mapped very slowly: % build/linux-aarch64-server-release/images/jdk/bin/java -Xmx100g -Xms100g -XX:+UseShenandoahGC -XX:+UseTransparentHugePages -XX:+AlwaysPreTouch Wait.java & % for I in `seq 1 10`; do cat /proc/`pidof java`/smaps | grep AnonHugePages | sort | uniq; sleep 5; done AnonHugePages: 16384 kB AnonHugePages: 32768 kB AnonHugePages: 32768 kB AnonHugePages: 49152 kB AnonHugePages: 49152 kB AnonHugePages: 65536 kB AnonHugePages: 65536 kB AnonHugePages: 81920 kB Please see what still goes wrongly? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20254#issuecomment-2251205753 From nprasad at openjdk.org Thu Jul 25 21:06:31 2024 From: nprasad at openjdk.org (Neethu Prasad) Date: Thu, 25 Jul 2024 21:06:31 GMT Subject: RFR: 8336742: Shenandoah: Add more verbose logging/stats for mark termination attempts In-Reply-To: References: Message-ID: On Thu, 25 Jul 2024 18:18:46 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 234: >> >>> 232: timings->record_phase_time(ShenandoahPhaseTimings::conc_mark_satb_flush_rendezvous, handshake_total_time); >>> 233: >>> 234: log_info(gc, phases) ("Total SATB flush attempts : %d", (flushes + 1)); >> >> Instead of this, I suggest we wrap the `Handshake::execute` with `log_info(gc,phases,start)` and `log_info(gc,phases)`. Logging would tell us how many attempts were done, and give us a rough number how long it took. The benefit would be that we would have the exact timestamps when handshakes started/finished. > > Hm. Maybe even better to wrap it with: > > > { > ShenandoahTimingsTracker t(ShenandoahPhaseTimings::conc_mark_satb_flush_rendezvous, true); > Handshake::execute(&flush_satb); > } > > > Note the last, new argument that allows to merge multiple events into one. We need to separately make `ShenandoahTimingsTracker` report `gc,phases` log statement, imo. Do we want something like below for conc_mark_satb_flush_rendezvous phase? The log line is not useful for all the other usage of ShenandoahTimingTracker. So I'm thinking of logging only when boolean should_aggregate is set. There will be two log lines when number of flush attempts = 2. [3.475s][info][gc,phases ] GC(14) Recorded time for phase SATB Flush Rendezvous as 0.615854 ms [3.476s][info][gc,phases ] GC(14) Recorded time for phase SATB Flush Rendezvous as 0.357069 ms We can also have separate log line for start and end. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20318#discussion_r1692122427 From egahlin at openjdk.org Thu Jul 25 21:09:31 2024 From: egahlin at openjdk.org (Erik Gahlin) Date: Thu, 25 Jul 2024 21:09:31 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah In-Reply-To: References: Message-ID: On Thu, 25 Jul 2024 11:26:45 GMT, Aleksey Shipilev wrote: > While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). > > Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. > > I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. > > This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. > > Additional testing: > - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) > - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) test/jdk/jdk/jfr/event/oldobject/TestFieldInformation.java line 60: > 58: addToTestField(); > 59: > 60: // Let GC catch up before we stop the recording and do the old object sample We have worked hard to remove Thread.sleep, in the JFR tests, so we can run them in parallel, i.e. -conc:64, without introducing failures. I think the test was written before event streaming existed. The test could perhaps be rewritten to use an event stream, i.e: try (RecordingStream rs = new RecordingStream()) { rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity"); rs.onEvent(e -> { if (hasValidField(e)) { rs.close(); } }); rs.startAsync(); addToTestField(); rs.awaitTermination(); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1692125586 From wkemper at openjdk.org Thu Jul 25 21:39:06 2024 From: wkemper at openjdk.org (William Kemper) Date: Thu, 25 Jul 2024 21:39:06 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v4] In-Reply-To: References: Message-ID: > We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. > > ## Testing > * hotspot_gc_shenandoah > * dacapo > * diluvian > * extremem > * hyperalloc > * specjbb2015 > * specjvm2008 William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge remote-tracking branch 'openjdk/master' into remove-incremental-update-mode - Simplify final mark now that incremental update mode is removed - Remove unintentional new line - Remove last vestiges of incremental update mode - Missed test, remove actual IU barrier flag - Remove missed iu_barrier usages for C1 - Update test (all barriers can be enabled now for all modes) - WIP: Remove incremental update mode ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20316/files - new: https://git.openjdk.org/jdk/pull/20316/files/79ceade3..287b6187 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20316&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20316&range=02-03 Stats: 10966 lines in 271 files changed: 7634 ins; 1955 del; 1377 mod Patch: https://git.openjdk.org/jdk/pull/20316.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20316/head:pull/20316 PR: https://git.openjdk.org/jdk/pull/20316 From wkemper at openjdk.org Thu Jul 25 23:21:55 2024 From: wkemper at openjdk.org (William Kemper) Date: Thu, 25 Jul 2024 23:21:55 GMT Subject: RFR: 8337241: Shenandoah: Normalize include guards Message-ID: Some of the include guards are not like the others. ------------- Commit messages: - Fix idiosyncratic include guards Changes: https://git.openjdk.org/jdk/pull/20342/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20342&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8337241 Stats: 12 lines in 4 files changed: 0 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/20342.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20342/head:pull/20342 PR: https://git.openjdk.org/jdk/pull/20342 From wkemper at openjdk.org Thu Jul 25 23:24:08 2024 From: wkemper at openjdk.org (William Kemper) Date: Thu, 25 Jul 2024 23:24:08 GMT Subject: RFR: 8337242: GenShen: Remove unnecessary copyright changes Message-ID: The Amazon copyright has been added to some files that we've hardly changed. Removing the copyright reduces the total number of files changed in the forthcoming generational mode pull request. ------------- Commit messages: - Remove unnecessary Amazon copyright from little changed files Changes: https://git.openjdk.org/shenandoah/pull/462/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=462&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8337242 Stats: 6 lines in 6 files changed: 1 ins; 5 del; 0 mod Patch: https://git.openjdk.org/shenandoah/pull/462.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/462/head:pull/462 PR: https://git.openjdk.org/shenandoah/pull/462 From wkemper at openjdk.org Fri Jul 26 00:02:38 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 00:02:38 GMT Subject: Integrated: 8336685: Shenandoah: Remove experimental incremental update mode In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 18:08:46 GMT, William Kemper wrote: > We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. > > ## Testing > * hotspot_gc_shenandoah > * dacapo > * diluvian > * extremem > * hyperalloc > * specjbb2015 > * specjvm2008 This pull request has now been integrated. Changeset: 0584af23 Author: William Kemper URL: https://git.openjdk.org/jdk/commit/0584af23255b6b8f49190eaf2618f3bcc299adfe Stats: 1708 lines in 69 files changed: 4 ins; 1668 del; 36 mod 8336685: Shenandoah: Remove experimental incremental update mode Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/20316 From ysr at openjdk.org Fri Jul 26 01:09:39 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 26 Jul 2024 01:09:39 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v4] In-Reply-To: References: Message-ID: On Thu, 25 Jul 2024 21:39:06 GMT, William Kemper wrote: >> We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. >> >> ## Testing >> * hotspot_gc_shenandoah >> * dacapo >> * diluvian >> * extremem >> * hyperalloc >> * specjbb2015 >> * specjvm2008 > > William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge remote-tracking branch 'openjdk/master' into remove-incremental-update-mode > - Simplify final mark now that incremental update mode is removed > - Remove unintentional new line > - Remove last vestiges of incremental update mode > - Missed test, remove actual IU barrier flag > - Remove missed iu_barrier usages for C1 > - Update test (all barriers can be enabled now for all modes) > - WIP: Remove incremental update mode I'd left my review comments instead of flushing them yesterday. Not sure if they are still relevant, but here they go... fwiw. src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp line 339: > 337: product(bool, ShenandoahIUBarrier, false, DIAGNOSTIC, \ > 338: "Turn on/off I-U barriers barriers in Shenandoah") \ > 339: \ Not your change, but these doc comments should really say `"Enable blah-blah ..."` rather than `"Turn on/off blah-blah..."`. ------------- PR Review: https://git.openjdk.org/jdk/pull/20316#pullrequestreview-2198135931 PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1690703426 From ysr at openjdk.org Fri Jul 26 01:09:40 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 26 Jul 2024 01:09:40 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v2] In-Reply-To: <03bSRAN8T28AU2-M4IzjsBygTwG4SHrc8HUIJYLM5TE=.e4299a87-b25f-471b-9f6e-2c08e741c6f2@github.com> References: <03bSRAN8T28AU2-M4IzjsBygTwG4SHrc8HUIJYLM5TE=.e4299a87-b25f-471b-9f6e-2c08e741c6f2@github.com> Message-ID: On Wed, 24 Jul 2024 19:31:04 GMT, William Kemper wrote: >> We've reason to believe that this mode is very rarely used and its maintenance has become a burden for future development. >> >> ## Testing >> * hotspot_gc_shenandoah >> * dacapo >> * diluvian >> * extremem >> * hyperalloc >> * specjbb2015 >> * specjvm2008 > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Remove unintentional new line src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 122: > 120: > 121: ShenandoahMarkRefsClosure mark_cl(q, rp); > 122: ShenandoahSATBAndRemarkThreadsClosure tc(satb_mq_set, nullptr); Because this is the only c'tor usage of this closure, may be get rid of the second argument altogether, and clean up its `do_thread()` method further above at lines 84-89 ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1690701759 From nprasad at openjdk.org Fri Jul 26 01:15:33 2024 From: nprasad at openjdk.org (Neethu Prasad) Date: Fri, 26 Jul 2024 01:15:33 GMT Subject: RFR: 8335865: Shenandoah: Improve THP pretouch after JDK-8315923 In-Reply-To: References: Message-ID: On Fri, 19 Jul 2024 14:28:24 GMT, Neethu Prasad wrote: > **Notes** > os::pretouch is now using madvice now when available and has a fall back to using vm page size [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923) > Hence removing code that sets _pretouch_heap_page_size & _pretouch_bitmap_page_size in Shenandoah. > > **Testing** > > * Ran test in Linux 5.10 and Linux 6.x and confirmed that there is no regression. I could not replicate the issue or performance improvement though. [add results] > * Ran [TestTransparentHugePageUsage](https://github.com/openjdk/jdk/commit/a65a89522d2f24b1767e1c74f6689a22ea32ca6a) for Shenandoah and verified that test passed > * Ran tier 1, tier 2 , tier1_gc_shenandoah, tier2_gc_shenandoah, tier3_gc_shenandoah and hotspot_gc_shenandoah. > > Ran [TestTransparentHugePageUsage](https://github.com/openjdk/jdk/commit/a65a89522d2f24b1767e1c74f6689a22ea32ca6a) for Shenandoah and verified that test passed > > I think that test works only with Serial, it hardcodes the GC config. > > But I think THP pretouch still does not work well :( Look: > > ``` > % cat Wait.java > public class Wait { > public static void main(String... s) throws Throwable { > System.out.println("Ready"); > System.in.read(); > } > } > ``` > > JDK 17.0.12 baseline: > > ``` > % java -Xmx100g -Xms100g -XX:+UseShenandoahGC -XX:+UseTransparentHugePages -XX:+AlwaysPreTouch Wait.java & > [1] 20386 > > % for I in `seq 1 10`; do cat /proc/`pidof java`/smaps | grep AnonHugePages | sort | uniq; sleep 5; done > AnonHugePages: 106496000 kB > AnonHugePages: 106496000 kB > AnonHugePages: 106496000 kB > AnonHugePages: 106496000 kB > AnonHugePages: 106496000 kB > AnonHugePages: 106496000 kB > AnonHugePages: 106496000 kB > AnonHugePages: 106496000 kB > AnonHugePages: 106496000 kB > AnonHugePages: 106496000 kB > ``` > > Mainline without this patch maps very slowly, this is what [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923) is about, AFAIU: > > ``` > % build/linux-aarch64-server-release/images/jdk/bin/java -Xmx100g -Xms100g -XX:+UseShenandoahGC -XX:+UseTransparentHugePages -XX:+AlwaysPreTouch Wait.java & > > % for I in `seq 1 10`; do cat /proc/`pidof java`/smaps | grep AnonHugePages | sort | uniq; sleep 5; done > AnonHugePages: 131072 kB > AnonHugePages: 131072 kB > AnonHugePages: 147456 kB > AnonHugePages: 163840 kB > AnonHugePages: 163840 kB > AnonHugePages: 180224 kB > AnonHugePages: 180224 kB > AnonHugePages: 196608 kB > AnonHugePages: 196608 kB > AnonHugePages: 212992 kB > ``` > > But mainline _with this patch_ still gets mapped very slowly: > > ``` > % build/linux-aarch64-server-release/images/jdk/bin/java -Xmx100g -Xms100g -XX:+UseShenandoahGC -XX:+UseTransparentHugePages -XX:+AlwaysPreTouch Wait.java & > > % for I in `seq 1 10`; do cat /proc/`pidof java`/smaps | grep AnonHugePages | sort | uniq; sleep 5; done > AnonHugePages: 16384 kB > AnonHugePages: 32768 kB > AnonHugePages: 32768 kB > AnonHugePages: 49152 kB > AnonHugePages: 49152 kB > AnonHugePages: 65536 kB > AnonHugePages: 65536 kB > AnonHugePages: 81920 kB > ``` > > Please see what still goes wrongly? I'll take a look. I'd updated the [TestTransparentHugePageUsage](https://github.com/openjdk/jdk/commit/a65a89522d2f24b1767e1c74f6689a22ea32ca6a) to use Shenandoah. So wasn't run serial GC. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20254#issuecomment-2251770454 From ysr at openjdk.org Fri Jul 26 01:51:58 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 26 Jul 2024 01:51:58 GMT Subject: RFR: 8337242: GenShen: Remove unnecessary copyright changes In-Reply-To: References: Message-ID: On Thu, 25 Jul 2024 23:18:49 GMT, William Kemper wrote: > The Amazon copyright has been added to some files that we've hardly changed. Removing the copyright reduces the total number of files changed in the forthcoming generational mode pull request. One confusing blank line, but otherwise good. src/hotspot/share/gc/shenandoah/shenandoahEvacOOMHandler.inline.hpp line 29: > 27: > 28: #include "gc/shenandoah/shenandoahEvacOOMHandler.hpp" > 29: Why the blank line? ------------- Marked as reviewed by ysr (Committer). PR Review: https://git.openjdk.org/shenandoah/pull/462#pullrequestreview-2200742201 PR Review Comment: https://git.openjdk.org/shenandoah/pull/462#discussion_r1692376420 From xpeng at openjdk.org Fri Jul 26 06:46:32 2024 From: xpeng at openjdk.org (Xiaolong Peng) Date: Fri, 26 Jul 2024 06:46:32 GMT Subject: RFR: 8334482: Shenandoah: Deadlock when safepoint is pending during nmethods iteration In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 09:10:35 GMT, Aleksey Shipilev wrote: > There is a problem in interaction between STS taken in `ShenandoahConcurrentWeakRootsEvacUpdateTask::work` and nmethod list iteration. If we start the nmethod iteration in `ShCWREUT` constructor, proceed to STS and block, any waiter that waits for the end of iteration would block until safepoint is over. If that waiter waits without a safepoint check, like anyone waiting on `CodeCache_lock`, e.g. Sweeper (not really in mainline, but we seen a case in JDK 17), or a JIT compiler patching code, then we would have a deadlock. > > A solution is to initialize iteration only immediately before the use of iterator. This will pass any pending STSes and give much less risk for deadlock, as long as nothing in nmethod handling safepoints. > > Unfortunately, it is really hard to reproduce in mainline, since the absence of Sweeper makes this deadlock exceedingly rare. > > Additional testing: > - [x] Reproducer on JDK 17u with this patch applied now does not deadlock > - [x] Linux AArch64 server fastdebug, `all` with `-XX:+UseShenandoahGC` > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` This is really good, but I bet my review doesn't really count. :) ------------- Marked as reviewed by xpeng (Author). PR Review: https://git.openjdk.org/jdk/pull/20309#pullrequestreview-2201055970 From shade at openjdk.org Fri Jul 26 07:56:33 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 07:56:33 GMT Subject: RFR: 8337241: Shenandoah: Normalize include guards In-Reply-To: References: Message-ID: <43xxolYdYfiVpy5fNZFEW6nGQngPXzrezYok2ODhp_w=.5987b214-261f-4e32-9135-87f29e9b6de2@github.com> On Thu, 25 Jul 2024 23:15:31 GMT, William Kemper wrote: > Some of the include guards are not like the others. Looks fine, with a nit. src/hotspot/share/gc/shenandoah/shenandoahController.hpp line 105: > 103: void update_gc_id(); > 104: }; > 105: #endif //SHARE_GC_SHENANDOAH_SHENANDOAHCONTROLLER_HPP Should have a space after `//`? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20342#pullrequestreview-2201170905 PR Review Comment: https://git.openjdk.org/jdk/pull/20342#discussion_r1692658169 From shade at openjdk.org Fri Jul 26 08:39:34 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 08:39:34 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah In-Reply-To: References: Message-ID: On Thu, 25 Jul 2024 21:06:30 GMT, Erik Gahlin wrote: >> While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). >> >> Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. >> >> I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. >> >> This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. >> >> Additional testing: >> - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) >> - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) > > test/jdk/jdk/jfr/event/oldobject/TestFieldInformation.java line 60: > >> 58: addToTestField(); >> 59: >> 60: // Let GC catch up before we stop the recording and do the old object sample > > We have worked hard to remove Thread.sleep, in the JFR tests, so we can run them in parallel, i.e. -conc:64, without introducing failures. I think the test was written before event streaming existed. The test could perhaps be rewritten to use an event stream, i.e: > > try (RecordingStream rs = new RecordingStream()) { > rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity"); > rs.onEvent(e -> { > if (hasValidField(e)) { > rs.close(); > } > }); > rs.startAsync(); > addToTestField(); > rs.awaitTermination(); > } I agree using `sleep` here is not great. But I could not find a better way to do it. When I was looking at this test, I noticed that `OldObjectSample` event is only emitted when recording is stopped: https://github.com/openjdk/jdk/blob/7f11935461a6ff1ef0fac800fb4264a519e21061/src/jdk.jfr/share/classes/jdk/jfr/internal/OldObjectSample.java#L41-L42 -- this is why I thought to add `sleep` right before we stop the recording. So going async does not really help, because of this circularity: we only stop the recording when event is received, which we would not receive until recording is stopped. Sure enough, this times out: public static void main(String[] args) throws Exception { WhiteBox.setWriteAllObjectSamples(true); try (RecordingStream rs = new RecordingStream()) { rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity"); rs.onEvent(e -> { try { if (hasValidField(e)) { rs.close(); } } catch (Exception ex) { ex.printStackTrace(); } }); rs.startAsync(); addToTestField(); rs.awaitTermination(); } } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1692709095 From shade at openjdk.org Fri Jul 26 09:30:31 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 09:30:31 GMT Subject: RFR: 8335865: Shenandoah: Improve THP pretouch after JDK-8315923 In-Reply-To: References: Message-ID: On Fri, 26 Jul 2024 01:11:58 GMT, Neethu Prasad wrote: >> Please see what still goes wrongly? > I'll take a look. Check if this really works with e.g. Parallel first? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20254#issuecomment-2252338076 From rkennke at openjdk.org Fri Jul 26 11:17:30 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 26 Jul 2024 11:17:30 GMT Subject: RFR: 8334482: Shenandoah: Deadlock when safepoint is pending during nmethods iteration In-Reply-To: References: Message-ID: <_6nWwuQkPdkrwsBA574a1GajwI_QfKqy7mbEI-MQRS8=.aaabcdca-0247-455b-8652-5fdba4830929@github.com> On Wed, 24 Jul 2024 09:10:35 GMT, Aleksey Shipilev wrote: > There is a problem in interaction between STS taken in `ShenandoahConcurrentWeakRootsEvacUpdateTask::work` and nmethod list iteration. If we start the nmethod iteration in `ShCWREUT` constructor, proceed to STS and block, any waiter that waits for the end of iteration would block until safepoint is over. If that waiter waits without a safepoint check, like anyone waiting on `CodeCache_lock`, e.g. Sweeper (not really in mainline, but we seen a case in JDK 17), or a JIT compiler patching code, then we would have a deadlock. > > A solution is to initialize iteration only immediately before the use of iterator. This will pass any pending STSes that we have seen could deadlock. As long as nothing in nmethod handling safepoints, the deadlock is resolved. I think `NoSafepointVerifier` would give us a warning if something safepoints while iteration is on. > > Unfortunately, it is really hard to reproduce in mainline, since the absence of Sweeper makes this deadlock exceedingly rare. > > Additional testing: > - [x] Reproducer on JDK 17u with this patch applied now does not deadlock > - [x] Linux AArch64 server fastdebug, `all` with `-XX:+UseShenandoahGC` > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` Looks good to me! Thanks! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20309#pullrequestreview-2201587027 From shade at openjdk.org Fri Jul 26 11:23:39 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 11:23:39 GMT Subject: RFR: 8334482: Shenandoah: Deadlock when safepoint is pending during nmethods iteration In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 09:10:35 GMT, Aleksey Shipilev wrote: > There is a problem in interaction between STS taken in `ShenandoahConcurrentWeakRootsEvacUpdateTask::work` and nmethod list iteration. If we start the nmethod iteration in `ShCWREUT` constructor, proceed to STS and block, any waiter that waits for the end of iteration would block until safepoint is over. If that waiter waits without a safepoint check, like anyone waiting on `CodeCache_lock`, e.g. Sweeper (not really in mainline, but we seen a case in JDK 17), or a JIT compiler patching code, then we would have a deadlock. > > A solution is to initialize iteration only immediately before the use of iterator. This will pass any pending STSes that we have seen could deadlock. As long as nothing in nmethod handling safepoints, the deadlock is resolved. I think `NoSafepointVerifier` would give us a warning if something safepoints while iteration is on. > > Unfortunately, it is really hard to reproduce in mainline, since the absence of Sweeper makes this deadlock exceedingly rare. > > Additional testing: > - [x] Reproducer on JDK 17u with this patch applied now does not deadlock > - [x] Linux AArch64 server fastdebug, `all` with `-XX:+UseShenandoahGC` > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` Thank you! ------------- PR Comment: https://git.openjdk.org/jdk/pull/20309#issuecomment-2252547704 From shade at openjdk.org Fri Jul 26 11:23:39 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 11:23:39 GMT Subject: Integrated: 8334482: Shenandoah: Deadlock when safepoint is pending during nmethods iteration In-Reply-To: References: Message-ID: On Wed, 24 Jul 2024 09:10:35 GMT, Aleksey Shipilev wrote: > There is a problem in interaction between STS taken in `ShenandoahConcurrentWeakRootsEvacUpdateTask::work` and nmethod list iteration. If we start the nmethod iteration in `ShCWREUT` constructor, proceed to STS and block, any waiter that waits for the end of iteration would block until safepoint is over. If that waiter waits without a safepoint check, like anyone waiting on `CodeCache_lock`, e.g. Sweeper (not really in mainline, but we seen a case in JDK 17), or a JIT compiler patching code, then we would have a deadlock. > > A solution is to initialize iteration only immediately before the use of iterator. This will pass any pending STSes that we have seen could deadlock. As long as nothing in nmethod handling safepoints, the deadlock is resolved. I think `NoSafepointVerifier` would give us a warning if something safepoints while iteration is on. > > Unfortunately, it is really hard to reproduce in mainline, since the absence of Sweeper makes this deadlock exceedingly rare. > > Additional testing: > - [x] Reproducer on JDK 17u with this patch applied now does not deadlock > - [x] Linux AArch64 server fastdebug, `all` with `-XX:+UseShenandoahGC` > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` This pull request has now been integrated. Changeset: 2aeb12ec Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/2aeb12ec03944c777d617d0be48982fd225b16e7 Stats: 72 lines in 4 files changed: 25 ins; 33 del; 14 mod 8334482: Shenandoah: Deadlock when safepoint is pending during nmethods iteration Reviewed-by: wkemper, xpeng, rkennke ------------- PR: https://git.openjdk.org/jdk/pull/20309 From egahlin at openjdk.org Fri Jul 26 11:43:35 2024 From: egahlin at openjdk.org (Erik Gahlin) Date: Fri, 26 Jul 2024 11:43:35 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah In-Reply-To: References: Message-ID: On Fri, 26 Jul 2024 08:36:30 GMT, Aleksey Shipilev wrote: >> test/jdk/jdk/jfr/event/oldobject/TestFieldInformation.java line 60: >> >>> 58: addToTestField(); >>> 59: >>> 60: // Let GC catch up before we stop the recording and do the old object sample >> >> We have worked hard to remove Thread.sleep, in the JFR tests, so we can run them in parallel, i.e. -conc:64, without introducing failures. I think the test was written before event streaming existed. The test could perhaps be rewritten to use an event stream, i.e: >> >> try (RecordingStream rs = new RecordingStream()) { >> rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity"); >> rs.onEvent(e -> { >> if (hasValidField(e)) { >> rs.close(); >> } >> }); >> rs.startAsync(); >> addToTestField(); >> rs.awaitTermination(); >> } > > I agree using `sleep` here is not great. But I could not find a better way to do it. > > When I was looking at this test, I noticed that `OldObjectSample` event is only emitted when recording is stopped: https://github.com/openjdk/jdk/blob/7f11935461a6ff1ef0fac800fb4264a519e21061/src/jdk.jfr/share/classes/jdk/jfr/internal/OldObjectSample.java#L41-L42 -- this is why I thought to add `sleep` right before we stop the recording. > > So going async does not really help, because of this circularity: we only stop the recording when event is received, which we would not receive until recording is stopped. Sure enough, this times out: > > > public static void main(String[] args) throws Exception { > WhiteBox.setWriteAllObjectSamples(true); > > try (RecordingStream rs = new RecordingStream()) { > rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity"); > rs.onEvent(e -> { > try { > if (hasValidField(e)) { > rs.close(); > } > } catch (Exception ex) { > ex.printStackTrace(); > } > }); > > rs.startAsync(); > addToTestField(); > rs.awaitTermination(); > } > } I see the problem now. As I understand it, this only happens with Shenandoah. Would it be possible to sleep only when using that GC? Or perhaps repeat on failure, increasing the sleep time with a few seconds f?r every iteration, starting with 0s? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1692942006 From shade at openjdk.org Fri Jul 26 11:58:31 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 11:58:31 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah In-Reply-To: References: Message-ID: On Fri, 26 Jul 2024 11:40:51 GMT, Erik Gahlin wrote: >> I agree using `sleep` here is not great. But I could not find a better way to do it. >> >> When I was looking at this test, I noticed that `OldObjectSample` event is only emitted when recording is stopped: https://github.com/openjdk/jdk/blob/7f11935461a6ff1ef0fac800fb4264a519e21061/src/jdk.jfr/share/classes/jdk/jfr/internal/OldObjectSample.java#L41-L42 -- this is why I thought to add `sleep` right before we stop the recording. >> >> So going async does not really help, because of this circularity: we only stop the recording when event is received, which we would not receive until recording is stopped. Sure enough, this times out: >> >> >> public static void main(String[] args) throws Exception { >> WhiteBox.setWriteAllObjectSamples(true); >> >> try (RecordingStream rs = new RecordingStream()) { >> rs.enable(EventNames.OldObjectSample).withoutStackTrace().with("cutoff", "infinity"); >> rs.onEvent(e -> { >> try { >> if (hasValidField(e)) { >> rs.close(); >> } >> } catch (Exception ex) { >> ex.printStackTrace(); >> } >> }); >> >> rs.startAsync(); >> addToTestField(); >> rs.awaitTermination(); >> } >> } > > I see the problem now. > > As I understand it, this only happens with Shenandoah. Would it be possible to sleep only when using that GC? Or perhaps repeat on failure, increasing the sleep time with a few seconds f?r every iteration, starting with 0s? Sure, I can try that. I would not like to specialize the test for Shenandoah, but retries with backoffs sound fine. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1692956428 From wkemper at openjdk.org Fri Jul 26 14:15:18 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 14:15:18 GMT Subject: RFR: Merge openjdk/jdk:master Message-ID: Merges tag jdk-24+8 ------------- Commit messages: - 8336741: Optimize LocalTime.toString with StringBuilder.repeat - 8334232: Optimize C1 classes layout - 8337067: Test runtime/classFileParserBug/Bad_NCDFE_Msg.java won't compile - 8336755: Remove unused UNALIGNED field from view buffers - 8336847: Use pattern match switch in NumberFormat classes - 8336787: Examine java.text.Format API for implSpec usage - 8336679: Add @implSpec for the default implementations in Process.waitFor() - 8336815: Several methods in java.net.Socket and ServerSocket do not specify behavior when already bound, connected or closed - 8336316: JFR: Use SettingControl::getValue() instead of setValue() for ActiveSetting event - 8336485: jdk/jfr/jcmd/TestJcmdView.java RuntimeException: 'Invoked Concurrent' missing from stdout/stderr - ... and 149 more: https://git.openjdk.org/shenandoah/compare/b363de8c...0898ab7f The webrev contains the conflicts with master: - merge conflicts: https://webrevs.openjdk.org/?repo=shenandoah&pr=463&range=00.conflicts Changes: https://git.openjdk.org/shenandoah/pull/463/files Stats: 11673 lines in 480 files changed: 7497 ins; 2241 del; 1935 mod Patch: https://git.openjdk.org/shenandoah/pull/463.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/463/head:pull/463 PR: https://git.openjdk.org/shenandoah/pull/463 From wkemper at openjdk.org Fri Jul 26 15:32:38 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 15:32:38 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v2] In-Reply-To: References: <03bSRAN8T28AU2-M4IzjsBygTwG4SHrc8HUIJYLM5TE=.e4299a87-b25f-471b-9f6e-2c08e741c6f2@github.com> Message-ID: On Thu, 25 Jul 2024 02:24:43 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unintentional new line > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 122: > >> 120: >> 121: ShenandoahMarkRefsClosure mark_cl(q, rp); >> 122: ShenandoahSATBAndRemarkThreadsClosure tc(satb_mq_set, nullptr); > > Because this is the only c'tor usage of this closure, may be get rid of the second argument altogether, and clean up its `do_thread()` method further above at lines 84-89 ? Right, @shipilev also pointed this out. I've since cleaned it up. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1693258254 From wkemper at openjdk.org Fri Jul 26 15:32:41 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 15:32:41 GMT Subject: RFR: 8336685: Shenandoah: Remove experimental incremental update mode [v4] In-Reply-To: References: Message-ID: On Thu, 25 Jul 2024 02:27:20 GMT, Y. Srinivas Ramakrishna wrote: >> William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: >> >> - Merge remote-tracking branch 'openjdk/master' into remove-incremental-update-mode >> - Simplify final mark now that incremental update mode is removed >> - Remove unintentional new line >> - Remove last vestiges of incremental update mode >> - Missed test, remove actual IU barrier flag >> - Remove missed iu_barrier usages for C1 >> - Update test (all barriers can be enabled now for all modes) >> - WIP: Remove incremental update mode > > src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp line 339: > >> 337: product(bool, ShenandoahIUBarrier, false, DIAGNOSTIC, \ >> 338: "Turn on/off I-U barriers barriers in Shenandoah") \ >> 339: \ > > Not your change, but these doc comments should really say `"Enable blah-blah ..."` rather than `"Turn on/off blah-blah..."`. Yes, next time we're in this file. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20316#discussion_r1693259588 From wkemper at openjdk.org Fri Jul 26 15:45:59 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 15:45:59 GMT Subject: RFR: 8337242: GenShen: Remove unnecessary copyright changes In-Reply-To: References: Message-ID: On Fri, 26 Jul 2024 01:48:06 GMT, Y. Srinivas Ramakrishna wrote: >> The Amazon copyright has been added to some files that we've hardly changed. Removing the copyright reduces the total number of files changed in the forthcoming generational mode pull request. > > src/hotspot/share/gc/shenandoah/shenandoahEvacOOMHandler.inline.hpp line 29: > >> 27: >> 28: #include "gc/shenandoah/shenandoahEvacOOMHandler.hpp" >> 29: > > Why the blank line? Just eliminates 1-line diff from upstream. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/462#discussion_r1693275533 From wkemper at openjdk.org Fri Jul 26 15:47:04 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 15:47:04 GMT Subject: RFR: 8337241: Shenandoah: Normalize include guards [v2] In-Reply-To: References: Message-ID: > Some of the include guards are not like the others. William Kemper has updated the pull request incrementally with one additional commit since the last revision: Add missing whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20342/files - new: https://git.openjdk.org/jdk/pull/20342/files/cbf28d70..eabbddfc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20342&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20342&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/20342.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20342/head:pull/20342 PR: https://git.openjdk.org/jdk/pull/20342 From shade at openjdk.org Fri Jul 26 15:47:04 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 15:47:04 GMT Subject: RFR: 8337241: Shenandoah: Normalize include guards [v2] In-Reply-To: References: Message-ID: <53PvbF-0EUZ6qVjG-4ZpBgKbV-NzhFcnn59jpxkJsJo=.618e57b4-4860-4846-a9c6-978f6dc8ccc1@github.com> On Fri, 26 Jul 2024 15:43:35 GMT, William Kemper wrote: >> Some of the include guards are not like the others. > > William Kemper has updated the pull request incrementally with one additional commit since the last revision: > > Add missing whitespace Still good and trivial. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20342#pullrequestreview-2202171078 From wkemper at openjdk.org Fri Jul 26 15:47:04 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 15:47:04 GMT Subject: RFR: 8337241: Shenandoah: Normalize include guards [v2] In-Reply-To: <43xxolYdYfiVpy5fNZFEW6nGQngPXzrezYok2ODhp_w=.5987b214-261f-4e32-9135-87f29e9b6de2@github.com> References: <43xxolYdYfiVpy5fNZFEW6nGQngPXzrezYok2ODhp_w=.5987b214-261f-4e32-9135-87f29e9b6de2@github.com> Message-ID: On Fri, 26 Jul 2024 07:53:22 GMT, Aleksey Shipilev wrote: >> William Kemper has updated the pull request incrementally with one additional commit since the last revision: >> >> Add missing whitespace > > src/hotspot/share/gc/shenandoah/shenandoahController.hpp line 105: > >> 103: void update_gc_id(); >> 104: }; >> 105: #endif //SHARE_GC_SHENANDOAH_SHENANDOAHCONTROLLER_HPP > > Should have a space after `//`? Aye, good catch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20342#discussion_r1693274580 From shade at openjdk.org Fri Jul 26 15:48:06 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 15:48:06 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v2] In-Reply-To: References: Message-ID: > While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). > > Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. > > I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. > > This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. > > Additional testing: > - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) > - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Try to make tests faster / more robust ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20328/files - new: https://git.openjdk.org/jdk/pull/20328/files/0219ad66..bac48a15 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20328&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20328&range=00-01 Stats: 42 lines in 2 files changed: 32 ins; 5 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/20328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20328/head:pull/20328 PR: https://git.openjdk.org/jdk/pull/20328 From duke at openjdk.org Fri Jul 26 15:50:34 2024 From: duke at openjdk.org (duke) Date: Fri, 26 Jul 2024 15:50:34 GMT Subject: RFR: 8334315: Shenandoah: reduce GC logging noise In-Reply-To: References: Message-ID: On Wed, 19 Jun 2024 15:11:52 GMT, Kelvin Nilsen wrote: > Qualify certain log messages as ergo or debug to remove extraneous noise from regular gc logs. @kdnilsen Your change (at version 2a74ecc458cdadf86ab72e96776fbd04c3d0def0) is now ready to be sponsored by a Committer. ------------- PR Comment: https://git.openjdk.org/jdk/pull/19795#issuecomment-2253036972 From kdnilsen at openjdk.org Fri Jul 26 15:57:36 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 15:57:36 GMT Subject: Integrated: 8334315: Shenandoah: reduce GC logging noise In-Reply-To: References: Message-ID: On Wed, 19 Jun 2024 15:11:52 GMT, Kelvin Nilsen wrote: > Qualify certain log messages as ergo or debug to remove extraneous noise from regular gc logs. This pull request has now been integrated. Changeset: f35af717 Author: Kelvin Nilsen Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/f35af7171213c09107b99104a73022853bff91b0 Stats: 17 lines in 1 file changed: 0 ins; 0 del; 17 mod 8334315: Shenandoah: reduce GC logging noise Reviewed-by: wkemper, ysr, shade ------------- PR: https://git.openjdk.org/jdk/pull/19795 From wkemper at openjdk.org Fri Jul 26 16:00:00 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 16:00:00 GMT Subject: Integrated: 8337242: GenShen: Remove unnecessary copyright changes In-Reply-To: References: Message-ID: <_WroPByAE0U2qX35Y3wPz8FC8O78XMRNgq6YCUQQlO4=.ca786bee-c0d9-4060-a2dc-c14d1a82da54@github.com> On Thu, 25 Jul 2024 23:18:49 GMT, William Kemper wrote: > The Amazon copyright has been added to some files that we've hardly changed. Removing the copyright reduces the total number of files changed in the forthcoming generational mode pull request. This pull request has now been integrated. Changeset: 4f96b5d2 Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/4f96b5d29652c7d24adca6d77804926f2bb5ef55 Stats: 6 lines in 6 files changed: 1 ins; 5 del; 0 mod 8337242: GenShen: Remove unnecessary copyright changes Reviewed-by: ysr ------------- PR: https://git.openjdk.org/shenandoah/pull/462 From wkemper at openjdk.org Fri Jul 26 16:00:38 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 16:00:38 GMT Subject: Integrated: 8337241: Shenandoah: Normalize include guards In-Reply-To: References: Message-ID: <922UvdG6ZTUDPDxelhSZZKxkOowypnfybpyaALfcTKg=.50af8634-51f5-4a31-b5b2-b013e31890b2@github.com> On Thu, 25 Jul 2024 23:15:31 GMT, William Kemper wrote: > Some of the include guards are not like the others. This pull request has now been integrated. Changeset: 4f194f10 Author: William Kemper URL: https://git.openjdk.org/jdk/commit/4f194f10a1481cdc9df4e6338f6cabd07a34c84c Stats: 12 lines in 4 files changed: 0 ins; 0 del; 12 mod 8337241: Shenandoah: Normalize include guards Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/20342 From egahlin at openjdk.org Fri Jul 26 16:05:32 2024 From: egahlin at openjdk.org (Erik Gahlin) Date: Fri, 26 Jul 2024 16:05:32 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v2] In-Reply-To: References: Message-ID: On Fri, 26 Jul 2024 15:48:06 GMT, Aleksey Shipilev wrote: >> While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). >> >> Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. >> >> I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. >> >> This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. >> >> Additional testing: >> - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) >> - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Try to make tests faster / more robust Marked as reviewed by egahlin (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20328#pullrequestreview-2202203961 From shade at openjdk.org Fri Jul 26 16:05:33 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 16:05:33 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v2] In-Reply-To: References: Message-ID: On Fri, 26 Jul 2024 11:55:48 GMT, Aleksey Shipilev wrote: >> I see the problem now. >> >> As I understand it, this only happens with Shenandoah. Would it be possible to sleep only when using that GC? Or perhaps repeat on failure, increasing the sleep time with a few seconds f?r every iteration, starting with 0s? > > Sure, I can try that. I would not like to specialize the test for Shenandoah, but retries with backoffs sound fine. New version with backoffs posted. I am running stress tests to see if Shenandoah and G1 work well with them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1693297605 From wkemper at openjdk.org Fri Jul 26 16:52:29 2024 From: wkemper at openjdk.org (William Kemper) Date: Fri, 26 Jul 2024 16:52:29 GMT Subject: RFR: Merge openjdk/jdk:master [v2] In-Reply-To: References: Message-ID: > Merges tag jdk-24+8 William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 160 commits: - Merge branch 'master' into merge-jdk-24+8 - 8336741: Optimize LocalTime.toString with StringBuilder.repeat Reviewed-by: liach, rriggs, naoto - 8334232: Optimize C1 classes layout Reviewed-by: phh, kvn - 8337067: Test runtime/classFileParserBug/Bad_NCDFE_Msg.java won't compile Reviewed-by: lfoltan - 8336755: Remove unused UNALIGNED field from view buffers Reviewed-by: alanb, liach, bpb - 8336847: Use pattern match switch in NumberFormat classes Reviewed-by: liach, naoto - 8336787: Examine java.text.Format API for implSpec usage Reviewed-by: liach - 8336679: Add @implSpec for the default implementations in Process.waitFor() Reviewed-by: bpb, jlu, liach - 8336815: Several methods in java.net.Socket and ServerSocket do not specify behavior when already bound, connected or closed Reviewed-by: alanb - 8336316: JFR: Use SettingControl::getValue() instead of setValue() for ActiveSetting event Reviewed-by: mgronlun - ... and 150 more: https://git.openjdk.org/shenandoah/compare/4f96b5d2...173f0c42 ------------- Changes: https://git.openjdk.org/shenandoah/pull/463/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah&pr=463&range=01 Stats: 11677 lines in 480 files changed: 7499 ins; 2242 del; 1936 mod Patch: https://git.openjdk.org/shenandoah/pull/463.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/463/head:pull/463 PR: https://git.openjdk.org/shenandoah/pull/463 From shade at openjdk.org Fri Jul 26 17:56:32 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 17:56:32 GMT Subject: RFR: 8336742: Shenandoah: Add more verbose logging/stats for mark termination attempts In-Reply-To: References: Message-ID: On Thu, 25 Jul 2024 21:03:54 GMT, Neethu Prasad wrote: >> Hm. Maybe even better to wrap it with: >> >> >> { >> ShenandoahTimingsTracker t(ShenandoahPhaseTimings::conc_mark_satb_flush_rendezvous, true); >> Handshake::execute(&flush_satb); >> } >> >> >> Note the last, new argument that allows to merge multiple events into one. We need to separately make `ShenandoahTimingsTracker` report `gc,phases` log statement, imo. > > Do we want something like below for conc_mark_satb_flush_rendezvous phase? > The log line is not useful for all the other usage of ShenandoahTimingTracker. So I'm thinking of logging only when boolean should_aggregate is set. > > There will be two log lines when number of flush attempts = 2. > > > [3.475s][info][gc,phases ] GC(14) Recorded time for phase SATB Flush Rendezvous as 0.615854 ms > [3.476s][info][gc,phases ] GC(14) Recorded time for phase SATB Flush Rendezvous as 0.357069 ms > > We can also have separate log line for start and end. I don't think we need this phase tracking to print anything special, really. What we should probably do is to feed it into `ShenandoahTimingsTracker`, and let the rest of the infra deal with it as it sees fit. Nominally, show it in `gc+stats`. Later, we would need to consider printing `gc,phases` logging straight from `ShenandoahTimingsTracker`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20318#discussion_r1693421509 From shade at openjdk.org Fri Jul 26 18:26:51 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 26 Jul 2024 18:26:51 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v3] In-Reply-To: References: Message-ID: > While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). > > Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. > > I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. > > This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. > > Additional testing: > - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) > - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Referencing INCLUDE_SHENANDOAHGC nominally requires macro.hpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20328/files - new: https://git.openjdk.org/jdk/pull/20328/files/bac48a15..b3faa095 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20328&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20328&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/20328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20328/head:pull/20328 PR: https://git.openjdk.org/jdk/pull/20328 From kdnilsen at openjdk.org Fri Jul 26 19:25:26 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 19:25:26 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v7] In-Reply-To: References: Message-ID: > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Better comments as requested by code review ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/395/files - new: https://git.openjdk.org/shenandoah/pull/395/files/21a5d328..02ea5660 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=06 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=05-06 Stats: 20 lines in 2 files changed: 14 ins; 1 del; 5 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From kdnilsen at openjdk.org Fri Jul 26 19:25:28 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 19:25:28 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v6] In-Reply-To: References: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> Message-ID: On Wed, 24 Jul 2024 15:07:18 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comment > > src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.hpp line 106: > >> 104: size_t _fragmentation_last_old_region; >> 105: >> 106: // State variables involved in construction of a mixed-evacuation collection set > > Can you add a sentence describing when the state is updated, and when it is used, as well as which are inputs to the construction process and which the results of (or updated upon completion of) the construction process and used as inputs to downstream operations? Thanks. Done. > src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.hpp line 141: > >> 139: void choose_collection_set_from_regiondata(ShenandoahCollectionSet* set, RegionData* data, size_t data_size, size_t free) override; >> 140: >> 141: // Return true iff we need to finalize mixed evacs > > In addition to the return value, please add a sentence stating something like "Adds old regions to the collection set." What does "need to finalize mixed evacs" mean? Did you mean: "Returns true if and only if this step should be followed by ... (see method finalize_mixed_evacs)." In general, it's better to write comments such that they do not require chains of lookup (although in some cases it may be unavoidable). Yes. Very good suggestions. Done. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1693495597 PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1693495840 From kdnilsen at openjdk.org Fri Jul 26 19:25:29 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 19:25:29 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v6] In-Reply-To: References: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> Message-ID: <5-2IZbE4IHkaDg40EZy5qoXp_4XB5wBta32_K41iCug=.ab456a0a-8bd6-44b6-a5c3-cf2c4eefca19@github.com> On Thu, 25 Jul 2024 00:54:46 GMT, Y. Srinivas Ramakrishna wrote: >> src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.hpp line 156: >> >>> 154: bool top_off_collection_set(); >>> 155: >>> 156: // Return true iff the collection set holds at least one unpinned mixed evacuation candidate >> >> Same comment as the last one. The comment must not only state the value returned, but also what actions the method takes or side-effect it has on this or another interacting object's state. You want to do that for all imperative methods. (Only accessor methods may have a single line stating what they return; otherwise the effect of the method should be described.) > > (Applies to all the methods line 150-157.) Here too. Done. Thanks. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1693496020 From kdnilsen at openjdk.org Fri Jul 26 23:04:53 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 23:04:53 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v8] In-Reply-To: References: Message-ID: > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Simplify invocations of freeset rebuild when possible Most invocations do not need to resize generations between prepare_to_rebuild() and finish_rebuild(), so no need to make these two independent invocations. ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/395/files - new: https://git.openjdk.org/shenandoah/pull/395/files/02ea5660..10588379 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=07 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=06-07 Stats: 37 lines in 6 files changed: 12 ins; 20 del; 5 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From kdnilsen at openjdk.org Fri Jul 26 23:12:51 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 23:12:51 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v6] In-Reply-To: References: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> Message-ID: On Thu, 25 Jul 2024 00:10:40 GMT, Y. Srinivas Ramakrishna wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 226: >> >>> 224: // on its next iteration and run a degenerated young cycle. >>> 225: >>> 226: // vmop_entry_final_updaterefs rebuilds free set in preparation for next GC. >> >> Did you mean to say `vmop_entry_final_roots` in this comment? > > Also, are the comments at line 215 and 226 really needed, since the methods do much more than just rebuilding the free set. I'll remove both of those comments. You are right, I accidentally copy/pasted inappropriate comment in the second instance. Thanks. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1693681705 From kdnilsen at openjdk.org Fri Jul 26 23:19:34 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 23:19:34 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v9] In-Reply-To: References: Message-ID: > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Remove incorrect and unnecessary comments ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/395/files - new: https://git.openjdk.org/shenandoah/pull/395/files/10588379..7f01a7f3 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=08 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=07-08 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From kdnilsen at openjdk.org Fri Jul 26 23:33:00 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 23:33:00 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v6] In-Reply-To: References: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> Message-ID: On Thu, 25 Jul 2024 00:24:18 GMT, Y. Srinivas Ramakrishna wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 342: >> >>> 340: op_final_roots(); >>> 341: >>> 342: if (heap->mode()->is_generational()) { >> >> Can this code be factored out into a virtual method `rebuild_free_set()` that `ShenandoahGenerationalHeap` overrides? > > Also, can you take a look at `op_final_updaterefs` where `rebuild_free_set` is called? There may be an opportunity to refactor the code here into a method as suggested above, and to share that with `rebuild_free_set` perhaps. Thanks for asking these questions. The existing ShenandoahHeap::rebuild_free_set() method can be called from here instead of this code. I'm making this change. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1693689110 From kdnilsen at openjdk.org Fri Jul 26 23:36:27 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 23:36:27 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v10] In-Reply-To: References: Message-ID: > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Simplify code to rebuild free set after abbreviated and old GC ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/395/files - new: https://git.openjdk.org/shenandoah/pull/395/files/7f01a7f3..e009c35a Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=09 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=08-09 Stats: 24 lines in 1 file changed: 0 ins; 18 del; 6 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From kdnilsen at openjdk.org Fri Jul 26 23:46:59 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 26 Jul 2024 23:46:59 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v6] In-Reply-To: References: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> Message-ID: On Thu, 25 Jul 2024 00:31:59 GMT, Y. Srinivas Ramakrishna wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 752: >> >>> 750: // been set aside to hold objects evacuated from the young-gen collection set. Conservatively, this value >>> 751: // equals the entire amount of live young-gen memory within the collection set, even though some of this memory >>> 752: // will likely be promoted. >> >> It sounds like these comments should be moved (and split) into the `prepare_regions_and_collection_set()` header specification, and in its override for `ShenandoahOldGeneration`? > > The TODO comment at lines 727-734 should similarly move out into the method about which it talks. Thanks. I've moved these comments to more appropriate places. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1693693844 From kdnilsen at openjdk.org Sat Jul 27 14:25:24 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sat, 27 Jul 2024 14:25:24 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v11] In-Reply-To: References: Message-ID: <1TjPllmhUHFwC4qLSncC33Lqpue9jIRBNY4X9rUEirw=.e34b1841-874f-4aea-a2e2-adb10613a730@github.com> > Allow young-gen Collector reserve to share memory with old-gen Collector reserve in order to support prompt processing of mixed evacuations, as constrained by ShenandoahOldEvacRatioPercent. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Cleanups requested by code review ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/395/files - new: https://git.openjdk.org/shenandoah/pull/395/files/e009c35a..699f4098 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=10 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=395&range=09-10 Stats: 259 lines in 4 files changed: 129 ins; 130 del; 0 mod Patch: https://git.openjdk.org/shenandoah/pull/395.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/395/head:pull/395 PR: https://git.openjdk.org/shenandoah/pull/395 From kdnilsen at openjdk.org Sat Jul 27 14:30:03 2024 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sat, 27 Jul 2024 14:30:03 GMT Subject: RFR: 8325673: GenShen: Share Reserves between Old and Young Collector [v6] In-Reply-To: References: <9ArkWWvSpWGPA84_TYWwLqi4TKlRYnWJUvL033Z9pT0=.91929a46-fc66-47ee-84ad-ac1f19592a44@github.com> Message-ID: On Thu, 25 Jul 2024 01:02:06 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Improve comment > > src/hotspot/share/gc/shenandoah/heuristics/shenandoahOldHeuristics.cpp line 83: > >> 81: #ifdef ASSERT >> 82: if (uint(os::random()) % 100 < ShenandoahCoalesceChance) { >> 83: return true; > > The change in the order of the methods has left the diffs very confused, making it a it difficult to read. Would it be possible to restore the old order of definition of these methods, unless there was a specific reason to reorder them? (May be the reordering, if important, could be deferred to a separate PR that only does that; just to make reviewers' lives easier :-) Sorry for introducing this confusion. I've re-ordered the methods to make the diffs a bit cleaner. Some of the refactoring moved part of prime_collection_set() into a helper method. ------------- PR Review Comment: https://git.openjdk.org/shenandoah/pull/395#discussion_r1693966020 From rkennke at openjdk.org Mon Jul 29 14:50:43 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 29 Jul 2024 14:50:43 GMT Subject: RFR: 8337213: Shenandoah: Add verification for class mirrors In-Reply-To: <8eHostMEXZZ8g9Q_jCwNxZIk0sUiMBxM-6GpXxryMaY=.31253348-41ba-46e0-a6b8-063f73ca0a99@github.com> References: <8eHostMEXZZ8g9Q_jCwNxZIk0sUiMBxM-6GpXxryMaY=.31253348-41ba-46e0-a6b8-063f73ca0a99@github.com> Message-ID: <_m5ztc4TsHTf7A8GO0JWEQKoeOOxNp3P0RRx9yXUDN0=.292d7389-a072-4529-8de3-1c9abdeb1a5b@github.com> On Thu, 25 Jul 2024 14:35:37 GMT, Aleksey Shipilev wrote: > When working in Leyden, I noticed some oddities when loading Class mirrors from the CDS archive. These are not caught by Shenandoah asserts or verifier, since the problem is due to wrong metadata. We can amend verification to test these. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC -XX:+ShenandoahVerify` Looks reasonable. Thanks! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/20332#pullrequestreview-2205195325 From rkennke at openjdk.org Mon Jul 29 14:54:51 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 29 Jul 2024 14:54:51 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v3] In-Reply-To: References: Message-ID: On Fri, 26 Jul 2024 18:26:51 GMT, Aleksey Shipilev wrote: >> While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). >> >> Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. >> >> I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. >> >> This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. >> >> Additional testing: >> - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) >> - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Referencing INCLUDE_SHENANDOAHGC nominally requires macro.hpp Changes requested by rkennke (Reviewer). src/hotspot/share/jfr/leakprofiler/chains/pathToGcRootsOperation.cpp line 147: > 145: // mark word uses by Shenandoah itself, if we hit the op during the concurrent > 146: // GC cycle. > 147: return ShenandoahHeap::heap()->is_idle(); I think you need to include shenandoahHeap.inline.hpp for this? Also, do we need to block this during all of GC, even marking? Would it make sense to only block during evac/updaterefs? ------------- PR Review: https://git.openjdk.org/jdk/pull/20328#pullrequestreview-2205204781 PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1695377835 From shade at openjdk.org Mon Jul 29 15:34:44 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 29 Jul 2024 15:34:44 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: > While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). > > Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. > > I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. > > This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. > > Additional testing: > - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) > - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Roman's review: more precise GC state check, more includes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20328/files - new: https://git.openjdk.org/jdk/pull/20328/files/b3faa095..e46f7bd5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20328&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20328&range=02-03 Stats: 7 lines in 1 file changed: 4 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/20328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20328/head:pull/20328 PR: https://git.openjdk.org/jdk/pull/20328 From shade at openjdk.org Mon Jul 29 15:34:46 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 29 Jul 2024 15:34:46 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v3] In-Reply-To: References: Message-ID: On Mon, 29 Jul 2024 14:51:49 GMT, Roman Kennke wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Referencing INCLUDE_SHENANDOAHGC nominally requires macro.hpp > > src/hotspot/share/jfr/leakprofiler/chains/pathToGcRootsOperation.cpp line 147: > >> 145: // mark word uses by Shenandoah itself, if we hit the op during the concurrent >> 146: // GC cycle. >> 147: return ShenandoahHeap::heap()->is_idle(); > > I think you need to include shenandoahHeap.inline.hpp for this? Also, do we need to block this during all of GC, even marking? Would it make sense to only block during evac/updaterefs? One of my attempts _did_ have a more sharp `!has_forwarded_objects()` check. But then I chickened out from specializing this, and instead trusting the more catch-all `is_idle()`. We can do a more precise test and relax it if we see more problems. And right, we need a sharper include here, added. See new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1695442228 From rkennke at openjdk.org Mon Jul 29 15:44:32 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 29 Jul 2024 15:44:32 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jul 2024 15:34:44 GMT, Aleksey Shipilev wrote: >> While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). >> >> Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. >> >> I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. >> >> This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. >> >> Additional testing: >> - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) >> - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Roman's review: more precise GC state check, more includes Why is it ok to simply skip PathToGcRootsOperation? AFAICT, this is (only) used in EventEmitter::emit(..), to emit events with reference chains. What is the consequence of not doing so during GC? Also, why does JFR see from-space objects to begin with? This should not be allowed. Does JFR use raw loads of references to figure out chains? If so, should it use the proper Access API instead? If not - how does it see from-space refs? ------------- PR Review: https://git.openjdk.org/jdk/pull/20328#pullrequestreview-2205331205 From shade at openjdk.org Mon Jul 29 15:57:30 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 29 Jul 2024 15:57:30 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jul 2024 15:42:15 GMT, Roman Kennke wrote: > Why is it ok to simply skip PathToGcRootsOperation? AFAICT, this is (only) used in EventEmitter::emit(..), to emit events with reference chains. What is the consequence of not doing so during GC? I think this op is opportunistic, and we bail in the similar way we bail on other conditions in the same method. > Also, why does JFR see from-space objects to begin with? This should not be allowed. Does JFR use raw loads of references to figure out chains? If so, should it use the proper Access API instead? If not - how does it see from-space refs? I don't think JFR sees from-space refs. What happens that JFR sees a to-space object (maybe already passed through LRB, since we can be in evac), _tags it in markword_, and that tag starts to look like a forwarding pointer to Shenandoah. So when JFR code goes around and does LRB on that already-to-space object, LRB gets confused. This is why [JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194) shows "Multiple forwardings" as the failure mode. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20328#issuecomment-2256290894 From wkemper at openjdk.org Mon Jul 29 17:01:36 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 29 Jul 2024 17:01:36 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jul 2024 15:34:44 GMT, Aleksey Shipilev wrote: >> While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). >> >> Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. >> >> I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. >> >> This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. >> >> Additional testing: >> - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) >> - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Roman's review: more precise GC state check, more includes Marked as reviewed by wkemper (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20328#pullrequestreview-2205505622 From rkennke at openjdk.org Mon Jul 29 17:12:37 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 29 Jul 2024 17:12:37 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jul 2024 15:34:44 GMT, Aleksey Shipilev wrote: >> While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). >> >> Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. >> >> I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. >> >> This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. >> >> Additional testing: >> - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) >> - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Roman's review: more precise GC state check, more includes Marked as reviewed by rkennke (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20328#pullrequestreview-2205529383 From mgronlun at openjdk.org Mon Jul 29 19:51:37 2024 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 29 Jul 2024 19:51:37 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jul 2024 15:34:44 GMT, Aleksey Shipilev wrote: >> While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). >> >> Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. >> >> I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. >> >> This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. >> >> Additional testing: >> - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) >> - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Roman's review: more precise GC state check, more includes test/jdk/jdk/jfr/event/oldobject/TestFieldInformation.java line 55: > 53: > 54: // OldObjectSample is emitted at the end of the recording, and we need > 55: // to let GC catch up before we do the event. There is no reason to wait This notion of a "GC catching up" confuses these tests. Why should the GC " catch up" before a recording is stopped? If you need to use this construct here, please specify that it applies only to Shenandoah GC and not to GCs in general. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1695777218 From wkemper at openjdk.org Mon Jul 29 20:28:48 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 29 Jul 2024 20:28:48 GMT Subject: RFR: Merge openjdk/jdk:master [v2] In-Reply-To: References: Message-ID: <9mkuDXA1omdD35jkTokKP6-YDyVc3Tp7qkbinKjax6U=.132a7403-81a9-4019-9274-de0db308e2de@github.com> > Merges tag jdk-24+7 William Kemper has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. ------------- Changes: - all: https://git.openjdk.org/shenandoah/pull/461/files - new: https://git.openjdk.org/shenandoah/pull/461/files/21a6cf84..21a6cf84 Webrevs: - full: https://webrevs.openjdk.org/?repo=shenandoah&pr=461&range=01 - incr: https://webrevs.openjdk.org/?repo=shenandoah&pr=461&range=00-01 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/shenandoah/pull/461.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/461/head:pull/461 PR: https://git.openjdk.org/shenandoah/pull/461 From wkemper at openjdk.org Mon Jul 29 20:28:02 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 29 Jul 2024 20:28:02 GMT Subject: Integrated: Merge openjdk/jdk:master In-Reply-To: References: Message-ID: On Fri, 26 Jul 2024 14:10:25 GMT, William Kemper wrote: > Merges tag jdk-24+8 This pull request has now been integrated. Changeset: d3439437 Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/d34394378b700bf17b5aeef93dc67c4e555edb98 Stats: 11677 lines in 480 files changed: 7499 ins; 2242 del; 1936 mod Merge ------------- PR: https://git.openjdk.org/shenandoah/pull/463 From duke at openjdk.org Mon Jul 29 20:28:48 2024 From: duke at openjdk.org (duke) Date: Mon, 29 Jul 2024 20:28:48 GMT Subject: Withdrawn: Merge openjdk/jdk:master In-Reply-To: References: Message-ID: On Fri, 19 Jul 2024 14:12:27 GMT, William Kemper wrote: > Merges tag jdk-24+7 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/shenandoah/pull/461 From wkemper at openjdk.org Mon Jul 29 22:55:30 2024 From: wkemper at openjdk.org (William Kemper) Date: Mon, 29 Jul 2024 22:55:30 GMT Subject: RFR: 8337213: Shenandoah: Add verification for class mirrors In-Reply-To: <8eHostMEXZZ8g9Q_jCwNxZIk0sUiMBxM-6GpXxryMaY=.31253348-41ba-46e0-a6b8-063f73ca0a99@github.com> References: <8eHostMEXZZ8g9Q_jCwNxZIk0sUiMBxM-6GpXxryMaY=.31253348-41ba-46e0-a6b8-063f73ca0a99@github.com> Message-ID: On Thu, 25 Jul 2024 14:35:37 GMT, Aleksey Shipilev wrote: > When working in Leyden, I noticed some oddities when loading Class mirrors from the CDS archive. These are not caught by Shenandoah asserts or verifier, since the problem is due to wrong metadata. We can amend verification to test these. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC -XX:+ShenandoahVerify` Marked as reviewed by wkemper (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/20332#pullrequestreview-2206234626 From ysr at openjdk.org Tue Jul 30 01:46:32 2024 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Tue, 30 Jul 2024 01:46:32 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jul 2024 15:34:44 GMT, Aleksey Shipilev wrote: >> While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). >> >> Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. >> >> I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. >> >> This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. >> >> Additional testing: >> - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) >> - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Roman's review: more precise GC state check, more includes > Leak Profiler is notorious in using the mark words for its own needs. It seems to already use external storage for some of the leak profiling work. Can it be taught -- at least when Shenandoah is being used -- to just use external storage for its book-keeping/tagging as well to avoid this interference? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20328#issuecomment-2257300304 From shade at openjdk.org Tue Jul 30 08:22:33 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jul 2024 08:22:33 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Tue, 30 Jul 2024 01:43:55 GMT, Y. Srinivas Ramakrishna wrote: > > Leak Profiler is notorious in using the mark words for its own needs. > > It seems to already use external storage for some of the leak profiling work. Can it be taught -- at least when Shenandoah is being used -- to just use external storage for its book-keeping/tagging as well to avoid this interference? Yes, but as I mentioned in PR body, this is a continuous whack-a-mole game. At this point, I believe bailing from unsafe modes is a saner tactics. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20328#issuecomment-2257756422 From shade at openjdk.org Tue Jul 30 08:22:35 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jul 2024 08:22:35 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Mon, 29 Jul 2024 19:48:25 GMT, Markus Gr?nlund wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Roman's review: more precise GC state check, more includes > > test/jdk/jdk/jfr/event/oldobject/TestFieldInformation.java line 55: > >> 53: >> 54: // OldObjectSample is emitted at the end of the recording, and we need >> 55: // to let GC catch up before we do the event. There is no reason to wait > > This notion of a "GC catching up" confuses these tests. Why should the GC " catch up" before a recording is stopped? If you need to use this construct here, please specify that it applies only to Shenandoah GC and not to GCs in general. This applies to any concurrent GC, really. We have allocated things, GC might have been triggered, we need to wait for "old objects" to appear now. Old object sampling is done at the end of the recording. Current tests assume that "old" objects appear "instantaneously" after the allocations -- which is somewhat sane for STW collectors, but not for the concurrent ones. The fact that Shenandoah does not report old objects while GC is running is just a facet of this general problem. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1696534654 From shade at openjdk.org Tue Jul 30 08:29:32 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jul 2024 08:29:32 GMT Subject: RFR: 8337213: Shenandoah: Add verification for class mirrors In-Reply-To: <8eHostMEXZZ8g9Q_jCwNxZIk0sUiMBxM-6GpXxryMaY=.31253348-41ba-46e0-a6b8-063f73ca0a99@github.com> References: <8eHostMEXZZ8g9Q_jCwNxZIk0sUiMBxM-6GpXxryMaY=.31253348-41ba-46e0-a6b8-063f73ca0a99@github.com> Message-ID: On Thu, 25 Jul 2024 14:35:37 GMT, Aleksey Shipilev wrote: > When working in Leyden, I noticed some oddities when loading Class mirrors from the CDS archive. These are not caught by Shenandoah asserts or verifier, since the problem is due to wrong metadata. We can amend verification to test these. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC -XX:+ShenandoahVerify` Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/20332#issuecomment-2257771749 From shade at openjdk.org Tue Jul 30 08:32:38 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jul 2024 08:32:38 GMT Subject: Integrated: 8337213: Shenandoah: Add verification for class mirrors In-Reply-To: <8eHostMEXZZ8g9Q_jCwNxZIk0sUiMBxM-6GpXxryMaY=.31253348-41ba-46e0-a6b8-063f73ca0a99@github.com> References: <8eHostMEXZZ8g9Q_jCwNxZIk0sUiMBxM-6GpXxryMaY=.31253348-41ba-46e0-a6b8-063f73ca0a99@github.com> Message-ID: <9_eZYIls-hND_dxSgxUt0sWXo-zHwGHzCRWr_NvrZuE=.6e02f841-93ab-45e5-a271-8221e852956e@github.com> On Thu, 25 Jul 2024 14:35:37 GMT, Aleksey Shipilev wrote: > When working in Leyden, I noticed some oddities when loading Class mirrors from the CDS archive. These are not caught by Shenandoah asserts or verifier, since the problem is due to wrong metadata. We can amend verification to test these. > > Additional testing: > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC` > - [x] Linux x86_64 server fastdebug, `all` with `-XX:+UseShenandoahGC -XX:+ShenandoahVerify` This pull request has now been integrated. Changeset: 156f0b43 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/156f0b4332bf076165898417cf6678d2fc32df5c Stats: 34 lines in 2 files changed: 34 ins; 0 del; 0 mod 8337213: Shenandoah: Add verification for class mirrors Reviewed-by: rkennke, wkemper ------------- PR: https://git.openjdk.org/jdk/pull/20332 From rkennke at openjdk.org Tue Jul 30 08:52:31 2024 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 30 Jul 2024 08:52:31 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Tue, 30 Jul 2024 08:19:09 GMT, Aleksey Shipilev wrote: > > > Leak Profiler is notorious in using the mark words for its own needs. > > > > > > It seems to already use external storage for some of the leak profiling work. Can it be taught -- at least when Shenandoah is being used -- to just use external storage for its book-keeping/tagging as well to avoid this interference? > > Yes, but as I mentioned in PR body, this is a continuous whack-a-mole game. At this point, I believe bailing from unsafe modes is a saner tactics. The 'unsafe' part is that JFR meddles with the mark-word, and it can be argued that it shouldn't be any of JFR's business to do so. Avoiding to touch the mark-word altogether would be the sanest tactic, IMO. JFR could use a bitmap for 'marking' objects, pretty much like is done in JVMTI for very similar purpose. (See https://github.com/openjdk/jdk/blob/156f0b4332bf076165898417cf6678d2fc32df5c/src/hotspot/share/prims/jvmtiTagMap.cpp#L1487) The ObjectBitSet that is used there is well-suited for that purpose because it only allocates chunks as needed, and thus avoids allocating a full bitmap that covers all of the heap. ------------- PR Comment: https://git.openjdk.org/jdk/pull/20328#issuecomment-2257818242 From mgronlun at openjdk.org Tue Jul 30 10:15:34 2024 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 30 Jul 2024 10:15:34 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Tue, 30 Jul 2024 08:17:38 GMT, Aleksey Shipilev wrote: >> test/jdk/jdk/jfr/event/oldobject/TestFieldInformation.java line 55: >> >>> 53: >>> 54: // OldObjectSample is emitted at the end of the recording, and we need >>> 55: // to let GC catch up before we do the event. There is no reason to wait >> >> This notion of a "GC catching up" confuses these tests. Why should the GC " catch up" before a recording is stopped? If you need to use this construct here, please specify that it applies only to Shenandoah GC and not to GCs in general. > > This applies to any concurrent GC, really. > > We have allocated things, GC might have been triggered, we need to wait for "old objects" to appear now. Old object sampling is done at the end of the recording. Current tests assume that "old" objects appear "instantaneously" after the allocations -- which is somewhat sane for STW collectors, but not for the concurrent ones. The fact that Shenandoah does not report old objects while GC is running is just a facet of this general problem. Sampled oops are referenced by weak handles in the JFR sample priority queue. When the recording stops, an operation (EventEmitter) occurs that dumps the ObjectSamples in the priority queue and possibly, using an exclusive safe point operation, walks the paths-to-gc-roots. Can you explain more what you mean by "wait for old objects to appear"? Appear where? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1696706701 From shade at openjdk.org Tue Jul 30 10:46:10 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jul 2024 10:46:10 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v5] In-Reply-To: References: Message-ID: > While testing unrelated Shenandoah patch, I caught a GC assert when Leak Profiler was running ([JDK-8337194](https://bugs.openjdk.org/browse/JDK-8337194)). > > Leak Profiler is notorious in using the mark words for its own needs. We have been trying to mitigate its impact on GCs by moving to separate bitsets for tracking marked objects, or by treating "marked without fwdptr" as "JFR marked" and handling it. But this is not reliable, since things like [putting indexes in mark word](https://github.com/openjdk/jdk/blob/3baff2af6a30cc6cb2e0d4391db1cf7e61c33f64/src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp#L275-L280) sneak in. This is okay for Leak Profiler alone, since it restores the mark words after the operation completes, but that is still not enough when GC is already running. > > I say we side-step this whack-a-mole by cleanly bailing from JFR op, when we know it is unsafe to do. I thought to use `VM_Operation::doit_prologue`, but I think GC start may sneak in between checking in prologue and op start. > > This realistically only affects Shenandoah. All other STW collectors would never see what Leak Profiler did with mark words. ZGC would not see it, since it does not care about mark words for its own operation. > > Additional testing: > - [x] `jdk/jfr/event/oldobject/` pass by default (100x times) > - [x] `jdk/jfr/event/oldobject/` pass with `-XX:+UseShenandoah` (1000x) Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Tighten up comments prose ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20328/files - new: https://git.openjdk.org/jdk/pull/20328/files/e46f7bd5..249956fc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20328&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20328&range=03-04 Stats: 10 lines in 2 files changed: 2 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/20328.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20328/head:pull/20328 PR: https://git.openjdk.org/jdk/pull/20328 From shade at openjdk.org Tue Jul 30 10:46:10 2024 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 30 Jul 2024 10:46:10 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Tue, 30 Jul 2024 10:11:37 GMT, Markus Gr?nlund wrote: >> This applies to any concurrent GC, really. >> >> We have allocated things, GC might have been triggered, we need to wait for "old objects" to appear now. Old object sampling is done at the end of the recording. Current tests assume that "old" objects appear "instantaneously" after the allocations -- which is somewhat sane for STW collectors, but not for the concurrent ones. The fact that Shenandoah does not report old objects while GC is running is just a facet of this general problem. > > Sampled oops are referenced by weak handles in the JFR sample priority queue. When the recording stops, an operation (EventEmitter) occurs that dumps the ObjectSamples in the priority queue and possibly, using an exclusive safe point operation, walks the paths-to-gc-roots. Can you explain more what you mean by "wait for old objects to appear"? Appear where? All right, I think I misunderstood how sampling works. I assumed it happens from the GC cycle, but now I see it happens from the allocation path. Indeed sampler holds the object in weak oop storage, and they are supposed to be accessible immediately after the allocation. I tightened up comments to reflect this understanding. Looks better, or? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1696744209 From mgronlun at openjdk.org Tue Jul 30 11:15:32 2024 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 30 Jul 2024 11:15:32 GMT Subject: RFR: 8279016: JFR Leak Profiler is broken with Shenandoah [v4] In-Reply-To: References: Message-ID: On Tue, 30 Jul 2024 10:42:39 GMT, Aleksey Shipilev wrote: >> Sampled oops are referenced by weak handles in the JFR sample priority queue. When the recording stops, an operation (EventEmitter) occurs that dumps the ObjectSamples in the priority queue and possibly, using an exclusive safe point operation, walks the paths-to-gc-roots. Can you explain more what you mean by "wait for old objects to appear"? Appear where? > > All right, I think I misunderstood how sampling works. I assumed it happens from the GC cycle, but now I see it happens from the allocation path. Indeed sampler holds the object in weak oop storage, and they are supposed to be accessible immediately after the allocation. > > I tightened up comments to reflect this understanding. Looks better, or? I don't know. The backoff mechanism now introduced is to handle the special case where the LeakProfiler dump operation, even though it is performed under an exclusive safepoint, happens to to coincide with some phase of Shenandoah GC - a lot of surface for handling this special case. Also, need this not be categorical for all OldObject tests? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20328#discussion_r1696781952 From wkemper at openjdk.org Tue Jul 30 16:59:27 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 30 Jul 2024 16:59:27 GMT Subject: RFR: Merge openjdk/jdk:master Message-ID: Merges tip (to pickup/resolve IU mode removal). ------------- Commit messages: - Remove extraneous whitespace - Merge tip of openjdk:master - 8337285: Examine java.text.DecimalFormat API for api/implXxx tag usage - 6381729: Javadoc for generic constructor doesn't document type parameter - 8332738: Debug agent can deadlock on callbackLock when using StackFrame.PopFrames - 8337300: java/lang/Process/WaitForDuration.java leaves child process behind - 8336342: Fix known X11 library locations in sysroot - 8337267: [REDO] G1: Refactor G1RebuildRSAndScrubTask - 8337165: Test jdk/jfr/event/gc/stacktrace/TestG1YoungAllocationPendingStackTrace.java failed: IndexOutOfBoundsException: Index 64 out of bounds for length 64 - 8336966: Alpine Linux x86_64 compilation error: sendfile64 - ... and 33 more: https://git.openjdk.org/shenandoah/compare/d3439437...c3ed9334 The webrevs contain the adjustments done while merging with regards to each parent branch: - master: https://webrevs.openjdk.org/?repo=shenandoah&pr=464&range=00.0 - openjdk/jdk:master: https://webrevs.openjdk.org/?repo=shenandoah&pr=464&range=00.1 Changes: https://git.openjdk.org/shenandoah/pull/464/files Stats: 9010 lines in 233 files changed: 4887 ins; 3360 del; 763 mod Patch: https://git.openjdk.org/shenandoah/pull/464.diff Fetch: git fetch https://git.openjdk.org/shenandoah.git pull/464/head:pull/464 PR: https://git.openjdk.org/shenandoah/pull/464 From wkemper at openjdk.org Tue Jul 30 18:12:46 2024 From: wkemper at openjdk.org (William Kemper) Date: Tue, 30 Jul 2024 18:12:46 GMT Subject: Integrated: Merge openjdk/jdk:master In-Reply-To: References: Message-ID: On Tue, 30 Jul 2024 16:48:05 GMT, William Kemper wrote: > Merges tip (to pickup/resolve IU mode removal). This pull request has now been integrated. Changeset: a4cb96f7 Author: William Kemper URL: https://git.openjdk.org/shenandoah/commit/a4cb96f7d8024e27b6451c51b6a6f129ca6a756d Stats: 9010 lines in 233 files changed: 4887 ins; 3360 del; 763 mod Merge ------------- PR: https://git.openjdk.org/shenandoah/pull/464 From wkemper at openjdk.org Wed Jul 31 19:17:09 2024 From: wkemper at openjdk.org (William Kemper) Date: Wed, 31 Jul 2024 19:17:09 GMT Subject: RFR: 8335126: Shenandoah: Improve OOM handling Message-ID: Mostly clean backport. This commit prevents tests that try to exhaust the heap from running continuous back-to-back full GCs (resulting in timeouts). ------------- Commit messages: - Backport 3a87eb5c4606ce39970962895315567e8606eba7 Changes: https://git.openjdk.org/shenandoah-jdk21u/pull/69/files Webrev: https://webrevs.openjdk.org/?repo=shenandoah-jdk21u&pr=69&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8335126 Stats: 33 lines in 3 files changed: 13 ins; 1 del; 19 mod Patch: https://git.openjdk.org/shenandoah-jdk21u/pull/69.diff Fetch: git fetch https://git.openjdk.org/shenandoah-jdk21u.git pull/69/head:pull/69 PR: https://git.openjdk.org/shenandoah-jdk21u/pull/69