From martin.doerr at sap.com Mon Jul 2 07:55:57 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 2 Jul 2018 07:55:57 +0000 Subject: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space In-Reply-To: References: Message-ID: Hi Michihiro, thanks for addressing this issue. The change looks good to me. I only have a comment on the coding style (oop.inline.hpp): ?if ()? should be followed by braces ?{ ? }?. Seems like a user of the forwardee needs to rely on memory_order_consume in the current implementation. I guess it will be appreciated that you?re fixing this. Please note that SAP still supports CMS in the commercial VM so this change is still relevant and we?d like to push it to jdk11 if possible. But we definitely need an OK from a CMS expert (which I?m not). Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Mittwoch, 27. Juni 2018 02:23 To: hotspot-gc-dev at openjdk.java.net Cc: Doerr, Martin ; Kim Barrett ; Gustavo Romero Subject: RFR: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space Dear all, Would you please review the following change? Bug: https://bugs.openjdk.java.net/browse/JDK-8205908 Webrev: http://cr.openjdk.java.net/~mhorie/8205908/webrev.00/ [Current implementation] ParNewGeneration::copy_to_survivor_space tries to move live objects to a different location. There are two patterns on how to copy an object depending on whether there is space to allocate new_obj in to-space or not. If a thread cannot find space to allocate new_obj in to-space, the thread first executes the CAS with a dummy forwarding pointer "ClaimedForwardPtr", which is a sentinel to mark an object as claimed. After succeeding in the CAS, a thread can copy the new_obj in the old space. Here, suppose thread A succeeds in the CAS, while thread B fails in the CAS. When thread A finishes the copy, it replaces the dummy forwarding pointer with a real forwarding pointer. After thread B fails in the CAS, thread B returns the forwardee after waiting for the copy of the forwardee is completed. This is observable by checking the dummy forwarding pointer is replaced with a real forwarding pointer by thread A. In contrast, if a thread can find space to allocate new_obj in to-space, the thread first copies the new_obj and then executes the CAS with the new_obj. If a thread fails in the CAS, it deallocates the copied new_obj and returns the forwardee. Procedure of ParNewGeneration::copy_to_survivor_space : ([L****] represents the line number in src/hotspot/share/gc/cms/parNewGeneration.cpp) 1. Try to each allocate space for new_obj in to-space [L.1110] 2. If fail in the allocation in to-space [L1117] 2.1. Execute the CAS with the dummy forwarding pointer [L1122] ??? (A) 2.2. If fail in the CAS, return the forwardee via real_forwardee() [L1123] 2.3. If succeed in the CAS [L1128] 2.3.1. If promotion is allowed, copy new_obj in the old area [L1129] 2.3.2. If promotion is not allowed, forward to obj itself [L1133] 2.4. Set new_obj as forwardee [L1142] 3. If succeed in the allocation in to-space [L1144] 3.1. Copy new_obj [L1146] 3.2. Execute the CAS with new_obj [L1148] ??? (B) 4. Dereference the new_obj for logging. Each new_obj copied by each thread at step 3.1 is used instead of forwardee() [L1159] 5. If succeed in either CAS (A) or CAS (B), return new_obj [L1163] 6. If fail in CAS (B), get the forwardee via real_forwardee(). Unallocate new_obj in to-space [L1193] 7. Return forwardee [L1203] For reference, real_forwardee() is as shown below: oop ParNewGeneration::real_forwardee(oop obj) { oop forward_ptr = obj->forwardee(); if (forward_ptr != ClaimedForwardPtr) { return forward_ptr; } else { // manually inlined for readability. oop forward_ptr = obj->forwardee(); while (forward_ptr == ClaimedForwardPtr) { waste_some_time(); forward_ptr = obj->forwardee(); } return forward_ptr; } } Regarding the CAS (A), There is no copy before the CAS. Dereferencing the forwardee must be allowed after obtaining the forwardee. Regarding the CAS (B), There is a copy before the CAS. Dereferencing the forwardee must be allowed after obtaining the forwardee. [Observation on the current implementation] No fence is necessary before and after the CAS (A). Release barrier is necessary before the CAS (B). The forwardee_acquire() must be used instead of forwardee() in real_forwardee(). [Performance measurement] The critical-jOPS of SPECjbb2015 improved by 12% with this change. Best regards, -- Michihiro, IBM Research - Tokyo -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Mon Jul 2 12:32:30 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 02 Jul 2018 14:32:30 +0200 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink In-Reply-To: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> References: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> Message-ID: <0332d67eb5e0509367118eb99b9c84b280465918.camel@oracle.com> Hi all, can I have reviews for this fix that is scheduled for JDK 11? Thanks, Thomas On Tue, 2018-06-26 at 19:15 +0200, Thomas Schatzl wrote: > Hi all, > > can I have reviews for this bug in keeping remembered sets > consistent between HC and HS regions, causing crashes with > verification? > > The problem occurs during updating the remembered sets during the > Remark pause. This process is parallel; it uses liveness information > from marking to set the new remembered set states. > > However during marking G1 attributes all liveness information of a > humongous object to the HS region; if that liveness information has > not been updated yet for HC regions, and another thread is > responsible for determining that HC region's remembered set state, > the new remembered set state of the HC region will get a state as the > HS region. > > The fix is to, for HC regions, just pass the liveness data of the HS > region into the method that determines the new remembered set state. > Further, in that latter method, make sure that the predicate for > determining whether a region gets a remembered set assigned is > completely disjoint for humongous and non-humongous regions. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8205426 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8205426/webrev/ > Testing: > new test case, hs-tier1-4,jdk-tier1-3 > > Thanks, > Thomas > From per.liden at oracle.com Mon Jul 2 15:05:26 2018 From: per.liden at oracle.com (Per Liden) Date: Mon, 2 Jul 2018 17:05:26 +0200 Subject: RFR: 8205924: ZGC: Premature OOME due to failure to expand backing file Message-ID: <5564b685-22ab-c346-3fb9-f58a9aeee75b@oracle.com> ZGC currently assumes that there will be enough space available on the backing file system to hold the max heap size (-Xmx). However, this might not be true. For example, the backing filesystem might have been misconfigured or space on that filesystem might be used by some other process. In this situation, ZGC will try (and fail) to map more memory every time a new page needs to be allocated (assuming that request can't be satisfied by the page case). As a result, we fail to flush the page cache, which in turn means we throw a premature OOME and we continuously take the performance hit by making unnecessary fallocate() syscalls that will never succeed. We should instead detect this situation, flush the page cache and avoid making further fallocate() calls. This issue has been seen now and then in various tests (e.g. RunThese30M and Kitchensink), typically on machines running older kernels without support for memfd_create(), where we fall back to using /dev/shm, which sometimes doesn't have enough space to hold the given max heap size (default tmpfs size is 50% of the RAM in the machine). Bug: https://bugs.openjdk.java.net/browse/JDK-8205924 Webrev: http://cr.openjdk.java.net/~pliden/8205924/webrev.0 Testing: Passed two iterations of tier{1,2,3,4,5,6} on linux-x64, passed multiple iterations of RunThese30M locally, and various manual testing to provoke the bad situations. /Per From HORIE at jp.ibm.com Tue Jul 3 08:25:41 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Tue, 3 Jul 2018 17:25:41 +0900 Subject: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space In-Reply-To: References: Message-ID: Hi Martin, Thanks a lot for your review. Sure, we need an OK from a CMS expert. Following is the new webrev: http://cr.openjdk.java.net/~mhorie/8205908/webrev.01/ >Seems like a user of the forwardee needs to rely on memory_order_consume in the current implementation. I guess it will be appreciated that you?re fixing this. Thank you for pointing out this issue in the original implementation. I newly inserted a release at "2.4. Set new_obj as forwardee [L1142]". Improvement of critical-jOPS in SPECjbb2015 was 10%, which is still a big number. Best regards, -- Michihiro, IBM Research - Tokyo From: "Doerr, Martin" To: Michihiro Horie , "hotspot-gc-dev at openjdk.java.net" Cc: Kim Barrett , Gustavo Romero Date: 2018/07/02 16:56 Subject: RE: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space Hi Michihiro, thanks for addressing this issue. The change looks good to me. I only have a comment on the coding style (oop.inline.hpp): ?if ()? should be followed by braces ?{ ? }?. Seems like a user of the forwardee needs to rely on memory_order_consume in the current implementation. I guess it will be appreciated that you?re fixing this. Please note that SAP still supports CMS in the commercial VM so this change is still relevant and we?d like to push it to jdk11 if possible. But we definitely need an OK from a CMS expert (which I?m not). Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Mittwoch, 27. Juni 2018 02:23 To: hotspot-gc-dev at openjdk.java.net Cc: Doerr, Martin ; Kim Barrett ; Gustavo Romero Subject: RFR: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space Dear all, Would you please review the following change? Bug: https://bugs.openjdk.java.net/browse/JDK-8205908 Webrev: http://cr.openjdk.java.net/~mhorie/8205908/webrev.00/ [Current implementation] ParNewGeneration::copy_to_survivor_space tries to move live objects to a different location. There are two patterns on how to copy an object depending on whether there is space to allocate new_obj in to-space or not. If a thread cannot find space to allocate new_obj in to-space, the thread first executes the CAS with a dummy forwarding pointer "ClaimedForwardPtr", which is a sentinel to mark an object as claimed. After succeeding in the CAS, a thread can copy the new_obj in the old space. Here, suppose thread A succeeds in the CAS, while thread B fails in the CAS. When thread A finishes the copy, it replaces the dummy forwarding pointer with a real forwarding pointer. After thread B fails in the CAS, thread B returns the forwardee after waiting for the copy of the forwardee is completed. This is observable by checking the dummy forwarding pointer is replaced with a real forwarding pointer by thread A. In contrast, if a thread can find space to allocate new_obj in to-space, the thread first copies the new_obj and then executes the CAS with the new_obj. If a thread fails in the CAS, it deallocates the copied new_obj and returns the forwardee. Procedure of ParNewGeneration::copy_to_survivor_space : ([L****] represents the line number in src/hotspot/share/gc/cms/parNewGeneration.cpp) 1. Try to each allocate space for new_obj in to-space [L.1110] 2. If fail in the allocation in to-space [L1117] 2.1. Execute the CAS with the dummy forwarding pointer [L1122] ??? (A) 2.2. If fail in the CAS, return the forwardee via real_forwardee() [L1123] 2.3. If succeed in the CAS [L1128] 2.3.1. If promotion is allowed, copy new_obj in the old area [L1129] 2.3.2. If promotion is not allowed, forward to obj itself [L1133] 2.4. Set new_obj as forwardee [L1142] 3. If succeed in the allocation in to-space [L1144] 3.1. Copy new_obj [L1146] 3.2. Execute the CAS with new_obj [L1148] ??? (B) 4. Dereference the new_obj for logging. Each new_obj copied by each thread at step 3.1 is used instead of forwardee() [L1159] 5. If succeed in either CAS (A) or CAS (B), return new_obj [L1163] 6. If fail in CAS (B), get the forwardee via real_forwardee(). Unallocate new_obj in to-space [L1193] 7. Return forwardee [L1203] For reference, real_forwardee() is as shown below: oop ParNewGeneration::real_forwardee(oop obj) { oop forward_ptr = obj->forwardee(); if (forward_ptr != ClaimedForwardPtr) { return forward_ptr; } else { // manually inlined for readability. oop forward_ptr = obj->forwardee(); while (forward_ptr == ClaimedForwardPtr) { waste_some_time(); forward_ptr = obj->forwardee(); } return forward_ptr; } } Regarding the CAS (A), There is no copy before the CAS. Dereferencing the forwardee must be allowed after obtaining the forwardee. Regarding the CAS (B), There is a copy before the CAS. Dereferencing the forwardee must be allowed after obtaining the forwardee. [Observation on the current implementation] No fence is necessary before and after the CAS (A). Release barrier is necessary before the CAS (B). The forwardee_acquire() must be used instead of forwardee() in real_forwardee(). [Performance measurement] The critical-jOPS of SPECjbb2015 improved by 12% with this change. Best regards, -- Michihiro, IBM Research - Tokyo -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From martin.doerr at sap.com Tue Jul 3 13:51:18 2018 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 3 Jul 2018 13:51:18 +0000 Subject: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space In-Reply-To: References: Message-ID: <073dbd9f9aef42c89954e12b5ad005b9@sap.com> Hi Michihiro, I think oopDesc::forward_to should not be changed with this change because it is used by many GCs. If you want to add a StoreStore barrier, you could add ?OrderAccess::storestore();? before ?old->forward_to(new_obj);? for example. It would be nice to have a comment for your new case in oopDesc::forward_to_atomic. Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Dienstag, 3. Juli 2018 10:26 To: Doerr, Martin Cc: hotspot-gc-dev at openjdk.java.net; Kim Barrett ; Gustavo Romero Subject: RE: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space Hi Martin, Thanks a lot for your review. Sure, we need an OK from a CMS expert. Following is the new webrev: http://cr.openjdk.java.net/~mhorie/8205908/webrev.01/ >Seems like a user of the forwardee needs to rely on memory_order_consume in the current implementation. I guess it will be appreciated that you?re fixing this. Thank you for pointing out this issue in the original implementation. I newly inserted a release at "2.4. Set new_obj as forwardee [L1142]". Improvement of critical-jOPS in SPECjbb2015 was 10%, which is still a big number. Best regards, -- Michihiro, IBM Research - Tokyo [Inactive hide details for "Doerr, Martin" ---2018/07/02 16:56:03---Hi Michihiro, thanks for addressing this issue.]"Doerr, Martin" ---2018/07/02 16:56:03---Hi Michihiro, thanks for addressing this issue. From: "Doerr, Martin" > To: Michihiro Horie >, "hotspot-gc-dev at openjdk.java.net" > Cc: Kim Barrett >, Gustavo Romero > Date: 2018/07/02 16:56 Subject: RE: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space ________________________________ Hi Michihiro, thanks for addressing this issue. The change looks good to me. I only have a comment on the coding style (oop.inline.hpp): ?if ()? should be followed by braces ?{ ? }?. Seems like a user of the forwardee needs to rely on memory_order_consume in the current implementation. I guess it will be appreciated that you?re fixing this. Please note that SAP still supports CMS in the commercial VM so this change is still relevant and we?d like to push it to jdk11 if possible. But we definitely need an OK from a CMS expert (which I?m not). Best regards, Martin From: Michihiro Horie [mailto:HORIE at jp.ibm.com] Sent: Mittwoch, 27. Juni 2018 02:23 To: hotspot-gc-dev at openjdk.java.net Cc: Doerr, Martin >; Kim Barrett >; Gustavo Romero > Subject: RFR: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space Dear all, Would you please review the following change? Bug: https://bugs.openjdk.java.net/browse/JDK-8205908 Webrev: http://cr.openjdk.java.net/~mhorie/8205908/webrev.00/ [Current implementation] ParNewGeneration::copy_to_survivor_space tries to move live objects to a different location. There are two patterns on how to copy an object depending on whether there is space to allocate new_obj in to-space or not. If a thread cannot find space to allocate new_obj in to-space, the thread first executes the CAS with a dummy forwarding pointer "ClaimedForwardPtr", which is a sentinel to mark an object as claimed. After succeeding in the CAS, a thread can copy the new_obj in the old space. Here, suppose thread A succeeds in the CAS, while thread B fails in the CAS. When thread A finishes the copy, it replaces the dummy forwarding pointer with a real forwarding pointer. After thread B fails in the CAS, thread B returns the forwardee after waiting for the copy of the forwardee is completed. This is observable by checking the dummy forwarding pointer is replaced with a real forwarding pointer by thread A. In contrast, if a thread can find space to allocate new_obj in to-space, the thread first copies the new_obj and then executes the CAS with the new_obj. If a thread fails in the CAS, it deallocates the copied new_obj and returns the forwardee. Procedure of ParNewGeneration::copy_to_survivor_space : ([L****] represents the line number in src/hotspot/share/gc/cms/parNewGeneration.cpp) 1. Try to each allocate space for new_obj in to-space [L.1110] 2. If fail in the allocation in to-space [L1117] 2.1. Execute the CAS with the dummy forwarding pointer [L1122] ??? (A) 2.2. If fail in the CAS, return the forwardee via real_forwardee() [L1123] 2.3. If succeed in the CAS [L1128] 2.3.1. If promotion is allowed, copy new_obj in the old area [L1129] 2.3.2. If promotion is not allowed, forward to obj itself [L1133] 2.4. Set new_obj as forwardee [L1142] 3. If succeed in the allocation in to-space [L1144] 3.1. Copy new_obj [L1146] 3.2. Execute the CAS with new_obj [L1148] ??? (B) 4. Dereference the new_obj for logging. Each new_obj copied by each thread at step 3.1 is used instead of forwardee() [L1159] 5. If succeed in either CAS (A) or CAS (B), return new_obj [L1163] 6. If fail in CAS (B), get the forwardee via real_forwardee(). Unallocate new_obj in to-space [L1193] 7. Return forwardee [L1203] For reference, real_forwardee() is as shown below: oop ParNewGeneration::real_forwardee(oop obj) { oop forward_ptr = obj->forwardee(); if (forward_ptr != ClaimedForwardPtr) { return forward_ptr; } else { // manually inlined for readability. oop forward_ptr = obj->forwardee(); while (forward_ptr == ClaimedForwardPtr) { waste_some_time(); forward_ptr = obj->forwardee(); } return forward_ptr; } } Regarding the CAS (A), There is no copy before the CAS. Dereferencing the forwardee must be allowed after obtaining the forwardee. Regarding the CAS (B), There is a copy before the CAS. Dereferencing the forwardee must be allowed after obtaining the forwardee. [Observation on the current implementation] No fence is necessary before and after the CAS (A). Release barrier is necessary before the CAS (B). The forwardee_acquire() must be used instead of forwardee() in real_forwardee(). [Performance measurement] The critical-jOPS of SPECjbb2015 improved by 12% with this change. Best regards, -- Michihiro, IBM Research - Tokyo -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From kim.barrett at oracle.com Tue Jul 3 20:40:54 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 3 Jul 2018 16:40:54 -0400 Subject: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space In-Reply-To: References: Message-ID: <36321D6C-A8B7-48FB-8560-9B5807956A87@oracle.com> > On Jul 3, 2018, at 4:25 AM, Michihiro Horie wrote: > > Hi Martin, > > Thanks a lot for your review. Sure, we need an OK from a CMS expert. Following is the new webrev: > http://cr.openjdk.java.net/~mhorie/8205908/webrev.01/ > > >Seems like a user of the forwardee needs to rely on memory_order_consume in the current implementation. I guess it will be appreciated that you?re fixing this. > Thank you for pointing out this issue in the original implementation. I newly inserted a release at "2.4. Set new_obj as forwardee [L1142]". > > Improvement of critical-jOPS in SPECjbb2015 was 10%, which is still a big number. > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo CMS was deprecated in JDK 9, and has been on maintenance life-support for some time. This complex-to-review performance enhancement was proposed less than 48 hours before JDK 11 FC, and didn't receive any reviews until after FC. Because of these factors, I don't think it should be included in JDK 11. And if CMS gets removed in JDK 12 (I don't know if that will happen), then this change would be rendered entirely moot. I haven't looked carefully at the change, though I did find one part that I don't like. The new test of "order" in forward_to_atomic not only affects CMS, but also (uselessly) affects G1. I'm not going to be able to look at this carefully soon, as JDK 11 bug fixing has a higher priority for me. Since I think CMS might soon not be an issue, I'd really rather not look at it at all. I think this change needs not just a CMS-expert reviewer, but someone who is willing to maintain CMS (including any potential bug tail from this change). From per.liden at oracle.com Tue Jul 3 21:28:16 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 3 Jul 2018 23:28:16 +0200 Subject: 8206316: ZGC: Preferred tmpfs mount point not found on Debian Message-ID: <51c4c4b7-8ec0-fd6b-dc18-a6fc4685caed@oracle.com> On Linux kernels < 3.17 (where memfd_create() syscall does not exist), ZGC falls back to searching for a suitable tmpfs mount point. If multiple mount points are found (which is the common case) we try to see if any of them matches the "preferred default" path (which is hard coded to be /dev/shm in ZGC). This work well, except on Debian and Debian derived distributions, which for some reason have chosen to use /run/shm instead instead of /dev/shm. As a result, ZGC will fail to initialize on some commonly used distributions (current Debian stable, Ubuntu 14.04-LTS, etc), forcing the user to manually mount a tmpfs file system on /dev/shm or use -XX:ZPath=/run/shm to explicitly select the mount point. ZGC should handle this situation much better, but having a list of preferred mount points (instead of just one) to allow for multiple alternatives covering differences between distributions. There is a high risk that this otherwise becomes a common problem, given the popularity of Debian and Debian derived distributions. Bug: https://bugs.openjdk.java.net/browse/JDK-8206316 Webrev: http://cr.openjdk.java.net/~pliden/8206316/webrev.0 Testing: Manual testing of various mount point configurations. /Per From per.liden at oracle.com Tue Jul 3 22:04:39 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 4 Jul 2018 00:04:39 +0200 Subject: RFR: 8206322: ZGC: Incorrect license header in gtests Message-ID: When the ZGC gtests where open-sourced, the license header in these files were not updated accordingly. Bug: https://bugs.openjdk.java.net/browse/JDK-8206322 Webrev: http://cr.openjdk.java.net/~pliden/8206322/webrev.0 /Per From kim.barrett at oracle.com Tue Jul 3 23:41:51 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 3 Jul 2018 19:41:51 -0400 Subject: RFR: 8206322: ZGC: Incorrect license header in gtests In-Reply-To: References: Message-ID: <6567E49E-2512-4D1A-B446-21FEE50F9996@oracle.com> > On Jul 3, 2018, at 6:04 PM, Per Liden wrote: > > When the ZGC gtests where open-sourced, the license header in these files were not updated accordingly. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8206322 > Webrev: http://cr.openjdk.java.net/~pliden/8206322/webrev.0 > > /Per Looks good, and trivial. From kim.barrett at oracle.com Tue Jul 3 23:47:48 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 3 Jul 2018 19:47:48 -0400 Subject: 8206316: ZGC: Preferred tmpfs mount point not found on Debian In-Reply-To: <51c4c4b7-8ec0-fd6b-dc18-a6fc4685caed@oracle.com> References: <51c4c4b7-8ec0-fd6b-dc18-a6fc4685caed@oracle.com> Message-ID: > On Jul 3, 2018, at 5:28 PM, Per Liden wrote: > > On Linux kernels < 3.17 (where memfd_create() syscall does not exist), ZGC falls back to searching for a suitable tmpfs mount point. If multiple mount points are found (which is the common case) we try to see if any of them matches the "preferred default" path (which is hard coded to be /dev/shm in ZGC). This work well, except on Debian and Debian derived distributions, which for some reason have chosen to use /run/shm instead instead of /dev/shm. As a result, ZGC will fail to initialize on some commonly used distributions (current Debian stable, Ubuntu 14.04-LTS, etc), forcing the user to manually mount a tmpfs file system on /dev/shm or use -XX:ZPath=/run/shm to explicitly select the mount point. ZGC should handle this situation much better, but having a list of preferred mount points (instead of just one) to allow for multiple alternatives covering differences between distributions. There is a high risk that this otherwise becomes a common problem, given the popularity of Debian and Debian derived distributions. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8206316 > Webrev: http://cr.openjdk.java.net/~pliden/8206316/webrev.0 > > Testing: Manual testing of various mount point configurations. > > /Per Looks good. From per.liden at oracle.com Wed Jul 4 05:55:44 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 4 Jul 2018 07:55:44 +0200 Subject: RFR: 8206322: ZGC: Incorrect license header in gtests In-Reply-To: <6567E49E-2512-4D1A-B446-21FEE50F9996@oracle.com> References: <6567E49E-2512-4D1A-B446-21FEE50F9996@oracle.com> Message-ID: <23d712c2-f652-1966-d363-bbb7f2c873a3@oracle.com> Thanks Kim! /Per On 07/04/2018 01:41 AM, Kim Barrett wrote: >> On Jul 3, 2018, at 6:04 PM, Per Liden wrote: >> >> When the ZGC gtests where open-sourced, the license header in these files were not updated accordingly. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8206322 >> Webrev: http://cr.openjdk.java.net/~pliden/8206322/webrev.0 >> >> /Per > > Looks good, and trivial. > From per.liden at oracle.com Wed Jul 4 05:57:06 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 4 Jul 2018 07:57:06 +0200 Subject: 8206316: ZGC: Preferred tmpfs mount point not found on Debian In-Reply-To: References: <51c4c4b7-8ec0-fd6b-dc18-a6fc4685caed@oracle.com> Message-ID: Thanks Kim! /Per On 07/04/2018 01:47 AM, Kim Barrett wrote: >> On Jul 3, 2018, at 5:28 PM, Per Liden wrote: >> >> On Linux kernels < 3.17 (where memfd_create() syscall does not exist), ZGC falls back to searching for a suitable tmpfs mount point. If multiple mount points are found (which is the common case) we try to see if any of them matches the "preferred default" path (which is hard coded to be /dev/shm in ZGC). This work well, except on Debian and Debian derived distributions, which for some reason have chosen to use /run/shm instead instead of /dev/shm. As a result, ZGC will fail to initialize on some commonly used distributions (current Debian stable, Ubuntu 14.04-LTS, etc), forcing the user to manually mount a tmpfs file system on /dev/shm or use -XX:ZPath=/run/shm to explicitly select the mount point. ZGC should handle this situation much better, but having a list of preferred mount points (instead of just one) to allow for multiple alternatives covering differences between distributions. There is a high risk that this otherwise becomes a common problem, given the popularity of Debian and Debian derived distributions. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8206316 >> Webrev: http://cr.openjdk.java.net/~pliden/8206316/webrev.0 >> >> Testing: Manual testing of various mount point configurations. >> >> /Per > > Looks good. > From thomas.schatzl at oracle.com Wed Jul 4 06:03:12 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 04 Jul 2018 08:03:12 +0200 Subject: RFR: 8206322: ZGC: Incorrect license header in gtests In-Reply-To: References: Message-ID: <07fd3ba37713cdebdf59e500c704123330728745.camel@oracle.com> Hi, On Wed, 2018-07-04 at 00:04 +0200, Per Liden wrote: > When the ZGC gtests where open-sourced, the license header in these > files were not updated accordingly. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8206322 > Webrev: http://cr.openjdk.java.net/~pliden/8206322/webrev.0 > good. Thomas From per.liden at oracle.com Wed Jul 4 06:06:13 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 4 Jul 2018 08:06:13 +0200 Subject: RFR: 8206322: ZGC: Incorrect license header in gtests In-Reply-To: <07fd3ba37713cdebdf59e500c704123330728745.camel@oracle.com> References: <07fd3ba37713cdebdf59e500c704123330728745.camel@oracle.com> Message-ID: Thanks Thomas! /Per On 07/04/2018 08:03 AM, Thomas Schatzl wrote: > Hi, > > On Wed, 2018-07-04 at 00:04 +0200, Per Liden wrote: >> When the ZGC gtests where open-sourced, the license header in these >> files were not updated accordingly. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8206322 >> Webrev: http://cr.openjdk.java.net/~pliden/8206322/webrev.0 >> > > good. > > Thomas > From thomas.schatzl at oracle.com Wed Jul 4 06:55:29 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 04 Jul 2018 08:55:29 +0200 Subject: 8206316: ZGC: Preferred tmpfs mount point not found on Debian In-Reply-To: <51c4c4b7-8ec0-fd6b-dc18-a6fc4685caed@oracle.com> References: <51c4c4b7-8ec0-fd6b-dc18-a6fc4685caed@oracle.com> Message-ID: <5a83a02502ba8c42431c931dadbc8b73f385b27f.camel@oracle.com> Hi, On Tue, 2018-07-03 at 23:28 +0200, Per Liden wrote: > On Linux kernels < 3.17 (where memfd_create() syscall does not > exist), ZGC falls back to searching for a suitable tmpfs mount point. > If multiple mount points are found (which is the common case) we try > to see if any of them matches the "preferred default" path (which is > hard coded to be /dev/shm in ZGC). This work well, except on Debian > and Debian derived distributions, which for some reason have chosen > to use /run/shm instead instead of /dev/shm. As a result, ZGC will > fail to initialize on some commonly used distributions (current > Debian stable, Ubuntu 14.04-LTS, etc), forcing the user to manually > mount a tmpfs file system on /dev/shm or use -XX:ZPath=/run/shm to > explicitly select the mount point. ZGC should handle this situation > much better, but having a list of preferred mount points (instead of > just one) to allow for multiple alternatives covering differences > between distributions. There is a high risk that this otherwise > becomes a common problem, given the popularity of Debian and Debian > derived distributions. Looking at the various support documents, this does not seem to be a very significant issue. On Debian Stretch (latest stable) kernel is 4.9 [1]. And latest Ubuntu 14.04(.05) runs on a 4.4 kernel [2]. While Jessie (previous stable) is 3.16, it is "almost" out of support (and so is 14.04), and will be even more at GA. Also memfd_create has been backported to Jessie afaict [3]. I am not sure that people that already need to go out of their way to install latest JDK on these OSes to test, won't also consider upgrading minor versions their OS. (And I assume that for testing, people do not use live systems anyway). All in all I do not see this issue as urgent as the description make it seem. It does not seem to be a problematic change either (to me it seems like an enhancement of existing code too). > Bug: https://bugs.openjdk.java.net/browse/JDK-8206316 > Webrev: http://cr.openjdk.java.net/~pliden/8206316/webrev.0 > > Testing: Manual testing of various mount point configurations. looks good. Thomas [1] https://lists.debian.org/debian-kernel/2016/08/msg00099.html [2] https://wiki.ubuntu.com/Kernel/Support [3] https://manpages.debian.org/jessie-backports/manpages-dev/memfd_cre ate.2.en.html (see the "other versions" table) From per.liden at oracle.com Wed Jul 4 07:52:34 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 4 Jul 2018 09:52:34 +0200 Subject: 8206316: ZGC: Preferred tmpfs mount point not found on Debian In-Reply-To: <5a83a02502ba8c42431c931dadbc8b73f385b27f.camel@oracle.com> References: <51c4c4b7-8ec0-fd6b-dc18-a6fc4685caed@oracle.com> <5a83a02502ba8c42431c931dadbc8b73f385b27f.camel@oracle.com> Message-ID: <812cc3b6-6423-a843-46b6-9ead2ada7a58@oracle.com> Hi Thomas, On 07/04/2018 08:55 AM, Thomas Schatzl wrote: > Hi, > > On Tue, 2018-07-03 at 23:28 +0200, Per Liden wrote: >> On Linux kernels < 3.17 (where memfd_create() syscall does not >> exist), ZGC falls back to searching for a suitable tmpfs mount point. >> If multiple mount points are found (which is the common case) we try >> to see if any of them matches the "preferred default" path (which is >> hard coded to be /dev/shm in ZGC). This work well, except on Debian >> and Debian derived distributions, which for some reason have chosen >> to use /run/shm instead instead of /dev/shm. As a result, ZGC will >> fail to initialize on some commonly used distributions (current >> Debian stable, Ubuntu 14.04-LTS, etc), forcing the user to manually >> mount a tmpfs file system on /dev/shm or use -XX:ZPath=/run/shm to >> explicitly select the mount point. ZGC should handle this situation >> much better, but having a list of preferred mount points (instead of >> just one) to allow for multiple alternatives covering differences >> between distributions. There is a high risk that this otherwise >> becomes a common problem, given the popularity of Debian and Debian >> derived distributions. > > Looking at the various support documents, this does not seem to be a > very significant issue. On Debian Stretch (latest stable) kernel is 4.9 > [1]. > And latest Ubuntu 14.04(.05) runs on a 4.4 kernel [2]. > > While Jessie (previous stable) is 3.16, it is "almost" out of support > (and so is 14.04), and will be even more at GA. Also memfd_create has > been backported to Jessie afaict [3]. > > I am not sure that people that already need to go out of their way to > install latest JDK on these OSes to test, won't also consider upgrading > minor versions their OS. (And I assume that for testing, people do not > use live systems anyway). > > All in all I do not see this issue as urgent as the description make it > seem. It does not seem to be a problematic change either (to me it > seems like an enhancement of existing code too). Thanks for digging up this information. In light of this, I agree that this is not as urgent as I first thought. I still think we should consider this for 11, given that I've already received reports from people running into this issue, and the fix is pretty straight forward. Objections or thoughts? > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8206316 >> Webrev: http://cr.openjdk.java.net/~pliden/8206316/webrev.0 >> >> Testing: Manual testing of various mount point configurations. > > looks good. Thanks for reviewing! cheers, Per > > Thomas > > [1] https://lists.debian.org/debian-kernel/2016/08/msg00099.html > [2] https://wiki.ubuntu.com/Kernel/Support > [3] https://manpages.debian.org/jessie-backports/manpages-dev/memfd_cre > ate.2.en.html (see the "other versions" table) > From HORIE at jp.ibm.com Wed Jul 4 08:26:05 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Wed, 4 Jul 2018 17:26:05 +0900 Subject: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space In-Reply-To: <36321D6C-A8B7-48FB-8560-9B5807956A87@oracle.com> References: <36321D6C-A8B7-48FB-8560-9B5807956A87@oracle.com> Message-ID: Hi Martin, Kim, Thank you for both of your comments. I missed the point that oopDesc::forward_to is invoked from several callers. Using OrderAccess:storestore() before the invocation of forward_to () would be a great idea, thanks. >I haven't looked carefully at the change, though I did find one part >that I don't like. The new test of "order" in forward_to_atomic not >only affects CMS, but also (uselessly) affects G1. Please let me confirm your point. You mean I should give memory_order_acq_rel to forward_to_atomic, which uses tests as follows to hold the consistent meaning of acquire/release in forward_to_atomic? I agree it is not clear the test with release returns the forwardee with acquire. oop oopDesc::forward_to_atomic(oop p, atomic_memory_order order) { : while (!oldMark->is_marked()) { if (order == memory_order_acq_rel) { curMark = cas_set_mark_raw(forwardPtrMark, oldMark, memory_order_release); } else { curMark = cas_set_mark_raw(forwardPtrMark, oldMark, order); } } : } if (order == memory_order_acq_rel) { return forwardee_acquire(); } return forwardee(); } Best regards, -- Michihiro, IBM Research - Tokyo From: Kim Barrett To: Michihiro Horie Cc: "Doerr, Martin" , "hotspot-gc-dev at openjdk.java.net" , Gustavo Romero Date: 2018/07/04 05:41 Subject: Re: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space > On Jul 3, 2018, at 4:25 AM, Michihiro Horie wrote: > > Hi Martin, > > Thanks a lot for your review. Sure, we need an OK from a CMS expert. Following is the new webrev: > http://cr.openjdk.java.net/~mhorie/8205908/webrev.01/ > > >Seems like a user of the forwardee needs to rely on memory_order_consume in the current implementation. I guess it will be appreciated that you?re fixing this. > Thank you for pointing out this issue in the original implementation. I newly inserted a release at "2.4. Set new_obj as forwardee [L1142]". > > Improvement of critical-jOPS in SPECjbb2015 was 10%, which is still a big number. > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo CMS was deprecated in JDK 9, and has been on maintenance life-support for some time. This complex-to-review performance enhancement was proposed less than 48 hours before JDK 11 FC, and didn't receive any reviews until after FC. Because of these factors, I don't think it should be included in JDK 11. And if CMS gets removed in JDK 12 (I don't know if that will happen), then this change would be rendered entirely moot. I haven't looked carefully at the change, though I did find one part that I don't like. The new test of "order" in forward_to_atomic not only affects CMS, but also (uselessly) affects G1. I'm not going to be able to look at this carefully soon, as JDK 11 bug fixing has a higher priority for me. Since I think CMS might soon not be an issue, I'd really rather not look at it at all. I think this change needs not just a CMS-expert reviewer, but someone who is willing to maintain CMS (including any potential bug tail from this change). -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From erik.helin at oracle.com Wed Jul 4 08:46:51 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 4 Jul 2018 10:46:51 +0200 Subject: 8206316: ZGC: Preferred tmpfs mount point not found on Debian In-Reply-To: <812cc3b6-6423-a843-46b6-9ead2ada7a58@oracle.com> References: <51c4c4b7-8ec0-fd6b-dc18-a6fc4685caed@oracle.com> <5a83a02502ba8c42431c931dadbc8b73f385b27f.camel@oracle.com> <812cc3b6-6423-a843-46b6-9ead2ada7a58@oracle.com> Message-ID: <8705fb90-3b25-0846-e76f-de1c5615e2ac@oracle.com> On 07/04/2018 09:52 AM, Per Liden wrote: > Hi Thomas, > > On 07/04/2018 08:55 AM, Thomas Schatzl wrote: >> Hi, >> >> On Tue, 2018-07-03 at 23:28 +0200, Per Liden wrote: >>> On Linux kernels < 3.17 (where memfd_create() syscall does not >>> exist), ZGC falls back to searching for a suitable tmpfs mount point. >>> If multiple mount points are found (which is the common case) we try >>> to see if any of them matches the "preferred default" path (which is >>> hard coded to be /dev/shm in ZGC). This work well, except on Debian >>> and Debian derived distributions, which for some reason have chosen >>> to use /run/shm instead instead of /dev/shm. As a result, ZGC will >>> fail to initialize on some commonly used distributions (current >>> Debian stable, Ubuntu 14.04-LTS, etc), forcing the user to manually >>> mount a tmpfs file system? on /dev/shm or use -XX:ZPath=/run/shm to >>> explicitly select the mount point. ZGC should handle this situation >>> much better, but having a list of preferred mount points (instead of >>> just one) to allow for multiple alternatives covering differences >>> between distributions. There is a high risk that this otherwise >>> becomes a common problem, given the popularity of Debian and Debian >>> derived distributions. >> >> Looking at the various support documents, this does not seem to be a >> very significant issue. On Debian Stretch (latest stable) kernel is 4.9 >> [1]. >> And latest Ubuntu 14.04(.05) runs on a 4.4 kernel [2]. >> >> While Jessie (previous stable) is 3.16, it is "almost" out of support >> (and so is 14.04), and will be even more at GA. Also memfd_create has >> been backported to Jessie afaict [3]. >> >> I am not sure that people that already need to go out of their way to >> install latest JDK on these OSes to test, won't also consider upgrading >> ? minor versions their OS. (And I assume that for testing, people do not >> use live systems anyway). >> >> All in all I do not see this issue as urgent as the description make it >> seem. It does not seem to be a problematic change either (to me it >> seems like an enhancement of existing code too). > > Thanks for digging up this information. In light of this, I agree that > this is not as urgent as I first thought. I still think we should > consider this for 11, given that I've already received reports from > people running into this issue, and the fix is pretty straight forward. > > Objections or thoughts? Given the fix is small, I think we should just fix it. That seems easier than explaining to users why we did not fix this :) The patch looks good to me, Reviewed. Thanks, Erik >> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206316 >>> Webrev: http://cr.openjdk.java.net/~pliden/8206316/webrev.0 >>> >>> Testing: Manual testing of various mount point configurations. >> >> ?? looks good. > > Thanks for reviewing! > > cheers, > Per > >> >> Thomas >> >> [1] https://lists.debian.org/debian-kernel/2016/08/msg00099.html >> [2] https://wiki.ubuntu.com/Kernel/Support >> [3] https://manpages.debian.org/jessie-backports/manpages-dev/memfd_cre >> ate.2.en.html (see the "other versions" table) >> From per.liden at oracle.com Wed Jul 4 09:02:21 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 4 Jul 2018 11:02:21 +0200 Subject: 8206316: ZGC: Preferred tmpfs mount point not found on Debian In-Reply-To: <8705fb90-3b25-0846-e76f-de1c5615e2ac@oracle.com> References: <51c4c4b7-8ec0-fd6b-dc18-a6fc4685caed@oracle.com> <5a83a02502ba8c42431c931dadbc8b73f385b27f.camel@oracle.com> <812cc3b6-6423-a843-46b6-9ead2ada7a58@oracle.com> <8705fb90-3b25-0846-e76f-de1c5615e2ac@oracle.com> Message-ID: <6bad5283-a69a-31b1-cade-4d89bb8f2ee2@oracle.com> On 07/04/2018 10:46 AM, Erik Helin wrote: > On 07/04/2018 09:52 AM, Per Liden wrote: >> Hi Thomas, >> >> On 07/04/2018 08:55 AM, Thomas Schatzl wrote: >>> Hi, >>> >>> On Tue, 2018-07-03 at 23:28 +0200, Per Liden wrote: >>>> On Linux kernels < 3.17 (where memfd_create() syscall does not >>>> exist), ZGC falls back to searching for a suitable tmpfs mount point. >>>> If multiple mount points are found (which is the common case) we try >>>> to see if any of them matches the "preferred default" path (which is >>>> hard coded to be /dev/shm in ZGC). This work well, except on Debian >>>> and Debian derived distributions, which for some reason have chosen >>>> to use /run/shm instead instead of /dev/shm. As a result, ZGC will >>>> fail to initialize on some commonly used distributions (current >>>> Debian stable, Ubuntu 14.04-LTS, etc), forcing the user to manually >>>> mount a tmpfs file system? on /dev/shm or use -XX:ZPath=/run/shm to >>>> explicitly select the mount point. ZGC should handle this situation >>>> much better, but having a list of preferred mount points (instead of >>>> just one) to allow for multiple alternatives covering differences >>>> between distributions. There is a high risk that this otherwise >>>> becomes a common problem, given the popularity of Debian and Debian >>>> derived distributions. >>> >>> Looking at the various support documents, this does not seem to be a >>> very significant issue. On Debian Stretch (latest stable) kernel is 4.9 >>> [1]. >>> And latest Ubuntu 14.04(.05) runs on a 4.4 kernel [2]. >>> >>> While Jessie (previous stable) is 3.16, it is "almost" out of support >>> (and so is 14.04), and will be even more at GA. Also memfd_create has >>> been backported to Jessie afaict [3]. >>> >>> I am not sure that people that already need to go out of their way to >>> install latest JDK on these OSes to test, won't also consider upgrading >>> ? minor versions their OS. (And I assume that for testing, people do not >>> use live systems anyway). >>> >>> All in all I do not see this issue as urgent as the description make it >>> seem. It does not seem to be a problematic change either (to me it >>> seems like an enhancement of existing code too). >> >> Thanks for digging up this information. In light of this, I agree that >> this is not as urgent as I first thought. I still think we should >> consider this for 11, given that I've already received reports from >> people running into this issue, and the fix is pretty straight forward. >> >> Objections or thoughts? > > Given the fix is small, I think we should just fix it. That seems easier > than explaining to users why we did not fix this :) > > The patch looks good to me, Reviewed. Thanks Erik! /Per (For the record, Thomas told me off-line that he didn't have any objections, so I'll go ahead and push this to 11) > > Thanks, > Erik > >>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8206316 >>>> Webrev: http://cr.openjdk.java.net/~pliden/8206316/webrev.0 >>>> >>>> Testing: Manual testing of various mount point configurations. >>> >>> ?? looks good. >> >> Thanks for reviewing! >> >> cheers, >> Per >> >>> >>> Thomas >>> >>> [1] https://lists.debian.org/debian-kernel/2016/08/msg00099.html >>> [2] https://wiki.ubuntu.com/Kernel/Support >>> [3] https://manpages.debian.org/jessie-backports/manpages-dev/memfd_cre >>> ate.2.en.html (see the "other versions" table) >>> From erik.helin at oracle.com Wed Jul 4 09:44:44 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 4 Jul 2018 11:44:44 +0200 Subject: RFR: 8205924: ZGC: Premature OOME due to failure to expand backing file In-Reply-To: <5564b685-22ab-c346-3fb9-f58a9aeee75b@oracle.com> References: <5564b685-22ab-c346-3fb9-f58a9aeee75b@oracle.com> Message-ID: <58201055-4c6f-18af-2a06-183729d9cc6f@oracle.com> On 07/02/2018 05:05 PM, Per Liden wrote: > ZGC currently assumes that there will be enough space available on the > backing file system to hold the max heap size (-Xmx). However, this > might not be true. For example, the backing filesystem might have been > misconfigured or space on that filesystem might be used by some other > process. In this situation, ZGC will try (and fail) to map more memory > every time a new page needs to be allocated (assuming that request can't > be satisfied by the page case). As a result, we fail to flush the page > cache, which in turn means we throw a premature OOME and we continuously > take the performance hit by making unnecessary fallocate() syscalls that > will never succeed. We should instead detect this situation, flush the > page cache and avoid making further fallocate() calls. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205924 > Webrev: http://cr.openjdk.java.net/~pliden/8205924/webrev.0 This looks good to me, I would have added an assert in size_t ZPhysicalMemoryBacking::try_expand such as +size_t ZPhysicalMemoryBacking::try_expand(size_t old_capacity, size_t new_capacity) { + assert(new_capacity > old_capacity, "invariant"); + const size_t capacity = _file.try_expand(old_capacity, new_capacity - old_capacity, _granule_size); Not because I spotted anything wrong with this patch, more because if someone one day introduces such a bug, then it will be hell to debug without an assert like the above one :) In ZPageAllocator::try_ensure_unused_for_pre_mapped I would maybe have designed ZPhysicalMemoryManager::try_ensure_unused_capacity so that it is always valid to call (the method would just return in case _backing isn't initialized). I don't need to see another webrev if you just add the assert, but please send out a new version if you rework ZPhysicalMemoryManager. Thanks, Erik From per.liden at oracle.com Wed Jul 4 09:57:14 2018 From: per.liden at oracle.com (Per Liden) Date: Wed, 4 Jul 2018 11:57:14 +0200 Subject: RFR: 8205924: ZGC: Premature OOME due to failure to expand backing file In-Reply-To: <58201055-4c6f-18af-2a06-183729d9cc6f@oracle.com> References: <5564b685-22ab-c346-3fb9-f58a9aeee75b@oracle.com> <58201055-4c6f-18af-2a06-183729d9cc6f@oracle.com> Message-ID: <6c46d251-4d39-1c12-4f76-00b2f44b957f@oracle.com> On 07/04/2018 11:44 AM, Erik Helin wrote: > On 07/02/2018 05:05 PM, Per Liden wrote: >> ZGC currently assumes that there will be enough space available on the >> backing file system to hold the max heap size (-Xmx). However, this >> might not be true. For example, the backing filesystem might have been >> misconfigured or space on that filesystem might be used by some other >> process. In this situation, ZGC will try (and fail) to map more memory >> every time a new page needs to be allocated (assuming that request >> can't be satisfied by the page case). As a result, we fail to flush >> the page cache, which in turn means we throw a premature OOME and we >> continuously take the performance hit by making unnecessary >> fallocate() syscalls that will never succeed. We should instead detect >> this situation, flush the page cache and avoid making further >> fallocate() calls. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205924 >> Webrev: http://cr.openjdk.java.net/~pliden/8205924/webrev.0 > > This looks good to me, I would have added an assert in > size_t ZPhysicalMemoryBacking::try_expand such as > > +size_t ZPhysicalMemoryBacking::try_expand(size_t old_capacity, size_t > new_capacity) { > +? assert(new_capacity > old_capacity, "invariant"); > +? const size_t capacity = _file.try_expand(old_capacity, new_capacity - > old_capacity, _granule_size); > > Not because I spotted anything wrong with this patch, more because if > someone one day introduces such a bug, then it will be hell to debug > without an assert like the above one :) Sounds good, will add that. > > In ZPageAllocator::try_ensure_unused_for_pre_mapped I would maybe have > designed ZPhysicalMemoryManager::try_ensure_unused_capacity so that it > is always valid to call (the method would just return in case _backing > isn't initialized). I would prefer to keep that check in ZPageAllocator::try_ensure_unused_for_pre_mapped(), since that function is a special case since it's called during construction. The underlying ZPhysicalMemoryManager::try_ensure_unused_capacity() should never be called if the ZPhysicalMemoryManager isn't initialized and I'd rather crash hard than silently return if someone does that mistake. > > I don't need to see another webrev if you just add the assert, but > please send out a new version if you rework ZPhysicalMemoryManager. Thanks for reviewing, Erik! cheers, Per From erik.helin at oracle.com Wed Jul 4 10:01:09 2018 From: erik.helin at oracle.com (Erik Helin) Date: Wed, 4 Jul 2018 12:01:09 +0200 Subject: RFR: 8205924: ZGC: Premature OOME due to failure to expand backing file In-Reply-To: <6c46d251-4d39-1c12-4f76-00b2f44b957f@oracle.com> References: <5564b685-22ab-c346-3fb9-f58a9aeee75b@oracle.com> <58201055-4c6f-18af-2a06-183729d9cc6f@oracle.com> <6c46d251-4d39-1c12-4f76-00b2f44b957f@oracle.com> Message-ID: <8c08aa56-5808-09df-e4b7-38a36731568b@oracle.com> On 07/04/2018 11:57 AM, Per Liden wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205924 >>> Webrev: http://cr.openjdk.java.net/~pliden/8205924/webrev.0 >> >> This looks good to me, I would have added an assert in >> size_t ZPhysicalMemoryBacking::try_expand such as >> >> +size_t ZPhysicalMemoryBacking::try_expand(size_t old_capacity, size_t >> new_capacity) { >> +? assert(new_capacity > old_capacity, "invariant"); >> +? const size_t capacity = _file.try_expand(old_capacity, new_capacity >> - old_capacity, _granule_size); >> >> Not because I spotted anything wrong with this patch, more because if >> someone one day introduces such a bug, then it will be hell to debug >> without an assert like the above one :) > > Sounds good, will add that. Ok, good! >> >> In ZPageAllocator::try_ensure_unused_for_pre_mapped I would maybe have >> designed ZPhysicalMemoryManager::try_ensure_unused_capacity so that it >> is always valid to call (the method would just return in case _backing >> isn't initialized). > > I would prefer to keep that check in > ZPageAllocator::try_ensure_unused_for_pre_mapped(), since that function > is a special case since it's called during construction. The underlying > ZPhysicalMemoryManager::try_ensure_unused_capacity() should never be > called if the ZPhysicalMemoryManager isn't initialized and I'd rather > crash hard than silently return if someone does that mistake. Ok, that sounds good to me, just keep it the way it is then. >> I don't need to see another webrev if you just add the assert, but >> please send out a new version if you rework ZPhysicalMemoryManager. > > Thanks for reviewing, Erik! Since you only adding the assert, I'm fine with this now, Reviewed. Thanks, Erik > cheers, > Per From kim.barrett at oracle.com Wed Jul 4 23:17:00 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 4 Jul 2018 19:17:00 -0400 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink In-Reply-To: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> References: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> Message-ID: <54BF88C5-5835-49B7-8E1F-E21A4E429D15@oracle.com> > On Jun 26, 2018, at 1:15 PM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this bug in keeping remembered sets consistent > between HC and HS regions, causing crashes with verification? > > The problem occurs during updating the remembered sets during the > Remark pause. This process is parallel; it uses liveness information > from marking to set the new remembered set states. > > However during marking G1 attributes all liveness information of a > humongous object to the HS region; if that liveness information has not > been updated yet for HC regions, and another thread is responsible for > determining that HC region's remembered set state, the new remembered > set state of the HC region will get a state as the HS region. > > The fix is to, for HC regions, just pass the liveness data of the HS > region into the method that determines the new remembered set state. > Further, in that latter method, make sure that the predicate for > determining whether a region gets a remembered set assigned is > completely disjoint for humongous and non-humongous regions. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8205426 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8205426/webrev/ > Testing: > new test case, hs-tier1-4,jdk-tier1-3 > > Thanks, > Thomas The new live_bytes_for seems like it's overly verbose and complicated, and could instead just be something like: size_t live_bytes_for(HeapRegion* r) { // For humongous regions, use liveness of associated starts region. HeapRegion* hr = r->is_humongous() ? r->humongous_start_region() : r; return _cm->liveness(hr->hrm_index()) * HeapWordSize; } However, I wonder if this is the right way to go? It seems to me that the underlying problem is that we're even asking the live_bytes question of humongous_continues regions, when nobody really cares about the answer (after fixing the policy). We're also computing it for young regions and for humongous_start regions containing an objarray, where again the (fixed) policy doesn't care. It seems to me that things would be simpler if it were the policy that asked the live_bytes question, after it has determined that it was interested in the answer. The only downside I can think of is that the G1RemSetTrackingPolicy would be additionally coupled to the G1ConcurrentMark object. From kim.barrett at oracle.com Thu Jul 5 00:13:00 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 4 Jul 2018 20:13:00 -0400 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # Message-ID: Please review this fix of the HeapRegion gtest. The test modifies a region's "top" to unexpected values without ensuring that no allocation might use the region and no GC might run while the region is in that invalid state. We solve this by executing the test code in its very own safepoint, and by saving and then restoring the region's top back to its original value before completing the test. And since we are doing all that, there's no longer any reason to run the test in a separate VM. CR: https://bugs.openjdk.java.net/browse/JDK-8204691 Webrev: http://cr.openjdk.java.net/~kbarrett/8204691/open.00/ Testing: mach5 tier1 (where gtests are run). I wasn't able to reproduce the failure, but the issues these changes address can account for the one failure that's been reported. From kim.barrett at oracle.com Thu Jul 5 05:15:44 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 5 Jul 2018 01:15:44 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> Message-ID: <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> > On Jun 27, 2018, at 2:39 PM, Zhengyu Gu wrote: > > Hi, > > Please review this small enhancement base on paper [1], that keeps the last successfully stolen queue as one of best-of-2 candidates for work stealing. > > Based on experiments done by Thomas Schatzl and myself, it shows positive impacts on task termination and average pause time. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205921 > Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.html > > > Test: > hotspot_gc on Linux 64 (fastdebug and release) > > > [1] Characterizing and Optimizing Hotspot Parallel Garbage > Collection on Multicore Systems > http://ranger.uta.edu/~jrao/papers/EuroSys18.pdf > > Thanks, > > -Zhengyu Once set, _last_stolen_queues entries are never invalidated. So we may as well initialize the entries to queue_num+1 mod num_queues. Then get rid of the is_valid test (and the whole notion of validity) and the (only used once per queue_num in the webrev change) random selection of k1. But I think that might not be desirable. The webrev change's behavior is to always use the queue chosen for the last steal attempt as one of the two, even if the last steal attempt failed. And because the choice of which of the two to try next prefers that one when they are both empty, we may be reduced to searching with only one random choice for a while, even though the one we keep using has repeatedly failed to yield a result. An alternative that might be better is, whenever a pop_global fails, reset the associated last_stolen id to invalid. This will revert to 2 random choices until we find (at least) one with something we can steal. Actually, it seems the referenced paper does something similar, and the webrev code doesn't match the referenced paper. Why do the last_queue array entries need to be padded? Why not just add a _last_stolen_queue member to TaskQueueSuper? I think it is a pre-existing bug that GenericTaskQueueSet::_n is of type uint, but the associated constructor argument is of type int. I think the constructor is wrong in this regard. From thomas.schatzl at oracle.com Thu Jul 5 07:16:49 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 05 Jul 2018 09:16:49 +0200 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink In-Reply-To: <54BF88C5-5835-49B7-8E1F-E21A4E429D15@oracle.com> References: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> <54BF88C5-5835-49B7-8E1F-E21A4E429D15@oracle.com> Message-ID: <15c89934f5f69de519bca42c9d7b049e621ccbae.camel@oracle.com> Hi Kim, thanks for your review. On Wed, 2018-07-04 at 19:17 -0400, Kim Barrett wrote: > > On Jun 26, 2018, at 1:15 PM, Thomas Schatzl > com> wrote: > > > > Hi all, > > > > can I have reviews for this bug in keeping remembered sets > > consistent between HC and HS regions, causing crashes with > > verification? > > [...] > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8205426 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8205426/webrev/ > > Testing: > > new test case, hs-tier1-4,jdk-tier1-3 > > > > Thanks, > > Thomas > > The new live_bytes_for seems like it's overly verbose and > complicated, and could instead just be something like: The verbosity mainly comes from me just trying to add the comment about the approximation somehwere fitting. Cramming it into a "?" operator statement may cause confusion. But see further below. > > size_t live_bytes_for(HeapRegion* r) { > // For humongous regions, use liveness of associated starts region. > HeapRegion* hr = r->is_humongous() ? r->humongous_start_region() : > r; > return _cm->liveness(hr->hrm_index()) * HeapWordSize; > } > > However, I wonder if this is the right way to go? > > It seems to me that the underlying problem is that we're even asking > the live_bytes question of humongous_continues regions, when nobody > really cares about the answer (after fixing the policy). We're also > computing it for young regions and for humongous_start regions > containing an objarray, where again the (fixed) policy doesn't care. > > It seems to me that things would be simpler if it were the policy > that asked the live_bytes question, after it has determined that it > was interested in the answer. The only downside I can think of is > that the G1RemSetTrackingPolicy would be additionally coupled to the > G1ConcurrentMark object. I would like to have the G1RemSetTrackingPolicy mostly self-contained and not looking through things all over the place; however the main issue seems to be that we actually need to ask for the liveness and update the remembered sets for HC regions. It would be much nicer, and remove a lot of distinction between regular regions and humongous regions if the remembered sets were ready. So my master plan ;) had been to have the incremental mixed gcs ready by jdk11 and then fairly easily implement sharing of remembered sets between multiple regions. That not only solves this issue, but also quite much decreases remembered set overhead in many dimensions (e.g. if we process old gen regions during mixed gc in increments >> 1 anyway, why provide that possibility; of course there are some drawbacks to that in reducing flexibility that is not used at the moment anyway). (The exactly same thought came up when talking to ErikD about this change). There is a new webrev at http://cr.openjdk.java.net/~tschatzl/8205426/webrev.1 (full) http://cr.openjdk.java.net/~tschatzl/8205426/webrev.0_to_1 (diff, but almost useless due to many changes) That at least separates the concerns about humongous/regular region a bit. Thanks, Thomas From thomas.schatzl at oracle.com Thu Jul 5 07:54:42 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 05 Jul 2018 09:54:42 +0200 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> Message-ID: <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> Hi, On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote: > > On Jun 27, 2018, at 2:39 PM, Zhengyu Gu wrote: > > > > Hi, > > > > Please review this small enhancement base on paper [1], that keeps > > the last successfully stolen queue as one of best-of-2 candidates > > for work stealing. > > > > Based on experiments done by Thomas Schatzl and myself, it shows > > positive impacts on task termination and average pause time. > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205921 > > Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.htm > > l > > > > > > Test: > > hotspot_gc on Linux 64 (fastdebug and release) > > > > > > [1] Characterizing and Optimizing Hotspot Parallel Garbage > > Collection on Multicore Systems > > http://ranger.uta.edu/~jrao/papers/EuroSys18.pdf > > > > Thanks, > > > > -Zhengyu > > Once set, _last_stolen_queues entries are never invalidated. So we > may as well initialize the entries to queue_num+1 mod num_queues. > Then get rid of the is_valid test (and the whole notion of validity) > and the (only used once per queue_num in the webrev change) random > selection of k1. > > But I think that might not be desirable. The webrev change's > behavior is to always use the queue chosen for the last steal attempt > as one of the two, even if the last steal attempt failed. And > because the choice of which of the two to try next prefers that one > when they are both empty, we may be reduced to searching with only > one random choice for a while, even though the one we keep using has > repeatedly failed to yield a result. > > An alternative that might be better is, whenever a pop_global fails, > reset the associated last_stolen id to invalid. This will revert to > 2 random choices until we find (at least) one with something we can > steal. Actually, it seems the referenced paper does something > similar, and the webrev code doesn't match the referenced paper. That may explain why my perf results are different to the paper that I was planning to investigate :) Nice find. > Why do the last_queue array entries need to be padded? Why not just > add a _last_stolen_queue member to TaskQueueSuper? The _last_stolen_queue is associated to a (stealing) thread, not the queue. Multiple threads might have the same queue as current steal target. One other option I discussed is instead of this array of PaddedQueueId (which I would rename as TaskQueueThreadLocal or TaskQueueStealLocals/Context because I can see adding more in the future) would be passing this around like the seed parameter to steal_best_of_2 (and actually put the seed parameter in there too). It's a bit weird to me to pass two different kinds of thread locals related to work stealing two different ways. The padding is to avoid potential false sharing issues as otherwise last_stolen_id's of different threads end up on the same cache line. And the writes of different threads to disjoint locations would likely invalidate the cache line all the time. Just to avoid a potential performance issue here. > I think it is a pre-existing bug that GenericTaskQueueSet::_n is of > type uint, but the associated constructor argument is of type int. I > think the constructor is wrong in this regard. - please use CamelCase for the INVALID_QUEUE_ID constant. - there are some superfluous spaces at the end-of-line, but that would be flushed out before pushing anyway. Thanks, Thomas From thomas.schatzl at oracle.com Thu Jul 5 07:57:01 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 05 Jul 2018 09:57:01 +0200 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # In-Reply-To: References: Message-ID: Hi, On Wed, 2018-07-04 at 20:13 -0400, Kim Barrett wrote: > Please review this fix of the HeapRegion gtest. > > The test modifies a region's "top" to unexpected values without > ensuring that no allocation might use the region and no GC might run > while the region is in that invalid state. We solve this by > executing the test code in its very own safepoint, and by saving and > then restoring the region's top back to its original value before > completing the test. And since we are doing all that, there's no > longer any reason to run the test in a separate VM. looks good, but the actual test is still run in a separate VM. Intentional? Thanks, Thomas From rkennke at redhat.com Thu Jul 5 14:42:19 2018 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 5 Jul 2018 16:42:19 +0200 Subject: RFR: JDK-8206407: Primitive atomic_cmpxchg_in_heap_at() in BarrierSet::Access needs to call non-oop raw method Message-ID: <7bf834d8-3fe3-9458-21a1-02eac7e86897@redhat.com> BarrierSet::Access:atomic_cmpxchg_in_heap_at() is currently calling Raw::oop_atomic_cmpxchg_at() which is obviously wrong. We've been lucky because primitive is not bound in default OpenJDK. Even in Shenandoah land we've been lucky because primitives don't match narrowOop and thus don't get (attempted to) encoded/decoded. Lucky us. Let's fix it anyway: http://cr.openjdk.java.net/~rkennke/JDK-8206407/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8206407 Can I get reviews? Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Thu Jul 5 14:44:43 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 5 Jul 2018 16:44:43 +0200 Subject: RFR: JDK-8206407: Primitive atomic_cmpxchg_in_heap_at() in BarrierSet::Access needs to call non-oop raw method In-Reply-To: <7bf834d8-3fe3-9458-21a1-02eac7e86897@redhat.com> References: <7bf834d8-3fe3-9458-21a1-02eac7e86897@redhat.com> Message-ID: <894b27f7-2eae-ff9f-6a0a-48fcac07a48a@redhat.com> On 07/05/2018 04:42 PM, Roman Kennke wrote: > BarrierSet::Access:atomic_cmpxchg_in_heap_at() is currently calling > Raw::oop_atomic_cmpxchg_at() which is obviously wrong. > > We've been lucky because primitive is not bound in default OpenJDK. > > Even in Shenandoah land we've been lucky because primitives don't match > narrowOop and thus don't get (attempted to) encoded/decoded. > > Lucky us. > > Let's fix it anyway: > http://cr.openjdk.java.net/~rkennke/JDK-8206407/webrev.00/ Fix looks good to me. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From per.liden at oracle.com Thu Jul 5 14:58:43 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 5 Jul 2018 16:58:43 +0200 Subject: RFR: JDK-8206407: Primitive atomic_cmpxchg_in_heap_at() in BarrierSet::Access needs to call non-oop raw method In-Reply-To: <7bf834d8-3fe3-9458-21a1-02eac7e86897@redhat.com> References: <7bf834d8-3fe3-9458-21a1-02eac7e86897@redhat.com> Message-ID: <42e239d7-3006-a887-e81e-4fbeae80e2be@oracle.com> On 07/05/2018 04:42 PM, Roman Kennke wrote: > BarrierSet::Access:atomic_cmpxchg_in_heap_at() is currently calling > Raw::oop_atomic_cmpxchg_at() which is obviously wrong. > > We've been lucky because primitive is not bound in default OpenJDK. > > Even in Shenandoah land we've been lucky because primitives don't match > narrowOop and thus don't get (attempted to) encoded/decoded. > > Lucky us. > > Let's fix it anyway: > http://cr.openjdk.java.net/~rkennke/JDK-8206407/webrev.00/ Nice catch! Looks good! /Per > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8206407 > > Can I get reviews? > > Roman > From rkennke at redhat.com Thu Jul 5 15:00:38 2018 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 5 Jul 2018 17:00:38 +0200 Subject: RFR: JDK-8206407: Primitive atomic_cmpxchg_in_heap_at() in BarrierSet::Access needs to call non-oop raw method In-Reply-To: <42e239d7-3006-a887-e81e-4fbeae80e2be@oracle.com> References: <7bf834d8-3fe3-9458-21a1-02eac7e86897@redhat.com> <42e239d7-3006-a887-e81e-4fbeae80e2be@oracle.com> Message-ID: <3513ec0b-45bb-8983-78be-1cfb81ac8e5c@redhat.com> Hi Per, >> BarrierSet::Access:atomic_cmpxchg_in_heap_at() is currently calling >> Raw::oop_atomic_cmpxchg_at() which is obviously wrong. >> >> We've been lucky because primitive is not bound in default OpenJDK. >> >> Even in Shenandoah land we've been lucky because primitives don't match >> narrowOop and thus don't get (attempted to) encoded/decoded. >> >> Lucky us. >> >> Let's fix it anyway: >> http://cr.openjdk.java.net/~rkennke/JDK-8206407/webrev.00/ > > Nice catch! Looks good! Thanks for reviewing! Does it qualify for trivial-doesn't-have-to-wait-24h-rule? I believe it does, it's 1 line that is not even touched by default. Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Thu Jul 5 15:12:55 2018 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 5 Jul 2018 17:12:55 +0200 Subject: RFR: JDK-8206272: Remove stray BarrierSetAssembler call Message-ID: <48d19ab8-25d1-7a94-c015-8db7b30fc1aa@redhat.com> and while we are at trivial fixes, we've a call to get a BarrierSetAssembler* in methodHandles_x86.cpp that is subsequently not used anywhere. Bug: https://bugs.openjdk.java.net/browse/JDK-8206272 Webrev: http://cr.openjdk.java.net/~rkennke/JDK-8206272/webrev.00/ I assume this qualifies as trivial? Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Thu Jul 5 15:17:44 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 5 Jul 2018 17:17:44 +0200 Subject: RFR: JDK-8206272: Remove stray BarrierSetAssembler call In-Reply-To: <48d19ab8-25d1-7a94-c015-8db7b30fc1aa@redhat.com> References: <48d19ab8-25d1-7a94-c015-8db7b30fc1aa@redhat.com> Message-ID: On 07/05/2018 05:12 PM, Roman Kennke wrote: > and while we are at trivial fixes, we've a call to get a > BarrierSetAssembler* in methodHandles_x86.cpp that is subsequently not > used anywhere. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8206272 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8206272/webrev.00/ > > I assume this qualifies as trivial? I think so. Looks good! -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From per.liden at oracle.com Thu Jul 5 15:24:48 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 5 Jul 2018 17:24:48 +0200 Subject: RFR: JDK-8206272: Remove stray BarrierSetAssembler call In-Reply-To: <48d19ab8-25d1-7a94-c015-8db7b30fc1aa@redhat.com> References: <48d19ab8-25d1-7a94-c015-8db7b30fc1aa@redhat.com> Message-ID: On 07/05/2018 05:12 PM, Roman Kennke wrote: > and while we are at trivial fixes, we've a call to get a > BarrierSetAssembler* in methodHandles_x86.cpp that is subsequently not > used anywhere. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8206272 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8206272/webrev.00/ > > I assume this qualifies as trivial? Yep, looks good and trivial to me. /Per > > Roman > From per.liden at oracle.com Thu Jul 5 15:26:01 2018 From: per.liden at oracle.com (Per Liden) Date: Thu, 5 Jul 2018 17:26:01 +0200 Subject: RFR: JDK-8206407: Primitive atomic_cmpxchg_in_heap_at() in BarrierSet::Access needs to call non-oop raw method In-Reply-To: <3513ec0b-45bb-8983-78be-1cfb81ac8e5c@redhat.com> References: <7bf834d8-3fe3-9458-21a1-02eac7e86897@redhat.com> <42e239d7-3006-a887-e81e-4fbeae80e2be@oracle.com> <3513ec0b-45bb-8983-78be-1cfb81ac8e5c@redhat.com> Message-ID: <881e0ffd-4fbc-d7ad-ba2c-8b66b12e11f5@oracle.com> On 07/05/2018 05:00 PM, Roman Kennke wrote: > Hi Per, > >>> BarrierSet::Access:atomic_cmpxchg_in_heap_at() is currently calling >>> Raw::oop_atomic_cmpxchg_at() which is obviously wrong. >>> >>> We've been lucky because primitive is not bound in default OpenJDK. >>> >>> Even in Shenandoah land we've been lucky because primitives don't match >>> narrowOop and thus don't get (attempted to) encoded/decoded. >>> >>> Lucky us. >>> >>> Let's fix it anyway: >>> http://cr.openjdk.java.net/~rkennke/JDK-8206407/webrev.00/ >> >> Nice catch! Looks good! > > Thanks for reviewing! > > Does it qualify for trivial-doesn't-have-to-wait-24h-rule? I believe it > does, it's 1 line that is not even touched by default. Fine with me (assuming it still passes tier1). /Per > > Roman > From zgu at redhat.com Thu Jul 5 16:08:55 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 5 Jul 2018 12:08:55 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> Message-ID: <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> Hi Kim and Thomas, Thanks for reviewing. On 07/05/2018 03:54 AM, Thomas Schatzl wrote: > Hi, > > On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote: >>> On Jun 27, 2018, at 2:39 PM, Zhengyu Gu wrote: >>> >>> Hi, >>> >>> Please review this small enhancement base on paper [1], that keeps >>> the last successfully stolen queue as one of best-of-2 candidates >>> for work stealing. >>> >>> Based on experiments done by Thomas Schatzl and myself, it shows >>> positive impacts on task termination and average pause time. >>> >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205921 >>> Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.htm >>> l >>> >>> >>> Test: >>> hotspot_gc on Linux 64 (fastdebug and release) >>> >>> >>> [1] Characterizing and Optimizing Hotspot Parallel Garbage >>> Collection on Multicore Systems >>> http://ranger.uta.edu/~jrao/papers/EuroSys18.pdf >>> >>> Thanks, >>> >>> -Zhengyu >> >> Once set, _last_stolen_queues entries are never invalidated. So we >> may as well initialize the entries to queue_num+1 mod num_queues. >> Then get rid of the is_valid test (and the whole notion of validity) >> and the (only used once per queue_num in the webrev change) random >> selection of k1. >> >> But I think that might not be desirable. The webrev change's >> behavior is to always use the queue chosen for the last steal attempt >> as one of the two, even if the last steal attempt failed. And >> because the choice of which of the two to try next prefers that one >> when they are both empty, we may be reduced to searching with only >> one random choice for a while, even though the one we keep using has >> repeatedly failed to yield a result. >> >> An alternative that might be better is, whenever a pop_global fails, >> reset the associated last_stolen id to invalid. This will revert to >> 2 random choices until we find (at least) one with something we can >> steal. Actually, it seems the referenced paper does something >> similar, and the webrev code doesn't match the referenced paper. > > That may explain why my perf results are different to the paper that I > was planning to investigate :) Nice find. Sorry, my bad. > >> Why do the last_queue array entries need to be padded? Why not just >> add a _last_stolen_queue member to TaskQueueSuper? > > The _last_stolen_queue is associated to a (stealing) thread, not the > queue. Multiple threads might have the same queue as current steal > target. > > One other option I discussed is instead of this array of PaddedQueueId > (which I would rename as TaskQueueThreadLocal or > TaskQueueStealLocals/Context because I can see adding more in the > future) would be passing this around like the seed parameter to > steal_best_of_2 (and actually put the seed parameter in there too). > > It's a bit weird to me to pass two different kinds of thread locals > related to work stealing two different ways. I would prefer to pass down TaskQueueStealContext, just like seed, to avoid this padded queue id array. However, it means that we have to update all call sites, which I am not comfortable to do at this time. Could we make this a future item? Updated webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html Thanks, -Zhengyu > > The padding is to avoid potential false sharing issues as otherwise > last_stolen_id's of different threads end up on the same cache line. > And the writes of different threads to disjoint locations would likely > invalidate the cache line all the time. Just to avoid a potential > performance issue here. > >> I think it is a pre-existing bug that GenericTaskQueueSet::_n is of >> type uint, but the associated constructor argument is of type int. I >> think the constructor is wrong in this regard. > > > - please use CamelCase for the INVALID_QUEUE_ID constant. > > - there are some superfluous spaces at the end-of-line, but that would > be flushed out before pushing anyway. > > Thanks, > Thomas > From kim.barrett at oracle.com Thu Jul 5 17:33:09 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 5 Jul 2018 13:33:09 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> Message-ID: <39B54C1B-815D-4A24-A4F5-F3660FE63E05@oracle.com> > On Jul 5, 2018, at 3:54 AM, Thomas Schatzl wrote: > > Hi, > > On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote: >>> On Jun 27, 2018, at 2:39 PM, Zhengyu Gu wrote: >>> >>> [?] >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205921 >>> Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.htm >>> l >>> [?] >> Why do the last_queue array entries need to be padded? Why not just >> add a _last_stolen_queue member to TaskQueueSuper? > > The _last_stolen_queue is associated to a (stealing) thread, not the > queue. Multiple threads might have the same queue as current steal > target. The stealing thread should use its own queue to obtain and record this value, e.g. _queues[queue_num]->_last_stolen_queue It seems to me the random seed could also be there, addressing your other complaint (below). That might have false sharing issues with the volatile members of the queue, but the existing _elems member have similar issues. Maybe the volatile queue members ought to be padded? > One other option I discussed is instead of this array of PaddedQueueId > (which I would rename as TaskQueueThreadLocal or > TaskQueueStealLocals/Context because I can see adding more in the > future) would be passing this around like the seed parameter to > steal_best_of_2 (and actually put the seed parameter in there too). > > It's a bit weird to me to pass two different kinds of thread locals > related to work stealing two different ways. > > The padding is to avoid potential false sharing issues as otherwise > last_stolen_id's of different threads end up on the same cache line. > And the writes of different threads to disjoint locations would likely > invalidate the cache line all the time. Just to avoid a potential > performance issue here. From zgu at redhat.com Thu Jul 5 18:44:30 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 5 Jul 2018 14:44:30 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <39B54C1B-815D-4A24-A4F5-F3660FE63E05@oracle.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <39B54C1B-815D-4A24-A4F5-F3660FE63E05@oracle.com> Message-ID: <78c346e6-6c36-243a-e84d-16b2cee458d7@redhat.com> Hi Kim, On 07/05/2018 01:33 PM, Kim Barrett wrote: >> On Jul 5, 2018, at 3:54 AM, Thomas Schatzl wrote: >> >> Hi, >> >> On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote: >>>> On Jun 27, 2018, at 2:39 PM, Zhengyu Gu wrote: >>>> >>>> [?] >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205921 >>>> Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.htm >>>> l >>>> [?] >>> Why do the last_queue array entries need to be padded? Why not just >>> add a _last_stolen_queue member to TaskQueueSuper? >> >> The _last_stolen_queue is associated to a (stealing) thread, not the >> queue. Multiple threads might have the same queue as current steal >> target. > > The stealing thread should use its own queue to obtain and record this > value, e.g. > > _queues[queue_num]->_last_stolen_queue > > It seems to me the random seed could also be there, addressing your > other complaint (below). > Is it a bit weird to have these two fields in queue? given they have nothing to do with queue itself? I intended to use thread local for last_stolen_queue in Shenandoah, since we do have extra spaces in GCThreadLocalData. > That might have false sharing issues with the volatile members of the > queue, but the existing _elems member have similar issues. Maybe the > volatile queue members ought to be padded? I can see we might need to pad Age and bottom. But I don't understand why _elems member has similar issues, could you explain? Thanks, -Zhengyu > >> One other option I discussed is instead of this array of PaddedQueueId >> (which I would rename as TaskQueueThreadLocal or >> TaskQueueStealLocals/Context because I can see adding more in the >> future) would be passing this around like the seed parameter to >> steal_best_of_2 (and actually put the seed parameter in there too). >> >> It's a bit weird to me to pass two different kinds of thread locals >> related to work stealing two different ways. >> >> The padding is to avoid potential false sharing issues as otherwise >> last_stolen_id's of different threads end up on the same cache line. >> And the writes of different threads to disjoint locations would likely >> invalidate the cache line all the time. Just to avoid a potential >> performance issue here. > From kim.barrett at oracle.com Thu Jul 5 18:49:46 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 5 Jul 2018 14:49:46 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> Message-ID: <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> > On Jul 5, 2018, at 12:08 PM, Zhengyu Gu wrote: > > Hi Kim and Thomas, > > Thanks for reviewing. > > On 07/05/2018 03:54 AM, Thomas Schatzl wrote: >> Hi, >> On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote: >>>> On Jun 27, 2018, at 2:39 PM, Zhengyu Gu wrote: >>> [?] >>> An alternative that might be better is, whenever a pop_global fails, >>> reset the associated last_stolen id to invalid. This will revert to >>> 2 random choices until we find (at least) one with something we can >>> steal. Actually, it seems the referenced paper does something >>> similar, and the webrev code doesn't match the referenced paper. >> That may explain why my perf results are different to the paper that I >> was planning to investigate :) Nice find. > > Sorry, my bad. > > [?] > Updated webrev: > > http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html src/hotspot/share/gc/shared/taskqueue.inline.hpp 255 if (sz2 > sz1) { 256 sel_k = k2; 257 suc = _queues[k2]->pop_global(t); 258 } else { 259 sel_k = k1; 260 suc = _queues[k1]->pop_global(t); 261 } The paper avoids the steal attempt when both potential victims have a size of zero, e.g. insert another clause: } else if (sz1 == 0) { sel_k = k1; // Might be needed to avoid uninitialized variable warnings? suc = false; } else { ... There is a race condition between obtaining the size and checking it here, but I don't think that's important. The point is to avoid an expensive steal attempt when it is very likely to fail. From zgu at redhat.com Thu Jul 5 19:16:06 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 5 Jul 2018 15:16:06 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> Message-ID: <75070643-0492-c620-eb11-3225506ef1b7@redhat.com> On 07/05/2018 02:49 PM, Kim Barrett wrote: >> On Jul 5, 2018, at 12:08 PM, Zhengyu Gu wrote: >> >> Hi Kim and Thomas, >> >> Thanks for reviewing. >> >> On 07/05/2018 03:54 AM, Thomas Schatzl wrote: >>> Hi, >>> On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote: >>>>> On Jun 27, 2018, at 2:39 PM, Zhengyu Gu wrote: >>>> [?] > >>>> An alternative that might be better is, whenever a pop_global fails, >>>> reset the associated last_stolen id to invalid. This will revert to >>>> 2 random choices until we find (at least) one with something we can >>>> steal. Actually, it seems the referenced paper does something >>>> similar, and the webrev code doesn't match the referenced paper. >>> That may explain why my perf results are different to the paper that I >>> was planning to investigate :) Nice find. >> >> Sorry, my bad. >> >> [?] >> Updated webrev: >> >> http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html > > src/hotspot/share/gc/shared/taskqueue.inline.hpp > 255 if (sz2 > sz1) { > 256 sel_k = k2; > 257 suc = _queues[k2]->pop_global(t); > 258 } else { > 259 sel_k = k1; > 260 suc = _queues[k1]->pop_global(t); > 261 } > > The paper avoids the steal attempt when both potential victims have a > size of zero, e.g. insert another clause: > > } else if (sz1 == 0) { > sel_k = k1; // Might be needed to avoid uninitialized variable warnings? > suc = false; > } else { > ... > > There is a race condition between obtaining the size and checking it > here, but I don't think that's important. The point is to avoid an > expensive steal attempt when it is very likely to fail. Yes, I missed this. http://cr.openjdk.java.net/~zgu/8205921/webrev.02/index.html Thanks, -Zhengyu > From kim.barrett at oracle.com Thu Jul 5 19:26:04 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 5 Jul 2018 15:26:04 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <78c346e6-6c36-243a-e84d-16b2cee458d7@redhat.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <39B54C1B-815D-4A24-A4F5-F3660FE63E05@oracle.com> <78c346e6-6c36-243a-e84d-16b2cee458d7@redhat.com> Message-ID: > On Jul 5, 2018, at 2:44 PM, Zhengyu Gu wrote: > > Hi Kim, > > On 07/05/2018 01:33 PM, Kim Barrett wrote: >>> On Jul 5, 2018, at 3:54 AM, Thomas Schatzl wrote: >>> >>> Hi, >>> >>> On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote: >>>>> On Jun 27, 2018, at 2:39 PM, Zhengyu Gu wrote: >>>>> >>>>> [?] >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205921 >>>>> Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.htm >>>>> l >>>>> [?] >>>> Why do the last_queue array entries need to be padded? Why not just >>>> add a _last_stolen_queue member to TaskQueueSuper? >>> >>> The _last_stolen_queue is associated to a (stealing) thread, not the >>> queue. Multiple threads might have the same queue as current steal >>> target. >> The stealing thread should use its own queue to obtain and record this >> value, e.g. >> _queues[queue_num]->_last_stolen_queue >> It seems to me the random seed could also be there, addressing your >> other complaint (below). > Is it a bit weird to have these two fields in queue? given they have nothing to do with queue itself? > > I intended to use thread local for last_stolen_queue in Shenandoah, since we do have extra spaces in GCThreadLocalData. I don't think it's weird. Both the last steal queue and the random seed are 1:1 associated with a specific queue, and are part of the implementation of operations on the queue. This is a common problem when there is a cooperating pair of class X and class "collection of X". Maybe if steal_best_of_2 were a member function of the queue, rather than implemented by the queue set operating on the data in a selected queue, it might seem more apparent that this information belongs with the queue. >> That might have false sharing issues with the volatile members of the >> queue, but the existing _elems member have similar issues. Maybe the >> volatile queue members ought to be padded? > > I can see we might need to pad Age and bottom. But I don't understand why _elems member has similar issues, could you explain? Unshared _elems may be in the same cache line as shared _age or _bottom, so reads of the _elems member may be impacted by writes to those shared members by other threads. The same is true for any new unshared members we might add. From kim.barrett at oracle.com Thu Jul 5 19:33:03 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 5 Jul 2018 15:33:03 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <75070643-0492-c620-eb11-3225506ef1b7@redhat.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> <75070643-0492-c620-eb11-3225506ef1b7@redhat.com> Message-ID: > On Jul 5, 2018, at 3:16 PM, Zhengyu Gu wrote: > >>> Updated webrev: >>> >>> http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html >> src/hotspot/share/gc/shared/taskqueue.inline.hpp >> 255 if (sz2 > sz1) { >> 256 sel_k = k2; >> 257 suc = _queues[k2]->pop_global(t); >> 258 } else { >> 259 sel_k = k1; >> 260 suc = _queues[k1]->pop_global(t); >> 261 } >> The paper avoids the steal attempt when both potential victims have a >> size of zero, e.g. insert another clause: >> } else if (sz1 == 0) { >> sel_k = k1; // Might be needed to avoid uninitialized variable warnings? >> suc = false; >> } else { >> ... >> There is a race condition between obtaining the size and checking it >> here, but I don't think that's important. The point is to avoid an >> expensive steal attempt when it is very likely to fail. > > Yes, I missed this. > > http://cr.openjdk.java.net/~zgu/8205921/webrev.02/index.html > > Thanks, > > -Zhengyu I think that makes the change accurately reflect the paper. Just one minor nit: extraneous whitespace in ?0 )?: 258 } else if (sz1 > 0 ) { From thomas.schatzl at oracle.com Thu Jul 5 19:47:49 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 05 Jul 2018 21:47:49 +0200 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> Message-ID: <344187d3db4f3a069a70a730cc1c3b6555243f9d.camel@oracle.com> Hi, On Thu, 2018-07-05 at 14:49 -0400, Kim Barrett wrote: > > On Jul 5, 2018, at 12:08 PM, Zhengyu Gu wrote: > > > > Hi Kim and Thomas, > > > > Thanks for reviewing. > > > > On 07/05/2018 03:54 AM, Thomas Schatzl wrote: > > > Hi, > > > On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote: > > > > > On Jun 27, 2018, at 2:39 PM, Zhengyu Gu > > > > > wrote: > > > > > > > > [?] > > > > An alternative that might be better is, whenever a pop_global > > > > fails, reset the associated last_stolen id to invalid. This > > > > will revert to 2 random choices until we find (at least) one > > > > with something we can steal. Actually, it seems the referenced > > > > paper does something similar, and the webrev code doesn't match > > > > the referenced paper. > > > > > > That may explain why my perf results are different to the paper > > > that I was planning to investigate :) Nice > > > find. > > > > Sorry, my bad. I have been looking into this a bit and finally (with some patch from me that fixes the changes too) and some additional probes (using the TASKQUEUE_STATS "infrastructure") I am starting to get meaningful results. More about that later. In any case the technique looks like a nice improvement at least in steal attempts and steal/steal attempts ratio on some bigger tests, but I need to update my code again it seems :) I can add the changes to the TASKQUEUE_STATS logging later btw. > > > > [?] > > Updated webrev: > > > > http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html > > src/hotspot/share/gc/shared/taskqueue.inline.hpp > 255 if (sz2 > sz1) { > 256 sel_k = k2; > 257 suc = _queues[k2]->pop_global(t); > 258 } else { > 259 sel_k = k1; > 260 suc = _queues[k1]->pop_global(t); > 261 } > > The paper avoids the steal attempt when both potential victims have a > size of zero, e.g. insert another clause: > > } else if (sz1 == 0) { > sel_k = k1; // Might be needed to avoid uninitialized variable > warnings? > suc = false; > } else { > ... > > There is a race condition between obtaining the size and checking it > here, but I don't think that's important. The point is to avoid an > expensive steal attempt when it is very likely to fail. > There is another bug in the existing code: current Hotspot collectors all reuse a single task queue set. So since the queue id's are only initialized once at startup, there will be some initial use of a suboptimal queue. I assume Shenandoah does not need a reset because it creates new taskqueuesets whenever it needs them (and frees them afterwards). The current design of passing stealing-local information (the seed) makes it clear that the owner of that variable needs to initialize it. At this time I have no preference on Kim's suggestion to put these variables into the queue if you asked me. I would tend to encapsulate the mechanism as much as possible though. I do think if Shenandoah wants to put this information into the GCThreadLocalBlock (for what reason?) it is probably most flexible to pass these things as kind of context to the steal_best_of_2() method. I do not think it is desirable to have a second copy of the taskqueue* code around; I can't see how else one implementation can use the TaskQueueSet local queue ids and the other use the same information from somewhere else right now. Thanks, Thomas From zgu at redhat.com Thu Jul 5 19:56:54 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 5 Jul 2018 15:56:54 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <39B54C1B-815D-4A24-A4F5-F3660FE63E05@oracle.com> <78c346e6-6c36-243a-e84d-16b2cee458d7@redhat.com> Message-ID: <196db311-d3d2-de16-9af2-593d82b17a29@redhat.com> On 07/05/2018 03:26 PM, Kim Barrett wrote: >> On Jul 5, 2018, at 2:44 PM, Zhengyu Gu wrote: >> >> Hi Kim, >> >> On 07/05/2018 01:33 PM, Kim Barrett wrote: >>>> On Jul 5, 2018, at 3:54 AM, Thomas Schatzl wrote: >>>> >>>> Hi, >>>> >>>> On Thu, 2018-07-05 at 01:15 -0400, Kim Barrett wrote: >>>>>> On Jun 27, 2018, at 2:39 PM, Zhengyu Gu wrote: >>>>>> >>>>>> [?] >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205921 >>>>>> Webrev: http://cr.openjdk.java.net/~zgu/8205921/webrev.00/index.htm >>>>>> l >>>>>> [?] >>>>> Why do the last_queue array entries need to be padded? Why not just >>>>> add a _last_stolen_queue member to TaskQueueSuper? >>>> >>>> The _last_stolen_queue is associated to a (stealing) thread, not the >>>> queue. Multiple threads might have the same queue as current steal >>>> target. >>> The stealing thread should use its own queue to obtain and record this >>> value, e.g. >>> _queues[queue_num]->_last_stolen_queue >>> It seems to me the random seed could also be there, addressing your >>> other complaint (below). >> Is it a bit weird to have these two fields in queue? given they have nothing to do with queue itself? >> >> I intended to use thread local for last_stolen_queue in Shenandoah, since we do have extra spaces in GCThreadLocalData. > > I don't think it's weird. Both the last steal queue and the random > seed are 1:1 associated with a specific queue, and are part of the > implementation of operations on the queue. This is a common problem > when there is a cooperating pair of class X and class "collection of > X". Maybe if steal_best_of_2 were a member function of the queue, > rather than implemented by the queue set operating on the data in a > selected queue, it might seem more apparent that this information > belongs with the queue. > >>> That might have false sharing issues with the volatile members of the >>> queue, but the existing _elems member have similar issues. Maybe the >>> volatile queue members ought to be padded? >> >> I can see we might need to pad Age and bottom. But I don't understand why _elems member has similar issues, could you explain? > > Unshared _elems may be in the same cache line as shared _age or > _bottom, so reads of the _elems member may be impacted by writes to > those shared members by other threads. The same is true for any new > unshared members we might add. Ah, I thought _elems is from additional allocation, it is not a concern, but I guess there is still a chance. Thanks, -Zhengyu > From zgu at redhat.com Thu Jul 5 19:57:30 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 5 Jul 2018 15:57:30 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> <75070643-0492-c620-eb11-3225506ef1b7@redhat.com> Message-ID: On 07/05/2018 03:33 PM, Kim Barrett wrote: >> On Jul 5, 2018, at 3:16 PM, Zhengyu Gu wrote: >> >>>> Updated webrev: >>>> >>>> http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html >>> src/hotspot/share/gc/shared/taskqueue.inline.hpp >>> 255 if (sz2 > sz1) { >>> 256 sel_k = k2; >>> 257 suc = _queues[k2]->pop_global(t); >>> 258 } else { >>> 259 sel_k = k1; >>> 260 suc = _queues[k1]->pop_global(t); >>> 261 } >>> The paper avoids the steal attempt when both potential victims have a >>> size of zero, e.g. insert another clause: >>> } else if (sz1 == 0) { >>> sel_k = k1; // Might be needed to avoid uninitialized variable warnings? >>> suc = false; >>> } else { >>> ... >>> There is a race condition between obtaining the size and checking it >>> here, but I don't think that's important. The point is to avoid an >>> expensive steal attempt when it is very likely to fail. >> >> Yes, I missed this. >> >> http://cr.openjdk.java.net/~zgu/8205921/webrev.02/index.html >> >> Thanks, >> >> -Zhengyu > > I think that makes the change accurately reflect the paper. > > Just one minor nit: extraneous whitespace in ?0 )?: > 258 } else if (sz1 > 0 ) { > I will fix it before push. Thanks a lot! -Zhengyu > From kim.barrett at oracle.com Thu Jul 5 20:03:27 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 5 Jul 2018 16:03:27 -0400 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # In-Reply-To: References: Message-ID: <7BB45268-F475-41CA-BB2D-DFB2FE9AE6E4@oracle.com> > On Jul 5, 2018, at 3:57 AM, Thomas Schatzl wrote: > > Hi, > > On Wed, 2018-07-04 at 20:13 -0400, Kim Barrett wrote: >> Please review this fix of the HeapRegion gtest. >> >> The test modifies a region's "top" to unexpected values without >> ensuring that no allocation might use the region and no GC might run >> while the region is in that invalid state. We solve this by >> executing the test code in its very own safepoint, and by saving and >> then restoring the region's top back to its original value before >> completing the test. And since we are doing all that, there's no >> longer any reason to run the test in a separate VM. > > looks good, but the actual test is still run in a separate VM. > Intentional? Unintentional. And now I?m not sure what I last ran through mach5. I?ll re-test with TEST_OTHER_VM => TEST_VM. I know that failed in an obscure way earlier, but I think that was because of an unrelated recently introduced bug that?s been fixed in the repo. From kim.barrett at oracle.com Thu Jul 5 20:12:20 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 5 Jul 2018 16:12:20 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> <75070643-0492-c620-eb11-3225506ef1b7@redhat.com> Message-ID: > On Jul 5, 2018, at 3:57 PM, Zhengyu Gu wrote: > On 07/05/2018 03:33 PM, Kim Barrett wrote: >>> On Jul 5, 2018, at 3:16 PM, Zhengyu Gu wrote: >>> [?] >>> http://cr.openjdk.java.net/~zgu/8205921/webrev.02/index.html >>> >>> Thanks, >>> >>> -Zhengyu >> I think that makes the change accurately reflect the paper. >> Just one minor nit: extraneous whitespace in ?0 )?: >> 258 } else if (sz1 > 0 ) { > I will fix it before push. > > Thanks a lot! > > -Zhengyu In case there?s any confusion, that wasn?t a ?Looks good. Reviewed.? There?s still the padding and where to put the last steal queue discussions to be resolved. From kim.barrett at oracle.com Thu Jul 5 20:53:47 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 5 Jul 2018 16:53:47 -0400 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink In-Reply-To: <15c89934f5f69de519bca42c9d7b049e621ccbae.camel@oracle.com> References: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> <54BF88C5-5835-49B7-8E1F-E21A4E429D15@oracle.com> <15c89934f5f69de519bca42c9d7b049e621ccbae.camel@oracle.com> Message-ID: <68B2A585-08B4-4741-93C6-B68D3CC801CA@oracle.com> > On Jul 5, 2018, at 3:16 AM, Thomas Schatzl wrote: > There is a new webrev at > > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.1 (full) > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.0_to_1 (diff, but > almost useless due to many changes) > > That at least separates the concerns about humongous/regular region a > bit. > > Thanks, > Thomas I like this much better. It eliminates the implicit logical coupling that the before rebuild task "knows" the liveness of the starts region is good enough, without introducing physical coupling from remset to concurrentmark. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1RemSetTrackingPolicy.cpp 116 if (!r->is_old() && r->is_archive()) { I think that should be || rather than &&. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1RemSetTrackingPolicy.cpp 111 bool G1RemSetTrackingPolicy::update_before_rebuild(HeapRegion* r, size_t live_bytes) { Consider adding "assert(!r->is_humongous(), ...)". The !r->is_old() will filter them out, but we shouldn't be here at all and should have instead called the associated update_humongous function. ------------------------------------------------------------------------------ From zgu at redhat.com Thu Jul 5 23:03:29 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 5 Jul 2018 19:03:29 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> <75070643-0492-c620-eb11-3225506ef1b7@redhat.com> Message-ID: <7ba92840-d12f-0f4c-2b02-21ca690c3ca4@redhat.com> On 07/05/2018 04:12 PM, Kim Barrett wrote: >> On Jul 5, 2018, at 3:57 PM, Zhengyu Gu wrote: >> On 07/05/2018 03:33 PM, Kim Barrett wrote: >>>> On Jul 5, 2018, at 3:16 PM, Zhengyu Gu wrote: >>>> [?] >>>> http://cr.openjdk.java.net/~zgu/8205921/webrev.02/index.html >>>> >>>> Thanks, >>>> >>>> -Zhengyu >>> I think that makes the change accurately reflect the paper. >>> Just one minor nit: extraneous whitespace in ?0 )?: >>> 258 } else if (sz1 > 0 ) { >> I will fix it before push. >> >> Thanks a lot! >> >> -Zhengyu > > In case there?s any confusion, that wasn?t a ?Looks good. Reviewed.? There?s still the > padding and where to put the last steal queue discussions to be resolved. > Got it. I am fine with placing last steal queue inside stealing thread's queue. However, I think padding fields is beyond this RFE, we should file new one to address this issue. Thanks, -Zhengyu From zgu at redhat.com Thu Jul 5 23:22:59 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 5 Jul 2018 19:22:59 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <344187d3db4f3a069a70a730cc1c3b6555243f9d.camel@oracle.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> <344187d3db4f3a069a70a730cc1c3b6555243f9d.camel@oracle.com> Message-ID: <61def7df-d441-4a65-ea04-18e282b94db9@redhat.com> Hi Thomas, > > I have been looking into this a bit and finally (with some patch from > me that fixes the changes too) and some additional probes (using the > TASKQUEUE_STATS "infrastructure") I am starting to get meaningful > results. > > More about that later. > > In any case the technique looks like a nice improvement at least in > steal attempts and steal/steal attempts ratio on some bigger tests, but > I need to update my code again it seems :) > > I can add the changes to the TASKQUEUE_STATS logging later btw. Great! Looking forward to seeing the results. > >>> >>> [?] >>> Updated webrev: >>> >>> http://cr.openjdk.java.net/~zgu/8205921/webrev.01/index.html >> >> src/hotspot/share/gc/shared/taskqueue.inline.hpp >> 255 if (sz2 > sz1) { >> 256 sel_k = k2; >> 257 suc = _queues[k2]->pop_global(t); >> 258 } else { >> 259 sel_k = k1; >> 260 suc = _queues[k1]->pop_global(t); >> 261 } >> >> The paper avoids the steal attempt when both potential victims have a >> size of zero, e.g. insert another clause: >> >> } else if (sz1 == 0) { >> sel_k = k1; // Might be needed to avoid uninitialized variable >> warnings? >> suc = false; >> } else { >> ... >> >> There is a race condition between obtaining the size and checking it >> here, but I don't think that's important. The point is to avoid an >> expensive steal attempt when it is very likely to fail. >> > > There is another bug in the existing code: current Hotspot collectors > all reuse a single task queue set. So since the queue id's are only > initialized once at startup, there will be some initial use of a > suboptimal queue. Technically, it is a bug. I doubt it will have material impact, cause the old value probably just as good as next random one. > > I assume Shenandoah does not need a reset because it creates new > taskqueuesets whenever it needs them (and frees them afterwards). > Shenandoah does reuse queues, we added clear() method inside our queue set implementation to clean up queue, overflow queue and buffer, etc. > The current design of passing stealing-local information (the seed) > makes it clear that the owner of that variable needs to initialize it. > > At this time I have no preference on Kim's suggestion to put these > variables into the queue if you asked me. I would tend to encapsulate > the mechanism as much as possible though. > > I do think if Shenandoah wants to put this information into the > GCThreadLocalBlock (for what reason?) it is probably most flexible to > pass these things as kind of context to the steal_best_of_2() method. I > do not think it is desirable to have a second copy of the taskqueue* > code around; I can't see how else one implementation can use the > TaskQueueSet local queue ids and the other use the same information > from somewhere else right now. Similar to what ZGC does, so we can avoid passing worker id and queue, etc. all over the places. We don't want to use gnu style thread-local, so GCThreadLocalBlock is the temporary place until compiler upgrade (?) As I mentioned in early email, I would prefer to pass TaskQueueStealLocals/Context, but I am afraid of venturing into other GCs that I am not familiar with. Thomas, seems you have made other changes/improvements, do you want to take over this RFE? I am fine with either ways. Thanks, -Zhengyu > > Thanks, > Thomas > From ioi.lam at oracle.com Fri Jul 6 00:45:29 2018 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 5 Jul 2018 17:45:29 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> Message-ID: <64c6255c-7e5f-7688-6848-018504f479be@oracle.com> Hi Jiangli, Thank you so much for working on this. I think it's great that we can get the start-up improvement by archiving the ModuleDescriptor. I just have some coding style comments regarding heapShared.cpp. This file contains the code for coping objects and relocating pointers. By its nature, this kind of code is usually complicated, so I think we should try to make it as easy to understand as possible. [1] HeapShared::walk_from_field_and_archiving: ??? This name is not grammatically correct. How about HeapShared::archive_reachable_objects_from_static_field [2] How about changing the parameter field_offset -> static_field_offset ??? When I first read the code I was confused whether it's talking ??? about static or instance fields. Usually, "field" ??? implies instance field, so it's better to specifically ??? say "static field". [3] This code would fail if "f" is already archived. ??? 473?? // get the archived copy of the field referenced object ??? 474?? oop af = MetaspaceShared::archive_heap_object(f, THREAD); ??? 475?? WalkOopAndArchiveClosure walker(1, subgraph_info, f, af); ??? 476?? f->oop_iterate(&walker); [4] There's duplicated code between walk_from_field_and_archiving and ? ? WalkOopAndArchiveClosure::do_oop_work ??? 403?? assert(relocated_k == MetaspaceShared::get_relocated_klass(orig_k), ? ? 404????????? "must be the relocated Klass in the shared space"); ??? 405?? _subgraph_info->add_subgraph_object_klass(orig_k, relocated_k); ??? - vs - ? ? 484?? assert(relocated_k == MetaspaceShared::get_relocated_klass(orig_k), ? ? 485????????? "must be the relocated Klass in the shared space"); ? ? 486?? subgraph_info->add_subgraph_object_klass(orig_k, relocated_k); [5] This code? is also duplicated: ? ? 375?? RawAccess::oop_store(new_p, archived); ? ? 376?? log.print("--- archived copy existing, store archived " PTR_FORMAT " in " PTR_FORMAT, ? ? 377???????????? p2i(archived), p2i(new_p)); ??? - vs - ? ? 395? RawAccess::oop_store(new_p, archived); ??? 396? log.print("=== store archived " PTR_FORMAT " in " PTR_FORMAT, ??? 397??????????? p2i(archived), p2i(new_p)); [6] This code, even though it's correct, is hard to understand -- ? ? why are we calculating the distance between the two objects? ? ? 368? size_t delta = pointer_delta((HeapWord*)_archived_referencing_obj, ? ? 369 (HeapWord*)_orig_referencing_obj); ? ? 370? T* new_p = (T*)((HeapWord*)p + delta); ??? I thin it would be easier to understand if we change the order of the ? ? two arithmetic operations: ??? // new_p is the address of the same field inside _archived_referencing_obj. ??? size_t field_offset_in_bytes = pointer_delta(p, _orig_referencing_obj, 1); ??? T* new_p = (T*)(address(_orig_referencing_obj) + field_offset_in_bytes); [7] I have a hard time understand this log: ??? 376?? log.print("--- archived copy existing, store archived " PTR_FORMAT " in " PTR_FORMAT, ??? 377???????????? p2i(archived), p2i(new_p)); ??? How about this? ??? log.print("--- updated embedded pointer @[" PTR_FORMAT "] => " PTR_FORMAT, ????????????? p2i(new_p), p2i(archived)); For your consideration, I've incorporated my comments above into heapShared.cpp. I've not tested it so it most likely won't build :-( http://cr.openjdk.java.net/~iklam/misc/heapShared.old.cpp? [your version] http://cr.openjdk.java.net/~iklam/misc/heapShared.new.cpp? [my version] Please take a look and see if you like it. Thanks - Ioi On 6/28/18 4:15 PM, Jiangli Zhou wrote: > This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). > > The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. > > The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. > > webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ > RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 > > Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. > > Following are the details of system module archiving, which are duplicated in above bug report. > --------------------------------------------------------------------------------------------------------------------------- > Support archiving system module graph when the initial module is unnamed module from -cp currently. > > Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. > > Dump time system module object archiving > ================================= > At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. > > private static SystemModules archivedSystemModules; > private static ModuleFinder archivedSystemModuleFinder; > private static String archivedMainModule; > > The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. > > 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. > 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. > 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. > 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. > 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. > > Runtime initialization from archived system module objects > ============================================ > VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. > > If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. > > In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. > > Thanks, > Jiangli > > From jiangli.zhou at Oracle.COM Fri Jul 6 02:38:38 2018 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Thu, 5 Jul 2018 19:38:38 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <64c6255c-7e5f-7688-6848-018504f479be@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> <64c6255c-7e5f-7688-6848-018504f479be@oracle.com> Message-ID: Hi Ioi, Thanks for the review! > On Jul 5, 2018, at 5:45 PM, Ioi Lam wrote: > > Hi Jiangli, > > Thank you so much for working on this. I think it's great that we can get the > start-up improvement by archiving the ModuleDescriptor. > > I just have some coding style comments regarding heapShared.cpp. This file > contains the code for coping objects and relocating pointers. By its nature, > this kind of code is usually complicated, so I think we should try to make > it as easy to understand as possible. > > > [1] HeapShared::walk_from_field_and_archiving: > > This name is not grammatically correct. How about > HeapShared::archive_reachable_objects_from_static_field Sounds good. > > [2] How about changing the parameter field_offset -> static_field_offset > When I first read the code I was confused whether it's talking > about static or instance fields. Usually, "field" > implies instance field, so it's better to specifically > say "static field?. Ok. > > [3] This code would fail if "f" is already archived. > > 473 // get the archived copy of the field referenced object > 474 oop af = MetaspaceShared::archive_heap_object(f, THREAD); > 475 WalkOopAndArchiveClosure walker(1, subgraph_info, f, af); > 476 f->oop_iterate(&walker); Hmmm, it?s possible we might encounter an archived object during reference walking & archiving in future cases. I?ll add a check. > > [4] There's duplicated code between walk_from_field_and_archiving and > WalkOopAndArchiveClosure::do_oop_work > > 403 assert(relocated_k == MetaspaceShared::get_relocated_klass(orig_k), > 404 "must be the relocated Klass in the shared space"); > 405 _subgraph_info->add_subgraph_object_klass(orig_k, relocated_k); > > - vs - > > 484 assert(relocated_k == MetaspaceShared::get_relocated_klass(orig_k), > 485 "must be the relocated Klass in the shared space"); > 486 subgraph_info->add_subgraph_object_klass(orig_k, relocated_k); I?ll move the assert into add_subgraph_object_klass(). > > [5] This code is also duplicated: > > 375 RawAccess::oop_store(new_p, archived); > 376 log.print("--- archived copy existing, store archived " PTR_FORMAT " in " PTR_FORMAT, > 377 p2i(archived), p2i(new_p)); > > - vs - > > 395 RawAccess::oop_store(new_p, archived); > 396 log.print("=== store archived " PTR_FORMAT " in " PTR_FORMAT, > 397 p2i(archived), p2i(new_p)); The first case is for existing archived copy and the second is for newly archived. The different logging messages are helpful for debugging. Not sure if using a function to encapsulate the store & log worth it in this case. Any suggestion? > > [6] This code, even though it's correct, is hard to understand -- > why are we calculating the distance between the two objects? > > 368 size_t delta = pointer_delta((HeapWord*)_archived_referencing_obj, > 369 (HeapWord*)_orig_referencing_obj); > 370 T* new_p = (T*)((HeapWord*)p + delta); > > I thin it would be easier to understand if we change the order of the > two arithmetic operations: > > // new_p is the address of the same field inside _archived_referencing_obj. > size_t field_offset_in_bytes = pointer_delta(p, _orig_referencing_obj, 1); > T* new_p = (T*)(address(_orig_referencing_obj) + field_offset_in_bytes); I think this works too. I?ll change as you suggested. > > [7] I have a hard time understand this log: > > 376 log.print("--- archived copy existing, store archived " PTR_FORMAT " in " PTR_FORMAT, > 377 p2i(archived), p2i(new_p)); > > How about this? > > log.print("--- updated embedded pointer @[" PTR_FORMAT "] => " PTR_FORMAT, > p2i(new_p), p2i(archived)); It is for the case where there is an existing copy of the archived object. Maybe ?found existing archived copy? would help? > > > For your consideration, I've incorporated my comments above into heapShared.cpp. > I've not tested it so it most likely won't build :-( > > > http://cr.openjdk.java.net/~iklam/misc/heapShared.old.cpp [your version] > http://cr.openjdk.java.net/~iklam/misc/heapShared.new.cpp [my version] > > Please take a look and see if you like it. Thanks a lot! I?ll take a look and incorporate your suggestions. Thanks again! Jiangli > > Thanks > - Ioi > > On 6/28/18 4:15 PM, Jiangli Zhou wrote: >> This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). >> >> The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. >> >> The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ >> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 >> >> Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. >> >> Following are the details of system module archiving, which are duplicated in above bug report. >> --------------------------------------------------------------------------------------------------------------------------- >> Support archiving system module graph when the initial module is unnamed module from -cp currently. >> >> Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. >> >> Dump time system module object archiving >> ================================= >> At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. >> >> private static SystemModules archivedSystemModules; >> private static ModuleFinder archivedSystemModuleFinder; >> private static String archivedMainModule; >> >> The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. >> >> 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. >> 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. >> 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. >> 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. >> 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. >> >> Runtime initialization from archived system module objects >> ============================================ >> VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. >> >> If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. >> >> In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. >> >> Thanks, >> Jiangli >> >> > From erik.helin at oracle.com Fri Jul 6 12:09:59 2018 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 6 Jul 2018 14:09:59 +0200 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink In-Reply-To: <15c89934f5f69de519bca42c9d7b049e621ccbae.camel@oracle.com> References: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> <54BF88C5-5835-49B7-8E1F-E21A4E429D15@oracle.com> <15c89934f5f69de519bca42c9d7b049e621ccbae.camel@oracle.com> Message-ID: <5345e792-4a60-ea6d-e0b0-79aacae0e484@oracle.com> On 07/05/2018 09:16 AM, Thomas Schatzl wrote: > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.1 (full) > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.0_to_1 (diff, but > almost useless due to many changes) > > That at least separates the concerns about humongous/regular region a > bit. This version looks much better to me as well, thanks for refactoring the patch! I agree with Kim's comments and in addition I would also have "inlined" is_interesting_humongous_region into update_humongous_before_rebuild, so the `if` in update_humongous_before_rebuild becomes: bool is_type_array = oop(r->humongous_start_region()->bottom())->is_typeArray()); if (is_live && is_type_array && !r->rem_set()->is_tracked()) { r->rem_set()->set_state_updating(); selected_for_rebuild = true; } This change makes the comment easier to follow (at least for me). The patch also uses so called "east-side-const" in g1ConcurrentMark.cpp, but that doesn't matter too much since g1ConcurrentMark.cpp seems to use both "west-side-const" and "east-side-const" in equal proportions (that however should probably be cleaned up). Thanks, Erik From thomas.schatzl at oracle.com Fri Jul 6 13:10:16 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 06 Jul 2018 15:10:16 +0200 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink In-Reply-To: <68B2A585-08B4-4741-93C6-B68D3CC801CA@oracle.com> References: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> <54BF88C5-5835-49B7-8E1F-E21A4E429D15@oracle.com> <15c89934f5f69de519bca42c9d7b049e621ccbae.camel@oracle.com> <68B2A585-08B4-4741-93C6-B68D3CC801CA@oracle.com> Message-ID: <8f49bc7b4dbdf3255f4c2dc83354101c1b64ae2f.camel@oracle.com> Hi, On Thu, 2018-07-05 at 16:53 -0400, Kim Barrett wrote: > > On Jul 5, 2018, at 3:16 AM, Thomas Schatzl > om> wrote: > > There is a new webrev at > > > > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.1 (full) > > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.0_to_1 (diff, > > but > > almost useless due to many changes) > > > > That at least separates the concerns about humongous/regular region > > a > > bit. > > > > Thanks, > > Thomas > > I like this much better. It eliminates the implicit logical coupling > that the before rebuild task "knows" the liveness of the starts > region > is good enough, without introducing physical coupling from remset to > concurrentmark. > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/g1/g1RemSetTrackingPolicy.cpp > 116 if (!r->is_old() && r->is_archive()) { > > I think that should be || rather than &&. > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/g1/g1RemSetTrackingPolicy.cpp > 111 bool G1RemSetTrackingPolicy::update_before_rebuild(HeapRegion* > r, size_t live_bytes) { > > Consider adding "assert(!r->is_humongous(), ...)". The !r->is_old() > will filter them out, but we shouldn't be here at all and should have > instead called the associated update_humongous function. > > ------------------------------------------------------------------- > ----------- > fixed all that and Erik's suggestion. New webrev: http://cr.openjdk.java.net/~tschatzl/8205426/webrev.2 (full) http://cr.openjdk.java.net/~tschatzl/8205426/webrev.1_to_2 (diff) It passed hs-tier1-4,jdk-tier1-3 Thanks, Thomas From erik.helin at oracle.com Fri Jul 6 13:29:21 2018 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 6 Jul 2018 15:29:21 +0200 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink In-Reply-To: <8f49bc7b4dbdf3255f4c2dc83354101c1b64ae2f.camel@oracle.com> References: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> <54BF88C5-5835-49B7-8E1F-E21A4E429D15@oracle.com> <15c89934f5f69de519bca42c9d7b049e621ccbae.camel@oracle.com> <68B2A585-08B4-4741-93C6-B68D3CC801CA@oracle.com> <8f49bc7b4dbdf3255f4c2dc83354101c1b64ae2f.camel@oracle.com> Message-ID: <1ad12e1f-1bc4-5ab5-1d51-838ac0bd980e@oracle.com> On 07/06/2018 03:10 PM, Thomas Schatzl wrote: > New webrev: > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.2 (full) > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.1_to_2 (diff) Looks good, Reviewed! Thanks, Erik > It passed hs-tier1-4,jdk-tier1-3 > > Thanks, > Thomas > From thomas.schatzl at oracle.com Fri Jul 6 13:39:23 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 06 Jul 2018 15:39:23 +0200 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <61def7df-d441-4a65-ea04-18e282b94db9@redhat.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> <344187d3db4f3a069a70a730cc1c3b6555243f9d.camel@oracle.com> <61def7df-d441-4a65-ea04-18e282b94db9@redhat.com> Message-ID: <63818f4b77b2aee712e6001fd798adc97ba246bf.camel@oracle.com> Hi Zhengyu, On Thu, 2018-07-05 at 19:22 -0400, Zhengyu Gu wrote: > Hi Thomas, > > > [..] > > There is another bug in the existing code: current Hotspot > > collectors > > all reuse a single task queue set. So since the queue id's are only > > initialized once at startup, there will be some initial use of a > > suboptimal queue. > > Technically, it is a bug. I doubt it will have material impact, > cause the old value probably just as good as next random one. Agree. [...] > As I mentioned in early email, I would prefer to pass > TaskQueueStealLocals/Context, but I am afraid of venturing into > other GCs that I am not familiar with. > > Thomas, seems you have made other changes/improvements, do you want > to take over this RFE? I am fine with either ways. I assigned the issue to myselves. :) Thanks, Thomas From zgu at redhat.com Fri Jul 6 13:44:46 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 6 Jul 2018 09:44:46 -0400 Subject: RFR(S) 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <63818f4b77b2aee712e6001fd798adc97ba246bf.camel@oracle.com> References: <904c2ea5-0935-2c4d-fcbd-6b90238b4dc4@redhat.com> <1A7B4B34-68A1-49D6-AA2D-39FD8A7502CB@oracle.com> <43e0e7278da5684daf450b9847c67362ec361b08.camel@oracle.com> <9e6d0156-ecbb-45e3-a345-fc7a0f5a14c3@redhat.com> <90E9F360-602F-4E02-9800-C5C3231D1827@oracle.com> <344187d3db4f3a069a70a730cc1c3b6555243f9d.camel@oracle.com> <61def7df-d441-4a65-ea04-18e282b94db9@redhat.com> <63818f4b77b2aee712e6001fd798adc97ba246bf.camel@oracle.com> Message-ID: >> Thomas, seems you have made other changes/improvements, do you want >> to take over this RFE? I am fine with either ways. > > I assigned the issue to myselves. :) Thank you! -Zhengyu > > Thanks, > Thomas > From thomas.schatzl at oracle.com Fri Jul 6 14:11:47 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 06 Jul 2018 16:11:47 +0200 Subject: RFR (S): 8206453: Taskqueue stats should count real steal attempts, not calls to GenericTaskQueueSet::steal Message-ID: <1992fab92d03fbbe64218aab7562a59e41d2b553.camel@oracle.com> Hi all, can I have reviews for some small change to how successful steals and steal attempts in taskqueue statistics are gathered? In particular, as the subject of the CR suggests, the "steal attempts" counter the number of calls to GenericTaskQueueSet::steal() which internally actually attempts stealing quite often. This makes a useful comparison of steal attempts to successful steals impossible and misleading. The calls to GenericTaskQueueSet::steal are mostly reflected to the number of termination attempts that is already counted elsewhere (ie. steal attempts - successful steals), so it does not really give new information. CR: https://bugs.openjdk.java.net/browse/JDK-8206453 Webrev: http://cr.openjdk.java.net/~tschatzl/8206453/webrev/ Testing: local compilation and use with and without TASKQUEUE_STATS. Thanks, Thomas From rkennke at redhat.com Fri Jul 6 14:46:59 2018 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 6 Jul 2018 16:46:59 +0200 Subject: RFR: JDK-8206457: Code paths from oop_iterate() must use barrier-free access Message-ID: <28394f57-3590-3d74-b660-dcfa6b4648a2@redhat.com> We have several code paths going out from oop_iterate() methods that lead to GC barriers. This is not only inefficient but outright wrong. oop_iterate() is normally used by GC and GC need to see the raw stuff, not some resolved objects. In Shenandoah's full-GC it's fatal to attempt to read objects's forwarding pointers, because it's temporarily pointing to nowhere land. I propose to selectively use _raw() variants of the various accessors that are used on oop_iterate() paths. This means to introduce an oopDesc::int_field_raw(). I also propose to change metadata_field() accessors to always use raw access wholesale. This is only used to load the Klass* field, which is immutable and thus doesn't require barriers. The log_* statements in instanceRefKlass.inline.hpp surely don't need barriers. I turned them into raw accessors as well. Bug: https://bugs.openjdk.java.net/browse/JDK-8206457?filter=-1 Webrev: http://cr.openjdk.java.net/~rkennke/JDK-8206457/webrev.00/ Test: passes hotspot-tier1 here. Can I please get review? Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From rkennke at redhat.com Fri Jul 6 15:28:11 2018 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 6 Jul 2018 17:28:11 +0200 Subject: RFR: JDK-8204970: Remaing object comparisons need to use oopDesc::equals() Message-ID: <2b92069f-f378-2984-8518-68230238eaec@redhat.com> We found 2 more places where oopDesc::equals() should be used instead of raw obj==obj. Bug: https://bugs.openjdk.java.net/browse/JDK-8204970 Webrev: http://cr.openjdk.java.net/~rkennke/JDK-8204970/webrev.00/ Passes tier1 tests Can I get a review? Thanks, Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From calvin.cheung at oracle.com Fri Jul 6 16:15:39 2018 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Fri, 06 Jul 2018 09:15:39 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> Message-ID: <5B3F95AB.7060702@oracle.com> Hi Jiangli, Thanks for this start-up improvement. The changes look good overall. I've the following minor comments. 1) make/hotspot/symbols/symbols-unix 134 JVM_InitializeFromArchive If you want the symbols to be in alphabetical order, the above should be moved after JVM_InitStackTraceElementArray. 2) metaspaceShared.cpp 1927 oop MetaspaceShared::materialize_archived_object(oop obj) { 1928 if (obj != NULL) { 1929 return G1CollectedHeap::heap()->materialize_archived_object(obj); 1930 } 1931 return NULL; 1932 } Instead of two return statements, how about replacing lines 1928 - 1931 with the following? return (obj != NULL) ? G1CollectedHeap::heap()->materialize_archived_object(obj) : NULL; 3) ArchivedModuleComboTest.java 55 Path moduleDir = Files.createTempDirectory(userDir, "mods"); I don't see anything got placed under the "mods" dir, is it by design? For the "dump with --module-path" cases, there seems to be a missing test case with "--show-module-resolution" (similar to Test case 2). 4) CheckArchivedModuleApp.java 53 if (expectArchived && wb.isShared(md)) { 54 System.out.println(name + " is archived. Expected."); 55 } else if (!expectArchived && !wb.isShared(md)) { 56 System.out.println(name + " is not archived. Expected."); 57 } else if (expectArchived) { 58 throw new RuntimeException( 59 "FAILED. " + name + " is not archived. Expect archived."); 60 } else { 61 throw new RuntimeException( 62 "FAILED. " + name + " is archived. Expect not archived."); 63 } I'd suggest the following so that the code is easier to understand: if (expectArchived) { if (wb.isShared(md)) { System.out.println(name + " is archived. Expected."); } else { throw new RuntimeException( "FAILED. " + name + " is not archived. Expect archived."); } } else { if (!wb.isShared(md)) { System.out.println(name + " is not archived. Expected."); } else { throw new RuntimeException( "FAILED. " + name + " is archived. Expect not archived."); } } 5) ArchivedModuleWithCustomImageTest.java 178 private static void printCommand(String opts[]) { 179 StringBuilder cmdLine = new StringBuilder(); 180 for (String cmd : opts) 181 cmdLine.append(cmd).append(' '); 182 System.out.println("Command line: [" + cmdLine.toString() + "]"); 183 } Consider putting the above method in ProcessTools.java so that ProcessTools.createJavaProcessBuilder() and the above test can call it and avoiding duplicate code. A separate follow-up bug to address this is fine. 6) PrintSystemModulesApp.java I don't think it is being used? thanks, Calvin On 6/28/18, 4:15 PM, Jiangli Zhou wrote: > This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). > > The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. > > The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. > > webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ > RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 > > Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. > > Following are the details of system module archiving, which are duplicated in above bug report. > --------------------------------------------------------------------------------------------------------------------------- > Support archiving system module graph when the initial module is unnamed module from -cp currently. > > Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. > > Dump time system module object archiving > ================================= > At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. > > private static SystemModules archivedSystemModules; > private static ModuleFinder archivedSystemModuleFinder; > private static String archivedMainModule; > > The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. > > 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. > 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. > 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. > 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. > 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. > > Runtime initialization from archived system module objects > ============================================ > VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. > > If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. > > In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. > > Thanks, > Jiangli > > From kim.barrett at oracle.com Fri Jul 6 16:33:28 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 6 Jul 2018 12:33:28 -0400 Subject: RFR (S): 8206453: Taskqueue stats should count real steal attempts, not calls to GenericTaskQueueSet::steal In-Reply-To: <1992fab92d03fbbe64218aab7562a59e41d2b553.camel@oracle.com> References: <1992fab92d03fbbe64218aab7562a59e41d2b553.camel@oracle.com> Message-ID: <93B88660-BA2D-4387-A311-B73F5EBFCC41@oracle.com> > On Jul 6, 2018, at 10:11 AM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for some small change to how successful steals and > steal attempts in taskqueue statistics are gathered? In particular, as > the subject of the CR suggests, the "steal attempts" counter the number > of calls to GenericTaskQueueSet::steal() which internally actually > attempts stealing quite often. > > This makes a useful comparison of steal attempts to successful steals > impossible and misleading. > > The calls to GenericTaskQueueSet::steal are mostly reflected to the > number of termination attempts that is already counted elsewhere (ie. > steal attempts - successful steals), so it does not really give new > information. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8206453 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8206453/webrev/ > Testing: > local compilation and use with and without TASKQUEUE_STATS. > > Thanks, > Thomas Please change record_attempt to record_steal_attempt. Otherwise, looks good. I don't need a new webrev for that renaming. From zgu at redhat.com Fri Jul 6 16:56:05 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 6 Jul 2018 12:56:05 -0400 Subject: RFR(S) 8206467: Refactor G1ParallelCleaningTask into shared Message-ID: <15cc2aaf-54bc-fbe2-f9a1-e66168fa7a54@redhat.com> Hi, Shenandoah has a similar version that was derived from G1ParallelCleaningTask, with additional time tracking of each cleaning task. Due to code movement and renaming, the changeset appears to be large, but it is really not. Other than the renaming, followings are the actual diffs vs. current G1ParallelCleaningTask: - G1StringDedupUnlinkOrOopsDoClosure is passed as a parameter. - Counters (_string_processed, _string_removed, and etc.) in StringAndSymbolCleaningTask are volatile now, cause they are updated using atomic operations. - Added ParallelCleaningTimes and ParallelCleaningTaskTimer classes for tracking times. - ParallelCleaningTask::work() added time tracking code. Bug: https://bugs.openjdk.java.net/browse/JDK-8206467 Webrev: http://cr.openjdk.java.net/~zgu/8206467/webrev.00/ Test: hotspot_gc on Linux 64 (fastdebug and release) Thanks, -Zhengyu From jiangli.zhou at Oracle.COM Fri Jul 6 19:34:59 2018 From: jiangli.zhou at Oracle.COM (Jiangli Zhou) Date: Fri, 6 Jul 2018 12:34:59 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <5B3F95AB.7060702@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> <5B3F95AB.7060702@oracle.com> Message-ID: Hi Calvin, Thanks for the review! Here is the updated webrevs that address the feedbacks from you and Ioi: http://cr.openjdk.java.net/~jiangli/8202035/webrev_inc.01/ Full webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev_full.01/ > On Jul 6, 2018, at 9:15 AM, Calvin Cheung wrote: > > Hi Jiangli, > > Thanks for this start-up improvement. The changes look good overall. I've the following minor comments. > > 1) make/hotspot/symbols/symbols-unix > > 134 JVM_InitializeFromArchive > > If you want the symbols to be in alphabetical order, the above should be moved after JVM_InitStackTraceElementArray. Fixed. > > 2) metaspaceShared.cpp > > 1927 oop MetaspaceShared::materialize_archived_object(oop obj) { > 1928 if (obj != NULL) { > 1929 return G1CollectedHeap::heap()->materialize_archived_object(obj); > 1930 } > 1931 return NULL; > 1932 } > > Instead of two return statements, how about replacing lines 1928 - 1931 with the following? > > return (obj != NULL) ? G1CollectedHeap::heap()->materialize_archived_object(obj) : NULL; The original format probably is slightly easier to read, so I left it unchanged. Hope that?s okay with you. > > 3) ArchivedModuleComboTest.java > > 55 Path moduleDir = Files.createTempDirectory(userDir, "mods"); > > I don't see anything got placed under the "mods" dir, is it by design? Yes. > > For the "dump with --module-path" cases, there seems to be a missing test case with "--show-module-resolution" (similar to Test case 2). When --module-path is specified at dump time, system module graph is not archived currently. There is no need for additional test case with --show-module-resolution in this case since all module objects are created as normal. > > > 4) CheckArchivedModuleApp.java > > 53 if (expectArchived && wb.isShared(md)) { > 54 System.out.println(name + " is archived. Expected."); > 55 } else if (!expectArchived && !wb.isShared(md)) { > 56 System.out.println(name + " is not archived. Expected."); > 57 } else if (expectArchived) { > 58 throw new RuntimeException( > 59 "FAILED. " + name + " is not archived. Expect archived."); > 60 } else { > 61 throw new RuntimeException( > 62 "FAILED. " + name + " is archived. Expect not archived."); > 63 } > > I'd suggest the following so that the code is easier to understand: > > if (expectArchived) { > if (wb.isShared(md)) { > System.out.println(name + " is archived. Expected."); > } else { > throw new RuntimeException( > "FAILED. " + name + " is not archived. Expect archived."); > } > } else { > if (!wb.isShared(md)) { > System.out.println(name + " is not archived. Expected."); > } else { > throw new RuntimeException( > "FAILED. " + name + " is archived. Expect not archived."); > } > } Reformatted as suggested. > > 5) ArchivedModuleWithCustomImageTest.java > > 178 private static void printCommand(String opts[]) { > 179 StringBuilder cmdLine = new StringBuilder(); > 180 for (String cmd : opts) > 181 cmdLine.append(cmd).append(' '); > 182 System.out.println("Command line: [" + cmdLine.toString() + "]"); > 183 } > > Consider putting the above method in ProcessTools.java so that ProcessTools.createJavaProcessBuilder() and the above test can call it and avoiding duplicate code. > A separate follow-up bug to address this is fine. That sounds good to me. We might need some reformatting for consolidation. I will file a follow-up RFE. > > 6) PrintSystemModulesApp.java > > I don't think it is being used? It?s used by ArchivedModuleCompareTest.java. Looks like it was missing from the earlier webrev. Thanks for catching that. The file is included in the updated webrev. Thanks! Jiangli > > thanks, > Calvin > > On 6/28/18, 4:15 PM, Jiangli Zhou wrote: >> This is a follow-up RFE of JDK-8201650 (Move iteration order randomization of unmodifiable Set and Map to iterators), which was resolved to allow Set/Map objects being archived at CDS dump time (thanks Claes and Stuart Marks). In the current RFE, it archives the set of system ModuleReference and ModuleDescriptor objects (including their referenced objects) in 'open' archive heap region at CDS dump time. It allows reusing of the objects and bypassing the process of creating the system ModuleDescriptors and ModuleReferences at runtime for startup improvement. My preliminary measurements on linux-x64 showed ~5% startup improvement when running HelloWorld from -cp using archived module objects at runtime (without extra tuning). >> >> The library changes in the following webrev are contributed by Alan Bateman. Thanks Alan and Mandy for discussions and help. Thanks Karen, Lois and Ioi for discussion and suggestions on initialization ordering. >> >> The majority of the module object archiving code are in heapShared.hpp and heapShared.cpp. Thanks Coleen for pre-review and Eric Caspole for helping performance tests. >> >> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ >> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 >> >> Tested using tier1 - tier6 via mach5 including all new test cases added in the webrev. >> >> Following are the details of system module archiving, which are duplicated in above bug report. >> --------------------------------------------------------------------------------------------------------------------------- >> Support archiving system module graph when the initial module is unnamed module from -cp currently. >> >> Support G1 GC, 64-bit (non-Windows). Requires UseCompressedOops and UseCompressedClassPointers. >> >> Dump time system module object archiving >> ================================= >> At dump time, the following fields in ArchivedModuleGraph are set to record the system module information created by ModuleBootstrap for archiving. >> >> private static SystemModules archivedSystemModules; >> private static ModuleFinder archivedSystemModuleFinder; >> private static String archivedMainModule; >> >> The archiving process starts from a given static field in ArchivedModuleGraph class instance (java mirror object). The process archives the complete network of java heap objects that are reachable directly or indirectly from the starting object by following references. >> >> 1. Starts from a given static field within the Class instance (java mirror). If the static field is a refererence field and points to a non-null java object, proceed to the next step. The static field and it's value is recorded and stored outside the archived mirror. >> 2. Archives the referenced java object. If an archived copy of the current object already exists, updates the pointer in the archived copy of the referencing object to point to the current archived object. Otherwise, proceed to the next step. >> 3. Follows all references within the current java object and recursively archive the sub-graph of objects starting from each reference encountered within the object. >> 4. Updates the pointer in the archived copy of referecing object to point to the current archived object. >> 5. The Klass of the current java object is added to a list of Klasses for loading and initializing before any object in the archived graph can be accessed at runtime. >> >> Runtime initialization from archived system module objects >> ============================================ >> VM.initializeFromArchive() is called from ArchivedModuleGraph's static initializer to initialize from the archived module information. Klasses in the recorded list are loaded, linked and initialized. The static fields in ArchivedModuleGraph class instance are initialized using the archived field values. After initialization, the archived system module objects can be used directly. >> >> If the archived java heap data is not successfully mapped at runtime, or there is an error during VM.initializeFromArchive(), then all static fields in ArchivedModuleGraph are not initialized. In that case, system ModuleDescriptor and ModuleReference objects are created as normal. >> >> In non-CDS mode, VM.initializeFromArchive() returns immediately with minimum added overhead for normal execution. >> >> Thanks, >> Jiangli >> >> From mandy.chung at oracle.com Fri Jul 6 20:40:03 2018 From: mandy.chung at oracle.com (mandy chung) Date: Fri, 6 Jul 2018 13:40:03 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> Message-ID: <9aafedcf-4abf-77fd-e455-823c9e11c8b0@oracle.com> Hi Jiangli, On 6/28/18 4:15 PM, Jiangli Zhou wrote:> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ > RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 Good work. I'm glad to see a pretty good startup improvement. I reviewed java.base change that looks good. Mandy From jiangli.zhou at oracle.com Fri Jul 6 20:41:30 2018 From: jiangli.zhou at oracle.com (Jiangli Zhou) Date: Fri, 6 Jul 2018 13:41:30 -0700 Subject: RFR(L): 8202035: Archive the set of ModuleDescriptor and ModuleReference objects for system modules In-Reply-To: <9aafedcf-4abf-77fd-e455-823c9e11c8b0@oracle.com> References: <386DA770-8E9D-43A0-87CE-0E380977F884@oracle.com> <9aafedcf-4abf-77fd-e455-823c9e11c8b0@oracle.com> Message-ID: <39F0EBB2-3721-4A08-9661-955E4B5E6920@oracle.com> Thanks a lot for reviewing, Mandy! Jiangli > On Jul 6, 2018, at 1:40 PM, mandy chung wrote: > > Hi Jiangli, > > On 6/28/18 4:15 PM, Jiangli Zhou wrote:> webrev: http://cr.openjdk.java.net/~jiangli/8202035/webrev.00/ >> RFE: https://bugs.openjdk.java.net/browse/JDK-8202035?filter=14921 > > Good work. I'm glad to see a pretty good startup improvement. > > I reviewed java.base change that looks good. > > Mandy From kim.barrett at oracle.com Sat Jul 7 03:18:02 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 6 Jul 2018 23:18:02 -0400 Subject: RFR: JDK-8204970: Remaing object comparisons need to use oopDesc::equals() In-Reply-To: <2b92069f-f378-2984-8518-68230238eaec@redhat.com> References: <2b92069f-f378-2984-8518-68230238eaec@redhat.com> Message-ID: > On Jul 6, 2018, at 11:28 AM, Roman Kennke wrote: > > We found 2 more places where oopDesc::equals() should be used instead of > raw obj==obj. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8204970 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8204970/webrev.00/ > > Passes tier1 tests > > Can I get a review? > > Thanks, > Roman This looks good. How close are we to being able to remove operator== and operator!= from the oop class that is defined when CHECK_UNHANDLED_OOPS is defined? I suspect the main problem is checks for NULL? From kim.barrett at oracle.com Sat Jul 7 03:20:02 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 6 Jul 2018 23:20:02 -0400 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink In-Reply-To: <8f49bc7b4dbdf3255f4c2dc83354101c1b64ae2f.camel@oracle.com> References: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> <54BF88C5-5835-49B7-8E1F-E21A4E429D15@oracle.com> <15c89934f5f69de519bca42c9d7b049e621ccbae.camel@oracle.com> <68B2A585-08B4-4741-93C6-B68D3CC801CA@oracle.com> <8f49bc7b4dbdf3255f4c2dc83354101c1b64ae2f.camel@oracle.com> Message-ID: <29DEBCBF-A50B-44D7-BDE2-FF88BA99C3F7@oracle.com> > On Jul 6, 2018, at 9:10 AM, Thomas Schatzl wrote: > > Hi, > > On Thu, 2018-07-05 at 16:53 -0400, Kim Barrett wrote: >>> On Jul 5, 2018, at 3:16 AM, Thomas Schatzl >> om> wrote: >>> There is a new webrev at >>> >>> http://cr.openjdk.java.net/~tschatzl/8205426/webrev.1 (full) >>> http://cr.openjdk.java.net/~tschatzl/8205426/webrev.0_to_1 (diff, >>> but >>> almost useless due to many changes) >>> >>> That at least separates the concerns about humongous/regular region >>> a >>> bit. >>> >>> Thanks, >>> Thomas >> >> I like this much better. It eliminates the implicit logical coupling >> that the before rebuild task "knows" the liveness of the starts >> region >> is good enough, without introducing physical coupling from remset to >> concurrentmark. >> >> ------------------------------------------------------------------- >> ----------- >> src/hotspot/share/gc/g1/g1RemSetTrackingPolicy.cpp >> 116 if (!r->is_old() && r->is_archive()) { >> >> I think that should be || rather than &&. >> >> ------------------------------------------------------------------- >> ----------- >> src/hotspot/share/gc/g1/g1RemSetTrackingPolicy.cpp >> 111 bool G1RemSetTrackingPolicy::update_before_rebuild(HeapRegion* >> r, size_t live_bytes) { >> >> Consider adding "assert(!r->is_humongous(), ...)". The !r->is_old() >> will filter them out, but we shouldn't be here at all and should have >> instead called the associated update_humongous function. >> >> ------------------------------------------------------------------- >> ----------- >> > > fixed all that and Erik's suggestion. > > New webrev: > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.2 (full) > http://cr.openjdk.java.net/~tschatzl/8205426/webrev.1_to_2 (diff) > > It passed hs-tier1-4,jdk-tier1-3 > > Thanks, > Thomas Looks good. From rkennke at redhat.com Sat Jul 7 10:52:44 2018 From: rkennke at redhat.com (Roman Kennke) Date: Sat, 7 Jul 2018 12:52:44 +0200 Subject: RFR: JDK-8204970: Remaing object comparisons need to use oopDesc::equals() In-Reply-To: References: <2b92069f-f378-2984-8518-68230238eaec@redhat.com> Message-ID: Am 07.07.2018 um 05:18 schrieb Kim Barrett: >> On Jul 6, 2018, at 11:28 AM, Roman Kennke wrote: >> >> We found 2 more places where oopDesc::equals() should be used instead of >> raw obj==obj. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8204970 >> Webrev: >> http://cr.openjdk.java.net/~rkennke/JDK-8204970/webrev.00/ >> >> Passes tier1 tests >> >> Can I get a review? >> >> Thanks, >> Roman > > This looks good. > > How close are we to being able to remove operator== and operator!= from the oop class that > is defined when CHECK_UNHANDLED_OOPS is defined? I suspect the main problem is > checks for NULL? The main problems are all those places where we actually want to use naked comparisons, especially inside GC code. In Shenandoah, we actually put checks in the == and != operators to catch unintended raw == and !=: https://builds.shipilev.net/patch-openjdk-shenandoah-jdk/2018-07-06-v255-vs-dea7ce62c7b0/src/hotspot/share/oops/oopsHierarchy.hpp.udiff.html But this requires all *intended* raw comparisons to be expressed differently, in Shenandoah we have a special unsafe_equals() method that casts to oop to HeapWord* and compare that, but we could use RawAccessBarrier::equals() for this now. These verification checks have proven to be very useful to catch bad naked ==, I'd like to upstream this soon if you agree. WDYT? Cheers, Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From kim.barrett at oracle.com Sun Jul 8 15:52:32 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Sun, 8 Jul 2018 11:52:32 -0400 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # In-Reply-To: <7BB45268-F475-41CA-BB2D-DFB2FE9AE6E4@oracle.com> References: <7BB45268-F475-41CA-BB2D-DFB2FE9AE6E4@oracle.com> Message-ID: > On Jul 5, 2018, at 4:03 PM, Kim Barrett wrote: > >> On Jul 5, 2018, at 3:57 AM, Thomas Schatzl wrote: >> >> Hi, >> >> On Wed, 2018-07-04 at 20:13 -0400, Kim Barrett wrote: >>> Please review this fix of the HeapRegion gtest. >>> >>> The test modifies a region's "top" to unexpected values without >>> ensuring that no allocation might use the region and no GC might run >>> while the region is in that invalid state. We solve this by >>> executing the test code in its very own safepoint, and by saving and >>> then restoring the region's top back to its original value before >>> completing the test. And since we are doing all that, there's no >>> longer any reason to run the test in a separate VM. >> >> looks good, but the actual test is still run in a separate VM. >> Intentional? > > Unintentional. And now I?m not sure what I last ran through mach5. > I?ll re-test with TEST_OTHER_VM => TEST_VM. > > I know that failed in an obscure way earlier, but I think that was because > of an unrelated recently introduced bug that?s been fixed in the repo. Verified that I really have tested in same VM. New webrev: http://cr.openjdk.java.net/~kbarrett/8204691/open.01/ The only change is TEST_OTHER_VM => TEST_VM. From thomas.schatzl at oracle.com Mon Jul 9 08:30:43 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 09 Jul 2018 10:30:43 +0200 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # In-Reply-To: References: <7BB45268-F475-41CA-BB2D-DFB2FE9AE6E4@oracle.com> Message-ID: <55d3dd07125645c1f86b274a6eb7ceb28a1286d2.camel@oracle.com> Hi Kim, On Sun, 2018-07-08 at 11:52 -0400, Kim Barrett wrote: > > On Jul 5, 2018, at 4:03 PM, Kim Barrett > > wrote: > > > > > On Jul 5, 2018, at 3:57 AM, Thomas Schatzl > > > wrote: > > > > > > On Wed, 2018-07-04 at 20:13 -0400, Kim Barrett wrote: > > > > Please review this fix of the HeapRegion gtest. > > > > [...] > > > > > > looks good, but the actual test is still run in a separate VM. > > > Intentional? > > > > Unintentional. And now I?m not sure what I last ran through mach5. > > I?ll re-test with TEST_OTHER_VM => TEST_VM. > > > > I know that failed in an obscure way earlier, but I think that was > > because of an unrelated recently introduced bug that?s been fixed > > in the repo. > > Verified that I really have tested in same VM. New webrev: > http://cr.openjdk.java.net/~kbarrett/8204691/open.01/ > > The only change is TEST_OTHER_VM => TEST_VM. > thanks. Looks good. Thomas From thomas.schatzl at oracle.com Mon Jul 9 08:31:31 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 09 Jul 2018 10:31:31 +0200 Subject: RFR (S): 8205426: Humongous continues remembered set does not match humongous start region one in Kitchensink In-Reply-To: <29DEBCBF-A50B-44D7-BDE2-FF88BA99C3F7@oracle.com> References: <69519db2cd7fe431357a5a05b89bba17cdd0eaaa.camel@oracle.com> <54BF88C5-5835-49B7-8E1F-E21A4E429D15@oracle.com> <15c89934f5f69de519bca42c9d7b049e621ccbae.camel@oracle.com> <68B2A585-08B4-4741-93C6-B68D3CC801CA@oracle.com> <8f49bc7b4dbdf3255f4c2dc83354101c1b64ae2f.camel@oracle.com> <29DEBCBF-A50B-44D7-BDE2-FF88BA99C3F7@oracle.com> Message-ID: <1b398fbb77953e9b8b218b9d86e844027332b24c.camel@oracle.com> Kim, Erik, thanks for your reviews. Thomas From thomas.schatzl at oracle.com Mon Jul 9 08:52:46 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 09 Jul 2018 10:52:46 +0200 Subject: RFR (S): 8206453: Taskqueue stats should count real steal attempts, not calls to GenericTaskQueueSet::steal In-Reply-To: <93B88660-BA2D-4387-A311-B73F5EBFCC41@oracle.com> References: <1992fab92d03fbbe64218aab7562a59e41d2b553.camel@oracle.com> <93B88660-BA2D-4387-A311-B73F5EBFCC41@oracle.com> Message-ID: <8088066dfc637622a738def9bc955dcb9bc47760.camel@oracle.com> Hi Kim, On Fri, 2018-07-06 at 12:33 -0400, Kim Barrett wrote: > > On Jul 6, 2018, at 10:11 AM, Thomas Schatzl > com> wrote: > > [...] > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8206453 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8206453/webrev/ > > Testing: > > local compilation and use with and without TASKQUEUE_STATS. > > > > Thanks, > > Thomas > > Please change record_attempt to record_steal_attempt. > Otherwise, looks good. I don't need a new webrev for that renaming. > thanks for your review. I updated the existing webrev in-place for the second reviewer. Thomas From erik.helin at oracle.com Mon Jul 9 09:28:43 2018 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 9 Jul 2018 11:28:43 +0200 Subject: RFR (S): 8206453: Taskqueue stats should count real steal attempts, not calls to GenericTaskQueueSet::steal In-Reply-To: <8088066dfc637622a738def9bc955dcb9bc47760.camel@oracle.com> References: <1992fab92d03fbbe64218aab7562a59e41d2b553.camel@oracle.com> <93B88660-BA2D-4387-A311-B73F5EBFCC41@oracle.com> <8088066dfc637622a738def9bc955dcb9bc47760.camel@oracle.com> Message-ID: <07c18df7-57bd-0790-72bb-457720bf2823@oracle.com> On 07/09/2018 10:52 AM, Thomas Schatzl wrote: > Hi Kim, > > On Fri, 2018-07-06 at 12:33 -0400, Kim Barrett wrote: >>> On Jul 6, 2018, at 10:11 AM, Thomas Schatzl >> com> wrote: >>> > [...] >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8206453 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8206453/webrev/ >>> Testing: >>> local compilation and use with and without TASKQUEUE_STATS. >>> >>> Thanks, >>> Thomas >> >> Please change record_attempt to record_steal_attempt. >> Otherwise, looks good. I don't need a new webrev for that renaming. >> > > thanks for your review. I updated the existing webrev in-place for > the second reviewer. Looks good, Reviewed! Thanks, Erik > Thomas > From thomas.schatzl at oracle.com Mon Jul 9 09:39:01 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 09 Jul 2018 11:39:01 +0200 Subject: RFR (S): 8206453: Taskqueue stats should count real steal attempts, not calls to GenericTaskQueueSet::steal In-Reply-To: <07c18df7-57bd-0790-72bb-457720bf2823@oracle.com> References: <1992fab92d03fbbe64218aab7562a59e41d2b553.camel@oracle.com> <93B88660-BA2D-4387-A311-B73F5EBFCC41@oracle.com> <8088066dfc637622a738def9bc955dcb9bc47760.camel@oracle.com> <07c18df7-57bd-0790-72bb-457720bf2823@oracle.com> Message-ID: <6c02d2281e5355d99602cfd9a0a9b78118a47866.camel@oracle.com> Hi Erik, On Mon, 2018-07-09 at 11:28 +0200, Erik Helin wrote: > On 07/09/2018 10:52 AM, Thomas Schatzl wrote: > > Hi Kim, > > > > On Fri, 2018-07-06 at 12:33 -0400, Kim Barrett wrote: > > > > On Jul 6, 2018, at 10:11 AM, Thomas Schatzl > > > cle. > > > > com> wrote: > > > > > > > > [...] > > > > CR: > > > > https://bugs.openjdk.java.net/browse/JDK-8206453 > > > > Webrev: > > > > http://cr.openjdk.java.net/~tschatzl/8206453/webrev/ > > > > Testing: > > > > local compilation and use with and without TASKQUEUE_STATS. > > > > > > > > Thanks, > > > > Thomas > > > [...] > > > thanks for your review. I updated the existing webrev in-place > > for > > the second reviewer. > > Looks good, Reviewed! > > Thanks, > Erik thanks for your review. Thanks, Thomas From thomas.schatzl at oracle.com Mon Jul 9 10:13:30 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 09 Jul 2018 12:13:30 +0200 Subject: [11] RFR (XXS): 8206476: Wrong assert in phase_enum_2_phase_string() in referenceProcessorPhaseTimes.cpp Message-ID: <10ff9cdf14a2265e2437b4efc4f48eddc7a7c001.camel@oracle.com> Hi all, can I have reviews to this issue found with our C++ source code analysis tool that complains about one assert not being strict enough. In particular: static const char* phase_enum_2_phase_string(ReferenceProcessor::RefProcPhases phase) { assert(phase >= ReferenceProcessor::RefPhase1 && phase <= ReferenceProcessor::RefPhaseMax, "Invalid reference processing phase (%d)", phase); return PhaseNames[phase]; } The second "<=" should be a "<". Actually there is an existing (correct) macro for the whole assert. Replaced that line with the macro as follows: @@ -80,8 +80,7 @@ STATIC_ASSERT((REF_PHANTOM + 1) == ARRAY_SIZE(ReferenceTypeNames)); static const char* phase_enum_2_phase_string(ReferenceProcessor::RefProcPhases phase) { - assert(phase >= ReferenceProcessor::RefPhase1 && phase <= ReferenceProcessor::RefPhaseMax, - "Invalid reference processing phase (%d)", phase); + ASSERT_PHASE(phase); return PhaseNames[phase]; } There is no actual failure, and there are no known failures with the change either; the reason for putting this into 11 is to get rid of unnecessary noise in source code analysis tool results. CR: https://bugs.openjdk.java.net/browse/JDK-8206476 Webrev: http://cr.openjdk.java.net/~tschatzl/8206476/webrev/index.html Testing: hs-tier1-3,jdk-tier1 Thanks, Thomas From erik.helin at oracle.com Mon Jul 9 12:05:19 2018 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 9 Jul 2018 14:05:19 +0200 Subject: [11] RFR (XXS): 8206476: Wrong assert in phase_enum_2_phase_string() in referenceProcessorPhaseTimes.cpp In-Reply-To: <10ff9cdf14a2265e2437b4efc4f48eddc7a7c001.camel@oracle.com> References: <10ff9cdf14a2265e2437b4efc4f48eddc7a7c001.camel@oracle.com> Message-ID: <3f5c14d4-c850-4481-1299-9c2ee1ade6c5@oracle.com> On 07/09/2018 12:13 PM, Thomas Schatzl wrote: > Hi all, > > can I have reviews to this issue found with our C++ source code > analysis tool that complains about one assert not being strict enough. > > In particular: > > static const char* > phase_enum_2_phase_string(ReferenceProcessor::RefProcPhases phase) { > assert(phase >= ReferenceProcessor::RefPhase1 && phase <= > ReferenceProcessor::RefPhaseMax, > "Invalid reference processing phase (%d)", phase); > return PhaseNames[phase]; > } > > The second "<=" should be a "<". > > Actually there is an existing (correct) macro for the whole assert. > Replaced that line with the macro as follows: > > @@ -80,8 +80,7 @@ > STATIC_ASSERT((REF_PHANTOM + 1) == ARRAY_SIZE(ReferenceTypeNames)); > > static const char* > phase_enum_2_phase_string(ReferenceProcessor::RefProcPhases phase) { > - assert(phase >= ReferenceProcessor::RefPhase1 && phase <= > ReferenceProcessor::RefPhaseMax, > - "Invalid reference processing phase (%d)", phase); > + ASSERT_PHASE(phase); > return PhaseNames[phase]; > } > > There is no actual failure, and there are no known failures with the > change either; the reason for putting this into 11 is to get rid of > unnecessary noise in source code analysis tool results. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8206476 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8206476/webrev/index.html Looks good, Reviewed. Thanks, Erik > Testing: > hs-tier1-3,jdk-tier1 > > Thanks, > Thomas > From thomas.schatzl at oracle.com Mon Jul 9 12:54:24 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 09 Jul 2018 14:54:24 +0200 Subject: [11] RFR (XXS): 8206476: Wrong assert in phase_enum_2_phase_string() in referenceProcessorPhaseTimes.cpp In-Reply-To: <3f5c14d4-c850-4481-1299-9c2ee1ade6c5@oracle.com> References: <10ff9cdf14a2265e2437b4efc4f48eddc7a7c001.camel@oracle.com> <3f5c14d4-c850-4481-1299-9c2ee1ade6c5@oracle.com> Message-ID: <285a4c976db53d5cf54832b8cf930d6e863cb8ad.camel@oracle.com> Hi, On Mon, 2018-07-09 at 14:05 +0200, Erik Helin wrote: > On 07/09/2018 12:13 PM, Thomas Schatzl wrote: > > Hi all, > > > > can I have reviews to this issue found with our C++ source code > > analysis tool that complains about one assert not being strict > > enough. > > > > [...] > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8206476 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8206476/webrev/index.html > > Looks good, Reviewed. > thanks for your review. Thomas From kim.barrett at oracle.com Mon Jul 9 14:51:51 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 9 Jul 2018 10:51:51 -0400 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # In-Reply-To: <55d3dd07125645c1f86b274a6eb7ceb28a1286d2.camel@oracle.com> References: <7BB45268-F475-41CA-BB2D-DFB2FE9AE6E4@oracle.com> <55d3dd07125645c1f86b274a6eb7ceb28a1286d2.camel@oracle.com> Message-ID: > On Jul 9, 2018, at 4:30 AM, Thomas Schatzl wrote: > > Hi Kim, > > On Sun, 2018-07-08 at 11:52 -0400, Kim Barrett wrote: >> [?] >> Verified that I really have tested in same VM. New webrev: >> http://cr.openjdk.java.net/~kbarrett/8204691/open.01/ >> >> The only change is TEST_OTHER_VM => TEST_VM. >> > > thanks. Looks good. > > Thomas Thanks. From kim.barrett at oracle.com Mon Jul 9 14:54:34 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 9 Jul 2018 10:54:34 -0400 Subject: [11] RFR (XXS): 8206476: Wrong assert in phase_enum_2_phase_string() in referenceProcessorPhaseTimes.cpp In-Reply-To: <10ff9cdf14a2265e2437b4efc4f48eddc7a7c001.camel@oracle.com> References: <10ff9cdf14a2265e2437b4efc4f48eddc7a7c001.camel@oracle.com> Message-ID: > On Jul 9, 2018, at 6:13 AM, Thomas Schatzl wrote: > [?] > Actually there is an existing (correct) macro for the whole assert. > Replaced that line with the macro as follows: > > [?] > CR: > https://bugs.openjdk.java.net/browse/JDK-8206476 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8206476/webrev/index.html > Testing: > hs-tier1-3,jdk-tier1 > > Thanks, > Thomas Looks good. Thanks for spotting and using the existing helper macro. From thomas.schatzl at oracle.com Mon Jul 9 14:58:55 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 09 Jul 2018 16:58:55 +0200 Subject: [11] RFR (XXS): 8206476: Wrong assert in phase_enum_2_phase_string() in referenceProcessorPhaseTimes.cpp In-Reply-To: References: <10ff9cdf14a2265e2437b4efc4f48eddc7a7c001.camel@oracle.com> Message-ID: <90e85ae58fdc6810d4b69200b99a746394a8912a.camel@oracle.com> Hi, On Mon, 2018-07-09 at 10:54 -0400, Kim Barrett wrote: > > On Jul 9, 2018, at 6:13 AM, Thomas Schatzl > om> wrote: > > [?] > > Actually there is an existing (correct) macro for the whole assert. > > Replaced that line with the macro as follows: > > > > [?] > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8206476 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8206476/webrev/index.html > > Testing: > > hs-tier1-3,jdk-tier1 > > > > Thanks, > > Thomas > > Looks good. > > Thanks for spotting and using the existing helper macro. > thanks for your review. Thomas From erik.helin at oracle.com Mon Jul 9 15:49:48 2018 From: erik.helin at oracle.com (Erik Helin) Date: Mon, 9 Jul 2018 17:49:48 +0200 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # In-Reply-To: References: <7BB45268-F475-41CA-BB2D-DFB2FE9AE6E4@oracle.com> Message-ID: <2aa63d5e-d148-219c-2e77-928fd0964a64@oracle.com> On 07/08/2018 05:52 PM, Kim Barrett wrote: >> On Jul 5, 2018, at 4:03 PM, Kim Barrett wrote: >> >>> On Jul 5, 2018, at 3:57 AM, Thomas Schatzl wrote: >>> >>> Hi, >>> >>> On Wed, 2018-07-04 at 20:13 -0400, Kim Barrett wrote: >>>> Please review this fix of the HeapRegion gtest. >>>> >>>> The test modifies a region's "top" to unexpected values without >>>> ensuring that no allocation might use the region and no GC might run >>>> while the region is in that invalid state. We solve this by >>>> executing the test code in its very own safepoint, and by saving and >>>> then restoring the region's top back to its original value before >>>> completing the test. And since we are doing all that, there's no >>>> longer any reason to run the test in a separate VM. >>> >>> looks good, but the actual test is still run in a separate VM. >>> Intentional? >> >> Unintentional. And now I?m not sure what I last ran through mach5. >> I?ll re-test with TEST_OTHER_VM => TEST_VM. >> >> I know that failed in an obscure way earlier, but I think that was because >> of an unrelated recently introduced bug that?s been fixed in the repo. > > Verified that I really have tested in same VM. New webrev: > http://cr.openjdk.java.net/~kbarrett/8204691/open.01/ > > The only change is TEST_OTHER_VM => TEST_VM. Hmmm, it is (very) unfortunate if we have native code in JVM allocating Java objects and triggering garbage collections _concurrently_ with the unit tests being run (there shouldn't be any Java code running when the unit tests are executing). I understand that we have to restore the top pointer in case there is some verification for example when the JVM exits (or if we assert in a destructor etc), but do we really need to run the test in a safepoint? There is nothing wrong with running the test in a safepoint, but it seems to me that we then would have to run almost all TEST_VM tests in a safepoint? Thanks, Erik From kim.barrett at oracle.com Mon Jul 9 19:51:15 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 9 Jul 2018 15:51:15 -0400 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # In-Reply-To: <2aa63d5e-d148-219c-2e77-928fd0964a64@oracle.com> References: <7BB45268-F475-41CA-BB2D-DFB2FE9AE6E4@oracle.com> <2aa63d5e-d148-219c-2e77-928fd0964a64@oracle.com> Message-ID: <92BE79FB-A982-46AB-80D7-61B459D88396@oracle.com> > On Jul 9, 2018, at 11:49 AM, Erik Helin wrote: > > On 07/08/2018 05:52 PM, Kim Barrett wrote: >>> On Jul 5, 2018, at 4:03 PM, Kim Barrett wrote: >>> >>>> On Jul 5, 2018, at 3:57 AM, Thomas Schatzl wrote: >>>> >>>> Hi, >>>> >>>> On Wed, 2018-07-04 at 20:13 -0400, Kim Barrett wrote: >>>>> Please review this fix of the HeapRegion gtest. >>>>> >>>>> The test modifies a region's "top" to unexpected values without >>>>> ensuring that no allocation might use the region and no GC might run >>>>> while the region is in that invalid state. We solve this by >>>>> executing the test code in its very own safepoint, and by saving and >>>>> then restoring the region's top back to its original value before >>>>> completing the test. And since we are doing all that, there's no >>>>> longer any reason to run the test in a separate VM. >>>> >>>> looks good, but the actual test is still run in a separate VM. >>>> Intentional? >>> >>> Unintentional. And now I?m not sure what I last ran through mach5. >>> I?ll re-test with TEST_OTHER_VM => TEST_VM. >>> >>> I know that failed in an obscure way earlier, but I think that was because >>> of an unrelated recently introduced bug that?s been fixed in the repo. >> Verified that I really have tested in same VM. New webrev: >> http://cr.openjdk.java.net/~kbarrett/8204691/open.01/ >> The only change is TEST_OTHER_VM => TEST_VM. > > Hmmm, it is (very) unfortunate if we have native code in JVM allocating Java objects and triggering garbage collections _concurrently_ with the unit tests being run (there shouldn't be any Java code running when the unit tests are executing). I understand that we have to restore the top pointer in case there is some verification for example when the JVM exits (or if we assert in a destructor etc), but do we really need to run the test in a safepoint? There is nothing wrong with running the test in a safepoint, but it seems to me that we then would have to run almost all TEST_VM tests in a safepoint? > > Thanks, > Erik I don't think it's quite *that* bad. As far as I can tell, TEST_VM tests have always been executed concurrently with the executing VM. That is, the VM is created (by calling JNI_CreateJavaVM), and then the same thread that made that call (which is now the main thread for the VM) executes the TEST_VM tests. That thread is a Java thread, initially "in native" (which is why we need to do the ThreadInVMfromNative transition first, before going to the safepoint). It's only a problem for tests that mess with VM data structures in unexpected ways. I guess many / most / nearly all(?) don't do that, since we haven't seen more of these kinds of failures. But I agree that it does mean one needs to take some additional care when writing TEST_VM tests. And in fact, there are some tests that rely on that behavior, e.g. the test of OopStorage::delete_empty_blocks_concurrent(). This particular test is doing something really nasty behind the collector's back. It was trying to protect against that by using TEST_OTHER_VM, but that just narrowed the window for failures. I looked at the 3 other uses of TEST_OTHER_VM, and none of them appear to have this kind of problem. They are run in another VM because they side-effect the VM in a way that we don't necessarily want to apply the to main test runner. But they don't seem to be bashing on VM data structures in non-approved ways. From Derek.White at cavium.com Mon Jul 9 20:48:16 2018 From: Derek.White at cavium.com (White, Derek) Date: Mon, 9 Jul 2018 20:48:16 +0000 Subject: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space In-Reply-To: References: <36321D6C-A8B7-48FB-8560-9B5807956A87@oracle.com> Message-ID: Hi Michihiro, FYI, this patch does seem to help AArch64 also on SPECjbb to a lesser degree. This was benchmarked with very large young gen, so GC overhead is kept lower than you?d see in typical applications. * Derek From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] On Behalf Of Michihiro Horie Sent: Wednesday, July 04, 2018 4:26 AM To: Kim Barrett Cc: hotspot-gc-dev at openjdk.java.net; Gustavo Romero Subject: Re: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space External Email Hi Martin, Kim, Thank you for both of your comments. I missed the point that oopDesc::forward_to is invoked from several callers. Using OrderAccess:storestore() before the invocation of forward_to() would be a great idea, thanks. >I haven't looked carefully at the change, though I did find one part >that I don't like. The new test of "order" in forward_to_atomic not >only affects CMS, but also (uselessly) affects G1. Please let me confirm your point. You mean I should give memory_order_acq_rel to forward_to_atomic, which uses tests as follows to hold the consistent meaning of acquire/release in forward_to_atomic? I agree it is not clear the test with release returns the forwardee with acquire. oop oopDesc::forward_to_atomic(oop p, atomic_memory_order order) { : while (!oldMark->is_marked()) { if (order == memory_order_acq_rel) { curMark = cas_set_mark_raw(forwardPtrMark, oldMark, memory_order_release); } else { curMark = cas_set_mark_raw(forwardPtrMark, oldMark, order); } } : } if (order == memory_order_acq_rel) { return forwardee_acquire(); } return forwardee(); } Best regards, -- Michihiro, IBM Research - Tokyo [Inactive hide details for Kim Barrett ---2018/07/04 05:41:02---> On Jul 3, 2018, at 4:25 AM, Michihiro Horie ]Kim Barrett ---2018/07/04 05:41:02---> On Jul 3, 2018, at 4:25 AM, Michihiro Horie > wrote: > From: Kim Barrett > To: Michihiro Horie > Cc: "Doerr, Martin" >, "hotspot-gc-dev at openjdk.java.net" >, Gustavo Romero > Date: 2018/07/04 05:41 Subject: Re: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space ________________________________ > On Jul 3, 2018, at 4:25 AM, Michihiro Horie > wrote: > > Hi Martin, > > Thanks a lot for your review. Sure, we need an OK from a CMS expert. Following is the new webrev: > http://cr.openjdk.java.net/~mhorie/8205908/webrev.01/ > > >Seems like a user of the forwardee needs to rely on memory_order_consume in the current implementation. I guess it will be appreciated that you?re fixing this. > Thank you for pointing out this issue in the original implementation. I newly inserted a release at "2.4. Set new_obj as forwardee [L1142]". > > Improvement of critical-jOPS in SPECjbb2015 was 10%, which is still a big number. > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo CMS was deprecated in JDK 9, and has been on maintenance life-support for some time. This complex-to-review performance enhancement was proposed less than 48 hours before JDK 11 FC, and didn't receive any reviews until after FC. Because of these factors, I don't think it should be included in JDK 11. And if CMS gets removed in JDK 12 (I don't know if that will happen), then this change would be rendered entirely moot. I haven't looked carefully at the change, though I did find one part that I don't like. The new test of "order" in forward_to_atomic not only affects CMS, but also (uselessly) affects G1. I'm not going to be able to look at this carefully soon, as JDK 11 bug fixing has a higher priority for me. Since I think CMS might soon not be an issue, I'd really rather not look at it at all. I think this change needs not just a CMS-expert reviewer, but someone who is willing to maintain CMS (including any potential bug tail from this change). -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From HORIE at jp.ibm.com Tue Jul 10 07:36:11 2018 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Tue, 10 Jul 2018 16:36:11 +0900 Subject: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space In-Reply-To: References: <36321D6C-A8B7-48FB-8560-9B5807956A87@oracle.com> Message-ID: Hi Derek, Thank you very much for testing this change in AArch64 and giving observation on the result, which makes sense. I uploaded a new webrev based on the comments from Martin and Kim. http://cr.openjdk.java.net/~mhorie/8205908/webrev.02/ Best regards, -- Michihiro, IBM Research - Tokyo From: "White, Derek" To: Michihiro Horie , Kim Barrett Cc: "hotspot-gc-dev at openjdk.java.net" , Gustavo Romero Date: 2018/07/10 05:48 Subject: RE: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space Hi Michihiro, FYI, this patch does seem to help AArch64 also on SPECjbb to a lesser degree. This was benchmarked with very large young gen, so GC overhead is kept lower than you?d see in typical applications. Derek From: hotspot-gc-dev [mailto:hotspot-gc-dev-bounces at openjdk.java.net] On Behalf Of Michihiro Horie Sent: Wednesday, July 04, 2018 4:26 AM To: Kim Barrett Cc: hotspot-gc-dev at openjdk.java.net; Gustavo Romero Subject: Re: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space External Email Hi Martin, Kim, Thank you for both of your comments. I missed the point that oopDesc::forward_to is invoked from several callers. Using OrderAccess:storestore() before the invocation of forward_to () would be a great idea, thanks. >I haven't looked carefully at the change, though I did find one part >that I don't like. The new test of "order" in forward_to_atomic not >only affects CMS, but also (uselessly) affects G1. Please let me confirm your point. You mean I should give memory_order_acq_rel to forward_to_atomic, which uses tests as follows to hold the consistent meaning of acquire/release in forward_to_atomic? I agree it is not clear the test with release returns the forwardee with acquire. oop oopDesc::forward_to_atomic(oop p, atomic_memory_order order) { : while (!oldMark->is_marked()) { if (order == memory_order_acq_rel) { curMark = cas_set_mark_raw(forwardPtrMark, oldMark, memory_order_release); } else { curMark = cas_set_mark_raw(forwardPtrMark, oldMark, order); } } : } if (order == memory_order_acq_rel) { return forwardee_acquire(); } return forwardee(); } Best regards, -- Michihiro, IBM Research - Tokyo Inactive hide details for Kim Barrett ---2018/07/04 05:41:02---> On Jul 3, 2018, at 4:25 AM, Michihiro Horie Kim Barrett ---2018/07/04 05:41:02---> On Jul 3, 2018, at 4:25 AM, Michihiro Horie < HORIE at jp.ibm.com> wrote: > From: Kim Barrett To: Michihiro Horie Cc: "Doerr, Martin" , " hotspot-gc-dev at openjdk.java.net" , Gustavo Romero Date: 2018/07/04 05:41 Subject: Re: 8205908: Unnecessarily strong memory barriers in ParNewGeneration::copy_to_survivor_space > On Jul 3, 2018, at 4:25 AM, Michihiro Horie wrote: > > Hi Martin, > > Thanks a lot for your review. Sure, we need an OK from a CMS expert. Following is the new webrev: > http://cr.openjdk.java.net/~mhorie/8205908/webrev.01/ > > >Seems like a user of the forwardee needs to rely on memory_order_consume in the current implementation. I guess it will be appreciated that you?re fixing this. > Thank you for pointing out this issue in the original implementation. I newly inserted a release at "2.4. Set new_obj as forwardee [L1142]". > > Improvement of critical-jOPS in SPECjbb2015 was 10%, which is still a big number. > > > Best regards, > -- > Michihiro, > IBM Research - Tokyo CMS was deprecated in JDK 9, and has been on maintenance life-support for some time. This complex-to-review performance enhancement was proposed less than 48 hours before JDK 11 FC, and didn't receive any reviews until after FC. Because of these factors, I don't think it should be included in JDK 11. And if CMS gets removed in JDK 12 (I don't know if that will happen), then this change would be rendered entirely moot. I haven't looked carefully at the change, though I did find one part that I don't like. The new test of "order" in forward_to_atomic not only affects CMS, but also (uselessly) affects G1. I'm not going to be able to look at this carefully soon, as JDK 11 bug fixing has a higher priority for me. Since I think CMS might soon not be an issue, I'd really rather not look at it at all. I think this change needs not just a CMS-expert reviewer, but someone who is willing to maintain CMS (including any potential bug tail from this change). -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From erik.helin at oracle.com Tue Jul 10 14:34:05 2018 From: erik.helin at oracle.com (Erik Helin) Date: Tue, 10 Jul 2018 16:34:05 +0200 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # In-Reply-To: <92BE79FB-A982-46AB-80D7-61B459D88396@oracle.com> References: <7BB45268-F475-41CA-BB2D-DFB2FE9AE6E4@oracle.com> <2aa63d5e-d148-219c-2e77-928fd0964a64@oracle.com> <92BE79FB-A982-46AB-80D7-61B459D88396@oracle.com> Message-ID: On 07/09/2018 09:51 PM, Kim Barrett wrote: >> On Jul 9, 2018, at 11:49 AM, Erik Helin wrote: >> >> On 07/08/2018 05:52 PM, Kim Barrett wrote: >>>> On Jul 5, 2018, at 4:03 PM, Kim Barrett wrote: >>>> >>>>> On Jul 5, 2018, at 3:57 AM, Thomas Schatzl wrote: >>>>> >>>>> Hi, >>>>> >>>>> On Wed, 2018-07-04 at 20:13 -0400, Kim Barrett wrote: >>>>>> Please review this fix of the HeapRegion gtest. >>>>>> >>>>>> The test modifies a region's "top" to unexpected values without >>>>>> ensuring that no allocation might use the region and no GC might run >>>>>> while the region is in that invalid state. We solve this by >>>>>> executing the test code in its very own safepoint, and by saving and >>>>>> then restoring the region's top back to its original value before >>>>>> completing the test. And since we are doing all that, there's no >>>>>> longer any reason to run the test in a separate VM. >>>>> >>>>> looks good, but the actual test is still run in a separate VM. >>>>> Intentional? >>>> >>>> Unintentional. And now I?m not sure what I last ran through mach5. >>>> I?ll re-test with TEST_OTHER_VM => TEST_VM. >>>> >>>> I know that failed in an obscure way earlier, but I think that was because >>>> of an unrelated recently introduced bug that?s been fixed in the repo. >>> Verified that I really have tested in same VM. New webrev: >>> http://cr.openjdk.java.net/~kbarrett/8204691/open.01/ >>> The only change is TEST_OTHER_VM => TEST_VM. >> >> Hmmm, it is (very) unfortunate if we have native code in JVM allocating Java objects and triggering garbage collections _concurrently_ with the unit tests being run (there shouldn't be any Java code running when the unit tests are executing). I understand that we have to restore the top pointer in case there is some verification for example when the JVM exits (or if we assert in a destructor etc), but do we really need to run the test in a safepoint? There is nothing wrong with running the test in a safepoint, but it seems to me that we then would have to run almost all TEST_VM tests in a safepoint? >> >> Thanks, >> Erik > > I don't think it's quite *that* bad. > > As far as I can tell, TEST_VM tests have always been executed > concurrently with the executing VM. That is, the VM is created (by > calling JNI_CreateJavaVM), and then the same thread that made that > call (which is now the main thread for the VM) executes the TEST_VM > tests. That thread is a Java thread, initially "in native" (which is > why we need to do the ThreadInVMfromNative transition first, before > going to the safepoint). > > It's only a problem for tests that mess with VM data structures in > unexpected ways. I guess many / most / nearly all(?) don't do that, > since we haven't seen more of these kinds of failures. But I agree > that it does mean one needs to take some additional care when writing > TEST_VM tests. > > And in fact, there are some tests that rely on that behavior, e.g. > the test of OopStorage::delete_empty_blocks_concurrent(). > > This particular test is doing something really nasty behind the > collector's back. It was trying to protect against that by using > TEST_OTHER_VM, but that just narrowed the window for failures. > > I looked at the 3 other uses of TEST_OTHER_VM, and none of them appear > to have this kind of problem. They are run in another VM because they > side-effect the VM in a way that we don't necessarily want to apply > the to main test runner. But they don't seem to be bashing on VM data > structures in non-approved ways. Thanks for doing another round of checking. I'm still a bit concerned about some of our TEST_VM tests, I know that there are tests that e.g. temporarily changes the values of the flags in a way that would mess up a garbage collection. But lets leave that out of this patch, if there are additional problems with other tests then those can be solved in separate patches. This patch looks good, Reviewed. Thanks, Erik From kim.barrett at oracle.com Tue Jul 10 17:21:19 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 10 Jul 2018 13:21:19 -0400 Subject: RFR[JDK11]: 8204691: HeapRegion.apply_to_marked_objects_other_vm_test fails with assert(!hr->is_free() || hr->is_empty()) failed: Free region 0 is not empty for set Free list # In-Reply-To: References: <7BB45268-F475-41CA-BB2D-DFB2FE9AE6E4@oracle.com> <2aa63d5e-d148-219c-2e77-928fd0964a64@oracle.com> <92BE79FB-A982-46AB-80D7-61B459D88396@oracle.com> Message-ID: <7F9CFD8E-566B-483B-BE72-54C8524E4680@oracle.com> > On Jul 10, 2018, at 10:34 AM, Erik Helin wrote: > > On 07/09/2018 09:51 PM, Kim Barrett wrote: >>> On Jul 9, 2018, at 11:49 AM, Erik Helin wrote: >>> >>> [?] >>> Hmmm, it is (very) unfortunate if we have native code in JVM allocating Java objects and triggering garbage collections _concurrently_ with the unit tests being run (there shouldn't be any Java code running when the unit tests are executing). I understand that we have to restore the top pointer in case there is some verification for example when the JVM exits (or if we assert in a destructor etc), but do we really need to run the test in a safepoint? There is nothing wrong with running the test in a safepoint, but it seems to me that we then would have to run almost all TEST_VM tests in a safepoint? >>> >>> Thanks, >>> Erik >> I don't think it's quite *that* bad.[?] > > Thanks for doing another round of checking. I'm still a bit concerned about some of our TEST_VM tests, I know that there are tests that e.g. temporarily changes the values of the flags in a way that would mess up a garbage collection. But lets leave that out of this patch, if there are additional problems with other tests then those can be solved in separate patches. Yes. > This patch looks good, Reviewed. Thanks. > > Thanks, > Erik From fairoz.matte at oracle.com Wed Jul 11 13:14:06 2018 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Wed, 11 Jul 2018 06:14:06 -0700 (PDT) Subject: [8u-backport] RFR: JDK-8114823: G1 doesn't honor request to disable class unloading Message-ID: <6371e589-466c-4b8e-b974-1dee8d2446c4@default> Hi, Kindly review the backport of "JDK- 8114823: G1 doesn't honor request to disable class unloading" to 8u Webrev - http://cr.openjdk.java.net/~fmatte/8114823/webrev.00/ JDK 9 bug - https://bugs.openjdk.java.net/browse/JDK-8114823 JDK 9 changeset - http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/53a14fe65414 Review thread - http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2016-June/018298.html Thanks, Fairoz From boris.ulasevich at bell-sw.com Wed Jul 11 16:17:26 2018 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Wed, 11 Jul 2018 19:17:26 +0300 Subject: [11] RFR(XS) 8207044: minimal vm build fail: missing #include Message-ID: <80f472dd-6d16-0714-5f43-1b3d9512888a@bell-sw.com> Hi all, Please review the following patch: http://cr.openjdk.java.net/~bulasevich/8207044/webrev.01 https://bugs.openjdk.java.net/browse/JDK-8207044 thanks, Boris From zgu at redhat.com Wed Jul 11 16:28:15 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 11 Jul 2018 12:28:15 -0400 Subject: [12] RFR(XS) 8207056: Epsilon GC to support object pinning Message-ID: <8ffa03a3-de73-bafe-2664-3aa8a170db2d@redhat.com> Please review this small change to support object pinning in Epsilon GC. Pinning object in Epsilon GC is no-op, so it is simpler than doing GCLock dance. Bug: https://bugs.openjdk.java.net/browse/JDK-8207056 Webrev: http://cr.openjdk.java.net/~zgu/8207056/webrev.00/ Test: gc/epsilon on Linux 64 (fastdebug + release) Thanks, -Zhengyu From shade at redhat.com Wed Jul 11 16:30:47 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 11 Jul 2018 18:30:47 +0200 Subject: [12] RFR(XS) 8207056: Epsilon GC to support object pinning In-Reply-To: <8ffa03a3-de73-bafe-2664-3aa8a170db2d@redhat.com> References: <8ffa03a3-de73-bafe-2664-3aa8a170db2d@redhat.com> Message-ID: On 07/11/2018 06:28 PM, Zhengyu Gu wrote: > Please review this small change to support object pinning in Epsilon GC. > > Pinning object in Epsilon GC is no-op, so it is simpler than doing GCLock dance. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8207056 > Webrev: http://cr.openjdk.java.net/~zgu/8207056/webrev.00/ This looks good to me, thanks! Make the comment more verbose: "Object pinning support: every object is implicitly pinned" -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Wed Jul 11 16:34:59 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 11 Jul 2018 18:34:59 +0200 Subject: [11] RFR(XS) 8207044: minimal vm build fail: missing #include In-Reply-To: <80f472dd-6d16-0714-5f43-1b3d9512888a@bell-sw.com> References: <80f472dd-6d16-0714-5f43-1b3d9512888a@bell-sw.com> Message-ID: <74059557-9104-aa43-737f-023fd35ae052@redhat.com> On 07/11/2018 06:17 PM, Boris Ulasevich wrote: > Please review the following patch: > http://cr.openjdk.java.net/~bulasevich/8207044/webrev.01 > https://bugs.openjdk.java.net/browse/JDK-8207044 Looks good and trivial to me. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From zgu at redhat.com Wed Jul 11 16:37:45 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 11 Jul 2018 12:37:45 -0400 Subject: [12] RFR(XS) 8207056: Epsilon GC to support object pinning In-Reply-To: References: <8ffa03a3-de73-bafe-2664-3aa8a170db2d@redhat.com> Message-ID: <0a24a175-7b33-84bd-50f4-049ee9c22a8b@redhat.com> On 07/11/2018 12:30 PM, Aleksey Shipilev wrote: > On 07/11/2018 06:28 PM, Zhengyu Gu wrote: >> Please review this small change to support object pinning in Epsilon GC. >> >> Pinning object in Epsilon GC is no-op, so it is simpler than doing GCLock dance. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8207056 >> Webrev: http://cr.openjdk.java.net/~zgu/8207056/webrev.00/ > > This looks good to me, thanks! > > Make the comment more verbose: "Object pinning support: every object is implicitly pinned" Thanks for the review. Updated: http://cr.openjdk.java.net/~zgu/8207056/webrev.01/ -Zhengyu > > -Aleksey > From shade at redhat.com Wed Jul 11 16:44:57 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 11 Jul 2018 18:44:57 +0200 Subject: [12] RFR(XS) 8207056: Epsilon GC to support object pinning In-Reply-To: <0a24a175-7b33-84bd-50f4-049ee9c22a8b@redhat.com> References: <8ffa03a3-de73-bafe-2664-3aa8a170db2d@redhat.com> <0a24a175-7b33-84bd-50f4-049ee9c22a8b@redhat.com> Message-ID: <03e59567-be00-70be-c4de-ea0fb5818b7d@redhat.com> On 07/11/2018 06:37 PM, Zhengyu Gu wrote: > Updated: http://cr.openjdk.java.net/~zgu/8207056/webrev.01/ Looks good. This is actually a very simple patch, and it avoids going to GCLocker needlessly. Maybe we should push it to 11, as "Late Enhancement": http://openjdk.java.net/jeps/3#Late-Enhancement-Request-Process -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From kim.barrett at oracle.com Wed Jul 11 17:14:02 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 11 Jul 2018 13:14:02 -0400 Subject: [11] RFR(XS) 8207044: minimal vm build fail: missing #include In-Reply-To: <80f472dd-6d16-0714-5f43-1b3d9512888a@bell-sw.com> References: <80f472dd-6d16-0714-5f43-1b3d9512888a@bell-sw.com> Message-ID: > On Jul 11, 2018, at 12:17 PM, Boris Ulasevich wrote: > > Hi all, > > Please review the following patch: > http://cr.openjdk.java.net/~bulasevich/8207044/webrev.01 > https://bugs.openjdk.java.net/browse/JDK-8207044 > > thanks, > Boris Looks good. I think you are not yet a committer? If so, I can push it for you. From rkennke at redhat.com Wed Jul 11 17:18:31 2018 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 11 Jul 2018 19:18:31 +0200 Subject: [12] RFR(XS) 8207056: Epsilon GC to support object pinning In-Reply-To: <8ffa03a3-de73-bafe-2664-3aa8a170db2d@redhat.com> References: <8ffa03a3-de73-bafe-2664-3aa8a170db2d@redhat.com> Message-ID: <69dfdff9-6e8b-6e11-6e94-b42ca45dc93e@redhat.com> Am 11.07.2018 um 18:28 schrieb Zhengyu Gu: > Please review this small change to support object pinning in Epsilon GC. > > Pinning object in Epsilon GC is no-op, so it is simpler than doing > GCLock dance. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8207056 > Webrev: http://cr.openjdk.java.net/~zgu/8207056/webrev.00/ > > Test: > > ?gc/epsilon on Linux 64 (fastdebug + release) > > Thanks, Looks good! Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From zgu at redhat.com Wed Jul 11 17:19:33 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 11 Jul 2018 13:19:33 -0400 Subject: [12] RFR(XS) 8207056: Epsilon GC to support object pinning In-Reply-To: <03e59567-be00-70be-c4de-ea0fb5818b7d@redhat.com> References: <8ffa03a3-de73-bafe-2664-3aa8a170db2d@redhat.com> <0a24a175-7b33-84bd-50f4-049ee9c22a8b@redhat.com> <03e59567-be00-70be-c4de-ea0fb5818b7d@redhat.com> Message-ID: <93f57b73-a5e1-bed4-19b6-0c1460f86a8e@redhat.com> On 07/11/2018 12:44 PM, Aleksey Shipilev wrot> On 07/11/2018 06:37 PM, Zhengyu Gu wrote: >> Updated: http://cr.openjdk.java.net/~zgu/8207056/webrev.01/ > > Looks good. This is actually a very simple patch, and it avoids going to GCLocker needlessly. Maybe > we should push it to 11, as "Late Enhancement": > http://openjdk.java.net/jeps/3#Late-Enhancement-Request-Process Done. -Zhengyu > > -Aleksey > From zgu at redhat.com Wed Jul 11 17:21:48 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 11 Jul 2018 13:21:48 -0400 Subject: [12] RFR(XS) 8207056: Epsilon GC to support object pinning In-Reply-To: <69dfdff9-6e8b-6e11-6e94-b42ca45dc93e@redhat.com> References: <8ffa03a3-de73-bafe-2664-3aa8a170db2d@redhat.com> <69dfdff9-6e8b-6e11-6e94-b42ca45dc93e@redhat.com> Message-ID: <8cebb738-fd55-19a4-0079-741794b62a28@redhat.com> Thanks, Roman. -Zhengyu On 07/11/2018 01:18 PM, Roman Kennke wrote: > Am 11.07.2018 um 18:28 schrieb Zhengyu Gu: >> Please review this small change to support object pinning in Epsilon GC. >> >> Pinning object in Epsilon GC is no-op, so it is simpler than doing >> GCLock dance. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8207056 >> Webrev: http://cr.openjdk.java.net/~zgu/8207056/webrev.00/ >> >> Test: >> >> ?gc/epsilon on Linux 64 (fastdebug + release) >> >> Thanks, > > > Looks good! > > Roman > > From boris.ulasevich at bell-sw.com Wed Jul 11 18:13:02 2018 From: boris.ulasevich at bell-sw.com (Boris Ulasevich) Date: Wed, 11 Jul 2018 21:13:02 +0300 Subject: [11] RFR(XS) 8207044: minimal vm build fail: missing #include In-Reply-To: References: <80f472dd-6d16-0714-5f43-1b3d9512888a@bell-sw.com> Message-ID: Yes, push it please. Thank you! 11.07.2018 20:14, Kim Barrett ?????: >> On Jul 11, 2018, at 12:17 PM, Boris Ulasevich wrote: >> >> Hi all, >> >> Please review the following patch: >> http://cr.openjdk.java.net/~bulasevich/8207044/webrev.01 >> https://bugs.openjdk.java.net/browse/JDK-8207044 >> >> thanks, >> Boris > Looks good. > > I think you are not yet a committer? If so, I can push it for you. From vladimir.kozlov at oracle.com Thu Jul 12 21:28:51 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 12 Jul 2018 14:28:51 -0700 Subject: [11] RFR(S) 8207069: [AOT] we should check that VM uses the same GC as one used for AOT library generation. Message-ID: Including GC group since I added new method to GCConfig. http://cr.openjdk.java.net/~kvn/8207069/webrev.00/ https://bugs.openjdk.java.net/browse/JDK-8207069 Recent Graal's changes [1] added list of GC [2] which matches Hotspot GC list [3]. I used that to fix this issue by storing enum value from Graal in AOT config header and compare it with selected GC when AOT library is loaded into Hotspot. I verified the fix with all GCs combination when compiling AOT lib and using it. I also ran our hs-tier1-3 testing which includes AOT and Graal tests. These changes are for JDK 11 so I don't need to go through Graal PR now but I would need to do that for JDK 12 to make changes in AOT code. Thanks, Vladimir [1] https://bugs.openjdk.java.net/browse/JDK-8205824 "[GR-10514] Use whitelist for GCs supported by Graal" [2] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/HotSpotGraalRuntime.java#l137 [3] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/hotspot/share/gc/shared/collectedHeap.hpp#l173 From erik.helin at oracle.com Fri Jul 13 08:18:19 2018 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 13 Jul 2018 10:18:19 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: Hi Daniel, thanks for letting us know. Since you have only set -Xms512 and -Xmx512 and you are running on JDK 10 that means you are using the G1 garbage collector, so all the calls to pool->get_memory_usage() in the loop will end up in g1MemoryPool.cpp [0] which in turn will return cached values from the recalculate_sizes code in G1MonitoringSupport [1]. Since you are running with -Xmx512m you should have gotten 1 MB sized regions (see heapRegion.cpp for details [2]), so the 5 MB _could_ mean that five regions were accounted wrongly. Do you any kind of GC logging from the test run where you encountered the bug? The code in G1MonitoringSupport::recalculate_sizes seems messy enough that there could be in a small bug in there. I'm adding hotspot-gc-dev since all GC developers might not read serviceability-dev. Thanks, Erik [0]: http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/gc/g1/g1MemoryPool.cpp [1]: http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/gc/g1/g1MonitoringSupport.cpp#l182 [2]: http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/gc/g1/heapRegion.cpp#l63 On 07/12/2018 03:35 PM, Daniel Mitterdorfer wrote: > Hi, > > while working on a change in Elasticsearch, I discovered an interesting > situation related to the implementation of jmm_getMemoryUsage (see > [jdk-mem-usage]). In one of the test runs, a test failed with the following > exception: > > java.lang.IllegalArgumentException: committed = 542113792 should be < > max = 536870912 > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246) > [...] > > This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only JVM flags > specified where -Xms512M -Xmx512M. So far this failure occurred only once and I > could not reproduce it yet. > > The values reported in the exception message are: > > * "max": 536870912 = 512MB (exactly) > * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max". > > As the value of "max" is exactly what we have specified with -Xmx this indicates > to me that the problem seems to be the calculation of "committed". > > As the value of "max" is exactly what we have specified with -Xmx it seems to > indicate that the problem is the calculation of "committed". I do not > understand under which conditions this can happen thus I post this to the > mailing list in case anybody has ideas what might cause this. > > I plan to run further tests with JVM trace logging enabled > (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags to be > precise) in the hope that this problem will occur again and I can provide logs > that help to debug / fix the problem. > > Searching for that error message, there is [JDK-8020530] but that one is about > *non-heap* memory usage and has already been resolved a while ago. Several > sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to indicate > that this problem happened indeed in the wild but what I find odd is that I > could not find a single ticket in the OpenJDK bug tracker or a discussion on a > JDK mailing list about this problem. > > I'd be glad to get any pointers on what might cause this or requests for > additional info that I need to provide to help analyze this problem. > > Thanks, > Daniel > > [jdk-mem-usage] > http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728 > [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530 > [apache-ignite-workaround] > https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346 > [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733 > From daniel.mitterdorfer at gmail.com Fri Jul 13 08:30:17 2018 From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer) Date: Fri, 13 Jul 2018 10:30:17 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: Hi Erik, > > Do you any kind of GC logging from the test run where you encountered > the bug? Unfortunately, we don't have GC logging enabled by default in our test suite so the exception trace is all I got. I am now repeatedly running the test suite with the original flags (-Xms512M -Xmx512M) and also added the following logging configuration: -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags As soon as I get another failure, I'll provide the full log file. Please let me know if you need any other logs (i.e. whether I should adjust my log configuration). Daniel From thomas.schatzl at oracle.com Fri Jul 13 08:33:33 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 13 Jul 2018 10:33:33 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote: > Hi Erik, > > > > Do you any kind of GC logging from the test run where you > > encountered the bug? > > Unfortunately, we don't have GC logging enabled by default in our > test suite so the exception trace is all I got. I am now repeatedly > running the test suite with the original flags (-Xms512M -Xmx512M) > and also added the following logging configuration: > > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > > As soon as I get another failure, I'll provide the full log file. > Please let me know if you need any other logs (i.e. whether I should > adjust my log configuration). I think these flags are fine. Since Erik and me strongly believe the issue is with the relevant G1 code Erik mentioned we will reassign the bug to us (he said there is already a bug reported on it). Thanks a lot, Thomas From daniel.mitterdorfer at gmail.com Fri Jul 13 14:10:37 2018 From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer) Date: Fri, 13 Jul 2018 16:10:37 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: Hi, I have good news. I was able to reproduce this issue but this time I have logs. A test failed with the following stack trace around 15:06:55 with: java.lang.IllegalArgumentException: committed = 537919488 should be < max = 536870912 > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:242) This time it happened on Linux (kernel 4.13.0-45-generic) with JDK 10 (build 10+46). The JVM arguments were: -Xms512M -Xmx512M -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags The logs are somewhat massive (~250MB uncompressed) and available at https://www.dropbox.com/s/wir9sv1dk5cf54y/JDK-8207200_test_log.txt.zip?dl=0 I hope that helps identifying the cause. Please let me know if you need anything else. Daniel Am Fr., 13. Juli 2018 um 10:33 Uhr schrieb Thomas Schatzl : > > On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote: > > Hi Erik, > > > > > > Do you any kind of GC logging from the test run where you > > > encountered the bug? > > > > Unfortunately, we don't have GC logging enabled by default in our > > test suite so the exception trace is all I got. I am now repeatedly > > running the test suite with the original flags (-Xms512M -Xmx512M) > > and also added the following logging configuration: > > > > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > > > > As soon as I get another failure, I'll provide the full log file. > > Please let me know if you need any other logs (i.e. whether I should > > adjust my log configuration). > > I think these flags are fine. > > Since Erik and me strongly believe the issue is with the relevant G1 > code Erik mentioned we will reassign the bug to us (he said there is > already a bug reported on it). > > Thanks a lot, > Thomas > From goetz.lindenmaier at sap.com Tue Jul 17 09:49:24 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Tue, 17 Jul 2018 09:49:24 +0000 Subject: Test IncompatibleOptions.java is failing if ZGC is not compiled Message-ID: <5831c166cd3a4cdebcf2124d01a8c92c@sap.com> Hi, did anybody notice that the test runtime/appcds/sharedStrings/IncompatibleOptions.java is failing if openJdk is compiled the default way, i.e., with INCLUDE_ZGC=0? Best regards, Goetz. [STDOUT] Error occurred during initialization of VM Option -XX:+UseZGC not supported ----------System.err:(23/1362)---------- stdout: [Error occurred during initialization of VM Option -XX:+UseZGC not supported ]; stderr: [] exitValue = 1 java.lang.RuntimeException: 'Cannot dump shared archive when UseCompressedOops or UseCompressedClassPointers is off' missing from stdout/stderr From cthalinger at twitter.com Wed Jul 18 00:17:21 2018 From: cthalinger at twitter.com (Christian Thalinger) Date: Tue, 17 Jul 2018 20:17:21 -0400 Subject: [11] RFR(S) 8207069: [AOT] we should check that VM uses the same GC as one used for AOT library generation. In-Reply-To: References: Message-ID: <6B58B3BC-C763-4799-98A8-16B68B0949E0@twitter.com> > On Jul 12, 2018, at 5:28 PM, Vladimir Kozlov wrote: > > Including GC group since I added new method to GCConfig. > > http://cr.openjdk.java.net/~kvn/8207069/webrev.00/ > https://bugs.openjdk.java.net/browse/JDK-8207069 > > Recent Graal's changes [1] added list of GC [2] which matches Hotspot GC list [3]. > I used that to fix this issue by storing enum value from Graal in AOT config header and compare it with selected GC when AOT library is loaded into Hotspot. The fix is correct but too strict. For example, Serial and Parallel GC can use the same AOT library. CMS too. > > I verified the fix with all GCs combination when compiling AOT lib and using it. I also ran our hs-tier1-3 testing which includes AOT and Graal tests. > > These changes are for JDK 11 so I don't need to go through Graal PR now but I would need to do that for JDK 12 to make changes in AOT code. > > Thanks, > Vladimir > > [1] https://bugs.openjdk.java.net/browse/JDK-8205824 > "[GR-10514] Use whitelist for GCs supported by Graal" > [2] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/HotSpotGraalRuntime.java#l137 > [3] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/hotspot/share/gc/shared/collectedHeap.hpp#l173 From vladimir.kozlov at oracle.com Wed Jul 18 04:14:32 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 17 Jul 2018 21:14:32 -0700 Subject: [11] RFR(S) 8207069: [AOT] we should check that VM uses the same GC as one used for AOT library generation. In-Reply-To: <6B58B3BC-C763-4799-98A8-16B68B0949E0@twitter.com> References: <6B58B3BC-C763-4799-98A8-16B68B0949E0@twitter.com> Message-ID: On 7/17/18 5:17 PM, Christian Thalinger wrote: > > >> On Jul 12, 2018, at 5:28 PM, Vladimir Kozlov wrote: >> >> Including GC group since I added new method to GCConfig. >> >> http://cr.openjdk.java.net/~kvn/8207069/webrev.00/ >> https://bugs.openjdk.java.net/browse/JDK-8207069 >> >> Recent Graal's changes [1] added list of GC [2] which matches Hotspot GC list [3]. >> I used that to fix this issue by storing enum value from Graal in AOT config header and compare it with selected GC when AOT library is loaded into Hotspot. > > The fix is correct but too strict. For example, Serial and Parallel GC can use the same AOT library. CMS too. Do you have other suggestions how to check compatibility? Thanks, Vladimir > >> >> I verified the fix with all GCs combination when compiling AOT lib and using it. I also ran our hs-tier1-3 testing which includes AOT and Graal tests. >> >> These changes are for JDK 11 so I don't need to go through Graal PR now but I would need to do that for JDK 12 to make changes in AOT code. >> >> Thanks, >> Vladimir >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8205824 >> "[GR-10514] Use whitelist for GCs supported by Graal" >> [2] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/HotSpotGraalRuntime.java#l137 >> [3] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/hotspot/share/gc/shared/collectedHeap.hpp#l173 > From cthalinger at twitter.com Wed Jul 18 13:15:47 2018 From: cthalinger at twitter.com (Christian Thalinger) Date: Wed, 18 Jul 2018 09:15:47 -0400 Subject: [11] RFR(S) 8207069: [AOT] we should check that VM uses the same GC as one used for AOT library generation. In-Reply-To: References: <6B58B3BC-C763-4799-98A8-16B68B0949E0@twitter.com> Message-ID: <2B6BD07E-7D66-4B3E-8B81-8528582FCC47@twitter.com> > On Jul 18, 2018, at 12:14 AM, Vladimir Kozlov wrote: > > On 7/17/18 5:17 PM, Christian Thalinger wrote: >>> On Jul 12, 2018, at 5:28 PM, Vladimir Kozlov wrote: >>> >>> Including GC group since I added new method to GCConfig. >>> >>> http://cr.openjdk.java.net/~kvn/8207069/webrev.00/ >>> https://bugs.openjdk.java.net/browse/JDK-8207069 >>> >>> Recent Graal's changes [1] added list of GC [2] which matches Hotspot GC list [3]. >>> I used that to fix this issue by storing enum value from Graal in AOT config header and compare it with selected GC when AOT library is loaded into Hotspot. >> The fix is correct but too strict. For example, Serial and Parallel GC can use the same AOT library. CMS too. > > Do you have other suggestions how to check compatibility? I think the best way would be to use BarrierSet::Name: // Do something for each concrete barrier set part of the build. #define FOR_EACH_CONCRETE_BARRIER_SET_DO(f) \ f(CardTableBarrierSet) \ EPSILONGC_ONLY(f(EpsilonBarrierSet)) \ G1GC_ONLY(f(G1BarrierSet)) \ ZGC_ONLY(f(ZBarrierSet)) > > Thanks, > Vladimir > >>> >>> I verified the fix with all GCs combination when compiling AOT lib and using it. I also ran our hs-tier1-3 testing which includes AOT and Graal tests. >>> >>> These changes are for JDK 11 so I don't need to go through Graal PR now but I would need to do that for JDK 12 to make changes in AOT code. >>> >>> Thanks, >>> Vladimir >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8205824 >>> "[GR-10514] Use whitelist for GCs supported by Graal" >>> [2] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/HotSpotGraalRuntime.java#l137 >>> [3] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/hotspot/share/gc/shared/collectedHeap.hpp#l173 -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Wed Jul 18 19:08:09 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 18 Jul 2018 12:08:09 -0700 Subject: [11] RFR(S) 8207069: [AOT] we should check that VM uses the same GC as one used for AOT library generation. In-Reply-To: <2B6BD07E-7D66-4B3E-8B81-8528582FCC47@twitter.com> References: <6B58B3BC-C763-4799-98A8-16B68B0949E0@twitter.com> <2B6BD07E-7D66-4B3E-8B81-8528582FCC47@twitter.com> Message-ID: On 7/18/18 6:15 AM, Christian Thalinger wrote: > > >> On Jul 18, 2018, at 12:14 AM, Vladimir Kozlov >> > wrote: >> >> On 7/17/18 5:17 PM, Christian Thalinger wrote: >>>> On Jul 12, 2018, at 5:28 PM, Vladimir Kozlov >>>> > wrote: >>>> >>>> Including GC group since I added new method to GCConfig. >>>> >>>> http://cr.openjdk.java.net/~kvn/8207069/webrev.00/ >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8207069 >>>> >>>> Recent Graal's changes [1] added list of GC [2] which matches >>>> Hotspot GC list [3]. >>>> I used that to fix this issue by storing enum value from Graal in >>>> AOT config header and compare it with selected GC when AOT library >>>> is loaded into Hotspot. >>> The fix is correct but too strict. ?For example, Serial and Parallel >>> GC can use the same AOT library. ?CMS too. >> >> Do you have other suggestions how to check compatibility? > > I think the best way would be to use BarrierSet::Name: > > // Do something for each concrete barrier set part of the build. > #define FOR_EACH_CONCRETE_BARRIER_SET_DO(f)? ? ? ? ? \ > ? f(CardTableBarrierSet) ? ? ? ? ? ? ? ? ? ? ? ? ? ? \ > ? EPSILONGC_ONLY(f(EpsilonBarrierSet)) ? ? ? ? ? ? ? \ > ? G1GC_ONLY(f(G1BarrierSet)) ? ? ? ? ? ? ? ? ? ? ? ? \ > ? ZGC_ONLY(f(ZBarrierSet)) Thank you, Chris, for suggestion. To record barrier set in AOT library would require a lot more complex changes (JVMCI) not suitable for JDK 11. Currently Graal checks only GC flags. To get information about barrier set it needs to access Hotspot's data. The only simple way to relax the check is to get BarrierSet::Name value based on CollectedHeap::Name and compare them in aot library config check code. But I can't find a functionality in GC code to do that. I asked GC group. Note, we never intended to support mixed GCs with the same type of barriers. It was accidental and I am not comfortable to support such "feature". Vladimir > >> >> Thanks, >> Vladimir >> >>>> >>>> I verified the fix with all GCs combination when compiling AOT lib >>>> and using it. I also ran our hs-tier1-3 testing which includes AOT >>>> and Graal tests. >>>> >>>> These changes are for JDK 11 so I don't need to go through Graal PR >>>> now but I would need to do that for JDK 12 to make changes in AOT code. >>>> >>>> Thanks, >>>> Vladimir >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8205824 >>>> ?"[GR-10514] Use whitelist for GCs supported by Graal" >>>> [2] >>>> http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/HotSpotGraalRuntime.java#l137 >>>> [3] >>>> http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/hotspot/share/gc/shared/collectedHeap.hpp#l173 > From cthalinger at twitter.com Wed Jul 18 21:25:40 2018 From: cthalinger at twitter.com (Christian Thalinger) Date: Wed, 18 Jul 2018 17:25:40 -0400 Subject: [11] RFR(S) 8207069: [AOT] we should check that VM uses the same GC as one used for AOT library generation. In-Reply-To: References: <6B58B3BC-C763-4799-98A8-16B68B0949E0@twitter.com> <2B6BD07E-7D66-4B3E-8B81-8528582FCC47@twitter.com> Message-ID: > On Jul 18, 2018, at 3:08 PM, Vladimir Kozlov wrote: > > On 7/18/18 6:15 AM, Christian Thalinger wrote: >>> On Jul 18, 2018, at 12:14 AM, Vladimir Kozlov >> wrote: >>> >>> On 7/17/18 5:17 PM, Christian Thalinger wrote: >>>>> On Jul 12, 2018, at 5:28 PM, Vladimir Kozlov >> wrote: >>>>> >>>>> Including GC group since I added new method to GCConfig. >>>>> >>>>> http://cr.openjdk.java.net/~kvn/8207069/webrev.00/ > >>>>> https://bugs.openjdk.java.net/browse/JDK-8207069 >>>>> >>>>> Recent Graal's changes [1] added list of GC [2] which matches Hotspot GC list [3]. >>>>> I used that to fix this issue by storing enum value from Graal in AOT config header and compare it with selected GC when AOT library is loaded into Hotspot. >>>> The fix is correct but too strict. For example, Serial and Parallel GC can use the same AOT library. CMS too. >>> >>> Do you have other suggestions how to check compatibility? >> I think the best way would be to use BarrierSet::Name: >> // Do something for each concrete barrier set part of the build. >> #define FOR_EACH_CONCRETE_BARRIER_SET_DO(f) \ >> f(CardTableBarrierSet) \ >> EPSILONGC_ONLY(f(EpsilonBarrierSet)) \ >> G1GC_ONLY(f(G1BarrierSet)) \ >> ZGC_ONLY(f(ZBarrierSet)) > > Thank you, Chris, for suggestion. > > To record barrier set in AOT library would require a lot more complex changes (JVMCI) not suitable for JDK 11. Currently Graal checks only GC flags. To get information about barrier set it needs to access Hotspot's data. Yeah, that?s a problem. > > The only simple way to relax the check is to get BarrierSet::Name value based on CollectedHeap::Name and compare them in aot library config check code. But I can't find a functionality in GC code to do that. I asked GC group. > > Note, we never intended to support mixed GCs with the same type of barriers. It was accidental and I am not comfortable to support such "feature?. You mean that two different GCs use the same barrier set? Yes, I agree, it would be better if each had their own. Do you want to push your current patch? > > Vladimir > >>> >>> Thanks, >>> Vladimir >>> >>>>> >>>>> I verified the fix with all GCs combination when compiling AOT lib and using it. I also ran our hs-tier1-3 testing which includes AOT and Graal tests. >>>>> >>>>> These changes are for JDK 11 so I don't need to go through Graal PR now but I would need to do that for JDK 12 to make changes in AOT code. >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8205824 >>>>> "[GR-10514] Use whitelist for GCs supported by Graal" >>>>> [2] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/HotSpotGraalRuntime.java#l137 >>>>> [3] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/hotspot/share/gc/shared/collectedHeap.hpp#l173 -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Wed Jul 18 21:51:15 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 18 Jul 2018 14:51:15 -0700 Subject: [11] RFR(S) 8207069: [AOT] we should check that VM uses the same GC as one used for AOT library generation. In-Reply-To: References: <6B58B3BC-C763-4799-98A8-16B68B0949E0@twitter.com> <2B6BD07E-7D66-4B3E-8B81-8528582FCC47@twitter.com> Message-ID: <12a13dff-7474-5c5f-4026-578cc4c40ae4@oracle.com> On 7/18/18 2:25 PM, Christian Thalinger wrote: > > >> On Jul 18, 2018, at 3:08 PM, Vladimir Kozlov >> > wrote: >> >> On 7/18/18 6:15 AM, Christian Thalinger wrote: >>>> On Jul 18, 2018, at 12:14 AM, Vladimir Kozlov >>>> >>> > >>>> wrote: >>>> >>>> On 7/17/18 5:17 PM, Christian Thalinger wrote: >>>>>> On Jul 12, 2018, at 5:28 PM, Vladimir Kozlov >>>>>> >>>>> > >>>>>> wrote: >>>>>> >>>>>> Including GC group since I added new method to GCConfig. >>>>>> >>>>>> http://cr.openjdk.java.net/~kvn/8207069/webrev.00/ >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8207069 >>>>>> >>>>>> Recent Graal's changes [1] added list of GC [2] which matches >>>>>> Hotspot GC list [3]. >>>>>> I used that to fix this issue by storing enum value from Graal in >>>>>> AOT config header and compare it with selected GC when AOT library >>>>>> is loaded into Hotspot. >>>>> The fix is correct but too strict. ?For example, Serial and >>>>> Parallel GC can use the same AOT library. CMS too. >>>> >>>> Do you have other suggestions how to check compatibility? >>> I think the best way would be to use BarrierSet::Name: >>> // Do something for each concrete barrier set part of the build. >>> #define FOR_EACH_CONCRETE_BARRIER_SET_DO(f)? ? ? ? ? \ >>> f(CardTableBarrierSet) ? ? ? ? ? ? ? ? ? ? ? ? ? ? \ >>> EPSILONGC_ONLY(f(EpsilonBarrierSet)) ? ? ? ? ? ? ? \ >>> G1GC_ONLY(f(G1BarrierSet)) ? ? ? ? ? ? ? ? ? ? ? ? \ >>> ZGC_ONLY(f(ZBarrierSet)) >> >> Thank you, Chris, for suggestion. >> >> To record barrier set in AOT library would require a lot more complex >> changes (JVMCI) not suitable for JDK 11. ?Currently Graal checks only >> GC flags. To get information about barrier set it needs to access >> Hotspot's data. > > Yeah, that?s a problem. > >> >> The only simple way to relax the check is to get BarrierSet::Name >> value based on CollectedHeap::Name and compare them in aot library >> config check code. But I can't find a functionality in GC code to do >> that. I asked GC group. I got answer from GC that they don't have such mapping and don't think it is needed. >> >> Note, we never intended to support mixed GCs with the same type of >> barriers. It was accidental and I am not comfortable to support such >> "feature?. > > You mean that two different GCs use the same barrier set? ?Yes, I agree, > it would be better if each had their own. > > Do you want to push your current patch? Yes. I am waiting PR review from Labs since AOT code (jaotc) is now there. Thanks, Vladimir > >> >> Vladimir >> >>>> >>>> Thanks, >>>> Vladimir >>>> >>>>>> >>>>>> I verified the fix with all GCs combination when compiling AOT lib >>>>>> and using it. I also ran our hs-tier1-3 testing which includes AOT >>>>>> and Graal tests. >>>>>> >>>>>> These changes are for JDK 11 so I don't need to go through Graal >>>>>> PR now but I would need to do that for JDK 12 to make changes in >>>>>> AOT code. >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8205824 >>>>>> ?"[GR-10514] Use whitelist for GCs supported by Graal" >>>>>> [2] >>>>>> http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/HotSpotGraalRuntime.java#l137 >>>>>> [3] >>>>>> http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/hotspot/share/gc/shared/collectedHeap.hpp#l173 > From cthalinger at twitter.com Wed Jul 18 23:56:03 2018 From: cthalinger at twitter.com (Christian Thalinger) Date: Wed, 18 Jul 2018 19:56:03 -0400 Subject: [11] RFR(S) 8207069: [AOT] we should check that VM uses the same GC as one used for AOT library generation. In-Reply-To: <12a13dff-7474-5c5f-4026-578cc4c40ae4@oracle.com> References: <6B58B3BC-C763-4799-98A8-16B68B0949E0@twitter.com> <2B6BD07E-7D66-4B3E-8B81-8528582FCC47@twitter.com> <12a13dff-7474-5c5f-4026-578cc4c40ae4@oracle.com> Message-ID: <679F74F0-473D-4BA5-A651-8C45E1F99686@twitter.com> > On Jul 18, 2018, at 5:51 PM, Vladimir Kozlov wrote: > > On 7/18/18 2:25 PM, Christian Thalinger wrote: >>> On Jul 18, 2018, at 3:08 PM, Vladimir Kozlov >> wrote: >>> >>> On 7/18/18 6:15 AM, Christian Thalinger wrote: >>>>> On Jul 18, 2018, at 12:14 AM, Vladimir Kozlov >>> wrote: >>>>> >>>>> On 7/17/18 5:17 PM, Christian Thalinger wrote: >>>>>>> On Jul 12, 2018, at 5:28 PM, Vladimir Kozlov >>> wrote: >>>>>>> >>>>>>> Including GC group since I added new method to GCConfig. >>>>>>> >>>>>>> http://cr.openjdk.java.net/~kvn/8207069/webrev.00/ >> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8207069 >>>>>>> >>>>>>> Recent Graal's changes [1] added list of GC [2] which matches Hotspot GC list [3]. >>>>>>> I used that to fix this issue by storing enum value from Graal in AOT config header and compare it with selected GC when AOT library is loaded into Hotspot. >>>>>> The fix is correct but too strict. For example, Serial and Parallel GC can use the same AOT library. CMS too. >>>>> >>>>> Do you have other suggestions how to check compatibility? >>>> I think the best way would be to use BarrierSet::Name: >>>> // Do something for each concrete barrier set part of the build. >>>> #define FOR_EACH_CONCRETE_BARRIER_SET_DO(f) \ >>>> f(CardTableBarrierSet) \ >>>> EPSILONGC_ONLY(f(EpsilonBarrierSet)) \ >>>> G1GC_ONLY(f(G1BarrierSet)) \ >>>> ZGC_ONLY(f(ZBarrierSet)) >>> >>> Thank you, Chris, for suggestion. >>> >>> To record barrier set in AOT library would require a lot more complex changes (JVMCI) not suitable for JDK 11. Currently Graal checks only GC flags. To get information about barrier set it needs to access Hotspot's data. >> Yeah, that?s a problem. >>> >>> The only simple way to relax the check is to get BarrierSet::Name value based on CollectedHeap::Name and compare them in aot library config check code. But I can't find a functionality in GC code to do that. I asked GC group. > > I got answer from GC that they don't have such mapping and don't think it is needed. > >>> >>> Note, we never intended to support mixed GCs with the same type of barriers. It was accidental and I am not comfortable to support such "feature?. >> You mean that two different GCs use the same barrier set? Yes, I agree, it would be better if each had their own. >> Do you want to push your current patch? > > Yes. I am waiting PR review from Labs since AOT code (jaotc) is now there. Sounds good. You can use me as a Reviewer, if needed. > > Thanks, > Vladimir > >>> >>> Vladimir >>> >>>>> >>>>> Thanks, >>>>> Vladimir >>>>> >>>>>>> >>>>>>> I verified the fix with all GCs combination when compiling AOT lib and using it. I also ran our hs-tier1-3 testing which includes AOT and Graal tests. >>>>>>> >>>>>>> These changes are for JDK 11 so I don't need to go through Graal PR now but I would need to do that for JDK 12 to make changes in AOT code. >>>>>>> >>>>>>> Thanks, >>>>>>> Vladimir >>>>>>> >>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8205824 >>>>>>> "[GR-10514] Use whitelist for GCs supported by Graal" >>>>>>> [2] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/HotSpotGraalRuntime.java#l137 >>>>>>> [3] http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/hotspot/share/gc/shared/collectedHeap.hpp#l173 -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Thu Jul 19 00:42:16 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 18 Jul 2018 17:42:16 -0700 Subject: [11] RFR(S) 8207069: [AOT] we should check that VM uses the same GC as one used for AOT library generation. In-Reply-To: <679F74F0-473D-4BA5-A651-8C45E1F99686@twitter.com> References: <6B58B3BC-C763-4799-98A8-16B68B0949E0@twitter.com> <2B6BD07E-7D66-4B3E-8B81-8528582FCC47@twitter.com> <12a13dff-7474-5c5f-4026-578cc4c40ae4@oracle.com> <679F74F0-473D-4BA5-A651-8C45E1F99686@twitter.com> Message-ID: <54f66d48-9d47-ca71-d06a-95b127da3146@oracle.com> Thank you, Chris Vladimir On 7/18/18 4:56 PM, Christian Thalinger wrote: > > >> On Jul 18, 2018, at 5:51 PM, Vladimir Kozlov >> > wrote: >> >> On 7/18/18 2:25 PM, Christian Thalinger wrote: >>>> On Jul 18, 2018, at 3:08 PM, Vladimir Kozlov >>>> >>> > >>>> wrote: >>>> >>>> On 7/18/18 6:15 AM, Christian Thalinger wrote: >>>>>> On Jul 18, 2018, at 12:14 AM, Vladimir Kozlov >>>>>> >>>>> > >>>>>> wrote: >>>>>> >>>>>> On 7/17/18 5:17 PM, Christian Thalinger wrote: >>>>>>>> On Jul 12, 2018, at 5:28 PM, Vladimir Kozlov >>>>>>>> >>>>>>> > >>>>>>>> wrote: >>>>>>>> >>>>>>>> Including GC group since I added new method to GCConfig. >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~kvn/8207069/webrev.00/ >>>>>>>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8207069 >>>>>>>> >>>>>>>> Recent Graal's changes [1] added list of GC [2] which matches >>>>>>>> Hotspot GC list [3]. >>>>>>>> I used that to fix this issue by storing enum value from Graal >>>>>>>> in AOT config header and compare it with selected GC when AOT >>>>>>>> library is loaded into Hotspot. >>>>>>> The fix is correct but too strict. ?For example, Serial and >>>>>>> Parallel GC can use the same AOT library. CMS too. >>>>>> >>>>>> Do you have other suggestions how to check compatibility? >>>>> I think the best way would be to use BarrierSet::Name: >>>>> // Do something for each concrete barrier set part of the build. >>>>> #define FOR_EACH_CONCRETE_BARRIER_SET_DO(f)? ? ? ? ? \ >>>>> f(CardTableBarrierSet) ? ? ? ? ? ? ? ? ? ? ? ? ? ? \ >>>>> EPSILONGC_ONLY(f(EpsilonBarrierSet)) ? ? ? ? ? ? ? \ >>>>> G1GC_ONLY(f(G1BarrierSet)) ? ? ? ? ? ? ? ? ? ? ? ? \ >>>>> ZGC_ONLY(f(ZBarrierSet)) >>>> >>>> Thank you, Chris, for suggestion. >>>> >>>> To record barrier set in AOT library would require a lot more >>>> complex changes (JVMCI) not suitable for JDK 11. ?Currently Graal >>>> checks only GC flags. To get information about barrier set it needs >>>> to access Hotspot's data. >>> Yeah, that?s a problem. >>>> >>>> The only simple way to relax the check is to get BarrierSet::Name >>>> value based on CollectedHeap::Name and compare them in aot library >>>> config check code. But I can't find a functionality in GC code to do >>>> that. I asked GC group. >> >> I got answer from GC that they don't have such mapping and don't think >> it is needed. >> >>>> >>>> Note, we never intended to support mixed GCs with the same type of >>>> barriers. It was accidental and I am not comfortable to support such >>>> "feature?. >>> You mean that two different GCs use the same barrier set? ?Yes, I >>> agree, it would be better if each had their own. >>> Do you want to push your current patch? >> >> Yes. I am waiting PR review from Labs since AOT code (jaotc) is now there. > > Sounds good. ?You can use me as a Reviewer, if needed. > >> >> Thanks, >> Vladimir >> >>>> >>>> Vladimir >>>> >>>>>> >>>>>> Thanks, >>>>>> Vladimir >>>>>> >>>>>>>> >>>>>>>> I verified the fix with all GCs combination when compiling AOT >>>>>>>> lib and using it. I also ran our hs-tier1-3 testing which >>>>>>>> includes AOT and Graal tests. >>>>>>>> >>>>>>>> These changes are for JDK 11 so I don't need to go through Graal >>>>>>>> PR now but I would need to do that for JDK 12 to make changes in >>>>>>>> AOT code. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Vladimir >>>>>>>> >>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8205824 >>>>>>>> ?"[GR-10514] Use whitelist for GCs supported by Graal" >>>>>>>> [2] >>>>>>>> http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot/src/org/graalvm/compiler/hotspot/HotSpotGraalRuntime.java#l137 >>>>>>>> [3] >>>>>>>> http://hg.openjdk.java.net/jdk/jdk11/file/bf686c47c109/src/hotspot/share/gc/shared/collectedHeap.hpp#l173 > From manc at google.com Thu Jul 19 01:53:19 2018 From: manc at google.com (Man Cao) Date: Wed, 18 Jul 2018 18:53:19 -0700 Subject: Patch to inline os::SpinPause() for X86 on non-Windows OS Message-ID: Hello, The Java platform team at Google has maintained a local patch to inline os::SpinPause() since 2014. We would like to upstream this patch to OpenJDK. Could someone sponsor this patch? It is difficult to demonstrate performance improvement in Java benchmarks. It is more of a code refactoring to better utilize modern GCC. It partly addresses the comment about inlining SpinPause() above its declaration in os.hpp. I found an interesting discussion about PAUSE and a microbenchmark in: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html However, the microbenchmark has a large variance in our experiment, making it difficult to tell if there's any benefit from inlining PAUSE. Inlining PAUSE does seem to reduce the variance a bit. The patch is inlined and attached below: diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s @@ -63,15 +63,6 @@ popl %eax ret - .globl SYMBOL(SpinPause) - ELF_TYPE(SpinPause, at function) - .p2align 4,,15 -SYMBOL(SpinPause): - rep - nop - movl $1, %eax - ret - # Support for void Copy::conjoint_bytes(void* from, # void* to, # size_t count) diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s @@ -46,15 +46,6 @@ .text - .globl SYMBOL(SpinPause) - .p2align 4,,15 - ELF_TYPE(SpinPause, at function) -SYMBOL(SpinPause): - rep - nop - movq $1, %rax - ret - # Support for void Copy::arrayof_conjoint_bytes(void* from, # void* to, # size_t count) diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s --- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s @@ -42,15 +42,6 @@ .text - .globl SpinPause - .type SpinPause, at function - .p2align 4,,15 -SpinPause: - rep - nop - movl $1, %eax - ret - # Support for void Copy::conjoint_bytes(void* from, # void* to, # size_t count) diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s --- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s @@ -38,15 +38,6 @@ .text - .globl SpinPause - .align 16 - .type SpinPause, at function -SpinPause: - rep - nop - movq $1, %rax - ret - # Support for void Copy::arrayof_conjoint_bytes(void* from, # void* to, # size_t count) diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s --- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s +++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s @@ -51,15 +51,6 @@ movq %fs:0x0,%rax ret - .globl SpinPause - .align 16 -SpinPause: - rep - nop - movq $1, %rax - ret - - / Support for void Copy::arrayof_conjoint_bytes(void* from, / void* to, / size_t count) diff --git a/src/hotspot/share/runtime/os.hpp b/src/hotspot/share/runtime/os.hpp --- a/src/hotspot/share/runtime/os.hpp +++ b/src/hotspot/share/runtime/os.hpp @@ -1031,6 +1031,13 @@ // of the global SpinPause() with C linkage. // It'd also be eligible for inlining on many platforms. +#if defined(X86) && !defined(_WINDOWS) +extern "C" int inline SpinPause() { + __asm__ __volatile__ ("pause"); + return 1; +} +#else extern "C" int SpinPause(); +#endif #endif // SHARE_VM_RUNTIME_OS_HPP -Man -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: inline_spinpause.patch Type: text/x-patch Size: 3778 bytes Desc: not available URL: From fairoz.matte at oracle.com Thu Jul 19 06:52:27 2018 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Wed, 18 Jul 2018 23:52:27 -0700 (PDT) Subject: [8u-backport] RFR: JDK-8114823: G1 doesn't honor request to disable class unloading In-Reply-To: <6371e589-466c-4b8e-b974-1dee8d2446c4@default> References: <6371e589-466c-4b8e-b974-1dee8d2446c4@default> Message-ID: <496e6c67-2b03-4e96-924b-eec90216ae20@default> Hi All, Just a gentle reminder for review request. Thanks, Fairoz > -----Original Message----- > From: Fairoz Matte > Sent: Wednesday, July 11, 2018 6:44 PM > To: hotspot-gc-dev at openjdk.java.net > Subject: [8u-backport] RFR: JDK-8114823: G1 doesn't honor request to > disable class unloading > > Hi, > > Kindly review the backport of "JDK- 8114823: G1 doesn't honor request to > disable class unloading" to 8u > > Webrev - http://cr.openjdk.java.net/~fmatte/8114823/webrev.00/ > > JDK 9 bug - https://bugs.openjdk.java.net/browse/JDK-8114823 > > JDK 9 changeset - > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/53a14fe65414 > > Review thread - http://mail.openjdk.java.net/pipermail/hotspot-gc- > dev/2016-June/018298.html > > Thanks, > Fairoz From thomas.schatzl at oracle.com Thu Jul 19 13:27:02 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 19 Jul 2018 15:27:02 +0200 Subject: [8u-backport] RFR: JDK-8114823: G1 doesn't honor request to disable class unloading In-Reply-To: <496e6c67-2b03-4e96-924b-eec90216ae20@default> References: <6371e589-466c-4b8e-b974-1dee8d2446c4@default> <496e6c67-2b03-4e96-924b-eec90216ae20@default> Message-ID: <74e6b279260667f9dd443771def98ad9d0448358.camel@oracle.com> Hi Fairoz, On Wed, 2018-07-18 at 23:52 -0700, Fairoz Matte wrote: > Hi All, > > Just a gentle reminder for review request. - at g1RootProcessor.cpp:152, the call to process_string_table_roots() should be moved to after the process_vm_roots() to (as much as possible) keep in sync with JDK9 code. - in G1RootProcessor::process_string_table_roots(), the "if (weak_roots != NULL)" is superfluous given the assert above it (and better keeps in sync with JDK9 code). Otherwise looks good to me. Thanks, Thomas From erik.helin at oracle.com Thu Jul 19 14:57:25 2018 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 19 Jul 2018 16:57:25 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: <35365ee0-e158-ad99-93c0-6f60251573a1@oracle.com> On 07/13/2018 04:10 PM, Daniel Mitterdorfer wrote: > Hi, > > I have good news. I was able to reproduce this issue but this time I > have logs. A test failed with the following stack trace around > 15:06:55 with: > > java.lang.IllegalArgumentException: committed = 537919488 should be < > max = 536870912 > > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:242) > > This time it happened on Linux (kernel 4.13.0-45-generic) with JDK 10 > (build 10+46). The JVM arguments were: > > -Xms512M -Xmx512M > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > > The logs are somewhat massive (~250MB uncompressed) and available at > https://www.dropbox.com/s/wir9sv1dk5cf54y/JDK-8207200_test_log.txt.zip?dl=0 Thanks for the logs Daniel, they helped a lot! Me and Thomas looked through the logs and the code and as we suspected, this is code is a bit buggy :/ Please see the bug for more details: https://bugs.openjdk.java.net/browse/JDK-8207200 Again, thanks for taking your time and reporting this issue and for getting us the logs, much appreciated! Erik > I hope that helps identifying the cause. Please let me know if you > need anything else. > > Daniel > Am Fr., 13. Juli 2018 um 10:33 Uhr schrieb Thomas Schatzl > : >> >> On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote: >>> Hi Erik, >>>> >>>> Do you any kind of GC logging from the test run where you >>>> encountered the bug? >>> >>> Unfortunately, we don't have GC logging enabled by default in our >>> test suite so the exception trace is all I got. I am now repeatedly >>> running the test suite with the original flags (-Xms512M -Xmx512M) >>> and also added the following logging configuration: >>> >>> -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags >>> >>> As soon as I get another failure, I'll provide the full log file. >>> Please let me know if you need any other logs (i.e. whether I should >>> adjust my log configuration). >> >> I think these flags are fine. >> >> Since Erik and me strongly believe the issue is with the relevant G1 >> code Erik mentioned we will reassign the bug to us (he said there is >> already a bug reported on it). >> >> Thanks a lot, >> Thomas >> From fairoz.matte at oracle.com Thu Jul 19 15:34:28 2018 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Thu, 19 Jul 2018 08:34:28 -0700 (PDT) Subject: [8u-backport] RFR: JDK-8114823: G1 doesn't honor request to disable class unloading In-Reply-To: <74e6b279260667f9dd443771def98ad9d0448358.camel@oracle.com> References: <6371e589-466c-4b8e-b974-1dee8d2446c4@default> <496e6c67-2b03-4e96-924b-eec90216ae20@default> <74e6b279260667f9dd443771def98ad9d0448358.camel@oracle.com> Message-ID: <372b3602-e550-4f85-8ebd-2a2dfceb80a4@default> Hi Thomas, Thanks for the review. Here is the updated webrev http://cr.openjdk.java.net/~fmatte/8114823/webrev.01/ with suggested changes. Thanks, Fairoz > -----Original Message----- > From: Thomas Schatzl > Sent: Thursday, July 19, 2018 6:57 PM > To: Fairoz Matte ; hotspot-gc- > dev at openjdk.java.net > Subject: Re: [8u-backport] RFR: JDK-8114823: G1 doesn't honor request to > disable class unloading > > Hi Fairoz, > > On Wed, 2018-07-18 at 23:52 -0700, Fairoz Matte wrote: > > Hi All, > > > > Just a gentle reminder for review request. > > > - at g1RootProcessor.cpp:152, the call to process_string_table_roots() should > be moved to after the process_vm_roots() to (as much as > possible) keep in sync with JDK9 code. > > - in G1RootProcessor::process_string_table_roots(), the "if (weak_roots != > NULL)" is superfluous given the assert above it (and better keeps in sync with > JDK9 code). > > Otherwise looks good to me. > > Thanks, > Thomas > From thomas.schatzl at oracle.com Thu Jul 19 15:44:33 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 19 Jul 2018 17:44:33 +0200 Subject: [8u-backport] RFR: JDK-8114823: G1 doesn't honor request to disable class unloading In-Reply-To: <372b3602-e550-4f85-8ebd-2a2dfceb80a4@default> References: <6371e589-466c-4b8e-b974-1dee8d2446c4@default> <496e6c67-2b03-4e96-924b-eec90216ae20@default> <74e6b279260667f9dd443771def98ad9d0448358.camel@oracle.com> <372b3602-e550-4f85-8ebd-2a2dfceb80a4@default> Message-ID: <12f07c2bcaa51956486ae0d968211675903b85a1.camel@oracle.com> Hi, On Thu, 2018-07-19 at 08:34 -0700, Fairoz Matte wrote: > Hi Thomas, > > Thanks for the review. > Here is the updated webrev > http://cr.openjdk.java.net/~fmatte/8114823/webrev.01/ with suggested > changes. updates look good. Thanks, Thomas From daniel.mitterdorfer at gmail.com Thu Jul 19 17:10:09 2018 From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer) Date: Thu, 19 Jul 2018 19:10:09 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: <35365ee0-e158-ad99-93c0-6f60251573a1@oracle.com> References: <35365ee0-e158-ad99-93c0-6f60251573a1@oracle.com> Message-ID: Hi Erik, I am quite happy that I could reproduce it after running the tests repeatedly for approximately a week after the first failure. Glad I could help and thank you all for you help as well! Daniel Am Do., 19. Juli 2018 um 16:57 Uhr schrieb Erik Helin : > > On 07/13/2018 04:10 PM, Daniel Mitterdorfer wrote: > > Hi, > > > > I have good news. I was able to reproduce this issue but this time I > > have logs. A test failed with the following stack trace around > > 15:06:55 with: > > > > java.lang.IllegalArgumentException: committed = 537919488 should be < > > max = 536870912 > > > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > > > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > > > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > > > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:242) > > > > This time it happened on Linux (kernel 4.13.0-45-generic) with JDK 10 > > (build 10+46). The JVM arguments were: > > > > -Xms512M -Xmx512M > > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > > > > The logs are somewhat massive (~250MB uncompressed) and available at > > https://www.dropbox.com/s/wir9sv1dk5cf54y/JDK-8207200_test_log.txt.zip?dl=0 > > Thanks for the logs Daniel, they helped a lot! Me and Thomas looked > through the logs and the code and as we suspected, this is code is a bit > buggy :/ Please see the bug for more details: > > https://bugs.openjdk.java.net/browse/JDK-8207200 > > Again, thanks for taking your time and reporting this issue and for > getting us the logs, much appreciated! > Erik > > > I hope that helps identifying the cause. Please let me know if you > > need anything else. > > > > Daniel > > Am Fr., 13. Juli 2018 um 10:33 Uhr schrieb Thomas Schatzl > > : > >> > >> On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote: > >>> Hi Erik, > >>>> > >>>> Do you any kind of GC logging from the test run where you > >>>> encountered the bug? > >>> > >>> Unfortunately, we don't have GC logging enabled by default in our > >>> test suite so the exception trace is all I got. I am now repeatedly > >>> running the test suite with the original flags (-Xms512M -Xmx512M) > >>> and also added the following logging configuration: > >>> > >>> -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > >>> > >>> As soon as I get another failure, I'll provide the full log file. > >>> Please let me know if you need any other logs (i.e. whether I should > >>> adjust my log configuration). > >> > >> I think these flags are fine. > >> > >> Since Erik and me strongly believe the issue is with the relevant G1 > >> code Erik mentioned we will reassign the bug to us (he said there is > >> already a bug reported on it). > >> > >> Thanks a lot, > >> Thomas > >> From thomas.schatzl at oracle.com Fri Jul 20 09:58:30 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 20 Jul 2018 11:58:30 +0200 Subject: RFR (XS): 8207953: Remove dead code in G1CopyingKeepAliveClosure Message-ID: <0de8d55387eac077c43677a65bbc2c2a35407a55.camel@oracle.com> Hi all, can I have a review for this trivial change that removes dead code: There is the following code in G1CopyingKeepAliveClosure::do_oop_work: if (_g1h->is_in_cset_or_humongous(obj)) { [...] if (_g1h->is_in_g1_reserved(p)) { _par_scan_state->push_on_queue(p); } else { assert(!Metaspace::contains((const void*)p), "Unexpectedly found a pointer from metadata: " PTR_FORMAT, p2i(p)); _copy_non_heap_obj_cl->do_oop(p); } } is_in_cset_or_humongous() implies is_in_g1_reserved(), so the condition and the else-part can be removed. CR: https://bugs.openjdk.java.net/browse/JDK-8207953 Webrev: http://cr.openjdk.java.net/~tschatzl/8207953/webrev/ Testing: hs-tier1 Thanks, Thomas From kim.barrett at oracle.com Fri Jul 20 17:48:59 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 20 Jul 2018 13:48:59 -0400 Subject: RFR (XS): 8207953: Remove dead code in G1CopyingKeepAliveClosure In-Reply-To: <0de8d55387eac077c43677a65bbc2c2a35407a55.camel@oracle.com> References: <0de8d55387eac077c43677a65bbc2c2a35407a55.camel@oracle.com> Message-ID: <058157FC-81FB-4F31-8F7F-B32B4C02B601@oracle.com> > On Jul 20, 2018, at 5:58 AM, Thomas Schatzl wrote: > > Hi all, > > can I have a review for this trivial change that removes dead code: > > There is the following code in G1CopyingKeepAliveClosure::do_oop_work: > > if (_g1h->is_in_cset_or_humongous(obj)) { > [...] > if (_g1h->is_in_g1_reserved(p)) { > _par_scan_state->push_on_queue(p); > } else { > assert(!Metaspace::contains((const void*)p), > "Unexpectedly found a pointer from metadata: " > PTR_FORMAT, p2i(p)); > _copy_non_heap_obj_cl->do_oop(p); > } > } > > is_in_cset_or_humongous() implies is_in_g1_reserved(), so the condition > and the else-part can be removed. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8207953 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8207953/webrev/ > Testing: > hs-tier1 > > Thanks, > Thomas Looks good. From thomas.schatzl at oracle.com Fri Jul 20 20:36:02 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 20 Jul 2018 22:36:02 +0200 Subject: RFR (XS): 8207953: Remove dead code in G1CopyingKeepAliveClosure In-Reply-To: <058157FC-81FB-4F31-8F7F-B32B4C02B601@oracle.com> References: <0de8d55387eac077c43677a65bbc2c2a35407a55.camel@oracle.com> <058157FC-81FB-4F31-8F7F-B32B4C02B601@oracle.com> Message-ID: Hi Kim, On Fri, 2018-07-20 at 13:48 -0400, Kim Barrett wrote: > > On Jul 20, 2018, at 5:58 AM, Thomas Schatzl > com> wrote: > > > > Hi all, > > > > can I have a review for this trivial change that removes dead > > code: > > > > There is the following code in > > G1CopyingKeepAliveClosure::do_oop_work: > > [...] > > Thanks, > > Thomas > > Looks good. > thanks for your review. Thomas From hohensee at amazon.com Fri Jul 20 22:37:14 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 20 Jul 2018 22:37:14 +0000 Subject: RFR(L): 8196889: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions Message-ID: Please review. Bug: https://bugs.openjdk.java.net/browse/JDK-8196989 CSR: https://bugs.openjdk.java.net/browse/JDK-8196991 Webrev: http://cr.openjdk.java.net/~phh/8196989/webrev.00 This webrev is marked ?L? because it?s a behavioral change (CSR in draft state, may I have a review of that too please?) and because the test change fanout is large. The actual code changes are ?M?. Passes the submit repo, Hotspot tier1, the JFR gc event tests and any other test set with ?gc? or ?serviceability? in the test directory name. I found it difficult to verify the accuracy of the reported values other than manually, since they can vary from run to run of the same program. I?d appreciate suggestions for how to go about writing accuracy tests. I set out originally to revamp only the MXBeans, but decided it would be incomplete if I didn?t include the jstat counters and the output of the GC.heap_info jcmd option. I can separate the latter two into their own RFEs, but I find it easier understand it all in a single webrev and hope the reviewers will too. The basic approach is to add the new memory pools and collectors, the new jstat counters, and an archive region counter that stands in for an actual archive region set. HeapRegionSets are disjoint, so initially I tried to create a first-class archive region set (on the same level as the humongous region set), but that idea foundered on the fact that there?s too much code I don?t fully understand that depends on archive regions being in the existing old region set. Externally (i.e., in the MXBeans and the jstat counters), however, the old region set doesn?t include archive regions (unless running in legacy mode). I used CMS?s TraceCMSMemoryManagerStats class as the model for TraceConcMemoryManagerStats, which latter collects statistics on concurrent cycles. There are two STW pauses in each concurrent cycle: they are recorded separately and count as two sun.gc.collector.2 events. The humongous and archive space committed and used values are always identical, hence they are always 100% used. The revised output of jcmd GC.heap_info is in G1CollectedHeap::print_on(). I fixed a typo in src/hotspot/share/gc/g1/g1Policy.hpp by changing the result type of young_list_target_length() from size_t to uint, which latter is the type of the _young_list_target_length member. I updated the copyright date in src/hotspot/share/services/memoryService.hpp to 2018, as I neglected to do so in a previous push. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.buck at oracle.com Sun Jul 22 01:10:35 2018 From: david.buck at oracle.com (David Buck) Date: Sun, 22 Jul 2018 10:10:35 +0900 Subject: JDK Memory Allocation In-Reply-To: References: Message-ID: <87801549-1b64-afa3-95dc-95994b0d0fea@oracle.com> Hi Max! Your question does not seem to be related to building OpenJDK, so I have BCCed build-dev from the thread and added gc-dev. That said, I am not sure any of the development lists are really an ideal place to ask general "code walk through" questions. If really necessary, memAllocator.cpp [0] would probably be as good a place as any to start reading the source code. But unless you intend to hack on the JVM itself, trying to read this source code may not be the most productive use of your time. You may get a lot more out of reading some of the wikis [1], blogs [2], and books [3][4] that cover the HotSpot JVM in detail. Even if you ultimately chose to read the source code directly, reading these other types of resources first should really help you make better sense of what you see in the source code. Cheers, -Buck [0] http://hg.openjdk.java.net/jdk/jdk/file/b0fcf59be391/src/hotspot/share/gc/shared/memAllocator.cpp [1] https://wiki.openjdk.java.net/display/HotSpot/Main [2] https://shipilev.net/jvm-anatomy-park/ [3] https://www.goodreads.com/book/show/13227108-java-performance [4] https://www.goodreads.com/book/show/23316035-java-performance-companion On 2018/07/22 8:51, mr rupplin wrote: > Having looked for some while at the OpenJDK source code I am unable to find where the memory allocation occurs. I will be working very much with the JDK and would like to get a firm grasp on its underlying mechanisms. > > public class JustAsk > { > public static void main(String...args) > { > for(int i=0; i<100; i++) > { > new JustAsk(); > } > } > } > > This doesn't seem to rely on any of the functions in the libjli nor of the jni.h. So clearly where do we look for the handler here? > > Thanks, > > Your friend Max > From hohensee at amazon.com Mon Jul 23 21:33:28 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 23 Jul 2018 21:33:28 +0000 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions Message-ID: Corrected subject line: 8196889 s/b 8196989. From: hotspot-gc-dev on behalf of "Hohensee, Paul" Date: Friday, July 20, 2018 at 3:38 PM To: "hotspot-gc-dev at openjdk.java.net" , "serviceability-dev at openjdk.java.net" Subject: RFR(L): 8196889: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions Please review. Bug: https://bugs.openjdk.java.net/browse/JDK-8196989 CSR: https://bugs.openjdk.java.net/browse/JDK-8196991 Webrev: http://cr.openjdk.java.net/~phh/8196989/webrev.00 This webrev is marked ?L? because it?s a behavioral change (CSR in draft state, may I have a review of that too please?) and because the test change fanout is large. The actual code changes are ?M?. Passes the submit repo, Hotspot tier1, the JFR gc event tests and any other test set with ?gc? or ?serviceability? in the test directory name. I found it difficult to verify the accuracy of the reported values other than manually, since they can vary from run to run of the same program. I?d appreciate suggestions for how to go about writing accuracy tests. I set out originally to revamp only the MXBeans, but decided it would be incomplete if I didn?t include the jstat counters and the output of the GC.heap_info jcmd option. I can separate the latter two into their own RFEs, but I find it easier understand it all in a single webrev and hope the reviewers will too. The basic approach is to add the new memory pools and collectors, the new jstat counters, and an archive region counter that stands in for an actual archive region set. HeapRegionSets are disjoint, so initially I tried to create a first-class archive region set (on the same level as the humongous region set), but that idea foundered on the fact that there?s too much code I don?t fully understand that depends on archive regions being in the existing old region set. Externally (i.e., in the MXBeans and the jstat counters), however, the old region set doesn?t include archive regions (unless running in legacy mode). I used CMS?s TraceCMSMemoryManagerStats class as the model for TraceConcMemoryManagerStats, which latter collects statistics on concurrent cycles. There are two STW pauses in each concurrent cycle: they are recorded separately and count as two sun.gc.collector.2 events. The humongous and archive space committed and used values are always identical, hence they are always 100% used. The revised output of jcmd GC.heap_info is in G1CollectedHeap::print_on(). I fixed a typo in src/hotspot/share/gc/g1/g1Policy.hpp by changing the result type of young_list_target_length() from size_t to uint, which latter is the type of the _young_list_target_length member. I updated the copyright date in src/hotspot/share/services/memoryService.hpp to 2018, as I neglected to do so in a previous push. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.osterlund at oracle.com Wed Jul 25 13:18:11 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 25 Jul 2018 15:18:11 +0200 Subject: RFR: JDK-8204970: Remaing object comparisons need to use oopDesc::equals() In-Reply-To: <2b92069f-f378-2984-8518-68230238eaec@redhat.com> References: <2b92069f-f378-2984-8518-68230238eaec@redhat.com> Message-ID: <5B587893.4080404@oracle.com> Hi Roman, Looks good. Thanks, /Erik On 2018-07-06 17:28, Roman Kennke wrote: > We found 2 more places where oopDesc::equals() should be used instead of > raw obj==obj. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8204970 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8204970/webrev.00/ > > Passes tier1 tests > > Can I get a review? > > Thanks, > Roman > From erik.osterlund at oracle.com Wed Jul 25 13:34:08 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 25 Jul 2018 15:34:08 +0200 Subject: RFR: JDK-8206457: Code paths from oop_iterate() must use barrier-free access In-Reply-To: <28394f57-3590-3d74-b660-dcfa6b4648a2@redhat.com> References: <28394f57-3590-3d74-b660-dcfa6b4648a2@redhat.com> Message-ID: <5B587C50.80502@oracle.com> Hi Roman, In instanceRefKlass.inline.hpp: Are these changes to logging required for Shenandoah not to crash? It appears to me that for ZGC, it would print the wrong addresses if barriers were not used. That's why I wonder if this is a strict requirement or not for Shenandoah to work. In oops/oop.cpp: Rather than changing the metadata accessors to always assume raw accesses are enough for all metadata accesses, I would prefer to have explicit metadata_field_raw accessors instead where we expect this (even if that turns out to be always). Thanks, /Erik On 2018-07-06 16:46, Roman Kennke wrote: > We have several code paths going out from oop_iterate() methods that > lead to GC barriers. This is not only inefficient but outright wrong. > oop_iterate() is normally used by GC and GC need to see the raw stuff, > not some resolved objects. In Shenandoah's full-GC it's fatal to attempt > to read objects's forwarding pointers, because it's temporarily pointing > to nowhere land. > > I propose to selectively use _raw() variants of the various accessors > that are used on oop_iterate() paths. This means to introduce an > oopDesc::int_field_raw(). I also propose to change metadata_field() > accessors to always use raw access wholesale. This is only used to load > the Klass* field, which is immutable and thus doesn't require barriers. > > The log_* statements in instanceRefKlass.inline.hpp surely don't need > barriers. I turned them into raw accessors as well. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8206457?filter=-1 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8206457/webrev.00/ > > Test: passes hotspot-tier1 here. > > Can I please get review? > > Roman > From zgu at redhat.com Wed Jul 25 15:08:40 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 25 Jul 2018 11:08:40 -0400 Subject: RFR: JDK-8204970: Remaing object comparisons need to use oopDesc::equals() In-Reply-To: <5B587893.4080404@oracle.com> References: <2b92069f-f378-2984-8518-68230238eaec@redhat.com> <5B587893.4080404@oracle.com> Message-ID: Looks good to me too. Thanks, -Zhengyu On 07/25/2018 09:18 AM, Erik ?sterlund wrote: > Hi Roman, > > Looks good. > > Thanks, > /Erik > > On 2018-07-06 17:28, Roman Kennke wrote: >> We found 2 more places where oopDesc::equals() should be used instead of >> raw obj==obj. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8204970 >> Webrev: >> http://cr.openjdk.java.net/~rkennke/JDK-8204970/webrev.00/ >> >> Passes tier1 tests >> >> Can I get a review? >> >> Thanks, >> Roman >> > From thomas.schatzl at oracle.com Wed Jul 25 15:09:19 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 25 Jul 2018 17:09:19 +0200 Subject: RFR(M): 8205921: Optimizing best-of-2 work stealing queue selection Message-ID: <51f994e937737d50be0ca38c33ec1df85c8bfb6e.camel@oracle.com> Hi all, could I have reviews for this change to work stealing that significantly decreases the amount of unsuccessful steal attempts initially worked on by Zhengyu from RedHat [0]? After some initial reviews we agreed that I continued working on that, mostly cleaning up the code, and doing some more testing. While the change looks a lot bigger now, most of the changes were due to removing the need to pass in the "seed" parameter in from all collectors. The change is based on one of the ideas presented in a recent paper [1], where it has been shown that work stealing in the task queues should be biased towards queues a thread already successfully stole from, as this would speed up GC pauses (and program execution) significantly. My measurements showed that the change significantly decreases the number of overall steal attempts (not necessarily termination time) and increases the relative number of successful ones. Contrary to the paper there is no clear observable (statistically significant) actual pause time improvement. This may have several reasons: - the paper mostly measures total execution time, not pause times which I am mostly interested in. :) - while the number of steal attempts decreases significantly with that change, this number is typically dwarfed by the actual pushes and pops on all applications that actually do work - and on others you might want to simply use less gc threads. At least in steady state, the work in the majority of applications I tried seem to be fairly well balanced in default configurations, so there is not much stealing going on compared to other work. Dacapo is an outlier because there is typically not much work to do at all, e.g. I measured in total like 1000(!) objects pushed on the task queue per GC in parallel GC on lusearch (with 40 worker threads, which is higher than used in the paper). - the GC statistics implementation enabled by the TASKQUEUE_STATS may be buggy or incomplete. - the test setup in the paper may not have been specified well enough. I used -XX:ParallelGCThreads=15 -Xmx -Xms. - the dacapo benchmarks may be a bit useless to use for measuring GC pause times: pause times average take 1-2ms on roughly the same machine as in the paper with the suggested heap sizes, and a significant part of that time seems to be spent on work completely unrelated to task queues. I.e. you can actually measure "Object Copy" which includes actual copying work next to stealing with G1. I get like 10% of total pause time spent there for e.g. lusearch (in the baseline). The paper suggests ~10% smaller total _execution_ time with only these task queue improvements (Fig 10a) which seems very hard to achieve knowing that. - while I did most of my measurements with JDK11 and G1, some very brief tests on Parallel GC and JDK8 did not show much difference. However overall I think this change is useful. :) I also spent some time on the second enhancement, i.e. limiting the number of steal attempts per steal round based on the number of active queues, which is attached to the CR. That one did not show any further measurable improvement on the number of steal attempts (within statistical significance), which makes sense if you consider that we do not spend a lot of time in stealing anyway, and the reduction of the number of threads due to that is very rare (in my tests). The paper also did not give a breakdown either. The changeset also credits Zhengyu for his work. CR: https://bugs.openjdk.java.net/browse/JDK-8205921 Webrev: http://cr.openjdk.java.net/~tschatzl/8205921/webrev/ Testing: hs-tier1-4 Thanks, Thomas P.S: sorry for taking so long. Actually the changes were lying around for some time locally... [0] http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2018-June/022 556.html [1] Characterizing and Optimizing Hotspot Parallel Garbage Collection on Multicore Systems http://ranger.uta.edu/~jrao/papers/EuroSys18.pdf From kim.barrett at oracle.com Wed Jul 25 20:07:30 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 25 Jul 2018 16:07:30 -0400 Subject: RFR(M): 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: <51f994e937737d50be0ca38c33ec1df85c8bfb6e.camel@oracle.com> References: <51f994e937737d50be0ca38c33ec1df85c8bfb6e.camel@oracle.com> Message-ID: > On Jul 25, 2018, at 11:09 AM, Thomas Schatzl wrote: > > Hi all, > > could I have reviews for this change to work stealing that > significantly decreases the amount of unsuccessful steal attempts > initially worked on by Zhengyu from RedHat [0]? > > [?] > CR: > https://bugs.openjdk.java.net/browse/JDK-8205921 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8205921/webrev/ > Testing: > hs-tier1-4 > Looks good. One minor issue, for which I don't need a new webrev. ------------------------------------------------------------------------------ src/hotspot/share/gc/shared/taskqueue.inline.hpp 235 assert(sizeof(int) == 4, "I think this relies on that"); Use STATIC_ASSERT instead of assert. Although I think the actual requirement is INT_MAX >= m. ------------------------------------------------------------------------------ From karen.kinnear at oracle.com Wed Jul 25 21:56:00 2018 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 25 Jul 2018 17:56:00 -0400 Subject: Patch to inline os::SpinPause() for X86 on non-Windows OS In-Reply-To: References: Message-ID: <91C01F03-ECC3-4E31-89A9-5B8AA489BB3D@oracle.com> Man, Thank you for your proposal. The runtime is the correct team. Could you please file an rfe under hotspot/runtime with the information below and the patch as well as any tests you have run and any performance results you have? That will help us track this information and find you a sponsor. thanks, Karen > On Jul 18, 2018, at 9:53 PM, Man Cao wrote: > > Hello, > > The Java platform team at Google has maintained a local patch to inline os::SpinPause() since 2014. We would like to upstream this patch to OpenJDK. Could someone sponsor this patch? > > It is difficult to demonstrate performance improvement in Java benchmarks. It is more of a code refactoring to better utilize modern GCC. It partly addresses the comment about inlining SpinPause() above its declaration in os.hpp. > I found an interesting discussion about PAUSE and a microbenchmark in: > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html > However, the microbenchmark has a large variance in our experiment, making it difficult to tell if there's any benefit from inlining PAUSE. Inlining PAUSE does seem to reduce the variance a bit. > > The patch is inlined and attached below: > > diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > @@ -63,15 +63,6 @@ > popl %eax > ret > > - .globl SYMBOL(SpinPause) > - ELF_TYPE(SpinPause, at function) > - .p2align 4,,15 > -SYMBOL(SpinPause): > - rep > - nop > - movl $1, %eax > - ret > - > # Support for void Copy::conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > @@ -46,15 +46,6 @@ > > .text > > - .globl SYMBOL(SpinPause) > - .p2align 4,,15 > - ELF_TYPE(SpinPause, at function) > -SYMBOL(SpinPause): > - rep > - nop > - movq $1, %rax > - ret > - > # Support for void Copy::arrayof_conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > --- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > @@ -42,15 +42,6 @@ > > .text > > - .globl SpinPause > - .type SpinPause, at function > - .p2align 4,,15 > -SpinPause: > - rep > - nop > - movl $1, %eax > - ret > - > # Support for void Copy::conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > --- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > @@ -38,15 +38,6 @@ > > .text > > - .globl SpinPause > - .align 16 > - .type SpinPause, at function > -SpinPause: > - rep > - nop > - movq $1, %rax > - ret > - > # Support for void Copy::arrayof_conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > --- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > +++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > @@ -51,15 +51,6 @@ > movq %fs:0x0,%rax > ret > > - .globl SpinPause > - .align 16 > -SpinPause: > - rep > - nop > - movq $1, %rax > - ret > - > - > / Support for void Copy::arrayof_conjoint_bytes(void* from, > / void* to, > / size_t count) > diff --git a/src/hotspot/share/runtime/os.hpp b/src/hotspot/share/runtime/os.hpp > --- a/src/hotspot/share/runtime/os.hpp > +++ b/src/hotspot/share/runtime/os.hpp > @@ -1031,6 +1031,13 @@ > // of the global SpinPause() with C linkage. > // It'd also be eligible for inlining on many platforms. > > +#if defined(X86) && !defined(_WINDOWS) > +extern "C" int inline SpinPause() { > + __asm__ __volatile__ ("pause"); > + return 1; > +} > +#else > extern "C" int SpinPause(); > +#endif > > #endif // SHARE_VM_RUNTIME_OS_HPP > > -Man > -------------- next part -------------- An HTML attachment was scrubbed... URL: From manc at google.com Thu Jul 26 00:08:27 2018 From: manc at google.com (Man Cao) Date: Wed, 25 Jul 2018 17:08:27 -0700 Subject: Patch to inline os::SpinPause() for X86 on non-Windows OS In-Reply-To: <91C01F03-ECC3-4E31-89A9-5B8AA489BB3D@oracle.com> References: <91C01F03-ECC3-4E31-89A9-5B8AA489BB3D@oracle.com> Message-ID: Thanks Karen for the response! I don't have a JBS account currently. I could ask my colleagues with JBS accounts to create an RFE for this issue, but probably I cannot directly post comments or performance results on JBS. I'm working on upstreaming more Google-local runtime and GC patches, so I can become an Author, according to: https://wiki.openjdk.java.net/display/general/JBS+Overview So far I have just contributed one patch: JDK-8193386. Can I just post performance results on the mailing list and someone could copy the results when creating an RFE? -Man On Wed, Jul 25, 2018 at 2:56 PM Karen Kinnear wrote: > Man, > > Thank you for your proposal. The runtime is the correct team. > > Could you please file an rfe under hotspot/runtime with the information > below and the patch as well as any > tests you have run and any performance results you have? > > That will help us track this information and find you a sponsor. > > thanks, > Karen > > On Jul 18, 2018, at 9:53 PM, Man Cao wrote: > > Hello, > > The Java platform team at Google has maintained a local patch to inline > os::SpinPause() since 2014. We would like to upstream this patch to > OpenJDK. Could someone sponsor this patch? > > It is difficult to demonstrate performance improvement in Java benchmarks. > It is more of a code refactoring to better utilize modern GCC. It partly > addresses the comment about inlining SpinPause() above its declaration in > os.hpp. > I found an interesting discussion about PAUSE and a microbenchmark in: > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html > However, the microbenchmark has a large variance in our experiment, making > it difficult to tell if there's any benefit from inlining PAUSE. Inlining > PAUSE does seem to reduce the variance a bit. > > The patch is inlined and attached below: > > diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s > @@ -63,15 +63,6 @@ > popl %eax > ret > > - .globl SYMBOL(SpinPause) > - ELF_TYPE(SpinPause, at function) > - .p2align 4,,15 > -SYMBOL(SpinPause): > - rep > - nop > - movl $1, %eax > - ret > - > # Support for void Copy::conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s > @@ -46,15 +46,6 @@ > > .text > > - .globl SYMBOL(SpinPause) > - .p2align 4,,15 > - ELF_TYPE(SpinPause, at function) > -SYMBOL(SpinPause): > - rep > - nop > - movq $1, %rax > - ret > - > # Support for void Copy::arrayof_conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > --- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s > @@ -42,15 +42,6 @@ > > .text > > - .globl SpinPause > - .type SpinPause, at function > - .p2align 4,,15 > -SpinPause: > - rep > - nop > - movl $1, %eax > - ret > - > # Support for void Copy::conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > --- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s > @@ -38,15 +38,6 @@ > > .text > > - .globl SpinPause > - .align 16 > - .type SpinPause, at function > -SpinPause: > - rep > - nop > - movq $1, %rax > - ret > - > # Support for void Copy::arrayof_conjoint_bytes(void* from, > # void* to, > # size_t count) > diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > --- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > +++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s > @@ -51,15 +51,6 @@ > movq %fs:0x0,%rax > ret > > - .globl SpinPause > - .align 16 > -SpinPause: > - rep > - nop > - movq $1, %rax > - ret > - > - > / Support for void Copy::arrayof_conjoint_bytes(void* from, > / void* to, > / size_t count) > diff --git a/src/hotspot/share/runtime/os.hpp > b/src/hotspot/share/runtime/os.hpp > --- a/src/hotspot/share/runtime/os.hpp > +++ b/src/hotspot/share/runtime/os.hpp > @@ -1031,6 +1031,13 @@ > // of the global SpinPause() with C linkage. > // It'd also be eligible for inlining on many platforms. > > +#if defined(X86) && !defined(_WINDOWS) > +extern "C" int inline SpinPause() { > + __asm__ __volatile__ ("pause"); > + return 1; > +} > +#else > extern "C" int SpinPause(); > +#endif > > #endif // SHARE_VM_RUNTIME_OS_HPP > > -Man > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From karen.kinnear at oracle.com Thu Jul 26 00:40:04 2018 From: karen.kinnear at oracle.com (Karen Kinnear) Date: Wed, 25 Jul 2018 20:40:04 -0400 Subject: Patch to inline os::SpinPause() for X86 on non-Windows OS In-Reply-To: References: <91C01F03-ECC3-4E31-89A9-5B8AA489BB3D@oracle.com> Message-ID: <85A4AE3E-62C9-4663-BC3E-BC2C64037FAB@oracle.com> Man. > On Jul 25, 2018, at 8:08 PM, Man Cao wrote: > > Thanks Karen for the response! > I don't have a JBS account currently. I could ask my colleagues with JBS accounts to create an RFE for this issue, but probably I cannot directly post comments or performance results on JBS. Sounds good - why don?t you work with a colleague who has a JBS account until you can become an Author. > > I'm working on upstreaming more Google-local runtime and GC patches, so I can become an Author, according to: > https://wiki.openjdk.java.net/display/general/JBS+Overview > So far I have just contributed one patch: JDK-8193386. > > Can I just post performance results on the mailing list and someone could copy the results when creating an RFE? Sounds like you will be finding a colleague who can help to create the initial RFE. Feel free to work with that colleague to add updates, or wait until you have the information to only need to ask them a favor once. thanks, Karen > > -Man > > > On Wed, Jul 25, 2018 at 2:56 PM Karen Kinnear > wrote: > Man, > > Thank you for your proposal. The runtime is the correct team. > > Could you please file an rfe under hotspot/runtime with the information below and the patch as well as any > tests you have run and any performance results you have? > > That will help us track this information and find you a sponsor. > > thanks, > Karen > >> On Jul 18, 2018, at 9:53 PM, Man Cao > wrote: >> >> Hello, >> >> The Java platform team at Google has maintained a local patch to inline os::SpinPause() since 2014. We would like to upstream this patch to OpenJDK. Could someone sponsor this patch? >> >> It is difficult to demonstrate performance improvement in Java benchmarks. It is more of a code refactoring to better utilize modern GCC. It partly addresses the comment about inlining SpinPause() above its declaration in os.hpp. >> I found an interesting discussion about PAUSE and a microbenchmark in: >> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2012-August/004352.html >> However, the microbenchmark has a large variance in our experiment, making it difficult to tell if there's any benefit from inlining PAUSE. Inlining PAUSE does seem to reduce the variance a bit. >> >> The patch is inlined and attached below: >> >> diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_32.s >> @@ -63,15 +63,6 @@ >> popl %eax >> ret >> >> - .globl SYMBOL(SpinPause) >> - ELF_TYPE(SpinPause, at function) >> - .p2align 4,,15 >> -SYMBOL(SpinPause): >> - rep >> - nop >> - movl $1, %eax >> - ret >> - >> # Support for void Copy::conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> --- a/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> +++ b/src/hotspot/os_cpu/bsd_x86/bsd_x86_64.s >> @@ -46,15 +46,6 @@ >> >> .text >> >> - .globl SYMBOL(SpinPause) >> - .p2align 4,,15 >> - ELF_TYPE(SpinPause, at function) >> -SYMBOL(SpinPause): >> - rep >> - nop >> - movq $1, %rax >> - ret >> - >> # Support for void Copy::arrayof_conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> --- a/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_32.s >> @@ -42,15 +42,6 @@ >> >> .text >> >> - .globl SpinPause >> - .type SpinPause, at function >> - .p2align 4,,15 >> -SpinPause: >> - rep >> - nop >> - movl $1, %eax >> - ret >> - >> # Support for void Copy::conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> --- a/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> +++ b/src/hotspot/os_cpu/linux_x86/linux_x86_64.s >> @@ -38,15 +38,6 @@ >> >> .text >> >> - .globl SpinPause >> - .align 16 >> - .type SpinPause, at function >> -SpinPause: >> - rep >> - nop >> - movq $1, %rax >> - ret >> - >> # Support for void Copy::arrayof_conjoint_bytes(void* from, >> # void* to, >> # size_t count) >> diff --git a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> --- a/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> +++ b/src/hotspot/os_cpu/solaris_x86/solaris_x86_64.s >> @@ -51,15 +51,6 @@ >> movq %fs:0x0,%rax >> ret >> >> - .globl SpinPause >> - .align 16 >> -SpinPause: >> - rep >> - nop >> - movq $1, %rax >> - ret >> - >> - >> / Support for void Copy::arrayof_conjoint_bytes(void* from, >> / void* to, >> / size_t count) >> diff --git a/src/hotspot/share/runtime/os.hpp b/src/hotspot/share/runtime/os.hpp >> --- a/src/hotspot/share/runtime/os.hpp >> +++ b/src/hotspot/share/runtime/os.hpp >> @@ -1031,6 +1031,13 @@ >> // of the global SpinPause() with C linkage. >> // It'd also be eligible for inlining on many platforms. >> >> +#if defined(X86) && !defined(_WINDOWS) >> +extern "C" int inline SpinPause() { >> + __asm__ __volatile__ ("pause"); >> + return 1; >> +} >> +#else >> extern "C" int SpinPause(); >> +#endif >> >> #endif // SHARE_VM_RUNTIME_OS_HPP >> >> -Man >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Thu Jul 26 10:00:54 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jul 2018 12:00:54 +0200 Subject: RFR(M): 8205921: Optimizing best-of-2 work stealing queue selection In-Reply-To: References: <51f994e937737d50be0ca38c33ec1df85c8bfb6e.camel@oracle.com> Message-ID: Hi Kim, On Wed, 2018-07-25 at 16:07 -0400, Kim Barrett wrote: > > On Jul 25, 2018, at 11:09 AM, Thomas Schatzl > .com> wrote: > > > > Hi all, > > > > could I have reviews for this change to work stealing that > > significantly decreases the amount of unsuccessful steal attempts > > initially worked on by Zhengyu from RedHat [0]? > > > > [?] > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8205921 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8205921/webrev/ > > Testing: > > hs-tier1-4 > > > > Looks good. > > One minor issue, for which I don't need a new webrev. thanks for your review. I fixed this in the original webrev for another reviewer. Thanks, Thomas From rkennke at redhat.com Thu Jul 26 11:09:56 2018 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 26 Jul 2018 13:09:56 +0200 Subject: RFR: JDK-8206457: Code paths from oop_iterate() must use barrier-free access In-Reply-To: <5B587C50.80502@oracle.com> References: <28394f57-3590-3d74-b660-dcfa6b4648a2@redhat.com> <5B587C50.80502@oracle.com> Message-ID: Hi Erik, thanks for reviewing! > In instanceRefKlass.inline.hpp: > > Are these changes to logging required for Shenandoah not to crash? It > appears to me that for ZGC, it would print the wrong addresses if > barriers were not used. That's why I wonder if this is a strict > requirement or not for Shenandoah to work. We do need to avoid read-barriers on those paths. The problem is that during full-GC we temporarily don't have the fwd-ptr available (it's a sliding mark-compact algorithm). However, we can work around this by not using the base+offset variants like in the patch. However, this seems to make the Access API unhappy at compile-time when using ON_UNKNOWN_OOP_REF. Can you check this? I've no clue where to look. > In oops/oop.cpp: > Rather than changing the metadata accessors to always assume raw > accesses are enough for all metadata accesses, I would prefer to have > explicit metadata_field_raw accessors instead where we expect this (even > if that turns out to be always). Alright, I unwinded it and came up with a minimal use of metadata_field_raw(). It requires splicing Class:as_Klass() into a _raw() variant to be used from oop iterator paths. But it's not as bad as I thought. Updated patch: http://cr.openjdk.java.net/~rkennke/JDK-8206457/webrev.01/ Do you like it better? Can you check the compile-error that I mentioned above? Thanks, Roman > > Thanks, > /Erik > > On 2018-07-06 16:46, Roman Kennke wrote: >> ? We have several code paths going out from oop_iterate() methods that >> lead to GC barriers. This is not only inefficient but outright wrong. >> oop_iterate() is normally used by GC and GC need to see the raw stuff, >> not some resolved objects. In Shenandoah's full-GC it's fatal to attempt >> to read objects's forwarding pointers, because it's temporarily pointing >> to nowhere land. >> >> I propose to selectively use _raw() variants of the various accessors >> that are used on oop_iterate() paths. This means to introduce an >> oopDesc::int_field_raw(). I also propose to change metadata_field() >> accessors to always use raw access wholesale. This is only used to load >> the Klass* field, which is immutable and thus doesn't require barriers. >> >> The log_* statements in instanceRefKlass.inline.hpp surely don't need >> barriers. I turned them into raw accessors as well. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8206457?filter=-1 >> Webrev: >> http://cr.openjdk.java.net/~rkennke/JDK-8206457/webrev.00/ >> >> Test: passes hotspot-tier1 here. >> >> Can I please get review? >> >> Roman >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Thu Jul 26 11:25:20 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jul 2018 13:25:20 +0200 Subject: RFR: 8204089: Timely Reducing Unused Committed Memory In-Reply-To: References: Message-ID: <244535da19fb696e8e48345f72c29a8f7478432a.camel@oracle.com> Hi Rodrigo, first, sorry for taking so long to respond in this thread. Unfortunately the JDK11 release and summer vacations made sure that we were a bit busy with other matters. I think this has now somewhat cleared up, although there is still some potential for further work for JDK11. On Tue, 2018-06-19 at 20:46 +0200, Rodrigo Bruno wrote: > Hi all, > > here is the first version of our contribution for draft JEP-8204089. > > More details at the CR. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8204089 > Webrev: > http://cr.openjdk.java.net/~tschatzl/jelastic/pgc/ > > Thanks, > Rodrigo some comments: - currently the check for whether we should shrink the heap is attached to the VMThread-loop as a kind of control thread. This works, but is a bit ugly I guess :) One option would be to add a dedicated control thread, but at least all concurrent collectors already have that (Z the ZDirector and Shenandoah the ShenandoahControlThread, and G1 its G1ConcurrentMarkThread or G1YoungRemSetSamplingThread), and it might be better to at most provide a method for this functionality instead of going full-on separate. I.e. my suggestion at the moment, since we are limited to G1 too, is to use the G1ConcurrentMarkThread to trigger this. This would make it a bit hard to support this for parallel and serial gc, but we limited ourselves to G1 support anyway. There is a "sleep_before_next_cycle" method which could be somewhat refactored to timeout "frequently" and trigger the action. - comments for the should_gc() method: - it would imho be best to put the frequency check first because to me it seems the one that most frequently returns false. - please use Ticks/Tickspan to measure time for any new code. I know that G1 uses mostly doubles to store time, but I am in the process to replace all of the doubles with those. - looking a bit at other implementations it might be worth to be able to customize what is been done when idle is detected. In most cases it might be sufficient to just shrink the heap (in a new VM operation, using the existing code in G1CollectedHeap::resize_after_full_collection() that already uses Min/MaxHeapFreeRatio). A System.gc() seems very intrusive and should be optional imho after some thinking; making this optional does not seem too much work. Consider applications with more than a few GB of heap, those will be affected a lot (i.e. unresponsive for multiple seconds) A flag like UseFullGCForIdleCompaction(?) could be used here. - the change does not make the system.gc() use "Idle" (probably "Idle Time Compaction") or similar as suggested in the JEP. - there needs to be at least one junit test that tests this functionality given different options. I also reformatted the JEP text just a bit. Sorry again for the long delay in answering your request. Thanks, Thomas From thomas.schatzl at oracle.com Thu Jul 26 11:49:36 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jul 2018 13:49:36 +0200 Subject: RFR (XXS): 8208297: Allow printing of taskqueue stats if compiled in in product builds Message-ID: Hi all, please review this trivial patch that changes some log_develop_is_enabled(Trace, ...) to log_is_enabled(Trace, ...) to allow printing of task queue stats if they are collected (by setting TASKQUEUE_STATS) in product builds with Parallel/CMS without any additional changes too. CR: https://bugs.openjdk.java.net/browse/JDK-8208297 Webrev: http://cr.openjdk.java.net/~tschatzl/8208297/webrev/ Testing: local compilation/use Thanks, Thomas From thomas.schatzl at oracle.com Thu Jul 26 11:32:46 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jul 2018 13:32:46 +0200 Subject: RFR: 8204089: Timely Reducing Unused Committed Memory In-Reply-To: <244535da19fb696e8e48345f72c29a8f7478432a.camel@oracle.com> References: <244535da19fb696e8e48345f72c29a8f7478432a.camel@oracle.com> Message-ID: Hi, at least one sentence needs further clarification. On Thu, 2018-07-26 at 13:25 +0200, Thomas Schatzl wrote: > Hi Rodrigo, > > first, sorry for taking so long to respond in this thread. > [...] > On Tue, 2018-06-19 at 20:46 +0200, Rodrigo Bruno wrote: > > Hi all, > > > > here is the first version of our contribution for draft JEP- > > 8204089. > > > > some comments: > [...] > - looking a bit at other implementations it might be worth to be able > to customize what is been done when idle is detected. > > In most cases it might be sufficient to just shrink the heap (in a > new VM operation, using the existing code in > G1CollectedHeap::resize_after_full_collection() that already uses > Min/MaxHeapFreeRatio). A System.gc() seems very intrusive and should > be optional imho after some thinking; making this optional does not > seem too much work. > > Consider applications with more than a few GB of heap, those will be > affected a lot (i.e. unresponsive for multiple seconds) > > A flag like UseFullGCForIdleCompaction(?) could be used here. > > - the change does not make the system.gc() use "Idle" (probably "Idle > Time Compaction") or similar as suggested in the JEP. ... as GC cause... It is very important to make sure that this action is easily identifiable for users. Thanks, Thomas From erik.osterlund at oracle.com Thu Jul 26 13:13:53 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 26 Jul 2018 15:13:53 +0200 Subject: RFR: JDK-8206457: Code paths from oop_iterate() must use barrier-free access In-Reply-To: References: <28394f57-3590-3d74-b660-dcfa6b4648a2@redhat.com> <5B587C50.80502@oracle.com> Message-ID: <5B59C911.3060200@oracle.com> Hi Roman, On 2018-07-26 13:09, Roman Kennke wrote: > Hi Erik, > thanks for reviewing! > >> In instanceRefKlass.inline.hpp: >> >> Are these changes to logging required for Shenandoah not to crash? It >> appears to me that for ZGC, it would print the wrong addresses if >> barriers were not used. That's why I wonder if this is a strict >> requirement or not for Shenandoah to work. > We do need to avoid read-barriers on those paths. The problem is that > during full-GC we temporarily don't have the fwd-ptr available (it's a > sliding mark-compact algorithm). However, we can work around this by not > using the base+offset variants like in the patch. However, this seems to > make the Access API unhappy at compile-time when using > ON_UNKNOWN_OOP_REF. Can you check this? I've no clue where to look. The reason is that wherever ON_UNKNOWN_OOP_REF is used, the backend needs to be able to determine the exact strength. And to do that, the backend needs to be able to determine of this is a referent field. And to do that, it needs a base pointer. I'm not 100% sure what I think is a good solution to this. I wonder if along the lines of introducing these resolve for read/write decorators (which it looks like we will be needing anyway), there could be a do not resolve decorator that could be passed in to determining how to resolve the access. Default for stores could be ACCESS_WRITE, for loads ACCESS_READ, for atomics ACCESS_READ | ACCESS_WRITE, and explicitly setting ACCESS_NONE meaning don't resolve this one. Maybe the prefix ought to be RESOLVE_READ / RESOLVE_WRITE / RESOLVE_NONE instead though to be more specific. >> In oops/oop.cpp: >> Rather than changing the metadata accessors to always assume raw >> accesses are enough for all metadata accesses, I would prefer to have >> explicit metadata_field_raw accessors instead where we expect this (even >> if that turns out to be always). > Alright, I unwinded it and came up with a minimal use of > metadata_field_raw(). It requires splicing Class:as_Klass() into a > _raw() variant to be used from oop iterator paths. But it's not as bad > as I thought. > > Updated patch: > http://cr.openjdk.java.net/~rkennke/JDK-8206457/webrev.01/ > > Do you like it better? Can you check the compile-error that I mentioned > above? Looks better now, yeah. Thanks, /Erik > Thanks, Roman > > >> Thanks, >> /Erik >> >> On 2018-07-06 16:46, Roman Kennke wrote: >>> We have several code paths going out from oop_iterate() methods that >>> lead to GC barriers. This is not only inefficient but outright wrong. >>> oop_iterate() is normally used by GC and GC need to see the raw stuff, >>> not some resolved objects. In Shenandoah's full-GC it's fatal to attempt >>> to read objects's forwarding pointers, because it's temporarily pointing >>> to nowhere land. >>> >>> I propose to selectively use _raw() variants of the various accessors >>> that are used on oop_iterate() paths. This means to introduce an >>> oopDesc::int_field_raw(). I also propose to change metadata_field() >>> accessors to always use raw access wholesale. This is only used to load >>> the Klass* field, which is immutable and thus doesn't require barriers. >>> >>> The log_* statements in instanceRefKlass.inline.hpp surely don't need >>> barriers. I turned them into raw accessors as well. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8206457?filter=-1 >>> Webrev: >>> http://cr.openjdk.java.net/~rkennke/JDK-8206457/webrev.00/ >>> >>> Test: passes hotspot-tier1 here. >>> >>> Can I please get review? >>> >>> Roman >>> > From yasuenag at gmail.com Thu Jul 26 13:44:30 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 26 Jul 2018 22:44:30 +0900 Subject: PING: RFR: 8207756: jstat should show CGC STW phases of ZGC In-Reply-To: <107AD8D6-1E1F-4469-9352-8CCD3C84AA67@oracle.com> References: <11866bdb-9e9d-019c-c64e-8548a79a7617@gmail.com> <107AD8D6-1E1F-4469-9352-8CCD3C84AA67@oracle.com> Message-ID: <51cc06a1-b027-4208-f2d9-327cbd009564@gmail.com> CC'ing to hotspot-gc-dev Hi Per, I've looked at ZServiceabilityCounters in zServiceability.cpp . IMHO it is unsuitable because CGC counter means STW in concurrent GC. So I think it is suitable to implement in VMOperation for ZGC. Thanks, Yasumasa On 2018/07/26 22:30, Per Lid?n wrote: > Hi, > > Please have a look at the zServiceability class, which is where code like this is intended to live. > > /Per > >> On 26 Jul 2018, at 14:30, Yasumasa Suenaga wrote: >> >> PING: Could you review it? >> >>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207756/webrev.00/ >> >> >> Yasumasa >> >> >>> On 2018/07/18 11:59, Yasumasa Suenaga wrote: >>> Hi all, >>> Please review this change: >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8207756 >>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207756/webrev.00/ >>> jstat shows CGC STW phases as CGC and CGCT fields since JDK-8153333 . >>> However, ZGC does not adapt to it yet as following: >>> ``` >>> $ jstat -gc 1234 1 1 >>> S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT CGC CGCT GCT >>> - - - - - - 514048.0 12288.0 4352.0 3731.2 0.0 0.0 - - - - - - 0.000 >>> ``` >>> I uploaded webrev which is for jdk/jdk repo. Please tell me if it >>> should be for zgc repo. >>> I'm not an author of ZGC. So I need a sponsor. >>> (I'm a Reviewer of jdk. If this change should be pushed to jdk/jdk, >>> please tell me. I will push it after reviewing.) >>> Thanks, >>> Yasumasa > From yasuenag at gmail.com Thu Jul 26 13:52:10 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 26 Jul 2018 22:52:10 +0900 Subject: ZGC: RFR: 8207843: HSDB cannot show Object Histogram when ZGC is working In-Reply-To: <3d4f8faf-e26a-0e6b-6df1-73b6600ee5a0@gmail.com> References: <0e8ce2e6-e043-2320-e660-a2f1f4df820e@gmail.com> <3d4f8faf-e26a-0e6b-6df1-73b6600ee5a0@gmail.com> Message-ID: <06ceb864-bca5-d89c-c54e-fbfce3585066@gmail.com> CC'ing to hotspot-gc-dev On 2018/07/26 21:30, Yasumasa Suenaga wrote: > PING: Could you review it? > >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/ > > > Yasumasa > > > On 2018/07/19 23:03, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this webrev. >> >> ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8207843 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/ >> >> I encountered AssertionFailure when I attached HSDB to the process which is working with ZGC as below: >> >> sun.jvm.hotspot.utilities.AssertionFailure: Unexpected CollectedHeap type: sun.jvm.hotspot.gc.z.ZCollectedHeap >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.Assert.that(Assert.java:32) >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.collectLiveRegions(ObjectHeap.java:448) >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.iterate(ObjectHeap.java:173) >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.HSDB$VisitHeap.run(HSDB.java:1741) >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.WorkerThread$MainLoop.run(WorkerThread.java:70) >> ???? at java.base/java.lang.Thread.run(Thread.java:832) >> >> ObjectHeap#collectLiveRegions() branches by instance type of CollectedHeap. However it does not support ZCollectedHeap. >> So I add ZCollectedHeap to it and add some methods to iterate ZPageTable. >> >> >> Thanks, >> >> Yasumasa From rkennke at redhat.com Thu Jul 26 14:02:02 2018 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 26 Jul 2018 16:02:02 +0200 Subject: RFR: JDK-8206457: Code paths from oop_iterate() must use barrier-free access In-Reply-To: <5B59C911.3060200@oracle.com> References: <28394f57-3590-3d74-b660-dcfa6b4648a2@redhat.com> <5B587C50.80502@oracle.com> <5B59C911.3060200@oracle.com> Message-ID: Am 26.07.2018 um 15:13 schrieb Erik ?sterlund: > Hi Roman, > > On 2018-07-26 13:09, Roman Kennke wrote: >> Hi Erik, >> thanks for reviewing! >> >>> In instanceRefKlass.inline.hpp: >>> >>> Are these changes to logging required for Shenandoah not to crash? It >>> appears to me that for ZGC, it would print the wrong addresses if >>> barriers were not used. That's why I wonder if this is a strict >>> requirement or not for Shenandoah to work. >> We do need to avoid read-barriers on those paths. The problem is that >> during full-GC we temporarily don't have the fwd-ptr available (it's a >> sliding mark-compact algorithm). However, we can work around this by not >> using the base+offset variants like in the patch. However, this seems to >> make the Access API unhappy at compile-time when using >> ON_UNKNOWN_OOP_REF. Can you check this? I've no clue where to look. > > The reason is that wherever ON_UNKNOWN_OOP_REF is used, the backend > needs to be able to determine the exact strength. And to do that, the > backend needs to be able to determine of this is a referent field. And > to do that, it needs a base pointer. > > I'm not 100% sure what I think is a good solution to this. I wonder if > along the lines of introducing these resolve for read/write decorators > (which it looks like we will be needing anyway), there could be a do not > resolve decorator that could be passed in to determining how to resolve > the access. Default for stores could be ACCESS_WRITE, for loads > ACCESS_READ, for atomics ACCESS_READ | ACCESS_WRITE, and explicitly > setting ACCESS_NONE meaning don't resolve this one. Maybe the prefix > ought to be RESOLVE_READ / RESOLVE_WRITE / RESOLVE_NONE instead though > to be more specific. We are in instanceRefKlass, and we should be able to determine the reference strength statically, and pass in the correct ON_XXX_OOP_REF decorator, right? E.g. via InstanceKlass::reference_type() ? Or would that not work? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Thu Jul 26 14:06:42 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jul 2018 16:06:42 +0200 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: References: Message-ID: Hi Paul, Erik may not have time in the next few months to review such a large change. But it would also be better to do the changes in steps for other reviewers. Also see below. On Mon, 2018-07-23 at 21:33 +0000, Hohensee, Paul wrote: > Please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8196989 I may have missed this in the previous discussion (which has been a while), but has there been any discussion about a "Free (Region) Space" for the committed but free regions? It seems a bit random to assign free region to the "old space", seemingly just a repeat of the current behavior (where everything has been put into "old gen"). Also, imho the second survivor space should preferably be dropped completely. :) > CSR: https://bugs.openjdk.java.net/browse/JDK-8196991 > Webrev: http://cr.openjdk.java.net/~phh/8196989/webrev.00 > > This webrev is marked ?L? because it?s a behavioral change (CSR in > draft state, may I have a review of that too please?) and because the > test change fanout is large. The actual code changes are ?M?. > > Passes the submit repo, Hotspot tier1, the JFR gc event tests and any > other test set with ?gc? or ?serviceability? in the test directory > name. I found it difficult to verify the accuracy of the reported > values other than manually, since they can vary from run to run of > the same program. I?d appreciate suggestions for how to go about > writing accuracy tests. > > I set out originally to revamp only the MXBeans, but decided it would > be incomplete if I didn?t include the jstat counters and the output > of the GC.heap_info jcmd option. I can separate the latter two into > their own RFEs, but I find it easier understand it all in a single > webrev and hope the reviewers will too. > > The basic approach is to add the new memory pools and collectors, the > new jstat counters, and an archive region counter that stands in for > an actual archive region set. HeapRegionSets are disjoint, so One option would be to add a HeapRegionSet tailored for archives that does not check the disjoint-criteria (it is superficially used for verification only anyway) - we already have special classes/flags for different kinds of regions (humongous/free/old) in the HeapRegionSet hierarchy. > initially I tried to create a first-class archive region set (on the > same level as the humongous region set), but that idea foundered on > the fact that there?s too much code I don?t fully understand that > depends on archive regions being in the existing old region set. Probably to simplify the implementation of archive regions :) This is another option, and does not look too bad actually, we only need to check and change all HeapRegion::is_old() or HeapRegion::is_old_or_humongous() checks. Now we only need a good name for is_old_or_archive_or_humongous() because that one is a bit lengthy :) > Externally (i.e., in the MXBeans and the jstat counters), however, > the old region set doesn?t include archive regions (unless running in > legacy mode). > > I used CMS?s TraceCMSMemoryManagerStats class as the model for > TraceConcMemoryManagerStats, which latter collects statistics on > concurrent cycles. There are two STW pauses in each concurrent cycle: > they are recorded separately and count as two sun.gc.collector.2 > events. I would like to move away serviceability code from G1CollectedHeap.h/cpp as much as possible; e.g. it would be very nice to make G1MonitoringSupport the owner of all the serviceability related data. Also the _use_legacy_monitoring member should probably move there too. > The humongous and archive space committed and used values are always > identical, This is because, for some reason, G1 counts the memory filled with filler objects as "used". Other collectors don't. > hence they are always 100% used. You may have noticed that just recently we got a bug (https://bugs.open jdk.java.net/browse/JDK-8207200) filed against the G1 MXBeans because of races in the code particularly code to be not-racy. The reason is the really weird calculation of used/committed for eden space/survivor space/old gen and that the precondition written down in G1MonitoringSupport::recalculate_sizes() does not hold. G1 MemoryMXBeans basically fabricates some numbers as you might have noticed :), so in addition to fixing that issue with the race I am still working on improving the accuracy of the used values. Also, in course of this change I am considering removing some other backwards-bending in returned values for G1 (the mentioned and e.g. funky stuff like assuming that adding together max-capacities of the pools gives you total heap size). I have also a preliminary webrev for that at http://cr.openjdk.java.net /~tschatzl/8207200/webrev/ which unfortunately clashes a lot with your changes. The reason why it is a single webrev is because I am not finished yet - I tend to split it up in parts for much better reviewing at the very end only. Could we work together on first refactoring the code before adding new kinds of spaces to the MXBeans? Looking at this change and mine roughly the following issues would need to be resolved first: - find a solution for archive regions as suggested above :) At the moment, without doing the change, I would tend to make archive regions separate from old regions. - move serviceability stuff as much as possible to g1MonitoringSupport - clean up MemoryPool, remove duplicate information - provide and return sane memory pool used/committed values to the MXBeans - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" variables for every single memory pool. Use MemoryUsage structs for them. Make reading of memory pool information atomic wrt to its readers (note that I think it is currently just impossible to get consistent output for other statistics like jstat) - that's JDK-8207200. - add whatever serviceability stuff for the new pools/jstat/* in steps. > The revised output of jcmd GC.heap_info is in > G1CollectedHeap::print_on(). > I fixed a typo in src/hotspot/share/gc/g1/g1Policy.hpp by changing > the result type of young_list_target_length() from size_t to uint, > which latter is the type of the _young_list_target_length member. > I updated the copyright date in > src/hotspot/share/services/memoryService.hpp to 2018, as I neglected > to do so in a previous push. Comments? Thanks, Thomas From erik.osterlund at oracle.com Thu Jul 26 14:22:34 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 26 Jul 2018 16:22:34 +0200 Subject: RFR: JDK-8206457: Code paths from oop_iterate() must use barrier-free access In-Reply-To: References: <28394f57-3590-3d74-b660-dcfa6b4648a2@redhat.com> <5B587C50.80502@oracle.com> <5B59C911.3060200@oracle.com> Message-ID: <5B59D92A.90602@oracle.com> Hi Roman, On 2018-07-26 16:02, Roman Kennke wrote: > Am 26.07.2018 um 15:13 schrieb Erik ?sterlund: >> Hi Roman, >> >> On 2018-07-26 13:09, Roman Kennke wrote: >>> Hi Erik, >>> thanks for reviewing! >>> >>>> In instanceRefKlass.inline.hpp: >>>> >>>> Are these changes to logging required for Shenandoah not to crash? It >>>> appears to me that for ZGC, it would print the wrong addresses if >>>> barriers were not used. That's why I wonder if this is a strict >>>> requirement or not for Shenandoah to work. >>> We do need to avoid read-barriers on those paths. The problem is that >>> during full-GC we temporarily don't have the fwd-ptr available (it's a >>> sliding mark-compact algorithm). However, we can work around this by not >>> using the base+offset variants like in the patch. However, this seems to >>> make the Access API unhappy at compile-time when using >>> ON_UNKNOWN_OOP_REF. Can you check this? I've no clue where to look. >> The reason is that wherever ON_UNKNOWN_OOP_REF is used, the backend >> needs to be able to determine the exact strength. And to do that, the >> backend needs to be able to determine of this is a referent field. And >> to do that, it needs a base pointer. >> >> I'm not 100% sure what I think is a good solution to this. I wonder if >> along the lines of introducing these resolve for read/write decorators >> (which it looks like we will be needing anyway), there could be a do not >> resolve decorator that could be passed in to determining how to resolve >> the access. Default for stores could be ACCESS_WRITE, for loads >> ACCESS_READ, for atomics ACCESS_READ | ACCESS_WRITE, and explicitly >> setting ACCESS_NONE meaning don't resolve this one. Maybe the prefix >> ought to be RESOLVE_READ / RESOLVE_WRITE / RESOLVE_NONE instead though >> to be more specific. > We are in instanceRefKlass, and we should be able to determine the > reference strength statically, and pass in the correct ON_XXX_OOP_REF > decorator, right? E.g. via InstanceKlass::reference_type() ? Or would > that not work? That should probably do the trick, yes. /Erik From rkennke at redhat.com Thu Jul 26 14:37:21 2018 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 26 Jul 2018 16:37:21 +0200 Subject: RFR: JDK-8206457: Code paths from oop_iterate() must use barrier-free access In-Reply-To: <5B59D92A.90602@oracle.com> References: <28394f57-3590-3d74-b660-dcfa6b4648a2@redhat.com> <5B587C50.80502@oracle.com> <5B59C911.3060200@oracle.com> <5B59D92A.90602@oracle.com> Message-ID: <78747f4d-b6c7-31ff-51b5-0f60d7586ccf@redhat.com> Am 26.07.2018 um 16:22 schrieb Erik ?sterlund: > Hi Roman, > > On 2018-07-26 16:02, Roman Kennke wrote: >> Am 26.07.2018 um 15:13 schrieb Erik ?sterlund: >>> Hi Roman, >>> >>> On 2018-07-26 13:09, Roman Kennke wrote: >>>> Hi Erik, >>>> thanks for reviewing! >>>> >>>>> In instanceRefKlass.inline.hpp: >>>>> >>>>> Are these changes to logging required for Shenandoah not to crash? It >>>>> appears to me that for ZGC, it would print the wrong addresses if >>>>> barriers were not used. That's why I wonder if this is a strict >>>>> requirement or not for Shenandoah to work. >>>> We do need to avoid read-barriers on those paths. The problem is that >>>> during full-GC we temporarily don't have the fwd-ptr available (it's a >>>> sliding mark-compact algorithm). However, we can work around this by >>>> not >>>> using the base+offset variants like in the patch. However, this >>>> seems to >>>> make the Access API unhappy at compile-time when using >>>> ON_UNKNOWN_OOP_REF. Can you check this? I've no clue where to look. >>> The reason is that wherever ON_UNKNOWN_OOP_REF is used, the backend >>> needs to be able to determine the exact strength. And to do that, the >>> backend needs to be able to determine of this is a referent field. And >>> to do that, it needs a base pointer. >>> >>> I'm not 100% sure what I think is a good solution to this. I wonder if >>> along the lines of introducing these resolve for read/write decorators >>> (which it looks like we will be needing anyway), there could be a do not >>> resolve decorator that could be passed in to determining how to resolve >>> the access. Default for stores could be ACCESS_WRITE, for loads >>> ACCESS_READ, for atomics ACCESS_READ | ACCESS_WRITE, and explicitly >>> setting ACCESS_NONE meaning don't resolve this one. Maybe the prefix >>> ought to be RESOLVE_READ / RESOLVE_WRITE / RESOLVE_NONE instead though >>> to be more specific. >> We are in instanceRefKlass, and we should be able to determine the >> reference strength statically, and pass in the correct ON_XXX_OOP_REF >> decorator, right? E.g. via InstanceKlass::reference_type() ? Or would >> that not work? > > That should probably do the trick, yes. not 100% sure this is the correct ReferenceType -> decorators mapping? Incremental: http://cr.openjdk.java.net/~rkennke/JDK-8206457/webrev.02.diff/ Full patch: http://cr.openjdk.java.net/~rkennke/JDK-8206457/webrev.02/ Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From erik.osterlund at oracle.com Thu Jul 26 14:19:58 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 26 Jul 2018 16:19:58 +0200 Subject: PING: RFR: 8207756: jstat should show CGC STW phases of ZGC In-Reply-To: <51cc06a1-b027-4208-f2d9-327cbd009564@gmail.com> References: <11866bdb-9e9d-019c-c64e-8548a79a7617@gmail.com> <107AD8D6-1E1F-4469-9352-8CCD3C84AA67@oracle.com> <51cc06a1-b027-4208-f2d9-327cbd009564@gmail.com> Message-ID: <5B59D88E.3040708@oracle.com> Hi Yasumasa, There is only one global instance of CollectorCounters. Do we really have to pass it around into constructors and dump it in ZDriver? I don't see why it would be in any way problematic to just put this global instance of counters in ZServiceabilityCounters. And put the TraceCollectorStats tcs(_cgc_counters); stuff in ZServiceabilityCountersTracer or something like that. Also I saw this: _cgc_counters = new CollectorCounters("Z stop-the-world phases", 2); Does this assume there are two pauses for a ZGC collection cycle? There are at least 3, but possibly more: 1) Mark start 2) Mark end (this could recur multiple times due to unlucky races with mutators resurrecting weak references right before hitting mark end, causing a new concurrent phase if a large object graph was reachable from this weak ref). So we really can't know for sure how many pauses there will be. 3) Relocate start (done after reference processing and concurrent selection of relocaiton set) Thanks, /Erik On 2018-07-26 15:44, Yasumasa Suenaga wrote: > CC'ing to hotspot-gc-dev > > Hi Per, > > I've looked at ZServiceabilityCounters in zServiceability.cpp . > IMHO it is unsuitable because CGC counter means STW in concurrent GC. > So I think it is suitable to implement in VMOperation for ZGC. > > Thanks, > > Yasumasa > > > On 2018/07/26 22:30, Per Lid?n wrote: >> Hi, >> >> Please have a look at the zServiceability class, which is where code >> like this is intended to live. >> >> /Per >> >>> On 26 Jul 2018, at 14:30, Yasumasa Suenaga wrote: >>> >>> PING: Could you review it? >>> >>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207756/webrev.00/ >>> >>> >>> Yasumasa >>> >>> >>>> On 2018/07/18 11:59, Yasumasa Suenaga wrote: >>>> Hi all, >>>> Please review this change: >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8207756 >>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207756/webrev.00/ >>>> jstat shows CGC STW phases as CGC and CGCT fields since JDK-8153333 . >>>> However, ZGC does not adapt to it yet as following: >>>> ``` >>>> $ jstat -gc 1234 1 1 >>>> S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT CGC >>>> CGCT GCT >>>> - - - - - - 514048.0 12288.0 4352.0 3731.2 0.0 0.0 - - - - - - >>>> 0.000 >>>> ``` >>>> I uploaded webrev which is for jdk/jdk repo. Please tell me if it >>>> should be for zgc repo. >>>> I'm not an author of ZGC. So I need a sponsor. >>>> (I'm a Reviewer of jdk. If this change should be pushed to jdk/jdk, >>>> please tell me. I will push it after reviewing.) >>>> Thanks, >>>> Yasumasa >> From kim.barrett at oracle.com Thu Jul 26 18:32:43 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 26 Jul 2018 14:32:43 -0400 Subject: RFR (XXS): 8208297: Allow printing of taskqueue stats if compiled in in product builds In-Reply-To: References: Message-ID: > On Jul 26, 2018, at 7:49 AM, Thomas Schatzl wrote: > > Hi all, > > please review this trivial patch that changes some > log_develop_is_enabled(Trace, ...) to log_is_enabled(Trace, ...) to > allow printing of task queue stats if they are collected (by setting > TASKQUEUE_STATS) in product builds with Parallel/CMS without any > additional changes too. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8208297 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8208297/webrev/ > Testing: > local compilation/use > > Thanks, > Thomas Looks good, and trivial. From thomas.schatzl at oracle.com Thu Jul 26 18:50:43 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jul 2018 20:50:43 +0200 Subject: RFR (XXS): 8208297: Allow printing of taskqueue stats if compiled in in product builds In-Reply-To: References: Message-ID: <91663645e0d90db78297b9d5a84d3a84e3f14405.camel@oracle.com> Hi Kim, On Thu, 2018-07-26 at 14:32 -0400, Kim Barrett wrote: > > On Jul 26, 2018, at 7:49 AM, Thomas Schatzl > com> wrote: > > > > Hi all, > > > > please review this trivial patch that changes some > > log_develop_is_enabled(Trace, ...) to log_is_enabled(Trace, ...) to > > allow printing of task queue stats if they are collected (by > > setting > > TASKQUEUE_STATS) in product builds with Parallel/CMS without any > > additional changes too. > > > > CR: > > https://bugs.openjdk.java.net/browse/JDK-8208297 > > Webrev: > > http://cr.openjdk.java.net/~tschatzl/8208297/webrev/ > > Testing: > > local compilation/use > > > > Thanks, > > Thomas > > Looks good, and trivial. > thanks for your review. Thomas From jcbeyler at google.com Thu Jul 26 20:04:32 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 26 Jul 2018 13:04:32 -0700 Subject: RFR(S) 8208249: TriggerUnloadingByFillingMetaspace generates garbage class names Message-ID: Hi all, I'm not sure this is the right list, let me know if not. Could someone review this small webrev that puts the GeneratedClassProducer in a ThreadLocal holder to remove the data race on it: Webrev: http://cr.openjdk.java.net/~jcbeyler/8208249/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8208249 Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkennke at redhat.com Fri Jul 27 11:17:37 2018 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 27 Jul 2018 13:17:37 +0200 Subject: RFR: JDK-8154343: Make SATB related code available to other GCs Message-ID: <40bede30-fb98-2fcc-d29e-6172b515ced4@redhat.com> This change moves SATB related code to gc/shared from where it can be used by other SATB GCs. I'm doing it in preparation for Shenandoah. It also includes two new interfaces in CollectedHeap to make it independent from G1. One is for fetching per-thread-queue from GCThreadLocalData, and the other to support filtering. I am not sure if the filtering is too performance sensitive to make a virtual call there. If it is, we may substitute the virtual call with something templated. I actually have such code in Shenandoah, but it's a bit ugly (it requires knowledge about GCs that use it to drive code geneneration). An alternative might be to move the whole filtering loop into (e.g.) CollectedHeap, from where we know the actual type and can drive a templated loop. However, I tend to think virtual call might be ok, and thought I'd start with something simple and nice. Bug: https://bugs.openjdk.java.net/browse/JDK-8154343 Webrev: http://cr.openjdk.java.net/~rkennke/JDK-8154343/webrev.00/ Testing: compiles and passes hotspot/jtreg:tier1 What do you think? Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From yasuenag at gmail.com Fri Jul 27 14:31:03 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Fri, 27 Jul 2018 23:31:03 +0900 Subject: ZGC: RFR: 8207756: jstat should show CGC STW phases of ZGC In-Reply-To: <5B59D88E.3040708@oracle.com> References: <11866bdb-9e9d-019c-c64e-8548a79a7617@gmail.com> <107AD8D6-1E1F-4469-9352-8CCD3C84AA67@oracle.com> <51cc06a1-b027-4208-f2d9-327cbd009564@gmail.com> <5B59D88E.3040708@oracle.com> Message-ID: <5858d3d5-7eed-18ae-a34a-5fbefd0a90ed@gmail.com> Hi Erik, Per, Thank you for your comment. I uploaded new webrev. It has CGC counter in ZServiceability. http://cr.openjdk.java.net/~ysuenaga/JDK-8207756/webrev.01/ I know ZGC has three STW phases. They are called by VM_ZOperation. So I had added CGC counter in it at first. However I agree with you if ZServiceability should have it. Yasumasa On 2018/07/26 23:19, Erik ?sterlund wrote: > Hi Yasumasa, > > There is only one global instance of CollectorCounters. Do we really have to pass it around into constructors and dump it in ZDriver? > I don't see why it would be in any way problematic to just put this global instance of counters in ZServiceabilityCounters. And put the TraceCollectorStats tcs(_cgc_counters); stuff in ZServiceabilityCountersTracer or something like that. > > Also I saw this: _cgc_counters = new CollectorCounters("Z stop-the-world phases", 2); > > Does this assume there are two pauses for a ZGC collection cycle? There are at least 3, but possibly more: > > 1) Mark start > 2) Mark end (this could recur multiple times due to unlucky races with mutators resurrecting weak references right before hitting mark end, causing a new concurrent phase if a large object graph was reachable from this weak ref). So we really can't know for sure how many pauses there will be. > 3) Relocate start (done after reference processing and concurrent selection of relocaiton set) > > Thanks, > /Erik > > On 2018-07-26 15:44, Yasumasa Suenaga wrote: >> CC'ing to hotspot-gc-dev >> >> Hi Per, >> >> I've looked at ZServiceabilityCounters in zServiceability.cpp . >> IMHO it is unsuitable because CGC counter means STW in concurrent GC. So I think it is suitable to implement in VMOperation for ZGC. >> >> Thanks, >> >> Yasumasa >> >> >> On 2018/07/26 22:30, Per Lid?n wrote: >>> Hi, >>> >>> Please have a look at the zServiceability class, which is where code like this is intended to live. >>> >>> /Per >>> >>>> On 26 Jul 2018, at 14:30, Yasumasa Suenaga wrote: >>>> >>>> PING: Could you review it? >>>> >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207756/webrev.00/ >>>> >>>> >>>> Yasumasa >>>> >>>> >>>>> On 2018/07/18 11:59, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> Please review this change: >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8207756 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207756/webrev.00/ >>>>> jstat shows CGC STW phases as CGC and CGCT fields since JDK-8153333 . >>>>> However, ZGC does not adapt to it yet as following: >>>>> ``` >>>>> $ jstat -gc 1234 1 1 >>>>> ? S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT CGC CGCT GCT >>>>> ?? - - - - - - 514048.0 12288.0 4352.0 3731.2 0.0 0.0 - - - - - - 0.000 >>>>> ``` >>>>> I uploaded webrev which is for jdk/jdk repo. Please tell me if it >>>>> should be for zgc repo. >>>>> I'm not an author of ZGC. So I need a sponsor. >>>>> (I'm a Reviewer of jdk. If this change should be pushed to jdk/jdk, >>>>> please tell me. I will push it after reviewing.) >>>>> Thanks, >>>>> Yasumasa >>> > From kim.barrett at oracle.com Fri Jul 27 22:44:24 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 27 Jul 2018 18:44:24 -0400 Subject: RFR: JDK-8154343: Make SATB related code available to other GCs In-Reply-To: <40bede30-fb98-2fcc-d29e-6172b515ced4@redhat.com> References: <40bede30-fb98-2fcc-d29e-6172b515ced4@redhat.com> Message-ID: <019BEC79-BC23-4634-85CF-DB33CBAA22CA@oracle.com> I?ve started looking at this. > On Jul 27, 2018, at 7:17 AM, Roman Kennke wrote: > > This change moves SATB related code to gc/shared from where it can be > used by other SATB GCs. I'm doing it in preparation for Shenandoah. > > It also includes two new interfaces in CollectedHeap to make it > independent from G1. One is for fetching per-thread-queue from > GCThreadLocalData, and the other to support filtering. > > I am not sure if the filtering is too performance sensitive to make a > virtual call there. If it is, we may substitute the virtual call with > something templated. I actually have such code in Shenandoah, but it's a > bit ugly (it requires knowledge about GCs that use it to drive code > geneneration). An alternative might be to move the whole filtering loop > into (e.g.) CollectedHeap, from where we know the actual type and can > drive a templated loop. However, I tend to think virtual call might be > ok, and thought I'd start with something simple and nice. I?m very concerned about the possible performance impact here. Such a change really needs to come with some performance testing. (I?m going to run some of ours, but more would be better.) I?m also not sure I like the proposed new interfaces generally, but I?m still thinking about what I might like as an alternative. Performance may drive some of that. > Bug: > https://bugs.openjdk.java.net/browse/JDK-8154343 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8154343/webrev.00/ > > Testing: compiles and passes hotspot/jtreg:tier1 > > What do you think? > Roman You missed updating precompiled.hpp, which has #include ?gc/g1/PtrQueue.hpp?. From rkennke at redhat.com Fri Jul 27 23:02:48 2018 From: rkennke at redhat.com (Roman Kennke) Date: Sat, 28 Jul 2018 01:02:48 +0200 Subject: RFR: JDK-8154343: Make SATB related code available to other GCs In-Reply-To: <019BEC79-BC23-4634-85CF-DB33CBAA22CA@oracle.com> References: <40bede30-fb98-2fcc-d29e-6172b515ced4@redhat.com> <019BEC79-BC23-4634-85CF-DB33CBAA22CA@oracle.com> Message-ID: <383143e3-9759-effa-5916-dcdb20037d00@redhat.com> Am 28.07.2018 um 00:44 schrieb Kim Barrett: > I?ve started looking at this. > >> On Jul 27, 2018, at 7:17 AM, Roman Kennke wrote: >> >> This change moves SATB related code to gc/shared from where it can be >> used by other SATB GCs. I'm doing it in preparation for Shenandoah. >> >> It also includes two new interfaces in CollectedHeap to make it >> independent from G1. One is for fetching per-thread-queue from >> GCThreadLocalData, and the other to support filtering. >> >> I am not sure if the filtering is too performance sensitive to make a >> virtual call there. If it is, we may substitute the virtual call with >> something templated. I actually have such code in Shenandoah, but it's a >> bit ugly (it requires knowledge about GCs that use it to drive code >> geneneration). An alternative might be to move the whole filtering loop >> into (e.g.) CollectedHeap, from where we know the actual type and can >> drive a templated loop. However, I tend to think virtual call might be >> ok, and thought I'd start with something simple and nice. > > I?m very concerned about the possible performance impact here. > Such a change really needs to come with some performance testing. > (I?m going to run some of ours, but more would be better.) > > I?m also not sure I like the proposed new interfaces generally, but > I?m still thinking about what I might like as an alternative. Performance > may drive some of that. I also don't like the new interfaces much. Especially polluting CollectedHeap with something that GCs might not need (and in-fact, only two known GCs currently need, and only one of them is currently in mainline OpenJDK). Maybe we can define an interface like SATBSupport that provides these new methods and have GCs pass in an implementation subclass at creation time? With regards to performance, I think we can templatize the filter-loop: filter() would call out to SATBSupport::filter_driver() which would call back to SATBMarkQueue::filter_impl with known SATBSupportType which would then call the non-virtual and inlineable SATBSupport::retain_entry() of the speficied type. This should solve all performance concerns I think. >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8154343 >> Webrev: >> http://cr.openjdk.java.net/~rkennke/JDK-8154343/webrev.00/ >> >> Testing: compiles and passes hotspot/jtreg:tier1 >> >> What do you think? >> Roman > > You missed updating precompiled.hpp, which has #include ?gc/g1/PtrQueue.hpp?. Ha, oops. And here I thought I'm doing something good by always building with --disable-precompiled-headers ;-) Will come up with updated patch that implements the ideas above start of next week. Feel free to suggest improvements. Thanks for reviewing! Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From kim.barrett at oracle.com Fri Jul 27 23:17:53 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 27 Jul 2018 19:17:53 -0400 Subject: RFR: JDK-8154343: Make SATB related code available to other GCs In-Reply-To: <383143e3-9759-effa-5916-dcdb20037d00@redhat.com> References: <40bede30-fb98-2fcc-d29e-6172b515ced4@redhat.com> <019BEC79-BC23-4634-85CF-DB33CBAA22CA@oracle.com> <383143e3-9759-effa-5916-dcdb20037d00@redhat.com> Message-ID: <8DF187C0-EAF8-422B-B6B8-073A48EB4622@oracle.com> > On Jul 27, 2018, at 7:02 PM, Roman Kennke wrote: > > Am 28.07.2018 um 00:44 schrieb Kim Barrett: >> >> I?m also not sure I like the proposed new interfaces generally, but >> I?m still thinking about what I might like as an alternative. Performance >> may drive some of that. > > I also don't like the new interfaces much. Especially polluting > CollectedHeap with something that GCs might not need (and in-fact, only > two known GCs currently need, and only one of them is currently in > mainline OpenJDK). > > Maybe we can define an interface like SATBSupport that provides these > new methods and have GCs pass in an implementation subclass at creation > time? > > With regards to performance, I think we can templatize the filter-loop: > filter() would call out to SATBSupport::filter_driver() which would call > back to SATBMarkQueue::filter_impl with known > SATBSupportType which would then call the non-virtual and inlineable > SATBSupport::retain_entry() of the speficied type. This should solve all > performance concerns I think. I have in mind something that I think will be simpler than that, but need to spend some time working through the details. I?ll let you know what I come up with. >> You missed updating precompiled.hpp, which has #include ?gc/g1/PtrQueue.hpp?. > > Ha, oops. And here I thought I'm doing something good by always building > with --disable-precompiled-headers ;-) Our automated build system does both, else I would have missed it too, since I always do my local builds without precompiled headers. I think the simplest solution in this case is to just remove the two offending includes. I don?t think there are enough references to those classes to make ?including" them everywhere worthwhile anyway. From rkennke at redhat.com Sat Jul 28 11:13:47 2018 From: rkennke at redhat.com (Roman Kennke) Date: Sat, 28 Jul 2018 13:13:47 +0200 Subject: RFR: JDK-8154343: Make SATB related code available to other GCs In-Reply-To: <8DF187C0-EAF8-422B-B6B8-073A48EB4622@oracle.com> References: <40bede30-fb98-2fcc-d29e-6172b515ced4@redhat.com> <019BEC79-BC23-4634-85CF-DB33CBAA22CA@oracle.com> <383143e3-9759-effa-5916-dcdb20037d00@redhat.com> <8DF187C0-EAF8-422B-B6B8-073A48EB4622@oracle.com> Message-ID: Am 28.07.2018 um 01:17 schrieb Kim Barrett: >> On Jul 27, 2018, at 7:02 PM, Roman Kennke wrote: >> >> Am 28.07.2018 um 00:44 schrieb Kim Barrett: >>> >>> I?m also not sure I like the proposed new interfaces generally, but >>> I?m still thinking about what I might like as an alternative. Performance >>> may drive some of that. >> >> I also don't like the new interfaces much. Especially polluting >> CollectedHeap with something that GCs might not need (and in-fact, only >> two known GCs currently need, and only one of them is currently in >> mainline OpenJDK). >> >> Maybe we can define an interface like SATBSupport that provides these >> new methods and have GCs pass in an implementation subclass at creation >> time? >> >> With regards to performance, I think we can templatize the filter-loop: >> filter() would call out to SATBSupport::filter_driver() which would call >> back to SATBMarkQueue::filter_impl with known >> SATBSupportType which would then call the non-virtual and inlineable >> SATBSupport::retain_entry() of the speficied type. This should solve all >> performance concerns I think. > > I have in mind something that I think will be simpler than that, but need > to spend some time working through the details. I?ll let you know what > I come up with. Ok, I'll wait for you then :-) Cheers, Roman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From thomas.schatzl at oracle.com Mon Jul 30 11:33:08 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 30 Jul 2018 13:33:08 +0200 Subject: RFR(S) 8208249: TriggerUnloadingByFillingMetaspace generates garbage class names In-Reply-To: References: Message-ID: <5b9aa5a3df5b7d2a69c23071ac2cead77c03f3f3.camel@oracle.com> Hi, On Thu, 2018-07-26 at 13:04 -0700, JC Beyler wrote: > Hi all, > > I'm not sure this is the right list, let me know if not. > > Could someone review this small webrev that puts the > GeneratedClassProducer in a ThreadLocal holder to remove the data > race on it: > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208249/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8208249 > > Thanks, > Jc looks good to me. Thomas From thomas.schatzl at oracle.com Mon Jul 30 13:03:20 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 30 Jul 2018 15:03:20 +0200 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: References: Message-ID: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> Hi Paul, did some prototyping and wanted to show you the results and get your input: On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote: > [...] > Could we work together on first refactoring the code before adding > new > kinds of spaces to the MXBeans? > > Looking at this change and mine roughly the following issues would > need to be resolved first: > - find a solution for archive regions as suggested above :) At the > moment, without doing the change, I would tend to make archive > regions separate from old regions. I went with that and I am currently testing https://bugs.openjdk.java.n et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j ava.net/~tschatzl/8208498/webrev/ > - move serviceability stuff as much as possible to > g1MonitoringSupport Preliminary webrev: http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/ I think this came out better than expected: while we maybe want to add a ServiceabilitySupport interface that collects the get_memory_manager/pools/* methods in the future, imho this is a lot better than current code as it tightens the G1MonitoringSupport interface quite a bit. Particularly of note should be the G1MonitoringScope class that collects both TraceCollectorStats and TraceMemoryManagerStats into a single class. (Instead of the two bools passed to it something indicating the GC directly would probably be better too). It would be nice if something similar could be made for the concurrent Trace*Stats. > - clean up MemoryPool, remove duplicate information > - provide and return sane memory pool used/committed values to the > MXBeans > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" > variables > for every single memory pool. Use MemoryUsage structs for them. Make > reading of memory pool information atomic wrt to its readers (note > that I think it is currently just impossible to get consistent output > for other statistics like jstat) - that's JDK-8207200. > - add whatever serviceability stuff for the new pools/jstat/* in > steps. Thanks, Thomas From hohensee at amazon.com Mon Jul 30 19:18:27 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 30 Jul 2018 19:18:27 +0000 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> References: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> Message-ID: <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com> At JVMLS, so can't look in depth this instant, but I'm fine with your approach, except I'd get the new JMX and jstat structure in place before fixing the data that gets reported. Imo it'll be easier to fit correct data into the new JMX/jstat setup than into the old one, and doing it the new way will give us a good idea of exactly what we should do for the legacy ones. Your archive region set webrev looks pretty much the same as what I wrote, but I got a trace trap when I tried to execute the resulting JVM. Not a clue why, so I abandoned it. I'd not have thought of making a G1MonitoringScope, looks good. Thanks, Paul ?On 7/30/18, 6:04 AM, "Thomas Schatzl" wrote: Hi Paul, did some prototyping and wanted to show you the results and get your input: On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote: > [...] > Could we work together on first refactoring the code before adding > new > kinds of spaces to the MXBeans? > > Looking at this change and mine roughly the following issues would > need to be resolved first: > - find a solution for archive regions as suggested above :) At the > moment, without doing the change, I would tend to make archive > regions separate from old regions. I went with that and I am currently testing https://bugs.openjdk.java.n et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j ava.net/~tschatzl/8208498/webrev/ > - move serviceability stuff as much as possible to > g1MonitoringSupport Preliminary webrev: http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/ I think this came out better than expected: while we maybe want to add a ServiceabilitySupport interface that collects the get_memory_manager/pools/* methods in the future, imho this is a lot better than current code as it tightens the G1MonitoringSupport interface quite a bit. Particularly of note should be the G1MonitoringScope class that collects both TraceCollectorStats and TraceMemoryManagerStats into a single class. (Instead of the two bools passed to it something indicating the GC directly would probably be better too). It would be nice if something similar could be made for the concurrent Trace*Stats. > - clean up MemoryPool, remove duplicate information > - provide and return sane memory pool used/committed values to the > MXBeans > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" > variables > for every single memory pool. Use MemoryUsage structs for them. Make > reading of memory pool information atomic wrt to its readers (note > that I think it is currently just impossible to get consistent output > for other statistics like jstat) - that's JDK-8207200. > - add whatever serviceability stuff for the new pools/jstat/* in > steps. Thanks, Thomas From jcbeyler at google.com Mon Jul 30 19:28:34 2018 From: jcbeyler at google.com (JC Beyler) Date: Mon, 30 Jul 2018 12:28:34 -0700 Subject: RFR (XS) 8169004: arguments/TestTargetSurvivorRatioFlag.java has redundant @requires tag Message-ID: Hi all, Could I get a few reviews for a really small webrev? Webrev: http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8169004 Thanks! Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Mon Jul 30 19:34:23 2018 From: jcbeyler at google.com (JC Beyler) Date: Mon, 30 Jul 2018 12:34:23 -0700 Subject: RFR (L) 8208246: flags duplications in vmTestbase_vm_g1classunloading tests Message-ID: Hi all, Could I get a review for: Webrev: http://cr.openjdk.java.net/~jcbeyler/8208246/ Bug: https://bugs.openjdk.java.net/browse/JDK-8208246 Basically, I removed the duplicate flags in the various tests. Additional notes due to the number of files changed: - I used an awk script to remove any duplicate line that was not just a empty commented line and occurred after a line containing @run. - I then ran a script to check that all lines removed from the tests were still present in the test files (sanity check of my script) - Finally, I ran the tests and they still pass via make run-test-only on my dev machine Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Mon Jul 30 23:26:57 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 30 Jul 2018 23:26:57 +0000 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com> References: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com> Message-ID: <54CF80A5-A8C3-4C20-8D3D-045A8FA181AA@amazon.com> A couple nits on http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/. g1CollectedHeap.cpp: in initialize_serviceability(), memory_managers(), and memory_pools(), use g1mm() instead of _g1mm. g1MonitoringSupport.cpp: there's an extra newline after ~G1MonitoringSupport(). Otherwise looks good. Paul ?On 7/30/18, 12:18 PM, "Hohensee, Paul" wrote: At JVMLS, so can't look in depth this instant, but I'm fine with your approach, except I'd get the new JMX and jstat structure in place before fixing the data that gets reported. Imo it'll be easier to fit correct data into the new JMX/jstat setup than into the old one, and doing it the new way will give us a good idea of exactly what we should do for the legacy ones. Your archive region set webrev looks pretty much the same as what I wrote, but I got a trace trap when I tried to execute the resulting JVM. Not a clue why, so I abandoned it. I'd not have thought of making a G1MonitoringScope, looks good. Thanks, Paul On 7/30/18, 6:04 AM, "Thomas Schatzl" wrote: Hi Paul, did some prototyping and wanted to show you the results and get your input: On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote: > [...] > Could we work together on first refactoring the code before adding > new > kinds of spaces to the MXBeans? > > Looking at this change and mine roughly the following issues would > need to be resolved first: > - find a solution for archive regions as suggested above :) At the > moment, without doing the change, I would tend to make archive > regions separate from old regions. I went with that and I am currently testing https://bugs.openjdk.java.n et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j ava.net/~tschatzl/8208498/webrev/ > - move serviceability stuff as much as possible to > g1MonitoringSupport Preliminary webrev: http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/ I think this came out better than expected: while we maybe want to add a ServiceabilitySupport interface that collects the get_memory_manager/pools/* methods in the future, imho this is a lot better than current code as it tightens the G1MonitoringSupport interface quite a bit. Particularly of note should be the G1MonitoringScope class that collects both TraceCollectorStats and TraceMemoryManagerStats into a single class. (Instead of the two bools passed to it something indicating the GC directly would probably be better too). It would be nice if something similar could be made for the concurrent Trace*Stats. > - clean up MemoryPool, remove duplicate information > - provide and return sane memory pool used/committed values to the > MXBeans > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" > variables > for every single memory pool. Use MemoryUsage structs for them. Make > reading of memory pool information atomic wrt to its readers (note > that I think it is currently just impossible to get consistent output > for other statistics like jstat) - that's JDK-8207200. > - add whatever serviceability stuff for the new pools/jstat/* in > steps. Thanks, Thomas From thomas.schatzl at oracle.com Tue Jul 31 07:38:28 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 31 Jul 2018 09:38:28 +0200 Subject: RFR (XS) 8169004: arguments/TestTargetSurvivorRatioFlag.java has redundant @requires tag In-Reply-To: References: Message-ID: <1d6a2dcf1c19e7c2a823c533b445a3a03c3b4d65.camel@oracle.com> Hi, On Mon, 2018-07-30 at 12:28 -0700, JC Beyler wrote: > Hi all, > > Could I get a few reviews for a really small webrev? > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8169004 > > Thanks! > Jc looks good and trivial. I will sponsor it. Thanks, Thomas From thomas.schatzl at oracle.com Tue Jul 31 07:49:35 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 31 Jul 2018 09:49:35 +0200 Subject: RFR (L) 8208246: flags duplications in vmTestbase_vm_g1classunloading tests In-Reply-To: References: Message-ID: <3aeac869b5e66b43a554e665a30dfbb176d4109f.camel@oracle.com> Hi, On Mon, 2018-07-30 at 12:34 -0700, JC Beyler wrote: > Hi all, > > Could I get a review for: > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208246/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8208246 > > Basically, I removed the duplicate flags in the various tests. > > Additional notes due to the number of files changed: > - I used an awk script to remove any duplicate line that was not > just a empty commented line and occurred after a line > containing @run. > - I then ran a script to check that all lines removed from the > tests were still present in the test files (sanity check of my > script) > - Finally, I ran the tests and they still pass via make run-test- > only on my dev machine > looks good. Thomas From jcbeyler at google.com Tue Jul 31 16:50:54 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 31 Jul 2018 09:50:54 -0700 Subject: RFR (S) 8069343: Improve gc/g1/TestHumongousCodeCacheRoots.java to use jtreg @requires Message-ID: Hi all, I cleaned up the TestHumongousCodeCacheRoots to no longer try to do client and then server test runs. Let me know what you think: Webrev: http://cr.openjdk.java.net/~jcbeyler/8069343/webrev.01/ Bug: https://bugs.openjdk.java.net/browse/JDK-8069343 Thanks! Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Tue Jul 31 16:51:30 2018 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 31 Jul 2018 09:51:30 -0700 Subject: RFR(S) 8208249: TriggerUnloadingByFillingMetaspace generates garbage class names In-Reply-To: <5b9aa5a3df5b7d2a69c23071ac2cead77c03f3f3.camel@oracle.com> References: <5b9aa5a3df5b7d2a69c23071ac2cead77c03f3f3.camel@oracle.com> Message-ID: <517A4B8E-0B65-4A5E-9154-347CF7758910@oracle.com> Hi JC, looks good to me. -- Igor > On Jul 30, 2018, at 4:33 AM, Thomas Schatzl wrote: > > Hi, > > On Thu, 2018-07-26 at 13:04 -0700, JC Beyler wrote: >> Hi all, >> >> I'm not sure this is the right list, let me know if not. >> >> Could someone review this small webrev that puts the >> GeneratedClassProducer in a ThreadLocal holder to remove the data >> race on it: >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208249/webrev.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8208249 >> >> Thanks, >> Jc > > looks good to me. > > Thomas From igor.ignatyev at oracle.com Tue Jul 31 16:57:54 2018 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 31 Jul 2018 09:57:54 -0700 Subject: RFR (XS) 8169004: arguments/TestTargetSurvivorRatioFlag.java has redundant @requires tag In-Reply-To: <1d6a2dcf1c19e7c2a823c533b445a3a03c3b4d65.camel@oracle.com> References: <1d6a2dcf1c19e7c2a823c533b445a3a03c3b4d65.camel@oracle.com> Message-ID: <109345B8-B4F8-4803-8B82-6BF984C3A2C9@oracle.com> Hi JC, would you mind removing similar redundancy in other tests? e.g. 'vm.opt.AggressiveOpts=="false" | vm.opt.AggressiveOpts=="null"' in a few test/hotspot/jtreg/gc/g1/ tests can be replaced w/ vm.opt.AggressiveOpts != true'. the parentheses around simple predicate seem redundant to me as well. Thanks, -- Igor > On Jul 31, 2018, at 12:38 AM, Thomas Schatzl wrote: > > Hi, > > On Mon, 2018-07-30 at 12:28 -0700, JC Beyler wrote: >> Hi all, >> >> Could I get a few reviews for a really small webrev? >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8169004 >> >> Thanks! >> Jc > > looks good and trivial. I will sponsor it. > > Thanks, > Thomas > From igor.ignatyev at oracle.com Tue Jul 31 16:59:56 2018 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 31 Jul 2018 09:59:56 -0700 Subject: RFR (L) 8208246: flags duplications in vmTestbase_vm_g1classunloading tests In-Reply-To: <3aeac869b5e66b43a554e665a30dfbb176d4109f.camel@oracle.com> References: <3aeac869b5e66b43a554e665a30dfbb176d4109f.camel@oracle.com> Message-ID: <53F9816E-33FD-4A8D-A736-34D13F84C2AA@oracle.com> Hi JC, the fix looks good to me. thanks for fixing it. Cheers, -- Igor > On Jul 31, 2018, at 12:49 AM, Thomas Schatzl wrote: > > Hi, > > On Mon, 2018-07-30 at 12:34 -0700, JC Beyler wrote: >> Hi all, >> >> Could I get a review for: >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208246/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8208246 >> >> Basically, I removed the duplicate flags in the various tests. >> >> Additional notes due to the number of files changed: >> - I used an awk script to remove any duplicate line that was not >> just a empty commented line and occurred after a line >> containing @run. >> - I then ran a script to check that all lines removed from the >> tests were still present in the test files (sanity check of my >> script) >> - Finally, I ran the tests and they still pass via make run-test- >> only on my dev machine >> > > looks good. > > Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Tue Jul 31 17:18:18 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 31 Jul 2018 10:18:18 -0700 Subject: RFR(S) 8208249: TriggerUnloadingByFillingMetaspace generates garbage class names In-Reply-To: <517A4B8E-0B65-4A5E-9154-347CF7758910@oracle.com> References: <5b9aa5a3df5b7d2a69c23071ac2cead77c03f3f3.camel@oracle.com> <517A4B8E-0B65-4A5E-9154-347CF7758910@oracle.com> Message-ID: Thanks both! Here is the new webrev if someone could push it :-): http://cr.openjdk.java.net/~jcbeyler/8208249/webrev.01/ Thanks again! Jc On Tue, Jul 31, 2018 at 9:51 AM Igor Ignatyev wrote: > Hi JC, > > looks good to me. > > -- Igor > > > On Jul 30, 2018, at 4:33 AM, Thomas Schatzl > wrote: > > > > Hi, > > > > On Thu, 2018-07-26 at 13:04 -0700, JC Beyler wrote: > >> Hi all, > >> > >> I'm not sure this is the right list, let me know if not. > >> > >> Could someone review this small webrev that puts the > >> GeneratedClassProducer in a ThreadLocal holder to remove the data > >> race on it: > >> > >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208249/webrev.00/ > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8208249 > >> > >> Thanks, > >> Jc > > > > looks good to me. > > > > Thomas > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Tue Jul 31 17:31:39 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 31 Jul 2018 10:31:39 -0700 Subject: RFR (L) 8208246: flags duplications in vmTestbase_vm_g1classunloading tests In-Reply-To: <53F9816E-33FD-4A8D-A736-34D13F84C2AA@oracle.com> References: <3aeac869b5e66b43a554e665a30dfbb176d4109f.camel@oracle.com> <53F9816E-33FD-4A8D-A736-34D13F84C2AA@oracle.com> Message-ID: Hi all, Here is a webrev ready for a push if someone could do it: http://cr.openjdk.java.net/~jcbeyler/8208246/webrev.01/ Thanks both for the reviews! Jc On Tue, Jul 31, 2018 at 10:00 AM Igor Ignatyev wrote: > Hi JC, > > the fix looks good to me. thanks for fixing it. > > Cheers, > -- Igor > > On Jul 31, 2018, at 12:49 AM, Thomas Schatzl > wrote: > > Hi, > > On Mon, 2018-07-30 at 12:34 -0700, JC Beyler wrote: > > Hi all, > > Could I get a review for: > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208246/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8208246 > > Basically, I removed the duplicate flags in the various tests. > > Additional notes due to the number of files changed: > - I used an awk script to remove any duplicate line that was not > just a empty commented line and occurred after a line > containing @run. > - I then ran a script to check that all lines removed from the > tests were still present in the test files (sanity check of my > script) > - Finally, I ran the tests and they still pass via make run-test- > only on my dev machine > > > looks good. > > Thomas > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Tue Jul 31 18:18:09 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 31 Jul 2018 11:18:09 -0700 Subject: RFR (XS) 8169004: arguments/TestTargetSurvivorRatioFlag.java has redundant @requires tag In-Reply-To: <109345B8-B4F8-4803-8B82-6BF984C3A2C9@oracle.com> References: <1d6a2dcf1c19e7c2a823c533b445a3a03c3b4d65.camel@oracle.com> <109345B8-B4F8-4803-8B82-6BF984C3A2C9@oracle.com> Message-ID: Hi Igor, Here is a webrev that does the 'X == false | X == null' -> 'X != true' for all the ones I could find across the tests. I also fixed the parenthesis issue. Not sure if it was important, but I updated the copyright year for files that did not have 2018. http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.01/jdk12-test3.changeset Let me know what you think, Jc On Tue, Jul 31, 2018 at 9:57 AM Igor Ignatyev wrote: > Hi JC, > > would you mind removing similar redundancy in other tests? e.g. > 'vm.opt.AggressiveOpts=="false" | vm.opt.AggressiveOpts=="null"' in a few > test/hotspot/jtreg/gc/g1/ tests can be replaced w/ vm.opt.AggressiveOpts != > true'. > > the parentheses around simple predicate seem redundant to me as well. > > Thanks, > -- Igor > > > On Jul 31, 2018, at 12:38 AM, Thomas Schatzl > wrote: > > > > Hi, > > > > On Mon, 2018-07-30 at 12:28 -0700, JC Beyler wrote: > >> Hi all, > >> > >> Could I get a few reviews for a really small webrev? > >> > >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.00/ > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8169004 > >> > >> Thanks! > >> Jc > > > > looks good and trivial. I will sponsor it. > > > > Thanks, > > Thomas > > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Tue Jul 31 18:45:14 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 31 Jul 2018 18:45:14 +0000 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: <54CF80A5-A8C3-4C20-8D3D-045A8FA181AA@amazon.com> References: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com> <54CF80A5-A8C3-4C20-8D3D-045A8FA181AA@amazon.com> Message-ID: A few small things for http://cr.openjdk.java.net/~tschatzl/8208498/webrev/, otherwise looks good. collectionSetChooser.cpp: Doesn't !r->is_old() include is_archive()? g1CollectedHeap.hpp: Add archive_region_add(), archive_region_remove(), and old_set_bulk_remove(). In non_young_capacity_bytes(), use old_regions_count(), humongous_regions_count(), and archive_regions_count(). g1CollectedHeap.cpp: Use old_set_add() and friends where possible. "// humongous regions set." -> "// humongous and archive region sets." ?On 7/30/18, 4:27 PM, "Hohensee, Paul" wrote: A couple nits on http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/. g1CollectedHeap.cpp: in initialize_serviceability(), memory_managers(), and memory_pools(), use g1mm() instead of _g1mm. g1MonitoringSupport.cpp: there's an extra newline after ~G1MonitoringSupport(). Otherwise looks good. Paul On 7/30/18, 12:18 PM, "Hohensee, Paul" wrote: At JVMLS, so can't look in depth this instant, but I'm fine with your approach, except I'd get the new JMX and jstat structure in place before fixing the data that gets reported. Imo it'll be easier to fit correct data into the new JMX/jstat setup than into the old one, and doing it the new way will give us a good idea of exactly what we should do for the legacy ones. Your archive region set webrev looks pretty much the same as what I wrote, but I got a trace trap when I tried to execute the resulting JVM. Not a clue why, so I abandoned it. I'd not have thought of making a G1MonitoringScope, looks good. Thanks, Paul On 7/30/18, 6:04 AM, "Thomas Schatzl" wrote: Hi Paul, did some prototyping and wanted to show you the results and get your input: On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote: > [...] > Could we work together on first refactoring the code before adding > new > kinds of spaces to the MXBeans? > > Looking at this change and mine roughly the following issues would > need to be resolved first: > - find a solution for archive regions as suggested above :) At the > moment, without doing the change, I would tend to make archive > regions separate from old regions. I went with that and I am currently testing https://bugs.openjdk.java.n et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j ava.net/~tschatzl/8208498/webrev/ > - move serviceability stuff as much as possible to > g1MonitoringSupport Preliminary webrev: http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/ I think this came out better than expected: while we maybe want to add a ServiceabilitySupport interface that collects the get_memory_manager/pools/* methods in the future, imho this is a lot better than current code as it tightens the G1MonitoringSupport interface quite a bit. Particularly of note should be the G1MonitoringScope class that collects both TraceCollectorStats and TraceMemoryManagerStats into a single class. (Instead of the two bools passed to it something indicating the GC directly would probably be better too). It would be nice if something similar could be made for the concurrent Trace*Stats. > - clean up MemoryPool, remove duplicate information > - provide and return sane memory pool used/committed values to the > MXBeans > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" > variables > for every single memory pool. Use MemoryUsage structs for them. Make > reading of memory pool information atomic wrt to its readers (note > that I think it is currently just impossible to get consistent output > for other statistics like jstat) - that's JDK-8207200. > - add whatever serviceability stuff for the new pools/jstat/* in > steps. Thanks, Thomas From igor.ignatyev at oracle.com Tue Jul 31 19:13:50 2018 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 31 Jul 2018 12:13:50 -0700 Subject: RFR (XS) 8169004: arguments/TestTargetSurvivorRatioFlag.java has redundant @requires tag In-Reply-To: References: <1d6a2dcf1c19e7c2a823c533b445a3a03c3b4d65.camel@oracle.com> <109345B8-B4F8-4803-8B82-6BF984C3A2C9@oracle.com> Message-ID: Hi JC, you should change second year instead of adding another one[*]. otherwise the fix looks good to me. -- Igor [*] http://mail.openjdk.java.net/pipermail/jdk7-dev/2010-May/001321.html > On Jul 31, 2018, at 11:18 AM, JC Beyler wrote: > > Hi Igor, > > Here is a webrev that does the 'X == false | X == null' -> 'X != true' for all the ones I could find across the tests. > I also fixed the parenthesis issue. > Not sure if it was important, but I updated the copyright year for files that did not have 2018. > > http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.01/jdk12-test3.changeset > > Let me know what you think, > Jc > > > On Tue, Jul 31, 2018 at 9:57 AM Igor Ignatyev > wrote: > Hi JC, > > would you mind removing similar redundancy in other tests? e.g. 'vm.opt.AggressiveOpts=="false" | vm.opt.AggressiveOpts=="null"' in a few test/hotspot/jtreg/gc/g1/ tests can be replaced w/ vm.opt.AggressiveOpts != true'. > > the parentheses around simple predicate seem redundant to me as well. > > Thanks, > -- Igor > > > On Jul 31, 2018, at 12:38 AM, Thomas Schatzl > wrote: > > > > Hi, > > > > On Mon, 2018-07-30 at 12:28 -0700, JC Beyler wrote: > >> Hi all, > >> > >> Could I get a few reviews for a really small webrev? > >> > >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.00/ > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8169004 > >> > >> Thanks! > >> Jc > > > > looks good and trivial. I will sponsor it. > > > > Thanks, > > Thomas > > > > > > -- > > Thanks, > Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From kim.barrett at oracle.com Tue Jul 31 19:23:04 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 31 Jul 2018 15:23:04 -0400 Subject: RFR: 8072498: Multi-thread JNI weak reference processing Message-ID: Please review this change to WeakProcessor to support processing by multiple threads in parallel. This change uses the WorkGang infrastructure to provide tasking and thread management. The number of threads to use may be determined ergonomically, based on ReferencesPerThread. At this time, only G1 makes use of this change. CMS is deprecated, so we're not spending effort on enhancements of it. ParallelGC doesn't use the WorkGang mechanism for parallelism, leading to the same issues here as led to not making ParallelGC's j.l.r.Reference processing use an ergonomically determined number of threads. We should fix JDK-8204951 before trying to make ParallelGC use the parallel form of WeakProcessor. As part of this, introduced WeakProcessorPhases (to enumerate and manipulate the phases) and WeakProcessorPhaseTimes (for collecting and reporting timing information for the phases. CR: https://bugs.openjdk.java.net/browse/JDK-8072498 Webrev: http://cr.openjdk.java.net/~kbarrett/8072498/open.00/ Testing: mach5 tier1-3, hs-tier4-5. Local and mach5 testing of TestGCBasherWithG1 and TestSystemGCWithG1, with the tests modified to use a smaller non-default value of -XX:ReferencesPerThread, to examine the logging output and verify multi-threaded execution. From jcbeyler at google.com Tue Jul 31 19:27:52 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 31 Jul 2018 12:27:52 -0700 Subject: RFR (XS) 8169004: arguments/TestTargetSurvivorRatioFlag.java has redundant @requires tag In-Reply-To: References: <1d6a2dcf1c19e7c2a823c533b445a3a03c3b4d65.camel@oracle.com> <109345B8-B4F8-4803-8B82-6BF984C3A2C9@oracle.com> Message-ID: Hi Igor, Ah, sorry about that. Here is the webrev with fixed copyrights then: http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.02/ I'll wait for a second review since now the webrev that Thomas did review has changed (or wait until Thomas redoes a review :-)). Thanks! Jc On Tue, Jul 31, 2018 at 12:13 PM Igor Ignatyev wrote: > Hi JC, > > you should change second year instead of adding another one[*]. otherwise > the fix looks good to me. > > -- Igor > > [*] http://mail.openjdk.java.net/pipermail/jdk7-dev/2010-May/001321.html > > On Jul 31, 2018, at 11:18 AM, JC Beyler wrote: > > Hi Igor, > > Here is a webrev that does the 'X == false | X == null' -> 'X != true' for > all the ones I could find across the tests. > I also fixed the parenthesis issue. > > Not sure if it was important, but I updated the copyright year for files > that did not have 2018. > > > > http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.01/jdk12-test3.changeset > > Let me know what you think, > Jc > > > On Tue, Jul 31, 2018 at 9:57 AM Igor Ignatyev > wrote: > >> Hi JC, >> >> would you mind removing similar redundancy in other tests? e.g. >> 'vm.opt.AggressiveOpts=="false" | vm.opt.AggressiveOpts=="null"' in a few >> test/hotspot/jtreg/gc/g1/ tests can be replaced w/ vm.opt.AggressiveOpts != >> true'. >> >> the parentheses around simple predicate seem redundant to me as well. >> >> Thanks, >> -- Igor >> >> > On Jul 31, 2018, at 12:38 AM, Thomas Schatzl >> wrote: >> > >> > Hi, >> > >> > On Mon, 2018-07-30 at 12:28 -0700, JC Beyler wrote: >> >> Hi all, >> >> >> >> Could I get a few reviews for a really small webrev? >> >> >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8169004/webrev.00/ >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8169004 >> >> >> >> Thanks! >> >> Jc >> > >> > looks good and trivial. I will sponsor it. >> > >> > Thanks, >> > Thomas >> > >> >> > > -- > > Thanks, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Tue Jul 31 20:09:20 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 31 Jul 2018 16:09:20 -0400 Subject: RFR(XS): 8208605 Fix for 8199868 breaks tier1 build Message-ID: Greetings, I have a proposed fix for the following bug: ??? JDK-8208605 Fix for 8199868 breaks tier1 build ??? https://bugs.openjdk.java.net/browse/JDK-8208605 Webrev: http://cr.openjdk.java.net/~dcubed/8208605-webrev/0_for_jdk_jdk/ I'm looking for a single (R)eviewer for this trivial change. This fix is tested by local builds on my Solaris-X64 server and by a "mach5 remote-build-and-test --job builds-tier1,hs-tier1". Thanks, in advance, for any questions, comments or suggestions... Dan From kim.barrett at oracle.com Tue Jul 31 20:12:43 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 31 Jul 2018 16:12:43 -0400 Subject: RFR(XS): 8208605 Fix for 8199868 breaks tier1 build In-Reply-To: References: Message-ID: <058A4E7D-8704-40FE-98D6-85FC4752296D@oracle.com> > On Jul 31, 2018, at 4:09 PM, Daniel D. Daugherty wrote: > > Greetings, > > I have a proposed fix for the following bug: > > JDK-8208605 Fix for 8199868 breaks tier1 build > https://bugs.openjdk.java.net/browse/JDK-8208605 > > Webrev: http://cr.openjdk.java.net/~dcubed/8208605-webrev/0_for_jdk_jdk/ > > I'm looking for a single (R)eviewer for this trivial change. > > This fix is tested by local builds on my Solaris-X64 server and > by a "mach5 remote-build-and-test --job builds-tier1,hs-tier1". > > Thanks, in advance, for any questions, comments or suggestions... > > Dan Looks good, and trivial. From daniel.daugherty at oracle.com Tue Jul 31 20:14:32 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Tue, 31 Jul 2018 16:14:32 -0400 Subject: RFR(XS): 8208605 Fix for 8199868 breaks tier1 build In-Reply-To: <058A4E7D-8704-40FE-98D6-85FC4752296D@oracle.com> References: <058A4E7D-8704-40FE-98D6-85FC4752296D@oracle.com> Message-ID: <4fba5c2c-61a1-c69c-3995-534650f52aa7@oracle.com> On 7/31/18 4:12 PM, Kim Barrett wrote: >> On Jul 31, 2018, at 4:09 PM, Daniel D. Daugherty wrote: >> >> Greetings, >> >> I have a proposed fix for the following bug: >> >> JDK-8208605 Fix for 8199868 breaks tier1 build >> https://bugs.openjdk.java.net/browse/JDK-8208605 >> >> Webrev: http://cr.openjdk.java.net/~dcubed/8208605-webrev/0_for_jdk_jdk/ >> >> I'm looking for a single (R)eviewer for this trivial change. >> >> This fix is tested by local builds on my Solaris-X64 server and >> by a "mach5 remote-build-and-test --job builds-tier1,hs-tier1". >> >> Thanks, in advance, for any questions, comments or suggestions... >> >> Dan > Looks good, and trivial. Kim, thanks for the fast review! I'll wait for the test builds to finish since it would be ironic if my build fix broke a different build... :-) Dan From zgu at redhat.com Tue Jul 31 20:52:31 2018 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 31 Jul 2018 16:52:31 -0400 Subject: RFR(XS): 8208605 Fix for 8199868 breaks tier1 build In-Reply-To: References: Message-ID: Thanks for fixing this, Dan. -Zhengyu On 07/31/2018 04:09 PM, Daniel D. Daugherty wrote: > Greetings, > > I have a proposed fix for the following bug: > > ??? JDK-8208605 Fix for 8199868 breaks tier1 build > ??? https://bugs.openjdk.java.net/browse/JDK-8208605 > > Webrev: http://cr.openjdk.java.net/~dcubed/8208605-webrev/0_for_jdk_jdk/ > > I'm looking for a single (R)eviewer for this trivial change. > > This fix is tested by local builds on my Solaris-X64 server and > by a "mach5 remote-build-and-test --job builds-tier1,hs-tier1". > > Thanks, in advance, for any questions, comments or suggestions... > > Dan > From kim.barrett at oracle.com Tue Jul 31 23:07:02 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 31 Jul 2018 19:07:02 -0400 Subject: 8208611: Refactor SATBMarkQueue filtering to allow GC-specific filters Message-ID: <928D8480-FF84-4EBA-A127-8C43AAD746CA@oracle.com> Please review this change to the implementation of SATB mark queue filtering to permit a GC-specific filter to be provided. This is a preliminary step toward being able to share (most of) the SATB mark queue code between G1 and Shenandoah. We introduce a new abstract class, SATBMarkQueueFilter, which is responsible for filtering a queue. A SATBMarkQueueSet contains the filter object, which is constructed and installed by the GC-specific initialization of the qset. The former filter function has been changed to be a function template, with a function argument that provides the filter-out / retain decision. The GC-specific filter object calls that template with an appropriate function for the GC. For G1, it all gets nicely inlined (with gcc; I haven't looked at the generated code for other compilers), so the only overhead for this refactoring is the replacement of a (possibly inlined, but probably not) ordinary function call with a virtual function call per buffer completion. CR: https://bugs.openjdk.java.net/browse/JDK-8208611 Webrev: http://cr.openjdk.java.net/~kbarrett/8208611/open.00/ Testing: mach5 tier1-3, hs-tier4-5. From mearvk at outlook.com Sun Jul 22 16:12:00 2018 From: mearvk at outlook.com (mr rupplin) Date: Sun, 22 Jul 2018 16:12:00 +0000 Subject: JDK Memory Allocation In-Reply-To: <87801549-1b64-afa3-95dc-95994b0d0fea@oracle.com> References: , <87801549-1b64-afa3-95dc-95994b0d0fea@oracle.com> Message-ID: David. Thanks. Seems that you're right. In the JDK9 stack they seem to have struck onto a different formula. Let us know if you have that one too or also. Kthxbye, Mr. Rupplin /sr software developer ________________________________ From: David Buck Sent: Saturday, July 21, 2018 9:10 PM To: mr rupplin Cc: hotspot-gc-dev at openjdk.java.net Subject: Re: JDK Memory Allocation Hi Max! Your question does not seem to be related to building OpenJDK, so I have BCCed build-dev from the thread and added gc-dev. That said, I am not sure any of the development lists are really an ideal place to ask general "code walk through" questions. If really necessary, memAllocator.cpp [0] would probably be as good a place as any to start reading the source code. But unless you intend to hack on the JVM itself, trying to read this source code may not be the most productive use of your time. You may get a lot more out of reading some of the wikis [1], blogs [2], and books [3][4] that cover the HotSpot JVM in detail. Even if you ultimately chose to read the source code directly, reading these other types of resources first should really help you make better sense of what you see in the source code. Cheers, -Buck [0] http://hg.openjdk.java.net/jdk/jdk/file/b0fcf59be391/src/hotspot/share/gc/shared/memAllocator.cpp [1] https://wiki.openjdk.java.net/display/HotSpot/Main [2] https://shipilev.net/jvm-anatomy-park/ [3] https://www.goodreads.com/book/show/13227108-java-performance [4] https://www.goodreads.com/book/show/23316035-java-performance-companion On 2018/07/22 8:51, mr rupplin wrote: > Having looked for some while at the OpenJDK source code I am unable to find where the memory allocation occurs. I will be working very much with the JDK and would like to get a firm grasp on its underlying mechanisms. > > public class JustAsk > { > public static void main(String...args) > { > for(int i=0; i<100; i++) > { > new JustAsk(); > } > } > } > > This doesn't seem to rely on any of the functions in the libjli nor of the jni.h. So clearly where do we look for the handler here? > > Thanks, > > Your friend Max > -------------- next part -------------- An HTML attachment was scrubbed... URL: